[PATCH] qemu-keymap.c: Fix bad printf format specifiers

2020-11-10 Thread Alex Chen
We should use printf format specifier "%u" instead of "%d" for
argument of type "unsigned int".

Reported-by: Euler Robot 
Signed-off-by: Alex Chen 
---
 qemu-keymap.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/qemu-keymap.c b/qemu-keymap.c
index 536e8f2385..6797006dda 100644
--- a/qemu-keymap.c
+++ b/qemu-keymap.c
@@ -77,11 +77,11 @@ static void walk_map(struct xkb_keymap *map, xkb_keycode_t 
code, void *data)
 xkb_state_update_mask(state,  0, 0, 0,  0, 0, 0);
 kbase = xkb_state_key_get_one_sym(state, code);
 xkb_keysym_get_name(kbase, name, sizeof(name));
-fprintf(outfile, "# evdev %d (0x%x): no evdev -> QKeyCode mapping"
+fprintf(outfile, "# evdev %u (0x%x): no evdev -> QKeyCode mapping"
 " (xkb keysym %s)\n", evdev, evdev, name);
 return;
 }
-fprintf(outfile, "# evdev %d (0x%x), QKeyCode \"%s\", number 0x%x\n",
+fprintf(outfile, "# evdev %u (0x%x), QKeyCode \"%s\", number 0x%x\n",
 evdev, evdev,
 QKeyCode_str(qcode),
 qcode_to_number(qcode));
-- 
2.19.1




Re: [PATCH v3 04/18] migration/rdma: add multifd_setup_ops for rdma

2020-11-10 Thread Zheng Chuan



On 2020/11/10 20:30, Dr. David Alan Gilbert wrote:
> * Chuan Zheng (zhengch...@huawei.com) wrote:
>> Signed-off-by: Chuan Zheng 
>> ---
>>  migration/multifd.c |  6 
>>  migration/multifd.h |  4 +++
>>  migration/rdma.c| 82 
>> +
>>  3 files changed, 92 insertions(+)
>>
>> diff --git a/migration/multifd.c b/migration/multifd.c
>> index 1f82307..0d494df 100644
>> --- a/migration/multifd.c
>> +++ b/migration/multifd.c
>> @@ -1210,6 +1210,12 @@ MultiFDSetup *multifd_setup_ops_init(void)
>>  {
>>  MultiFDSetup *ops;
>>  
>> +#ifdef CONFIG_RDMA
>> +if (migrate_use_rdma()) {
>> +ops = multifd_rdma_setup();
>> +return ops;
>> +}
>> +#endif
>>  ops = _socket_ops;
>>  return ops;
>>  }
>> diff --git a/migration/multifd.h b/migration/multifd.h
>> index 446315b..62a0b2a 100644
>> --- a/migration/multifd.h
>> +++ b/migration/multifd.h
>> @@ -173,6 +173,10 @@ typedef struct {
>>  void (*recv_channel_setup)(QIOChannel *ioc, MultiFDRecvParams *p);
>>  } MultiFDSetup;
>>  
>> +#ifdef CONFIG_RDMA
>> +MultiFDSetup *multifd_rdma_setup(void);
>> +#endif
>> +MultiFDSetup *multifd_setup_ops_init(void);
>>  void multifd_register_ops(int method, MultiFDMethods *ops);
>>  
>>  #endif
>> diff --git a/migration/rdma.c b/migration/rdma.c
>> index 0340841..ad4e4ba 100644
>> --- a/migration/rdma.c
>> +++ b/migration/rdma.c
>> @@ -19,6 +19,7 @@
>>  #include "qemu/cutils.h"
>>  #include "rdma.h"
>>  #include "migration.h"
>> +#include "multifd.h"
>>  #include "qemu-file.h"
>>  #include "ram.h"
>>  #include "qemu-file-channel.h"
>> @@ -4138,3 +4139,84 @@ err:
>>  g_free(rdma);
>>  g_free(rdma_return_path);
>>  }
>> +
>> +static void *multifd_rdma_send_thread(void *opaque)
>> +{
>> +MultiFDSendParams *p = opaque;
>> +
>> +while (true) {
>> +qemu_mutex_lock(>mutex);
>> +if (p->quit) {
>> +qemu_mutex_unlock(>mutex);
>> +break;
>> +}
>> +qemu_mutex_unlock(>mutex);
>> +qemu_sem_wait(>sem);
>> +}
>> +
>> +qemu_mutex_lock(>mutex);
>> +p->running = false;
>> +qemu_mutex_unlock(>mutex);
>> +
>> +return NULL;
>> +}
> 
> You might like to consider using WITH_QEMU_LOCK_GUARD, I think that
> would become:
> 
>   while (true) {
>   WITH_QEMU_LOCK_GUARD(>mutex) {
>   if (p->quit) {
>   break;
>   }
>   }
>   qemu_sem_wait(>sem);
>   }
>   WITH_QEMU_LOCK_GUARD(>mutex) {
>   p->running = false;
>   }
> 
OK. and this remind me now we keep qemu_mutex_lock(>mutex); in our multifd 
code, it that should also optimized?
>> +
>> +static void multifd_rdma_send_channel_setup(MultiFDSendParams *p)
>> +{
>> +Error *local_err = NULL;
>> +
>> +if (p->quit) {
>> +error_setg(_err, "multifd: send id %d already quit", p->id);
>> +return ;
>> +}
>> +p->running = true;
>> +
>> +qemu_thread_create(>thread, p->name, multifd_rdma_send_thread, p,
>> +   QEMU_THREAD_JOINABLE);
>> +}
>> +
>> +static void *multifd_rdma_recv_thread(void *opaque)
>> +{
>> +MultiFDRecvParams *p = opaque;
>> +
>> +while (true) {
>> +qemu_mutex_lock(>mutex);
>> +if (p->quit) {
>> +qemu_mutex_unlock(>mutex);
>> +break;
>> +}
>> +qemu_mutex_unlock(>mutex);
>> +qemu_sem_wait(>sem_sync);
>> +}
>> +
>> +qemu_mutex_lock(>mutex);
>> +p->running = false;
>> +qemu_mutex_unlock(>mutex);
>> +
>> +return NULL;
>> +}
>> +
>> +static void multifd_rdma_recv_channel_setup(QIOChannel *ioc,
>> +MultiFDRecvParams *p)
>> +{
>> +QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc);
>> +
>> +p->file = rioc->file;
>> +return;
>> +}
>> +
>> +static MultiFDSetup multifd_rdma_ops = {
>> +.send_thread_setup = multifd_rdma_send_thread,
>> +.recv_thread_setup = multifd_rdma_recv_thread,
>> +.send_channel_setup = multifd_rdma_send_channel_setup,
>> +.recv_channel_setup = multifd_rdma_recv_channel_setup
>> +};
>> +
>> +MultiFDSetup *multifd_rdma_setup(void)
>> +{
>> +MultiFDSetup *rdma_ops;
>> +
>> +rdma_ops = _rdma_ops;
>> +
>> +return rdma_ops;
> 
> Why bother making this a function - just export multifd_rdma_ops ?
> 
> Dave
> 
OK, will consider in that way.

>> +}
>> -- 
>> 1.8.3.1
>>

-- 
Regards.
Chuan



Re: [PATCH v3 03/18] migration/rdma: create multifd_setup_ops for Tx/Rx thread

2020-11-10 Thread Zheng Chuan



On 2020/11/10 20:11, Dr. David Alan Gilbert wrote:
> * Chuan Zheng (zhengch...@huawei.com) wrote:
>> Create multifd_setup_ops for TxRx thread, no logic change.
>>
>> Signed-off-by: Chuan Zheng 
>> ---
>>  migration/multifd.c | 44 +++-
>>  migration/multifd.h |  7 +++
>>  2 files changed, 46 insertions(+), 5 deletions(-)
>>
>> diff --git a/migration/multifd.c b/migration/multifd.c
>> index 68b171f..1f82307 100644
>> --- a/migration/multifd.c
>> +++ b/migration/multifd.c
>> @@ -383,6 +383,8 @@ struct {
>>  int exiting;
>>  /* multifd ops */
>>  MultiFDMethods *ops;
>> +/* multifd setup ops */
>> +MultiFDSetup *setup_ops;
>>  } *multifd_send_state;
>>  
>>  /*
>> @@ -790,8 +792,9 @@ static bool multifd_channel_connect(MultiFDSendParams *p,
>>  } else {
>>  /* update for tls qio channel */
>>  p->c = ioc;
>> -qemu_thread_create(>thread, p->name, multifd_send_thread, p,
>> -   QEMU_THREAD_JOINABLE);
>> +qemu_thread_create(>thread, p->name,
>> +   
>> multifd_send_state->setup_ops->send_thread_setup,
>> +   p, QEMU_THREAD_JOINABLE);
>> }
>> return false;
>>  }
>> @@ -839,6 +842,11 @@ cleanup:
>>  multifd_new_send_channel_cleanup(p, sioc, local_err);
>>  }
>>  
>> +static void multifd_send_channel_setup(MultiFDSendParams *p)
>> +{
>> +socket_send_channel_create(multifd_new_send_channel_async, p);
>> +}
>> +
>>  int multifd_save_setup(Error **errp)
>>  {
>>  int thread_count;
>> @@ -856,6 +864,7 @@ int multifd_save_setup(Error **errp)
>>  multifd_send_state->pages = multifd_pages_init(page_count);
>>  qemu_sem_init(_send_state->channels_ready, 0);
>>  qatomic_set(_send_state->exiting, 0);
>> +multifd_send_state->setup_ops = multifd_setup_ops_init();
>>  multifd_send_state->ops = multifd_ops[migrate_multifd_compression()];
>>  
>>  for (i = 0; i < thread_count; i++) {
>> @@ -875,7 +884,7 @@ int multifd_save_setup(Error **errp)
>>  p->packet->version = cpu_to_be32(MULTIFD_VERSION);
>>  p->name = g_strdup_printf("multifdsend_%d", i);
>>  p->tls_hostname = g_strdup(s->hostname);
>> -socket_send_channel_create(multifd_new_send_channel_async, p);
>> +multifd_send_state->setup_ops->send_channel_setup(p);
>>  }
>>  
>>  for (i = 0; i < thread_count; i++) {
>> @@ -902,6 +911,8 @@ struct {
>>  uint64_t packet_num;
>>  /* multifd ops */
>>  MultiFDMethods *ops;
>> +/* multifd setup ops */
>> +MultiFDSetup *setup_ops;
>>  } *multifd_recv_state;
>>  
>>  static void multifd_recv_terminate_threads(Error *err)
>> @@ -1095,6 +1106,7 @@ int multifd_load_setup(Error **errp)
>>  multifd_recv_state->params = g_new0(MultiFDRecvParams, thread_count);
>>  qatomic_set(_recv_state->count, 0);
>>  qemu_sem_init(_recv_state->sem_sync, 0);
>> +multifd_recv_state->setup_ops = multifd_setup_ops_init();
>>  multifd_recv_state->ops = multifd_ops[migrate_multifd_compression()];
>>  
>>  for (i = 0; i < thread_count; i++) {
>> @@ -1173,9 +1185,31 @@ bool multifd_recv_new_channel(QIOChannel *ioc, Error 
>> **errp)
>>  p->num_packets = 1;
>>  
>>  p->running = true;
>> -qemu_thread_create(>thread, p->name, multifd_recv_thread, p,
>> -   QEMU_THREAD_JOINABLE);
>> +multifd_recv_state->setup_ops->recv_channel_setup(ioc, p);
>> +qemu_thread_create(>thread, p->name,
>> +   multifd_recv_state->setup_ops->recv_thread_setup,
>> +   p, QEMU_THREAD_JOINABLE);
>>  qatomic_inc(_recv_state->count);
>>  return qatomic_read(_recv_state->count) ==
>> migrate_multifd_channels();
>>  }
>> +
>> +static void multifd_recv_channel_setup(QIOChannel *ioc, MultiFDRecvParams 
>> *p)
>> +{
>> +return;
>> +}
>> +
>> +static MultiFDSetup multifd_socket_ops = {
>> +.send_thread_setup = multifd_send_thread,
>> +.recv_thread_setup = multifd_recv_thread,
>> +.send_channel_setup = multifd_send_channel_setup,
>> +.recv_channel_setup = multifd_recv_channel_setup
>> +};
> 
> I don't think you need '_setup' on the thread function names here.
> 
> Dave
> 
OK, done in my local tree.
>> +MultiFDSetup *multifd_setup_ops_init(void)
>> +{
>> +MultiFDSetup *ops;
>> +
>> +ops = _socket_ops;
>> +return ops;
>> +}
>> diff --git a/migration/multifd.h b/migration/multifd.h
>> index 8d6751f..446315b 100644
>> --- a/migration/multifd.h
>> +++ b/migration/multifd.h
>> @@ -166,6 +166,13 @@ typedef struct {
>>  int (*recv_pages)(MultiFDRecvParams *p, uint32_t used, Error **errp);
>>  } MultiFDMethods;
>>  
>> +typedef struct {
>> +void *(*send_thread_setup)(void *opaque);
>> +void *(*recv_thread_setup)(void *opaque);
>> +void (*send_channel_setup)(MultiFDSendParams *p);
>> +void 

[PATCH] exynos: Fix bad printf format specifiers

2020-11-10 Thread Alex Chen
We should use printf format specifier "%u" instead of "%d" for
argument of type "unsigned int".

Reported-by: Euler Robot 
Signed-off-by: Alex Chen 
---
 hw/timer/exynos4210_mct.c | 4 ++--
 hw/timer/exynos4210_pwm.c | 8 
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/hw/timer/exynos4210_mct.c b/hw/timer/exynos4210_mct.c
index 08ee3ca76c..439053acd2 100644
--- a/hw/timer/exynos4210_mct.c
+++ b/hw/timer/exynos4210_mct.c
@@ -537,7 +537,7 @@ static void exynos4210_gcomp_raise_irq(void *opaque, 
uint32_t id)
 /* If CSTAT is pending and IRQ is enabled */
 if ((s->reg.int_cstat & G_INT_CSTAT_COMP(id)) &&
 (s->reg.int_enb & G_INT_ENABLE(id))) {
-DPRINTF("gcmp timer[%d] IRQ\n", id);
+DPRINTF("gcmp timer[%u] IRQ\n", id);
 qemu_irq_raise(s->irq[id]);
 }
 }
@@ -1003,7 +1003,7 @@ static void exynos4210_mct_update_freq(Exynos4210MCTState 
*s)
 MCT_CFG_GET_DIVIDER(s->reg_mct_cfg));
 
 if (freq != s->freq) {
-DPRINTF("freq=%dHz\n", s->freq);
+DPRINTF("freq=%uHz\n", s->freq);
 
 /* global timer */
 tx_ptimer_set_freq(s->g_timer.ptimer_frc, s->freq);
diff --git a/hw/timer/exynos4210_pwm.c b/hw/timer/exynos4210_pwm.c
index 4fa3d87396..de181428b4 100644
--- a/hw/timer/exynos4210_pwm.c
+++ b/hw/timer/exynos4210_pwm.c
@@ -169,7 +169,7 @@ static void exynos4210_pwm_update_freq(Exynos4210PWMState 
*s, uint32_t id)
 
 if (freq != s->timer[id].freq) {
 ptimer_set_freq(s->timer[id].ptimer, s->timer[id].freq);
-DPRINTF("freq=%dHz\n", s->timer[id].freq);
+DPRINTF("freq=%uHz\n", s->timer[id].freq);
 }
 }
 
@@ -183,14 +183,14 @@ static void exynos4210_pwm_tick(void *opaque)
 uint32_t id = s->id;
 bool cmp;
 
-DPRINTF("timer %d tick\n", id);
+DPRINTF("timer %u tick\n", id);
 
 /* set irq status */
 p->reg_tint_cstat |= TINT_CSTAT_STATUS(id);
 
 /* raise IRQ */
 if (p->reg_tint_cstat & TINT_CSTAT_ENABLE(id)) {
-DPRINTF("timer %d IRQ\n", id);
+DPRINTF("timer %u IRQ\n", id);
 qemu_irq_raise(p->timer[id].irq);
 }
 
@@ -202,7 +202,7 @@ static void exynos4210_pwm_tick(void *opaque)
 }
 
 if (cmp) {
-DPRINTF("auto reload timer %d count to %x\n", id,
+DPRINTF("auto reload timer %u count to %x\n", id,
 p->timer[id].reg_tcntb);
 ptimer_set_count(p->timer[id].ptimer, p->timer[id].reg_tcntb);
 ptimer_run(p->timer[id].ptimer, 1);
-- 
2.19.1




Re: [PATCH] net/l2tpv3: Remove redundant check in net_init_l2tpv3()

2020-11-10 Thread Alex Chen
Kindly ping.

On 2020/10/30 10:46, AlexChen wrote:
> The result has been checked to be NULL before, it cannot be NULL here,
> so the check is redundant. Remove it.
> 
> Reported-by: Euler Robot 
> Signed-off-by: AlexChen 
> ---
>  net/l2tpv3.c | 9 +++--
>  1 file changed, 3 insertions(+), 6 deletions(-)
> 
> diff --git a/net/l2tpv3.c b/net/l2tpv3.c
> index 55fea17c0f..e4d4218db6 100644
> --- a/net/l2tpv3.c
> +++ b/net/l2tpv3.c
> @@ -655,9 +655,8 @@ int net_init_l2tpv3(const Netdev *netdev,
>  error_setg(errp, "could not bind socket err=%i", errno);
>  goto outerr;
>  }
> -if (result) {
> -freeaddrinfo(result);
> -}
> +
> +freeaddrinfo(result);
> 
>  memset(, 0, sizeof(hints));
> 
> @@ -686,9 +685,7 @@ int net_init_l2tpv3(const Netdev *netdev,
>  memcpy(s->dgram_dst, result->ai_addr, result->ai_addrlen);
>  s->dst_size = result->ai_addrlen;
> 
> -if (result) {
> -freeaddrinfo(result);
> -}
> +freeaddrinfo(result);
> 
>  if (l2tpv3->has_counter && l2tpv3->counter) {
>  s->has_counter = true;
> 




Re: [PATCH v2] migration/multifd: close TLS channel before socket finalize

2020-11-10 Thread Zheng Chuan
I think i have found it why.

When we create tls client in migration_tls_client_create(), we reference 
tioc->master.
As for main migration thread, it will do dereference after 
migration_channel_connect in socket_outgoing_migration().
As for non-TLS migration, it will do another reference in 
qemu_fopen_channel_output(ioc) of migration_channel_connect().

In a conclusion, we need to dereference the underlying QIOChannelSocket after 
tls handshake for multifd-TLS channel.
The fix patch is sent and waiting for review.
https://www.mail-archive.com/qemu-devel@nongnu.org/msg759110.html

On 2020/11/10 19:56, Zheng Chuan wrote:
> 
> 
> On 2020/11/10 19:01, Daniel P. Berrangé wrote:
>> On Tue, Nov 10, 2020 at 06:45:45PM +0800, Zheng Chuan wrote:
>>>
>>>
>>> On 2020/11/10 18:12, Daniel P. Berrangé wrote:
 On Fri, Nov 06, 2020 at 06:54:54PM +0800, Chuan Zheng wrote:
> Since we now support tls multifd, when we cancel migration, the TLS
> sockets will be left as CLOSE-WAIT On Src which results in socket
> leak.
> Fix it by closing TLS channel before socket finalize.
>
> Signed-off-by: Chuan Zheng 
> ---
>  migration/multifd.c | 14 ++
>  1 file changed, 14 insertions(+)
>
> diff --git a/migration/multifd.c b/migration/multifd.c
> index 68b171f..a6838dc 100644
> --- a/migration/multifd.c
> +++ b/migration/multifd.c
> @@ -523,6 +523,19 @@ static void multifd_send_terminate_threads(Error 
> *err)
>  }
>  }
>  
> +static void multifd_tls_socket_close(QIOChannel *ioc, Error *err)
> +{
> +if (ioc &&
> +object_dynamic_cast(OBJECT(ioc),
> +TYPE_QIO_CHANNEL_TLS)) {
> +/*
> + * TLS channel is special, we need close it before
> + * socket finalize.
> + */
> +qio_channel_close(ioc, );
> +}
> +}

 This doesn't feel quite right to me.  Calling qio_channel_close will close
 both the TLS layer, and the underlying QIOChannelSocket. If the latter
 is safe to do, then we don't need the object_dynamic_cast() check, we can
 do it unconditionally whether we're using TLS or not.

 Having said that, I'm not sure if we actually want to be using
 qio_channel_close or not ?

 I would have expected that there is already code somewhere else in the
 migration layer that is closing these multifd channels, but I can't
 actually find where that happens right now.  Assuming that code does
 exist though, qio_channel_shutdown(ioc, BOTH) feels like the right
 answer to unblock waiting I/O ops.

>>> Hi, Daniel.
>>> Actually, I have tried to use qio_channel_shutdown at the same place,
>>> but it seems not work right.
>>> the socket connection is closed by observing through 'ss' command but
>>> the socket fds in /proc/$(qemu pid)/fd are still residual.
>>>
>>> The underlying QIOChannelSocket will be closed by
>>> qio_channel_socket_finalize() through object_unref(QIOChannel) later
>>> in socket_send_channel_destroy(),
>>> does that means it is safe to close both of TLS and tcp socket?
>>
>> Hmm, that makes me even more confused, because the object_unref
>> should be calling qio_channel_close() already.
>>
>> eg with your patch we have:
>>
>>multifd_tls_socket_close(p->c, NULL);
>>-> qio_channel_close(p->c)
>>  -> qio_channel_tls_close(p->c)
>>  -> qio_channel_close(master)
>>
>>socket_send_channel_destroy(p->c)
>>-> object_unref(p->c)
>>   -> qio_channel_tls_socket_finalize(p->c)
>>-> object_unref(master)
>>-> qio_channel_close(master)
>>
>> so the multifd_tls_socket_close should not be doing anything
>> at all *assuming* we releasing the last reference in our
>> object_unref call.
>>
>> Given what you describe, I think we are *not* releasing the
>> last reference. There is an active reference being held
>> somewhere else, and that is preventing the QIOChannelSocket
>> from being freed and thus the socket remains open.
>>
>> If that extra active reference is a bug, then we have a memory
>> leak of the QIOChannelSocket object, that needs fixing somewhere.
>>
>> If that extra active reference is intentional, then we do indeed
>> need to explicitly close the socket. That is possibly better
>> handled by putting a qio_channel_close() call into the
>> socket_send_channel_destroy() method.
>>
>> I wonder if we're leaking a reference to the underlying QIOChannelSocket,
>> when we create the QIOChannelTLS wrapper ? That could explain a problem
>> that only happens when using TLS.
>>
> Aha, you are right!
> The QIOChannelSocket is added by an extra reference.
> 
> Thread 1 "qemu-system-aar" hit Breakpoint 1, socket_send_channel_destroy (
> send=0xbea527f0) at migration/socket.c:44
> 44migration/socket.c: No such file or directory.
> (gdb) p 

[RESEND][PATCH] multifd/tls: fix memoryleak of the QIOChannelSocket object when canceling migration

2020-11-10 Thread Chuan Zheng
When creating new tls client, the tioc->master will be referenced, we need 
dereferenced
it after tls handshake.

Signed-off-by: Chuan Zheng 
---
 migration/multifd.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/migration/multifd.c b/migration/multifd.c
index 68b171f..df76a8e 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -728,7 +728,8 @@ static void multifd_tls_outgoing_handshake(QIOTask *task,
gpointer opaque)
 {
 MultiFDSendParams *p = opaque;
-QIOChannel *ioc = QIO_CHANNEL(qio_task_get_source(task));
+QIOChannelTLS *tioc = QIO_CHANNEL_TLS(qio_task_get_source(task));
+QIOChannel *ioc = QIO_CHANNEL(tioc);
 Error *err = NULL;
 
 if (qio_task_propagate_error(task, )) {
@@ -737,6 +738,7 @@ static void multifd_tls_outgoing_handshake(QIOTask *task,
 trace_multifd_tls_outgoing_handshake_complete(ioc);
 }
 multifd_channel_connect(p, ioc, err);
+object_unref(OBJECT(tioc->master));
 }
 
 static void multifd_tls_channel_connect(MultiFDSendParams *p,
-- 
1.8.3.1




[PATCH] multifd/tls: fix memoryleak of the QIOChannelSocket object when canceling migration

2020-11-10 Thread Chuan Zheng
When creating new tls client, the tioc->master will be referred, we need unrefer
it after tls handshake.

Signed-off-by: Chuan Zheng 
---
 migration/multifd.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/migration/multifd.c b/migration/multifd.c
index 68b171f..df76a8e 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -728,7 +728,8 @@ static void multifd_tls_outgoing_handshake(QIOTask *task,
gpointer opaque)
 {
 MultiFDSendParams *p = opaque;
-QIOChannel *ioc = QIO_CHANNEL(qio_task_get_source(task));
+QIOChannelTLS *tioc = QIO_CHANNEL_TLS(qio_task_get_source(task));
+QIOChannel *ioc = QIO_CHANNEL(tioc);
 Error *err = NULL;
 
 if (qio_task_propagate_error(task, )) {
@@ -737,6 +738,7 @@ static void multifd_tls_outgoing_handshake(QIOTask *task,
 trace_multifd_tls_outgoing_handshake_complete(ioc);
 }
 multifd_channel_connect(p, ioc, err);
+object_unref(OBJECT(tioc->master));
 }
 
 static void multifd_tls_channel_connect(MultiFDSendParams *p,
-- 
1.8.3.1




[PATCH] dev-uas: Fix a error of variable sized type not at end

2020-11-10 Thread Han Han
Fix the following error when compiling:

FAILED: libcommon.fa.p/hw_usb_dev-uas.c.o
clang -Ilibcommon.fa.p -I. -I.. -Iqapi -Itrace -Iui -Iui/shader 
-I/usr/include/libusb-1.0 -I/usr/include/spice-1 -I/usr/include/spice-server 
-I/usr/include/cacard -I/usr/include/glib-2.0 -I/usr/lib64/glib-2.0/include 
-I/usr/include/nss3 -I/usr/include/nspr4 -I/usr/include/libmount 
-I/usr/include/blkid -I/usr/include/pixman-1 -I/usr/include/vte-2.91 
-I/usr/include/pango-1.0 -I/usr/include/harfbuzz -I/usr/include/freetype2 
-I/usr/include/libpng16 -I/usr/include/fribidi -I/usr/include/libxml2 
-I/usr/include/cairo -I/usr/include/gtk-3.0 -I/usr/include/gdk-pixbuf-2.0 
-I/usr/include/gio-unix-2.0 -I/usr/include/atk-1.0 
-I/usr/include/at-spi2-atk/2.0 -I/usr/include/dbus-1.0 
-I/usr/lib64/dbus-1.0/include -I/usr/include/at-spi-2.0 -I/usr/include/SDL2 
-I/usr/include/slirp -I/usr/include/virgl -I/usr/include/capstone -Xclang 
-fcolor-diagnostics -pipe -Wall -Winvalid-pch -Werror -std=gnu99 -O2 -g 
-fsanitize=undefined -fsanitize=address -m64 -mcx16 -D_GNU_SOURCE 
-D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes 
-Wredundant-decls -Wundef -Wwrite-strings -Wmissing-prototypes 
-fno-strict-aliasing -fno-common -fwrapv -Wold-style-definition -Wtype-limits 
-Wformat-security -Wformat-y2k -Winit-self -Wignored-qualifiers -Wempty-body 
-Wnested-externs -Wendif-labels -Wexpansion-to-defined 
-Wno-initializer-overrides -Wno-missing-include-dirs -Wno-shift-negative-value 
-Wno-string-plus-int -Wno-typedef-redefinition 
-Wno-tautological-type-limit-compare -Wno-psabi -fstack-protector-strong 
-fsanitize=fuzzer-no-link -isystem /home/hhan/Software/qemu/linux-headers 
-isystem linux-headers -iquote /home/hhan/Software/qemu/tcg/i386 -iquote . 
-iquote /home/hhan/Software/qemu -iquote /home/hhan/Software/qemu/accel/tcg 
-iquote /home/hhan/Software/qemu/include -iquote 
/home/hhan/Software/qemu/disas/libvixl -pthread -fPIC -DSTRUCT_IOVEC_DEFINED 
-D_REENTRANT -Wno-undef -D_DEFAULT_SOURCE -D_XOPEN_SOURCE=600 
-DNCURSES_WIDECHAR -MD -MQ libcommon.fa.p/hw_usb_dev-uas.c.o -MF 
libcommon.fa.p/hw_usb_dev-uas.c.o.d -o libcommon.fa.p/hw_usb_dev-uas.c.o -c 
../hw/usb/dev-uas.c
../hw/usb/dev-uas.c:158:31: error: field 'status' with variable sized type 
'uas_iu' not at the end of a struct or class is a GNU extension 
[-Werror,-Wgnu-variable-sized-type-not-at-end]

Signed-off-by: Han Han 
---
 hw/usb/dev-uas.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/usb/dev-uas.c b/hw/usb/dev-uas.c
index cec071d96c..5ef3f4fec9 100644
--- a/hw/usb/dev-uas.c
+++ b/hw/usb/dev-uas.c
@@ -154,9 +154,9 @@ struct UASRequest {
 
 struct UASStatus {
 uint32_t  stream;
-uas_iustatus;
 uint32_t  length;
 QTAILQ_ENTRY(UASStatus)   next;
+uas_iustatus;
 };
 
 /* - */
-- 
2.28.0




[RFC PATCH 24/25] WIP: i386/cxl: Initialize a host bridge

2020-11-10 Thread Ben Widawsky
This patch allows initializing the primary host bridge as a CXL capable
hostbridge.

Signed-off-by: Ben Widawsky 

--
This patch is WIP.
---
 hw/arm/virt.c|  1 +
 hw/core/machine.c| 26 ++
 hw/i386/acpi-build.c |  8 +++-
 hw/i386/microvm.c|  1 +
 hw/i386/pc.c |  1 +
 hw/ppc/spapr.c   |  2 ++
 include/hw/boards.h  |  2 ++
 include/hw/cxl/cxl.h |  4 
 8 files changed, 44 insertions(+), 1 deletion(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 27dbeb549e..9d1dafea9f 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -2475,6 +2475,7 @@ static void virt_machine_class_init(ObjectClass *oc, void 
*data)
 hc->unplug_request = virt_machine_device_unplug_request_cb;
 hc->unplug = virt_machine_device_unplug_cb;
 mc->nvdimm_supported = true;
+mc->cxl_supported = false;
 mc->auto_enable_numa_with_memhp = true;
 mc->auto_enable_numa_with_memdev = true;
 mc->default_ram_id = "mach-virt.ram";
diff --git a/hw/core/machine.c b/hw/core/machine.c
index 98b87f76cb..5f37d63da6 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -26,6 +26,7 @@
 #include "sysemu/qtest.h"
 #include "hw/pci/pci.h"
 #include "hw/mem/nvdimm.h"
+#include "hw/cxl/cxl.h"
 #include "migration/vmstate.h"
 
 GlobalProperty hw_compat_5_1[] = {
@@ -491,6 +492,20 @@ static void machine_set_nvdimm_persistence(Object *obj, 
const char *value,
 nvdimms_state->persistence_string = g_strdup(value);
 }
 
+static bool machine_get_cxl(Object *obj, Error **errp)
+{
+MachineState *ms = MACHINE(obj);
+
+return ms->cxl_devices_state->is_enabled;
+}
+
+static void machine_set_cxl(Object *obj, bool value, Error **errp)
+{
+MachineState *ms = MACHINE(obj);
+
+ms->cxl_devices_state->is_enabled = value;
+}
+
 void machine_class_allow_dynamic_sysbus_dev(MachineClass *mc, const char *type)
 {
 strList *item = g_new0(strList, 1);
@@ -895,6 +910,16 @@ static void machine_initfn(Object *obj)
 "Valid values are cpu, mem-ctrl");
 }
 
+if (mc->cxl_supported) {
+Object *obj = OBJECT(ms);
+
+ms->cxl_devices_state = g_new0(CXLState, 1);
+object_property_add_bool(obj, "cxl", machine_get_cxl, machine_set_cxl);
+object_property_set_description(obj, "cxl",
+"Set on/off to enable/disable "
+"CXL instantiation");
+}
+
 if (mc->cpu_index_to_instance_props && mc->get_default_cpu_node_id) {
 ms->numa_state = g_new0(NumaState, 1);
 object_property_add_bool(obj, "hmat",
@@ -931,6 +956,7 @@ static void machine_finalize(Object *obj)
 g_free(ms->device_memory);
 g_free(ms->nvdimms_state);
 g_free(ms->numa_state);
+g_free(ms->cxl_devices_state);
 }
 
 bool machine_usb(MachineState *machine)
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index d080e24228..465bde0196 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -53,6 +53,7 @@
 #include "sysemu/numa.h"
 #include "sysemu/reset.h"
 #include "hw/hyperv/vmbus-bridge.h"
+#include "hw/cxl/cxl.h"
 
 /* Supported chipsets: */
 #include "hw/southbridge/piix.h"
@@ -1569,8 +1570,13 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
 build_piix4_pci0_int(dsdt);
 } else {
 sb_scope = aml_scope("_SB");
+/*
+ * XXX: CXL spec calls this "CXL0", but that would require lots of
+ * changes throughout and so even for CXL enabled, we call it "PCI0"
+ */
 dev = aml_device("PCI0");
-init_pci_acpi(dev, 0, PCIE);
+init_pci_acpi(dev, 0,
+machine->cxl_devices_state->is_enabled ? CXL : PCIE);
 aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
 aml_append(sb_scope, dev);
 
diff --git a/hw/i386/microvm.c b/hw/i386/microvm.c
index 5428448b70..ed2f992b2a 100644
--- a/hw/i386/microvm.c
+++ b/hw/i386/microvm.c
@@ -656,6 +656,7 @@ static void microvm_class_init(ObjectClass *oc, void *data)
 mc->auto_enable_numa_with_memdev = false;
 mc->default_cpu_type = TARGET_DEFAULT_CPU_TYPE;
 mc->nvdimm_supported = false;
+mc->cxl_supported = false;
 mc->default_ram_id = "microvm.ram";
 
 /* Avoid relying too much on kernel components */
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index ecfc497f71..a962a77835 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1694,6 +1694,7 @@ static void pc_machine_class_init(ObjectClass *oc, void 
*data)
 hc->unplug = pc_machine_device_unplug_cb;
 mc->default_cpu_type = TARGET_DEFAULT_CPU_TYPE;
 mc->nvdimm_supported = true;
+mc->cxl_supported = true;
 mc->default_ram_id = "pc.ram";
 
 object_class_property_add(oc, PC_MACHINE_MAX_RAM_BELOW_4G, "size",
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 227075103e..3d72bad5f2 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -4422,6 +4422,7 @@ static void spapr_machine_class_init(ObjectClass *oc, 
void *data)
 

[RFC PATCH 23/25] Temp: acpi/cxl: Add ACPI0017 (CEDT awareness)

2020-11-10 Thread Ben Widawsky
This represents Intel's proposal for how the system firmware can notify
Linux that the CEDT exists and provides a driver attach point. It is not
in the CXL 2.0 specification as of now.

CXL 2.0 specification adds an _HID, ACPI0016, for CXL capable host
bridges, with a _CID of PNP0A08 (PCIe host bridge). CXL aware software
is able to use this initiate the proper _OSC method, and get the _UID
which is referenced by the CEDT. Therefore the existence of an ACPI0016
device allows a CXL aware driver perform the necessary actions. For a
CXL capable OS, this works. For a CXL unaware OS, this works.

The motivation for ACPI0017 is to provide the possibility of having a
Linux CXL module that can work on a legacy Linux kernel.  Linux core
PCI/ACPI which won't be built as a module, will see the _CID of PNP0A08
and bind a driver to it. If we later loaded a driver for ACPI0016, Linux
won't be able to bind it to the hardware because it has already bound
the PNP0A08 driver. The ACPI0017 device is an opportunity to have an
object to bind a driver will be used by a Linux driver to walk the CXL
topology and do everything that we would have preferred to do with
ACPI0016.

There is another motivation for an ACPI0017 device which isn't
implemented here. An operating system needs an attach point for a
non-volatile region provider that understands cross-hostbridge
interleaving. Since QEMU emulation doesn't support interleaving yet,
this is more important on the OS side, for now.

Signed-off-by: Ben Widawsky 
---
 hw/i386/acpi-build.c | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index eda62dcd6a..d080e24228 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -1513,6 +1513,19 @@ static void init_pci_acpi(Aml *dev, int uid, int type)
 }
 }
 
+static void build_acpi0017(Aml *table)
+{
+Aml *dev;
+Aml *scope;
+
+scope =  aml_scope("_SB");
+dev = aml_device("CXLM");
+aml_append(dev, aml_name_decl("_HID", aml_string("ACPI0017")));
+
+aml_append(scope, dev);
+aml_append(table, scope);
+}
+
 static void
 build_dsdt(GArray *table_data, BIOSLinker *linker,
AcpiPmInfo *pm, AcpiMiscInfo *misc,
@@ -1529,6 +1542,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
 int root_bus_limit = 0xFF;
 PCIBus *bus = NULL;
 TPMIf *tpm = tpm_find();
+bool cxl_present = false;
 int i;
 VMBusBridge *vmbus_bridge = vmbus_bridge_find();
 
@@ -1683,6 +1697,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
 
 /* Handle the ranges for the PXB expanders */
 if (type == CXL) {
+cxl_present = true;
 uint64_t base = CXL_HOST_BASE + uid * 0x1;
 crs_range_insert(crs_range_set.mem_ranges, base,
  base + 0x1 - 1);
@@ -1690,6 +1705,10 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
 }
 }
 
+if (cxl_present) {
+build_acpi0017(dsdt);
+}
+
 /*
  * At this point crs_range_set has all the ranges used by pci
  * busses *other* than PCI0.  These ranges will be excluded from
-- 
2.29.2




[RFC PATCH 25/25] qtest/cxl: Add very basic sanity tests

2020-11-10 Thread Ben Widawsky
Signed-off-by: Ben Widawsky 
---
 tests/qtest/cxl-test.c  | 93 +
 tests/qtest/meson.build |  4 ++
 2 files changed, 97 insertions(+)
 create mode 100644 tests/qtest/cxl-test.c

diff --git a/tests/qtest/cxl-test.c b/tests/qtest/cxl-test.c
new file mode 100644
index 00..00eca14faa
--- /dev/null
+++ b/tests/qtest/cxl-test.c
@@ -0,0 +1,93 @@
+/*
+ * QTest testcase for CXL
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "libqtest-single.h"
+
+#define QEMU_PXB_CMD "-machine q35 -object memory-backend-file,id=cxl-mem1," \
+ "share,mem-path=%s,size=512M "  \
+ "-device pxb-cxl,id=cxl.0,bus=pcie.0,bus_nr=52,uid=0,"  \
+ 
"len-window-base=1,window-base[0]=0x4c000,memdev[0]=cxl-mem1"
+#define QEMU_RP "-device cxl-rp,id=rp0,bus=cxl.0,addr=0.0,chassis=0,slot=0"
+
+#define QEMU_T3D "-device 
cxl-type3,bus=rp0,memdev=cxl-mem1,id=cxl-pmem0,size=256M"
+
+static void cxl_basic_hb(void)
+{
+qtest_start("-machine q35,cxl");
+qtest_end();
+}
+
+static void cxl_basic_pxb(void)
+{
+qtest_start("-machine q35 -device pxb-cxl,bus=pcie.0,uid=0");
+qtest_end();
+}
+
+static void cxl_pxb_with_window(void)
+{
+GString *cmdline;
+char template[] = "/tmp/cxl-test-XX";
+const char *tmpfs;
+
+tmpfs = mkdtemp(template);
+
+cmdline = g_string_new(NULL);
+g_string_printf(cmdline, QEMU_PXB_CMD, tmpfs);
+
+qtest_start(cmdline->str);
+qtest_end();
+
+g_string_free(cmdline, TRUE);
+}
+
+static void cxl_root_port(void)
+{
+GString *cmdline;
+char template[] = "/tmp/cxl-test-XX";
+const char *tmpfs;
+
+tmpfs = mkdtemp(template);
+
+cmdline = g_string_new(NULL);
+g_string_printf(cmdline, QEMU_PXB_CMD " %s", tmpfs, QEMU_RP);
+
+qtest_start(cmdline->str);
+qtest_end();
+
+g_string_free(cmdline, TRUE);
+}
+
+static void cxl_t3d(void)
+{
+GString *cmdline;
+char template[] = "/tmp/cxl-test-XX";
+const char *tmpfs;
+
+tmpfs = mkdtemp(template);
+
+cmdline = g_string_new(NULL);
+g_string_printf(cmdline, QEMU_PXB_CMD " %s %s", tmpfs, QEMU_RP, QEMU_T3D);
+
+qtest_start(cmdline->str);
+qtest_end();
+
+g_string_free(cmdline, TRUE);
+}
+
+int main(int argc, char **argv)
+{
+g_test_init(, , NULL);
+
+qtest_add_func("/pci/cxl/basic_hostbridge", cxl_basic_hb);
+qtest_add_func("/pci/cxl/basic_pxb", cxl_basic_pxb);
+qtest_add_func("/pci/cxl/pxb_with_window", cxl_pxb_with_window);
+qtest_add_func("/pci/cxl/root_port", cxl_root_port);
+qtest_add_func("/pci/cxl/type3_device", cxl_t3d);
+
+return g_test_run();
+}
diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
index c19f1c8503..7c6439b45c 100644
--- a/tests/qtest/meson.build
+++ b/tests/qtest/meson.build
@@ -22,6 +22,9 @@ qtests_pci = \
   (config_all_devices.has_key('CONFIG_VGA') ? ['display-vga-test'] : []) + 
 \
   (config_all_devices.has_key('CONFIG_IVSHMEM_DEVICE') ? ['ivshmem-test'] : [])
 
+qtests_cxl = \
+  (config_all_devices.has_key('CONFIG_CXL') ? ['cxl-test'] : [])
+
 qtests_i386 = \
   (slirp.found() ? ['pxe-test', 'test-netfilter'] : []) + \
   (config_host.has_key('CONFIG_POSIX') ? ['test-filter-mirror'] : []) +
 \
@@ -47,6 +50,7 @@ qtests_i386 = \
   (config_all_devices.has_key('CONFIG_TPM_TIS_ISA') ? ['tpm-tis-swtpm-test'] : 
[]) +\
   (config_all_devices.has_key('CONFIG_RTL8139_PCI') ? ['rtl8139-test'] : []) + 
 \
   qtests_pci + 
 \
+  qtests_cxl + 
 \
   ['fdc-test',
'ide-test',
'hd-geo-test',
-- 
2.29.2




[RFC PATCH 17/25] hw/cxl/rp: Add a root port

2020-11-10 Thread Ben Widawsky
This adds just enough of a root port implementation to be able to
enumerate root ports (creating the required DVSEC entries). What's not
here yet is the MMIO nor the ability to write some of the DVSEC entries.

This can be added with the qemu commandline by adding a rootport to a
specific CXL host bridge. For example:
  -device cxl-rp,id=rp0,bus="cxl.0",addr=0.0,chassis=4

Like the host bridge patch, the ACPI tables aren't generated at this
point and so system software cannot use it.

Signed-off-by: Ben Widawsky 
---
 hw/pci-bridge/Kconfig  |   5 +
 hw/pci-bridge/cxl_root_port.c  | 231 +
 hw/pci-bridge/meson.build  |   1 +
 hw/pci-bridge/pcie_root_port.c |   6 +-
 hw/pci/pci.c   |   4 +-
 5 files changed, 245 insertions(+), 2 deletions(-)
 create mode 100644 hw/pci-bridge/cxl_root_port.c

diff --git a/hw/pci-bridge/Kconfig b/hw/pci-bridge/Kconfig
index a51ec716f5..a821b531da 100644
--- a/hw/pci-bridge/Kconfig
+++ b/hw/pci-bridge/Kconfig
@@ -27,3 +27,8 @@ config DEC_PCI
 
 config SIMBA
 bool
+
+config CXL
+bool
+default y if PCI_EXPRESS && PXB
+depends on PCI_EXPRESS && MSI_NONBROKEN && PXB
diff --git a/hw/pci-bridge/cxl_root_port.c b/hw/pci-bridge/cxl_root_port.c
new file mode 100644
index 00..e3a19dee6d
--- /dev/null
+++ b/hw/pci-bridge/cxl_root_port.c
@@ -0,0 +1,231 @@
+/*
+ * CXL 2.0 Root Port Implementation
+ *
+ * Copyright(C) 2020 Intel Corporation.
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see 
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/log.h"
+#include "qemu/range.h"
+#include "hw/pci/pci_bridge.h"
+#include "hw/pci/pcie_port.h"
+#include "hw/qdev-properties.h"
+#include "hw/sysbus.h"
+#include "qapi/error.h"
+#include "hw/cxl/cxl.h"
+
+#define CXL_ROOT_PORT_DID 0x7075
+
+/* Copied from the gen root port which we derive */
+#define GEN_PCIE_ROOT_PORT_AER_OFFSET 0x100
+#define GEN_PCIE_ROOT_PORT_ACS_OFFSET \
+(GEN_PCIE_ROOT_PORT_AER_OFFSET + PCI_ERR_SIZEOF)
+#define CXL_ROOT_PORT_DVSEC_OFFSET \
+(GEN_PCIE_ROOT_PORT_ACS_OFFSET + PCI_ACS_SIZEOF)
+
+typedef struct CXLRootPort {
+/*< private >*/
+PCIESlot parent_obj;
+
+CXLComponentState cxl_cstate;
+PCIResReserve res_reserve;
+} CXLRootPort;
+
+#define TYPE_CXL_ROOT_PORT "cxl-rp"
+DECLARE_INSTANCE_CHECKER(CXLRootPort, CXL_ROOT_PORT, TYPE_CXL_ROOT_PORT)
+
+static void latch_registers(CXLRootPort *crp)
+{
+uint32_t *reg_state = crp->cxl_cstate.crb.cache_mem_registers;
+
+cxl_component_register_init_common(reg_state, CXL2_ROOT_PORT);
+}
+
+static void build_dvsecs(CXLComponentState *cxl)
+{
+uint8_t *dvsec;
+
+dvsec = (uint8_t *)&(struct dvsec_port){ 0 };
+cxl_component_create_dvsec(cxl, EXTENSIONS_PORT_DVSEC_LENGTH,
+   EXTENSIONS_PORT_DVSEC,
+   EXTENSIONS_PORT_DVSEC_REVID, dvsec);
+
+dvsec = (uint8_t *)&(struct dvsec_port_gpf){
+.rsvd= 0,
+.phase1_ctrl = 1, /* 1μs timeout */
+.phase2_ctrl = 1, /* 1μs timeout */
+};
+cxl_component_create_dvsec(cxl, GPF_PORT_DVSEC_LENGTH, GPF_PORT_DVSEC,
+   GPF_PORT_DVSEC_REVID, dvsec);
+
+dvsec = (uint8_t *)&(struct dvsec_port_flexbus){
+.cap  = 0x26, /* IO, Mem, non-MLD */
+.ctrl = 0,
+.status   = 0x26, /* same */
+.rcvd_mod_ts_data = 0xef, /* WTF? */
+};
+cxl_component_create_dvsec(cxl, PCIE_FLEXBUS_PORT_DVSEC_LENGTH_2_0,
+   PCIE_FLEXBUS_PORT_DVSEC,
+   PCIE_FLEXBUS_PORT_DVSEC_REVID_2_0, dvsec);
+
+dvsec = (uint8_t *)&(struct dvsec_register_locator){
+.rsvd = 0,
+.reg0_base_lo = RBI_COMPONENT_REG | COMPONENT_REG_BAR_IDX,
+.reg0_base_hi = 0,
+};
+cxl_component_create_dvsec(cxl, REG_LOC_DVSEC_LENGTH, REG_LOC_DVSEC,
+   REG_LOC_DVSEC_REVID, dvsec);
+}
+
+static void cxl_rp_realize(DeviceState *dev, Error **errp)
+{
+PCIDevice *pci_dev = PCI_DEVICE(dev);
+PCIERootPortClass *rpc = PCIE_ROOT_PORT_GET_CLASS(dev);
+CXLRootPort *crp   = CXL_ROOT_PORT(dev);
+CXLComponentState *cxl_cstate = >cxl_cstate;
+ComponentRegisters *cregs = _cstate->crb;
+MemoryRegion *component_bar = >component_registers;
+Error 

[RFC PATCH 19/25] hw/cxl/device: Implement MMIO HDM decoding (8.2.5.12)

2020-11-10 Thread Ben Widawsky
A device's volatile and persistent memory are known Host Defined Memory
(HDM) regions. The mechanism by which the device is programmed to claim
the addresses associated with those regions is through dedicated logic
known as the HDM decoder. In order to allow the OS to properly program
the HDMs, the HDM decoders must be modeled.

There are two ways the HDM decoders can be implemented, the legacy
mechanism is through the PCIe DVSEC programming from CXL 1.1 (8.1.3.8),
and MMIO is found in 8.2.5.12 of the spec. For now, 8.1.3.8 is not
implemented.

Much of CXL device logic is implemented in cxl-utils. The HDM decoder
however is implemented directly by the device implementation. The
generic cxl-utils probably should be the correct place to put this since
HDM decoders aren't unique to a type3 device. It is however easier at
the moment, and requires less design consideration to simply implement
it in the device, and figure out how to consolidate it later.

Signed-off-by: Ben Widawsky 
---
 hw/mem/cxl_type3.c | 82 +++---
 1 file changed, 77 insertions(+), 5 deletions(-)

diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
index 48c25922f3..00ab5044b1 100644
--- a/hw/mem/cxl_type3.c
+++ b/hw/mem/cxl_type3.c
@@ -57,6 +57,71 @@ static void build_dvsecs(CXLType3Dev *ct3d)
REG_LOC_DVSEC_REVID, dvsec);
 }
 
+static void hdm_decoder_commit(CXLType3Dev *ct3d, int which)
+{
+MemoryRegion *pmem = ct3d->cxl_dstate.pmem;
+MemoryRegion *mr = host_memory_backend_get_memory(ct3d->hostmem);
+Range window, device;
+ComponentRegisters *cregs = >cxl_cstate.crb;
+uint32_t *cache_mem = cregs->cache_mem_registers;
+uint64_t offset, size;
+Error *err = NULL;
+
+assert(which == 0);
+
+ARRAY_FIELD_DP32(cache_mem, CXL_HDM_DECODER0_CTRL, COMMIT, 0);
+ARRAY_FIELD_DP32(cache_mem, CXL_HDM_DECODER0_CTRL, ERROR, 0);
+
+offset = ((uint64_t)cache_mem[R_CXL_HDM_DECODER0_BASE_HI] << 32) |
+ cache_mem[R_CXL_HDM_DECODER0_BASE_LO];
+size = ((uint64_t)cache_mem[R_CXL_HDM_DECODER0_SIZE_HI] << 32) |
+   cache_mem[R_CXL_HDM_DECODER0_SIZE_LO];
+
+range_init_nofail(, mr->addr, memory_region_size(mr));
+range_init_nofail(, offset, size);
+
+if (!range_contains_range(, )) {
+ARRAY_FIELD_DP32(cache_mem, CXL_HDM_DECODER0_CTRL, ERROR, 1);
+return;
+}
+
+memory_region_ram_resize(pmem, size, );
+if (err) {
+ARRAY_FIELD_DP32(cache_mem, CXL_HDM_DECODER0_CTRL, ERROR, 1);
+return;
+}
+
+offset -= mr->addr;
+memory_region_add_subregion(mr, offset, pmem);
+
+ARRAY_FIELD_DP32(cache_mem, CXL_HDM_DECODER0_CTRL, COMMITTED, 1);
+}
+
+static void ct3d_reg_write(void *opaque, hwaddr offset, uint64_t value, 
unsigned size)
+{
+CXLComponentState *cxl_cstate = opaque;
+ComponentRegisters *cregs = _cstate->crb;
+CXLType3Dev *ct3d = container_of(cxl_cstate, CXLType3Dev, cxl_cstate);
+uint32_t *cache_mem = cregs->cache_mem_registers;
+bool should_commit = false;
+int which_hdm = -1;
+
+assert(size == 4);
+
+switch (offset) {
+case A_CXL_HDM_DECODER0_CTRL:
+should_commit = FIELD_EX32(value, CXL_HDM_DECODER0_CTRL, COMMIT);
+which_hdm = 0;
+break;
+default:
+break;
+}
+
+stl_le_p((uint8_t *)cache_mem + offset, value);
+if (should_commit)
+hdm_decoder_commit(ct3d, which_hdm);
+}
+
 static void ct3_instance_init(Object *obj)
 {
 /* MemoryDeviceClass *mdc = MEMORY_DEVICE_GET_CLASS(obj); */
@@ -65,7 +130,10 @@ static void ct3_instance_init(Object *obj)
 static void ct3_finalize(Object *obj)
 {
 CXLType3Dev *ct3d = CT3(obj);
+CXLComponentState *cxl_cstate = >cxl_cstate;
+ComponentRegisters *regs = _cstate->crb;
 
+g_free((void *)regs->special_ops);
 g_free(ct3d->cxl_dstate.pmem);
 }
 
@@ -81,11 +149,12 @@ static void cxl_setup_memory(CXLType3Dev *ct3d, Error 
**errp)
 return;
 }
 
-/* FIXME: need to check mr is the host bridge's MR */
-mr = host_memory_backend_get_memory(ct3d->hostmem);
-
 /* Create our new subregion */
 ct3d->cxl_dstate.pmem = g_new(MemoryRegion, 1);
+memory_region_set_nonvolatile(ct3d->cxl_dstate.pmem, true);
+
+/* FIXME: need to check mr is the host bridge's MR */
+mr = host_memory_backend_get_memory(ct3d->hostmem);
 
 /* Find the first free space in the window */
 WITH_RCU_READ_LOCK_GUARD()
@@ -108,8 +177,6 @@ static void cxl_setup_memory(CXLType3Dev *ct3d, Error 
**errp)
 /* Register our subregion as non-volatile */
 memory_region_init_ram(ct3d->cxl_dstate.pmem, OBJECT(ct3d),
"cxl_type3-memory", ct3d->size, errp);
-memory_region_set_nonvolatile(ct3d->cxl_dstate.pmem, true);
-
 #ifdef SET_PMEM_PADDR
 memory_region_add_subregion(mr, offset, ct3d->cxl_dstate.pmem);
 #endif
@@ -148,6 +215,11 @@ static void ct3_realize(PCIDevice *pci_dev, Error **errp)
 

[RFC PATCH 22/25] acpi/cxl: Create the CEDT (9.14.1)

2020-11-10 Thread Ben Widawsky
The CXL Early Discovery Table is defined in the CXL 2.0 specification as
a way for the OS to get CXL specific information from the system
firmware.

As of CXL 2.0 spec, only 1 sub structure is defined, the CXL Host Bridge
Structure (CHBS) which is primarily useful for telling the OS exactly
where the MMIO for the host bridge is.

Signed-off-by: Ben Widawsky 
---
 hw/acpi/cxl.c   | 72 +
 hw/i386/acpi-build.c|  6 ++-
 hw/pci-bridge/pci_expander_bridge.c | 21 +
 include/hw/acpi/cxl.h   |  4 ++
 include/hw/pci/pci_bridge.h | 25 ++
 5 files changed, 107 insertions(+), 21 deletions(-)

diff --git a/hw/acpi/cxl.c b/hw/acpi/cxl.c
index 31ceaeecc3..c9631763ad 100644
--- a/hw/acpi/cxl.c
+++ b/hw/acpi/cxl.c
@@ -18,14 +18,86 @@
  */
 
 #include "qemu/osdep.h"
+#include "hw/sysbus.h"
+#include "hw/pci/pci_bridge.h"
+#include "hw/pci/pci_host.h"
 #include "hw/cxl/cxl.h"
+#include "hw/mem/memory-device.h"
 #include "hw/acpi/acpi.h"
 #include "hw/acpi/aml-build.h"
 #include "hw/acpi/bios-linker-loader.h"
 #include "hw/acpi/cxl.h"
+#include "hw/acpi/cxl.h"
 #include "qapi/error.h"
 #include "qemu/uuid.h"
 
+static void cedt_build_chbs(GArray *table_data, PXBDev *cxl)
+{
+SysBusDevice *sbd = SYS_BUS_DEVICE(cxl->cxl.cxl_host_bridge);
+struct MemoryRegion *mr = sbd->mmio[0].memory;
+
+/* Type */
+build_append_int_noprefix(table_data, 0, 1);
+
+/* Reserved */
+build_append_int_noprefix(table_data, 0xff, 1);
+
+/* Record Length */
+build_append_int_noprefix(table_data, 32, 2);
+
+/* UID */
+build_append_int_noprefix(table_data, cxl->uid, 4);
+
+/* Version */
+build_append_int_noprefix(table_data, 1, 4);
+
+/* Reserved */
+build_append_int_noprefix(table_data, 0x, 4);
+
+/* Base */
+build_append_int_noprefix(table_data, mr->addr, 8);
+
+/* Length */
+build_append_int_noprefix(table_data, memory_region_size(mr), 4);
+
+/* Reserved */
+build_append_int_noprefix(table_data, 0x, 4);
+}
+
+static int cxl_foreach_pxb_hb(Object *obj, void *opaque)
+{
+Aml *cedt = opaque;
+
+if (object_dynamic_cast(obj, TYPE_PXB_CXL_DEVICE)) {
+PXBDev *pxb = PXB_CXL_DEV(obj);
+
+cedt_build_chbs(cedt->buf, pxb);
+}
+
+return 0;
+}
+
+void cxl_build_cedt(GArray *table_offsets, GArray *table_data,
+BIOSLinker *linker)
+{
+const int cedt_start = table_data->len;
+Aml *cedt;
+
+cedt = init_aml_allocator();
+
+/* reserve space for CEDT header */
+acpi_add_table(table_offsets, table_data);
+acpi_data_push(cedt->buf, sizeof(AcpiTableHeader));
+
+object_child_foreach_recursive(object_get_root(), cxl_foreach_pxb_hb, 
cedt);
+
+/* copy AML table into ACPI tables blob and patch header there */
+g_array_append_vals(table_data, cedt->buf->data, cedt->buf->len);
+build_header(linker, table_data, (void *)(table_data->data + cedt_start),
+ "CEDT", table_data->len - cedt_start, 1, NULL, NULL);
+free_aml_allocator();
+}
+
 static Aml *__build_cxl_osc_method(void)
 {
 Aml *method, *if_uuid, *else_uuid, *if_arg1_not_1, *if_cxl, 
*if_caps_masked;
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index dd1f8b39d4..eda62dcd6a 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -75,6 +75,8 @@
 #include "hw/acpi/ipmi.h"
 #include "hw/acpi/hmat.h"
 
+#include "hw/acpi/cxl.h"
+
 /* These are used to size the ACPI tables for -M pc-i440fx-1.7 and
  * -M pc-i440fx-2.0.  Even if the actual amount of AML generated grows
  * a little bit, there should be plenty of free space since the DSDT
@@ -1662,7 +1664,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
 
 scope = aml_scope("\\_SB");
 if (type == CXL) {
-dev = aml_device("CXL%.01X", pci_bus_uid(bus));
+dev = aml_device("CXL%.01X", uid);
 } else {
 dev = aml_device("PC%.02X", bus_num);
 }
@@ -2568,6 +2570,8 @@ void acpi_build(AcpiBuildTables *tables, MachineState 
*machine)
   machine->nvdimms_state, machine->ram_slots);
 }
 
+cxl_build_cedt(table_offsets, tables_blob, tables->linker);
+
 acpi_add_table(table_offsets, tables_blob);
 build_waet(tables_blob, tables->linker);
 
diff --git a/hw/pci-bridge/pci_expander_bridge.c 
b/hw/pci-bridge/pci_expander_bridge.c
index 75910f5870..b2c1d9056a 100644
--- a/hw/pci-bridge/pci_expander_bridge.c
+++ b/hw/pci-bridge/pci_expander_bridge.c
@@ -57,26 +57,6 @@ DECLARE_INSTANCE_CHECKER(PXBDev, PXB_DEV,
 DECLARE_INSTANCE_CHECKER(PXBDev, PXB_PCIE_DEV,
  TYPE_PXB_PCIE_DEVICE)
 
-#define TYPE_PXB_CXL_DEVICE "pxb-cxl"
-DECLARE_INSTANCE_CHECKER(PXBDev, PXB_CXL_DEV,
- TYPE_PXB_CXL_DEVICE)
-
-struct PXBDev {
-/*< private >*/
-PCIDevice parent_obj;
-/*< public >*/
-
-uint8_t 

[RFC PATCH 21/25] acpi/cxl: Introduce a compat-driver UUID for CXL _OSC

2020-11-10 Thread Ben Widawsky
From: Vishal Verma 

Introduce a new UUID for CXL _OSC that only sets CXL related 'Support'
and Control' Dwords, independent of PCI/PCIe Dwords. This is a proposal
and an example AML implementation to demonstrate what such a compat UUID
would look like.

The AML resulting from this change is:

Method (_OSC, 4, NotSerialized)  // _OSC: Operating System Capabilities
{
CreateDWordField (Arg3, Zero, CDW1)
If Arg0 == ToUUID ("33db4d5b-1ff7-401c-9657-7441c03dd766") /* 
PCI Host Bridge Device */) || (Arg0 == ToUUID 
("68f2d50b-c469-4d8a-bd3d-941a103fd3fc"))) || (
Arg0 == ToUUID ("a4d1629d-ff52-4888-be96-e5cade548db1"
{
If ((Arg0 == ToUUID ("a4d1629d-ff52-4888-be96-e5cade548db1")))
{
CreateDWordField (Arg3, 0x04, CDW2)
CreateDWordField (Arg3, 0x08, CDW3)
SUPC = CDW2 /* \_SB_.CXL0._OSC.CDW2 */
CTRC = CDW3 /* \_SB_.CXL0._OSC.CDW3 */
CDW3 |= One
Return (Arg3)
}
Else
{
CreateDWordField (Arg3, 0x04, CDW2)
CreateDWordField (Arg3, 0x08, CDW3)
Local0 = CDW3 /* \_SB_.CXL0._OSC.CDW3 */
CTRL &= 0x1F
If ((Arg1 != One))
{
CDW1 |= 0x08
}

If ((CDW3 != Local0))
{
CDW1 |= 0x10
}

SUPP = CDW2 /* \_SB_.CXL0._OSC.CDW2 */
CTRL = CDW3 /* \_SB_.CXL0._OSC.CDW3 */
If ((Arg0 == ToUUID 
("68f2d50b-c469-4d8a-bd3d-941a103fd3fc")))
{
CreateDWordField (Arg3, 0x0C, CDW4)
CreateDWordField (Arg3, 0x10, CDW5)
SUPC = CDW4 /* \_SB_.CXL0._OSC.CDW4 */
CTRC = CDW5 /* \_SB_.CXL0._OSC.CDW5 */
CDW5 |= One
}

CDW3 = Local0
Return (Arg3)
}
}

Return (Arg3)
Else
{
CDW1 |= 0x04
}
}

Signed-off-by: Vishal Verma 
---
 hw/acpi/cxl.c | 54 ---
 1 file changed, 38 insertions(+), 16 deletions(-)

diff --git a/hw/acpi/cxl.c b/hw/acpi/cxl.c
index 7124d5a1a3..31ceaeecc3 100644
--- a/hw/acpi/cxl.c
+++ b/hw/acpi/cxl.c
@@ -29,6 +29,7 @@
 static Aml *__build_cxl_osc_method(void)
 {
 Aml *method, *if_uuid, *else_uuid, *if_arg1_not_1, *if_cxl, 
*if_caps_masked;
+Aml *if_compat, *else_nocompat;
 Aml *a_ctrl = aml_local(0);
 Aml *a_cdw1 = aml_name("CDW1");
 
@@ -37,31 +38,51 @@ static Aml *__build_cxl_osc_method(void)
 
 /* 9.14.2.1.4 */
 if_uuid = aml_if(
-aml_lor(aml_equal(aml_arg(0),
+aml_lor(
+aml_lor(aml_equal(aml_arg(0),
   aml_touuid("33DB4D5B-1FF7-401C-9657-7441C03DD766")),
-aml_equal(aml_arg(0),
-  
aml_touuid("68F2D50B-C469-4D8A-BD3D-941A103FD3FC";
-aml_append(if_uuid, aml_create_dword_field(aml_arg(3), aml_int(4), 
"CDW2"));
-aml_append(if_uuid, aml_create_dword_field(aml_arg(3), aml_int(8), 
"CDW3"));
-
-aml_append(if_uuid, aml_store(aml_name("CDW3"), a_ctrl));
+aml_equal(aml_arg(0),
+  aml_touuid("68F2D50B-C469-4D8A-BD3D-941A103FD3FC"))),
+aml_equal(aml_arg(0),
+  
aml_touuid("A4D1629D-FF52-4888-BE96-E5CADE548DB1";
+
+if_compat = aml_if(aml_equal(aml_arg(0),
+  aml_touuid("A4D1629D-FF52-4888-BE96-E5CADE548DB1")));
+aml_append(if_compat,
+   aml_create_dword_field(aml_arg(3), aml_int(4), "CDW2"));
+aml_append(if_compat,
+   aml_create_dword_field(aml_arg(3), aml_int(8), "CDW3"));
+aml_append(if_compat, aml_store(aml_name("CDW2"), aml_name("SUPC")));
+aml_append(if_compat, aml_store(aml_name("CDW3"), aml_name("CTRC")));
+aml_append(if_compat,
+   aml_or(aml_name("CDW3"), aml_int(0x1), aml_name("CDW3")));
+aml_append(if_compat, aml_return(aml_arg(3)));
+aml_append(if_uuid, if_compat);
+
+else_nocompat = aml_else();
+aml_append(else_nocompat,
+   aml_create_dword_field(aml_arg(3), aml_int(4), "CDW2"));
+aml_append(else_nocompat,
+   aml_create_dword_field(aml_arg(3), aml_int(8), "CDW3"));
+
+aml_append(else_nocompat, aml_store(aml_name("CDW3"), a_ctrl));
 
 /* This is all the same as what's used for PCIe */
-aml_append(if_uuid,
+aml_append(else_nocompat,
aml_and(aml_name("CTRL"), aml_int(0x1F), aml_name("CTRL")));
 
 if_arg1_not_1 = aml_if(aml_lnot(aml_equal(aml_arg(1), 

[RFC PATCH 18/25] hw/cxl/device: Add a memory device (8.2.8.5)

2020-11-10 Thread Ben Widawsky
A CXL memory device (AKA Type 3) is a CXL component that contains some
combination of volatile and persistent memory. It also implements the
previously defined mailbox interface as well as the memory device
firmware interface.

The following example will create a 256M device in a 512M window:

-object "memory-backend-file,id=cxl-mem1,share,mem-path=cxl-type3,size=512M"
-device "cxl-type3,bus=rp0,memdev=cxl-mem1,id=cxl-pmem0,size=256M"

Signed-off-by: Ben Widawsky 
---
 hw/core/numa.c   |   3 +
 hw/i386/pc.c |   1 +
 hw/mem/Kconfig   |   5 +
 hw/mem/cxl_type3.c   | 262 +++
 hw/mem/meson.build   |   1 +
 hw/pci/pcie.c|  30 +
 include/hw/cxl/cxl.h |   2 +
 include/hw/cxl/cxl_pci.h |  22 
 include/hw/pci/pci_ids.h |   1 +
 monitor/hmp-cmds.c   |  15 +++
 qapi/machine.json|   1 +
 11 files changed, 343 insertions(+)
 create mode 100644 hw/mem/cxl_type3.c

diff --git a/hw/core/numa.c b/hw/core/numa.c
index 7c4dd4e68e..3ddeb23036 100644
--- a/hw/core/numa.c
+++ b/hw/core/numa.c
@@ -770,6 +770,9 @@ static void numa_stat_memory_devices(NumaNodeMem node_mem[])
 node_mem[pcdimm_info->node].node_plugged_mem +=
 pcdimm_info->size;
 break;
+case MEMORY_DEVICE_INFO_KIND_CXL:
+/* FINISHME */
+break;
 case MEMORY_DEVICE_INFO_KIND_VIRTIO_PMEM:
 vpi = value->u.virtio_pmem.data;
 /* TODO: once we support numa, assign to right node */
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 5e6c0023e0..ecfc497f71 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -79,6 +79,7 @@
 #include "acpi-build.h"
 #include "hw/mem/pc-dimm.h"
 #include "hw/mem/nvdimm.h"
+#include "hw/cxl/cxl.h"
 #include "qapi/error.h"
 #include "qapi/qapi-visit-common.h"
 #include "qapi/visitor.h"
diff --git a/hw/mem/Kconfig b/hw/mem/Kconfig
index a0ef2cf648..7d9d1ced3e 100644
--- a/hw/mem/Kconfig
+++ b/hw/mem/Kconfig
@@ -10,3 +10,8 @@ config NVDIMM
 default y
 depends on (PC || PSERIES || ARM_VIRT)
 select MEM_DEVICE
+
+config CXL_MEM_DEVICE
+bool
+default y if CXL
+select MEM_DEVICE
diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
new file mode 100644
index 00..48c25922f3
--- /dev/null
+++ b/hw/mem/cxl_type3.c
@@ -0,0 +1,262 @@
+#include "qemu/osdep.h"
+#include "qemu/units.h"
+#include "qemu/error-report.h"
+#include "hw/mem/memory-device.h"
+#include "hw/mem/pc-dimm.h"
+#include "hw/pci/pci.h"
+#include "hw/qdev-properties.h"
+#include "qapi/error.h"
+#include "qemu/log.h"
+#include "qemu/module.h"
+#include "qemu/range.h"
+#include "qemu/rcu.h"
+#include "sysemu/hostmem.h"
+#include "hw/cxl/cxl.h"
+
+typedef struct cxl_type3_dev {
+/* Private */
+PCIDevice parent_obj;
+
+/* Properties */
+uint64_t size;
+HostMemoryBackend *hostmem;
+
+/* State */
+CXLComponentState cxl_cstate;
+CXLDeviceState cxl_dstate;
+} CXLType3Dev;
+
+#define CT3(obj) OBJECT_CHECK(CXLType3Dev, (obj), TYPE_CXL_TYPE3_DEV)
+
+static void build_dvsecs(CXLType3Dev *ct3d)
+{
+CXLComponentState *cxl_cstate = >cxl_cstate;
+uint8_t *dvsec;
+
+dvsec = (uint8_t *)&(struct dvsec_device){
+.cap = 0x1e,
+.ctrl = 0x6,
+.status2 = 0x2,
+.range1_size_hi = 0,
+.range1_size_lo = (2 << 5) | (2 << 2) | 0x3 | ct3d->size,
+.range1_base_hi = 0,
+.range1_base_lo = 0,
+};
+cxl_component_create_dvsec(cxl_cstate, PCIE_CXL_DEVICE_DVSEC_LENGTH,
+   PCIE_CXL_DEVICE_DVSEC,
+   PCIE_CXL_DEVICE_DVSEC_REVID, dvsec);
+
+dvsec = (uint8_t *)&(struct dvsec_register_locator){
+.rsvd = 0,
+.reg0_base_lo = RBI_COMPONENT_REG | COMPONENT_REG_BAR_IDX,
+.reg0_base_hi = 0,
+.reg1_base_lo = RBI_CXL_DEVICE_REG | DEVICE_REG_BAR_IDX,
+.reg1_base_hi = 0,
+};
+cxl_component_create_dvsec(cxl_cstate, REG_LOC_DVSEC_LENGTH, REG_LOC_DVSEC,
+   REG_LOC_DVSEC_REVID, dvsec);
+}
+
+static void ct3_instance_init(Object *obj)
+{
+/* MemoryDeviceClass *mdc = MEMORY_DEVICE_GET_CLASS(obj); */
+}
+
+static void ct3_finalize(Object *obj)
+{
+CXLType3Dev *ct3d = CT3(obj);
+
+g_free(ct3d->cxl_dstate.pmem);
+}
+
+static void cxl_setup_memory(CXLType3Dev *ct3d, Error **errp)
+{
+MemoryRegionSection mrs;
+MemoryRegion *mr;
+uint64_t offset = 0;
+size_t remaining_size;
+
+if (!ct3d->hostmem) {
+error_setg(errp, "memdev property must be set");
+return;
+}
+
+/* FIXME: need to check mr is the host bridge's MR */
+mr = host_memory_backend_get_memory(ct3d->hostmem);
+
+/* Create our new subregion */
+ct3d->cxl_dstate.pmem = g_new(MemoryRegion, 1);
+
+/* Find the first free space in the window */
+WITH_RCU_READ_LOCK_GUARD()
+{
+mrs = memory_region_find(mr, 

[RFC PATCH 11/25] hw/pxb: Allow creation of a CXL PXB (host bridge)

2020-11-10 Thread Ben Widawsky
This works like adding a typical pxb device, except the name is
'pxb-cxl' instead of 'pxb-pcie'. An example command line would be as
follows:
  -device pxb-cxl,id=cxl.0,bus="pcie.0",bus_nr=1

A CXL PXB is backward compatible with PCIe. What this means in practice
is that an operating system that is unaware of CXL should still be able
to enumerate this topology as if it were PCIe.

One can create multiple CXL PXB host bridges, but a host bridge can only
be connected to the main root bus. Host bridges cannot appear elsewhere
in the topology.

Note that as of this patch, the ACPI tables needed for the host bridge
(specifically, an ACPI object in _SB named ACPI0016 and the CEDT) aren't
created. So while this patch internally creates it, it cannot be
properly used by an operating system or other system software.

Upcoming patches will allow creating multiple host bridges.

Signed-off-by: Ben Widawsky 
---
 hw/pci-bridge/pci_expander_bridge.c | 67 -
 hw/pci/pci.c|  7 +++
 include/hw/pci/pci.h|  6 +++
 3 files changed, 78 insertions(+), 2 deletions(-)

diff --git a/hw/pci-bridge/pci_expander_bridge.c 
b/hw/pci-bridge/pci_expander_bridge.c
index 88c45dc3b5..3a8d815231 100644
--- a/hw/pci-bridge/pci_expander_bridge.c
+++ b/hw/pci-bridge/pci_expander_bridge.c
@@ -56,6 +56,10 @@ DECLARE_INSTANCE_CHECKER(PXBDev, PXB_DEV,
 DECLARE_INSTANCE_CHECKER(PXBDev, PXB_PCIE_DEV,
  TYPE_PXB_PCIE_DEVICE)
 
+#define TYPE_PXB_CXL_DEVICE "pxb-cxl"
+DECLARE_INSTANCE_CHECKER(PXBDev, PXB_CXL_DEV,
+ TYPE_PXB_CXL_DEVICE)
+
 struct PXBDev {
 /*< private >*/
 PCIDevice parent_obj;
@@ -67,6 +71,11 @@ struct PXBDev {
 
 static PXBDev *convert_to_pxb(PCIDevice *dev)
 {
+/* A CXL PXB's parent bus is PCIe, so the normal check won't work */
+if (object_dynamic_cast(OBJECT(dev), TYPE_PXB_CXL_DEVICE)) {
+return PXB_CXL_DEV(dev);
+}
+
 return pci_bus_is_express(pci_get_bus(dev))
 ? PXB_PCIE_DEV(dev) : PXB_DEV(dev);
 }
@@ -111,11 +120,20 @@ static const TypeInfo pxb_pcie_bus_info = {
 .class_init= pxb_bus_class_init,
 };
 
+static const TypeInfo pxb_cxl_bus_info = {
+.name  = TYPE_PXB_CXL_BUS,
+.parent= TYPE_CXL_BUS,
+.instance_size = sizeof(PXBBus),
+.class_init= pxb_bus_class_init,
+};
+
 static const char *pxb_host_root_bus_path(PCIHostState *host_bridge,
   PCIBus *rootbus)
 {
-PXBBus *bus = pci_bus_is_express(rootbus) ?
-  PXB_PCIE_BUS(rootbus) : PXB_BUS(rootbus);
+PXBBus *bus = pci_bus_is_cxl(rootbus) ?
+  PXB_CXL_BUS(rootbus) :
+  pci_bus_is_express(rootbus) ? PXB_PCIE_BUS(rootbus) :
+PXB_BUS(rootbus);
 
 snprintf(bus->bus_path, 8, ":%02x", pxb_bus_num(rootbus));
 return bus->bus_path;
@@ -380,13 +398,58 @@ static const TypeInfo pxb_pcie_dev_info = {
 },
 };
 
+static void pxb_cxl_dev_realize(PCIDevice *dev, Error **errp)
+{
+/* A CXL PXB's parent bus is still PCIe */
+if (!pci_bus_is_express(pci_get_bus(dev))) {
+error_setg(errp, "pxb-cxl devices cannot reside on a PCI bus");
+return;
+}
+
+pxb_dev_realize_common(dev, CXL, errp);
+}
+
+static void pxb_cxl_dev_class_init(ObjectClass *klass, void *data)
+{
+DeviceClass *dc   = DEVICE_CLASS(klass);
+PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
+
+k->realize = pxb_cxl_dev_realize;
+k->exit= pxb_dev_exitfn;
+k->vendor_id   = PCI_VENDOR_ID_INTEL;
+k->device_id   = 0xabcd;
+k->class_id= PCI_CLASS_BRIDGE_HOST;
+k->subsystem_vendor_id = PCI_VENDOR_ID_INTEL;
+
+dc->desc = "CXL Host Bridge";
+device_class_set_props(dc, pxb_dev_properties);
+set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
+
+/* Host bridges aren't hotpluggable. FIXME: spec reference */
+dc->hotpluggable = false;
+}
+
+static const TypeInfo pxb_cxl_dev_info = {
+.name  = TYPE_PXB_CXL_DEVICE,
+.parent= TYPE_PCI_DEVICE,
+.instance_size = sizeof(PXBDev),
+.class_init= pxb_cxl_dev_class_init,
+.interfaces =
+(InterfaceInfo[]){
+{ INTERFACE_CONVENTIONAL_PCI_DEVICE },
+{},
+},
+};
+
 static void pxb_register_types(void)
 {
 type_register_static(_bus_info);
 type_register_static(_pcie_bus_info);
+type_register_static(_cxl_bus_info);
 type_register_static(_host_info);
 type_register_static(_dev_info);
 type_register_static(_pcie_dev_info);
+type_register_static(_cxl_dev_info);
 }
 
 type_init(pxb_register_types)
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index db88788c4b..67eed889a4 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -220,6 +220,12 @@ static const TypeInfo pcie_bus_info = {
 .class_init = pcie_bus_class_init,
 };
 

[RFC PATCH 15/25] acpi/pxb/cxl: Reserve host bridge MMIO

2020-11-10 Thread Ben Widawsky
For all host bridges, reserve MMIO space with _CRS. The MMIO for the
host bridge lives in a magically hard coded space in the system's
physical address space. The standard mechanism to tell the OS about
regions which can't be used for host bridges is _CRS.

Signed-off-by: Ben Widawsky 
---
 hw/i386/acpi-build.c | 22 +++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index aaed7da7dc..fae4fa28e1 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -28,6 +28,7 @@
 #include "qemu/bitmap.h"
 #include "qemu/error-report.h"
 #include "hw/pci/pci.h"
+#include "hw/cxl/cxl.h"
 #include "hw/core/cpu.h"
 #include "target/i386/cpu.h"
 #include "hw/misc/pvpanic.h"
@@ -1486,7 +1487,7 @@ static void build_smb0(Aml *table, I2CBus *smbus, int 
devnr, int func)
 aml_append(table, scope);
 }
 
-enum { PCI, PCIE };
+enum { PCI, PCIE, CXL };
 static void init_pci_acpi(Aml *dev, int uid, int type)
 {
 if (type == PCI) {
@@ -1635,20 +1636,28 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
 uint8_t bus_num = pci_bus_num(bus);
 uint8_t numa_node = pci_bus_numa_node(bus);
 int32_t uid = pci_bus_uid(bus);
+int type;
 
 /* look only for expander root buses */
 if (!pci_bus_is_root(bus)) {
 continue;
 }
 
+type = pci_bus_is_cxl(bus) ? CXL :
+ pci_bus_is_express(bus) ? PCIE : PCI;
+
 if (bus_num < root_bus_limit) {
 root_bus_limit = bus_num - 1;
 }
 
 scope = aml_scope("\\_SB");
-dev = aml_device("PC%.02X", bus_num);
+if (type == CXL) {
+dev = aml_device("CXL%.01X", pci_bus_uid(bus));
+} else {
+dev = aml_device("PC%.02X", bus_num);
+}
 aml_append(dev, aml_name_decl("_BBN", aml_int(bus_num)));
-init_pci_acpi(dev, uid, pci_bus_is_express(bus) ? PCIE : PCI);
+init_pci_acpi(dev, uid, type);
 
 if (numa_node != NUMA_NODE_UNASSIGNED) {
 aml_append(dev, aml_name_decl("_PXM", aml_int(numa_node)));
@@ -1659,6 +1668,13 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
 aml_append(dev, aml_name_decl("_CRS", crs));
 aml_append(scope, dev);
 aml_append(dsdt, scope);
+
+/* Handle the ranges for the PXB expanders */
+if (type == CXL) {
+uint64_t base = CXL_HOST_BASE + uid * 0x1;
+crs_range_insert(crs_range_set.mem_ranges, base,
+ base + 0x1 - 1);
+}
 }
 }
 
-- 
2.29.2




[RFC PATCH 10/25] hw/pci/cxl: Create a CXL bus type

2020-11-10 Thread Ben Widawsky
The easiest way to differentiate a CXL bus, and a PCIE bus is using a
flag. A CXL bus, in hardware, is backward compatible with PCIE, and
therefore the code tries pretty hard to keep them in sync as much as
possible.

The other way to implement this would be to try to cast the bus to the
correct type. This is less code and useful for debugging via simply
looking at the flags.

Signed-off-by: Ben Widawsky 
---
 hw/pci-bridge/pci_expander_bridge.c | 9 -
 include/hw/pci/pci_bus.h| 7 +++
 2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/hw/pci-bridge/pci_expander_bridge.c 
b/hw/pci-bridge/pci_expander_bridge.c
index 232b7ce305..88c45dc3b5 100644
--- a/hw/pci-bridge/pci_expander_bridge.c
+++ b/hw/pci-bridge/pci_expander_bridge.c
@@ -24,7 +24,7 @@
 #include "hw/boards.h"
 #include "qom/object.h"
 
-enum BusType { PCI, PCIE };
+enum BusType { PCI, PCIE, CXL };
 
 #define TYPE_PXB_BUS "pxb-bus"
 typedef struct PXBBus PXBBus;
@@ -35,6 +35,10 @@ DECLARE_INSTANCE_CHECKER(PXBBus, PXB_BUS,
 DECLARE_INSTANCE_CHECKER(PXBBus, PXB_PCIE_BUS,
  TYPE_PXB_PCIE_BUS)
 
+#define TYPE_PXB_CXL_BUS "pxb-cxl-bus"
+DECLARE_INSTANCE_CHECKER(PXBBus, PXB_CXL_BUS,
+ TYPE_PXB_CXL_BUS)
+
 struct PXBBus {
 /*< private >*/
 PCIBus parent_obj;
@@ -244,6 +248,9 @@ static void pxb_dev_realize_common(PCIDevice *dev, enum 
BusType type,
 ds = qdev_new(TYPE_PXB_HOST);
 if (type == PCIE) {
 bus = pci_root_bus_new(ds, dev_name, NULL, NULL, 0, TYPE_PXB_PCIE_BUS);
+} else if (type == CXL) {
+bus = pci_root_bus_new(ds, dev_name, NULL, NULL, 0, TYPE_PXB_CXL_BUS);
+bus->flags |= PCI_BUS_CXL;
 } else {
 bus = pci_root_bus_new(ds, "pxb-internal", NULL, NULL, 0, 
TYPE_PXB_BUS);
 bds = qdev_new("pci-bridge");
diff --git a/include/hw/pci/pci_bus.h b/include/hw/pci/pci_bus.h
index 347440d42c..eb94e7e85c 100644
--- a/include/hw/pci/pci_bus.h
+++ b/include/hw/pci/pci_bus.h
@@ -24,6 +24,8 @@ enum PCIBusFlags {
 PCI_BUS_IS_ROOT = 0x0001,
 /* PCIe extended configuration space is accessible on this bus */
 PCI_BUS_EXTENDED_CONFIG_SPACE   = 0x0002,
+/* This is a CXL Type BUS */
+PCI_BUS_CXL = 0x0004,
 };
 
 struct PCIBus {
@@ -53,6 +55,11 @@ struct PCIBus {
 Notifier machine_done;
 };
 
+static inline bool pci_bus_is_cxl(PCIBus *bus)
+{
+return !!(bus->flags & PCI_BUS_CXL);
+}
+
 static inline bool pci_bus_is_root(PCIBus *bus)
 {
 return !!(bus->flags & PCI_BUS_IS_ROOT);
-- 
2.29.2




[RFC PATCH 16/25] hw/pxb/cxl: Add "windows" for host bridges

2020-11-10 Thread Ben Widawsky
In a bare metal CXL capable system, system firmware will program
physical address ranges on the host. This is done by programming
internal registers that aren't typically known to OS. These address
ranges might be contiguous or interleaved across host bridges.

For a QEMU guest a new construct is introduced allowing passing a memory
backend to the host bridge for this same purpose. Each memory backend
needs to be passed to the host bridge as well as any device that will be
emulating that memory (not implemented here).

I'm hopeful the interleaving work in the link can be re-purposed here
(see Link).

An example to create a host bridges with a 512M window at 0x4c000
 -object memory-backend-file,id=cxl-mem1,share,mem-path=cxl-type3,size=512M
 -device 
pxb-cxl,id=cxl.0,bus=pcie.0,bus_nr=52,uid=0,len-memory-base=1,memory-base\[0\]=0x4c000,memory\[0\]=cxl-mem1

Link: https://lists.nongnu.org/archive/html/qemu-devel/2020-08/msg03680.html
Signed-off-by: Ben Widawsky 
---
 hw/pci-bridge/pci_expander_bridge.c | 65 +++--
 include/hw/cxl/cxl.h|  1 +
 2 files changed, 62 insertions(+), 4 deletions(-)

diff --git a/hw/pci-bridge/pci_expander_bridge.c 
b/hw/pci-bridge/pci_expander_bridge.c
index eca5c71d45..75910f5870 100644
--- a/hw/pci-bridge/pci_expander_bridge.c
+++ b/hw/pci-bridge/pci_expander_bridge.c
@@ -69,12 +69,19 @@ struct PXBDev {
 uint8_t bus_nr;
 uint16_t numa_node;
 int32_t uid;
+struct cxl_dev {
+HostMemoryBackend *memory_window[CXL_WINDOW_MAX];
+
+uint32_t num_windows;
+hwaddr *window_base[CXL_WINDOW_MAX];
+} cxl;
 };
 
 typedef struct CXLHost {
 PCIHostState parent_obj;
 
 CXLComponentState cxl_cstate;
+PXBDev *dev;
 } CXLHost;
 
 static PXBDev *convert_to_pxb(PCIDevice *dev)
@@ -213,16 +220,31 @@ static void pxb_cxl_realize(DeviceState *dev, Error 
**errp)
 SysBusDevice *sbd = SYS_BUS_DEVICE(dev);
 PCIHostState *phb = PCI_HOST_BRIDGE(dev);
 CXLHost *cxl = PXB_CXL_HOST(dev);
+struct cxl_dev *cxl_dev = >dev->cxl;
 CXLComponentState *cxl_cstate = >cxl_cstate;
 struct MemoryRegion *mr = _cstate->crb.component_registers;
+int uid = pci_bus_uid(phb->bus);
 
 cxl_component_register_block_init(OBJECT(dev), cxl_cstate,
   TYPE_PXB_CXL_HOST);
 sysbus_init_mmio(sbd, mr);
 
-/* FIXME: support multiple host bridges. */
-sysbus_mmio_map(sbd, 0, CXL_HOST_BASE +
-memory_region_size(mr) * pci_bus_uid(phb->bus));
+sysbus_mmio_map(sbd, 0, CXL_HOST_BASE + memory_region_size(mr) * uid);
+
+/*
+ * A CXL host bridge can exist without a fixed memory window, but it would
+ * only operate in legacy PCIe mode.
+ */
+if (!cxl_dev->memory_window[uid]) {
+warn_report(
+"CXL expander bridge created without window. Consider using %s",
+"memdev[0]=");
+return;
+}
+
+mr = host_memory_backend_get_memory(cxl_dev->memory_window[uid]);
+sysbus_init_mmio(sbd, mr);
+sysbus_mmio_map(sbd, 1 + uid, *cxl_dev->window_base[uid]);
 }
 
 static void pxb_cxl_host_class_init(ObjectClass *class, void *data)
@@ -328,6 +350,7 @@ static void pxb_dev_realize_common(PCIDevice *dev, enum 
BusType type,
 } else if (type == CXL) {
 bus = pci_root_bus_new(ds, dev_name, NULL, NULL, 0, TYPE_PXB_CXL_BUS);
 bus->flags |= PCI_BUS_CXL;
+PXB_CXL_HOST(ds)->dev = PXB_CXL_DEV(dev);
 } else {
 bus = pci_root_bus_new(ds, "pxb-internal", NULL, NULL, 0, 
TYPE_PXB_BUS);
 bds = qdev_new("pci-bridge");
@@ -389,6 +412,8 @@ static Property pxb_dev_properties[] = {
 DEFINE_PROP_UINT8("bus_nr", PXBDev, bus_nr, 0),
 DEFINE_PROP_UINT16("numa_node", PXBDev, numa_node, NUMA_NODE_UNASSIGNED),
 DEFINE_PROP_INT32("uid", PXBDev, uid, -1),
+DEFINE_PROP_ARRAY("window-base", PXBDev, cxl.num_windows, cxl.window_base,
+  qdev_prop_uint64, hwaddr),
 DEFINE_PROP_END_OF_LIST(),
 };
 
@@ -460,7 +485,9 @@ static const TypeInfo pxb_pcie_dev_info = {
 
 static void pxb_cxl_dev_realize(PCIDevice *dev, Error **errp)
 {
-PXBDev *pxb = convert_to_pxb(dev);
+PXBDev *pxb = PXB_CXL_DEV(dev);
+struct cxl_dev *cxl = >cxl;
+int count = 0;
 
 /* A CXL PXB's parent bus is still PCIe */
 if (!pci_bus_is_express(pci_get_bus(dev))) {
@@ -476,6 +503,23 @@ static void pxb_cxl_dev_realize(PCIDevice *dev, Error 
**errp)
 /* FIXME: Check that uid doesn't collide with UIDs of other host bridges */
 
 pxb_dev_realize_common(dev, CXL, errp);
+
+for (unsigned i = 0; i < CXL_WINDOW_MAX; i++) {
+if (!cxl->memory_window[i]) {
+continue;
+}
+
+count++;
+}
+
+if (!count) {
+warn_report("memory-windows should be set when creating CXL host 
bridges");
+}
+
+if (count != cxl->num_windows) {
+error_setg(errp, "window bases count (%d) must match window count 

[RFC PATCH 20/25] acpi/cxl: Add _OSC implementation (9.14.2)

2020-11-10 Thread Ben Widawsky
CXL 2.0 specification adds 2 new dwords to the existing _OSC definition
from PCIe. The new dwords are accessed with a new uuid. This
implementation supports what is in the specification.

We are currently in the process of trying to define a new definition for
_OSC. See later work for an explanation.

Signed-off-by: Ben Widawsky 
---
 hw/acpi/Kconfig   |   5 ++
 hw/acpi/cxl.c | 104 ++
 hw/acpi/meson.build   |   1 +
 hw/i386/acpi-build.c  |  12 -
 include/hw/acpi/cxl.h |  23 ++
 5 files changed, 144 insertions(+), 1 deletion(-)
 create mode 100644 hw/acpi/cxl.c
 create mode 100644 include/hw/acpi/cxl.h

diff --git a/hw/acpi/Kconfig b/hw/acpi/Kconfig
index 1932f66af8..b27907953e 100644
--- a/hw/acpi/Kconfig
+++ b/hw/acpi/Kconfig
@@ -5,6 +5,7 @@ config ACPI_X86
 bool
 select ACPI
 select ACPI_NVDIMM
+select ACPI_CXL
 select ACPI_CPU_HOTPLUG
 select ACPI_MEMORY_HOTPLUG
 select ACPI_HMAT
@@ -42,3 +43,7 @@ config ACPI_VMGENID
 depends on PC
 
 config ACPI_HW_REDUCED
+
+config ACPI_CXL
+bool
+depends on ACPI
diff --git a/hw/acpi/cxl.c b/hw/acpi/cxl.c
new file mode 100644
index 00..7124d5a1a3
--- /dev/null
+++ b/hw/acpi/cxl.c
@@ -0,0 +1,104 @@
+/*
+ * CXL ACPI Implementation
+ *
+ * Copyright(C) 2020 Intel Corporation.
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see 
+ */
+
+#include "qemu/osdep.h"
+#include "hw/cxl/cxl.h"
+#include "hw/acpi/acpi.h"
+#include "hw/acpi/aml-build.h"
+#include "hw/acpi/bios-linker-loader.h"
+#include "hw/acpi/cxl.h"
+#include "qapi/error.h"
+#include "qemu/uuid.h"
+
+static Aml *__build_cxl_osc_method(void)
+{
+Aml *method, *if_uuid, *else_uuid, *if_arg1_not_1, *if_cxl, 
*if_caps_masked;
+Aml *a_ctrl = aml_local(0);
+Aml *a_cdw1 = aml_name("CDW1");
+
+method = aml_method("_OSC", 4, AML_NOTSERIALIZED);
+aml_append(method, aml_create_dword_field(aml_arg(3), aml_int(0), "CDW1"));
+
+/* 9.14.2.1.4 */
+if_uuid = aml_if(
+aml_lor(aml_equal(aml_arg(0),
+  aml_touuid("33DB4D5B-1FF7-401C-9657-7441C03DD766")),
+aml_equal(aml_arg(0),
+  
aml_touuid("68F2D50B-C469-4D8A-BD3D-941A103FD3FC";
+aml_append(if_uuid, aml_create_dword_field(aml_arg(3), aml_int(4), 
"CDW2"));
+aml_append(if_uuid, aml_create_dword_field(aml_arg(3), aml_int(8), 
"CDW3"));
+
+aml_append(if_uuid, aml_store(aml_name("CDW3"), a_ctrl));
+
+/* This is all the same as what's used for PCIe */
+aml_append(if_uuid,
+   aml_and(aml_name("CTRL"), aml_int(0x1F), aml_name("CTRL")));
+
+if_arg1_not_1 = aml_if(aml_lnot(aml_equal(aml_arg(1), aml_int(0x1;
+/* Unknown revision */
+aml_append(if_arg1_not_1, aml_or(a_cdw1, aml_int(0x08), a_cdw1));
+aml_append(if_uuid, if_arg1_not_1);
+
+if_caps_masked = aml_if(aml_lnot(aml_equal(aml_name("CDW3"), a_ctrl)));
+/* Capability bits were masked */
+aml_append(if_caps_masked, aml_or(a_cdw1, aml_int(0x10), a_cdw1));
+aml_append(if_uuid, if_caps_masked);
+
+aml_append(if_uuid, aml_store(aml_name("CDW2"), aml_name("SUPP")));
+aml_append(if_uuid, aml_store(aml_name("CDW3"), aml_name("CTRL")));
+
+if_cxl = aml_if(aml_equal(
+aml_arg(0), aml_touuid("68F2D50B-C469-4D8A-BD3D-941A103FD3FC")));
+/* CXL support field */
+aml_append(if_cxl, aml_create_dword_field(aml_arg(3), aml_int(12), 
"CDW4"));
+/* CXL capabilities */
+aml_append(if_cxl, aml_create_dword_field(aml_arg(3), aml_int(16), 
"CDW5"));
+aml_append(if_cxl, aml_store(aml_name("CDW4"), aml_name("SUPC")));
+aml_append(if_cxl, aml_store(aml_name("CDW5"), aml_name("CTRC")));
+
+/* CXL 2.0 Port/Device Register access */
+aml_append(if_cxl,
+   aml_or(aml_name("CDW5"), aml_int(0x1), aml_name("CDW5")));
+aml_append(if_uuid, if_cxl);
+
+/* Update DWORD3 (the return value) */
+aml_append(if_uuid, aml_store(a_ctrl, aml_name("CDW3")));
+
+aml_append(if_uuid, aml_return(aml_arg(3)));
+aml_append(method, if_uuid);
+
+else_uuid = aml_else();
+
+/* unrecognized uuid */
+aml_append(else_uuid,
+   aml_or(aml_name("CDW1"), aml_int(0x4), aml_name("CDW1")));
+aml_append(else_uuid, aml_return(aml_arg(3)));
+aml_append(method, else_uuid);
+
+return method;

[RFC PATCH 09/25] hw/pxb: Use a type for realizing expanders

2020-11-10 Thread Ben Widawsky
This opens up the possibility for more types of expanders (other than
PCI and PCIe). We'll need this to create a CXL expander.

Signed-off-by: Ben Widawsky 
---
 hw/pci-bridge/pci_expander_bridge.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/hw/pci-bridge/pci_expander_bridge.c 
b/hw/pci-bridge/pci_expander_bridge.c
index aedded1064..232b7ce305 100644
--- a/hw/pci-bridge/pci_expander_bridge.c
+++ b/hw/pci-bridge/pci_expander_bridge.c
@@ -24,6 +24,8 @@
 #include "hw/boards.h"
 #include "qom/object.h"
 
+enum BusType { PCI, PCIE };
+
 #define TYPE_PXB_BUS "pxb-bus"
 typedef struct PXBBus PXBBus;
 DECLARE_INSTANCE_CHECKER(PXBBus, PXB_BUS,
@@ -214,7 +216,8 @@ static gint pxb_compare(gconstpointer a, gconstpointer b)
0;
 }
 
-static void pxb_dev_realize_common(PCIDevice *dev, bool pcie, Error **errp)
+static void pxb_dev_realize_common(PCIDevice *dev, enum BusType type,
+   Error **errp)
 {
 PXBDev *pxb = convert_to_pxb(dev);
 DeviceState *ds, *bds = NULL;
@@ -239,7 +242,7 @@ static void pxb_dev_realize_common(PCIDevice *dev, bool 
pcie, Error **errp)
 }
 
 ds = qdev_new(TYPE_PXB_HOST);
-if (pcie) {
+if (type == PCIE) {
 bus = pci_root_bus_new(ds, dev_name, NULL, NULL, 0, TYPE_PXB_PCIE_BUS);
 } else {
 bus = pci_root_bus_new(ds, "pxb-internal", NULL, NULL, 0, 
TYPE_PXB_BUS);
@@ -287,7 +290,7 @@ static void pxb_dev_realize(PCIDevice *dev, Error **errp)
 return;
 }
 
-pxb_dev_realize_common(dev, false, errp);
+pxb_dev_realize_common(dev, PCI, errp);
 }
 
 static void pxb_dev_exitfn(PCIDevice *pci_dev)
@@ -339,7 +342,7 @@ static void pxb_pcie_dev_realize(PCIDevice *dev, Error 
**errp)
 return;
 }
 
-pxb_dev_realize_common(dev, true, errp);
+pxb_dev_realize_common(dev, PCIE, errp);
 }
 
 static void pxb_pcie_dev_class_init(ObjectClass *klass, void *data)
-- 
2.29.2




[RFC PATCH 14/25] hw/cxl/component: Implement host bridge MMIO (8.2.5, table 142)

2020-11-10 Thread Ben Widawsky
CXL host bridges themselves may have MMIO. Since host bridges don't have
a BAR they are treated as special for MMIO.

Signed-off-by: Ben Widawsky 

--

It's arbitrarily chosen here to pick 0xD000 as the base for the host
bridge MMIO. I'm not sure what the right way to find free space for
platform hardcoded things like this is.
---
 hw/pci-bridge/pci_expander_bridge.c | 53 -
 include/hw/cxl/cxl.h|  2 ++
 2 files changed, 54 insertions(+), 1 deletion(-)

diff --git a/hw/pci-bridge/pci_expander_bridge.c 
b/hw/pci-bridge/pci_expander_bridge.c
index d5b43a8a31..eca5c71d45 100644
--- a/hw/pci-bridge/pci_expander_bridge.c
+++ b/hw/pci-bridge/pci_expander_bridge.c
@@ -17,6 +17,7 @@
 #include "hw/pci/pci_host.h"
 #include "hw/qdev-properties.h"
 #include "hw/pci/pci_bridge.h"
+#include "hw/cxl/cxl.h"
 #include "qemu/range.h"
 #include "qemu/error-report.h"
 #include "qemu/module.h"
@@ -70,6 +71,12 @@ struct PXBDev {
 int32_t uid;
 };
 
+typedef struct CXLHost {
+PCIHostState parent_obj;
+
+CXLComponentState cxl_cstate;
+} CXLHost;
+
 static PXBDev *convert_to_pxb(PCIDevice *dev)
 {
 /* A CXL PXB's parent bus is PCIe, so the normal check won't work */
@@ -85,6 +92,9 @@ static GList *pxb_dev_list;
 
 #define TYPE_PXB_HOST "pxb-host"
 
+#define TYPE_PXB_CXL_HOST "pxb-cxl-host"
+#define PXB_CXL_HOST(obj) OBJECT_CHECK(CXLHost, (obj), TYPE_PXB_CXL_HOST)
+
 static int pxb_bus_num(PCIBus *bus)
 {
 PXBDev *pxb = convert_to_pxb(bus->parent_dev);
@@ -198,6 +208,46 @@ static const TypeInfo pxb_host_info = {
 .class_init= pxb_host_class_init,
 };
 
+static void pxb_cxl_realize(DeviceState *dev, Error **errp)
+{
+SysBusDevice *sbd = SYS_BUS_DEVICE(dev);
+PCIHostState *phb = PCI_HOST_BRIDGE(dev);
+CXLHost *cxl = PXB_CXL_HOST(dev);
+CXLComponentState *cxl_cstate = >cxl_cstate;
+struct MemoryRegion *mr = _cstate->crb.component_registers;
+
+cxl_component_register_block_init(OBJECT(dev), cxl_cstate,
+  TYPE_PXB_CXL_HOST);
+sysbus_init_mmio(sbd, mr);
+
+/* FIXME: support multiple host bridges. */
+sysbus_mmio_map(sbd, 0, CXL_HOST_BASE +
+memory_region_size(mr) * pci_bus_uid(phb->bus));
+}
+
+static void pxb_cxl_host_class_init(ObjectClass *class, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(class);
+PCIHostBridgeClass *hc = PCI_HOST_BRIDGE_CLASS(class);
+
+hc->root_bus_path = pxb_host_root_bus_path;
+dc->fw_name = "cxl";
+dc->realize = pxb_cxl_realize;
+/* Reason: Internal part of the pxb/pxb-pcie device, not usable by itself 
*/
+dc->user_creatable = false;
+}
+
+/*
+ * This is a device to handle the MMIO for a CXL host bridge. It does nothing
+ * else.
+ */
+static const TypeInfo cxl_host_info = {
+.name  = TYPE_PXB_CXL_HOST,
+.parent= TYPE_PCI_HOST_BRIDGE,
+.instance_size = sizeof(CXLHost),
+.class_init= pxb_cxl_host_class_init,
+};
+
 /*
  * Registers the PXB bus as a child of pci host root bus.
  */
@@ -272,7 +322,7 @@ static void pxb_dev_realize_common(PCIDevice *dev, enum 
BusType type,
 dev_name = dev->qdev.id;
 }
 
-ds = qdev_new(TYPE_PXB_HOST);
+ds = qdev_new(type == CXL ? TYPE_PXB_CXL_HOST : TYPE_PXB_HOST);
 if (type == PCIE) {
 bus = pci_root_bus_new(ds, dev_name, NULL, NULL, 0, TYPE_PXB_PCIE_BUS);
 } else if (type == CXL) {
@@ -466,6 +516,7 @@ static void pxb_register_types(void)
 type_register_static(_pcie_bus_info);
 type_register_static(_cxl_bus_info);
 type_register_static(_host_info);
+type_register_static(_host_info);
 type_register_static(_dev_info);
 type_register_static(_pcie_dev_info);
 type_register_static(_cxl_dev_info);
diff --git a/include/hw/cxl/cxl.h b/include/hw/cxl/cxl.h
index 362cda40de..6bc344f205 100644
--- a/include/hw/cxl/cxl.h
+++ b/include/hw/cxl/cxl.h
@@ -17,5 +17,7 @@
 #define COMPONENT_REG_BAR_IDX 0
 #define DEVICE_REG_BAR_IDX 2
 
+#define CXL_HOST_BASE 0xD000
+
 #endif
 
-- 
2.29.2




[RFC PATCH 08/25] hw/cxl/device: Add memory devices (8.2.8.5)

2020-11-10 Thread Ben Widawsky
Memory devices implement extra capabilities on top of CXL devices. This
adds support for that.

Signed-off-by: Ben Widawsky 
---
 hw/cxl/cxl-device-utils.c   | 48 -
 hw/cxl/cxl-mailbox-utils.c  | 48 -
 include/hw/cxl/cxl_device.h | 15 
 3 files changed, 109 insertions(+), 2 deletions(-)

diff --git a/hw/cxl/cxl-device-utils.c b/hw/cxl/cxl-device-utils.c
index aec8b0d421..6544a68567 100644
--- a/hw/cxl/cxl-device-utils.c
+++ b/hw/cxl/cxl-device-utils.c
@@ -158,6 +158,45 @@ static void mailbox_reg_write(void *opaque, hwaddr offset, 
uint64_t value,
 process_mailbox(cxl_dstate);
 }
 
+static uint64_t mdev_reg_read(void *opaque, hwaddr offset, unsigned size)
+{
+uint64_t retval = 0;
+
+retval = FIELD_DP64(retval, CXL_MEM_DEV_STS, MEDIA_STATUS, 1);
+retval = FIELD_DP64(retval, CXL_MEM_DEV_STS, MBOX_READY, 1);
+
+switch (size) {
+case 4:
+if (unlikely(offset & (sizeof(uint32_t) - 1))) {
+qemu_log_mask(LOG_UNIMP, "Unaligned register read\n");
+return 0;
+}
+break;
+case 8:
+if (unlikely(offset & (sizeof(uint64_t) - 1))) {
+qemu_log_mask(LOG_UNIMP, "Unaligned register read\n");
+return 0;
+}
+break;
+}
+
+return ldn_le_p(, size);
+}
+
+static const MemoryRegionOps mdev_ops = {
+.read = mdev_reg_read,
+.write = NULL,
+.endianness = DEVICE_LITTLE_ENDIAN,
+.valid = {
+.min_access_size = 4,
+.max_access_size = 8,
+},
+.impl = {
+.min_access_size = 4,
+.max_access_size = 8,
+},
+};
+
 static const MemoryRegionOps mailbox_ops = {
 .read = mailbox_reg_read,
 .write = mailbox_reg_write,
@@ -213,6 +252,9 @@ void cxl_device_register_block_init(Object *obj, 
CXLDeviceState *cxl_dstate)
   "device-status", CXL_DEVICE_REGISTERS_LENGTH);
 memory_region_init_io(_dstate->mailbox, obj, _ops, cxl_dstate,
   "mailbox", CXL_MAILBOX_REGISTERS_LENGTH);
+memory_region_init_io(_dstate->memory_device, obj, _ops,
+  cxl_dstate, "memory device caps",
+  CXL_MEMORY_DEVICE_REGISTERS_LENGTH);
 
 memory_region_add_subregion(_dstate->device_registers, 0,
 _dstate->caps);
@@ -221,6 +263,9 @@ void cxl_device_register_block_init(Object *obj, 
CXLDeviceState *cxl_dstate)
 _dstate->device);
 memory_region_add_subregion(_dstate->device_registers,
 CXL_MAILBOX_REGISTERS_OFFSET, 
_dstate->mailbox);
+memory_region_add_subregion(_dstate->device_registers,
+CXL_MEMORY_DEVICE_REGISTERS_OFFSET,
+_dstate->memory_device);
 }
 
 static void mailbox_init_common(uint32_t *mbox_regs)
@@ -233,7 +278,7 @@ static void mailbox_init_common(uint32_t *mbox_regs)
 void cxl_device_register_init_common(CXLDeviceState *cxl_dstate)
 {
 uint32_t *cap_hdrs = cxl_dstate->caps_reg_state32;
-const int cap_count = 1;
+const int cap_count = 3;
 
 /* CXL Device Capabilities Array Register */
 ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY, CAP_ID, 0);
@@ -242,6 +287,7 @@ void cxl_device_register_init_common(CXLDeviceState 
*cxl_dstate)
 
 cxl_device_cap_init(cxl_dstate, DEVICE, 1);
 cxl_device_cap_init(cxl_dstate, MAILBOX, 2);
+cxl_device_cap_init(cxl_dstate, MEMORY_DEVICE, 0x4000);
 
 mailbox_init_common(cxl_dstate->mbox_reg_state32);
 }
diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
index 2d1b0ef9e4..5d2579800e 100644
--- a/hw/cxl/cxl-mailbox-utils.c
+++ b/hw/cxl/cxl-mailbox-utils.c
@@ -12,6 +12,12 @@
 #include "hw/pci/pci.h"
 #include "hw/cxl/cxl.h"
 
+enum cxl_opcode {
+CXL_EVENTS  = 0x1,
+CXL_IDENTIFY= 0x40,
+#define CXL_IDENTIFY_MEMORY_DEVICE = 0x0
+};
+
 /* 8.2.8.4.5.1 Command Return Codes */
 enum {
 RET_SUCCESS = 0x0,
@@ -40,6 +46,43 @@ enum {
 RET_MAX = 0x17
 };
 
+/* 8.2.9.5.1.1 */
+static int cmd_set_identify(CXLDeviceState *cxl_dstate, uint8_t cmd,
+uint32_t *ret_size)
+{
+struct identify {
+char fw_revision[0x10];
+uint64_t total_capacity;
+uint64_t volatile_capacity;
+uint64_t persistent_capacity;
+uint64_t partition_align;
+uint16_t info_event_log_size;
+uint16_t warning_event_log_size;
+uint16_t failure_event_log_size;
+uint16_t fatal_event_log_size;
+uint32_t lsa_size;
+uint8_t poison_list_max_mer[3];
+uint16_t inject_poison_limit;
+uint8_t poison_caps;
+uint8_t qos_telemetry_caps;
+} __attribute__((packed)) *id;
+_Static_assert(sizeof(struct identify) == 0x43, "Bad identify size");
+
+if (memory_region_size(cxl_dstate->pmem) < (256 << 

[RFC PATCH 13/25] hw/pci: Plumb _UID through host bridges

2020-11-10 Thread Ben Widawsky
Currently, QEMU makes _UID equivalent to the bus number (_BBN). While
there is nothing wrong with doing it this way, CXL spec has a heavy
reliance on _UID to identify host bridges and there is no link to the
bus number. Having a distinct UID solves two problems. The first is it
gets us around the limitation of 256 (current max bus number). The
second is it allows us to replicate hardware configurations where bus
number and uid aren't equivalent. The latter has benefits for our
development and debugging using QEMU.

The other way to do this would be to implement the expanded bus
numbering, but having an explicit uid makes more sense when trying to
replicate real hardware configurations.

The QEMU commandline to utilize this would be:
  -device pxb-cxl,id=cxl.0,bus="pcie.0",bus_nr=1,uid=x

Signed-off-by: Ben Widawsky 

--

I'm guessing this patch will be somewhat controversial. For early CXL
work, this can be dropped without too much heartache.
---
 hw/i386/acpi-build.c|  3 ++-
 hw/pci-bridge/pci_expander_bridge.c | 19 +++
 hw/pci/pci.c| 11 +++
 include/hw/pci/pci.h|  1 +
 include/hw/pci/pci_bus.h|  1 +
 5 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 99b3088c9e..aaed7da7dc 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -1634,6 +1634,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
 QLIST_FOREACH(bus, >child, sibling) {
 uint8_t bus_num = pci_bus_num(bus);
 uint8_t numa_node = pci_bus_numa_node(bus);
+int32_t uid = pci_bus_uid(bus);
 
 /* look only for expander root buses */
 if (!pci_bus_is_root(bus)) {
@@ -1647,7 +1648,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
 scope = aml_scope("\\_SB");
 dev = aml_device("PC%.02X", bus_num);
 aml_append(dev, aml_name_decl("_BBN", aml_int(bus_num)));
-init_pci_acpi(dev, bus_num, pci_bus_is_express(bus) ? PCIE : PCI);
+init_pci_acpi(dev, uid, pci_bus_is_express(bus) ? PCIE : PCI);
 
 if (numa_node != NUMA_NODE_UNASSIGNED) {
 aml_append(dev, aml_name_decl("_PXM", aml_int(numa_node)));
diff --git a/hw/pci-bridge/pci_expander_bridge.c 
b/hw/pci-bridge/pci_expander_bridge.c
index 3a8d815231..d5b43a8a31 100644
--- a/hw/pci-bridge/pci_expander_bridge.c
+++ b/hw/pci-bridge/pci_expander_bridge.c
@@ -67,6 +67,7 @@ struct PXBDev {
 
 uint8_t bus_nr;
 uint16_t numa_node;
+int32_t uid;
 };
 
 static PXBDev *convert_to_pxb(PCIDevice *dev)
@@ -98,12 +99,20 @@ static uint16_t pxb_bus_numa_node(PCIBus *bus)
 return pxb->numa_node;
 }
 
+static int32_t pxb_bus_uid(PCIBus *bus)
+{
+PXBDev *pxb = convert_to_pxb(bus->parent_dev);
+
+return pxb->uid;
+}
+
 static void pxb_bus_class_init(ObjectClass *class, void *data)
 {
 PCIBusClass *pbc = PCI_BUS_CLASS(class);
 
 pbc->bus_num = pxb_bus_num;
 pbc->numa_node = pxb_bus_numa_node;
+pbc->uid = pxb_bus_uid;
 }
 
 static const TypeInfo pxb_bus_info = {
@@ -329,6 +338,7 @@ static Property pxb_dev_properties[] = {
 /* Note: 0 is not a legal PXB bus number. */
 DEFINE_PROP_UINT8("bus_nr", PXBDev, bus_nr, 0),
 DEFINE_PROP_UINT16("numa_node", PXBDev, numa_node, NUMA_NODE_UNASSIGNED),
+DEFINE_PROP_INT32("uid", PXBDev, uid, -1),
 DEFINE_PROP_END_OF_LIST(),
 };
 
@@ -400,12 +410,21 @@ static const TypeInfo pxb_pcie_dev_info = {
 
 static void pxb_cxl_dev_realize(PCIDevice *dev, Error **errp)
 {
+PXBDev *pxb = convert_to_pxb(dev);
+
 /* A CXL PXB's parent bus is still PCIe */
 if (!pci_bus_is_express(pci_get_bus(dev))) {
 error_setg(errp, "pxb-cxl devices cannot reside on a PCI bus");
 return;
 }
 
+if (pxb->uid < 0) {
+error_setg(errp, "pxb-cxl devices must have a valid uid 
(0-2147483647)");
+return;
+}
+
+/* FIXME: Check that uid doesn't collide with UIDs of other host bridges */
+
 pxb_dev_realize_common(dev, CXL, errp);
 }
 
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 67eed889a4..f728975d32 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -168,6 +168,11 @@ static uint16_t pcibus_numa_node(PCIBus *bus)
 return NUMA_NODE_UNASSIGNED;
 }
 
+static int32_t pcibus_uid(PCIBus *bus)
+{
+return -1;
+}
+
 static void pci_bus_class_init(ObjectClass *klass, void *data)
 {
 BusClass *k = BUS_CLASS(klass);
@@ -182,6 +187,7 @@ static void pci_bus_class_init(ObjectClass *klass, void 
*data)
 
 pbc->bus_num = pcibus_num;
 pbc->numa_node = pcibus_numa_node;
+pbc->uid = pcibus_uid;
 }
 
 static const TypeInfo pci_bus_info = {
@@ -528,6 +534,11 @@ int pci_bus_numa_node(PCIBus *bus)
 return PCI_BUS_GET_CLASS(bus)->numa_node(bus);
 }
 
+int pci_bus_uid(PCIBus *bus)
+{
+return PCI_BUS_GET_CLASS(bus)->uid(bus);
+}
+
 static int get_pci_config_device(QEMUFile *f, void *pv, 

[RFC PATCH 07/25] hw/cxl/device: Implement basic mailbox (8.2.8.4)

2020-11-10 Thread Ben Widawsky
This is the beginning of implementing mailbox support for CXL 2.0
devices.

Signed-off-by: Ben Widawsky 
---
 hw/cxl/cxl-device-utils.c   | 131 
 hw/cxl/cxl-mailbox-utils.c  |  93 +
 hw/cxl/meson.build  |   1 +
 include/hw/cxl/cxl.h|   3 +
 include/hw/cxl/cxl_device.h |  10 ++-
 5 files changed, 237 insertions(+), 1 deletion(-)
 create mode 100644 hw/cxl/cxl-mailbox-utils.c

diff --git a/hw/cxl/cxl-device-utils.c b/hw/cxl/cxl-device-utils.c
index 78144e103c..aec8b0d421 100644
--- a/hw/cxl/cxl-device-utils.c
+++ b/hw/cxl/cxl-device-utils.c
@@ -55,6 +55,123 @@ static uint64_t dev_reg_read(void *opaque, hwaddr offset, 
unsigned size)
 return ldn_le_p(, size);
 }
 
+static uint64_t mailbox_reg_read(void *opaque, hwaddr offset, unsigned size)
+{
+CXLDeviceState *cxl_dstate = opaque;
+
+switch (size) {
+case 4:
+if (unlikely(offset & (sizeof(uint32_t) - 1))) {
+qemu_log_mask(LOG_UNIMP, "Unaligned register read\n");
+return 0;
+}
+break;
+case 8:
+if (unlikely(offset & (sizeof(uint64_t) - 1))) {
+qemu_log_mask(LOG_UNIMP, "Unaligned register read\n");
+return 0;
+}
+break;
+default:
+qemu_log_mask(LOG_UNIMP, "%uB component register read\n", size);
+return 0;
+}
+
+return ldn_le_p(cxl_dstate->mbox_reg_state + offset, size);
+}
+
+static void mailbox_mem_writel(uint32_t *reg_state, hwaddr offset,
+   uint64_t value)
+{
+switch (offset) {
+case A_CXL_DEV_MAILBOX_CTRL:
+/* fallthrough */
+case A_CXL_DEV_MAILBOX_CAP:
+/* RO register */
+break;
+default:
+qemu_log_mask(LOG_UNIMP,
+  "%s Unexpected 32-bit access to 0x%" PRIx64 " (WI)\n",
+  __func__, offset);
+break;
+}
+
+stl_le_p((uint8_t *)reg_state + offset, value);
+}
+
+static void mailbox_mem_writeq(uint64_t *reg_state, hwaddr offset,
+   uint64_t value)
+{
+switch (offset) {
+case A_CXL_DEV_MAILBOX_CMD:
+break;
+case A_CXL_DEV_BG_CMD_STS:
+/* BG not supported */
+/* fallthrough */
+case A_CXL_DEV_MAILBOX_STS:
+/* Read only register, will get updated by the state machine */
+return;
+case A_CXL_DEV_MAILBOX_CAP:
+case A_CXL_DEV_MAILBOX_CTRL:
+default:
+qemu_log_mask(LOG_UNIMP,
+  "%s Unexpected 64-bit access to 0x%" PRIx64 " (WI)\n",
+  __func__, offset);
+return;
+}
+
+stq_le_p((uint8_t *)reg_state + offset, value);
+}
+
+static void mailbox_reg_write(void *opaque, hwaddr offset, uint64_t value,
+  unsigned size)
+{
+CXLDeviceState *cxl_dstate = opaque;
+
+/*
+ * Lock is needed to prevent concurrent writes as well as to prevent writes
+ * coming in while the firmware is processing. Without background commands
+ * or the second mailbox implemented, this serves no purpose since the
+ * memory access is synchronized at a higher level (per memory region).
+ */
+RCU_READ_LOCK_GUARD();
+
+switch (size) {
+case 4:
+if (unlikely(offset & (sizeof(uint32_t) - 1))) {
+qemu_log_mask(LOG_UNIMP, "Unaligned register read\n");
+return;
+}
+mailbox_mem_writel(cxl_dstate->mbox_reg_state32, offset, value);
+break;
+case 8:
+if (unlikely(offset & (sizeof(uint64_t) - 1))) {
+qemu_log_mask(LOG_UNIMP, "Unaligned register read\n");
+return;
+}
+mailbox_mem_writeq(cxl_dstate->mbox_reg_state64, offset, value);
+break;
+}
+
+if (ARRAY_FIELD_EX32(cxl_dstate->mbox_reg_state32, CXL_DEV_MAILBOX_CTRL,
+ DOORBELL))
+process_mailbox(cxl_dstate);
+}
+
+static const MemoryRegionOps mailbox_ops = {
+.read = mailbox_reg_read,
+.write = mailbox_reg_write,
+.endianness = DEVICE_LITTLE_ENDIAN,
+.valid = {
+.min_access_size = 4,
+.max_access_size = 8,
+},
+.impl = {
+.min_access_size = 4,
+.max_access_size = 8,
+},
+};
+
 static const MemoryRegionOps dev_ops = {
 .read = dev_reg_read,
 .write = NULL,
@@ -94,12 +211,23 @@ void cxl_device_register_block_init(Object *obj, 
CXLDeviceState *cxl_dstate)
   "cap-array", CXL_DEVICE_REGISTERS_OFFSET - 0);
 memory_region_init_io(_dstate->device, obj, _ops, cxl_dstate,
   "device-status", CXL_DEVICE_REGISTERS_LENGTH);
+memory_region_init_io(_dstate->mailbox, obj, _ops, cxl_dstate,
+  "mailbox", CXL_MAILBOX_REGISTERS_LENGTH);
 
 memory_region_add_subregion(_dstate->device_registers, 0,
 _dstate->caps);
 memory_region_add_subregion(_dstate->device_registers,
   

[RFC PATCH 03/25] hw/cxl/component: Introduce CXL components (8.1.x, 8.2.5)

2020-11-10 Thread Ben Widawsky
A CXL 2.0 component is any entity in the CXL topology. All components
have a analogous function in PCIe. Except for the CXL host bridge, all
have a PCIe config space that is accessible via the common PCIe
mechanisms. CXL components are enumerated via DVSEC fields in the
extended PCIe header space. CXL components will minimally implement some
subset of CXL.mem and CXL.cache registers defined in 8.2.5 of the CXL
2.0 specification. Two headers and a utility library are introduced to
support the minimum functionality needed to enumerate components.

The cxl_pci header manages bits associated with PCI, specifically the
DVSEC and related fields. The cxl_component.h variant has data
structures and APIs that are useful for drivers implementing any of the
CXL 2.0 components. The library takes care of making use of the DVSEC
bits and the CXL.[mem|cache] regisetrs.

None of the mechanisms required to enumerate a CXL capable hostbridge
are introduced at this point.

Note that the CXL.mem and CXL.cache registers used are always 4B wide.
It's possible in the future that this constraint will not hold.

Signed-off-by: Ben Widawsky 

--
It's tempting to have a more generalized DVSEC infrastructure. As far as
I can tell, the amount this would actually save in terms of code is
minimal because most of DVESC is vendor specific.
---
 MAINTAINERS|   6 ++
 hw/Kconfig |   1 +
 hw/cxl/Kconfig |   3 +
 hw/cxl/cxl-component-utils.c   | 192 +
 hw/cxl/cxl-device-utils.c  |   0
 hw/cxl/meson.build |   3 +
 hw/meson.build |   1 +
 include/hw/cxl/cxl.h   |  17 +++
 include/hw/cxl/cxl_component.h | 181 +++
 include/hw/cxl/cxl_pci.h   | 133 +++
 10 files changed, 537 insertions(+)
 create mode 100644 hw/cxl/Kconfig
 create mode 100644 hw/cxl/cxl-component-utils.c
 create mode 100644 hw/cxl/cxl-device-utils.c
 create mode 100644 hw/cxl/meson.build
 create mode 100644 include/hw/cxl/cxl.h
 create mode 100644 include/hw/cxl/cxl_component.h
 create mode 100644 include/hw/cxl/cxl_pci.h

diff --git a/MAINTAINERS b/MAINTAINERS
index c1d16026ba..02b8e2274d 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2184,6 +2184,12 @@ F: qapi/block*.json
 F: qapi/transaction.json
 T: git https://repo.or.cz/qemu/armbru.git block-next
 
+Compute Express Link
+M: Ben Widawsky 
+S: Supported
+F: hw/cxl/
+F: include/hw/cxl/
+
 Dirty Bitmaps
 M: Eric Blake 
 M: Vladimir Sementsov-Ogievskiy 
diff --git a/hw/Kconfig b/hw/Kconfig
index 4de1797ffd..efed27805a 100644
--- a/hw/Kconfig
+++ b/hw/Kconfig
@@ -6,6 +6,7 @@ source audio/Kconfig
 source block/Kconfig
 source char/Kconfig
 source core/Kconfig
+source cxl/Kconfig
 source display/Kconfig
 source dma/Kconfig
 source gpio/Kconfig
diff --git a/hw/cxl/Kconfig b/hw/cxl/Kconfig
new file mode 100644
index 00..8e67519b16
--- /dev/null
+++ b/hw/cxl/Kconfig
@@ -0,0 +1,3 @@
+config CXL
+bool
+default y if PCI_EXPRESS
diff --git a/hw/cxl/cxl-component-utils.c b/hw/cxl/cxl-component-utils.c
new file mode 100644
index 00..c52bd5bfc7
--- /dev/null
+++ b/hw/cxl/cxl-component-utils.c
@@ -0,0 +1,192 @@
+/*
+ * CXL Utility library for components
+ *
+ * Copyright(C) 2020 Intel Corporation.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2. See the
+ * COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/log.h"
+#include "hw/pci/pci.h"
+#include "hw/cxl/cxl.h"
+
+static uint64_t cxl_cache_mem_read_reg(void *opaque, hwaddr offset,
+   unsigned size)
+{
+CXLComponentState *cxl_cstate = opaque;
+ComponentRegisters *cregs = _cstate->crb;
+uint32_t *cache_mem = cregs->cache_mem_registers;
+
+if (size != 4) {
+qemu_log_mask(LOG_UNIMP, "%uB component register read (RAZ)\n", size);
+return 0;
+}
+
+if (cregs->special_ops && cregs->special_ops->read) {
+return cregs->special_ops->read(cxl_cstate, offset, size);
+} else {
+return cache_mem[offset >> 2];
+}
+}
+
+static void cxl_cache_mem_write_reg(void *opaque, hwaddr offset, uint64_t 
value,
+unsigned size)
+{
+CXLComponentState *cxl_cstate = opaque;
+ComponentRegisters *cregs = _cstate->crb;
+
+if (size != 4) {
+qemu_log_mask(LOG_UNIMP, "%uB component register write (WI)\n", size);
+return;
+}
+
+if (cregs->special_ops && cregs->special_ops->write) {
+cregs->special_ops->write(cxl_cstate, offset, value, size);
+}
+}
+
+static const MemoryRegionOps cache_mem_ops = {
+.read = cxl_cache_mem_read_reg,
+.write = cxl_cache_mem_write_reg,
+.endianness = DEVICE_LITTLE_ENDIAN,
+.valid = {
+.min_access_size = 4,
+.max_access_size = 4,
+},
+.impl = {
+.min_access_size = 4,
+.max_access_size = 4,
+},
+};
+

[RFC PATCH 04/25] hw/cxl/device: Introduce a CXL device (8.2.8)

2020-11-10 Thread Ben Widawsky
A CXL device is a type of CXL component. Conceptually, a CXL device
would be a leaf node in a CXL topology. From an emulation perspective,
CXL devices are the most complex and so the actual implementation is
reserved for discrete commits.

This new device type is specifically catered towards the eventually
implementation of a Type3 CXL.mem device, 8.2.8.5 in the CXL 2.0
specification.

Signed-off-by: Ben Widawsky 
---
 include/hw/cxl/cxl.h|   1 +
 include/hw/cxl/cxl_device.h | 193 
 2 files changed, 194 insertions(+)
 create mode 100644 include/hw/cxl/cxl_device.h

diff --git a/include/hw/cxl/cxl.h b/include/hw/cxl/cxl.h
index 55f6cc30a5..23f52c4cf9 100644
--- a/include/hw/cxl/cxl.h
+++ b/include/hw/cxl/cxl.h
@@ -12,6 +12,7 @@
 
 #include "cxl_pci.h"
 #include "cxl_component.h"
+#include "cxl_device.h"
 
 #endif
 
diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
new file mode 100644
index 00..491eca6e05
--- /dev/null
+++ b/include/hw/cxl/cxl_device.h
@@ -0,0 +1,193 @@
+/*
+ * QEMU CXL Devices
+ *
+ * Copyright (c) 2020 Intel
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2. See the
+ * COPYING file in the top-level directory.
+ */
+
+#ifndef CXL_DEVICE_H
+#define CXL_DEVICE_H
+
+#include "hw/register.h"
+
+/*
+ * The following is how a CXL device's MMIO space is laid out. The only
+ * requirement from the spec is that the capabilities array and the capability
+ * headers start at offset 0 and are contiguously packed. The headers 
themselves
+ * provide offsets to the register fields. For this emulation, registers will
+ * start at offset 0x80 (m == 0x80). No secondary mailbox is implemented which
+ * means that n = m + sizeof(mailbox registers) + sizeof(device registers).
+ *
+ * This is roughly described in 8.2.8 Figure 138 of the CXL 2.0 spec.
+ *
+ * n + PAYLOAD_SIZE_MAX  +-+
+ *   | |
+ *  ^| |
+ *  || |
+ *  || |
+ *  || |
+ *  || Command Payload |
+ *  || |
+ *  || |
+ *  || |
+ *  || |
+ *  || |
+ *  n+-+
+ *  ^| |
+ *  ||Device Capability Registers  |
+ *  ||x, mailbox, y|
+ *  || |
+ *  m+-+
+ *  ^| Device Capability Header y  |
+ *  |+-+
+ *  || Device Capability Header Mailbox|
+ *  |+- 
+ *  || Device Capability Header x  |
+ *  |+-+
+ *  || |
+ *  || |
+ *  ||  Device Cap Array[0..n] |
+ *  || |
+ *  || |
+ *  || |
+ *  0+-+
+ */
+
+#define CXL_DEVICE_CAP_HDR1_OFFSET 0x10 /* Figure 138 */
+#define CXL_DEVICE_CAP_REG_SIZE 0x10 /* 8.2.8.2 */
+
+#define CXL_DEVICE_REGISTERS_OFFSET 0x80 /* Read comment above */
+#define CXL_DEVICE_REGISTERS_LENGTH 0x8 /* 8.2.8.3.1 */
+
+#define CXL_MAILBOX_REGISTERS_OFFSET \
+(CXL_DEVICE_REGISTERS_OFFSET + CXL_DEVICE_REGISTERS_LENGTH)
+#define CXL_MAILBOX_REGISTERS_SIZE 0x20
+#define CXL_MAILBOX_PAYLOAD_SHIFT 11
+#define CXL_MAILBOX_MAX_PAYLOAD_SIZE (1 << CXL_MAILBOX_PAYLOAD_SHIFT)
+#define CXL_MAILBOX_REGISTERS_LENGTH \
+(CXL_MAILBOX_REGISTERS_SIZE + CXL_MAILBOX_MAX_PAYLOAD_SIZE)
+
+typedef struct cxl_device_state {
+/* Boss container and caps registers */
+MemoryRegion device_registers;
+
+MemoryRegion caps;
+MemoryRegion device;
+MemoryRegion mailbox;
+
+MemoryRegion *pmem;
+MemoryRegion *vmem;
+
+bool active;
+uint16_t command;
+uint16_t payload_size;
+union {
+uint8_t caps_reg_state[CXL_DEVICE_CAP_REG_SIZE * 4]; /* ARRAY + 3 CAPS 
*/
+uint32_t caps_reg_state32[0];
+};
+} CXLDeviceState;
+
+/* Initialize the register block for a device */
+void cxl_device_register_block_init(Object *obj, 

[RFC PATCH 06/25] hw/cxl/device: Add device status (8.2.8.3)

2020-11-10 Thread Ben Widawsky
This implements the CXL device status registers from 8.2.8.3.1 in the
CXL 2.0 specification. It is capability ID 0001h.

Signed-off-by: Ben Widawsky 
---
 hw/cxl/cxl-device-utils.c   | 45 +-
 include/hw/cxl/cxl_device.h | 49 -
 2 files changed, 60 insertions(+), 34 deletions(-)

diff --git a/hw/cxl/cxl-device-utils.c b/hw/cxl/cxl-device-utils.c
index a391bb15c6..78144e103c 100644
--- a/hw/cxl/cxl-device-utils.c
+++ b/hw/cxl/cxl-device-utils.c
@@ -33,6 +33,42 @@ static uint64_t caps_reg_read(void *opaque, hwaddr offset, 
unsigned size)
 return ldn_le_p(cxl_dstate->caps_reg_state + offset, size);
 }
 
+static uint64_t dev_reg_read(void *opaque, hwaddr offset, unsigned size)
+{
+uint64_t retval = 0;
+
+switch (size) {
+case 4:
+if (unlikely(offset & (sizeof(uint32_t) - 1))) {
+qemu_log_mask(LOG_UNIMP, "Unaligned register read\n");
+return 0;
+}
+break;
+case 8:
+if (unlikely(offset & (sizeof(uint64_t) - 1))) {
+qemu_log_mask(LOG_UNIMP, "Unaligned register read\n");
+return 0;
+}
+break;
+}
+
+return ldn_le_p(, size);
+}
+
+static const MemoryRegionOps dev_ops = {
+.read = dev_reg_read,
+.write = NULL,
+.endianness = DEVICE_LITTLE_ENDIAN,
+.valid = {
+.min_access_size = 4,
+.max_access_size = 8,
+},
+.impl = {
+.min_access_size = 4,
+.max_access_size = 8,
+},
+};
+
 static const MemoryRegionOps caps_ops = {
 .read = caps_reg_read,
 .write = NULL,
@@ -56,18 +92,25 @@ void cxl_device_register_block_init(Object *obj, 
CXLDeviceState *cxl_dstate)
 
 memory_region_init_io(_dstate->caps, obj, _ops, cxl_dstate,
   "cap-array", CXL_DEVICE_REGISTERS_OFFSET - 0);
+memory_region_init_io(_dstate->device, obj, _ops, cxl_dstate,
+  "device-status", CXL_DEVICE_REGISTERS_LENGTH);
 
 memory_region_add_subregion(_dstate->device_registers, 0,
 _dstate->caps);
+memory_region_add_subregion(_dstate->device_registers,
+CXL_DEVICE_REGISTERS_OFFSET,
+_dstate->device);
 }
 
 void cxl_device_register_init_common(CXLDeviceState *cxl_dstate)
 {
 uint32_t *cap_hdrs = cxl_dstate->caps_reg_state32;
-const int cap_count = 0;
+const int cap_count = 1;
 
 /* CXL Device Capabilities Array Register */
 ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY, CAP_ID, 0);
 ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY, CAP_VERSION, 1);
 ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY2, CAP_COUNT, cap_count);
+
+cxl_device_cap_init(cxl_dstate, DEVICE, 1);
 }
diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
index 491eca6e05..2c674fdc9c 100644
--- a/include/hw/cxl/cxl_device.h
+++ b/include/hw/cxl/cxl_device.h
@@ -127,6 +127,22 @@ CXL_DEVICE_CAPABILITY_HEADER_REGISTER(DEVICE, 
CXL_DEVICE_CAP_HDR1_OFFSET)
 CXL_DEVICE_CAPABILITY_HEADER_REGISTER(MAILBOX, CXL_DEVICE_CAP_HDR1_OFFSET + \
CXL_DEVICE_CAP_REG_SIZE)
 
+#define cxl_device_cap_init(dstate, reg, cap_id)   
\
+do {   
\
+uint32_t *cap_hdrs = dstate->caps_reg_state32; 
\
+int which = R_CXL_DEV_##reg##_CAP_HDR0;
\
+cap_hdrs[which] =  
\
+FIELD_DP32(cap_hdrs[which], CXL_DEV_##reg##_CAP_HDR0, CAP_ID, 
cap_id); \
+cap_hdrs[which] = FIELD_DP32(  
\
+cap_hdrs[which], CXL_DEV_##reg##_CAP_HDR0, CAP_VERSION, 1);
\
+cap_hdrs[which + 1] =  
\
+FIELD_DP32(cap_hdrs[which + 1], CXL_DEV_##reg##_CAP_HDR1,  
\
+   CAP_OFFSET, CXL_##reg##_REGISTERS_OFFSET);  
\
+cap_hdrs[which + 2] =  
\
+FIELD_DP32(cap_hdrs[which + 2], CXL_DEV_##reg##_CAP_HDR2,  
\
+   CAP_LENGTH, CXL_##reg##_REGISTERS_LENGTH);  
\
+} while (0)
+
 REG32(CXL_DEV_MAILBOX_CAP, 0)
 FIELD(CXL_DEV_MAILBOX_CAP, PAYLOAD_SIZE, 0, 5)
 FIELD(CXL_DEV_MAILBOX_CAP, INT_CAP, 5, 1)
@@ -138,43 +154,10 @@ REG32(CXL_DEV_MAILBOX_CTRL, 4)
 FIELD(CXL_DEV_MAILBOX_CTRL, INT_EN, 1, 2)
 FIELD(CXL_DEV_MAILBOX_CTRL, BG_INT_EN, 2, 1)
 
-enum {
-CXL_CMD_EVENTS  = 0x1,
-CXL_CMD_IDENTIFY= 0x40,
-};
-
 REG32(CXL_DEV_MAILBOX_CMD, 8)
 FIELD(CXL_DEV_MAILBOX_CMD, OP, 0, 16)
 FIELD(CXL_DEV_MAILBOX_CMD, LENGTH, 16, 20)
 
-/* 8.2.8.4.5.1 Command Return Codes */

[RFC PATCH 12/25] acpi/pci: Consolidate host bridge setup

2020-11-10 Thread Ben Widawsky
This cleanup will make it easier to add support for CXL to the mix.

Signed-off-by: Ben Widawsky 
---
 hw/i386/acpi-build.c | 31 +--
 1 file changed, 17 insertions(+), 14 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 4f66642d88..99b3088c9e 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -1486,6 +1486,20 @@ static void build_smb0(Aml *table, I2CBus *smbus, int 
devnr, int func)
 aml_append(table, scope);
 }
 
+enum { PCI, PCIE };
+static void init_pci_acpi(Aml *dev, int uid, int type)
+{
+if (type == PCI) {
+aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
+aml_append(dev, aml_name_decl("_UID", aml_int(uid)));
+} else {
+aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
+aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
+aml_append(dev, aml_name_decl("_UID", aml_int(uid)));
+aml_append(dev, build_q35_osc_method());
+}
+}
+
 static void
 build_dsdt(GArray *table_data, BIOSLinker *linker,
AcpiPmInfo *pm, AcpiMiscInfo *misc,
@@ -1514,9 +1528,8 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
 if (misc->is_piix4) {
 sb_scope = aml_scope("_SB");
 dev = aml_device("PCI0");
-aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
+init_pci_acpi(dev, 0, PCI);
 aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
-aml_append(dev, aml_name_decl("_UID", aml_int(0)));
 aml_append(sb_scope, dev);
 aml_append(dsdt, sb_scope);
 
@@ -1530,11 +1543,8 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
 } else {
 sb_scope = aml_scope("_SB");
 dev = aml_device("PCI0");
-aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
-aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
+init_pci_acpi(dev, 0, PCIE);
 aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
-aml_append(dev, aml_name_decl("_UID", aml_int(0)));
-aml_append(dev, build_q35_osc_method());
 aml_append(sb_scope, dev);
 
 if (pm->smi_on_cpuhp) {
@@ -1636,15 +1646,8 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
 
 scope = aml_scope("\\_SB");
 dev = aml_device("PC%.02X", bus_num);
-aml_append(dev, aml_name_decl("_UID", aml_int(bus_num)));
 aml_append(dev, aml_name_decl("_BBN", aml_int(bus_num)));
-if (pci_bus_is_express(bus)) {
-aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
-aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
-aml_append(dev, build_q35_osc_method());
-} else {
-aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
-}
+init_pci_acpi(dev, bus_num, pci_bus_is_express(bus) ? PCIE : PCI);
 
 if (numa_node != NUMA_NODE_UNASSIGNED) {
 aml_append(dev, aml_name_decl("_PXM", aml_int(numa_node)));
-- 
2.29.2




[RFC PATCH 02/25] hw/pci/cxl: Add a CXL component type (interface)

2020-11-10 Thread Ben Widawsky
A CXL component is a hardware entity that implements CXL component
registers from the CXL 2.0 spec (8.2.3). Currently these represent 3
general types.
1. Host Bridge
2. Ports (root, upstream, downstream)
3. Devices (memory, other)

A CXL component can be conceptually thought of as a PCIe device with
extra functionality when enumerated and enabled. For this reason, CXL
does here, and will continue to add on to existing PCI code paths.

Host bridges will typically need to be handled specially and so they can
implement this newly introduced interface or not. All other components
should implement this interface. Implementing this interface allows the
core pci code to treat these devices as special where appropriate.

Signed-off-by: Ben Widawsky 
---
 hw/pci/pci.c | 10 ++
 include/hw/pci/pci.h |  8 
 2 files changed, 18 insertions(+)

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 0131d9d02c..db88788c4b 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -192,6 +192,11 @@ static const TypeInfo pci_bus_info = {
 .class_init = pci_bus_class_init,
 };
 
+static const TypeInfo cxl_interface_info = {
+.name  = INTERFACE_CXL_DEVICE,
+.parent= TYPE_INTERFACE,
+};
+
 static const TypeInfo pcie_interface_info = {
 .name  = INTERFACE_PCIE_DEVICE,
 .parent= TYPE_INTERFACE,
@@ -2113,6 +2118,10 @@ static void pci_qdev_realize(DeviceState *qdev, Error 
**errp)
 pci_dev->cap_present |= QEMU_PCI_CAP_EXPRESS;
 }
 
+if (object_class_dynamic_cast(klass, INTERFACE_CXL_DEVICE)) {
+pci_dev->cap_present |= QEMU_PCIE_CAP_CXL;
+}
+
 pci_dev = do_pci_register_device(pci_dev,
  object_get_typename(OBJECT(qdev)),
  pci_dev->devfn, errp);
@@ -2839,6 +2848,7 @@ static void pci_register_types(void)
 type_register_static(_bus_info);
 type_register_static(_bus_info);
 type_register_static(_pci_interface_info);
+type_register_static(_interface_info);
 type_register_static(_interface_info);
 type_register_static(_device_type_info);
 }
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index 72ce649eee..4e6fd59fdd 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -194,6 +194,8 @@ enum {
 QEMU_PCIE_LNKSTA_DLLLA = (1 << QEMU_PCIE_LNKSTA_DLLLA_BITNR),
 #define QEMU_PCIE_EXTCAP_INIT_BITNR 9
 QEMU_PCIE_EXTCAP_INIT = (1 << QEMU_PCIE_EXTCAP_INIT_BITNR),
+#define QEMU_PCIE_CXL_BITNR 10
+QEMU_PCIE_CAP_CXL = (1 << QEMU_PCIE_CXL_BITNR),
 };
 
 #define TYPE_PCI_DEVICE "pci-device"
@@ -201,6 +203,12 @@ typedef struct PCIDeviceClass PCIDeviceClass;
 DECLARE_OBJ_CHECKERS(PCIDevice, PCIDeviceClass,
  PCI_DEVICE, TYPE_PCI_DEVICE)
 
+/*
+ * Implemented by devices that can be plugged on CXL buses. In the spec, this 
is
+ * actually a "CXL Component, but we name it device to match the PCI naming.
+ */
+#define INTERFACE_CXL_DEVICE "cxl-device"
+
 /* Implemented by devices that can be plugged on PCI Express buses */
 #define INTERFACE_PCIE_DEVICE "pci-express-device"
 
-- 
2.29.2




[RFC PATCH 05/25] hw/cxl/device: Implement the CAP array (8.2.8.1-2)

2020-11-10 Thread Ben Widawsky
This implements all device MMIO up to the first capability .That
includes the CXL Device Capabilities Array Register, as well as all of
the CXL Device Capability Header Registers. The latter are filled in as
they are implemented in the following patches.

Signed-off-by: Ben Widawsky 
---
 hw/cxl/cxl-device-utils.c | 73 +++
 hw/cxl/meson.build|  1 +
 2 files changed, 74 insertions(+)

diff --git a/hw/cxl/cxl-device-utils.c b/hw/cxl/cxl-device-utils.c
index e69de29bb2..a391bb15c6 100644
--- a/hw/cxl/cxl-device-utils.c
+++ b/hw/cxl/cxl-device-utils.c
@@ -0,0 +1,73 @@
+/*
+ * CXL Utility library for devices
+ *
+ * Copyright(C) 2020 Intel Corporation.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2. See the
+ * COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/log.h"
+#include "hw/cxl/cxl.h"
+
+static uint64_t caps_reg_read(void *opaque, hwaddr offset, unsigned size)
+{
+CXLDeviceState *cxl_dstate = opaque;
+
+switch (size) {
+case 4:
+if (unlikely(offset & (sizeof(uint32_t) - 1))) {
+qemu_log_mask(LOG_UNIMP, "Unaligned register read\n");
+return 0;
+}
+break;
+case 8:
+if (unlikely(offset & (sizeof(uint64_t) - 1))) {
+qemu_log_mask(LOG_UNIMP, "Unaligned register read\n");
+return 0;
+}
+break;
+}
+
+return ldn_le_p(cxl_dstate->caps_reg_state + offset, size);
+}
+
+static const MemoryRegionOps caps_ops = {
+.read = caps_reg_read,
+.write = NULL,
+.endianness = DEVICE_LITTLE_ENDIAN,
+.valid = {
+.min_access_size = 4,
+.max_access_size = 8,
+},
+.impl = {
+.min_access_size = 4,
+.max_access_size = 8,
+},
+};
+
+void cxl_device_register_block_init(Object *obj, CXLDeviceState *cxl_dstate)
+{
+/* This will be a BAR, so needs to be rounded up to pow2 for PCI spec */
+memory_region_init(
+_dstate->device_registers, obj, "device-registers",
+pow2ceil(CXL_MAILBOX_REGISTERS_LENGTH + CXL_MAILBOX_REGISTERS_OFFSET));
+
+memory_region_init_io(_dstate->caps, obj, _ops, cxl_dstate,
+  "cap-array", CXL_DEVICE_REGISTERS_OFFSET - 0);
+
+memory_region_add_subregion(_dstate->device_registers, 0,
+_dstate->caps);
+}
+
+void cxl_device_register_init_common(CXLDeviceState *cxl_dstate)
+{
+uint32_t *cap_hdrs = cxl_dstate->caps_reg_state32;
+const int cap_count = 0;
+
+/* CXL Device Capabilities Array Register */
+ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY, CAP_ID, 0);
+ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY, CAP_VERSION, 1);
+ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY2, CAP_COUNT, cap_count);
+}
diff --git a/hw/cxl/meson.build b/hw/cxl/meson.build
index 00c3876a0f..47154d6850 100644
--- a/hw/cxl/meson.build
+++ b/hw/cxl/meson.build
@@ -1,3 +1,4 @@
 softmmu_ss.add(when: 'CONFIG_CXL', if_true: files(
   'cxl-component-utils.c',
+  'cxl-device-utils.c',
 ))
-- 
2.29.2




[RFC PATCH 01/25] Temp: Add the PCI_EXT_ID_DVSEC definition to the qemu pci_regs.h copy.

2020-11-10 Thread Ben Widawsky
From: Jonathan Cameron 

This hasn't yet been added to the linux kernel tree, so for purposes
of this RFC just add it locally.

Signed-off-by: Jonathan Cameron 
Signed-off-by: Ben Widawsky 
---
 include/standard-headers/linux/pci_regs.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/standard-headers/linux/pci_regs.h 
b/include/standard-headers/linux/pci_regs.h
index a95d55f9f2..5d0b79b9da 100644
--- a/include/standard-headers/linux/pci_regs.h
+++ b/include/standard-headers/linux/pci_regs.h
@@ -723,6 +723,7 @@
 #define PCI_EXT_CAP_ID_DPC 0x1D/* Downstream Port Containment */
 #define PCI_EXT_CAP_ID_L1SS0x1E/* L1 PM Substates */
 #define PCI_EXT_CAP_ID_PTM 0x1F/* Precision Time Measurement */
+#define PCI_EXT_CAP_ID_DVSEC   0x23/* Designated Vendor-Specific */
 #define PCI_EXT_CAP_ID_DLF 0x25/* Data Link Feature */
 #define PCI_EXT_CAP_ID_PL_16GT 0x26/* Physical Layer 16.0 GT/s */
 #define PCI_EXT_CAP_ID_MAX PCI_EXT_CAP_ID_PL_16GT
-- 
2.29.2




[RFC PATCH 00/25] Introduce CXL 2.0 Emulation

2020-11-10 Thread Ben Widawsky
Introduce emulation of Compute Express Link 2.0, which was released
today at https://www.computeexpresslink.org/.

I've pushed a branch here: https://gitlab.com/bwidawsk/qemu/-/tree/cxl-2.0

The emulation has been critical to get the Linux enabling started
(https://lore.kernel.org/linux-cxl/), it would be an ideal place to land
regression tests for different topology handling, and there may be applications
for this emulation as a way for a guest to manipulate its address space relative
to different performance memories. I am new to QEMU development, so please
forgive and point me in the right direction if I severely misinterpreted where a
piece of infrastructure belongs.

Three of the five CXL component types are emulated with some level of 
functionality:
host bridge, root port, and memory device. Upstream ports and downstream ports
aren't implemented (the two components needed to make up a switch).

CXL 2.0 is built on top of PCIe (see spec for details). As a result, much of the
implementation utilizes existing PCI paradigms. To implement the host bridge,
I've chosen to use PXB (PCI Expander Bridge). It seemed to be the most natural
fit even though it doesn't directly map to how hardware will work. For
persistent capacity of the memory device, I utilized the memory subsystem
(hw/mem).

We have 3 reasons why this work is valuable:
1. OS driver development and testing
2. OS driver regression testing
3. Possible guest support for HDMs

As mentioned above there are three benefits to carrying this enabling in
upstream QEMU:

1. Linux driver feature development benefits from emulation both due to
a lack of initial hardware availability, but also, as is seen with
NVDIMM/PMEM emulation, there is value in being able to share
topologies with system-software developers even after hardware is
available.

2. The Linux kernel's unit test suite for NVDIMM/PMEM ended up injecting fake
resources via custom modules (nfit_test). In retrospect a QEMU emulation of
nfit_test capabilities would have made the test environment more portable, and
allowed for easier community contributions of example configurations.

3. This is still being fleshed out, but in short it provides a standardized
mechanism for the guest to provide feedback to the host about size and placement
needs of the memory. After the host gives the guest a physical window mapping to
the CXL device, the emulated HDM decoders allow the guest a way to tell the host
how much it wants and where. There are likely simpler ways to do this, but
they'd require inventing a new interface and you'd need to have diverging driver
code in the guest programming of the HDM decoder vs. the host. Since we've
already done this work, why not use it?

There is quite a long list of work to do for full spec compliance, but I don't
believe that any of it precludes merging. Off the top of my head:
- Main host bridge support (WIP)
- Interleaving
- Better Tests
- Huge swaths of firmware functionality
- Hot plug support
- Emulating volatile capacity

The flow of the patches in general is to define all the data structures and
registers associated with the various components in a top down manner. Host
bridge, component, ports, devices. Then, the actual implementation is done in
the same order.

The summary is:
1-8: Put infrastructure in place for emulation of the components.
9-11: Create the concept of a CXL bus and plumb into PXB
12-16: Implement host bridges
17: Implement a root port
18: Implement a memory device
19: Implement HDM decoders
20-24: ACPI bits
25: Start working on enabling the main host bridge

Ben Widawsky (23):
  hw/pci/cxl: Add a CXL component type (interface)
  hw/cxl/component: Introduce CXL components (8.1.x, 8.2.5)
  hw/cxl/device: Introduce a CXL device (8.2.8)
  hw/cxl/device: Implement the CAP array (8.2.8.1-2)
  hw/cxl/device: Add device status (8.2.8.3)
  hw/cxl/device: Implement basic mailbox (8.2.8.4)
  hw/cxl/device: Add memory devices (8.2.8.5)
  hw/pxb: Use a type for realizing expanders
  hw/pci/cxl: Create a CXL bus type
  hw/pxb: Allow creation of a CXL PXB (host bridge)
  acpi/pci: Consolidate host bridge setup
  hw/pci: Plumb _UID through host bridges
  hw/cxl/component: Implement host bridge MMIO (8.2.5, table 142)
  acpi/pxb/cxl: Reserve host bridge MMIO
  hw/pxb/cxl: Add "windows" for host bridges
  hw/cxl/rp: Add a root port
  hw/cxl/device: Add a memory device (8.2.8.5)
  hw/cxl/device: Implement MMIO HDM decoding (8.2.5.12)
  acpi/cxl: Add _OSC implementation (9.14.2)
  acpi/cxl: Create the CEDT (9.14.1)
  Temp: acpi/cxl: Add ACPI0017 (CEDT awareness)
  WIP: i386/cxl: Initialize a host bridge
  qtest/cxl: Add very basic sanity tests

Jonathan Cameron (1):
  Temp: Add the PCI_EXT_ID_DVSEC definition to the qemu pci_regs.h copy.

Vishal Verma (1):
  acpi/cxl: Introduce a compat-driver UUID for CXL _OSC

 MAINTAINERS   |   6 +
 hw/Kconfig|   1 +
 hw/acpi/Kconfig   |   5 +

[Bug 1874678] Re: [Feature request] python-qemu package

2020-11-10 Thread Thomas Huth
** Changed in: qemu
 Assignee: (unassigned) => John Snow (jnsnow)

** Changed in: qemu
   Status: New => In Progress

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1874678

Title:
  [Feature request] python-qemu package

Status in QEMU:
  In Progress

Bug description:
  It would be useful to have the python/qemu/ files published as a
  Python pip package, so users from distribution can also use the QEMU
  python methods (in particular for testing) without having to clone the
  full repository.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1874678/+subscriptions



Re: [PATCH 4/5 v4] KVM: VMX: Fill in conforming vmx_x86_ops via macro

2020-11-10 Thread Xu, Like

On 2020/11/11 3:02, Krish Sadhukhan wrote:


On 11/9/20 5:49 PM, Like Xu wrote:

Hi Krish,

On 2020/11/10 9:23, Krish Sadhukhan wrote:
@@ -1192,7 +1192,7 @@ void vmx_set_host_fs_gs(struct vmcs_host_state 
*host, u16 fs_sel, u16 gs_sel,

  }
  }
  -void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu)
+void vmx_prepare_guest_switch(struct kvm_vcpu *vcpu)


What do you think of renaming it to

void vmx_prepare_switch_for_guest(struct kvm_vcpu *vcpu);



In my opinion, it sounds a bit odd as we usually say, "switch to 
something". :-)


From that perspective, {svm|vmx}_prepare_switch_to_guest is probably the 
best name to keep.


Ah, I'm fine with the original one and thank you.






?

Thanks,
Like Xu


  {
  struct vcpu_vmx *vmx = to_vmx(vcpu);
  struct vmcs_host_state *host_state;

@@ -311,7 +311,7 @@ void vmx_vcpu_load_vmcs(struct kvm_vcpu *vcpu, int 
cpu,

  int allocate_vpid(void);
  void free_vpid(int vpid);
  void vmx_set_constant_host_state(struct vcpu_vmx *vmx);
-void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu);
+void vmx_prepare_guest_switch(struct kvm_vcpu *vcpu);
  void vmx_set_host_fs_gs(struct vmcs_host_state *host, u16 fs_sel, u16 
gs_sel,

  unsigned long fs_base, unsigned long gs_base);
  int vmx_get_cpl(struct kvm_vcpu *vcpu);







RE: [PATCH 2/2] i386/cpu: Make the Intel PT LIP feature configurable

2020-11-10 Thread Kang, Luwei
> -Original Message-
> From: Kang, Luwei 
> Sent: Wednesday, October 14, 2020 4:05 PM
> To: pbonz...@redhat.com; r...@twiddle.net; ehabk...@redhat.com
> Cc: qemu-devel@nongnu.org; Kang, Luwei 
> Subject: [PATCH 2/2] i386/cpu: Make the Intel PT LIP feature configurable
> 
> The current implementation will disable the guest Intel PT feature if the 
> Intel
> PT LIP feature is supported on the host, but the LIP feature is comming
> soon(e.g. SnowRidge and later).
> 
> This patch will make the guest LIP feature configurable and Intel PT feature 
> can
> be enabled in guest when the guest LIP status same with the host.

Ping. 

Thanks,
Luwei Kang

> 
> Signed-off-by: Luwei Kang 
> ---
>  target/i386/cpu.c | 29 +++--  target/i386/cpu.h
> |  4 
>  2 files changed, 31 insertions(+), 2 deletions(-)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c index
> 24644abfd4..aeabdd5bd4 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -672,6 +672,7 @@ static void x86_cpu_vendor_words2str(char *dst,
> uint32_t vendor1,  #define TCG_XSAVE_FEATURES (CPUID_XSAVE_XSAVEOPT
> | CPUID_XSAVE_XGETBV1)
>/* missing:
>CPUID_XSAVE_XSAVEC, CPUID_XSAVE_XSAVES */
> +#define TCG_14_0_ECX_FEATURES 0
> 
>  typedef enum FeatureWordType {
> CPUID_FEATURE_WORD,
> @@ -1301,6 +1302,26 @@ static FeatureWordInfo
> feature_word_info[FEATURE_WORDS] = {
>  }
>  },
> 
> +[FEAT_14_0_ECX] = {
> +.type = CPUID_FEATURE_WORD,
> +.feat_names = {
> +NULL, NULL, NULL, NULL,
> +NULL, NULL, NULL, NULL,
> +NULL, NULL, NULL, NULL,
> +NULL, NULL, NULL, NULL,
> +NULL, NULL, NULL, NULL,
> +NULL, NULL, NULL, NULL,
> +NULL, NULL, NULL, NULL,
> +NULL, NULL, NULL, "intel-pt-lip",
> +},
> +.cpuid = {
> +.eax = 0x14,
> +.needs_ecx = true, .ecx = 0,
> +.reg = R_ECX,
> +},
> +.tcg_features = TCG_14_0_ECX_FEATURES,
> +},
> +
>  };
> 
>  typedef struct FeatureMask {
> @@ -5743,6 +5764,9 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t
> index, uint32_t count,
>  *eax = INTEL_PT_MAX_SUBLEAF;
>  *ebx = INTEL_PT_MINIMAL_EBX;
>  *ecx = INTEL_PT_MINIMAL_ECX;
> +if (env->features[FEAT_14_0_ECX] & CPUID_14_0_ECX_LIP) {
> +*ecx |= CPUID_14_0_ECX_LIP;
> +}
>  } else if (count == 1) {
>  *eax = INTEL_PT_MTC_BITMAP | INTEL_PT_ADDR_RANGES_NUM;
>  *ebx = INTEL_PT_PSB_BITMAP | INTEL_PT_CYCLE_BITMAP; @@ -6416,8
> +6440,9 @@ static void x86_cpu_expand_features(X86CPU *cpu, Error **errp)
> ((eax_1 & INTEL_PT_ADDR_RANGES_NUM_MASK) >=
> INTEL_PT_ADDR_RANGES_NUM) &&
> ((ebx_1 & (INTEL_PT_PSB_BITMAP | INTEL_PT_CYCLE_BITMAP)) ==
> -(INTEL_PT_PSB_BITMAP | INTEL_PT_CYCLE_BITMAP)) &&
> -   !(ecx_0 & INTEL_PT_IP_LIP)) {
> +(INTEL_PT_PSB_BITMAP | INTEL_PT_CYCLE_BITMAP)) &&
> +   ((ecx_0 & CPUID_14_0_ECX_LIP) ==
> +(env->features[FEAT_14_0_ECX] &
> + CPUID_14_0_ECX_LIP))) {
>  if (cpu->intel_pt_auto_level) {
>  x86_cpu_adjust_level(cpu, >env.cpuid_min_level, 
> 0x14);
>  } else if (cpu->env.cpuid_min_level < 0x14) { diff --git
> a/target/i386/cpu.h b/target/i386/cpu.h index 51c1d5f60a..1fcd93e39a 100644
> --- a/target/i386/cpu.h
> +++ b/target/i386/cpu.h
> @@ -541,6 +541,7 @@ typedef enum FeatureWord {
>  FEAT_VMX_EPT_VPID_CAPS,
>  FEAT_VMX_BASIC,
>  FEAT_VMX_VMFUNC,
> +FEAT_14_0_ECX,
>  FEATURE_WORDS,
>  } FeatureWord;
> 
> @@ -797,6 +798,9 @@ typedef uint64_t FeatureWordArray[FEATURE_WORDS];
>  /* AVX512 BFloat16 Instruction */
>  #define CPUID_7_1_EAX_AVX512_BF16   (1U << 5)
> 
> +/* Packets which contain IP payload have LIP values */
> +#define CPUID_14_0_ECX_LIP  (1U << 31)
> +
>  /* CLZERO instruction */
>  #define CPUID_8000_0008_EBX_CLZERO  (1U << 0)
>  /* Always save/restore FP error pointers */
> --
> 2.18.4



[PATCH 3/3] virtiofsd: check whether strdup lo.source return NULL in main func

2020-11-10 Thread Haotian Li
In main func, strdup lo.source may fail. So check whether strdup
lo.source return NULL before using it.

Signed-off-by: Haotian Li 
Signed-off-by: Zhiqiang Liu 
---
 tools/virtiofsd/passthrough_ll.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/tools/virtiofsd/passthrough_ll.c b/tools/virtiofsd/passthrough_ll.c
index 3e9bbc7a04..0c11134fb5 100644
--- a/tools/virtiofsd/passthrough_ll.c
+++ b/tools/virtiofsd/passthrough_ll.c
@@ -3525,6 +3525,10 @@ int main(int argc, char *argv[])
 }
 } else {
 lo.source = strdup("/");
+if (!lo.source) {
+fuse_log(FUSE_LOG_ERR, "failed to strdup source\n");
+goto err_out1;
+}
 }

 if (lo.xattrmap) {
-- 



[PATCH 2/3] virtiofsd: check whether lo_map_reserve returns NULL in, main func

2020-11-10 Thread Haotian Li
In main func, func lo_map_reserve is called without NULL check.
If reallocing new_elems fails in func lo_map_grow, the func
lo_map_reserve may return NULL. We should check whether
lo_map_reserve returns NULL before using it.

Signed-off-by: Haotian Li 
Signed-off-by: Zhiqiang Liu 
---
 tools/virtiofsd/passthrough_ll.c | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/tools/virtiofsd/passthrough_ll.c b/tools/virtiofsd/passthrough_ll.c
index ec1008bceb..3e9bbc7a04 100644
--- a/tools/virtiofsd/passthrough_ll.c
+++ b/tools/virtiofsd/passthrough_ll.c
@@ -3433,6 +3433,7 @@ int main(int argc, char *argv[])
 .proc_self_fd = -1,
 };
 struct lo_map_elem *root_elem;
+struct lo_map_elem *reserve_elem;
 int ret = -1;

 /* Don't mask creation mode, kernel already did that */
@@ -3452,8 +3453,17 @@ int main(int argc, char *argv[])
  * [1] Root inode
  */
 lo_map_init(_map);
-lo_map_reserve(_map, 0)->in_use = false;
+reserve_elem = lo_map_reserve(_map, 0);
+if (!reserve_elem) {
+fuse_log(FUSE_LOG_ERR, "failed to alloc reserve_elem.\n");
+goto err_out1;
+}
+reserve_elem->in_use = false;
 root_elem = lo_map_reserve(_map, lo.root.fuse_ino);
+if (!root_elem) {
+fuse_log(FUSE_LOG_ERR, "failed to alloc root_elem.\n");
+goto err_out1;
+}
 root_elem->inode = 

 lo_map_init(_map);
-- 



[PATCH 1/3] tools/virtiofsd/buffer.c: check whether buf is NULL in fuse_bufvec_advance func

2020-11-10 Thread Haotian Li
In fuse_bufvec_advance func, calling fuse_bufvec_current func
may return NULL, so we should check whether buf is NULL before
using it.

Signed-off-by: Haotian Li 
Signed-off-by: Zhiqiang Liu 
---
 tools/virtiofsd/buffer.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/tools/virtiofsd/buffer.c b/tools/virtiofsd/buffer.c
index 27c1377f22..bdc608c221 100644
--- a/tools/virtiofsd/buffer.c
+++ b/tools/virtiofsd/buffer.c
@@ -246,6 +246,10 @@ static int fuse_bufvec_advance(struct fuse_bufvec *bufv, 
size_t len)
 {
 const struct fuse_buf *buf = fuse_bufvec_current(bufv);

+if (!buf) {
+return 0;
+}
+
 bufv->off += len;
 assert(bufv->off <= buf->size);
 if (bufv->off == buf->size) {
-- 



[PATCH v3 0/3] virtiofsd: fix some accessing NULL pointer problem

2020-11-10 Thread Haotian Li
Hi,
  We find some potential NULL pointer bugs on tools/virtiofsd.
Three patches are made to fix them

Haotian Li (3):
  tools/virtiofsd/buffer.c: check whether buf is NULL in
fuse_bufvec_advance func
  virtiofsd: check whether lo_map_reserve returns NULL in main func
  virtiofsd: check whether strdup lo.source return NULL in main func.

 tools/virtiofsd/buffer.c |  4 
 tools/virtiofsd/passthrough_ll.c | 16 +++-
 2 files changed, 19 insertions(+), 1 deletion(-)

-- 



[ANNOUNCE] QEMU 5.2.0-rc1 is now available

2020-11-10 Thread Michael Roth
Hello,

On behalf of the QEMU Team, I'd like to announce the availability of the
second release candidate for the QEMU 5.2 release.  This release is meant
for testing purposes and should not be used in a production environment.

  http://download.qemu-project.org/qemu-5.2.0-rc1.tar.xz
  http://download.qemu-project.org/qemu-5.2.0-rc1.tar.xz.sig

You can help improve the quality of the QEMU 5.2 release by testing this
release and reporting bugs on Launchpad:

  https://bugs.launchpad.net/qemu/

The release plan, as well a documented known issues for release
candidates, are available at:

  http://wiki.qemu.org/Planning/5.2

Please add entries to the ChangeLog for the 5.2 release below:

  http://wiki.qemu.org/ChangeLog/5.2

Thank you to everyone involved!

Changes since rc0:

c6f28ed507: Update version for v5.2.0-rc1 release (Peter Maydell)
b6c56c8a9a: target/arm/translate-neon.c: Handle VTBL UNDEF case before VFP 
access check (Peter Maydell)
8006c9842b: tests/qtest/npcm7xx_rng-test: count runs properly (Havard 
Skinnemoen)
0e5dc77573: hw/arm/nseries: Check return value from load_image_targphys() 
(Peter Maydell)
44cbf34975: hw/arm/musicpal: Only use qdev_get_gpio_in() when necessary 
(Philippe Mathieu-Daudé)
498661dd22: hw/arm/musicpal: Don't connect two qemu_irqs directly to the same 
input (Philippe Mathieu-Daudé)
bdad3654d3: hw/arm/nseries: Remove invalid/unnecessary n8x0_uart_setup() 
(Philippe Mathieu-Daudé)
2108e5092a: hw/misc/stm32f2xx_syscfg: Remove extraneous IRQ (Philippe 
Mathieu-Daudé)
509602eed4: hw/arm/armsse: Correct expansion MPC interrupt lines (Philippe 
Mathieu-Daudé)
604cef3e57: target/arm: Fix neon VTBL/VTBX for len > 1 (Richard Henderson)
bec3c97e0c: hw/arm/virt: Remove dependency on Cortex-A15 MPCore peripherals 
(Philippe Mathieu-Daudé)
0339c2a86f: docs: add some notes on the sbsa-ref machine (Alex Bennée)
7f350a87e3: target/arm: add space before the open parenthesis '(' (Xinhao Zhang)
6eb55edbab: target/arm: Don't use '#' flag of printf format (Xinhao Zhang)
bdc3b6f570: target/arm: add spaces around operator (Xinhao Zhang)
9df0a97298: ssi: Fix bad printf format specifiers (AlexChen)
9ad5f6b05f: hw/arm/Kconfig: ARM_V7M depends on PTIMER (Andrew Jones)
a58cabd0e3: s390x: Avoid variable size warning in ipl.h (Daniele Buono)
074df27f74: s390x: fix clang 11 warnings in cpu_models.c (Daniele Buono)
ad57e2b1f5: qtest: Update references to parse_escape() in comments (Peter 
Maydell)
d4e279141b: fuzz: add virtio-blk fuzz target (Dima Stepanov)
704a256da8: docs: add "page source" link to sphinx documentation (Daniel P. 
Berrangé)
d0f26e68a0: gitlab: force enable docs build in Fedora, Ubuntu, Debian (Daniel 
P. Berrangé)
4daa9055be: gitlab: publish the docs built during CI (Daniel P. Berrangé)
2deca810d8: configure: surface deprecated targets in the help output (Alex 
Bennée)
aba378dee6: fuzz: Make fork_fuzz.ld compatible with LLVM's LLD (Daniele Buono)
bb451d2487: scripts/oss-fuzz: give all fuzzers -target names (Alexander Bulekov)
e6a3e1322b: docs/fuzz: update fuzzing documentation post-meson (Alexander 
Bulekov)
f3a0208f24: docs/fuzz: rST-ify the fuzzing documentation (Alexander Bulekov)
3758e88bb8: MAINTAINERS: Add gitlab-pipeline-status script to GitLab CI section 
(Philippe Mathieu-Daudé)
c3ab5df2f5: linux-user/sparc: Don't zero high half of PC, NPC, PSR in sigreturn 
(Peter Maydell)
266b41582e: linux-user/sparc: Correct set/get_context handling of fp and i7 
(Peter Maydell)
b8ae597f0e: linux-user/sparc: Fix errors in target_ucontext structures (Peter 
Maydell)
96338fefc1: hw/intc/ibex_plic: Clear the claim register when read (Alistair 
Francis)
7687537ab0: target/riscv: Split the Hypervisor execute load helpers (Alistair 
Francis)
743077b35b: target/riscv: Remove the hyp load and store functions (Alistair 
Francis)
1c1c060aa8: target/riscv: Remove the HS_TWO_STAGE flag (Alistair Francis)
3e5979046f: target/riscv: Set the virtualised MMU mode when doing hyp accesses 
(Alistair Francis)
c445593d30: target/riscv: Add a virtualised MMU Mode (Alistair Francis)
b1b9ab1c04: qga: fix missing closedir() in qmp_guest_get_disks() (Michael Roth)
d669ed6ab0: block: make bdrv_drop_intermediate() less wrong (Vladimir 
Sementsov-Ogievskiy)
313274bbd4: block: add bdrv_replace_node_common() (Vladimir Sementsov-Ogievskiy)
6c5f7b3a10: block: add forgotten bdrv_abort_perm_update() to 
bdrv_co_invalidate_cache() (Vladimir Sementsov-Ogievskiy)
5f14f31d2b: block: Fix some code style problems, "foo* bar" should be "foo 
*bar" (shiliyang)
7433a6860b: gitlab-ci: Drop generic cache rule (Philippe Mathieu-Daudé)
dccaea2514: tests/qtest/tpm: Remove redundant check in the 
tpm_test_swtpm_test() (AlexChen)
3dc057923d: qtest: Fix bad printf format specifiers (AlexChen)
8a47836548: device-crash-test: Check if path is actually an executable file 
(Eduardo Habkost)
45716765b1: tests/vm: update openbsd to release 6.8 (Brad Smith)
a3f6be81aa: meson: always include contrib/libvhost-user (Stefan Hajnoczi)
122860bae7: 

Re: [PATCH v1 for 5.1 00/10] various fixes (CI, Xen, plugins)

2020-11-10 Thread Alex Bennée


Alex Bennée  writes:

> Hi,
>
> This collects together a bunch of fixes for 5.2:

Doh, subject did not match body, I of course mean for the current
release candidate.

>   - a few resource leak fixes for plugins
>   - Xen on arm64 build fixes (from my larger Xen series)
>   - a couple of build and CI fixes
>   - a tweak to the gitlab status script


-- 
Alex Bennée



Re: [PULL 0/6] Misc fixes for QEMU 5.2-rc2

2020-11-10 Thread Peter Maydell
On Tue, 10 Nov 2020 at 11:35, Paolo Bonzini  wrote:
>
> The following changes since commit 3493c36f0371777c62d1d72b205b0eb6117e2156:
>
>   Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20201106' into 
> staging (2020-11-06 13:43:28 +)
>
> are available in the Git repository at:
>
>   https://gitlab.com/bonzini/qemu.git tags/for-upstream
>
> for you to fetch changes up to 6e853291036573c8831f486fc7d76b779b0ac567:
>
>   pvpanic: Advertise the PVPANIC_CRASHLOADED event support (2020-11-10 
> 06:27:17 -0500)
>
> 
> Bug fixes
>
> 

Fails "make check" on all platforms:

MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))}
QTEST_QEMU_IMG=./qemu-img G_TEST_DBUS_DAEMON=/home/petmay
01/linaro/qemu-for-merges/tests/dbus-vmstate-daemon.sh
QTEST_QEMU_BINARY=./qemu-system-i386 tests/qtest/pvpanic-test --tap
 -k
**
ERROR:../../tests/qtest/pvpanic-test.c:23:test_panic: assertion failed
(val == 1): (3 == 1)
ERROR qtest-i386/pvpanic-test - Bail out!
ERROR:../../tests/qtest/pvpanic-test.c:23:test_panic: assertion failed
(val == 1): (3 == 1)

thanks
-- PMM



[PATCH for-5.2 v3 4/4] hw/net/can/ctucan_core: Use stl_le_p to write to tx_buffers

2020-11-10 Thread Pavel Pisa
From: Peter Maydell 

Instead of casting an address within a uint8_t array to a
uint32_t*, use stl_le_p(). This handles possibly misaligned
addresses which would otherwise crash on some hosts.

Signed-off-by: Peter Maydell 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Pavel Pisa 
Tested-by: Pavel Pisa 
---
 hw/net/can/ctucan_core.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/hw/net/can/ctucan_core.c b/hw/net/can/ctucan_core.c
index f49c76261c..d171c372e0 100644
--- a/hw/net/can/ctucan_core.c
+++ b/hw/net/can/ctucan_core.c
@@ -303,11 +303,9 @@ void ctucan_mem_write(CtuCanCoreState *s, hwaddr addr, 
uint64_t val,
 addr -= CTU_CAN_FD_TXTB1_DATA_1;
 buff_num = addr / CTUCAN_CORE_TXBUFF_SPAN;
 addr %= CTUCAN_CORE_TXBUFF_SPAN;
-addr &= ~3;
 if ((buff_num < CTUCAN_CORE_TXBUF_NUM) &&
-(addr < sizeof(s->tx_buffer[buff_num].data))) {
-uint32_t *bufp = (uint32_t *)(s->tx_buffer[buff_num].data + addr);
-*bufp = cpu_to_le32(val);
+((addr + size) <= sizeof(s->tx_buffer[buff_num].data))) {
+stn_le_p(s->tx_buffer[buff_num].data + addr, size, val);
 }
 } else {
 switch (addr & ~3) {
-- 
2.20.1




Re: [PATCH for-5.2 v2 1/4] hw/net/can/ctucan: Don't allow guest to write off end of tx_buffer

2020-11-10 Thread Pavel Pisa
Hello Peter,

On Tuesday 10 of November 2020 22:18:45 Peter Maydell wrote:
> If you've got a modified patch set that you've tested, would
> you mind sending it out to the list? That would avoid my
> possibly making mistakes in updating patches on my end and
> then requiring you to repeat the testing.

OK, I have tried to send it with your authorship and my
Signed-of-by at these patches which I have slightly
modified and with Acked-by of these which should stay
exactly same. If you prefer another style, send me a hint.

Thanks much to help us to make our code better,

 Pavel Pisa



[PATCH for-5.2 v3 3/4] hw/net/can/ctucan_core: Handle big-endian hosts

2020-11-10 Thread Pavel Pisa
From: Peter Maydell 

The ctucan driver defines types for its registers which are a union
of a uint32_t with a struct with bitfields for the individual
fields within that register. This is a bad idea, because bitfields
aren't portable. The ctu_can_fd_regs.h header works around the
most glaring of the portability issues by defining the
fields in two different orders depending on the setting of the
__LITTLE_ENDIAN_BITFIELD define. However, in ctucan_core.h this
is unconditionally set to 1, which is wrong for big-endian hosts.

Set it only if HOST_WORDS_BIGENDIAN is not set. There is no need
for a "have we defined it already" guard, because the only place
that should set it is ctucan_core.h, which has the usual
double-inclusion guard.

Signed-off-by: Peter Maydell 
Reviewed-by: Philippe Mathieu-Daudé 
Acked-by: Pavel Pisa 
Tested-by: Pavel Pisa 
---
 hw/net/can/ctucan_core.h | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/hw/net/can/ctucan_core.h b/hw/net/can/ctucan_core.h
index f21cb1c5ec..bbc09ae067 100644
--- a/hw/net/can/ctucan_core.h
+++ b/hw/net/can/ctucan_core.h
@@ -31,8 +31,7 @@
 #include "exec/hwaddr.h"
 #include "net/can_emu.h"
 
-
-#ifndef __LITTLE_ENDIAN_BITFIELD
+#ifndef HOST_WORDS_BIGENDIAN
 #define __LITTLE_ENDIAN_BITFIELD 1
 #endif
 
-- 
2.20.1




[PATCH for-5.2 v3 2/4] hw/net/can/ctucan: Avoid unused value in ctucan_send_ready_buffers()

2020-11-10 Thread Pavel Pisa
From: Peter Maydell 

Coverity points out that in ctucan_send_ready_buffers() we
set buff_st_mask = 0xf << (i * 4) inside the loop, but then
we never use it before overwriting it later.

The only thing we use the mask for is as part of the code that is
inserting the new buff_st field into tx_status.  That is more
comprehensibly written using deposit32(), so do that and drop the
mask variable entirely.

We also update the buff_st local variable at multiple points
during this function, but nothing can ever see these
intermediate values, so just drop those, write the final
TXT_TOK as a fixed constant value, and collapse the only
remaining set/use of buff_st down into an extract32().

Fixes: Coverity CID 1432869
Signed-off-by: Peter Maydell 
Acked-by: Pavel Pisa 
Tested-by: Pavel Pisa 
---
 hw/net/can/ctucan_core.c | 15 +++
 1 file changed, 3 insertions(+), 12 deletions(-)

diff --git a/hw/net/can/ctucan_core.c b/hw/net/can/ctucan_core.c
index 8486f429d7..f49c76261c 100644
--- a/hw/net/can/ctucan_core.c
+++ b/hw/net/can/ctucan_core.c
@@ -240,8 +240,6 @@ static void ctucan_send_ready_buffers(CtuCanCoreState *s)
 uint8_t *pf;
 int buff2tx_idx;
 uint32_t tx_prio_max;
-unsigned int buff_st;
-uint32_t buff_st_mask;
 
 if (!s->mode_settings.s.ena) {
 return;
@@ -256,10 +254,7 @@ static void ctucan_send_ready_buffers(CtuCanCoreState *s)
 for (i = 0; i < CTUCAN_CORE_TXBUF_NUM; i++) {
 uint32_t prio;
 
-buff_st_mask = 0xf << (i * 4);
-buff_st = (s->tx_status.u32 >> (i * 4)) & 0xf;
-
-if (buff_st != TXT_RDY) {
+if (extract32(s->tx_status.u32, i * 4, 4) != TXT_RDY) {
 continue;
 }
 prio = (s->tx_priority.u32 >> (i * 4)) & 0x7;
@@ -271,10 +266,7 @@ static void ctucan_send_ready_buffers(CtuCanCoreState *s)
 if (buff2tx_idx == -1) {
 break;
 }
-buff_st_mask = 0xf << (buff2tx_idx * 4);
-buff_st = (s->tx_status.u32 >> (buff2tx_idx * 4)) & 0xf;
 int_stat.u32 = 0;
-buff_st = TXT_RDY;
 pf = s->tx_buffer[buff2tx_idx].data;
 ctucan_buff2frame(pf, );
 s->status.s.idle = 0;
@@ -283,12 +275,11 @@ static void ctucan_send_ready_buffers(CtuCanCoreState *s)
 s->status.s.idle = 1;
 s->status.s.txs = 0;
 s->tx_fr_ctr.s.tx_fr_ctr_val++;
-buff_st = TXT_TOK;
 int_stat.s.txi = 1;
 int_stat.s.txbhci = 1;
 s->int_stat.u32 |= int_stat.u32 & ~s->int_mask.u32;
-s->tx_status.u32 = (s->tx_status.u32 & ~buff_st_mask) |
-(buff_st << (buff2tx_idx * 4));
+s->tx_status.u32 = deposit32(s->tx_status.u32,
+ buff2tx_idx * 4, 4, TXT_TOK);
 } while (1);
 }
 
-- 
2.20.1




[PATCH for-5.2 v3 0/4] hw/net/can/ctucan: fix Coverity and other issues

2020-11-10 Thread Pavel Pisa
Credit for finding and fixes goes to Peter Maydell

This patchset fixes a couple of issues spotted by Coverity:
 * incorrect address checks meant the guest could write off the
   end of the tx_buffer arrays
 * we had an unused value in ctucan_send_ready_buffers()
and also some I noticed while reading the code:
 * we don't adjust the device's non-portable use of bitfields
   on bigendian hosts
 * we should use stl_le_p() rather than casting uint_t* to
   uint32_t*

Tested with "make check" only.

Changes v1->v2: don't assert() the can't-happen case in patch 1,
to allow for future adjustment of #defines that correspond to
h/w synthesis parameters.

Changes v2->v3: minnor corrections of range checking,
support for unaligned and partial word writes into Tx
buffers. Tested on x86_64 guest on x86_64 host and bige-edian
MIPS guest on x86_64 host Pavel Pisa.

Peter Maydell (4):
  hw/net/can/ctucan: Don't allow guest to write off end of tx_buffer
  hw/net/can/ctucan: Avoid unused value in ctucan_send_ready_buffers()
  hw/net/can/ctucan_core: Handle big-endian hosts
  hw/net/can/ctucan_core: Use stl_le_p to write to tx_buffers

 hw/net/can/ctucan_core.c | 23 +++
 hw/net/can/ctucan_core.h |  3 +--
 2 files changed, 8 insertions(+), 18 deletions(-)

-- 
2.20.1




[PATCH for-5.2 v3 1/4] hw/net/can/ctucan: Don't allow guest to write off end of tx_buffer

2020-11-10 Thread Pavel Pisa
From: Peter Maydell 

The ctucan device has 4 CAN bus cores, each of which has a set of 20
32-bit registers for writing the transmitted data. The registers are
however not contiguous; each core's buffers is 0x100 bytes after
the last.

We got the checks on the address wrong in the ctucan_mem_write()
function:
 * the first "is addr in range at all" check allowed
   addr == CTUCAN_CORE_MEM_SIZE, which is actually the first
   byte off the end of the range
 * the decode of addresses into core-number plus offset in the
   tx buffer for that core failed to check that the offset was
   in range, so the guest could write off the end of the
   tx_buffer[] array

NB: currently the values of CTUCAN_CORE_MEM_SIZE, CTUCAN_CORE_TXBUF_NUM,
etc, make "buff_num >= CTUCAN_CORE_TXBUF_NUM" impossible, but we
retain this as a runtime check rather than an assertion to permit
those values to be changed in future (in hardware they are
configurable synthesis parameters).

Fix the top level check, and check the offset is within the buffer.

Fixes: Coverity CID 1432874
Signed-off-by: Peter Maydell 
Signed-off-by: Pavel Pisa 
Tested-by: Pavel Pisa 
---
 hw/net/can/ctucan_core.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/hw/net/can/ctucan_core.c b/hw/net/can/ctucan_core.c
index d20835cd7e..8486f429d7 100644
--- a/hw/net/can/ctucan_core.c
+++ b/hw/net/can/ctucan_core.c
@@ -303,7 +303,7 @@ void ctucan_mem_write(CtuCanCoreState *s, hwaddr addr, 
uint64_t val,
 DPRINTF("write 0x%02llx addr 0x%02x\n",
 (unsigned long long)val, (unsigned int)addr);
 
-if (addr > CTUCAN_CORE_MEM_SIZE) {
+if (addr >= CTUCAN_CORE_MEM_SIZE) {
 return;
 }
 
@@ -312,7 +312,9 @@ void ctucan_mem_write(CtuCanCoreState *s, hwaddr addr, 
uint64_t val,
 addr -= CTU_CAN_FD_TXTB1_DATA_1;
 buff_num = addr / CTUCAN_CORE_TXBUFF_SPAN;
 addr %= CTUCAN_CORE_TXBUFF_SPAN;
-if (buff_num < CTUCAN_CORE_TXBUF_NUM) {
+addr &= ~3;
+if ((buff_num < CTUCAN_CORE_TXBUF_NUM) &&
+(addr < sizeof(s->tx_buffer[buff_num].data))) {
 uint32_t *bufp = (uint32_t *)(s->tx_buffer[buff_num].data + addr);
 *bufp = cpu_to_le32(val);
 }
-- 
2.20.1




Re: [RFC PATCH for-QEMU-5.2] vfio: Make migration support experimental

2020-11-10 Thread Neo Jia
On Tue, Nov 10, 2020 at 08:20:50AM -0700, Alex Williamson wrote:
> External email: Use caution opening links or attachments
> 
> 
> On Tue, 10 Nov 2020 19:46:20 +0530
> Kirti Wankhede  wrote:
> 
> > On 11/10/2020 2:40 PM, Dr. David Alan Gilbert wrote:
> > > * Alex Williamson (alex.william...@redhat.com) wrote:
> > >> On Mon, 9 Nov 2020 19:44:17 +
> > >> "Dr. David Alan Gilbert"  wrote:
> > >>
> > >>> * Alex Williamson (alex.william...@redhat.com) wrote:
> >  Per the proposed documentation for vfio device migration:
> > 
> > Dirty pages are tracked when device is in stop-and-copy phase
> > because if pages are marked dirty during pre-copy phase and
> > content is transfered from source to destination, there is no
> > way to know newly dirtied pages from the point they were copied
> > earlier until device stops. To avoid repeated copy of same
> > content, pinned pages are marked dirty only during
> > stop-and-copy phase.
> > 
> >  Essentially, since we don't have hardware dirty page tracking for
> >  assigned devices at this point, we consider any page that is pinned
> >  by an mdev vendor driver or pinned and mapped through the IOMMU to
> >  be perpetually dirty.  In the worst case, this may result in all of
> >  guest memory being considered dirty during every iteration of live
> >  migration.  The current vfio implementation of migration has chosen
> >  to mask device dirtied pages until the final stages of migration in
> >  order to avoid this worst case scenario.
> > 
> >  Allowing the device to implement a policy decision to prioritize
> >  reduced migration data like this jeopardizes QEMU's overall ability
> >  to implement any degree of service level guarantees during migration.
> >  For example, any estimates towards achieving acceptable downtime
> >  margins cannot be trusted when such a device is present.  The vfio
> >  device should participate in dirty page tracking to the best of its
> >  ability throughout migration, even if that means the dirty footprint
> >  of the device impedes migration progress, allowing both QEMU and
> >  higher level management tools to decide whether to continue the
> >  migration or abort due to failure to achieve the desired behavior.
> > >>>
> > >>> I don't feel particularly badly about the decision to squash it in
> > >>> during the stop-and-copy phase; for devices where the pinned memory
> > >>> is large, I don't think doing it during the main phase makes much sense;
> > >>> especially if you then have to deal with tracking changes in pinning.
> > >>
> > >>
> > >> AFAIK the kernel support for tracking changes in page pinning already
> > >> exists, this is largely the vfio device in QEMU that decides when to
> > >> start exposing the device dirty footprint to QEMU.  I'm a bit surprised
> > >> by this answer though, we don't really know what the device memory
> > >> footprint is.  It might be large, it might be nothing, but by not
> > >> participating in dirty page tracking until the VM is stopped, we can't
> > >> know what the footprint is and how it will affect downtime.  Is it
> > >> really the place of a QEMU device driver to impose this sort of policy?
> > >
> > > If it could actually track changes then I'd agree we shouldn't impose
> > > any policy; but if it's just marking the whole area as dirty we're going
> > > to need a bodge somewhere; this bodge doesn't look any worse than the
> > > others to me.
> > >
> > >>
> > >>> Having said that, I agree with marking it as experimental, because
> > >>> I'm dubious how useful it will be for the same reason, I worry
> > >>> about whether the downtime will be so large to make it pointless.
> > >>
> >
> > Not all device state is large, for example NIC might only report
> > currently mapped RX buffers which usually not more than a 1GB and could
> > be as low as 10's of MB. GPU might or might not have large data, that
> > depends on its use cases.
> 
> 
> Right, it's only if we have a vendor driver that doesn't pin any memory
> when dirty tracking is enabled and we're running without a viommu that
> we would expect all of guest memory to be continuously dirty.
> 
> 
> > >> TBH I think that's the wrong reason to mark it experimental.  There's
> > >> clearly demand for vfio device migration and even if the practical use
> > >> cases are initially small, they will expand over time and hardware will
> > >> get better.  My objection is that the current behavior masks the
> > >> hardware and device limitations, leading to unrealistic expectations.
> > >> If the user expects minimal downtime, configures convergence to account
> > >> for that, QEMU thinks it can achieve it, and then the device marks
> > >> everything dirty, that's not supportable.
> > >
> > > Yes, agreed.
> >
> > Yes, there is demand for vfio device migration and many devices owners
> > started scoping and 

Re: [PATCH for-5.2 v2 1/4] hw/net/can/ctucan: Don't allow guest to write off end of tx_buffer

2020-11-10 Thread Peter Maydell
On Tue, 10 Nov 2020 at 19:32, Pavel Pisa  wrote:
>
> Hello Peter,
>
> On Tuesday 10 of November 2020 19:24:03 Peter Maydell wrote:
> > For unaligned accesses, for 6.0, I think the code for doing
> > them to the txbuff at least is straightforward:
> >
> >if (buff_num < CTUCAN_CORE_TXBUF_NUM &&
> >(addr + size) < CTUCAN_CORE_MSG_MAX_LEN) {
> >   stn_le_p(s->tx_buffer[buff_num].data + addr, size, val);
> >}
> >
> > (stn_le_p takes care of doing an appropriate-width write.)
>
> Thanks, great to know, I like that much.
> Only small nitpicking, it should be (addr + size) <= CTUCAN_CORE_MSG_MAX_LEN
>
> So whole code I am testing now
>
> if (addr >= CTU_CAN_FD_TXTB1_DATA_1) {
> int buff_num;
> addr -= CTU_CAN_FD_TXTB1_DATA_1;
> buff_num = addr / CTUCAN_CORE_TXBUFF_SPAN;
> addr %= CTUCAN_CORE_TXBUFF_SPAN;
> if ((buff_num < CTUCAN_CORE_TXBUF_NUM) &&
> ((addr + size) <= sizeof(s->tx_buffer[buff_num].data))) {
> stn_le_p(s->tx_buffer[buff_num].data + addr, size, val);
> }
> } else {
>
> So I have applied you whole series with above update. All works correctly
> on x86_64 Linux host and with Linux x86_64 and MIPS big endian guests.
>
> Please update to this combination.

If you've got a modified patch set that you've tested, would
you mind sending it out to the list? That would avoid my
possibly making mistakes in updating patches on my end and
then requiring you to repeat the testing.

thanks
-- PMM



Re: [PULL 00/16] target-arm queue

2020-11-10 Thread Peter Maydell
On Tue, 10 Nov 2020 at 11:19, Peter Maydell  wrote:
>
> Patches for rc1: nothing major, just some minor bugfixes and
> code cleanups.
>
> -- PMM
>
> The following changes since commit f7e1914adad8885a5d4c70239ab90d901ed97e9f:
>
>   Merge remote-tracking branch 
> 'remotes/alistair/tags/pull-riscv-to-apply-20201109' into staging (2020-11-10 
> 09:24:56 +)
>
> are available in the Git repository at:
>
>   https://git.linaro.org/people/pmaydell/qemu-arm.git 
> tags/pull-target-arm-20201110
>
> for you to fetch changes up to b6c56c8a9a4064ea783f352f43c5df6231a110fa:
>
>   target/arm/translate-neon.c: Handle VTBL UNDEF case before VFP access check 
> (2020-11-10 11:03:48 +)
>
> 
> target-arm queue:
>  * hw/arm/Kconfig: ARM_V7M depends on PTIMER
>  * Minor coding style fixes
>  * docs: add some notes on the sbsa-ref machine
>  * hw/arm/virt: Remove dependency on Cortex-A15 MPCore peripherals
>  * target/arm: Fix neon VTBL/VTBX for len > 1
>  * hw/arm/armsse: Correct expansion MPC interrupt lines
>  * hw/misc/stm32f2xx_syscfg: Remove extraneous IRQ
>  * hw/arm/nseries: Remove invalid/unnecessary n8x0_uart_setup()
>  * hw/arm/musicpal: Don't connect two qemu_irqs directly to the same input
>  * hw/arm/musicpal: Only use qdev_get_gpio_in() when necessary
>  * hw/arm/nseries: Check return value from load_image_targphys()
>  * tests/qtest/npcm7xx_rng-test: count runs properly
>  * target/arm/translate-neon.c: Handle VTBL UNDEF case before VFP access check
>
> 


Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/5.2
for any user-visible changes.

-- PMM



Re: [PATCH for-5.2 v2 2/4] hw/net/can/ctucan: Avoid unused value in ctucan_send_ready_buffers()

2020-11-10 Thread Peter Maydell
On Tue, 10 Nov 2020 at 19:37, Pavel Pisa  wrote:
>
> Hello Peter,
>
> On Tuesday 10 of November 2020 18:06:02 Peter Maydell wrote:
> > @@ -256,10 +254,7 @@ static void ctucan_send_ready_buffers(CtuCanCoreState
> > *s) for (i = 0; i < CTUCAN_CORE_TXBUF_NUM; i++) {
> >  uint32_t prio;
> >
> > -buff_st_mask = 0xf << (i * 4);
> > -buff_st = (s->tx_status.u32 >> (i * 4)) & 0xf;
> > -
> > -if (buff_st != TXT_RDY) {
> > +if (extract32(s->tx_status.u32, i * 4, 4) != TXT_RDY) {
> >  continue;
> >  }
> >  prio = (s->tx_priority.u32 >> (i * 4)) & 0x7;
> > @@ -271,10 +266,7 @@ static void ctucan_send_ready_buffers(CtuCanCoreState
> > *s) if (buff2tx_idx == -1) {
> >  break;
> >  }
> > -buff_st_mask = 0xf << (buff2tx_idx * 4);
> > -buff_st = (s->tx_status.u32 >> (buff2tx_idx * 4)) & 0xf;
> >  int_stat.u32 = 0;
> > -buff_st = TXT_RDY;
>
> I would prefer to add there next line even that it has no real effect
>
>  +s->tx_status.u32 = deposit32(s->tx_status.u32,
>  + buff2tx_idx * 4, 4, TXT_RDY);

I mentioned this in a reply to my v1 series. The buffer status
in the tx_status field is already TXT_RDY, so there is no state
change happening here to document as far as I can tell ?

thanks
-- PMM



Re: [PATCH v3 08/11] gitlab-ci: Extract common job definition as 'native_common_job'

2020-11-10 Thread Wainer dos Santos Moschetta



On 11/8/20 8:19 PM, Philippe Mathieu-Daudé wrote:

Extract the common definitions shared by '.native_build_job'
and '.native_test_job' to '.native_common_job'.

Signed-off-by: Philippe Mathieu-Daudé 
---
  .gitlab-ci.yml | 9 ++---
  1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
index d4526323169..f708573884e 100644
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
@@ -13,9 +13,12 @@ include:
- local: '/.gitlab-ci.d/containers.yml'
- local: '/.gitlab-ci.d/crossbuilds.yml'
  
-.native_build_job:

-  stage: build
+.native_common_job:
image: $CI_REGISTRY_IMAGE/qemu/$IMAGE:latest


Do you envision that "native_common_job" with more common properties?

Asking because it creates another indirection to just replace two "image"s.

Anyway,

Reviewed-by: Wainer dos Santos Moschetta 



+
+.native_build_job:
+  extends: .native_common_job
+  stage: build
before_script:
  - JOBS=$(expr $(nproc) + 1)
  - sed -i s,git.qemu.org/git,gitlab.com/qemu-project, .gitmodules
@@ -35,8 +38,8 @@ include:
fi
  
  .native_test_job:

+  extends: .native_common_job
stage: test
-  image: $CI_REGISTRY_IMAGE/qemu/$IMAGE:latest
script:
  - cd build
  - find . -type f -exec touch {} +





Re: [RFC v1 09/10] i386: split cpu.c and defer x86 models registration

2020-11-10 Thread Eduardo Habkost
On Tue, Nov 10, 2020 at 09:39:37PM +0100, Paolo Bonzini wrote:
> On 10/11/20 18:55, Eduardo Habkost wrote:
> > > I think we should not try yo implement interfaces conditionally (i.e. have
> > > TYPE_X86_ACCEL implemented only on qemu-system-{i386,x86_64} and not
> > > qemu-system-arm), even if technically the accel/ objects are per-target
> > > (specific_ss) rather than common.
> > If the accel objects are already per target, it seems appropriate
> > to have a QOM type hierarchy that reflects that.
> > 
> > `qemu-system-x86_64 -accel kvm` would create a kvm-x86_64-accel
> > object, but `qemu-system-arm -accel kvm` would create a
> > kvm-arm-accel.
> 
> ... and fall back to kvm-accel?  So accel_find would be the only place to
> change.

Sounds good.  This way we don't need to convert all accelerators
or all targets at the same time.

-- 
Eduardo




Re: [PATCH v3 07/11] gitlab-ci: Extract common job definition as 'cross_common_job'

2020-11-10 Thread Wainer dos Santos Moschetta



On 11/8/20 8:19 PM, Philippe Mathieu-Daudé wrote:

Extract the common definitions shared by '.cross_system_build_job'
and '.cross_user_build_job' to '.cross_common_job'.

Signed-off-by: Philippe Mathieu-Daudé 
---
  .gitlab-ci.d/crossbuilds.yml | 9 +
  1 file changed, 5 insertions(+), 4 deletions(-)

Reviewed-by: Wainer dos Santos Moschetta 


diff --git a/.gitlab-ci.d/crossbuilds.yml b/.gitlab-ci.d/crossbuilds.yml
index 099949aaef3..701550f028c 100644
--- a/.gitlab-ci.d/crossbuilds.yml
+++ b/.gitlab-ci.d/crossbuilds.yml
@@ -1,7 +1,9 @@
-
-.cross_system_build_job:
+.cross_common_job:
stage: build
image: $CI_REGISTRY_IMAGE/qemu/$IMAGE:latest
+
+.cross_system_build_job:
+  extends: .cross_common_job
timeout: 80m
script:
  - mkdir build
@@ -14,8 +16,7 @@
  - make -j$(expr $(nproc) + 1) all check-build
  
  .cross_user_build_job:

-  stage: build
-  image: $CI_REGISTRY_IMAGE/qemu/$IMAGE:latest
+  extends: .cross_common_job
script:
  - mkdir build
  - cd build





Re: [PATCH v3 06/11] gitlab-ci: Rename acceptance_test_job -> integration_test_job

2020-11-10 Thread Wainer dos Santos Moschetta
Once Cleber said "acceptance" wasn't  a good name for those tests. 
Indeed "integration" is widely used, so okay for this renaming.


On 11/8/20 8:19 PM, Philippe Mathieu-Daudé wrote:

Signed-off-by: Philippe Mathieu-Daudé 
---
  .gitlab-ci.yml | 18 +-
  1 file changed, 9 insertions(+), 9 deletions(-)

Reviewed-by: Wainer dos Santos Moschetta 


diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
index 0ef814764a0..d4526323169 100644
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
@@ -42,7 +42,7 @@ include:
  - find . -type f -exec touch {} +
  - make $MAKE_CHECK_ARGS
  
-.acceptance_test_job:

+.integration_test_job:
extends: .native_test_job
cache:
  key: "${CI_JOB_NAME}-cache"
@@ -89,8 +89,8 @@ check-system-ubuntu:
  IMAGE: ubuntu2004
  MAKE_CHECK_ARGS: check
  
-acceptance-system-ubuntu:

-  extends: .acceptance_test_job
+integration-system-ubuntu:
+  extends: .integration_test_job
needs:
  - job: build-system-ubuntu
artifacts: true
@@ -119,8 +119,8 @@ check-system-debian:
  IMAGE: debian-amd64
  MAKE_CHECK_ARGS: check
  
-acceptance-system-debian:

-  extends: .acceptance_test_job
+integration-system-debian:
+  extends: .integration_test_job
needs:
  - job: build-system-debian
artifacts: true
@@ -150,8 +150,8 @@ check-system-fedora:
  IMAGE: fedora
  MAKE_CHECK_ARGS: check
  
-acceptance-system-fedora:

-  extends: .acceptance_test_job
+integration-system-fedora:
+  extends: .integration_test_job
needs:
  - job: build-system-fedora
artifacts: true
@@ -181,8 +181,8 @@ check-system-centos:
  IMAGE: centos8
  MAKE_CHECK_ARGS: check
  
-acceptance-system-centos:

-  extends: .acceptance_test_job
+integration-system-centos:
+  extends: .integration_test_job
needs:
  - job: build-system-centos
artifacts: true





Re: [RFC v1 09/10] i386: split cpu.c and defer x86 models registration

2020-11-10 Thread Paolo Bonzini

On 10/11/20 18:55, Eduardo Habkost wrote:

I think we should not try yo implement interfaces conditionally (i.e. have
TYPE_X86_ACCEL implemented only on qemu-system-{i386,x86_64} and not
qemu-system-arm), even if technically the accel/ objects are per-target
(specific_ss) rather than common.

If the accel objects are already per target, it seems appropriate
to have a QOM type hierarchy that reflects that.

`qemu-system-x86_64 -accel kvm` would create a kvm-x86_64-accel
object, but `qemu-system-arm -accel kvm` would create a
kvm-arm-accel.


... and fall back to kvm-accel?  So accel_find would be the only place 
to change.


Paolo


*-x86_64-accel and *-i386-accel would all implement
INTERFACE_X86_ACCEL.





Re: [PATCH v3 05/11] gitlab-ci: Replace YAML anchors by extends (acceptance_test_job)

2020-11-10 Thread Wainer dos Santos Moschetta



On 11/8/20 8:19 PM, Philippe Mathieu-Daudé wrote:

'extends' is an alternative to using YAML anchors
and is a little more flexible and readable. See:
https://docs.gitlab.com/ee/ci/yaml/#extends

Signed-off-by: Philippe Mathieu-Daudé 
---
  .gitlab-ci.yml | 15 ++-
  1 file changed, 6 insertions(+), 9 deletions(-)


LGTM

Reviewed-by: Wainer dos Santos Moschetta 



diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
index e11f80f6d65..0ef814764a0 100644
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
@@ -42,7 +42,8 @@ include:
  - find . -type f -exec touch {} +
  - make $MAKE_CHECK_ARGS
  
-.acceptance_template: _definition

+.acceptance_test_job:
+  extends: .native_test_job
cache:
  key: "${CI_JOB_NAME}-cache"
  paths:
@@ -89,14 +90,13 @@ check-system-ubuntu:
  MAKE_CHECK_ARGS: check
  
  acceptance-system-ubuntu:

-  extends: .native_test_job
+  extends: .acceptance_test_job
needs:
  - job: build-system-ubuntu
artifacts: true
variables:
  IMAGE: ubuntu2004
  MAKE_CHECK_ARGS: check-acceptance
-  <<: *acceptance_definition
  
  build-system-debian:

extends: .native_build_job
@@ -120,14 +120,13 @@ check-system-debian:
  MAKE_CHECK_ARGS: check
  
  acceptance-system-debian:

-  extends: .native_test_job
+  extends: .acceptance_test_job
needs:
  - job: build-system-debian
artifacts: true
variables:
  IMAGE: debian-amd64
  MAKE_CHECK_ARGS: check-acceptance
-  <<: *acceptance_definition
  
  build-system-fedora:

extends: .native_build_job
@@ -152,14 +151,13 @@ check-system-fedora:
  MAKE_CHECK_ARGS: check
  
  acceptance-system-fedora:

-  extends: .native_test_job
+  extends: .acceptance_test_job
needs:
  - job: build-system-fedora
artifacts: true
variables:
  IMAGE: fedora
  MAKE_CHECK_ARGS: check-acceptance
-  <<: *acceptance_definition
  
  build-system-centos:

extends: .native_build_job
@@ -184,14 +182,13 @@ check-system-centos:
  MAKE_CHECK_ARGS: check
  
  acceptance-system-centos:

-  extends: .native_test_job
+  extends: .acceptance_test_job
needs:
  - job: build-system-centos
artifacts: true
variables:
  IMAGE: centos8
  MAKE_CHECK_ARGS: check-acceptance
-  <<: *acceptance_definition
  
  build-disabled:

extends: .native_build_job





Re: [PATCH v3 04/11] gitlab-ci: Replace YAML anchors by extends (native_test_job)

2020-11-10 Thread Wainer dos Santos Moschetta



On 11/8/20 8:19 PM, Philippe Mathieu-Daudé wrote:

'extends' is an alternative to using YAML anchors
and is a little more flexible and readable. See:
https://docs.gitlab.com/ee/ci/yaml/#extends

Signed-off-by: Philippe Mathieu-Daudé 
---
  .gitlab-ci.yml | 26 +-
  1 file changed, 13 insertions(+), 13 deletions(-)


LGTM

Reviewed-by: Wainer dos Santos Moschetta 



diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
index a96e7dd23e5..e11f80f6d65 100644
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
@@ -34,7 +34,7 @@ include:
  make -j"$JOBS" $MAKE_CHECK_ARGS ;
fi
  
-.native_test_job_template: _test_job_definition

+.native_test_job:
stage: test
image: $CI_REGISTRY_IMAGE/qemu/$IMAGE:latest
script:
@@ -80,7 +80,7 @@ build-system-ubuntu:
- build
  
  check-system-ubuntu:

-  <<: *native_test_job_definition
+  extends: .native_test_job
needs:
  - job: build-system-ubuntu
artifacts: true
@@ -89,7 +89,7 @@ check-system-ubuntu:
  MAKE_CHECK_ARGS: check
  
  acceptance-system-ubuntu:

-  <<: *native_test_job_definition
+  extends: .native_test_job
needs:
  - job: build-system-ubuntu
artifacts: true
@@ -111,7 +111,7 @@ build-system-debian:
- build
  
  check-system-debian:

-  <<: *native_test_job_definition
+  extends: .native_test_job
needs:
  - job: build-system-debian
artifacts: true
@@ -120,7 +120,7 @@ check-system-debian:
  MAKE_CHECK_ARGS: check
  
  acceptance-system-debian:

-  <<: *native_test_job_definition
+  extends: .native_test_job
needs:
  - job: build-system-debian
artifacts: true
@@ -143,7 +143,7 @@ build-system-fedora:
- build
  
  check-system-fedora:

-  <<: *native_test_job_definition
+  extends: .native_test_job
needs:
  - job: build-system-fedora
artifacts: true
@@ -152,7 +152,7 @@ check-system-fedora:
  MAKE_CHECK_ARGS: check
  
  acceptance-system-fedora:

-  <<: *native_test_job_definition
+  extends: .native_test_job
needs:
  - job: build-system-fedora
artifacts: true
@@ -175,7 +175,7 @@ build-system-centos:
- build
  
  check-system-centos:

-  <<: *native_test_job_definition
+  extends: .native_test_job
needs:
  - job: build-system-centos
artifacts: true
@@ -184,7 +184,7 @@ check-system-centos:
  MAKE_CHECK_ARGS: check
  
  acceptance-system-centos:

-  <<: *native_test_job_definition
+  extends: .native_test_job
needs:
  - job: build-system-centos
artifacts: true
@@ -282,7 +282,7 @@ build-deprecated:
  # We split the check-tcg step as test failures are expected but we still
  # want to catch the build breaking.
  check-deprecated:
-  <<: *native_test_job_definition
+  extends: .native_test_job
needs:
  - job: build-deprecated
artifacts: true
@@ -346,7 +346,7 @@ build-crypto-old-nettle:
- build
  
  check-crypto-old-nettle:

-  <<: *native_test_job_definition
+  extends: .native_test_job
needs:
  - job: build-crypto-old-nettle
artifacts: true
@@ -367,7 +367,7 @@ build-crypto-old-gcrypt:
- build
  
  check-crypto-old-gcrypt:

-  <<: *native_test_job_definition
+  extends: .native_test_job
needs:
  - job: build-crypto-old-gcrypt
artifacts: true
@@ -388,7 +388,7 @@ build-crypto-only-gnutls:
- build
  
  check-crypto-only-gnutls:

-  <<: *native_test_job_definition
+  extends: .native_test_job
needs:
  - job: build-crypto-only-gnutls
artifacts: true





Re: [PATCH v3 03/11] gitlab-ci: Replace YAML anchors by extends (native_build_job)

2020-11-10 Thread Wainer dos Santos Moschetta



On 11/8/20 8:19 PM, Philippe Mathieu-Daudé wrote:

'extends' is an alternative to using YAML anchors
and is a little more flexible and readable. See:
https://docs.gitlab.com/ee/ci/yaml/#extends

Signed-off-by: Philippe Mathieu-Daudé 
---
  .gitlab-ci.yml | 32 
  1 file changed, 16 insertions(+), 16 deletions(-)


LGTM

Reviewed-by: Wainer dos Santos Moschetta 



diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
index 5763318d375..a96e7dd23e5 100644
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
@@ -13,7 +13,7 @@ include:
- local: '/.gitlab-ci.d/containers.yml'
- local: '/.gitlab-ci.d/crossbuilds.yml'
  
-.native_build_job_template: _build_job_definition

+.native_build_job:
stage: build
image: $CI_REGISTRY_IMAGE/qemu/$IMAGE:latest
before_script:
@@ -68,7 +68,7 @@ include:
  - du -chs ${CI_PROJECT_DIR}/avocado-cache
  
  build-system-ubuntu:

-  <<: *native_build_job_definition
+  extends: .native_build_job
variables:
  IMAGE: ubuntu2004
  TARGETS: aarch64-softmmu alpha-softmmu cris-softmmu hppa-softmmu
@@ -99,7 +99,7 @@ acceptance-system-ubuntu:
<<: *acceptance_definition
  
  build-system-debian:

-  <<: *native_build_job_definition
+  extends: .native_build_job
variables:
  IMAGE: debian-amd64
  TARGETS: arm-softmmu avr-softmmu i386-softmmu mipsel-softmmu
@@ -130,7 +130,7 @@ acceptance-system-debian:
<<: *acceptance_definition
  
  build-system-fedora:

-  <<: *native_build_job_definition
+  extends: .native_build_job
variables:
  IMAGE: fedora
  CONFIGURE_ARGS: --disable-gcrypt --enable-nettle
@@ -162,7 +162,7 @@ acceptance-system-fedora:
<<: *acceptance_definition
  
  build-system-centos:

-  <<: *native_build_job_definition
+  extends: .native_build_job
variables:
  IMAGE: centos8
  CONFIGURE_ARGS: --disable-nettle --enable-gcrypt
@@ -194,7 +194,7 @@ acceptance-system-centos:
<<: *acceptance_definition
  
  build-disabled:

-  <<: *native_build_job_definition
+  extends: .native_build_job
variables:
  IMAGE: fedora
  CONFIGURE_ARGS: --disable-attr --disable-avx2 --disable-bochs
@@ -219,7 +219,7 @@ build-disabled:
  MAKE_CHECK_ARGS: check-qtest SPEED=slow
  
  build-tcg-disabled:

-  <<: *native_build_job_definition
+  extends: .native_build_job
variables:
  IMAGE: centos8
script:
@@ -239,7 +239,7 @@ build-tcg-disabled:
  260 261 262 263 264 270 272 273 277 279
  
  build-user:

-  <<: *native_build_job_definition
+  extends: .native_build_job
variables:
  IMAGE: debian-all-test-cross
  CONFIGURE_ARGS: --disable-tools --disable-system
@@ -249,7 +249,7 @@ build-user:
  # we skip sparc64-linux-user until it has been fixed somewhat
  # we skip cris-linux-user as it doesn't use the common run loop
  build-user-plugins:
-  <<: *native_build_job_definition
+  extends: .native_build_job
variables:
  IMAGE: debian-all-test-cross
  CONFIGURE_ARGS: --disable-tools --disable-system --enable-plugins 
--enable-debug-tcg --target-list-exclude=sparc64-linux-user,cris-linux-user
@@ -257,7 +257,7 @@ build-user-plugins:
timeout: 1h 30m
  
  build-clang:

-  <<: *native_build_job_definition
+  extends: .native_build_job
variables:
  IMAGE: fedora
  CONFIGURE_ARGS: --cc=clang --cxx=clang++
@@ -267,7 +267,7 @@ build-clang:
  
  # These targets are on the way out

  build-deprecated:
-  <<: *native_build_job_definition
+  extends: .native_build_job
variables:
  IMAGE: debian-all-test-cross
  CONFIGURE_ARGS: --disable-docs --disable-tools
@@ -292,7 +292,7 @@ check-deprecated:
allow_failure: true
  
  build-oss-fuzz:

-  <<: *native_build_job_definition
+  extends: .native_build_job
variables:
  IMAGE: fedora
script:
@@ -310,7 +310,7 @@ build-oss-fuzz:
  - cd build-oss-fuzz && make check-qtest-i386 check-unit
  
  build-tci:

-  <<: *native_build_job_definition
+  extends: .native_build_job
variables:
  IMAGE: fedora
script:
@@ -335,7 +335,7 @@ build-tci:
  # These jobs test old gcrypt and nettle from RHEL7
  # which had some API differences.
  build-crypto-old-nettle:
-  <<: *native_build_job_definition
+  extends: .native_build_job
variables:
  IMAGE: centos7
  TARGETS: x86_64-softmmu x86_64-linux-user
@@ -356,7 +356,7 @@ check-crypto-old-nettle:
  
  
  build-crypto-old-gcrypt:

-  <<: *native_build_job_definition
+  extends: .native_build_job
variables:
  IMAGE: centos7
  TARGETS: x86_64-softmmu x86_64-linux-user
@@ -377,7 +377,7 @@ check-crypto-old-gcrypt:
  
  
  build-crypto-only-gnutls:

-  <<: *native_build_job_definition
+  extends: .native_build_job
variables:
  IMAGE: centos7
  TARGETS: x86_64-softmmu x86_64-linux-user





Re: [PATCH v3 02/11] gitlab-ci: Replace YAML anchors by extends (cross_system_build_job)

2020-11-10 Thread Wainer dos Santos Moschetta



On 11/8/20 8:19 PM, Philippe Mathieu-Daudé wrote:

'extends' is an alternative to using YAML anchors
and is a little more flexible and readable. See:
https://docs.gitlab.com/ee/ci/yaml/#extends

Good idea!


Signed-off-by: Philippe Mathieu-Daudé 
---
  .gitlab-ci.d/crossbuilds.yml | 40 ++--
  1 file changed, 20 insertions(+), 20 deletions(-)

Reviewed-by: Wainer dos Santos Moschetta 


diff --git a/.gitlab-ci.d/crossbuilds.yml b/.gitlab-ci.d/crossbuilds.yml
index 03ebfabb3fa..099949aaef3 100644
--- a/.gitlab-ci.d/crossbuilds.yml
+++ b/.gitlab-ci.d/crossbuilds.yml
@@ -1,5 +1,5 @@
  
-.cross_system_build_job_template: _system_build_job_definition

+.cross_system_build_job:
stage: build
image: $CI_REGISTRY_IMAGE/qemu/$IMAGE:latest
timeout: 80m
@@ -13,7 +13,7 @@
xtensa-softmmu"
  - make -j$(expr $(nproc) + 1) all check-build
  
-.cross_user_build_job_template: _user_build_job_definition

+.cross_user_build_job:
stage: build
image: $CI_REGISTRY_IMAGE/qemu/$IMAGE:latest
script:
@@ -24,91 +24,91 @@
  - make -j$(expr $(nproc) + 1) all check-build
  
  cross-armel-system:

-  <<: *cross_system_build_job_definition
+  extends: .cross_system_build_job
variables:
  IMAGE: debian-armel-cross
  
  cross-armel-user:

-  <<: *cross_user_build_job_definition
+  extends: .cross_user_build_job
variables:
  IMAGE: debian-armel-cross
  
  cross-armhf-system:

-  <<: *cross_system_build_job_definition
+  extends: .cross_system_build_job
variables:
  IMAGE: debian-armhf-cross
  
  cross-armhf-user:

-  <<: *cross_user_build_job_definition
+  extends: .cross_user_build_job
variables:
  IMAGE: debian-armhf-cross
  
  cross-arm64-system:

-  <<: *cross_system_build_job_definition
+  extends: .cross_system_build_job
variables:
  IMAGE: debian-arm64-cross
  
  cross-arm64-user:

-  <<: *cross_user_build_job_definition
+  extends: .cross_user_build_job
variables:
  IMAGE: debian-arm64-cross
  
  cross-mips-system:

-  <<: *cross_system_build_job_definition
+  extends: .cross_system_build_job
variables:
  IMAGE: debian-mips-cross
  
  cross-mips-user:

-  <<: *cross_user_build_job_definition
+  extends: .cross_user_build_job
variables:
  IMAGE: debian-mips-cross
  
  cross-mipsel-system:

-  <<: *cross_system_build_job_definition
+  extends: .cross_system_build_job
variables:
  IMAGE: debian-mipsel-cross
  
  cross-mipsel-user:

-  <<: *cross_user_build_job_definition
+  extends: .cross_user_build_job
variables:
  IMAGE: debian-mipsel-cross
  
  cross-mips64el-system:

-  <<: *cross_system_build_job_definition
+  extends: .cross_system_build_job
variables:
  IMAGE: debian-mips64el-cross
  
  cross-mips64el-user:

-  <<: *cross_user_build_job_definition
+  extends: .cross_user_build_job
variables:
  IMAGE: debian-mips64el-cross
  
  cross-ppc64el-system:

-  <<: *cross_system_build_job_definition
+  extends: .cross_system_build_job
variables:
  IMAGE: debian-ppc64el-cross
  
  cross-ppc64el-user:

-  <<: *cross_user_build_job_definition
+  extends: .cross_user_build_job
variables:
  IMAGE: debian-ppc64el-cross
  
  cross-s390x-system:

-  <<: *cross_system_build_job_definition
+  extends: .cross_system_build_job
variables:
  IMAGE: debian-s390x-cross
  
  cross-s390x-user:

-  <<: *cross_user_build_job_definition
+  extends: .cross_user_build_job
variables:
  IMAGE: debian-s390x-cross
  
  cross-win32-system:

-  <<: *cross_system_build_job_definition
+  extends: .cross_system_build_job
variables:
  IMAGE: fedora-win32-cross
  
  cross-win64-system:

-  <<: *cross_system_build_job_definition
+  extends: .cross_system_build_job
variables:
  IMAGE: fedora-win64-cross





Re: [RFC v3] VFIO Migration

2020-11-10 Thread Alex Williamson
On Tue, 10 Nov 2020 09:53:49 +
Stefan Hajnoczi  wrote:
> VFIO mdev Drivers
> -
> The following mdev type sysfs attrs are available for managing device
> instances::
> 
>   /sys/...//mdev_supported_types//
> create - writing a UUID to this file instantiates a device
> migration_info.json - read-only migration information JSON
> 
> TODO The JSON can be represented as a file system hierarchy but sysfs seems
> limited to // and / so it is not possible
> to express deeper attr groups like /migration/params//?


Complex structured formats have been proposed in other threads related
to migration compatibility and generally been dismissed as not adhering
to the standards of sysfs per:

Documentation/filesystems/sysfs.rst:
---
Attributes
~~

Attributes can be exported for kobjects in the form of regular files in
the filesystem. Sysfs forwards file I/O operations to methods defined
for the attributes, providing a means to read and write kernel
attributes.

Attributes should be ASCII text files, preferably with only one value
per file. It is noted that it may not be efficient to contain only one
value per file, so it is socially acceptable to express an array of
values of the same type.

Mixing types, expressing multiple lines of data, and doing fancy
formatting of data is heavily frowned upon. Doing these things may get
you publicly humiliated and your code rewritten without notice.
---

We'd either need to address your TODO and create a hierarchical
representation or find another means to exchange this format.


> Device models supported by an mdev driver and their details can be read from
> the migration_info.json attr. Each mdev type supports one device model. If a
> parent device supports multiple device models then each device model has an
> mdev type. There may be multiple mdev types for a single device model when 
> they
> offer different migration parameters such as resource capacity or feature
> availability.
> 
> For example, a graphics card that supports 4 GB and 8 GB device instances 
> would
> provide gfx-4GB and gfx-8GB mdev types with memory=4096 and memory=8192
> migration parameters, respectively.


I think this example could be expanded for clarity.  I think this is
suggesting we have mdev_types of gfx-4GB and gfx-8GB, which each
implement some common device model, ie. com.gfx/GPU, where the
migration parameter 'memory' for each defaults to a value matching the
type name.  But it seems like this can also lead to some combinatorial
challenges for management tools if these parameters are writable.  For
example, should a management tool create a gfx-4GB device and change to
memory parameter to 8192 or a gfx-8GB device with the default parameter?


> The following mdev device sysfs attrs relate to a specific device instance::
> 
>   /sys/...///
> mdev_type/ - symlink to mdev type sysfs attrs, e.g. to fetch 
> migration/model


We need a mechanism that translates to non-mdev vfio devices as well,
the device "model" creates a clean separation from an mdev-type, we
shouldn't reintroduce that dependency here.


> migration/ - migration related files
>- read/write migration parameter "param"
>   ...
> 
> When the device is created all migration/ attrs take their
> migration_info.json "init_value".
> 
> When preparing for migration on the source, each migration parameter from
> migration/ is read and added to the migration parameter list if its
> value differs from "off_value" in migration_info.json. If a migration 
> parameter
> in the list is not available on the destination, then migration is not
> possible. If a migration parameter value is not in the destination
> "allowed_values" migration_info.json then migration is not possible.
> 
> In order to prepare an mdev device instance for an incoming migration on the
> destination, the "off_value" from migration_info.json is written to each
> migration parameter in migration/. Then the migration parameter list
> from the source is written to migration/ one migration parameter at a
> time. If an error occurs while writing a migration parameter on the 
> destination
> then migration is not possible. Once the migration parameter list has been
> written the mdev can be opened and migration can proceed.


What's the logic behind setting the value twice?  If we have a
preconfigured pool of devices where the off_value might use less
resources, we risk that resources might be consumed elsewhere if we
release them and try to get them back.  It also seems rather
inefficient.

 
> An open mdev device typically does not allow migration parameters to be 
> changed
> at runtime. However, certain migration/params attrs may allow writes at
> runtime. Usually these migration parameters only affect the device state
> representation and not the hardware interface. This makes it possible to
> upgrade or downgrade the device state representation at runtime so that
> migration is possible to newer or older device 

Re: [PATCH v3 00/81] target/arm: Implement SVE2

2020-11-10 Thread Stephen Long
Hi Richard, what's the plan to get this patch series into master?

Thanks,
Stephen



Re: [PATCH for-5.2 v2 2/4] hw/net/can/ctucan: Avoid unused value in ctucan_send_ready_buffers()

2020-11-10 Thread Pavel Pisa
Hello Peter,

On Tuesday 10 of November 2020 18:06:02 Peter Maydell wrote:
> Coverity points out that in ctucan_send_ready_buffers() we
> set buff_st_mask = 0xf << (i * 4) inside the loop, but then
> we never use it before overwriting it later.
>
> The only thing we use the mask for is as part of the code that is
> inserting the new buff_st field into tx_status.  That is more
> comprehensibly written using deposit32(), so do that and drop the
> mask variable entirely.
>
> We also update the buff_st local variable at multiple points
> during this function, but nothing can ever see these
> intermediate values, so just drop those, write the final
> TXT_TOK as a fixed constant value, and collapse the only
> remaining set/use of buff_st down into an extract32().
>
> Fixes: Coverity CID 1432869
> Signed-off-by: Peter Maydell 
> ---
>  hw/net/can/ctucan_core.c | 15 +++
>  1 file changed, 3 insertions(+), 12 deletions(-)
>
> diff --git a/hw/net/can/ctucan_core.c b/hw/net/can/ctucan_core.c
> index 538270e62f9..a400ad13a43 100644
> --- a/hw/net/can/ctucan_core.c
> +++ b/hw/net/can/ctucan_core.c
> @@ -240,8 +240,6 @@ static void ctucan_send_ready_buffers(CtuCanCoreState
> *s) uint8_t *pf;
>  int buff2tx_idx;
>  uint32_t tx_prio_max;
> -unsigned int buff_st;
> -uint32_t buff_st_mask;
>
>  if (!s->mode_settings.s.ena) {
>  return;
> @@ -256,10 +254,7 @@ static void ctucan_send_ready_buffers(CtuCanCoreState
> *s) for (i = 0; i < CTUCAN_CORE_TXBUF_NUM; i++) {
>  uint32_t prio;
>
> -buff_st_mask = 0xf << (i * 4);
> -buff_st = (s->tx_status.u32 >> (i * 4)) & 0xf;
> -
> -if (buff_st != TXT_RDY) {
> +if (extract32(s->tx_status.u32, i * 4, 4) != TXT_RDY) {
>  continue;
>  }
>  prio = (s->tx_priority.u32 >> (i * 4)) & 0x7;
> @@ -271,10 +266,7 @@ static void ctucan_send_ready_buffers(CtuCanCoreState
> *s) if (buff2tx_idx == -1) {
>  break;
>  }
> -buff_st_mask = 0xf << (buff2tx_idx * 4);
> -buff_st = (s->tx_status.u32 >> (buff2tx_idx * 4)) & 0xf;
>  int_stat.u32 = 0;
> -buff_st = TXT_RDY;

I would prefer to add there next line even that it has no real effect

 +s->tx_status.u32 = deposit32(s->tx_status.u32,
 + buff2tx_idx * 4, 4, TXT_RDY);

But if it generates warning or you have some other reason not to put
it there, I add my

Acked-by: Pavel Pisa 

When we separated processsing to call of message submit for Tx
and then separate callback to confirm arbitration win,
we would need to reintroduce this assignment. But there would
be much moresignificant changes that this small notice is not
so important. 

>  pf = s->tx_buffer[buff2tx_idx].data;
>  ctucan_buff2frame(pf, );
>  s->status.s.idle = 0;
> @@ -283,12 +275,11 @@ static void ctucan_send_ready_buffers(CtuCanCoreState
> *s) s->status.s.idle = 1;
>  s->status.s.txs = 0;
>  s->tx_fr_ctr.s.tx_fr_ctr_val++;
> -buff_st = TXT_TOK;
>  int_stat.s.txi = 1;
>  int_stat.s.txbhci = 1;
>  s->int_stat.u32 |= int_stat.u32 & ~s->int_mask.u32;
> -s->tx_status.u32 = (s->tx_status.u32 & ~buff_st_mask) |
> -(buff_st << (buff2tx_idx * 4));
> +s->tx_status.u32 = deposit32(s->tx_status.u32,
> + buff2tx_idx * 4, 4, TXT_TOK);
>  } while (1);
>  }


-- 
Yours sincerely

Pavel Pisa
phone:  +420 603531357
e-mail: p...@cmp.felk.cvut.cz
Department of Control Engineering FEE CVUT
Karlovo namesti 13, 121 35, Prague 2
university: http://dce.fel.cvut.cz/
personal:   http://cmp.felk.cvut.cz/~pisa
projects:   https://www.openhub.net/accounts/ppisa
CAN related:http://canbus.pages.fel.cvut.cz/




Re: [PATCH for-5.2 v2 3/4] hw/net/can/ctucan_core: Handle big-endian hosts

2020-11-10 Thread Pavel Pisa
Hello Peter,

On Tuesday 10 of November 2020 18:06:03 Peter Maydell wrote:
> The ctucan driver defines types for its registers which are a union
> of a uint32_t with a struct with bitfields for the individual
> fields within that register. This is a bad idea, because bitfields
> aren't portable. The ctu_can_fd_regs.h header works around the
> most glaring of the portability issues by defining the
> fields in two different orders depending on the setting of the
> __LITTLE_ENDIAN_BITFIELD define. However, in ctucan_core.h this
> is unconditionally set to 1, which is wrong for big-endian hosts.
>
> Set it only if HOST_WORDS_BIGENDIAN is not set. There is no need
> for a "have we defined it already" guard, because the only place
> that should set it is ctucan_core.h, which has the usual
> double-inclusion guard.
>
> Signed-off-by: Peter Maydell 
> Reviewed-by: Philippe Mathieu-Daudé 
> ---
>  hw/net/can/ctucan_core.h | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/hw/net/can/ctucan_core.h b/hw/net/can/ctucan_core.h
> index f21cb1c5ec3..bbc09ae0678 100644
> --- a/hw/net/can/ctucan_core.h
> +++ b/hw/net/can/ctucan_core.h
> @@ -31,8 +31,7 @@
>  #include "exec/hwaddr.h"
>  #include "net/can_emu.h"
>
> -
> -#ifndef __LITTLE_ENDIAN_BITFIELD
> +#ifndef HOST_WORDS_BIGENDIAN
>  #define __LITTLE_ENDIAN_BITFIELD 1
>  #endif

Acked-by: Pavel Pisa 

Thanks,

Pavel



Re: [PATCH for-5.2 v2 1/4] hw/net/can/ctucan: Don't allow guest to write off end of tx_buffer

2020-11-10 Thread Pavel Pisa
Hello Peter,

On Tuesday 10 of November 2020 19:24:03 Peter Maydell wrote:
> For unaligned accesses, for 6.0, I think the code for doing
> them to the txbuff at least is straightforward:
>
>if (buff_num < CTUCAN_CORE_TXBUF_NUM &&
>(addr + size) < CTUCAN_CORE_MSG_MAX_LEN) {
>   stn_le_p(s->tx_buffer[buff_num].data + addr, size, val);
>}
>
> (stn_le_p takes care of doing an appropriate-width write.)

Thanks, great to know, I like that much.
Only small nitpicking, it should be (addr + size) <= CTUCAN_CORE_MSG_MAX_LEN

So whole code I am testing now

if (addr >= CTU_CAN_FD_TXTB1_DATA_1) {
int buff_num;
addr -= CTU_CAN_FD_TXTB1_DATA_1;
buff_num = addr / CTUCAN_CORE_TXBUFF_SPAN;
addr %= CTUCAN_CORE_TXBUFF_SPAN;
if ((buff_num < CTUCAN_CORE_TXBUF_NUM) &&
((addr + size) <= sizeof(s->tx_buffer[buff_num].data))) {
stn_le_p(s->tx_buffer[buff_num].data + addr, size, val);
}
} else {

So I have applied you whole series with above update. All works correctly
on x86_64 Linux host and with Linux x86_64 and MIPS big endian guests.

Please update to this combination.
I do not expect to have byte writes in our drivers but real core
supports byte enable bus signals.

Thanks much for teaching me QEMU stn_le_p.
In the fact, we are discussion about similar slution of peripherals
access for our https://github.com/cvut/QtMips/ education emulator
(performance vise a total toy when compared to QEMU).

It would worth to enable byte writes into registers as well.
But I would not do it before release. It would be more complex.
The reads supports bytes by reading 32/bit word and then shifting
and masking right bits into result. Cross word unaligned reads
are not supported. Again no reason for them now.

You can add

Tested-by: Pavel Pisa 

to whole series.

Thanks,

Pavel



[PATCH v1 10/10] scripts/ci: clean up default args logic a little

2020-11-10 Thread Alex Bennée
This allows us to do:

  ./scripts/ci/gitlab-pipeline-status -w -b HEAD -p 2961854

to check out own pipeline status of a recently pushed branch.

Signed-off-by: Alex Bennée 
---
 scripts/ci/gitlab-pipeline-status | 24 +---
 1 file changed, 13 insertions(+), 11 deletions(-)

diff --git a/scripts/ci/gitlab-pipeline-status 
b/scripts/ci/gitlab-pipeline-status
index bac8233079..78e72f6008 100755
--- a/scripts/ci/gitlab-pipeline-status
+++ b/scripts/ci/gitlab-pipeline-status
@@ -31,7 +31,7 @@ class NoPipelineFound(Exception):
 """Communication is successfull but pipeline is not found."""
 
 
-def get_local_branch_commit(branch='staging'):
+def get_local_branch_commit(branch):
 """
 Returns the commit sha1 for the *local* branch named "staging"
 """
@@ -126,18 +126,16 @@ def create_parser():
 help=('The GitLab project ID. Defaults to the project '
   'for https://gitlab.com/qemu-project/qemu, that '
   'is, "%(default)s"'))
-try:
-default_commit = get_local_branch_commit()
-commit_required = False
-except ValueError:
-default_commit = ''
-commit_required = True
-parser.add_argument('-c', '--commit', required=commit_required,
-default=default_commit,
+parser.add_argument('-b', '--branch', type=str, default="staging",
+help=('Specify the branch to check. '
+  'Use HEAD for your current branch. '
+  'Otherwise looks at "%(default)s"'))
+parser.add_argument('-c', '--commit',
+default=None,
 help=('Look for a pipeline associated with the given '
   'commit.  If one is not explicitly given, the '
-  'commit associated with the local branch named '
-  '"staging" is used.  Default: %(default)s'))
+  'commit associated with the default branch '
+  'is used.'))
 parser.add_argument('--verbose', action='store_true', default=False,
 help=('A minimal verbosity level that prints the '
   'overall result of the check/wait'))
@@ -149,6 +147,10 @@ def main():
 """
 parser = create_parser()
 args = parser.parse_args()
+
+if not args.commit:
+args.commit = get_local_branch_commit(args.branch)
+
 success = False
 try:
 if args.wait:
-- 
2.20.1




Re: [PATCH-for-6.0 v4 08/17] gitlab-ci: Move linux-user debug-tcg test across to gitlab

2020-11-10 Thread Alex Bennée


Philippe Mathieu-Daudé  writes:

> Similarly to commit 8cdb2cef3f1, move the linux-user (debug-tcg)
> test to GitLab.
>
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
> Cc: Laurent Vivier 
> ---
>  .gitlab-ci.yml | 7 +++
>  .travis.yml| 9 -
>  2 files changed, 7 insertions(+), 9 deletions(-)
>
> diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
> index 3fc3d0568c6..80082a602b8 100644
> --- a/.gitlab-ci.yml
> +++ b/.gitlab-ci.yml
> @@ -304,6 +304,13 @@ build-user:
>  CONFIGURE_ARGS: --disable-tools --disable-system
>  MAKE_CHECK_ARGS: check-tcg
>  
> +build-user-debug:
> +  <<: *native_build_job_definition
> +  variables:
> +IMAGE: debian-all-test-cross
> +CONFIGURE_ARGS: --disable-tools --disable-system --enable-debug-tcg
> +MAKE_CHECK_ARGS: check-tcg
> +
>  # Run check-tcg against linux-user (with plugins)
>  # we skip sparc64-linux-user until it has been fixed somewhat
>  # we skip cris-linux-user as it doesn't use the common run loop
> diff --git a/.travis.yml b/.travis.yml
> index 15d92291358..bee6197290d 100644
> --- a/.travis.yml
> +++ b/.travis.yml
> @@ -293,15 +293,6 @@ jobs:
>  - ${SRC_DIR}/configure ${CONFIG} --extra-cflags="-g3 -O0 
> -fsanitize=thread" || { cat config.log meson-logs/meson-log.txt && exit 1; }
>  
>  
> -# Run check-tcg against linux-user
> -- name: "GCC check-tcg (user)"
> -  env:
> -- CONFIG="--disable-system --enable-debug-tcg"
> -- TEST_BUILD_CMD="make build-tcg"
> -- TEST_CMD="make check-tcg"
> -- CACHE_NAME="${TRAVIS_BRANCH}-linux-gcc-debug-tcg"
> -
> -
>  # Run check-tcg against softmmu targets
>  - name: "GCC check-tcg (some-softmmu)"
>env:

I just realised I replicated this is a slightly different way - by
dropping --debug-tcg and moving the rest in one commit. I skipped over
the for 6.0 stuff when looking over your series but it's certainly worth
moving the check-tcg ones now given the stability issues.

-- 
Alex Bennée



[PATCH v1 08/10] tests/acceptance: Disable Spartan-3A DSP 1800A test

2020-11-10 Thread Alex Bennée
From: Philippe Mathieu-Daudé 

This test is regularly failing on CI:

   (05/34) 
tests/acceptance/boot_linux_console.py:BootLinuxConsole.test_microblaze_s3adsp1800:
  Linux version 4.11.3 (th...@thuth.remote.csb) (gcc version 6.4.0 (Buildroot 
2018.05.2) ) #5 Tue Dec 11 11:56:23 CET 2018
  ...
  Freeing unused kernel memory: 1444K
  This architecture does not have kernel memory protection.
  [nothing happens here]
  Runner error occurred: Timeout reached (90.91 s)

This is a regression. Until someone figure out the problem,
disable the test to keep CI pipeline useful.

Signed-off-by: Philippe Mathieu-Daudé 
Acked-by: Thomas Huth 
Message-Id: <20201109091719.2449141-1-f4...@amsat.org>
Signed-off-by: Alex Bennée 
---
 tests/acceptance/boot_linux_console.py | 2 ++
 tests/acceptance/replay_kernel.py  | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/tests/acceptance/boot_linux_console.py 
b/tests/acceptance/boot_linux_console.py
index 8f433a67f8..cc6ec0f8c1 100644
--- a/tests/acceptance/boot_linux_console.py
+++ b/tests/acceptance/boot_linux_console.py
@@ -13,6 +13,7 @@ import lzma
 import gzip
 import shutil
 
+from avocado import skip
 from avocado import skipUnless
 from avocado_qemu import Test
 from avocado_qemu import exec_command_and_wait_for_pattern
@@ -1025,6 +1026,7 @@ class BootLinuxConsole(LinuxKernelTest):
 tar_hash = 'ac688fd00561a2b6ce1359f9ff6aa2b98c9a570c'
 self.do_test_advcal_2018('07', tar_hash, 'sanity-clause.elf')
 
+@skip("Test currently broken") # Console stuck as of 5.2-rc1
 def test_microblaze_s3adsp1800(self):
 """
 :avocado: tags=arch:microblaze
diff --git a/tests/acceptance/replay_kernel.py 
b/tests/acceptance/replay_kernel.py
index 00c228382b..772633b01d 100644
--- a/tests/acceptance/replay_kernel.py
+++ b/tests/acceptance/replay_kernel.py
@@ -14,6 +14,7 @@ import shutil
 import logging
 import time
 
+from avocado import skip
 from avocado import skipIf
 from avocado import skipUnless
 from avocado_qemu import wait_for_console_pattern
@@ -280,6 +281,7 @@ class ReplayKernelNormal(ReplayKernelBase):
 file_path = self.fetch_asset(tar_url, asset_hash=tar_hash)
 self.do_test_advcal_2018(file_path, 'sanity-clause.elf')
 
+@skip("Test currently broken") # Console stuck as of 5.2-rc1
 def test_microblaze_s3adsp1800(self):
 """
 :avocado: tags=arch:microblaze
-- 
2.20.1




[PATCH v1 09/10] gitlab: move remaining x86 check-tcg targets to gitlab

2020-11-10 Thread Alex Bennée
The GCC check-tcg (user) test in particular was very prone to timing
out on Travis. We only actually need to move the some-softmmu builds
across as we already have coverage for linux-user.

As --enable-debug-tcg does increase the run time somewhat as more
debug is put in let's restrict that to just the plugins build. It's
unlikely that a plugins enabled build is going to hide a sanity
failure in core TCG code so let the plugin builds do the heavy lifting
on checking TCG sanity so the non-plugin builds can run swiftly.

Now the only remaining check-tcg builds on Travis are for the various
non-x86 arches.

Signed-off-by: Alex Bennée 
---
 .gitlab-ci.yml | 17 +
 .travis.yml| 26 --
 2 files changed, 17 insertions(+), 26 deletions(-)

diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
index 9a8b375188..b406027a55 100644
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
@@ -247,6 +247,15 @@ build-user:
 CONFIGURE_ARGS: --disable-tools --disable-system
 MAKE_CHECK_ARGS: check-tcg
 
+# Only build the softmmu targets we have check-tcg tests for
+build-some-softmmu:
+  <<: *native_build_job_definition
+  variables:
+IMAGE: debian-all-test-cross
+CONFIGURE_ARGS: --disable-tools --enable-debug-tcg
+TARGETS: xtensa-softmmu arm-softmmu aarch64-softmmu alpha-softmmu
+MAKE_CHECK_ARGS: check-tcg
+
 # Run check-tcg against linux-user (with plugins)
 # we skip sparc64-linux-user until it has been fixed somewhat
 # we skip cris-linux-user as it doesn't use the common run loop
@@ -258,6 +267,14 @@ build-user-plugins:
 MAKE_CHECK_ARGS: check-tcg
   timeout: 1h 30m
 
+build-some-softmmu-plugins:
+  <<: *native_build_job_definition
+  variables:
+IMAGE: debian-all-test-cross
+CONFIGURE_ARGS: --disable-tools --disable-user --enable-plugins 
--enable-debug-tcg
+TARGETS: xtensa-softmmu arm-softmmu aarch64-softmmu alpha-softmmu
+MAKE_CHECK_ARGS: check-tcg
+
 build-clang:
   <<: *native_build_job_definition
   variables:
diff --git a/.travis.yml b/.travis.yml
index a3d78171ca..bac085f800 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -301,32 +301,6 @@ jobs:
 - ${SRC_DIR}/configure ${CONFIG} --extra-cflags="-g3 -O0 
-fsanitize=thread" || { cat config.log meson-logs/meson-log.txt && exit 1; }
 
 
-# Run check-tcg against linux-user
-- name: "GCC check-tcg (user)"
-  env:
-- CONFIG="--disable-system --enable-debug-tcg"
-- TEST_BUILD_CMD="make build-tcg"
-- TEST_CMD="make check-tcg"
-- CACHE_NAME="${TRAVIS_BRANCH}-linux-gcc-debug-tcg"
-
-
-# Run check-tcg against softmmu targets
-- name: "GCC check-tcg (some-softmmu)"
-  env:
-- CONFIG="--enable-debug-tcg 
--target-list=xtensa-softmmu,arm-softmmu,aarch64-softmmu,alpha-softmmu"
-- TEST_BUILD_CMD="make build-tcg"
-- TEST_CMD="make check-tcg"
-- CACHE_NAME="${TRAVIS_BRANCH}-linux-gcc-debug-tcg"
-
-
-# Run check-tcg against softmmu targets (with plugins)
-- name: "GCC plugins check-tcg (some-softmmu)"
-  env:
-- CONFIG="--enable-plugins --enable-debug-tcg 
--target-list=xtensa-softmmu,arm-softmmu,aarch64-softmmu,alpha-softmmu"
-- TEST_BUILD_CMD="make build-tcg"
-- TEST_CMD="make check-tcg"
-- CACHE_NAME="${TRAVIS_BRANCH}-linux-gcc-debug-tcg"
-
 - name: "[aarch64] GCC check-tcg"
   arch: arm64
   dist: focal
-- 
2.20.1




[PATCH v1 03/10] meson.build: fix building of Xen support for aarch64

2020-11-10 Thread Alex Bennée
Xen is supported on ARM although weirdly using the i386-softmmu model.
Checking based on the host CPU meant we never enabled Xen support. It
would be nice to enable CONFIG_XEN for aarch64-softmmu to make it not
seem weird but that will require further build surgery.

Fixes: 8a19980e3f ("configure: move accelerator logic to meson")
Suggested-by: Paolo Bonzini 
Signed-off-by: Alex Bennée 
Reviewed-by: Philippe Mathieu-Daudé 
Cc: Masami Hiramatsu 
Cc: Stefano Stabellini 
Cc: Anthony Perard 
Cc: Paul Durrant 
Message-Id: <20201105175153.30489-9-alex.ben...@linaro.org>
Signed-off-by: Alex Bennée 
---
 meson.build | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/meson.build b/meson.build
index b473620321..ef197f9a6b 100644
--- a/meson.build
+++ b/meson.build
@@ -74,10 +74,15 @@ else
 endif
 
 accelerator_targets = { 'CONFIG_KVM': kvm_targets }
+if cpu in ['x86', 'x86_64', 'arm', 'aarch64']
+  # i368 emulator provides xenpv machine type for multiple architectures
+  accelerator_targets += {
+'CONFIG_XEN': ['i386-softmmu', 'x86_64-softmmu'],
+  }
+endif
 if cpu in ['x86', 'x86_64']
   accelerator_targets += {
 'CONFIG_HAX': ['i386-softmmu', 'x86_64-softmmu'],
-'CONFIG_XEN': ['i386-softmmu', 'x86_64-softmmu'],
 'CONFIG_HVF': ['x86_64-softmmu'],
 'CONFIG_WHPX': ['i386-softmmu', 'x86_64-softmmu'],
   }
-- 
2.20.1




[PATCH v1 05/10] stubs/xen-hw-stub: drop xenstore_store_pv_console_info stub

2020-11-10 Thread Alex Bennée
We should never build something that calls this without having it.

Signed-off-by: Alex Bennée 
Reviewed-by: Philippe Mathieu-Daudé 
Message-Id: <20201105175153.30489-13-alex.ben...@linaro.org>
Signed-off-by: Alex Bennée 
---
 stubs/xen-hw-stub.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/stubs/xen-hw-stub.c b/stubs/xen-hw-stub.c
index 2ea8190921..15f3921a76 100644
--- a/stubs/xen-hw-stub.c
+++ b/stubs/xen-hw-stub.c
@@ -10,10 +10,6 @@
 #include "hw/xen/xen.h"
 #include "hw/xen/xen-x86.h"
 
-void xenstore_store_pv_console_info(int i, Chardev *chr)
-{
-}
-
 int xen_pci_slot_get_pirq(PCIDevice *pci_dev, int irq_num)
 {
 return -1;
-- 
2.20.1




[PATCH v1 06/10] accel/stubs: drop unused cpu.h include

2020-11-10 Thread Alex Bennée
Signed-off-by: Alex Bennée 
Reviewed-by: Philippe Mathieu-Daudé 
Message-Id: <20201105175153.30489-14-alex.ben...@linaro.org>
Signed-off-by: Alex Bennée 
---
 accel/stubs/hax-stub.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/accel/stubs/hax-stub.c b/accel/stubs/hax-stub.c
index 1a9da83185..49077f88e3 100644
--- a/accel/stubs/hax-stub.c
+++ b/accel/stubs/hax-stub.c
@@ -14,7 +14,6 @@
  */
 
 #include "qemu/osdep.h"
-#include "cpu.h"
 #include "sysemu/hax.h"
 
 int hax_sync_vcpus(void)
-- 
2.20.1




[PATCH v1 04/10] include/hw/xen.h: drop superfluous struct

2020-11-10 Thread Alex Bennée
Chardev is already a typedef'ed struct.

Signed-off-by: Alex Bennée 
Reviewed-by: Philippe Mathieu-Daudé 
Message-Id: <20201105175153.30489-12-alex.ben...@linaro.org>
Signed-off-by: Alex Bennée 
---
 include/hw/xen/xen.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/hw/xen/xen.h b/include/hw/xen/xen.h
index 1406648ca5..0f9962b1c1 100644
--- a/include/hw/xen/xen.h
+++ b/include/hw/xen/xen.h
@@ -28,7 +28,7 @@ int xen_is_pirq_msi(uint32_t msi_data);
 
 qemu_irq *xen_interrupt_controller_init(void);
 
-void xenstore_store_pv_console_info(int i, struct Chardev *chr);
+void xenstore_store_pv_console_info(int i, Chardev *chr);
 
 void xen_register_framebuffer(struct MemoryRegion *mr);
 
-- 
2.20.1




[PATCH v1 07/10] hw/i386/acpi-build: Fix maybe-uninitialized error when ACPI hotplug off

2020-11-10 Thread Alex Bennée
From: Philippe Mathieu-Daudé 

GCC 9.3.0 thinks that 'method' can be left uninitialized. This code
is already in the "if (bsel || pcihp_bridge_en)" block statement,
but it isn't smart enough to figure it out.

Restrict the code to be used only in the "if (bsel || pcihp_bridge_en)"
block statement to fix (on Ubuntu):

  ../hw/i386/acpi-build.c: In function 'build_append_pci_bus_devices':
  ../hw/i386/acpi-build.c:496:9: error: 'method' may be used uninitialized
  in this function [-Werror=maybe-uninitialized]
496 | aml_append(parent_scope, method);
| ^~~~
  cc1: all warnings being treated as errors

Fixes: df4008c9c59 ("piix4: don't reserve hw resources when hotplug is off 
globally")
Signed-off-by: Philippe Mathieu-Daudé 
Message-Id: <20201108204535.2319870-4-phi...@redhat.com>
Signed-off-by: Alex Bennée 
---
 hw/i386/acpi-build.c | 41 +++--
 1 file changed, 19 insertions(+), 22 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 4f66642d88..1f5c211245 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -465,34 +465,31 @@ static void build_append_pci_bus_devices(Aml 
*parent_scope, PCIBus *bus,
  */
 if (bsel || pcihp_bridge_en) {
 method = aml_method("PCNT", 0, AML_NOTSERIALIZED);
-}
-/* If bus supports hotplug select it and notify about local events */
-if (bsel) {
-uint64_t bsel_val = qnum_get_uint(qobject_to(QNum, bsel));
 
-aml_append(method, aml_store(aml_int(bsel_val), aml_name("BNUM")));
-aml_append(method,
-aml_call2("DVNT", aml_name("PCIU"), aml_int(1) /* Device Check */)
-);
-aml_append(method,
-aml_call2("DVNT", aml_name("PCID"), aml_int(3)/* Eject Request */)
-);
-}
+/* If bus supports hotplug select it and notify about local events */
+if (bsel) {
+uint64_t bsel_val = qnum_get_uint(qobject_to(QNum, bsel));
 
-/* Notify about child bus events in any case */
-if (pcihp_bridge_en) {
-QLIST_FOREACH(sec, >child, sibling) {
-int32_t devfn = sec->parent_dev->devfn;
+aml_append(method, aml_store(aml_int(bsel_val), aml_name("BNUM")));
+aml_append(method, aml_call2("DVNT", aml_name("PCIU"),
+ aml_int(1))); /* Device Check */
+aml_append(method, aml_call2("DVNT", aml_name("PCID"),
+ aml_int(3))); /* Eject Request */
+}
 
-if (pci_bus_is_root(sec) || pci_bus_is_express(sec)) {
-continue;
-}
+/* Notify about child bus events in any case */
+if (pcihp_bridge_en) {
+QLIST_FOREACH(sec, >child, sibling) {
+int32_t devfn = sec->parent_dev->devfn;
+
+if (pci_bus_is_root(sec) || pci_bus_is_express(sec)) {
+continue;
+}
 
-aml_append(method, aml_name("^S%.02X.PCNT", devfn));
+aml_append(method, aml_name("^S%.02X.PCNT", devfn));
+}
 }
-}
 
-if (bsel || pcihp_bridge_en) {
 aml_append(parent_scope, method);
 }
 qobject_unref(bsel);
-- 
2.20.1




[PATCH v1 01/10] plugins: Fix resource leak in connect_socket()

2020-11-10 Thread Alex Bennée
From: Alex Chen 

Close the fd when the connect() fails.

Reported-by: Euler Robot 
Signed-off-by: Alex Chen 
Message-Id: <20201109082829.87496-2-alex.c...@huawei.com>
Signed-off-by: Alex Bennée 
---
 contrib/plugins/lockstep.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/contrib/plugins/lockstep.c b/contrib/plugins/lockstep.c
index a696673dff..319bd44b83 100644
--- a/contrib/plugins/lockstep.c
+++ b/contrib/plugins/lockstep.c
@@ -292,6 +292,7 @@ static bool connect_socket(const char *path)
 
 if (connect(fd, (struct sockaddr *), sizeof(sockaddr)) < 0) {
 perror("failed to connect");
+close(fd);
 return false;
 }
 
-- 
2.20.1




[PATCH v1 for 5.1 00/10] various fixes (CI, Xen, plugins)

2020-11-10 Thread Alex Bennée
Hi,

This collects together a bunch of fixes for 5.2:

  - a few resource leak fixes for plugins
  - Xen on arm64 build fixes (from my larger Xen series)
  - a couple of build and CI fixes
  - a tweak to the gitlab status script

I can drop the last patch if I have to but it hopefully allows for
easier scripting of the "waiting for gitlab" experience for those that
are not using "staging".

The following need review:

 - scripts/ci: clean up default args logic a little
 - gitlab: move remaining x86 check-tcg targets to gitlab

Alex Bennée (6):
  meson.build: fix building of Xen support for aarch64
  include/hw/xen.h: drop superfluous struct
  stubs/xen-hw-stub: drop xenstore_store_pv_console_info stub
  accel/stubs: drop unused cpu.h include
  gitlab: move remaining x86 check-tcg targets to gitlab
  scripts/ci: clean up default args logic a little

Alex Chen (2):
  plugins: Fix resource leak in connect_socket()
  plugins: Fix two resource leaks in setup_socket()

Philippe Mathieu-Daudé (2):
  hw/i386/acpi-build: Fix maybe-uninitialized error when ACPI hotplug
off
  tests/acceptance: Disable Spartan-3A DSP 1800A test

 meson.build|  7 -
 include/hw/xen/xen.h   |  2 +-
 accel/stubs/hax-stub.c |  1 -
 contrib/plugins/lockstep.c |  3 ++
 hw/i386/acpi-build.c   | 41 --
 stubs/xen-hw-stub.c|  4 ---
 .gitlab-ci.yml | 17 +++
 .travis.yml| 26 
 scripts/ci/gitlab-pipeline-status  | 24 ---
 tests/acceptance/boot_linux_console.py |  2 ++
 tests/acceptance/replay_kernel.py  |  2 ++
 11 files changed, 63 insertions(+), 66 deletions(-)

-- 
2.20.1




[PATCH v1 02/10] plugins: Fix two resource leaks in setup_socket()

2020-11-10 Thread Alex Bennée
From: Alex Chen 

Either accept() fails or exits normally, we need to close the fd.

Reported-by: Euler Robot 
Signed-off-by: Alex Chen 
Message-Id: <20201109082829.87496-3-alex.c...@huawei.com>
Signed-off-by: Alex Bennée 
---
 contrib/plugins/lockstep.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/contrib/plugins/lockstep.c b/contrib/plugins/lockstep.c
index 319bd44b83..5aad50869d 100644
--- a/contrib/plugins/lockstep.c
+++ b/contrib/plugins/lockstep.c
@@ -268,11 +268,13 @@ static bool setup_socket(const char *path)
 socket_fd = accept(fd, NULL, NULL);
 if (socket_fd < 0 && errno != EINTR) {
 perror("accept socket");
+close(fd);
 return false;
 }
 
 qemu_plugin_outs("setup_socket::ready\n");
 
+close(fd);
 return true;
 }
 
-- 
2.20.1




Re: [PATCH 4/5 v4] KVM: VMX: Fill in conforming vmx_x86_ops via macro

2020-11-10 Thread Krish Sadhukhan



On 11/9/20 5:49 PM, Like Xu wrote:

Hi Krish,

On 2020/11/10 9:23, Krish Sadhukhan wrote:
@@ -1192,7 +1192,7 @@ void vmx_set_host_fs_gs(struct vmcs_host_state 
*host, u16 fs_sel, u16 gs_sel,

  }
  }
  -void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu)
+void vmx_prepare_guest_switch(struct kvm_vcpu *vcpu)


What do you think of renaming it to

void vmx_prepare_switch_for_guest(struct kvm_vcpu *vcpu);



In my opinion, it sounds a bit odd as we usually say, "switch to 
something". :-)


From that perspective, {svm|vmx}_prepare_switch_to_guest is probably 
the best name to keep.





?

Thanks,
Like Xu


  {
  struct vcpu_vmx *vmx = to_vmx(vcpu);
  struct vmcs_host_state *host_state;

@@ -311,7 +311,7 @@ void vmx_vcpu_load_vmcs(struct kvm_vcpu *vcpu, 
int cpu,

  int allocate_vpid(void);
  void free_vpid(int vpid);
  void vmx_set_constant_host_state(struct vcpu_vmx *vmx);
-void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu);
+void vmx_prepare_guest_switch(struct kvm_vcpu *vcpu);
  void vmx_set_host_fs_gs(struct vmcs_host_state *host, u16 fs_sel, 
u16 gs_sel,

  unsigned long fs_base, unsigned long gs_base);
  int vmx_get_cpl(struct kvm_vcpu *vcpu);






Re: [PATCH for-5.2] virtiofsd: Announce submounts even without statx()

2020-11-10 Thread Dr. David Alan Gilbert
* Max Reitz (mre...@redhat.com) wrote:
> Contrary to what the check (and warning) in lo_init() claims, we can
> announce submounts just fine even without statx() -- the check is based
> on comparing both the mount ID and st_dev of parent and child.  Without
> statx(), we will not have the mount ID; but we always have st_dev.
> 
> The only problems we have (without statx() and its mount ID) are:
> 
> (1) Mounting the same device twice may lead to both trees being treated
> as exactly the same tree by virtiofsd.  But that is a problem that
> is completely independent of mirroring host submounts in the guest.
> Both submount roots will still show the FUSE_SUBMOUNT flag, because
> their st_dev still differs from their respective parent.
> 
> (2) There is only one exception to (1), and that is if you mount a
> device inside a mount of itself: Then, its st_dev will be the same
> as that of its parent, and so without a mount ID, virtiofsd will not
> be able to recognize the nested mount's root as a submount.
> However, thanks to virtiofsd then treating both trees as exactly the
> same tree, it will be caught up in a loop when the guest tries to
> examine the nested submount, so the guest will always see nothing
> but an ELOOP there.  Therefore, this case is just fully broken
> without statx(), whether we check for submounts (based on st_dev) or
> not.
> 
> All in all, checking for submounts works well even without comparing the
> mount ID (i.e., without statx()).  The only concern is an edge case
> that, without statx() mount IDs, is utterly broken anyway.
> 
> Thus, drop said check in lo_init().
> 
> Reported-by: Miklos Szeredi 
> Signed-off-by: Max Reitz 

OK, that seems to have been the outcome of the discussion here:
  https://lists.gnu.org/archive/html/qemu-devel/2020-11/msg00500.html
so


Reviewed-by: Dr. David Alan Gilbert 

> ---
>  tools/virtiofsd/passthrough_ll.c | 8 
>  1 file changed, 8 deletions(-)
> 
> diff --git a/tools/virtiofsd/passthrough_ll.c 
> b/tools/virtiofsd/passthrough_ll.c
> index ec1008bceb..6c64b03f1a 100644
> --- a/tools/virtiofsd/passthrough_ll.c
> +++ b/tools/virtiofsd/passthrough_ll.c
> @@ -610,14 +610,6 @@ static void lo_init(void *userdata, struct 
> fuse_conn_info *conn)
>   "does not support it\n");
>  lo->announce_submounts = false;
>  }
> -
> -#ifndef CONFIG_STATX
> -if (lo->announce_submounts) {
> -fuse_log(FUSE_LOG_WARNING, "lo_init: Cannot announce submounts, 
> there "
> - "is no statx()\n");
> -lo->announce_submounts = false;
> -}
> -#endif
>  }
>  
>  static void lo_getattr(fuse_req_t req, fuse_ino_t ino,
> -- 
> 2.26.2
> 
-- 
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




Re: QOM address space handling

2020-11-10 Thread Eduardo Habkost
On Tue, Nov 10, 2020 at 12:46:48PM -0500, Eduardo Habkost wrote:
> On Tue, Nov 10, 2020 at 04:08:16PM +0100, Paolo Bonzini wrote:
> > On 10/11/20 16:03, Eduardo Habkost wrote:
> > > > Does anyone have any arguments for which solution is preferred?
> > > I'd say (2) is preferred, as we don't expect object_new(T) to
> > > have any side effects outside the object instance state.
> > 
> > Since there are no listeners, the side effects of address_space_init() are
> > relatively limited.  So doing it in instance_init is not a big deal.
> > 
> > > Most
> > > address_space_init() calls in the code today seem to be in
> > > realize functions.
> > > 
> > > However, I wonder if we could make this simpler (and mistakes
> > > less fatal) if we make AddressSpace a QOM child of the device.
> > > Paolo, would it be too much overhead to make AddressSpace a QOM
> > > object?
> > 
> > No, it wouldn't.  AddressSpace is already quite heavyweight.
> 
> I thought this was going to be an easy job, but call_rcu()
> requires rcu_head to be the first struct field.  I assume it is
> acceptable to use call_rcu1() + container_of() manually in this
> case.

Wait.  What exactly prevents callers of address_space_destroy()
from freeing the area containing the AddressSpace struct before
do_address_space_destroy() gets a chance to be called?

-- 
Eduardo




Re: [PATCH for-5.2 v2 1/4] hw/net/can/ctucan: Don't allow guest to write off end of tx_buffer

2020-11-10 Thread Peter Maydell
On Tue, 10 Nov 2020 at 18:02, Pavel Pisa  wrote:
>
> Hello Peter,
>
> On Tuesday 10 of November 2020 18:06:01 Peter Maydell wrote:
> > The ctucan device has 4 CAN bus cores, each of which has a set of 20
> > 32-bit registers for writing the transmitted data. The registers are
> > however not contiguous; each core's buffers is 0x100 bytes after
> > the last.
> >
> > We got the checks on the address wrong in the ctucan_mem_write()
> > function:
> >  * the first "is addr in range at all" check allowed
> >addr == CTUCAN_CORE_MEM_SIZE, which is actually the first
> >byte off the end of the range
> >  * the decode of addresses into core-number plus offset in the
> >tx buffer for that core failed to check that the offset was
> >in range, so the guest could write off the end of the
> >tx_buffer[] array
> >
> > NB: currently the values of CTUCAN_CORE_MEM_SIZE, CTUCAN_CORE_TXBUF_NUM,
> > etc, make "buff_num >= CTUCAN_CORE_TXBUF_NUM" impossible, but we
> > retain this as a runtime check rather than an assertion to permit
> > those values to be changed in future (in hardware they are
> > configurable synthesis parameters).
> >
> > Fix the top level check, and check the offset is within the buffer.
> >
> > Fixes: Coverity CID 1432874
> > Signed-off-by: Peter Maydell 
> > ---
> >  hw/net/can/ctucan_core.c | 5 +++--
> >  1 file changed, 3 insertions(+), 2 deletions(-)
> >
> > diff --git a/hw/net/can/ctucan_core.c b/hw/net/can/ctucan_core.c
> > index d20835cd7e9..538270e62f9 100644
> > --- a/hw/net/can/ctucan_core.c
> > +++ b/hw/net/can/ctucan_core.c
> > @@ -303,7 +303,7 @@ void ctucan_mem_write(CtuCanCoreState *s, hwaddr addr,
> > uint64_t val, DPRINTF("write 0x%02llx addr 0x%02x\n",
> >  (unsigned long long)val, (unsigned int)addr);
> >
> > -if (addr > CTUCAN_CORE_MEM_SIZE) {
> > +if (addr >= CTUCAN_CORE_MEM_SIZE) {
> >  return;
> >  }
>
> Ack
>
> > @@ -312,7 +312,8 @@ void ctucan_mem_write(CtuCanCoreState *s, hwaddr addr,
> > uint64_t val, addr -= CTU_CAN_FD_TXTB1_DATA_1;
> >  buff_num = addr / CTUCAN_CORE_TXBUFF_SPAN;
> >  addr %= CTUCAN_CORE_TXBUFF_SPAN;
> > -if (buff_num < CTUCAN_CORE_TXBUF_NUM) {
> > +if ((buff_num < CTUCAN_CORE_TXBUF_NUM) ||
> > +(addr < sizeof(s->tx_buffer[buff_num].data))) {
>
> should be &&

Whoops, that's a silly mistake on my part.

> I would use
>
> +if (buff_num < CTUCAN_CORE_TXBUF_NUM &&
> +addr < CTUCAN_CORE_MSG_MAX_LEN) {
>
> But that is equal. There can be problem that last three bytes of the uint32_t
> type can fall after the end. The correct changes to fully support
> unaligned writes is not so easy an dis unnecessary for actual drivers
> and use. So suggest

>> +addr &= ~3;
> +if ((buff_num < CTUCAN_CORE_TXBUF_NUM) &&
> +(addr < sizeof(s->tx_buffer[buff_num].data))) {

Hmm, yeah, the code is currently doing a 32-bit read regardless.

> You can consider that as Acked by me

OK, let's go with your version for 5.2.

For unaligned accesses, for 6.0, I think the code for doing
them to the txbuff at least is straightforward:

   if (buff_num < CTUCAN_CORE_TXBUF_NUM &&
   (addr + size) < CTUCAN_CORE_MSG_MAX_LEN) {
  stn_le_p(s->tx_buffer[buff_num].data + addr, size, val);
   }

(stn_le_p takes care of doing an appropriate-width write.)

thanks
-- PMM



Re: [RFC v1 09/10] i386: split cpu.c and defer x86 models registration

2020-11-10 Thread Eduardo Habkost
On Tue, Nov 10, 2020 at 06:38:49PM +0100, Claudio Fontana wrote:
> On 11/10/20 4:23 PM, Eduardo Habkost wrote:
> > On Tue, Nov 10, 2020 at 11:41:46AM +0100, Paolo Bonzini wrote:
> >> On 10/11/20 11:04, Daniel P. Berrangé wrote:
> >>>
> >>> ie, we should have one class hierarchy for CPU model definitions, and
> >>> one class hierarchy  for accelerator CPU implementations.
> >>>
> >>> So at runtime we then get two object instances - a CPU implementation
> >>> and a CPU definition. The CPU implementation object should have a
> >>> property which is a link to the desired CPU definition.
> >>
> >> It doesn't even have to be two object instances.  The implementation can be
> >> nothing more than a set of function pointers.
> > 
> > A set of function pointers is exactly what a QOM interface is.
> > Could the methods be provided by a TYPE_X86_ACCEL interface type,
> > implemented by the accel object?
> > 
> 
> Looking at the 2 axes mentioned by Daniel before, on the "accelerator cpu 
> axis", we have TYPE_TCG_CPU, TYPE_KVM_CPU, TYPE_HVF_CPU,
> which look like simple subclasses of TYPE_X86_CPU to me, with basically all 
> the divergent functionality being added by composition.
> TYPE_HVF_CPU seems to do everything that TYPE_X86_CPU does on construction 
> (and device realization), and then some.

What I don't get here is: why do we need a new "accelerator CPU
axis" if we already have an accelerator QOM type hierarchy?
accelerator-specific behavior can be delegated to the (existing)
accelerator object.

> 
> On the "cpu models" axis we have all the current subclasses of TYPE_X86_CPU, 
> which include "links" to X86CPUModel objects in the form
> of class_data:
> 
> static void x86_register_cpu_model_type(const char *name, X86CPUModel *model,
> const char *parent_type)
> {
> g_autofree char *typename = x86_cpu_type_name(name);
> TypeInfo ti = {
> .name = typename,
> .parent = parent_type,
> .class_init = x86_cpu_cpudef_class_init,
> .class_data = model,
> };
> 
> type_register();
> }
> 
> so this would be close to the "link" property that Daniel you were speaking 
> about before?
> Should X86CPUmodel be the prime citizen of the "cpu models"
> axis, without constructing a separate TYPE_X86_CPU subclass for
> each cpu model?

I don't think this would be fundamentally wrong, but the
assumption that each CPU model is implemented as a separate
subclass of TYPE_CPU is encoded everywhere in the code and in
management software.

> 
> A separate topic we did not address in comments before, where
> I'd like opinions, is how should we treat cpu types "base" and
> "max" and "host"?
> 
> Just to avoid forgetting about them, currently TYPE_X86_CPU is
> the parent type of "base" and of "max", and "max" is the parent
> type of "host".
> 
> "host" is only allowed when using accelerator kvm or hvf.
> Attempts to create such a cpu without a kvm or hvf accelerator
> enabled will error out.
> "max" behaves differently when using hvf or kvm.

"base" exists only to allow us to implement
`query-cpu-model-expansion type=static` (because it requires a
"static" CPU model[1]).  It is not supposed to be used directly.

"host" is supposed to be used directly by the user, work out of
the box, and is a convenient way to get an optimal configuration
for the current host.  It is supposed to have reasonable defaults
that let you boot a guest, and enable as most features as
possible.  We don't offer it for TCG, because TCG emulation
features are not dependent on host capabilities.

Now, "max" is tricky to define, because its semantics are
overloaded:

For KVM, "max" is used for querying which features are supported
by the host (even if the feature is not enabled by default by
"host").

However, "max" is _also_ usable directly by users with TCG, if
they want all features supported by TCG enabled.  Its use case
for TCG is more similar to the use case for "host".

Probably mixing two use cases in the same "max" CPU model was a
mistake, and we should have added a separate CPU model for each
use case.

Because of the above, having separate accel-specific names for
each of those models sounds like a welcome change.

---
[1] The definition of "static CPU model" is in the documentation
for query-cpu-model-expansion.

-- 
Eduardo




[Bug 1174654] Re: qemu-system-x86_64 takes 100% CPU after host machine resumed from suspend to ram

2020-11-10 Thread Thomas Huth
The QEMU project is currently considering to move its bug tracking to another 
system. For this we need to know which bugs are still valid and which could be 
closed already. Thus we are setting older bugs to "Incomplete" now.
If you still think this bug report here is valid, then please switch the state 
back to "New" within the next 60 days, otherwise this report will be marked as 
"Expired". Or mark it as "Fix Released" if the problem has been solved with a 
newer version of QEMU already. Thank you and sorry for the inconvenience.


** Changed in: qemu
   Status: Confirmed => Incomplete

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1174654

Title:
  qemu-system-x86_64 takes 100% CPU after host machine resumed from
  suspend to ram

Status in QEMU:
  Incomplete
Status in qemu package in Ubuntu:
  Invalid

Bug description:
  I have Windows XP SP3  inside qemu VM. All works fine in 12.10. But
  after upgraiding to 13.04 i have to restart the VM each time i
  resuming my host machine, because qemu process starts to take CPU
  cycles and OS inside VM is very slow and sluggish. However it's still
  controllable and could be shutdown by itself.

  According to the taskmgr any active process takes 99% CPU. It's not
  stuck on some single process.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1174654/+subscriptions



Re: [PATCH 0/8] qom: Use qlit to represent property defaults

2020-11-10 Thread Eduardo Habkost
On Tue, Nov 10, 2020 at 05:39:08PM +0100, Paolo Bonzini wrote:
> On 09/11/20 22:25, Eduardo Habkost wrote:
> > Based-on: 20201104160021.2342108-1-ehabk...@redhat.com
> > Git branch: 
> > https://gitlab.com/ehabkost/qemu/-/commits/work/qdev-qlit-defaults
> > 
> > This extend qlit.h to support all QNum types (signed int,
> > unsigned int, and double), and use QLitObject to represent field
> > property defaults.
> > 
> > It allows us to get rid of most type-specific .set_default_value
> > functions for QOM property types.
> > 
> > Eduardo Habkost (8):
> >qobject: Include API docs in docs/devel/qobject.html
> >qnum: Make qnum_get_double() get const pointer
> >qnum: QNumValue type for QNum value literals
> >qnum: qnum_value_is_equal() function
> >qlit: Support all types of QNums
> >qlit: qlit_type() function
> >qom: Make object_property_set_default() public
> >qom: Use qlit to represent property defaults
> > 
> >   docs/devel/index.rst  |   1 +
> >   docs/devel/qobject.rst|  11 +++
> >   include/hw/qdev-properties-system.h   |   2 +-
> >   include/qapi/qmp/qlit.h   |  16 +++-
> >   include/qapi/qmp/qnum.h   |  47 ++-
> >   include/qapi/qmp/qobject.h|  48 +++
> >   include/qom/field-property-internal.h |   4 -
> >   include/qom/field-property.h  |  26 +++---
> >   include/qom/object.h  |  11 +++
> >   include/qom/property-types.h  |  21 ++---
> >   hw/core/qdev-properties-system.c  |   8 --
> >   qobject/qlit.c|   4 +-
> >   qobject/qnum.c| 116 +++---
> >   qom/field-property.c  |  27 --
> >   qom/object.c  |   2 +-
> >   qom/property-types.c  |  36 ++--
> >   tests/check-qjson.c   |  72 ++--
> >   17 files changed, 295 insertions(+), 157 deletions(-)
> >   create mode 100644 docs/devel/qobject.rst
> > 
> 
> Acked-by: Paolo Bonzini 

Thanks!

It looks like I broke some unit tests in this series.  I will
submit v2 after submitting v3 of the field property series.

-- 
Eduardo




Re: [PATCH-for-6.0 v4 14/17] gitlab-ci: Move trace backend tests across to gitlab

2020-11-10 Thread Wainer dos Santos Moschetta



On 11/8/20 6:45 PM, Philippe Mathieu-Daudé wrote:

Similarly to commit 8cdb2cef3f1, move the trace backend
tests to GitLab.

Signed-off-by: Philippe Mathieu-Daudé 
---
Cc: Stefan Hajnoczi 
---
  .gitlab-ci.yml | 18 ++
  .travis.yml| 19 ---
  2 files changed, 18 insertions(+), 19 deletions(-)

diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
index 6552a832939..2f0da7b3dc1 100644
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
@@ -557,6 +557,24 @@ check-crypto-only-gnutls:
  IMAGE: centos7
  MAKE_CHECK_ARGS: check
  
+# We don't need to exercise every backend with every front-end

+build-trace-multi-user:
+  <<: *native_build_job_definition
+  variables:
+IMAGE: ubuntu2004


Doesn't it need the lttng-ust-dev package in Ubuntu's image likewise you 
did for Fedora (patch 13)?



+CONFIGURE_ARGS: --enable-trace-backends=log,simple,syslog --disable-system
+
+build-trace-ftrace-system:
+  <<: *native_build_job_definition
+  variables:
+IMAGE: ubuntu2004
+CONFIGURE_ARGS: --enable-trace-backends=ftrace 
--target-list=aarch64-softmmu
On Travis it builds the x86_64 softmmu target. Changed to aarch64 to 
increase coverage?

+
+build-trace-ust-system:
+  <<: *native_build_job_definition
+  variables:
+IMAGE: fedora


Similar question here, increasing coverage by using Fedora?

- Wainer


+CONFIGURE_ARGS: --enable-trace-backends=ust --target-list=x86_64-softmmu 
--disable-tcg
  
  check-patch:

stage: build
diff --git a/.travis.yml b/.travis.yml
index 8ef31f8d8b6..ff5d5ead579 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -182,25 +182,6 @@ jobs:
compiler: clang
  
  
-# We don't need to exercise every backend with every front-end

-- name: "GCC trace log,simple,syslog (user)"
-  env:
-- CONFIG="--enable-trace-backends=log,simple,syslog --disable-system"
-- TEST_CMD=""
-
-
-- name: "GCC trace ftrace (x86_64-softmmu)"
-  env:
-- CONFIG="--enable-trace-backends=ftrace --target-list=x86_64-softmmu"
-- TEST_CMD=""
-
-
-- name: "GCC trace ust (x86_64-softmmu)"
-  env:
-- CONFIG="--enable-trace-backends=ust --target-list=x86_64-softmmu"
-- TEST_CMD=""
-
-
  # Using newer GCC with sanitizers
  - name: "GCC9 with sanitizers (softmmu)"
dist: bionic





[Bug 1738507] Re: qemu sometimes stuck when booting windows 10

2020-11-10 Thread Thomas Huth
The QEMU project is currently considering to move its bug tracking to another 
system. For this we need to know which bugs are still valid and which could be 
closed already. Thus we are setting older bugs to "Incomplete" now.
If you still think this bug report here is valid, then please switch the state 
back to "New" within the next 60 days, otherwise this report will be marked as 
"Expired". Or mark it as "Fix Released" if the problem has been solved with a 
newer version of QEMU already. Thank you and sorry for the inconvenience.


** Changed in: qemu
   Status: New => Incomplete

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1738507

Title:
  qemu sometimes stuck when booting windows 10

Status in QEMU:
  Incomplete

Bug description:
  I am using qemu-2.10.1, or actually libvirt, to create a virtual machine, 
running microsoft windows 10 pro operating system.
  It installed fine and was actually working, however sometimes when trying to 
boot the vm, the whole boot process gets stuck.
  For some reason, it seemed to happen only when enough physical memory is 
taken so that, when booting a windows vm that has 4gb of available ram, host 
starts swapping some other processes. It is not always happening there, but 
often it happens, and I do not remember seeing any case of this happening when 
not swapping, maybe a kind of a timing issue?
  When this happens, I usually try to hard reset the machine by libvirt reset 
command or equivalent system_reset on qemu monitor, however the whole reset 
does not happen, and the command is a noop. That makes me think it is a qemu 
bug, not windows refusing operation. At the time of this event, qemu monitor 
and spice server are working correctly, are not stuck, and even doing things 
like system reset does not result in a monitor hang. It is also possible to 
quit qemu normally.
  I tried to workaround the bug by guessing what may cause it. Switched from 
bios to uefi, changed virtio-scsi to ahci temporarily, and disabled 
virtio-balloon in case it would be buggy, with no visible change.
  I will attach a libvirt log, because it contains qemu command line. I will 
also attach an example qemu backtrace.
  From what i know, both vcpu threads are working normally, at least none of 
them is stuck in a vcpu, nor deadlocked, etc. So backtrace could be different 
each time I tried to get it.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1738507/+subscriptions



[Bug 1737882] Re: QEMU Zaurus cannot boot 2.4.x kernels

2020-11-10 Thread Thomas Huth
** Changed in: qemu
   Status: New => Won't Fix

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1737882

Title:
  QEMU Zaurus cannot boot 2.4.x kernels

Status in QEMU:
  Won't Fix

Bug description:
  I tried akita and spitz machines.
  4.x, 3.x and 2.6.x kernels boot OK, but when I try to pass any 2.4.x, qemu 
crashes with "Trying to execute code outside RAM or ROM at 0x0080".

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1737882/+subscriptions



[Bug 1738434] Re: CALL FWORD PTR [ESP] handled incorrectly

2020-11-10 Thread Thomas Huth
The QEMU project is currently considering to move its bug tracking to another 
system. For this we need to know which bugs are still valid and which could be 
closed already. Thus we are setting older bugs to "Incomplete" now.
If you still think this bug report here is valid, then please switch the state 
back to "New" within the next 60 days, otherwise this report will be marked as 
"Expired". Or mark it as "Fix Released" if the problem has been solved with a 
newer version of QEMU already. Thank you and sorry for the inconvenience.


** Changed in: qemu
   Status: New => Incomplete

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1738434

Title:
  CALL FWORD PTR [ESP] handled incorrectly

Status in QEMU:
  Incomplete

Bug description:
  To keep the story short, this 32-bit code crashes on 64-bit Windows
  whereas it works fine on real system and VMware:

  push 33h
  push offset _far_call
  call fword ptr[esp]
  jmp _ret
  _far_call:
  retf
  _ret:

  32-bit code running under WoW64 on 64-bit Windows has the ability to
  switch to the 64-bit mode via so called "Heaven's gate". In order to
  do that you have to make a far call/jmp by 0x33 selector how the code
  snippet above shows. QEMU throws an access violation exception whereas
  the code snippet runs with no problems on real CPU and VMware. By the
  way, this code works fine under QEMU, I hope it gives you a hint where
  to look:

  push 23h
  push offset _far_call
  call fword ptr[esp]
  jmp _ret
  _far_call:
  retf
  _ret:

  0x23 is a default 32-bit selector for 32-bit processes running under
  WoW64.

  Environment:
  QEMU: 2.10.93, command line: qemu-system-x86_64.exe -m 2G -snapshot -cdrom 
full_path_to_iso fullP_path_to_img
  Guest OS: Windows 7 x64 SP1 build 7601 or Windows 10 version 1709 build 
16299.19
  Host OS: Windows 10 x64 version 1703 build 15063.786

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1738434/+subscriptions



Re: [PATCH for-5.2 v2 4/4] hw/net/can/ctucan_core: Use stl_le_p to write to tx_buffers

2020-11-10 Thread Pavel Pisa
Hello Peter,

On Tuesday 10 of November 2020 18:06:04 Peter Maydell wrote:
> Instead of casting an address within a uint8_t array to a
> uint32_t*, use stl_le_p(). This handles possibly misaligned
> addresses which would otherwise crash on some hosts.
>
> Signed-off-by: Peter Maydell 
> Reviewed-by: Philippe Mathieu-Daudé 
> Acked-by: Pavel Pisa 
> ---
>  hw/net/can/ctucan_core.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/hw/net/can/ctucan_core.c b/hw/net/can/ctucan_core.c
> index a400ad13a43..0ef528eb879 100644
> --- a/hw/net/can/ctucan_core.c
> +++ b/hw/net/can/ctucan_core.c
> @@ -305,8 +305,7 @@ void ctucan_mem_write(CtuCanCoreState *s, hwaddr addr,
> uint64_t val, addr %= CTUCAN_CORE_TXBUFF_SPAN;
>  if ((buff_num < CTUCAN_CORE_TXBUF_NUM) ||
>  (addr < sizeof(s->tx_buffer[buff_num].data))) {
> -uint32_t *bufp = (uint32_t *)(s->tx_buffer[buff_num].data +
> addr); -*bufp = cpu_to_le32(val);
> +stl_le_p(s->tx_buffer[buff_num].data + addr, val);
>  }
>  } else {
>  switch (addr & ~3) {

I test change soon, but this seems obvious so

Acked-by: Pavel Pisa 



Re: [PATCH for-5.2 v2 1/4] hw/net/can/ctucan: Don't allow guest to write off end of tx_buffer

2020-11-10 Thread Pavel Pisa
Hello Peter,

On Tuesday 10 of November 2020 18:06:01 Peter Maydell wrote:
> The ctucan device has 4 CAN bus cores, each of which has a set of 20
> 32-bit registers for writing the transmitted data. The registers are
> however not contiguous; each core's buffers is 0x100 bytes after
> the last.
>
> We got the checks on the address wrong in the ctucan_mem_write()
> function:
>  * the first "is addr in range at all" check allowed
>addr == CTUCAN_CORE_MEM_SIZE, which is actually the first
>byte off the end of the range
>  * the decode of addresses into core-number plus offset in the
>tx buffer for that core failed to check that the offset was
>in range, so the guest could write off the end of the
>tx_buffer[] array
>
> NB: currently the values of CTUCAN_CORE_MEM_SIZE, CTUCAN_CORE_TXBUF_NUM,
> etc, make "buff_num >= CTUCAN_CORE_TXBUF_NUM" impossible, but we
> retain this as a runtime check rather than an assertion to permit
> those values to be changed in future (in hardware they are
> configurable synthesis parameters).
>
> Fix the top level check, and check the offset is within the buffer.
>
> Fixes: Coverity CID 1432874
> Signed-off-by: Peter Maydell 
> ---
>  hw/net/can/ctucan_core.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/hw/net/can/ctucan_core.c b/hw/net/can/ctucan_core.c
> index d20835cd7e9..538270e62f9 100644
> --- a/hw/net/can/ctucan_core.c
> +++ b/hw/net/can/ctucan_core.c
> @@ -303,7 +303,7 @@ void ctucan_mem_write(CtuCanCoreState *s, hwaddr addr,
> uint64_t val, DPRINTF("write 0x%02llx addr 0x%02x\n",
>  (unsigned long long)val, (unsigned int)addr);
>
> -if (addr > CTUCAN_CORE_MEM_SIZE) {
> +if (addr >= CTUCAN_CORE_MEM_SIZE) {
>  return;
>  }

Ack

> @@ -312,7 +312,8 @@ void ctucan_mem_write(CtuCanCoreState *s, hwaddr addr,
> uint64_t val, addr -= CTU_CAN_FD_TXTB1_DATA_1;
>  buff_num = addr / CTUCAN_CORE_TXBUFF_SPAN;
>  addr %= CTUCAN_CORE_TXBUFF_SPAN;
> -if (buff_num < CTUCAN_CORE_TXBUF_NUM) {
> +if ((buff_num < CTUCAN_CORE_TXBUF_NUM) ||
> +(addr < sizeof(s->tx_buffer[buff_num].data))) {

should be &&

I would use 

+if (buff_num < CTUCAN_CORE_TXBUF_NUM &&
+addr < CTUCAN_CORE_MSG_MAX_LEN) {

But that is equal. There can be problem that last three bytes of the uint32_t 
type can fall after the end. The correct changes to fully support
unaligned writes is not so easy an dis unnecessary for actual drivers
and use. So suggest

+addr &= ~3;
+if ((buff_num < CTUCAN_CORE_TXBUF_NUM) &&
+(addr < sizeof(s->tx_buffer[buff_num].data))) {

You can consider that as Acked by me

>  uint32_t *bufp = (uint32_t *)(s->tx_buffer[buff_num].data +
> addr); *bufp = cpu_to_le32(val);
>  }



[Bug 1736042] Re: qemu-system-x86_64 does not boot image reliably

2020-11-10 Thread Thomas Huth
Have you ever tried the suggestions from Liang Yan ?

** Changed in: qemu
   Status: New => Incomplete

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1736042

Title:
  qemu-system-x86_64 does not boot image reliably

Status in QEMU:
  Incomplete

Bug description:
  Booting image as root user with following command works randomly.

  ./qemu-system-x86_64 -enable-kvm -curses -smp cpus=4 -m 8192
  /root/ructfe2917_g.qcow2

  Most of the time it ends up on "800x600 Graphic mode"(been stuck there
  even for 4 hours before killed), but 1 out of ~20 it boots image
  correctly(and instantly).

  This is visible in v2.5.0 build from sources, v2.5.0 from Ubuntu
  Xenial and v2.1.2 from Debian Jessie.

  The image in question was converted from vmdk using:

  qemu-img convert -O qcow2 file.vmdk file.qcow2

  The image contains Ubuntu with grub.

  I can provide debug logs, but will need guidance how to enable
  them(and what logs are necessary).

  As a side note, it seems that booting is more certain after
  connecting(or mounting) partition using qemu-nbd/mount.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1736042/+subscriptions



[Bug 1732177] Re: SBSA ACS test freezes inside qemu-system-aarch64

2020-11-10 Thread Thomas Huth
Which version of QEMU did you test? Does it work better with the latest
version of QEMU now?

** Changed in: qemu
   Status: New => Incomplete

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1732177

Title:
  SBSA ACS test freezes inside qemu-system-aarch64

Status in QEMU:
  Incomplete

Bug description:
  In an effort to get Windows 10 for ARM64 (which is supposed to boot on
  SBSA/SBBR-compliant platforms) to boot inside qemu, I tried to run the
  SBSA ACS test suite. I used the UEFI image from the latest Linaro
  snapshot, and built the SBSA ACS UEFI application from
  https://github.com/ARM-software/sbsa-acs myself using a Linaro aarch64
  compiler.

  Test #8 causes an infinite exception loop, as the exception vectors
  themselves somehow become inaccessible, and accessing them triggers
  another exception to be handled by the same vector. (With some older
  Linaro UEFI images, the hard lockup is avoided, and the SBSA UEFI app
  crashes instead.) If I disable that test, the testsuite locks up in
  other tests in very similar ways. We aren't even able to get a
  pass/fail score from the app because of this.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1732177/+subscriptions



[Bug 1735082] Re: NVME pass through in th eguest VM

2020-11-10 Thread Thomas Huth
Can you reproduce the problem with the latest official upstream version
of QEMU?

** Changed in: qemu
   Status: New => Incomplete

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1735082

Title:
  NVME pass through in th eguest VM

Status in QEMU:
  Incomplete

Bug description:
  Hi Qemu Team

  i am new in qemu and trying for nvme pass through ..
  for that i used  below git repo for nvme 

  https://github.com/famz/qemu/tree/nvme

  and trying to launch the VM by below qemu command ..

  /usr/local/bin/qemu-system-x86_64 -name sl7.0  -m 1024 -object memory-
  backend-file,id=mem,size=1G,mem-path=/dev/hugepages,share=on
  -nographic -no-user-config -nodefaults -serial
  mon:telnet:localhost:7704,server,nowait -monitor
  mon:telnet:localhost:8804,server,nowait -numa node,memdev=mem -drive
  file=/home/qemu/qcows,format=qcow2,if=none,id=disk -device ide-
  hd,drive=disk,bootindex=0 -drive
  file=nvme://:d8:00.0,if=none,id=drive0 -device virtio-
  blk,drive=drive0,id=virtio0 --enable-kvm

  i am getting kernel panic and not proceed further..please help

  PS:-  our guest VM version is

  Scientific Linux 7.0 (Nitrogen)
  Kernel 3.10.0-123.el7.x86_64 on an x86_64

  Regards
  Nitin

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1735082/+subscriptions



Re: [PATCH v3 16/41] accel/tcg: Support split-wx for darwin/iOS with vm_remap

2020-11-10 Thread Joelle van Dyne
FWIW, it's a syscall that's been around for as long as I can remember.
In macOS 11 they added a new mach_vm_remap but kept the old one for
compatibility so I don't think it's going away any time soon.

-j

On Tue, Nov 10, 2020 at 9:37 AM Alex Bennée  wrote:
>
>
> Richard Henderson  writes:
>
> > Cribbed from code posted by Joelle van Dyne ,
> > and rearranged to a cleaner structure.  Completely untested.
> >
> > Signed-off-by: Richard Henderson 
> > ---
> >  accel/tcg/translate-all.c | 65 +++
> >  1 file changed, 65 insertions(+)
> >
> > diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
> > index 1931e65365..17df6c94fa 100644
> > --- a/accel/tcg/translate-all.c
> > +++ b/accel/tcg/translate-all.c
> > @@ -1166,9 +1166,71 @@ static bool 
> > alloc_code_gen_buffer_splitwx_memfd(size_t size, Error **errp)
> >  }
> >  #endif /* CONFIG_POSIX */
> >
> > +#ifdef CONFIG_DARWIN
> > +#include 
> > +
> > +extern kern_return_t mach_vm_remap(vm_map_t target_task,
> > +   mach_vm_address_t *target_address,
> > +   mach_vm_size_t size,
> > +   mach_vm_offset_t mask,
> > +   int flags,
> > +   vm_map_t src_task,
> > +   mach_vm_address_t src_address,
> > +   boolean_t copy,
> > +   vm_prot_t *cur_protection,
> > +   vm_prot_t *max_protection,
> > +   vm_inherit_t inheritance);
>
> Our checkpatch really doesn't like the extern being dropped in here but
> having grepped the xnu source I'm not sure we have a choice. I'm curious
> how stable the function might be given it's not in a published header.
>
> --
> Alex Bennée



Re: [RFC v1 09/10] i386: split cpu.c and defer x86 models registration

2020-11-10 Thread Eduardo Habkost
On Tue, Nov 10, 2020 at 05:05:27PM +0100, Paolo Bonzini wrote:
> On 10/11/20 16:23, Eduardo Habkost wrote:
> > On Tue, Nov 10, 2020 at 11:41:46AM +0100, Paolo Bonzini wrote:
> > > On 10/11/20 11:04, Daniel P. Berrangé wrote:
> > > > 
> > > > ie, we should have one class hierarchy for CPU model definitions, and
> > > > one class hierarchy  for accelerator CPU implementations.
> > > > 
> > > > So at runtime we then get two object instances - a CPU implementation
> > > > and a CPU definition. The CPU implementation object should have a
> > > > property which is a link to the desired CPU definition.
> > > 
> > > It doesn't even have to be two object instances.  The implementation can 
> > > be
> > > nothing more than a set of function pointers.
> > 
> > A set of function pointers is exactly what a QOM interface is.
> > Could the methods be provided by a TYPE_X86_ACCEL interface type,
> > implemented by the accel object?
> 
> I think we should not try yo implement interfaces conditionally (i.e. have
> TYPE_X86_ACCEL implemented only on qemu-system-{i386,x86_64} and not
> qemu-system-arm), even if technically the accel/ objects are per-target
> (specific_ss) rather than common.

If the accel objects are already per target, it seems appropriate
to have a QOM type hierarchy that reflects that.

`qemu-system-x86_64 -accel kvm` would create a kvm-x86_64-accel
object, but `qemu-system-arm -accel kvm` would create a
kvm-arm-accel.

*-x86_64-accel and *-i386-accel would all implement
INTERFACE_X86_ACCEL.

-- 
Eduardo




Re: QOM address space handling

2020-11-10 Thread Eduardo Habkost
On Tue, Nov 10, 2020 at 04:08:16PM +0100, Paolo Bonzini wrote:
> On 10/11/20 16:03, Eduardo Habkost wrote:
> > > Does anyone have any arguments for which solution is preferred?
> > I'd say (2) is preferred, as we don't expect object_new(T) to
> > have any side effects outside the object instance state.
> 
> Since there are no listeners, the side effects of address_space_init() are
> relatively limited.  So doing it in instance_init is not a big deal.
> 
> > Most
> > address_space_init() calls in the code today seem to be in
> > realize functions.
> > 
> > However, I wonder if we could make this simpler (and mistakes
> > less fatal) if we make AddressSpace a QOM child of the device.
> > Paolo, would it be too much overhead to make AddressSpace a QOM
> > object?
> 
> No, it wouldn't.  AddressSpace is already quite heavyweight.

I thought this was going to be an easy job, but call_rcu()
requires rcu_head to be the first struct field.  I assume it is
acceptable to use call_rcu1() + container_of() manually in this
case.

-- 
Eduardo




  1   2   3   4   >