Re: [PATCH RFC v1 0/2] VM fork detection for RNG

2022-02-24 Thread Laszlo Ersek
(+Daniel, +Rich)

On 02/23/22 17:08, Jason A. Donenfeld wrote:
> On Wed, Feb 23, 2022 at 2:12 PM Jason A. Donenfeld  wrote:
>> second patch is the reason this is just an RFC: it's a cleanup of the
>> ACPI driver from last year, and I don't really have much experience
>> writing, testing, debugging, or maintaining these types of drivers.
>> Ideally this thread would yield somebody saying, "I see the intent of
>> this; I'm happy to take over ownership of this part." That way, I can
>> focus on the RNG part, and whoever steps up for the paravirt ACPI part
>> can focus on that.
> 
> I actually managed to test this in QEMU, and it seems to work quite well. 
> Steps:
> 
> $ qemu-system-x86_64 ... -device vmgenid,guid=auto -monitor stdio
> (qemu) savevm blah
> (qemu) quit
> $ qemu-system-x86_64 ... -device vmgenid,guid=auto -monitor stdio
> (qemu) loadvm blah
> 
> Doing this successfully triggers the function to reinitialize the RNG
> with the new GUID.

QEMU's related design is documented in
.

The Microsoft specification referenced therein,
, contains the following
statement:

"they can also use the data provided in the 128-bit identifier as a high
entropy random data source"

So reinitializing an RNG from it is an express purpose.

More info in the libvirt docs (see "genid"):

https://libvirt.org/formatdomain.html#general-metadata

QEMU's interpretation of the VMGENID specifically as a UUID (which I
believe comes from me) has received (valid) criticism since:

https://github.com/libguestfs/virt-v2v/blob/master/docs/vm-generation-id-across-hypervisors.txt

(This document also investigates VMGENID on other hypervisors, which I
think pertains to your other message.)

> (It appears there's a bug in QEMU which prevents
> the GUID from being reinitialized when running `loadvm` without
> quitting first; I suppose this should be discussed with QEMU
> upstream.)

That's not (necessarily) a bug; see the end of the above-linked QEMU
document:

"There are no known use cases for changing the GUID once QEMU is
running, and adding this capability would greatly increase the complexity."

> 
> So that's very positive. But I would appreciate hearing from some
> ACPI/Virt/Amazon people about this.

I've only made some random comments; I didn't see a question so I
couldn't attempt to answer :)

Thanks
Laszlo




Re: [PATCH RFC v1 0/2] VM fork detection for RNG

2022-02-24 Thread Alexander Graf

Hey Jason,

On 23.02.22 14:12, Jason A. Donenfeld wrote:

This small series picks up work from Amazon that seems to have stalled
out later year around this time: listening for the vmgenid ACPI
notification, and using it to "do something." Last year, that something
involved a complicated userspace mmap chardev, which seems frought with
difficulty. This year, I have something much simpler in mind: simply
using those ACPI notifications to tell the RNG to reinitialize safely,
so we don't repeat random numbers in cloned, forked, or rolled-back VM
instances.

This series consists of two patches. The first is a rather
straightforward addition to random.c, which I feel fine about. The
second patch is the reason this is just an RFC: it's a cleanup of the
ACPI driver from last year, and I don't really have much experience
writing, testing, debugging, or maintaining these types of drivers.
Ideally this thread would yield somebody saying, "I see the intent of
this; I'm happy to take over ownership of this part." That way, I can
focus on the RNG part, and whoever steps up for the paravirt ACPI part
can focus on that.

As a final note, this series intentionally does _not_ focus on
notification of these events to userspace or to other kernel consumers.
Since these VM fork detection events first need to hit the RNG, we can
later talk about what sorts of notifications or mmap'd counters the RNG
should be making accessible to elsewhere. But that's a different sort of
project and ties into a lot of more complicated concerns beyond this
more basic patchset. So hopefully we can keep the discussion rather
focused here to this ACPI business.



The main problem with VMGenID is that it is inherently racy. There will 
always be a (short) amount of time where the ACPI notification is not 
processed, but the VM could use its RNG to for example establish TLS 
connections.


Hence we as the next step proposed a multi-stage quiesce/resume 
mechanism where the system is aware that it is going into suspend - can 
block network connections for example - and only returns to a fully 
functional state after an unquiesce phase:


  https://github.com/systemd/systemd/issues/20222

Looking at the issue again, it seems like we completely missed to follow 
up with a PR to implement that functionality :(.


What exact use case do you have in mind for the RNG/VMGenID update? Can 
you think of situations where the race is not an actual concern?



Alex




Cc: d...@amazon.co.uk
Cc: aca...@amazon.com
Cc: g...@amazon.com
Cc: colmm...@amazon.com
Cc: sbl...@amazon.com
Cc: raduw...@amazon.com
Cc: ja...@google.com
Cc: gre...@linuxfoundation.org
Cc: ty...@mit.edu

Jason A. Donenfeld (2):
   random: add mechanism for VM forks to reinitialize crng
   drivers/virt: add vmgenid driver for reinitializing RNG

  drivers/char/random.c  |  58 ++
  drivers/virt/Kconfig   |   8 +++
  drivers/virt/Makefile  |   1 +
  drivers/virt/vmgenid.c | 133 +
  include/linux/random.h |   1 +
  5 files changed, 201 insertions(+)
  create mode 100644 drivers/virt/vmgenid.c

--
2.35.1





Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879




Re: Analysis of slow distro boots in check-avocado (BootLinuxAarch64.test_virt_tcg*)

2022-02-24 Thread Igor Mammedov
On Wed, 23 Feb 2022 12:50:42 +0100
Gerd Hoffmann  wrote:

>   Hi,
> 
> > > Also, "make install" installs these EDK2 images, which doesn't
> > > seem like the right thing for "this is only for one test case".  
> > 
> > Well I'd prefer we never had them installed. Today I don't remember
> > why it ended that way.  
> 
> Probably to behave simliar to other firmware, which makes sense to me.
> 
> So maybe do non-debug builds for install and debug builds for the test
> cases?  Why do the test cases need debug builds btw?

wrt bios-tables-test, it doesn't need debug version and should work fine
with non-debug builds.
(if memory serves me right it's this test case that prompted to add
uefi images to qemu)


> 
> take care,
>   Gerd
> 




Re: [PATCH v8 1/3] monitor/hmp: add support for flag argument with value

2022-02-24 Thread Fabian Ebner
Am 09.02.22 um 15:12 schrieb Markus Armbruster:
> Fabian Ebner  writes:

8<

>> diff --git a/monitor/monitor-internal.h b/monitor/monitor-internal.h
>> index 3da3f86c6a..a4cb307c8a 100644
>> --- a/monitor/monitor-internal.h
>> +++ b/monitor/monitor-internal.h
>> @@ -63,7 +63,8 @@
>>   * '.'  other form of optional type (for 'i' and 'l')
>>   * 'b'  boolean
>>   *  user mode accepts "on" or "off"
>> - * '-'  optional parameter (eg. '-f')
>> + * '-'  optional parameter (eg. '-f'); if followed by a 'V', it
>> + *  specifies an optional string param (e.g. '-fV' allows '-f 
>> foo')
>>   *
>>   */
> 
> For what it's worth, getopt() uses ':' after the option character for
> "takes an argument".
> 

Doing that leads to e.g.
.args_type  = "protocol:s,password:s,display:-d:,connected:s?",
so there's two different kinds of colons now. It's not a problem
functionality-wise AFAICT, but it might not be ideal. Should I still go
for it?

Also, wouldn't future non-string flag parameters need their own letter
too? What about re-using 's' here instead?

> Happy to make that tweak in my tree.  But see my review of PATCH 3
> first.
> 
> 




Re: [PATCH 2/3] util/main-loop: Introduce the main loop into QOM

2022-02-24 Thread Stefan Hajnoczi
On Mon, Feb 21, 2022 at 06:08:44PM +0100, Nicolas Saenz Julienne wrote:
> diff --git a/include/qemu/main-loop.h b/include/qemu/main-loop.h
> index 8dbc6fcb89..fea5a3e9d4 100644
> --- a/include/qemu/main-loop.h
> +++ b/include/qemu/main-loop.h
> @@ -26,9 +26,20 @@
>  #define QEMU_MAIN_LOOP_H
>  
>  #include "block/aio.h"
> +#include "qom/object.h"
> +#include "util/event-loop.h"
>  
>  #define SIG_IPI SIGUSR1
>  
> +#define TYPE_MAIN_LOOP "main-loop"
> +
> +struct MainLoop {
> +EventLoopBackend parent_obj;
> +};
> +typedef struct MainLoop MainLoop;
> +
> +DECLARE_INSTANCE_CHECKER(MainLoop, MAIN_LOOP, TYPE_MAIN_LOOP)

 * Direct usage of this macro should be avoided, and the complete
 * OBJECT_DECLARE_TYPE macro is recommended instead.

Is there a reason for using DECLARE_INSTANCE_CHECKER() instead of
OBJECT_DECLARE_TYPE()?

> @@ -882,7 +883,8 @@
>'input-barrier':  'InputBarrierProperties',
>'input-linux':{ 'type': 'InputLinuxProperties',
>'if': 'CONFIG_LINUX' },
> -  'iothread':   'IothreadProperties',
> +  'iothread':   'EventLoopBackendProperties',
> +  'main-loop':  'EventLoopBackendProperties',
>'memory-backend-epc': { 'type': 'MemoryBackendEpcProperties',
>'if': 'CONFIG_LINUX' },
>'memory-backend-file':'MemoryBackendFileProperties',

Does this commit the QAPI schema to keeping iothread and main-loop
properties identical or can they diverge over time, if necessary?

I think we have the freedom to switch the QAPI schema to different
structs for iothread and main-loop in the future because QMP clients
aren't supposed to rely on the exact type, but I wanted to double-check.

> diff --git a/qga/meson.build b/qga/meson.build
> index 1ee9dca60b..3051473e04 100644
> --- a/qga/meson.build
> +++ b/qga/meson.build
> @@ -52,7 +52,7 @@ qga_ss = qga_ss.apply(config_host, strict: false)
>  
>  qga = executable('qemu-ga', qga_ss.sources(),
>   link_args: config_host['LIBS_QGA'].split(),
> - dependencies: [qemuutil, libudev],
> + dependencies: [qemuutil, libudev, qom],

Looks like a change because the first patch added the base class to qom
instead of qemuutil. Maybe this can be undone if the base class is added
to qemuutil instead.


signature.asc
Description: PGP signature


Re: [PATCH v5 04/20] job.c: move inner aiocontext lock in callbacks

2022-02-24 Thread Emanuele Giuseppe Esposito



On 17/02/2022 14:45, Stefan Hajnoczi wrote:
> On Tue, Feb 08, 2022 at 09:34:57AM -0500, Emanuele Giuseppe Esposito wrote:
>> Instead of having the lock in job_tnx_apply, move it inside
> 
> s/tnx/txn/
> 
>> in the callback. This will be helpful for next commits, when
>> we introduce job_lock/unlock pairs.
>>
>> job_transition_to_pending() and job_needs_finalize() do not
>> need to be protected by the aiocontext lock.
>>
>> No functional change intended.
>>
>> Signed-off-by: Emanuele Giuseppe Esposito 
>> ---
>>  job.c | 17 -
>>  1 file changed, 12 insertions(+), 5 deletions(-)
> 
> I find this hard to review. The patch reduces the scope of the
> AioContext lock region and accesses Job in places where the AioContext
> was previously held. The commit description doesn't explain why it is
> safe to do this.

I will add this to the commit description, but in my opinion the
AioContext lock was not protecting any of the job fields in this patch.

It is only taken because the callbacks assume it is always held.
No job field in this patch is protected by the AioContext lock because
they are also read/written in functions that never take the lock.

Emanuele
> 
> I may need to audit the code myself but will try reviewing a few more
> patches first to see if I get the hang of this.
> 
>>
>> diff --git a/job.c b/job.c
>> index f13939d3c6..d8c13ac0d1 100644
>> --- a/job.c
>> +++ b/job.c
>> @@ -149,7 +149,6 @@ static void job_txn_del_job(Job *job)
>>  
>>  static int job_txn_apply(Job *job, int fn(Job *))
>>  {
>> -AioContext *inner_ctx;
>>  Job *other_job, *next;
>>  JobTxn *txn = job->txn;
>>  int rc = 0;
>> @@ -164,10 +163,7 @@ static int job_txn_apply(Job *job, int fn(Job *))
>>  aio_context_release(job->aio_context);
>>  
>>  QLIST_FOREACH_SAFE(other_job, >jobs, txn_list, next) {
>> -inner_ctx = other_job->aio_context;
>> -aio_context_acquire(inner_ctx);
>>  rc = fn(other_job);
>> -aio_context_release(inner_ctx);
>>  if (rc) {
>>  break;
>>  }
>> @@ -717,11 +713,15 @@ static void job_clean(Job *job)
>>  
>>  static int job_finalize_single(Job *job)
>>  {
>> +AioContext *ctx = job->aio_context;
>> +
>>  assert(job_is_completed(job));
>>  
>>  /* Ensure abort is called for late-transactional failures */
>>  job_update_rc(job);
>>  
>> +aio_context_acquire(ctx);
>> +
>>  if (!job->ret) {
>>  job_commit(job);
>>  } else {
>> @@ -729,6 +729,8 @@ static int job_finalize_single(Job *job)
>>  }
>>  job_clean(job);
>>  
>> +aio_context_release(ctx);
>> +
>>  if (job->cb) {
>>  job->cb(job->opaque, job->ret);
>>  }
>> @@ -833,8 +835,8 @@ static void job_completed_txn_abort(Job *job)
>>  assert(job_cancel_requested(other_job));
>>  job_finish_sync(other_job, NULL, NULL);
>>  }
>> -job_finalize_single(other_job);
>>  aio_context_release(ctx);
>> +job_finalize_single(other_job);
>>  }
>>  
>>  /*
>> @@ -849,11 +851,16 @@ static void job_completed_txn_abort(Job *job)
>>  
>>  static int job_prepare(Job *job)
>>  {
>> +AioContext *ctx = job->aio_context;
>>  assert(qemu_in_main_thread());
>> +
>>  if (job->ret == 0 && job->driver->prepare) {
>> +aio_context_acquire(ctx);
>>  job->ret = job->driver->prepare(job);
>> +aio_context_release(ctx);
>>  job_update_rc(job);
>>  }
>> +
>>  return job->ret;
>>  }
>>  
>> -- 
>> 2.31.1
>>




Re: [PATCH v5 04/20] job.c: move inner aiocontext lock in callbacks

2022-02-24 Thread Emanuele Giuseppe Esposito



On 17/02/2022 14:45, Stefan Hajnoczi wrote:
> On Tue, Feb 08, 2022 at 09:34:57AM -0500, Emanuele Giuseppe Esposito wrote:
>> Instead of having the lock in job_tnx_apply, move it inside
> 
> s/tnx/txn/
> 
>> in the callback. This will be helpful for next commits, when
>> we introduce job_lock/unlock pairs.
>>
>> job_transition_to_pending() and job_needs_finalize() do not
>> need to be protected by the aiocontext lock.
>>
>> No functional change intended.
>>
>> Signed-off-by: Emanuele Giuseppe Esposito 
>> ---
>>  job.c | 17 -
>>  1 file changed, 12 insertions(+), 5 deletions(-)
> 
> I find this hard to review. The patch reduces the scope of the
> AioContext lock region and accesses Job in places where the AioContext
> was previously held. The commit description doesn't explain why it is
> safe to do this.
> 
> I may need to audit the code myself but will try reviewing a few more
> patches first to see if I get the hang of this.
> 
>>
>> diff --git a/job.c b/job.c
>> index f13939d3c6..d8c13ac0d1 100644
>> --- a/job.c
>> +++ b/job.c
>> @@ -149,7 +149,6 @@ static void job_txn_del_job(Job *job)
>>  
>>  static int job_txn_apply(Job *job, int fn(Job *))
>>  {
>> -AioContext *inner_ctx;
>>  Job *other_job, *next;
>>  JobTxn *txn = job->txn;
>>  int rc = 0;
>> @@ -164,10 +163,7 @@ static int job_txn_apply(Job *job, int fn(Job *))
>>  aio_context_release(job->aio_context);
>>  
>>  QLIST_FOREACH_SAFE(other_job, >jobs, txn_list, next) {
>> -inner_ctx = other_job->aio_context;
>> -aio_context_acquire(inner_ctx);
>>  rc = fn(other_job);
>> -aio_context_release(inner_ctx);
>>  if (rc) {
>>  break;
>>  }
>> @@ -717,11 +713,15 @@ static void job_clean(Job *job)
>>  
>>  static int job_finalize_single(Job *job)
>>  {
>> +AioContext *ctx = job->aio_context;
>> +
>>  assert(job_is_completed(job));
>>  
>>  /* Ensure abort is called for late-transactional failures */
>>  job_update_rc(job);
>>  
>> +aio_context_acquire(ctx);
>> +
>>  if (!job->ret) {
>>  job_commit(job);
>>  } else {
>> @@ -729,6 +729,8 @@ static int job_finalize_single(Job *job)
>>  }
>>  job_clean(job);
>>  
>> +aio_context_release(ctx);
>> +
>>  if (job->cb) {
>>  job->cb(job->opaque, job->ret);
>>  }
>> @@ -833,8 +835,8 @@ static void job_completed_txn_abort(Job *job)
>>  assert(job_cancel_requested(other_job));
>>  job_finish_sync(other_job, NULL, NULL);
>>  }
>> -job_finalize_single(other_job);
>>  aio_context_release(ctx);
>> +job_finalize_single(other_job);
>>  }
>>  
>>  /*
>> @@ -849,11 +851,16 @@ static void job_completed_txn_abort(Job *job)
>>  
>>  static int job_prepare(Job *job)
>>  {
>> +AioContext *ctx = job->aio_context;
>>  assert(qemu_in_main_thread());
>> +
>>  if (job->ret == 0 && job->driver->prepare) {
>> +aio_context_acquire(ctx);
>>  job->ret = job->driver->prepare(job);
>> +aio_context_release(ctx);
>>  job_update_rc(job);
>>  }
>> +
>>  return job->ret;
>>  }
>>  
>> -- 
>> 2.31.1
>>

Maybe just for safety this should also go in patch 19, where I enable
job lock/unlock and reduce/remove AioContext locks.

Emanuele




Re: [PATCH v5 07/20] job.h: add _locked duplicates for job API functions called with and without job_mutex

2022-02-24 Thread Emanuele Giuseppe Esposito



On 17/02/2022 15:20, Stefan Hajnoczi wrote:
> On Tue, Feb 08, 2022 at 09:35:00AM -0500, Emanuele Giuseppe Esposito wrote:
>>  static void job_exit(void *opaque)
>>  {
>>  Job *job = (Job *)opaque;
>>  AioContext *ctx;
>> +JOB_LOCK_GUARD();
>>  
>>  job_ref(job);
>>  aio_context_acquire(job->aio_context);
> 
> The previous patch said:
> 
>   We don't want this, as job_lock must be taken inside the AioContext
>   lock, and taking it outside would cause deadlocks.

This was misleading (and thus removed). You have to remember that at
this point JOB_LOCK_GUARD is nop.
I will add this to the commit message too.

> 
> Therefore this looks like a deadlock.
> 




Re: [PATCH v4 12/12] KVM: Expose KVM_MEM_PRIVATE

2022-02-24 Thread Chao Peng
On Wed, Feb 23, 2022 at 07:32:37PM +0100, Maciej S. Szmigiero wrote:
> On 23.02.2022 13:00, Chao Peng wrote:
> > On Tue, Feb 22, 2022 at 02:16:46AM +0100, Maciej S. Szmigiero wrote:
> > > On 17.02.2022 14:45, Chao Peng wrote:
> > > > On Tue, Jan 25, 2022 at 09:20:39PM +0100, Maciej S. Szmigiero wrote:
> > > > > On 18.01.2022 14:21, Chao Peng wrote:
> > > > > > KVM_MEM_PRIVATE is not exposed by default but architecture code can 
> > > > > > turn
> > > > > > on it by implementing kvm_arch_private_memory_supported().
> > > > > > 
> > > > > > Also private memslot cannot be movable and the same file+offset can 
> > > > > > not
> > > > > > be mapped into different GFNs.
> > > > > > 
> > > > > > Signed-off-by: Yu Zhang 
> > > > > > Signed-off-by: Chao Peng 
> > > > > > ---
> > > > > (..)
> > > > > > static bool kvm_check_memslot_overlap(struct kvm_memslots 
> > > > > > *slots, int id,
> > > > > > - gfn_t start, gfn_t end)
> > > > > > + struct file *file,
> > > > > > + gfn_t start, gfn_t end,
> > > > > > + loff_t start_off, loff_t end_off)
> > > > > > {
> > > > > > struct kvm_memslot_iter iter;
> > > > > > +   struct kvm_memory_slot *slot;
> > > > > > +   struct inode *inode;
> > > > > > +   int bkt;
> > > > > > kvm_for_each_memslot_in_gfn_range(, slots, start, 
> > > > > > end) {
> > > > > > if (iter.slot->id != id)
> > > > > > return true;
> > > > > > }
> > > > > > +   /* Disallow mapping the same file+offset into multiple gfns. */
> > > > > > +   if (file) {
> > > > > > +   inode = file_inode(file);
> > > > > > +   kvm_for_each_memslot(slot, bkt, slots) {
> > > > > > +   if (slot->private_file &&
> > > > > > +file_inode(slot->private_file) == inode &&
> > > > > > +!(end_off <= slot->private_offset ||
> > > > > > +  start_off >= slot->private_offset
> > > > > > ++ (slot->npages >> 
> > > > > > PAGE_SHIFT)))
> > > > > > +   return true;
> > > > > > +   }
> > > > > > +   }
> > > > > 
> > > > > That's a linear scan of all memslots on each CREATE (and MOVE) 
> > > > > operation
> > > > > with a fd - we just spent more than a year rewriting similar linear 
> > > > > scans
> > > > > into more efficient operations in KVM.
> > > > 
> (..)
> > > > So linear scan is used before I can find a better way.
> > > 
> > > Another option would be to simply not check for overlap at add or move
> > > time, declare such configuration undefined behavior under KVM API and
> > > make sure in MMU notifiers that nothing bad happens to the host kernel
> > > if it turns out somebody actually set up a VM this way (it could be
> > > inefficient in this case, since it's not supposed to ever happen
> > > unless there is a bug somewhere in the userspace part).
> > 
> > Specific to TDX case, SEAMMODULE will fail the overlapping case and then
> > KVM prints a message to the kernel log. It will not cause any other side
> > effect, it does look weird however. Yes warn that in the API document
> > can help to some extent.
> 
> So for the functionality you are adding this code for (TDX) this scan
> isn't necessary and the overlapping case (not supported anyway) is safely
> handled by the hardware (or firmware)?

Yes, it will be handled by the firmware.

> Then I would simply remove the scan and, maybe, add a comment instead
> that the overlap check is done by the hardware.

Sure.

> 
> By the way, if a kernel log message could be triggered by (misbehaving)
> userspace then it should be rate limited (if it isn't already).

Thanks for mention.

Chao
> 
> > Thanks,
> > Chao
> 
> Thanks,
> Maciej



Re: [PATCH] hw/i386/pc: when adding reserved E820 entries do not allocate dynamic entries

2022-02-24 Thread Igor Mammedov
On Wed, 23 Feb 2022 17:30:34 +0530
Ani Sinha  wrote:

> On Wed, Feb 23, 2022 at 2:34 PM Igor Mammedov  wrote:
> >
> > On Thu, 10 Feb 2022 18:58:21 +0530
> > Ani Sinha  wrote:
> >  
> > > When adding E820_RESERVED entries we also accidentally allocate dynamic
> > > entries. This is incorrect. We should simply return early with the count 
> > > of
> > > the number of reserved entries added.  
> >
> > can you expand commit message to explain what's wrong and
> > how problem manifests ... etc.  
> 
> The issue has been present for the last 8 years without apparent
> visible issues. I think the only issue is that the bug allocates more
> memory in the firmware than is actually needed.

let me repeat: Why do you think it's an issue or why it's wrong

> 
> >  
> > >
> > > fixes: 7d67110f2d9a6("pc: add etc/e820 fw_cfg file")
> > > cc: kra...@redhat.com
> > > Signed-off-by: Ani Sinha 
> > > ---
> > >  hw/i386/e820_memory_layout.c | 2 ++
> > >  1 file changed, 2 insertions(+)
> > >
> > > diff --git a/hw/i386/e820_memory_layout.c b/hw/i386/e820_memory_layout.c
> > > index bcf9eaf837..afb08253a4 100644
> > > --- a/hw/i386/e820_memory_layout.c
> > > +++ b/hw/i386/e820_memory_layout.c
> > > @@ -31,6 +31,8 @@ int e820_add_entry(uint64_t address, uint64_t length, 
> > > uint32_t type)
> > >  entry->type = cpu_to_le32(type);
> > >
> > >  e820_reserve.count = cpu_to_le32(index);
> > > +
> > > +return index;
> > >  }  
> >
> > this changes e820_table size/content, which is added by fw_cfg_add_file() 
> > to fwcfg,
> > as result it breaks ABI in case of migration.  
> 
> Ugh. So should we keep the bug? or do we add config setting to handle
> the ABI breakage.
> 




Re: [PATCH v4 13/14] hw/mem/system-memory: add a memory sysbus device

2022-02-24 Thread Igor Mammedov
On Wed, 23 Feb 2022 11:19:49 +0100
Damien Hedde  wrote:

> On 2/23/22 10:44, Igor Mammedov wrote:
> > On Wed, 23 Feb 2022 10:07:05 +0100
> > Damien Hedde  wrote:
> >   
> >> This device can be used to create a memory wrapped into a
> >> sysbus device.
> >> This device has one property 'readonly' which allows
> >> to choose between a ram or a rom.
> >>  
> >   
> >> The purpose for this device is to be used with qapi command
> >> device_add.  
> > that's the way to add a device to QEMU but a don't actual
> > purpose described here, i.e. how board will use this
> > device/actual usecase and how it will be wired to board
> > and why it does have to be a sysbus device.
> >   
> Sorry, this was unclear.
> 
> It is a sysbus device in order to use it like any other sysbus device. 
> The memory region it contains is exposed as a sysbus mmio.

aside of that sysbus is legacy fictional bus (albeit widely used),
it doesn't scale to non sysbus devices (me thinking about buss-less
pc-dimm & co) since eventually we would like to create mainstream
machine types via QMP as well.

> I can replace the commit message by the following paragraph:
> 
> Boards instantiate memories by creating memory region objects which is 
> not possible using QAPI commands.

That's not entirely true, you can use object-add with hostmem backends
which do provide a means to allocate memory_region.
(there is no rom hostmem backend probably (i.e. one that return rom memory 
region)
but that could be added).
Another benefit of approach is that one can replace backing
memory any other backend (file/memfd/pmem...) without affecting
device model.

> To create a memory, the user can instantiate and map this device by 
> issuing the following commands:
> device_add driver=sysbus-memory size=0x1000 id=someram
> sysbus-mmio-map device=someram addr=0

I'd imagine more generic approach would look like:

object-add memory-backend-ram,id=mem1,size=0x1000,other_backend_twiks
device_add memdevice_frontend,memdev=mem1,addr=0x0

where [pre]plug hooks in machine can map device to
an appropriate address space/place at device realize time.
(see memory_device_[pre_]plug() for starters).


> >> Signed-off-by: Damien Hedde 
> >> ---
> >>   include/hw/mem/sysbus-memory.h | 28 
> >>   hw/mem/sysbus-memory.c | 80 ++
> >>   hw/mem/meson.build |  2 +
> >>   3 files changed, 110 insertions(+)
> >>   create mode 100644 include/hw/mem/sysbus-memory.h
> >>   create mode 100644 hw/mem/sysbus-memory.c
> >>
> >> diff --git a/include/hw/mem/sysbus-memory.h 
> >> b/include/hw/mem/sysbus-memory.h
> >> new file mode 100644
> >> index 00..5c596f8b4f
> >> --- /dev/null
> >> +++ b/include/hw/mem/sysbus-memory.h
> >> @@ -0,0 +1,28 @@
> >> +/*
> >> + * SPDX-License-Identifier: GPL-2.0-or-later
> >> + *
> >> + * SysBusDevice Memory
> >> + *
> >> + * Copyright (c) 2021 Greensocs
> >> + */
> >> +
> >> +#ifndef HW_SYSBUS_MEMORY_H
> >> +#define HW_SYSBUS_MEMORY_H
> >> +
> >> +#include "hw/sysbus.h"
> >> +#include "qom/object.h"
> >> +
> >> +#define TYPE_SYSBUS_MEMORY "sysbus-memory"
> >> +OBJECT_DECLARE_SIMPLE_TYPE(SysBusMemoryState, SYSBUS_MEMORY)
> >> +
> >> +struct SysBusMemoryState {
> >> +/*  */
> >> +SysBusDevice parent_obj;
> >> +uint64_t size;
> >> +bool readonly;
> >> +
> >> +/*  */
> >> +MemoryRegion mem;
> >> +};
> >> +
> >> +#endif /* HW_SYSBUS_MEMORY_H */
> >> diff --git a/hw/mem/sysbus-memory.c b/hw/mem/sysbus-memory.c
> >> new file mode 100644
> >> index 00..f1ad7ba7ec
> >> --- /dev/null
> >> +++ b/hw/mem/sysbus-memory.c
> >> @@ -0,0 +1,80 @@
> >> +/*
> >> + * SPDX-License-Identifier: GPL-2.0-or-later
> >> + *
> >> + * SysBusDevice Memory
> >> + *
> >> + * Copyright (c) 2021 Greensocs
> >> + */
> >> +
> >> +#include "qemu/osdep.h"
> >> +#include "hw/mem/sysbus-memory.h"
> >> +#include "hw/qdev-properties.h"
> >> +#include "qemu/log.h"
> >> +#include "qemu/module.h"
> >> +#include "qapi/error.h"
> >> +
> >> +static Property sysbus_memory_properties[] = {
> >> +DEFINE_PROP_UINT64("size", SysBusMemoryState, size, 0),
> >> +DEFINE_PROP_BOOL("readonly", SysBusMemoryState, readonly, false),
> >> +DEFINE_PROP_END_OF_LIST(),
> >> +};
> >> +
> >> +static void sysbus_memory_realize(DeviceState *dev, Error **errp)
> >> +{
> >> +SysBusMemoryState *s = SYSBUS_MEMORY(dev);
> >> +gchar *name;
> >> +
> >> +if (!s->size) {
> >> +error_setg(errp, "'size' must be non-zero.");
> >> +return;
> >> +}
> >> +
> >> +/*
> >> + * We impose having an id (which is unique) because we need to 
> >> generate
> >> + * a unique name for the memory region.
> >> + * memory_region_init_ram/rom() will abort() (in qemu_ram_set_idstr()
> >> + * function if 2 system-memory devices are created with the same name
> >> + * for the memory region).
> >> + */
> >> +if (!dev->id) {
> >> +error_setg(errp, "system-memory device must have an 

Re: [PATCH RFCv2 3/4] i386/pc: warn if phys-bits is too low

2022-02-24 Thread Igor Mammedov
On Wed, 23 Feb 2022 17:18:55 +
Joao Martins  wrote:

> On 2/14/22 15:18, Joao Martins wrote:
> > On 2/14/22 15:03, Igor Mammedov wrote:  
> >> On Mon,  7 Feb 2022 20:24:21 +
> >> Joao Martins  wrote:
> >>  
> >>> Default phys-bits on Qemu is TCG_PHYS_BITS (40) which is enough
> >>> to address 1Tb (0xff  ). On AMD platforms, if a
> >>> ram-above-4g relocation happens and the CPU wasn't configured
> >>> with a big enough phys-bits, warn the user. There isn't a
> >>> catastrophic failure exactly, the guest will still boot, but
> >>> most likely won't be able to use more than ~4G of RAM.  
> >>
> >> how 'unable to use" would manifest?
> >> It might be better to prevent QEMU startup with broken setup (CLI)
> >> rather then letting guest run and trying to figure out what's
> >> going wrong when users start to complain. 
> >>  
> > Sounds better to be conservative here.
> > 
> > I will change from warn_report() to error_report()
> > and exit.
> >   
> 
> I was running through x86_64 qtests prior to submission
> and it seems that the inclusion of a pci_hole64_size in
> the check added by this patch would break tests if we were
> to error out. So far, I'm keeping it as a warning over
> compatibility concerns, not limited these 5 test failures
> below. Let me know otherwise if you disagree, or if you
> prefer another way.

can you check what exactly breaks tests?

> Summary of Failures:
> 
>  1/56 qemu:qtest+qtest-x86_64 / qtest-x86_64/qom-test   ERROR 
>   0.07s
>   killed by signal 6 SIGABRT
>  4/56 qemu:qtest+qtest-x86_64 / qtest-x86_64/test-hmp   ERROR 
>   0.07s
>   killed by signal 6 SIGABRT
>  7/56 qemu:qtest+qtest-x86_64 / qtest-x86_64/boot-serial-test   ERROR 
>   0.07s
>   killed by signal 6 SIGABRT
> 44/56 qemu:qtest+qtest-x86_64 / qtest-x86_64/test-x86-cpuid-compat  ERROR 
>   0.09s
>   killed by signal 6 SIGABRT
> 45/56 qemu:qtest+qtest-x86_64 / qtest-x86_64/numa-test  ERROR 
>   0.17s
>   killed by signal 6 SIGABRT
> 




Re: [PATCH] vl: transform QemuOpts device to JSON syntax device

2022-02-24 Thread Peter Krempa
On Thu, Feb 24, 2022 at 09:02:02 +, Daniel P. Berrangé wrote:
> On Thu, Feb 24, 2022 at 02:06:53PM +0800, Zhenzhong Duan wrote:
> > While there are mixed use of traditional -device option and JSON
> > syntax option, QEMU reports conflict, e.x:
> > 
> > /usr/libexec/qemu-kvm -nodefaults \
> >   -device 
> > '{"driver":"virtio-scsi-pci","id":"scsi0","bus":"pci.0","addr":"0x02.0"}' \
> >   -device virtio-scsi-pci,id=scsi1,bus=pci.0
> 
> Why are you attempting to mix JSON and non-JSON syntax at the same
> time ? The expectation is that any mgmt app adopting JSON syntax
> will do so universally and not mix old and new syntax. So in practice
> the scenario above is not one that QEMU ever intended to have used
> by apps.

Based on the previous post they are using some 'qemu:commandline'
overrides with the legacy syntax with new libvirt which uses JSON
syntax:


  


  


I suggested that they should add the required functionality to libvirt
instead of trying to hack qemu:

https://lists.gnu.org/archive/html/qemu-devel/2022-02/msg05068.html




Re: [PATCH RFCv2 3/4] i386/pc: warn if phys-bits is too low

2022-02-24 Thread Joao Martins
On 2/24/22 09:01, Igor Mammedov wrote:
> On Wed, 23 Feb 2022 17:18:55 +
> Joao Martins  wrote:
> 
>> On 2/14/22 15:18, Joao Martins wrote:
>>> On 2/14/22 15:03, Igor Mammedov wrote:  
 On Mon,  7 Feb 2022 20:24:21 +
 Joao Martins  wrote:
  
> Default phys-bits on Qemu is TCG_PHYS_BITS (40) which is enough
> to address 1Tb (0xff  ). On AMD platforms, if a
> ram-above-4g relocation happens and the CPU wasn't configured
> with a big enough phys-bits, warn the user. There isn't a
> catastrophic failure exactly, the guest will still boot, but
> most likely won't be able to use more than ~4G of RAM.  

 how 'unable to use" would manifest?
 It might be better to prevent QEMU startup with broken setup (CLI)
 rather then letting guest run and trying to figure out what's
 going wrong when users start to complain. 
  
>>> Sounds better to be conservative here.
>>>
>>> I will change from warn_report() to error_report()
>>> and exit.
>>>   
>>
>> I was running through x86_64 qtests prior to submission
>> and it seems that the inclusion of a pci_hole64_size in
>> the check added by this patch would break tests if we were
>> to error out. So far, I'm keeping it as a warning over
>> compatibility concerns, not limited these 5 test failures
>> below. Let me know otherwise if you disagree, or if you
>> prefer another way.
> 
> can you check what exactly breaks tests?
> 
The test prematuralely fails with the above check that.

And on a second look, the problem is obvious. Essentially, I
am not handling the 32-bit case correctly :(

I will revert back what I submitted in v3 to be an error_report()
and exit() and will restrict this 64-bit only (i.e. for memory above-4g)

qemu-system-x86_64: Address space above 4G at 1-1 phys-bits too 
low (32)
qemu-system-x86_64: Address space above 4G at 1-1 phys-bits too 
low (32)
qemu-system-x86_64: Address space above 4G at 1-1 phys-bits too 
low (32)
# child process (/x86/cpuid/parsing-plus-minus/subprocess [188634]) stderr:
"qemu-system-x86_64: warning: Ambiguous CPU model string. Don't mix both 
\"-mce\" and
\"mce=on\"\nqemu-system-x86_64: warning: Ambiguous CPU model string. Don't mix 
both
\"+cx8\" and \"cx8=off\"\nqemu-system-x86_64: warning: Compatibility of 
ambiguous CPU
model strings won't be kept on future QEMU versions\nqemu-system-x86_64: 
Address space
above 4G at 1-18000 phys-bits too low (32)\nBroken pipe\n"
qemu-system-x86_64: Address space above 4G at 1-18000 phys-bits too 
low (32)

>> Summary of Failures:
>>
>>  1/56 qemu:qtest+qtest-x86_64 / qtest-x86_64/qom-test   ERROR
>>0.07s
>>   killed by signal 6 SIGABRT
>>  4/56 qemu:qtest+qtest-x86_64 / qtest-x86_64/test-hmp   ERROR
>>0.07s
>>   killed by signal 6 SIGABRT
>>  7/56 qemu:qtest+qtest-x86_64 / qtest-x86_64/boot-serial-test   ERROR
>>0.07s
>>   killed by signal 6 SIGABRT
>> 44/56 qemu:qtest+qtest-x86_64 / qtest-x86_64/test-x86-cpuid-compat  ERROR
>>0.09s
>>   killed by signal 6 SIGABRT
>> 45/56 qemu:qtest+qtest-x86_64 / qtest-x86_64/numa-test  ERROR
>>0.17s
>>   killed by signal 6 SIGABRT
>>
> 



Re: Fix a potential memory leak bug in write_boot_rom() (v6.2.0).

2022-02-24 Thread Cédric Le Goater

Hello Wentao,

On 2/23/22 17:15, Philippe Mathieu-Daudé wrote:

On 23/2/22 15:39, wli...@stu.xidian.edu.cn wrote:

Hi all,

I find a memory leak bug in QEMU 6.2.0, which is in 
write_boot_rom()(./hw/arm/aspeed.c).

Specifically, at line 276, a memory chunk is allocated with g_new0() and 
assigned to the variable 'storage'. However, if the branch takes true at line 
277, there will be only an error report at line 278 but not a free operation 
for 'storage' before function returns. As a result, a memory leak bug is 
triggered.


259    BlockBackend *blk = blk_by_legacy_dinfo(dinfo);
...
276    storage = g_new0(uint8_t, rom_size);
277    if (blk_pread(blk, 0, storage, rom_size) < 0) {
278        error_setg(errp, "failed to read the initial flash content");
279        return;
280    }


I believe that the problem can be fixed by adding a g_free() before the 
function returns.


277    if (blk_pread(blk, 0, storage, rom_size) < 0) {
278        error_setg(errp, "failed to read the initial flash content");
+++    g_free(storage);
279        return;
280    }


I'm looking forward to your confirmation.


Correct.

Or using g_autofree:


yes. Could you please send a patch using  g_autofree ?

Thanks,

C.



-- >8 --
diff --git a/hw/arm/aspeed.c b/hw/arm/aspeed.c
index d911dc904f..170e773ef8 100644
--- a/hw/arm/aspeed.c
+++ b/hw/arm/aspeed.c
@@ -257,7 +257,7 @@ static void write_boot_rom(DriveInfo *dinfo, hwaddr addr, 
size_t rom_size,
     Error **errp)
  {
  BlockBackend *blk = blk_by_legacy_dinfo(dinfo);
-    uint8_t *storage;
+    g_autofree void *storage = NULL;
  int64_t size;

  /* The block backend size should have already been 'validated' by
@@ -273,14 +273,13 @@ static void write_boot_rom(DriveInfo *dinfo, hwaddr addr, 
size_t rom_size,
  rom_size = size;
  }

-    storage = g_new0(uint8_t, rom_size);
+    storage = g_malloc0(rom_size);
  if (blk_pread(blk, 0, storage, rom_size) < 0) {
  error_setg(errp, "failed to read the initial flash content");
  return;
  }

  rom_add_blob_fixed("aspeed.boot_rom", storage, rom_size, addr);
-    g_free(storage);
  }
---





Re: [PATCH] vl: transform QemuOpts device to JSON syntax device

2022-02-24 Thread Daniel P . Berrangé
On Thu, Feb 24, 2022 at 02:06:53PM +0800, Zhenzhong Duan wrote:
> While there are mixed use of traditional -device option and JSON
> syntax option, QEMU reports conflict, e.x:
> 
> /usr/libexec/qemu-kvm -nodefaults \
>   -device 
> '{"driver":"virtio-scsi-pci","id":"scsi0","bus":"pci.0","addr":"0x02.0"}' \
>   -device virtio-scsi-pci,id=scsi1,bus=pci.0

Why are you attempting to mix JSON and non-JSON syntax at the same
time ? The expectation is that any mgmt app adopting JSON syntax
will do so universally and not mix old and new syntax. So in practice
the scenario above is not one that QEMU ever intended to have used
by apps.


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH v2] qapi, target/i386/sev: Add cpu0-id to query-sev-capabilities

2022-02-24 Thread Daniel P . Berrangé
On Thu, Feb 24, 2022 at 06:14:05AM +, Dov Murik wrote:
> Add a new field 'cpu0-id' to the response of query-sev-capabilities QMP
> command.  The value of the field is the base64-encoded 64-byte unique ID
> of the CPU0 (socket 0), which can be used to retrieve the signed CEK of
> the CPU from AMD's Key Distribution Service (KDS).
> 
> Signed-off-by: Dov Murik 
> 
> ---
> 
> v2:
> - change encoding to Base64 (thanks Daniel)
> - rename constant to SEV_CPU_UNIQUE_ID_LEN
> ---
>  qapi/misc-target.json |  4 
>  target/i386/sev.c | 27 +++
>  2 files changed, 31 insertions(+)
> 
> diff --git a/qapi/misc-target.json b/qapi/misc-target.json
> index 4bc45d2474..c6d9ad69e1 100644
> --- a/qapi/misc-target.json
> +++ b/qapi/misc-target.json
> @@ -177,6 +177,8 @@
>  #
>  # @cert-chain:  PDH certificate chain (base64 encoded)
>  #
> +# @cpu0-id: 64-byte unique ID of CPU0 (base64 encoded) (since 7.0)
> +#
>  # @cbitpos: C-bit location in page table entry
>  #
>  # @reduced-phys-bits: Number of physical Address bit reduction when SEV is
> @@ -187,6 +189,7 @@
>  { 'struct': 'SevCapability',
>'data': { 'pdh': 'str',
>  'cert-chain': 'str',
> +'cpu0-id': 'str',
>  'cbitpos': 'int',
>  'reduced-phys-bits': 'int'},
>'if': 'TARGET_I386' }
> @@ -205,6 +208,7 @@
>  #
>  # -> { "execute": "query-sev-capabilities" }
>  # <- { "return": { "pdh": "8CCDD8DDD", "cert-chain": "888CCCDDDEE",
> +#  "cpu0-id": "2lvmGwo+...61iEinw==",
>  #  "cbitpos": 47, "reduced-phys-bits": 5}}
>  #
>  ##
> diff --git a/target/i386/sev.c b/target/i386/sev.c
> index 025ff7a6f8..d3d2680e16 100644
> --- a/target/i386/sev.c
> +++ b/target/i386/sev.c
> @@ -82,6 +82,8 @@ struct SevGuestState {
>  #define DEFAULT_GUEST_POLICY0x1 /* disable debug */
>  #define DEFAULT_SEV_DEVICE  "/dev/sev"
>  
> +#define SEV_CPU_UNIQUE_ID_LEN   64

Is this fixed size actually guaranteed by AMD ?  In reading the spec
for "GET_ID" it feels like they're intentionally not specifying a
fixed length, pushing towards querying once and then re-trying with
the reported buffer size:

   "If the value of the ID_LEN field is too small, an 
INVALID_LENGTH error is returned and the minimum 
required length is written to the field"

I didn't find where it says 64 bytes exactly.

> +
>  #define SEV_INFO_BLOCK_GUID "00f771de-1a7e-4fcb-890e-68c77e2fb44e"
>  typedef struct __attribute__((__packed__)) SevInfoBlock {
>  /* SEV-ES Reset Vector Address */
> @@ -531,11 +533,31 @@ e_free:
>  return 1;
>  }
>  
> +static int
> +sev_get_id(int fd, guchar *id_buf, size_t id_buf_len, Error **errp)
> +{
> +struct sev_user_data_get_id2 id = {
> +.address = (unsigned long)id_buf,
> +.length = id_buf_len
> +};
> +int err, r;
> +
> +r = sev_platform_ioctl(fd, SEV_GET_ID2, , );
> +if (r < 0) {
> +error_setg(errp, "SEV: Failed to get ID ret=%d fw_err=%d (%s)",
> +   r, err, fw_error_to_str(err));
> +return 1;
> +}
> +
> +return 0;
> +}
> +
>  static SevCapability *sev_get_capabilities(Error **errp)
>  {
>  SevCapability *cap = NULL;
>  guchar *pdh_data = NULL;
>  guchar *cert_chain_data = NULL;
> +guchar cpu0_id[SEV_CPU_UNIQUE_ID_LEN];
>  size_t pdh_len = 0, cert_chain_len = 0;
>  uint32_t ebx;
>  int fd;
> @@ -561,9 +583,14 @@ static SevCapability *sev_get_capabilities(Error **errp)
>  goto out;
>  }
>  
> +if (sev_get_id(fd, cpu0_id, sizeof(cpu0_id), errp)) {
> +goto out;
> +}
> +
>  cap = g_new0(SevCapability, 1);
>  cap->pdh = g_base64_encode(pdh_data, pdh_len);
>  cap->cert_chain = g_base64_encode(cert_chain_data, cert_chain_len);
> +cap->cpu0_id = g_base64_encode(cpu0_id, sizeof(cpu0_id));
>  
>  host_cpuid(0x801F, 0, NULL, , NULL, NULL);
>  cap->cbitpos = ebx & 0x3f;
> -- 
> 2.25.1
> 

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH v5 06/20] jobs: remove aiocontext locks since the functions are under BQL

2022-02-24 Thread Emanuele Giuseppe Esposito



On 17/02/2022 15:12, Stefan Hajnoczi wrote:
> On Tue, Feb 08, 2022 at 09:34:59AM -0500, Emanuele Giuseppe Esposito wrote:
>> In preparation to the job_lock/unlock patch, remove these
>> aiocontext locks.
>> The main reason these two locks are removed here is because
>> they are inside a loop iterating on the jobs list. Once the
>> job_lock is added, it will have to protect the whole loop,
>> wrapping also the aiocontext acquire/release.
>>
>> We don't want this, as job_lock must be taken inside the AioContext
>> lock, and taking it outside would cause deadlocks.
>>
>> Signed-off-by: Emanuele Giuseppe Esposito 
>> ---
>>  blockdev.c | 4 
>>  job-qmp.c  | 4 
>>  2 files changed, 8 deletions(-)
>>
>> diff --git a/blockdev.c b/blockdev.c
>> index 8cac3d739c..e315466914 100644
>> --- a/blockdev.c
>> +++ b/blockdev.c
>> @@ -3713,15 +3713,11 @@ BlockJobInfoList *qmp_query_block_jobs(Error **errp)
>>  
>>  for (job = block_job_next(NULL); job; job = block_job_next(job)) {
> 
> I'm confused. block_job_next() is supposed to be called with job_mutex
> held since it iterates the jobs list.

Ok here it is clear that the lock is taken to protect the list. So I
should squash this patch with the one that enables job lock/unlock
(patch 19).

Emanuele

> 
> The patch series might fix this later on but it's hard to review patches
> with broken invariants.
> 
> Does this mean git-bisect(1) is broken since intermediate commits are
> not thread-safe?
> 
>>  BlockJobInfo *value;
>> -AioContext *aio_context;
>>  
>>  if (block_job_is_internal(job)) {
>>  continue;
>>  }
>> -aio_context = block_job_get_aio_context(job);
>> -aio_context_acquire(aio_context);
>>  value = block_job_query(job, errp);
> 
> This function accesses fields that are protected by job_mutex, which we
> don't hold.
> 




Re: [PATCH 1/3] util & iothread: Introduce event-loop abstract class

2022-02-24 Thread Stefan Hajnoczi
On Mon, Feb 21, 2022 at 06:08:43PM +0100, Nicolas Saenz Julienne wrote:
> diff --git a/qom/meson.build b/qom/meson.build
> index 062a3789d8..c20e5dd1cb 100644
> --- a/qom/meson.build
> +++ b/qom/meson.build
> @@ -4,6 +4,7 @@ qom_ss.add(files(
>'object.c',
>'object_interfaces.c',
>'qom-qobject.c',
> +  '../util/event-loop.c',

This looks strange. I expected util/event-loop.c to be in
util/meson.build and added to the util_ss SourceSet instead of qom_ss.

What is the reason for this?

>  ))
>  
>  qmp_ss.add(files('qom-qmp-cmds.c'))
> diff --git a/util/event-loop.c b/util/event-loop.c
> new file mode 100644
> index 00..f3e50909a0
> --- /dev/null
> +++ b/util/event-loop.c

The naming is a little inconsistent. The filename "event-loop.c" does
match the QOM type or typedef name event-loop-backend/EventLoopBackend.

I suggest calling the source file event-loop-base.c and the QOM type
"event-loop-base".

> @@ -0,0 +1,142 @@
> +/*
> + * QEMU event-loop backend
> + *
> + * Copyright (C) 2022 Red Hat Inc
> + *
> + * Authors:
> + *  Nicolas Saenz Julienne 

Most of the code is cut and pasted. It would be nice to carry over the
authorship information too.

> +struct EventLoopBackend {
> +Object parent;
> +
> +/* AioContext poll parameters */
> +int64_t poll_max_ns;
> +int64_t poll_grow;
> +int64_t poll_shrink;

These parameters do not affect the main loop because it cannot poll. If
you decide to keep them in the base class, please document that they
have no effect on the main loop so users aren't confused. I would keep
them unique to IOThread for now.


signature.asc
Description: PGP signature


Re: [PATCH 3/3] whpx: Added support for breakpoints and stepping

2022-02-24 Thread Peter Maydell
On Wed, 23 Feb 2022 at 20:08, Ivan Shcherbakov  wrote:
>

>  static GDBState gdbserver_state;
> +static bool gdbserver_is_connected;
> +
> +bool gdb_is_connected(void)
> +{
> +return gdbserver_is_connected;
> +}

I haven't looked at the rest of the patch -- but can you explain where
whpx is different from how other accelerators handle debug such that
it needs to know whether gdb is connected or not ?

thanks
-- PMM



Re: [PATCH] vl: transform QemuOpts device to JSON syntax device

2022-02-24 Thread Daniel P . Berrangé
On Thu, Feb 24, 2022 at 10:09:35AM +0100, Peter Krempa wrote:
> On Thu, Feb 24, 2022 at 09:02:02 +, Daniel P. Berrangé wrote:
> > On Thu, Feb 24, 2022 at 02:06:53PM +0800, Zhenzhong Duan wrote:
> > > While there are mixed use of traditional -device option and JSON
> > > syntax option, QEMU reports conflict, e.x:
> > > 
> > > /usr/libexec/qemu-kvm -nodefaults \
> > >   -device 
> > > '{"driver":"virtio-scsi-pci","id":"scsi0","bus":"pci.0","addr":"0x02.0"}' 
> > > \
> > >   -device virtio-scsi-pci,id=scsi1,bus=pci.0
> > 
> > Why are you attempting to mix JSON and non-JSON syntax at the same
> > time ? The expectation is that any mgmt app adopting JSON syntax
> > will do so universally and not mix old and new syntax. So in practice
> > the scenario above is not one that QEMU ever intended to have used
> > by apps.
> 
> Based on the previous post they are using some 'qemu:commandline'
> overrides with the legacy syntax with new libvirt which uses JSON
> syntax:
> 
> 
>   
> 
> 
>value='virtio-net-pci,netdev=mynet0,mac=00:16:3E:68:00:10,romfile='/>
> 
> 
> I suggested that they should add the required functionality to libvirt
> instead of trying to hack qemu:
> 
> https://lists.gnu.org/archive/html/qemu-devel/2022-02/msg05068.html

Or the other answer is to switch to use JSON syntax in the above command
line passthrough config.

Or to specify the PCI address so it doesn't clash with other devices
already present.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH v8 1/3] monitor/hmp: add support for flag argument with value

2022-02-24 Thread Markus Armbruster
Fabian Ebner  writes:

> Am 09.02.22 um 15:12 schrieb Markus Armbruster:
>> Fabian Ebner  writes:
>
> 8<
>
>>> diff --git a/monitor/monitor-internal.h b/monitor/monitor-internal.h
>>> index 3da3f86c6a..a4cb307c8a 100644
>>> --- a/monitor/monitor-internal.h
>>> +++ b/monitor/monitor-internal.h
>>> @@ -63,7 +63,8 @@
>>>   * '.'  other form of optional type (for 'i' and 'l')
>>>   * 'b'  boolean
>>>   *  user mode accepts "on" or "off"
>>> - * '-'  optional parameter (eg. '-f')
>>> + * '-'  optional parameter (eg. '-f'); if followed by a 'V', it
>>> + *  specifies an optional string param (e.g. '-fV' allows '-f 
>>> foo')
>>>   *
>>>   */
>> 
>> For what it's worth, getopt() uses ':' after the option character for
>> "takes an argument".
>> 
>
> Doing that leads to e.g.
> .args_type  = "protocol:s,password:s,display:-d:,connected:s?",
> so there's two different kinds of colons now.

Point.

>   It's not a problem
> functionality-wise AFAICT, but it might not be ideal. Should I still go
> for it?
>
> Also, wouldn't future non-string flag parameters need their own letter
> too? What about re-using 's' here instead?

Another good point.

Dave, what do you think?

>> Happy to make that tweak in my tree.  But see my review of PATCH 3
>> first.




[PATCH v2 0/2] ui/cocoa: updateUIInfo threading, autorelease pools

2022-02-24 Thread Peter Maydell
This patchset was originally provoked by Akihiko Odaki noting
that we have some unnecessary creation and deletion of autorelease
pools in the Cocoa UI code. Patch 2 deletes them; but to get there
we need to do a bit of cleanup of the updateUIInfo support,
which wasn't considering threads.

Tested only very lightly.

v1->v2 changes:
 * name method updateUIInfoLocked, to match existing handleEventLocked
 * don't call updateUIInfo in cocoa_display_init() -- this happens
   indirectly as a result of register_displaychangelistener()

thanks
-- PMM

Peter Maydell (2):
  ui/cocoa.m: Fix updateUIInfo threading issues
  ui/cocoa.m: Remove unnecessary NSAutoreleasePools

 ui/cocoa.m | 31 ++-
 1 file changed, 22 insertions(+), 9 deletions(-)

-- 
2.25.1




Re: [PATCH RFC v1 1/2] random: add mechanism for VM forks to reinitialize crng

2022-02-24 Thread Jason A. Donenfeld
Hey Eric,

On Thu, Feb 24, 2022 at 2:27 AM Eric Biggers  wrote:
>
> On Thu, Feb 24, 2022 at 01:54:54AM +0100, Jason A. Donenfeld wrote:
> > On 2/24/22, Eric Biggers  wrote:
> > > I think we should be removing cases where the base_crng key is changed
> > > directly
> > > besides extraction from the input_pool, not adding new ones.  Why not
> > > implement
> > > this as add_device_randomness() followed by crng_reseed(force=true), where
> > > the
> > > 'force' argument forces a reseed to occur even if the entropy_count is too
> > > low?
> >
> > Because that induces a "premature next" condition which can let that
> > entropy, potentially newly acquired by a storm of IRQs at power-on, be
> > bruteforced by unprivileged userspace. I actually had it exactly the
> > way you describe at first, but decided that this here is the lesser of
> > evils and doesn't really complicate things the way an intentional
> > premature next would. The only thing we care about here is branching
> > the crng stream, and so this does explicitly that, without having to
> > interfere with how we collect entropy. Of course we *also* add it as
> > non-credited "device randomness" so that it's part of the next
> > reseeding, whenever that might occur.
>
> Can you make sure to properly explain this in the code?

The carousel keeps turning, and after I wrote to you last night I kept
thinking about the matter. Here's how it breaks down:

Injection method:
- Assumes existing pool of entropy is still sacred.
- Assumes base_crng timestamp is representative of pool age.
- Result: Mixes directly into base_crng to avoid premature-next of pool.

Input pool method:
- Assumes existing pool of entropy is old / out of date / used by a
different fork, so not sacred.
- Assumes base_crng timestamp is tied to irrelevant entropy pool.
- Result: Force-drains input pool, causing intentional premature-next.

Which of these assumptions better models the situation? I started in
the input pool method camp, then by the time I posted v1, was
concerned about power-on IRQs, but now I think relying at all on
snapshotted entropy represents the biggest issue. And judging from
your email, it appears that you do too. So v3 of this patchset will
switch back to the input pool method, per your suggestion.

As a plus, it means we go through the RDSEED'd extraction algorithm
too, which always helps.

Jason



Re: [PATCH 3/3] whpx: Added support for breakpoints and stepping

2022-02-24 Thread Alex Bennée


Peter Maydell  writes:

> On Wed, 23 Feb 2022 at 20:08, Ivan Shcherbakov  wrote:
>>
>
>>  static GDBState gdbserver_state;
>> +static bool gdbserver_is_connected;
>> +
>> +bool gdb_is_connected(void)
>> +{
>> +return gdbserver_is_connected;
>> +}
>
> I haven't looked at the rest of the patch -- but can you explain where
> whpx is different from how other accelerators handle debug such that
> it needs to know whether gdb is connected or not ?

It's backwards from how it should work. I don't know about the other
accelerators but in the KVM case the gdbserver handles the requests for
installing breakpoints and watchpoints and if KVM is enabled passes them
onto the arch specific code.

When QEMU prepares to enter the run loop it checks for the presence of
those things and then can make the arch specific decision about how to
invoke KVM (see kvm_update_guest_debug and friends).

Just the fact you have connected to the gdbserver shouldn't affect how
you run WHPX up until the point there are things you need to trap - i.e.
handling installed breakpoints.

FWIW we rejected a similar patch a few weeks ago that wanted to change
emulation behaviour based on gdbserver connection status. 

>
> thanks
> -- PMM


-- 
Alex Bennée



[PATCH v2 04/12] mos6522: switch over to use qdev gpios for IRQs

2022-02-24 Thread Mark Cave-Ayland
For historical reasons each mos6522 instance implements its own setting and
update of the IFR flag bits using methods exposed by MOS6522DeviceClass. As
of today this is no longer required, and it is now possible to implement
the mos6522 IRQs as standard qdev gpios.

Switch over to use qdev gpios for the mos6522 device and update all instances
accordingly.

Signed-off-by: Mark Cave-Ayland 
Reviewed-by: Peter Maydell 
---
 hw/misc/mac_via.c | 56 +++
 hw/misc/macio/cuda.c  |  5 ++--
 hw/misc/macio/pmu.c   |  4 +--
 hw/misc/mos6522.c | 15 +++
 include/hw/misc/mac_via.h |  5 
 include/hw/misc/mos6522.h |  2 ++
 6 files changed, 31 insertions(+), 56 deletions(-)

diff --git a/hw/misc/mac_via.c b/hw/misc/mac_via.c
index 71b74c3372..80eb433044 100644
--- a/hw/misc/mac_via.c
+++ b/hw/misc/mac_via.c
@@ -325,10 +325,9 @@ static void via1_sixty_hz(void *opaque)
 {
 MOS6522Q800VIA1State *v1s = opaque;
 MOS6522State *s = MOS6522(v1s);
-MOS6522DeviceClass *mdc = MOS6522_GET_CLASS(s);
+qemu_irq irq = qdev_get_gpio_in(DEVICE(s), VIA1_IRQ_60HZ_BIT);
 
-s->ifr |= VIA1_IRQ_60HZ;
-mdc->update_irq(s);
+qemu_set_irq(irq, 1);
 
 via1_sixty_hz_update(v1s);
 }
@@ -337,44 +336,13 @@ static void via1_one_second(void *opaque)
 {
 MOS6522Q800VIA1State *v1s = opaque;
 MOS6522State *s = MOS6522(v1s);
-MOS6522DeviceClass *mdc = MOS6522_GET_CLASS(s);
+qemu_irq irq = qdev_get_gpio_in(DEVICE(s), VIA1_IRQ_ONE_SECOND_BIT);
 
-s->ifr |= VIA1_IRQ_ONE_SECOND;
-mdc->update_irq(s);
+qemu_set_irq(irq, 1);
 
 via1_one_second_update(v1s);
 }
 
-static void via1_irq_request(void *opaque, int irq, int level)
-{
-MOS6522Q800VIA1State *v1s = opaque;
-MOS6522State *s = MOS6522(v1s);
-MOS6522DeviceClass *mdc = MOS6522_GET_CLASS(s);
-
-if (level) {
-s->ifr |= 1 << irq;
-} else {
-s->ifr &= ~(1 << irq);
-}
-
-mdc->update_irq(s);
-}
-
-static void via2_irq_request(void *opaque, int irq, int level)
-{
-MOS6522Q800VIA2State *v2s = opaque;
-MOS6522State *s = MOS6522(v2s);
-MOS6522DeviceClass *mdc = MOS6522_GET_CLASS(s);
-
-if (level) {
-s->ifr |= 1 << irq;
-} else {
-s->ifr &= ~(1 << irq);
-}
-
-mdc->update_irq(s);
-}
-
 
 static void pram_update(MOS6522Q800VIA1State *v1s)
 {
@@ -1061,8 +1029,6 @@ static void mos6522_q800_via1_init(Object *obj)
 qbus_init((BusState *)>adb_bus, sizeof(v1s->adb_bus),
   TYPE_ADB_BUS, DEVICE(v1s), "adb.0");
 
-qdev_init_gpio_in(DEVICE(obj), via1_irq_request, VIA1_IRQ_NB);
-
 /* A/UX mode */
 qdev_init_gpio_out(DEVICE(obj), >auxmode_irq, 1);
 }
@@ -1150,22 +1116,20 @@ static void mos6522_q800_via2_reset(DeviceState *dev)
 ms->a = 0x7f;
 }
 
-static void via2_nubus_irq_request(void *opaque, int irq, int level)
+static void via2_nubus_irq_request(void *opaque, int n, int level)
 {
 MOS6522Q800VIA2State *v2s = opaque;
 MOS6522State *s = MOS6522(v2s);
-MOS6522DeviceClass *mdc = MOS6522_GET_CLASS(s);
+qemu_irq irq = qdev_get_gpio_in(DEVICE(s), VIA2_IRQ_NUBUS_BIT);
 
 if (level) {
 /* Port A nubus IRQ inputs are active LOW */
-s->a &= ~(1 << irq);
-s->ifr |= 1 << VIA2_IRQ_NUBUS_BIT;
+s->a &= ~(1 << n);
 } else {
-s->a |= (1 << irq);
-s->ifr &= ~(1 << VIA2_IRQ_NUBUS_BIT);
+s->a |= (1 << n);
 }
 
-mdc->update_irq(s);
+qemu_set_irq(irq, level);
 }
 
 static void mos6522_q800_via2_init(Object *obj)
@@ -1177,8 +1141,6 @@ static void mos6522_q800_via2_init(Object *obj)
   "via2", VIA_SIZE);
 sysbus_init_mmio(sbd, >via_mem);
 
-qdev_init_gpio_in(DEVICE(obj), via2_irq_request, VIA2_IRQ_NB);
-
 qdev_init_gpio_in_named(DEVICE(obj), via2_nubus_irq_request, "nubus-irq",
 VIA2_NUBUS_IRQ_NB);
 }
diff --git a/hw/misc/macio/cuda.c b/hw/misc/macio/cuda.c
index 233daf1405..693fc82e05 100644
--- a/hw/misc/macio/cuda.c
+++ b/hw/misc/macio/cuda.c
@@ -24,6 +24,7 @@
  */
 
 #include "qemu/osdep.h"
+#include "hw/irq.h"
 #include "hw/ppc/mac.h"
 #include "hw/qdev-properties.h"
 #include "migration/vmstate.h"
@@ -96,9 +97,9 @@ static void cuda_set_sr_int(void *opaque)
 CUDAState *s = opaque;
 MOS6522CUDAState *mcs = >mos6522_cuda;
 MOS6522State *ms = MOS6522(mcs);
-MOS6522DeviceClass *mdc = MOS6522_GET_CLASS(ms);
+qemu_irq irq = qdev_get_gpio_in(DEVICE(ms), SR_INT_BIT);
 
-mdc->set_sr_int(ms);
+qemu_set_irq(irq, 1);
 }
 
 static void cuda_delay_set_sr_int(CUDAState *s)
diff --git a/hw/misc/macio/pmu.c b/hw/misc/macio/pmu.c
index 76c608ee19..b210068ab7 100644
--- a/hw/misc/macio/pmu.c
+++ b/hw/misc/macio/pmu.c
@@ -75,9 +75,9 @@ static void via_set_sr_int(void *opaque)
 PMUState *s = opaque;
 MOS6522PMUState *mps = MOS6522_PMU(>mos6522_pmu);
 MOS6522State *ms = MOS6522(mps);
-MOS6522DeviceClass *mdc = 

[PATCH v2 12/12] macio/pmu.c: remove redundant code

2022-02-24 Thread Mark Cave-Ayland
Now that the logic related to edge-triggered interrupts is all contained within
the mos6522 device the redundant implementation for the mac99 PMU device can
be removed.

Signed-off-by: Mark Cave-Ayland 
---
 hw/misc/macio/pmu.c | 33 -
 include/hw/misc/macio/pmu.h |  2 --
 2 files changed, 35 deletions(-)

diff --git a/hw/misc/macio/pmu.c b/hw/misc/macio/pmu.c
index 5b1ec100e2..336502a84b 100644
--- a/hw/misc/macio/pmu.c
+++ b/hw/misc/macio/pmu.c
@@ -57,19 +57,6 @@
 
 #define VIA_TIMER_FREQ (470 / 6)
 
-static void via_update_irq(PMUState *s)
-{
-MOS6522PMUState *mps = MOS6522_PMU(>mos6522_pmu);
-MOS6522State *ms = MOS6522(mps);
-
-bool new_state = !!(ms->ifr & ms->ier & (SR_INT | T1_INT | T2_INT));
-
-if (new_state != s->via_irq_state) {
-s->via_irq_state = new_state;
-qemu_set_irq(s->via_irq, new_state);
-}
-}
-
 static void via_set_sr_int(void *opaque)
 {
 PMUState *s = opaque;
@@ -808,28 +795,9 @@ static void mos6522_pmu_portB_write(MOS6522State *s)
 MOS6522PMUState *mps = container_of(s, MOS6522PMUState, parent_obj);
 PMUState *ps = container_of(mps, PMUState, mos6522_pmu);
 
-if ((s->pcr & 0xe0) == 0x20 || (s->pcr & 0xe0) == 0x60) {
-s->ifr &= ~CB2_INT;
-}
-s->ifr &= ~CB1_INT;
-
-via_update_irq(ps);
 pmu_update(ps);
 }
 
-static void mos6522_pmu_portA_write(MOS6522State *s)
-{
-MOS6522PMUState *mps = container_of(s, MOS6522PMUState, parent_obj);
-PMUState *ps = container_of(mps, PMUState, mos6522_pmu);
-
-if ((s->pcr & 0x0e) == 0x02 || (s->pcr & 0x0e) == 0x06) {
-s->ifr &= ~CA2_INT;
-}
-s->ifr &= ~CA1_INT;
-
-via_update_irq(ps);
-}
-
 static void mos6522_pmu_reset(DeviceState *dev)
 {
 MOS6522State *ms = MOS6522(dev);
@@ -853,7 +821,6 @@ static void mos6522_pmu_class_init(ObjectClass *oc, void 
*data)
 device_class_set_parent_reset(dc, mos6522_pmu_reset,
   >parent_reset);
 mdc->portB_write = mos6522_pmu_portB_write;
-mdc->portA_write = mos6522_pmu_portA_write;
 }
 
 static const TypeInfo mos6522_pmu_type_info = {
diff --git a/include/hw/misc/macio/pmu.h b/include/hw/misc/macio/pmu.h
index 78237d99a2..00fcdd23f5 100644
--- a/include/hw/misc/macio/pmu.h
+++ b/include/hw/misc/macio/pmu.h
@@ -193,8 +193,6 @@ struct PMUState {
 
 MemoryRegion mem;
 uint64_t frequency;
-qemu_irq via_irq;
-bool via_irq_state;
 
 /* PMU state */
 MOS6522PMUState mos6522_pmu;
-- 
2.20.1




Re: [PATCH v4 07/18] block/reqlist: reqlist_find_conflict(): use ranges_overlap()

2022-02-24 Thread Hanna Reitz

On 16.02.22 20:46, Vladimir Sementsov-Ogievskiy wrote:

Let's reuse convenient helper.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  block/reqlist.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)


Reviewed-by: Hanna Reitz 




Re: [PATCH v4 04/18] block/copy-before-write: add bitmap open parameter

2022-02-24 Thread Hanna Reitz

On 16.02.22 20:46, Vladimir Sementsov-Ogievskiy wrote:

This brings "incremental" mode to copy-before-write filter: user can
specify bitmap so that filter will copy only "dirty" areas.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  qapi/block-core.json  | 10 +++-
  block/copy-before-write.c | 51 ++-
  2 files changed, 59 insertions(+), 2 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 9a5a3641d0..3bab597506 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -4171,11 +4171,19 @@
  #
  # @target: The target for copy-before-write operations.
  #
+# @bitmap: If specified, copy-before-write filter will do
+#  copy-before-write operations only for dirty regions of the
+#  bitmap. Bitmap size must be equal to length of file and
+#  target child of the filter. Note also, that bitmap is used
+#  only to initialize internal bitmap of the process, so further
+#  modifications (or removing) of specified bitmap doesn't
+#  influence the filter.


Sorry, missed this last time: There should be a “since: 7.0” here.


+#
  # Since: 6.2
  ##
  { 'struct': 'BlockdevOptionsCbw',
'base': 'BlockdevOptionsGenericFormat',
-  'data': { 'target': 'BlockdevRef' } }
+  'data': { 'target': 'BlockdevRef', '*bitmap': 'BlockDirtyBitmap' } }
  
  ##

  # @BlockdevOptions:
diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index 799223e3fb..91a2288b66 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c
@@ -34,6 +34,8 @@
  
  #include "block/copy-before-write.h"
  
+#include "qapi/qapi-visit-block-core.h"

+
  typedef struct BDRVCopyBeforeWriteState {
  BlockCopyState *bcs;
  BdrvChild *target;
@@ -145,10 +147,53 @@ static void cbw_child_perm(BlockDriverState *bs, 
BdrvChild *c,
  }
  }
  
+static bool cbw_parse_bitmap_option(QDict *options, BdrvDirtyBitmap **bitmap,

+Error **errp)
+{
+QDict *bitmap_qdict = NULL;
+BlockDirtyBitmap *bmp_param = NULL;
+Visitor *v = NULL;
+bool ret = false;
+
+*bitmap = NULL;
+
+qdict_extract_subqdict(options, _qdict, "bitmap.");
+if (!qdict_size(bitmap_qdict)) {
+ret = true;
+goto out;
+}
+
+v = qobject_input_visitor_new_flat_confused(bitmap_qdict, errp);
+if (!v) {
+goto out;
+}
+
+visit_type_BlockDirtyBitmap(v, NULL, _param, errp);
+if (!bmp_param) {
+goto out;
+}
+
+*bitmap = block_dirty_bitmap_lookup(bmp_param->node, bmp_param->name, NULL,
+errp);
+if (!*bitmap) {
+goto out;
+}
+
+ret = true;
+
+out:
+qapi_free_BlockDirtyBitmap(bmp_param);
+visit_free(v);
+qobject_unref(bitmap_qdict);
+
+return ret;
+}
+
  static int cbw_open(BlockDriverState *bs, QDict *options, int flags,
  Error **errp)
  {
  BDRVCopyBeforeWriteState *s = bs->opaque;
+BdrvDirtyBitmap *bitmap = NULL;
  
  bs->file = bdrv_open_child(NULL, options, "file", bs, _of_bds,

 BDRV_CHILD_FILTERED | BDRV_CHILD_PRIMARY,
@@ -163,6 +208,10 @@ static int cbw_open(BlockDriverState *bs, QDict *options, 
int flags,
  return -EINVAL;
  }
  
+if (!cbw_parse_bitmap_option(options, , errp)) {

+return -EINVAL;


Hm...  Just to get a second opinion on this: We don’t need to close 
s->target here, because the failure paths of bdrv_open_inherit() and 
bdrv_new_open_driver_opts() both call bdrv_unref(), which will call 
bdrv_close(), which will close all children including s->target, right?



+}
+
  bs->total_sectors = bs->file->bs->total_sectors;
  bs->supported_write_flags = BDRV_REQ_WRITE_UNCHANGED |
  (BDRV_REQ_FUA & bs->file->bs->supported_write_flags);
@@ -170,7 +219,7 @@ static int cbw_open(BlockDriverState *bs, QDict *options, 
int flags,
  ((BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP | BDRV_REQ_NO_FALLBACK) &
   bs->file->bs->supported_zero_flags);
  
-s->bcs = block_copy_state_new(bs->file, s->target, NULL, errp);

+s->bcs = block_copy_state_new(bs->file, s->target, bitmap, errp);
  if (!s->bcs) {
  error_prepend(errp, "Cannot create block-copy-state: ");
  return -EINVAL;





Re: [PULL 00/10] Misc next patches

2022-02-24 Thread Peter Maydell
On Mon, 21 Feb 2022 at 19:18, Daniel P. Berrangé  wrote:
> AFAIK, the block jobs run in CI don't cover the SSH driver at all.
>
> I had a CI pipeline before submitting, which covered Free BSD 13
> which passed. To be sure I just rebased to git master and tried
> another pipeline which passed too:
>
>   https://gitlab.com/berrange/qemu/-/jobs/2119118096
>
> so I'm thinking the failure you got is a transient. Could you retry
> this PULL

Does seem to have been a transient. (We have way too many of those
at the moment for a variety of reasons.)


Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/7.0
for any user-visible changes.

-- PMM



[PATCH] target/arm: Report KVM's actual PSCI version to guest in dtb

2022-02-24 Thread Peter Maydell
When we're using KVM, the PSCI implementation is provided by the
kernel, but QEMU has to tell the guest about it via the device tree.
Currently we look at the KVM_CAP_ARM_PSCI_0_2 capability to determine
if the kernel is providing at least PSCI 0.2, but if the kernel
provides a newer version than that we will still only tell the guest
it has PSCI 0.2.  (This is fairly harmless; it just means the guest
won't use newer parts of the PSCI API.)

The kernel exposes the specific PSCI version it is implementing via
the ONE_REG API; use this to report in the dtb that the PSCI
implementation is 1.0-compatible if appropriate.  (The device tree
binding currently only distinguishes "pre-0.2", "0.2-compatible" and
"1.0-compatible".)

Signed-off-by: Peter Maydell 
---
Based-on: 20220213035753.34577-1-akihiko.od...@gmail.com
("[PATCH v2] target/arm: Support PSCI 1.1 and SMCCC 1.0")
though note that to compile on arm hosts you'll need the
bugfix to that patch from which I describe in a reply to it.

 target/arm/kvm-consts.h |  1 +
 hw/arm/boot.c   |  5 ++---
 target/arm/kvm64.c  | 12 
 3 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/target/arm/kvm-consts.h b/target/arm/kvm-consts.h
index e770921ddc2..faacf96fdc7 100644
--- a/target/arm/kvm-consts.h
+++ b/target/arm/kvm-consts.h
@@ -95,6 +95,7 @@ MISMATCH_CHECK(QEMU_PSCI_1_0_FN_PSCI_FEATURES, 
PSCI_1_0_FN_PSCI_FEATURES);
 
 #define QEMU_PSCI_VERSION_0_1 0x1
 #define QEMU_PSCI_VERSION_0_2 0x2
+#define QEMU_PSCI_VERSION_1_0 0x1
 #define QEMU_PSCI_VERSION_1_1 0x10001
 
 MISMATCH_CHECK(QEMU_PSCI_0_2_RET_TOS_MIGRATION_NOT_REQUIRED, PSCI_0_2_TOS_MP);
diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index 0eeef94ceb5..a47f38dfc90 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -488,9 +488,8 @@ static void fdt_add_psci_node(void *fdt)
 }
 
 qemu_fdt_add_subnode(fdt, "/psci");
-if (armcpu->psci_version == QEMU_PSCI_VERSION_0_2 ||
-armcpu->psci_version == QEMU_PSCI_VERSION_1_1) {
-if (armcpu->psci_version == QEMU_PSCI_VERSION_0_2) {
+if (armcpu->psci_version >= QEMU_PSCI_VERSION_0_2) {
+if (armcpu->psci_version < QEMU_PSCI_VERSION_1_0) {
 const char comp[] = "arm,psci-0.2\0arm,psci";
 qemu_fdt_setprop(fdt, "/psci", "compatible", comp, sizeof(comp));
 } else {
diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
index 64d48bfb19d..ccadfbbe72b 100644
--- a/target/arm/kvm64.c
+++ b/target/arm/kvm64.c
@@ -849,6 +849,7 @@ int kvm_arch_init_vcpu(CPUState *cs)
 uint64_t mpidr;
 ARMCPU *cpu = ARM_CPU(cs);
 CPUARMState *env = >env;
+uint64_t psciver;
 
 if (cpu->kvm_target == QEMU_KVM_ARM_TARGET_NONE ||
 !object_dynamic_cast(OBJECT(cpu), TYPE_AARCH64_CPU)) {
@@ -904,6 +905,17 @@ int kvm_arch_init_vcpu(CPUState *cs)
 }
 }
 
+/*
+ * KVM reports the exact PSCI version it is implementing via a
+ * special sysreg. If it is present, use its contents to determine
+ * what to report to the guest in the dtb (it is the PSCI version,
+ * in the same 15-bits major 16-bits minor format that PSCI_VERSION
+ * returns).
+ */
+if (!kvm_get_one_reg(cs, KVM_REG_ARM_PSCI_VERSION, )) {
+cpu->psci_version = psciver;
+}
+
 /*
  * When KVM is in use, PSCI is emulated in-kernel and not by qemu.
  * Currently KVM has its own idea about MPIDR assignment, so we
-- 
2.25.1




Re: [PATCH] block: fix preallocate filter: don't do unaligned preallocate requests

2022-02-24 Thread Hanna Reitz

On 15.02.22 13:16, Vladimir Sementsov-Ogievskiy wrote:

There is a bug in handling BDRV_REQ_NO_WAIT flag: we still may wait in
wait_serialising_requests() if request is unaligned. And this is
possible for the only user of this flag (preallocate filter) if
underlying file is unaligned to its request_alignment on start.

So, we have to fix preallocate filter to do only aligned preallocate
requests.

Next, we should fix generic block/io.c somehow. Keeping in mind that
preallocate is the only user of BDRV_REQ_NO_WAIT and that we have to
fix its behavior now, it seems more safe to just assert that we never
use BDRV_REQ_NO_WAIT with unaligned requests and add corresponding
comment. Let's do so.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Denis V. Lunev 
---
  include/block/block.h |  3 ++-
  block/io.c|  4 
  block/preallocate.c   | 15 ---
  3 files changed, 18 insertions(+), 4 deletions(-)


Thanks, applied to my block branch:

https://gitlab.com/hreitz/qemu/-/commits/block




[PATCH v2 1/2] ui/cocoa.m: Fix updateUIInfo threading issues

2022-02-24 Thread Peter Maydell
The updateUIInfo method makes Cocoa API calls.  It also calls back
into QEMU functions like dpy_set_ui_info().  To do this safely, we
need to follow two rules:
 * Cocoa API calls are made on the Cocoa UI thread
 * When calling back into QEMU we must hold the iothread lock

Fix the places where we got this wrong, by taking the iothread lock
while executing updateUIInfo, and moving the call in cocoa_switch()
inside the dispatch_async block.

Some of the Cocoa UI methods which call updateUIInfo are invoked as
part of the initial application startup, while we're still doing the
little cross-thread dance described in the comment just above
call_qemu_main().  This meant they were calling back into the QEMU UI
layer before we'd actually finished initializing our display and
registered the DisplayChangeListener, which isn't really valid.  Once
updateUIInfo takes the iothread lock, we no longer get away with
this, because during this startup phase the iothread lock is held by
the QEMU main-loop thread which is waiting for us to finish our
display initialization.  So we must suppress updateUIInfo until
applicationDidFinishLaunching allows the QEMU main-loop thread to
continue.

Signed-off-by: Peter Maydell 
---
v1->v2:
 * name method updateUIInfoLocked, to match existing handleEventLocked
 * don't call updateUIInfo in cocoa_display_init() -- this happens
   indirectly as a result of register_displaychangelistener()
---
 ui/cocoa.m | 25 ++---
 1 file changed, 22 insertions(+), 3 deletions(-)

diff --git a/ui/cocoa.m b/ui/cocoa.m
index a8f1cdaf926..5ed1495552a 100644
--- a/ui/cocoa.m
+++ b/ui/cocoa.m
@@ -522,8 +522,9 @@ QemuCocoaView *cocoaView;
 }
 }
 
-- (void) updateUIInfo
+- (void) updateUIInfoLocked
 {
+/* Must be called with the iothread lock, i.e. via updateUIInfo */
 NSSize frameSize;
 QemuUIInfo info;
 
@@ -554,6 +555,25 @@ QemuCocoaView *cocoaView;
 dpy_set_ui_info(dcl.con, , TRUE);
 }
 
+- (void) updateUIInfo
+{
+if (!allow_events) {
+/*
+ * Don't try to tell QEMU about UI information in the application
+ * startup phase -- we haven't yet registered dcl with the QEMU UI
+ * layer, and also trying to take the iothread lock would deadlock.
+ * When cocoa_display_init() does register the dcl, the UI layer
+ * will call cocoa_switch(), which will call updateUIInfo, so
+ * we don't lose any information here.
+ */
+return;
+}
+
+with_iothread_lock(^{
+[self updateUIInfoLocked];
+});
+}
+
 - (void)viewDidMoveToWindow
 {
 [self updateUIInfo];
@@ -1985,8 +2005,6 @@ static void cocoa_switch(DisplayChangeListener *dcl,
 
 COCOA_DEBUG("qemu_cocoa: cocoa_switch\n");
 
-[cocoaView updateUIInfo];
-
 // The DisplaySurface will be freed as soon as this callback returns.
 // We take a reference to the underlying pixman image here so it does
 // not disappear from under our feet; the switchSurface method will
@@ -1994,6 +2012,7 @@ static void cocoa_switch(DisplayChangeListener *dcl,
 pixman_image_ref(image);
 
 dispatch_async(dispatch_get_main_queue(), ^{
+[cocoaView updateUIInfo];
 [cocoaView switchSurface:image];
 });
 [pool release];
-- 
2.25.1




Re: [PATCH v2] qapi, target/i386/sev: Add cpu0-id to query-sev-capabilities

2022-02-24 Thread Dov Murik



On 24/02/2022 10:59, Daniel P. Berrangé wrote:
> On Thu, Feb 24, 2022 at 06:14:05AM +, Dov Murik wrote:
>> Add a new field 'cpu0-id' to the response of query-sev-capabilities QMP
>> command.  The value of the field is the base64-encoded 64-byte unique ID
>> of the CPU0 (socket 0), which can be used to retrieve the signed CEK of
>> the CPU from AMD's Key Distribution Service (KDS).
>>
>> Signed-off-by: Dov Murik 
>>
>> ---
>>
>> v2:
>> - change encoding to Base64 (thanks Daniel)
>> - rename constant to SEV_CPU_UNIQUE_ID_LEN
>> ---
>>  qapi/misc-target.json |  4 
>>  target/i386/sev.c | 27 +++
>>  2 files changed, 31 insertions(+)
>>
>> diff --git a/qapi/misc-target.json b/qapi/misc-target.json
>> index 4bc45d2474..c6d9ad69e1 100644
>> --- a/qapi/misc-target.json
>> +++ b/qapi/misc-target.json
>> @@ -177,6 +177,8 @@
>>  #
>>  # @cert-chain:  PDH certificate chain (base64 encoded)
>>  #
>> +# @cpu0-id: 64-byte unique ID of CPU0 (base64 encoded) (since 7.0)
>> +#
>>  # @cbitpos: C-bit location in page table entry
>>  #
>>  # @reduced-phys-bits: Number of physical Address bit reduction when SEV is
>> @@ -187,6 +189,7 @@
>>  { 'struct': 'SevCapability',
>>'data': { 'pdh': 'str',
>>  'cert-chain': 'str',
>> +'cpu0-id': 'str',
>>  'cbitpos': 'int',
>>  'reduced-phys-bits': 'int'},
>>'if': 'TARGET_I386' }
>> @@ -205,6 +208,7 @@
>>  #
>>  # -> { "execute": "query-sev-capabilities" }
>>  # <- { "return": { "pdh": "8CCDD8DDD", "cert-chain": "888CCCDDDEE",
>> +#  "cpu0-id": "2lvmGwo+...61iEinw==",
>>  #  "cbitpos": 47, "reduced-phys-bits": 5}}
>>  #
>>  ##
>> diff --git a/target/i386/sev.c b/target/i386/sev.c
>> index 025ff7a6f8..d3d2680e16 100644
>> --- a/target/i386/sev.c
>> +++ b/target/i386/sev.c
>> @@ -82,6 +82,8 @@ struct SevGuestState {
>>  #define DEFAULT_GUEST_POLICY0x1 /* disable debug */
>>  #define DEFAULT_SEV_DEVICE  "/dev/sev"
>>  
>> +#define SEV_CPU_UNIQUE_ID_LEN   64
> 
> Is this fixed size actually guaranteed by AMD ?  In reading the spec
> for "GET_ID" it feels like they're intentionally not specifying a
> fixed length, pushing towards querying once and then re-trying with
> the reported buffer size:
> 
>"If the value of the ID_LEN field is too small, an 
> INVALID_LENGTH error is returned and the minimum 
> required length is written to the field"
> 
> I didn't find where it says 64 bytes exactly.
> 

OK.

I'll change it to dynamically allocate the ID buffer based on the length
returned by the first query, similar to sev_get_pdh_info().

I'll remove the explicit lengths from the qapi description and the
commit message.

-Dov


>> +
>>  #define SEV_INFO_BLOCK_GUID "00f771de-1a7e-4fcb-890e-68c77e2fb44e"
>>  typedef struct __attribute__((__packed__)) SevInfoBlock {
>>  /* SEV-ES Reset Vector Address */
>> @@ -531,11 +533,31 @@ e_free:
>>  return 1;
>>  }
>>  
>> +static int
>> +sev_get_id(int fd, guchar *id_buf, size_t id_buf_len, Error **errp)
>> +{
>> +struct sev_user_data_get_id2 id = {
>> +.address = (unsigned long)id_buf,
>> +.length = id_buf_len
>> +};
>> +int err, r;
>> +
>> +r = sev_platform_ioctl(fd, SEV_GET_ID2, , );
>> +if (r < 0) {
>> +error_setg(errp, "SEV: Failed to get ID ret=%d fw_err=%d (%s)",
>> +   r, err, fw_error_to_str(err));
>> +return 1;
>> +}
>> +
>> +return 0;
>> +}
>> +
>>  static SevCapability *sev_get_capabilities(Error **errp)
>>  {
>>  SevCapability *cap = NULL;
>>  guchar *pdh_data = NULL;
>>  guchar *cert_chain_data = NULL;
>> +guchar cpu0_id[SEV_CPU_UNIQUE_ID_LEN];
>>  size_t pdh_len = 0, cert_chain_len = 0;
>>  uint32_t ebx;
>>  int fd;
>> @@ -561,9 +583,14 @@ static SevCapability *sev_get_capabilities(Error **errp)
>>  goto out;
>>  }
>>  
>> +if (sev_get_id(fd, cpu0_id, sizeof(cpu0_id), errp)) {
>> +goto out;
>> +}
>> +
>>  cap = g_new0(SevCapability, 1);
>>  cap->pdh = g_base64_encode(pdh_data, pdh_len);
>>  cap->cert_chain = g_base64_encode(cert_chain_data, cert_chain_len);
>> +cap->cpu0_id = g_base64_encode(cpu0_id, sizeof(cpu0_id));
>>  
>>  host_cpuid(0x801F, 0, NULL, , NULL, NULL);
>>  cap->cbitpos = ebx & 0x3f;
>> -- 
>> 2.25.1
>>
> 
> Regards,
> Daniel



Re: [PATCH 0/8] Misc build fixes and cleanups

2022-02-24 Thread Paolo Bonzini

On 2/22/22 20:40, marcandre.lur...@redhat.com wrote:

From: Marc-André Lureau 

Hi,

A small collection of patches gleaned while working on different things.


Queued all except patch 4.

Paolo


Marc-André Lureau (8):
   meson: fix generic location of vss headers
   qga/vss-win32: check old VSS SDK headers
   qga/vss: update informative message about MinGW
   meson: drop the .fa library suffix
   meson: use chardev_ss dependencies
   char: move qemu_openpty_raw from util/ to char/
   Drop qemu_foo() socket API wrapper
   Replace GCC_FMT_ATTR with G_GNUC_PRINTF

  docs/devel/build-system.rst |   5 -
  meson.build |  25 ++---
  audio/audio.h   |   4 +-
  block/qcow2.h   |   2 +-
  bsd-user/qemu.h |   2 +-
  hw/display/qxl.h|   2 +-
  hw/net/rocker/rocker.h  |   2 +-
  hw/xen/xen_pt.h |   2 +-
  include/chardev/char-fe.h   |   2 +-
  include/disas/dis-asm.h |   2 +-
  include/hw/acpi/aml-build.h |  12 +-
  include/hw/core/cpu.h   |   2 +-
  include/hw/hw.h |   2 +-
  include/hw/virtio/virtio.h  |   2 +-
  include/hw/xen/xen-bus-helper.h |   4 +-
  include/hw/xen/xen-bus.h|   4 +-
  include/hw/xen/xen_common.h |   2 +-
  include/hw/xen/xen_pvdev.h  |   2 +-
  include/monitor/monitor.h   |   4 +-
  include/qapi/error.h|  20 ++--
  include/qapi/qmp/qjson.h|   8 +-
  include/qemu-common.h   |  21 
  include/qemu/buffer.h   |   2 +-
  include/qemu/compiler.h |  11 +-
  include/qemu/error-report.h |  24 ++--
  include/qemu/log-for-trace.h|   2 +-
  include/qemu/log.h  |   2 +-
  include/qemu/qemu-print.h   |   8 +-
  include/qemu/readline.h |   2 +-
  qga/guest-agent-core.h  |   2 +-
  qga/vss-win32/requester.h   |   2 +-
  qga/vss-win32/vss-common.h  |   3 +-
  scripts/cocci-macro-file.h  |   2 +-
  tests/qtest/libqos/libqtest.h   |  42 +++
  tests/qtest/libqtest-single.h   |   2 +-
  tests/qtest/migration-helpers.h |   6 +-
  audio/alsaaudio.c   |   4 +-
  audio/coreaudio.c   |   4 +-
  audio/dsoundaudio.c |   4 +-
  audio/ossaudio.c|   4 +-
  audio/paaudio.c |   2 +-
  audio/sdlaudio.c|   2 +-
  block/blkverify.c   |   2 +-
  block/ssh.c |   4 +-
  chardev/char-pty.c  | 104 ++
  crypto/cipher-afalg.c   |   4 +-
  crypto/hash-afalg.c |   4 +-
  fsdev/9p-marshal.c  |   2 +-
  fsdev/virtfs-proxy-helper.c |   2 +-
  gdbstub.c   |   2 +-
  hw/9pfs/9p.c|   2 +-
  hw/acpi/aml-build.c |   4 +-
  hw/mips/fuloong2e.c |   2 +-
  hw/mips/malta.c |   2 +-
  hw/net/rtl8139.c|   2 +-
  hw/virtio/virtio.c  |   2 +-
  io/channel-socket.c |   6 +-
  io/channel-websock.c|   2 +-
  monitor/hmp.c   |   4 +-
  nbd/server.c|  10 +-
  net/socket.c|  24 ++--
  qemu-img.c  |   4 +-
  qemu-io.c   |   2 +-
  qobject/json-parser.c   |   2 +-
  softmmu/qtest.c |   4 +-
  tests/qtest/e1000e-test.c   |   4 +-
  tests/qtest/libqtest.c  |   6 +-
  tests/qtest/npcm7xx_emc-test.c  |   4 +-
  tests/qtest/test-filter-mirror.c|   4 +-
  tests/qtest/test-filter-redirector.c|   8 +-
  tests/qtest/virtio-net-test.c   |  10 +-
  tests/unit/socket-helpers.c |   2 +-
  tests/unit/test-qobject-input-visitor.c |   4 +-
  util/osdep.c|   4 +-
  util/qemu-openpty.c | 139 
  util/qemu-sockets.c |  10 +-
  chardev/meson.build |   4 +-
  qga/meson.build |   2 +-
  qga/vss-win32/install.cpp   |   4 +
  qga/vss-win32/provider.cpp  |   4 +
  scripts/checkpatch.pl   |   2 +-
  tests/qtest/libqos/meson.build  |   1 -
  util/meson.build|   1 -
  83 files changed, 300 insertions(+), 370 deletions(-)
  delete mode 100644 util/qemu-openpty.c






Re: [PATCH v3 0/4] Improve integration of iotests in the meson test harness

2022-02-24 Thread Paolo Bonzini

On 2/23/22 10:38, Thomas Huth wrote:

This way, we can now finally get the output of failed
tests on the console again (unless you're running meson test in verbose
mode, where meson only puts this to the log file - for incomprehensible
reasons),


It's a bug (introduced by yours truly, which perhaps makes it less 
incomprehensible) and there's an open pull request to fix it.


Paolo



Re: [PATCH v2 2/2] block/curl.c: Check error return from curl_easy_setopt()

2022-02-24 Thread Hanna Reitz

On 22.02.22 16:23, Peter Maydell wrote:

Coverity points out that we aren't checking the return value
from curl_easy_setopt() for any of the calls to it we make
in block/curl.c.

Some of these options are documented as always succeeding (e.g.
CURLOPT_VERBOSE) but others have documented failure cases (e.g.
CURLOPT_URL).  For consistency we check every call, even the ones
that theoretically cannot fail.

Fixes: Coverity CID 1459336, 1459482, 1460331
Signed-off-by: Peter Maydell 
---
Changes v1->v2:
  * set the error string in the failure path for the
direct setopt calls in curl_open()
  * fix the failure path in curl_setup_preadv() by putting
the curl_easy_setopt() call in the same if() condition
as the existing curl_multi_add_handle()
---
  block/curl.c | 92 +---
  1 file changed, 58 insertions(+), 34 deletions(-)


Reviewed-by: Hanna Reitz 




[PATCH] hw/intc/armv7m_nvic: Fix typo in comment

2022-02-24 Thread Evgeny Ermakov
Signed-off-by: Evgeny Ermakov 
---
 hw/intc/armv7m_nvic.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c
index 13df002ce4..a08a0fdc50 100644
--- a/hw/intc/armv7m_nvic.c
+++ b/hw/intc/armv7m_nvic.c
@@ -97,7 +97,7 @@ static int nvic_pending_prio(NVICState *s)
  * this is only different in the obscure corner case where guest
  * code has manually deactivated an exception and is about
  * to fail an exception-return integrity check. The definition
- * above is the one from the v8M ARM ARM and is also in line
+ * above is the one from the v8M ARM and is also in line
  * with the behaviour documented for the Cortex-M3.
  */
 static bool nvic_rettobase(NVICState *s)
-- 
2.35.1




[PATCH v2] target/arm: Support PSCI 1.1 and SMCCC 1.0

2022-02-24 Thread Akihiko Odaki
Support the latest PSCI on TCG and HVF. A 64-bit function called from
AArch32 now returns NOT_SUPPORTED, which is necessary to adhere to SMC
Calling Convention 1.0. It is still not compliant with SMCCC 1.3 since
they do not implement mandatory functions.

Signed-off-by: Akihiko Odaki 
---
 hw/arm/boot.c   | 12 +---
 target/arm/cpu.c|  5 +++--
 target/arm/hvf/hvf.c| 27 ++-
 target/arm/kvm-consts.h | 13 +
 target/arm/kvm64.c  |  2 +-
 target/arm/psci.c   | 35 ---
 6 files changed, 80 insertions(+), 14 deletions(-)

diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index b1e95978f26..0eeef94ceb5 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -488,9 +488,15 @@ static void fdt_add_psci_node(void *fdt)
 }
 
 qemu_fdt_add_subnode(fdt, "/psci");
-if (armcpu->psci_version == 2) {
-const char comp[] = "arm,psci-0.2\0arm,psci";
-qemu_fdt_setprop(fdt, "/psci", "compatible", comp, sizeof(comp));
+if (armcpu->psci_version == QEMU_PSCI_VERSION_0_2 ||
+armcpu->psci_version == QEMU_PSCI_VERSION_1_1) {
+if (armcpu->psci_version == QEMU_PSCI_VERSION_0_2) {
+const char comp[] = "arm,psci-0.2\0arm,psci";
+qemu_fdt_setprop(fdt, "/psci", "compatible", comp, sizeof(comp));
+} else {
+const char comp[] = "arm,psci-1.0\0arm,psci-0.2\0arm,psci";
+qemu_fdt_setprop(fdt, "/psci", "compatible", comp, sizeof(comp));
+}
 
 cpu_off_fn = QEMU_PSCI_0_2_FN_CPU_OFF;
 if (arm_feature(>env, ARM_FEATURE_AARCH64)) {
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 5a9c02a2561..307a83a7bb6 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -1110,11 +1110,12 @@ static void arm_cpu_initfn(Object *obj)
  * picky DTB consumer will also provide a helpful error message.
  */
 cpu->dtb_compatible = "qemu,unknown";
-cpu->psci_version = 1; /* By default assume PSCI v0.1 */
+cpu->psci_version = QEMU_PSCI_VERSION_0_1; /* By default assume PSCI v0.1 
*/
 cpu->kvm_target = QEMU_KVM_ARM_TARGET_NONE;
 
 if (tcg_enabled() || hvf_enabled()) {
-cpu->psci_version = 2; /* TCG and HVF implement PSCI 0.2 */
+/* TCG and HVF implement PSCI 1.1 */
+cpu->psci_version = QEMU_PSCI_VERSION_1_1;
 }
 }
 
diff --git a/target/arm/hvf/hvf.c b/target/arm/hvf/hvf.c
index 0dc96560d34..1701fb8bbdb 100644
--- a/target/arm/hvf/hvf.c
+++ b/target/arm/hvf/hvf.c
@@ -653,7 +653,7 @@ static bool hvf_handle_psci_call(CPUState *cpu)
 
 switch (param[0]) {
 case QEMU_PSCI_0_2_FN_PSCI_VERSION:
-ret = QEMU_PSCI_0_2_RET_VERSION_0_2;
+ret = QEMU_PSCI_VERSION_1_1;
 break;
 case QEMU_PSCI_0_2_FN_MIGRATE_INFO_TYPE:
 ret = QEMU_PSCI_0_2_RET_TOS_MIGRATION_NOT_REQUIRED; /* No trusted OS */
@@ -721,6 +721,31 @@ static bool hvf_handle_psci_call(CPUState *cpu)
 case QEMU_PSCI_0_2_FN_MIGRATE:
 ret = QEMU_PSCI_RET_NOT_SUPPORTED;
 break;
+case QEMU_PSCI_1_0_FN_PSCI_FEATURES:
+switch (param[1]) {
+case QEMU_PSCI_0_2_FN_PSCI_VERSION:
+case QEMU_PSCI_0_2_FN_MIGRATE_INFO_TYPE:
+case QEMU_PSCI_0_2_FN_AFFINITY_INFO:
+case QEMU_PSCI_0_2_FN64_AFFINITY_INFO:
+case QEMU_PSCI_0_2_FN_SYSTEM_RESET:
+case QEMU_PSCI_0_2_FN_SYSTEM_OFF:
+case QEMU_PSCI_0_1_FN_CPU_ON:
+case QEMU_PSCI_0_2_FN_CPU_ON:
+case QEMU_PSCI_0_2_FN64_CPU_ON:
+case QEMU_PSCI_0_1_FN_CPU_OFF:
+case QEMU_PSCI_0_2_FN_CPU_OFF:
+case QEMU_PSCI_0_1_FN_CPU_SUSPEND:
+case QEMU_PSCI_0_2_FN_CPU_SUSPEND:
+case QEMU_PSCI_0_2_FN64_CPU_SUSPEND:
+case QEMU_PSCI_1_0_FN_PSCI_FEATURES:
+ret = 0;
+break;
+case QEMU_PSCI_0_1_FN_MIGRATE:
+case QEMU_PSCI_0_2_FN_MIGRATE:
+default:
+ret = QEMU_PSCI_RET_NOT_SUPPORTED;
+}
+break;
 default:
 return false;
 }
diff --git a/target/arm/kvm-consts.h b/target/arm/kvm-consts.h
index 580f1c1fee0..241e02562e1 100644
--- a/target/arm/kvm-consts.h
+++ b/target/arm/kvm-consts.h
@@ -77,6 +77,8 @@ MISMATCH_CHECK(QEMU_PSCI_0_1_FN_MIGRATE, KVM_PSCI_FN_MIGRATE);
 #define QEMU_PSCI_0_2_FN64_AFFINITY_INFO QEMU_PSCI_0_2_FN64(4)
 #define QEMU_PSCI_0_2_FN64_MIGRATE QEMU_PSCI_0_2_FN64(5)
 
+#define QEMU_PSCI_1_0_FN_PSCI_FEATURES QEMU_PSCI_0_2_FN(10)
+
 MISMATCH_CHECK(QEMU_PSCI_0_2_FN_CPU_SUSPEND, PSCI_0_2_FN_CPU_SUSPEND);
 MISMATCH_CHECK(QEMU_PSCI_0_2_FN_CPU_OFF, PSCI_0_2_FN_CPU_OFF);
 MISMATCH_CHECK(QEMU_PSCI_0_2_FN_CPU_ON, PSCI_0_2_FN_CPU_ON);
@@ -84,18 +86,21 @@ MISMATCH_CHECK(QEMU_PSCI_0_2_FN_MIGRATE, 
PSCI_0_2_FN_MIGRATE);
 MISMATCH_CHECK(QEMU_PSCI_0_2_FN64_CPU_SUSPEND, PSCI_0_2_FN64_CPU_SUSPEND);
 MISMATCH_CHECK(QEMU_PSCI_0_2_FN64_CPU_ON, PSCI_0_2_FN64_CPU_ON);
 MISMATCH_CHECK(QEMU_PSCI_0_2_FN64_MIGRATE, PSCI_0_2_FN64_MIGRATE);
+MISMATCH_CHECK(QEMU_PSCI_1_0_FN_PSCI_FEATURES, 

[PATCH v2 0/4] target/nios2: Shadow register set, EIC and VIC

2022-02-24 Thread Amir Gonnen
Implement nios2 Shadow register set, EIC and VIC.

Currently nios2 on QEMU contains an internal Interrupt Controller.
The nios2 architecture can support a more powerful External Interrupt
Controller (EIC) instead of the internal, and implements special cpu
features to support it: Shadow register set and External Interrupt
Controller Interface.

This patch series introduces the necessary changes to the nios2 cpu to
support an External Interrupt Controller, and includes a Vectored
Interrupt Controller (VIC) device that can be attach to the EIC.

Following Peter's suggestion in the previous version, I've splitted this
into several independant patches that rely on each other incrementally
and added a board that wires up the VIC:

1. Shadow Register Set support on the nios2 core
2. External Interrupt Controller interface on the nios2 core
3. Vectored Interrupt Controller
4. A board that uses the VIC instead of the default internal interrupt
   controller

Signed-off-by: Amir Gonnen 

Amir Gonnen (4):
  target/nios2: Shadow register set
  target/nios2: Exteral Interrupt Controller (EIC)
  hw/intc: Vectored Interrupt Controller (VIC)
  hw/nios2: Machine with a Vectored Interrupt Controller

 hw/intc/Kconfig   |   4 +
 hw/intc/meson.build   |   1 +
 hw/intc/nios2_vic.c   | 327 ++
 hw/nios2/10m50_devboard.c |  64 +++-
 target/nios2/cpu.c|  59 +--
 target/nios2/cpu.h|  69 +++-
 target/nios2/helper.c |  33 +++-
 target/nios2/helper.h |   3 +
 target/nios2/op_helper.c  |  31 +++-
 target/nios2/translate.c  |  32 +++-
 10 files changed, 597 insertions(+), 26 deletions(-)
 create mode 100644 hw/intc/nios2_vic.c

--
2.25.1


The contents of this email message and any attachments are intended solely for 
the addressee(s) and may contain confidential and/or privileged information and 
may be legally protected from disclosure. If you are not the intended recipient 
of this message or their agent, or if this message has been addressed to you in 
error, please immediately alert the sender by reply email and then delete this 
message and any attachments. If you are not the intended recipient, you are 
hereby notified that any use, dissemination, copying, or storage of this 
message or its attachments is strictly prohibited.



Re: [PATCH v8 1/3] monitor/hmp: add support for flag argument with value

2022-02-24 Thread Dr. David Alan Gilbert
* Markus Armbruster (arm...@redhat.com) wrote:
> Fabian Ebner  writes:
> 
> > Am 09.02.22 um 15:12 schrieb Markus Armbruster:
> >> Fabian Ebner  writes:
> >
> > 8<
> >
> >>> diff --git a/monitor/monitor-internal.h b/monitor/monitor-internal.h
> >>> index 3da3f86c6a..a4cb307c8a 100644
> >>> --- a/monitor/monitor-internal.h
> >>> +++ b/monitor/monitor-internal.h
> >>> @@ -63,7 +63,8 @@
> >>>   * '.'  other form of optional type (for 'i' and 'l')
> >>>   * 'b'  boolean
> >>>   *  user mode accepts "on" or "off"
> >>> - * '-'  optional parameter (eg. '-f')
> >>> + * '-'  optional parameter (eg. '-f'); if followed by a 'V', it
> >>> + *  specifies an optional string param (e.g. '-fV' allows 
> >>> '-f foo')
> >>>   *
> >>>   */
> >> 
> >> For what it's worth, getopt() uses ':' after the option character for
> >> "takes an argument".
> >> 
> >
> > Doing that leads to e.g.
> > .args_type  = "protocol:s,password:s,display:-d:,connected:s?",
> > so there's two different kinds of colons now.
> 
> Point.

Yeh, : is probably a bad idea.

> >   It's not a problem
> > functionality-wise AFAICT, but it might not be ideal. Should I still go
> > for it?
> >
> > Also, wouldn't future non-string flag parameters need their own letter
> > too? What about re-using 's' here instead?
> 
> Another good point.
> 
> Dave, what do you think?

Yeh, I'd be happy reusing 's' here.

Dave

> >> Happy to make that tweak in my tree.  But see my review of PATCH 3
> >> first.
> 
-- 
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




Re: Analysis of slow distro boots in check-avocado (BootLinuxAarch64.test_virt_tcg*)

2022-02-24 Thread Laszlo Ersek
On 02/24/22 10:10, Igor Mammedov wrote:
> On Wed, 23 Feb 2022 12:50:42 +0100
> Gerd Hoffmann  wrote:
> 
>>   Hi,
>>
 Also, "make install" installs these EDK2 images, which doesn't
 seem like the right thing for "this is only for one test case".  
>>>
>>> Well I'd prefer we never had them installed. Today I don't remember
>>> why it ended that way.  
>>
>> Probably to behave simliar to other firmware, which makes sense to me.
>>
>> So maybe do non-debug builds for install and debug builds for the test
>> cases?  Why do the test cases need debug builds btw?
> 
> wrt bios-tables-test, it doesn't need debug version and should work fine
> with non-debug builds.
> (if memory serves me right it's this test case that prompted to add
> uefi images to qemu)

Yes, commit 77db55fc8155 came first, commit 536d2173b2b3 came later.

Thanks
Laszlo




[PATCH v2] Added parameter to take screenshot with screendump as PNG

2022-02-24 Thread Kshitij Suri
Currently screendump only supports PPM format, which is un-compressed and not
standard. Added a "format" parameter to qemu monitor screendump capabilites
to support PNG image capture using libpng. The param was added in QAPI schema
of screendump present in ui.json along with png_save() function which converts
pixman_image to PNG. HMP command equivalent was also modified to support the
feature.

Example usage:
{ "execute": "screendump", "arguments": { "filename": "/tmp/image", 
"format":"png" } }

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/718

Signed-off-by: Kshitij Suri 
---
diff to v1:
  - Removed repeated alpha conversion operation.
  - Modified logic to mirror png conversion in vnc-enc-tight.c file.
  - Added a new CONFIG_PNG parameter for libpng support.
  - Changed input format to enum instead of string.

 hmp-commands.hx| 11 +++---
 meson.build| 13 +--
 meson_options.txt  |  2 +
 monitor/hmp-cmds.c |  8 +++-
 qapi/ui.json   | 24 ++--
 ui/console.c   | 97 --
 6 files changed, 139 insertions(+), 16 deletions(-)

diff --git a/hmp-commands.hx b/hmp-commands.hx
index 70a9136ac2..e43c9954b5 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -244,17 +244,18 @@ ERST
 
 {
 .name   = "screendump",
-.args_type  = "filename:F,device:s?,head:i?",
-.params = "filename [device [head]]",
-.help   = "save screen from head 'head' of display device 'device' 
"
-  "into PPM image 'filename'",
+.args_type  = "filename:F,device:s?,head:i?,format:f?",
+.params = "filename [device [head]] [format]",
+.help   = "save screen from head 'head' of display device 'device'"
+  "in specified format 'format' as image 'filename'."
+  "Currently only 'png' and 'ppm' formats are supported.",
 .cmd= hmp_screendump,
 .coroutine  = true,
 },
 
 SRST
 ``screendump`` *filename*
-  Save screen into PPM image *filename*.
+  Save screen as an image *filename*.
 ERST
 
 {
diff --git a/meson.build b/meson.build
index 8df40bfac4..fd550c01ec 100644
--- a/meson.build
+++ b/meson.build
@@ -1112,13 +1112,16 @@ if gtkx11.found()
   x11 = dependency('x11', method: 'pkg-config', required: gtkx11.found(),
kwargs: static_kwargs)
 endif
-vnc = not_found
 png = not_found
+png = dependency('libpng', required: get_option('png'),
+   method: 'pkg-config', kwargs: static_kwargs)
+vnc = not_found
+vnc_png = not_found
 jpeg = not_found
 sasl = not_found
 if get_option('vnc').allowed() and have_system
   vnc = declare_dependency() # dummy dependency
-  png = dependency('libpng', required: get_option('vnc_png'),
+  vnc_png = dependency('libpng', required: get_option('vnc_png'),
method: 'pkg-config', kwargs: static_kwargs)
   jpeg = dependency('libjpeg', required: get_option('vnc_jpeg'),
 method: 'pkg-config', kwargs: static_kwargs)
@@ -1537,9 +1540,10 @@ config_host_data.set('CONFIG_TPM', have_tpm)
 config_host_data.set('CONFIG_USB_LIBUSB', libusb.found())
 config_host_data.set('CONFIG_VDE', vde.found())
 config_host_data.set('CONFIG_VHOST_USER_BLK_SERVER', 
have_vhost_user_blk_server)
+config_host_data.set('CONFIG_PNG', png.found())
 config_host_data.set('CONFIG_VNC', vnc.found())
 config_host_data.set('CONFIG_VNC_JPEG', jpeg.found())
-config_host_data.set('CONFIG_VNC_PNG', png.found())
+config_host_data.set('CONFIG_VNC_PNG', vnc_png.found())
 config_host_data.set('CONFIG_VNC_SASL', sasl.found())
 config_host_data.set('CONFIG_VIRTFS', have_virtfs)
 config_host_data.set('CONFIG_VTE', vte.found())
@@ -3579,11 +3583,12 @@ summary_info += {'curses support':curses}
 summary_info += {'virgl support': virgl}
 summary_info += {'curl support':  curl}
 summary_info += {'Multipath support': mpathpersist}
+summary_info += {'PNG support':   png}
 summary_info += {'VNC support':   vnc}
 if vnc.found()
   summary_info += {'VNC SASL support':  sasl}
   summary_info += {'VNC JPEG support':  jpeg}
-  summary_info += {'VNC PNG support':   png}
+  summary_info += {'VNC PNG support':   vnc_png}
 endif
 if targetos not in ['darwin', 'haiku', 'windows']
   summary_info += {'OSS support': oss}
diff --git a/meson_options.txt b/meson_options.txt
index 52b11cead4..88148dec6c 100644
--- a/meson_options.txt
+++ b/meson_options.txt
@@ -177,6 +177,8 @@ option('vde', type : 'feature', value : 'auto',
description: 'vde network backend support')
 option('virglrenderer', type : 'feature', value : 'auto',
description: 'virgl rendering support')
+option('png', type : 'feature', value : 'auto',
+   description: 'PNG support with libpng')
 option('vnc', type : 'feature', value : 'auto',
description: 'VNC server')
 option('vnc_jpeg', type : 'feature', value : 'auto',
diff --git a/monitor/hmp-cmds.c 

[PATCH v2 09/12] mos6522: record last_irq_levels in mos6522_set_irq()

2022-02-24 Thread Mark Cave-Ayland
To detect edge-triggered IRQs it is necessary to store the last state of each
IRQ in a last_irq_levels bitmap.

Note: this is a migration break for machines which use mos6522 instances which
are g3beige/mac99 (PPC) and q800 (m68k).

Signed-off-by: Mark Cave-Ayland 
---
 hw/misc/mos6522.c | 11 +--
 include/hw/misc/mos6522.h |  1 +
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/hw/misc/mos6522.c b/hw/misc/mos6522.c
index 38b5877ffa..60b3b693d0 100644
--- a/hw/misc/mos6522.c
+++ b/hw/misc/mos6522.c
@@ -72,6 +72,12 @@ static void mos6522_set_irq(void *opaque, int n, int level)
 }
 
 mos6522_update_irq(s);
+
+if (level) {
+s->last_irq_levels |= 1 << n;
+} else {
+s->last_irq_levels &= ~(1 << n);
+}
 }
 
 static uint64_t get_counter_value(MOS6522State *s, MOS6522Timer *ti)
@@ -544,8 +550,8 @@ static const VMStateDescription vmstate_mos6522_timer = {
 
 const VMStateDescription vmstate_mos6522 = {
 .name = "mos6522",
-.version_id = 0,
-.minimum_version_id = 0,
+.version_id = 1,
+.minimum_version_id = 1,
 .fields = (VMStateField[]) {
 VMSTATE_UINT8(a, MOS6522State),
 VMSTATE_UINT8(b, MOS6522State),
@@ -556,6 +562,7 @@ const VMStateDescription vmstate_mos6522 = {
 VMSTATE_UINT8(pcr, MOS6522State),
 VMSTATE_UINT8(ifr, MOS6522State),
 VMSTATE_UINT8(ier, MOS6522State),
+VMSTATE_UINT8(last_irq_levels, MOS6522State),
 VMSTATE_STRUCT_ARRAY(timers, MOS6522State, 2, 0,
  vmstate_mos6522_timer, MOS6522Timer),
 VMSTATE_END_OF_LIST()
diff --git a/include/hw/misc/mos6522.h b/include/hw/misc/mos6522.h
index 846ded64b7..9ec9b57239 100644
--- a/include/hw/misc/mos6522.h
+++ b/include/hw/misc/mos6522.h
@@ -131,6 +131,7 @@ struct MOS6522State {
 uint64_t frequency;
 
 qemu_irq irq;
+uint8_t last_irq_levels;
 };
 
 #define TYPE_MOS6522 "mos6522"
-- 
2.20.1




[PATCH v2 05/12] mos6522: remove update_irq() and set_sr_int() methods from MOS6522DeviceClass

2022-02-24 Thread Mark Cave-Ayland
Now that the mos6522 IRQs are managed using standard qdev gpios these methods
are no longer required.

Signed-off-by: Mark Cave-Ayland 
Reviewed-by: Peter Maydell 
---
 hw/misc/mos6522.c | 9 -
 include/hw/misc/mos6522.h | 2 --
 2 files changed, 11 deletions(-)

diff --git a/hw/misc/mos6522.c b/hw/misc/mos6522.c
index 6be6853dc2..4c3147a7d1 100644
--- a/hw/misc/mos6522.c
+++ b/hw/misc/mos6522.c
@@ -208,13 +208,6 @@ static void mos6522_timer2(void *opaque)
 mos6522_update_irq(s);
 }
 
-static void mos6522_set_sr_int(MOS6522State *s)
-{
-trace_mos6522_set_sr_int();
-s->ifr |= SR_INT;
-mos6522_update_irq(s);
-}
-
 static uint64_t mos6522_get_counter_value(MOS6522State *s, MOS6522Timer *ti)
 {
 return muldiv64(qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) - ti->load_time,
@@ -527,10 +520,8 @@ static void mos6522_class_init(ObjectClass *oc, void *data)
 dc->vmsd = _mos6522;
 device_class_set_props(dc, mos6522_properties);
 mdc->parent_reset = dc->reset;
-mdc->set_sr_int = mos6522_set_sr_int;
 mdc->portB_write = mos6522_portB_write;
 mdc->portA_write = mos6522_portA_write;
-mdc->update_irq = mos6522_update_irq;
 mdc->get_timer1_counter_value = mos6522_get_counter_value;
 mdc->get_timer2_counter_value = mos6522_get_counter_value;
 mdc->get_timer1_load_time = mos6522_get_load_time;
diff --git a/include/hw/misc/mos6522.h b/include/hw/misc/mos6522.h
index f38ae2b0f0..f0a614898e 100644
--- a/include/hw/misc/mos6522.h
+++ b/include/hw/misc/mos6522.h
@@ -140,10 +140,8 @@ struct MOS6522DeviceClass {
 DeviceClass parent_class;
 
 DeviceReset parent_reset;
-void (*set_sr_int)(MOS6522State *dev);
 void (*portB_write)(MOS6522State *dev);
 void (*portA_write)(MOS6522State *dev);
-void (*update_irq)(MOS6522State *dev);
 /* These are used to influence the CUDA MacOS timebase calibration */
 uint64_t (*get_timer1_counter_value)(MOS6522State *dev, MOS6522Timer *ti);
 uint64_t (*get_timer2_counter_value)(MOS6522State *dev, MOS6522Timer *ti);
-- 
2.20.1




Re: [PATCH v4 03/18] block/block-copy: block_copy_state_new(): add bitmap parameter

2022-02-24 Thread Hanna Reitz

On 16.02.22 20:46, Vladimir Sementsov-Ogievskiy wrote:

This will be used in the following commit to bring "incremental" mode
to copy-before-write filter.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  include/block/block-copy.h |  1 +
  block/block-copy.c | 14 +-
  block/copy-before-write.c  |  2 +-
  3 files changed, 15 insertions(+), 2 deletions(-)


Reviewed-by: Hanna Reitz 




Re: [PATCH 08/11] mos6522: add "info via" HMP command for debugging

2022-02-24 Thread Mark Cave-Ayland

On 22/02/2022 15:03, Dr. David Alan Gilbert wrote:


* Mark Cave-Ayland (mark.cave-ayl...@ilande.co.uk) wrote:

On 21/02/2022 17:11, Daniel P. Berrangé wrote:


On Sun, Feb 20, 2022 at 05:18:33PM +, Mark Cave-Ayland wrote:

On 08/02/2022 13:10, Daniel P. Berrangé wrote:


On Tue, Feb 08, 2022 at 01:06:59PM +, Mark Cave-Ayland wrote:

On 08/02/2022 12:49, Daniel P. Berrangé wrote:


I was under the impression that monitor_register_hmp_info_hrt() does all the
magic here i.e. it declares the underlying QMP command with an x- prefix and
effectively encapsulates the text field in a way that says "this is an
unreliable text opaque for humans"?


The monitor_register_hmp_info_hrt only does the HMP glue side, and
that's only needed if you must dynamically register the HMP command.
For statically registered commands set '.cmd_info_hrt' directly in
the hml-commands-info.hx for the HMP side.


If a qapi/ schema is needed could you explain what it should look like for
this example and where it should go? Looking at the existing .json files I
can't immediately see one which is the right place for this to live.


Take a look in qapi/machine.json for anyof the 'x-query-' commands
there. The QAPI bit is fairly simple.

if you want to see an illustration of what's different from a previous
pure HMP impl, look at:

  commit dd98234c059e6bdb05a52998270df6d3d990332e
  Author: Daniel P. Berrangé 
  Date:   Wed Sep 8 10:35:43 2021 +0100

qapi: introduce x-query-roms QMP command


I see, thanks for the reference. So qapi/machine.json would be the right
place to declare the QMP part even for a specific device?

Even this approach still wouldn't work in its current form though, since as
mentioned in my previous email it seems that only the target CONFIG_*
defines and not the device CONFIG_* defines are present when processing
hmp-commands-info.hx.


Yeah, that's where the pain comes in.  While QAPI schema can be made
conditional on a few CONFIG_* parameters - basically those derived
from global configure time options, it is impossible for this to be
with with target specific options like the device CONFIG_* defines.

This is why I suggested in my othuer reply that it would need to be
done with a generic 'info dev-debug' / 'x-query-dev-debug' command
that can be registered unconditionally, and then individual devices
plug into it.


After some more experiments this afternoon I still seem to be falling
through the gaps on this one. This is based upon my understanding that all
new HMP commands should use a QMP HumanReadableText implementation and the
new command should be restricted according to target.

Currently I am working with this change to hmp-commands-info.hx and
qapi/misc-target.json:


[snip]

i.e. qmp_marshal_output_HumanReadableText() isn't protected by the #if
TARGET guards and since HumanReadableText is only used by the new
qmp_x_query_via() functionality then the compiler complains and aborts the
compilation.

Possibly this is an error in the QAPI generator for types hidden behind
commands using "if"? Otherwise I'm not sure what is the best way to proceed,
so I'd be grateful for some further pointers.


Yes, this is pretty much what I expect and exactly what I hit with
other target specific commands.

That's why I suggested something like a general 'x-device-debug' command
that is NOT conditionalized in QAPI, against which dev impls can register
a callback to provide detailed reporting, instead of a device type specific
command.


Ah so this is a known issue with this approach then. David mentioned earlier
in the thread that he'd be okay with a HMP command if it was useful and
restricted to the required targets, so would it be okay to add "info via"
for now as just a (non-QMP wrapped) HMP info command if I can get that to
work?


I still am from an HMP point of view; it sounds like the right way in
the future is to get the info devices or whatever;  I suggest you keep
it as close to a QMP implementation as possible, still with the
HumanReadableText stuff.
(Others might still be nervous about an HMP special; but I don't see
it's worth holding this trivial stuff up for it).


I've just posted a v2 and what I've done there is to manually add a hmp_info_via() 
wrapper (taken almost verbatim from 
https://www.qemu.org/docs/master/devel/writing-monitor-commands.html#id1) and added 
it to both include/monitor/hmp-target.h and include/hw/misc/mos6522.h which passes a 
Gitlab run.


I think it's worth having as an in-tree reference for when a more formal HMP/QMP 
per-device as opposed to per-target infrastructure arrives.



ATB,

Mark.



Re: [PATCH v4 10/18] block/io: introduce block driver snapshot-access API

2022-02-24 Thread Hanna Reitz

On 16.02.22 20:46, Vladimir Sementsov-Ogievskiy wrote:

Add new block driver handlers and corresponding generic wrappers.
It will be used to allow copy-before-write filter to provide
reach fleecing interface in further commit.

In future this approach may be used to allow reading qcow2 interanal


(s/interanal/internal/)


snaphots, for example to export them through NBD.


Ooh, that’s indeed quite nice.

Raises the question of how users are to select a specific snapshot in 
qcow2 file, but your next patch answers that question: The snapshot 
access driver is to receive a runtime option for this, and the API is to 
be extended to allow for selecting a specific snapshot.  Sounds good!



Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  include/block/block_int.h | 27 +++
  block/io.c| 69 +++
  2 files changed, 96 insertions(+)


Yes, really nice.  Thanks.

Reviewed-by: Hanna Reitz 




Re: [PATCH v4 08/18] block/dirty-bitmap: introduce bdrv_dirty_bitmap_status()

2022-02-24 Thread Hanna Reitz

On 16.02.22 20:46, Vladimir Sementsov-Ogievskiy wrote:

Add a convenient function similar with bdrv_block_status() to get
status of dirty bitmap.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  include/block/dirty-bitmap.h |  2 ++
  include/qemu/hbitmap.h   | 12 
  block/dirty-bitmap.c |  6 ++
  util/hbitmap.c   | 33 +
  4 files changed, 53 insertions(+)


Reviewed-by: Hanna Reitz 




Re: [PATCH] hw/intc/armv7m_nvic: Fix typo in comment

2022-02-24 Thread Peter Maydell
On Thu, 24 Feb 2022 at 12:22, Evgeny Ermakov  wrote:
>
> Signed-off-by: Evgeny Ermakov 
> ---
>  hw/intc/armv7m_nvic.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c
> index 13df002ce4..a08a0fdc50 100644
> --- a/hw/intc/armv7m_nvic.c
> +++ b/hw/intc/armv7m_nvic.c
> @@ -97,7 +97,7 @@ static int nvic_pending_prio(NVICState *s)
>   * this is only different in the obscure corner case where guest
>   * code has manually deactivated an exception and is about
>   * to fail an exception-return integrity check. The definition
> - * above is the one from the v8M ARM ARM and is also in line
> + * above is the one from the v8M ARM and is also in line
>   * with the behaviour documented for the Cortex-M3.
>   */

This is not a typo. The "ARM ARM" (or "Arm ARM" these days if
you want to follow the official corporate name capitalization)
is the standard abbreviated way to refer to the
Arm Architecture Reference Manual. "git grep -i 'arm arm'"
finds over 50 uses of it in various comments.

thanks
-- PMM



Re: [PATCH v2] target/arm: Support PSCI 1.1 and SMCCC 1.0

2022-02-24 Thread Peter Maydell
On Sun, 13 Feb 2022 at 03:58, Akihiko Odaki  wrote:
>
> Support the latest PSCI on TCG and HVF. A 64-bit function called from
> AArch32 now returns NOT_SUPPORTED, which is necessary to adhere to SMC
> Calling Convention 1.0. It is still not compliant with SMCCC 1.3 since
> they do not implement mandatory functions.
>
> Signed-off-by: Akihiko Odaki 
> ---

Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/7.0
for any user-visible changes.

(I noticed while reviewing this that we report KVM's PSCI via
the DTB as only 0.2 even if KVM's actually implementing better
than that; I'll write a patch to clean that up.)

-- PMM



[PATCH v3 2/2] virt: vmgenid: introduce driver for reinitializing RNG on VM fork

2022-02-24 Thread Jason A. Donenfeld
VM Generation ID is a feature from Microsoft, described at
, and supported by
Hyper-V and QEMU. Its usage is described in Microsoft's RNG whitepaper,
, as:

If the OS is running in a VM, there is a problem that most
hypervisors can snapshot the state of the machine and later rewind
the VM state to the saved state. This results in the machine running
a second time with the exact same RNG state, which leads to serious
security problems.  To reduce the window of vulnerability, Windows
10 on a Hyper-V VM will detect when the VM state is reset, retrieve
a unique (not random) value from the hypervisor, and reseed the root
RNG with that unique value.  This does not eliminate the
vulnerability, but it greatly reduces the time during which the RNG
system will produce the same outputs as it did during a previous
instantiation of the same VM state.

Linux has the same issue, and given that vmgenid is supported already by
multiple hypervisors, we can implement more or less the same solution.
So this commit wires up the vmgenid ACPI notification to the RNG's newly
added add_vmfork_randomness() function.

It can be used from qemu via the `-device vmgenid,guid=auto` parameter.
After setting that, use `savevm` in the monitor to save the VM state,
then quit QEMU, start it again, and use `loadvm`. That will trigger this
driver's notify function, which hands the new UUID to the RNG. This is
described in .
And there are hooks for this in libvirt as well, described in
.

Note, however, that the treatment of this as a UUID is considered to be
an accidental QEMU nuance, per
,
so this driver simply treats these bytes as an opaque 128-bit binary
blob, as per the spec. This doesn't really make a difference anyway,
considering that's how it ends up when handed to the RNG in the end.

This driver builds on prior work from Adrian Catangiu at Amazon, and it
is my hope that that team can resume maintenance of this driver.

Cc: Adrian Catangiu 
Cc: Laszlo Ersek 
Cc: Daniel P. Berrangé 
Cc: Dominik Brodowski 
Cc: Ard Biesheuvel 
Signed-off-by: Jason A. Donenfeld 
---
 drivers/virt/Kconfig   |   9 +++
 drivers/virt/Makefile  |   1 +
 drivers/virt/vmgenid.c | 121 +
 3 files changed, 131 insertions(+)
 create mode 100644 drivers/virt/vmgenid.c

diff --git a/drivers/virt/Kconfig b/drivers/virt/Kconfig
index 8061e8ef449f..d3276dc2095c 100644
--- a/drivers/virt/Kconfig
+++ b/drivers/virt/Kconfig
@@ -13,6 +13,15 @@ menuconfig VIRT_DRIVERS
 
 if VIRT_DRIVERS
 
+config VMGENID
+   tristate "Virtual Machine Generation ID driver"
+   default y
+   depends on ACPI
+   help
+ Say Y here to use the hypervisor-provided Virtual Machine Generation 
ID
+ to reseed the RNG when the VM is cloned. This is highly recommended if
+ you intend to do any rollback / cloning / snapshotting of VMs.
+
 config FSL_HV_MANAGER
tristate "Freescale hypervisor management driver"
depends on FSL_SOC
diff --git a/drivers/virt/Makefile b/drivers/virt/Makefile
index 3e272ea60cd9..108d0ffcc9aa 100644
--- a/drivers/virt/Makefile
+++ b/drivers/virt/Makefile
@@ -4,6 +4,7 @@
 #
 
 obj-$(CONFIG_FSL_HV_MANAGER)   += fsl_hypervisor.o
+obj-$(CONFIG_VMGENID)  += vmgenid.o
 obj-y  += vboxguest/
 
 obj-$(CONFIG_NITRO_ENCLAVES)   += nitro_enclaves/
diff --git a/drivers/virt/vmgenid.c b/drivers/virt/vmgenid.c
new file mode 100644
index ..5da4dc8f25e3
--- /dev/null
+++ b/drivers/virt/vmgenid.c
@@ -0,0 +1,121 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Virtual Machine Generation ID driver
+ *
+ * Copyright (C) 2022 Jason A. Donenfeld . All Rights 
Reserved.
+ * Copyright (C) 2020 Amazon. All rights reserved.
+ * Copyright (C) 2018 Red Hat Inc. All rights reserved.
+ */
+
+#include 
+#include 
+#include 
+#include 
+
+ACPI_MODULE_NAME("vmgenid");
+
+enum { VMGENID_SIZE = 16 };
+
+static struct {
+   u8 this_id[VMGENID_SIZE];
+   u8 *next_id;
+} state;
+
+static int vmgenid_acpi_add(struct acpi_device *device)
+{
+   struct acpi_buffer buffer = { ACPI_ALLOCATE_BUFFER };
+   union acpi_object *pss;
+   phys_addr_t phys_addr;
+   acpi_status status;
+   int ret = 0;
+
+   if (!device)
+   return -EINVAL;
+
+   status = acpi_evaluate_object(device->handle, "ADDR", NULL, );
+   if (ACPI_FAILURE(status)) {
+   ACPI_EXCEPTION((AE_INFO, status, "Evaluating ADDR"));
+   return -ENODEV;
+   }
+   pss = buffer.pointer;
+   if (!pss || pss->type != ACPI_TYPE_PACKAGE || pss->package.count != 2 ||
+   pss->package.elements[0].type != 

Re: [PATCH v4 14/18] iotests.py: add qemu_io_pipe_and_status()

2022-02-24 Thread Vladimir Sementsov-Ogievskiy

24.02.2022 15:52, Hanna Reitz wrote:

On 16.02.22 20:46, Vladimir Sementsov-Ogievskiy wrote:

Add helper that returns both status and output, to be used in the
following commit

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  tests/qemu-iotests/iotests.py | 4 
  1 file changed, 4 insertions(+)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 6ba65eb1ff..23bc6f686f 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -278,6 +278,10 @@ def qemu_io(*args):
  '''Run qemu-io and return the stdout data'''
  return qemu_tool_pipe_and_status('qemu-io', qemu_io_wrap_args(args))[0]
+def qemu_io_pipe_and_status(*args):
+    args = qemu_io_args + list(args)
+    return qemu_tool_pipe_and_status('qemu-io', args)


Shouldn’t we use qemu_io_wrap_args() here, like above?  The next patch adds a 
caller that passes `'-f', 'raw'` to it, which kind of implies to me that 
qemu_io_wrap_args() would be better.


Will do




+
  def qemu_io_log(*args):
  result = qemu_io(*args)
  log(result, filters=[filter_testfiles, filter_qemu_io])





--
Best regards,
Vladimir



Re: [PATCH v2 03/12] mac_via: use IFR bit flag constants for VIA2 IRQs

2022-02-24 Thread Philippe Mathieu-Daudé

On 24/2/22 12:59, Mark Cave-Ayland wrote:

This allows us to easily see how the physical control lines are mapped to the
IFR bit flags.

Signed-off-by: Mark Cave-Ayland 
---
  include/hw/misc/mac_via.h | 19 +--
  1 file changed, 9 insertions(+), 10 deletions(-)


Reviewed-by: Philippe Mathieu-Daudé 



Re: [PATCH v2 01/12] mos6522: add defines for IFR bit flags

2022-02-24 Thread Philippe Mathieu-Daudé

On 24/2/22 12:59, Mark Cave-Ayland wrote:

These are intended to make it easier to see how the physical control lines
are wired for each instance.

Signed-off-by: Mark Cave-Ayland 
---
  include/hw/misc/mos6522.h | 22 +++---
  1 file changed, 15 insertions(+), 7 deletions(-)


Reviewed-by: Philippe Mathieu-Daudé 



Re: [PATCH v2 0/2] block/curl: check error return from curl_easy_setopt()

2022-02-24 Thread Hanna Reitz

On 22.02.22 16:23, Peter Maydell wrote:

Coverity points out that we aren't checking the return value
from curl_easy_setopt() for any of the calls to it we make
in block/curl.c.

Tested with 'make check' and with some basic smoke test command lines
suggested by Dan:

  qemu-img info 
https://cloud.debian.org/images/cloud/buster/daily/latest/debian-10-nocloud-amd64-daily.qcow2
  qemu-img info --image-opts 
driver=qcow2,file.driver=https,file.url=https://cloud.debian.org/images/cloud/buster/daily/latest/debian-10-nocloud-amd64-daily.qcow2

Changes v1->v2:
  * new patch 1 which fixes a missing "set the error string" for
when curl_init_state() returns failure, since we're about to
add more cases when that function can fail
  * set the error string in the failure path for the direct setopt
calls in curl_open()
  * fix the failure path in curl_setup_preadv() by putting
the curl_easy_setopt() call in the same if() condition
as the existing curl_multi_add_handle()


Thanks, applied to my block branch:

https://gitlab.com/hreitz/qemu/-/commits/block

Hanna




Re: [PATCH v4 17/18] qapi: backup: add immutable-source parameter

2022-02-24 Thread Vladimir Sementsov-Ogievskiy

24.02.2022 16:05, Hanna Reitz wrote:

On 16.02.22 20:46, Vladimir Sementsov-Ogievskiy wrote:

We are on the way to implement internal-backup with fleecing scheme,
which includes backup job copying from fleecing block driver node
(which is target of copy-before-write filter) to final target of
backup. This job doesn't need own filter, as fleecing block driver node
is a kind of snapshot, it's immutable from reader point of view.

Let's add a parameter for backup to not insert filter but instead
unshare writes on source. This way backup job becomes a simple copying
process.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  qapi/block-core.json  | 11 ++-
  include/block/block_int.h |  1 +
  block/backup.c    | 61 +++
  block/replication.c   |  2 +-
  blockdev.c    |  1 +
  5 files changed, 69 insertions(+), 7 deletions(-)


I’m not really technically opposed to this, but I wonder what the actual 
benefit of this is.  It sounds like the only benefit is that we don’t need a 
filter driver, but what’s the problem with such a filter driver?


Hmm. Yes, that's the only benefit: less extra components -> more stability.

But I doubt now does it really worth extra parameter.. More parameters that 
actually change nothing for the user -> less stability :)

Ok, I think I at least should postpone it for now, this series is too fat even 
without this patch.

The only possible problem - will permission conflict happen in the next test 
without this patch? But if it will, the solution should exist to solve it 
without user interaction. I'll check and try to avoid this new parameter.



(And if we just want to copy data off of a immutable node, I personally would 
go for the mirror job instead, but it isn’t like I could give good technical 
reasons for that personal bias.)



I still hope that in far good future mirror will work through block/block-copy 
like backup, and there would be no difference what to use for immutable source 
copying.


Thanks a lot for reviewing!

--
Best regards,
Vladimir



Re: [PATCH RFC v1 0/2] VM fork detection for RNG

2022-02-24 Thread Daniel P . Berrangé
On Thu, Feb 24, 2022 at 09:22:50AM +0100, Laszlo Ersek wrote:
> (+Daniel, +Rich)
> 
> On 02/23/22 17:08, Jason A. Donenfeld wrote:
> > On Wed, Feb 23, 2022 at 2:12 PM Jason A. Donenfeld  wrote:
> >> second patch is the reason this is just an RFC: it's a cleanup of the
> >> ACPI driver from last year, and I don't really have much experience
> >> writing, testing, debugging, or maintaining these types of drivers.
> >> Ideally this thread would yield somebody saying, "I see the intent of
> >> this; I'm happy to take over ownership of this part." That way, I can
> >> focus on the RNG part, and whoever steps up for the paravirt ACPI part
> >> can focus on that.
> 
> > (It appears there's a bug in QEMU which prevents
> > the GUID from being reinitialized when running `loadvm` without
> > quitting first; I suppose this should be discussed with QEMU
> > upstream.)
> 
> That's not (necessarily) a bug; see the end of the above-linked QEMU
> document:
> 
> "There are no known use cases for changing the GUID once QEMU is
> running, and adding this capability would greatly increase the complexity."

IIRC this part of the QEMU doc was making an implicit assumption
about the way QEMU is to be used by mgmt apps doing snapshots.

Instead of using the 'loadvm' command on the existing running QEMU
process, the doc seems to tacitly expect the management app will
throwaway the existing QEMU process and spawn a brand new QEMU
process to load the snapshot into, thus getting the new GUID on
the QEMU command line. There are some downsides with doing this
as compared  to running 'loadvm' in the existing QEMU, most
notably the user's VNC/SPICE console session gets interrupted.
I guess the ease of impl for QEMU was more compelling though.

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH v2] hw/i2c: flatten pca954x mux device

2022-02-24 Thread Peter Maydell
On Wed, 2 Feb 2022 at 17:57, Patrick Venture  wrote:
>
> Previously this device created N subdevices which each owned an i2c bus.
> Now this device simply owns the N i2c busses directly.
>
> Tested: Verified devices behind mux are still accessible via qmp and i2c
> from within an arm32 SoC.
>
> Reviewed-by: Hao Wu 
> Signed-off-by: Patrick Venture 
> Reviewed-by: Philippe Mathieu-Daudé 
> Tested-by: Philippe Mathieu-Daudé 
> ---
> v2: explicitly create an incrementing name for the i2c busses (channels).
> ---

Applied to target-arm.next, thanks.

Apologies for not picking this up earlier, the v2 got lost in the
side-conversation about spam filtering. (Blame gmail for not
doing email threading properly if you like :-))

-- PMM



Re: [PATCH RFC v1 0/2] VM fork detection for RNG

2022-02-24 Thread Alexander Graf


On 24.02.22 11:43, Daniel P. Berrangé wrote:

On Thu, Feb 24, 2022 at 09:53:59AM +0100, Alexander Graf wrote:

Hey Jason,

On 23.02.22 14:12, Jason A. Donenfeld wrote:

This small series picks up work from Amazon that seems to have stalled
out later year around this time: listening for the vmgenid ACPI
notification, and using it to "do something." Last year, that something
involved a complicated userspace mmap chardev, which seems frought with
difficulty. This year, I have something much simpler in mind: simply
using those ACPI notifications to tell the RNG to reinitialize safely,
so we don't repeat random numbers in cloned, forked, or rolled-back VM
instances.

This series consists of two patches. The first is a rather
straightforward addition to random.c, which I feel fine about. The
second patch is the reason this is just an RFC: it's a cleanup of the
ACPI driver from last year, and I don't really have much experience
writing, testing, debugging, or maintaining these types of drivers.
Ideally this thread would yield somebody saying, "I see the intent of
this; I'm happy to take over ownership of this part." That way, I can
focus on the RNG part, and whoever steps up for the paravirt ACPI part
can focus on that.

As a final note, this series intentionally does _not_ focus on
notification of these events to userspace or to other kernel consumers.
Since these VM fork detection events first need to hit the RNG, we can
later talk about what sorts of notifications or mmap'd counters the RNG
should be making accessible to elsewhere. But that's a different sort of
project and ties into a lot of more complicated concerns beyond this
more basic patchset. So hopefully we can keep the discussion rather
focused here to this ACPI business.


The main problem with VMGenID is that it is inherently racy. There will
always be a (short) amount of time where the ACPI notification is not
processed, but the VM could use its RNG to for example establish TLS
connections.

Hence we as the next step proposed a multi-stage quiesce/resume mechanism
where the system is aware that it is going into suspend - can block network
connections for example - and only returns to a fully functional state after
an unquiesce phase:

   https://github.com/systemd/systemd/issues/20222

The downside of course is precisely that the guest now needs to be aware
and involved every single time a snapshot is taken.

Currently with virt the act of taking a snapshot can often remain invisible
to the VM with no functional effect on the guest OS or its workload, and
the host OS knows it can complete a snapshot in a specific timeframe. That
said, this transparency to the VM is precisely the cause of the race
condition described.

With guest involvement to quiesce the bulk of activity for time period,
there is more likely to be a negative impact on the guest workload. The
guest admin likely needs to be more explicit about exactly when in time
it is reasonable to take a snapshot to mitigate the impact.

The host OS snapshot operations are also now dependant on co-operation
of a guest OS that has to be considered to be potentially malicious, or
at least crashed/non-responsive. The guest OS also needs a way to receive
the triggers for snapshot capture and restore, most likely via an extension
to something like the QEMU guest agent or an equivalent for othuer
hypervisors.



What you describe sounds almost exactly like pressing a power button on 
modern systems. You don't just kill the power line, you press a button 
and wait for the guest to acknowledge that it's ready.


Maybe the real answer to all of this is S3: Suspend to RAM. You press 
the suspend button, the guest can prepare for sleep (quiesce!) and the 
next time you run, it can check whether VMGenID changed and act accordingly.




Despite the above, I'm not against the idea of co-operative involvement
of the guest OS in the acts of taking & restoring snapshots. I can't
see any other proposals so far that can reliably eliminate the races
in the general case, from the kernel right upto user applications.
So I think it is neccessary to have guest cooperative snapshotting.


What exact use case do you have in mind for the RNG/VMGenID update? Can you
think of situations where the race is not an actual concern?

Lets assume we do take the approach described in that systemd bug and
have a co-operative snapshot process. If the hypervisor does the right
thing and guest owners install the right things, they'll have a race
free solution that works well in normal operation. That's good.


Realistically though, it is never going to be universally and reliably
put into practice. So what is our attitude to cases where the preferred
solution isn't availble and/or operative ?


There are going to be users who continue to build their guest disk images
without the QEMU guest agent (or equivalent for whatever hypervisor they
run on) installed because they don't know any better. Or where the guest
agent is 

[PATCH v2 07/12] mos6522: add register names to register read/write trace events

2022-02-24 Thread Mark Cave-Ayland
This helps to follow how the guest is programming the mos6522 when debugging.

Signed-off-by: Mark Cave-Ayland 
Reviewed-by: Peter Maydell 
---
 hw/misc/mos6522.c| 10 --
 hw/misc/trace-events |  4 ++--
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/hw/misc/mos6522.c b/hw/misc/mos6522.c
index 093cc83dcf..aaae195d63 100644
--- a/hw/misc/mos6522.c
+++ b/hw/misc/mos6522.c
@@ -36,6 +36,12 @@
 #include "qemu/module.h"
 #include "trace.h"
 
+
+static const char *mos6522_reg_names[16] = {
+"ORB", "ORA", "DDRB", "DDRA", "T1CL", "T1CH", "T1LL", "T1LH",
+"T2CL", "T2CH", "SR", "ACR", "PCR", "IFR", "IER", "ANH"
+};
+
 /* XXX: implement all timer modes */
 
 static void mos6522_timer1_update(MOS6522State *s, MOS6522Timer *ti,
@@ -310,7 +316,7 @@ uint64_t mos6522_read(void *opaque, hwaddr addr, unsigned 
size)
 }
 
 if (addr != VIA_REG_IFR || val != 0) {
-trace_mos6522_read(addr, val);
+trace_mos6522_read(addr, mos6522_reg_names[addr], val);
 }
 
 return val;
@@ -321,7 +327,7 @@ void mos6522_write(void *opaque, hwaddr addr, uint64_t val, 
unsigned size)
 MOS6522State *s = opaque;
 MOS6522DeviceClass *mdc = MOS6522_GET_CLASS(s);
 
-trace_mos6522_write(addr, val);
+trace_mos6522_write(addr, mos6522_reg_names[addr], val);
 
 switch (addr) {
 case VIA_REG_B:
diff --git a/hw/misc/trace-events b/hw/misc/trace-events
index 1c373dd0a4..c1ea57de31 100644
--- a/hw/misc/trace-events
+++ b/hw/misc/trace-events
@@ -95,8 +95,8 @@ imx7_gpr_write(uint64_t offset, uint64_t value) "addr 0x%08" 
PRIx64 "value 0x%08
 mos6522_set_counter(int index, unsigned int val) "T%d.counter=%d"
 mos6522_get_next_irq_time(uint16_t latch, int64_t d, int64_t delta) "latch=%d 
counter=0x%"PRId64 " delta_next=0x%"PRId64
 mos6522_set_sr_int(void) "set sr_int"
-mos6522_write(uint64_t addr, uint64_t val) "reg=0x%"PRIx64 " val=0x%"PRIx64
-mos6522_read(uint64_t addr, unsigned val) "reg=0x%"PRIx64 " val=0x%x"
+mos6522_write(uint64_t addr, const char *name, uint64_t val) "reg=0x%"PRIx64 " 
[%s] val=0x%"PRIx64
+mos6522_read(uint64_t addr, const char *name, unsigned val) "reg=0x%"PRIx64 " 
[%s] val=0x%x"
 
 # npcm7xx_clk.c
 npcm7xx_clk_read(uint64_t offset, uint32_t value) " offset: 0x%04" PRIx64 " 
value: 0x%08" PRIx32
-- 
2.20.1




Re: [PATCH v4 14/18] iotests.py: add qemu_io_pipe_and_status()

2022-02-24 Thread Hanna Reitz

On 16.02.22 20:46, Vladimir Sementsov-Ogievskiy wrote:

Add helper that returns both status and output, to be used in the
following commit

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  tests/qemu-iotests/iotests.py | 4 
  1 file changed, 4 insertions(+)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 6ba65eb1ff..23bc6f686f 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -278,6 +278,10 @@ def qemu_io(*args):
  '''Run qemu-io and return the stdout data'''
  return qemu_tool_pipe_and_status('qemu-io', qemu_io_wrap_args(args))[0]
  
+def qemu_io_pipe_and_status(*args):

+args = qemu_io_args + list(args)
+return qemu_tool_pipe_and_status('qemu-io', args)


Shouldn’t we use qemu_io_wrap_args() here, like above?  The next patch 
adds a caller that passes `'-f', 'raw'` to it, which kind of implies to 
me that qemu_io_wrap_args() would be better.



+
  def qemu_io_log(*args):
  result = qemu_io(*args)
  log(result, filters=[filter_testfiles, filter_qemu_io])





Re: [PATCH v4 17/18] qapi: backup: add immutable-source parameter

2022-02-24 Thread Hanna Reitz

On 16.02.22 20:46, Vladimir Sementsov-Ogievskiy wrote:

We are on the way to implement internal-backup with fleecing scheme,
which includes backup job copying from fleecing block driver node
(which is target of copy-before-write filter) to final target of
backup. This job doesn't need own filter, as fleecing block driver node
is a kind of snapshot, it's immutable from reader point of view.

Let's add a parameter for backup to not insert filter but instead
unshare writes on source. This way backup job becomes a simple copying
process.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  qapi/block-core.json  | 11 ++-
  include/block/block_int.h |  1 +
  block/backup.c| 61 +++
  block/replication.c   |  2 +-
  blockdev.c|  1 +
  5 files changed, 69 insertions(+), 7 deletions(-)


I’m not really technically opposed to this, but I wonder what the actual 
benefit of this is.  It sounds like the only benefit is that we don’t 
need a filter driver, but what’s the problem with such a filter driver?


(And if we just want to copy data off of a immutable node, I personally 
would go for the mirror job instead, but it isn’t like I could give good 
technical reasons for that personal bias.)





Re: [PATCH v2] target/arm: Support PSCI 1.1 and SMCCC 1.0

2022-02-24 Thread Peter Maydell
On Sun, 13 Feb 2022 at 03:58, Akihiko Odaki  wrote:
>
> Support the latest PSCI on TCG and HVF. A 64-bit function called from
> AArch32 now returns NOT_SUPPORTED, which is necessary to adhere to SMC
> Calling Convention 1.0. It is still not compliant with SMCCC 1.3 since
> they do not implement mandatory functions.

>  /* PSCI v0.2 return values used by TCG emulation of PSCI */
>
>  /* No Trusted OS migration to worry about when offlining CPUs */
>  #define QEMU_PSCI_0_2_RET_TOS_MIGRATION_NOT_REQUIRED2
>
> -/* We implement version 0.2 only */
> -#define QEMU_PSCI_0_2_RET_VERSION_0_2   2
> +#define QEMU_PSCI_VERSION_0_1 0x1
> +#define QEMU_PSCI_VERSION_0_2 0x2
> +#define QEMU_PSCI_VERSION_1_1 0x10001

Just noticed that there's a minor issue with this change -- it
deletes the definition of QEMU_PSCI_0_2_RET_VERSION_0_2, but
it is still used below:

>
>  MISMATCH_CHECK(QEMU_PSCI_0_2_RET_TOS_MIGRATION_NOT_REQUIRED, 
> PSCI_0_2_TOS_MP);
>  MISMATCH_CHECK(QEMU_PSCI_0_2_RET_VERSION_0_2,

here ^^  which means that this breaks compilation on Arm hosts.

I'll squash in the fix:

--- a/target/arm/kvm-consts.h
+++ b/target/arm/kvm-consts.h
@@ -98,8 +98,11 @@ MISMATCH_CHECK(QEMU_PSCI_1_0_FN_PSCI_FEATURES,
PSCI_1_0_FN_PSCI_FEATURES);
 #define QEMU_PSCI_VERSION_1_1 0x10001

 MISMATCH_CHECK(QEMU_PSCI_0_2_RET_TOS_MIGRATION_NOT_REQUIRED, PSCI_0_2_TOS_MP);
-MISMATCH_CHECK(QEMU_PSCI_0_2_RET_VERSION_0_2,
+/* We don't bother to check every possible version value */
+MISMATCH_CHECK(QEMU_PSCI_VERSION_0_2,
(PSCI_VERSION_MAJOR(0) | PSCI_VERSION_MINOR(2)));
+MISMATCH_CHECK(QEMU_PSCI_VERSION_1_1,
+   (PSCI_VERSION_MAJOR(1) | PSCI_VERSION_MINOR(1)));

 /* PSCI return values (inclusive of all PSCI versions) */
 #define QEMU_PSCI_RET_SUCCESS 0

thanks
-- PMM



Re: [PATCH v2] target/arm: Support PSCI 1.1 and SMCCC 1.0

2022-02-24 Thread Peter Maydell
On Thu, 24 Feb 2022 at 13:25, Peter Maydell  wrote:
>
> On Sun, 13 Feb 2022 at 03:58, Akihiko Odaki  wrote:
> >
> > Support the latest PSCI on TCG and HVF. A 64-bit function called from
> > AArch32 now returns NOT_SUPPORTED, which is necessary to adhere to SMC
> > Calling Convention 1.0. It is still not compliant with SMCCC 1.3 since
> > they do not implement mandatory functions.
>
> >  /* PSCI v0.2 return values used by TCG emulation of PSCI */
> >
> >  /* No Trusted OS migration to worry about when offlining CPUs */
> >  #define QEMU_PSCI_0_2_RET_TOS_MIGRATION_NOT_REQUIRED2
> >
> > -/* We implement version 0.2 only */
> > -#define QEMU_PSCI_0_2_RET_VERSION_0_2   2
> > +#define QEMU_PSCI_VERSION_0_1 0x1
> > +#define QEMU_PSCI_VERSION_0_2 0x2
> > +#define QEMU_PSCI_VERSION_1_1 0x10001
>
> Just noticed that there's a minor issue with this change -- it
> deletes the definition of QEMU_PSCI_0_2_RET_VERSION_0_2, but
> it is still used below:
>
> >
> >  MISMATCH_CHECK(QEMU_PSCI_0_2_RET_TOS_MIGRATION_NOT_REQUIRED, 
> > PSCI_0_2_TOS_MP);
> >  MISMATCH_CHECK(QEMU_PSCI_0_2_RET_VERSION_0_2,
>
> here ^^  which means that this breaks compilation on Arm hosts.
>
> I'll squash in the fix:
>
> --- a/target/arm/kvm-consts.h
> +++ b/target/arm/kvm-consts.h
> @@ -98,8 +98,11 @@ MISMATCH_CHECK(QEMU_PSCI_1_0_FN_PSCI_FEATURES,
> PSCI_1_0_FN_PSCI_FEATURES);
>  #define QEMU_PSCI_VERSION_1_1 0x10001
>
>  MISMATCH_CHECK(QEMU_PSCI_0_2_RET_TOS_MIGRATION_NOT_REQUIRED, 
> PSCI_0_2_TOS_MP);
> -MISMATCH_CHECK(QEMU_PSCI_0_2_RET_VERSION_0_2,
> +/* We don't bother to check every possible version value */
> +MISMATCH_CHECK(QEMU_PSCI_VERSION_0_2,
> (PSCI_VERSION_MAJOR(0) | PSCI_VERSION_MINOR(2)));
> +MISMATCH_CHECK(QEMU_PSCI_VERSION_1_1,
> +   (PSCI_VERSION_MAJOR(1) | PSCI_VERSION_MINOR(1)));

Ha, turns out the existing check line was wrong : it ORs together
the major and minor, which only works if the major happens to be 0.
Actually working lines:

MISMATCH_CHECK(QEMU_PSCI_VERSION_0_2, PSCI_VERSION(0, 2));
MISMATCH_CHECK(QEMU_PSCI_VERSION_1_1, PSCI_VERSION(1, 1));

-- PMM



Re: [PATCH v2 2/2] block/curl.c: Check error return from curl_easy_setopt()

2022-02-24 Thread Peter Maydell
On Thu, 24 Feb 2022 at 14:11, Hanna Reitz  wrote:
>
> On 22.02.22 16:23, Peter Maydell wrote:
> > Coverity points out that we aren't checking the return value
> > from curl_easy_setopt() for any of the calls to it we make
> > in block/curl.c.
> >
> > Some of these options are documented as always succeeding (e.g.
> > CURLOPT_VERBOSE) but others have documented failure cases (e.g.
> > CURLOPT_URL).  For consistency we check every call, even the ones
> > that theoretically cannot fail.
> >
> > Fixes: Coverity CID 1459336, 1459482, 1460331
> > Signed-off-by: Peter Maydell 
> > ---
> > Changes v1->v2:
> >   * set the error string in the failure path for the
> > direct setopt calls in curl_open()
> >   * fix the failure path in curl_setup_preadv() by putting
> > the curl_easy_setopt() call in the same if() condition
> > as the existing curl_multi_add_handle()
> > ---
> >   block/curl.c | 92 +---
> >   1 file changed, 58 insertions(+), 34 deletions(-)
>
> Reviewed-by: Hanna Reitz 

For the record, I had a late thought that maybe we were setting
some optional-for-us options that were only added in later versions
of libcurl and accidentally relying on not checking the error code.
But it turns out this isn't the case: most of the options we set
are always supported, and the exceptions are:
 NOSIGNAL -- 7.10 onward
 PRIVATE -- 7.10.3 onward
 USERNAME, PASSWORD, PROXYUSERNAME, PROXYPASSWORD -- 7.19.1 onward,
  but we only set these if the user asked for them, so failing
  would be the right thing anyway
 PROTOCOLS, REDIR_PROTOCOLS -- 7.19.4 onward, guarded by a
  compile-time LIBCURL_VERSION_NUM check

And in any case our meson.build insists on at least 7.29.
(That means we could drop the LIBCURL_VERSION_NUM guards, I guess.)

-- PMM



Re: [PATCH] target/arm: Report KVM's actual PSCI version to guest in dtb

2022-02-24 Thread Marc Zyngier via

On 2022-02-24 13:46, Peter Maydell wrote:

When we're using KVM, the PSCI implementation is provided by the
kernel, but QEMU has to tell the guest about it via the device tree.
Currently we look at the KVM_CAP_ARM_PSCI_0_2 capability to determine
if the kernel is providing at least PSCI 0.2, but if the kernel
provides a newer version than that we will still only tell the guest
it has PSCI 0.2.  (This is fairly harmless; it just means the guest
won't use newer parts of the PSCI API.)

The kernel exposes the specific PSCI version it is implementing via
the ONE_REG API; use this to report in the dtb that the PSCI
implementation is 1.0-compatible if appropriate.  (The device tree
binding currently only distinguishes "pre-0.2", "0.2-compatible" and
"1.0-compatible".)

Signed-off-by: Peter Maydell 


Reviewed-by: Marc Zyngier 

M.
--
Who you jivin' with that Cosmik Debris?



Re: [PATCH] Added parameter to take screenshot with screendump as PNG

2022-02-24 Thread Kshitij Suri



On 23/02/22 4:30 pm, Dr. David Alan Gilbert wrote:

* Kshitij Suri (kshitij.s...@nutanix.com) wrote:

Currently screendump only supports PPM format, which is un-compressed and not
standard. Added an "format" parameter to qemu monitor screendump capabilites
to support PNG image capture using libpng. The param was added in QAPI schema
of screendump present in ui.json along with png_save() function which converts
pixman_image to PNG. HMP command equivalent was also modified to support the
feature.

Example usage:
{ "execute": "screendump", "arguments": { "filename": "/tmp/image", 
"format":"png" } }

Resolves: 
https://urldefense.proofpoint.com/v2/url?u=https-3A__gitlab.com_qemu-2Dproject_qemu_-2D_issues_718=DwIBAg=s883GpUCOChKOHiocYtGcg=utjv19Ej9Fb0TB7_DX0o3faQ-OAm2ypPniPyqVSoj_w=YAoUy9TYXWb8lktv5hZwRWjZEBcoHKxfdrHMd8KkTnQAESANHEcW96C1ukSngzJ2=xJP6q9IE0req5UkOFLablyJWWREdmwS1NX4Yse0pYxE=

Signed-off-by: Kshitij Suri 
---
  hmp-commands.hx|  11 ++--
  monitor/hmp-cmds.c |   4 +-
  qapi/ui.json   |   7 ++-
  ui/console.c   | 153 -
  4 files changed, 165 insertions(+), 10 deletions(-)

diff --git a/hmp-commands.hx b/hmp-commands.hx
index 70a9136ac2..2163337f35 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -244,17 +244,18 @@ ERST
  
  {

  .name   = "screendump",
-.args_type  = "filename:F,device:s?,head:i?",
-.params = "filename [device [head]]",
-.help   = "save screen from head 'head' of display device 'device' 
"
-  "into PPM image 'filename'",
+.args_type  = "filename:F,device:s?,head:i?,format:f?",
+.params = "filename [device [head]] [format]",
+.help   = "save screen from head 'head' of display device 'device'"
+  "in specified format 'format' as image 'filename'."
+  "Default format for screendump is PPM.",
  .cmd= hmp_screendump,
  .coroutine  = true,
  },
  
  SRST

  ``screendump`` *filename*
-  Save screen into PPM image *filename*.
+  Save screen as image *filename*.
  ERST
  
  {

diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index 2669156b28..3fb1394561 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -1665,9 +1665,11 @@ hmp_screendump(Monitor *mon, const QDict *qdict)
  const char *filename = qdict_get_str(qdict, "filename");
  const char *id = qdict_get_try_str(qdict, "device");
  int64_t head = qdict_get_try_int(qdict, "head", 0);
+const char *format  = qdict_get_str(qdict, "format");
  Error *err = NULL;
  
-qmp_screendump(filename, id != NULL, id, id != NULL, head, );

+qmp_screendump(filename, id != NULL, id, id != NULL, head,
+   format != NULL, format, );
  hmp_handle_error(mon, err);
  }

I think that's OK from the HMP front, see some questions below.


diff --git a/qapi/ui.json b/qapi/ui.json
index 9354f4c467..9fdb56b60b 100644
--- a/qapi/ui.json
+++ b/qapi/ui.json
@@ -76,7 +76,7 @@
  ##
  # @screendump:
  #
-# Write a PPM of the VGA screen to a file.
+# Write a screenshot of the VGA screen to a file.
  #
  # @filename: the path of a new PPM file to store the image
  #
@@ -87,6 +87,9 @@
  #parameter is missing, head #0 will be used. Also note that the head
  #can only be specified in conjunction with the device ID. (Since 2.12)
  #
+# @format: image format for screendump is specified. Currently only PNG and
+# PPM are supported.
+#
  # Returns: Nothing on success
  #
  # Since: 0.14
@@ -99,7 +102,7 @@
  #
  ##
  { 'command': 'screendump',
-  'data': {'filename': 'str', '*device': 'str', '*head': 'int'},
+  'data': {'filename': 'str', '*device': 'str', '*head': 'int', '*format': 
'str'},
'coroutine': true }
  
  ##

diff --git a/ui/console.c b/ui/console.c
index 40eebb6d2c..7813b195ac 100644
--- a/ui/console.c
+++ b/ui/console.c
@@ -37,6 +37,9 @@
  #include "exec/memory.h"
  #include "io/channel-file.h"
  #include "qom/object.h"
+#ifdef CONFIG_VNC_PNG
+#include "png.h"
+#endif
  
  #define DEFAULT_BACKSCROLL 512

  #define CONSOLE_CURSOR_PERIOD 500
@@ -289,6 +292,137 @@ void graphic_hw_invalidate(QemuConsole *con)
  }
  }
  
+#ifdef CONFIG_VNC_PNG

+/**
+ * a8r8g8b8_to_rgba: Convert a8r8g8b8 to rgba format
+ *
+ * @dst: Destination pointer.
+ * @src: Source pointer.
+ * @n_pixels: Size of image.
+ */
+static void a8r8g8b8_to_rgba(uint32_t *dst, uint32_t *src, int n_pixels)
+{
+uint8_t *dst8 = (uint8_t *)dst;
+int i;
+
+for (i = 0; i < n_pixels; ++i) {
+uint32_t p = src[i];
+uint8_t a, r, g, b;
+
+a = (p & 0xff00) >> 24;
+r = (p & 0x00ff) >> 16;
+g = (p & 0xff00) >> 8;
+b = (p & 0x00ff) >> 0;
+
+if (a != 0) {
+#define DIVIDE(c, a) \
+do { \
+int t = ((c) * 255) / a; \
+(c) = t < 0 ? 0 : t > 255 ? 255 : t; \

I can't 

Re: [PATCH] Added parameter to take screenshot with screendump as PNG

2022-02-24 Thread Kshitij Suri


On 22/02/22 10:04 pm, Daniel P. Berrangé wrote:

On Tue, Feb 22, 2022 at 03:27:58PM +, Kshitij Suri wrote:

Currently screendump only supports PPM format, which is un-compressed and not
standard. Added an "format" parameter to qemu monitor screendump capabilites
to support PNG image capture using libpng. The param was added in QAPI schema
of screendump present in ui.json along with png_save() function which converts
pixman_image to PNG. HMP command equivalent was also modified to support the
feature.

Example usage:
{ "execute": "screendump", "arguments": { "filename": "/tmp/image", 
"format":"png" } }

Resolves:https://urldefense.proofpoint.com/v2/url?u=https-3A__gitlab.com_qemu-2Dproject_qemu_-2D_issues_718=DwIBaQ=s883GpUCOChKOHiocYtGcg=utjv19Ej9Fb0TB7_DX0o3faQ-OAm2ypPniPyqVSoj_w=eOSxOmlj_wVOfj7HF2fJ1VXlNNJp4k8UAlT5STcoAguScPKeHYwb6hwEuY54ok5c=ou3KExLg2-Cx31UtMV6vXErHpesdEHJjHXnbKcE9Wdk=  


Signed-off-by: Kshitij Suri
---
  hmp-commands.hx|  11 ++--
  monitor/hmp-cmds.c |   4 +-
  qapi/ui.json   |   7 ++-
  ui/console.c   | 153 -
  4 files changed, 165 insertions(+), 10 deletions(-)

You're going to need to update configure here.

Currently libpng is only detected if VNC is enabled.

This proposed change means we need to detect libpng
any time the build of system emulators is enabled.

The CONFIG_VNC_PNG option will also need renaming
to CONFIG_PNG


Yes I have added a new CONFIG_PNG option in the updated patch. As for 
CONFIG_VNC_PNG, can we have a seperate patch set


where I can replace CONFIG_VNC_PNG usage with combination of CONFIG_VNC 
and CONFIG_PNG? The replacement strategy would be as follows


#ifdef CONFIG_VNC_PNG -> #if defined(CONFIG_VNC) && defined(CONFIG_PNG)

Along with this, can we also use the new png meson-option to denote PNG 
format for VNC as well while maintaining backward compatibility? The 
change would look like


vnc_png = dependency('libpng', required: get_option('vnc_png'), method: 
'pkg-config', kwargs: static_kwargs)


gets changed to

vnc_png = dependency('libpng', required: get_option('vnc_png') || 
get_option('png'),
method: 'pkg-config', kwargs: static_kwargs)




+.args_type  = "filename:F,device:s?,head:i?,format:f?",
+.params = "filename [device [head]] [format]",
+.help   = "save screen from head 'head' of display device 'device'"
+  "in specified format 'format' as image 'filename'."
+  "Default format for screendump is PPM.",

Mention what other formats are permitted, making it clear that
the format is in fact 'png' and 'ppm', not 'PPM'

Updated in the v2 patch.

diff --git a/qapi/ui.json b/qapi/ui.json
index 9354f4c467..9fdb56b60b 100644
--- a/qapi/ui.json
+++ b/qapi/ui.json
@@ -76,7 +76,7 @@
  ##
  # @screendump:
  #
-# Write a PPM of the VGA screen to a file.
+# Write a screenshot of the VGA screen to a file.
  #
  # @filename: the path of a new PPM file to store the image
  #
@@ -87,6 +87,9 @@
  #parameter is missing, head #0 will be used. Also note that the head
  #can only be specified in conjunction with the device ID. (Since 2.12)
  #
+# @format: image format for screendump is specified. Currently only PNG and
+# PPM are supported.
+#
  # Returns: Nothing on success
  #
  # Since: 0.14
@@ -99,7 +102,7 @@
  #
  ##
  { 'command': 'screendump',
-  'data': {'filename': 'str', '*device': 'str', '*head': 'int'},
+  'data': {'filename': 'str', '*device': 'str', '*head': 'int', '*format': 
'str'},

This 'format' should not be a string, it needs to be an enum type.

Modified to an enum with png and ppm types.



'coroutine': true }
  
  ##

diff --git a/ui/console.c b/ui/console.c
index 40eebb6d2c..7813b195ac 100644
--- a/ui/console.c
+++ b/ui/console.c
@@ -37,6 +37,9 @@
  #include "exec/memory.h"
  #include "io/channel-file.h"
  #include "qom/object.h"
+#ifdef CONFIG_VNC_PNG
+#include "png.h"
+#endif
  
  #define DEFAULT_BACKSCROLL 512

  #define CONSOLE_CURSOR_PERIOD 500
@@ -289,6 +292,137 @@ void graphic_hw_invalidate(QemuConsole *con)
  }
  }
  
+#ifdef CONFIG_VNC_PNG

+/**
+ * a8r8g8b8_to_rgba: Convert a8r8g8b8 to rgba format
+ *
+ * @dst: Destination pointer.
+ * @src: Source pointer.
+ * @n_pixels: Size of image.
+ */
+static void a8r8g8b8_to_rgba(uint32_t *dst, uint32_t *src, int n_pixels)
+{
+uint8_t *dst8 = (uint8_t *)dst;
+int i;
+
+for (i = 0; i < n_pixels; ++i) {
+uint32_t p = src[i];
+uint8_t a, r, g, b;
+
+a = (p & 0xff00) >> 24;
+r = (p & 0x00ff) >> 16;
+g = (p & 0xff00) >> 8;
+b = (p & 0x00ff) >> 0;
+
+if (a != 0) {
+#define DIVIDE(c, a) \
+do { \
+int t = ((c) * 255) / a; \
+(c) = t < 0 ? 0 : t > 255 ? 255 : t; \
+} while (0)
+
+DIVIDE(r, a);
+DIVIDE(g, a);
+DIVIDE(b, a);
+  

Re: [PATCH v4 06/18] block: intoduce reqlist

2022-02-24 Thread Hanna Reitz

On 16.02.22 20:46, Vladimir Sementsov-Ogievskiy wrote:

Split intersecting-requests functionality out of block-copy to be
reused in copy-before-write filter.

Note: while being here, fix tiny typo in MAINTAINERS.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  include/block/reqlist.h |  67 +++
  block/block-copy.c  | 116 +---
  block/reqlist.c |  76 ++
  MAINTAINERS |   4 +-
  block/meson.build   |   1 +
  5 files changed, 184 insertions(+), 80 deletions(-)
  create mode 100644 include/block/reqlist.h
  create mode 100644 block/reqlist.c


Reviewed-by: Hanna Reitz 




[PATCH v2 11/12] mos6522: implement edge-triggering for CA1/2 and CB1/2 control line IRQs

2022-02-24 Thread Mark Cave-Ayland
The mos6522 datasheet describes how the control lines IRQs are edge-triggered
according to the configuration in the PCR register. Implement the logic 
according
to the datasheet so that the interrupt bits in IFR are latched when the edge is
detected, and cleared when reading portA/portB or writing to IFR as necessary.

To maintain bisectibility this change also updates the SCSI, SCSI data, Nubus
and VIA2 60Hz/1Hz clocks in the q800 machine to be negative edge-triggered as
confirmed by the PCR programming in all of Linux, NetBSD and MacOS.

Signed-off-by: Mark Cave-Ayland 
---
 hw/m68k/q800.c|  9 +++--
 hw/misc/mac_via.c | 15 +--
 hw/misc/mos6522.c | 82 +--
 include/hw/misc/mos6522.h | 15 +++
 4 files changed, 109 insertions(+), 12 deletions(-)

diff --git a/hw/m68k/q800.c b/hw/m68k/q800.c
index 55dfe5036f..66ca5c0df6 100644
--- a/hw/m68k/q800.c
+++ b/hw/m68k/q800.c
@@ -533,10 +533,11 @@ static void q800_init(MachineState *machine)
 
 sysbus = SYS_BUS_DEVICE(dev);
 sysbus_realize_and_unref(sysbus, _fatal);
-sysbus_connect_irq(sysbus, 0, qdev_get_gpio_in(via2_dev,
-   VIA2_IRQ_SCSI_BIT));
-sysbus_connect_irq(sysbus, 1, qdev_get_gpio_in(via2_dev,
-   VIA2_IRQ_SCSI_DATA_BIT));
+/* SCSI and SCSI data IRQs are negative edge triggered */
+sysbus_connect_irq(sysbus, 0, qemu_irq_invert(qdev_get_gpio_in(via2_dev,
+  VIA2_IRQ_SCSI_BIT)));
+sysbus_connect_irq(sysbus, 1, qemu_irq_invert(qdev_get_gpio_in(via2_dev,
+  VIA2_IRQ_SCSI_DATA_BIT)));
 sysbus_mmio_map(sysbus, 0, ESP_BASE);
 sysbus_mmio_map(sysbus, 1, ESP_PDMA);
 
diff --git a/hw/misc/mac_via.c b/hw/misc/mac_via.c
index d8b35e6ca6..525e38ce93 100644
--- a/hw/misc/mac_via.c
+++ b/hw/misc/mac_via.c
@@ -327,7 +327,9 @@ static void via1_sixty_hz(void *opaque)
 MOS6522State *s = MOS6522(v1s);
 qemu_irq irq = qdev_get_gpio_in(DEVICE(s), VIA1_IRQ_60HZ_BIT);
 
-qemu_set_irq(irq, 1);
+/* Negative edge trigger */
+qemu_irq_lower(irq);
+qemu_irq_raise(irq);
 
 via1_sixty_hz_update(v1s);
 }
@@ -338,7 +340,9 @@ static void via1_one_second(void *opaque)
 MOS6522State *s = MOS6522(v1s);
 qemu_irq irq = qdev_get_gpio_in(DEVICE(s), VIA1_IRQ_ONE_SECOND_BIT);
 
-qemu_set_irq(irq, 1);
+/* Negative edge trigger */
+qemu_irq_lower(irq);
+qemu_irq_raise(irq);
 
 via1_one_second_update(v1s);
 }
@@ -917,9 +921,11 @@ static uint64_t mos6522_q800_via2_read(void *opaque, 
hwaddr addr, unsigned size)
  * On a Q800 an emulated VIA2 is integrated into the onboard logic. The
  * expectation of most OSs is that the DRQ bit is live, rather than
  * latched as it would be on a real VIA so do the same here.
+ *
+ * Note: DRQ is negative edge triggered
  */
 val &= ~VIA2_IRQ_SCSI_DATA;
-val |= (ms->last_irq_levels & VIA2_IRQ_SCSI_DATA);
+val |= (~ms->last_irq_levels & VIA2_IRQ_SCSI_DATA);
 break;
 }
 
@@ -1146,7 +1152,8 @@ static void via2_nubus_irq_request(void *opaque, int n, 
int level)
 s->a |= (1 << n);
 }
 
-qemu_set_irq(irq, level);
+/* Negative edge trigger */
+qemu_set_irq(irq, !level);
 }
 
 static void mos6522_q800_via2_init(Object *obj)
diff --git a/hw/misc/mos6522.c b/hw/misc/mos6522.c
index 60b3b693d0..2af32b68ef 100644
--- a/hw/misc/mos6522.c
+++ b/hw/misc/mos6522.c
@@ -64,14 +64,62 @@ static void mos6522_update_irq(MOS6522State *s)
 static void mos6522_set_irq(void *opaque, int n, int level)
 {
 MOS6522State *s = MOS6522(opaque);
+int last_level = !!(s->last_irq_levels & (1 << n));
+uint8_t last_ifr = s->ifr;
+bool positive_edge = true;
+int ctrl;
+
+/*
+ * SR_INT is managed by mos6522 instances and cleared upon SR
+ * read. It is only the external CA1/2 and CB1/2 lines that
+ * are edge-triggered and latched in IFR
+ */
+if (n != SR_INT_BIT && level == last_level) {
+return;
+}
 
-if (level) {
+/* Detect negative edge trigger */
+if (last_level == 1 && level == 0) {
+positive_edge = false;
+}
+
+switch (n) {
+case CA2_INT_BIT:
+ctrl = (s->pcr & CA2_CTRL_MASK) >> CA2_CTRL_SHIFT;
+if ((positive_edge && (ctrl & C2_POS)) ||
+ (!positive_edge && !(ctrl & C2_POS))) {
+s->ifr |= 1 << n;
+}
+break;
+case CA1_INT_BIT:
+ctrl = (s->pcr & CA1_CTRL_MASK) >> CA1_CTRL_SHIFT;
+if ((positive_edge && (ctrl & C1_POS)) ||
+ (!positive_edge && !(ctrl & C1_POS))) {
+s->ifr |= 1 << n;
+}
+break;
+case SR_INT_BIT:
 s->ifr |= 1 << n;
-} else {
-s->ifr &= ~(1 << n);
+break;
+case CB2_INT_BIT:
+ctrl = (s->pcr 

Re: [PATCH v4 12/18] block: copy-before-write: realize snapshot-access API

2022-02-24 Thread Hanna Reitz

On 16.02.22 20:46, Vladimir Sementsov-Ogievskiy wrote:

Current scheme of image fleecing looks like this:

[guest][NBD export]
   |  |
   |root  | root
   v  v
[copy-before-write] -> [temp.qcow2]
   | target  |
   |file |backing
   v |
[active disk] <-+

  - On guest writes copy-before-write filter copies old data from active
disk to temp.qcow2. So fleecing client (NBD export) when reads
changed regions from temp.qcow2 image and unchanged from active disk
through backing link.

This patch makes possible new image fleecing scheme:

[guest]   [NBD export]
||
| root   | root
v file   v
[copy-before-write]<--[x-snapshot-access]
|   |
| file  | target
v   v
[active-disk] [temp.img]

  - copy-before-write does CBW operations and also provides
snapshot-access API. The API may be accessed through
x-snapshot-access driver.


The “x-” prefix seems like a relic from an earlier version.

(I agree with what I assume is your opinion now, that we don’t need an 
x- prefix.  I can’t imagine why we’d need to change the snapshot-access 
interface in an incompatible way.)



Benefits of new scheme:

1. Access control: if remote client try to read data that not covered
by original dirty bitmap used on copy-before-write open, client gets
-EACCES.

2. Discard support: if remote client do DISCARD, this additionally to
discarding data in temp.img informs block-copy process to not copy
these clusters. Next read from discarded area will return -EACCES.
This is significant thing: when fleecing user reads data that was
not yet copied to temp.img, we can avoid copying it on further guest
write.

3. Synchronisation between client reads and block-copy write is more
efficient. In old scheme we just rely on BDRV_REQ_SERIALISING flag
used for writes to temp.qcow2. New scheme is less blocking:
  - fleecing reads are never blocked: if data region is untouched or
in-flight, we just read from active-disk, otherwise we read from
temp.img
  - writes to temp.img are not blocked by fleecing reads
  - still, guest writes of-course are blocked by in-flight fleecing
reads, that currently read from active-disk - it's the minimum
necessary blocking

4. Temporary image may be of any format, as we don't rely on backing
feature.

5. Permission relation are simplified. With old scheme we have to share
write permission on target child of copy-before-write, otherwise
backing link conflicts with copy-before-write file child write
permissions. With new scheme we don't have backing link, and
copy-before-write node may have unshared access to temporary node.
(Not realized in this commit, will be in future).

6. Having control on fleecing reads we'll be able to implement
alternative behavior on failed copy-before-write operations.
Currently we just break guest request (that's a historical behavior
of backup). But in some scenarios it's a bad behavior: better
is to drop the backup as failed but don't break guest request.
With new scheme we can simply unset some bits in a bitmap on CBW
failure and further fleecing reads will -EACCES, or something like
this. (Not implemented in this commit, will be in future)
Additional application for this is implementing timeout for CBW
operations.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  block/copy-before-write.c | 212 +-
  1 file changed, 211 insertions(+), 1 deletion(-)

diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index 91a2288b66..a8c88f64eb 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c


[...]


+static int coroutine_fn
+cbw_co_snapshot_block_status(BlockDriverState *bs,
+ bool want_zero, int64_t offset, int64_t bytes,
+ int64_t *pnum, int64_t *map,
+ BlockDriverState **file)
+{
+BDRVCopyBeforeWriteState *s = bs->opaque;
+BlockReq *req;
+int ret;
+int64_t cur_bytes;
+BdrvChild *child;
+
+req = cbw_snapshot_read_lock(bs, offset, bytes, _bytes, );
+if (!req) {
+return -EACCES;
+}
+
+ret = bdrv_block_status(bs, offset, cur_bytes, pnum, map, file);


This looks like an infinite recursion.  Shouldn’t this be s/bs/child->bs/?


+if (child == s->target) {
+/*
+ * We refer to s->target only for areas that we've written to it.
+ * And we can not report unallocated blocks in s->target: this will
+ * break generic block-status-above logic, that will go to
+ * copy-before-write filtered child in this case.
+ 

Re: [PATCH] hw/i386/pc: when adding reserved E820 entries do not allocate dynamic entries

2022-02-24 Thread Igor Mammedov
On Thu, 24 Feb 2022 18:14:35 +0530
Ani Sinha  wrote:

> On Thu, Feb 24, 2022 at 2:33 PM Igor Mammedov  wrote:
> >
> > On Wed, 23 Feb 2022 17:30:34 +0530
> > Ani Sinha  wrote:
> >  
> > > On Wed, Feb 23, 2022 at 2:34 PM Igor Mammedov  
> > > wrote:  
> > > >
> > > > On Thu, 10 Feb 2022 18:58:21 +0530
> > > > Ani Sinha  wrote:
> > > >  
> > > > > When adding E820_RESERVED entries we also accidentally allocate 
> > > > > dynamic
> > > > > entries. This is incorrect. We should simply return early with the 
> > > > > count of
> > > > > the number of reserved entries added.  
> > > >
> > > > can you expand commit message to explain what's wrong and
> > > > how problem manifests ... etc.  
> > >
> > > The issue has been present for the last 8 years without apparent
> > > visible issues. I think the only issue is that the bug allocates more
> > > memory in the firmware than is actually needed.  
> >
> > let me repeat: Why do you think it's an issue or why it's wrong  
> 
> Allocating more memory than what we need unnecessarily bloats up the
> rom. We should not be allocating memory that we do not use.

see how firmware uses "etc/e820" fwcfg file first, to make up
mind on 'need' part.

> 
> >  
> > >  
> > > >  
> > > > >
> > > > > fixes: 7d67110f2d9a6("pc: add etc/e820 fw_cfg file")
> > > > > cc: kra...@redhat.com
> > > > > Signed-off-by: Ani Sinha 
> > > > > ---
> > > > >  hw/i386/e820_memory_layout.c | 2 ++
> > > > >  1 file changed, 2 insertions(+)
> > > > >
> > > > > diff --git a/hw/i386/e820_memory_layout.c 
> > > > > b/hw/i386/e820_memory_layout.c
> > > > > index bcf9eaf837..afb08253a4 100644
> > > > > --- a/hw/i386/e820_memory_layout.c
> > > > > +++ b/hw/i386/e820_memory_layout.c
> > > > > @@ -31,6 +31,8 @@ int e820_add_entry(uint64_t address, uint64_t 
> > > > > length, uint32_t type)
> > > > >  entry->type = cpu_to_le32(type);
> > > > >
> > > > >  e820_reserve.count = cpu_to_le32(index);
> > > > > +
> > > > > +return index;
> > > > >  }  
> > > >
> > > > this changes e820_table size/content, which is added by 
> > > > fw_cfg_add_file() to fwcfg,
> > > > as result it breaks ABI in case of migration.  
> > >
> > > Ugh. So should we keep the bug? or do we add config setting to handle
> > > the ABI breakage.
> > >  
> >  
> 




[PATCH v2 4/4] hw/nios2: Machine with a Vectored Interrupt Controller

2022-02-24 Thread Amir Gonnen
Demonstrate how to use nios2 VIC on a machine.
Introduce a new machine "10m50-ghrd-vic" which is based on "10m50-ghrd"
with a VIC attached and internal interrupt controller removed.

When VIC is present, irq0 connects the VIC to the cpu, intc_present
is set to false to disable the internal interrupt controller, and the
devices on the machine are attached to the VIC (and not directly to cpu).
To allow VIC update EIC fields, we set the "cpu" property of the VIC
with a reference to the nios2 cpu.

Signed-off-by: Amir Gonnen 
---
 hw/nios2/10m50_devboard.c | 64 ---
 1 file changed, 60 insertions(+), 4 deletions(-)

diff --git a/hw/nios2/10m50_devboard.c b/hw/nios2/10m50_devboard.c
index 3d1205b8bd..9f62a2993f 100644
--- a/hw/nios2/10m50_devboard.c
+++ b/hw/nios2/10m50_devboard.c
@@ -36,10 +36,23 @@

 #include "boot.h"

+#define TYPE_NIOS2_MACHINE  MACHINE_TYPE_NAME("10m50-ghrd")
+typedef struct Nios2MachineClass Nios2MachineClass;
+DECLARE_OBJ_CHECKERS(MachineState, Nios2MachineClass,
+ NIOS2_MACHINE, TYPE_NIOS2_MACHINE)
+
 #define BINARY_DEVICE_TREE_FILE"10m50-devboard.dtb"

+struct Nios2MachineClass {
+MachineClass parent_obj;
+
+bool vic;
+};
+
 static void nios2_10m50_ghrd_init(MachineState *machine)
 {
+Nios2MachineClass *nmc = NIOS2_MACHINE_GET_CLASS(machine);
+
 Nios2CPU *cpu;
 DeviceState *dev;
 MemoryRegion *address_space_mem = get_system_memory();
@@ -74,8 +87,24 @@ static void nios2_10m50_ghrd_init(MachineState *machine)

 /* Create CPU -- FIXME */
 cpu = NIOS2_CPU(cpu_create(TYPE_NIOS2_CPU));
-for (i = 0; i < 32; i++) {
-irq[i] = qdev_get_gpio_in_named(DEVICE(cpu), "IRQ", i);
+
+if (nmc->vic) {
+DeviceState *dev = qdev_new("nios2-vic");
+
+object_property_set_link(OBJECT(dev), "cpu", OBJECT(cpu), 
_fatal);
+sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), _fatal);
+cpu->intc_present = false;
+qemu_irq cpu_irq = qdev_get_gpio_in_named(DEVICE(cpu), "IRQ", 0);
+sysbus_connect_irq(SYS_BUS_DEVICE(dev), 0, cpu_irq);
+for (int i = 0; i < 32; i++) {
+irq[i] = qdev_get_gpio_in(dev, i);
+}
+MemoryRegion *dev_mr = sysbus_mmio_get_region(SYS_BUS_DEVICE(dev), 0);
+memory_region_add_subregion(address_space_mem, 0x18002000, dev_mr);
+} else {
+for (i = 0; i < 32; i++) {
+irq[i] = qdev_get_gpio_in_named(DEVICE(cpu), "IRQ", i);
+}
 }

 /* Register: Altera 16550 UART */
@@ -105,11 +134,38 @@ static void nios2_10m50_ghrd_init(MachineState *machine)
   BINARY_DEVICE_TREE_FILE, NULL);
 }

-static void nios2_10m50_ghrd_machine_init(struct MachineClass *mc)
+static void nios2_10m50_ghrd_machine_class_init(ObjectClass *oc, void *data)
 {
+MachineClass *mc = MACHINE_CLASS(oc);
+Nios2MachineClass *nmc = NIOS2_MACHINE_CLASS(oc);
 mc->desc = "Altera 10M50 GHRD Nios II design";
 mc->init = nios2_10m50_ghrd_init;
 mc->is_default = true;
+nmc->vic = false;
 }

-DEFINE_MACHINE("10m50-ghrd", nios2_10m50_ghrd_machine_init);
+static void nios2_10m50_ghrd_vic_machine_class_init(ObjectClass *oc, void 
*data)
+{
+MachineClass *mc = MACHINE_CLASS(oc);
+Nios2MachineClass *nmc = NIOS2_MACHINE_CLASS(oc);
+mc->desc = "Altera 10M50 GHRD Nios II design with VIC";
+mc->init = nios2_10m50_ghrd_init;
+mc->is_default = false;
+nmc->vic = true;
+}
+
+static const TypeInfo nios_machine_types[] = {
+{
+.name  = MACHINE_TYPE_NAME("10m50-ghrd"),
+.parent= TYPE_MACHINE,
+.class_size= sizeof(Nios2MachineClass),
+.class_init= nios2_10m50_ghrd_machine_class_init,
+}, {
+.name  = MACHINE_TYPE_NAME("10m50-ghrd-vic"),
+.parent= TYPE_MACHINE,
+.class_size= sizeof(Nios2MachineClass),
+.class_init= nios2_10m50_ghrd_vic_machine_class_init,
+}
+};
+
+DEFINE_TYPES(nios_machine_types)
--
2.25.1


The contents of this email message and any attachments are intended solely for 
the addressee(s) and may contain confidential and/or privileged information and 
may be legally protected from disclosure. If you are not the intended recipient 
of this message or their agent, or if this message has been addressed to you in 
error, please immediately alert the sender by reply email and then delete this 
message and any attachments. If you are not the intended recipient, you are 
hereby notified that any use, dissemination, copying, or storage of this 
message or its attachments is strictly prohibited.



[PATCH v2 3/4] hw/intc: Vectored Interrupt Controller (VIC)

2022-02-24 Thread Amir Gonnen
Implement nios2 Vectored Interrupt Controller (VIC).
VIC is connected to EIC. It needs to update rha, ril, rrs and rnmi
fields on Nios2CPU before raising an IRQ.
For that purpose, VIC has a "cpu" property which should refer to the
nios2 cpu and set by the board that connects VIC.

Signed-off-by: Amir Gonnen 
---
 hw/intc/Kconfig |   4 +
 hw/intc/meson.build |   1 +
 hw/intc/nios2_vic.c | 327 
 3 files changed, 332 insertions(+)
 create mode 100644 hw/intc/nios2_vic.c

diff --git a/hw/intc/Kconfig b/hw/intc/Kconfig
index 528e77b4a6..8000241428 100644
--- a/hw/intc/Kconfig
+++ b/hw/intc/Kconfig
@@ -81,3 +81,7 @@ config GOLDFISH_PIC

 config M68K_IRQC
 bool
+
+config NIOS2_VIC
+default y
+depends on NIOS2 && TCG
diff --git a/hw/intc/meson.build b/hw/intc/meson.build
index 7466024402..547e16eb2d 100644
--- a/hw/intc/meson.build
+++ b/hw/intc/meson.build
@@ -61,3 +61,4 @@ specific_ss.add(when: ['CONFIG_KVM', 'CONFIG_XIVE'],
if_true: files('spapr_xive_kvm.c'))
 specific_ss.add(when: 'CONFIG_GOLDFISH_PIC', if_true: files('goldfish_pic.c'))
 specific_ss.add(when: 'CONFIG_M68K_IRQC', if_true: files('m68k_irqc.c'))
+specific_ss.add(when: 'CONFIG_NIOS2_VIC', if_true: files('nios2_vic.c'))
diff --git a/hw/intc/nios2_vic.c b/hw/intc/nios2_vic.c
new file mode 100644
index 00..a9c9b3e7ac
--- /dev/null
+++ b/hw/intc/nios2_vic.c
@@ -0,0 +1,327 @@
+/*
+ * Vectored Interrupt Controller for nios2 processor
+ *
+ * Copyright (c) 2022 Neuroblade
+ *
+ * Interface:
+ * QOM property "cpu": link to the Nios2 CPU (must be set)
+ * Unnamed GPIO inputs 0..NIOS2_VIC_MAX_IRQ-1: input IRQ lines
+ * IRQ should be connected to nios2 IRQ0.
+ *
+ * Reference: "Embedded Peripherals IP User Guide
+ * for Intel® Quartus® Prime Design Suite: 21.4"
+ * Chapter 38 "Vectored Interrupt Controller Core"
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include "qemu/osdep.h"
+
+#include "hw/irq.h"
+#include "hw/qdev-properties.h"
+#include "hw/sysbus.h"
+#include "migration/vmstate.h"
+#include "qapi/error.h"
+#include "qemu/bitops.h"
+#include "qemu/log.h"
+#include "qom/object.h"
+#include "cpu.h"
+
+#define LOG_VIC(...) qemu_log_mask(CPU_LOG_INT, ##__VA_ARGS__)
+
+#define TYPE_NIOS2_VIC "nios2-vic"
+
+OBJECT_DECLARE_SIMPLE_TYPE(Nios2Vic, NIOS2_VIC)
+
+#define NIOS2_VIC_MAX_IRQ 32
+
+enum {
+INT_CONFIG0 = 0,
+INT_CONFIG31 = 31,
+INT_ENABLE = 32,
+INT_ENABLE_SET = 33,
+INT_ENABLE_CLR = 34,
+INT_PENDING = 35,
+INT_RAW_STATUS = 36,
+SW_INTERRUPT = 37,
+SW_INTERRUPT_SET = 38,
+SW_INTERRUPT_CLR = 39,
+VIC_CONFIG = 40,
+VIC_STATUS = 41,
+VEC_TBL_BASE = 42,
+VEC_TBL_ADDR = 43,
+CSR_COUNT /* Last! */
+};
+
+struct Nios2Vic {
+/*< private >*/
+SysBusDevice parent_obj;
+
+/*< public >*/
+qemu_irq output_int;
+
+/* properties */
+CPUState *cpu;
+MemoryRegion csr;
+
+uint32_t int_config[32];
+uint32_t vic_config;
+uint32_t int_raw_status;
+uint32_t int_enable;
+uint32_t sw_int;
+uint32_t vic_status;
+uint32_t vec_tbl_base;
+uint32_t vec_tbl_addr;
+};
+
+/* Requested interrupt level (INT_CONFIG[0:5]) */
+static inline uint32_t vic_int_config_ril(const Nios2Vic *vic, int irq_num)
+{
+return extract32(vic->int_config[irq_num], 0, 6);
+}
+
+/* Requested NMI (INT_CONFIG[6]) */
+static inline uint32_t vic_int_config_rnmi(const Nios2Vic *vic, int irq_num)
+{
+return extract32(vic->int_config[irq_num], 6, 1);
+}
+
+/* Requested register set (INT_CONFIG[7:12]) */
+static inline uint32_t vic_int_config_rrs(const Nios2Vic *vic, int irq_num)
+{
+return extract32(vic->int_config[irq_num], 7, 6);
+}
+
+static inline uint32_t vic_config_vec_size(const Nios2Vic *vic)
+{
+return 1 << (2 + extract32(vic->vic_config, 0, 3));
+}
+
+static inline uint32_t vic_int_pending(const Nios2Vic *vic)
+{
+return 

Re: [PATCH v4 15/18] iotests/image-fleecing: add test case with bitmap

2022-02-24 Thread Vladimir Sementsov-Ogievskiy

24.02.2022 15:58, Hanna Reitz wrote:

On 16.02.22 20:46, Vladimir Sementsov-Ogievskiy wrote:

Note that reads zero areas (not dirty in the bitmap) fails, that's
correct.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  tests/qemu-iotests/tests/image-fleecing | 32 ++--
  tests/qemu-iotests/tests/image-fleecing.out | 84 +
  2 files changed, 108 insertions(+), 8 deletions(-)


Looks good, just one general usage question:


diff --git a/tests/qemu-iotests/tests/image-fleecing 
b/tests/qemu-iotests/tests/image-fleecing
index 909fc0a7ad..33995612be 100755
--- a/tests/qemu-iotests/tests/image-fleecing
+++ b/tests/qemu-iotests/tests/image-fleecing
@@ -23,7 +23,7 @@
  # Creator/Owner: John Snow 
  import iotests
-from iotests import log, qemu_img, qemu_io, qemu_io_silent
+from iotests import log, qemu_img, qemu_io, qemu_io_silent, 
qemu_io_pipe_and_status
  iotests.script_initialize(
  supported_fmts=['qcow2', 'qcow', 'qed', 'vmdk', 'vhdx', 'raw'],
@@ -50,11 +50,15 @@ remainder = [('0xd5', '0x108000',  '32k'), # Right-end of 
partial-left [1]
   ('0xcd', '0x3ff', '64k')] # patterns[3]
  def do_test(use_cbw, use_snapshot_access_filter, base_img_path,
-    fleece_img_path, nbd_sock_path, vm):
+    fleece_img_path, nbd_sock_path, vm,
+    bitmap=False):
  log('--- Setting up images ---')
  log('')
  assert qemu_img('create', '-f', iotests.imgfmt, base_img_path, '64M') == 0
+    if bitmap:
+    assert qemu_img('bitmap', '--add', base_img_path, 'bitmap0') == 0
+
  if use_snapshot_access_filter:
  assert use_cbw
  assert qemu_img('create', '-f', 'raw', fleece_img_path, '64M') == 0
@@ -106,12 +110,17 @@ def do_test(use_cbw, use_snapshot_access_filter, 
base_img_path,
  # Establish CBW from source to fleecing node
  if use_cbw:
-    log(vm.qmp('blockdev-add', {
+    fl_cbw = {
  'driver': 'copy-before-write',
  'node-name': 'fl-cbw',
  'file': src_node,
  'target': tmp_node
-    }))
+    }
+
+    if bitmap:
+    fl_cbw['bitmap'] = {'node': src_node, 'name': 'bitmap0'}
+
+    log(vm.qmp('blockdev-add', fl_cbw))
  log(vm.qmp('qom-set', path=qom_path, property='drive', 
value='fl-cbw'))


This makes me wonder how exactly the @bitmap parameter is to be used. In this 
case here, we use an active bitmap that tracks all writes, so it looks like a 
case of trying to copy the changes since some previous checkpoint (as a 
point-in-time state).  But if there are any writes between the blockdev-add and 
the qom-set, then they will not be included in the CBW bitmap.  Is that fine?  
Or is it perhaps even intentional?

(Is the idea that one would use a transaction to disable the current bitmap 
(say “A”), and add a new one (say “B”) at the same time, then use bitmap A for 
the CBW filter, delete it after the backup, and then use B for the subsequent 
backup?)



Hmm, good question. If we do this way, we break a point-in-time of backup.. 
We'll make a copy of disk in state of the moment of qom-set, but use an 
outdated copy of bitmap..

Good solution would do blockdev-add and qom-set in one transaction. But it's 
more possible to make transaction support for my proposed blockdev-replace, 
which should substitute qom-set in this scenario..

And supporting blockdev-add in transaction is not simple too.

With usual backup we simply do blockdev-backup and all needed bitmap 
manipulations in one transaction. With filter, actual backup start is qom-set 
(or blockdev-replace), not blockdev-add.. But we can't pass bitmap parameter to 
qom-set or blockdev-replace.

We probably could support blockdev-reopen in transaction, and change the bitmap 
in reopen.. But that seems wrong to me: we should not use reopen in scenario 
where we've just created this temporary node with all arguments we want.

Keeping in mind my recent series that introduces a kind of transaction for 
bdrv_close, may be the best and more native way is really support blockdev-add 
and blockdev-del in transaction.


The only alternative way I see is to not copy the user-given bitmap, but use 
exactly what user gives. This way, we do the following:

1. User give active bitmap A to cbw_open, so bitmap continue to track dirtiness.
2. User start a new dirty bitmap B
3. On filter insertion, we have a good bitmap with all needed dirty bits
4. After filter insertion, user stops tracking in bitmap A (disable it)

This way, we'll not lose any data. The drawback, is that we may copy some extra 
data, that was not actually dirty at point [3] (because we disable bitmap A 
after it, not in transaction). As well, bitmap B which will be used for next 
incremental backup will probably contain some extra dirty bits. That's not bad, 
but that's not an ideal architecture.

--
Best regards,
Vladimir



Re: [PATCH v2] target/arm: Support PSCI 1.1 and SMCCC 1.0

2022-02-24 Thread Peter Maydell
On Thu, 24 Feb 2022 at 14:37, Akihiko Odaki  wrote:
>
> Support the latest PSCI on TCG and HVF. A 64-bit function called from
> AArch32 now returns NOT_SUPPORTED, which is necessary to adhere to SMC
> Calling Convention 1.0. It is still not compliant with SMCCC 1.3 since
> they do not implement mandatory functions.
>
> Signed-off-by: Akihiko Odaki 

You didn't need to repost this; I've already queued it.

thanks
-- PMM



Re: [PATCH 3/3] util/event-loop: Introduce options to set the thread pool size

2022-02-24 Thread Stefan Hajnoczi
On Mon, Feb 21, 2022 at 06:08:45PM +0100, Nicolas Saenz Julienne wrote:
> The thread pool regulates itself: when idle, it kills threads until
> empty, when in demand, it creates new threads until full. This behaviour
> doesn't play well with latency sensitive workloads where the price of
> creating a new thread is too high. For example, when paired with qemu's
> '-mlock', or using safety features like SafeStack, creating a new thread
> has been measured take multiple milliseconds.
> 
> In order to mitigate this let's introduce a new 'EventLoopBackend'
> property to set the thread pool size. The threads will be created during
> the pool's initialization, remain available during its lifetime
> regardless of demand, and destroyed upon freeing it. A properly
> characterized workload will then be able to configure the pool to avoid
> any latency spike.
> 
> Signed-off-by: Nicolas Saenz Julienne 
> ---
>  include/block/aio.h | 11 +++
>  qapi/qom.json   |  4 +++-
>  util/async.c|  3 +++
>  util/event-loop.c   | 15 ++-
>  util/event-loop.h   |  4 
>  util/main-loop.c| 13 +
>  util/thread-pool.c  | 41 +
>  7 files changed, 85 insertions(+), 6 deletions(-)
> 
> diff --git a/include/block/aio.h b/include/block/aio.h
> index 5634173b12..331483d1d1 100644
> --- a/include/block/aio.h
> +++ b/include/block/aio.h
> @@ -192,6 +192,8 @@ struct AioContext {
>  QSLIST_HEAD(, Coroutine) scheduled_coroutines;
>  QEMUBH *co_schedule_bh;
>  
> +int pool_min;
> +int pool_max;

Are these fields protected by ThreadPool->lock? Please document. This is
a clue that maybe these fields belong in ThreadPool.

Regarding the field names: the AioContext thread pool field is called
thread_pool and the user-visible parameters are thread-pool-min/max. I
suggest calling the fields thread_pool_min/max too so it's clear which
pool we're talking about and there is a correspondence to user-visible
parameters.

> @@ -350,3 +358,28 @@ void thread_pool_free(ThreadPool *pool)
>  qemu_mutex_destroy(>lock);
>  g_free(pool);
>  }
> +
> +void aio_context_set_thread_pool_params(AioContext *ctx, uint64_t min,
> +uint64_t max, Error **errp)
> +{
> +ThreadPool *pool = ctx->thread_pool;
> +
> +if (min > max || !max) {

ctx->pool_min/max are int while the min/max arguments are uint64_t.
Please add an INT_MAX check to detect overflow.

> +error_setg(errp, "bad thread-pool-min/thread-pool-max values");
> +return;
> +}
> +
> +if (pool) {
> +qemu_mutex_lock(>lock);
> +}

This code belongs in util/thread-pool.c. I guess the reason for keeping
the fields in AioContext instead of ThreadPool is because the ThreadPool
is created on demand and we'd have nowhere to store the parameter value.
I suggest we bite the bullet and keep an extra copy of the variables in
AioContext with a clean ThreadPool interface (thread_pool_set_params())
instead of letting AioContext and ThreadPool access each other's
internals.

> +
> +ctx->pool_min = min;
> +ctx->pool_max = max;
> +
> +if (pool) {
> +for (int i = pool->cur_threads; i < ctx->pool_min; i++) {
> +spawn_thread(pool);
> +}

What about the reverse: when min is lowered and there are a bunch of
idle worker threads we could wake them up so they terminate until
->pool_min is reached again?


signature.asc
Description: PGP signature


Re: [PATCH 4/5] python: qmp_shell: add -e/--exit-on-error option

2022-02-24 Thread Damien Hedde




On 2/23/22 19:20, John Snow wrote:

On Wed, Feb 23, 2022 at 12:09 PM Damien Hedde
 wrote:




On 2/23/22 17:18, John Snow wrote:

On Wed, Feb 23, 2022 at 10:44 AM Daniel P. Berrangé  wrote:


On Wed, Feb 23, 2022 at 10:41:11AM -0500, John Snow wrote:

On Wed, Feb 23, 2022 at 10:27 AM Daniel P. Berrangé  wrote:


On Wed, Feb 23, 2022 at 10:22:11AM -0500, John Snow wrote:

On Mon, Feb 21, 2022 at 10:55 AM Damien Hedde
 wrote:


This option makes qmp_shell exit (with error code 1)
as soon as one of the following error occurs:
+ command parsing error
+ disconnection
+ command failure (response is an error)

_execute_cmd() method now returns None or the response
so that read_exec_command() can do the last check.

This is meant to be used in combination with an input file
redirection. It allows to store a list of commands
into a file and try to run them by qmp_shell and easily
see if it failed or not.

Signed-off-by: Damien Hedde 


Based on this patch, it looks like you really want something
scriptable, so I think the qemu-send idea that Dan has suggested might
be the best way to go. Are you still hoping to use the interactive
"short" QMP command format? That might be a bad idea, given how flaky
the parsing is -- and how we don't actually have a published standard
for that format. We've *never* liked the bad parsing here, so I have a
reluctance to use it in more places.

I'm having the naive idea that a script file could be as simple as a
list of QMP commands to send:

[
  {"execute": "block-dirty-bitmap-add", "arguments": { ... }},
  ...
]


I'd really recommend against creating a new format for the script
file, especially one needing opening & closing  [] like this, as
that isn't so amenable to dynamic usage/creation. ie you can't
just append an extcra command to an existing file.

IMHO, the "file" format should be identical to the result of
capturing the socket data off the wire. ie just a concatenation
of QMP commands, with no extra wrapping / change in format.



Eugh. That's just so hard to parse, because there's no off-the-shelf
tooling for "load a sequence of JSON documents". Nothing in Python
does it. :\


It isn't that hard if you require each JSON doc to be followed by
a newline.

Feed one line at a time to the JSON parser, until you get a complete
JSON doc, process that, then re-init the parser and carry on feeding
it lines until it emits the next JSON doc, and so on.



There's two interfaces in Python:

(1) json.load(), which takes a file pointer and either returns a
single, complete JSON document or it raises an Exception. It's not
useful here at all.
(2) json.JSONDecoder().raw_decode(strbuf), which takes a string buffer
and returns a 2-tuple of a JSON Document and the position at which it
stopped decoding.

The second is what we need here, but it does require buffering the
entire file into a string first, and then iteratively calling it. It
feels like working against the grain a little bit. We also can't use
the QAPI parser, as that parser has intentionally removed support for
constructs we don't use in the qapi schema language. Boo. (Not that I
want more non-standard configuration files like that propagating,
either.)

It would be possible to generate a JSON-Schema document to describe a
script file that used a containing list construct, but impossible for
a concatenation of JSON documents. This is one of the reasons I
instinctively shy away from non-standard file formats, they tend to
cut off support for this sort of thing.

Wanting to keep the script easy to append to is legitimate. I'm keen
to hear a bit more about the use case here before I press extremely
hard in any given direction, but those are my impulses here.



The use case is to be able to feed qemu with a bunch of commands we
expect to succeed and let qemu continue (unlike Daniel's wrap use case,
we don't want to quit qemu after the last command).

Typically it's the use case I present in the following cover-letter:
https://lore.kernel.org/qemu-devel/20220223090706.4888-1-damien.he...@greensocs.com/



OK (Sorry for blowing this out into a bigger ordeal than you had maybe
hoped for. I want to get you happy and on your way ASAP, I promise)

So, I see comments and simple QMP commands using the short-hand
format. If I understand correctly, you want this script upstream so
that you don't have to re-engineer the hack every time I shift
something around in qmp-shell, and so that examples can be easily
shared and reproduced between parties. Good reasons, so I want to help
you out and get something merged. (An aside: In the past I have just
copy-pasted stuff into my qmp-shell terminal. It's less reproducible
for people who aren't used to using the tool, but it's been just
enough juice for me in the past. I empathize with wanting to give
someone a single-shot command they can copy-paste and have it Just
Work.)

Some observations:

(1) Comments we don't have in JSON, but we could use YAML instead, I'm
fine with that personally. 

RE: [PATCH] vl: transform QemuOpts device to JSON syntax device

2022-02-24 Thread Duan, Zhenzhong


>-Original Message-
>From: Daniel P. Berrangé 
>Sent: Thursday, February 24, 2022 5:14 PM
>To: Peter Krempa 
>Cc: Duan, Zhenzhong ; qemu-
>de...@nongnu.org; kw...@redhat.com; m...@redhat.com;
>ler...@redhat.com; pbonz...@redhat.com; ebl...@redhat.com
>Subject: Re: [PATCH] vl: transform QemuOpts device to JSON syntax device
>
>On Thu, Feb 24, 2022 at 10:09:35AM +0100, Peter Krempa wrote:
>> On Thu, Feb 24, 2022 at 09:02:02 +, Daniel P. Berrangé wrote:
>> > On Thu, Feb 24, 2022 at 02:06:53PM +0800, Zhenzhong Duan wrote:
>> > > While there are mixed use of traditional -device option and JSON
>> > > syntax option, QEMU reports conflict, e.x:
>> > >
>> > > /usr/libexec/qemu-kvm -nodefaults \
>> > >   -device '{"driver":"virtio-scsi-
>pci","id":"scsi0","bus":"pci.0","addr":"0x02.0"}' \
>> > >   -device virtio-scsi-pci,id=scsi1,bus=pci.0
>> >
>> > Why are you attempting to mix JSON and non-JSON syntax at the same
>> > time ? The expectation is that any mgmt app adopting JSON syntax
>> > will do so universally and not mix old and new syntax. So in
>> > practice the scenario above is not one that QEMU ever intended to
>> > have used by apps.
>>
>> Based on the previous post they are using some 'qemu:commandline'
>> overrides with the legacy syntax with new libvirt which uses JSON
>> syntax:
>>
>> 
>>   
>> 
>> 
>>   > value='virtio-net-pci,netdev=mynet0,mac=00:16:3E:68:00:10,romfile='/>
>> 
>>
>> I suggested that they should add the required functionality to libvirt
>> instead of trying to hack qemu:
>>
>> https://lists.gnu.org/archive/html/qemu-devel/2022-02/msg05068.html
>
>Or the other answer is to switch to use JSON syntax in the above command
>line passthrough config.
>
>Or to specify the PCI address so it doesn't clash with other devices already
>present.

Thanks Daniel and Peter for comments.
Clear on my side. I'll synchronize your suggestions to my colleague.

Regards
Zhenzhong


Re: [PATCH] vl: transform QemuOpts device to JSON syntax device

2022-02-24 Thread Kevin Wolf
Am 24.02.2022 um 07:06 hat Zhenzhong Duan geschrieben:
> While there are mixed use of traditional -device option and JSON
> syntax option, QEMU reports conflict, e.x:
> 
> /usr/libexec/qemu-kvm -nodefaults \
>   -device 
> '{"driver":"virtio-scsi-pci","id":"scsi0","bus":"pci.0","addr":"0x02.0"}' \
>   -device virtio-scsi-pci,id=scsi1,bus=pci.0
> 
> It breaks with:
> 
> qemu-kvm: -device 
> {"driver":"virtio-scsi-pci","id":"scsi0","bus":"pci.0","addr":"0x02.0"}: PCI: 
> slot 2 function 0 not available for virtio-scsi-pci, in use by virtio-scsi-pci
> 
> But if we reformat first -device same as the second, so only same kind
> of option for all the devices, it succeeds, vice versa. e.x:
> 
> /usr/libexec/qemu-kvm -nodefaults \
>   -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=02.0 \
>   -device virtio-scsi-pci,id=scsi1,bus=pci.0
> 
> Succeed!
> 
> Because both kind of options are inserted into their own list and
> break the order in QEMU command line during BDF auto assign. Fix it
> by transform QemuOpts into JSON syntax and insert in JSON device
> list, so the order in QEMU command line kept.
> 
> Signed-off-by: Zhenzhong Duan 

This patch is incorrect and breaks several cases, which are the reason
why the QemuOpts path hasn't been changed yet.

For example, after this patch, help doesn't work any more:

$ build/qemu-system-x86_64 -device help
qemu-system-x86_64: -device help: 'help' is not a valid device model name

Any non-string property doesn't work any more in non-JSON syntax:

$ $ build/qemu-system-x86_64 -blockdev null-co,node-name=disk -device 
virtio-blk,drive=disk,physical_block_size=4096
qemu-system-x86_64: -device virtio-blk,drive=disk,physical_block_size=4096: 
Parameter 'physical_block_size' expects uint64

There may be more cases that are broken with this patch.

Kevin




Re: [PATCH v2 0/2] ui/cocoa: updateUIInfo threading, autorelease pools

2022-02-24 Thread Akihiko Odaki

Reviewed-by: Akihiko Odaki 
Tested-by: Akihiko Odaki 

On 2022/02/24 19:13, Peter Maydell wrote:

This patchset was originally provoked by Akihiko Odaki noting
that we have some unnecessary creation and deletion of autorelease
pools in the Cocoa UI code. Patch 2 deletes them; but to get there
we need to do a bit of cleanup of the updateUIInfo support,
which wasn't considering threads.

Tested only very lightly.

v1->v2 changes:
  * name method updateUIInfoLocked, to match existing handleEventLocked
  * don't call updateUIInfo in cocoa_display_init() -- this happens
indirectly as a result of register_displaychangelistener()

thanks
-- PMM

Peter Maydell (2):
   ui/cocoa.m: Fix updateUIInfo threading issues
   ui/cocoa.m: Remove unnecessary NSAutoreleasePools

  ui/cocoa.m | 31 ++-
  1 file changed, 22 insertions(+), 9 deletions(-)





[PATCH v2 10/12] mac_via: make SCSI_DATA (DRQ) bit live rather than latched

2022-02-24 Thread Mark Cave-Ayland
The VIA2 on the Q800 machine is not a separate chip as in older Macs but instead
is integrated into the on-board logic. From analysing the SCSI routines in the
MacOS toolbox ROM (and to a lesser extent NetBSD and Linux) the expectation 
seems
to be that the SCSI_DATA (DRQ) bit is live on the Q800 and not latched.

Fortunately we can use the recently introduced mos6522 last_irq_levels variable
which tracks the edge-triggered state to return the SCSI_DATA (DRQ) bit live to
the guest OS.

Signed-off-by: Mark Cave-Ayland 
---
 hw/misc/mac_via.c | 17 -
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/hw/misc/mac_via.c b/hw/misc/mac_via.c
index 3f473c3fcf..d8b35e6ca6 100644
--- a/hw/misc/mac_via.c
+++ b/hw/misc/mac_via.c
@@ -906,9 +906,24 @@ static uint64_t mos6522_q800_via2_read(void *opaque, 
hwaddr addr, unsigned size)
 {
 MOS6522Q800VIA2State *s = MOS6522_Q800_VIA2(opaque);
 MOS6522State *ms = MOS6522(s);
+uint64_t val;
 
 addr = (addr >> 9) & 0xf;
-return mos6522_read(ms, addr, size);
+val = mos6522_read(ms, addr, size);
+
+switch (addr) {
+case VIA_REG_IFR:
+/*
+ * On a Q800 an emulated VIA2 is integrated into the onboard logic. The
+ * expectation of most OSs is that the DRQ bit is live, rather than
+ * latched as it would be on a real VIA so do the same here.
+ */
+val &= ~VIA2_IRQ_SCSI_DATA;
+val |= (ms->last_irq_levels & VIA2_IRQ_SCSI_DATA);
+break;
+}
+
+return val;
 }
 
 static void mos6522_q800_via2_write(void *opaque, hwaddr addr, uint64_t val,
-- 
2.20.1




Re: [PATCH v4 11/18] block: introduce snapshot-access filter

2022-02-24 Thread Hanna Reitz

On 16.02.22 20:46, Vladimir Sementsov-Ogievskiy wrote:

The filter simply utilizes snapshot-access API of underlying block


Nit picking: Well, it isn’t really a filter.  I understand where you’re 
coming from, but by definition it isn’t a filter driver.



node.

In further patches we want to use it like this:

[guest]   [NBD export]
||
| root   | root
v file   v
[copy-before-write]<--[snapshot-access]
|   |
| file  | target
v   v
[active-disk] [temp.img]

This way, NBD client will be able to read snapshotted state of active
disk, when active disk is continued to be written by guest. This is
known as "fleecing", and currently uses another scheme based on qcow2
temporary image which backing file is active-disk. New scheme comes
with benefits - see next commit.

The other possible application is exporting internal snapshots of
qcow2, like this:

[guest]  [NBD export]
|  |
| root | root
v   file   v
[qcow2]<-[snapshot-access]

For this, we'll need to implement snapshot-access API handlers in
qcow2 driver, and improve snapshot-access filter (and API) to make it
possibele to select snapshot by name.


s/possibele/possible/


Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  qapi/block-core.json|   4 +-
  block/snapshot-access.c | 132 
  MAINTAINERS |   1 +
  block/meson.build   |   1 +
  4 files changed, 137 insertions(+), 1 deletion(-)
  create mode 100644 block/snapshot-access.c


Again, I like this very much, not least because it provides a clean way 
to solve the long-standing question of how to nicely export qcow2 snapshots.


[...]


diff --git a/block/snapshot-access.c b/block/snapshot-access.c
new file mode 100644
index 00..77b87c1946
--- /dev/null
+++ b/block/snapshot-access.c


[...]


+static int snapshot_access_open(BlockDriverState *bs, QDict *options, int 
flags,
+Error **errp)
+{
+bs->file = bdrv_open_child(NULL, options, "file", bs, _of_bds,
+   BDRV_CHILD_DATA | BDRV_CHILD_PRIMARY,
+   false, errp);
+if (!bs->file) {
+return -EINVAL;
+}
+
+bs->total_sectors = bs->file->bs->total_sectors;


(qcow2) snapshots can have a size that differs from the image’s current 
(active layer) size.  We should accommodate for that here (I guess I’d 
be fine with a FIXME, too, but introducing FIXMEs is always not exactly 
great), I think.



+
+return 0;
+}
+





Re: [PATCH v4 13/18] iotests/image-fleecing: add test-case for fleecing format node

2022-02-24 Thread Hanna Reitz

On 16.02.22 20:46, Vladimir Sementsov-Ogievskiy wrote:

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  tests/qemu-iotests/tests/image-fleecing | 64 -
  tests/qemu-iotests/tests/image-fleecing.out | 76 -
  2 files changed, 120 insertions(+), 20 deletions(-)


Reviewed-by: Hanna Reitz 




Re: [PATCH v4 12/18] block: copy-before-write: realize snapshot-access API

2022-02-24 Thread Vladimir Sementsov-Ogievskiy

24.02.2022 15:46, Hanna Reitz wrote:

On 16.02.22 20:46, Vladimir Sementsov-Ogievskiy wrote:

Current scheme of image fleecing looks like this:

[guest]    [NBD export]
   |  |
   |root  | root
   v  v
[copy-before-write] -> [temp.qcow2]
   | target  |
   |file |backing
   v |
[active disk] <-+

  - On guest writes copy-before-write filter copies old data from active
    disk to temp.qcow2. So fleecing client (NBD export) when reads
    changed regions from temp.qcow2 image and unchanged from active disk
    through backing link.

This patch makes possible new image fleecing scheme:

[guest]   [NBD export]
    |    |
    | root   | root
    v file   v
[copy-before-write]<--[x-snapshot-access]
    |   |
    | file  | target
    v   v
[active-disk] [temp.img]

  - copy-before-write does CBW operations and also provides
    snapshot-access API. The API may be accessed through
    x-snapshot-access driver.


The “x-” prefix seems like a relic from an earlier version.

(I agree with what I assume is your opinion now, that we don’t need an x- 
prefix.  I can’t imagine why we’d need to change the snapshot-access interface 
in an incompatible way.)


Benefits of new scheme:

1. Access control: if remote client try to read data that not covered
    by original dirty bitmap used on copy-before-write open, client gets
    -EACCES.

2. Discard support: if remote client do DISCARD, this additionally to
    discarding data in temp.img informs block-copy process to not copy
    these clusters. Next read from discarded area will return -EACCES.
    This is significant thing: when fleecing user reads data that was
    not yet copied to temp.img, we can avoid copying it on further guest
    write.

3. Synchronisation between client reads and block-copy write is more
    efficient. In old scheme we just rely on BDRV_REQ_SERIALISING flag
    used for writes to temp.qcow2. New scheme is less blocking:
  - fleecing reads are never blocked: if data region is untouched or
    in-flight, we just read from active-disk, otherwise we read from
    temp.img
  - writes to temp.img are not blocked by fleecing reads
  - still, guest writes of-course are blocked by in-flight fleecing
    reads, that currently read from active-disk - it's the minimum
    necessary blocking

4. Temporary image may be of any format, as we don't rely on backing
    feature.

5. Permission relation are simplified. With old scheme we have to share
    write permission on target child of copy-before-write, otherwise
    backing link conflicts with copy-before-write file child write
    permissions. With new scheme we don't have backing link, and
    copy-before-write node may have unshared access to temporary node.
    (Not realized in this commit, will be in future).

6. Having control on fleecing reads we'll be able to implement
    alternative behavior on failed copy-before-write operations.
    Currently we just break guest request (that's a historical behavior
    of backup). But in some scenarios it's a bad behavior: better
    is to drop the backup as failed but don't break guest request.
    With new scheme we can simply unset some bits in a bitmap on CBW
    failure and further fleecing reads will -EACCES, or something like
    this. (Not implemented in this commit, will be in future)
    Additional application for this is implementing timeout for CBW
    operations.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  block/copy-before-write.c | 212 +-
  1 file changed, 211 insertions(+), 1 deletion(-)

diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index 91a2288b66..a8c88f64eb 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c


[...]


+static int coroutine_fn
+cbw_co_snapshot_block_status(BlockDriverState *bs,
+ bool want_zero, int64_t offset, int64_t bytes,
+ int64_t *pnum, int64_t *map,
+ BlockDriverState **file)
+{
+    BDRVCopyBeforeWriteState *s = bs->opaque;
+    BlockReq *req;
+    int ret;
+    int64_t cur_bytes;
+    BdrvChild *child;
+
+    req = cbw_snapshot_read_lock(bs, offset, bytes, _bytes, );
+    if (!req) {
+    return -EACCES;
+    }
+
+    ret = bdrv_block_status(bs, offset, cur_bytes, pnum, map, file);


This looks like an infinite recursion.  Shouldn’t this be s/bs/child->bs/?


Oh, yes, right




+    if (child == s->target) {
+    /*
+ * We refer to s->target only for areas that we've written to it.
+ * And we can not report unallocated blocks in s->target: this will
+ * break generic block-status-above logic, that will go to
+ * 

[PATCH v2 2/4] target/nios2: Exteral Interrupt Controller (EIC)

2022-02-24 Thread Amir Gonnen
Implement Exteral Interrupt Controller interface (EIC).
Added intc_present property, true by default. When set to false, nios2
uses the EIC interface when handling IRQ. When set to true (default)
it uses the internal interrupt controller.
When nios2 recieves irq, it first checks intc_present to decide whether
to use the internal interrupt controller or the EIC.

The EIC is triggered by IRQ gpio but also recieves additional data from
the external interrupt controller (such as VIC): rha, ril, rrs and rnmi.
The interrupt controller is expected to raise IRQ after setting these
fields on Nios2CPU.

rha, ril, rrs and rnmi are used when EIC handles external interrupt, in
order to decide if to take the interrupt now, which shadow register set
to use, which PC to jump to, whether to set NMI flag, etc.

Signed-off-by: Amir Gonnen 
---
 target/nios2/cpu.c   | 58 +---
 target/nios2/cpu.h   | 22 +++
 target/nios2/helper.c| 33 ---
 target/nios2/op_helper.c |  7 +++--
 4 files changed, 105 insertions(+), 15 deletions(-)

diff --git a/target/nios2/cpu.c b/target/nios2/cpu.c
index 0886705818..9bd8a6301a 100644
--- a/target/nios2/cpu.c
+++ b/target/nios2/cpu.c
@@ -55,6 +55,7 @@ static void nios2_cpu_reset(DeviceState *dev)

 memset(env->regs, 0, sizeof(uint32_t) * NUM_CORE_REGS);
 memset(env->shadow_regs, 0, sizeof(uint32_t) * NUM_REG_SETS * NUM_GP_REGS);
+env->regs[CR_STATUS] |= CR_STATUS_RSIE;
 env->regs[R_PC] = cpu->reset_addr;

 #if defined(CONFIG_USER_ONLY)
@@ -65,10 +66,28 @@ static void nios2_cpu_reset(DeviceState *dev)
 #endif
 }

+bool nios2_take_eic_irq(const Nios2CPU *cpu)
+{
+const CPUNios2State *env = >env;
+
+if (cpu->rnmi) {
+return !(env->regs[CR_STATUS] & CR_STATUS_NMI);
+}
+
+if (((env->regs[CR_STATUS] & CR_STATUS_PIE) == 0) ||
+(cpu->ril <= cpu_get_il(env)) ||
+(cpu->rrs == cpu_get_crs(env) &&
+  !(env->regs[CR_STATUS] & CR_STATUS_RSIE))) {
+
+return false;
+}
+
+return true;
+}
+
 #ifndef CONFIG_USER_ONLY
-static void nios2_cpu_set_irq(void *opaque, int irq, int level)
+static void nios2_cpu_set_intc_irq(Nios2CPU *cpu, int irq, int level)
 {
-Nios2CPU *cpu = opaque;
 CPUNios2State *env = >env;
 CPUState *cs = CPU(cpu);

@@ -83,6 +102,32 @@ static void nios2_cpu_set_irq(void *opaque, int irq, int 
level)
 cpu_reset_interrupt(cs, CPU_INTERRUPT_HARD);
 }
 }
+
+static void nios2_cpu_set_eic_irq(Nios2CPU *cpu, int level)
+{
+CPUNios2State *env = >env;
+CPUState *cs = CPU(cpu);
+
+env->irq_pending = level;
+
+if (env->irq_pending && nios2_take_eic_irq(cpu)) {
+env->irq_pending = 0;
+cpu_interrupt(cs, CPU_INTERRUPT_HARD);
+} else if (!env->irq_pending) {
+cpu_reset_interrupt(cs, CPU_INTERRUPT_HARD);
+}
+}
+
+static void nios2_cpu_set_irq(void *opaque, int irq, int level)
+{
+Nios2CPU *cpu = opaque;
+
+if (cpu->intc_present) {
+nios2_cpu_set_intc_irq(cpu, irq, level);
+} else {
+nios2_cpu_set_eic_irq(cpu, level);
+}
+}
 #endif

 static void nios2_cpu_initfn(Object *obj)
@@ -94,13 +139,6 @@ static void nios2_cpu_initfn(Object *obj)
 #if !defined(CONFIG_USER_ONLY)
 mmu_init(>env);

-/*
- * These interrupt lines model the IIC (internal interrupt
- * controller). QEMU does not currently support the EIC
- * (external interrupt controller) -- if we did it would be
- * a separate device in hw/intc with a custom interface to
- * the CPU, and boards using it would not wire up these IRQ lines.
- */
 qdev_init_gpio_in_named(DEVICE(cpu), nios2_cpu_set_irq, "IRQ", 32);
 #endif
 }
@@ -202,6 +240,8 @@ static Property nios2_properties[] = {
 DEFINE_PROP_UINT32("mmu_tlb_num_ways", Nios2CPU, tlb_num_ways, 16),
 /* ALTR,tlb-num-entries */
 DEFINE_PROP_UINT32("mmu_pid_num_entries", Nios2CPU, tlb_num_entries, 256),
+/* interrupt-controller (internal) */
+DEFINE_PROP_BOOL("intc_present", Nios2CPU, intc_present, true),
 DEFINE_PROP_END_OF_LIST(),
 };

diff --git a/target/nios2/cpu.h b/target/nios2/cpu.h
index 1d3503825b..1b3d0ed25e 100644
--- a/target/nios2/cpu.h
+++ b/target/nios2/cpu.h
@@ -92,12 +92,14 @@ struct Nios2CPUClass {
 #define   CR_STATUS_EH   (1 << 2)
 #define   CR_STATUS_IH   (1 << 3)
 #define   CR_STATUS_IL   (63 << 4)
+#define   CR_STATUS_IL_OFFSET 6
 #define   CR_STATUS_CRS  (63 << 10)
 #define   CR_STATUS_CRS_OFFSET 10
 #define   CR_STATUS_PRS  (63 << 16)
 #define   CR_STATUS_PRS_OFFSET 16
 #define   CR_STATUS_NMI  (1 << 22)
 #define   CR_STATUS_RSIE (1 << 23)
+#define   CR_STATUS_SRS  (1 << 31)
 #define CR_ESTATUS   (CR_BASE + 1)
 #define CR_BSTATUS   (CR_BASE + 2)
 #define CR_IENABLE   (CR_BASE + 3)
@@ -189,6 +191,7 @@ struct Nios2CPU {
 CPUNios2State env;

 bool mmu_present;
+bool intc_present;
 uint32_t pid_num_bits;
 uint32_t tlb_num_ways;
 uint32_t tlb_num_entries;
@@ 

Re: [PATCH v3 5/6] i386/pc: warn if phys-bits is too low

2022-02-24 Thread Joao Martins
On 2/23/22 18:44, Joao Martins wrote:
> @@ -896,6 +897,15 @@ void pc_memory_init(PCMachineState *pcms,
>  
>  x86_update_above_4g_mem_start(pcms, pci_hole64_size);
>  
> +maxphysaddr = ((hwaddr)1 << X86_CPU(first_cpu)->phys_bits) - 1;
> +maxusedaddr = x86_max_phys_addr(pcms, pci_hole64_size);
> +if (maxphysaddr < maxusedaddr) {
> +warn_report("Address space above 4G at %"PRIx64"-%"PRIx64
> +" phys-bits too low (%u)",
> +x86ms->above_4g_mem_start, maxusedaddr,
> +X86_CPU(first_cpu)->phys_bits);
> +}
> +
And in addition to the change in patch 4, for 32-bit I will change this
to an error_report(...) and exit right after, and updating commit message
accordingly. The error message changes slightly too given that it was
too specific to the above 4G region. All qtests pass.

diffstat below:

@@ -904,6 +905,16 @@ void pc_memory_init(PCMachineState *pcms,

 x86_update_above_4g_mem_start(pcms, pci_hole64_size);

+maxphysaddr = ((hwaddr)1 << X86_CPU(first_cpu)->phys_bits) - 1;
+maxusedaddr = x86_max_phys_addr(pcms, pci_hole64_size);
+if (maxphysaddr < maxusedaddr) {
+error_report("Address space limit 0x%"PRIx64" < 0x%"PRIx64
+ " phys-bits too low (%u)",
+ maxphysaddr, maxusedaddr,
+ X86_CPU(first_cpu)->phys_bits);
+exit(EXIT_FAILURE);
+}



Re: [PATCH] mps3-an547: Add missing user ahb interfaces

2022-02-24 Thread Peter Maydell
On Thu, 10 Feb 2022 at 21:02, Jimmy Brisson  wrote:
>
> With these interfaces missing, TFM would delegate peripherals 0, 1,
> 2, 3 and 8, and qemu would ignore the delegation of interface 8, as
> it thought interface 4 was eth & USB.
>
> This patch corrects this behavior and allows TFM to delegate the
> eth & USB peripheral to NS mode.
>
> Signed-off-by: Jimmy Brisson 
> ---
>  hw/arm/mps2-tz.c | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/hw/arm/mps2-tz.c b/hw/arm/mps2-tz.c
> index f40e854dec..e287ad4d06 100644
> --- a/hw/arm/mps2-tz.c
> +++ b/hw/arm/mps2-tz.c
> @@ -1078,6 +1078,10 @@ static void mps2tz_common_init(MachineState *machine)
>  { "gpio1", make_unimp_dev, >gpio[1], 0x41101000, 0x1000 
> },
>  { "gpio2", make_unimp_dev, >gpio[2], 0x41102000, 0x1000 
> },
>  { "gpio3", make_unimp_dev, >gpio[3], 0x41103000, 0x1000 
> },
> +{ /* port 4 USER AHB interface 0 */ },
> +{ /* port 5 USER AHB interface 1 */ },
> +{ /* port 6 USER AHB interface 2 */ },
> +{ /* port 7 USER AHB interface 3 */ },
>  { "eth-usb", make_eth_usb, NULL, 0x4140, 0x20, { 49 
> } },
>  },

Yes; we implemented this against the revision B of the AN547
AppNote, which had an error in the relevant table. Revision C
of the AppNote corrects the docs to follow the FPGA image behaviour,
which is to have eth-usb on port 8. (I sent a patch the other day
that updates the URL in this source file to point to the latest
pdf rather than the old rev B.)

Applied to target-arm.next, thanks.

-- PMM



Re: [PATCH RFC v1 0/2] VM fork detection for RNG

2022-02-24 Thread Daniel P . Berrangé
On Thu, Feb 24, 2022 at 09:53:59AM +0100, Alexander Graf wrote:
> Hey Jason,
> 
> On 23.02.22 14:12, Jason A. Donenfeld wrote:
> > This small series picks up work from Amazon that seems to have stalled
> > out later year around this time: listening for the vmgenid ACPI
> > notification, and using it to "do something." Last year, that something
> > involved a complicated userspace mmap chardev, which seems frought with
> > difficulty. This year, I have something much simpler in mind: simply
> > using those ACPI notifications to tell the RNG to reinitialize safely,
> > so we don't repeat random numbers in cloned, forked, or rolled-back VM
> > instances.
> > 
> > This series consists of two patches. The first is a rather
> > straightforward addition to random.c, which I feel fine about. The
> > second patch is the reason this is just an RFC: it's a cleanup of the
> > ACPI driver from last year, and I don't really have much experience
> > writing, testing, debugging, or maintaining these types of drivers.
> > Ideally this thread would yield somebody saying, "I see the intent of
> > this; I'm happy to take over ownership of this part." That way, I can
> > focus on the RNG part, and whoever steps up for the paravirt ACPI part
> > can focus on that.
> > 
> > As a final note, this series intentionally does _not_ focus on
> > notification of these events to userspace or to other kernel consumers.
> > Since these VM fork detection events first need to hit the RNG, we can
> > later talk about what sorts of notifications or mmap'd counters the RNG
> > should be making accessible to elsewhere. But that's a different sort of
> > project and ties into a lot of more complicated concerns beyond this
> > more basic patchset. So hopefully we can keep the discussion rather
> > focused here to this ACPI business.
> 
> 
> The main problem with VMGenID is that it is inherently racy. There will
> always be a (short) amount of time where the ACPI notification is not
> processed, but the VM could use its RNG to for example establish TLS
> connections.
> 
> Hence we as the next step proposed a multi-stage quiesce/resume mechanism
> where the system is aware that it is going into suspend - can block network
> connections for example - and only returns to a fully functional state after
> an unquiesce phase:
> 
>   https://github.com/systemd/systemd/issues/20222

The downside of course is precisely that the guest now needs to be aware
and involved every single time a snapshot is taken.

Currently with virt the act of taking a snapshot can often remain invisible
to the VM with no functional effect on the guest OS or its workload, and
the host OS knows it can complete a snapshot in a specific timeframe. That
said, this transparency to the VM is precisely the cause of the race
condition described.

With guest involvement to quiesce the bulk of activity for time period,
there is more likely to be a negative impact on the guest workload. The
guest admin likely needs to be more explicit about exactly when in time
it is reasonable to take a snapshot to mitigate the impact.

The host OS snapshot operations are also now dependant on co-operation
of a guest OS that has to be considered to be potentially malicious, or
at least crashed/non-responsive. The guest OS also needs a way to receive
the triggers for snapshot capture and restore, most likely via an extension
to something like the QEMU guest agent or an equivalent for othuer
hypervisors.


Despite the above, I'm not against the idea of co-operative involvement
of the guest OS in the acts of taking & restoring snapshots. I can't
see any other proposals so far that can reliably eliminate the races
in the general case, from the kernel right upto user applications.
So I think it is neccessary to have guest cooperative snapshotting.

> What exact use case do you have in mind for the RNG/VMGenID update? Can you
> think of situations where the race is not an actual concern?

Lets assume we do take the approach described in that systemd bug and
have a co-operative snapshot process. If the hypervisor does the right
thing and guest owners install the right things, they'll have a race
free solution that works well in normal operation. That's good.


Realistically though, it is never going to be universally and reliably
put into practice. So what is our attitude to cases where the preferred
solution isn't availble and/or operative ?


There are going to be users who continue to build their guest disk images
without the QEMU guest agent (or equivalent for whatever hypervisor they
run on) installed because they don't know any better. Or where the guest
agent is mis-configured or fails to starts or some other scenario that
prevents the quiesce working as desired. The host mgmt could refuse to
take a snapshot in these cases. More likely is that they are just
going to go ahead and do a snapshot anyway because lack of guest agent
is a very common scenario today and users want their snapshots.


There are 

Re: [PATCH RFC v1 0/2] VM fork detection for RNG

2022-02-24 Thread Jason A. Donenfeld
Hi Lazlo,

Thanks for your reply.

On Thu, Feb 24, 2022 at 9:23 AM Laszlo Ersek  wrote:
> QEMU's related design is documented in
> .

I'll link to this document on the 2/2 patch next to the other ones.

> "they can also use the data provided in the 128-bit identifier as a high
> entropy random data source"
>
> So reinitializing an RNG from it is an express purpose.

It seems like this is indeed meant to be used for RNG purposes, but
the Windows 10 RNG document says: "Windows 10 on a Hyper-V VM will
detect when the VM state is reset, retrieve a unique (not random)
value from the hypervisor." I gather from that that it's not totally
clear what the "quality" of those 128 bits are. So this patchset mixes
them into the entropy pool, but does not credit it, which is
consistent with how the RNG deals with other data where the conclusion
is, "probably pretty good but maybe not," erring on the side of
caution. Either way, it's certainly being used -- and combined with
what was there before -- to reinitialize the RNG following a VM fork.

>
> More info in the libvirt docs (see "genid"):
>
> https://libvirt.org/formatdomain.html#general-metadata

Thanks, noted in the 2/2 patch too.

> QEMU's interpretation of the VMGENID specifically as a UUID (which I
> believe comes from me) has received (valid) criticism since:
>
> https://github.com/libguestfs/virt-v2v/blob/master/docs/vm-generation-id-across-hypervisors.txt
>
> (This document also investigates VMGENID on other hypervisors, which I
> think pertains to your other message.)

Thank you very much for this reference! You're absolutely right here.
v3 will treat this as just an opaque 128-bit binary blob. There's no
point, anyway, in treating it as a UUID in the kernel, since it never
should be printed or exposed to anywhere except random.c (and my gifs,
of course :-P).

>
> > (It appears there's a bug in QEMU which prevents
> > the GUID from being reinitialized when running `loadvm` without
> > quitting first; I suppose this should be discussed with QEMU
> > upstream.)
>
> That's not (necessarily) a bug; see the end of the above-linked QEMU
> document:
>
> "There are no known use cases for changing the GUID once QEMU is
> running, and adding this capability would greatly increase the complexity."

I read that, and I think I might disagree? If you're QEMUing with the
monitor and are jumping back and forth and all around between saved
snapshots, probably those snapshots should have their RNG
reinitialized through this mechanism, right? It seems like doing that
would be the proper behavior for `guid=auto`, but not for
`guid={some-fixed-thing}`.

> > So that's very positive. But I would appreciate hearing from some
> > ACPI/Virt/Amazon people about this.
>
> I've only made some random comments; I didn't see a question so I
> couldn't attempt to answer :)

"Am I on the right track," I guess, and your reply has been very
informative. Thanks for your feedback. I'll have a v3 sent out not
before long.

Jason



Re: [PATCH RFC v1 0/2] VM fork detection for RNG

2022-02-24 Thread Jason A. Donenfeld
Hi Alex,

Strangely your message never made it to me, and I had to pull this out
of Lore after seeing Daniel's reply to it. I wonder what's up.

On Thu, Feb 24, 2022 at 09:53:59AM +0100, Alexander Graf wrote:
> The main problem with VMGenID is that it is inherently racy. There will 
> always be a (short) amount of time where the ACPI notification is not 
> processed, but the VM could use its RNG to for example establish TLS 
> connections.
> 
> Hence we as the next step proposed a multi-stage quiesce/resume 
> mechanism where the system is aware that it is going into suspend - can 
> block network connections for example - and only returns to a fully 
> functional state after an unquiesce phase:
> 
>    https://github.com/systemd/systemd/issues/20222
> 
> Looking at the issue again, it seems like we completely missed to follow 
> up with a PR to implement that functionality :(.
> 
> What exact use case do you have in mind for the RNG/VMGenID update? Can 
> you think of situations where the race is not an actual concern?

No, I think the race is something that remains a problem for the
situations I care about. There are simpler ways of fixing that -- just
expose a single incrementing integer so that it can be checked every
time the RNG does something, without being expensive, via the same
mechanism -- and then you don't need any complexity. But anyway, that
doesn't exist right now, so this series tries to implement something for
what does exist and is already supported by multiple hypervisors. I'd
suggest sending a proposal for an improved mechanism as part of a
different thread, and pull the various parties into that, and we can
make something good for the future. I'm happy to implement whatever the
virtual hardware exposes.

Jason



[PATCH v2 00/12] mos6522: switch to gpios, add control line edge-triggering and extra debugging

2022-02-24 Thread Mark Cave-Ayland
Here is another patchset taken from my series to enable MacOS to boot on the 
q800
machine.

Patches 1-3 define the IFR bit flags in terms of the physical control lines and
update mac_via to use them.

Patch 4 does the main switch from custom methods in MOS6522DeviceClass to using
standard gpios whilst patch 5 removes these now-obsolete methods.

Patch 6 updates mos6522 instances to use the recommended method of calling
device_class_set_parent_reset() to propagate the device reset to the parent.

Patches 7 and 8 add more support for debugging guests using the mos6522 devices
by adding register names into the trace-event output and implementing a new
"info via" HMP command to give detailed information about the registers and 
timer
states.

Patch 9 introduces a new last_irq_levels field within MOS6522State to enable 
detection
of edge transitions, which is also a migration break for the q800 and 
g3beige/mac99
machines.

Patch 10 ensures that the SCSI_DATA (DRQ) bit in VIA2 is unlatched since in the 
q800
machine VIA2 is integrated within the on-board logic, and analysis of the MacOS
toolbox ROM suggests that the DRQ bit is expected to be live compared with older
Macs which use a real (latched) VIA.

Patches 11 implement edge-triggering for the CA1/2 and CB1/2 control lines as
documented in the datasheet, including updating the relevant inputs for negative
edge-triggering if required.

Finally patch 12 removes some old code in the PMU mos6522 instance which is now 
no
longer required with these latest changes.

Signed-off-by: Mark Cave-Ayland 

v2:
- Update patches 1-3 to use the BIT() macro
- Add R-B tags from Peter
- Update "info via" patch to use a target-specific HMP command as discussed 
on-list
- Add patch 10 "mac_via: make SCSI_DATA (DRQ) bit live rather than latched"


Mark Cave-Ayland (12):
  mos6522: add defines for IFR bit flags
  mac_via: use IFR bit flag constants for VIA1 IRQs
  mac_via: use IFR bit flag constants for VIA2 IRQs
  mos6522: switch over to use qdev gpios for IRQs
  mos6522: remove update_irq() and set_sr_int() methods from
MOS6522DeviceClass
  mos6522: use device_class_set_parent_reset() to propagate reset to
parent
  mos6522: add register names to register read/write trace events
  mos6522: add "info via" HMP command for debugging
  mos6522: record last_irq_levels in mos6522_set_irq()
  mac_via: make SCSI_DATA (DRQ) bit live rather than latched
  mos6522: implement edge-triggering for CA1/2 and CB1/2 control line
IRQs
  macio/pmu.c: remove redundant code

 hmp-commands-info.hx |  15 +++
 hw/m68k/q800.c   |   9 +-
 hw/misc/mac_via.c|  87 ++
 hw/misc/macio/cuda.c |   8 +-
 hw/misc/macio/pmu.c  |  40 +--
 hw/misc/mos6522.c| 223 ---
 hw/misc/trace-events |   4 +-
 include/hw/misc/mac_via.h|  46 
 include/hw/misc/macio/pmu.h  |   2 -
 include/hw/misc/mos6522.h|  44 +--
 include/monitor/hmp-target.h |   1 +
 11 files changed, 333 insertions(+), 146 deletions(-)

-- 
2.20.1




[PATCH v2 01/12] mos6522: add defines for IFR bit flags

2022-02-24 Thread Mark Cave-Ayland
These are intended to make it easier to see how the physical control lines
are wired for each instance.

Signed-off-by: Mark Cave-Ayland 
---
 include/hw/misc/mos6522.h | 22 +++---
 1 file changed, 15 insertions(+), 7 deletions(-)

diff --git a/include/hw/misc/mos6522.h b/include/hw/misc/mos6522.h
index fc95d22b0f..be5c90d24d 100644
--- a/include/hw/misc/mos6522.h
+++ b/include/hw/misc/mos6522.h
@@ -41,13 +41,21 @@
 #define IER_SET0x80/* set bits in IER */
 #define IER_CLR0   /* clear bits in IER */
 
-#define CA2_INT0x01
-#define CA1_INT0x02
-#define SR_INT 0x04/* Shift register full/empty */
-#define CB2_INT0x08
-#define CB1_INT0x10
-#define T2_INT 0x20/* Timer 2 interrupt */
-#define T1_INT 0x40/* Timer 1 interrupt */
+#define CA2_INT_BIT0
+#define CA1_INT_BIT1
+#define SR_INT_BIT 2   /* Shift register full/empty */
+#define CB2_INT_BIT3
+#define CB1_INT_BIT4
+#define T2_INT_BIT 5   /* Timer 2 interrupt */
+#define T1_INT_BIT 6   /* Timer 1 interrupt */
+
+#define CA2_INTBIT(CA2_INT_BIT)
+#define CA1_INTBIT(CA1_INT_BIT)
+#define SR_INT BIT(SR_INT_BIT)
+#define CB2_INTBIT(CB2_INT_BIT)
+#define CB1_INTBIT(CB1_INT_BIT)
+#define T2_INT BIT(T2_INT_BIT)
+#define T1_INT BIT(T1_INT_BIT)
 
 /* Bits in ACR */
 #define T1MODE 0xc0/* Timer 1 mode */
-- 
2.20.1




[PATCH v2 02/12] mac_via: use IFR bit flag constants for VIA1 IRQs

2022-02-24 Thread Mark Cave-Ayland
This allows us to easily see how the physical control lines are mapped to the
IFR bit flags.

Signed-off-by: Mark Cave-Ayland 
---
 include/hw/misc/mac_via.h | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/include/hw/misc/mac_via.h b/include/hw/misc/mac_via.h
index b445565866..b0535c84da 100644
--- a/include/hw/misc/mac_via.h
+++ b/include/hw/misc/mac_via.h
@@ -18,19 +18,19 @@
 #define VIA_SIZE   0x2000
 
 /* VIA 1 */
-#define VIA1_IRQ_ONE_SECOND_BIT 0
-#define VIA1_IRQ_60HZ_BIT   1
-#define VIA1_IRQ_ADB_READY_BIT  2
-#define VIA1_IRQ_ADB_DATA_BIT   3
-#define VIA1_IRQ_ADB_CLOCK_BIT  4
+#define VIA1_IRQ_ONE_SECOND_BIT CA2_INT_BIT
+#define VIA1_IRQ_60HZ_BIT   CA1_INT_BIT
+#define VIA1_IRQ_ADB_READY_BIT  SR_INT_BIT
+#define VIA1_IRQ_ADB_DATA_BIT   CB2_INT_BIT
+#define VIA1_IRQ_ADB_CLOCK_BIT  CB1_INT_BIT
 
 #define VIA1_IRQ_NB 8
 
-#define VIA1_IRQ_ONE_SECOND (1 << VIA1_IRQ_ONE_SECOND_BIT)
-#define VIA1_IRQ_60HZ   (1 << VIA1_IRQ_60HZ_BIT)
-#define VIA1_IRQ_ADB_READY  (1 << VIA1_IRQ_ADB_READY_BIT)
-#define VIA1_IRQ_ADB_DATA   (1 << VIA1_IRQ_ADB_DATA_BIT)
-#define VIA1_IRQ_ADB_CLOCK  (1 << VIA1_IRQ_ADB_CLOCK_BIT)
+#define VIA1_IRQ_ONE_SECOND BIT(VIA1_IRQ_ONE_SECOND_BIT)
+#define VIA1_IRQ_60HZ   BIT(VIA1_IRQ_60HZ_BIT)
+#define VIA1_IRQ_ADB_READY  BIT(VIA1_IRQ_ADB_READY_BIT)
+#define VIA1_IRQ_ADB_DATA   BIT(VIA1_IRQ_ADB_DATA_BIT)
+#define VIA1_IRQ_ADB_CLOCK  BIT(VIA1_IRQ_ADB_CLOCK_BIT)
 
 
 #define TYPE_MOS6522_Q800_VIA1 "mos6522-q800-via1"
-- 
2.20.1




Re: [PATCH v5 08/20] jobs: protect jobs with job_lock/unlock

2022-02-24 Thread Emanuele Giuseppe Esposito



On 17/02/2022 15:48, Stefan Hajnoczi wrote:
> On Tue, Feb 08, 2022 at 09:35:01AM -0500, Emanuele Giuseppe Esposito wrote:
>> diff --git a/block/replication.c b/block/replication.c
>> index 55c8f894aa..a03b28726e 100644
>> --- a/block/replication.c
>> +++ b/block/replication.c
>> @@ -149,7 +149,9 @@ static void replication_close(BlockDriverState *bs)
>>  if (s->stage == BLOCK_REPLICATION_FAILOVER) {
>>  commit_job = >commit_job->job;
>>  assert(commit_job->aio_context == qemu_get_current_aio_context());
> 
> Is it safe to access commit_job->aio_context outside job_mutex?

No, but it is currently not done. Patch 18 takes care of protecting
aio_context. Remember again that job lock API is still nop.
> 
>> @@ -1838,7 +1840,9 @@ static void drive_backup_abort(BlkActionState *common)
>>  aio_context = bdrv_get_aio_context(state->bs);
>>  aio_context_acquire(aio_context);
>>  
>> -job_cancel_sync(>job->job, true);
>> +WITH_JOB_LOCK_GUARD() {
>> +job_cancel_sync(>job->job, true);
>> +}
> 
> Maybe job_cancel_sync() should take the lock internally since all
> callers in this patch seem to need the lock?

The _locked version is useful because it is used when lock guards are
already present, and cover multiple operations. There are only 3 places
where a lock guard is added to cover job_cance_sync_locked. Is it worth
defining another additional function?


> 
> I noticed this patch does not add WITH_JOB_LOCK_GUARD() to
> tests/unit/test-blockjob.c:cancel_common(). Was that an oversight or is
> there a reason why job_mutex is not needed around the job_cancel_sync()
> call there?

No, locks in unit tests are added in patch 10 "jobs: protect jobs with
job_lock/unlock".

> 
>> @@ -252,7 +258,13 @@ int block_job_add_bdrv(BlockJob *job, const char *name, 
>> BlockDriverState *bs,
>>  
>>  static void block_job_on_idle(Notifier *n, void *opaque)
>>  {
>> +/*
>> + * we can't kick with job_mutex held, but we also want
>> + * to protect the notifier list.
>> + */
>> +job_unlock();
>>  aio_wait_kick();
>> +job_lock();
> 
> I don't understand this. aio_wait_kick() looks safe to call with a mutex
> held?
You are right. It should be safe.

> 
>> @@ -292,7 +304,9 @@ bool block_job_set_speed(BlockJob *job, int64_t speed, 
>> Error **errp)
>>  job->speed = speed;
>>  
>>  if (drv->set_speed) {
>> +job_unlock();
>>  drv->set_speed(job, speed);
>> +job_lock();
> 
> What guarantees that job stays alive during drv->set_speed(job)? We
> don't hold a ref here. Maybe the assumption is that
> block_job_set_speed() only gets called from the main loop thread and
> nothing else will modify the jobs list while we're in drv->set_speed()?

What guaranteed this before? I am not sure.

> 
>> @@ -545,10 +566,15 @@ BlockErrorAction block_job_error_action(BlockJob *job, 
>> BlockdevOnError on_err,
>>  action);
>>  }
>>  if (action == BLOCK_ERROR_ACTION_STOP) {
>> -if (!job->job.user_paused) {
>> -job_pause(>job);
>> -/* make the pause user visible, which will be resumed from QMP. 
>> */
>> -job->job.user_paused = true;
>> +WITH_JOB_LOCK_GUARD() {
>> +if (!job->job.user_paused) {
>> +job_pause(>job);
>> +/*
>> + * make the pause user visible, which will be
>> + * resumed from QMP.
>> + */
>> +job->job.user_paused = true;
>> +}
>>  }
>>  block_job_iostatus_set_err(job, error);
> 
> Does this need the lock? If not, why is block_job_iostatus_reset()
> called with the hold?
> 
block_job_iostatus_set_err does not touch any Job fields. On the other
hand block_job_iostatus_reset reads job.user_paused and job.pause_count.

Emanuele




Re: [PATCH v4 04/18] block/copy-before-write: add bitmap open parameter

2022-02-24 Thread Vladimir Sementsov-Ogievskiy

24.02.2022 15:07, Hanna Reitz wrote:

On 16.02.22 20:46, Vladimir Sementsov-Ogievskiy wrote:

This brings "incremental" mode to copy-before-write filter: user can
specify bitmap so that filter will copy only "dirty" areas.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  qapi/block-core.json  | 10 +++-
  block/copy-before-write.c | 51 ++-
  2 files changed, 59 insertions(+), 2 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 9a5a3641d0..3bab597506 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -4171,11 +4171,19 @@
  #
  # @target: The target for copy-before-write operations.
  #
+# @bitmap: If specified, copy-before-write filter will do
+#  copy-before-write operations only for dirty regions of the
+#  bitmap. Bitmap size must be equal to length of file and
+#  target child of the filter. Note also, that bitmap is used
+#  only to initialize internal bitmap of the process, so further
+#  modifications (or removing) of specified bitmap doesn't
+#  influence the filter.


Sorry, missed this last time: There should be a “since: 7.0” here.


+#
  # Since: 6.2
  ##
  { 'struct': 'BlockdevOptionsCbw',
    'base': 'BlockdevOptionsGenericFormat',
-  'data': { 'target': 'BlockdevRef' } }
+  'data': { 'target': 'BlockdevRef', '*bitmap': 'BlockDirtyBitmap' } }
  ##
  # @BlockdevOptions:
diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index 799223e3fb..91a2288b66 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c
@@ -34,6 +34,8 @@
  #include "block/copy-before-write.h"
+#include "qapi/qapi-visit-block-core.h"
+
  typedef struct BDRVCopyBeforeWriteState {
  BlockCopyState *bcs;
  BdrvChild *target;
@@ -145,10 +147,53 @@ static void cbw_child_perm(BlockDriverState *bs, 
BdrvChild *c,
  }
  }
+static bool cbw_parse_bitmap_option(QDict *options, BdrvDirtyBitmap **bitmap,
+    Error **errp)
+{
+    QDict *bitmap_qdict = NULL;
+    BlockDirtyBitmap *bmp_param = NULL;
+    Visitor *v = NULL;
+    bool ret = false;
+
+    *bitmap = NULL;
+
+    qdict_extract_subqdict(options, _qdict, "bitmap.");
+    if (!qdict_size(bitmap_qdict)) {
+    ret = true;
+    goto out;
+    }
+
+    v = qobject_input_visitor_new_flat_confused(bitmap_qdict, errp);
+    if (!v) {
+    goto out;
+    }
+
+    visit_type_BlockDirtyBitmap(v, NULL, _param, errp);
+    if (!bmp_param) {
+    goto out;
+    }
+
+    *bitmap = block_dirty_bitmap_lookup(bmp_param->node, bmp_param->name, NULL,
+    errp);
+    if (!*bitmap) {
+    goto out;
+    }
+
+    ret = true;
+
+out:
+    qapi_free_BlockDirtyBitmap(bmp_param);
+    visit_free(v);
+    qobject_unref(bitmap_qdict);
+
+    return ret;
+}
+
  static int cbw_open(BlockDriverState *bs, QDict *options, int flags,
  Error **errp)
  {
  BDRVCopyBeforeWriteState *s = bs->opaque;
+    BdrvDirtyBitmap *bitmap = NULL;
  bs->file = bdrv_open_child(NULL, options, "file", bs, _of_bds,
 BDRV_CHILD_FILTERED | BDRV_CHILD_PRIMARY,
@@ -163,6 +208,10 @@ static int cbw_open(BlockDriverState *bs, QDict *options, 
int flags,
  return -EINVAL;
  }
+    if (!cbw_parse_bitmap_option(options, , errp)) {
+    return -EINVAL;


Hm...  Just to get a second opinion on this: We don’t need to close s->target 
here, because the failure paths of bdrv_open_inherit() and 
bdrv_new_open_driver_opts() both call bdrv_unref(), which will call bdrv_close(), 
which will close all children including s->target, right?


I think I just followed existing error path in cbw_open() on 
block_copy_state_new() failure. But I think you are right and bdrv_close() 
should take care of all bs children.




+    }
+
  bs->total_sectors = bs->file->bs->total_sectors;
  bs->supported_write_flags = BDRV_REQ_WRITE_UNCHANGED |
  (BDRV_REQ_FUA & bs->file->bs->supported_write_flags);
@@ -170,7 +219,7 @@ static int cbw_open(BlockDriverState *bs, QDict *options, 
int flags,
  ((BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP | BDRV_REQ_NO_FALLBACK) &
   bs->file->bs->supported_zero_flags);
-    s->bcs = block_copy_state_new(bs->file, s->target, NULL, errp);
+    s->bcs = block_copy_state_new(bs->file, s->target, bitmap, errp);
  if (!s->bcs) {
  error_prepend(errp, "Cannot create block-copy-state: ");
  return -EINVAL;





--
Best regards,
Vladimir



Re: [PATCH 4/8] meson: drop the .fa library suffix

2022-02-24 Thread Paolo Bonzini

On 2/23/22 10:14, Marc-André Lureau wrote:

Hi

On Wed, Feb 23, 2022 at 1:08 PM Paolo Bonzini  wrote:


On 2/22/22 20:40, marcandre.lur...@redhat.com wrote:


The .fa suffix was a temporary hack introduced in commit
1f0a1d8a51 ("build-sys hack: link with whole .fa archives") when the
build system was mixed between meson & makefiles. It is no longer
needed.


It is still needed to separate internal and system archives, otherwise
the oss-fuzz build fails.  Meson adds a --start-group/--end-group pair


Ah, but gitlab oss-fuzz build passed
https://gitlab.com/marcandre.lureau/qemu/-/pipelines/476763089


Alexander, do you remember how to reproduce this?


around all libraries, and the fork-fuzz.ld linker script should be
outside it.  If the libraries are named .a, the --start-group is placed
much earlier.

This is of course a very ugly workaround; Meson should just use the
objects instead of the archives when link_whole is used with an internal
convenience archive.


Any meson github issue we can point to?


#9292 maybe, I'm not sure if there's anything better.

Paolo




Re: [PATCH v2 10/12] mac_via: make SCSI_DATA (DRQ) bit live rather than latched

2022-02-24 Thread Philippe Mathieu-Daudé

On 24/2/22 12:59, Mark Cave-Ayland wrote:

The VIA2 on the Q800 machine is not a separate chip as in older Macs but instead
is integrated into the on-board logic. From analysing the SCSI routines in the
MacOS toolbox ROM (and to a lesser extent NetBSD and Linux) the expectation 
seems
to be that the SCSI_DATA (DRQ) bit is live on the Q800 and not latched.

Fortunately we can use the recently introduced mos6522 last_irq_levels variable
which tracks the edge-triggered state to return the SCSI_DATA (DRQ) bit live to
the guest OS.

Signed-off-by: Mark Cave-Ayland 
---
  hw/misc/mac_via.c | 17 -
  1 file changed, 16 insertions(+), 1 deletion(-)


Reviewed-by: Philippe Mathieu-Daudé 



Re: [PATCH v2 1/2] block/curl.c: Set error message string if curl_init_state() fails

2022-02-24 Thread Hanna Reitz

On 22.02.22 16:23, Peter Maydell wrote:

In curl_open(), the 'out' label assumes that the state->errmsg string
has been set (either by curl_easy_perform() or by manually copying a
string into it); however if curl_init_state() fails we will jump to
that label without setting the string.  Add the missing error string
setup.

(We can't be specific about the cause of failure: the documentation
of curl_easy_init() just says "If this function returns NULL,
something went wrong".)

Signed-off-by: Peter Maydell 
---
  block/curl.c | 2 ++
  1 file changed, 2 insertions(+)


Reviewed-by: Hanna Reitz 




  1   2   3   >