date:20170725

Re: [Qemu-devel] [PATCH] migration: optimize the downtime

2017-07-25 Thread Jay Zhou



On 2017/7/24 23:35, Dr. David Alan Gilbert wrote:

* Jay Zhou (jianjay.z...@huawei.com) wrote:

Hi Dave,

On 2017/7/21 17:49, Dr. David Alan Gilbert wrote:

* Jay Zhou (jianjay.z...@huawei.com) wrote:

Qemu_savevm_state_cleanup() takes about 300ms in my ram migration tests
with a 8U24G vm(20G is really occupied), the main cost comes from
KVM_SET_USER_MEMORY_REGION ioctl when mem.memory_size = 0 in
kvm_set_user_memory_region(). In kmod, the main cost is
kvm_zap_obsolete_pages(), which traverses the active_mmu_pages list to
zap the unsync sptes.


Hi Jay,
Is this actually increasing the real downtime when the guest isn't
running, or is it just the reported time? I see that the s->downtime
value is calculated right after where we currently call
qemu_savevm_state_cleanup.


It actually increased the real downtime, I used the "ping" command to
test. Reason is that the source side libvirt sends qmp to qemu to query
the status of migration, which needs the BQL. qemu_savevm_state_cleanup
is done with BQL, qemu can not handle the qmp if qemu_savevm_state_cleanup
has not finished. And the source side libvirt delays about 300ms to notify
the destination side libvirt to send the "cont" command to start the vm.

I think the value of s->downtime is not accurate enough, maybe we could
move the calculation of end_time after qemu_savevm_state_cleanup has done.


I'm copying in Paolo, Radim and Andrea- is there anyway we can make the
teardown of KVMs dirty tracking not take so long? 300ms is a silly long time
on only a small VM.


I guess the biggest problem is that 300ms happens before we restart
the guest on the source if a migration fails.


300ms happens even if a migration succeeds.


Hmm, OK, this needs fixing then - it does explain a result I saw a while
ago where the downtime was much bigger with libvirt than it was with
qemu on it's own.


I think it can be optimized:
(1) source vm will be destroyed if the migration is successfully done,
  so the resources will be cleanuped automatically by the system
(2) delay the cleanup if the migration failed


I don't like putting it in qmp_cont; that shouldn't have migration magic
in it.


Yes, it is not a ideal place. :(


I guess we could put it in migrate_fd_cleanup perhaps? It gets called on
a bh near the end -  or could we just move it closer to the end of
migration_thread?


I have tested putting it in migrate_fd_cleanup, but the downtime is not
optimized. So I think it is the same to move it closer to the end of
migration_thread if it holds the BQL.
Could we put it in migrate_init?


Your explanation above hints as to why migrate_fd_cleanup doesn't help;
it's because we're still going to be doing it with the BQL taken.


Yes, it is.


Can you tell me which version of libvirt you're using?


I'm using 1.3.4


I thought the newer ones were supposed to use events so they did't
have to poll qemu.


After checking the codes of the newest libvirt, I think it is the same
in the qemuMigrationWaitForCompletion function, which is used to poll
qemu every 50ms.

Thanks,
Jay


  If we move qemu_savevm_state_cleanup is it still safe? Are there
some things we're supposed to do at that point which are wrong if
we don't.

I wonder about something like;  take a mutex in
memory_global_dirty_log_start, release it in
memory_global_dirty_log_stop.  Then make ram_save_cleanup start
a new thread that does the call to memory_global_dirty_log_stop.

Dave

Re: [Qemu-devel] [PATCH 1/4] target/s390x: Add ilen to unwind data

2017-07-25 Thread David Hildenbrand


>  
> @@ -5656,6 +5658,7 @@ static const DisasInsn *extract_insn(CPUS390XState 
> *env, DisasContext *s,
>  }
>  s->next_pc = s->pc + ilen;
>  s->ilen = ilen;
> +tcg_set_insn_param(s->insn_start_idx, 2, ilen);

Ah, that's a nice trick :)

>  
>  /* We can't actually determine the insn format until we've looked up
> the full insn opcode.  Which we can't do without locating the
> @@ -5890,7 +5893,10 @@ void gen_intermediate_code(CPUState *cs, struct 
> TranslationBlock *tb)
>  gen_tb_start(tb);
>  
>  do {
> -tcg_gen_insn_start(dc.pc, dc.cc_op);
> +/* ??? Alternately, delay emitting insn_start until after we
> +   have computed the insn length in extract_insn.  */

Think this is just fine for now and you can drop the comment.

> +dc.insn_start_idx = tcg_op_buf_count();
> +tcg_gen_insn_start(dc.pc, dc.cc_op, 0);
>  num_insns++;
>  
>  if (unlikely(cpu_breakpoint_test(cs, dc.pc, BP_ANY))) {
> @@ -5984,4 +5990,5 @@ void restore_state_to_opc(CPUS390XState *env, 
> TranslationBlock *tb,
>  if ((cc_op != CC_OP_DYNAMIC) && (cc_op != CC_OP_STATIC)) {
>  env->cc_op = cc_op;
>  }
> +env->int_pgm_ilen = data[2];
>  }
> 

Reviewed-by: David Hildenbrand 

-- 

Thanks,

David

Re: [Qemu-devel] [PATCH V3] rtc: fix a infinite loop in windows vm startup

2017-07-25 Thread Paolo Bonzini

On 25/07/2017 06:14, peng.h...@zte.com.cn wrote:
>> On 24/07/2017 20:35, Peng Hao wrote:
> 
> 
> 
> 
> 
>>> When a windows vm starts, periodic timer of rtc will stop several times.
>>> windows kernel will check whether REG_A_UIP is changed. REG_C's interrupt
>>> flags will not be cleared when periodic timer stops and the update timer
>>> will switch to alarm timer. So the expiration time of alarm timer is very
>>> long and REG_A_UIP will not vary.At last windows kernel will repeat to 
>>> check REG_A_UIP all the time.
> 
>> This should not happen.  REG_A_UIP is set and cleared in register A
>> every second, like this:
>>case RTC_REG_A:
> 
>>if (update_in_progress(s)) {
>>s->cmos_data[s->cmos_index] |= REG_A_UIP
>>} else {
>>s->cmos_data[s->cmos_index] &= ~REG_A_UIP
>>}
>>ret = s->cmos_data[s->cmos_index]
>>break
> 
> 
> 
> when periodic timer stop, update timer is set to a long expire time (as alarm 
> timer).

I think I see the bug now:

diff --git a/hw/timer/mc146818rtc.c b/hw/timer/mc146818rtc.c
index 1b8d3d7d4c..6184b4378e 100644
--- a/hw/timer/mc146818rtc.c
+++ b/hw/timer/mc146818rtc.c
@@ -321,9 +321,11 @@ static void check_update_timer(RTCState *s)
 s->next_alarm_time = next_update_time +
  (next_alarm_sec - 1) * NANOSECONDS_PER_SECOND;
 
-if (s->cmos_data[RTC_REG_C] & REG_C_UF) {
-/* UF is set, but AF is clear.  Program the timer to target
- * the alarm time.  */
+if ((s->cmos_data[RTC_REG_C] & REG_C_UF) &&
+!(s->cmos_data[RTC_REG_A] & REG_A_UIP)) {
+/* If UIP was latched, we need to clear it at the next update.
+ * Otherwise, if UF is set we only need to program the timer to
+ * target the alarm time.  */
 next_update_time = s->next_alarm_time;
 }
 if (next_update_time != timer_expire_time_ns(s->update_timer)) {


but I would like to have a testcase for it in tests/rtc-test.c.

Can you check if the above works and try writing a testcase (that fails 
without the patch and succeeds with it)?

Thanks,

Paolo

Re: [Qemu-devel] [PATCH 2/4] target/s390x: Use ILEN_UNWIND in trivial cases

2017-07-25 Thread David Hildenbrand


>  /* FIXME: a) LAP
> diff --git a/target/s390x/misc_helper.c b/target/s390x/misc_helper.c
> index 452b2bd902..376dbd68c2 100644
> --- a/target/s390x/misc_helper.c
> +++ b/target/s390x/misc_helper.c
> @@ -57,7 +57,7 @@ void QEMU_NORETURN runtime_exception(CPUS390XState *env, 
> int excp,
>  
>  cs->exception_index = EXCP_PGM;
>  env->int_pgm_code = excp;
> -env->int_pgm_ilen = ILEN_AUTO;
> +env->int_pgm_ilen = ILEN_UNWIND;

This is the only place where int_pgm_ilen is ever set to ILEN_UNWIND.

Think it would be better to drop setting int_pgm_ilen here completely.

>  
>  /* Use the (ultimate) callers address to find the insn that trapped.  */
>  cpu_restore_state(cs, retaddr);
> 

Reviewed-by: David Hildenbrand 

-- 

Thanks,

David

Re: [Qemu-devel] [PATCH 3/4] target/s390x: Use unwind info in STSI

2017-07-25 Thread David Hildenbrand

On 25.07.2017 04:36, Richard Henderson wrote:
> Signed-off-by: Richard Henderson 
> ---
>  target/s390x/misc_helper.c | 3 ++-
>  target/s390x/translate.c   | 1 -
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/target/s390x/misc_helper.c b/target/s390x/misc_helper.c
> index 376dbd68c2..ab8551f605 100644
> --- a/target/s390x/misc_helper.c
> +++ b/target/s390x/misc_helper.c
> @@ -388,7 +388,8 @@ uint32_t HELPER(stsi)(CPUS390XState *env, uint64_t a0,
>  if ((r0 & STSI_LEVEL_MASK) <= STSI_LEVEL_3 &&
>  ((r0 & STSI_R0_RESERVED_MASK) || (r1 & STSI_R1_RESERVED_MASK))) {
>  /* valid function code, invalid reserved bits */
> -program_interrupt(env, PGM_SPECIFICATION, 2);
> +cpu_restore_state(ENV_GET_CPU(env), GETPC());

(I'd vote for a local variable ra)

> +program_interrupt(env, PGM_SPECIFICATION, ILEN_UNWIND);
>  }
>  
>  sel1 = r0 & STSI_R0_SEL1_MASK;
> diff --git a/target/s390x/translate.c b/target/s390x/translate.c
> index 9b0c35efa2..57f09154be 100644
> --- a/target/s390x/translate.c
> +++ b/target/s390x/translate.c
> @@ -3988,7 +3988,6 @@ static ExitStatus op_stpt(DisasContext *s, DisasOps *o)
>  static ExitStatus op_stsi(DisasContext *s, DisasOps *o)
>  {
>  check_privileged(s);
> -potential_page_fault(s);
>  gen_helper_stsi(cc_op, cpu_env, o->in2, regs[0], regs[1]);
>  set_cc_static(s);
>  return NO_EXIT;
> 

Reviewed-by: David Hildenbrand 

-- 

Thanks,

David

Re: [Qemu-devel] [PATCH for 2.10 14/35] vfio/ccw: fix incorrect malloc() size

2017-07-25 Thread Cornelia Huck

On Mon, 24 Jul 2017 15:27:30 -0300
Philippe Mathieu-Daudé  wrote:

> Since sizeof(struct vfio_irq_info) < sizeof(struct vfio_irq_set) a heap 
> overflow
> never occured. Still, let's use the correct size.
> 
> hw/vfio/ccw.c:170:16: warning: Cast a region whose size is not a multiple of 
> the destination type size
> irq_info = g_malloc0(sizeof(*irq_set));
>^~~
> 
> Reported-by: Clang Static Analyzer
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  hw/vfio/ccw.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c
> index 12d0262336..8d97b53e77 100644
> --- a/hw/vfio/ccw.c
> +++ b/hw/vfio/ccw.c
> @@ -168,7 +168,7 @@ static void vfio_ccw_register_io_notifier(VFIOCCWDevice 
> *vcdev, Error **errp)
>  return;
>  }
>  
> -argsz = sizeof(*irq_set);
> +argsz = sizeof(*irq_info);
>  irq_info = g_malloc0(argsz);
>  irq_info->index = VFIO_CCW_IO_IRQ_INDEX;
>  irq_info->argsz = argsz;

Thanks for the patch, but I already have "vfio/ccw: allocate irq info
with the right size" queued in my s390-next branch (for which I plan to
send a pull req today).

Re: [Qemu-devel] [PATCH for 2.10 27/35] syscall: fix dereference of undefined pointer

2017-07-25 Thread Laurent Vivier

Le 24/07/2017 à 23:26, Peter Maydell a écrit :
> On 24 July 2017 at 19:27, Philippe Mathieu-Daudé  wrote:
>> Clang's scan-build-5.0 reports:
>>
>> linux-user/syscall.c:5581:9: warning: Dereference of undefined pointer value
>> if (*host_rt_dev_ptr != 0) {
>> ^~~~
>>
>> Reported-by: Clang Static Analyzer
>> Suggested-by: Laurent Vivier 
>> Signed-off-by: Philippe Mathieu-Daudé 
>> ---
>>  linux-user/syscall.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/linux-user/syscall.c b/linux-user/syscall.c
>> index fcd20fa276..e79b5baec4 100644
>> --- a/linux-user/syscall.c
>> +++ b/linux-user/syscall.c
>> @@ -5524,7 +5524,7 @@ static abi_long do_ioctl_rt(const IOCTLEntry *ie, 
>> uint8_t *buf_temp,
>>  int target_size;
>>  void *argptr;
>>  abi_ulong *target_rt_dev_ptr;
>> -unsigned long *host_rt_dev_ptr;
>> +unsigned long *host_rt_dev_ptr = NULL;
>>  abi_long ret;
>>  int i;
>>
>> @@ -5570,6 +5570,7 @@ static abi_long do_ioctl_rt(const IOCTLEntry *ie, 
>> uint8_t *buf_temp,
>>  unlock_user(argptr, arg, 0);
>>
>>  ret = get_errno(safe_ioctl(fd, ie->host_cmd, buf_temp));
>> +assert(host_rt_dev_ptr);
> 
> There's not much point in this assert because it doesn't
> tell us anything we're not about to find out immediately
> by dereferencing the pointer...

It's just to shut off the warning.

What I said in the comment of the previous version of this patch:

I think we should "assert(host_rt_dev_ptr)" here. It's a bug if
host_rt_dev_ptr is not set.

The "for" loop scans the structure to find the rt_dev field, and we
should always enter in the first "if", so "host_rt_dev_ptr" is always set.

Thanks,
Laurent

Re: [Qemu-devel] [PATCH 1/1] s390x/css: fix ilen in IO instruction handlers

2017-07-25 Thread Cornelia Huck

On Mon, 24 Jul 2017 16:34:52 +0200
Halil Pasic  wrote:

> When initiating a program check interruption by calling program_interrupt
> the instruction length (ilen) of the current instruction is supplied as
> the third parameter.
> 
> On s390x all the IO instructions are of instruction format S and their
> ilen is 4.  The calls to program_interrupt (introduced by commits
> 7b18aad543 ("s390: Add channel I/O instructions.", 2013-01-24) and
> 61bf0dcb2e ("s390x/ioinst: Add missing alignment checks for IO
> instructions", 2013-06-21)) however use ilen == 2.
> 
> This is probably due to a confusion between ilen which specifies the
> instruction length in bytes and ILC which does the same but in halfwords.
> If kvm_enabled() this does not actually matter, because the ilen
> parameter of program_interrupt is effectively unused.
> 
> Let's provide the correct ilen to program_interrupt.
> 
> Signed-off-by: Halil Pasic 
> Fixes:  7b18aad543 ("s390: Add channel I/O instructions.")
> Fixes: 61bf0dcb2e ("s390x/ioinst: Add missing alignment checks for IO 
> instructions")
> Reviewed-by: David Hildenbrand 
> ---
>  target/s390x/ioinst.c | 42 +-
>  1 file changed, 21 insertions(+), 21 deletions(-)

Thanks, applied.

Re: [Qemu-devel] [PATCH] migration: optimize the downtime

2017-07-25 Thread Jay Zhou



On 2017/7/25 0:33, Paolo Bonzini wrote:

On 24/07/2017 17:35, Dr. David Alan Gilbert wrote:

* Jay Zhou (jianjay.z...@huawei.com) wrote:

Hi Dave,

On 2017/7/21 17:49, Dr. David Alan Gilbert wrote:

* Jay Zhou (jianjay.z...@huawei.com) wrote:

Qemu_savevm_state_cleanup() takes about 300ms in my ram migration tests
with a 8U24G vm(20G is really occupied), the main cost comes from
KVM_SET_USER_MEMORY_REGION ioctl when mem.memory_size = 0 in
kvm_set_user_memory_region(). In kmod, the main cost is
kvm_zap_obsolete_pages(), which traverses the active_mmu_pages list to
zap the unsync sptes.


Hi Jay,
Is this actually increasing the real downtime when the guest isn't
running, or is it just the reported time? I see that the s->downtime
value is calculated right after where we currently call
qemu_savevm_state_cleanup.


It actually increased the real downtime, I used the "ping" command to
test. Reason is that the source side libvirt sends qmp to qemu to query
the status of migration, which needs the BQL. qemu_savevm_state_cleanup
is done with BQL, qemu can not handle the qmp if qemu_savevm_state_cleanup
has not finished. And the source side libvirt delays about 300ms to notify
the destination side libvirt to send the "cont" command to start the vm.

I think the value of s->downtime is not accurate enough, maybe we could
move the calculation of end_time after qemu_savevm_state_cleanup has done.


I'm copying in Paolo, Radim and Andrea- is there anyway we can make the
teardown of KVMs dirty tracking not take so long? 300ms is a silly long time
on only a small VM.


Xiao Guangrong is working on something vaguely related (but different
and simpler because it's entirely contained within KVM), which is to
make log_sync faster.

The Intel folks working on clear containers also would like
MemoryListeners to have a better complexity, but that's again separate
from the "zapping" of SPTEs.


Can you tell me which version of libvirt you're using?
I thought the newer ones were supposed to use events so they did't
have to poll qemu.

  If we move qemu_savevm_state_cleanup is it still safe? Are there
some things we're supposed to do at that point which are wrong if
we don't.

I wonder about something like;  take a mutex in
memory_global_dirty_log_start, release it in
memory_global_dirty_log_stop.  Then make ram_save_cleanup start
a new thread that does the call to memory_global_dirty_log_stop.


I don't like having such a long-lived mutex (it seems like a recipe for
deadlocks with the BQL), plus memory_region_transaction_commit (the
expensive part of memory_global_dirty_log_stop) needs to be under the
BQL itself because it calls MemoryListeners.

Maybe memory_global_dirty_log_stop can delay itself to the next vm_start
if it's called while runstate_running() returns false (which should be
always the case)?


It is logical, but then we need to put some migration related codes into
the file of cpus.c(in my patch, I put some codes into the file of qmp.c),
is there any elegant way to reduce the coupling?

Thanks,
Jay


It could even be entirely enclosed within memory.c if you do it with a
VMChangeStateHandler.

Thanks,

Paolo


Dave


Thanks,
Jay


However, we would need to be a bit careful of anything that needs
cleaning up before the source restarts on failure; I'm not sure of
the semantics of all the current things wired into save_cleanup.

Dave



Signed-off-by: Jay Zhou 
---
   migration/migration.c | 16 +---
   qmp.c | 10 ++
   2 files changed, 19 insertions(+), 7 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index a0db40d..72832be 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1877,6 +1877,15 @@ static void *migration_thread(void *opaque)
   if (qemu_file_get_error(s->to_dst_file)) {
   migrate_set_state(&s->state, current_active_state,
 MIGRATION_STATUS_FAILED);
+/*
+ * The resource has been allocated by migration will be reused in
+ * COLO process, so don't release them.
+ */
+if (!enable_colo) {
+qemu_mutex_lock_iothread();
+qemu_savevm_state_cleanup();
+qemu_mutex_unlock_iothread();
+}
   trace_migration_thread_file_err();
   break;
   }
@@ -1916,13 +1925,6 @@ static void *migration_thread(void *opaque)
   end_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);

   qemu_mutex_lock_iothread();
-/*
- * The resource has been allocated by migration will be reused in COLO
- * process, so don't release them.
- */
-if (!enable_colo) {
-qemu_savevm_state_cleanup();
-}
   if (s->state == MIGRATION_STATUS_COMPLETED) {
   uint64_t transferred_bytes = qemu_ftell(s->to_dst_file);
   s->total_time = end_time - s->total_time;
diff --git a/qmp.c b/qmp.c
index b86201e..0e68eaa 100644
--- a/qmp.c
+++

Re: [Qemu-devel] [PATCH v2] hmp/(p)memsave: Allow >32bit file size

2017-07-25 Thread Markus Armbruster

"Dr. David Alan Gilbert (git)"  writes:

> From: "Dr. David Alan Gilbert" 
>
> memsave and pmemsave only take 32bit size arguments in HMP at the
> moment; let them take 64bit values.
>
> Reported-by: Pierre Kim 
> Signed-off-by: Dr. David Alan Gilbert 
> ---
>  hmp-commands.hx |  4 ++--
>  hmp.c   | 13 +++--
>  2 files changed, 13 insertions(+), 4 deletions(-)
>
> diff --git a/hmp-commands.hx b/hmp-commands.hx
> index 1941e19932..ddf77ae7ac 100644
> --- a/hmp-commands.hx
> +++ b/hmp-commands.hx
> @@ -829,7 +829,7 @@ ETEXI
>  
>  {
>  .name   = "memsave",
> -.args_type  = "val:l,size:i,filename:s",
> +.args_type  = "val:l,size:l,filename:s",
>  .params = "addr size file",
>  .help   = "save to disk virtual memory dump starting at 'addr' 
> of size 'size'",
>  .cmd= hmp_memsave,

The size is wider, but still signed.  Reproduces QAPI/QMP's mistake.  We
should fix it there, and make this "size:o" here, so suffixes work as
they should.

I'll cook up a QAPI/QMP patch.

[...]

Re: [Qemu-devel] [PATCH] migration: optimize the downtime

2017-07-25 Thread Paolo Bonzini

On 25/07/2017 09:29, Jay Zhou wrote:
>>
>>
>> Maybe memory_global_dirty_log_stop can delay itself to the next vm_start
>> if it's called while runstate_running() returns false (which should be
>> always the case)?
> 
> It is logical, but then we need to put some migration related codes into
> the file of cpus.c(in my patch, I put some codes into the file of qmp.c),
> is there any elegant way to reduce the coupling?

See below:

>> It could even be entirely enclosed within memory.c if you do it with a
>> VMChangeStateHandler. 

If memory_global_dirty_log_stop runs with !runstate_running(), instead
of doing the memory transaction you set a VMChangeStateHandler and
return immediately.

The VMChangeStateHandler then does the actual work just before the VM
starts running.

memory_global_dirty_log_start also needs to delete the
VMChangeStateHandler if it is present.

Paolo

Re: [Qemu-devel] [PATCH for 2.10 04/35] ivshmem: fix incorrect error handling in ivshmem_recv_msg()

2017-07-25 Thread Markus Armbruster

Philippe Mathieu-Daudé  writes:

> If qemu_chr_fe_read_all() returns -EINTR the do {} statement continues and the
> n accumulator used to complete reads upto sizeof(msg) is decremented by 4 (the
> value of EINTR on Linux).
> To avoid that, use simpler if() statements and continue if EINTR occured.
>
> hw/misc/ivshmem.c:650:14: warning: Loss of sign in implicit conversion
> } while (n < sizeof(msg));
>  ^
>

Let's add "Screwed up in commit 3a55fc0f, v2.6.0."

> Reported-by: Clang Static Analyzer
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
> get_maintainer.pl: No maintainers found!
>
>  hw/misc/ivshmem.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c
> index a58f9ee579..47a015f072 100644
> --- a/hw/misc/ivshmem.c
> +++ b/hw/misc/ivshmem.c
> @@ -642,7 +642,10 @@ static int64_t ivshmem_recv_msg(IVShmemState *s, int 
> *pfd, Error **errp)
>  do {
>  ret = qemu_chr_fe_read_all(&s->server_chr, (uint8_t *)&msg + n,
> sizeof(msg) - n);
> -if (ret < 0 && ret != -EINTR) {
> +if (ret < 0) {
> +if (ret == -EINTR) {
> +continue;
> +}
>  error_setg_errno(errp, -ret, "read from server failed");
>  return INT64_MIN;
>  }

Reviewed-by: Markus Armbruster 

Paolo, you taking this through your miscellaneous queue would save me
(and possibly Peter) a bit of work.  Only if you have something queued
already.  Let me know.

Re: [Qemu-devel] [RFC PATCH 08/26] ppc/xive: add flags to the XIVE interrupt source

2017-07-25 Thread Cédric Le Goater

On 07/24/2017 11:50 AM, David Gibson wrote:
> On Mon, Jul 24, 2017 at 05:00:57PM +1000, Benjamin Herrenschmidt wrote:
>> On Mon, 2017-07-24 at 14:36 +1000, David Gibson wrote:
>>> On Wed, Jul 05, 2017 at 07:13:21PM +0200, Cédric Le Goater wrote:
 These flags define some characteristics of the source :

  - XIVE_SRC_H_INT_ESB  the Event State Buffer are controlled with a
specific hcall H_INT_ESB
>>>
>>> What's the other option?
>>
>> Direct MMIO access. Normally all interrupts use normal MMIOs,
>> each interrupts has an associated MMIO page with special MMIOs
>> to control the source state (PQ bits). This is something I added
>> to the PAPR spec (and the OPAL <-> Linux interface) to allow firmware
>> to work around broken HW (which happens on some P9 versions).
> 
> Ok.. and that's something that can be decided at runtime?
> 

This is a characteristic of an Interrupt Source and the associated 
object should be created with such a flag. But I don't think will 
ever use it in QEMU, maybe with KVM.

C.

Re: [Qemu-devel] [RFC] virtio-mem: paravirtualized memory

2017-07-25 Thread David Hildenbrand

(ping)

Hi,

this has been on these lists for quite some time now. I want to start
preparing a virtio spec for virtio-mem soon.

So if you have any more comments/ideas/objections/questions, now is the
right time to post them :)

Thanks!


On 16.06.2017 16:20, David Hildenbrand wrote:
> Hi,
> 
> this is an idea that is based on Andrea Arcangeli's original idea to
> host enforce guest access to memory given up using virtio-balloon using
> userfaultfd in the hypervisor. While looking into the details, I
> realized that host-enforcing virtio-balloon would result in way too many
> problems (mainly backwards compatibility) and would also have some
> conceptual restrictions that I want to avoid. So I developed the idea of
> virtio-mem - "paravirtualized memory".
> 
> The basic idea is to add memory to the guest via a paravirtualized
> mechanism (so the guest can hotplug it) and remove memory via a
> mechanism similar to a balloon. This avoids having to online memory as
> "online-movable" in the guest and allows more fain grained memory
> hot(un)plug. In addition, migrating QEMU guests after adding/removing
> memory gets a lot easier.
> 
> Actually, this has a lot in common with the XEN balloon or the Hyper-V
> balloon (namely: paravirtualized hotplug and ballooning), but is very
> different when going into the details.
> 
> Getting this all implemented properly will take quite some effort,
> that's why I want to get some early feedback regarding the general
> concept. If you have some alternative ideas, or ideas how to modify this
> concept, I'll be happy to discuss. Just please make sure to have a look
> at the requirements first.
> 
> ---
> 0. Outline:
> ---
> - I.General concept
> - II.   Use cases
> - III.  Identified requirements
> - IV.   Possible modifications
> - V.Prototype
> - VI.   Problems to solve / things to sort out / missing in prototype
> - VII.  Questions
> - VIII. Q&A
> 
> 
> I. General concept
> 
> 
> We expose memory regions to the guest via a paravirtualize interface. So
> instead of e.g. a DIMM on x86, such memory is not anounced via ACPI.
> Unmodified guests (without a virtio-mem driver) won't be able to see/use
> this memory. The virtio-mem guest driver is needed to detect and manage
> these memory areas. What makes this memory special is that it can grow
> while the guest is running ("plug memory") and might shrink on a reboot
> (to compensate "unplugged" memory - see next paragraph). Each virtio-mem
> device manages exactly one such memory area. By having multiple ones
> assigned to different NUMA nodes, we can modify memory on a NUMA basis.
> 
> Of course, we cannot shrink these memory areas while the guest is
> running. To be able to unplug memory, we do something like a balloon
> does, however limited to this very memory area that belongs to the
> virtio-mem device. The guest will hand back small chunks of memory. If
> we want to add memory to the guest, we first "replug" memory that has
> previously been given up by the guest, before we grow our memory area.
> 
> On a reboot, we want to avoid any memory holes in our memory, therefore
> we resize our memory area (shrink it) to compensate memory that has been
> unplugged. This highly simplifies hotplugging memory in the guest (
> hotplugging memory with random memory holes is basically impossible).
> 
> We have to make sure that all memory chunks the guest hands back on
> unplug requests will not consume memory in the host. We do this by
> write-protecting that memory chunk in the host and then dropping the
> backing pages. The guest can read this memory (reading from the ZERO
> page) but no longer write to it. For now, this will only work on
> anonymous memory. We will use userfaultfd WP (write-protect mode) to
> avoid creating too many VMAs. Huge pages will require more effort (no
> explicit ZERO page).
> 
> As we unplug memory on a fine grained basis (and e.g. not on
> a complete DIMM basis), there is no need to online virtio-mem memory
> as online-movable. Also, memory unplug support for Windows might be
> supported that way. You can find more details in the Q/A section below.
> 
> 
> The important points here are:
> - After a reboot, every memory the guest sees can be accessed and used.
>   (in contrast to e.g. the XEN balloon, see Q/A fore more details)
> - Rebooting into an unmodified guest will not result into random
>   crashed. The guest will simply not be able to use all memory without a
>   virtio-mem driver.
> - Adding/Removing memory will not require modifying the QEMU command
>   line on the migration target. Migration simply works (re-sizing memory
>   areas is already part of the migration protocol!). Essentially, this
>   makes adding

Re: [Qemu-devel] [PATCH 11/13] instrument: [qapi] Add library loader

2017-07-25 Thread Lluís Vilanova

Eric Blake writes:

> On 07/24/2017 12:46 PM, Lluís Vilanova wrote:
>> Signed-off-by: Lluís Vilanova 
>> ---
>> instrument/Makefile.objs |1 +
>> instrument/qmp.c |   71 
>> qapi-schema.json |3 ++
>> qapi/instrument.json |   92 
>> ++
>> 4 files changed, 167 insertions(+)
>> create mode 100644 instrument/qmp.c
>> create mode 100644 qapi/instrument.json

> Adding new files; but I don't see a patch to MAINTAINERS to cover
> instrument/*.

Who should I put as a maintainer? Or does this go to the general maintainer(s)?


>> +++ b/qapi/instrument.json
>> @@ -0,0 +1,92 @@
>> +# *-*- Mode: Python -*-*
>> +#
>> +# QAPI trace instrumentation control commands.
>> +#
>> +# Copyright (C) 2012-2017 Lluís Vilanova 
>> +#
>> +# This work is licensed under the terms of the GNU GPL, version 2 or later.
>> +# See the COPYING file in the top-level directory.
>> +
>> +##
>> +# @InstrLoadCode:
>> +#
>> +# Result code of an 'instr-load' command.
>> +#
>> +# @ok: Correctly loaded.
>> +# @unavailable: Service not available.
>> +# @error: Error with libdl (see 'msg').
>> +#
>> +# Since: 2.10

> This is a new feature, and you've missed soft freeze.  You'll want to
> use 2.11 throughout the file.

Thx!


>> +##
>> +{ 'enum': 'InstrLoadCode',
>> +  'data': [ 'ok', 'unavailable', 'error' ] }
>> +
>> +##
>> +# @InstrLoadResult:
>> +#
>> +# Result of an 'instr-load' command.
>> +#
>> +# @code: Result code.
>> +# @msg: Additional error message.

> Worth a comment that the message is for human consumption, and should
> not be further parsed by machine?

> Should msg be optional, present only when there is an error?

True.


>> +# @handle: Instrumentation library identifier (undefined in case of error).

> Should it be an optional member, omitted when there is an error?

Also true.


>> +#
>> +# Since: 2.10
>> +##
>> +{ 'struct': 'InstrLoadResult',
>> +  'data': { 'code': 'InstrLoadCode', 'msg': 'str', 'handle': 'int' } }
>> +
>> +##
>> +# @instr-load:
>> +#
>> +# Load an instrumentation library.
>> +#
>> +# @path: path to the dynamic instrumentation library
>> +# @args: arguments to the dynamic instrumentation library
>> +#
>> +# Since: 2.10
>> +##
>> +{ 'command': 'instr-load',
>> +  'data':{ 'path': 'str', '*args': ['String'] },

> Why are you double-nesting things?  It's a lot nicer to use ['str']
> (that is, your way requires
>  "arguments":{"path":"/some/path",
>"args": [ { "str": "string1" }, { "str": "string2" } ] }

> whereas mine only needs:
>  "arguments":{"path":"/some/path", "args":[ "string1", "string2" ]}

Aha, you mean the definition should be this instead?

  { 'command': 'instr-load',
'data':{ 'path': 'str', '*args': ['str'] },
'returns': 'InstrLoadResult' }


>> +  'returns': 'InstrLoadResult' }
>> +
>> +
>> +##
>> +# @InstrUnloadCode:
>> +#
>> +# Result code of an 'instr-unload' command.
>> +#
>> +# @ok: Correctly unloaded.
>> +# @unavailable: Service not available.
>> +# @invalid: Invalid handle.
>> +# @error: Error with libdl (see 'msg').
>> +#
>> +# Since: 2.10
>> +##
>> +{ 'enum': 'InstrUnloadCode',
>> +  'data': [ 'ok', 'unavailable', 'invalid', 'error' ] }
>> +
>> +##
>> +# @InstrUnloadResult:
>> +#
>> +# Result of an 'instr-unload' command.
>> +#
>> +# @code: Result code.
>> +# @msg: Additional error message.

> Again, should msg be optional?  Document that it is only for human
> consumption.

Ok.


>> +#
>> +# Since: 2.10
>> +##
>> +{ 'struct': 'InstrUnloadResult',
>> +  'data': { 'code': 'InstrUnloadCode', 'msg': 'str' } }
>> +
>> +##
>> +# @instr-unload:
>> +#
>> +# Unload an instrumentation library.
>> +#
>> +# @handle: Instrumentation library identifier (see #InstrLoadResult).
>> +#
>> +# Since: 2.10
>> +##
>> +{ 'command': 'instr-unload',
>> +  'data': { 'handle': 'int' },
>> +  'returns': 'InstrUnloadResult' }
>> 
>> 


Thanks,
  Lluis

Re: [Qemu-devel] [RFC PATCH 08/26] ppc/xive: add flags to the XIVE interrupt source

2017-07-25 Thread Cédric Le Goater

On 07/25/2017 07:47 AM, Benjamin Herrenschmidt wrote:
> On Tue, 2017-07-25 at 14:18 +1000, David Gibson wrote:
>>> Well, at this point I think nothing will set that flag It's there
>>> for workaround around HW bugs on some chips. At least in full emu it
>>> shouldn't happen unless we try to emulate those bugs. Hopefully direct
>>> MMIO will just work.
>>
>> Hm.  That doesn't seem like a good match for a per-irq state
>> structure.
> 
> The flag is returned to the guest on a per-IRQ basis, so are the LSI
> etc... flags, but at the HW level, indeed, they correspond to
> attributes of blocks of interrupts.
> 
> It might be easier in qemu to keep that in the per-source flags
> though.

Theses flags are at the source level :

  XIVE_SRC_H_INT_ESB 
  XIVE_SRC_TRIGGER
  XIVE_SRC_STORE_EOI  


The XIVE_SRC_LSI flag is only returned in the GET_SOURCE_INFO hcall 
but, internally, we use the ICSIRQState flags XICS_FLAGS_IRQ_LSI to 
store the information per-irq.   
 
Thanks,

C.


> Especially when we start having actual HW interrupts under the
> hood with KVM. it's easier to keep the state self contained
> for each of them.
> 
> Ben.
>

Re: [Qemu-devel] [PATCH for 2.10] ps2: fix sending of PAUSE/BREAK scancodes

2017-07-25 Thread Daniel P. Berrange

On Mon, Jul 24, 2017 at 09:55:20PM +0200, Hervé Poussineau wrote:
> Le 24/07/2017 à 18:46, Daniel P. Berrange a écrit :
> > The processing of the scancodes for PAUSE/BREAK  has been broken since
> > the conversion to qcodes in:
> > 
> >   commit 8c10e0baf0260b59a4e984744462a18016662e3e
> >   Author: Hervé Poussineau 
> >   Date:   Thu Sep 15 22:06:26 2016 +0200
> > 
> > ps2: use QEMU qcodes instead of scancodes
> > 
> > When using a VNC client, with the raw scancode extension, the client
> > will send a scancode of 0xc6 for both PAUSE and BREAK. There is mistakenly
> > no entry in the qcode_to_number table for this scancode, so
> > ps2_keyboard_event() just generates a log message and discards the
> > scancode
> > 
> > When using a SPICE client, it will also send 0xc6 for BREAK, but
> > will send 0xe1 0x1d 0x45 0xe1 0x9d 0xc5 for PAUSE. There is no
> > entry in the qcode_to_number table for the scancode 0xe1 because
> > it is a special XT keyboard prefix not mapping to any QKeyCode.
> > Again ps2_keyboard_event() just generates a log message and discards
> > the scancode. The following 0x1d, 0x45, 0x9d, 0xc5 scancodes get
> > handled correctly. Fixing this just requires special casing 0xe1 so
> > it is directly queued for sending to the guest, skipping any conversion
> > to QKeyCode.
> 
> Your commit message is divided in 2 parts, and you're touching 1 file for 
> each part of the commit message.
> Should it be 2 patches instead?
> 
> > 
> > Signed-off-by: Daniel P. Berrange 
> > ---
> >  hw/input/ps2.c| 7 +++
> >  ui/input-keymap.c | 1 +
> >  2 files changed, 8 insertions(+)
> > 
> > diff --git a/hw/input/ps2.c b/hw/input/ps2.c
> > index 3ba05efd06..a132d1ba72 100644
> > --- a/hw/input/ps2.c
> > +++ b/hw/input/ps2.c
> > @@ -607,6 +607,13 @@ static void ps2_keyboard_event(DeviceState *dev, 
> > QemuConsole *src,
> >  assert(evt->type == INPUT_EVENT_KIND_KEY);
> >  qcode = qemu_input_key_value_to_qcode(key->key);
> > 
> > +if (qcode == 0 &&
> > +key->key->type == KEY_VALUE_KIND_NUMBER &&
> > +key->key->u.number.data == 0x61) {
> > +ps2_put_keycode(s, 0xe1);
> > +return;
> > +}
> > +
> 
> You're putting some specific code for spice in ps2 emulation.
> IMO, the workaround should be moved to spice keyboard handling 
> (ui/spice-input.c),
> which needs to generate a qcode instead of a scancode.

This isn't really a spice specific hack. QEMU internal code is *not* required
to use qcodes - the KeyValue struct is a union that allows use of either qcodes
or XT scancodes, and the latter is what all the frontends (SPICE, VNC, GTk, SDL)
use. QCodes are really only input by the monitor (the sendkey command).

> Here, you're making ps2 emulation work again with spice, but you're missing 
> all
> other emulations using qcodes. Do you want to also modify them?

I checked the USB keyboard device and that works fine. AFAIK, the PS/2
code is the only code that takes XT scancodes, converts to qcode and
then converts qcodes back to scancodes - most others avoid this kind of
roundtripping via qcodes.

> I understand that it may be the easiest way to fix the regression for 2.10, 
> but
> it needs at least a comment.


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [Qemu-devel] [PATCH] io: Always remove an old channel watch before adding a new one.

2017-07-25 Thread Daniel P. Berrange

On Mon, Jul 24, 2017 at 11:15:44AM -0700, Brandon Carpenter wrote:
> Also set saved handle to zero when removing without adding a new watch.
> 
> Signed-off-by: Brandon Carpenter 
> ---
>  ui/vnc-auth-vencrypt.c | 3 +++
>  ui/vnc-ws.c| 6 ++
>  ui/vnc.c   | 4 
>  3 files changed, 13 insertions(+)

Reviewed-by: Daniel P. Berrange 



Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [Qemu-devel] [PATCH for 2.10 09/35] ui/vnc: fix leak of SocketAddress **

2017-07-25 Thread Daniel P. Berrange

On Mon, Jul 24, 2017 at 03:27:25PM -0300, Philippe Mathieu-Daudé wrote:
> Extract the (correct) cleaning code as a new function vnc_free_addresses() 
> then
> use it to remove the memory leaks.
> 
> Reported-by: Clang Static Analyzer
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  ui/vnc.c | 36 ++--
>  1 file changed, 18 insertions(+), 18 deletions(-)

Reviewed-by: Daniel P. Berrange 


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [Qemu-devel] [PATCH] io: Improve websocket support by becoming more RFC compliant.

2017-07-25 Thread Daniel P. Berrange

On Mon, Jul 24, 2017 at 11:42:17AM -0700, Brandon Carpenter wrote:
> Remembering the opcode is sufficient for handling fragmented frames from
> the client, which may be introduced by an intermediary server/proxy.
> Respond to pings and ignore pongs rather than close the connection as
> many browsers use ping/pong to test an idle connection. Close
> connections according to the RFC, including providing reason code and
> message to aid debugging of unexpected disconnects. Empty payloads
> should not cause a disconnect.

Couuld you split this up into multiple patches, each one only fixing
a single problem at a time. It is rather hard to review this for
correctness when everything is changed at once.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [Qemu-devel] [PATCH] configure: fix qemu-ga missing '.exe' extension on windows

2017-07-25 Thread Daniel P. Berrange

On Mon, Jul 24, 2017 at 10:15:30PM -0300, Philippe Mathieu-Daudé wrote:
> Reported-by: Sameeh Jubran 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
> original report from Sameeh Jubran:
> http://lists.nongnu.org/archive/html/qemu-devel/2017-07/msg00906.html
> 
>  Makefile  | 2 +-
>  configure | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)

Reviewed-by: Daniel P. Berrange 

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [Qemu-devel] [PATCH V2] vhost: fix a migration failed because of vhost region merge

2017-07-25 Thread Igor Mammedov

On Mon, 24 Jul 2017 23:50:00 +0300
"Michael S. Tsirkin"  wrote:

> On Mon, Jul 24, 2017 at 11:14:19AM +0200, Igor Mammedov wrote:
> > On Sun, 23 Jul 2017 20:46:11 +0800
> > Peng Hao  wrote:
> >   
> > > When a guest that has several hotplugged dimms is migrated, on
> > > destination it will fail to resume. Because regions on source
> > > are merged and on destination the order of realizing devices
> > > is different from on source with dimms, so when part of devices
> > > are realizd some region can not be merged.That may be more than
> > > vhost slot limit.
> > > 
> > > Signed-off-by: Peng Hao 
> > > Signed-off-by: Wang Yechao 
> > > ---
> > >  hw/mem/pc-dimm.c| 2 +-
> > >  include/sysemu/sysemu.h | 1 +
> > >  vl.c| 5 +
> > >  3 files changed, 7 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/hw/mem/pc-dimm.c b/hw/mem/pc-dimm.c
> > > index ea67b46..13f3db5 100644
> > > --- a/hw/mem/pc-dimm.c
> > > +++ b/hw/mem/pc-dimm.c
> > > @@ -101,7 +101,7 @@ void pc_dimm_memory_plug(DeviceState *dev, 
> > > MemoryHotplugState *hpms,
> > >  goto out;
> > >  }
> > >  
> > > -if (!vhost_has_free_slot()) {
> > > +if (!vhost_has_free_slot() && qemu_is_machine_init_done()) {
> > >  error_setg(&local_err, "a used vhost backend has no free"
> > > " memory slots left");  
> > that doesn't fix issue,
> >1st: number of used entries is changing after machine_init_done() is 
> > called
> > as regions continue to mapped/unmapped during runtime  
> 
> But that's fine, we want hotplug to fail if we can not guarantee vhost
> will work.
don't we want guarantee that vhost will work with dimm devices at startup
if it were requested on CLI or fail startup cleanly if it can't?

> 
> >2nd: it brings regression and allows to start QEMU with number memory
> > regions more than supported by backend, which combined with missing
> > error handling in vhost will lead to qemu crashes or obscure bugs in
> > guest breaking vhost enabled drivers.
> > i.e. patch undoes what were fixed by
> > https://lists.gnu.org/archive/html/qemu-devel/2015-10/msg00789.html 
> >  
> 
> Why does it? The issue you fixed there is hotplug, and that means
> pc_dimm_memory_plug called after machine done.
I wasn't able to crash fc24 guest with current qemu/rhen7 kernel,
it fallbacks back to virtio and switches off vhost. 


> 
> >   
> > >  goto out;
> > > diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
> > > index b213696..48228ad 100644
> > > --- a/include/sysemu/sysemu.h
> > > +++ b/include/sysemu/sysemu.h
> > > @@ -88,6 +88,7 @@ void qemu_system_guest_panicked(GuestPanicInformation 
> > > *info);
> > >  void qemu_add_exit_notifier(Notifier *notify);
> > >  void qemu_remove_exit_notifier(Notifier *notify);
> > >  
> > > +bool qemu_is_machine_init_done(void);
> > >  void qemu_add_machine_init_done_notifier(Notifier *notify);
> > >  void qemu_remove_machine_init_done_notifier(Notifier *notify);
> > >  
> > > diff --git a/vl.c b/vl.c
> > > index fb6b2ef..43aee22 100644
> > > --- a/vl.c
> > > +++ b/vl.c
> > > @@ -2681,6 +2681,11 @@ static void qemu_run_exit_notifiers(void)
> > >  
> > >  static bool machine_init_done;
> > >  
> > > +bool qemu_is_machine_init_done(void)
> > > +{
> > > +return machine_init_done;
> > > +}
> > > +
> > >  void qemu_add_machine_init_done_notifier(Notifier *notify)
> > >  {
> > >  notifier_list_add(&machine_init_done_notifiers, notify);

Re: [Qemu-devel] [RFC PATCH 04/26] ppc/xive: introduce a skeleton for the XIVE interrupt controller model

2017-07-25 Thread Cédric Le Goater

On 07/24/2017 12:03 PM, David Gibson wrote:
> On Mon, Jul 24, 2017 at 05:20:26PM +1000, Benjamin Herrenschmidt wrote:
>> On Mon, 2017-07-24 at 15:38 +1000, David Gibson wrote:
>>>
>>> Can we assign our logical numbers sparsely, or will that cause other
>>> problems?
>>
>> The main issue is that they probably needs to be the same between XICS
>> and XIVE because by the time we get the CAS call to chose between XICS
>> and XIVE, we have already handed out interrupts and constructed the DT,
>> no ? Unless we do a real CAS reboot...
> 
> A real CAS reboot probably isn't unreasonable for this case.
> 
> I definitely think we need to go one way or the other - either fully
> unify the irq mapping between xics and xive, or fully separate them.

To be able to change interrupt model at CAS time, we need to unify 
the IRQ numbering. We don't have much choice because the DT is 
already populated. We also need to share the ICSIRQState flags unless
we share the interrupt source object between the XIVE and XICS mode. 

In my current tree, I made sure that the same IRQ number ranges 
were being used in the XIVE and in the XICS allocator and that the 
ICSIRQState flags of the different sPAPR Interrupt sources (XIVE 
and XICS) were in sync. That works pretty well for reset, migration 
and hotplug, but it is bit hacky.

C.


>> Otherwise, there's no reason they can't be sparse no.
>>
>>> Note that for PAPR we also have the question of finding logical
>>> interrupts for legacy PAPR VIO devices.
>>
>> We just make them another range ? With KVM legacy today, I just use the
>> generic interrupt facility for those. So when you do the ioctl to
>> "trigger" one, I just do an MMIO to the corresponding page and the
>> interrupt magically shows up wherever the guest is running the target
>> vcpu. In fact, I'd like to add a way to mmap that page into qemu so
>> that qemu can triggers them without an ioctl.
> 
> Ok.
> 
>> The guest doesn't care, from the guest perspective they are interrupts
>> coming from the DT, so they are like PCI etc...
> 
> Ok.
> 
 We can fix the number of "generic" interrupts given to a guest. The
 only requirements from a PAPR perspective is that there should be at
 least as many as there are possible threads in the guest so they can be
 used as IPIs.
>>>
>>> Ok.  If we can do things sparsely, allocating these well away from the
>>> hw interrupts would make things easier.
>>>
 But we may need more for other things. We can make this a machine
 parameter with a default value of something like 4096. If we call N
 that number of extra generic interrupts, then the number of generic
 interrutps would be #possible-vcpu's + N, or something like that.
>>>
>>> That seems reasonable.
>>>
>> But it's fundamentally an allocator that sits in the hypervisor, so in
>> our case, I would say in the spapr "component" of XIVE, rather than the
>> XIVE HW model itself.
>
> Maybe..

 You are right in that a mapping is a better term than an allocator
 here.

>> Now what Cedric did, because XIVE is very complex and we need something
>> for PAPR quickly, is not a complete HW model, but a somewhat simplified
>> one that only handles what PAPR exposes. So in that case where the
>> allocator sits is a bit of a TBD...
>
> Hm, ok.  My concern here is that "dynamic" allocation of irqs at the
> machine type level needs extreme caution, or the irqs may not be
> stable which will generally break migration.

 Yes you are right. We should probably create a more "static" scheme.
>>>
>>> Sounds like we're in violent agreement.
>>
>> Yup :)
>>
>

[Qemu-devel] [PATCH] qemu-iotests: Fix reference output for 186

2017-07-25 Thread Kevin Wolf

Commits 70f17a1 ('error: Revert unwanted change of warning messages')
and e1824e5 ('qemu-iotests: Test 'info block'') had a semantic merge
conflict, which results in failure for qemu-iotests case 186. Fix the
reference output to consider the changes of 70f17a1.

Signed-off-by: Kevin Wolf 
---
 tests/qemu-iotests/186.out | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tests/qemu-iotests/186.out b/tests/qemu-iotests/186.out
index b963b12..b8bf9a2 100644
--- a/tests/qemu-iotests/186.out
+++ b/tests/qemu-iotests/186.out
@@ -442,7 +442,7 @@ ide0-cd0 (NODE_NAME): null-co:// (null-co, read-only)
 Cache mode:   writeback
 (qemu) quit
 
-warning: qemu-system-x86_64: -drive if=scsi,driver=null-co: bus=0,unit=0 is 
deprecated with this machine type
+qemu-system-x86_64: -drive if=scsi,driver=null-co: warning: bus=0,unit=0 is 
deprecated with this machine type
 Testing: -drive if=scsi,driver=null-co
 QEMU X.Y.Z monitor - type 'help' for more information
 (qemu) info block
@@ -451,7 +451,7 @@ scsi0-hd0 (NODE_NAME): null-co:// (null-co)
 Cache mode:   writeback
 (qemu) quit
 
-warning: qemu-system-x86_64: -drive if=scsi,media=cdrom: bus=0,unit=0 is 
deprecated with this machine type
+qemu-system-x86_64: -drive if=scsi,media=cdrom: warning: bus=0,unit=0 is 
deprecated with this machine type
 Testing: -drive if=scsi,media=cdrom
 QEMU X.Y.Z monitor - type 'help' for more information
 (qemu) info block
@@ -460,7 +460,7 @@ scsi0-cd0: [not inserted]
 Removable device: not locked, tray closed
 (qemu) quit
 
-warning: qemu-system-x86_64: -drive if=scsi,driver=null-co,media=cdrom: 
bus=0,unit=0 is deprecated with this machine type
+qemu-system-x86_64: -drive if=scsi,driver=null-co,media=cdrom: warning: 
bus=0,unit=0 is deprecated with this machine type
 Testing: -drive if=scsi,driver=null-co,media=cdrom
 QEMU X.Y.Z monitor - type 'help' for more information
 (qemu) info block
-- 
1.8.3.1

Re: [Qemu-devel] [Qemu-block] [PATCH v3] util: remove the obsolete non-blocking connect

2017-07-25 Thread Daniel P. Berrange

On Tue, Jul 25, 2017 at 08:18:31AM +0200, Markus Armbruster wrote:
> Mao Zhongyi  writes:
> 
> > On 07/25/2017 06:07 AM, John Snow wrote:
> >> This was posted over a month ago with two R-Bs, did it get merged or 
> >> dropped?
> >>
> >> --js
> >
> > Not yet, I hope that it will.
> 
> get_maintainers.pl blames Daniel, Gerd and Paolo :)

Yeah, I totally forgot that I was listed as a maintainer of the
util/qemu-sockets.c file when I reviewed this & assumed Gerd/Paolo
would take it.

I'll put it on my merge queue right now, but unfortunately I can't
justify sending this during freeze as it isn't a bugfix, so it'll
have to wait until 2.11 opens.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

[Qemu-devel] [Bug 1181354] Re: assert failed in scsi-bus.c line 1539 in SCSI_XFER_NONE

2017-07-25 Thread Thomas Huth

** Changed in: qemu
   Status: New => Incomplete

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1181354

Title:
  assert failed in scsi-bus.c line 1539 in SCSI_XFER_NONE

Status in QEMU:
  Incomplete

Bug description:
  Every time I format a SCSI hard disk (on ID 0) with Windows NT or DOS,
  QEMU crashes with an assertion failure on scsi-bus.c, any help?

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1181354/+subscriptions

[Qemu-devel] [Bug 1473451] Re: Please support the native bios format for dec alpha

2017-07-25 Thread Thomas Huth

** Changed in: qemu
   Importance: Undecided => Wishlist

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1473451

Title:
  Please support the native bios format for dec alpha

Status in QEMU:
  New

Bug description:
  Currently qemu-system-alpha -bios parameter takes an ELF image.
  However HP maintains firmware updates for those systems.

  Some example rom files can be found here
  
ftp://ftp.hp.com/pub/alphaserver/firmware/current_platforms/v7.3_release/DS20_DS20e/

  It might allow things like using the SRM firmware.
  The ARC(nt) firmware would allow to build and test windows applications for 
that platforms without having the relevant hardware

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1473451/+subscriptions

Re: [Qemu-devel] [RFC PATCH 14/26] ppc/xive: add MMIO handlers to the XIVE interrupt presenter model

2017-07-25 Thread Cédric Le Goater

On 07/25/2017 06:20 AM, David Gibson wrote:
> On Mon, Jul 24, 2017 at 04:44:00PM +0200, Cédric Le Goater wrote:
>> On 07/24/2017 08:35 AM, David Gibson wrote:
>>> On Wed, Jul 05, 2017 at 07:13:27PM +0200, Cédric Le Goater wrote:
 The Thread Interrupt Management Area for the OS is mostly used to
 acknowledge interrupts and set the CPPR of the CPU.

 The TIMA is mapped at the same address for each CPU. 'current_cpu' is
 used to retrieve the targeted interrupt presenter object.

 Signed-off-by: Cédric Le Goater 
>>>
>>> Am I right in thinking that this shoehorns the XIVE TIMA state into
>>> the existing XICS ICP object.  That.. doesn't seem like a good idea.
>>
>> The TIMA memory region is under the XIVE object because it is 
>> unique for the system. The lookup of the ICP is simply done using 
>> 'current_cpu'. The TIMA state is under the ICPState, yes, but this 
>> model does not seem incorrect to me as this state contains the 
>> interrupt information presented to a CPU.
> 
> Yeah, that's not the point I'm making.  My point is that the TIMA
> state isn't really the same as xics ICP state.  You're squeezing one
> into the other in a pretty ugly way.

yes, well, we need to have compatible objects between the XICS and XIVE 
mode because of the CAS negotiation. for migration compatibility, it is 
much easier to extend existing objects. This approach I am taking today.


C.

Re: [Qemu-devel] [PATCH] Makefile: add all-user/all-linux-user/all-softmmu meta-targets

2017-07-25 Thread Peter Maydell

On 25 July 2017 at 07:11, Philippe Mathieu-Daudé  wrote:
> Useful to build a whole set at once.

You can already do this by not specifying a --target-list=
option and using whichever of --disable-system and
--disable-user you need to skip the ones you don't want,
can't you?

thanks
-- PMM

Re: [Qemu-devel] [Qemu-ppc] [RFC PATCH 19/26] ppc/xive: introduce a helper to map the XIVE memory regions

2017-07-25 Thread Cédric Le Goater

On 07/25/2017 04:54 AM, Alexey Kardashevskiy wrote:
> On 06/07/17 03:13, Cédric Le Goater wrote:
>> It will be used when the guest chooses the XIVE exploitation mode in
>> CAS.
>>
>> Signed-off-by: Cédric Le Goater 
>> ---
>>  hw/intc/xive.c| 11 +++
>>  include/hw/ppc/xive.h |  2 ++
>>  2 files changed, 13 insertions(+)
>>
>> diff --git a/hw/intc/xive.c b/hw/intc/xive.c
>> index cda1fa18e44d..895dd2b2f61b 100644
>> --- a/hw/intc/xive.c
>> +++ b/hw/intc/xive.c
>> @@ -915,3 +915,14 @@ bool xive_eq_for_target(XIVE *x, uint32_t target, 
>> uint8_t priority,
>>  
>>  return true;
>>  }
>> +
>> +void xive_mmio_map(XIVE *x)
>> +{
>> +/* ESBs */
>> +sysbus_mmio_map(SYS_BUS_DEVICE(x), 0, x->vc_base);
>> +
>> +/* Thread Management Interrupt Areas */
>> +/* TODO: Only map the OS TIMA for the moment. Mapping the whole
>> + * region needs some rework in the handlers */
>> +sysbus_mmio_map(SYS_BUS_DEVICE(x), 1, x->tm_base + (1 << x->tm_shift));
>> +}
> 
> 
> imho it makes more sense to squash such small patches (this one and 20/26,
> 21/26) into those which actually make use of the new helpers - easier to
> review, better for bisectability.

ok. I am also realizing we should unmap.

Thanks,

C.
 
> 
>> diff --git a/include/hw/ppc/xive.h b/include/hw/ppc/xive.h
>> index 288116aeb8f4..560f6ab66f73 100644
>> --- a/include/hw/ppc/xive.h
>> +++ b/include/hw/ppc/xive.h
>> @@ -68,4 +68,6 @@ typedef struct sPAPRMachineState sPAPRMachineState;
>>  void xive_spapr_init(sPAPRMachineState *spapr);
>>  void xive_spapr_populate(XIVE *x, void *fdt);
>>  
>> +void xive_mmio_map(XIVE *x);
>> +
>>  #endif /* PPC_XIVE_H */
>>
> 
>

Re: [Qemu-devel] [Qemu-ppc] [RFC PATCH 26/26] spapr: force XIVE exploitation mode for POWER9 (HACK)

2017-07-25 Thread Cédric Le Goater

On 07/25/2017 04:43 AM, Alexey Kardashevskiy wrote:
> On 06/07/17 03:13, Cédric Le Goater wrote:
>> The CAS negotiation process determines the interrupt controller model
>> to use in the guest but currently, the sPAPR machine make uses of the
>> controller very early in the initialization sequence. The interrupt
>> source is used to allocate IRQ numbers and populate the device tree
>> and the interrupt presenter objects are created along with the CPU.
>>
>> One solution would be use a bitmap to allocate these IRQ numbers and
>> then instantiate the interrupt source object of the correct type with
>> the bitmap as a constructor parameter.
>>
>> As for the interrupt presenter objects, we could allocated them later
>> in the boot process. May be on demand, when a CPU is first notified.
>>
>> Signed-off-by: Cédric Le Goater 
>> ---
>>  hw/ppc/spapr.c | 62 
>> ++
>>  1 file changed, 62 insertions(+)
>>
>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>> index ca3a6bc2ea16..623fc776c886 100644
>> --- a/hw/ppc/spapr.c
>> +++ b/hw/ppc/spapr.c
>> @@ -237,6 +237,38 @@ error:
>>  return NULL;
>>  }
>>  
>> +static XiveICSState *spapr_xive_ics_create(XIVE *x, int nr_irqs, Error 
>> **errp)
>> +{
>> +Error *local_err = NULL;
>> +int irq_base;
>> +Object *obj;
>> +
>> +/*
>> + * TODO: use an XICS_IRQ_BASE alignment to be in sync with XICS
>> + * irq numbers. we should probably simplify the XIVE model or use
>> + * a common allocator. a bitmap maybe ?
>> + */
>> +irq_base = xive_alloc_hw_irqs(x, nr_irqs, XICS_IRQ_BASE);
>> +if (irq_base < 0) {
>> +error_setg(errp, "Failed to allocate %d irqs", nr_irqs);
>> +return NULL;
>> +}
>> +
>> +obj = object_new(TYPE_ICS_XIVE);
>> +object_property_add_child(OBJECT(x), "hw", obj, NULL);
>> +
>> +xive_ics_create(ICS_XIVE(obj), x, irq_base, nr_irqs, 16 /* 64KB page */,
>> +XIVE_SRC_TRIGGER, &local_err);
>> +if (local_err) {
>> +goto error;
>> +}
>> +return ICS_XIVE(obj);
>> +
>> +error:
>> +error_propagate(errp, local_err);
>> +return NULL;
>> +}
>> +
>>  static int spapr_fixup_cpu_smt_dt(void *fdt, int offset, PowerPCCPU *cpu,
>>int smt_threads)
>>  {
>> @@ -814,6 +846,11 @@ static int spapr_dt_cas_updates(sPAPRMachineState 
>> *spapr, void *fdt,
>>  /* /interrupt controller */
>>  if (!spapr_ovec_test(ov5_updates, OV5_XIVE_EXPLOIT)) {
>>  spapr_dt_xics(xics_max_server_number(), fdt, PHANDLE_XICP);
>> +} else {
>> +xive_spapr_populate(spapr->xive, fdt);
>> +
>> +/* Install XIVE MMIOs */
>> +xive_mmio_map(spapr->xive);
> 
> 
> xive_mmio_map() could be called where sysbus_init_mmio() is called as once
> these are mmap'ed, they are never unmapped and tm_base/vc_base never
> change. And XIVE is always created on P9 anyway.

OK. So you don't think we should map/unmap depending on 
CAS negotiation of the OV5_XIVE_EXPLOIT bit ? 

Thanks,

C. 


> 
> 
>>  }
>>  
>>  offset = fdt_path_offset(fdt, "/chosen");
>> @@ -963,6 +1000,13 @@ static void spapr_dt_ov5_platform_support(void *fdt, 
>> int chosen)
>>  } else {
>>  val[3] = 0x00; /* Hash */
>>  }
>> +
>> +/* TODO: introduce a kvmppc_has_cap_xive() ? Works with
> 
> 
> Yes.
> 
>> + * irqchip=off for now
>> + */
>> +if (first_ppc_cpu->env.excp_model & POWERPC_EXCP_POWER9) {
>> +val[1] = 0x01;
>> +}
>>  } else {
>>  if (first_ppc_cpu->env.mmu_model & POWERPC_MMU_V3) {
>>  /* V3 MMU supports both hash and radix (with dynamic switching) 
>> */
>> @@ -971,6 +1015,9 @@ static void spapr_dt_ov5_platform_support(void *fdt, 
>> int chosen)
>>  /* Otherwise we can only do hash */
>>  val[3] = 0x00;
>>  }
>> +if (first_ppc_cpu->env.excp_model & POWERPC_EXCP_POWER9) {
>> +val[1] = 0x01;
>> +}
>>  }
>>  _FDT(fdt_setprop(fdt, chosen, "ibm,arch-vec-5-platform-support",
>>   val, sizeof(val)));
>> @@ -2237,6 +2284,21 @@ static void ppc_spapr_init(MachineState *machine)
>>  spapr->ov5 = spapr_ovec_new();
>>  spapr->ov5_cas = spapr_ovec_new();
>>  
>> +/* TODO: force XIVE mode by default on POWER9.
>> + *
>> + * Switching from XICS to XIVE is badly broken. The ICP type is
>> + * incorrect and the ICS is needed before the CAS negotiation to
>> + * allocate irq numbers ...
>> + */
>> +if (strstr(machine->cpu_model, "POWER9") ||
>> +!strcmp(machine->cpu_model, "host")) {
>> +spapr_ovec_set(spapr->ov5, OV5_XIVE_EXPLOIT);
>> +
>> +spapr->icp_type = TYPE_XIVE_ICP;
>> +spapr->ics = ICS_BASE(
>> +spapr_xive_ics_create(spapr->xive, XICS_IRQS_SPAPR, 
>> &error_fatal));
>> +}
>> +
>>  if (smc->dr_lmb_enabled) {
>>  spapr_ovec

Re: [Qemu-devel] [PATCH 4/4] tests/pxe-test: add testcase using vhost-user-bridge

2017-07-25 Thread Jens Freimann


On Tue, Jul 25, 2017 at 01:06:51AM +0300, Michael S. Tsirkin wrote:

On Fri, Jul 21, 2017 at 11:55:53AM +0200, Jens Freimann wrote:

From: Jens Freimann 

Add a PXE testcase tunneling traffic through vhost-user-bridge process.
Create a vhost-user-bridge  process and connect it to qemu via a socket.

Signed-off-by: Jens Freimann 
---
 tests/Makefile.include |   4 +-
 tests/pxe-test.c   | 140 -
 2 files changed, 142 insertions(+), 2 deletions(-)

diff --git a/tests/Makefile.include b/tests/Makefile.include
index 7af278d..e40550c 100644
--- a/tests/Makefile.include
+++ b/tests/Makefile.include
@@ -705,7 +705,8 @@ tests/boot-order-test$(EXESUF): tests/boot-order-test.o 
$(libqos-obj-y)
 tests/boot-serial-test$(EXESUF): tests/boot-serial-test.o $(libqos-obj-y)
 tests/bios-tables-test$(EXESUF): tests/bios-tables-test.o \
tests/boot-sector.o tests/acpi-utils.o $(libqos-obj-y)
-tests/pxe-test$(EXESUF): tests/pxe-test.o tests/boot-sector.o $(libqos-obj-y)
+tests/pxe-test$(EXESUF): tests/pxe-test.o tests/boot-sector.o \
+tests/vhost-user-bridge$(EXESUF) $(libqos-obj-y)
 tests/tmp105-test$(EXESUF): tests/tmp105-test.o $(libqos-omap-obj-y)
 tests/ds1338-test$(EXESUF): tests/ds1338-test.o $(libqos-imx-obj-y)
 tests/m25p80-test$(EXESUF): tests/m25p80-test.o
@@ -834,6 +835,7 @@ $(patsubst %, check-qtest-%, $(QTEST_TARGETS)): 
check-qtest-%: $(check-qtest-y)
$(if $(CONFIG_GCOV),@rm -f *.gcda */*.gcda */*/*.gcda */*/*/*.gcda,)
$(call quiet-command,QTEST_QEMU_BINARY=$*-softmmu/qemu-system-$* \
QTEST_QEMU_IMG=qemu-img$(EXESUF) \
+   QTEST_VUBR_BINARY=./tests/vhost-user-bridge$(EXESUF) \
MALLOC_PERTURB_=$${MALLOC_PERTURB_:-$$(( $${RANDOM:-0} % 255 + 
1))} \
gtester $(GTESTER_OPTIONS) -m=$(SPEED) $(check-qtest-$*-y) 
$(check-qtest-generic-y),"GTESTER","$@")
$(if $(CONFIG_GCOV),@for f in $(gcov-files-$*-y) 
$(gcov-files-generic-y); do \
diff --git a/tests/pxe-test.c b/tests/pxe-test.c
index 34282d3..d6379a8 100644
--- a/tests/pxe-test.c
+++ b/tests/pxe-test.c
@@ -5,7 +5,8 @@
  *
  * Authors:
  *  Michael S. Tsirkin ,
- *  Victor Kaplansky 
+ *  Victor Kaplansky ,
+ *  Jens Freimann 
  *
  * This work is licensed under the terms of the GNU GPL, version 2 or later.
  * See the COPYING file in the top-level directory.
@@ -13,14 +14,150 @@

 #include "qemu/osdep.h"
 #include 
+#include 
 #include "qemu-common.h"
 #include "libqtest.h"
 #include "boot-sector.h"
+#include 

+#define LPORT 
+#define RPORT 
 #define NETNAME "net0"
+#define QEMU_CMD_MEM"--enable-kvm -m %d "\
+"-object memory-backend-file,id=mem,size=%dM,"\
+"mem-path=%s,share=on -numa node,memdev=mem -mem-prealloc 
"
+#define QEMU_CMD_CHR" -chardev socket,id=%s,path=%s"
+#define QEMU_CMD_NETDEV " -device virtio-net-pci,netdev=net0 "\
+" -netdev vhost-user,id=net0,chardev=%s,vhostforce "\
+" -netdev user,id=n0,tftp=./,bootfile=%s "\
+" -netdev socket,id=n1,fd=%d"
+#define QEMU_CMD_NET" -device virtio-net-pci,netdev=n0 "\
+" -device virtio-net-pci,netdev=n1 "
+
+#define QEMU_CMDQEMU_CMD_MEM QEMU_CMD_CHR \
+QEMU_CMD_NETDEV QEMU_CMD_NET
+
+#define HUGETLBFS_MAGIC   0x958458f6
+#define VUBR_SOCK "vubr.sock"
+#define MEMSZ 1024

 static char disk[] = "tests/pxe-test-disk-XX";

+static const char *init_hugepagefs(const char *path)
+{
+struct statfs fs;
+int ret;
+
+if (access(path, R_OK | W_OK | X_OK)) {
+g_test_message("access on path (%s): %s\n", path, strerror(errno));
+return NULL;
+}
+
+do {
+ret = statfs(path, &fs);
+} while (ret != 0 && errno == EINTR);
+
+if (ret != 0) {
+g_test_message("statfs on path (%s): %s\n", path, strerror(errno));
+return NULL;
+}
+
+if (fs.f_type != HUGETLBFS_MAGIC) {
+g_test_message("Warning: path not on HugeTLBFS: %s\n", path);
+return NULL;
+}
+
+return path;
+}


Why bother?  Rest of the patchset works find on any filesystem.  User
set this for some reason, let's just roll with it.


We do it like this for the vhost-user testcase. 
Just to be sure: you're suggesting to just get rid of
init_hugepagefs()? 


regards,
Jens 


+
+static int vubr_create_socket(struct sockaddr_in *si_remote, int rport)
+{
+int sock;
+
+if (inet_aton("127.0.0.1", (struct in_addr *) &si_remote->sin_addr.s_addr)
+== 0) {
+g_test_message("inet_aton failed\n");
+return -1;
+}
+sock = socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP);
+if (sock == -1) {
+g_test_message("socket creation failed\n");
+return -1;
+}
+if (connect(sock, (struct sockaddr *) si_remote, sizeof(*si_remote))) {
+g_test_message("connect failed: %s", strerror(errno));
+return

Re: [Qemu-devel] [PATCH V3] rtc: fix a infinite loop in windows vmstartup

2017-07-25 Thread peng.hao2

>On 25/07/2017 06:14, peng.h...@zte.com.cn wrote:





>>> On 24/07/2017 20:35, Peng Hao wrote:
>> 
>> 
>> 
>> 
>> 
 When a windows vm starts, periodic timer of rtc will stop several times.
 windows kernel will check whether REG_A_UIP is changed. REG_C's interrupt
 flags will not be cleared when periodic timer stops and the update timer
 will switch to alarm timer. So the expiration time of alarm timer is very
 long and REG_A_UIP will not vary.At last windows kernel will repeat to 
 check REG_A_UIP all the time.
>> 
>>> This should not happen.  REG_A_UIP is set and cleared in register A
>>> every second, like this:
>>>case RTC_REG_A:
>>>if (update_in_progress(s)) {
>>>s->cmos_data[s->cmos_index] |= REG_A_UIP
>>>} else {
>>>s->cmos_data[s->cmos_index] &= ~REG_A_UIP
>>>}
>>>ret = s->cmos_data[s->cmos_index]
>>>break
>> 
>> 
>> 
>> when periodic timer stop, update timer is set to a long expire time (as 
>> alarm timer).

>I think I see the bug now:

>diff --git a/hw/timer/mc146818rtc.c b/hw/timer/mc146818rtc.c
>index 1b8d3d7d4c..6184b4378e 100644
>--- a/hw/timer/mc146818rtc.c
>+++ b/hw/timer/mc146818rtc.c
>@@ -321,9 +321,11 @@ static void check_update_timer(RTCState *s)
> s->next_alarm_time = next_update_time +
>  (next_alarm_sec - 1) * NANOSECONDS_PER_SECOND
> 
>-if (s->cmos_data[RTC_REG_C] & REG_C_UF) {
>-/* UF is set, but AF is clear.  Program the timer to target
>- * the alarm time.  */
>+if ((s->cmos_data[RTC_REG_C] & REG_C_UF) &&
>+!(s->cmos_data[RTC_REG_A] & REG_A_UIP)) {
>+/* If UIP was latched, we need to clear it at the next update.
>+ * Otherwise, if UF is set we only need to program the timer to
>+ * target the alarm time.  */
> next_update_time = s->next_alarm_time
> }
> if (next_update_time != timer_expire_time_ns(s->update_timer)) {
>but I would like to have a testcase for it in tests/rtc-test.c.

>Can you check if the above works and try writing a testcase (that fails 
>without the patch and succeeds with it)?
>Thanks,
I don't  think it can works. REG_C_UF is totally cleared by periodic timer in 
original code.

after periodic timer stoped, the REG_C_UF  is never cleared.

rtc_update_timer has cleared REG_A_UIP,I think  the patch  does nothing.

I reproduce the bug when many windows VMs reboot and it is about the method of  
windows kernel accessing rtc .

so i don't know how to write the testcase.




>Paolo

Re: [Qemu-devel] [PATCH v12 6/8] mm: support reporting free page blocks

2017-07-25 Thread Wei Wang


On 07/24/2017 05:00 PM, Michal Hocko wrote:

On Wed 19-07-17 20:01:18, Wei Wang wrote:

On 07/19/2017 04:13 PM, Michal Hocko wrote:

[...

All you should need is the check for the page reference count, no?  I
assume you do some sort of pfn walk and so you should be able to get an
access to the struct page.

Not necessarily - the guest struct page is not seen by the hypervisor. The
hypervisor only gets those guest pfns which are hinted as unused. From the
hypervisor (host) point of view, a guest physical address corresponds to a
virtual address of a host process. So, once the hypervisor knows a guest
physical page is unsued, it knows that the corresponding virtual memory of
the process doesn't need to be transferred in the 1st round.

I am sorry, but I do not understand. Why cannot _guest_ simply check the
struct page ref count and send them to the hypervisor?


Were you suggesting the following?
1) get a free page block from the page list using the API;
2) if page->ref_count == 0, send it to the hypervisor

Btw, ref_count may also change at any time.


Is there any
documentation which describes the workflow or code which would use your
new API?



It's used in the balloon driver (patch 8). We don't have any docs yet, but
I think the high level workflow is the two steps above.


Best,
Wei

Re: [Qemu-devel] [PATCH v5 2/2 (for 2.10)] docs: document deprecated features in appendix

2017-07-25 Thread Thomas Huth

On 19.07.2017 12:08, Daniel P. Berrange wrote:
> The deprecation of features in QEMU is totally adhoc currently,
> with no way for the user to get a list of what is deprecated
> in each release. This adds an appendix to the doc that records
> when each deprecation was made and provides text explaining
> what to use instead, if anything.
> 
> Since there has been no formal policy around removal of deprecated
> features in the past, any deprecations prior to 2.10.0 are to be
> treated as if they had been made at the 2.10.0 release. Thus the
> earliest that existing deprecations will be deleted is the start
> of the 2.12.0 cycle.
> 
> Signed-off-by: Daniel P. Berrange 
> ---
>  qemu-doc.texi | 168 
> ++
>  1 file changed, 168 insertions(+)
[...]
> +@subsection -monitor default=on (since 2.4.0)
> +
> +The ``default'' option to the ``-monitor'' argument is
> +now ignored. When multiple monitors were enabled, it
> +indicated which monitor would receive log messages
> +from the various subsystems. This feature is no longer
> +required as messages are now only sent to the monitor
> +in response to explicitly monitor commands.

BTW, seems like it's the "-mon" option, not the "-monitor" option that
has this "default" parameter?

 Thomas

Re: [Qemu-devel] [PATCH v5 2/4] sockets: factor out create_fast_reuse_socket

2017-07-25 Thread Daniel P. Berrange

On Sat, Jul 22, 2017 at 09:49:31AM +0200, Knut Omang wrote:
> First refactoring step to prepare for fixing the problem
> exposed with the test-listen test in the previous commit
> 
> Signed-off-by: Knut Omang 
> ---
>  util/qemu-sockets.c | 24 +---
>  1 file changed, 17 insertions(+), 7 deletions(-)
> 
> diff --git a/util/qemu-sockets.c b/util/qemu-sockets.c
> index 1358c81..578f25b 100644
> --- a/util/qemu-sockets.c
> +++ b/util/qemu-sockets.c
> @@ -149,6 +149,20 @@ int inet_ai_family_from_address(InetSocketAddress *addr,
>  return PF_UNSPEC;
>  }
>  
> +static int create_fast_reuse_socket(struct addrinfo *e, Error **errp)
> +{
> +int slisten = qemu_socket(e->ai_family, e->ai_socktype, e->ai_protocol);
> +if (slisten < 0) {
> +if (!e->ai_next) {
> +error_setg_errno(errp, errno, "Failed to create socket");
> +}
> +return -1;
> +}
> +
> +socket_set_fast_reuse(slisten);
> +return slisten;
> +}

As mentioned in the previous review, I don't think we should have
methods like this which sometimes report an error on failure, and
sometimes don't report an error. It makes it hard to review the
callers for correctness of error handling. 

> +
>  static int inet_listen_saddr(InetSocketAddress *saddr,
>   int port_offset,
>   bool update_addr,
> @@ -210,21 +224,17 @@ static int inet_listen_saddr(InetSocketAddress *saddr,
>  return -1;
>  }
>  
> -/* create socket + bind */
> +/* create socket + bind/listen */
>  for (e = res; e != NULL; e = e->ai_next) {
>  getnameinfo((struct sockaddr*)e->ai_addr,e->ai_addrlen,
>   uaddr,INET6_ADDRSTRLEN,uport,32,
>   NI_NUMERICHOST | NI_NUMERICSERV);
> -slisten = qemu_socket(e->ai_family, e->ai_socktype, e->ai_protocol);
> +
> +slisten = create_fast_reuse_socket(e, &err);
>  if (slisten < 0) {
> -if (!e->ai_next) {
> -error_setg_errno(errp, errno, "Failed to create socket");
> -}

So please leave this here.

>  continue;
>  }
>  
> -socket_set_fast_reuse(slisten);
> -
>  port_min = inet_getport(e);
>  port_max = saddr->has_to ? saddr->to + port_offset : port_min;
>  for (p = port_min; p <= port_max; p++) {


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [Qemu-devel] [PATCH v5 3/4] sockets: factor out a new try_bind() function

2017-07-25 Thread Daniel P. Berrange

On Sat, Jul 22, 2017 at 09:49:32AM +0200, Knut Omang wrote:
> Another refactoring step to prepare for the problem
> exposed by the test-listen test.
> 
> This time simplify and reorganize the IPv6 specific extra
> measures and move it out of the for loop to increase
> code readability. No semantic changes.
> 
> Signed-off-by: Knut Omang 
> ---
>  util/qemu-sockets.c | 69 ++
>  1 file changed, 39 insertions(+), 30 deletions(-)

Reviewed-by: Daniel P. Berrange 


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

[Qemu-devel] [PULL for-2.10 02/14] vfio/ccw: fix initialization of the Object DeviceState pointer in the common base-device

2017-07-25 Thread Cornelia Huck

From: Dong Jia Shi 

Commit 7da624e2 ("vfio: Test realized when using VFIOGroup.device_list
iterator") introduced a pointer to the Object DeviceState in the VFIO
common base-device and skipped non-realized devices as we iterate
VFIOGroup.device_list. While it missed to initialize the pointer for
the vfio-ccw case. Let's fix it.

Fixes: 7da624e2 ("vfio: Test realized when using VFIOGroup.device_list
  iterator")

Cc: Alex Williamson 
Reviewed-by: Halil Pasic 
Signed-off-by: Dong Jia Shi 
Reviewed-by: Alex Williamson 
Message-Id: <20170718014926.44781-3-bjsdj...@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck 
---
 hw/vfio/ccw.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c
index 8d97b53e77..a8baadf57a 100644
--- a/hw/vfio/ccw.c
+++ b/hw/vfio/ccw.c
@@ -338,6 +338,7 @@ static void vfio_ccw_realize(DeviceState *dev, Error **errp)
 vcdev->vdev.type = VFIO_DEVICE_TYPE_CCW;
 vcdev->vdev.name = g_strdup_printf("%x.%x.%04x", cdev->hostid.cssid,
cdev->hostid.ssid, cdev->hostid.devid);
+vcdev->vdev.dev = dev;
 QLIST_FOREACH(vbasedev, &group->device_list, next) {
 if (strcmp(vbasedev->name, vcdev->vdev.name) == 0) {
 error_setg(&err, "vfio: subchannel %s has already been attached",
-- 
2.13.3

[Qemu-devel] [PULL for-2.10 00/14] A set of s390x patches

2017-07-25 Thread Cornelia Huck

The following changes since commit b5a74cd81d76cb467552f38f2b39520d07c65ea2:

  Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20170724' 
into staging (2017-07-24 18:15:45 +0100)

are available in the git repository at:

  git://github.com/cohuck/qemu tags/s390x-20170725

for you to fetch changes up to 7e01376daea75e888c370aab521a7d4aeaf2ffd1:

  s390x/css: fix ilen in IO instruction handlers (2017-07-25 09:17:42 +0200)


Various changes for the s390x code:
- updates for cpu model handling
- fix compilation with --disable-tcg
- fixes in vfio-ccw and I/O instruction handling



Christian Borntraeger (1):
  s390/cpumodel: remove KSS from the default model of z14

David Hildenbrand (5):
  target/s390x: drop BE_BIT()
  target/s390x: indicate query subfunction in s390_fill_feat_block
  target/s390x: introduce (test|set)_be_bit
  s390x/kvm: better comment regarding zPCI feature availability
  target/s390x: improve baselining if certain base features are missing

Dong Jia Shi (1):
  vfio/ccw: fix initialization of the Object DeviceState pointer in the
common base-device

Halil Pasic (1):
  s390x/css: fix ilen in IO instruction handlers

Jing Zhang (1):
  vfio/ccw: allocate irq info with the right size

Thomas Huth (5):
  target/s390x: Move s390_cpu_dump_state() to helper.c
  target/s390x: Move diag helpers to a separate file
  target/s390x: Rework program_interrupt() and related functions
  target/s390x: Move exception-related functions to a new excp_helper.c
file
  target/s390x: Add remaining switches to compile with --disable-tcg

 hw/vfio/ccw.c   |   3 +-
 target/s390x/Makefile.objs  |   8 +-
 target/s390x/cpu.c  |   4 +
 target/s390x/cpu.h  |  17 +-
 target/s390x/cpu_features.c |  33 ++-
 target/s390x/cpu_features.h |   9 +-
 target/s390x/cpu_models.c   |  34 +++
 target/s390x/diag.c | 179 +++
 target/s390x/excp_helper.c  | 515 ++
 target/s390x/gen-features.c |   1 -
 target/s390x/helper.c   | 528 +---
 target/s390x/interrupt.c|  39 
 target/s390x/ioinst.c   |  42 ++--
 target/s390x/kvm.c  |  45 +---
 target/s390x/misc_helper.c  | 193 
 target/s390x/translate.c|  60 -
 16 files changed, 914 insertions(+), 796 deletions(-)
 create mode 100644 target/s390x/diag.c
 create mode 100644 target/s390x/excp_helper.c

-- 
2.13.3

[Qemu-devel] [PULL for-2.10 03/14] s390/cpumodel: remove KSS from the default model of z14

2017-07-25 Thread Cornelia Huck

From: Christian Borntraeger 

The SIE_KSS feature will allow a guest to use KSS for a nested guest.
To create a nested guest the SIE_F2 facility is still necessary.
Since SIE_F2 is not part of the default model it does not make
a lot of sense to provide the SIE_KSS feature in the default model.
Let's also create a dependency check.

Signed-off-by: Christian Borntraeger 
Reviewed-by: Jason J. Herne 
Reviewed-by: Janosch Frank 
Message-Id: <1500550051-7821-2-git-send-email-borntrae...@de.ibm.com>
Acked-by: David Hildenbrand 
Signed-off-by: Cornelia Huck 
---
 target/s390x/cpu_models.c   | 1 +
 target/s390x/gen-features.c | 1 -
 2 files changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/s390x/cpu_models.c b/target/s390x/cpu_models.c
index f4e5bb6611..d91b7b85f6 100644
--- a/target/s390x/cpu_models.c
+++ b/target/s390x/cpu_models.c
@@ -723,6 +723,7 @@ static void check_consistency(const S390CPUModel *model)
 { S390_FEAT_KLMD_SHAKE_256, S390_FEAT_MSA },
 { S390_FEAT_PRNO_TRNG_QRTCR, S390_FEAT_MSA_EXT_5 },
 { S390_FEAT_PRNO_TRNG, S390_FEAT_MSA_EXT_5 },
+{ S390_FEAT_SIE_KSS, S390_FEAT_SIE_F2 },
 };
 int i;
 
diff --git a/target/s390x/gen-features.c b/target/s390x/gen-features.c
index cf69157610..c8dc104bc1 100644
--- a/target/s390x/gen-features.c
+++ b/target/s390x/gen-features.c
@@ -535,7 +535,6 @@ static uint16_t default_GEN14_GA1[] = {
 S390_FEAT_GROUP_MSA_EXT_6,
 S390_FEAT_GROUP_MSA_EXT_7,
 S390_FEAT_GROUP_MSA_EXT_8,
-S390_FEAT_SIE_KSS,
 };
 
 /** END FEATURE DEFS **/
-- 
2.13.3

[Qemu-devel] [PULL for-2.10 01/14] vfio/ccw: allocate irq info with the right size

2017-07-25 Thread Cornelia Huck

From: Jing Zhang 

When allocating memory for the vfio_irq_info parameter of the
VFIO_DEVICE_GET_IRQ_INFO ioctl, we used the wrong size. Let's
fix it by using the right size.

Reviewed-by: Dong Jia Shi 
Signed-off-by: Jing Zhang 
Signed-off-by: Dong Jia Shi 
Message-Id: <20170718014926.44781-2-bjsdj...@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck 
---
 hw/vfio/ccw.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c
index 12d0262336..8d97b53e77 100644
--- a/hw/vfio/ccw.c
+++ b/hw/vfio/ccw.c
@@ -168,7 +168,7 @@ static void vfio_ccw_register_io_notifier(VFIOCCWDevice 
*vcdev, Error **errp)
 return;
 }
 
-argsz = sizeof(*irq_set);
+argsz = sizeof(*irq_info);
 irq_info = g_malloc0(argsz);
 irq_info->index = VFIO_CCW_IO_IRQ_INDEX;
 irq_info->argsz = argsz;
-- 
2.13.3

[Qemu-devel] [PULL for-2.10 05/14] target/s390x: indicate query subfunction in s390_fill_feat_block

2017-07-25 Thread Cornelia Huck

From: David Hildenbrand 

We'll have to do the same for TCG, so let's just move it in there.

Reviewed-by: Christian Borntraeger 
Signed-off-by: David Hildenbrand 
Message-Id: <20170720123721.12366-3-da...@redhat.com>
Signed-off-by: Cornelia Huck 
---
 target/s390x/cpu_features.c | 31 ++-
 target/s390x/kvm.c  | 13 -
 2 files changed, 26 insertions(+), 18 deletions(-)

diff --git a/target/s390x/cpu_features.c b/target/s390x/cpu_features.c
index fa887d9b6f..8b1491734f 100644
--- a/target/s390x/cpu_features.c
+++ b/target/s390x/cpu_features.c
@@ -336,11 +336,32 @@ void s390_fill_feat_block(const S390FeatBitmap features, 
S390FeatType type,
 S390Feat feat;
 int bit_nr;
 
-if (type == S390_FEAT_TYPE_STFL && test_bit(S390_FEAT_ZARCH, features)) {
-/* Features that are always active */
-data[0] |= 0x20;  /* z/Architecture */
-data[17] |= 0x20; /* Configuration-z-architectural-mode */
-}
+switch (type) {
+case S390_FEAT_TYPE_STFL:
+if (test_bit(S390_FEAT_ZARCH, features)) {
+/* Features that are always active */
+data[0] |= 0x20;  /* z/Architecture */
+data[17] |= 0x20; /* Configuration-z-architectural-mode */
+}
+break;
+case S390_FEAT_TYPE_PTFF:
+case S390_FEAT_TYPE_KMAC:
+case S390_FEAT_TYPE_KMC:
+case S390_FEAT_TYPE_KM:
+case S390_FEAT_TYPE_KIMD:
+case S390_FEAT_TYPE_KLMD:
+case S390_FEAT_TYPE_PCKMO:
+case S390_FEAT_TYPE_KMCTR:
+case S390_FEAT_TYPE_KMF:
+case S390_FEAT_TYPE_KMO:
+case S390_FEAT_TYPE_PCC:
+case S390_FEAT_TYPE_PPNO:
+case S390_FEAT_TYPE_KMA:
+data[0] |= 0x80; /* query is always available */
+break;
+default:
+break;
+};
 
 feat = find_first_bit(features, S390_FEAT_MAX);
 while (feat < S390_FEAT_MAX) {
diff --git a/target/s390x/kvm.c b/target/s390x/kvm.c
index 831492f9a2..999ea570c3 100644
--- a/target/s390x/kvm.c
+++ b/target/s390x/kvm.c
@@ -2506,41 +2506,28 @@ static int configure_cpu_subfunc(const S390FeatBitmap 
features)
 s390_fill_feat_block(features, S390_FEAT_TYPE_PLO, prop.plo);
 if (test_bit(S390_FEAT_TOD_CLOCK_STEERING, features)) {
 s390_fill_feat_block(features, S390_FEAT_TYPE_PTFF, prop.ptff);
-prop.ptff[0] |= 0x80; /* query is always available */
 }
 if (test_bit(S390_FEAT_MSA, features)) {
 s390_fill_feat_block(features, S390_FEAT_TYPE_KMAC, prop.kmac);
-prop.kmac[0] |= 0x80; /* query is always available */
 s390_fill_feat_block(features, S390_FEAT_TYPE_KMC, prop.kmc);
-prop.kmc[0] |= 0x80; /* query is always available */
 s390_fill_feat_block(features, S390_FEAT_TYPE_KM, prop.km);
-prop.km[0] |= 0x80; /* query is always available */
 s390_fill_feat_block(features, S390_FEAT_TYPE_KIMD, prop.kimd);
-prop.kimd[0] |= 0x80; /* query is always available */
 s390_fill_feat_block(features, S390_FEAT_TYPE_KLMD, prop.klmd);
-prop.klmd[0] |= 0x80; /* query is always available */
 }
 if (test_bit(S390_FEAT_MSA_EXT_3, features)) {
 s390_fill_feat_block(features, S390_FEAT_TYPE_PCKMO, prop.pckmo);
-prop.pckmo[0] |= 0x80; /* query is always available */
 }
 if (test_bit(S390_FEAT_MSA_EXT_4, features)) {
 s390_fill_feat_block(features, S390_FEAT_TYPE_KMCTR, prop.kmctr);
-prop.kmctr[0] |= 0x80; /* query is always available */
 s390_fill_feat_block(features, S390_FEAT_TYPE_KMF, prop.kmf);
-prop.kmf[0] |= 0x80; /* query is always available */
 s390_fill_feat_block(features, S390_FEAT_TYPE_KMO, prop.kmo);
-prop.kmo[0] |= 0x80; /* query is always available */
 s390_fill_feat_block(features, S390_FEAT_TYPE_PCC, prop.pcc);
-prop.pcc[0] |= 0x80; /* query is always available */
 }
 if (test_bit(S390_FEAT_MSA_EXT_5, features)) {
 s390_fill_feat_block(features, S390_FEAT_TYPE_PPNO, prop.ppno);
-prop.ppno[0] |= 0x80; /* query is always available */
 }
 if (test_bit(S390_FEAT_MSA_EXT_8, features)) {
 s390_fill_feat_block(features, S390_FEAT_TYPE_KMA, prop.kma);
-prop.kma[0] |= 0x80; /* query is always available */
 }
 return kvm_vm_ioctl(kvm_state, KVM_SET_DEVICE_ATTR, &attr);
 }
-- 
2.13.3

[Qemu-devel] [PULL for-2.10 06/14] target/s390x: introduce (test|set)_be_bit

2017-07-25 Thread Cornelia Huck

From: David Hildenbrand 

Using ordinary bitmap operations to set/test bits does not work properly
on architectures !s390x. Let's drop (test|set)_bit_inv and introduce
(test|set)_be_bit instead. These functions work on uint8_t array, not on
unsigned longs arrays and are for now only used in the context of
CPU features.

Signed-off-by: David Hildenbrand 
Message-Id: <20170720123721.12366-4-da...@redhat.com>
Signed-off-by: Cornelia Huck 
---
 target/s390x/cpu_features.c |  8 
 target/s390x/cpu_features.h |  8 
 target/s390x/kvm.c  | 14 ++
 3 files changed, 14 insertions(+), 16 deletions(-)

diff --git a/target/s390x/cpu_features.c b/target/s390x/cpu_features.c
index 8b1491734f..1d3a036393 100644
--- a/target/s390x/cpu_features.c
+++ b/target/s390x/cpu_features.c
@@ -340,8 +340,8 @@ void s390_fill_feat_block(const S390FeatBitmap features, 
S390FeatType type,
 case S390_FEAT_TYPE_STFL:
 if (test_bit(S390_FEAT_ZARCH, features)) {
 /* Features that are always active */
-data[0] |= 0x20;  /* z/Architecture */
-data[17] |= 0x20; /* Configuration-z-architectural-mode */
+set_be_bit(2, data);   /* z/Architecture */
+set_be_bit(138, data); /* Configuration-z-architectural-mode */
 }
 break;
 case S390_FEAT_TYPE_PTFF:
@@ -357,7 +357,7 @@ void s390_fill_feat_block(const S390FeatBitmap features, 
S390FeatType type,
 case S390_FEAT_TYPE_PCC:
 case S390_FEAT_TYPE_PPNO:
 case S390_FEAT_TYPE_KMA:
-data[0] |= 0x80; /* query is always available */
+set_be_bit(0, data); /* query is always available */
 break;
 default:
 break;
@@ -368,7 +368,7 @@ void s390_fill_feat_block(const S390FeatBitmap features, 
S390FeatType type,
 if (s390_features[feat].type == type) {
 bit_nr = s390_features[feat].bit;
 /* big endian on uint8_t array */
-data[bit_nr / 8] |= 0x80 >> (bit_nr % 8);
+set_be_bit(bit_nr, data);
 }
 feat = find_next_bit(features, S390_FEAT_MAX, feat + 1);
 }
diff --git a/target/s390x/cpu_features.h b/target/s390x/cpu_features.h
index 770435e0cc..e306aa7ab2 100644
--- a/target/s390x/cpu_features.h
+++ b/target/s390x/cpu_features.h
@@ -93,4 +93,12 @@ const S390FeatGroupDef *s390_feat_group_def(S390FeatGroup 
group);
 
 #define BE_BIT_NR(BIT) (BIT ^ (BITS_PER_LONG - 1))
 
+static inline void set_be_bit(unsigned int bit_nr, uint8_t *array)
+{
+array[bit_nr / 8] |= 0x80 >> (bit_nr % 8);
+}
+static inline bool test_be_bit(unsigned int bit_nr, const uint8_t *array)
+{
+return array[bit_nr / 8] & (0x80 >> (bit_nr % 8));
+}
 #endif /* TARGET_S390X_CPU_FEATURES_H */
diff --git a/target/s390x/kvm.c b/target/s390x/kvm.c
index 999ea570c3..9bec48152f 100644
--- a/target/s390x/kvm.c
+++ b/target/s390x/kvm.c
@@ -2430,16 +2430,6 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
 abort();
 }
 
-static inline int test_bit_inv(long nr, const unsigned long *addr)
-{
-return test_bit(BE_BIT_NR(nr), addr);
-}
-
-static inline void set_bit_inv(long nr, unsigned long *addr)
-{
-set_bit(BE_BIT_NR(nr), addr);
-}
-
 static int query_cpu_subfunc(S390FeatBitmap features)
 {
 struct kvm_s390_vm_cpu_subfunc prop;
@@ -2566,7 +2556,7 @@ static int query_cpu_feat(S390FeatBitmap features)
 }
 
 for (i = 0; i < ARRAY_SIZE(kvm_to_feat); i++) {
-if (test_bit_inv(kvm_to_feat[i][0], (unsigned long *)prop.feat)) {
+if (test_be_bit(kvm_to_feat[i][0], (uint8_t *) prop.feat)) {
 set_bit(kvm_to_feat[i][1], features);
 }
 }
@@ -2585,7 +2575,7 @@ static int configure_cpu_feat(const S390FeatBitmap 
features)
 
 for (i = 0; i < ARRAY_SIZE(kvm_to_feat); i++) {
 if (test_bit(kvm_to_feat[i][1], features)) {
-set_bit_inv(kvm_to_feat[i][0], (unsigned long *)prop.feat);
+set_be_bit(kvm_to_feat[i][0], (uint8_t *) prop.feat);
 }
 }
 return kvm_vm_ioctl(kvm_state, KVM_SET_DEVICE_ATTR, &attr);
-- 
2.13.3

[Qemu-devel] [PULL for-2.10 04/14] target/s390x: drop BE_BIT()

2017-07-25 Thread Cornelia Huck

From: David Hildenbrand 

Unused and broken, let's just get rid of it.

Acked-by: Christian Borntraeger 
Signed-off-by: David Hildenbrand 
Message-Id: <20170720123721.12366-2-da...@redhat.com>
Signed-off-by: Cornelia Huck 
---
 target/s390x/cpu_features.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/target/s390x/cpu_features.h b/target/s390x/cpu_features.h
index 14bc311dbe..770435e0cc 100644
--- a/target/s390x/cpu_features.h
+++ b/target/s390x/cpu_features.h
@@ -92,6 +92,5 @@ typedef struct {
 const S390FeatGroupDef *s390_feat_group_def(S390FeatGroup group);
 
 #define BE_BIT_NR(BIT) (BIT ^ (BITS_PER_LONG - 1))
-#define BE_BIT(BIT) (1ULL < BE_BIT_NR(BIT))
 
 #endif /* TARGET_S390X_CPU_FEATURES_H */
-- 
2.13.3

[Qemu-devel] [PULL for-2.10 08/14] target/s390x: improve baselining if certain base features are missing

2017-07-25 Thread Cornelia Huck

From: David Hildenbrand 

There are certain features that we put into base models, but that are
not relevant for the actual search. The most famous example are
MSA subfunctions that might be disabled on certain real hardware out
there.

While the kvm host model detection will usually detect the correct model
on such machines (as it will in the common case not pass features to check
for into s390_find_cpu_def()), baselining will fall back to a quite old
model just because some MSA subfunctions are missing.

Let's improve that by ignoring lack of these features while performing
the search for a base model.

Signed-off-by: David Hildenbrand 
Message-Id: <20170720123721.12366-6-da...@redhat.com>
Signed-off-by: Cornelia Huck 
---
 target/s390x/cpu_models.c | 33 +
 1 file changed, 33 insertions(+)

diff --git a/target/s390x/cpu_models.c b/target/s390x/cpu_models.c
index d91b7b85f6..fa1338fc72 100644
--- a/target/s390x/cpu_models.c
+++ b/target/s390x/cpu_models.c
@@ -78,6 +78,9 @@ static S390CPUDef s390_cpu_defs[] = {
 CPUDEF_INIT(0x3906, 14, 1, 47, 0x0800U, "z14", "IBM z14 GA1"),
 };
 
+/* features part of a base model but not relevant for finding a base model */
+S390FeatBitmap ignored_base_feat;
+
 void s390_cpudef_featoff(uint8_t gen, uint8_t ec_ga, S390Feat feat)
 {
 const S390CPUDef *def;
@@ -237,6 +240,11 @@ const S390CPUDef *s390_find_cpu_def(uint16_t type, uint8_t 
gen, uint8_t ec_ga,
 if (features) {
 /* see if the model satisfies the minimum features */
 bitmap_andnot(missing, def->base_feat, features, S390_FEAT_MAX);
+/*
+ * Ignore certain features that are in the base model, but not
+ * relevant for the search (esp. MSA subfunctions).
+ */
+bitmap_andnot(missing, missing, ignored_base_feat, S390_FEAT_MAX);
 if (!bitmap_empty(missing, S390_FEAT_MAX)) {
 break;
 }
@@ -1210,10 +1218,35 @@ static const TypeInfo host_s390_cpu_type_info = {
 };
 #endif
 
+static void init_ignored_base_feat(void)
+{
+static const int feats[] = {
+ /* MSA subfunctions that could not be available on certain machines */
+ S390_FEAT_KMAC_DEA,
+ S390_FEAT_KMAC_TDEA_128,
+ S390_FEAT_KMAC_TDEA_192,
+ S390_FEAT_KMC_DEA,
+ S390_FEAT_KMC_TDEA_128,
+ S390_FEAT_KMC_TDEA_192,
+ S390_FEAT_KM_DEA,
+ S390_FEAT_KM_TDEA_128,
+ S390_FEAT_KM_TDEA_192,
+ S390_FEAT_KIMD_SHA_1,
+ S390_FEAT_KLMD_SHA_1,
+};
+int i;
+
+for (i = 0; i < ARRAY_SIZE(feats); i++) {
+set_bit(feats[i], ignored_base_feat);
+}
+}
+
 static void register_types(void)
 {
 int i;
 
+init_ignored_base_feat();
+
 /* init all bitmaps from gnerated data initially */
 for (i = 0; i < ARRAY_SIZE(s390_cpu_defs); i++) {
 s390_init_feat_bitmap(s390_cpu_defs[i].base_init,
-- 
2.13.3

[Qemu-devel] [PULL for-2.10 09/14] target/s390x: Move s390_cpu_dump_state() to helper.c

2017-07-25 Thread Cornelia Huck

From: Thomas Huth 

translate.c can not be compiled with --disable-tcg, but we need
the s390_cpu_dump_state() in KVM-only builds, too. So let's move
that function to helper.c instead, which will also be compiled
when --disable-tcg has been specified.

Reviewed-by: Richard Henderson 
Signed-off-by: Thomas Huth 
Message-Id: <1500886370-14572-2-git-send-email-th...@redhat.com>
Signed-off-by: Cornelia Huck 
---
 target/s390x/helper.c| 60 
 target/s390x/translate.c | 60 
 2 files changed, 60 insertions(+), 60 deletions(-)

diff --git a/target/s390x/helper.c b/target/s390x/helper.c
index aef09e1234..30ac2a77b3 100644
--- a/target/s390x/helper.c
+++ b/target/s390x/helper.c
@@ -766,3 +766,63 @@ void s390x_cpu_do_unaligned_access(CPUState *cs, vaddr 
addr,
 program_interrupt(env, PGM_SPECIFICATION, ILEN_AUTO);
 }
 #endif /* CONFIG_USER_ONLY */
+
+void s390_cpu_dump_state(CPUState *cs, FILE *f, fprintf_function cpu_fprintf,
+ int flags)
+{
+S390CPU *cpu = S390_CPU(cs);
+CPUS390XState *env = &cpu->env;
+int i;
+
+if (env->cc_op > 3) {
+cpu_fprintf(f, "PSW=mask %016" PRIx64 " addr %016" PRIx64 " cc %15s\n",
+env->psw.mask, env->psw.addr, cc_name(env->cc_op));
+} else {
+cpu_fprintf(f, "PSW=mask %016" PRIx64 " addr %016" PRIx64 " cc %02x\n",
+env->psw.mask, env->psw.addr, env->cc_op);
+}
+
+for (i = 0; i < 16; i++) {
+cpu_fprintf(f, "R%02d=%016" PRIx64, i, env->regs[i]);
+if ((i % 4) == 3) {
+cpu_fprintf(f, "\n");
+} else {
+cpu_fprintf(f, " ");
+}
+}
+
+for (i = 0; i < 16; i++) {
+cpu_fprintf(f, "F%02d=%016" PRIx64, i, get_freg(env, i)->ll);
+if ((i % 4) == 3) {
+cpu_fprintf(f, "\n");
+} else {
+cpu_fprintf(f, " ");
+}
+}
+
+for (i = 0; i < 32; i++) {
+cpu_fprintf(f, "V%02d=%016" PRIx64 "%016" PRIx64, i,
+env->vregs[i][0].ll, env->vregs[i][1].ll);
+cpu_fprintf(f, (i % 2) ? "\n" : " ");
+}
+
+#ifndef CONFIG_USER_ONLY
+for (i = 0; i < 16; i++) {
+cpu_fprintf(f, "C%02d=%016" PRIx64, i, env->cregs[i]);
+if ((i % 4) == 3) {
+cpu_fprintf(f, "\n");
+} else {
+cpu_fprintf(f, " ");
+}
+}
+#endif
+
+#ifdef DEBUG_INLINE_BRANCHES
+for (i = 0; i < CC_OP_MAX; i++) {
+cpu_fprintf(f, "  %15s = %10ld\t%10ld\n", cc_name(i),
+inline_branch_miss[i], inline_branch_hit[i]);
+}
+#endif
+
+cpu_fprintf(f, "\n");
+}
diff --git a/target/s390x/translate.c b/target/s390x/translate.c
index 48b71f9604..cd96a8dee2 100644
--- a/target/s390x/translate.c
+++ b/target/s390x/translate.c
@@ -93,66 +93,6 @@ static uint64_t pc_to_link_info(DisasContext *s, uint64_t pc)
 return pc;
 }
 
-void s390_cpu_dump_state(CPUState *cs, FILE *f, fprintf_function cpu_fprintf,
- int flags)
-{
-S390CPU *cpu = S390_CPU(cs);
-CPUS390XState *env = &cpu->env;
-int i;
-
-if (env->cc_op > 3) {
-cpu_fprintf(f, "PSW=mask %016" PRIx64 " addr %016" PRIx64 " cc %15s\n",
-env->psw.mask, env->psw.addr, cc_name(env->cc_op));
-} else {
-cpu_fprintf(f, "PSW=mask %016" PRIx64 " addr %016" PRIx64 " cc %02x\n",
-env->psw.mask, env->psw.addr, env->cc_op);
-}
-
-for (i = 0; i < 16; i++) {
-cpu_fprintf(f, "R%02d=%016" PRIx64, i, env->regs[i]);
-if ((i % 4) == 3) {
-cpu_fprintf(f, "\n");
-} else {
-cpu_fprintf(f, " ");
-}
-}
-
-for (i = 0; i < 16; i++) {
-cpu_fprintf(f, "F%02d=%016" PRIx64, i, get_freg(env, i)->ll);
-if ((i % 4) == 3) {
-cpu_fprintf(f, "\n");
-} else {
-cpu_fprintf(f, " ");
-}
-}
-
-for (i = 0; i < 32; i++) {
-cpu_fprintf(f, "V%02d=%016" PRIx64 "%016" PRIx64, i,
-env->vregs[i][0].ll, env->vregs[i][1].ll);
-cpu_fprintf(f, (i % 2) ? "\n" : " ");
-}
-
-#ifndef CONFIG_USER_ONLY
-for (i = 0; i < 16; i++) {
-cpu_fprintf(f, "C%02d=%016" PRIx64, i, env->cregs[i]);
-if ((i % 4) == 3) {
-cpu_fprintf(f, "\n");
-} else {
-cpu_fprintf(f, " ");
-}
-}
-#endif
-
-#ifdef DEBUG_INLINE_BRANCHES
-for (i = 0; i < CC_OP_MAX; i++) {
-cpu_fprintf(f, "  %15s = %10ld\t%10ld\n", cc_name(i),
-inline_branch_miss[i], inline_branch_hit[i]);
-}
-#endif
-
-cpu_fprintf(f, "\n");
-}
-
 static TCGv_i64 psw_addr;
 static TCGv_i64 psw_mask;
 static TCGv_i64 gbea;
-- 
2.13.3

[Qemu-devel] [PULL for-2.10 07/14] s390x/kvm: better comment regarding zPCI feature availability

2017-07-25 Thread Cornelia Huck

From: David Hildenbrand 

Acked-by: Christian Borntraeger 
Signed-off-by: David Hildenbrand 
Message-Id: <20170720123721.12366-5-da...@redhat.com>
Signed-off-by: Cornelia Huck 
---
 target/s390x/kvm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/s390x/kvm.c b/target/s390x/kvm.c
index 9bec48152f..84416c0a64 100644
--- a/target/s390x/kvm.c
+++ b/target/s390x/kvm.c
@@ -2661,7 +2661,7 @@ void kvm_s390_get_host_cpu_model(S390CPUModel *model, 
Error **errp)
 clear_bit(S390_FEAT_CMM_NT, model->features);
 }
 
-/* set zpci and aen facilities */
+/* We emulate a zPCI bus and AEN, therefore we don't need HW support */
 set_bit(S390_FEAT_ZPCI, model->features);
 set_bit(S390_FEAT_ADAPTER_EVENT_NOTIFICATION, model->features);
 
-- 
2.13.3

[Qemu-devel] [PULL for-2.10 11/14] target/s390x: Rework program_interrupt() and related functions

2017-07-25 Thread Cornelia Huck

From: Thomas Huth 

misc_helper.c won't be compiled with --disable-tcg anymore, but we
still need the program_interrupt() function in that case. Move it
to interrupt.c instead, and refactor it to re-use the code from
trigger_pgm_exception() (for TCG) and enter_pgmcheck() (for KVM,
which now got renamed to kvm_s390_program_interrupt() for
clarity).

Signed-off-by: Thomas Huth 
Message-Id: <1500886370-14572-4-git-send-email-th...@redhat.com>
Reviewed-by: Richard Henderson 
Signed-off-by: Cornelia Huck 
---
 target/s390x/cpu.h |  9 +
 target/s390x/helper.c  | 10 --
 target/s390x/interrupt.c   | 39 +++
 target/s390x/kvm.c | 16 
 target/s390x/misc_helper.c | 26 --
 5 files changed, 52 insertions(+), 48 deletions(-)

diff --git a/target/s390x/cpu.h b/target/s390x/cpu.h
index 7732d01784..4041bfc4bc 100644
--- a/target/s390x/cpu.h
+++ b/target/s390x/cpu.h
@@ -474,10 +474,6 @@ static inline bool get_per_in_range(CPUS390XState *env, 
uint64_t addr)
 }
 }
 
-#ifndef CONFIG_USER_ONLY
-void trigger_pgm_exception(CPUS390XState *env, uint32_t code, uint32_t ilen);
-#endif
-
 S390CPU *cpu_s390x_init(const char *cpu_model);
 S390CPU *s390x_new_cpu(const char *cpu_model, int64_t id, Error **errp);
 S390CPU *cpu_s390x_create(const char *cpu_model, Error **errp);
@@ -1146,10 +1142,12 @@ void handle_diag_308(CPUS390XState *env, uint64_t r1, 
uint64_t r3);
 /* automatically detect the instruction length */
 #define ILEN_AUTO   0xff
 void program_interrupt(CPUS390XState *env, uint32_t code, int ilen);
+void trigger_pgm_exception(CPUS390XState *env, uint32_t code, uint32_t ilen);
 void QEMU_NORETURN runtime_exception(CPUS390XState *env, int excp,
  uintptr_t retaddr);
 
 #ifdef CONFIG_KVM
+void kvm_s390_program_interrupt(S390CPU *cpu, uint16_t code);
 void kvm_s390_io_interrupt(uint16_t subchannel_id,
uint16_t subchannel_nr, uint32_t io_int_parm,
uint32_t io_int_word);
@@ -1170,6 +1168,9 @@ int kvm_s390_get_ri(void);
 int kvm_s390_get_gs(void);
 void kvm_s390_crypto_reset(void);
 #else
+static inline void kvm_s390_program_interrupt(S390CPU *cpu, uint16_t code)
+{
+}
 static inline void kvm_s390_io_interrupt(uint16_t subchannel_id,
 uint16_t subchannel_nr,
 uint32_t io_int_parm,
diff --git a/target/s390x/helper.c b/target/s390x/helper.c
index 30ac2a77b3..e2040d65a6 100644
--- a/target/s390x/helper.c
+++ b/target/s390x/helper.c
@@ -165,16 +165,6 @@ int s390_cpu_handle_mmu_fault(CPUState *cs, vaddr address,
 
 #else /* !CONFIG_USER_ONLY */
 
-/* Ensure to exit the TB after this call! */
-void trigger_pgm_exception(CPUS390XState *env, uint32_t code, uint32_t ilen)
-{
-CPUState *cs = CPU(s390_env_get_cpu(env));
-
-cs->exception_index = EXCP_PGM;
-env->int_pgm_code = code;
-env->int_pgm_ilen = ilen;
-}
-
 int s390_cpu_handle_mmu_fault(CPUState *cs, vaddr orig_vaddr,
   int rw, int mmu_idx)
 {
diff --git a/target/s390x/interrupt.c b/target/s390x/interrupt.c
index 9edef96795..fde3da793a 100644
--- a/target/s390x/interrupt.c
+++ b/target/s390x/interrupt.c
@@ -8,10 +8,49 @@
  */
 
 #include "qemu/osdep.h"
+#include "qemu/log.h"
 #include "cpu.h"
+#include "exec/exec-all.h"
 #include "sysemu/kvm.h"
 #include "hw/s390x/ioinst.h"
 
+/* Ensure to exit the TB after this call! */
+void trigger_pgm_exception(CPUS390XState *env, uint32_t code, uint32_t ilen)
+{
+CPUState *cs = CPU(s390_env_get_cpu(env));
+
+cs->exception_index = EXCP_PGM;
+env->int_pgm_code = code;
+env->int_pgm_ilen = ilen;
+}
+
+static void tcg_s390_program_interrupt(CPUS390XState *env, uint32_t code,
+   int ilen)
+{
+#ifdef CONFIG_TCG
+trigger_pgm_exception(env, code, ilen);
+cpu_loop_exit(CPU(s390_env_get_cpu(env)));
+#else
+g_assert_not_reached();
+#endif
+}
+
+void program_interrupt(CPUS390XState *env, uint32_t code, int ilen)
+{
+S390CPU *cpu = s390_env_get_cpu(env);
+
+qemu_log_mask(CPU_LOG_INT, "program interrupt at %#" PRIx64 "\n",
+  env->psw.addr);
+
+if (kvm_enabled()) {
+kvm_s390_program_interrupt(cpu, code);
+} else if (tcg_enabled()) {
+tcg_s390_program_interrupt(env, code, ilen);
+} else {
+g_assert_not_reached();
+}
+}
+
 #if !defined(CONFIG_USER_ONLY)
 void cpu_inject_ext(S390CPU *cpu, uint32_t code, uint32_t param,
 uint64_t param64)
diff --git a/target/s390x/kvm.c b/target/s390x/kvm.c
index 84416c0a64..c4c5791d27 100644
--- a/target/s390x/kvm.c
+++ b/target/s390x/kvm.c
@@ -1032,7 +1032,7 @@ void kvm_s390_service_interrupt(uint32_t parm)
 kvm_s390_floating_interrupt(&irq);
 }
 
-static void enter_pgmcheck(S390CPU *cpu, uint16_t code)
+void kvm_s390_program_interrupt(S39

[Qemu-devel] [PULL for-2.10 10/14] target/s390x: Move diag helpers to a separate file

2017-07-25 Thread Cornelia Huck

From: Thomas Huth 

misc_helper.c won't be compiled with --disable-tcg anymore, but we
still need the diag helpers in KVM builds, too, so move the helper
functions to a separate file.

Reviewed-by: Richard Henderson 
Signed-off-by: Thomas Huth 
Message-Id: <1500886370-14572-3-git-send-email-th...@redhat.com>
Signed-off-by: Cornelia Huck 
---
 target/s390x/Makefile.objs |   2 +-
 target/s390x/diag.c| 179 +
 target/s390x/misc_helper.c | 167 --
 3 files changed, 180 insertions(+), 168 deletions(-)
 create mode 100644 target/s390x/diag.c

diff --git a/target/s390x/Makefile.objs b/target/s390x/Makefile.objs
index 46f6a8c6b6..80028521ef 100644
--- a/target/s390x/Makefile.objs
+++ b/target/s390x/Makefile.objs
@@ -1,7 +1,7 @@
 obj-y += translate.o helper.o cpu.o interrupt.o
 obj-y += int_helper.o fpu_helper.o cc_helper.o mem_helper.o misc_helper.o
 obj-y += gdbstub.o cpu_models.o cpu_features.o
-obj-$(CONFIG_SOFTMMU) += machine.o ioinst.o arch_dump.o mmu_helper.o
+obj-$(CONFIG_SOFTMMU) += machine.o ioinst.o arch_dump.o mmu_helper.o diag.o
 obj-$(CONFIG_KVM) += kvm.o
 
 # build and run feature list generator
diff --git a/target/s390x/diag.c b/target/s390x/diag.c
new file mode 100644
index 00..10ac845bcd
--- /dev/null
+++ b/target/s390x/diag.c
@@ -0,0 +1,179 @@
+/*
+ * S390x DIAG instruction helper functions
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include "qemu/osdep.h"
+#include "cpu.h"
+#include "exec/address-spaces.h"
+#include "exec/exec-all.h"
+#include "hw/watchdog/wdt_diag288.h"
+#include "sysemu/cpus.h"
+#include "hw/s390x/ipl.h"
+
+static int modified_clear_reset(S390CPU *cpu)
+{
+S390CPUClass *scc = S390_CPU_GET_CLASS(cpu);
+CPUState *t;
+
+pause_all_vcpus();
+cpu_synchronize_all_states();
+CPU_FOREACH(t) {
+run_on_cpu(t, s390_do_cpu_full_reset, RUN_ON_CPU_NULL);
+}
+s390_cmma_reset();
+subsystem_reset();
+s390_crypto_reset();
+scc->load_normal(CPU(cpu));
+cpu_synchronize_all_post_reset();
+resume_all_vcpus();
+return 0;
+}
+
+static int load_normal_reset(S390CPU *cpu)
+{
+S390CPUClass *scc = S390_CPU_GET_CLASS(cpu);
+CPUState *t;
+
+pause_all_vcpus();
+cpu_synchronize_all_states();
+CPU_FOREACH(t) {
+run_on_cpu(t, s390_do_cpu_reset, RUN_ON_CPU_NULL);
+}
+s390_cmma_reset();
+subsystem_reset();
+scc->initial_cpu_reset(CPU(cpu));
+scc->load_normal(CPU(cpu));
+cpu_synchronize_all_post_reset();
+resume_all_vcpus();
+return 0;
+}
+
+int handle_diag_288(CPUS390XState *env, uint64_t r1, uint64_t r3)
+{
+uint64_t func = env->regs[r1];
+uint64_t timeout = env->regs[r1 + 1];
+uint64_t action = env->regs[r3];
+Object *obj;
+DIAG288State *diag288;
+DIAG288Class *diag288_class;
+
+if (r1 % 2 || action != 0) {
+return -1;
+}
+
+/* Timeout must be more than 15 seconds except for timer deletion */
+if (func != WDT_DIAG288_CANCEL && timeout < 15) {
+return -1;
+}
+
+obj = object_resolve_path_type("", TYPE_WDT_DIAG288, NULL);
+if (!obj) {
+return -1;
+}
+
+diag288 = DIAG288(obj);
+diag288_class = DIAG288_GET_CLASS(diag288);
+return diag288_class->handle_timer(diag288, func, timeout);
+}
+
+#define DIAG_308_RC_OK  0x0001
+#define DIAG_308_RC_NO_CONF 0x0102
+#define DIAG_308_RC_INVALID 0x0402
+
+void handle_diag_308(CPUS390XState *env, uint64_t r1, uint64_t r3)
+{
+uint64_t addr =  env->regs[r1];
+uint64_t subcode = env->regs[r3];
+IplParameterBlock *iplb;
+
+if (env->psw.mask & PSW_MASK_PSTATE) {
+program_interrupt(env, PGM_PRIVILEGED, ILEN_AUTO);
+return;
+}
+
+if ((subcode & ~0x0ULL) || (subcode > 6)) {
+program_interrupt(env, PGM_SPECIFICATION, ILEN_AUTO);
+return;
+}
+
+switch (subcode) {
+case 0:
+modified_clear_reset(s390_env_get_cpu(env));
+if (tcg_enabled()) {
+cpu_loop_exit(CPU(s390_env_get_cpu(env)));
+}
+break;
+case 1:
+load_normal_reset(s390_env_get_cpu(env));
+if (tcg_enabled()) {
+cpu_loop_exit(CPU(s390_env_get_cpu(env)));
+}
+break;
+case 3:
+s390_reipl_request();
+if (tcg_enabled()) {
+cpu_loop_exit(CPU(s390_env_get_cpu(env)));
+}
+break;
+case 5:
+if ((r1 & 1) || (addr & 0x0fffULL

Re: [Qemu-devel] [PATCH] Makefile: add all-user/all-linux-user/all-softmmu meta-targets

2017-07-25 Thread Paolo Bonzini

On 25/07/2017 08:11, Philippe Mathieu-Daudé wrote:
> Useful to build a whole set at once.
> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  Makefile | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/Makefile b/Makefile
> index 5f18243d05..da899522e4 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -814,6 +814,11 @@ endif
>  # Dependencies in Makefile.objs files come from our recursive subdir rules
>  -include $(wildcard *.d tests/*.d)
>  
> +.PHONY: subdir-all-user subdir-all-linux-user subdir-all-softmmu
> +subdir-all-user: $(patsubst %,subdir-%,$(filter %-user,$(TARGET_DIRS)))
> +subdir-all-linux-user: $(patsubst %,subdir-%,$(filter 
> %-linux-user,$(TARGET_DIRS)))
> +subdir-all-softmmu: $(patsubst %,subdir-%,$(filter %-softmmu,$(TARGET_DIRS)))

I'm not sure why you need subdir-all-linux-user.  I think it's okay to
add these, but please add "romsubdir-all" too.  This way "all" can use
"subdir-all-user subdir-all-softmmu romsubdir-all" and you can remove
"recurse-all".

Paolo

>  include $(SRC_PATH)/tests/docker/Makefile.include
>  
>  .PHONY: help
> @@ -827,7 +832,7 @@ help:
>   @echo  ''
>   @$(if $(TARGET_DIRS), \
>   echo 'Architecture specific targets:'; \
> - $(foreach t, $(TARGET_DIRS), \
> + $(foreach t, $(TARGET_DIRS) all-user all-linux-user 
> all-softmmu, \
>   printf "  %-30s - Build for %s\\n" $(patsubst %,subdir-%,$(t)) 
> $(t);) \
>   echo '')
>   @echo  'Cleaning targets:'
>

[Qemu-devel] [PULL for-2.10 14/14] s390x/css: fix ilen in IO instruction handlers

2017-07-25 Thread Cornelia Huck

From: Halil Pasic 

When initiating a program check interruption by calling program_interrupt
the instruction length (ilen) of the current instruction is supplied as
the third parameter.

On s390x all the IO instructions are of instruction format S and their
ilen is 4.  The calls to program_interrupt (introduced by commits
7b18aad543 ("s390: Add channel I/O instructions.", 2013-01-24) and
61bf0dcb2e ("s390x/ioinst: Add missing alignment checks for IO
instructions", 2013-06-21)) however use ilen == 2.

This is probably due to a confusion between ilen which specifies the
instruction length in bytes and ILC which does the same but in halfwords.
If kvm_enabled() this does not actually matter, because the ilen
parameter of program_interrupt is effectively unused.

Let's provide the correct ilen to program_interrupt.

Signed-off-by: Halil Pasic 
Fixes:  7b18aad543 ("s390: Add channel I/O instructions.")
Fixes: 61bf0dcb2e ("s390x/ioinst: Add missing alignment checks for IO 
instructions")
Reviewed-by: David Hildenbrand 
Message-Id: <20170724143452.55534-1-pa...@linux.vnet.ibm.com>
Reviewed-by: Dong Jia Shi 
Signed-off-by: Cornelia Huck 
---
 target/s390x/ioinst.c | 42 +-
 1 file changed, 21 insertions(+), 21 deletions(-)

diff --git a/target/s390x/ioinst.c b/target/s390x/ioinst.c
index d5e6b8066b..51fbea620d 100644
--- a/target/s390x/ioinst.c
+++ b/target/s390x/ioinst.c
@@ -45,7 +45,7 @@ void ioinst_handle_xsch(S390CPU *cpu, uint64_t reg1)
 int cc;
 
 if (ioinst_disassemble_sch_ident(reg1, &m, &cssid, &ssid, &schid)) {
-program_interrupt(&cpu->env, PGM_OPERAND, 2);
+program_interrupt(&cpu->env, PGM_OPERAND, 4);
 return;
 }
 trace_ioinst_sch_id("xsch", cssid, ssid, schid);
@@ -78,7 +78,7 @@ void ioinst_handle_csch(S390CPU *cpu, uint64_t reg1)
 int cc;
 
 if (ioinst_disassemble_sch_ident(reg1, &m, &cssid, &ssid, &schid)) {
-program_interrupt(&cpu->env, PGM_OPERAND, 2);
+program_interrupt(&cpu->env, PGM_OPERAND, 4);
 return;
 }
 trace_ioinst_sch_id("csch", cssid, ssid, schid);
@@ -102,7 +102,7 @@ void ioinst_handle_hsch(S390CPU *cpu, uint64_t reg1)
 int cc;
 
 if (ioinst_disassemble_sch_ident(reg1, &m, &cssid, &ssid, &schid)) {
-program_interrupt(&cpu->env, PGM_OPERAND, 2);
+program_interrupt(&cpu->env, PGM_OPERAND, 4);
 return;
 }
 trace_ioinst_sch_id("hsch", cssid, ssid, schid);
@@ -153,7 +153,7 @@ void ioinst_handle_msch(S390CPU *cpu, uint64_t reg1, 
uint32_t ipb)
 
 addr = decode_basedisp_s(env, ipb, &ar);
 if (addr & 3) {
-program_interrupt(env, PGM_SPECIFICATION, 2);
+program_interrupt(env, PGM_SPECIFICATION, 4);
 return;
 }
 if (s390_cpu_virt_mem_read(cpu, addr, ar, &schib, sizeof(schib))) {
@@ -161,7 +161,7 @@ void ioinst_handle_msch(S390CPU *cpu, uint64_t reg1, 
uint32_t ipb)
 }
 if (ioinst_disassemble_sch_ident(reg1, &m, &cssid, &ssid, &schid) ||
 !ioinst_schib_valid(&schib)) {
-program_interrupt(env, PGM_OPERAND, 2);
+program_interrupt(env, PGM_OPERAND, 4);
 return;
 }
 trace_ioinst_sch_id("msch", cssid, ssid, schid);
@@ -224,7 +224,7 @@ void ioinst_handle_ssch(S390CPU *cpu, uint64_t reg1, 
uint32_t ipb)
 
 addr = decode_basedisp_s(env, ipb, &ar);
 if (addr & 3) {
-program_interrupt(env, PGM_SPECIFICATION, 2);
+program_interrupt(env, PGM_SPECIFICATION, 4);
 return;
 }
 if (s390_cpu_virt_mem_read(cpu, addr, ar, &orig_orb, sizeof(orb))) {
@@ -233,7 +233,7 @@ void ioinst_handle_ssch(S390CPU *cpu, uint64_t reg1, 
uint32_t ipb)
 copy_orb_from_guest(&orb, &orig_orb);
 if (ioinst_disassemble_sch_ident(reg1, &m, &cssid, &ssid, &schid) ||
 !ioinst_orb_valid(&orb)) {
-program_interrupt(env, PGM_OPERAND, 2);
+program_interrupt(env, PGM_OPERAND, 4);
 return;
 }
 trace_ioinst_sch_id("ssch", cssid, ssid, schid);
@@ -277,7 +277,7 @@ void ioinst_handle_stcrw(S390CPU *cpu, uint32_t ipb)
 
 addr = decode_basedisp_s(env, ipb, &ar);
 if (addr & 3) {
-program_interrupt(env, PGM_SPECIFICATION, 2);
+program_interrupt(env, PGM_SPECIFICATION, 4);
 return;
 }
 
@@ -304,7 +304,7 @@ void ioinst_handle_stsch(S390CPU *cpu, uint64_t reg1, 
uint32_t ipb)
 
 addr = decode_basedisp_s(env, ipb, &ar);
 if (addr & 3) {
-program_interrupt(env, PGM_SPECIFICATION, 2);
+program_interrupt(env, PGM_SPECIFICATION, 4);
 return;
 }
 
@@ -315,7 +315,7 @@ void ioinst_handle_stsch(S390CPU *cpu, uint64_t reg1, 
uint32_t ipb)
  * access execption if it is not) first.
  */
 if (!s390_cpu_virt_mem_check_write(cpu, addr, ar, sizeof(schib))) {
-program_interrupt(env, PGM_OPERAND, 2);
+program_interrupt(env, PGM_OPERAND, 4);
 }
 return;
 }
@@ -363,13 +363,13 @@ int ioinst_handle_tsch(S390CPU *

[Qemu-devel] [PULL for-2.10 13/14] target/s390x: Add remaining switches to compile with --disable-tcg

2017-07-25 Thread Cornelia Huck

From: Thomas Huth 

Adding some CONFIG_TCG tests to be finally able to compile QEMU
on s390x also without TCG.

Reviewed-by: Richard Henderson 
Signed-off-by: Thomas Huth 
Message-Id: <1500886370-14572-6-git-send-email-th...@redhat.com>
Signed-off-by: Cornelia Huck 
---
 target/s390x/Makefile.objs | 6 +++---
 target/s390x/cpu.c | 4 
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/target/s390x/Makefile.objs b/target/s390x/Makefile.objs
index cc2b4c9257..f42cd1f9cc 100644
--- a/target/s390x/Makefile.objs
+++ b/target/s390x/Makefile.objs
@@ -1,6 +1,6 @@
-obj-y += translate.o helper.o cpu.o interrupt.o
-obj-y += int_helper.o fpu_helper.o cc_helper.o mem_helper.o misc_helper.o
-obj-y += gdbstub.o cpu_models.o cpu_features.o excp_helper.o
+obj-y += cpu.o cpu_models.o cpu_features.o gdbstub.o interrupt.o helper.o
+obj-$(CONFIG_TCG) += translate.o cc_helper.o excp_helper.o fpu_helper.o
+obj-$(CONFIG_TCG) += int_helper.o mem_helper.o misc_helper.o
 obj-$(CONFIG_SOFTMMU) += machine.o ioinst.o arch_dump.o mmu_helper.o diag.o
 obj-$(CONFIG_KVM) += kvm.o
 
diff --git a/target/s390x/cpu.c b/target/s390x/cpu.c
index accef03234..489bc25334 100644
--- a/target/s390x/cpu.c
+++ b/target/s390x/cpu.c
@@ -417,7 +417,9 @@ static void s390_cpu_class_init(ObjectClass *oc, void *data)
 cc->reset = s390_cpu_full_reset;
 cc->class_by_name = s390_cpu_class_by_name,
 cc->has_work = s390_cpu_has_work;
+#ifdef CONFIG_TCG
 cc->do_interrupt = s390_cpu_do_interrupt;
+#endif
 cc->dump_state = s390_cpu_dump_state;
 cc->set_pc = s390_cpu_set_pc;
 cc->gdb_read_register = s390_cpu_gdb_read_register;
@@ -428,10 +430,12 @@ static void s390_cpu_class_init(ObjectClass *oc, void 
*data)
 cc->get_phys_page_debug = s390_cpu_get_phys_page_debug;
 cc->vmsd = &vmstate_s390_cpu;
 cc->write_elf64_note = s390_cpu_write_elf64_note;
+#ifdef CONFIG_TCG
 cc->cpu_exec_interrupt = s390_cpu_exec_interrupt;
 cc->debug_excp_handler = s390x_cpu_debug_excp_handler;
 cc->do_unaligned_access = s390x_cpu_do_unaligned_access;
 #endif
+#endif
 cc->disas_set_info = s390_cpu_disas_set_info;
 
 cc->gdb_num_core_regs = S390_NUM_CORE_REGS;
-- 
2.13.3

Re: [Qemu-devel] [RFC PATCH 09/26] ppc/xive: add an overall memory region for the ESBs

2017-07-25 Thread Cédric Le Goater

On 07/25/2017 04:19 AM, David Gibson wrote:
> On Mon, Jul 24, 2017 at 03:25:29PM +0200, Cédric Le Goater wrote:
>> On 07/24/2017 08:09 AM, Benjamin Herrenschmidt wrote:
>>> On Mon, 2017-07-24 at 14:49 +1000, David Gibson wrote:
 On Wed, Jul 05, 2017 at 07:13:22PM +0200, Cédric Le Goater wrote:
> Each source adds its own ESB mempry region to the overall ESB memory
> region of the controller. It will be mapped in the CPU address space
> when XIVE is activated.
>
> The default mapping address for the ESB memory region is the same one
> used on baremetal.
>
> Signed-off-by: Cédric Le Goater 
> ---
>  hw/intc/xive-internal.h |  5 +
>  hw/intc/xive.c  | 44 +++-
>  2 files changed, 48 insertions(+), 1 deletion(-)
>
> diff --git a/hw/intc/xive-internal.h b/hw/intc/xive-internal.h
> index 8e755aa88a14..c06be823aad0 100644
> --- a/hw/intc/xive-internal.h
> +++ b/hw/intc/xive-internal.h
> @@ -98,6 +98,7 @@ struct XIVE {
>  SysBusDevice parent;
>  
>  /* Properties */
> +uint32_t chip_id;

 So there is a XIVE object per chip.  How does this work on PAPR?  One
 logical chip/XIVE, or something more complex?
>>>
>>> One global XIVE for PAPR. 
>>
>> Yes. 
>>
>> The chip-id is useless for sPAPR (0 is the default) but for a PowerNV
>> system, the address used to map the ESB memory region depends on the 
>> chip-id and  I thought we could reuse the same XIVE object. 
> 
> Hmm, maybe.

yes. 

I am thinking of greatly simplifying the allocator to fit only 
the sPAPR needs : a simple range of IRQ numbers with the IPIs 
at the beginning, and the HW interrupts starting at an offset 
(in sync with the XICS allocator). That's what I ended doing 
for CAS negotiation.

So we could just call it sPAPRXive and forget about PowerNV
support for the moment. 
 
C.

> 
>> So, a sPAPR guest would use the address of a single chip baremetal 
>> system. This needs more explanation I agree. Thanks to Ben who is 
>> providing a lot. I will update the changelogs in the next version. 
> 
>> The TIMA is mapped at a fixed address so the chip-id does not come 
>> in play.
>>
>>> For the MMIOs, the way it works is that:
>>>
>>>  - For MMIOs pertaining to a specific interrupt or queue, there's an H-
>>> call that will return the proper "guest physical" address. For qemu
>>> with KVM we'll have to probably create a single chunk of qemu address
>>> space (a single mem region) that contains individual pages mapped with
>>> MAP_FIXED originating from the different HW bits, we still need to sort
>>> out how exactly we'll do that in practice.
>>
>> I haven't looked at all the KVM details. But, regarding the ESBs, I had
>> the above in mind and used a single memory region to contain them all. 
>>  
>>>  - For the TIMA (the presentation MMIOs), those are always at the same
>>> physical address for everybody (so for a guest it's a single memory
>>> region we'll map to that physical address), the HW "knows" which HW
>>> thread is talking to it (and the hypervisor tells the HW which vcpu is
>>> running on a given HW thread at a given point in time). That address is
>>> obtained from the device-tree
>>>
>  uint32_t nr_targets;
>  
>  /* IRQ number allocator */
> @@ -111,6 +112,10 @@ struct XIVE {
>  void *sbe;
>  XiveIVE  *ivt;
>  XiveEQ   *eqdt;
> +
> +/* ESB and TIMA memory location */
> +hwaddr   vc_base;
> +MemoryRegion esb_iomem;
>  };
>  
>  void xive_reset(void *dev);
> diff --git a/hw/intc/xive.c b/hw/intc/xive.c
> index 8f8bb8b787bd..a1cb87a07b76 100644
> --- a/hw/intc/xive.c
> +++ b/hw/intc/xive.c
> @@ -312,6 +312,7 @@ static void xive_ics_realize(ICSState *ics, Error 
> **errp)
>  XiveICSState *xs = ICS_XIVE(ics);
>  Object *obj;
>  Error *err = NULL;
> +XIVE *x;

 I don't really like just 'x' for a context variable like this (as
 opposed to a temporary).
>>
>> OK. I will change 'x' in 'xive' then.
>>
>  
>  obj = object_property_get_link(OBJECT(xs), "xive", &err);
>  if (!obj) {
> @@ -319,7 +320,7 @@ static void xive_ics_realize(ICSState *ics, Error 
> **errp)
> __func__, error_get_pretty(err));
>  return;
>  }
> -xs->xive = XIVE(obj);
> +x = xs->xive = XIVE(obj);
>  
>  if (!ics->nr_irqs) {
>  error_setg(errp, "Number of interrupts needs to be greater 0");
> @@ -338,6 +339,11 @@ static void xive_ics_realize(ICSState *ics, Error 
> **errp)
>"xive.esb",
>(1ull << xs->esb_shift) * 
> ICS_BASE(xs)->nr_irqs);
>  
> +/* Install the ESB memory region in the overall one */
> +memory_region_add_subregion(&x->esb_

[Qemu-devel] [Bug 1706296] [NEW] Booting NT 4 disk causes /home/rjones/d/qemu/cpus.c:1580:qemu_mutex_lock_iothread: assertion failed: (!qemu_mutex_iothread_locked())

2017-07-25 Thread Richard Jones

Public bug reported:

Grab the NT 4 disk from
https://archive.org/details/Microsoft_Windows_NT_Server_Version_4.0_227-075
-385_CD-KEY_419-1343253_1996

Try to boot it as follows:

qemu-system-x86_64 -hda disk.img -cdrom 
Microsoft_Windows_NT_Server_Version_4.0_227-075-385_CD-KEY_419-1343253_1996.iso 
-m 2048 -boot d -machine pc,accel=tcg
WARNING: Image format was not specified for 'disk.img' and probing guessed raw.
 Automatically detecting the format is dangerous for raw images, write 
operations on block 0 will be restricted.
 Specify the 'raw' format explicitly to remove the restrictions.
**
ERROR:/home/rjones/d/qemu/cpus.c:1580:qemu_mutex_lock_iothread: assertion 
failed: (!qemu_mutex_iothread_locked())
Aborted (core dumped)

The stack trace in the failing thread is:

Thread 4 (Thread 0x7fffb0418700 (LWP 21979)):
#0  0x7fffdd89b64b in raise () at /lib64/libc.so.6
#1  0x7fffdd89d450 in abort () at /lib64/libc.so.6
#2  0x7fffdff8c75d in g_assertion_message () at /lib64/libglib-2.0.so.0
#3  0x7fffdff8c7ea in g_assertion_message_expr ()
at /lib64/libglib-2.0.so.0
#4  0x557a7d00 in qemu_mutex_lock_iothread ()
at /home/rjones/d/qemu/cpus.c:1580
#5  0x557cb429 in io_writex (env=env@entry=0x56751400, 
iotlbentry=0x5675b678, 
iotlbentry@entry=0x5ae40c918, val=val@entry=8, 
addr=addr@entry=2148532220, retaddr=0, retaddr@entry=93825011136120, 
size=size@entry=4)
at /home/rjones/d/qemu/accel/tcg/cputlb.c:795
#6  0x557ce0f7 in io_writel (retaddr=93825011136120, addr=2148532220, 
val=8, index=255, mmu_idx=21845, env=0x56751400)
at /home/rjones/d/qemu/softmmu_template.h:265
#7  0x557ce0f7 in helper_le_stl_mmu (env=env@entry=0x56751400, 
addr=addr@entry=2148532220, val=val@entry=8, oi=, 
retaddr=93825011136120, retaddr@entry=0) at 
/home/rjones/d/qemu/softmmu_template.h:300
#8  0x5587c0a4 in cpu_stl_kernel_ra (env=0x56751400, 
ptr=2148532220, v=8, retaddr=0) at 
/home/rjones/d/qemu/include/exec/cpu_ldst_template.h:182
#9  0x55882610 in do_interrupt_protected (is_hw=, 
next_eip=, error_code=2, is_int=, 
intno=, env=0x56751400) at 
/home/rjones/d/qemu/target/i386/seg_helper.c:758
#10 0x55882610 in do_interrupt_all (cpu=cpu@entry=0x56749170, 
intno=, is_int=, error_code=2, 
next_eip=, is_hw=is_hw@entry=0) at 
/home/rjones/d/qemu/target/i386/seg_helper.c:1252
#11 0x558839d3 in x86_cpu_do_interrupt (cs=0x56749170)
at /home/rjones/d/qemu/target/i386/seg_helper.c:1298
#12 0x557d2ccb in cpu_handle_exception (ret=, 
cpu=0x566a4590) at /home/rjones/d/qemu/accel/tcg/cpu-exec.c:465
#13 0x557d2ccb in cpu_exec (cpu=cpu@entry=0x56749170)
at /home/rjones/d/qemu/accel/tcg/cpu-exec.c:670
#14 0x557a855a in tcg_cpu_exec (cpu=0x56749170)
at /home/rjones/d/qemu/cpus.c:1270
#15 0x557a855a in qemu_tcg_rr_cpu_thread_fn (arg=)
at /home/rjones/d/qemu/cpus.c:1365
#16 0x7fffddc3d36d in start_thread () at /lib64/libpthread.so.0
#17 0x7fffdd975b9f in clone () at /lib64/libc.so.6

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1706296

Title:
  Booting NT 4 disk causes
  /home/rjones/d/qemu/cpus.c:1580:qemu_mutex_lock_iothread: assertion
  failed: (!qemu_mutex_iothread_locked())

Status in QEMU:
  New

Bug description:
  Grab the NT 4 disk from
  https://archive.org/details/Microsoft_Windows_NT_Server_Version_4.0_227-075
  -385_CD-KEY_419-1343253_1996

  Try to boot it as follows:

  qemu-system-x86_64 -hda disk.img -cdrom 
Microsoft_Windows_NT_Server_Version_4.0_227-075-385_CD-KEY_419-1343253_1996.iso 
-m 2048 -boot d -machine pc,accel=tcg
  WARNING: Image format was not specified for 'disk.img' and probing guessed 
raw.
   Automatically detecting the format is dangerous for raw images, 
write operations on block 0 will be restricted.
   Specify the 'raw' format explicitly to remove the restrictions.
  **
  ERROR:/home/rjones/d/qemu/cpus.c:1580:qemu_mutex_lock_iothread: assertion 
failed: (!qemu_mutex_iothread_locked())
  Aborted (core dumped)

  The stack trace in the failing thread is:

  Thread 4 (Thread 0x7fffb0418700 (LWP 21979)):
  #0  0x7fffdd89b64b in raise () at /lib64/libc.so.6
  #1  0x7fffdd89d450 in abort () at /lib64/libc.so.6
  #2  0x7fffdff8c75d in g_assertion_message () at /lib64/libglib-2.0.so.0
  #3  0x7fffdff8c7ea in g_assertion_message_expr ()
  at /lib64/libglib-2.0.so.0
  #4  0x557a7d00 in qemu_mutex_lock_iothread ()
  at /home/rjones/d/qemu/cpus.c:1580
  #5  0x557cb429 in io_writex (env=env@entry=0x56751400, 
iotlbentry=0x5675b678, 
  iotlbentry@entry=0x5ae40c918, val=val@entry=8, 
addr=addr@entry=2148532220, retaddr=0, retaddr@entry=93825011136120, 
size=size@entry=4)
  at /

[Qemu-devel] [PULL for-2.10 12/14] target/s390x: Move exception-related functions to a new excp_helper.c file

2017-07-25 Thread Cornelia Huck

From: Thomas Huth 

These functions can not be compiled with --disable-tcg. But since we
need the other functions from helper.c in the non-tcg build, we can also
not simply remove helper.c from the non-tcg builds. Thus the problematic
functions have to be moved into a separate new file instead that we
can later omit in the non-tcg builds.

Reviewed-by: Richard Henderson 
Signed-off-by: Thomas Huth 
Message-Id: <1500886370-14572-5-git-send-email-th...@redhat.com>
Signed-off-by: Cornelia Huck 
---
 target/s390x/Makefile.objs |   2 +-
 target/s390x/cpu.h |   8 +
 target/s390x/excp_helper.c | 515 +
 target/s390x/helper.c  | 472 +
 4 files changed, 528 insertions(+), 469 deletions(-)
 create mode 100644 target/s390x/excp_helper.c

diff --git a/target/s390x/Makefile.objs b/target/s390x/Makefile.objs
index 80028521ef..cc2b4c9257 100644
--- a/target/s390x/Makefile.objs
+++ b/target/s390x/Makefile.objs
@@ -1,6 +1,6 @@
 obj-y += translate.o helper.o cpu.o interrupt.o
 obj-y += int_helper.o fpu_helper.o cc_helper.o mem_helper.o misc_helper.o
-obj-y += gdbstub.o cpu_models.o cpu_features.o
+obj-y += gdbstub.o cpu_models.o cpu_features.o excp_helper.o
 obj-$(CONFIG_SOFTMMU) += machine.o ioinst.o arch_dump.o mmu_helper.o diag.o
 obj-$(CONFIG_KVM) += kvm.o
 
diff --git a/target/s390x/cpu.h b/target/s390x/cpu.h
index 4041bfc4bc..29fdd5d49a 100644
--- a/target/s390x/cpu.h
+++ b/target/s390x/cpu.h
@@ -801,6 +801,8 @@ static inline void setcc(S390CPU *cpu, uint64_t cc)
 env->cc_op = cc;
 }
 
+#ifndef CONFIG_USER_ONLY
+
 typedef struct LowCore
 {
 /* prefix area: defined by architecture */
@@ -918,6 +920,11 @@ typedef struct LowCore
 uint8_t pad18[0x2000-0x1400];  /* 0x1400 */
 } QEMU_PACKED LowCore;
 
+LowCore *cpu_map_lowcore(CPUS390XState *env);
+void cpu_unmap_lowcore(LowCore *lowcore);
+
+#endif
+
 /* STSI */
 #define STSI_LEVEL_MASK 0xf000ULL
 #define STSI_LEVEL_CURRENT  0xULL
@@ -1094,6 +1101,7 @@ struct sysib_322 {
 #define SIGP_ORDER_MASK 0x00ff
 
 void load_psw(CPUS390XState *env, uint64_t mask, uint64_t addr);
+uint64_t get_psw_mask(CPUS390XState *env);
 target_ulong mmu_real2abs(CPUS390XState *env, target_ulong raddr);
 int mmu_translate(CPUS390XState *env, target_ulong vaddr, int rw, uint64_t asc,
   target_ulong *raddr, int *flags, bool exc);
diff --git a/target/s390x/excp_helper.c b/target/s390x/excp_helper.c
new file mode 100644
index 00..d1833772d5
--- /dev/null
+++ b/target/s390x/excp_helper.c
@@ -0,0 +1,515 @@
+/*
+ * s390x exception / interrupt helpers
+ *
+ *  Copyright (c) 2009 Ulrich Hecht
+ *  Copyright (c) 2011 Alexander Graf
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see .
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "cpu.h"
+#include "qemu/timer.h"
+#include "exec/exec-all.h"
+#include "exec/cpu_ldst.h"
+#include "hw/s390x/ioinst.h"
+#ifndef CONFIG_USER_ONLY
+#include "sysemu/sysemu.h"
+#endif
+
+/* #define DEBUG_S390 */
+/* #define DEBUG_S390_STDOUT */
+
+#ifdef DEBUG_S390
+#ifdef DEBUG_S390_STDOUT
+#define DPRINTF(fmt, ...) \
+do { fprintf(stderr, fmt, ## __VA_ARGS__); \
+ if (qemu_log_separate()) { qemu_log(fmt, ##__VA_ARGS__); } } while (0)
+#else
+#define DPRINTF(fmt, ...) \
+do { qemu_log(fmt, ## __VA_ARGS__); } while (0)
+#endif
+#else
+#define DPRINTF(fmt, ...) \
+do { } while (0)
+#endif
+
+#if defined(CONFIG_USER_ONLY)
+
+void s390_cpu_do_interrupt(CPUState *cs)
+{
+cs->exception_index = -1;
+}
+
+int s390_cpu_handle_mmu_fault(CPUState *cs, vaddr address,
+  int rw, int mmu_idx)
+{
+S390CPU *cpu = S390_CPU(cs);
+
+cs->exception_index = EXCP_PGM;
+cpu->env.int_pgm_code = PGM_ADDRESSING;
+/* On real machines this value is dropped into LowMem.  Since this
+   is userland, simply put this someplace that cpu_loop can find it.  */
+cpu->env.__excp_addr = address;
+return 1;
+}
+
+#else /* !CONFIG_USER_ONLY */
+
+int s390_cpu_handle_mmu_fault(CPUState *cs, vaddr orig_vaddr,
+  int rw, int mmu_idx)
+{
+S390CPU *cpu = S390_CPU(cs);
+CPUS390XState *env = &cpu->env;
+uint64_t asc = cpu_mmu_idx_to_asc(mmu_idx);
+target_ulong vaddr, raddr;
+int prot;

Re: [Qemu-devel] [PATCH V3] rtc: fix a infinite loop in windows vmstartup

2017-07-25 Thread Paolo Bonzini

On 25/07/2017 11:29, peng.h...@zte.com.cn wrote:
>>On 25/07/2017 06:14, peng.h...@zte.com.cn wrote:
> 
 On 24/07/2017 20:35, Peng Hao wrote:
>>> 
>>> 
>>> 
>>> 
>>> 
> When a windows vm starts, periodic timer of rtc will stop several times.
> windows kernel will check whether REG_A_UIP is changed. REG_C's interrupt
> flags will not be cleared when periodic timer stops and the update timer
> will switch to alarm timer. So the expiration time of alarm timer is very
> long and REG_A_UIP will not vary.At last windows kernel will repeat to 
> check REG_A_UIP all the time.
>>> 
 This should not happen.  REG_A_UIP is set and cleared in register A
 every second, like this:
case RTC_REG_A:
if (update_in_progress(s)) {
s->cmos_data[s->cmos_index] |= REG_A_UIP
} else {
s->cmos_data[s->cmos_index] &= ~REG_A_UIP
}
ret = s->cmos_data[s->cmos_index]
break
>>> 
>>> 
>>> 
>>> when periodic timer stop, update timer is set to a long expire time (as 
>>> alarm timer).
> 
>>I think I see the bug now:
> 
>>diff --git a/hw/timer/mc146818rtc.c b/hw/timer/mc146818rtc.c
>>index 1b8d3d7d4c..6184b4378e 100644
>>--- a/hw/timer/mc146818rtc.c
>>+++ b/hw/timer/mc146818rtc.c
>>@@ -321,9 +321,11 @@ static void check_update_timer(RTCState *s)
>> s->next_alarm_time = next_update_time +
>>  (next_alarm_sec - 1) * NANOSECONDS_PER_SECOND;
>> 
>>-if (s->cmos_data[RTC_REG_C] & REG_C_UF) {
>>-/* UF is set, but AF is clear.  Program the timer to target
>>- * the alarm time.  */
>>+if ((s->cmos_data[RTC_REG_C] & REG_C_UF) &&
>>+!(s->cmos_data[RTC_REG_A] & REG_A_UIP)) {
>>+/* If UIP was latched, we need to clear it at the next update.
>>+ * Otherwise, if UF is set we only need to program the timer to
>>+ * target the alarm time.  */
>> next_update_time = s->next_alarm_time;
>> }
>> if (next_update_time != timer_expire_time_ns(s->update_timer)) {
>>
>>but I would like to have a testcase for it in tests/rtc-test.c.
>>Can you check if the above works and try writing a testcase (that fails 
>>without the patch and succeeds with it)?
>>Thanks,
> 
> I don't  think it can works.

Have you tested it?

> REG_C_UF is totally cleared by periodic timer in original code.
> after periodic timer stoped, the REG_C_UF  is never cleared.

Your patch cleared REG_C_UF, but you have not said why the actual
hardware would clear REG_C_UF (it wouldn't).

> rtc_update_timer has cleared REG_A_UIP,I think  the patch  does nothing.
> 
> I reproduce the bug when many windows VMs reboot and it is about the
> method of  windows kernel accessing rtc .
> 
> so i don't know how to write the testcase.

First of all you have to describe the exact sequence of register
accesses that causes the problem.  I can help you writing the testcase,
but your changes to the device model must reflect the actual behavior of
the hardware.

Paolo

Re: [Qemu-devel] [PATCH v5 4/4] sockets: Handle race condition between binds to the same port

2017-07-25 Thread Daniel P. Berrange

On Sat, Jul 22, 2017 at 09:49:33AM +0200, Knut Omang wrote:
> If an offset of ports is specified to the inet_listen_saddr function(),
> and two or more processes tries to bind from these ports at the same time,
> occasionally more than one process may be able to bind to the same
> port. The condition is detected by listen() but too late to avoid a failure.
> 
> This function is called by socket_listen() and used
> by all socket listening code in QEMU, so all cases where any form of dynamic
> port selection is used should be subject to this issue.
> 
> Add code to close and re-establish the socket when this
> condition is observed, hiding the race condition from the user.
> 
> Also clean up some issues with error handling to allow more
> accurate reporting of the cause of an error.
> 
> This has been developed and tested by means of the
> test-listen unit test in the previous commit.
> Enable the test for make check now that it passes.
> 
> Signed-off-by: Knut Omang 
> Reviewed-by: Bhavesh Davda 
> Reviewed-by: Yuval Shaia 
> Reviewed-by: Girish Moodalbail 
> Signed-off-by: Knut Omang 
> ---
>  tests/Makefile.include |  2 +-
>  util/qemu-sockets.c| 51 ---
>  2 files changed, 39 insertions(+), 14 deletions(-)
> 
> diff --git a/tests/Makefile.include b/tests/Makefile.include
> index b37c0c8..9d2131d 100644
> --- a/tests/Makefile.include
> +++ b/tests/Makefile.include
> @@ -128,7 +128,7 @@ check-unit-y += tests/test-bufferiszero$(EXESUF)
>  gcov-files-check-bufferiszero-y = util/bufferiszero.c
>  check-unit-y += tests/test-uuid$(EXESUF)
>  check-unit-y += tests/ptimer-test$(EXESUF)
> -#check-unit-y += tests/test-listen$(EXESUF)
> +check-unit-y += tests/test-listen$(EXESUF)
>  gcov-files-ptimer-test-y = hw/core/ptimer.c
>  check-unit-y += tests/test-qapi-util$(EXESUF)
>  gcov-files-test-qapi-util-y = qapi/qapi-util.c
> diff --git a/util/qemu-sockets.c b/util/qemu-sockets.c
> index 39f207c..8ca3ba6 100644
> --- a/util/qemu-sockets.c
> +++ b/util/qemu-sockets.c
> @@ -210,7 +210,9 @@ static int inet_listen_saddr(InetSocketAddress *saddr,
>  char port[33];
>  char uaddr[INET6_ADDRSTRLEN+1];
>  char uport[33];
> -int slisten, rc, port_min, port_max, p;
> +int rc, port_min, port_max, p;
> +int slisten = 0;
> +int saved_errno = 0;
>  Error *err = NULL;
>  
>  memset(&ai,0, sizeof(ai));
> @@ -277,27 +279,50 @@ static int inet_listen_saddr(InetSocketAddress *saddr,
>  port_max = saddr->has_to ? saddr->to + port_offset : port_min;
>  for (p = port_min; p <= port_max; p++) {
>  inet_setport(e, p);
> -if (try_bind(slisten, saddr, e) >= 0) {
> -goto listen;
> -}
> -if (p == port_max) {
> -if (!e->ai_next) {
> +rc = try_bind(slisten, saddr, e);
> +if (rc) {
> +if (errno == EADDRINUSE) {

Add in  '&& e->ai_next' here, so we trigger the error handling
on the else branch if this was the last address we had.

> +continue;
> +} else {
>  error_setg_errno(errp, errno, "Failed to bind socket");
> +goto listen_failed;
> +}
> +}
> +rc = listen(slisten, 1);
> +if (!rc) {

Assigning to rc seems redundant here.

> +goto listen_ok;
> +} else if (errno == EADDRINUSE) {
> +/* Someone else managed to bind to the same port and beat us
> + * to listen on it! Socket semantics does not allow us to
> + * recover from this situation, so we need to recreate the
> + * socket to allow bind attempts for subsequent ports:
> + */
> +closesocket(slisten);
> +slisten = create_fast_reuse_socket(e, errp);

create_fast_reuse_socket may report an error here

> +if (slisten >= 0) {
> +continue;
>  }
>  }
> +error_setg_errno(errp, errno, "Failed to listen on socket");

at which point you report another error on top of it with a stale
errno value here. This bug would be more obvious if error reporting
was pulled out of create_fast_reuse_socket() as described in the
earlier patch.

I think it'd be clearer if we reduced the nesting here. eg

 if (listen() == 0) {
goto listen_ok;
 }
 if (errno != EADDRINUSE) {
 error_setg_error(errp, errno, "");
 goto listen_failed;
 }
 closesocket(listen);
 slisten = create_fast_reuse_socket(e);
 if (slisten < 0 && !e->ai_next) {
 error_setg_error(errp, errno, "");
 goto listen_failed;
 }

> +goto listen_failed;
>  }
> +}
> +if (err) {
> +error_propagate(errp, err);
> +} else {
> +error_setg_errno(errp, errno, "Failed to find an available port");
> +}

Re: [Qemu-devel] [PULL for-2.10 00/14] A set of s390x patches

2017-07-25 Thread no-reply

Hi,

This series failed build test on FreeBSD host. Please find the details below.

Message-id: 20170725094026.5376-1-coh...@redhat.com
Subject: [Qemu-devel] [PULL for-2.10 00/14] A set of s390x patches
Type: series

=== TEST SCRIPT BEGIN ===
#!/bin/sh
# Testing script will be invoked under the git checkout with
# HEAD pointing to a commit that has the patches applied on top of "base"
# branch
set -e
echo "=== ENV ==="
env
echo "=== PACKAGES ==="
pkg info
echo "=== TEST BEGIN ==="
CC=/usr/local/libexec/ccache/cc
INSTALL=$PWD/install
BUILD=$PWD/build
echo -n "Using CC: "
realpath $CC
mkdir -p $BUILD $INSTALL
SRC=$PWD
cd $BUILD
$SRC/configure --cc=$CC --prefix=$INSTALL --target-list=x86_64-softmmu
gmake -j4
# XXX: we need reliable clean up
# make check -j4 V=1
gmake install
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
fatal: unable to access 'https://github.com/patchew-project/qemu/': OpenSSL 
SSL_connect: SSL_ERROR_SYSCALL in connection to github.com:443 
error: Could not fetch 3c8cf5a9c21ff8782164d1def7f44bd888713384
Traceback (most recent call last):
  File "/home/patchew-tester/patchew/patchew-cli", line 412, in test_one
git_clone_repo(clone, r["repo"], r["head"], logf)
  File "/home/patchew-tester/patchew/patchew-cli", line 48, in git_clone_repo
stdout=logf, stderr=logf)
  File "/usr/local/lib/python3.5/subprocess.py", line 271, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['git', 'remote', 'add', '-f', 
'--mirror=fetch', '3c8cf5a9c21ff8782164d1def7f44bd888713384', 
'https://github.com/patchew-project/qemu']' returned non-zero exit status 1



---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-de...@freelists.org

Re: [Qemu-devel] Status and RFC of patchew testings on QEMU

2017-07-25 Thread Peter Maydell

On 17 July 2017 at 11:26, Daniel P. Berrange  wrote:
> On Mon, Jul 17, 2017 at 11:06:12AM +0100, Peter Maydell wrote:
>> The current 'quiet' mode is not quite aimed at the same purpose:
>> it's good for interactive use where you don't want the long detail
>> of command lines but you do want some periodic indication of
>> progress through the compile. For entirely noninteractive
>> setups like travis and patchew there's no need to produce
>> what is in effect a rather verbose progress bar...
>
> For unattended automated systems though, I think running with V=1 is
> preferrable to quiet mode. When you don't have direct access to the
> test system to reproduce systems, every bit of information that you
> can get from the logs is potentially useful, including full compiler
> args. The flipside is that in other cases the verbose output can
> obscure the error messages you are after - depends what your're trying
> to debug - code problems vs build system problems.

The problem in both cases is that it makes the logs nearly
unusable. For travis, their log-pretty-printer takes forever
to try to render our enormously long build logs, to the point
where you're better off grabbing the raw text-only log.
For patchew, I basically ignore patchew compile failure emails
because it is too painful to get to the bottom of the email
where the actual error message is because of the pages and
pages and pages of useless progress output :-(

thanks
-- PMM

[Qemu-devel] [PATCH v2] char: don't exit on hmp 'chardev-add help'

2017-07-25 Thread Anton Nefedov

qemu_chr_new_from_opts() is used from both vl.c and hmp,
and it is quite confusing to see qemu suddenly exit after receiving a help
option in hmp.

Do exit(0) from vl.c instead.

Signed-off-by: Anton Nefedov 
---
 include/chardev/char.h |  4 +++-
 chardev/char.c |  2 +-
 vl.c   | 10 ++
 3 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/include/chardev/char.h b/include/chardev/char.h
index 1604ea9..66dde46 100644
--- a/include/chardev/char.h
+++ b/include/chardev/char.h
@@ -65,7 +65,9 @@ struct Chardev {
  *
  * @opts see qemu-config.c for a list of valid options
  *
- * Returns: a new character backend
+ * Returns: on success: a new character backend
+ *  otherwise:  NULL; @errp specifies the error
+ *or left untouched in case of help option
  */
 Chardev *qemu_chr_new_from_opts(QemuOpts *opts,
 Error **errp);
diff --git a/chardev/char.c b/chardev/char.c
index c34b44a..5d283b9 100644
--- a/chardev/char.c
+++ b/chardev/char.c
@@ -620,7 +620,7 @@ Chardev *qemu_chr_new_from_opts(QemuOpts *opts, Error 
**errp)
 
 error_report("Available chardev backend types: %s", str->str);
 g_string_free(str, true);
-exit(0);
+return NULL;
 }
 
 if (id == NULL) {
diff --git a/vl.c b/vl.c
index fb6b2ef..47bfe51 100644
--- a/vl.c
+++ b/vl.c
@@ -2344,10 +2344,12 @@ static int chardev_init_func(void *opaque, QemuOpts 
*opts, Error **errp)
 {
 Error *local_err = NULL;
 
-qemu_chr_new_from_opts(opts, &local_err);
-if (local_err) {
-error_report_err(local_err);
-return -1;
+if (!qemu_chr_new_from_opts(opts, &local_err)) {
+if (local_err) {
+error_report_err(local_err);
+return -1;
+}
+exit(0);
 }
 return 0;
 }
-- 
2.7.4

Re: [Qemu-devel] [PULL for-2.10 00/14] A set of s390x patches

2017-07-25 Thread Cornelia Huck

On Tue, 25 Jul 2017 02:58:27 -0700 (PDT)
no-re...@patchew.org wrote:

> Hi,
> 
> This series failed build test on FreeBSD host. Please find the details below.
> 
> Message-id: 20170725094026.5376-1-coh...@redhat.com
> Subject: [Qemu-devel] [PULL for-2.10 00/14] A set of s390x patches
> Type: series
> 
> === TEST SCRIPT BEGIN ===
> #!/bin/sh
> # Testing script will be invoked under the git checkout with
> # HEAD pointing to a commit that has the patches applied on top of "base"
> # branch
> set -e
> echo "=== ENV ==="
> env
> echo "=== PACKAGES ==="
> pkg info
> echo "=== TEST BEGIN ==="
> CC=/usr/local/libexec/ccache/cc
> INSTALL=$PWD/install
> BUILD=$PWD/build
> echo -n "Using CC: "
> realpath $CC
> mkdir -p $BUILD $INSTALL
> SRC=$PWD
> cd $BUILD
> $SRC/configure --cc=$CC --prefix=$INSTALL --target-list=x86_64-softmmu
> gmake -j4
> # XXX: we need reliable clean up
> # make check -j4 V=1
> gmake install
> === TEST SCRIPT END ===
> 
> Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
> fatal: unable to access 'https://github.com/patchew-project/qemu/': OpenSSL 
> SSL_connect: SSL_ERROR_SYSCALL in connection to github.com:443 

Hm... connection issues?

> error: Could not fetch 3c8cf5a9c21ff8782164d1def7f44bd888713384
> Traceback (most recent call last):
>   File "/home/patchew-tester/patchew/patchew-cli", line 412, in test_one
> git_clone_repo(clone, r["repo"], r["head"], logf)
>   File "/home/patchew-tester/patchew/patchew-cli", line 48, in git_clone_repo
> stdout=logf, stderr=logf)
>   File "/usr/local/lib/python3.5/subprocess.py", line 271, in check_call
> raise CalledProcessError(retcode, cmd)
> subprocess.CalledProcessError: Command '['git', 'remote', 'add', '-f', 
> '--mirror=fetch', '3c8cf5a9c21ff8782164d1def7f44bd888713384', 
> 'https://github.com/patchew-project/qemu']' returned non-zero exit status 1
> 
> 
> 
> ---
> Email generated automatically by Patchew [http://patchew.org/].
> Please send your feedback to patchew-de...@freelists.org

Re: [Qemu-devel] [PULL 0/4] ppc-for-2.10 queue 20170725

2017-07-25 Thread Peter Maydell

On 25 July 2017 at 06:45, David Gibson  wrote:
> The following changes since commit b5a74cd81d76cb467552f38f2b39520d07c65ea2:
>
>   Merge remote-tracking branch 
> 'remotes/pmaydell/tags/pull-target-arm-20170724' into staging (2017-07-24 
> 18:15:45 +0100)
>
> are available in the git repository at:
>
>   git://github.com/dgibson/qemu.git tags/ppc-for-2.10-20170725
>
> for you to fetch changes up to 86844c213348a596ea716a44a6f337555e07ad09:
>
>   pseries: Update SLOF firmware image (2017-07-25 14:35:42 +1000)
>
> 
> ppc patch queue 2017-07-25
>
> Last pull request for the 2.10 hard freeze, and correspondingly small.
> There are a handful of bugfixes here plus an update for the "pseries"
> guest firmware (SLOF).
>
> This is later than ideal for a guest firmware update.  However, this
> does include a number of fixes in that guest firmware, so I think it's
> worth the risk of squeezing this in just before the hard freeze.
>
> 
> Alexey Kardashevskiy (2):
>   spapr_pci: Fix obsolete comment about MSIX encoding in addr/data
>   pseries: Update SLOF firmware image
>
> Bharata B Rao (1):
>   spapr: Fix QEMU abort during memory unplug
>
> Laurent Vivier (1):
>   spapr/htab: fix savevm

Applied, thanks.

-- PMM

Re: [Qemu-devel] [PATCH 4/7] block: convert ThrottleGroup to object with QOM

2017-07-25 Thread Manos Pitsidianakis


On Mon, Jul 24, 2017 at 04:12:47PM +0100, Stefan Hajnoczi wrote:

On Fri, Jul 14, 2017 at 12:45:18PM +0300, Manos Pitsidianakis wrote:

ThrottleGroup is converted to an object. This will allow the future
throttle block filter drive easy creation and configuration of throttle
groups in QMP and cli.

A new QAPI struct, ThrottleLimits, is introduced to provide a shared
struct for all throttle configuration needs in QMP.

ThrottleGroups can be created via CLI as
-object throttling-group,id=foo,x-iops-total=100,x-..


Please make the QOM name and struct name consistent.  Either
ThrottleGroup/throttle-group or ThrottlingGroup/throttling-group but not
ThrottleGroup/throttling-group.


I did this on purpose because current throttling has ThrottleGroup 
internally and throttling.group in the command line. Should we keep this 
only in legacy and make it throttle-group everywhere?



where x-* are individual limit properties. Since we can't add non-scalar
properties in -object this interface must be used instead. However,
setting these properties must be disabled after initialization because
certain combinations of limits are forbidden and thus configuration
changes should be done in one transaction. The individual properties
will go away when support for non-scalar values in CLI is implemented
and thus are marked as experimental.

ThrottleGroup also has a `limits` property that uses the ThrottleLimits
struct.  It can be used to create ThrottleGroups or set the
configuration in existing groups as follows:

{ "execute": "object-add",
  "arguments": {
"qom-type": "throttling-group",
"id": "foo",
"props" : {
  "limits": {
  "iops-total": 100
  }
}
  }
}
{ "execute" : "qom-set",
"arguments" : {
"path" : "foo",
"property" : "limits",
"value" : {
"iops-total" : 99
}
}
}

This also means a group's configuration can be fetched with qom-get.

ThrottleGroups can be anonymous which means they can't get accessed by
other users ie they will always be units instead of group (Because they
have one ThrottleGroupMember).

Signed-off-by: Manos Pitsidianakis 
---

Notes:
Note: I tested Markus Armbruster's patch in 
<87wp7fghi9@dusky.pond.sub.org>
on master and I can use this syntax successfuly:
-object '{ "qom-type" : "throttling-group", "id" : "foo", "props" : { 
"limits" \
: { "iops-total" : 1000 } } }'
If this gets merged using -object will be a little more verbose but at 
least we
won't have seperate properties, which is a good thing, so the x-* should be
dropped.

 block/throttle-groups.c | 507 +---
 include/block/throttle-groups.h |   3 +
 include/qemu/throttle-options.h |  59 +++--
 include/qemu/throttle.h |   3 +
 qapi/block-core.json|  45 
 tests/test-throttle.c   |   1 +
 util/throttle.c | 121 ++
 7 files changed, 689 insertions(+), 50 deletions(-)

diff --git a/block/throttle-groups.c b/block/throttle-groups.c
index f711a3dc62..4d1f82ec06 100644
--- a/block/throttle-groups.c
+++ b/block/throttle-groups.c
@@ -25,9 +25,17 @@
 #include "qemu/osdep.h"
 #include "sysemu/block-backend.h"
 #include "block/throttle-groups.h"
+#include "qemu/throttle-options.h"
 #include "qemu/queue.h"
 #include "qemu/thread.h"
 #include "sysemu/qtest.h"
+#include "qapi/error.h"
+#include "qapi-visit.h"
+#include "qom/object.h"
+#include "qom/object_interfaces.h"
+
+static void throttle_group_obj_init(Object *obj);
+static void throttle_group_obj_complete(UserCreatable *obj, Error **errp);

 /* The ThrottleGroup structure (with its ThrottleState) is shared
  * among different ThrottleGroupMembers and it's independent from
@@ -54,6 +62,10 @@
  * that ThrottleGroupMember has throttled requests in the queue.
  */
 typedef struct ThrottleGroup {
+Object parent_obj;
+
+/* refuse individual property change if initialization is complete */
+bool is_initialized;
 char *name; /* This is constant during the lifetime of the group */

 QemuMutex lock; /* This lock protects the following four fields */
@@ -63,8 +75,7 @@ typedef struct ThrottleGroup {
 bool any_timer_armed[2];
 QEMUClockType clock_type;

-/* These two are protected by the global throttle_groups_lock */
-unsigned refcount;
+/* This is protected by the global throttle_groups_lock */
 QTAILQ_ENTRY(ThrottleGroup) list;
 } ThrottleGroup;

@@ -77,7 +88,8 @@ static QTAILQ_HEAD(, ThrottleGroup) throttle_groups =
  * If no ThrottleGroup is found with the given name a new one is
  * created.
  *
- * @name: the name of the ThrottleGroup
+ * @name: the name of the ThrottleGroup, NULL means a new anonymous group will
+ *be created.
  * @ret:  the ThrottleState member of the ThrottleGroup
  */
 ThrottleState *throttle_group_incref(const char *name)
@@ -87,32 +99,30 @@ ThrottleState *throttle_group_incref(const char *name)

 qemu_mutex_lock

Re: [Qemu-devel] [PATCH] migration: optimize the downtime

2017-07-25 Thread Dr. David Alan Gilbert

* Jay Zhou (jianjay.z...@huawei.com) wrote:
> 
> On 2017/7/24 23:35, Dr. David Alan Gilbert wrote:
> > * Jay Zhou (jianjay.z...@huawei.com) wrote:
> > > Hi Dave,
> > > 
> > > On 2017/7/21 17:49, Dr. David Alan Gilbert wrote:
> > > > * Jay Zhou (jianjay.z...@huawei.com) wrote:
> > > > > Qemu_savevm_state_cleanup() takes about 300ms in my ram migration 
> > > > > tests
> > > > > with a 8U24G vm(20G is really occupied), the main cost comes from
> > > > > KVM_SET_USER_MEMORY_REGION ioctl when mem.memory_size = 0 in
> > > > > kvm_set_user_memory_region(). In kmod, the main cost is
> > > > > kvm_zap_obsolete_pages(), which traverses the active_mmu_pages list to
> > > > > zap the unsync sptes.
> > > > 
> > > > Hi Jay,
> > > > Is this actually increasing the real downtime when the guest isn't
> > > > running, or is it just the reported time? I see that the s->downtime
> > > > value is calculated right after where we currently call
> > > > qemu_savevm_state_cleanup.
> > > 
> > > It actually increased the real downtime, I used the "ping" command to
> > > test. Reason is that the source side libvirt sends qmp to qemu to query
> > > the status of migration, which needs the BQL. qemu_savevm_state_cleanup
> > > is done with BQL, qemu can not handle the qmp if qemu_savevm_state_cleanup
> > > has not finished. And the source side libvirt delays about 300ms to notify
> > > the destination side libvirt to send the "cont" command to start the vm.
> > > 
> > > I think the value of s->downtime is not accurate enough, maybe we could
> > > move the calculation of end_time after qemu_savevm_state_cleanup has done.
> > 
> > I'm copying in Paolo, Radim and Andrea- is there anyway we can make the
> > teardown of KVMs dirty tracking not take so long? 300ms is a silly long time
> > on only a small VM.
> > 
> > > > I guess the biggest problem is that 300ms happens before we restart
> > > > the guest on the source if a migration fails.
> > > 
> > > 300ms happens even if a migration succeeds.
> > 
> > Hmm, OK, this needs fixing then - it does explain a result I saw a while
> > ago where the downtime was much bigger with libvirt than it was with
> > qemu on it's own.
> > 
> > > > > I think it can be optimized:
> > > > > (1) source vm will be destroyed if the migration is successfully done,
> > > > >   so the resources will be cleanuped automatically by the system
> > > > > (2) delay the cleanup if the migration failed
> > > > 
> > > > I don't like putting it in qmp_cont; that shouldn't have migration magic
> > > > in it.
> > > 
> > > Yes, it is not a ideal place. :(
> > > 
> > > > I guess we could put it in migrate_fd_cleanup perhaps? It gets called on
> > > > a bh near the end -  or could we just move it closer to the end of
> > > > migration_thread?
> > > 
> > > I have tested putting it in migrate_fd_cleanup, but the downtime is not
> > > optimized. So I think it is the same to move it closer to the end of
> > > migration_thread if it holds the BQL.
> > > Could we put it in migrate_init?
> > 
> > Your explanation above hints as to why migrate_fd_cleanup doesn't help;
> > it's because we're still going to be doing it with the BQL taken.
> 
> Yes, it is.
> 
> > Can you tell me which version of libvirt you're using?
> 
> I'm using 1.3.4
> 
> > I thought the newer ones were supposed to use events so they did't
> > have to poll qemu.
> 
> After checking the codes of the newest libvirt, I think it is the same
> in the qemuMigrationWaitForCompletion function, which is used to poll
> qemu every 50ms.

Checking with Jiri Denemark (added to cc), newer libvirt should use
events when available - but that polling code is there to cope with
older qemu's.  So with a newer qemu, i think it should spot the
COMPLETED event.

Dave

> Thanks,
> Jay
> 
> >   If we move qemu_savevm_state_cleanup is it still safe? Are there
> > some things we're supposed to do at that point which are wrong if
> > we don't.
> > 
> > I wonder about something like;  take a mutex in
> > memory_global_dirty_log_start, release it in
> > memory_global_dirty_log_stop.  Then make ram_save_cleanup start
> > a new thread that does the call to memory_global_dirty_log_stop.
> > 
> > Dave
> > 
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [Qemu-devel] [PATCH v2] char: don't exit on hmp 'chardev-add help'

2017-07-25 Thread Paolo Bonzini

On 25/07/2017 12:04, Anton Nefedov wrote:
> qemu_chr_new_from_opts() is used from both vl.c and hmp,
> and it is quite confusing to see qemu suddenly exit after receiving a help
> option in hmp.
> 
> Do exit(0) from vl.c instead.

Queued, thanks.

Paolo

> Signed-off-by: Anton Nefedov 
> ---
>  include/chardev/char.h |  4 +++-
>  chardev/char.c |  2 +-
>  vl.c   | 10 ++
>  3 files changed, 10 insertions(+), 6 deletions(-)
> 
> diff --git a/include/chardev/char.h b/include/chardev/char.h
> index 1604ea9..66dde46 100644
> --- a/include/chardev/char.h
> +++ b/include/chardev/char.h
> @@ -65,7 +65,9 @@ struct Chardev {
>   *
>   * @opts see qemu-config.c for a list of valid options
>   *
> - * Returns: a new character backend
> + * Returns: on success: a new character backend
> + *  otherwise:  NULL; @errp specifies the error
> + *or left untouched in case of help option
>   */
>  Chardev *qemu_chr_new_from_opts(QemuOpts *opts,
>  Error **errp);
> diff --git a/chardev/char.c b/chardev/char.c
> index c34b44a..5d283b9 100644
> --- a/chardev/char.c
> +++ b/chardev/char.c
> @@ -620,7 +620,7 @@ Chardev *qemu_chr_new_from_opts(QemuOpts *opts, Error 
> **errp)
>  
>  error_report("Available chardev backend types: %s", str->str);
>  g_string_free(str, true);
> -exit(0);
> +return NULL;
>  }
>  
>  if (id == NULL) {
> diff --git a/vl.c b/vl.c
> index fb6b2ef..47bfe51 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -2344,10 +2344,12 @@ static int chardev_init_func(void *opaque, QemuOpts 
> *opts, Error **errp)
>  {
>  Error *local_err = NULL;
>  
> -qemu_chr_new_from_opts(opts, &local_err);
> -if (local_err) {
> -error_report_err(local_err);
> -return -1;
> +if (!qemu_chr_new_from_opts(opts, &local_err)) {
> +if (local_err) {
> +error_report_err(local_err);
> +return -1;
> +}
> +exit(0);
>  }
>  return 0;
>  }
>

[Qemu-devel] [qemu-web PATCH] Add a blog post about deprecation of old interfaces and features

2017-07-25 Thread Thomas Huth

The list of deprecated interfaces/features in the wiki should be pretty
complete now, so it is now time to draw some more public attention to our
plans of removing certain interfaces/features in future releases.

Signed-off-by: Thomas Huth 
---
 _posts/2017-07-25-deprecation.md | 26 ++
 1 file changed, 26 insertions(+)
 create mode 100644 _posts/2017-07-25-deprecation.md

diff --git a/_posts/2017-07-25-deprecation.md b/_posts/2017-07-25-deprecation.md
new file mode 100644
index 000..b5eaf0b
--- /dev/null
+++ b/_posts/2017-07-25-deprecation.md
@@ -0,0 +1,26 @@
+---
+layout: post
+title:  "Deprecation of old parameters and features"
+date:   2017-07-25 9:00:00 +0200
+author: Thomas Huth
+categories: [features, 'web site']
+---
+QEMU has a lot of interfaces (like command line options or HMP commands) and
+old features (like certain devices) which are considered as deprecated
+since other more generic or better interfaces/features have been established
+instead. While the QEMU developers are generally trying to keep each QEMU
+release compatible with the previous ones, the old legacy sometimes gets into
+the way when developing new code and/or causes quite some burden of maintaining
+it.
+
+Thus we are currently considering to get rid of some of the old interfaces
+and features in a future release and have started to collect a list of such
+old items in our Wiki on a
+[page about removing legacy 
parts](http://wiki.qemu.org/Features/LegacyRemoval).
+If you are running QEMU directly, please have a look at this page to see
+whether you are still using one of these old interfaces or features, so you
+can adapt your setup to use the new interfaces or features instead. Or if
+you rather think that one of the legacy interfaces/features should *not* be
+removed from QEMU at all, please speak up on the
+[qemu-devel mailing list](http://wiki.qemu.org/Contribute/MailingLists)
+to explain why the interface or feature is still required.
-- 
1.8.3.1

Re: [Qemu-devel] [qemu-web PATCH] Add a blog post about deprecation of old interfaces and features

2017-07-25 Thread Paolo Bonzini

On 25/07/2017 13:07, Thomas Huth wrote:
> The list of deprecated interfaces/features in the wiki should be pretty
> complete now, so it is now time to draw some more public attention to our
> plans of removing certain interfaces/features in future releases.
> 
> Signed-off-by: Thomas Huth 
> ---
>  _posts/2017-07-25-deprecation.md | 26 ++
>  1 file changed, 26 insertions(+)
>  create mode 100644 _posts/2017-07-25-deprecation.md
> 
> diff --git a/_posts/2017-07-25-deprecation.md 
> b/_posts/2017-07-25-deprecation.md
> new file mode 100644
> index 000..b5eaf0b
> --- /dev/null
> +++ b/_posts/2017-07-25-deprecation.md
> @@ -0,0 +1,26 @@
> +---
> +layout: post
> +title:  "Deprecation of old parameters and features"
> +date:   2017-07-25 9:00:00 +0200
> +author: Thomas Huth
> +categories: [features, 'web site']

Maybe s/web site/wiki/?

> +---
> +QEMU has a lot of interfaces (like command line options or HMP commands) and
> +old features (like certain devices) which are considered as deprecated
> +since other more generic or better interfaces/features have been established
> +instead. While the QEMU developers are generally trying to keep each QEMU
> +release compatible with the previous ones, the old legacy sometimes gets into
> +the way when developing new code and/or causes quite some burden of 
> maintaining
> +it.
> +
> +Thus we are currently considering to get rid of some of the old interfaces
> +and features in a future release and have started to collect a list of such
> +old items in our Wiki on a
> +[page about removing legacy 
> parts](http://wiki.qemu.org/Features/LegacyRemoval).
> +If you are running QEMU directly, please have a look at this page to see
> +whether you are still using one of these old interfaces or features, so you
> +can adapt your setup to use the new interfaces or features instead. Or if
> +you rather think that one of the legacy interfaces/features should *not* be
> +removed from QEMU at all, please speak up on the
> +[qemu-devel mailing list](http://wiki.qemu.org/Contribute/MailingLists)
> +to explain why the interface or feature is still required.

This text looks good.

However, we should first finalize Daniel's patches and update the wiki
to match the newly-instated policy.  The blog post might also include
the text that is added to the manual.

Thanks,

Paolo

Re: [Qemu-devel] [PATCH] qemu-iotests: Fix reference output for 186

2017-07-25 Thread Markus Armbruster

Kevin Wolf  writes:

> Commits 70f17a1 ('error: Revert unwanted change of warning messages')
> and e1824e5 ('qemu-iotests: Test 'info block'') had a semantic merge
> conflict, which results in failure for qemu-iotests case 186. Fix the
> reference output to consider the changes of 70f17a1.
>
> Signed-off-by: Kevin Wolf 

Reviewed-by: Markus Armbruster

Re: [Qemu-devel] [qemu-web PATCH] Add a blog post about deprecation of old interfaces and features

2017-07-25 Thread Daniel P. Berrange

On Tue, Jul 25, 2017 at 01:10:38PM +0200, Paolo Bonzini wrote:
> On 25/07/2017 13:07, Thomas Huth wrote:
> > The list of deprecated interfaces/features in the wiki should be pretty
> > complete now, so it is now time to draw some more public attention to our
> > plans of removing certain interfaces/features in future releases.
> > 
> > Signed-off-by: Thomas Huth 
> > ---
> >  _posts/2017-07-25-deprecation.md | 26 ++
> >  1 file changed, 26 insertions(+)
> >  create mode 100644 _posts/2017-07-25-deprecation.md
> > 
> > diff --git a/_posts/2017-07-25-deprecation.md 
> > b/_posts/2017-07-25-deprecation.md
> > new file mode 100644
> > index 000..b5eaf0b
> > --- /dev/null
> > +++ b/_posts/2017-07-25-deprecation.md
> > @@ -0,0 +1,26 @@
> > +---
> > +layout: post
> > +title:  "Deprecation of old parameters and features"
> > +date:   2017-07-25 9:00:00 +0200
> > +author: Thomas Huth
> > +categories: [features, 'web site']
> 
> Maybe s/web site/wiki/?
> 
> > +---
> > +QEMU has a lot of interfaces (like command line options or HMP commands) 
> > and
> > +old features (like certain devices) which are considered as deprecated
> > +since other more generic or better interfaces/features have been 
> > established
> > +instead. While the QEMU developers are generally trying to keep each QEMU
> > +release compatible with the previous ones, the old legacy sometimes gets 
> > into
> > +the way when developing new code and/or causes quite some burden of 
> > maintaining
> > +it.
> > +
> > +Thus we are currently considering to get rid of some of the old interfaces
> > +and features in a future release and have started to collect a list of such
> > +old items in our Wiki on a
> > +[page about removing legacy 
> > parts](http://wiki.qemu.org/Features/LegacyRemoval).
> > +If you are running QEMU directly, please have a look at this page to see
> > +whether you are still using one of these old interfaces or features, so you
> > +can adapt your setup to use the new interfaces or features instead. Or if
> > +you rather think that one of the legacy interfaces/features should *not* be
> > +removed from QEMU at all, please speak up on the
> > +[qemu-devel mailing list](http://wiki.qemu.org/Contribute/MailingLists)
> > +to explain why the interface or feature is still required.
> 
> This text looks good.
> 
> However, we should first finalize Daniel's patches and update the wiki
> to match the newly-instated policy.  The blog post might also include
> the text that is added to the manual.

IMHO we shouldn't really point people to the wiki at all. The qemu-doc
content is what any 3rd parties should rely, since that has been formally
reviewed & approved by maintainers.

The remaining content on the wiki page that differs from qemu-doc, is a
braindump of stuff that could potentially be removed, but is not at all
certain that any will be deprecated. IOW the wiki page is fine for QEMU
maintainers to use, but 3rd parties should only use the published docs.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [Qemu-devel] [PATCH v5 2/2 (for 2.10)] docs: document deprecated features in appendix

2017-07-25 Thread Daniel P. Berrange

On Tue, Jul 25, 2017 at 11:35:53AM +0200, Thomas Huth wrote:
> On 19.07.2017 12:08, Daniel P. Berrange wrote:
> > The deprecation of features in QEMU is totally adhoc currently,
> > with no way for the user to get a list of what is deprecated
> > in each release. This adds an appendix to the doc that records
> > when each deprecation was made and provides text explaining
> > what to use instead, if anything.
> > 
> > Since there has been no formal policy around removal of deprecated
> > features in the past, any deprecations prior to 2.10.0 are to be
> > treated as if they had been made at the 2.10.0 release. Thus the
> > earliest that existing deprecations will be deleted is the start
> > of the 2.12.0 cycle.
> > 
> > Signed-off-by: Daniel P. Berrange 
> > ---
> >  qemu-doc.texi | 168 
> > ++
> >  1 file changed, 168 insertions(+)
> [...]
> > +@subsection -monitor default=on (since 2.4.0)
> > +
> > +The ``default'' option to the ``-monitor'' argument is
> > +now ignored. When multiple monitors were enabled, it
> > +indicated which monitor would receive log messages
> > +from the various subsystems. This feature is no longer
> > +required as messages are now only sent to the monitor
> > +in response to explicitly monitor commands.
> 
> BTW, seems like it's the "-mon" option, not the "-monitor" option that
> has this "default" parameter?

Yes, you are right.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [Qemu-devel] [qemu-web PATCH] Add a blog post about deprecation of old interfaces and features

2017-07-25 Thread Thomas Huth

On 25.07.2017 13:15, Daniel P. Berrange wrote:
> On Tue, Jul 25, 2017 at 01:10:38PM +0200, Paolo Bonzini wrote:
>> On 25/07/2017 13:07, Thomas Huth wrote:
>>> The list of deprecated interfaces/features in the wiki should be pretty
>>> complete now, so it is now time to draw some more public attention to our
>>> plans of removing certain interfaces/features in future releases.
>>>
>>> Signed-off-by: Thomas Huth 
>>> ---
>>>  _posts/2017-07-25-deprecation.md | 26 ++
>>>  1 file changed, 26 insertions(+)
>>>  create mode 100644 _posts/2017-07-25-deprecation.md
>>>
>>> diff --git a/_posts/2017-07-25-deprecation.md 
>>> b/_posts/2017-07-25-deprecation.md
>>> new file mode 100644
>>> index 000..b5eaf0b
>>> --- /dev/null
>>> +++ b/_posts/2017-07-25-deprecation.md
>>> @@ -0,0 +1,26 @@
>>> +---
>>> +layout: post
>>> +title:  "Deprecation of old parameters and features"
>>> +date:   2017-07-25 9:00:00 +0200
>>> +author: Thomas Huth
>>> +categories: [features, 'web site']
>>
>> Maybe s/web site/wiki/?
>>
>>> +---
>>> +QEMU has a lot of interfaces (like command line options or HMP commands) 
>>> and
>>> +old features (like certain devices) which are considered as deprecated
>>> +since other more generic or better interfaces/features have been 
>>> established
>>> +instead. While the QEMU developers are generally trying to keep each QEMU
>>> +release compatible with the previous ones, the old legacy sometimes gets 
>>> into
>>> +the way when developing new code and/or causes quite some burden of 
>>> maintaining
>>> +it.
>>> +
>>> +Thus we are currently considering to get rid of some of the old interfaces
>>> +and features in a future release and have started to collect a list of such
>>> +old items in our Wiki on a
>>> +[page about removing legacy 
>>> parts](http://wiki.qemu.org/Features/LegacyRemoval).
>>> +If you are running QEMU directly, please have a look at this page to see
>>> +whether you are still using one of these old interfaces or features, so you
>>> +can adapt your setup to use the new interfaces or features instead. Or if
>>> +you rather think that one of the legacy interfaces/features should *not* be
>>> +removed from QEMU at all, please speak up on the
>>> +[qemu-devel mailing list](http://wiki.qemu.org/Contribute/MailingLists)
>>> +to explain why the interface or feature is still required.
>>
>> This text looks good.
>>
>> However, we should first finalize Daniel's patches and update the wiki
>> to match the newly-instated policy.  The blog post might also include
>> the text that is added to the manual.
> 
> IMHO we shouldn't really point people to the wiki at all. The qemu-doc
> content is what any 3rd parties should rely, since that has been formally
> reviewed & approved by maintainers.
> 
> The remaining content on the wiki page that differs from qemu-doc, is a
> braindump of stuff that could potentially be removed, but is not at all
> certain that any will be deprecated. IOW the wiki page is fine for QEMU
> maintainers to use, but 3rd parties should only use the published docs.

Ok, your points about the wiki are certainly true ... but so far we
still haven't decided on a final wording for qemu-doc yet ... or do we
have an agreement on the machine type deprecation already? Maybe it's
really best if you omit the part about machine type deprecation there
for now, and discuss that again in the 2.11 time frame, so that we at
least get the other parts still into the qemu-doc for 2.10 ?

 Thomas

Re: [Qemu-devel] [PATCH v12 6/8] mm: support reporting free page blocks

2017-07-25 Thread Michal Hocko

On Tue 25-07-17 17:32:00, Wei Wang wrote:
> On 07/24/2017 05:00 PM, Michal Hocko wrote:
> >On Wed 19-07-17 20:01:18, Wei Wang wrote:
> >>On 07/19/2017 04:13 PM, Michal Hocko wrote:
> >[...
> >>>All you should need is the check for the page reference count, no?  I
> >>>assume you do some sort of pfn walk and so you should be able to get an
> >>>access to the struct page.
> >>Not necessarily - the guest struct page is not seen by the hypervisor. The
> >>hypervisor only gets those guest pfns which are hinted as unused. From the
> >>hypervisor (host) point of view, a guest physical address corresponds to a
> >>virtual address of a host process. So, once the hypervisor knows a guest
> >>physical page is unsued, it knows that the corresponding virtual memory of
> >>the process doesn't need to be transferred in the 1st round.
> >I am sorry, but I do not understand. Why cannot _guest_ simply check the
> >struct page ref count and send them to the hypervisor?
> 
> Were you suggesting the following?
> 1) get a free page block from the page list using the API;

No. Use a pfn walk, check the reference count and skip those pages which
have 0 ref count. I suspected that you need to do some sort of the pfn
walk anyway because you somehow have to evaluate a memory to migrate,
right?

> 2) if page->ref_count == 0, send it to the hypervisor

yes

> Btw, ref_count may also change at any time.
> 
> >Is there any
> >documentation which describes the workflow or code which would use your
> >new API?
> >
> 
> It's used in the balloon driver (patch 8). We don't have any docs yet, but
> I think the high level workflow is the two steps above.

I will have a look.
-- 
Michal Hocko
SUSE Labs

Re: [Qemu-devel] Can I mount encrypt qcow2?

2017-07-25 Thread Eric Blake

On 07/24/2017 11:26 PM, lampahome wrote:
> I thought 2.9.0 is the latest and check to the wrong commit.

You're still top-posting, even after being told not to:
http://www.caliburn.nl/topposting.html

> Now it supports encryption.
> 
> My cmd is:
> 
>> qemu-nbd --object secret,id=sec0,file=passwd.txt,format=raw \
>>  --image-opts
>> driver=qcow2,file.filename=demo.qcow2,encrypt.format=luks,encrypt.key-secret=sec0

This says to open an existing already-encrypted image...

> 
> But it shows error message:
> 
>> No encryption in image header, but options specified format 'luks'

...and this says your image was not already encrypted.

> 
> Something wrong?
> 
> My procedure is below:
> 1. create a clean demo.qcow2 image(no compression, no encryption)
> 2. use the cmd above to encrypt the demo.qcow2 image

Indeed, if you want an encrypted image, you must create it encrypted up
front (you can't do an in-place encryption after the fact).  qemu-img
convert is probably the easiest way to copy a non-encrypted image to a
newly-created encrypted image.

> 
> 2017-07-24 20:25 GMT+08:00 Eric Blake :
> 
>> On 07/23/2017 08:49 PM, 陳培泓 wrote:
>>> I check to the newest version of qemu.
>>
>> You're still top-posting, which makes it really hard to answer your
>> questions.

Here's where I previously asked you to avoid top-posting.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [PATCH v6 (for 2.10) 0/1] Document deprecation policy & features

2017-07-25 Thread Daniel P. Berrange

This is a followup to

  v1: https://lists.gnu.org/archive/html/qemu-devel/2017-05/msg02390.html
  v2: https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg01286.html
  v3: https://lists.gnu.org/archive/html/qemu-devel/2017-07/msg00651.html
  v4: https://lists.gnu.org/archive/html/qemu-devel/2017-07/msg04239.html

I would really strongly like to see this documented in time for the
2.10 release, so that we can start the clock ticking on our deprecation
policy and thus actually delete some stuff in the not too distant
future.

The goal is to clarify to users & app developers what they can expect
from QEMU in terms of feature life & any deprecation policy should
it be neccessary to remove features.

The list of features marked as deprecated was determined by looking at
the QEMU source for the word "deprecated'. It was then compared with
the doc Thomas put up at http://wiki.qemu.org/Features/LegacyRemoval

Key differences with the wiki page that Thomas wrote up vs patch 2
in this series

 - Deprecated features are given a fixed lifespan of 2 releases,
   rather than listing deletion at a future "major" v3.0.0 release.
   This ensures that applications like libvirt have a predictable
   fixed amount of time to react to deprecations.

 - Only lists features which are currently officially deprecated,
   no list of possible future candidates. The wiki page is probably
   a good place to maintain a list of future possible deprecations.
   To turn them into actual deprecations, a patch to the QEMU doc
   can then be posted & reviewed in the normal manner.

 - Not listing the '-6' and '-e' args to qemu-img create. Those
   were never deprecations, because the functionality was
   immediately turned into a fatal error. Patches to delete these
   have been merged now

Changed in v6:

 - Remove all discussion of machine types lifecycle since it
   looks like they will be better kept upstream for as long 
   as any downstream wants them (Paolo)

 - Get rid of separate "Support lifecycle" appendix and fold
   its content into the "Deprecated features" appendix (Daniel)

 - Fix s/-monitor/-mon/  (Thomas)

Changed in v5:

 - Removed misleading reference to "major" release (Eduardo)

Changed in v4:

 - Misc typos / wording clarification (Thomas)

Changed in v3:

 - Rename appendix to "Deprecated features" (Markus)
 - List all currently deprecated features
 - Document that deprecated features will be removed after
   2 releases of being deprecated
 - Clarify that clock for removing historically deprecated
   features starts with the forthcoming release.

Changed in v2:

 - Split into 2 patches so we can consider each suggested addition
   independantly.


Daniel P. Berrange (1):
  docs: document deprecation policy & deprecated features in appendix

 qemu-doc.texi | 175 ++
 1 file changed, 175 insertions(+)

-- 
2.13.3

Re: [Qemu-devel] [qemu-web PATCH] Add a blog post about deprecation of old interfaces and features

2017-07-25 Thread Daniel P. Berrange

On Tue, Jul 25, 2017 at 01:23:24PM +0200, Thomas Huth wrote:
> On 25.07.2017 13:15, Daniel P. Berrange wrote:
> > On Tue, Jul 25, 2017 at 01:10:38PM +0200, Paolo Bonzini wrote:
> >> On 25/07/2017 13:07, Thomas Huth wrote:
> >>> The list of deprecated interfaces/features in the wiki should be pretty
> >>> complete now, so it is now time to draw some more public attention to our
> >>> plans of removing certain interfaces/features in future releases.
> >>>
> >>> Signed-off-by: Thomas Huth 
> >>> ---
> >>>  _posts/2017-07-25-deprecation.md | 26 ++
> >>>  1 file changed, 26 insertions(+)
> >>>  create mode 100644 _posts/2017-07-25-deprecation.md
> >>>
> >>> diff --git a/_posts/2017-07-25-deprecation.md 
> >>> b/_posts/2017-07-25-deprecation.md
> >>> new file mode 100644
> >>> index 000..b5eaf0b
> >>> --- /dev/null
> >>> +++ b/_posts/2017-07-25-deprecation.md
> >>> @@ -0,0 +1,26 @@
> >>> +---
> >>> +layout: post
> >>> +title:  "Deprecation of old parameters and features"
> >>> +date:   2017-07-25 9:00:00 +0200
> >>> +author: Thomas Huth
> >>> +categories: [features, 'web site']
> >>
> >> Maybe s/web site/wiki/?
> >>
> >>> +---
> >>> +QEMU has a lot of interfaces (like command line options or HMP commands) 
> >>> and
> >>> +old features (like certain devices) which are considered as deprecated
> >>> +since other more generic or better interfaces/features have been 
> >>> established
> >>> +instead. While the QEMU developers are generally trying to keep each QEMU
> >>> +release compatible with the previous ones, the old legacy sometimes gets 
> >>> into
> >>> +the way when developing new code and/or causes quite some burden of 
> >>> maintaining
> >>> +it.
> >>> +
> >>> +Thus we are currently considering to get rid of some of the old 
> >>> interfaces
> >>> +and features in a future release and have started to collect a list of 
> >>> such
> >>> +old items in our Wiki on a
> >>> +[page about removing legacy 
> >>> parts](http://wiki.qemu.org/Features/LegacyRemoval).
> >>> +If you are running QEMU directly, please have a look at this page to see
> >>> +whether you are still using one of these old interfaces or features, so 
> >>> you
> >>> +can adapt your setup to use the new interfaces or features instead. Or if
> >>> +you rather think that one of the legacy interfaces/features should *not* 
> >>> be
> >>> +removed from QEMU at all, please speak up on the
> >>> +[qemu-devel mailing list](http://wiki.qemu.org/Contribute/MailingLists)
> >>> +to explain why the interface or feature is still required.
> >>
> >> This text looks good.
> >>
> >> However, we should first finalize Daniel's patches and update the wiki
> >> to match the newly-instated policy.  The blog post might also include
> >> the text that is added to the manual.
> > 
> > IMHO we shouldn't really point people to the wiki at all. The qemu-doc
> > content is what any 3rd parties should rely, since that has been formally
> > reviewed & approved by maintainers.
> > 
> > The remaining content on the wiki page that differs from qemu-doc, is a
> > braindump of stuff that could potentially be removed, but is not at all
> > certain that any will be deprecated. IOW the wiki page is fine for QEMU
> > maintainers to use, but 3rd parties should only use the published docs.
> 
> Ok, your points about the wiki are certainly true ... but so far we
> still haven't decided on a final wording for qemu-doc yet ... or do we
> have an agreement on the machine type deprecation already? Maybe it's
> really best if you omit the part about machine type deprecation there
> for now, and discuss that again in the 2.11 time frame, so that we at
> least get the other parts still into the qemu-doc for 2.10 ?

Yes, I have just dropped the machine type content and sent a v6.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [Qemu-devel] [PATCH 11/13] instrument: [qapi] Add library loader

2017-07-25 Thread Eric Blake

On 07/25/2017 03:24 AM, Lluís Vilanova wrote:
> Eric Blake writes:
> 
>> On 07/24/2017 12:46 PM, Lluís Vilanova wrote:
>>> Signed-off-by: Lluís Vilanova 
>>> ---
>>> instrument/Makefile.objs |1 +
>>> instrument/qmp.c |   71 
>>> qapi-schema.json |3 ++
>>> qapi/instrument.json |   92 
>>> ++
>>> 4 files changed, 167 insertions(+)
>>> create mode 100644 instrument/qmp.c
>>> create mode 100644 qapi/instrument.json
> 
>> Adding new files; but I don't see a patch to MAINTAINERS to cover
>> instrument/*.
> 
> Who should I put as a maintainer? Or does this go to the general 
> maintainer(s)?

You can be the maintainer if you'd like; or see if Stefan is okay
including it as part of the trace files, since it trace-related.


>>> +##
>>> +{ 'command': 'instr-load',
>>> +  'data':{ 'path': 'str', '*args': ['String'] },
> 
>> Why are you double-nesting things?  It's a lot nicer to use ['str']

> Aha, you mean the definition should be this instead?
> 
>   { 'command': 'instr-load',
> 'data':{ 'path': 'str', '*args': ['str'] },
> 'returns': 'InstrLoadResult' }

Yes.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [Bug 1706296] [NEW] Booting NT 4 disk causes /home/rjones/d/qemu/cpus.c:1580:qemu_mutex_lock_iothread: assertion failed: (!qemu_mutex_iothread_locked())

2017-07-25 Thread Thomas Huth

On 25.07.2017 11:30, Richard Jones wrote:
> ERROR:/home/rjones/d/qemu/cpus.c:1580:qemu_mutex_lock_iothread: assertion 
> failed: (!qemu_mutex_iothread_locked())
> Aborted (core dumped)
> 
> The stack trace in the failing thread is:
> 
> Thread 4 (Thread 0x7fffb0418700 (LWP 21979)):
> #0  0x7fffdd89b64b in raise () at /lib64/libc.so.6
> #1  0x7fffdd89d450 in abort () at /lib64/libc.so.6
> #2  0x7fffdff8c75d in g_assertion_message () at /lib64/libglib-2.0.so.0
> #3  0x7fffdff8c7ea in g_assertion_message_expr ()
> at /lib64/libglib-2.0.so.0
> #4  0x557a7d00 in qemu_mutex_lock_iothread ()
> at /home/rjones/d/qemu/cpus.c:1580
> #5  0x557cb429 in io_writex (env=env@entry=0x56751400, 
> iotlbentry=0x5675b678, 
> iotlbentry@entry=0x5ae40c918, val=val@entry=8, 
> addr=addr@entry=2148532220, retaddr=0, retaddr@entry=93825011136120, 
> size=size@entry=4)
> at /home/rjones/d/qemu/accel/tcg/cputlb.c:795
> #6  0x557ce0f7 in io_writel (retaddr=93825011136120, addr=2148532220, 
> val=8, index=255, mmu_idx=21845, env=0x56751400)
> at /home/rjones/d/qemu/softmmu_template.h:265
> #7  0x557ce0f7 in helper_le_stl_mmu (env=env@entry=0x56751400, 
> addr=addr@entry=2148532220, val=val@entry=8, oi=, 
> retaddr=93825011136120, retaddr@entry=0) at 
> /home/rjones/d/qemu/softmmu_template.h:300
> #8  0x5587c0a4 in cpu_stl_kernel_ra (env=0x56751400, 
> ptr=2148532220, v=8, retaddr=0) at 
> /home/rjones/d/qemu/include/exec/cpu_ldst_template.h:182
> #9  0x55882610 in do_interrupt_protected (is_hw=, 
> next_eip=, error_code=2, is_int=, 
> intno=, env=0x56751400) at 
> /home/rjones/d/qemu/target/i386/seg_helper.c:758
> #10 0x55882610 in do_interrupt_all (cpu=cpu@entry=0x56749170, 
> intno=, is_int=, error_code=2, 
> next_eip=, is_hw=is_hw@entry=0) at 
> /home/rjones/d/qemu/target/i386/seg_helper.c:1252
> #11 0x558839d3 in x86_cpu_do_interrupt (cs=0x56749170)
> at /home/rjones/d/qemu/target/i386/seg_helper.c:1298
> #12 0x557d2ccb in cpu_handle_exception (ret=, 
> cpu=0x566a4590) at /home/rjones/d/qemu/accel/tcg/cpu-exec.c:465
> #13 0x557d2ccb in cpu_exec (cpu=cpu@entry=0x56749170)
> at /home/rjones/d/qemu/accel/tcg/cpu-exec.c:670
> #14 0x557a855a in tcg_cpu_exec (cpu=0x56749170)
> at /home/rjones/d/qemu/cpus.c:1270
> #15 0x557a855a in qemu_tcg_rr_cpu_thread_fn (arg=)
> at /home/rjones/d/qemu/cpus.c:1365
> #16 0x7fffddc3d36d in start_thread () at /lib64/libpthread.so.0
> #17 0x7fffdd975b9f in clone () at /lib64/libc.so.6

Looks like the iothread lock is taken twice here, one time in
accel/tcg/cpu-exec.c around line 465 and one time in
accel/tcg/cputlb.c:795 again.

If I've get that right, the locks have been added by this commit here:

 8d04fb55dec381bc5105cb47f29d918e579e8cbd
 tcg: drop global lock during TCG code execution

so this looks related to the MTTCG reworks that happened recently. I
hope one of the MTTCG gurus has some spare time to look at this...

 Thomas

[Qemu-devel] [PATCH v6 (for 2.10) 1/1] docs: document deprecation policy & deprecated features in appendix

2017-07-25 Thread Daniel P. Berrange

The deprecation of features in QEMU is totally adhoc currently,
with no way for the user to get a list of what is deprecated
in each release. This adds an appendix to the doc that records
when each deprecation was made and provides text explaining
what to use instead, if anything.

Since there has been no formal policy around removal of deprecated
features in the past, any deprecations prior to 2.10.0 are to be
treated as if they had been made at the 2.10.0 release. Thus the
earliest that existing deprecations will be deleted is the start
of the 2.12.0 cycle.

Signed-off-by: Daniel P. Berrange 
---
 qemu-doc.texi | 175 ++
 1 file changed, 175 insertions(+)

diff --git a/qemu-doc.texi b/qemu-doc.texi
index 48af5155c7..aeb7bc52f5 100644
--- a/qemu-doc.texi
+++ b/qemu-doc.texi
@@ -38,6 +38,7 @@
 * QEMU Guest Agent::
 * QEMU User space emulator::
 * Implementation notes::
+* Deprecated features::
 * License::
 * Index::
 @end menu
@@ -3128,6 +3129,180 @@ Run the emulation in single step mode.
 
 @include qemu-tech.texi
 
+@node Deprecated features
+@appendix Deprecated features
+
+In general features are intended to be supported indefinitely once
+introduced into QEMU. In the event that a feature needs to be removed,
+it will be listed in this appendix. The feature will remain functional
+for 2 releases prior to actual removal. Deprecated features may also
+generate warnings on the console when QEMU starts up, or if activated
+via a monitor command, however, this is not a mandatory requirement.
+
+Prior to the 2.10.0 release there was no official policy on how
+long features would be deprecated prior to their removal, nor
+any documented list of which features were deprecated. Thus
+any features deprecated prior to 2.10.0 will be treated as if
+they were first deprecated in the 2.10.0 release.
+
+What follows is a list of all features currently marked as
+deprecated.
+
+@section System emulator command line arguments
+
+@subsection -drive boot=on|off (since 1.3.0)
+
+The ``boot=on|off'' option to the ``-drive'' argument is
+ignored. Applications should use the ``bootindex=N'' parameter
+to set an absolute ordering between devices instead.
+
+@subsection -tdf (since 1.3.0)
+
+The ``-tdf'' argument is ignored. The behaviour implemented
+by this argument is now the default when using the KVM PIT,
+but can be requested explicitly using
+``-global kvm-pit.lost_tick_policy=slew''.
+
+@subsection -no-kvm-pit-reinjection (since 1.3.0)
+
+The ``-no-kvm-pit-reinjection'' argument is now a
+synonym for setting ``-global kvm-pit.lost_tick_policy=discard''.
+
+@subsection -no-kvm-irqchip (since 1.3.0)
+
+The ``-no-kvm-irqchip'' argument is now a synonym for
+setting ``-machine kernel_irqchip=off''.
+
+@subsection -no-kvm-pit (since 1.3.0)
+
+The ``-no-kvm-pit'' argument is ignored. It is no longer
+possible to disable the KVM PIT directly.
+
+@subsection -no-kvm (since 1.3.0)
+
+The ``-no-kvm'' argument is now a synonym for setting
+``-machine accel=tcg''.
+
+@subsection -mon default=on (since 2.4.0)
+
+The ``default'' option to the ``-mon'' argument is
+now ignored. When multiple monitors were enabled, it
+indicated which monitor would receive log messages
+from the various subsystems. This feature is no longer
+required as messages are now only sent to the monitor
+in response to explicitly monitor commands.
+
+@subsection -vnc tls (since 2.5.0)
+
+The ``-vnc tls'' argument is now a synonym for setting
+``-object tls-creds-anon,id=tls0'' combined with
+``-vnc tls-creds=tls0'
+
+@subsection -vnc x509 (since 2.5.0)
+
+The ``-vnc x509=/path/to/certs'' argument is now a
+synonym for setting
+``-object tls-creds-x509,dir=/path/to/certs,id=tls0,verify-peer=no''
+combined with ``-vnc tls-creds=tls0'
+
+@subsection -vnc x509verify (since 2.5.0)
+
+The ``-vnc x509verify=/path/to/certs'' argument is now a
+synonym for setting
+``-object tls-creds-x509,dir=/path/to/certs,id=tls0,verify-peer=yes''
+combined with ``-vnc tls-creds=tls0'
+
+@subsection -tftp (since 2.6.0)
+
+The ``-tftp /some/dir'' argument is now a synonym for setting
+the ``-netdev user,tftp=/some/dir' argument. The new syntax
+allows different settings to be provided per NIC.
+
+@subsection -bootp (since 2.6.0)
+
+The ``-bootp /some/file'' argument is now a synonym for setting
+the ``-netdev user,bootp=/some/file' argument. The new syntax
+allows different settings to be provided per NIC.
+
+@subsection -redir (since 2.6.0)
+
+The ``-redir ARGS'' argument is now a synonym for setting
+the ``-netdev user,hostfwd=ARGS'' argument instead. The new
+syntax allows different settings to be provided per NIC.
+
+@subsection -smb (since 2.6.0)
+
+The ``-smb /some/dir'' argument is now a synonym for setting
+the ``-netdev user,smb=/some/dir'' argument instead. The new
+syntax allows different settings to be provided per NIC.
+
+@subsection -net channel (since 2.6.0)
+
+The ``--net channel,ARGS'' argument is now a synonym

Re: [Qemu-devel] [RFC PATCH v2 5/6] hw/pci: add bus_reserve property to pcie-root-port

2017-07-25 Thread Marcel Apfelbaum


On 25/07/2017 0:58, Michael S. Tsirkin wrote:

On Tue, Jul 25, 2017 at 12:41:12AM +0300, Alexander Bezzubikov wrote:

2017-07-24 23:46 GMT+03:00 Michael S. Tsirkin :

On Sun, Jul 23, 2017 at 05:13:11PM +0300, Marcel Apfelbaum wrote:

On 23/07/2017 15:22, Michael S. Tsirkin wrote:

On Sun, Jul 23, 2017 at 01:15:42AM +0300, Aleksandr Bezzubikov wrote:

To enable hotplugging of a newly created pcie-pci-bridge,
we need to tell firmware (SeaBIOS in this case)




Hi Michael,


Presumably, EFI would need to support this too?



Sure, Eduardo added to CC, but he is in PTO now.


to reserve
additional buses for pcie-root-port, that allows us to
hotplug pcie-pci-bridge into this root port.
The number of buses to reserve is provided to the device via a corresponding
property, and to the firmware via new PCI capability (next patch).
The property's default value is 1 as we want to hotplug at least 1 bridge.


If so you should just teach firmware to allocate one bus #
unconditionally.



That would be a problem for the PCIe machines, since each PCIe
devices is plugged in a different bus and we are already
limited to 256 PCIe devices. Allocating an extra-bus always
would really limit the PCIe devices we can use.


But this is exactly what this patch does as the property is added to all
buses and default to 1 (1 extra bus).


But why would that be so? What's wrong with a device
directly in the root port?



First, plugging a legacy PCI device into a PCIe Root Port
looks strange at least, and it can;t be done on real HW anyway.
(incompatible slots)


You can still plug in PCIe devices there.


Second (and more important), if we want 2 or more PCI
devices we would loose both IO ports space and bus numbers.


What I am saying is maybe default should not be 1.




Hi Michael, Alexander


The only sensible variant left is 0.
But as we want pcie-pci-bridge to be used for every legacy PCI device
on q35 machine, every time one hotplugs the bridge into the root port,
he must be sure rp's prop value >0 (for Linux). I'm not sure
that it is a very convenient way to utilize the bridge - always remember
to set property.




Is not for Linux only, is for all guest OSes.
I also think setting the property is OK, libvirt can always
add a single PCIe Root Port port with this property set,
while upper layers can create flavors (if the feature needed
or not for the current setup)


That's what I'm saying then - if in your opinion default is >0 anyway,
tweak firmware to do it by default.



Default should be 0 for sure - because of the hard limitation
on the number of PCIe devices for single PCI domain (the same
as the number of buses, 256).

For a positive value we will should add a property "buses-reserve = x".


Another way - we can set this to 0 by default, and to 1 for pcie-root-port,
and recommend to use it for hotplugging of the pcie-pci-bridge itself.


I wonder about something: imagine hotplugging a hierarchy of bridges
below a root port. It seems that nothing prevents guest from finding a
free range of buses to cover this hierarchy and setting that as
secondary/subordinate bus for this bridge.
 > This does need support on QEMU side to hotplug a hierarchy at once,
and might need some fixes in Linux, on the plus side you can defer
management decision on how many are needed until you are actually
adding something, and you don't need vendor specific patches.



We can teach Linux kernel, that's for sure (OK, almost sure...)
but what we don't want is to be dependent on specific guest Operating
Systems. For example, most configurations are not supported
by Windows guests.

Also is a great opportunity adding PCI IO resources hints
to guest FW, something we wanted to do for some time.

Thanks,
Marcel









Signed-off-by: Aleksandr Bezzubikov 
---
   hw/pci-bridge/pcie_root_port.c | 1 +
   include/hw/pci/pcie_port.h | 3 +++
   2 files changed, 4 insertions(+)

diff --git a/hw/pci-bridge/pcie_root_port.c b/hw/pci-bridge/pcie_root_port.c
index 4d588cb..b0e49e1 100644
--- a/hw/pci-bridge/pcie_root_port.c
+++ b/hw/pci-bridge/pcie_root_port.c
@@ -137,6 +137,7 @@ static void rp_exit(PCIDevice *d)
   static Property rp_props[] = {
   DEFINE_PROP_BIT(COMPAT_PROP_PCP, PCIDevice, cap_present,
   QEMU_PCIE_SLTCAP_PCP_BITNR, true),
+DEFINE_PROP_UINT8("bus_reserve", PCIEPort, bus_reserve, 1),
   DEFINE_PROP_END_OF_LIST()
   };
diff --git a/include/hw/pci/pcie_port.h b/include/hw/pci/pcie_port.h
index 1333266..1b2dd1f 100644
--- a/include/hw/pci/pcie_port.h
+++ b/include/hw/pci/pcie_port.h
@@ -34,6 +34,9 @@ struct PCIEPort {
   /* pci express switch port */
   uint8_t port;
+
+/* additional buses to reserve on firmware init */
+uint8_t bus_reserve;
   };
   void pcie_port_init_reg(PCIDevice *d);


So here is a property and it does not do anything.
It makes it easier to work on series maybe, but review
is harder since we do not see what it does at all.
Please do not split up patches like this

Re: [Qemu-devel] [PATCH 11/13] instrument: [qapi] Add library loader

2017-07-25 Thread Lluís Vilanova

Eric Blake writes:

> On 07/25/2017 03:24 AM, Lluís Vilanova wrote:
>> Eric Blake writes:
>> 
>>> On 07/24/2017 12:46 PM, Lluís Vilanova wrote:
 Signed-off-by: Lluís Vilanova 
 ---
 instrument/Makefile.objs |1 +
 instrument/qmp.c |   71 
 qapi-schema.json |3 ++
 qapi/instrument.json |   92 
 ++
 4 files changed, 167 insertions(+)
 create mode 100644 instrument/qmp.c
 create mode 100644 qapi/instrument.json
>> 
>>> Adding new files; but I don't see a patch to MAINTAINERS to cover
>>> instrument/*.
>> 
>> Who should I put as a maintainer? Or does this go to the general 
>> maintainer(s)?

> You can be the maintainer if you'd like; or see if Stefan is okay
> including it as part of the trace files, since it trace-related.

I can certainly do that. I was just wondering if the project prefers to have
someone on the "core team" as a maintainer. Stefan?


Thanks,
  Lluis

Re: [Qemu-devel] [RFC PATCH v2 6/6] hw/pci: add hint capabilty for additional bus reservation to pcie-root-port

2017-07-25 Thread Marcel Apfelbaum


On 25/07/2017 0:43, Alexander Bezzubikov wrote:

2017-07-24 23:43 GMT+03:00 Michael S. Tsirkin :

On Sun, Jul 23, 2017 at 01:15:43AM +0300, Aleksandr Bezzubikov wrote:

Signed-off-by: Aleksandr Bezzubikov 
---
  hw/pci-bridge/pcie_root_port.c | 5 +
  1 file changed, 5 insertions(+)

diff --git a/hw/pci-bridge/pcie_root_port.c b/hw/pci-bridge/pcie_root_port.c
index b0e49e1..ca92d85 100644
--- a/hw/pci-bridge/pcie_root_port.c
+++ b/hw/pci-bridge/pcie_root_port.c
@@ -106,6 +106,11 @@ static void rp_realize(PCIDevice *d, Error **errp)
  pcie_aer_root_init(d);
  rp_aer_vector_update(d);

+rc = pci_bridge_help_cap_init(d, 0, p->bus_reserve, 0, 0, 0, errp);
+if (rc < 0) {
+goto err;
+}
+
  return;

  err:


It looks like this will add the capability unconditionally to all
pcie root ports. Two issues with it:
1. you can't add vendor properties to devices where vendor is
not qemu as they might have their own concept of what it does.
2. this will break compatibility with old machine types,
need to disable for these



Actually the original idea was to add it for pcie-root-port excusively
(for now at least), looks like I've confused a little with files naming.


Right, for the Generic PCIe Root Port and not for all the root ports.
In the future we may want to add it to the PCI-brigde so we can have
nested bridges, but we are not there yet.


Will add it for v3.



Thanks,
Marcel


--
2.7.4

Re: [Qemu-devel] [PATCH v12 6/8] mm: support reporting free page blocks

2017-07-25 Thread Wei Wang


On 07/25/2017 07:25 PM, Michal Hocko wrote:

On Tue 25-07-17 17:32:00, Wei Wang wrote:

On 07/24/2017 05:00 PM, Michal Hocko wrote:

On Wed 19-07-17 20:01:18, Wei Wang wrote:

On 07/19/2017 04:13 PM, Michal Hocko wrote:

[...

All you should need is the check for the page reference count, no?  I
assume you do some sort of pfn walk and so you should be able to get an
access to the struct page.

Not necessarily - the guest struct page is not seen by the hypervisor. The
hypervisor only gets those guest pfns which are hinted as unused. From the
hypervisor (host) point of view, a guest physical address corresponds to a
virtual address of a host process. So, once the hypervisor knows a guest
physical page is unsued, it knows that the corresponding virtual memory of
the process doesn't need to be transferred in the 1st round.

I am sorry, but I do not understand. Why cannot _guest_ simply check the
struct page ref count and send them to the hypervisor?

Were you suggesting the following?
1) get a free page block from the page list using the API;

No. Use a pfn walk, check the reference count and skip those pages which
have 0 ref count.



"pfn walk" - do you mean start from the first pfn, and scan all the pfns 
that the VM has?




I suspected that you need to do some sort of the pfn
walk anyway because you somehow have to evaluate a memory to migrate,
right?



We don't need to do the pfn walk in the guest kernel. When the API 
reports, for example,
a 2MB free page block, the API caller offers to the hypervisor the base 
address of the page

block, and size=2MB, to the hypervisor.

The hypervisor maintains a bitmap of all the guest physical memory (a 
bit corresponds to
a guest pfn). When migrating memory, only the pfns that are set in the 
bitmap are transferred
to the destination machine. So, when the hypervisor receives a 2MB free 
page block, the

corresponding bits in the bitmap are cleared.

Best,
Wei

Re: [Qemu-devel] [PATCH v6 (for 2.10) 1/1] docs: document deprecation policy & deprecated features in appendix

2017-07-25 Thread Thomas Huth

On 25.07.2017 13:36, Daniel P. Berrange wrote:
> The deprecation of features in QEMU is totally adhoc currently,

s/adhoc/ad hoc/

> with no way for the user to get a list of what is deprecated
> in each release. This adds an appendix to the doc that records
> when each deprecation was made and provides text explaining
> what to use instead, if anything.
> 
> Since there has been no formal policy around removal of deprecated
> features in the past, any deprecations prior to 2.10.0 are to be
> treated as if they had been made at the 2.10.0 release. Thus the
> earliest that existing deprecations will be deleted is the start
> of the 2.12.0 cycle.
> 
> Signed-off-by: Daniel P. Berrange 
> ---
>  qemu-doc.texi | 175 
> ++
>  1 file changed, 175 insertions(+)
[...]
> +@subsection -net vlan (since 2.9.0)
> +
> +The ``-net van=NN'' argument is partially replaced with the

s/van/vlan/

Apart from the nits, looks fine to me now:

Reviewed-by: Thomas Huth

Re: [Qemu-devel] [PATCH for 2.10] ps2: fix sending of PAUSE/BREAK scancodes

2017-07-25 Thread Gerd Hoffmann

  Hi,

> > You're putting some specific code for spice in ps2 emulation.
> > IMO, the workaround should be moved to spice keyboard handling
> > (ui/spice-input.c),
> > which needs to generate a qcode instead of a scancode.
> 
> This isn't really a spice specific hack. QEMU internal code is *not*
> required
> to use qcodes 

qcodes are prefered in new code though.

> - the KeyValue struct is a union that allows use of either qcodes
> or XT scancodes, and the latter is what all the frontends (SPICE,
> VNC, GTk, SDL)
> use. QCodes are really only input by the monitor (the sendkey
> command).

Well, PAUSE is actually sent as qcode by sdl and gtk.  This avoids
special cases in the input layer (PAUSE is the only three scancodes key
sequence).  IMO spice should do the same.  I want switch UIs to qcodes
anyway.

cheers,
  Gerd

Re: [Qemu-devel] [PULL for-2.10 00/14] A set of s390x patches

2017-07-25 Thread Peter Maydell

On 25 July 2017 at 10:40, Cornelia Huck  wrote:
> The following changes since commit b5a74cd81d76cb467552f38f2b39520d07c65ea2:
>
>   Merge remote-tracking branch 
> 'remotes/pmaydell/tags/pull-target-arm-20170724' into staging (2017-07-24 
> 18:15:45 +0100)
>
> are available in the git repository at:
>
>   git://github.com/cohuck/qemu tags/s390x-20170725
>
> for you to fetch changes up to 7e01376daea75e888c370aab521a7d4aeaf2ffd1:
>
>   s390x/css: fix ilen in IO instruction handlers (2017-07-25 09:17:42 +0200)
>
> 
> Various changes for the s390x code:
> - updates for cpu model handling
> - fix compilation with --disable-tcg
> - fixes in vfio-ccw and I/O instruction handling
>
> 

Applied, thanks.

-- PMM

Re: [Qemu-devel] [PATCH v1 0/6] target/s390x: tcg improvments + MSA functions

2017-07-25 Thread Cornelia Huck

On Fri, 21 Jul 2017 14:56:03 +0200
David Hildenbrand  wrote:

> Based on series:
> "[PATCH v2 0/5] target/s390x: cpu model cleanups + improvements"

I'm trying to decide what to do with this series; probably nothing for
2.10.

> 
> 1. smaller pgm irq instruction length fixes

These, I would have considered. But it seems Richard had an alternative
idea which is 2.11 material. So I'll probably just ignore these for now.

> 2. implement IPM, which is often used in context of MSA instructions
> 3. implement all basic MSA (cpacf/crypto) instructions <= z13. Only provide
>the query subfunction (to query available subfunctions), no actual
>de/encryption yet

This is certainly 2.11 material.

> 4. add a couple of alignment checks, that e.g. make some kvm-unit-tests
>pass now.

That one tells me that I really should try to get an understanding of
tcg...

> 
> I have written kvm-unit-tests for MSA functions and for SPM/IPM. Will
> send them out soon. I use the following cpu model to test with an upstream
> kernel compiled for z10:
> 
> ... -cpu qemu,mvcos=on,stfle=on,ldisp=on,ldisphp=on,\
>   eimm=on,stckf=on,csst=on,csst2=on,ginste=on,\
>   exrl=on,msa-base=on,msa3-base=on,msa4-base=on,msa5-base=on ...
> 
> David Hildenbrand (6):
>   target/s390x: fix pgm irq ilen for stsi
>   target/s390x: fix pgm irq ilen in translate_pages()
>   target/s390x: implement spm (SET PROGRAM MASK)
>   target/s390x: move wrap_address to cpu.h
>   target/s390x: add basic MSA features
>   target/s390x: various alignment check
> 
>  target/s390x/Makefile.objs   |  2 +-
>  target/s390x/cpu.h   | 16 +
>  target/s390x/cpu_models.c|  4 +++
>  target/s390x/crypto_helper.c | 65 
>  target/s390x/helper.h|  1 +
>  target/s390x/insn-data.def   | 29 
>  target/s390x/mem_helper.c| 49 +++
>  target/s390x/misc_helper.c   | 10 --
>  target/s390x/mmu_helper.c|  2 +-
>  target/s390x/translate.c | 79 
> 
>  10 files changed, 232 insertions(+), 25 deletions(-)
>  create mode 100644 target/s390x/crypto_helper.c
>

Re: [Qemu-devel] [RFC v5 1/8] hw/arm/smmu-common: smmu base class

2017-07-25 Thread Tomasz Nowicki


Hi Eric,

I found out what is going on regarding vhost-net outgoing packet's 
payload corruption. My packets were corrupted because of inconsistent 
IOVA to HVA translation in IOTLB. Please see below.


On 09.07.2017 22:51, Eric Auger wrote:

Introduces the base device and class for the ARM smmu.
Implements VMSAv8-64 table lookup and translation. VMSAv8-32
is not implemented.

Signed-off-by: Eric Auger 
Signed-off-by: Prem Mallappa 

---


[...]


+
+/**
+ * smmu_page_walk_level_64 - Walk an IOVA range from a specific level
+ * @baseaddr: table base address corresponding to @level
+ * @level: level
+ * @cfg: translation config
+ * @start: end of the IOVA range
+ * @end: end of the IOVA range
+ * @hook_fn: the hook that to be called for each detected area
+ * @private: private data for the hook function
+ * @read: whether parent level has read permission
+ * @write: whether parent level has write permission
+ * @nofail: indicates whether each iova of the range
+ *  must be translated or whether failure is allowed
+ * @notify_unmap: whether we should notify invalid entries
+ *
+ * Return 0 on success, < 0 on errors not related to translation
+ * process, > 1 on errors related to translation process (only
+ * if nofail is set)
+ */
+static int
+smmu_page_walk_level_64(dma_addr_t baseaddr, int level,
+SMMUTransCfg *cfg, uint64_t start, uint64_t end,
+smmu_page_walk_hook hook_fn, void *private,
+bool read, bool write, bool nofail,
+bool notify_unmap)
+{
+uint64_t subpage_size, subpage_mask, pte, iova = start;
+bool read_cur, write_cur, entry_valid;
+int ret, granule_sz, stage;
+IOMMUTLBEntry entry;
+
+granule_sz = cfg->granule_sz;
+stage = cfg->stage;
+subpage_size = 1ULL << level_shift(level, granule_sz);
+subpage_mask = level_page_mask(level, granule_sz);
+
+trace_smmu_page_walk_level_in(level, baseaddr, granule_sz,
+  start, end, subpage_size);
+
+while (iova < end) {
+dma_addr_t next_table_baseaddr;
+uint64_t iova_next, pte_addr;
+uint32_t offset;
+
+iova_next = (iova & subpage_mask) + subpage_size;
+offset = iova_level_offset(iova, level, granule_sz);
+pte_addr = baseaddr + offset * sizeof(pte);
+pte = get_pte(baseaddr, offset);
+
+trace_smmu_page_walk_level(level, iova, subpage_size,
+   baseaddr, offset, pte);
+
+if (pte == (uint64_t)-1) {
+if (nofail) {
+return SMMU_TRANS_ERR_WALK_EXT_ABRT;
+}
+goto next;
+}
+if (is_invalid_pte(pte) || is_reserved_pte(pte, level)) {
+trace_smmu_page_walk_level_res_invalid_pte(stage, level, baseaddr,
+   pte_addr, offset, pte);
+if (nofail) {
+return SMMU_TRANS_ERR_WALK_EXT_ABRT;
+}
+goto next;
+}



vhost maintains its IOTLB cache and when there is no IOVA->HVA 
translation, it asks QEMU for help. However, IOTLB entries invalidations 
are guest initiative so for any DMA unmap at guest side we trap to SMMU 
emulation code and call:
smmu_notify_all -> smmuv3_replay_single -> smmu_page_walk_64 -> 
smmu_page_walk_level_64 -> smmuv3_replay_hook -> vhost_iommu_unmap_notify


The thing is that smmuv3_replay_hook() is never called because guest 
zeros PTE before we trap to QEMU so that smmu_page_walk_level_64() fails 
on ^^^ is_invalid_pte(pte) check. This way we keep old IOTLB entry in 
vhost and subsequent translations may be broken.



+
+read_cur = read; /* TODO */
+write_cur = write; /* TODO */
+entry_valid = read_cur | write_cur; /* TODO */
+
+if (is_page_pte(pte, level)) {
+uint64_t gpa = get_page_pte_address(pte, granule_sz);
+int perm = IOMMU_ACCESS_FLAG(read_cur, write_cur);
+
+trace_smmu_page_walk_level_page_pte(stage, level, entry.iova,
+baseaddr, pte_addr, pte, gpa);
+if (!entry_valid && !notify_unmap) {
+printf("%s entry_valid=%d notify_unmap=%d\n", __func__,
+   entry_valid, notify_unmap);
+goto next;
+}
+ret = call_entry_hook(iova, subpage_mask, gpa, perm,
+  hook_fn, private);
+if (ret) {
+return ret;
+}
+goto next;
+}
+if (is_block_pte(pte, level)) {
+uint64_t block_size;
+hwaddr gpa = get_block_pte_address(pte, level, granule_sz,
+   &block_size);
+int perm = IOMMU_ACCESS_FLAG(read_cur, write_cur);
+
+if (gpa == -1) {
+if (nofail) {
+return SMMU_TRANS_ERR_WALK_EXT_ABRT;
+

Re: [Qemu-devel] [RFC PATCH for 2.10 1/3] docker: ensure NOUSER for travis images

2017-07-25 Thread Alex Bennée


Philippe Mathieu-Daudé  writes:

> On 07/20/2017 10:47 AM, Alex Bennée wrote:
>> While adding the current user is a useful default behaviour for
>> creating new images it is not appropriate for Travis which already has
>> a default user.
>>
>> Signed-off-by: Alex Bennée 
>> ---
>>   tests/docker/Makefile.include | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/tests/docker/Makefile.include b/tests/docker/Makefile.include
>> index aaab1a4208..d7dafdbd27 100644
>> --- a/tests/docker/Makefile.include
>> +++ b/tests/docker/Makefile.include
>> @@ -71,6 +71,7 @@ docker-image-debian-ppc64el-cross: docker-image-debian9
>>   docker-image-debian-s390x-cross: docker-image-debian9
>>   docker-image-debian-win32-cross: docker-image-debian8-mxe
>>   docker-image-debian-win64-cross: docker-image-debian8-mxe
>> +docker-image-travis: NOUSER=1
>
> Cool you kept it ordered :)
>
> I'm surprised we need to install the full LaTeX stack to be able to
> compile the device-tree-compiler...

Hmm I think installing build-deps is a little extreme as we are not
re-building the device-tree-compiler but using it.

>
> Reading
> https://docs.travis-ci.com/user/environment-variables#default-environment-variables
> I think it'd be a better match if we also use those default
> environment variables, at least:
>
> DEBIAN_FRONTEND=noninteractive
> LANG=en_US.UTF-8
> LC_ALL=en_US.UTF-8
>
> what do you think?

Sure - it does reduce the noise somewhat.

>
> Reviewed-by: Philippe Mathieu-Daudé 
> Tested-by: Philippe Mathieu-Daudé 
>
>> # Expand all the pre-requistes for each docker image and test
>> combination
>>   $(foreach i,$(DOCKER_IMAGES), \
>>


--
Alex Bennée

Re: [Qemu-devel] [PATCH v1 0/6] target/s390x: tcg improvments + MSA functions

2017-07-25 Thread David Hildenbrand

On 25.07.2017 13:55, Cornelia Huck wrote:
> On Fri, 21 Jul 2017 14:56:03 +0200
> David Hildenbrand  wrote:
> 
>> Based on series:
>> "[PATCH v2 0/5] target/s390x: cpu model cleanups + improvements"
> 
> I'm trying to decide what to do with this series; probably nothing for
> 2.10.
> 
>>
>> 1. smaller pgm irq instruction length fixes
> 
> These, I would have considered. But it seems Richard had an alternative
> idea which is 2.11 material. So I'll probably just ignore these for now.

I'd suggest to simply pick these two up. Simple fixes. Richards series
is a step into the right direction, getting rid of explicit ILEN
specification/ILEN_AUTO completely, avoiding such bugs in the future.

But I'll let Richard + you decide.

> 
>> 2. implement IPM, which is often used in context of MSA instructions
>> 3. implement all basic MSA (cpacf/crypto) instructions <= z13. Only provide
>>the query subfunction (to query available subfunctions), no actual
>>de/encryption yet
> 
> This is certainly 2.11 material.

We could discuss if IPM makes sense. I can resend the updated version.
MSA can certainly wait for 2.11.

Thanks.

-- 

Thanks,

David

Re: [Qemu-devel] [PATCH for 2.10 16/35] usb/dev-mtp: fix use of uninitialized values

2017-07-25 Thread Gerd Hoffmann

 case CMD_GET_OBJECT_INFO:
> -o = usb_mtp_object_lookup(s, c->argv[0]);
> +if (c->argc > 0) {
> +o = usb_mtp_object_lookup(s, c->argv[0]);
> +}

How about zero-initializing c->argv instead?

cheers,
  Gerd

Re: [Qemu-devel] [PATCH v12 6/8] mm: support reporting free page blocks

2017-07-25 Thread Michal Hocko

On Tue 25-07-17 19:56:24, Wei Wang wrote:
> On 07/25/2017 07:25 PM, Michal Hocko wrote:
> >On Tue 25-07-17 17:32:00, Wei Wang wrote:
> >>On 07/24/2017 05:00 PM, Michal Hocko wrote:
> >>>On Wed 19-07-17 20:01:18, Wei Wang wrote:
> On 07/19/2017 04:13 PM, Michal Hocko wrote:
> >>>[...
> >All you should need is the check for the page reference count, no?  I
> >assume you do some sort of pfn walk and so you should be able to get an
> >access to the struct page.
> Not necessarily - the guest struct page is not seen by the hypervisor. The
> hypervisor only gets those guest pfns which are hinted as unused. From the
> hypervisor (host) point of view, a guest physical address corresponds to a
> virtual address of a host process. So, once the hypervisor knows a guest
> physical page is unsued, it knows that the corresponding virtual memory of
> the process doesn't need to be transferred in the 1st round.
> >>>I am sorry, but I do not understand. Why cannot _guest_ simply check the
> >>>struct page ref count and send them to the hypervisor?
> >>Were you suggesting the following?
> >>1) get a free page block from the page list using the API;
> >No. Use a pfn walk, check the reference count and skip those pages which
> >have 0 ref count.
> 
> 
> "pfn walk" - do you mean start from the first pfn, and scan all the pfns
> that the VM has?

yes

> >I suspected that you need to do some sort of the pfn
> >walk anyway because you somehow have to evaluate a memory to migrate,
> >right?
> 
> We don't need to do the pfn walk in the guest kernel. When the API
> reports, for example, a 2MB free page block, the API caller offers to
> the hypervisor the base address of the page block, and size=2MB, to
> the hypervisor.

So you want to skip pfn walks by regularly calling into the page
allocator to update your bitmap. If that is the case then would an API
that would allow you to update your bitmap via a callback be s
sufficient? Something like
void walk_free_mem(int node, int min_order,
void (*visit)(unsigned long pfn, unsigned long 
nr_pages))

The function will call the given callback for each free memory block on
the given node starting from the given min_order. The callback will be
strictly an atomic and very light context. You can update your bitmap
from there.

This would address my main concern that the allocator internals would
get outside of the allocator proper. A nasty callback which would be too
expensive could still stall other allocations and cause latencies but
the locking will be under mm core control at least.

Does that sound useful?
-- 
Michal Hocko
SUSE Labs

Re: [Qemu-devel] [PATCH v1 0/6] target/s390x: tcg improvments + MSA functions

2017-07-25 Thread Cornelia Huck

On Tue, 25 Jul 2017 14:16:54 +0200
David Hildenbrand  wrote:

> On 25.07.2017 13:55, Cornelia Huck wrote:
> > On Fri, 21 Jul 2017 14:56:03 +0200
> > David Hildenbrand  wrote:
> >   
> >> Based on series:
> >> "[PATCH v2 0/5] target/s390x: cpu model cleanups + improvements"  
> > 
> > I'm trying to decide what to do with this series; probably nothing for
> > 2.10.
> >   
> >>
> >> 1. smaller pgm irq instruction length fixes  
> > 
> > These, I would have considered. But it seems Richard had an alternative
> > idea which is 2.11 material. So I'll probably just ignore these for now.  
> 
> I'd suggest to simply pick these two up. Simple fixes. Richards series
> is a step into the right direction, getting rid of explicit ILEN
> specification/ILEN_AUTO completely, avoiding such bugs in the future.

These two are certainly small enough.

> 
> But I'll let Richard + you decide.
> 
> >   
> >> 2. implement IPM, which is often used in context of MSA instructions
> >> 3. implement all basic MSA (cpacf/crypto) instructions <= z13. Only provide
> >>the query subfunction (to query available subfunctions), no actual
> >>de/encryption yet  
> > 
> > This is certainly 2.11 material.  
> 
> We could discuss if IPM makes sense. I can resend the updated version.

As we're suppose to enter (or have entered) hardfreeze now, I'd prefer
to defer to 2.11.

> MSA can certainly wait for 2.11.

Re: [Qemu-devel] [PATCH v6 (for 2.10) 0/1] Document deprecation policy & features

2017-07-25 Thread Stefan Hajnoczi

On Tue, Jul 25, 2017 at 12:26:55PM +0100, Daniel P. Berrange wrote:
> This is a followup to
> 
>   v1: https://lists.gnu.org/archive/html/qemu-devel/2017-05/msg02390.html
>   v2: https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg01286.html
>   v3: https://lists.gnu.org/archive/html/qemu-devel/2017-07/msg00651.html
>   v4: https://lists.gnu.org/archive/html/qemu-devel/2017-07/msg04239.html
> 
> I would really strongly like to see this documented in time for the
> 2.10 release, so that we can start the clock ticking on our deprecation
> policy and thus actually delete some stuff in the not too distant
> future.
> 
> The goal is to clarify to users & app developers what they can expect
> from QEMU in terms of feature life & any deprecation policy should
> it be neccessary to remove features.
> 
> The list of features marked as deprecated was determined by looking at
> the QEMU source for the word "deprecated'. It was then compared with
> the doc Thomas put up at http://wiki.qemu.org/Features/LegacyRemoval
> 
> Key differences with the wiki page that Thomas wrote up vs patch 2
> in this series
> 
>  - Deprecated features are given a fixed lifespan of 2 releases,
>rather than listing deletion at a future "major" v3.0.0 release.
>This ensures that applications like libvirt have a predictable
>fixed amount of time to react to deprecations.
> 
>  - Only lists features which are currently officially deprecated,
>no list of possible future candidates. The wiki page is probably
>a good place to maintain a list of future possible deprecations.
>To turn them into actual deprecations, a patch to the QEMU doc
>can then be posted & reviewed in the normal manner.
> 
>  - Not listing the '-6' and '-e' args to qemu-img create. Those
>were never deprecations, because the functionality was
>immediately turned into a fatal error. Patches to delete these
>have been merged now
> 
> Changed in v6:
> 
>  - Remove all discussion of machine types lifecycle since it
>looks like they will be better kept upstream for as long 
>as any downstream wants them (Paolo)
> 
>  - Get rid of separate "Support lifecycle" appendix and fold
>its content into the "Deprecated features" appendix (Daniel)
> 
>  - Fix s/-monitor/-mon/  (Thomas)
> 
> Changed in v5:
> 
>  - Removed misleading reference to "major" release (Eduardo)
> 
> Changed in v4:
> 
>  - Misc typos / wording clarification (Thomas)
> 
> Changed in v3:
> 
>  - Rename appendix to "Deprecated features" (Markus)
>  - List all currently deprecated features
>  - Document that deprecated features will be removed after
>2 releases of being deprecated
>  - Clarify that clock for removing historically deprecated
>features starts with the forthcoming release.
> 
> Changed in v2:
> 
>  - Split into 2 patches so we can consider each suggested addition
>independantly.
> 
> 
> Daniel P. Berrange (1):
>   docs: document deprecation policy & deprecated features in appendix
> 
>  qemu-doc.texi | 175 
> ++
>  1 file changed, 175 insertions(+)
> 
> -- 
> 2.13.3
> 
> 

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH for-2.10] hw/display/sm501: Don't use vmstate_register_ram_global()

2017-07-25 Thread Peter Maydell

On 24 July 2017 at 19:35, Dr. David Alan Gilbert  wrote:
> * Peter Maydell (peter.mayd...@linaro.org) wrote:
>> Ping for review, please? Would be nice to get this into rc0.
>
> Reviewed-by: Dr. David Alan Gilbert 

Thanks; applied to master.

-- PMM

Re: [Qemu-devel] [PATCH] qemu-iotests: Fix reference output for 186

2017-07-25 Thread Stefan Hajnoczi

On Tue, Jul 25, 2017 at 10:59:36AM +0200, Kevin Wolf wrote:
> Commits 70f17a1 ('error: Revert unwanted change of warning messages')
> and e1824e5 ('qemu-iotests: Test 'info block'') had a semantic merge
> conflict, which results in failure for qemu-iotests case 186. Fix the
> reference output to consider the changes of 70f17a1.
> 
> Signed-off-by: Kevin Wolf 
> ---
>  tests/qemu-iotests/186.out | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature

[Qemu-devel] RFC: QEMU RISC-V modular ISA decoding

2017-07-25 Thread Bastian Koppelmann

Hi QEMU devs, hi risc-v-sw devs,

I'm posting this cross mailing list since I'd like to get feedback from
the both sides.

Right now the RISC-V port for QEMU uses the classic decoding scheme of
one function decoding the first opcode (and prefixes) and then branches
to different functions for decoding the rest (as in target/arm or most
of the other targets). This is all done using switch, case statements.

This is a little bit tricky to extend, especially for third parties. I
don't think it's too bad, but it can definitely be improved and I really
like the way target/s390x does it, but I'm not sure about it's drawbacks.

I see three options to proceed from here:

1) Completely move to a decoding scheme similar to target/s390x in
   QEMU. On the plus side it make it super easy to add new
   instructions and/or new instruction formats, and reduces decoding
   errors. I don't know the major drawbacks to this approach, maybe
   performance. Does anyone know? Other than that it needs a major
   rewrite of the decoder, which will take some time and thus delay
   the development of RISC-V QEMU upstream. (I think RV32/64I can
   be left as is, since everybody has to implement it anyways)

2) The compromise: Leave the core as is, i.e. RV32GC, and provide a
   nice interface for any other extension similar to target/s390.
   The big plus here is that no code needs to be changed and only
   the interface needs to be added. We don't add any performance
   overhead (if s390x-style decoding adds any), but this might
   result in nobody using it, since they don't know about the
   interface and they just hack their stuff in. Then it was a waste
   of our time to implement the interface.

3) The status quo. Just leave it as is.

Any comments?

Cheers,
Bastian

Re: [Qemu-devel] [PATCH 2/4] net: fix -netdev socket, fd= for UDP sockets

2017-07-25 Thread Michael S. Tsirkin

On Fri, Jul 21, 2017 at 11:55:51AM +0200, Jens Freimann wrote:
> This patch fixes -netdev socket,fd= for UDP sockets
> Currently -netdev socket,fd=<...> results in
> 
>   qemu: error: specified mcastaddr "127.0.0.1" (0x7f01) does not
> contain a multicast address
>   qemu-system-x86_64: -netdev
> socket,id=n1,fd=3: Device 'socket' could not be initialized
> 
> To fix these we need to allow specifying multicast and fd arguments
> for the same netdev. With this the user can specify "-netdev
> fd=3,mcast="
> 
> Fixes: 3d830459b1eccdb61b75e2712fd364012ce5a115
> Signed-off-by: Jens Freimann 

Pls Cc Jason.

> ---
>  net/socket.c | 37 ++---
>  1 file changed, 18 insertions(+), 19 deletions(-)
> 
> diff --git a/net/socket.c b/net/socket.c
> index f85ef7d..18af2ab 100644
> --- a/net/socket.c
> +++ b/net/socket.c
> @@ -320,11 +320,11 @@ static NetClientInfo net_dgram_socket_info = {
>  static NetSocketState *net_socket_fd_init_dgram(NetClientState *peer,
>  const char *model,
>  const char *name,
> -int fd, int is_connected)
> +int fd, int is_connected,
> +const char *mcast)
>  {
>  struct sockaddr_in saddr;
>  int newfd;
> -socklen_t saddr_len = sizeof(saddr);
>  NetClientState *nc;
>  NetSocketState *s;
>  
> @@ -333,8 +333,13 @@ static NetSocketState 
> *net_socket_fd_init_dgram(NetClientState *peer,
>   * by ONLY ONE process: we must "clone" this dgram socket --jjo
>   */
>  
> -if (is_connected) {
> -if (getsockname(fd, (struct sockaddr *) &saddr, &saddr_len) == 0) {
> +if (is_connected && mcast != NULL) {
> +if (parse_host_port(&saddr, mcast) < 0) {
> +fprintf(stderr,
> +"qemu: error: init_dgram: fd=%d failed 
> parse_host_port()\n",
> +fd);
> +goto err;
> +}
>  /* must be bound */
>  if (saddr.sin_addr.s_addr == 0) {
>  fprintf(stderr, "qemu: error: init_dgram: fd=%d unbound, "
> @@ -351,12 +356,6 @@ static NetSocketState 
> *net_socket_fd_init_dgram(NetClientState *peer,
>  dup2(newfd, fd);
>  close(newfd);
>  
> -} else {
> -fprintf(stderr,
> -"qemu: error: init_dgram: fd=%d failed getsockname(): 
> %s\n",
> -fd, strerror(errno));
> -goto err;
> -}
>  }
>  
>  nc = qemu_new_net_client(&net_dgram_socket_info, peer, model, name);
> @@ -432,7 +431,7 @@ static NetSocketState 
> *net_socket_fd_init_stream(NetClientState *peer,
>  
>  static NetSocketState *net_socket_fd_init(NetClientState *peer,
>const char *model, const char 
> *name,
> -  int fd, int is_connected)
> +  int fd, int is_connected, const 
> char *mc)
>  {
>  int so_type = -1, optlen=sizeof(so_type);
>  
> @@ -445,7 +444,7 @@ static NetSocketState *net_socket_fd_init(NetClientState 
> *peer,
>  }
>  switch(so_type) {
>  case SOCK_DGRAM:
> -return net_socket_fd_init_dgram(peer, model, name, fd, is_connected);
> +return net_socket_fd_init_dgram(peer, model, name, fd, is_connected, 
> mc);
>  case SOCK_STREAM:
>  return net_socket_fd_init_stream(peer, model, name, fd, 
> is_connected);
>  default:
> @@ -567,7 +566,7 @@ static int net_socket_connect_init(NetClientState *peer,
>  break;
>  }
>  }
> -s = net_socket_fd_init(peer, model, name, fd, connected);
> +s = net_socket_fd_init(peer, model, name, fd, connected, NULL);
>  if (!s)
>  return -1;
>  snprintf(s->nc.info_str, sizeof(s->nc.info_str),
> @@ -602,7 +601,7 @@ static int net_socket_mcast_init(NetClientState *peer,
>  if (fd < 0)
>  return -1;
>  
> -s = net_socket_fd_init(peer, model, name, fd, 0);
> +s = net_socket_fd_init(peer, model, name, fd, 0, NULL);
>  if (!s)
>  return -1;
>  
> @@ -652,7 +651,7 @@ static int net_socket_udp_init(NetClientState *peer,
>  }
>  qemu_set_nonblock(fd);
>  
> -s = net_socket_fd_init(peer, model, name, fd, 0);
> +s = net_socket_fd_init(peer, model, name, fd, 0, NULL);
>  if (!s) {
>  return -1;
>  }
> @@ -675,9 +674,9 @@ int net_init_socket(const Netdev *netdev, const char 
> *name,
>  assert(netdev->type == NET_CLIENT_DRIVER_SOCKET);
>  sock = &netdev->u.socket;
>  
> -if (sock->has_fd + sock->has_listen + sock->has_connect + 
> sock->has_mcast +
> -sock->has_udp != 1) {
> -error_report("exactly one of fd=, listen=, connect=, mcast= or udp="
> +if (sock->has_listen + sock->h

Re: [Qemu-devel] [PATCH 2/4] net: fix -netdev socket, fd= for UDP sockets

2017-07-25 Thread Michael S. Tsirkin

On Fri, Jul 21, 2017 at 11:55:51AM +0200, Jens Freimann wrote:
> This patch fixes -netdev socket,fd= for UDP sockets
> Currently -netdev socket,fd=<...> results in
> 
>   qemu: error: specified mcastaddr "127.0.0.1" (0x7f01) does not
> contain a multicast address
>   qemu-system-x86_64: -netdev
> socket,id=n1,fd=3: Device 'socket' could not be initialized
> 
> To fix these we need to allow specifying multicast and fd arguments
> for the same netdev. With this the user can specify "-netdev
> fd=3,mcast="
> 
> Fixes: 3d830459b1eccdb61b75e2712fd364012ce5a115
> Signed-off-by: Jens Freimann 

Reviewed-by: Michael S. Tsirkin 

> ---
>  net/socket.c | 37 ++---
>  1 file changed, 18 insertions(+), 19 deletions(-)
> 
> diff --git a/net/socket.c b/net/socket.c
> index f85ef7d..18af2ab 100644
> --- a/net/socket.c
> +++ b/net/socket.c
> @@ -320,11 +320,11 @@ static NetClientInfo net_dgram_socket_info = {
>  static NetSocketState *net_socket_fd_init_dgram(NetClientState *peer,
>  const char *model,
>  const char *name,
> -int fd, int is_connected)
> +int fd, int is_connected,
> +const char *mcast)
>  {
>  struct sockaddr_in saddr;
>  int newfd;
> -socklen_t saddr_len = sizeof(saddr);
>  NetClientState *nc;
>  NetSocketState *s;
>  
> @@ -333,8 +333,13 @@ static NetSocketState 
> *net_socket_fd_init_dgram(NetClientState *peer,
>   * by ONLY ONE process: we must "clone" this dgram socket --jjo
>   */
>  
> -if (is_connected) {
> -if (getsockname(fd, (struct sockaddr *) &saddr, &saddr_len) == 0) {
> +if (is_connected && mcast != NULL) {
> +if (parse_host_port(&saddr, mcast) < 0) {
> +fprintf(stderr,
> +"qemu: error: init_dgram: fd=%d failed 
> parse_host_port()\n",
> +fd);
> +goto err;
> +}
>  /* must be bound */
>  if (saddr.sin_addr.s_addr == 0) {
>  fprintf(stderr, "qemu: error: init_dgram: fd=%d unbound, "
> @@ -351,12 +356,6 @@ static NetSocketState 
> *net_socket_fd_init_dgram(NetClientState *peer,
>  dup2(newfd, fd);
>  close(newfd);
>  
> -} else {
> -fprintf(stderr,
> -"qemu: error: init_dgram: fd=%d failed getsockname(): 
> %s\n",
> -fd, strerror(errno));
> -goto err;
> -}
>  }
>  
>  nc = qemu_new_net_client(&net_dgram_socket_info, peer, model, name);
> @@ -432,7 +431,7 @@ static NetSocketState 
> *net_socket_fd_init_stream(NetClientState *peer,
>  
>  static NetSocketState *net_socket_fd_init(NetClientState *peer,
>const char *model, const char 
> *name,
> -  int fd, int is_connected)
> +  int fd, int is_connected, const 
> char *mc)
>  {
>  int so_type = -1, optlen=sizeof(so_type);
>  
> @@ -445,7 +444,7 @@ static NetSocketState *net_socket_fd_init(NetClientState 
> *peer,
>  }
>  switch(so_type) {
>  case SOCK_DGRAM:
> -return net_socket_fd_init_dgram(peer, model, name, fd, is_connected);
> +return net_socket_fd_init_dgram(peer, model, name, fd, is_connected, 
> mc);
>  case SOCK_STREAM:
>  return net_socket_fd_init_stream(peer, model, name, fd, 
> is_connected);
>  default:
> @@ -567,7 +566,7 @@ static int net_socket_connect_init(NetClientState *peer,
>  break;
>  }
>  }
> -s = net_socket_fd_init(peer, model, name, fd, connected);
> +s = net_socket_fd_init(peer, model, name, fd, connected, NULL);
>  if (!s)
>  return -1;
>  snprintf(s->nc.info_str, sizeof(s->nc.info_str),
> @@ -602,7 +601,7 @@ static int net_socket_mcast_init(NetClientState *peer,
>  if (fd < 0)
>  return -1;
>  
> -s = net_socket_fd_init(peer, model, name, fd, 0);
> +s = net_socket_fd_init(peer, model, name, fd, 0, NULL);
>  if (!s)
>  return -1;
>  
> @@ -652,7 +651,7 @@ static int net_socket_udp_init(NetClientState *peer,
>  }
>  qemu_set_nonblock(fd);
>  
> -s = net_socket_fd_init(peer, model, name, fd, 0);
> +s = net_socket_fd_init(peer, model, name, fd, 0, NULL);
>  if (!s) {
>  return -1;
>  }
> @@ -675,9 +674,9 @@ int net_init_socket(const Netdev *netdev, const char 
> *name,
>  assert(netdev->type == NET_CLIENT_DRIVER_SOCKET);
>  sock = &netdev->u.socket;
>  
> -if (sock->has_fd + sock->has_listen + sock->has_connect + 
> sock->has_mcast +
> -sock->has_udp != 1) {
> -error_report("exactly one of fd=, listen=, connect=, mcast= or udp="
> +if (sock->h

Re: [Qemu-devel] [PATCH 00/13] instrument: Add basic event instrumentation

2017-07-25 Thread Stefan Hajnoczi

On Mon, Jul 24, 2017 at 08:02:24PM +0300, Lluís Vilanova wrote:
> This series adds a basic interface to instrument tracing events and control
> their tracing state.
> 
> The instrumentation code is dynamically loaded into QEMU (either when it 
> starts
> or later using its remote control interfaces).
> 
> All events can be instrumented, but the instrumentable events must be 
> explicitly
> specified at configure time.
> 
> Signed-off-by: Lluís Vilanova 

Hi Lluís,
I'm concerned that the shared library interface will be abused to monkey
patch code into QEMU far beyond instrumentation use cases and/or avoid
the responsibilities of the GPL license.

Instead I suggest adding a trace backend generates calls to registered
"callback" functions:

  $ cat >my-instrumentation.c
  #include "trace/control.h"

  static void my_cpu_in(unsigned int addr, char size, unsigned int val)
  {
  printf("my_cpu_in\n");
  }

  static void my_init(void)
  {
  trace_register_event_callback("cpu_in", my_cpu_in);
  trace_enable_events("cpu_in");
  }
  trace_init(my_init);

  $ ./configure --enable-trace-backends=log,callback && make -j4

This is still a clean interface that allows instrumentation code to be
kept separate from the trace event call sites.

The instrumentation code gets compiled into QEMU, but that shouldn't be
a huge burden since QEMU's Makefiles only recompile changed source
files (only the first build is slow).

Does this alternative sound reasonable to you?

Stefan

signature.asc
Description: PGP signature

Re: [Qemu-devel] [Qemu-block] [PATCH for-2.10 0/2] Test for bug fixes from byte-based block status

2017-07-25 Thread Stefan Hajnoczi

On Mon, Jul 24, 2017 at 10:39:50AM -0500, Eric Blake wrote:
> Kevin is correct, and I need tests of my recent bug fixes :)
> 
> https://lists.gnu.org/archive/html/qemu-devel/2017-07/msg07235.html
> 
> Eric Blake (2):
>   iotests: Check dirty bitmap statistics in 124
>   iotests: Add test of recent fix to 'qemu-img measure'
> 
>  tests/qemu-iotests/124 |  7 +-
>  tests/qemu-iotests/190 | 59 
> ++
>  tests/qemu-iotests/190.out | 11 +
>  tests/qemu-iotests/group   |  1 +
>  4 files changed, 77 insertions(+), 1 deletion(-)
>  create mode 100755 tests/qemu-iotests/190
>  create mode 100644 tests/qemu-iotests/190.out
> 
> -- 
> 2.13.3
> 
> 

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature

1 2 3 4 5 >

1 - 100 of 418 matches

Mail list logo