[Qemu-block] [PATCH] ratelimit: don't align wait time with slices

2018-02-06 Thread Wolfgang Bumiller
It is possible for rate limited writes to keep overshooting a slice's
quota by a tiny amount causing the slice-aligned waiting period to
effectively halve the rate.

Signed-off-by: Wolfgang Bumiller 
---
Copied the Ccs from the discussion thread, hope that's fine, as I also
just noticed that for my reply containing this snippet I had hit reply
on the mail that did not contain those Ccs yet, sorry about that.

 include/qemu/ratelimit.h | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/include/qemu/ratelimit.h b/include/qemu/ratelimit.h
index 8dece483f5..1b38291823 100644
--- a/include/qemu/ratelimit.h
+++ b/include/qemu/ratelimit.h
@@ -36,7 +36,7 @@ typedef struct {
 static inline int64_t ratelimit_calculate_delay(RateLimit *limit, uint64_t n)
 {
 int64_t now = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
-uint64_t delay_slices;
+double delay_slices;
 
 assert(limit->slice_quota && limit->slice_ns);
 
@@ -55,12 +55,11 @@ static inline int64_t ratelimit_calculate_delay(RateLimit 
*limit, uint64_t n)
 return 0;
 }
 
-/* Quota exceeded. Calculate the next time slice we may start
- * sending data again. */
-delay_slices = (limit->dispatched + limit->slice_quota - 1) /
-limit->slice_quota;
+/* Quota exceeded. Wait based on the excess amount and then start a new
+ * slice. */
+delay_slices = (double)limit->dispatched / limit->slice_quota;
 limit->slice_end_time = limit->slice_start_time +
-delay_slices * limit->slice_ns;
+(uint64_t)(delay_slices * limit->slice_ns);
 return limit->slice_end_time - now;
 }
 
-- 
2.11.0





Re: [Qemu-block] [Qemu-devel] [PATCH v2 3/6] qapi: add nbd-server-remove

2018-02-06 Thread Markus Armbruster
"Dr. David Alan Gilbert"  writes:

> * Vladimir Sementsov-Ogievskiy (vsement...@virtuozzo.com) wrote:
>> 26.01.2018 18:05, Dr. David Alan Gilbert wrote:
>> > * Vladimir Sementsov-Ogievskiy (vsement...@virtuozzo.com) wrote:
[...]
>> > > most of commands, ported to hmp are done in same style: they just call
>> > > corresponding qmp command.

HMP commands *should* call the QMP command to do the actual work.  That
way, we *know* all the functionality is available in QMP, and HMP is
consistent with it.

Sometimes, calling helpers shared with QMP is more convenient, and
that's okay, but then you have to think about QMP completeness and
HMP/QMP consistency.

The only exception are HMP commands that don't make sense in QMP, such
as @cpu.

>> > > Isn't it better to provide common interface for calling qmp commands 
>> > > through
>> > > HMP monitor, to never
>> > > create hmp versions of new commands? they will be available 
>> > > automatically.
>> > It would be nice to do that, but they're not that consistent in how they
>> > convert parameters and options, but I occasionally wonder if we could
>> > automate more of it.
>> 
>> 
>> What about allowing some new syntax in hmp, directly mapped to qmp?
>> 
>> something like
>> 
>> >>> blockdev-add id disk driver qcow2 cache {writeback true direct true} aio
>> native discard unmap file {driver file filename /tmp/somedisk}
>> 
>> ?
>
> Hmm, I don't particularly find that easy to read either; however the
> actual block device specification for HMP should be the same as what we
> pass on the command line, so we only have to worry about any extra
> things that are part of blockdev_add.
> (I'm sure we can find a way of making the one we pass on the commandline
> more readable as well, there's so much duplication).

Good points.

QMP syntax is different for a good reason: it serves machines rather
than humans.

Both HMP and command line serve the same humans, yet the syntax they
wrap around common functionality is different.  Sad waste of developer
time, sad waste of user brain power.  The former could perhaps be
reduced with better tooling, say having QAPI generate the details.

If you have QAPI generate HMP and command line from the exact same
definitions as QMP, you get what Vladimir wants: different interfaces to
the exact same functionality, without additional coding.

Note that the needs of humans and machines differ in more ways than just
syntax.  For instance, humans appreciate convenience features to save
typing.  In a machine interface, they'd add unnecessary and
inappropriate complexity.  Adding convenience is a good reason for
actually designing the HMP interface, rather than copying the QMP one
blindly.



Re: [Qemu-block] [PATCH v5 0/5] coroutine-lock: polymorphic CoQueue

2018-02-06 Thread Fam Zheng
On Sat, 02/03 10:39, Paolo Bonzini wrote:
> There are cases in which a queued coroutine must be restarted from
> non-coroutine context (with qemu_co_enter_next).  In this cases,
> qemu_co_enter_next also needs to be thread-safe, but it cannot use a
> CoMutex and so cannot qemu_co_queue_wait.  This happens in curl (which
> right now is rolling its own list of Coroutines) and will happen in
> Fam's NVMe driver as well.
> 
> This series extracts the idea of a polymorphic lockable object
> from my "scoped lock guard" proposal, and applies it to CoQueue.
> The implementation of QemuLockable is similar to C11 _Generic, but
> redone using the preprocessor and GCC builtins for compatibility.
> 
> In general, while a bit on the esoteric side, the functionality used
> to emulate _Generic is fairly old in GCC, and the builtins are already
> used by include/qemu/atomic.h; the series was tested with Fedora 27 (boot
> Damn Small Linux via http) and CentOS 6 (compiled only).
> 
> Paolo
> 
> v4->v5: fix checkpatch complaints

Queued, thanks.

Fam



Re: [Qemu-block] [PATCH for-2.9-rc5 v4 2/2] block: Drain BH in bdrv_drained_begin

2018-02-06 Thread Fam Zheng
On Tue, Feb 6, 2018 at 11:32 PM, Kevin Wolf  wrote:
> Am 18.04.2017 um 16:30 hat Fam Zheng geschrieben:
>> During block job completion, nothing is preventing
>> block_job_defer_to_main_loop_bh from being called in a nested
>> aio_poll(), which is a trouble, such as in this code path:
>>
>> qmp_block_commit
>>   commit_active_start
>> bdrv_reopen
>>   bdrv_reopen_multiple
>> bdrv_reopen_prepare
>>   bdrv_flush
>> aio_poll
>>   aio_bh_poll
>> aio_bh_call
>>   block_job_defer_to_main_loop_bh
>> stream_complete
>>   bdrv_reopen
>>
>> block_job_defer_to_main_loop_bh is the last step of the stream job,
>> which should have been "paused" by the bdrv_drained_begin/end in
>> bdrv_reopen_multiple, but it is not done because it's in the form of a
>> main loop BH.
>>
>> Similar to why block jobs should be paused between drained_begin and
>> drained_end, BHs they schedule must be excluded as well.  To achieve
>> this, this patch forces draining the BH in BDRV_POLL_WHILE.
>>
>> As a side effect this fixes a hang in block_job_detach_aio_context
>> during system_reset when a block job is ready:
>>
>> #0  0x55aa79f3 in bdrv_drain_recurse
>> #1  0x55aa825d in bdrv_drained_begin
>> #2  0x55aa8449 in bdrv_drain
>> #3  0x55a9c356 in blk_drain
>> #4  0x55aa3cfd in mirror_drain
>> #5  0x55a66e11 in block_job_detach_aio_context
>> #6  0x55a62f4d in bdrv_detach_aio_context
>> #7  0x55a63116 in bdrv_set_aio_context
>> #8  0x55a9d326 in blk_set_aio_context
>> #9  0x557e38da in virtio_blk_data_plane_stop
>> #10 0x559f9d5f in virtio_bus_stop_ioeventfd
>> #11 0x559fa49b in virtio_bus_stop_ioeventfd
>> #12 0x559f6a18 in virtio_pci_stop_ioeventfd
>> #13 0x559f6a18 in virtio_pci_reset
>> #14 0x559139a9 in qdev_reset_one
>> #15 0x55916738 in qbus_walk_children
>> #16 0x55913318 in qdev_walk_children
>> #17 0x55916738 in qbus_walk_children
>> #18 0x559168ca in qemu_devices_reset
>> #19 0x5581fcbb in pc_machine_reset
>> #20 0x558a4d96 in qemu_system_reset
>> #21 0x5577157a in main_loop_should_exit
>> #22 0x5577157a in main_loop
>> #23 0x5577157a in main
>>
>> The rationale is that the loop in block_job_detach_aio_context cannot
>> make any progress in pausing/completing the job, because bs->in_flight
>> is 0, so bdrv_drain doesn't process the block_job_defer_to_main_loop
>> BH. With this patch, it does.
>>
>> Reported-by: Jeff Cody 
>> Signed-off-by: Fam Zheng 
>
> Fam, do you remember whether this was really only about drain? Because
> in that case...

Yes I believe so.

>
>> diff --git a/include/block/block.h b/include/block/block.h
>> index 97d4330..5ddc0cf 100644
>> --- a/include/block/block.h
>> +++ b/include/block/block.h
>> @@ -381,12 +381,13 @@ void bdrv_drain_all(void);
>>
>>  #define BDRV_POLL_WHILE(bs, cond) ({   \
>>  bool waited_ = false;  \
>> +bool busy_ = true; \
>>  BlockDriverState *bs_ = (bs);  \
>>  AioContext *ctx_ = bdrv_get_aio_context(bs_);  \
>>  if (aio_context_in_iothread(ctx_)) {   \
>> -while ((cond)) {   \
>> -aio_poll(ctx_, true);  \
>> -waited_ = true;\
>> +while ((cond) || busy_) {  \
>> +busy_ = aio_poll(ctx_, (cond));\
>> +waited_ |= !!(cond) | busy_;   \
>>  }  \
>>  } else {   \
>>  assert(qemu_get_current_aio_context() ==   \
>> @@ -398,11 +399,16 @@ void bdrv_drain_all(void);
>>   */\
>>  assert(!bs_->wakeup);  \
>>  bs_->wakeup = true;\
>> -while ((cond)) {   \
>> -aio_context_release(ctx_); \
>> -aio_poll(qemu_get_aio_context(), true);\
>> -aio_context_acquire(ctx_); \
>> -waited_ = true;\
>> +while (busy_) {\
>> +if ((cond)) {  \
>> +waited_ = busy_ = true;\
>> +aio_context_release(ctx_);   

Re: [Qemu-block] [Qemu-devel] [PATCH v8 18/26] block: Add sgfnt_runtime_opts to BlockDriver

2018-02-06 Thread Eric Blake

On 02/05/2018 09:18 AM, Max Reitz wrote:

This new field can be set by block drivers to list the runtime options
they accept that may influence the contents of the respective BDS. As of
a follow-up patch, this list will be used by the common
bdrv_refresh_filename() implementation to decide which options to put
into BDS.full_open_options (and consequently whether a JSON filename has
to be created), thus freeing the drivers of having to implement that
logic themselves.

Additionally, this patch adds the field to all of the block drivers that
need it and sets it accordingly.

Signed-off-by: Max Reitz 
---



+/* Pointer to a NULL-terminated array of names of significant options that
+ * can be specified for bdrv_open(). A significant option is one that
+ * changes the data of a BDS.
+ * If this pointer is NULL, the array is considered empty.
+ * "filename" and "driver" are always considered significant. */
+const char *const *sgfnt_runtime_opts;


Warning: Bikeshedding follows:

s/sgfnt/vital/

might read easier (same number of letters, but has some vowels rthr thn 
bng crptc bbrvtn).


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



Re: [Qemu-block] [PATCH] block: Simplify bdrv_can_write_zeroes_with_unmap()

2018-02-06 Thread Eric Blake

On 01/29/2018 05:08 AM, Stefan Hajnoczi wrote:

On Fri, Jan 26, 2018 at 01:34:39PM -0600, Eric Blake wrote:

We don't need the can_write_zeroes_with_unmap field in
BlockDriverInfo, because it is redundant information with
supported_zero_flags & BDRV_REQ_MAY_UNMAP.  Note that
BlockDriverInfo and supported_zero_flags are both per-device
settings, rather than global state about the driver as a
whole, which means one or both of these bits of information
can already be conditional.  Let's audit how they were set:


...

Simplify the code by moving the conditional into
supported_zero_flags for all drivers, then dropping the
now-unused BDI field.  For callers that relied on
bdrv_can_write_zeroes_with_unmap(), we return the same
per-device settings for drivers that had conditions (no
observable change in behavior there); and can now return
true (instead of false) for drivers that support passthrough
(for example, the commit driver) which gives those drivers
the same fix as nbd just got in bca80059e.  For callers that
relied on supported_zero_flags, we now have a few more places
that can avoid a wasted call to pwrite_zeroes() that will
just fail with ENOTSUP.

Suggested-by: Paolo Bonzini 
Signed-off-by: Eric Blake 

---
The commit id mentioned above is dependent on me not having to respin
my latest NBD pull request:

Based-on: <20180126160411.4033-1-ebl...@redhat.com>
([PULL 0/8] NBD patches through 26 Jan)



Reviewed-by: Stefan Hajnoczi 



Thanks.

Since this patch was discovered in relation to NBD code, I'm fine taking 
it through my NBD queue; but it's also more generic to the block layer, 
so I'm also fine if one of the other block maintainers grabs it first 
through their tree.


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



Re: [Qemu-block] [PATCH 2/2] qapi: add block latency histogram interface

2018-02-06 Thread Eric Blake

On 02/06/2018 12:06 PM, Vladimir Sementsov-Ogievskiy wrote:

06.02.2018 18:50, Eric Blake wrote:

On 02/06/2018 08:07 AM, Vladimir Sementsov-Ogievskiy wrote:

Set (and clear) histogram through new command
block-latency-histogram-set and show new statistics in
query-blockstats results.



The commit message mentioned that you can set and clear histogram 
tracking; how does this interface let you clear things?  By passing a 
0-length latency array?  If you violate the constraint (pass 
non-ascending points, for example), does the previous latency map get 
wiped out or is it still preserved unchanged?


On error nothing is changed.

By "clear" I mean zeroing statistics, not removing the whole histogram. 
And to zero statistics you can call set with the same latency array.
There is no way to remove histogram at all.. We can add 
block-latency-histogram-unset later if needed.


Maybe "set (or restart) histogram collection points" might read better? 
I also don't think we need a new command; if you make 'latency' 
optional, then omitting it can serve to stop collecting statistics 
altogether, without needing a new command (then again, if you do that, 
now the command is used to "set, restart, and clear histogram collection").


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



Re: [Qemu-block] [PATCH v2] iotests: 205: support luks format

2018-02-06 Thread Eric Blake

On 02/06/2018 12:26 PM, Daniel P. Berrangé wrote:

On Tue, Feb 06, 2018 at 09:25:07PM +0300, Vladimir Sementsov-Ogievskiy wrote:

Support default luks options in VM.add_drive and in new library
function qemu_img_create. Use it in 205 iotests.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---


Reviewed-by: Daniel P. Berrange 


Thanks. I'll take this through my NBD queue.

git git://repo.or.cz/qemu/ericb.git nbd

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



Re: [Qemu-block] [PATCH v2 3/6] qapi: add nbd-server-remove

2018-02-06 Thread Dr. David Alan Gilbert
* Vladimir Sementsov-Ogievskiy (vsement...@virtuozzo.com) wrote:
> 26.01.2018 18:05, Dr. David Alan Gilbert wrote:
> > * Vladimir Sementsov-Ogievskiy (vsement...@virtuozzo.com) wrote:
> > > 17.01.2018 19:03, Eric Blake wrote:
> > > > On 01/17/2018 09:51 AM, Vladimir Sementsov-Ogievskiy wrote:
> > > > 
> > > > > > > I have a script (for managing libvirt guest, but it can be 
> > > > > > > adopted for
> > > > > > > qemu or even used for qemu monitor), which allows
> > > > > > > me run qmp commands on vms as easy as:
> > > > > > > 
> > > > > > > |qmp VMNAME query-block-jobs or qmp VMNAME nbd-server-remove name 
> > > > > > > exp1
> > > > > > > mode hard or even |
> > > > > > > 
> > > > > > > |qmp VMNAME blockdev-add id disk driver qcow2 cache {writeback 
> > > > > > > true
> > > > > > > direct true} aio native discard unmap file {driver file filename
> > > > > > > /tmp/somedisk} |||
> > > > > > Yeah, there are various scripting solutions around QMP that can 
> > > > > > make it
> > > > > > easier; but HMP is often still an easy front-line interface for
> > > > > > experiments.
> > > > > > 
> > > > > isn't it because these solutions are not available directly in 
> > > > > monitor,
> > > > > when HMP is?
> > > > QMP can be directly accessed in a monitor; it just requires more typing.
> > > >If you are developing QMP commands, it may be easier to use
> > > > ./scripts/qmp/qmp-shell (couple it with a readline wrapper, and you can
> > > > even get tab-completion and history across sessions).  There's also
> > > > things like libvirt's 'virsh qmp-monitor-command' for shell-scripting
> > > > access to arbitrary QMP commands, provided your guest is run by libvirt.
> > > > 
> > > > > may be, we need third type of monitor HQMP which is QMP with 
> > > > > simplified
> > > > > syntax? Or
> > > > > allow qmp commands in simplified syntax directly in HMP?
> > > > No, I don't think we need either thing.  Wrappers around existing
> > > > monitors is better than bloating qemu proper with a third flavor of
> > > > monitor.  And HMP is for humans, with no restrictions on back-compat
> > > > changes, so if it doesn't do something you want for quick-and-dirty
> > > > testing, you can either add a new HMP command, or just use QMP (or one
> > > > of its wrappers, like qmp-shell) in the first place.  Ultimately, our
> > > > long-term concern is only about the QMP interface; HMP is supposed to be
> > > > convenient.  So if it starts costing too much time to port a QMP
> > > > interface to HMP, then don't worry about it.
> > > > 
> > > most of commands, ported to hmp are done in same style: they just call
> > > corresponding qmp command.
> > > Isn't it better to provide common interface for calling qmp commands 
> > > through
> > > HMP monitor, to never
> > > create hmp versions of new commands? they will be available automatically.
> > It would be nice to do that, but they're not that consistent in how they
> > convert parameters and options, but I occasionally wonder if we could
> > automate more of it.
> 
> 
> What about allowing some new syntax in hmp, directly mapped to qmp?
> 
> something like
> 
> >>> blockdev-add id disk driver qcow2 cache {writeback true direct true} aio
> native discard unmap file {driver file filename /tmp/somedisk}
> 
> ?

Hmm, I don't particularly find that easy to read either; however the
actual block device specification for HMP should be the same as what we
pass on the command line, so we only have to worry about any extra
things that are part of blockdev_add.
(I'm sure we can find a way of making the one we pass on the commandline
more readable as well, there's so much duplication).

Dave

> Or it may be realized as a separate hmp command "qmp" (looks more safe as a
> first step, however, I think previous variant (direct call) is better):
> 
> >>> qmp blockdev-add id disk driver qcow2 cache {writeback true direct true}
> aio native discard unmap file {driver file filename /tmp/somedisk}
> 
> what do think? This looks simple to implement and should be useful.
> 
> > 
> > Dave
> > 
> > > -- 
> > > Best regards,
> > > Vladimir
> > > 
> > --
> > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
> 
> 
> -- 
> Best regards,
> Vladimir
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



[Qemu-block] [PATCH v2] iotests: 205: support luks format

2018-02-06 Thread Vladimir Sementsov-Ogievskiy
Support default luks options in VM.add_drive and in new library
function qemu_img_create. Use it in 205 iotests.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---

v2: use keysec0  and IMGKEYSECRET

 tests/qemu-iotests/205|  4 ++--
 tests/qemu-iotests/iotests.py | 31 +++
 2 files changed, 33 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/205 b/tests/qemu-iotests/205
index 10388920dc..e7b2eae51d 100644
--- a/tests/qemu-iotests/205
+++ b/tests/qemu-iotests/205
@@ -22,7 +22,7 @@ import os
 import sys
 import iotests
 import time
-from iotests import qemu_img, qemu_io, filter_qemu_io, QemuIoInteractive
+from iotests import qemu_img_create, qemu_io, filter_qemu_io, QemuIoInteractive
 
 nbd_sock = 'nbd_sock'
 nbd_uri = 'nbd+unix:///exp?socket=' + nbd_sock
@@ -31,7 +31,7 @@ disk = os.path.join(iotests.test_dir, 'disk')
 
 class TestNbdServerRemove(iotests.QMPTestCase):
 def setUp(self):
-qemu_img('create', '-f', iotests.imgfmt, disk, '1M')
+qemu_img_create('-f', iotests.imgfmt, disk, '1M')
 
 self.vm = iotests.VM().add_drive(disk)
 self.vm.launch()
diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 5a10b2d534..1bcc9ca57d 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -58,6 +58,11 @@ qemu_default_machine = os.environ.get('QEMU_DEFAULT_MACHINE')
 socket_scm_helper = os.environ.get('SOCKET_SCM_HELPER', 'socket_scm_helper')
 debug = False
 
+luks_default_secret_object = 'secret,id=keysec0,data=' + \
+ os.environ['IMGKEYSECRET']
+luks_default_key_secret_opt = 'key-secret=keysec0'
+
+
 def qemu_img(*args):
 '''Run qemu-img and return the exit code'''
 devnull = open('/dev/null', 'r+')
@@ -66,6 +71,25 @@ def qemu_img(*args):
 sys.stderr.write('qemu-img received signal %i: %s\n' % (-exitcode, ' 
'.join(qemu_img_args + list(args
 return exitcode
 
+def qemu_img_create(*args):
+args = list(args)
+
+# default luks support
+if '-f' in args and args[args.index('-f') + 1] == 'luks':
+if '-o' in args:
+i = args.index('-o')
+if 'key-secret' not in args[i + 1]:
+args[i + 1].append(luks_default_key_secret_opt)
+args.insert(i + 2, '--object')
+args.insert(i + 3, luks_default_secret_object)
+else:
+args = ['-o', luks_default_key_secret_opt,
+'--object', luks_default_secret_object] + args
+
+args.insert(0, 'create')
+
+return qemu_img(*args)
+
 def qemu_img_verbose(*args):
 '''Run qemu-img without suppressing its output and return the exit code'''
 exitcode = subprocess.call(qemu_img_args + list(args))
@@ -263,6 +287,13 @@ class VM(qtest.QEMUQtestMachine):
 if opts:
 options.append(opts)
 
+if format == 'luks' and 'key-secret' not in opts:
+# default luks support
+if luks_default_secret_object not in self._args:
+self.add_object(luks_default_secret_object)
+
+options.append(luks_default_key_secret_opt)
+
 self._args.append('-drive')
 self._args.append(','.join(options))
 self._num_drives += 1
-- 
2.11.1




Re: [Qemu-block] [PATCH] iotests: 205: support luks format

2018-02-06 Thread Daniel P . Berrangé
On Tue, Feb 06, 2018 at 08:57:38PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> 06.02.2018 20:29, Daniel P. Berrangé wrote:
> > On Tue, Feb 06, 2018 at 08:16:42PM +0300, Vladimir Sementsov-Ogievskiy 
> > wrote:
> > > Support default luks options in VM.add_drive and in new library
> > > function qemu_img_create. Use it in 205 iotests.
> > > 
> > > Signed-off-by: Vladimir Sementsov-Ogievskiy 
> > > ---
> > > 
> > > instead of
> > >[PATCH] iotests: 205: support only raw format
> > > 
> > > let's just support luks. This patch also makes it simple to support
> > > luks in any other python iotest.
> > > 
> > >   tests/qemu-iotests/205|  4 ++--
> > >   tests/qemu-iotests/iotests.py | 33 +
> > >   2 files changed, 35 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
> > > index 5a10b2d534..4b9a4445cd 100644
> > > --- a/tests/qemu-iotests/iotests.py
> > > +++ b/tests/qemu-iotests/iotests.py
> > > @@ -58,6 +58,13 @@ qemu_default_machine = 
> > > os.environ.get('QEMU_DEFAULT_MACHINE')
> > >   socket_scm_helper = os.environ.get('SOCKET_SCM_HELPER', 
> > > 'socket_scm_helper')
> > >   debug = False
> > > +luks_default_secret_id = 'luks_secret_default_iotests_id'
> > Can we just call this  "keysec0", so we matchh convention used by
> > the shell script based tests.
> 
> Here I'm trying to avoid intersection with some user-defined id.

The "user" here is the person writing individual I/O tests. They already
have to know to avoid keysec0 because that's the standard we've defined
for the shell based scripts. So I don't see any benefit to divering in
the Python - just doubles the stuff they need to remember.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: [Qemu-block] [PATCH] iotests: 205: support luks format

2018-02-06 Thread Vladimir Sementsov-Ogievskiy

06.02.2018 21:04, Eric Blake wrote:

On 02/06/2018 11:57 AM, Vladimir Sementsov-Ogievskiy wrote:


+luks_default_secret_id = 'luks_secret_default_iotests_id'

Can we just call this  "keysec0", so we matchh convention used by
the shell script based tests.


Here I'm trying to avoid intersection with some user-defined id.



You're overthinking it.  We are only using this in the testsuite, and 
nothing else in the testsuite is using 'keysec0' for anything except 
the id of the secret to pass to encrypted disks.  The longer name 
doesn't add any protection.  It might be different if we were trying 
to provide a reusable library for contexts outside the testsuite, but 
since we are not doing that, we can rely on 'make check' failing as 
evidence if we have any collisions in naming choices that need long 
name munging as a workaround.




Ok

--
Best regards,
Vladimir




Re: [Qemu-block] [PATCH 2/2] qapi: add block latency histogram interface

2018-02-06 Thread Vladimir Sementsov-Ogievskiy

06.02.2018 18:50, Eric Blake wrote:

On 02/06/2018 08:07 AM, Vladimir Sementsov-Ogievskiy wrote:

Set (and clear) histogram through new command
block-latency-histogram-set and show new statistics in
query-blockstats results.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  qapi/block-core.json | 62 
+++-

  block/qapi.c | 31 ++
  blockdev.c   | 15 +
  3 files changed, 107 insertions(+), 1 deletion(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 8225308904..4706a934d9 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -451,6 +451,63 @@
 'status': 'DirtyBitmapStatus'} }
    ##
+# @BlockLatencyHistogramInfo:
+#
+# Block latency histogram.
+#
+# @latency: list of latency points in microseconds. Equals to 
@latency parameter

+#   of last called block-latency-histogram-set.


Second sentence might read better as:

Matches the @latency parameter from the last call to 
block-latency-histogram-set for the given device.



+#
+# @read: list of read-request counts corresponding to latency region.
+#    len(@read) = len(@latency) + 1
+#    @read[0] corresponds to latency region [0, @latency[0])
+#    for 0 < i < len(@latency): @read[i] corresponds to latency 
region

+#    [@latency[i-1], @latency[i])
+#    @read.last_element corresponds to latency region
+#    [@latency.last_element, +inf)
+#
+# @write: list of write-request counts (see @read semantics)
+#
+# @flush: list of flush-request counts (see @read semantics)
+#
+# Since: 2.12
+##
+{ 'struct': 'BlockLatencyHistogramInfo',
+  'data': {'latency': ['uint64'],
+   'read': ['uint64'],
+   'write': ['uint64'],
+   'flush': ['uint64'] } }


Okay, I can see how this interface works.


+
+##
+# @block-latency-histogram-set:
+#
+# Add latency histogram to block device. If latency histogram alredy 
exists for


s/latency/a latency/ (twice)
s/alredy/already/

+# the device it will be removed and a new one created. Latency 
histogram may be


s/Latency/The latency/


+# quered through query-blockstats.


s/quered/queried/


+#
+# @device: device name to set latency histogram for.
+#
+# @latency: list of latency points in microseconds. The sequcence 
must be


s/sequcence/sequence/

+#   ascending, elements must be greater than zero. Histogram 
latency

+#   regions would be
+#   [0, @latency[0]), ..., [@latency[i], @latency[i+1]), ...,
+#   [@latency.last_element, +inf)
+#
+# Returns: error if device is not found.
+#
+# Since: 2.12
+#
+# Example:
+#
+# -> { "execute": "block-latency-histogram-set",
+#  "arguments": { "device": "drive0",
+# "latency": [50, 100, 200] } }
+# <- { "return": {} }
+##
+{ 'command': 'block-latency-histogram-set',
+  'data': {'device': 'str', 'latency': ['uint64'] } }


The commit message mentioned that you can set and clear histogram 
tracking; how does this interface let you clear things?  By passing a 
0-length latency array?  If you violate the constraint (pass 
non-ascending points, for example), does the previous latency map get 
wiped out or is it still preserved unchanged?


On error nothing is changed.

By "clear" I mean zeroing statistics, not removing the whole histogram. 
And to zero statistics you can call set with the same latency array.
There is no way to remove histogram at all.. We can add 
block-latency-histogram-unset later if needed.





+
+##
  # @BlockInfo:
  #
  # Block device information.  This structure describes a virtual 
device and

@@ -730,6 +787,8 @@
  # @timed_stats: Statistics specific to the set of previously defined
  #   intervals of time (Since 2.5)
  #
+# @latency-histogram: @BlockLatencyHistogramInfo. (Since 2.12)


I'd mention that this only appears if block-latency-histogram-set has 
been called.


yes




+#
  # Since: 0.14.0
  ##
  { 'struct': 'BlockDeviceStats',
@@ -742,7 +801,8 @@
 'failed_flush_operations': 'int', 
'invalid_rd_operations': 'int',
 'invalid_wr_operations': 'int', 
'invalid_flush_operations': 'int',

 'account_invalid': 'bool', 'account_failed': 'bool',
-   'timed_stats': ['BlockDeviceTimedStats'] } }
+   'timed_stats': ['BlockDeviceTimedStats'],
+   '*latency-histogram': 'BlockLatencyHistogramInfo' } }






--
Best regards,
Vladimir




Re: [Qemu-block] [PATCH] iotests: 205: support luks format

2018-02-06 Thread Eric Blake

On 02/06/2018 11:57 AM, Vladimir Sementsov-Ogievskiy wrote:


+luks_default_secret_id = 'luks_secret_default_iotests_id'

Can we just call this  "keysec0", so we matchh convention used by
the shell script based tests.


Here I'm trying to avoid intersection with some user-defined id.



You're overthinking it.  We are only using this in the testsuite, and 
nothing else in the testsuite is using 'keysec0' for anything except the 
id of the secret to pass to encrypted disks.  The longer name doesn't 
add any protection.  It might be different if we were trying to provide 
a reusable library for contexts outside the testsuite, but since we are 
not doing that, we can rely on 'make check' failing as evidence if we 
have any collisions in naming choices that need long name munging as a 
workaround.


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



Re: [Qemu-block] [PATCH] iotests: 205: support luks format

2018-02-06 Thread Vladimir Sementsov-Ogievskiy

06.02.2018 20:29, Daniel P. Berrangé wrote:

On Tue, Feb 06, 2018 at 08:16:42PM +0300, Vladimir Sementsov-Ogievskiy wrote:

Support default luks options in VM.add_drive and in new library
function qemu_img_create. Use it in 205 iotests.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---

instead of
   [PATCH] iotests: 205: support only raw format

let's just support luks. This patch also makes it simple to support
luks in any other python iotest.

  tests/qemu-iotests/205|  4 ++--
  tests/qemu-iotests/iotests.py | 33 +
  2 files changed, 35 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 5a10b2d534..4b9a4445cd 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -58,6 +58,13 @@ qemu_default_machine = os.environ.get('QEMU_DEFAULT_MACHINE')
  socket_scm_helper = os.environ.get('SOCKET_SCM_HELPER', 'socket_scm_helper')
  debug = False
  
+luks_default_secret_id = 'luks_secret_default_iotests_id'

Can we just call this  "keysec0", so we matchh convention used by
the shell script based tests.


Here I'm trying to avoid intersection with some user-defined id.




+luks_default_secret_data = '12345'

The 'check' script exports an environment variable IMGKEYSECRET
that is intended to be used as the default password for LUKS.


agree, missed this.




+luks_default_secret_object = 'secret,id=' + luks_default_secret_id + \
+ ',data=' + luks_default_secret_data
+luks_default_key_secret_opt = 'key-secret=' + luks_default_secret_id


Regards,
Daniel



--
Best regards,
Vladimir




Re: [Qemu-block] [PATCH v2 3/6] qapi: add nbd-server-remove

2018-02-06 Thread Vladimir Sementsov-Ogievskiy

06.02.2018 19:06, Eric Blake wrote:

On 02/06/2018 09:29 AM, Vladimir Sementsov-Ogievskiy wrote:


most of commands, ported to hmp are done in same style: they just call
corresponding qmp command.
Isn't it better to provide common interface for calling qmp 
commands through

HMP monitor, to never
create hmp versions of new commands? they will be available 
automatically.
It would be nice to do that, but they're not that consistent in how 
they

convert parameters and options, but I occasionally wonder if we could
automate more of it.



What about allowing some new syntax in hmp, directly mapped to qmp?

something like

 >>> blockdev-add id disk driver qcow2 cache {writeback true direct 
true} aio native discard unmap file {driver file filename /tmp/somedisk}




Personally, if I'm testing blockdev-add, I'll use QMP directly (or 
even scripts/qmp/qmp-shell or virsh qemu-monitor-command), not an HMP 
wrapper where I have to learn a new syntax of how to write something 
that will convert to QMP.  We already have enough different ways to 
write things that I don't need to learn yet another syntax wrapper.  
Or maybe what I'm saying is that instead of inventing a new syntax, 
that if you DO add an HMP command that forwards to QMP, please reuse 
an existing syntax (whether direct JSON as used by QMP, or the syntax 
used by qmp-shell).


I'm afraid, that JSON is too hard to use in human monitor. And this will 
make the whole feature useless.




If you think writing a new HMP command is worth it, I won't stop you 
from writing it.  But at this point, our current approach of writing a 
manual wrapper per command as we have interest, rather than a generic 
wrap-anything, has worked for the cases that HMP users have cared 
about.  Remember, QMP is the interface that MUST work, while HMP is 
only for convenience, and if it is not trivial to make HMP do 
everything that QMP can do, it is no real loss.




But we create hmp wrappers on demand, and for each case, we actually 
invent new syntax. I just search for the way to avoid creating new and 
new hmp wrappers, by introducing new syntax only once.

And, here is almost nothing to learn:

command := command-name parameters
parameters = [key value ]...
value = simple-value | array | map
map = '{' parameters '}'
array = '[' [value ]... ']'

another variant is to use yaml - like json, but we do not need put all 
keys into quotes.


On the other hand, implementing new parser in qemu is not trivial task 
(hmm, I don't want do it=), it should be simpler to create direct JSON 
wrapper in HMP monitor, and use some python wrapper around the monitor. 
And this looks useless, as with same result I can use wrapper around QMP 
monitor. So, may be the most interesting solution would be to make some 
easy-to-use python-based wrapper, which will give a simple way to use 
both qmp and hmp commands.. I'll think about it. However it doesn't 
solve initial problem of creating new and new hmp wrappers by hand.



?

Or it may be realized as a separate hmp command "qmp" (looks more 
safe as a first step, however, I think previous variant (direct call) 
is better):


 >>> qmp blockdev-add id disk driver qcow2 cache {writeback true 
direct true} aio native discard unmap file {driver file filename 
/tmp/somedisk}


what do think? This looks simple to implement and should be useful.


Up to you if you want to tackle anything like that, but it would be a 
new thread (a generic way to invoke QMP from HMP is independent of 
nbd-server-remove).





--
Best regards,
Vladimir




Re: [Qemu-block] [PATCH] iotests: 205: support luks format

2018-02-06 Thread Daniel P . Berrangé
On Tue, Feb 06, 2018 at 08:16:42PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> Support default luks options in VM.add_drive and in new library
> function qemu_img_create. Use it in 205 iotests.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
> 
> instead of
>   [PATCH] iotests: 205: support only raw format
> 
> let's just support luks. This patch also makes it simple to support
> luks in any other python iotest.
> 
>  tests/qemu-iotests/205|  4 ++--
>  tests/qemu-iotests/iotests.py | 33 +
>  2 files changed, 35 insertions(+), 2 deletions(-)
> 

> diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
> index 5a10b2d534..4b9a4445cd 100644
> --- a/tests/qemu-iotests/iotests.py
> +++ b/tests/qemu-iotests/iotests.py
> @@ -58,6 +58,13 @@ qemu_default_machine = 
> os.environ.get('QEMU_DEFAULT_MACHINE')
>  socket_scm_helper = os.environ.get('SOCKET_SCM_HELPER', 'socket_scm_helper')
>  debug = False
>  
> +luks_default_secret_id = 'luks_secret_default_iotests_id'

Can we just call this  "keysec0", so we matchh convention used by
the shell script based tests.

> +luks_default_secret_data = '12345'

The 'check' script exports an environment variable IMGKEYSECRET
that is intended to be used as the default password for LUKS.

> +luks_default_secret_object = 'secret,id=' + luks_default_secret_id + \
> + ',data=' + luks_default_secret_data
> +luks_default_key_secret_opt = 'key-secret=' + luks_default_secret_id


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



[Qemu-block] [PATCH] iotests: 205: support luks format

2018-02-06 Thread Vladimir Sementsov-Ogievskiy
Support default luks options in VM.add_drive and in new library
function qemu_img_create. Use it in 205 iotests.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---

instead of
  [PATCH] iotests: 205: support only raw format

let's just support luks. This patch also makes it simple to support
luks in any other python iotest.

 tests/qemu-iotests/205|  4 ++--
 tests/qemu-iotests/iotests.py | 33 +
 2 files changed, 35 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/205 b/tests/qemu-iotests/205
index 10388920dc..e7b2eae51d 100644
--- a/tests/qemu-iotests/205
+++ b/tests/qemu-iotests/205
@@ -22,7 +22,7 @@ import os
 import sys
 import iotests
 import time
-from iotests import qemu_img, qemu_io, filter_qemu_io, QemuIoInteractive
+from iotests import qemu_img_create, qemu_io, filter_qemu_io, QemuIoInteractive
 
 nbd_sock = 'nbd_sock'
 nbd_uri = 'nbd+unix:///exp?socket=' + nbd_sock
@@ -31,7 +31,7 @@ disk = os.path.join(iotests.test_dir, 'disk')
 
 class TestNbdServerRemove(iotests.QMPTestCase):
 def setUp(self):
-qemu_img('create', '-f', iotests.imgfmt, disk, '1M')
+qemu_img_create('-f', iotests.imgfmt, disk, '1M')
 
 self.vm = iotests.VM().add_drive(disk)
 self.vm.launch()
diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 5a10b2d534..4b9a4445cd 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -58,6 +58,13 @@ qemu_default_machine = os.environ.get('QEMU_DEFAULT_MACHINE')
 socket_scm_helper = os.environ.get('SOCKET_SCM_HELPER', 'socket_scm_helper')
 debug = False
 
+luks_default_secret_id = 'luks_secret_default_iotests_id'
+luks_default_secret_data = '12345'
+luks_default_secret_object = 'secret,id=' + luks_default_secret_id + \
+ ',data=' + luks_default_secret_data
+luks_default_key_secret_opt = 'key-secret=' + luks_default_secret_id
+
+
 def qemu_img(*args):
 '''Run qemu-img and return the exit code'''
 devnull = open('/dev/null', 'r+')
@@ -66,6 +73,25 @@ def qemu_img(*args):
 sys.stderr.write('qemu-img received signal %i: %s\n' % (-exitcode, ' 
'.join(qemu_img_args + list(args
 return exitcode
 
+def qemu_img_create(*args):
+args = list(args)
+
+# default luks support
+if '-f' in args and args[args.index('-f') + 1] == 'luks':
+if '-o' in args:
+i = args.index('-o')
+if 'key-secret' not in args[i + 1]:
+args[i + 1].append(luks_default_key_secret_opt)
+args.insert(i + 2, '--object')
+args.insert(i + 3, luks_default_secret_object)
+else:
+args = ['-o', luks_default_key_secret_opt,
+'--object', luks_default_secret_object] + args
+
+args.insert(0, 'create')
+
+return qemu_img(*args)
+
 def qemu_img_verbose(*args):
 '''Run qemu-img without suppressing its output and return the exit code'''
 exitcode = subprocess.call(qemu_img_args + list(args))
@@ -263,6 +289,13 @@ class VM(qtest.QEMUQtestMachine):
 if opts:
 options.append(opts)
 
+if format == 'luks' and 'key-secret' not in opts:
+# default luks support
+if luks_default_secret_object not in self._args:
+self.add_object(luks_default_secret_object)
+
+options.append(luks_default_key_secret_opt)
+
 self._args.append('-drive')
 self._args.append(','.join(options))
 self._num_drives += 1
-- 
2.11.1




Re: [Qemu-block] [PATCH] block: fix write with zero flag set and iovector provided

2018-02-06 Thread Stefan Hajnoczi
On Thu, Feb 01, 2018 at 05:16:31PM +0300, Anton Nefedov wrote:
> The normal bdrv_co_pwritev() use is either
>   - BDRV_REQ_ZERO_WRITE reset and iovector provided
>   - BDRV_REQ_ZERO_WRITE set and iovector == NULL
> 
> while
>   - the flag reset and iovector == NULL is an assertion failure
> in bdrv_co_do_zero_pwritev()
>   - the flag set and iovector provided is in fact allowed
> (the flag prevails and zeroes are written)
> 
> However the alignment logic does not support the latter case so the padding
> areas get overwritten with zeroes.

Please include a test case.  Berto mentioned that bdrv_pwrite_zeroes()
hits this issue, that might be one way to test it.

> Solution could be to forbid such case or just use bdrv_co_do_zero_pwritev()
> alignment for it which also makes the code a bit more obvious anyway.
> 
> Signed-off-by: Anton Nefedov 
> ---
>  block/io.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/block/io.c b/block/io.c
> index 7ea4023..cf63fd0 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -1701,7 +1701,7 @@ int coroutine_fn bdrv_co_pwritev(BdrvChild *child,
>   */
>  tracked_request_begin(, bs, offset, bytes, BDRV_TRACKED_WRITE);
>  
> -if (!qiov) {
> +if (flags & BDRV_REQ_ZERO_WRITE) {
>  ret = bdrv_co_do_zero_pwritev(child, offset, bytes, flags, );
>  goto out;
>  }

Looks good.


signature.asc
Description: PGP signature


Re: [Qemu-block] [PATCH v2 3/6] qapi: add nbd-server-remove

2018-02-06 Thread Eric Blake

On 02/06/2018 09:29 AM, Vladimir Sementsov-Ogievskiy wrote:


most of commands, ported to hmp are done in same style: they just call
corresponding qmp command.
Isn't it better to provide common interface for calling qmp commands 
through

HMP monitor, to never
create hmp versions of new commands? they will be available 
automatically.

It would be nice to do that, but they're not that consistent in how they
convert parameters and options, but I occasionally wonder if we could
automate more of it.



What about allowing some new syntax in hmp, directly mapped to qmp?

something like

 >>> blockdev-add id disk driver qcow2 cache {writeback true direct 
true} aio native discard unmap file {driver file filename /tmp/somedisk}




Personally, if I'm testing blockdev-add, I'll use QMP directly (or even 
scripts/qmp/qmp-shell or virsh qemu-monitor-command), not an HMP wrapper 
where I have to learn a new syntax of how to write something that will 
convert to QMP.  We already have enough different ways to write things 
that I don't need to learn yet another syntax wrapper.  Or maybe what 
I'm saying is that instead of inventing a new syntax, that if you DO add 
an HMP command that forwards to QMP, please reuse an existing syntax 
(whether direct JSON as used by QMP, or the syntax used by qmp-shell).


If you think writing a new HMP command is worth it, I won't stop you 
from writing it.  But at this point, our current approach of writing a 
manual wrapper per command as we have interest, rather than a generic 
wrap-anything, has worked for the cases that HMP users have cared about. 
 Remember, QMP is the interface that MUST work, while HMP is only for 
convenience, and if it is not trivial to make HMP do everything that QMP 
can do, it is no real loss.



?

Or it may be realized as a separate hmp command "qmp" (looks more safe 
as a first step, however, I think previous variant (direct call) is 
better):


 >>> qmp blockdev-add id disk driver qcow2 cache {writeback true direct 
true} aio native discard unmap file {driver file filename /tmp/somedisk}


what do think? This looks simple to implement and should be useful.


Up to you if you want to tackle anything like that, but it would be a 
new thread (a generic way to invoke QMP from HMP is independent of 
nbd-server-remove).


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



Re: [Qemu-block] [PATCH 2/2] qapi: add block latency histogram interface

2018-02-06 Thread Eric Blake

On 02/06/2018 08:07 AM, Vladimir Sementsov-Ogievskiy wrote:

Set (and clear) histogram through new command
block-latency-histogram-set and show new statistics in
query-blockstats results.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  qapi/block-core.json | 62 +++-
  block/qapi.c | 31 ++
  blockdev.c   | 15 +
  3 files changed, 107 insertions(+), 1 deletion(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 8225308904..4706a934d9 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -451,6 +451,63 @@
 'status': 'DirtyBitmapStatus'} }
  
  ##

+# @BlockLatencyHistogramInfo:
+#
+# Block latency histogram.
+#
+# @latency: list of latency points in microseconds. Equals to @latency 
parameter
+#   of last called block-latency-histogram-set.


Second sentence might read better as:

Matches the @latency parameter from the last call to 
block-latency-histogram-set for the given device.



+#
+# @read: list of read-request counts corresponding to latency region.
+#len(@read) = len(@latency) + 1
+#@read[0] corresponds to latency region [0, @latency[0])
+#for 0 < i < len(@latency): @read[i] corresponds to latency region
+#[@latency[i-1], @latency[i])
+#@read.last_element corresponds to latency region
+#[@latency.last_element, +inf)
+#
+# @write: list of write-request counts (see @read semantics)
+#
+# @flush: list of flush-request counts (see @read semantics)
+#
+# Since: 2.12
+##
+{ 'struct': 'BlockLatencyHistogramInfo',
+  'data': {'latency': ['uint64'],
+   'read': ['uint64'],
+   'write': ['uint64'],
+   'flush': ['uint64'] } }


Okay, I can see how this interface works.


+
+##
+# @block-latency-histogram-set:
+#
+# Add latency histogram to block device. If latency histogram alredy exists for


s/latency/a latency/ (twice)
s/alredy/already/


+# the device it will be removed and a new one created. Latency histogram may be


s/Latency/The latency/


+# quered through query-blockstats.


s/quered/queried/


+#
+# @device: device name to set latency histogram for.
+#
+# @latency: list of latency points in microseconds. The sequcence must be


s/sequcence/sequence/


+#   ascending, elements must be greater than zero. Histogram latency
+#   regions would be
+#   [0, @latency[0]), ..., [@latency[i], @latency[i+1]), ...,
+#   [@latency.last_element, +inf)
+#
+# Returns: error if device is not found.
+#
+# Since: 2.12
+#
+# Example:
+#
+# -> { "execute": "block-latency-histogram-set",
+#  "arguments": { "device": "drive0",
+# "latency": [50, 100, 200] } }
+# <- { "return": {} }
+##
+{ 'command': 'block-latency-histogram-set',
+  'data': {'device': 'str', 'latency': ['uint64'] } }


The commit message mentioned that you can set and clear histogram 
tracking; how does this interface let you clear things?  By passing a 
0-length latency array?  If you violate the constraint (pass 
non-ascending points, for example), does the previous latency map get 
wiped out or is it still preserved unchanged?



+
+##
  # @BlockInfo:
  #
  # Block device information.  This structure describes a virtual device and
@@ -730,6 +787,8 @@
  # @timed_stats: Statistics specific to the set of previously defined
  #   intervals of time (Since 2.5)
  #
+# @latency-histogram: @BlockLatencyHistogramInfo. (Since 2.12)


I'd mention that this only appears if block-latency-histogram-set has 
been called.



+#
  # Since: 0.14.0
  ##
  { 'struct': 'BlockDeviceStats',
@@ -742,7 +801,8 @@
 'failed_flush_operations': 'int', 'invalid_rd_operations': 'int',
 'invalid_wr_operations': 'int', 'invalid_flush_operations': 'int',
 'account_invalid': 'bool', 'account_failed': 'bool',
-   'timed_stats': ['BlockDeviceTimedStats'] } }
+   'timed_stats': ['BlockDeviceTimedStats'],
+   '*latency-histogram': 'BlockLatencyHistogramInfo' } }
  



--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



Re: [Qemu-block] [PATCH for-2.9-rc5 v4 2/2] block: Drain BH in bdrv_drained_begin

2018-02-06 Thread Kevin Wolf
Am 18.04.2017 um 16:30 hat Fam Zheng geschrieben:
> During block job completion, nothing is preventing
> block_job_defer_to_main_loop_bh from being called in a nested
> aio_poll(), which is a trouble, such as in this code path:
> 
> qmp_block_commit
>   commit_active_start
> bdrv_reopen
>   bdrv_reopen_multiple
> bdrv_reopen_prepare
>   bdrv_flush
> aio_poll
>   aio_bh_poll
> aio_bh_call
>   block_job_defer_to_main_loop_bh
> stream_complete
>   bdrv_reopen
> 
> block_job_defer_to_main_loop_bh is the last step of the stream job,
> which should have been "paused" by the bdrv_drained_begin/end in
> bdrv_reopen_multiple, but it is not done because it's in the form of a
> main loop BH.
> 
> Similar to why block jobs should be paused between drained_begin and
> drained_end, BHs they schedule must be excluded as well.  To achieve
> this, this patch forces draining the BH in BDRV_POLL_WHILE.
> 
> As a side effect this fixes a hang in block_job_detach_aio_context
> during system_reset when a block job is ready:
> 
> #0  0x55aa79f3 in bdrv_drain_recurse
> #1  0x55aa825d in bdrv_drained_begin
> #2  0x55aa8449 in bdrv_drain
> #3  0x55a9c356 in blk_drain
> #4  0x55aa3cfd in mirror_drain
> #5  0x55a66e11 in block_job_detach_aio_context
> #6  0x55a62f4d in bdrv_detach_aio_context
> #7  0x55a63116 in bdrv_set_aio_context
> #8  0x55a9d326 in blk_set_aio_context
> #9  0x557e38da in virtio_blk_data_plane_stop
> #10 0x559f9d5f in virtio_bus_stop_ioeventfd
> #11 0x559fa49b in virtio_bus_stop_ioeventfd
> #12 0x559f6a18 in virtio_pci_stop_ioeventfd
> #13 0x559f6a18 in virtio_pci_reset
> #14 0x559139a9 in qdev_reset_one
> #15 0x55916738 in qbus_walk_children
> #16 0x55913318 in qdev_walk_children
> #17 0x55916738 in qbus_walk_children
> #18 0x559168ca in qemu_devices_reset
> #19 0x5581fcbb in pc_machine_reset
> #20 0x558a4d96 in qemu_system_reset
> #21 0x5577157a in main_loop_should_exit
> #22 0x5577157a in main_loop
> #23 0x5577157a in main
> 
> The rationale is that the loop in block_job_detach_aio_context cannot
> make any progress in pausing/completing the job, because bs->in_flight
> is 0, so bdrv_drain doesn't process the block_job_defer_to_main_loop
> BH. With this patch, it does.
> 
> Reported-by: Jeff Cody 
> Signed-off-by: Fam Zheng 

Fam, do you remember whether this was really only about drain? Because
in that case...

> diff --git a/include/block/block.h b/include/block/block.h
> index 97d4330..5ddc0cf 100644
> --- a/include/block/block.h
> +++ b/include/block/block.h
> @@ -381,12 +381,13 @@ void bdrv_drain_all(void);
>  
>  #define BDRV_POLL_WHILE(bs, cond) ({   \
>  bool waited_ = false;  \
> +bool busy_ = true; \
>  BlockDriverState *bs_ = (bs);  \
>  AioContext *ctx_ = bdrv_get_aio_context(bs_);  \
>  if (aio_context_in_iothread(ctx_)) {   \
> -while ((cond)) {   \
> -aio_poll(ctx_, true);  \
> -waited_ = true;\
> +while ((cond) || busy_) {  \
> +busy_ = aio_poll(ctx_, (cond));\
> +waited_ |= !!(cond) | busy_;   \
>  }  \
>  } else {   \
>  assert(qemu_get_current_aio_context() ==   \
> @@ -398,11 +399,16 @@ void bdrv_drain_all(void);
>   */\
>  assert(!bs_->wakeup);  \
>  bs_->wakeup = true;\
> -while ((cond)) {   \
> -aio_context_release(ctx_); \
> -aio_poll(qemu_get_aio_context(), true);\
> -aio_context_acquire(ctx_); \
> -waited_ = true;\
> +while (busy_) {\
> +if ((cond)) {  \
> +waited_ = busy_ = true;\
> +aio_context_release(ctx_); \
> +aio_poll(qemu_get_aio_context(), true);\
> +aio_context_acquire(ctx_); \
> +} else { 

Re: [Qemu-block] [PATCH v2 3/6] qapi: add nbd-server-remove

2018-02-06 Thread Vladimir Sementsov-Ogievskiy

26.01.2018 18:05, Dr. David Alan Gilbert wrote:

* Vladimir Sementsov-Ogievskiy (vsement...@virtuozzo.com) wrote:

17.01.2018 19:03, Eric Blake wrote:

On 01/17/2018 09:51 AM, Vladimir Sementsov-Ogievskiy wrote:


I have a script (for managing libvirt guest, but it can be adopted for
qemu or even used for qemu monitor), which allows
me run qmp commands on vms as easy as:

|qmp VMNAME query-block-jobs or qmp VMNAME nbd-server-remove name exp1
mode hard or even |

|qmp VMNAME blockdev-add id disk driver qcow2 cache {writeback true
direct true} aio native discard unmap file {driver file filename
/tmp/somedisk} |||

Yeah, there are various scripting solutions around QMP that can make it
easier; but HMP is often still an easy front-line interface for
experiments.


isn't it because these solutions are not available directly in monitor,
when HMP is?

QMP can be directly accessed in a monitor; it just requires more typing.
   If you are developing QMP commands, it may be easier to use
./scripts/qmp/qmp-shell (couple it with a readline wrapper, and you can
even get tab-completion and history across sessions).  There's also
things like libvirt's 'virsh qmp-monitor-command' for shell-scripting
access to arbitrary QMP commands, provided your guest is run by libvirt.


may be, we need third type of monitor HQMP which is QMP with simplified
syntax? Or
allow qmp commands in simplified syntax directly in HMP?

No, I don't think we need either thing.  Wrappers around existing
monitors is better than bloating qemu proper with a third flavor of
monitor.  And HMP is for humans, with no restrictions on back-compat
changes, so if it doesn't do something you want for quick-and-dirty
testing, you can either add a new HMP command, or just use QMP (or one
of its wrappers, like qmp-shell) in the first place.  Ultimately, our
long-term concern is only about the QMP interface; HMP is supposed to be
convenient.  So if it starts costing too much time to port a QMP
interface to HMP, then don't worry about it.


most of commands, ported to hmp are done in same style: they just call
corresponding qmp command.
Isn't it better to provide common interface for calling qmp commands through
HMP monitor, to never
create hmp versions of new commands? they will be available automatically.

It would be nice to do that, but they're not that consistent in how they
convert parameters and options, but I occasionally wonder if we could
automate more of it.



What about allowing some new syntax in hmp, directly mapped to qmp?

something like

>>> blockdev-add id disk driver qcow2 cache {writeback true direct 
true} aio native discard unmap file {driver file filename /tmp/somedisk}


?

Or it may be realized as a separate hmp command "qmp" (looks more safe 
as a first step, however, I think previous variant (direct call) is better):


>>> qmp blockdev-add id disk driver qcow2 cache {writeback true direct 
true} aio native discard unmap file {driver file filename /tmp/somedisk}


what do think? This looks simple to implement and should be useful.



Dave


--
Best regards,
Vladimir


--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



--
Best regards,
Vladimir




Re: [Qemu-block] [Qemu-devel] [PATCH] block: include original filename when reporting invalid URIs

2018-02-06 Thread Eric Blake

On 02/06/2018 04:52 AM, Daniel P. Berrangé wrote:

Consider passing a JSON based block driver to "qemu-img commit"

$ qemu-img commit 'json:{"driver":"qcow2","file":{"driver":"gluster",\
   "volume":"gv0","path":"sn1.qcow2",
   "server":[{"type":\
  "tcp","host":"10.73.199.197","port":"24007"}]},}'

Currently it will commit the content and then report an incredibly
useless error message when trying to re-open the committed image:

   qemu-img: invalid URI
   Usage: 
file=gluster[+transport]://[host[:port]]volume/path[?socket=...][,file.debug=N][,file.logfile=/path/filename.log]

With this fix we get:

   qemu-img: invalid URI json:{"server.0.host": "10.73.199.197",
   "driver": "gluster", "path": "luks.qcow2", "server.0.type":
   "tcp", "server.0.port": "24007", "volume": "gv0"}

Of course the root cause problem still exists, but now we know
what actually needs fixing.

Signed-off-by: Daniel P. Berrangé 
---
  block/gluster.c  | 2 +-
  block/sheepdog.c | 2 +-
  2 files changed, 2 insertions(+), 2 deletions(-)


Reviewed-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



Re: [Qemu-block] [Qemu-devel] [PATCH v9 03/13] block/dirty-bitmap: add _locked version of bdrv_reclaim_dirty_bitmap

2018-02-06 Thread Vladimir Sementsov-Ogievskiy

24.01.2018 13:16, Paolo Bonzini wrote:

On 22/01/2018 13:14, Vladimir Sementsov-Ogievskiy wrote:

so, accessing the bitmap needs mutex lock?

Then what do you mean under accessing the bitmap? Any touch of
BdrvDirtyBitmap fields? Then "reading the list" will require bitmap
mutex too.
Or accessing the bitmap is accessing any field except
BdrvDirtyBitmap.list? Then in (2), what do you mean? For example
query-block will go through
the list, but it touches other fields too, so it should lock mutex.

The bitmap mutex is internal to block/dirty-bitmap.c.


yes and query-block actually calls bdrv_query_dirty_bitmaps, which locks 
mutex..





and one more question:

What about qmp transactions? Should we lock mutex during the whole
transaction?

Transactions hold the BQL, but you don't need to lock the bitmap mutex.


I don't understand. But "Accessing a bitmap only requires 
dirty_bitmap_mutex".. So, if we have several operations on one bitmap,
each of them will lock/unlock mutex by itself? Then we'll have unlocked 
"holes" in our transaction. Or this doesn't matter?




Paolo



--
Best regards,
Vladimir




Re: [Qemu-block] [PATCH v8 21/26] block: Purify .bdrv_refresh_filename()

2018-02-06 Thread Alberto Garcia
On Mon 05 Feb 2018 04:18:30 PM CET, Max Reitz wrote:
> Currently, BlockDriver.bdrv_refresh_filename() is supposed to both
> refresh the filename (BDS.exact_filename) and set BDS.full_open_options.
> Now that we have generic code in the central bdrv_refresh_filename() for
> creating BDS.full_open_options, we can drop the latter part from all
> BlockDriver.bdrv_refresh_filename() implementations.
>
> This also means that we can drop all of the existing default code for
> this from the global bdrv_refresh_filename() itself.
>
> Furthermore, we now have to call BlockDriver.bdrv_refresh_filename()
> after having set BDS.full_open_options, because the block driver's
> implementation should now be allowed to depend on BDS.full_open_options
> being set correctly.
>
> Finally, with this patch we can drop the @options parameter from
> BlockDriver.bdrv_refresh_filename(); also, add a comment on this
> function's purpose in block/block_int.h while touching its interface.
>
> Signed-off-by: Max Reitz 

Reviewed-by: Alberto Garcia 

Berto



[Qemu-block] [PATCH 1/2] block/accounting: introduce latency histogram

2018-02-06 Thread Vladimir Sementsov-Ogievskiy
Introduce latency histogram statics for block devices.
For each accounted operation type latency region [0, +inf) is
divided into subregions by several points. Then, calculate
hits for each subregion.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 include/block/accounting.h |  8 +
 block/accounting.c | 83 ++
 2 files changed, 91 insertions(+)

diff --git a/include/block/accounting.h b/include/block/accounting.h
index b833d26d6c..7fbfc86c43 100644
--- a/include/block/accounting.h
+++ b/include/block/accounting.h
@@ -45,6 +45,12 @@ struct BlockAcctTimedStats {
 QSLIST_ENTRY(BlockAcctTimedStats) entries;
 };
 
+typedef struct BlockLatencyHistogram {
+int size;
+uint64_t *points; /* @size-1 points here (all points, except 0 and +inf) */
+uint64_t *histogram[BLOCK_MAX_IOTYPE]; /* @size elements for each type */
+} BlockLatencyHistogram;
+
 struct BlockAcctStats {
 QemuMutex lock;
 uint64_t nr_bytes[BLOCK_MAX_IOTYPE];
@@ -57,6 +63,7 @@ struct BlockAcctStats {
 QSLIST_HEAD(, BlockAcctTimedStats) intervals;
 bool account_invalid;
 bool account_failed;
+BlockLatencyHistogram latency_histogram;
 };
 
 typedef struct BlockAcctCookie {
@@ -82,5 +89,6 @@ void block_acct_merge_done(BlockAcctStats *stats, enum 
BlockAcctType type,
 int64_t block_acct_idle_time_ns(BlockAcctStats *stats);
 double block_acct_queue_depth(BlockAcctTimedStats *stats,
   enum BlockAcctType type);
+int block_latency_histogram_set(BlockAcctStats *stats, uint64List *latency);
 
 #endif
diff --git a/block/accounting.c b/block/accounting.c
index 87ef5bbfaa..a34ef09015 100644
--- a/block/accounting.c
+++ b/block/accounting.c
@@ -94,6 +94,86 @@ void block_acct_start(BlockAcctStats *stats, BlockAcctCookie 
*cookie,
 cookie->type = type;
 }
 
+/* block_latency_histogram_compare_func
+ * Compare @key with interval [@el, @el+1), where @el+1 is a next array element
+ * after @el.
+ * Return: -1 if @key < @el
+ *  0 if @key in [@el, @el+1)
+ * +1 if @key >= @el+1
+ */
+static int block_latency_histogram_compare_func(const void *key, const void 
*el)
+{
+uint64_t k = *(uint64_t *)key;
+uint64_t a = *(uint64_t *)el;
+uint64_t b = *((uint64_t *)el + 1);
+
+return k < a ? -1 : (k < b ? 0 : 1);
+}
+
+static void block_latency_histogram_account(BlockLatencyHistogram *hist,
+enum BlockAcctType type,
+int64_t latency_ns)
+{
+uint64_t *data, *pos;
+
+if (hist->points == NULL) {
+/* histogram disabled */
+return;
+}
+
+data = hist->histogram[type];
+
+if (latency_ns < hist->points[0]) {
+data[0]++;
+return;
+}
+
+if (latency_ns >= hist->points[hist->size - 2]) {
+data[hist->size - 1]++;
+return;
+}
+
+pos = bsearch(_ns, hist->points, hist->size - 2,
+  sizeof(hist->points[0]),
+  block_latency_histogram_compare_func);
+assert(pos != NULL);
+
+data[pos - hist->points + 1]++;
+}
+
+int block_latency_histogram_set(BlockAcctStats *stats, uint64List *latency)
+{
+BlockLatencyHistogram *hist = >latency_histogram;
+uint64List *entry;
+uint64_t *ptr;
+int i;
+uint64_t prev = 0;
+
+hist->size = 1;
+
+for (entry = latency; entry; entry = entry->next) {
+if (entry->value <= prev) {
+return -EINVAL;
+}
+hist->size++;
+prev = entry->value;
+}
+
+hist->points = g_renew(uint64_t, hist->points, hist->size - 1);
+for (entry = latency, ptr = hist->points; entry;
+ entry = entry->next, ptr++)
+{
+*ptr = entry->value;
+}
+
+for (i = 0; i < BLOCK_MAX_IOTYPE; i++) {
+hist->histogram[i] = g_renew(uint64_t, hist->histogram[i], hist->size);
+memset(hist->histogram[i], 0, hist->size * sizeof(uint64_t));
+}
+
+return 0;
+}
+
 static void block_account_one_io(BlockAcctStats *stats, BlockAcctCookie 
*cookie,
  bool failed)
 {
@@ -116,6 +196,9 @@ static void block_account_one_io(BlockAcctStats *stats, 
BlockAcctCookie *cookie,
 stats->nr_ops[cookie->type]++;
 }
 
+block_latency_histogram_account(>latency_histogram, cookie->type,
+latency_ns);
+
 if (!failed || stats->account_failed) {
 stats->total_time_ns[cookie->type] += latency_ns;
 stats->last_access_time_ns = time_ns;
-- 
2.11.1




Re: [Qemu-block] [PATCH v8 25/26] block/curl: Implement bdrv_refresh_filename()

2018-02-06 Thread Alberto Garcia
On Mon 05 Feb 2018 04:18:34 PM CET, Max Reitz wrote:
> Signed-off-by: Max Reitz 

Reviewed-by: Alberto Garcia 

Berto



[Qemu-block] [PATCH 2/2] qapi: add block latency histogram interface

2018-02-06 Thread Vladimir Sementsov-Ogievskiy
Set (and clear) histogram through new command
block-latency-histogram-set and show new statistics in
query-blockstats results.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 qapi/block-core.json | 62 +++-
 block/qapi.c | 31 ++
 blockdev.c   | 15 +
 3 files changed, 107 insertions(+), 1 deletion(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 8225308904..4706a934d9 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -451,6 +451,63 @@
'status': 'DirtyBitmapStatus'} }
 
 ##
+# @BlockLatencyHistogramInfo:
+#
+# Block latency histogram.
+#
+# @latency: list of latency points in microseconds. Equals to @latency 
parameter
+#   of last called block-latency-histogram-set.
+#
+# @read: list of read-request counts corresponding to latency region.
+#len(@read) = len(@latency) + 1
+#@read[0] corresponds to latency region [0, @latency[0])
+#for 0 < i < len(@latency): @read[i] corresponds to latency region
+#[@latency[i-1], @latency[i])
+#@read.last_element corresponds to latency region
+#[@latency.last_element, +inf)
+#
+# @write: list of write-request counts (see @read semantics)
+#
+# @flush: list of flush-request counts (see @read semantics)
+#
+# Since: 2.12
+##
+{ 'struct': 'BlockLatencyHistogramInfo',
+  'data': {'latency': ['uint64'],
+   'read': ['uint64'],
+   'write': ['uint64'],
+   'flush': ['uint64'] } }
+
+##
+# @block-latency-histogram-set:
+#
+# Add latency histogram to block device. If latency histogram alredy exists for
+# the device it will be removed and a new one created. Latency histogram may be
+# quered through query-blockstats.
+#
+# @device: device name to set latency histogram for.
+#
+# @latency: list of latency points in microseconds. The sequcence must be
+#   ascending, elements must be greater than zero. Histogram latency
+#   regions would be
+#   [0, @latency[0]), ..., [@latency[i], @latency[i+1]), ...,
+#   [@latency.last_element, +inf)
+#
+# Returns: error if device is not found.
+#
+# Since: 2.12
+#
+# Example:
+#
+# -> { "execute": "block-latency-histogram-set",
+#  "arguments": { "device": "drive0",
+# "latency": [50, 100, 200] } }
+# <- { "return": {} }
+##
+{ 'command': 'block-latency-histogram-set',
+  'data': {'device': 'str', 'latency': ['uint64'] } }
+
+##
 # @BlockInfo:
 #
 # Block device information.  This structure describes a virtual device and
@@ -730,6 +787,8 @@
 # @timed_stats: Statistics specific to the set of previously defined
 #   intervals of time (Since 2.5)
 #
+# @latency-histogram: @BlockLatencyHistogramInfo. (Since 2.12)
+#
 # Since: 0.14.0
 ##
 { 'struct': 'BlockDeviceStats',
@@ -742,7 +801,8 @@
'failed_flush_operations': 'int', 'invalid_rd_operations': 'int',
'invalid_wr_operations': 'int', 'invalid_flush_operations': 'int',
'account_invalid': 'bool', 'account_failed': 'bool',
-   'timed_stats': ['BlockDeviceTimedStats'] } }
+   'timed_stats': ['BlockDeviceTimedStats'],
+   '*latency-histogram': 'BlockLatencyHistogramInfo' } }
 
 ##
 # @BlockStats:
diff --git a/block/qapi.c b/block/qapi.c
index fc10f0a565..715ed17a6b 100644
--- a/block/qapi.c
+++ b/block/qapi.c
@@ -389,6 +389,24 @@ static void bdrv_query_info(BlockBackend *blk, BlockInfo 
**p_info,
 qapi_free_BlockInfo(info);
 }
 
+static uint64List *uint64_list(uint64_t *list, int size)
+{
+int i;
+uint64List *out_list = NULL;
+uint64List **pout_list = _list;
+
+for (i = 0; i < size; i++) {
+uint64List *entry = g_new(uint64List, 1);
+entry->value = list[i];
+*pout_list = entry;
+pout_list = >next;
+}
+
+*pout_list = NULL;
+
+return out_list;
+}
+
 static void bdrv_query_blk_stats(BlockDeviceStats *ds, BlockBackend *blk)
 {
 BlockAcctStats *stats = blk_get_stats(blk);
@@ -454,6 +472,19 @@ static void bdrv_query_blk_stats(BlockDeviceStats *ds, 
BlockBackend *blk)
 dev_stats->avg_wr_queue_depth =
 block_acct_queue_depth(ts, BLOCK_ACCT_WRITE);
 }
+
+ds->has_latency_histogram = stats->latency_histogram.points != NULL;
+if (ds->has_latency_histogram) {
+BlockLatencyHistogramInfo *info = g_new0(BlockLatencyHistogramInfo, 1);
+BlockLatencyHistogram *h = >latency_histogram;
+
+ds->latency_histogram = info;
+
+info->latency = uint64_list(h->points, h->size - 1);
+info->read = uint64_list(h->histogram[BLOCK_ACCT_READ], h->size);
+info->write = uint64_list(h->histogram[BLOCK_ACCT_WRITE], h->size);
+info->flush = uint64_list(h->histogram[BLOCK_ACCT_FLUSH], h->size);
+}
 }
 
 static BlockStats *bdrv_query_bds_stats(BlockDriverState *bs,
diff --git a/blockdev.c 

[Qemu-block] [PATCH 0/2] block latency histogram

2018-02-06 Thread Vladimir Sementsov-Ogievskiy
Vladimir Sementsov-Ogievskiy (2):
  block/accounting: introduce latency histogram
  qapi: add block latency histogram interface

 qapi/block-core.json   | 62 +-
 include/block/accounting.h |  8 +
 block/accounting.c | 83 ++
 block/qapi.c   | 31 +
 blockdev.c | 15 +
 5 files changed, 198 insertions(+), 1 deletion(-)

-- 
2.11.1




Re: [Qemu-block] [PATCH v8 24/26] block/curl: Harmonize option defaults

2018-02-06 Thread Alberto Garcia
On Mon 05 Feb 2018 04:18:33 PM CET, Max Reitz wrote:
> Both of the defaults we currently have in the curl driver are named
> based on a slightly different schema, let's unify that and call both
> CURL_BLOCK_OPT_${NAME}_DEFAULT.
>
> While at it, we can add a macro for the third option for which a default
> exists, namely "sslverify".
>
> Signed-off-by: Max Reitz 

Reviewed-by: Alberto Garcia 

Berto



Re: [Qemu-block] [PATCH v8 20/26] block: Generically refresh runtime options

2018-02-06 Thread Alberto Garcia
On Mon 05 Feb 2018 04:18:29 PM CET, Max Reitz wrote:
> Instead of having every block driver which implements
> bdrv_refresh_filename() copy all of the significant runtime options over
> to bs->full_open_options, implement this process generically in
> bdrv_refresh_filename().
>
> This patch only adds this new generic implementation, it does not remove
> the old functionality. This is done in a follow-up patch.
>
> With this patch, some superfluous information (that should never have
> been there) may be removed from some JSON filenames, as can be seen in
> the change to iotest 110's reference output.  In case of 191, backing
> nodes that have not been overridden are now removed from the filename.
>
> Signed-off-by: Max Reitz 

Reviewed-by: Alberto Garcia 

Berto



Re: [Qemu-block] [PATCH v8 05/26] block: Respect backing bs in bdrv_refresh_filename

2018-02-06 Thread Alberto Garcia
On Mon 05 Feb 2018 04:18:14 PM CET, Max Reitz wrote:
> Basically, bdrv_refresh_filename() should respect all children of a
> BlockDriverState. However, generally those children are driver-specific,
> so this function cannot handle the general case. On the other hand,
> there are only few drivers which use other children than @file and
> @backing (that being vmdk, quorum, and blkverify).
>
> Most block drivers only use @file and/or @backing (if they use any
> children at all). Both can be implemented directly in
> bdrv_refresh_filename.
>
> The user overriding the file's filename is already handled, however, the
> user overriding the backing file is not. If this is done, opening the
> BDS with the plain filename of its file will not be correct, so we may
> not set bs->exact_filename in that case.
>
> iotests 051 and 191 contain test cases for overwriting the backing file,
> and so their output changes with this patch applied (which I consider a
> good thing). Note that in the case of 191, the implicitly opened
> (non-overridden) base file is included in the json:{} filename as well
> -- this will be remedied by a later patch.
>
> Signed-off-by: Max Reitz 

Reviewed-by: Alberto Garcia 

Berto



Re: [Qemu-block] [PATCH v8 04/26] iotests: Drop explicit base blockdev in 191

2018-02-06 Thread Alberto Garcia
On Mon 05 Feb 2018 04:18:13 PM CET, Max Reitz wrote:
> Overriding the backing image should result in a json:{} pseudo-filename.
> Then, you can no longer use the commit block job with filename
> parameters.  Therefore, do not explicitly add the base and override the
> middle image in iotest 191, since we do not need to anyway.  This will
> allow us to continue to use the middle image's filename to identify it.
>
> In the long run, we want block-commit to accept node names for base and
> top (just like block-stream does).
>
> Signed-off-by: Max Reitz 

Reviewed-by: Alberto Garcia 

Berto



Re: [Qemu-block] [PATCH v2 1/8] qapi: group BlockDeviceStats fields

2018-02-06 Thread Alberto Garcia
On Fri 19 Jan 2018 01:50:00 PM CET, Anton Nefedov wrote:
> Make the stat fields definition slightly more readable.
> Cosmetic change only.
>
> Signed-off-by: Anton Nefedov 
> ---
>  qapi/block-core.json | 24 ++--
>  1 file changed, 18 insertions(+), 6 deletions(-)
>
> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index e94a688..2e0665e 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -733,14 +733,26 @@
>  # Since: 0.14.0
>  ##
>  { 'struct': 'BlockDeviceStats',
> -  'data': {'rd_bytes': 'int', 'wr_bytes': 'int', 'rd_operations': 'int',
> -   'wr_operations': 'int', 'flush_operations': 'int',
> +  'data': {'rd_bytes': 'int', 'wr_bytes': 'int',
> +
> +   'rd_operations': 'int', 'wr_operations': 'int',
> +   'flush_operations': 'int',
> +
> 'flush_total_time_ns': 'int', 'wr_total_time_ns': 'int',
> -   'rd_total_time_ns': 'int', 'wr_highest_offset': 'int',
> -   'rd_merged': 'int', 'wr_merged': 'int', '*idle_time_ns': 'int',
> +   'rd_total_time_ns': 'int',

It would be nice to reorder these following the read-write-flush order.
This way these would be rd_total_time_ns, wr_total_time_ns and
flush_total_time_ns.

The order of the fields in the documentation would also need to be
changed.

Berto



Re: [Qemu-block] Block Migration and CPU throttling

2018-02-06 Thread Peter Lieven

Am 12.12.2017 um 18:05 schrieb Dr. David Alan Gilbert:

* Peter Lieven (p...@kamp.de) wrote:

Am 21.09.2017 um 14:36 schrieb Dr. David Alan Gilbert:

* Peter Lieven (p...@kamp.de) wrote:

Am 19.09.2017 um 16:41 schrieb Dr. David Alan Gilbert:

* Peter Lieven (p...@kamp.de) wrote:

Am 19.09.2017 um 16:38 schrieb Dr. David Alan Gilbert:

* Peter Lieven (p...@kamp.de) wrote:

Hi,

I just noticed that CPU throttling and Block Migration don't work together very 
well.
During block migration the throttling heuristic detects that we obviously make 
no progress
in ram transfer. But the reason is the running block migration and not a too 
high dirty pages rate.

The result is that any VM is throttled by 99% during block migration.

Hmm that's unfortunate; do you have a bandwidth set lower than your
actual network connection? I'm just wondering if it's actually going
between the block and RAM iterative sections or getting stuck in ne.

It happens also if source and dest are on the same machine and speed is set to 
100G.

But does it happen if they're not and the speed is set low?

Yes, it does. I noticed it in our test environment between different nodes with 
a 10G
link in between. But its totally clear why it happens. During block migration 
we transfer
all dirty memory pages in each round (if there is moderate memory load), but 
all dirty
pages are obviously more than 50% of the transferred ram in that round.
It is more exactly 100%. But the current logic triggers on this condition.

I think I will go forward and send a patch which disables auto converge during
block migration bulk stage.

Yes, that's fair;  it probably would also make sense to throttle the RAM
migration during the block migration bulk stage, since the chances are
it's not going to get far.  (I think in the nbd setup, the main
migration process isn't started until the end of bulk).

Catching up with the idea of delaying ram migration until block bulk has 
completed.
What do you think is the easiest way to achieve this?



I think the answer depends whether we think this is a 'special' or we
need a new general purpose mechanism.

If it was really general then we'd probably want to split the iterative
stage in two somehow, and only do RAM in the second half.

But I'm not sure it's worth it; I suspect the easiest way is:

a) Add a counter in migration/ram.c or in the RAM state somewhere
b) Make ram_save_inhibit increment the counter
c) Check the counter at the head of ram_save_iterate and just exit
  if it's none 0
d) Call ram_save_inhibit from block_save_setup
e) Then release it when you've finished the bulk stage

Make sure you still count the RAM in the pending totals, otherwise
migration might think it's finished a bit early.


Is there any culprit I don't see or is it as easy as this?

diff --git a/migration/ram.c b/migration/ram.c
index cb1950f..c67bcf1 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2255,6 +2255,13 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
 int64_t t0;
 int done = 0;

+    if (blk_mig_bulk_active()) {
+    /* Avoid transferring RAM during bulk phase of block migration as
+ * the bulk phase will usually take a lot of time and transferring
+ * RAM updates again and again is pointless. */
+    goto out;
+    }
+
 rcu_read_lock();
 if (ram_list.version != rs->last_version) {
 ram_state_reset(rs);
@@ -2301,6 +2308,7 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
  */
 ram_control_after_iterate(f, RAM_CONTROL_ROUND);

+out:
 qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
 ram_counters.transferred += 8;


Peter




[Qemu-block] [PATCH] block: include original filename when reporting invalid URIs

2018-02-06 Thread Daniel P . Berrangé
Consider passing a JSON based block driver to "qemu-img commit"

$ qemu-img commit 'json:{"driver":"qcow2","file":{"driver":"gluster",\
  "volume":"gv0","path":"sn1.qcow2",
  "server":[{"type":\
  "tcp","host":"10.73.199.197","port":"24007"}]},}'

Currently it will commit the content and then report an incredibly
useless error message when trying to re-open the committed image:

  qemu-img: invalid URI
  Usage: 
file=gluster[+transport]://[host[:port]]volume/path[?socket=...][,file.debug=N][,file.logfile=/path/filename.log]

With this fix we get:

  qemu-img: invalid URI json:{"server.0.host": "10.73.199.197",
  "driver": "gluster", "path": "luks.qcow2", "server.0.type":
  "tcp", "server.0.port": "24007", "volume": "gv0"}

Of course the root cause problem still exists, but now we know
what actually needs fixing.

Signed-off-by: Daniel P. Berrangé 
---
 block/gluster.c  | 2 +-
 block/sheepdog.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/gluster.c b/block/gluster.c
index 0f4265a3a4..0215e19087 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -660,7 +660,7 @@ static struct glfs 
*qemu_gluster_init(BlockdevOptionsGluster *gconf,
 if (filename) {
 ret = qemu_gluster_parse_uri(gconf, filename);
 if (ret < 0) {
-error_setg(errp, "invalid URI");
+error_setg(errp, "invalid URI %s", filename);
 error_append_hint(errp, "Usage: file=gluster[+transport]://"
 "[host[:port]]volume/path[?socket=...]"
 "[,file.debug=N]"
diff --git a/block/sheepdog.c b/block/sheepdog.c
index f684477328..c847ab6c98 100644
--- a/block/sheepdog.c
+++ b/block/sheepdog.c
@@ -1050,7 +1050,7 @@ static void sd_parse_uri(SheepdogConfig *cfg, const char 
*filename,
 
 cfg->uri = uri = uri_parse(filename);
 if (!uri) {
-error_setg(, "invalid URI");
+error_setg(, "invalid URI '%s'", filename);
 goto out;
 }
 
-- 
2.14.3