date:20150917

Re: [Qemu-devel] [PATCH v8 02/26] qapi: New QAPISchema intermediate reperesentation

2015-09-17 Thread Markus Armbruster

Eric Blake  writes:

> On 09/16/2015 05:06 AM, Markus Armbruster wrote:
>> The QAPI code generators work with a syntax tree (nested dictionaries)
>> plus a few symbol tables (also dictionaries) on the side.
>
>> Nothing uses the new intermediate representation just yet, thus no
>> change to generated files.
>> 
>> Signed-off-by: Markus Armbruster 
>> ---
>
>
>> +class QAPISchemaEnumType(QAPISchemaType):
>> +def __init__(self, name, info, values, prefix):
>> +QAPISchemaType.__init__(self, name, info)
>> +for v in values:
>> +assert isinstance(v, str)
>> +self.values = values
>> +self.prefix = prefix
>
> Missing:
> assert prefix is None or isinstance(prefix, str)

Yes.

> Should this be self._prefix, since no clients ever directly use the
> member field (they just use the string passed through the visitor)?

Yes.

If nothing else comes up in review, I'll touch this up on commit.

> Reviewed-by: Eric Blake 

Thanks!

Re: [Qemu-devel] [PATCH v8 05/26] tests/qapi-schema: Convert test harness to QAPISchemaVisitor

2015-09-17 Thread Markus Armbruster

Eric Blake  writes:

> On 09/16/2015 05:06 AM, Markus Armbruster wrote:
>> The old code prints the result of parsing (list of expression
>> dictionaries), and partial results of semantic analysis (list of enum
>> dictionaries, list of struct dictionaries).
>> 
>> The new code prints a trace of a schema visit, i.e. what the back-ends
>> are going to use.  Built-in and array types are omitted, because
>> they're boring.
>> 
>> Signed-off-by: Markus Armbruster 
>> ---
>
>> +class QAPISchemaTestVisitor(QAPISchemaVisitor):
>> +def visit_enum_type(self, name, info, values, prefix):
>> +print 'enum %s %s prefix=%s' % (name, values, prefix)
>> +
>> +def visit_object_type(self, name, info, base, members, variants):
>> +print 'object %s' % name
>> +if base:
>> +print 'base %s' % base.name
>
> For consistency with other visitors, I might have done:
>
> if prefix:
> print 'prefix %s' % prefix
>
> instead of printing prefix=None for most enums.  In particular,
>
>> +enum EnumOne ['value1', 'value2', 'value3'] prefix=None
>> +object EventStructOne
>> + member struct1: UserDefOne optional=False
>> + member string: str optional=False
>> + member enum2: EnumOne optional=True
>> +object NestedEnumsOne
>> + member enum1: EnumOne optional=False
>> + member enum2: EnumOne optional=True
>> + member enum3: EnumOne optional=False
>> + member enum4: EnumOne optional=True
>> +enum QEnumTwo ['value1', 'value2'] prefix=QENUM_TWO
>
> the fact that prefix=None and prefix=QENUM_TWO doesn't offer any visual
> clues on whether the string was user-supplied, means that the following
> would result in ambiguous output:
>  { 'enum':'Foo', 'prefix':'None', 'data':[] }

No big deal.  But avoiding it is no big deal, either.  I'll give it a
try.

> But it's not necessarily a strong enough reason for a respin.
>
> Reviewed-by: Eric Blake 

Thanks!

Re: [Qemu-devel] [PATCH v8 09/26] qapi: De-duplicate enum code generation

2015-09-17 Thread Markus Armbruster

Eric Blake  writes:

> On 09/16/2015 05:06 AM, Markus Armbruster wrote:
>> Duplicated in commit 21cd70d.  Yes, we can't import qapi-types, but
>> that's no excuse.  Move the helpers from qapi-types.py to qapi.py, and
>> replace the duplicates in qapi-event.py.
>> 
>> The generated event enumeration type's lookup table becomes
>> const-correct (see commit 2e4450f), and uses explicit indexes instead
>> of relying on order (see commit 912ae9c).
>> 
>> Signed-off-by: Markus Armbruster 
>> ---
>>  docs/qapi-code-gen.txt |  9 ---
>>  scripts/qapi-event.py  | 67 
>> +++---
>>  scripts/qapi-types.py  | 55 -
>>  scripts/qapi.py| 55 +
>>  4 files changed, 64 insertions(+), 122 deletions(-)
>> 
>
>> +++ b/scripts/qapi.py
>> @@ -1497,6 +1497,61 @@ def guardend(name):
>>  ''',
>>   name=guardname(name))
>>  
>> +def generate_enum_lookup(name, values, prefix=None):
>
> To keep pep8 happier, you could use two blank lines before def here...
>
>> +return ret
>> +
>> +def generate_enum(name, values, prefix=None):
>
> and here.  Then again, 13/26 does more of these sorts of cleanups, and
> v7 had the same use of 1 blank line.  Up to you if it is worth avoiding
> the churn; but it is whitespace either way so it doesn't affect review.

Matches the code around it, which trumps pep8.

pep8 is quite unhappy with qapi.py.  This series makes it slightly
happier by deleting unhappy code.  Finishing the conversion to the new
intermediate representation involves folding semantic analysis into
QAPISchema, good opportunity to tidy up the moved code.  After that, we
should probably tidy up whatever's left.

> Also, do you still need prefix=None, or can we rely on the fact that now
> all callers supply prefix by virtue of the visitor callback, and make
> the parameter non-optional?

There are calls without prefix left in qapi-event.py.

> Reviewed-by: Eric Blake 

Thanks!

Re: [Qemu-devel] [PATCH 2/7] vhost-user: add protocol feature negotiation

2015-09-17 Thread Yuanhan Liu

On Thu, Sep 17, 2015 at 03:12:55PM +0800, Jason Wang wrote:
> 
> 
> On 09/15/2015 03:10 PM, Yuanhan Liu wrote:
> > From: "Michael S. Tsirkin" 
> >
> > Support a separate bitmask for vhost-user protocol features,
> > and messages to get/set protocol features.
> >
> > Invoke them at init.
> >
> > No features are defined yet.
> >
> > v2: leverage vhost_user_call for request handling -- Yuanhan Liu
> >
> > Signed-off-by: Michael S. Tsirkin 
> > Signed-off-by: Yuanhan Liu 
> > ---
> >  docs/specs/vhost-user.txt | 37 +
> >  hw/net/vhost_net.c|  2 ++
> >  hw/virtio/vhost-user.c| 31 +++
> >  include/hw/virtio/vhost.h |  1 +
> >  4 files changed, 71 insertions(+)
> >
> > diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
> > index 650bb18..70da3b1 100644
> > --- a/docs/specs/vhost-user.txt
> > +++ b/docs/specs/vhost-user.txt
> > @@ -113,6 +113,7 @@ message replies. Most of the requests don't require 
> > replies. Here is a list of
> >  the ones that do:
> >  
> >   * VHOST_GET_FEATURES
> > + * VHOST_GET_PROTOCOL_FEATURES
> >   * VHOST_GET_VRING_BASE
> >  
> >  There are several messages that the master sends with file descriptors 
> > passed
> > @@ -127,6 +128,13 @@ in the ancillary data:
> >  If Master is unable to send the full message or receives a wrong reply it 
> > will
> >  close the connection. An optional reconnection mechanism can be 
> > implemented.
> >  
> > +Any protocol extensions are gated by protocol feature bits,
> > +which allows full backwards compatibility on both master
> > +and slave.
> > +As older slaves don't support negotiating protocol features,
> > +a feature bit was dedicated for this purpose:
> > +#define VHOST_USER_F_PROTOCOL_FEATURES 30
> > +
> >  Message types
> >  -
> >  
> > @@ -138,6 +146,8 @@ Message types
> >Slave payload: u64
> >  
> >Get from the underlying vhost implementation the features bitmask.
> > +  Feature bit VHOST_USER_F_PROTOCOL_FEATURES signals slave support for
> > +  VHOST_USER_GET_PROTOCOL_FEATURES and 
> > VHOST_USER_SET_PROTOCOL_FEATURES.
> >  
> >   * VHOST_USER_SET_FEATURES
> >  
> > @@ -146,6 +156,33 @@ Message types
> >Master payload: u64
> >  
> >Enable features in the underlying vhost implementation using a 
> > bitmask.
> > +  Feature bit VHOST_USER_F_PROTOCOL_FEATURES signals slave support for
> > +  VHOST_USER_GET_PROTOCOL_FEATURES and 
> > VHOST_USER_SET_PROTOCOL_FEATURES.
> > +
> > + * VHOST_USER_GET_PROTOCOL_FEATURES
> > +
> > +  Id: 15
> > +  Equivalent ioctl: VHOST_GET_FEATURES
> > +  Master payload: N/A
> > +  Slave payload: u64
> > +
> > +  Get the protocol feature bitmask from the underlying vhost 
> > implementation.
> > +  Only legal if feature bit VHOST_USER_F_PROTOCOL_FEATURES is present 
> > in
> > +  VHOST_USER_GET_FEATURES.
> > +  Note: slave that reported VHOST_USER_F_PROTOCOL_FEATURES must support
> > +  this message even before VHOST_USER_SET_FEATURES was called.
> > +
> > + * VHOST_USER_SET_PROTOCOL_FEATURES
> > +
> > +  Id: 16
> > +  Ioctl: VHOST_SET_FEATURES
> > +  Master payload: u64
> > +
> > +  Enable protocol features in the underlying vhost implementation.
> > +  Only legal if feature bit VHOST_USER_F_PROTOCOL_FEATURES is present 
> > in
> > +  VHOST_USER_GET_FEATURES.
> > +  Note: slave that reported VHOST_USER_F_PROTOCOL_FEATURES must support
> > +  this message even before VHOST_USER_SET_FEATURES was called.
> >  
> >   * VHOST_USER_SET_OWNER
> >  
> > diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
> > index 5c1d11f..c864237 100644
> > --- a/hw/net/vhost_net.c
> > +++ b/hw/net/vhost_net.c
> > @@ -152,8 +152,10 @@ struct vhost_net *vhost_net_init(VhostNetOptions 
> > *options)
> >  net->dev.backend_features = qemu_has_vnet_hdr(options->net_backend)
> >  ? 0 : (1ULL << VHOST_NET_F_VIRTIO_NET_HDR);
> >  net->backend = r;
> > +net->dev.protocol_features = 0;
> >  } else {
> >  net->dev.backend_features = 0;
> > +net->dev.protocol_features = 0;
> >  net->backend = -1;
> >  }
> >  net->nc = options->net_backend;
> > diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> > index 13677ac..43169a1 100644
> > --- a/hw/virtio/vhost-user.c
> > +++ b/hw/virtio/vhost-user.c
> > @@ -24,6 +24,8 @@
> >  #include 
> >  
> >  #define VHOST_MEMORY_MAX_NREGIONS8
> > +#define VHOST_USER_F_PROTOCOL_FEATURES 30
> > +#define VHOST_USER_PROTOCOL_FEATURE_MASK 0x0ULL
> >  
> >  typedef enum VhostUserRequest {
> >  VHOST_USER_NONE = 0,
> > @@ -41,6 +43,8 @@ typedef enum VhostUserRequest {
> >  VHOST_USER_SET_VRING_KICK = 12,
> >  VHOST_USER_SET_VRING_CALL = 13,
> >  VHOST_USER_SET_VRING_ERR = 14,
> > +VHOST_USER_GET_PROTOCOL_FEATURES = 15,
> > +

Re: [Qemu-devel] help with understanding qcow2 file format

2015-09-17 Thread Vasiliy Tolstov

2015-09-16 16:46 GMT+03:00 Eric Blake :

> qemu-img map file.qcow2
>
>
Offset  Length  Mapped to   File
qemu-img: File contains external, encrypted or compressed clusters.


> is a great way to learn which physical host offsets hold the data at
> which guest offsets.
>
> As for coding interactions with qcow2, see the source under block/qcow2.c.
>
> You may also be interested in the visual representation of qcow2 in my
> KVM Forum slides, part 1:
>
>
> http://events.linuxfoundation.org/sites/events/files/slides/2015-qcow2-expanded.pdf
>


Thanks for slides, they are very useful.


-- 
Vasiliy Tolstov,
e-mail: v.tols...@selfip.ru

Re: [Qemu-devel] [PATCH 2/7] vhost-user: add protocol feature negotiation

2015-09-17 Thread Jason Wang



On 09/15/2015 03:10 PM, Yuanhan Liu wrote:
> From: "Michael S. Tsirkin" 
>
> Support a separate bitmask for vhost-user protocol features,
> and messages to get/set protocol features.
>
> Invoke them at init.
>
> No features are defined yet.
>
> v2: leverage vhost_user_call for request handling -- Yuanhan Liu
>
> Signed-off-by: Michael S. Tsirkin 
> Signed-off-by: Yuanhan Liu 
> ---
>  docs/specs/vhost-user.txt | 37 +
>  hw/net/vhost_net.c|  2 ++
>  hw/virtio/vhost-user.c| 31 +++
>  include/hw/virtio/vhost.h |  1 +
>  4 files changed, 71 insertions(+)
>
> diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
> index 650bb18..70da3b1 100644
> --- a/docs/specs/vhost-user.txt
> +++ b/docs/specs/vhost-user.txt
> @@ -113,6 +113,7 @@ message replies. Most of the requests don't require 
> replies. Here is a list of
>  the ones that do:
>  
>   * VHOST_GET_FEATURES
> + * VHOST_GET_PROTOCOL_FEATURES
>   * VHOST_GET_VRING_BASE
>  
>  There are several messages that the master sends with file descriptors passed
> @@ -127,6 +128,13 @@ in the ancillary data:
>  If Master is unable to send the full message or receives a wrong reply it 
> will
>  close the connection. An optional reconnection mechanism can be implemented.
>  
> +Any protocol extensions are gated by protocol feature bits,
> +which allows full backwards compatibility on both master
> +and slave.
> +As older slaves don't support negotiating protocol features,
> +a feature bit was dedicated for this purpose:
> +#define VHOST_USER_F_PROTOCOL_FEATURES 30
> +
>  Message types
>  -
>  
> @@ -138,6 +146,8 @@ Message types
>Slave payload: u64
>  
>Get from the underlying vhost implementation the features bitmask.
> +  Feature bit VHOST_USER_F_PROTOCOL_FEATURES signals slave support for
> +  VHOST_USER_GET_PROTOCOL_FEATURES and VHOST_USER_SET_PROTOCOL_FEATURES.
>  
>   * VHOST_USER_SET_FEATURES
>  
> @@ -146,6 +156,33 @@ Message types
>Master payload: u64
>  
>Enable features in the underlying vhost implementation using a bitmask.
> +  Feature bit VHOST_USER_F_PROTOCOL_FEATURES signals slave support for
> +  VHOST_USER_GET_PROTOCOL_FEATURES and VHOST_USER_SET_PROTOCOL_FEATURES.
> +
> + * VHOST_USER_GET_PROTOCOL_FEATURES
> +
> +  Id: 15
> +  Equivalent ioctl: VHOST_GET_FEATURES
> +  Master payload: N/A
> +  Slave payload: u64
> +
> +  Get the protocol feature bitmask from the underlying vhost 
> implementation.
> +  Only legal if feature bit VHOST_USER_F_PROTOCOL_FEATURES is present in
> +  VHOST_USER_GET_FEATURES.
> +  Note: slave that reported VHOST_USER_F_PROTOCOL_FEATURES must support
> +  this message even before VHOST_USER_SET_FEATURES was called.
> +
> + * VHOST_USER_SET_PROTOCOL_FEATURES
> +
> +  Id: 16
> +  Ioctl: VHOST_SET_FEATURES
> +  Master payload: u64
> +
> +  Enable protocol features in the underlying vhost implementation.
> +  Only legal if feature bit VHOST_USER_F_PROTOCOL_FEATURES is present in
> +  VHOST_USER_GET_FEATURES.
> +  Note: slave that reported VHOST_USER_F_PROTOCOL_FEATURES must support
> +  this message even before VHOST_USER_SET_FEATURES was called.
>  
>   * VHOST_USER_SET_OWNER
>  
> diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
> index 5c1d11f..c864237 100644
> --- a/hw/net/vhost_net.c
> +++ b/hw/net/vhost_net.c
> @@ -152,8 +152,10 @@ struct vhost_net *vhost_net_init(VhostNetOptions 
> *options)
>  net->dev.backend_features = qemu_has_vnet_hdr(options->net_backend)
>  ? 0 : (1ULL << VHOST_NET_F_VIRTIO_NET_HDR);
>  net->backend = r;
> +net->dev.protocol_features = 0;
>  } else {
>  net->dev.backend_features = 0;
> +net->dev.protocol_features = 0;
>  net->backend = -1;
>  }
>  net->nc = options->net_backend;
> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> index 13677ac..43169a1 100644
> --- a/hw/virtio/vhost-user.c
> +++ b/hw/virtio/vhost-user.c
> @@ -24,6 +24,8 @@
>  #include 
>  
>  #define VHOST_MEMORY_MAX_NREGIONS8
> +#define VHOST_USER_F_PROTOCOL_FEATURES 30
> +#define VHOST_USER_PROTOCOL_FEATURE_MASK 0x0ULL
>  
>  typedef enum VhostUserRequest {
>  VHOST_USER_NONE = 0,
> @@ -41,6 +43,8 @@ typedef enum VhostUserRequest {
>  VHOST_USER_SET_VRING_KICK = 12,
>  VHOST_USER_SET_VRING_CALL = 13,
>  VHOST_USER_SET_VRING_ERR = 14,
> +VHOST_USER_GET_PROTOCOL_FEATURES = 15,
> +VHOST_USER_SET_PROTOCOL_FEATURES = 16,
>  VHOST_USER_MAX
>  } VhostUserRequest;
>  
> @@ -206,11 +210,13 @@ static int vhost_user_call(struct vhost_dev *dev, 
> unsigned long int request,
>  
>  switch (msg_request) {
>  case VHOST_USER_GET_FEATURES:
> +case VHOST_USER_GET_PROTOCOL_FEATURES:
>  need_reply = 1;
>  break;
>  
>  case

Re: [Qemu-devel] [PATCH v8 04/26] qapi: New QAPISchemaVisitor

2015-09-17 Thread Markus Armbruster

Eric Blake  writes:

> On 09/16/2015 05:06 AM, Markus Armbruster wrote:
>> The visitor will help keeping the code generation code simple and
>> reasonably separated from QAPISchema details.
>> 
>> Signed-off-by: Markus Armbruster 
>> ---
>>  scripts/qapi.py | 64 
>> +
>>  1 file changed, 64 insertions(+)
>> 
>
>> @@ -840,6 +876,10 @@ class QAPISchemaEnumType(QAPISchemaType):
>>  def json_type(self):
>>  return 'string'
>>  
>> +def visit(self, visitor):
>> +visitor.visit_enum_type(self.name, self.info,
>> +self.values, self.prefix)
>
> Thinking aloud: Should this visit 'self.prefix or self.name', rather
> than letting callers see None?  If we did that, we could simplify
> c_enum_const() back to 2 parameters, and update all callers to just
> blindly pass the prefix instead of the enum name.  On the other hand, I
> think that's a bit too much churn, and I don't see what it would buy us
> that we don't already have with a 3-arg c_enum_const().

Could be explored on top.  We need to wrap up this series.

> So I'm fine with the version you have here as-is (modulo any obvious
> rebase to pass self._prefix based on my comments on 2).
>
> Reviewed-by: Eric Blake 

Thanks!

Re: [Qemu-devel] Could not boot a guest vm from kvm mode based on APM X-Gene Host and latest qemu

2015-09-17 Thread Alex Bennée


liang yan  writes:

> On 09/16/2015 08:34 AM, Alex Bennée wrote:
>> liang yan  writes:
>>
>>> Hello, All,
>>>
>>> I am trying to enable kvm for a guest vm on an APM X-Gene Host with
>>> latest qemu, but could not make it work.
>>>
>>> The host is APM X-Gene 8-core, Linux kernel is 4.1.0-rc7-1-arm64,
>>>
>>> Guest kernel is linux-3.16rc3
>>>
>>> QEMU is latest version
>>>
>>> Host has these dmesg info
>>> [2.708259] kvm [1]: GICH base=0x780c, GICV base=0x780e, IRQ=25
>>> [2.708327] kvm [1]: timer IRQ30
>>> [2.708335] kvm [1]: Hyp mode initialized successfully
>>>
>>> Host has dev/kvm.
>>>
>>> command-line is
>>> aarch64-softmmu/qemu-system-aarch64 -machine virt,kernel_irqchip=off
>>> -cpu cortex-a57 -machine accel=kvm -nographic -smp 1 -m 2048 -kernel
>>> aarch64-linux-3.16rc3-buildroot.img  --append "console=ttyAMA0"
>> I thought I recognised one of my images ;-)
>>
>> Why are you running with kernel_irqchip=off?
>
>
> Hello, Alex,
>
> Thanks for your reply and your pre-boot image. Very helpful.
>
> The reason I ran with "kernel_irqchip=off" is because it showed the 
> error "Create kernel irqchip failed: No such device"
> are you familiar with it?

No but I suspect it is a kernel configuration issue. Are you running a
defconfig?

>
> I did not try kernel 4.3.x yet, have you released on your linaro website 
> now?

That's running directly from Linus' tree so it's pure mainline.

> Thanks for your log file, it shows something wrong with the vcpu, I am 
> gonna check the kvm code from host kernel now.
> Will let you know the update.

Mark Zyngier pointed out that's a know bug that there are patches
in-flight for.

>
>
> Best,
> Liang
>
>> Without it I can boot the image fine on my APM running 4.3.0-rc1-ajb but
>> with it I helpfully seg the kernel:
> =
>>
>> [16035.990518] Bad mode in Synchronous Abort handler detected, code 
>> 0x8606 -- IABT (current EL)
> =
>> [16035.997970] CPU: 1 PID: 21328 Comm: qemu-system-aar Not tainted 
>> 4.3.0-rc1-ajb #446
>> [16036.004203] Hardware name: APM X-Gene Mustang board (DT)
>> [16036.008191] task: ffc3ecea8000 ti: ffc3d8078000 task.ti: 
>> ffc3d8078000
>> [16036.014338] PC is at 0x0
>> [16036.015564] LR is at kvm_vgic_map_resources+0x30/0x3c
>> [16036.019291] pc : [<>] lr : [] pstate: 
>> 0145
>> [16036.025350] sp : ffc3d807bb20
>> [16036.027348] x29: ffc3d807bb20 x28: ffc3d8078000
>> [16036.031355] x27: ffc000642000 x26: 001d
>> [16036.035361] x25: 011b x24: ffc3d80c1000
>> [16036.039368] x23:  x22: 
>> [16036.043374] x21: ffc0fa24 x20: ffc0fa807800
>> [16036.047380] x19: ffc0fa807800 x18: 007f97af20e0
>> [16036.051387] x17: 007f99c44810 x16: ffc0001fb030
>> [16036.055394] x15: 007f99cc9588 x14: 00922000
>> [16036.059401] x13: 0097eb80 x12: 004de0f0
>> [16036.063406] x11: 0038 x10: 
>> [16036.067413] x9 : 007f97af2480 x8 : 0050
>> [16036.071419] x7 : ffc3ec24c840 x6 : 
>> [16036.075424] x5 : 0003 x4 : ffc3ece72080
>> [16036.079430] x3 : ffc3ece72080 x2 : 
>> [16036.083436] x1 : ffc000a26260 x0 : ffc0fa807800
> 
>> [16036.087628] Internal error: Oops - bad mode: 0 [#1] SMP
> 
>
>> [16036.091528] Modules linked in:
>> [16036.093278] CPU: 1 PID: 21328 Comm: qemu-system-aar Not tainted 
>> 4.3.0-rc1-ajb #446
>> [16036.099510] Hardware name: APM X-Gene Mustang board (DT)
>> [16036.103497] task: ffc3ecea8000 ti: ffc3d8078000 task.ti: 
>> ffc3d8078000
>> [16036.109642] PC is at 0x0
>> [16036.110864] LR is at kvm_vgic_map_resources+0x30/0x3c
>> [16036.114590] pc : [<>] lr : [] pstate: 
>> 0145
>> [16036.120649] sp : ffc3d807bb20
>> [16036.122648] x29: ffc3d807bb20 x28: ffc3d8078000
>> [16036.126654] x27: ffc000642000 x26: 001d
>> [16036.130659] x25: 011b x24: ffc3d80c1000
>> [16036.134666] x23:  x22: 
>> [16036.138671] x21: ffc0fa24 x20: ffc0fa807800
>> [16036.142678] x19: ffc0fa807800 x18: 007f97af20e0
>> [16036.146685] x17: 007f99c44810 x16: ffc0001fb030
>> [16036.150690] x15: 007f99cc9588 x14: 00922000
>> [16036.154696] x13: 0097eb80 x12: 004de0f0
>> [16036.158701] x11: 0038 x10: 
>> [16036.162706] x9 : 007f97af2480 x8 : 0050
>> [16036.166712] x7 : ffc3ec24c840 x6 : 
>> [16036.170719] x5 : 0003 x4 : ffc3ece72080
>> [16036.174725] x3 : ffc3ece72080 x2 : 
>> [16036.178731] x1 : ffc000a26260 x0 : ffc0fa807800
>>
>>>
>>> when

Re: [Qemu-devel] [PATCH 0/4] ahci: clean up signature generation

2015-09-17 Thread Stefan Hajnoczi

On Tue, Sep 01, 2015 at 04:50:37PM -0400, John Snow wrote:
> Ultimately, clean up the signature generation and as a result, tidy up
> the port_reset and init_d2h functions.
> 
> 
> 
> For convenience, this branch is available at:
> https://github.com/jnsnow/qemu.git branch ahci-sigfix
> https://github.com/jnsnow/qemu/tree/ahci-sigfix
> 
> This version is tagged ahci-sigfix-v1:
> https://github.com/jnsnow/qemu/releases/tag/ahci-sigfix-v1
> 
> John Snow (4):
>   ahci: remove dead reset code
>   ahci: fix signature generation
>   ahci: remove cmd_fis argument from write_fis_d2h
>   ahci: clean up initial d2h semantics
> 
>  hw/ide/ahci.c | 53 ++---
>  1 file changed, 30 insertions(+), 23 deletions(-)
> 
> -- 
> 2.4.3
> 

Reviewed-by: Stefan Hajnoczi

Re: [Qemu-devel] [PATCH v2 08/18] nvdimm: init backend memory mapping and config data area

2015-09-17 Thread Xiao Guangrong




On 09/16/2015 12:10 AM, Paolo Bonzini wrote:



On 01/09/2015 11:14, Stefan Hajnoczi wrote:


When I was digging into live migration code, i noticed that the same MR name may
cause the name "idstr", please refer to qemu_ram_set_idstr().

Since nvdimm devices do not have parent-bus, it will trigger the abort() in that
function.

I see.  The other devices that use a constant name are on a bus so the
abort doesn't trigger.


However, the MR name must be the same across the two machines.  Indices
are not friendly to hotplug.  Even though hotplug isn't supported now,
we should prepare and try not to change migration format when we support
hotplug in the future.



Thanks for your reminder.


Is there any other fixed value that we can use, for example the base
address of the NVDIMM?


How about use object_get_canonical_path(OBJECT(dev)) (the @dev is NVDIMM
device) ?

Re: [Qemu-devel] [PATCH] vmdk: Create streamOptimized as version 3

2015-09-17 Thread Kevin Wolf

Am 17.09.2015 um 07:04 hat Fam Zheng geschrieben:
> VMware products accept only version 3 for streamOptimized, let's bump
> the version.
> 
> Reported-by: Radoslav Gerganov 
> Signed-off-by: Fam Zheng 

Radoslav, can I have your Reviewed-by and/or Tested-by for this patch?

Kevin

>  block/vmdk.c | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/block/vmdk.c b/block/vmdk.c
> index be0d640..37326c3 100644
> --- a/block/vmdk.c
> +++ b/block/vmdk.c
> @@ -1651,7 +1651,13 @@ static int vmdk_create_extent(const char *filename, 
> int64_t filesize,
>  }
>  magic = cpu_to_be32(VMDK4_MAGIC);
>  memset(, 0, sizeof(header));
> -header.version = zeroed_grain ? 2 : 1;
> +if (compress) {
> +header.version = 3;
> +} else if (zeroed_grain) {
> +header.version = 2;
> +} else {
> +header.version = 1;
> +}
>  header.flags = VMDK4_FLAG_RGD | VMDK4_FLAG_NL_DETECT
> | (compress ? VMDK4_FLAG_COMPRESS | VMDK4_FLAG_MARKER : 0)
> | (zeroed_grain ? VMDK4_FLAG_ZERO_GRAIN : 0);
> -- 
> 2.4.3
>

Re: [Qemu-devel] [PATCH v3 37/46] contrib: remove unnecessary strdup()

2015-09-17 Thread Vladimir Sementsov-Ogievskiy


On 15.09.2015 19:07, marcandre.lur...@redhat.com wrote:

From: Marc-André Lureau 

getopt() optarg points to argv memory, no need to dup those values,
fixes small leaks detected by clang-analyzer.

Signed-off-by: Marc-André Lureau 
---
  contrib/ivshmem-client/main.c | 2 +-
  contrib/ivshmem-server/main.c | 6 +++---
  2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/contrib/ivshmem-client/main.c b/contrib/ivshmem-client/main.c
index 5d85ae7..bd7cbfc 100644
--- a/contrib/ivshmem-client/main.c
+++ b/contrib/ivshmem-client/main.c
@@ -53,7 +53,7 @@ ivshmem_client_parse_args(IvshmemClientArgs *args, int argc, 
char *argv[])
  break;
  
  case 'S': /* unix_sock_path */

-args->unix_sock_path = strdup(optarg);
+args->unix_sock_path = optarg;
  break;
  
  default:

diff --git a/contrib/ivshmem-server/main.c b/contrib/ivshmem-server/main.c
index cd8d9ed..71e87ea 100644
--- a/contrib/ivshmem-server/main.c
+++ b/contrib/ivshmem-server/main.c
@@ -92,15 +92,15 @@ ivshmem_server_parse_args(IvshmemServerArgs *args, int 
argc, char *argv[])
  break;
  
  case 'p': /* pid_file */

-args->pid_file = strdup(optarg);
+args->pid_file = optarg;
  break;
  
  case 'S': /* unix_socket_path */

-args->unix_socket_path = strdup(optarg);
+args->unix_socket_path = optarg;
  break;
  
  case 'm': /* shm_path */

-args->shm_path = strdup(optarg);
+args->shm_path = optarg;
  break;
  
  case 'l': /* shm_size */


Reviewed-by: Vladimir Sementsov-Ogievskiy 

--
Best regards,
Vladimir
* now, @virtuozzo.com instead of @parallels.com. Sorry for this inconvenience.

[Qemu-devel] [PATCH] virtio: add some migration doc

2015-09-17 Thread Cornelia Huck

Try to cover the basics of virtio migration.

Signed-off-by: Cornelia Huck 
---
It might help if we add some documentation; at the very least, it will
prevent myself getting a headache everytime I look at that code :)

Feedback welcome.
---
 docs/virtio-migration.txt | 101 ++
 1 file changed, 101 insertions(+)
 create mode 100644 docs/virtio-migration.txt

diff --git a/docs/virtio-migration.txt b/docs/virtio-migration.txt
new file mode 100644
index 000..9c575e6
--- /dev/null
+++ b/docs/virtio-migration.txt
@@ -0,0 +1,101 @@
+Virtio devices and migration
+
+
+Saving and restoring the state of virtio devices is a bit of a twisty maze,
+for several reasons:
+- state is distributed between several parts:
+  - virtio core, for common fields like features, number of queues, ...
+  - virtio transport (pci, ccw, ...), for the different proxy devices and
+transport specific state (msix vectors, indicators, ...)
+  - virtio device (net, blk, ...), for the different device types and their
+state (mac address, request queue, ...)
+- most fields are saved via the stream interface; subsequently, subsections
+  have been added to make cross-version migration possible
+
+This file attempts to document the current procedure and point out some
+caveats.
+
+
+Save state procedure
+
+
+virtio core   virtio transport  virtio device
+---     -
+
+save() function registered
+via register_savevm()
+virtio_save()   <--
+ -->  save_config()
+  - save proxy device
+  - save transport-specific
+device fields
+- save common device
+  fields
+- save common virtqueue
+  fields
+ -->  save_queue()
+  - save transport-specific
+virtqueue fields
+ -->   save_device()
+   - save device-specific
+ fields
+- save subsections
+  - device endianness,
+if changed from
+default endianness
+  - 64 bit features, if
+any high feature bit
+is set
+  - virtio-1 virtqueue
+fields, if VERSION_1
+is set
+
+
+Load state procedure
+
+
+virtio core   virtio transport  virtio device
+---     -
+
+load() function registered
+via register_savevm()
+virtio_load()   <--
+ -->  load_config()
+  - load proxy device
+  - load transport-specific
+device fields
+- load common device
+  fields
+- load common virtqueue
+  fields
+ -->  load_queue()
+  - load transport-specific
+virtqueue fields
+- notify guest
+ -->   load_device()
+   - load device-specific
+ fields
+- load subsections
+  - device endianness
+  - 64 bit features
+  - virtio-1 virtqueue
+fields
+- sanitize endianness
+- sanitize features
+- virtqueue index sanity
+  check
+   - feature-dependent setup
+
+
+Implications of this setup
+==
+
+Devices need to be careful in their state processing during load: The
+load_device() procedure is invoked by the core before subsections have
+been loaded. Any code that depends on information transmitted in subsections
+therefore has to be invoked in the device's load() function _after_
+virtio_load() returned (like e.g. code depending on features).
+
+Any extension of the state being migrated should be done in subsections
+added to the core for compatibility reasons. If transport or device specific
+state is added, core needs to invoke a callback from the new subsection.
-- 
2.3.8

Re: [Qemu-devel] [PATCH v2 08/18] nvdimm: init backend memory mapping and config data area

2015-09-17 Thread Xiao Guangrong




On 09/17/2015 05:04 PM, Igor Mammedov wrote:

On Thu, 17 Sep 2015 16:39:12 +0800
Xiao Guangrong  wrote:




On 09/16/2015 12:10 AM, Paolo Bonzini wrote:



On 01/09/2015 11:14, Stefan Hajnoczi wrote:


When I was digging into live migration code, i noticed that the same MR name may
cause the name "idstr", please refer to qemu_ram_set_idstr().

Since nvdimm devices do not have parent-bus, it will trigger the abort() in that
function.

I see.  The other devices that use a constant name are on a bus so the
abort doesn't trigger.


However, the MR name must be the same across the two machines.  Indices
are not friendly to hotplug.  Even though hotplug isn't supported now,
we should prepare and try not to change migration format when we support
hotplug in the future.



Thanks for your reminder.


Is there any other fixed value that we can use, for example the base
address of the NVDIMM?


How about use object_get_canonical_path(OBJECT(dev)) (the @dev is NVDIMM
device) ?

if you use split backend/frotnend idea then existing backends
already have a stable name derived from backend's ID and you won't need to care
about it.



Yes, i am using this idea and addressing your suggestion that use
memory_region_init_alias() to partly map hostmem to guest's address
space.

The code is like this:

/* get the memory region from backend memory. */
mr = host_memory_backend_get_memory(dimm->hostmem, errp);

/* nvdimm_nr will map to guest address space. */
memory_region_init_alias(>nvdimm_mr, OBJECT(dev),
 object_get_canonical_path(OBJECT(dev)), mr, 0,
 size - nvdimm->label_size);

/* the label size at the end of the file used as label_data of NVDIMM. */
..

So there are two memory regions, one is the backend-mem and another one
is nvdimm_mr in the example above. The name i am worried about is the name
of nvdimm_mr.

Re: [Qemu-devel] [PATCH v2 08/18] nvdimm: init backend memory mapping and config data area

2015-09-17 Thread Paolo Bonzini



On 17/09/2015 11:14, Xiao Guangrong wrote:
> 
> 
> /* get the memory region from backend memory. */
> mr = host_memory_backend_get_memory(dimm->hostmem, errp);
> 
> /* nvdimm_nr will map to guest address space. */
> memory_region_init_alias(>nvdimm_mr, OBJECT(dev),
>  object_get_canonical_path(OBJECT(dev)), mr, 0,
>  size - nvdimm->label_size);

You can just use "memory" here for the name here.  The name only needs
to be unique for RAM memory regions, and dimm->hostmem will take care of it.

Paolo

Re: [Qemu-devel] [PATCH RFC v4 27/29] qapi: Change Netdev into a flat union

2015-09-17 Thread Wen Congyang

On 09/10/2015 12:06 PM, Eric Blake wrote:
> From: Kővágó, Zoltán 
> 
> Except qapi-schema.json, this patch was generated by:
> 
> find . -name .git -prune -o -type f \! -name '*~' -print0 | \
>   xargs -0 sed -i \
> -e 's/NetClientOptionsKind/NetClientDriver/g' \
> -e 's/NET_CLIENT_OPTIONS_KIND_/NET_CLIENT_DRIVER_/g' \
> -e 's/netdev->opts/netdev/g'
> 
> Signed-off-by: Kővágó, Zoltán 
> Message-Id: 
> <01a527fbf1a5de880091f98cf011616a78ad.1441627176.git.dirty.ice...@gmail.com>
> 
> Additional changes:
> Rebase the patch on top of an earlier change from netdev->kind to
> netdev->type, so that tweak no longer needed here.
> 
> Rework so that NetdevLegacy doesn't pollute QMP command but is instead
> copied piecewise into the new Netdev, which means that NetClientOptions
> must still remain in qapi. Since legacy previously always rejected
> 'hubport', we can now make that explicit by having the two unions be
> slightly different; but that means we must manually map between the
> two structures.

I apply Markus's v8 patch and this series, make check will fail:
TEST: tests/virtio-net-test... (pid=23648)
  /x86_64/virtio/net/pci/basic:
qemu-system-x86_64: -netdev socket,fd=6,id=hs0: Invalid parameter 'fd'
Broken pipe
FAIL
GTester: last random seed: R02Safdc7600ad90fa8d4fb252f86f953784
(pid=23652)
  /x86_64/virtio/net/pci/rx_stop_cont: 
qemu-system-x86_64: -netdev socket,fd=5,id=hs0: Invalid parameter 'fd'
Broken pipe
FAIL
GTester: last random seed: R02S3cde2cc144ab662ff4daf105a799ee50
(pid=23655)
  /x86_64/virtio/net/pci/hotplug:  OK
FAIL: tests/virtio-net-test

Do I make a mistake when I do rebase?

Thanks
Wen Congyang

> Signed-off-by: Eric Blake 
> ---
>  hw/arm/musicpal.c|   2 +-
>  hw/core/qdev-properties-system.c |   2 +-
>  hw/net/allwinner_emac.c  |   2 +-
>  hw/net/cadence_gem.c |   2 +-
>  hw/net/dp8393x.c |   2 +-
>  hw/net/e1000.c   |   2 +-
>  hw/net/eepro100.c|   2 +-
>  hw/net/etraxfs_eth.c |   2 +-
>  hw/net/fsl_etsec/etsec.c |   2 +-
>  hw/net/imx_fec.c |   2 +-
>  hw/net/lan9118.c |   2 +-
>  hw/net/lance.c   |   2 +-
>  hw/net/mcf_fec.c |   2 +-
>  hw/net/milkymist-minimac2.c  |   2 +-
>  hw/net/mipsnet.c |   2 +-
>  hw/net/ne2000-isa.c  |   2 +-
>  hw/net/ne2000.c  |   2 +-
>  hw/net/opencores_eth.c   |   2 +-
>  hw/net/pcnet-pci.c   |   2 +-
>  hw/net/rocker/rocker_fp.c|   2 +-
>  hw/net/rtl8139.c |   2 +-
>  hw/net/smc91c111.c   |   2 +-
>  hw/net/spapr_llan.c  |   2 +-
>  hw/net/stellaris_enet.c  |   2 +-
>  hw/net/vhost_net.c   |  18 +++---
>  hw/net/virtio-net.c  |   6 +-
>  hw/net/vmxnet3.c |   2 +-
>  hw/net/xen_nic.c |   2 +-
>  hw/net/xgmac.c   |   2 +-
>  hw/net/xilinx_axienet.c  |   2 +-
>  hw/net/xilinx_ethlite.c  |   2 +-
>  hw/usb/dev-network.c |   2 +-
>  include/net/net.h|   4 +-
>  monitor.c|  14 ++---
>  net/dump.c   |   6 +-
>  net/hub.c|  22 +++
>  net/l2tpv3.c |   6 +-
>  net/net.c| 133 
> +--
>  net/netmap.c |   4 +-
>  net/slirp.c  |   6 +-
>  net/socket.c |   8 +--
>  net/tap-win32.c  |   6 +-
>  net/tap.c|  24 +++
>  net/vde.c|   6 +-
>  net/vhost-user.c |  12 ++--
>  qapi-schema.json |  57 +
>  46 files changed, 230 insertions(+), 162 deletions(-)
> 
> diff --git a/hw/arm/musicpal.c b/hw/arm/musicpal.c
> index 42f66b3..94bdf5f 100644
> --- a/hw/arm/musicpal.c
> +++ b/hw/arm/musicpal.c
> @@ -374,7 +374,7 @@ static void eth_cleanup(NetClientState *nc)
>  }
> 
>  static NetClientInfo net_mv88w8618_info = {
> -.type = NET_CLIENT_OPTIONS_KIND_NIC,
> +.type = NET_CLIENT_DRIVER_NIC,
>  .size = sizeof(NICState),
>  .receive = eth_receive,
>  .cleanup = eth_cleanup,
> diff --git a/hw/core/qdev-properties-system.c 
> b/hw/core/qdev-properties-system.c
> index 921e799..249976e 100644
> --- a/hw/core/qdev-properties-system.c
> +++ b/hw/core/qdev-properties-system.c
> @@ -229,7 +229,7 @@ static void set_netdev(Object *obj, Visitor *v, void 
> *opaque,
>  }
> 
>  queues = qemu_find_net_clients_except(str, peers,
> -  NET_CLIENT_OPTIONS_KIND_NIC,
> +

Re: [Qemu-devel] [wiki] New wiki page - vhost-user setup with ovs/dpdk backend

2015-09-17 Thread Marcel Apfelbaum


On 09/16/2015 08:44 PM, Flavio Leitner wrote:

On Thu, Sep 10, 2015 at 10:51:19PM +0300, Marcel Apfelbaum wrote:

Hi,

The page describes how to setup an environment that allows testing/developing
vhost-user using ovs (with dpdk) as backend.

A regular pc machine can be used, no need for several hosts, a 'dpdk enabled' 
NIC or 1G huge-pages.

The goal is to connect guests' virtio-net devices having vhost-user backend to 
OVS dpdkvhostuser ports
and be able to run any kind of network traffic between them.

The page can be found at:
http://wiki.qemu.org/Features/vhost-user-ovs-dpdk

I want to keep it as simple as possible.
If you see steps that can be skipped or unneeded configuration please let me 
know
or feel free to update the page.


I gave a quick look and found couple issues.  It seems to be missing
the installing steps for qemu.  Also the eventfd_link module is only
needed for vhost-cuse, so you don't need to build/install/load at all.

Hi Flavio,
Thank you for reviewing the document!

For some reason I thought we still need eventfd_link to notify the guests
when new packets arrive. Indeed we don't need this anymore, I updated the wiki 
page.

Regarding QEMU, can you please point me to the missing setup steps?
(I think I already took care of them, anyway I want to be sure)


Thanks,
Marcel



Thanks for doing that Marcel!
fbl

Re: [Qemu-devel] [PATCH] README: fill out some useful quickstart information

2015-09-17 Thread Peter Maydell

On 17 September 2015 at 12:03, Daniel P. Berrange  wrote:
> The README file is usually the first thing consulted when a user
> or developer obtains a copy of the QEMU source. The current QEMU
> README is lacking immediately useful information and so not very
> friendly for first time encounters. It either redirects users to
> qemu-doc.html (which does not exist until they've actually
> compiled QEMU), or the website (which assumes the user has
> convenient internet access at time of reading).
>
> This fills out the README file as simple quick-start guide on
> the topics of building source, submitting patches, licensing
> and how to contact the QEMU community. It does not intend to be
> comprehensive, instead referring people to an appropriate web
> page to obtain more detailed information. The intent is to give
> users quick guidance to get them going in the right direction.
>
> Signed-off-by: Daniel P. Berrange 
> ---
>  README | 108 
> +++--
>  1 file changed, 106 insertions(+), 2 deletions(-)
>
> diff --git a/README b/README
> index c7c990d..71142c3 100644
> --- a/README
> +++ b/README
> @@ -1,3 +1,107 @@
> -Read the documentation in qemu-doc.html or on http://wiki.qemu-project.org
> + QEMU README
> +===
>
> -- QEMU team
> +QEMU is a generic and open source machine emulator and virtualizer. When used
> +as a machine emulator, QEMU can run OSes and programs made for one machine
> +(e.g. an ARM board) on a different machine (e.g. your own PC). By using 
> dynamic
> +translation, it achieves very good performance. When used as a virtualizer,
> +QEMU achieves near native performances by executing the guest code directly 
> on
> +the host CPU. QEMU supports virtualization when executing under the Xen
> +hypervisor or using the KVM kernel module in Linux.

This kind of forgets the linux-user use case (which isn't machine emulation).

> +
> +
> +Building
> +
> +
> +QEMU is multi-platform software intended to be buildable on all modern Linux
> +platforms, OS-X, Win32 (via the Mingw64 toolchain) and a variety of other
> +UNIX targets. The simple process to build QEMU is

This whole section seems to be duplicating our existing build
documentation (which is in qemu-doc.texi in the 'compilation'
section). We should document how to build QEMU in exactly one
place, not two (though I can see the rationale for that one
place not being in a .texi file.)

> +
> +  ./configure
> +  make
> +  sudo make install

I would prefer it if we recommended people to build in a separate
build directory, ie:

mkdir build
cd build
../configure
make

Also I'm not sure our 'make install' target is a great thing to recommend.

> +
> +The configure script supports a number of arguments to turn on/off various
> +optional features. These can be seen with "configure --help".
> +
> +For additional information on building QEMU for Linux and Windows consult:
> +
> +  http://qemu-project.org/Hosts/Linux
> +  http://qemu-project.org/Hosts/W32

We've just significantly improved our documentation for building
on OSX, and we should mention it here.

> +
> +
> +Submitting patches
> +==
> +
> +The QEMU source code is maintained under the GIT version control system.
> +
> +   git clone git://git.qemu-project.org/qemu.git
> +
> +When submitting patches, the preferred approach is to use 'git format-patch'
> +and/or 'git send-email' to format & send the mail to the 
> qemu-devel@nongnu.org
> +mailing list. All patches submitted must contain a 'Signed-off-by' line from
> +the author.
> +
> +For additional information on submitting patches consult:
> +
> +  http://qemu-project.org/Contribute/SubmitAPatch
> +  http://qemu-project.org/Contribute/TrivialPatches
> +
> +
> +Bug reporting
> +=
> +
> +The QEMU project uses Launchpad as its primary upstream bug tracker. Bugs
> +found when running code built from QEMU git or upstream released sources
> +should be reported via:
> +
> +  https://bugs.launchpad.net/qemu/
> +
> +If using QEMU via an operating system vendor pre-built binary package, it
> +is preferrable to report bugs to the vendor's own bug tracker first. If the

"preferable"

> +bug is also known to affect latest upstream code, it can also be reported
> +via launchpad.
> +
> +For additional information on bug reporting consult:
> +
> +  http://qemu-project.org/Contribute/ReportABug
> +
> +
> +Licensing
> +=

This section seems to be duplicating the LICENSE file.

> +
> +  - QEMU as a whole is released under the GNU General Public License, 
> version 2.
> +
> +  - Parts of QEMU have specific licenses which are compatible with the GNU
> +General Public License, version 2. Hence each source file contains its 
> own
> +licensing information. Source files with no licensing information are
> +released under the GNU General Public License, version 2 or (at your
> +

Re: [Qemu-devel] [RFT PATCH v1 0/3] net: smc91c111 can_receive fixes

2015-09-17 Thread Stefan Hajnoczi

On Thu, Sep 10, 2015 at 09:23:27PM -0700, Peter Crosthwaite wrote:
> Hi Richard,
> 
> This should hopefully fix your bug, while addressing the extra concern
> I raised.
> 
> There was also inconsistent behaviour with corking packets through a
> soft reset which I notice and fixed.
> 
> Please let me know if this works for you.
> 
> Regards,
> Peter
> 
> 
> Peter Crosthwaite (3):
>   net: smc91c111: guard flush_queued_packets() on can_rx()
>   net: smc91c111: gate can_receive() on rx FIFO having a slot
>   net: smc91c111: flush packets on RCR register changes
> 
>  hw/net/smc91c111.c | 33 +
>  1 file changed, 25 insertions(+), 8 deletions(-)
> 
> -- 
> 1.9.1
> 
> 

Thanks, applied to my net tree:
https://github.com/stefanha/qemu/commits/net

Stefan

[Qemu-devel] [PULL 1/4] MAINTAINERS: Stefan will not maintain net subsystem

2015-09-17 Thread Stefan Hajnoczi

From: Jason Wang 

Talked with Stefan, he will not maintain net subsystem.

Cc: Stefan Hajnoczi 
Cc: Michael S. Tsirkin 
Signed-off-by: Jason Wang 
Message-id: 1442372730-11360-1-git-send-email-jasow...@redhat.com
Signed-off-by: Stefan Hajnoczi 
---
 MAINTAINERS | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 688979b..2f4e8cf 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -955,11 +955,10 @@ F: hmp-commands.hx
 T: git git://repo.or.cz/qemu/qmp-unstable.git queue/qmp
 
 Network device layer
-M: Stefan Hajnoczi 
 M: Jason Wang 
 S: Maintained
 F: net/
-T: git git://github.com/stefanha/qemu.git net
+T: git git://github.com/jasowang/qemu.git net
 
 Netmap network backend
 M: Luigi Rizzo 
-- 
2.4.3

[Qemu-devel] [PULL 3/4] net: smc91c111: gate can_receive() on rx FIFO having a slot

2015-09-17 Thread Stefan Hajnoczi

From: Peter Crosthwaite 

Return false from can_receive() when the FIFO doesn't have a free RX
slot. This fixes a bug in the current code where the allocated buffer
is freed before the fifo pop, triggering a premature flush of queued RX
packets. It also will handle a corner case, where the guest manually
frees the allocated buffer before popping the rx FIFO (hence it is not
enough to just delay the flush_queued_packets()).

Reported-by: Richard Purdie 
Signed-off-by: Peter Crosthwaite 
Reviewed-by: Fam Zheng 
Tested-by: Richard Purdie 
Message-id: 
97bfdfc5cbce0bd5e0cbbbff35ce7a1bf6f8603d.1441873621.git.crosthwaite.pe...@gmail.com
Signed-off-by: Stefan Hajnoczi 
---
 hw/net/smc91c111.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/hw/net/smc91c111.c b/hw/net/smc91c111.c
index 5774eff..8fc3deb 100644
--- a/hw/net/smc91c111.c
+++ b/hw/net/smc91c111.c
@@ -129,7 +129,8 @@ static int smc91c111_can_receive(smc91c111_state *s)
 if ((s->rcr & RCR_RXEN) == 0 || (s->rcr & RCR_SOFT_RST)) {
 return 1;
 }
-if (s->allocated == (1 << NUM_PACKETS) - 1) {
+if (s->allocated == (1 << NUM_PACKETS) - 1 ||
+s->rx_fifo_len == NUM_PACKETS) {
 return 0;
 }
 return 1;
@@ -182,6 +183,7 @@ static void smc91c111_pop_rx_fifo(smc91c111_state *s)
 } else {
 s->int_level &= ~INT_RCV;
 }
+smc91c111_flush_queued_packets(s);
 smc91c111_update(s);
 }
 
-- 
2.4.3

[Qemu-devel] [PULL 2/4] net: smc91c111: guard flush_queued_packets() on can_rx()

2015-09-17 Thread Stefan Hajnoczi

From: Peter Crosthwaite 

Check that the core can once again receive packets before asking the
net layer to do a flush. This will make it more convenient to flush
packets when adding new conditions to can_receive.

Add missing if braces while moving the can_receive() core code.

Signed-off-by: Peter Crosthwaite 
Reviewed-by: Fam Zheng 
Tested-by: Richard Purdie 
Message-id: 
92e15e12a6964274f4bc0eb71b61a7d94326f6c6.1441873621.git.crosthwaite.pe...@gmail.com
Signed-off-by: Stefan Hajnoczi 
---
 hw/net/smc91c111.c | 30 ++
 1 file changed, 22 insertions(+), 8 deletions(-)

diff --git a/hw/net/smc91c111.c b/hw/net/smc91c111.c
index 74e06e6..5774eff 100644
--- a/hw/net/smc91c111.c
+++ b/hw/net/smc91c111.c
@@ -124,6 +124,24 @@ static void smc91c111_update(smc91c111_state *s)
 qemu_set_irq(s->irq, level);
 }
 
+static int smc91c111_can_receive(smc91c111_state *s)
+{
+if ((s->rcr & RCR_RXEN) == 0 || (s->rcr & RCR_SOFT_RST)) {
+return 1;
+}
+if (s->allocated == (1 << NUM_PACKETS) - 1) {
+return 0;
+}
+return 1;
+}
+
+static inline void smc91c111_flush_queued_packets(smc91c111_state *s)
+{
+if (smc91c111_can_receive(s)) {
+qemu_flush_queued_packets(qemu_get_queue(s->nic));
+}
+}
+
 /* Try to allocate a packet.  Returns 0x80 on failure.  */
 static int smc91c111_allocate_packet(smc91c111_state *s)
 {
@@ -185,7 +203,7 @@ static void smc91c111_release_packet(smc91c111_state *s, 
int packet)
 s->allocated &= ~(1 << packet);
 if (s->tx_alloc == 0x80)
 smc91c111_tx_alloc(s);
-qemu_flush_queued_packets(qemu_get_queue(s->nic));
+smc91c111_flush_queued_packets(s);
 }
 
 /* Flush the TX FIFO.  */
@@ -636,15 +654,11 @@ static uint32_t smc91c111_readl(void *opaque, hwaddr 
offset)
 return val;
 }
 
-static int smc91c111_can_receive(NetClientState *nc)
+static int smc91c111_can_receive_nc(NetClientState *nc)
 {
 smc91c111_state *s = qemu_get_nic_opaque(nc);
 
-if ((s->rcr & RCR_RXEN) == 0 || (s->rcr & RCR_SOFT_RST))
-return 1;
-if (s->allocated == (1 << NUM_PACKETS) - 1)
-return 0;
-return 1;
+return smc91c111_can_receive(s);
 }
 
 static ssize_t smc91c111_receive(NetClientState *nc, const uint8_t *buf, 
size_t size)
@@ -739,7 +753,7 @@ static const MemoryRegionOps smc91c111_mem_ops = {
 static NetClientInfo net_smc91c111_info = {
 .type = NET_CLIENT_OPTIONS_KIND_NIC,
 .size = sizeof(NICState),
-.can_receive = smc91c111_can_receive,
+.can_receive = smc91c111_can_receive_nc,
 .receive = smc91c111_receive,
 };
 
-- 
2.4.3

[Qemu-devel] [PULL 4/4] net: smc91c111: flush packets on RCR register changes

2015-09-17 Thread Stefan Hajnoczi

From: Peter Crosthwaite 

The SOFT_RST or RXEN in the control register can be used as a condition
to unblock the net layer via can_receive(). So check for possible
flushes on RCR changes. This will drop all pending packets on soft
reset or disable which is the functional intent of the can_receive()
logic.

Signed-off-by: Peter Crosthwaite 
Reviewed-by: Fam Zheng 
Tested-by: Richard Purdie 
Message-id: 
b114d4c96f4afbdaa15f1361d9c07e3021755915.1441873621.git.crosthwaite.pe...@gmail.com
Signed-off-by: Stefan Hajnoczi 
---
 hw/net/smc91c111.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/net/smc91c111.c b/hw/net/smc91c111.c
index 8fc3deb..c19cdd1 100644
--- a/hw/net/smc91c111.c
+++ b/hw/net/smc91c111.c
@@ -331,6 +331,7 @@ static void smc91c111_writeb(void *opaque, hwaddr offset,
 if (s->rcr & RCR_SOFT_RST) {
 smc91c111_reset(DEVICE(s));
 }
+smc91c111_flush_queued_packets(s);
 return;
 case 10: case 11: /* RPCR */
 /* Ignored */
-- 
2.4.3

Re: [Qemu-devel] [PATCH] README: fill out some useful quickstart information

2015-09-17 Thread Paolo Bonzini



On 17/09/2015 13:03, Daniel P. Berrange wrote:
> +
> +Licensing
> +=
> +
> +  - QEMU as a whole is released under the GNU General Public License, 
> version 2.
> +
> +  - Parts of QEMU have specific licenses which are compatible with the GNU
> +General Public License, version 2. Hence each source file contains its 
> own
> +licensing information. Source files with no licensing information are
> +released under the GNU General Public License, version 2 or (at your
> +option) any later version. As of July 2013, contributions under version
> +2 of the GNU General Public License (and no later version) are only
> +accepted for the following files or directories: bsd-user/, linux-user/,
> +hw/misc/vfio.c, hw/xen/xen_pt*.
> +
> +  - The Tiny Code Generator (TCG) is released under the BSD license (see
> +license headers in files).
> +
> +  - QEMU is a trademark of Fabrice Bellard.
> +
> +For additional information on QEMU licensing consult:
> +
> +  http://qemu-project.org/License

As mentioned by Peter, this text and this URL are a duplicate of the
LICENSE file.  I would just put a tl;dr in the README file, like this:

  QEMU as a whole is released under the GNU General Public License,
  version 2.  Parts of QEMU have specific licenses which are compatible
  with the GNU General Public License, version 2.

  For additional information on QEMU licensing see the LICENSE file.

Paolo

[Qemu-devel] [PULL 0/4] Net patches

2015-09-17 Thread Stefan Hajnoczi

The following changes since commit 619622424dba749feef752d76d79ef2569f7f250:

  Merge remote-tracking branch 
'remotes/berrange/tags/vnc-crypto-v9-for-upstream' into staging (2015-09-15 
15:42:58 +0100)

are available in the git repository at:

  git://github.com/stefanha/qemu.git tags/net-pull-request

for you to fetch changes up to 271a234a2359975101396916f37f3c7d347c61b8:

  net: smc91c111: flush packets on RCR register changes (2015-09-17 12:36:03 
+0100)





Jason Wang (1):
  MAINTAINERS: Stefan will not maintain net subsystem

Peter Crosthwaite (3):
  net: smc91c111: guard flush_queued_packets() on can_rx()
  net: smc91c111: gate can_receive() on rx FIFO having a slot
  net: smc91c111: flush packets on RCR register changes

 MAINTAINERS|  3 +--
 hw/net/smc91c111.c | 33 +
 2 files changed, 26 insertions(+), 10 deletions(-)

-- 
2.4.3

Re: [Qemu-devel] [PULL 0/19] xen-2015-09-08-tag

2015-09-17 Thread Stefano Stabellini

On Wed, 16 Sep 2015, Chen, Tiejun wrote:
> On 9/15/2015 7:00 PM, Paolo Bonzini wrote:
> > 
> > 
> > On 15/09/2015 11:55, Stefano Stabellini wrote:
> > > On Mon, 14 Sep 2015, Paolo Bonzini wrote:
> > > > > On 10/09/2015 12:29, Stefano Stabellini wrote:
> > > > > > > +if (lseek(config_fd, pos, SEEK_SET) != pos) {
> > > > > > > +return -errno;
> > > > > > > +}
> > > > > > >  do {
> > > > > > > -rc = pread(config_fd, (uint8_t *), len, pos);
> > > > > > > +rc = read(config_fd, (uint8_t *), len);
> > > > > > >  } while (rc < 0 && (errno == EINTR || errno == EAGAIN));
> > > > >
> > > > > This leaks config_fd.
> > > I don't follow, it leaks config_fd where?
> > 
> > Where lseek returns -errno (and IIRC in other places in the same function).
> 
> Do you mean we need this change?

Yes, please send out a separate patch. Add my Acked-by.


> diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c
> index 1fb71c8..7d44228 100644
> --- a/hw/pci-host/piix.c
> +++ b/hw/pci-host/piix.c
> @@ -775,15 +775,18 @@ static int host_pci_config_read(int pos, int len,
> uint32_t val)
>  }
> 
>  if (lseek(config_fd, pos, SEEK_SET) != pos) {
> +close(config_fd);
>  return -errno;
>  }
>  do {
>  rc = read(config_fd, (uint8_t *), len);
>  } while (rc < 0 && (errno == EINTR || errno == EAGAIN));
>  if (rc != len) {
> +close(config_fd);
>  return -errno;
>  }
> 
> +close(config_fd);
>  return 0;
>  }
> 
> 
> Thanks
> Tiejun
>

Re: [Qemu-devel] [PATCH v4 3/3] pcie: Add support for Single Root I/O Virtualization (SR/IOV)

2015-09-17 Thread Marcel Apfelbaum


On 09/12/2015 03:36 PM, Knut Omang wrote:

This patch provides the building blocks for creating an SR/IOV
PCIe Extended Capability header and register/unregister
SR/IOV Virtual Functions.

Signed-off-by: Knut Omang 
---
  hw/pci/Makefile.objs|   2 +-
  hw/pci/pci.c|  99 
  hw/pci/pcie.c   |   9 +-
  hw/pci/pcie_sriov.c | 271 
  include/hw/pci/pci.h|  11 +-
  include/hw/pci/pcie.h   |   6 +
  include/hw/pci/pcie_sriov.h |  55 +
  include/qemu/typedefs.h |   2 +
  8 files changed, 426 insertions(+), 29 deletions(-)
  create mode 100644 hw/pci/pcie_sriov.c
  create mode 100644 include/hw/pci/pcie_sriov.h

diff --git a/hw/pci/Makefile.objs b/hw/pci/Makefile.objs
index 9f905e6..2226980 100644
--- a/hw/pci/Makefile.objs
+++ b/hw/pci/Makefile.objs
@@ -3,7 +3,7 @@ common-obj-$(CONFIG_PCI) += msix.o msi.o
  common-obj-$(CONFIG_PCI) += shpc.o
  common-obj-$(CONFIG_PCI) += slotid_cap.o
  common-obj-$(CONFIG_PCI) += pci_host.o pcie_host.o
-common-obj-$(CONFIG_PCI) += pcie.o pcie_aer.o pcie_port.o
+common-obj-$(CONFIG_PCI) += pcie.o pcie_aer.o pcie_port.o pcie_sriov.o

  common-obj-$(call lnot,$(CONFIG_PCI)) += pci-stub.o
  common-obj-$(CONFIG_ALL) += pci-stub.o
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index a5cc015..9c0eba1 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -153,6 +153,9 @@ int pci_bar(PCIDevice *d, int reg)
  {
  uint8_t type;

+/* PCIe virtual functions do not have their own BARs */
+assert(!pci_is_vf(d));
+
  if (reg != PCI_ROM_SLOT)
  return PCI_BASE_ADDRESS_0 + reg * 4;

@@ -211,22 +214,13 @@ void pci_device_deassert_intx(PCIDevice *dev)
  }
  }

-static void pci_do_device_reset(PCIDevice *dev)
+static void pci_reset_regions(PCIDevice *dev)
  {
  int r;
+if (pci_is_vf(dev)) {
+return;
+}

-pci_device_deassert_intx(dev);
-assert(dev->irq_state == 0);
-
-/* Clear all writable bits */
-pci_word_test_and_clear_mask(dev->config + PCI_COMMAND,
- pci_get_word(dev->wmask + PCI_COMMAND) |
- pci_get_word(dev->w1cmask + PCI_COMMAND));
-pci_word_test_and_clear_mask(dev->config + PCI_STATUS,
- pci_get_word(dev->wmask + PCI_STATUS) |
- pci_get_word(dev->w1cmask + PCI_STATUS));
-dev->config[PCI_CACHE_LINE_SIZE] = 0x0;
-dev->config[PCI_INTERRUPT_LINE] = 0x0;
  for (r = 0; r < PCI_NUM_REGIONS; ++r) {
  PCIIORegion *region = >io_regions[r];
  if (!region->size) {
@@ -240,6 +234,27 @@ static void pci_do_device_reset(PCIDevice *dev)
  pci_set_long(dev->config + pci_bar(dev, r), region->type);
  }
  }
+}
+
+static void pci_do_device_reset(PCIDevice *dev)
+{
+qdev_reset_all(>qdev);


Hi,
Thank you for resubmitting this series!

This is only a quick look, I hope I'll have more time next week to go over it 
again.



+
+dev->irq_state = 0;


Are you sure we need the assignment above? It seems that the irq_state
should be modified only by the intx wrappers as  pci_device_deassert_intx and 
so.



+pci_update_irq_status(dev);


Why do we have to update the irq status on reset?



+pci_device_deassert_intx(dev);
+assert(dev->irq_state == 0);
+
+/* Clear all writable bits */
+pci_word_test_and_clear_mask(dev->config + PCI_COMMAND,
+ pci_get_word(dev->wmask + PCI_COMMAND) |
+ pci_get_word(dev->w1cmask + PCI_COMMAND));
+pci_word_test_and_clear_mask(dev->config + PCI_STATUS,
+ pci_get_word(dev->wmask + PCI_STATUS) |
+ pci_get_word(dev->w1cmask + PCI_STATUS));
+dev->config[PCI_CACHE_LINE_SIZE] = 0x0;
+dev->config[PCI_INTERRUPT_LINE] = 0x0;
+pci_reset_regions(dev);
  pci_update_mappings(dev);

  msi_reset(dev);
@@ -771,6 +786,15 @@ static void pci_init_multifunction(PCIBus *bus, PCIDevice 
*dev, Error **errp)
  dev->config[PCI_HEADER_TYPE] |= PCI_HEADER_TYPE_MULTI_FUNCTION;
  }

+/* With SR/IOV and ARI, a device at function 0 need not be a multifunction
+ * device, as it may just be a VF that ended up with function 0 in
+ * the legacy PCI interpretation. Avoid failing in such cases:
+ */
+if (pci_is_vf(dev) &&
+dev->exp.sriov_vf.pf->cap_present & QEMU_PCI_CAP_MULTIFUNCTION) {
+return;
+}
+
  /*
   * multifunction bit is interpreted in two ways as follows.
   *   - all functions must set the bit to 1.
@@ -962,6 +986,7 @@ void pci_register_bar(PCIDevice *pci_dev, int region_num,
  uint64_t wmask;
  pcibus_t size = memory_region_size(memory);

+assert(!pci_is_vf(pci_dev)); /* VFs must use pcie_sriov_vf_register_bar */
  assert(region_num >= 0);
  assert(region_num <

Re: [Qemu-devel] [PATCH] README: fill out some useful quickstart information

2015-09-17 Thread Daniel P. Berrange

On Thu, Sep 17, 2015 at 12:32:56PM +0100, Peter Maydell wrote:
> On 17 September 2015 at 12:03, Daniel P. Berrange  wrote:
> > The README file is usually the first thing consulted when a user
> > or developer obtains a copy of the QEMU source. The current QEMU
> > README is lacking immediately useful information and so not very
> > friendly for first time encounters. It either redirects users to
> > qemu-doc.html (which does not exist until they've actually
> > compiled QEMU), or the website (which assumes the user has
> > convenient internet access at time of reading).
> >
> > This fills out the README file as simple quick-start guide on
> > the topics of building source, submitting patches, licensing
> > and how to contact the QEMU community. It does not intend to be
> > comprehensive, instead referring people to an appropriate web
> > page to obtain more detailed information. The intent is to give
> > users quick guidance to get them going in the right direction.
> >
> > Signed-off-by: Daniel P. Berrange 
> > ---
> >  README | 108 
> > +++--
> >  1 file changed, 106 insertions(+), 2 deletions(-)
> >
> > diff --git a/README b/README
> > index c7c990d..71142c3 100644
> > --- a/README
> > +++ b/README
> > @@ -1,3 +1,107 @@
> > -Read the documentation in qemu-doc.html or on http://wiki.qemu-project.org
> > + QEMU README
> > +===
> >
> > -- QEMU team
> > +QEMU is a generic and open source machine emulator and virtualizer. When 
> > used
> > +as a machine emulator, QEMU can run OSes and programs made for one machine
> > +(e.g. an ARM board) on a different machine (e.g. your own PC). By using 
> > dynamic
> > +translation, it achieves very good performance. When used as a virtualizer,
> > +QEMU achieves near native performances by executing the guest code 
> > directly on
> > +the host CPU. QEMU supports virtualization when executing under the Xen
> > +hypervisor or using the KVM kernel module in Linux.
> 
> This kind of forgets the linux-user use case (which isn't machine emulation).

FYI, I just copied this text from the qemu-project.org front page :-) If
anyone has suggestions for improvements I'm all ears - we should update
the website too.

> > +Building
> > +
> > +
> > +QEMU is multi-platform software intended to be buildable on all modern 
> > Linux
> > +platforms, OS-X, Win32 (via the Mingw64 toolchain) and a variety of other
> > +UNIX targets. The simple process to build QEMU is
> 
> This whole section seems to be duplicating our existing build
> documentation (which is in qemu-doc.texi in the 'compilation'
> section). We should document how to build QEMU in exactly one
> place, not two (though I can see the rationale for that one
> place not being in a .texi file.)

The problem I have with refering people to qemu-doc.html,
is that in order to order to view the docs on how to build
QEMU, you first have to build QEMU, or enjoy reading the
.texi source code :-)  Though that doc does get exposed
via the website too, it is nice to not rely on people having
internet access all the time.

The qemu-doc.html chapter 6 is a bit more detailed in what
it descibes. I tend to view the instructions we put in the
README file as the minimal quick-start, and then point to
the comprehensive docs as a detailed reference on the matter.

> > +  ./configure
> > +  make
> > +  sudo make install
> 
> I would prefer it if we recommended people to build in a separate
> build directory, ie:
> 
> mkdir build
> cd build
> ../configure
> make

Sure, that does make sense.

> 
> Also I'm not sure our 'make install' target is a great thing to recommend.

Any particular reason why 'make install' is bad ? I've not personally
had any trouble with it, though to be fair I always build with
'--prefix=$HOME/usr/qemu-git' so I'm not splattering stuff into /usr

> 
> > +
> > +The configure script supports a number of arguments to turn on/off various
> > +optional features. These can be seen with "configure --help".
> > +
> > +For additional information on building QEMU for Linux and Windows consult:
> > +
> > +  http://qemu-project.org/Hosts/Linux
> > +  http://qemu-project.org/Hosts/W32
> 
> We've just significantly improved our documentation for building
> on OSX, and we should mention it here.

Are those docs mentioned anywhere on the wiki, or just in the qemu-doc.html
file ?  I took these two links from this page:

  http://qemu-project.org/Documentation/GettingStartedDevelopers

which doesn't mention OS-X. So we should probably address that at the
same time.

> > +Licensing
> > +=
> 
> This section seems to be duplicating the LICENSE file.
> 
> > +
> > +  - QEMU as a whole is released under the GNU General Public License, 
> > version 2.
> > +
> > +  - Parts of QEMU have specific licenses which are compatible with the GNU
> > +General Public License, version 2. Hence each

Re: [Qemu-devel] [PATCH v4 1/3] pci: Make use of the devfn property when registering new devices

2015-09-17 Thread Knut Omang

On Thu, 2015-09-17 at 14:11 +0300, Marcel Apfelbaum wrote:
> On 09/12/2015 03:36 PM, Knut Omang wrote:
> > Without this, the devfn argument to pci_create_*()
> > does not affect the assigned devfn.
> > 
> > Needed to support (VF_STRIDE,VF_OFFSET) values other than (1,1)
> > for SR/IOV.
> > 
> > Signed-off-by: Knut Omang 
> > ---
> >   hw/pci/pci.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> > index ccea628..a5cc015 100644
> > --- a/hw/pci/pci.c
> > +++ b/hw/pci/pci.c
> > @@ -1840,7 +1840,7 @@ static void pci_qdev_realize(DeviceState
> > *qdev, Error **errp)
> >   bus = PCI_BUS(qdev_get_parent_bus(qdev));
> >   pci_dev = do_pci_register_device(pci_dev, bus,
> >   
> >  object_get_typename(OBJECT(qdev)),
> > - pci_dev->devfn, errp);
> > +
> >  object_property_get_int(OBJECT(qdev), "addr", NULL), errp);
> Hi,
> 
> I don't get this, using object_property_get_int on "addr" should
> return the value of pci_dev->devfn,
> can you please tell what exactly is not working?

The problem is that at that point pci_dev->devfn has not been set yet -
have commented on this before somewhere..

Knut

> Thanks,
> Marcel
> 
> 
> 
> >   if (pci_dev == NULL)
> >   return;
> > 
> > 
>

Re: [Qemu-devel] [PULL v2 00/24] Misc patches for 2015-09-16

2015-09-17 Thread Peter Maydell

On 16 September 2015 at 16:40, Paolo Bonzini  wrote:
> The following changes since commit 007e620a7576e4ce2ea6955541e87d8ae8ed32ae:
>
>   Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging 
> (2015-09-14 18:51:09 +0100)
>
> are available in the git repository at:
>
>   git://github.com/bonzini/qemu.git tags/for-upstream
>
> for you to fetch changes up to d6268348493f32ecc096caa637620757472a1196:
>
>   nbd: release exp->blk after all clients are closed (2015-09-16 17:33:33 
> +0200)
>
> I moved the revert first in the pull request since I was at it.
>
> 
> * Linux header update and cleanup
> * Support for HyperV crash report
> * Cleanup of target-specific HMP commands
> * Multiarch batch
> * Checkpatch fix for Perl 5.22
> * NBD fix
> * Revert incorrect commit 5243722376
>

Applied, thanks.

-- PMM

Re: [Qemu-devel] [PATCH] virtio: add some migration doc

2015-09-17 Thread Jason Wang



On 09/17/2015 06:39 PM, Greg Kurz wrote:
> On Thu, 17 Sep 2015 11:06:01 +0200
> Cornelia Huck  wrote:
>
>> Try to cover the basics of virtio migration.
>>
>> Signed-off-by: Cornelia Huck 
>> ---
>> It might help if we add some documentation; at the very least, it will
>> prevent myself getting a headache everytime I look at that code :)
>>
>> Feedback welcome.
> Excellent ! This is a very good to start with.
>
> Reviewed-by: Greg Kurz 
>
> Just a thought below...
>
>> ---
>>  docs/virtio-migration.txt | 101 
>> ++
>>  1 file changed, 101 insertions(+)
>>  create mode 100644 docs/virtio-migration.txt
>>
>> diff --git a/docs/virtio-migration.txt b/docs/virtio-migration.txt
>> new file mode 100644
>> index 000..9c575e6
>> --- /dev/null
>> +++ b/docs/virtio-migration.txt
>> @@ -0,0 +1,101 @@
>> +Virtio devices and migration
>> +
>> +
>> +Saving and restoring the state of virtio devices is a bit of a twisty maze,
>> +for several reasons:
>> +- state is distributed between several parts:
>> +  - virtio core, for common fields like features, number of queues, ...
>> +  - virtio transport (pci, ccw, ...), for the different proxy devices and
>> +transport specific state (msix vectors, indicators, ...)
>> +  - virtio device (net, blk, ...), for the different device types and their
>> +state (mac address, request queue, ...)
>> +- most fields are saved via the stream interface; subsequently, subsections
>> +  have been added to make cross-version migration possible
>> +
>> +This file attempts to document the current procedure and point out some
>> +caveats.
>> +
>> +
>> +Save state procedure
>> +
>> +
>> +virtio core   virtio transport  virtio device
>> +---     -
>> +
>> +save() function 
>> registered
>> +via register_savevm()
>> +virtio_save()   <--
>> + -->  save_config()
>> +  - save proxy device
>> +  - save transport-specific
>> +device fields
>> +- save common device
>> +  fields
>> +- save common virtqueue
>> +  fields
>> + -->  save_queue()
>> +  - save transport-specific
>> +virtqueue fields
>> + -->   save_device()
>> +   - save device-specific
>> + fields
>> +- save subsections
>> +  - device endianness,
>> +if changed from
>> +default endianness
>> +  - 64 bit features, if
>> +any high feature bit
>> +is set
>> +  - virtio-1 virtqueue
>> +fields, if VERSION_1
>> +is set
>> +
>> +
>> +Load state procedure
>> +
>> +
>> +virtio core   virtio transport  virtio device
>> +---     -
>> +
>> +load() function 
>> registered
>> +via register_savevm()
>> +virtio_load()   <--
>> + -->  load_config()
>> +  - load proxy device
>> +  - load transport-specific
>> +device fields
>> +- load common device
>> +  fields
>> +- load common virtqueue
>> +  fields
>> + -->  load_queue()
>> +  - load transport-specific
>> +virtqueue fields
>> +- notify guest
>> + -->   load_device()
>> +   - load device-specific
>> + fields
>> +- load subsections
>> +  - device endianness
>> +  - 64 bit features
>> +  - virtio-1 virtqueue
>> +fields
>> +- sanitize endianness
>> +- sanitize features
>> +- virtqueue index sanity
>> +  check
>> +   - feature-dependent setup
>> +
>> +
>> +Implications of this setup
>> +==
>> +
>> +Devices need to be careful in their state processing during load: The
>> +load_device() procedure is invoked by the core before subsections have
>> +been loaded. Any code that depends on information transmitted in subsections
>> +therefore has to be invoked in the device's load() function _after_
>> +virtio_load() returned (like e.g. code depending on features).
>> +
> From a VMState standpoint, such code would land in a post-load callback most 
> of
> the time. Would

[Qemu-devel] [PATCH] README: fill out some useful quickstart information

2015-09-17 Thread Daniel P. Berrange

The README file is usually the first thing consulted when a user
or developer obtains a copy of the QEMU source. The current QEMU
README is lacking immediately useful information and so not very
friendly for first time encounters. It either redirects users to
qemu-doc.html (which does not exist until they've actually
compiled QEMU), or the website (which assumes the user has
convenient internet access at time of reading).

This fills out the README file as simple quick-start guide on
the topics of building source, submitting patches, licensing
and how to contact the QEMU community. It does not intend to be
comprehensive, instead referring people to an appropriate web
page to obtain more detailed information. The intent is to give
users quick guidance to get them going in the right direction.

Signed-off-by: Daniel P. Berrange 
---
 README | 108 +++--
 1 file changed, 106 insertions(+), 2 deletions(-)

diff --git a/README b/README
index c7c990d..71142c3 100644
--- a/README
+++ b/README
@@ -1,3 +1,107 @@
-Read the documentation in qemu-doc.html or on http://wiki.qemu-project.org
+ QEMU README
+===
 
-- QEMU team
+QEMU is a generic and open source machine emulator and virtualizer. When used
+as a machine emulator, QEMU can run OSes and programs made for one machine
+(e.g. an ARM board) on a different machine (e.g. your own PC). By using dynamic
+translation, it achieves very good performance. When used as a virtualizer,
+QEMU achieves near native performances by executing the guest code directly on
+the host CPU. QEMU supports virtualization when executing under the Xen
+hypervisor or using the KVM kernel module in Linux.
+
+
+Building
+
+
+QEMU is multi-platform software intended to be buildable on all modern Linux
+platforms, OS-X, Win32 (via the Mingw64 toolchain) and a variety of other
+UNIX targets. The simple process to build QEMU is
+
+  ./configure
+  make
+  sudo make install
+
+The configure script supports a number of arguments to turn on/off various
+optional features. These can be seen with "configure --help".
+
+For additional information on building QEMU for Linux and Windows consult:
+
+  http://qemu-project.org/Hosts/Linux
+  http://qemu-project.org/Hosts/W32
+
+
+Submitting patches
+==
+
+The QEMU source code is maintained under the GIT version control system.
+
+   git clone git://git.qemu-project.org/qemu.git
+
+When submitting patches, the preferred approach is to use 'git format-patch'
+and/or 'git send-email' to format & send the mail to the qemu-devel@nongnu.org
+mailing list. All patches submitted must contain a 'Signed-off-by' line from
+the author.
+
+For additional information on submitting patches consult:
+
+  http://qemu-project.org/Contribute/SubmitAPatch
+  http://qemu-project.org/Contribute/TrivialPatches
+
+
+Bug reporting
+=
+
+The QEMU project uses Launchpad as its primary upstream bug tracker. Bugs
+found when running code built from QEMU git or upstream released sources
+should be reported via:
+
+  https://bugs.launchpad.net/qemu/
+
+If using QEMU via an operating system vendor pre-built binary package, it
+is preferrable to report bugs to the vendor's own bug tracker first. If the
+bug is also known to affect latest upstream code, it can also be reported
+via launchpad.
+
+For additional information on bug reporting consult:
+
+  http://qemu-project.org/Contribute/ReportABug
+
+
+Licensing
+=
+
+  - QEMU as a whole is released under the GNU General Public License, version 
2.
+
+  - Parts of QEMU have specific licenses which are compatible with the GNU
+General Public License, version 2. Hence each source file contains its own
+licensing information. Source files with no licensing information are
+released under the GNU General Public License, version 2 or (at your
+option) any later version. As of July 2013, contributions under version
+2 of the GNU General Public License (and no later version) are only
+accepted for the following files or directories: bsd-user/, linux-user/,
+hw/misc/vfio.c, hw/xen/xen_pt*.
+
+  - The Tiny Code Generator (TCG) is released under the BSD license (see
+license headers in files).
+
+  - QEMU is a trademark of Fabrice Bellard.
+
+For additional information on QEMU licensing consult:
+
+  http://qemu-project.org/License
+
+
+Contact
+===
+
+The QEMU community can be contacted in a number of ways, with the two main
+methods being E-Mail and IRC
+
+ - qemu-devel@nongnu.org (http://lists.nongnu.org/mailman/listinfo/qemu-devel)
+ - #qemu on irc.oftc.net
+
+For additional information on contacted the community consult:
+
+  http://qemu-project.org/Contribute/StartHere
+
+-- End
-- 
2.4.3

Re: [Qemu-devel] [PATCH v4 1/3] pci: Make use of the devfn property when registering new devices

2015-09-17 Thread Marcel Apfelbaum


On 09/12/2015 03:36 PM, Knut Omang wrote:

Without this, the devfn argument to pci_create_*()
does not affect the assigned devfn.

Needed to support (VF_STRIDE,VF_OFFSET) values other than (1,1)
for SR/IOV.

Signed-off-by: Knut Omang 
---
  hw/pci/pci.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index ccea628..a5cc015 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -1840,7 +1840,7 @@ static void pci_qdev_realize(DeviceState *qdev, Error 
**errp)
  bus = PCI_BUS(qdev_get_parent_bus(qdev));
  pci_dev = do_pci_register_device(pci_dev, bus,
   object_get_typename(OBJECT(qdev)),
- pci_dev->devfn, errp);
+ object_property_get_int(OBJECT(qdev), 
"addr", NULL), errp);

Hi,

I don't get this, using object_property_get_int on "addr" should return the value 
of pci_dev->devfn,
can you please tell what exactly is not working?

Thanks,
Marcel




  if (pci_dev == NULL)
  return;

Re: [Qemu-devel] [PATCH 2/3] monitor: throttle QAPI_EVENT_VSERPORT_CHANGE by "id"

2015-09-17 Thread Marc-André Lureau

Hi

On Wed, Sep 16, 2015 at 6:40 PM, Markus Armbruster  wrote:
>
> I got a few more questions.
>
> Do we expect more events in need of separate throttling by some ID-like
> member of their data object?
> Do we expect *different* callbacks?  Callbacks that throttle by some
> member of their data object won't count as different.
>
> Do we really need the callback and the type punning?  As long as all we
> have is the current two throttles, I'd try to do without, as follows.
>

That's not a question you can answer easily, but I can easily imagine
that we may throttle events based on different conditions. So I prefer
a modular code if it's easy to do and hopefully easy to understand.

> monitor_qapi_event_throttle() takes the name of a data member as
> additional argument.  If null, throttle like monitor_qapi_event_delay().
> Else, throttle like monitor_qapi_event_id_delay().  Except do it in a
> single function, with a single MonitorQAPIEventState.  The natural
> representation of "either a single MonitorQAPIEventPending or a hash
> table of them" is a union.

You need a dispatch in a single function then, and it's very much not
modular code that way. I'd prefer we keep the modularity this patch
brings: different functions and structure for different throttle
implementations.

-- 
Marc-André Lureau

Re: [Qemu-devel] [PATCH] virtio: add some migration doc

2015-09-17 Thread Greg Kurz

On Thu, 17 Sep 2015 11:06:01 +0200
Cornelia Huck  wrote:

> Try to cover the basics of virtio migration.
> 
> Signed-off-by: Cornelia Huck 
> ---
> It might help if we add some documentation; at the very least, it will
> prevent myself getting a headache everytime I look at that code :)
> 
> Feedback welcome.

Excellent ! This is a very good to start with.

Reviewed-by: Greg Kurz 

Just a thought below...

> ---
>  docs/virtio-migration.txt | 101 
> ++
>  1 file changed, 101 insertions(+)
>  create mode 100644 docs/virtio-migration.txt
> 
> diff --git a/docs/virtio-migration.txt b/docs/virtio-migration.txt
> new file mode 100644
> index 000..9c575e6
> --- /dev/null
> +++ b/docs/virtio-migration.txt
> @@ -0,0 +1,101 @@
> +Virtio devices and migration
> +
> +
> +Saving and restoring the state of virtio devices is a bit of a twisty maze,
> +for several reasons:
> +- state is distributed between several parts:
> +  - virtio core, for common fields like features, number of queues, ...
> +  - virtio transport (pci, ccw, ...), for the different proxy devices and
> +transport specific state (msix vectors, indicators, ...)
> +  - virtio device (net, blk, ...), for the different device types and their
> +state (mac address, request queue, ...)
> +- most fields are saved via the stream interface; subsequently, subsections
> +  have been added to make cross-version migration possible
> +
> +This file attempts to document the current procedure and point out some
> +caveats.
> +
> +
> +Save state procedure
> +
> +
> +virtio core   virtio transport  virtio device
> +---     -
> +
> +save() function 
> registered
> +via register_savevm()
> +virtio_save()   <--
> + -->  save_config()
> +  - save proxy device
> +  - save transport-specific
> +device fields
> +- save common device
> +  fields
> +- save common virtqueue
> +  fields
> + -->  save_queue()
> +  - save transport-specific
> +virtqueue fields
> + -->   save_device()
> +   - save device-specific
> + fields
> +- save subsections
> +  - device endianness,
> +if changed from
> +default endianness
> +  - 64 bit features, if
> +any high feature bit
> +is set
> +  - virtio-1 virtqueue
> +fields, if VERSION_1
> +is set
> +
> +
> +Load state procedure
> +
> +
> +virtio core   virtio transport  virtio device
> +---     -
> +
> +load() function 
> registered
> +via register_savevm()
> +virtio_load()   <--
> + -->  load_config()
> +  - load proxy device
> +  - load transport-specific
> +device fields
> +- load common device
> +  fields
> +- load common virtqueue
> +  fields
> + -->  load_queue()
> +  - load transport-specific
> +virtqueue fields
> +- notify guest
> + -->   load_device()
> +   - load device-specific
> + fields
> +- load subsections
> +  - device endianness
> +  - 64 bit features
> +  - virtio-1 virtqueue
> +fields
> +- sanitize endianness
> +- sanitize features
> +- virtqueue index sanity
> +  check
> +   - feature-dependent setup
> +
> +
> +Implications of this setup
> +==
> +
> +Devices need to be careful in their state processing during load: The
> +load_device() procedure is invoked by the core before subsections have
> +been loaded. Any code that depends on information transmitted in subsections
> +therefore has to be invoked in the device's load() function _after_
> +virtio_load() returned (like e.g. code depending on features).
> +

From a VMState standpoint, such code would land in a post-load callback most of
the time. Would that help comprehension if we introduce a virtio_post_load()
function ?

> +Any extension of the state being migrated should be done in subsections
> +added to the core for

Re: [Qemu-devel] [PATCH] virtio: add some migration doc

2015-09-17 Thread Jason Wang



On 09/17/2015 05:06 PM, Cornelia Huck wrote:
> Try to cover the basics of virtio migration.
>
> Signed-off-by: Cornelia Huck 
> ---
> It might help if we add some documentation; at the very least, it will
> prevent myself getting a headache everytime I look at that code :)
>
> Feedback welcome.
> ---
>  docs/virtio-migration.txt | 101 
> ++
>  1 file changed, 101 insertions(+)
>  create mode 100644 docs/virtio-migration.txt

Thanks for the patch. Looks good to me. Keeping this doc to be synced
with code is important in the future. E.g I add a new subsection in core
with my pci virtio 1.0 fixes.

>
> diff --git a/docs/virtio-migration.txt b/docs/virtio-migration.txt
> new file mode 100644
> index 000..9c575e6
> --- /dev/null
> +++ b/docs/virtio-migration.txt
> @@ -0,0 +1,101 @@
> +Virtio devices and migration
> +
> +
> +Saving and restoring the state of virtio devices is a bit of a twisty maze,
> +for several reasons:
> +- state is distributed between several parts:
> +  - virtio core, for common fields like features, number of queues, ...
> +  - virtio transport (pci, ccw, ...), for the different proxy devices and
> +transport specific state (msix vectors, indicators, ...)
> +  - virtio device (net, blk, ...), for the different device types and their
> +state (mac address, request queue, ...)
> +- most fields are saved via the stream interface; subsequently, subsections
> +  have been added to make cross-version migration possible
> +
> +This file attempts to document the current procedure and point out some
> +caveats.
> +
> +
> +Save state procedure
> +
> +
> +virtio core   virtio transport  virtio device
> +---     -
> +
> +save() function 
> registered
> +via register_savevm()
> +virtio_save()   <--
> + -->  save_config()
> +  - save proxy device
> +  - save transport-specific
> +device fields
> +- save common device
> +  fields
> +- save common virtqueue
> +  fields
> + -->  save_queue()
> +  - save transport-specific
> +virtqueue fields
> + -->   save_device()
> +   - save device-specific
> + fields
> +- save subsections
> +  - device endianness,
> +if changed from
> +default endianness
> +  - 64 bit features, if
> +any high feature bit
> +is set
> +  - virtio-1 virtqueue
> +fields, if VERSION_1
> +is set
> +
> +
> +Load state procedure
> +
> +
> +virtio core   virtio transport  virtio device
> +---     -
> +
> +load() function 
> registered
> +via register_savevm()
> +virtio_load()   <--
> + -->  load_config()
> +  - load proxy device
> +  - load transport-specific
> +device fields
> +- load common device
> +  fields
> +- load common virtqueue
> +  fields
> + -->  load_queue()
> +  - load transport-specific
> +virtqueue fields
> +- notify guest
> + -->   load_device()
> +   - load device-specific
> + fields
> +- load subsections
> +  - device endianness
> +  - 64 bit features
> +  - virtio-1 virtqueue
> +fields
> +- sanitize endianness
> +- sanitize features
> +- virtqueue index sanity
> +  check
> +   - feature-dependent setup
> +
> +
> +Implications of this setup
> +==
> +
> +Devices need to be careful in their state processing during load: The
> +load_device() procedure is invoked by the core before subsections have
> +been loaded. Any code that depends on information transmitted in subsections
> +therefore has to be invoked in the device's load() function _after_
> +virtio_load() returned (like e.g. code depending on features).
> +
> +Any extension of the state being migrated should be done in subsections
> +added to the core for compatibility reasons. If transport or device specific
> +state is added, core needs to invoke a callback from the new subsection.

Re: [Qemu-devel] [PATCH] virtio: add some migration doc

2015-09-17 Thread Cornelia Huck

On Thu, 17 Sep 2015 12:39:31 +0200
Greg Kurz  wrote:
> On Thu, 17 Sep 2015 11:06:01 +0200
> Cornelia Huck  wrote:

> > +Devices need to be careful in their state processing during load: The
> > +load_device() procedure is invoked by the core before subsections have
> > +been loaded. Any code that depends on information transmitted in 
> > subsections
> > +therefore has to be invoked in the device's load() function _after_
> > +virtio_load() returned (like e.g. code depending on features).
> > +
> 
> From a VMState standpoint, such code would land in a post-load callback most 
> of
> the time. Would that help comprehension if we introduce a virtio_post_load()
> function ?

It might. I think the main problem is that we need to do some stuff
post-load, and it's not really obvious what it is.

Re: [Qemu-devel] [PATCH] virtio: add some migration doc

2015-09-17 Thread Cornelia Huck

On Thu, 17 Sep 2015 18:44:00 +0800
Jason Wang  wrote:
> 
> 
> On 09/17/2015 05:06 PM, Cornelia Huck wrote:
> > Try to cover the basics of virtio migration.
> >
> > Signed-off-by: Cornelia Huck 
> > ---
> > It might help if we add some documentation; at the very least, it will
> > prevent myself getting a headache everytime I look at that code :)
> >
> > Feedback welcome.
> > ---
> >  docs/virtio-migration.txt | 101 
> > ++
> >  1 file changed, 101 insertions(+)
> >  create mode 100644 docs/virtio-migration.txt
> 
> Thanks for the patch. Looks good to me. Keeping this doc to be synced
> with code is important in the future. E.g I add a new subsection in core
> with my pci virtio 1.0 fixes.

Yes, that's a drawback of the "document it" approach...

Re: [Qemu-devel] [PATCH 2/4] ahci: fix signature generation

2015-09-17 Thread Stefan Hajnoczi

On Tue, Sep 01, 2015 at 04:50:39PM -0400, John Snow wrote:
> The initial register device-to-host FIS no longer needs to specially
> set certain fields, as these can be handled generically by setting those
> fields explicitly with the signatures we want at port reset time.
> 
> (1) Signatures are decomposed into their four component registers and
> set upon (AHCI) port reset.
> (2) the signature cache register is no longer set manually per-each
> device type, but instead just once during ahci_init_d2h.
> 
> Signed-off-by: John Snow 
> ---
>  hw/ide/ahci.c | 32 
>  1 file changed, 20 insertions(+), 12 deletions(-)
> 
> diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
> index d04a161..3c50ccb 100644
> --- a/hw/ide/ahci.c
> +++ b/hw/ide/ahci.c
> @@ -542,20 +542,31 @@ static void ahci_init_d2h(AHCIDevice *ad)
>  {
>  uint8_t init_fis[20];
>  IDEState *ide_state = >port.ifs[0];
> +AHCIPortRegs *pr = >port_regs;
>  
>  memset(init_fis, 0, sizeof(init_fis));
>  
> -init_fis[4] = 1;
> -init_fis[12] = 1;
> -
> -if (ide_state->drive_kind == IDE_CD) {
> -init_fis[5] = ide_state->lcyl;
> -init_fis[6] = ide_state->hcyl;
> -}
> +/* We're emulating receiving the first Reg H2D Fis from the device;
> + * Update the SIG register, but otherwise procede as normal. */

s/procede/proceed/

Re: [Qemu-devel] [PATCH v2 08/18] nvdimm: init backend memory mapping and config data area

2015-09-17 Thread Xiao Guangrong




On 09/16/2015 12:07 AM, Paolo Bonzini wrote:



On 26/08/2015 12:40, Xiao Guangrong wrote:


+
+size = get_file_size(fd);
+buf = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);


I guess the user will want to choose between MAP_SHARED and MAP_PRIVATE.
This can be added in the future.


Good idea, it will allow guest to write data but discards its content
after it exits. Will implement O_RDONLY + MAP_PRIVATE in the near future.


FWIW, if Igor's backend/frontend idea is implemented, the choice between
MAP_SHARED and MAP_PRIVATE should belong in the backend.


Yes. I can not agree with you more! :)

Re: [Qemu-devel] [PATCH] cpu-exec: Add "nochain" debug flag

2015-09-17 Thread Peter Maydell

On 16 September 2015 at 23:34, Richard Henderson  wrote:
> Respect it to avoid linking TBs together.
>
> Signed-off-by: Richard Henderson 

Reviewed-by: Peter Maydell 

thanks
-- PMM

Re: [Qemu-devel] [PATCH v3 34/46] ivshmem-server: fix hugetlbfs support

2015-09-17 Thread Vladimir Sementsov-Ogievskiy


On 16.09.2015 19:14, Marc-André Lureau wrote:


- Original Message -

On 15.09.2015 19:07, marcandre.lur...@redhat.com wrote:

From: Marc-André Lureau 

As pointed out on the ML by Andrew Jones, glibc no longer permits
creating POSIX shm on hugetlbfs directly. When given a hugetlbfs path,
create a shareable file there.

Signed-off-by: Marc-André Lureau 
---
   contrib/ivshmem-server/ivshmem-server.c | 40
   -
   contrib/ivshmem-server/ivshmem-server.h |  3 +--
   contrib/ivshmem-server/main.c   |  5 ++---
   3 files changed, 42 insertions(+), 6 deletions(-)

diff --git a/contrib/ivshmem-server/ivshmem-server.c
b/contrib/ivshmem-server/ivshmem-server.c
index 972fda2..51264b4 100644
--- a/contrib/ivshmem-server/ivshmem-server.c
+++ b/contrib/ivshmem-server/ivshmem-server.c
@@ -11,6 +11,7 @@
   #include 
   #include 
   #include 
+#include 
   
   #include "qemu-common.h"

   #include "qemu/queue.h"
@@ -271,15 +272,52 @@ ivshmem_server_init(IvshmemServer *server, const char
*unix_sock_path,
   return 0;
   }
   
+#define HUGETLBFS_MAGIC   0x958458f6

+
+static long gethugepagesize(const char *path)
+{
+struct statfs fs;
+int ret;
+
+do {
+ret = statfs(path, );
+} while (ret != 0 && errno == EINTR);
+
+if (ret != 0) {
+if (errno != ENOENT) {
+fprintf(stderr, "cannot stat shm file %s: %s\n", path,
+strerror(errno));
+}
+return -1;
+}
+
+if (fs.f_type != HUGETLBFS_MAGIC) {

should it be silent?

Well, the same given path may be shm name or a hugefs. I guess we could use a 
different argument instead and make sure that it is hugefs. At the minimum, I 
understand we may want a warning here.


+return -1;
+}
+
+return fs.f_bsize;
+}
+
   /* open shm, create and bind to the unix socket */
   int
   ivshmem_server_start(IvshmemServer *server)
   {
   struct sockaddr_un sun;
   int shm_fd, sock_fd, ret;
+long hpagesize;
+gchar *filename;
   
   /* open shm file */

-shm_fd = shm_open(server->shm_path, O_CREAT|O_RDWR, S_IRWXU);
+hpagesize = gethugepagesize(server->shm_path);
+if (hpagesize > 0) {

filename is used only in this block, it may be defined here.

ok


+filename = g_strdup_printf("%s/ivshmem.XX", server->shm_path);
+shm_fd = mkstemp(filename);
+unlink(filename);
+g_free(filename);
+} else {
+shm_fd = shm_open(server->shm_path, O_CREAT|O_RDWR, S_IRWXU);
+}
+
   if (shm_fd < 0) {
   fprintf(stderr, "cannot open shm file %s: %s\n",
   server->shm_path,
   strerror(errno));
diff --git a/contrib/ivshmem-server/ivshmem-server.h
b/contrib/ivshmem-server/ivshmem-server.h
index 2176d5e..e9b0e7a 100644
--- a/contrib/ivshmem-server/ivshmem-server.h
+++ b/contrib/ivshmem-server/ivshmem-server.h
@@ -81,8 +81,7 @@ typedef struct IvshmemServer {
* @server: A pointer to an uninitialized IvshmemServer structure
* @unix_sock_path: The pointer to the unix socket file name
* @shm_path:   Path to the shared memory. The path corresponds to a
POSIX
- *  shm name. To use a real file, for instance in a
hugetlbfs,
- *  it is possible to use /../../abspath/to/file.
+ *  shm name or a hugetlbfs mount point.
* @shm_size:   Size of shared memory
* @n_vectors:  Number of interrupt vectors per client
* @verbose:True to enable verbose mode
diff --git a/contrib/ivshmem-server/main.c b/contrib/ivshmem-server/main.c
index 84ffc4d..cd8d9ed 100644
--- a/contrib/ivshmem-server/main.c
+++ b/contrib/ivshmem-server/main.c
@@ -47,9 +47,8 @@ ivshmem_server_usage(const char *name, int code)
   " to listen to.\n"
   " Default=%s\n",
   IVSHMEM_SERVER_DEFAULT_UNIX_SOCK_PATH);
   fprintf(stderr, "  -m : path to the shared memory.\n"
-" The path corresponds to a POSIX shm name. To use
a\n"
-" real file, for instance in a hugetlbfs, use\n"
-" /../../abspath/to/file.\n"
+" The path corresponds to a POSIX shm name or a\n"
+" hugetlbfs mount point.\n"
   " default=%s\n",
   IVSHMEM_SERVER_DEFAULT_SHM_PATH);
   fprintf(stderr, "  -l : size of shared memory in bytes. The
   suffix\n"
   " K, M and G can be used (ex: 1K means 1024).\n"


--
Best regards,
Vladimir
* now, @virtuozzo.com instead of @parallels.com. Sorry for this
inconvenience.



with these tiny fixes:

Reviewed-by: Vladimir Sementsov-Ogievskiy 



--
Best regards,
Vladimir
* now, @virtuozzo.com instead of @parallels.com. Sorry for this inconvenience.

Re: [Qemu-devel] [PATCH] hw/arm/virt-acpi-build: Fix wrong size of flash

2015-09-17 Thread Andrew Jones

On Thu, Sep 17, 2015 at 09:57:21AM +0800, shannon.z...@linaro.org wrote:
> From: Shannon Zhao 
> 
> While virt machine creates two flash devices with total size 0x0800,
> it wrongly uses this total size for each one. So it will overlap other
> MMIO spaces.
> 
> Signed-off-by: Shannon Zhao 
> ---
>  hw/arm/virt-acpi-build.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> index 2073573..bc858c8 100644
> --- a/hw/arm/virt-acpi-build.c
> +++ b/hw/arm/virt-acpi-build.c
> @@ -114,7 +114,7 @@ static void acpi_dsdt_add_flash(Aml *scope, const 
> MemMapEntry *flash_memmap)
>  {
>  Aml *dev, *crs;
>  hwaddr base = flash_memmap->base;
> -hwaddr size = flash_memmap->size;
> +hwaddr size = flash_memmap->size / 2;
>  
>  dev = aml_device("FLS0");
>  aml_append(dev, aml_name_decl("_HID", aml_string("LNRO0015")));
> -- 
> 2.1.0
> 
>

A sentence in the commit message saying that the DT generation also
gives each device one half the total size would be nice

Reviewed-by: Andrew Jones

Re: [Qemu-devel] [PATCH v2 08/18] nvdimm: init backend memory mapping and config data area

2015-09-17 Thread Xiao Guangrong




On 09/16/2015 12:06 AM, Paolo Bonzini wrote:



On 25/08/2015 18:03, Stefan Hajnoczi wrote:


+static uint64_t get_file_size(int fd)
+{
+struct stat stat_buf;
+uint64_t size;
+
+if (fstat(fd, _buf) < 0) {
+return 0;
+}
+
+if (S_ISREG(stat_buf.st_mode)) {
+return stat_buf.st_size;
+}
+
+if (S_ISBLK(stat_buf.st_mode) && !ioctl(fd, BLKGETSIZE64, )) {
+return size;
+}

#ifdef __linux__ for ioctl(fd, BLKGETSIZE64, )?

There is nothing Linux-specific about emulating NVDIMMs so this code
should compile on all platforms.


The code from block/raw-posix.c and block/raw-win32.c's raw_getlength
should probably be extracted to a new function in utils/, and reused here.



The function you pointed out is really complex - it mixed 9 platforms and each
platform has its own specific implementation. It is hard for us to verify the
change.

I'd prefer to make it for Linux specific first then share it to other platforms
if it's needed in the future.

[Qemu-devel] [PATCH v4] ppc/spapr: Implement H_RANDOM hypercall in QEMU

2015-09-17 Thread Thomas Huth

The PAPR interface defines a hypercall to pass high-quality
hardware generated random numbers to guests. Recent kernels can
already provide this hypercall to the guest if the right hardware
random number generator is available. But in case the user wants
to use another source like EGD, or QEMU is running with an older
kernel, we should also have this call in QEMU, so that guests that
do not support virtio-rng yet can get good random numbers, too.

This patch now adds a new pseudo-device to QEMU that either
directly provides this hypercall to the guest or is able to
enable the in-kernel hypercall if available. The in-kernel
hypercall can be enabled with the use-kvm property, e.g.:

 qemu-system-ppc64 -device spapr-rng,use-kvm=true

For handling the hypercall in QEMU instead, a "RngBackend" is
required since the hypercall should provide "good" random data
instead of pseudo-random (like from a "simple" library function
like rand() or g_random_int()). Since there are multiple RngBackends
available, the user must select an appropriate back-end via the
"rng" property of the device, e.g.:

 qemu-system-ppc64 -object rng-random,filename=/dev/hwrng,id=gid0 \
   -device spapr-rng,rng=gid0 ...

See http://wiki.qemu-project.org/Features-Done/VirtIORNG for
other example of specifying RngBackends.

Signed-off-by: Thomas Huth 
---
 hw/ppc/Makefile.objs   |   2 +-
 hw/ppc/spapr.c |   8 +++
 hw/ppc/spapr_rng.c | 186 +
 include/hw/ppc/spapr.h |   4 ++
 target-ppc/kvm.c   |   9 +++
 target-ppc/kvm_ppc.h   |   5 ++
 6 files changed, 213 insertions(+), 1 deletion(-)
 create mode 100644 hw/ppc/spapr_rng.c

diff --git a/hw/ppc/Makefile.objs b/hw/ppc/Makefile.objs
index c8ab06e..c1ffc77 100644
--- a/hw/ppc/Makefile.objs
+++ b/hw/ppc/Makefile.objs
@@ -3,7 +3,7 @@ obj-y += ppc.o ppc_booke.o
 # IBM pSeries (sPAPR)
 obj-$(CONFIG_PSERIES) += spapr.o spapr_vio.o spapr_events.o
 obj-$(CONFIG_PSERIES) += spapr_hcall.o spapr_iommu.o spapr_rtas.o
-obj-$(CONFIG_PSERIES) += spapr_pci.o spapr_rtc.o spapr_drc.o
+obj-$(CONFIG_PSERIES) += spapr_pci.o spapr_rtc.o spapr_drc.o spapr_rng.o
 ifeq ($(CONFIG_PCI)$(CONFIG_PSERIES)$(CONFIG_LINUX), yyy)
 obj-y += spapr_pci_vfio.o
 endif
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index bf0c64f..34e7d24 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -768,6 +768,14 @@ static void spapr_finalize_fdt(sPAPRMachineState *spapr,
 exit(1);
 }
 
+if (object_resolve_path_type("", TYPE_SPAPR_RNG, NULL)) {
+ret = spapr_rng_populate_dt(fdt);
+if (ret < 0) {
+fprintf(stderr, "could not set up rng device in the fdt\n");
+exit(1);
+}
+}
+
 QLIST_FOREACH(phb, >phbs, list) {
 ret = spapr_populate_pci_dt(phb, PHANDLE_XICP, fdt);
 }
diff --git a/hw/ppc/spapr_rng.c b/hw/ppc/spapr_rng.c
new file mode 100644
index 000..ed43d5e
--- /dev/null
+++ b/hw/ppc/spapr_rng.c
@@ -0,0 +1,186 @@
+/*
+ * QEMU sPAPR random number generator "device" for H_RANDOM hypercall
+ *
+ * Copyright 2015 Thomas Huth, Red Hat Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License,
+ * or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see .
+ */
+
+#include "qemu/error-report.h"
+#include "sysemu/sysemu.h"
+#include "sysemu/device_tree.h"
+#include "sysemu/rng.h"
+#include "hw/ppc/spapr.h"
+#include "kvm_ppc.h"
+
+#define SPAPR_RNG(obj) \
+OBJECT_CHECK(sPAPRRngState, (obj), TYPE_SPAPR_RNG)
+
+struct sPAPRRngState {
+/*< private >*/
+DeviceState ds;
+RngBackend *backend;
+bool use_kvm;
+};
+typedef struct sPAPRRngState sPAPRRngState;
+
+struct HRandomData {
+QemuSemaphore sem;
+union {
+uint64_t v64;
+uint8_t v8[8];
+} val;
+int received;
+};
+typedef struct HRandomData HRandomData;
+
+/* Callback function for the RngBackend */
+static void random_recv(void *dest, const void *src, size_t size)
+{
+HRandomData *hrdp = dest;
+
+if (src && size > 0) {
+assert(size + hrdp->received <= sizeof(hrdp->val.v8));
+memcpy(>val.v8[hrdp->received], src, size);
+hrdp->received += size;
+}
+
+qemu_sem_post(>sem);
+}
+
+/* Handler for the H_RANDOM hypercall */
+static target_ulong h_random(PowerPCCPU *cpu, sPAPRMachineState *spapr,
+ target_ulong opcode, target_ulong *args)
+{
+sPAPRRngState *rngstate;
+HRandomData

Re: [Qemu-devel] [PATCH v2 08/18] nvdimm: init backend memory mapping and config data area

2015-09-17 Thread Igor Mammedov

On Thu, 17 Sep 2015 16:39:12 +0800
Xiao Guangrong  wrote:

> 
> 
> On 09/16/2015 12:10 AM, Paolo Bonzini wrote:
> >
> >
> > On 01/09/2015 11:14, Stefan Hajnoczi wrote:
> 
>  When I was digging into live migration code, i noticed that the same MR 
>  name may
>  cause the name "idstr", please refer to qemu_ram_set_idstr().
> 
>  Since nvdimm devices do not have parent-bus, it will trigger the abort() 
>  in that
>  function.
> >> I see.  The other devices that use a constant name are on a bus so the
> >> abort doesn't trigger.
> >
> > However, the MR name must be the same across the two machines.  Indices
> > are not friendly to hotplug.  Even though hotplug isn't supported now,
> > we should prepare and try not to change migration format when we support
> > hotplug in the future.
> >
> 
> Thanks for your reminder.
> 
> > Is there any other fixed value that we can use, for example the base
> > address of the NVDIMM?
> 
> How about use object_get_canonical_path(OBJECT(dev)) (the @dev is NVDIMM
> device) ?
if you use split backend/frotnend idea then existing backends
already have a stable name derived from backend's ID and you won't need to care
about it.

Re: [Qemu-devel] [PATCH] hw/arm/virt-acpi-build: Fix wrong size of flash

2015-09-17 Thread G Gregory

This is an urgent fix as it completely breaks booting with ACPI.
Success is only a matter of luck with device probing order.

Tested-by: Graeme Gregory 

Graeme

On 17 September 2015 at 02:57,   wrote:
> From: Shannon Zhao 
>
> While virt machine creates two flash devices with total size 0x0800,
> it wrongly uses this total size for each one. So it will overlap other
> MMIO spaces.
>
> Signed-off-by: Shannon Zhao 
> ---
>  hw/arm/virt-acpi-build.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> index 2073573..bc858c8 100644
> --- a/hw/arm/virt-acpi-build.c
> +++ b/hw/arm/virt-acpi-build.c
> @@ -114,7 +114,7 @@ static void acpi_dsdt_add_flash(Aml *scope, const 
> MemMapEntry *flash_memmap)
>  {
>  Aml *dev, *crs;
>  hwaddr base = flash_memmap->base;
> -hwaddr size = flash_memmap->size;
> +hwaddr size = flash_memmap->size / 2;
>
>  dev = aml_device("FLS0");
>  aml_append(dev, aml_name_decl("_HID", aml_string("LNRO0015")));
> --
> 2.1.0
>

Re: [Qemu-devel] [PATCH] README: fill out some useful quickstart information

2015-09-17 Thread Peter Maydell

On 17 September 2015 at 13:05, Daniel P. Berrange  wrote:
> On Thu, Sep 17, 2015 at 12:32:56PM +0100, Peter Maydell wrote:
>> On 17 September 2015 at 12:03, Daniel P. Berrange  
>> wrote:
>
>> > +Building
>> > +
>> > +
>> > +QEMU is multi-platform software intended to be buildable on all modern 
>> > Linux
>> > +platforms, OS-X, Win32 (via the Mingw64 toolchain) and a variety of other
>> > +UNIX targets. The simple process to build QEMU is
>>
>> This whole section seems to be duplicating our existing build
>> documentation (which is in qemu-doc.texi in the 'compilation'
>> section). We should document how to build QEMU in exactly one
>> place, not two (though I can see the rationale for that one
>> place not being in a .texi file.)
>
> The problem I have with refering people to qemu-doc.html,
> is that in order to order to view the docs on how to build
> QEMU, you first have to build QEMU, or enjoy reading the
> .texi source code :-)  Though that doc does get exposed
> via the website too, it is nice to not rely on people having
> internet access all the time.
>
> The qemu-doc.html chapter 6 is a bit more detailed in what
> it descibes. I tend to view the instructions we put in the
> README file as the minimal quick-start, and then point to
> the comprehensive docs as a detailed reference on the matter.

I don't think we should have two places at all. If a "quick
start" is useful it should be at the start of the one document
we have on building QEMU.

>> Also I'm not sure our 'make install' target is a great thing to recommend.
>
> Any particular reason why 'make install' is bad ? I've not personally
> had any trouble with it, though to be fair I always build with
> '--prefix=$HOME/usr/qemu-git' so I'm not splattering stuff into /usr

Pretty much the "splats over /usr", with a side order of "I'm not
sure how much testing it gets".

>> > +
>> > +The configure script supports a number of arguments to turn on/off various
>> > +optional features. These can be seen with "configure --help".
>> > +
>> > +For additional information on building QEMU for Linux and Windows consult:
>> > +
>> > +  http://qemu-project.org/Hosts/Linux
>> > +  http://qemu-project.org/Hosts/W32
>>
>> We've just significantly improved our documentation for building
>> on OSX, and we should mention it here.
>
> Are those docs mentioned anywhere on the wiki, or just in the qemu-doc.html
> file ?  I took these two links from this page:
>
>   http://qemu-project.org/Documentation/GettingStartedDevelopers
>
> which doesn't mention OS-X. So we should probably address that at the
> same time.

The OSX documentation is in qemu-doc.texi.

Again, we should have all our 'how to build' docs in one place.

>> This part is no longer entirely true, incidentally (eg the AArch64 TCG 
>> backend
>> is GPL2+).
>
> I can just do what Paolo suggests and simply point people to the LICENSE
> file instead.

Yes, that would be better.

>> > +Contact
>> > +===
>> > +
>> > +The QEMU community can be contacted in a number of ways, with the two main
>> > +methods being E-Mail and IRC
>> > +
>> > + - qemu-devel@nongnu.org 
>> > (http://lists.nongnu.org/mailman/listinfo/qemu-devel)
>> > + - #qemu on irc.oftc.net
>> > +
>> > +For additional information on contacted the community consult:
>> > +
>> > +  http://qemu-project.org/Contribute/StartHere
>> > +
>> > +-- End
>>
>> Some of the lines in this file seem to be a bit over-long.
>
> Any preference for max length to stick too ? I merely ran
> it through checkpatch.pl, which enforces 80 char limit IIRC.
> I'm happy to reformat to shorter a shorter length if desired.

Wrap-at-between-70-and-75-ish is the usual recommendation, I think.

thanks
-- PMM

Re: [Qemu-devel] [PATCH 0/7 v9] vhost-user multiple queue support

2015-09-17 Thread Michael S. Tsirkin

On Tue, Sep 15, 2015 at 03:10:27PM +0800, Yuanhan Liu wrote:
> Hi,
> 
> Here is the updated patch set for enabling vhost-user multiple queue.
> This patch set introduces 2 more vhost user messages: 
> VHOST_USER_GET_QUEUE_NUM,
> for querying how many queues the backend supports, and 
> VHOST_USER_SET_VRING_ENABLE,
> for enabling/disabling a specific virt queue.
> 
> Both of the two new messages are treated as vhost protocol extension,
> and that's why Michaels's patch "vhost-user: add protocol feature
> negotiation" is also included here.
> 
> Patch 1-5 are all prepare works for actually enabling multiple queue.
> 
> Patch 6 is the major patch for enabling multiple queue, which also tries
> to address two major concerns from Michael: no feedback from backend if
> it can't support # of requested queues, and all messages are sent N time.
> It also fixes a hidden bug.
> 
> Patch 7 introduces the VHOST_USER_SET_VRING_ENABLE message, to enable
> or disable a specific vring.
> 
> Note that I haven't done any formal test yet, it just passes build
> test and basic functional test, such as it does exit when backend
> doesn't support # of requested queues. Here I sent it out just for
> more comments, and for avoiding spending too much effort on the wrong
> track.
> 

This looks reasonable to me.
Pls rebase, test and re-post with fixed numbering, and I can merge.
Please post the dpdk patches too (preferably Cc me) so
it's testable by others.

> 
> v9: - Per suggested by Jason Wang, patch 5 introduces a new vhost
>   backend method: vhost_backend_get_vq_index().
> 
> - Use qemu_find_net_clients_except() at net/vhost-user.c for
>   gathering all related ncs so that we could register chr dev
>   event handler once. Which is also suggested by Jason Wang.
> 
> 
> Thanks.
> 
> --yliu
> 
> ---
> Changchun Ouyang (2):
>   vhost-user: add multiple queue support
>   vhost-user: add a new message to disable/enable a specific virt queue.
> 
> Michael S. Tsirkin (1):
>   vhost-user: add protocol feature negotiation
> 
> Yuanhan Liu (4):
>   vhost-user: use VHOST_USER_XXX macro for switch statement
>   vhost: rename VHOST_RESET_OWNER to VHOST_RESET_DEVICE
>   vhost-user: add VHOST_USER_GET_QUEUE_NUM message
>   vhost: introduce vhost_backend_get_vq_index method
> 
>  docs/specs/vhost-user.txt |  75 +++-
>  hw/net/vhost_net.c|  39 --
>  hw/net/virtio-net.c   |   8 +++
>  hw/virtio/vhost-backend.c |  10 ++-
>  hw/virtio/vhost-user.c| 143 +++--
>  hw/virtio/vhost.c |  20 +++---
>  include/hw/virtio/vhost-backend.h |   4 ++
>  include/hw/virtio/vhost.h |   2 +
>  include/net/vhost_net.h   |   3 +
>  linux-headers/linux/vhost.h   |   2 +-
>  net/vhost-user.c  | 146 
> +++---
>  qapi-schema.json  |   6 +-
>  qemu-options.hx   |   5 +-
>  tests/vhost-user-test.c   |   2 +-
>  14 files changed, 380 insertions(+), 85 deletions(-)
> 
> -- 
> 1.9.0

Re: [Qemu-devel] [PATCH] README: fill out some useful quickstart information

2015-09-17 Thread Daniel P. Berrange

On Thu, Sep 17, 2015 at 01:20:43PM +0100, Peter Maydell wrote:
> On 17 September 2015 at 13:05, Daniel P. Berrange  wrote:
> > On Thu, Sep 17, 2015 at 12:32:56PM +0100, Peter Maydell wrote:
> >> On 17 September 2015 at 12:03, Daniel P. Berrange  
> >> wrote:
> >
> >> > +Building
> >> > +
> >> > +
> >> > +QEMU is multi-platform software intended to be buildable on all modern 
> >> > Linux
> >> > +platforms, OS-X, Win32 (via the Mingw64 toolchain) and a variety of 
> >> > other
> >> > +UNIX targets. The simple process to build QEMU is
> >>
> >> This whole section seems to be duplicating our existing build
> >> documentation (which is in qemu-doc.texi in the 'compilation'
> >> section). We should document how to build QEMU in exactly one
> >> place, not two (though I can see the rationale for that one
> >> place not being in a .texi file.)
> >
> > The problem I have with refering people to qemu-doc.html,
> > is that in order to order to view the docs on how to build
> > QEMU, you first have to build QEMU, or enjoy reading the
> > .texi source code :-)  Though that doc does get exposed
> > via the website too, it is nice to not rely on people having
> > internet access all the time.
> >
> > The qemu-doc.html chapter 6 is a bit more detailed in what
> > it descibes. I tend to view the instructions we put in the
> > README file as the minimal quick-start, and then point to
> > the comprehensive docs as a detailed reference on the matter.
> 
> I don't think we should have two places at all. If a "quick
> start" is useful it should be at the start of the one document
> we have on building QEMU.

How about splitting "Chapter 6 Compilation" out of the qemu-doc.texi
file into a standalone file, in a format that is friendly to read
without needing generating first.  Perhaps using something like
Markdown[1] would be a suitable thing, as that is essentially plain
text with a little extra punctuation, so it is easily readable as
source, as well as allowing reasonably pleasant HTML generation
allowing us to publish it to the website too ?

> >> Also I'm not sure our 'make install' target is a great thing to recommend.
> >
> > Any particular reason why 'make install' is bad ? I've not personally
> > had any trouble with it, though to be fair I always build with
> > '--prefix=$HOME/usr/qemu-git' so I'm not splattering stuff into /usr
> 
> Pretty much the "splats over /usr", with a side order of "I'm not
> sure how much testing it gets".

Heh, ok

Regards,
Daniel

[1] eg https://help.github.com/articles/markdown-basics/
   http://daringfireball.net/projects/markdown/syntax
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

Re: [Qemu-devel] [PATCH v2] target-cris: update CPU state save/load to use VMStateDescription

2015-09-17 Thread Peter Maydell

On 12 August 2015 at 15:15, Edgar E. Iglesias  wrote:
> On Fri, Aug 07, 2015 at 05:02:14PM +0100, Peter Maydell wrote:
>> From: Juan Quintela 
>>
>> Update the CRIS CPU state save/load to use a VMStateDescription struct
>> rather than cpu_save/cpu_load functions.
>>
>> Have to define TLBSet struct.
>> Multidimensional arrays in C are a mess, just unroll them.
>>
>> Signed-off-by: Juan Quintela 
>
>
> Acked-by: Edgar E. Iglesias 

So what's the process for target-cris changes to get into the tree?

thanks
-- PMM

Re: [Qemu-devel] [PATCH] README: fill out some useful quickstart information

2015-09-17 Thread Peter Maydell

On 17 September 2015 at 13:36, Daniel P. Berrange  wrote:
> How about splitting "Chapter 6 Compilation" out of the qemu-doc.texi
> file into a standalone file, in a format that is friendly to read
> without needing generating first.  Perhaps using something like
> Markdown[1] would be a suitable thing, as that is essentially plain
> text with a little extra punctuation, so it is easily readable as
> source, as well as allowing reasonably pleasant HTML generation
> allowing us to publish it to the website too ?

Yes, that sounds like a good idea.

-- PMM

Re: [Qemu-devel] [PATCH v4 1/3] pci: Make use of the devfn property when registering new devices

2015-09-17 Thread Marcel Apfelbaum


On 09/17/2015 03:12 PM, Knut Omang wrote:

On Thu, 2015-09-17 at 14:11 +0300, Marcel Apfelbaum wrote:

On 09/12/2015 03:36 PM, Knut Omang wrote:

Without this, the devfn argument to pci_create_*()
does not affect the assigned devfn.

Needed to support (VF_STRIDE,VF_OFFSET) values other than (1,1)
for SR/IOV.

Signed-off-by: Knut Omang 
---
   hw/pci/pci.c | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index ccea628..a5cc015 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -1840,7 +1840,7 @@ static void pci_qdev_realize(DeviceState
*qdev, Error **errp)
   bus = PCI_BUS(qdev_get_parent_bus(qdev));
   pci_dev = do_pci_register_device(pci_dev, bus,

  object_get_typename(OBJECT(qdev)),
- pci_dev->devfn, errp);
+
  object_property_get_int(OBJECT(qdev), "addr", NULL), errp);

Hi,

I don't get this, using object_property_get_int on "addr" should
return the value of pci_dev->devfn,
can you please tell what exactly is not working?


The problem is that at that point pci_dev->devfn has not been set yet -
have commented on this before somewhere..



But "addr" property has the right value? Is indeed strange because it should
get the value from pci_dev->devfn.

Don't get me wrong, this patch is OK.
I just want to understand if we have a hidden bug somewhere.

Anyway,
Reviewed-by: Marcel Apfelbaum 


Knut


Thanks,
Marcel




   if (pci_dev == NULL)
   return;

Re: [Qemu-devel] [wiki] New wiki page - vhost-user setup with ovs/dpdk backend

2015-09-17 Thread Flavio Leitner

On Thu, Sep 17, 2015 at 01:08:36PM +0300, Marcel Apfelbaum wrote:
> On 09/16/2015 08:44 PM, Flavio Leitner wrote:
> >On Thu, Sep 10, 2015 at 10:51:19PM +0300, Marcel Apfelbaum wrote:
> >>Hi,
> >>
> >>The page describes how to setup an environment that allows 
> >>testing/developing
> >>vhost-user using ovs (with dpdk) as backend.
> >>
> >>A regular pc machine can be used, no need for several hosts, a 'dpdk 
> >>enabled' NIC or 1G huge-pages.
> >>
> >>The goal is to connect guests' virtio-net devices having vhost-user backend 
> >>to OVS dpdkvhostuser ports
> >>and be able to run any kind of network traffic between them.
> >>
> >>The page can be found at:
> >>http://wiki.qemu.org/Features/vhost-user-ovs-dpdk
> >>
> >>I want to keep it as simple as possible.
> >>If you see steps that can be skipped or unneeded configuration please let 
> >>me know
> >>or feel free to update the page.
> >
> >I gave a quick look and found couple issues.  It seems to be missing
> >the installing steps for qemu.  Also the eventfd_link module is only
> >needed for vhost-cuse, so you don't need to build/install/load at all.
> Hi Flavio,
> Thank you for reviewing the document!
> 
> For some reason I thought we still need eventfd_link to notify the guests
> when new packets arrive. Indeed we don't need this anymore, I updated the 
> wiki page.

We use the one provided by kernel.  The eventfd_link on DPDK is special for
vhost-cuse: 
https://github.com/openvswitch/ovs/blob/master/INSTALL.DPDK.md#dpdk-vhost-cuse-prerequisites


> Regarding QEMU, can you please point me to the missing setup steps?
> (I think I already took care of them, anyway I want to be sure)

Yup, I missed that you are running qemu from the builddir, not from
the system, so you don't need to install it.

Thanks again,
fbl

Re: [Qemu-devel] [PATCH v2 08/18] nvdimm: init backend memory mapping and config data area

2015-09-17 Thread Xiao Guangrong




On 09/17/2015 05:34 PM, Paolo Bonzini wrote:



On 17/09/2015 11:14, Xiao Guangrong wrote:



/* get the memory region from backend memory. */
mr = host_memory_backend_get_memory(dimm->hostmem, errp);

/* nvdimm_nr will map to guest address space. */
memory_region_init_alias(>nvdimm_mr, OBJECT(dev),
  object_get_canonical_path(OBJECT(dev)), mr, 0,
  size - nvdimm->label_size);


You can just use "memory" here for the name here.  The name only needs
to be unique for RAM memory regions, and dimm->hostmem will take care of it.



Okay. I will try it, thank you, Paolo.

Re: [Qemu-devel] [PATCH] virtio: add some migration doc

2015-09-17 Thread Eric Blake

On 09/17/2015 03:06 AM, Cornelia Huck wrote:
> Try to cover the basics of virtio migration.
> 
> Signed-off-by: Cornelia Huck 
> ---
> It might help if we add some documentation; at the very least, it will
> prevent myself getting a headache everytime I look at that code :)
> 
> Feedback welcome.
> ---
>  docs/virtio-migration.txt | 101 
> ++
>  1 file changed, 101 insertions(+)
>  create mode 100644 docs/virtio-migration.txt
> 
> diff --git a/docs/virtio-migration.txt b/docs/virtio-migration.txt
> new file mode 100644
> index 000..9c575e6
> --- /dev/null
> +++ b/docs/virtio-migration.txt
> @@ -0,0 +1,101 @@
> +Virtio devices and migration
> +

Since you didn't supply an explicit Copyright/License, this document
inherits the project default of GPLv2+.  Not the end of the world, but a
lot of recently-added documentation has been explicitly listing a license.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [PATCH v18 09/21] replay: interrupts and exceptions

2015-09-17 Thread Pavel Dovgalyuk

This patch includes modifications of common cpu files. All interrupts and
exceptions occured during recording are written into the replay log.
These events allow correct replaying the execution by kicking cpu thread
when one of these events is found in the log.

Signed-off-by: Pavel Dovgalyuk 
---
 cpu-exec.c   |   48 ++---
 replay/replay-internal.h |4 +++
 replay/replay-user.c |   20 ++
 replay/replay.c  |   67 ++
 replay/replay.h  |   17 
 5 files changed, 146 insertions(+), 10 deletions(-)

diff --git a/cpu-exec.c b/cpu-exec.c
index d88ea4f..877f6f2 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -31,6 +31,7 @@
 #if defined(TARGET_I386) && !defined(CONFIG_USER_ONLY)
 #include "hw/i386/apic.h"
 #endif
+#include "replay/replay.h"
 
 /* -icount align implementation. */
 
@@ -405,21 +406,25 @@ int cpu_exec(CPUState *cpu)
 uintptr_t next_tb;
 SyncClocks sc;
 
+/* replay_interrupt may need current_cpu */
+current_cpu = cpu;
+
 if (cpu->halted) {
 #if defined(TARGET_I386) && !defined(CONFIG_USER_ONLY)
-if (cpu->interrupt_request & CPU_INTERRUPT_POLL) {
+if ((cpu->interrupt_request & CPU_INTERRUPT_POLL)
+&& replay_interrupt()) {
 apic_poll_irq(x86_cpu->apic_state);
 cpu_reset_interrupt(cpu, CPU_INTERRUPT_POLL);
 }
 #endif
 if (!cpu_has_work(cpu)) {
+current_cpu = NULL;
 return EXCP_HALTED;
 }
 
 cpu->halted = 0;
 }
 
-current_cpu = cpu;
 atomic_mb_set(_current_cpu, cpu);
 rcu_read_lock();
 
@@ -461,10 +466,22 @@ int cpu_exec(CPUState *cpu)
 cpu->exception_index = -1;
 break;
 #else
-cc->do_interrupt(cpu);
-cpu->exception_index = -1;
+if (replay_exception()) {
+cc->do_interrupt(cpu);
+cpu->exception_index = -1;
+} else if (!replay_has_interrupt()) {
+/* give a chance to iothread in replay mode */
+ret = EXCP_INTERRUPT;
+break;
+}
 #endif
 }
+} else if (replay_has_exception()
+   && cpu->icount_decr.u16.low + cpu->icount_extra == 0) {
+/* try to cause an exception pending in the log */
+cpu_exec_nocache(cpu, 1, tb_find_fast(cpu), true);
+ret = -1;
+break;
 }
 
 next_tb = 0; /* force lookup of first TB */
@@ -480,30 +497,40 @@ int cpu_exec(CPUState *cpu)
 cpu->exception_index = EXCP_DEBUG;
 cpu_loop_exit(cpu);
 }
-if (interrupt_request & CPU_INTERRUPT_HALT) {
+if (replay_mode == REPLAY_MODE_PLAY
+&& !replay_has_interrupt()) {
+/* Do nothing */
+} else if (interrupt_request & CPU_INTERRUPT_HALT) {
+replay_interrupt();
 cpu->interrupt_request &= ~CPU_INTERRUPT_HALT;
 cpu->halted = 1;
 cpu->exception_index = EXCP_HLT;
 cpu_loop_exit(cpu);
 }
 #if defined(TARGET_I386)
-if (interrupt_request & CPU_INTERRUPT_INIT) {
+else if (interrupt_request & CPU_INTERRUPT_INIT) {
+replay_interrupt();
 cpu_svm_check_intercept_param(env, SVM_EXIT_INIT, 0);
 do_cpu_init(x86_cpu);
 cpu->exception_index = EXCP_HALTED;
 cpu_loop_exit(cpu);
 }
 #else
-if (interrupt_request & CPU_INTERRUPT_RESET) {
+else if (interrupt_request & CPU_INTERRUPT_RESET) {
+replay_interrupt();
 cpu_reset(cpu);
+cpu_loop_exit(cpu);
 }
 #endif
 /* The target hook has 3 exit conditions:
False when the interrupt isn't processed,
True when it is, and we should restart on a new TB,
and via longjmp via cpu_loop_exit.  */
-if (cc->cpu_exec_interrupt(cpu, interrupt_request)) {
-next_tb = 0;
+else {
+replay_interrupt();
+if (cc->cpu_exec_interrupt(cpu, interrupt_request)) {
+next_tb = 0;
+}
 }
 /* Don't use the cached interrupt_request value,
do_interrupt may

[Qemu-devel] [PATCH v18 10/21] replay: asynchronous events infrastructure

2015-09-17 Thread Pavel Dovgalyuk

This patch adds module for saving and replaying asynchronous events.
These events include network packets, keyboard and mouse input,
USB packets, thread pool and bottom halves callbacks.
All events are stored in the queue to be processed at synchronization points
such as beginning of TB execution, or checkpoint in the iothread.

Signed-off-by: Pavel Dovgalyuk 
---
 replay/Makefile.objs |1 
 replay/replay-events.c   |  225 ++
 replay/replay-internal.h |   29 ++
 replay/replay.h  |6 +
 4 files changed, 261 insertions(+), 0 deletions(-)
 create mode 100755 replay/replay-events.c

diff --git a/replay/Makefile.objs b/replay/Makefile.objs
index 9711e14..1ce61ec 100755
--- a/replay/Makefile.objs
+++ b/replay/Makefile.objs
@@ -1,3 +1,4 @@
 obj-$(CONFIG_SOFTMMU) += replay.o
 obj-$(CONFIG_SOFTMMU) += replay-internal.o
+obj-$(CONFIG_SOFTMMU) += replay-events.o
 obj-$(CONFIG_USER_ONLY) += replay-user.o
diff --git a/replay/replay-events.c b/replay/replay-events.c
new file mode 100755
index 000..f973cb0
--- /dev/null
+++ b/replay/replay-events.c
@@ -0,0 +1,225 @@
+/*
+ * replay-events.c
+ *
+ * Copyright (c) 2010-2015 Institute for System Programming
+ * of the Russian Academy of Sciences.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu-common.h"
+#include "qemu/error-report.h"
+#include "replay.h"
+#include "replay-internal.h"
+
+typedef struct Event {
+ReplayAsyncEventKind event_kind;
+void *opaque;
+void *opaque2;
+uint64_t id;
+
+QTAILQ_ENTRY(Event) events;
+} Event;
+
+static QTAILQ_HEAD(, Event) events_list = QTAILQ_HEAD_INITIALIZER(events_list);
+static unsigned int read_event_kind = -1;
+static uint64_t read_id = -1;
+static int read_checkpoint = -1;
+
+static bool events_enabled;
+
+/* Functions */
+
+static void replay_run_event(Event *event)
+{
+switch (event->event_kind) {
+default:
+error_report("Replay: invalid async event ID (%d) in the queue",
+event->event_kind);
+exit(1);
+break;
+}
+}
+
+void replay_enable_events(void)
+{
+events_enabled = true;
+}
+
+bool replay_has_events(void)
+{
+return !QTAILQ_EMPTY(_list);
+}
+
+void replay_flush_events(void)
+{
+replay_mutex_lock();
+while (!QTAILQ_EMPTY(_list)) {
+Event *event = QTAILQ_FIRST(_list);
+replay_mutex_unlock();
+replay_run_event(event);
+replay_mutex_lock();
+QTAILQ_REMOVE(_list, event, events);
+g_free(event);
+}
+replay_mutex_unlock();
+}
+
+void replay_disable_events(void)
+{
+if (replay_mode != REPLAY_MODE_NONE) {
+events_enabled = false;
+/* Flush events queue before waiting of completion */
+replay_flush_events();
+}
+}
+
+void replay_clear_events(void)
+{
+replay_mutex_lock();
+while (!QTAILQ_EMPTY(_list)) {
+Event *event = QTAILQ_FIRST(_list);
+QTAILQ_REMOVE(_list, event, events);
+
+g_free(event);
+}
+replay_mutex_unlock();
+}
+
+/*! Adds specified async event to the queue */
+static void replay_add_event(ReplayAsyncEventKind event_kind,
+ void *opaque,
+ void *opaque2, uint64_t id)
+{
+assert(event_kind < REPLAY_ASYNC_COUNT);
+
+if (!replay_file || replay_mode == REPLAY_MODE_NONE
+|| !events_enabled) {
+Event e;
+e.event_kind = event_kind;
+e.opaque = opaque;
+e.opaque2 = opaque2;
+e.id = id;
+replay_run_event();
+return;
+}
+
+Event *event = g_malloc0(sizeof(Event));
+event->event_kind = event_kind;
+event->opaque = opaque;
+event->opaque2 = opaque2;
+event->id = id;
+
+replay_mutex_lock();
+QTAILQ_INSERT_TAIL(_list, event, events);
+replay_mutex_unlock();
+}
+
+static void replay_save_event(Event *event, int checkpoint)
+{
+if (replay_mode != REPLAY_MODE_PLAY) {
+/* put the event into the file */
+replay_put_event(EVENT_ASYNC);
+replay_put_byte(checkpoint);
+replay_put_byte(event->event_kind);
+
+/* save event-specific data */
+switch (event->event_kind) {
+default:
+error_report("Unknown ID %d of replay event", read_event_kind);
+exit(1);
+break;
+}
+}
+}
+
+/* Called with replay mutex locked */
+void replay_save_events(int checkpoint)
+{
+while (!QTAILQ_EMPTY(_list)) {
+Event *event = QTAILQ_FIRST(_list);
+replay_save_event(event, checkpoint);
+
+replay_mutex_unlock();
+replay_run_event(event);
+replay_mutex_lock();
+QTAILQ_REMOVE(_list, event, events);
+g_free(event);
+}
+}
+
+static Event *replay_read_event(int checkpoint)
+{
+Event *event;
+if

[Qemu-devel] [PATCH v18 06/21] cpu-exec: allow temporary disabling icount

2015-09-17 Thread Pavel Dovgalyuk

This patch is required for deterministic replay to generate an exception
by trying executing an instruction without changing icount.
It adds new flag to TB for disabling icount while translating it.

Signed-off-by: Paolo Bonzini 

Signed-off-by: Pavel Dovgalyuk 
---
 cpu-exec.c  |7 ---
 include/exec/exec-all.h |1 +
 translate-all.c |2 +-
 3 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/cpu-exec.c b/cpu-exec.c
index b247a23..d88ea4f 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -240,7 +240,7 @@ static inline tcg_target_ulong cpu_tb_exec(CPUState *cpu, 
uint8_t *tb_ptr)
 /* Execute the code without caching the generated code. An interpreter
could be used if available. */
 static void cpu_exec_nocache(CPUState *cpu, int max_cycles,
- TranslationBlock *orig_tb)
+ TranslationBlock *orig_tb, bool ignore_icount)
 {
 TranslationBlock *tb;
 
@@ -250,7 +250,8 @@ static void cpu_exec_nocache(CPUState *cpu, int max_cycles,
 max_cycles = CF_COUNT_MASK;
 
 tb = tb_gen_code(cpu, orig_tb->pc, orig_tb->cs_base, orig_tb->flags,
- max_cycles | CF_NOCACHE);
+ max_cycles | CF_NOCACHE
+ | (ignore_icount ? CF_IGNORE_ICOUNT : 0));
 tb->orig_tb = tcg_ctx.tb_ctx.tb_invalidated_flag ? NULL : orig_tb;
 cpu->current_tb = tb;
 /* execute the generated code */
@@ -577,7 +578,7 @@ int cpu_exec(CPUState *cpu)
 if (insns_left > 0) {
 /* Execute remaining instructions.  */
 tb = (TranslationBlock *)(next_tb & 
~TB_EXIT_MASK);
-cpu_exec_nocache(cpu, insns_left, tb);
+cpu_exec_nocache(cpu, insns_left, tb, false);
 align_clocks(, cpu);
 }
 cpu->exception_index = EXCP_INTERRUPT;
diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index 72d4012..21f4206 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -199,6 +199,7 @@ struct TranslationBlock {
 #define CF_LAST_IO 0x8000 /* Last insn may be an IO access.  */
 #define CF_NOCACHE 0x1 /* To be freed after execution */
 #define CF_USE_ICOUNT  0x2
+#define CF_IGNORE_ICOUNT 0x4 /* Do not generate icount code */
 
 void *tc_ptr;/* pointer to the translated code */
 /* next matching tb for physical address. */
diff --git a/translate-all.c b/translate-all.c
index 0b8b34b..5b3cf10 100644
--- a/translate-all.c
+++ b/translate-all.c
@@ -1039,7 +1039,7 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
 int code_gen_size;
 
 phys_pc = get_page_addr_code(env, pc);
-if (use_icount) {
+if (use_icount && !(cflags & CF_IGNORE_ICOUNT)) {
 cflags |= CF_USE_ICOUNT;
 }
 tb = tb_alloc(pc);

[Qemu-devel] [RFC PATCH 03/10] vfio: Check guest IOVA ranges against host IOMMU capabilities

2015-09-17 Thread David Gibson

The current vfio core code assumes that the host IOMMU is capable of
mapping any IOVA the guest wants to use to where we need.  However, real
IOMMUs generally only support translating a certain range of IOVAs (the
"DMA window") not a full 64-bit address space.

The common x86 IOMMUs support a wide enough range that guests are very
unlikely to go beyond it in practice, however the IOMMU used on IBM Power
machines - in the default configuration - supports only a much more limited
IOVA range, usually 0..2GiB.

If the guest attempts to set up an IOVA range that the host IOMMU can't
map, qemu won't report an error until it actually attempts to map a bad
IOVA.  If guest RAM is being mapped directly into the IOMMU (i.e. no guest
visible IOMMU) then this will show up very quickly.  If there is a guest
visible IOMMU, however, the problem might not show up until much later when
the guest actually attempt to DMA with an IOVA the host can't handle.

This patch adds a test so that we will detect earlier if the guest is
attempting to use IOVA ranges that the host IOMMU won't be able to deal
with.

For now, we assume that "Type1" (x86) IOMMUs can support any IOVA, this is
incorrect, but no worse than what we have already.  We can't do better for
now because the Type1 kernel interface doesn't tell us what IOVA range the
IOMMU actually supports.

For the Power "sPAPR TCE" IOMMU, however, we can retrieve the supported
IOVA range and validate guest IOVA ranges against it, and this patch does
so.

Signed-off-by: David Gibson 
---
 hw/vfio/common.c  | 42 +++---
 include/hw/vfio/vfio-common.h |  6 ++
 2 files changed, 45 insertions(+), 3 deletions(-)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 9953b9c..c37f1a1 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -344,14 +344,23 @@ static void vfio_listener_region_add(MemoryListener 
*listener,
 if (int128_ge(int128_make64(iova), llend)) {
 return;
 }
+end = int128_get64(llend);
+
+if ((iova < container->iommu_data.min_iova)
+|| ((end - 1) > container->iommu_data.max_iova)) {
+error_report("vfio: IOMMU container %p can't map guest IOVA region"
+ " 0x%"HWADDR_PRIx"..0x%"HWADDR_PRIx,
+ container, iova, end - 1);
+ret = -EFAULT; /* FIXME: better choice here? */
+goto fail;
+}
 
 memory_region_ref(section->mr);
 
 if (memory_region_is_iommu(section->mr)) {
 VFIOGuestIOMMU *giommu;
 
-trace_vfio_listener_region_add_iommu(iova,
-int128_get64(int128_sub(llend, int128_one(;
+trace_vfio_listener_region_add_iommu(iova, end - 1);
 /*
  * FIXME: We should do some checking to see if the
  * capabilities of the host VFIO IOMMU are adequate to model
@@ -388,7 +397,6 @@ static void vfio_listener_region_add(MemoryListener 
*listener,
 
 /* Here we assume that memory_region_is_ram(section->mr)==true */
 
-end = int128_get64(llend);
 vaddr = memory_region_get_ram_ptr(section->mr) +
 section->offset_within_region +
 (iova - section->offset_within_address_space);
@@ -687,7 +695,19 @@ static int vfio_connect_container(VFIOGroup *group, 
AddressSpace *as)
 ret = -errno;
 goto free_container_exit;
 }
+
+/*
+ * FIXME: This assumes that a Type1 IOMMU can map any 64-bit
+ * IOVA whatsoever.  That's not actually true, but the current
+ * kernel interface doesn't tell us what it can map, and the
+ * existing Type1 IOMMUs generally support any IOVA we're
+ * going to actually try in practice.
+ */
+container->iommu_data.min_iova = 0;
+container->iommu_data.max_iova = (hwaddr)-1;
 } else if (ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_SPAPR_TCE_IOMMU)) {
+struct vfio_iommu_spapr_tce_info info;
+
 ret = ioctl(group->fd, VFIO_GROUP_SET_CONTAINER, );
 if (ret) {
 error_report("vfio: failed to set group container: %m");
@@ -712,6 +732,22 @@ static int vfio_connect_container(VFIOGroup *group, 
AddressSpace *as)
 ret = -errno;
 goto free_container_exit;
 }
+
+/*
+ * FIXME: This only considers the host IOMMU' 32-bit window.
+ * At some point we need to add support for the optional
+ * 64-bit window and dynamic windows
+ */
+info.argsz = sizeof(info);
+ret = ioctl(fd, VFIO_IOMMU_SPAPR_TCE_GET_INFO, );
+if (ret) {
+error_report("vfio: VFIO_IOMMU_SPAPR_TCE_GET_INFO failed: %m");
+ret = -errno;
+goto free_container_exit;
+}
+container->iommu_data.min_iova = info.dma32_window_start;
+container->iommu_data.max_iova = container->iommu_data.min_iova
++ info.dma32_window_size - 1;
 } else {

[Qemu-devel] [RFC PATCH 09/10] spapr_iommu: Provide a function to switch a TCE table to allowing VFIO

2015-09-17 Thread David Gibson

Because of the way non-VFIO guest IOMMU operations are KVM accelerated, not
all TCE tables (guest IOMMU contexts) can support VFIO devices.  Currently,
this is decided at creation time.

To support hotplug of VFIO devices, we need to allow a TCE table which
previously didn't allow VFIO devices to be switched so that it can.  This
patch adds an spapr_tce_need_vfio() function to do this, by reallocating
the table in userspace if necessary.

Currently this doesn't allow the KVM acceleration to be re-enabled if all
the VFIO devices are removed.  That's an optimization for another time.

Signed-off-by: David Gibson 
---
 hw/ppc/spapr_iommu.c   | 19 +++
 include/hw/ppc/spapr.h |  2 ++
 2 files changed, 21 insertions(+)

diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c
index 5166cde..109e10d 100644
--- a/hw/ppc/spapr_iommu.c
+++ b/hw/ppc/spapr_iommu.c
@@ -168,6 +168,25 @@ static int spapr_tce_table_realize(DeviceState *dev)
 return 0;
 }
 
+void spapr_tce_need_vfio(sPAPRTCETable *tcet)
+{
+size_t table_size = tcet->nb_table * sizeof(uint64_t);
+void *newtable;
+
+if (tcet->fd < 0) {
+/* Table is already in userspace, nothing to be done */
+return;
+}
+
+newtable = g_malloc0(table_size);
+memcpy(newtable, tcet->table, table_size);
+
+kvmppc_remove_spapr_tce(tcet->table, tcet->fd, tcet->nb_table);
+
+tcet->fd = -1;
+tcet->table = newtable;
+}
+
 sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn,
uint64_t bus_offset,
uint32_t page_shift,
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 03a9804..38a29fb 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -588,6 +588,8 @@ sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, 
uint32_t liobn,
uint32_t page_shift,
uint32_t nb_table,
bool need_vfio);
+void spapr_tce_need_vfio(sPAPRTCETable *tcet);
+
 MemoryRegion *spapr_tce_get_iommu(sPAPRTCETable *tcet);
 int spapr_dma_dt(void *fdt, int node_off, const char *propname,
  uint32_t liobn, uint64_t window, uint32_t size);
-- 
2.4.3

[Qemu-devel] [RFC PATCH 01/10] vfio: Remove unneeded union from VFIOContainer

2015-09-17 Thread David Gibson

Currently the VFIOContainer iommu_data field contains a union with
different information for different host iommu types.  However:
   * It only actually contains information for the x86-like "Type1" iommu
   * Because we have a common listener the Type1 fields are actually used
on all IOMMU types, including the SPAPR TCE type as well
   * There's no tag in the VFIOContainer to tell you which union member is
valid anyway.

In fact we now have a general structure for the listener which is unlikely
to ever need per-iommu-type information, so this patch removes the union.

In a similar way we can unify the setup of the vfio memory listener in
vfio_connect_container() that is currently split across a switch on iommu
type, but is effectively the same in both cases.

The iommu_data.release pointer was only needed as a cleanup function
which would handle potentially different data in the union.  With the
union gone, it too can be removed.

Signed-off-by: David Gibson 
---
 hw/vfio/common.c  | 51 +--
 include/hw/vfio/vfio-common.h | 14 +++-
 2 files changed, 23 insertions(+), 42 deletions(-)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 6d21311..e3152f6 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -316,7 +316,7 @@ static void vfio_listener_region_add(MemoryListener 
*listener,
  MemoryRegionSection *section)
 {
 VFIOContainer *container = container_of(listener, VFIOContainer,
-iommu_data.type1.listener);
+iommu_data.listener);
 hwaddr iova, end;
 Int128 llend;
 void *vaddr;
@@ -406,9 +406,9 @@ static void vfio_listener_region_add(MemoryListener 
*listener,
  * can gracefully fail.  Runtime, there's not much we can do other
  * than throw a hardware error.
  */
-if (!container->iommu_data.type1.initialized) {
-if (!container->iommu_data.type1.error) {
-container->iommu_data.type1.error = ret;
+if (!container->iommu_data.initialized) {
+if (!container->iommu_data.error) {
+container->iommu_data.error = ret;
 }
 } else {
 hw_error("vfio: DMA mapping failed, unable to continue");
@@ -420,7 +420,7 @@ static void vfio_listener_region_del(MemoryListener 
*listener,
  MemoryRegionSection *section)
 {
 VFIOContainer *container = container_of(listener, VFIOContainer,
-iommu_data.type1.listener);
+iommu_data.listener);
 hwaddr iova, end;
 int ret;
 
@@ -485,7 +485,7 @@ static const MemoryListener vfio_memory_listener = {
 
 static void vfio_listener_release(VFIOContainer *container)
 {
-memory_listener_unregister(>iommu_data.type1.listener);
+memory_listener_unregister(>iommu_data.listener);
 }
 
 int vfio_mmap_region(Object *obj, VFIORegion *region,
@@ -683,21 +683,6 @@ static int vfio_connect_container(VFIOGroup *group, 
AddressSpace *as)
 ret = -errno;
 goto free_container_exit;
 }
-
-container->iommu_data.type1.listener = vfio_memory_listener;
-container->iommu_data.release = vfio_listener_release;
-
-memory_listener_register(>iommu_data.type1.listener,
- container->space->as);
-
-if (container->iommu_data.type1.error) {
-ret = container->iommu_data.type1.error;
-error_report("vfio: memory listener initialization failed for 
container");
-goto listener_release_exit;
-}
-
-container->iommu_data.type1.initialized = true;
-
 } else if (ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_SPAPR_TCE_IOMMU)) {
 ret = ioctl(group->fd, VFIO_GROUP_SET_CONTAINER, );
 if (ret) {
@@ -723,19 +708,25 @@ static int vfio_connect_container(VFIOGroup *group, 
AddressSpace *as)
 ret = -errno;
 goto free_container_exit;
 }
-
-container->iommu_data.type1.listener = vfio_memory_listener;
-container->iommu_data.release = vfio_listener_release;
-
-memory_listener_register(>iommu_data.type1.listener,
- container->space->as);
-
 } else {
 error_report("vfio: No available IOMMU models");
 ret = -EINVAL;
 goto free_container_exit;
 }
 
+container->iommu_data.listener = vfio_memory_listener;
+
+memory_listener_register(>iommu_data.listener,
+ container->space->as);
+
+if (container->iommu_data.error) {
+ret = container->iommu_data.error;
+error_report("vfio: memory listener initialization failed for 
container");
+goto listener_release_exit;
+}
+
+container->iommu_data.initialized = true;
+

[Qemu-devel] [Bug 1494350] Re: QEMU: causes vCPU steal time overflow on live migration

2015-09-17 Thread Alexandre Derumier

Hi,

I confirm this bug,

I have seen this a lot of time with debian jessie (kernel 3.16) and
ubuntu (kernel 4.X) with qemu 2.2 and qemu 2.3

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1494350

Title:
  QEMU: causes vCPU steal time overflow on live migration

Status in QEMU:
  New

Bug description:
  I'm pasting in text from Debian Bug 785557
  https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=785557
  b/c I couldn't find this issue reported.

  It is present in QEMU 2.3, but I haven't tested later versions.
  Perhaps someone else will find this bug and confirm for later
  versions.  (Or I will when I have time!)

  


  Hi,

  I'm trying to debug an issue we're having with some debian.org machines 
  running in QEMU 2.1.2 instances (see [1] for more background). In short, 
  after a live migration guests running Debian Jessie (linux 3.16) stop 
  accounting CPU time properly. /proc/stat in the guest shows no increase 
  in user and system time anymore (regardless of workload) and what stands 
  out are extremely large values for steal time:

   % cat /proc/stat
   cpu  2400 0 1842 650879168 2579640 0 25 136562317270 0 0
   cpu0 1366 0 1028 161392988 1238598 0 11 383803090749 0 0
   cpu1 294 0 240 162582008 639105 0 8 39686436048 0 0
   cpu2 406 0 338 163331066 383867 0 4 333994238765 0 0
   cpu3 332 0 235 163573105 318069 0 1 1223752959076 0 0
   intr 355773871 33 10 0 0 0 0 3 0 1 0 0 36 144 0 0 1638612 0 0 0 0 0 0 0 0 0 
0 
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 5001741 41 0 8516993 0 3669582 0 0 0 0 0 0 0 0 
0 
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
   ctxt 837862829
   btime 1431642967
   processes 8529939
   procs_running 1
   procs_blocked 0
   softirq 225193331 2 77532878 172 7250024 819289 0 54 33739135 176552 
105675225
   
  Reading the memory pointed to by the steal time MSRs pre- and 
  post-migration, I can see that post-migration the high bytes are set to 
  0xff:

  (qemu) xp /8b 0x1fc0cfc0
  1fc0cfc0: 0x94 0x57 0x77 0xf5 0xff 0xff 0xff 0xff

  The "jump" in steal time happens when the guest is resumed on the 
  receiving side.

  I've also been able to consistently reproduce this on a Ganeti cluster 
  at work, using QEMU 2.1.3 and kernels 3.16 and 4.0 in the guests. The 
  issue goes away if I disable the steal time MSR using `-cpu 
  qemu64,-kvm_steal_time`.

  So, it looks to me as if the steal time MSR is not set/copied properly 
  during live migration, although AFAICT this should be the case after 
  917367aa968fd4fef29d340e0c7ec8c608dffaab.

  After investigating a bit more, it looks like the issue comes from an overflow
  in the kernel's accumulate_steal_time() (arch/x86/kvm/x86.c:2023):

static void accumulate_steal_time(struct kvm_vcpu *vcpu)
{
u64 delta;

if (!(vcpu->arch.st.msr_val & KVM_MSR_ENABLED))
return;

delta = current->sched_info.run_delay - vcpu->arch.st.last_steal;

  Using systemtap with the attached script to trace KVM execution on the 
  receiving host kernel, we can see that shortly before marking the vCPUs 
  as runnable on a migrated KVM instance with 2 vCPUs, the following 
  happens (** marks lines of interest):

   **  0 qemu-system-x86(18446): kvm_arch_vcpu_load: run_delay=7856949 ns 
steal=7856949 ns
   0 qemu-system-x86(18446): -> kvm_arch_vcpu_load
   0 vhost-18446(18447): -> kvm_arch_vcpu_should_kick
   5 vhost-18446(18447): <- kvm_arch_vcpu_should_kick
  23

Re: [Qemu-devel] [PATCH v11 02/12] init/cleanup of netfilter object

2015-09-17 Thread Eric Blake

On 09/16/2015 07:23 PM, Yang Hongyang wrote:

>>> +{ 'enum': 'NetFilterChain',
>>> +  'data': [ 'all', 'in', 'out' ] }
>>
>> I don't see any other QMP usage of this enum anywhere in the series. Are
>> you planning on supporting QMP?  If so, let's get that design discussion
>> started.  If not, why not?
> 
> This series is based on QOM, so the QMP command for object_add
> will use this enum, for example:
>   1 { "execute": "qmp_capabilities" }
>   2 { "execute": "object-add",
>   3  "arguments": { "qom-type": "filter-buffer",
>   4 "id": "f0",
>   5 "props": { "netdev": "bn0",
>   6"chain": "in",
>   7"interval": 2000 } } }
> 
> for hmp:
> object_add filter-buffer,id=f0,netdev=bn0,chain=in,interval=1000
> 
> command options:
> -object filter-buffer,id=f0,netdev=bn0,chain=in,interval=1000

Do these examples appear in the documentation anywhere?  If not, it is
worth considering (maybe under 'object-add' in qmp-commands.hx, for
example).

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [PATCH v2 3/5] monitor: rename QDict *data->qdict

2015-09-17 Thread marcandre . lureau

From: Marc-André Lureau 

As suggested by Markus Armbruster, this is a bit more specific for the
event qdict than 'data'.

Signed-off-by: Marc-André Lureau 
---
 monitor.c | 36 ++--
 1 file changed, 18 insertions(+), 18 deletions(-)

diff --git a/monitor.c b/monitor.c
index e78ecc2..2f8af5b 100644
--- a/monitor.c
+++ b/monitor.c
@@ -178,13 +178,13 @@ typedef struct {
 QAPIEvent event;/* Event being tracked */
 int64_t last;   /* QEMU_CLOCK_REALTIME value at last emission */
 QEMUTimer *timer;   /* Timer for handling delayed events */
-QObject *data;  /* Event pending delayed dispatch */
+QObject *qdict; /* Event pending delayed dispatch */
 } MonitorQAPIEventPending;
 
 typedef struct MonitorQAPIEventState MonitorQAPIEventState;
 
 typedef bool (*MonitorQAPIEventDelay)(MonitorQAPIEventState *evstate,
-  QDict *data);
+  QDict *qdict);
 /*
  * To prevent flooding clients, events can be throttled. The
  * throttling is calculated globally, rather than per-Monitor
@@ -472,14 +472,14 @@ static void monitor_qapi_event_emit(QAPIEvent event, 
QObject *data)
  * Return 'false' if the event is not delayed and can be emitted now.
  */
 static bool
-monitor_qapi_event_delay(MonitorQAPIEventState *evstate, QDict *data)
+monitor_qapi_event_delay(MonitorQAPIEventState *evstate, QDict *qdict)
 {
 int64_t now = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
 MonitorQAPIEventPending *p = evstate->delay_data;
 int64_t delta = now - p->last;
 
 trace_monitor_protocol_event_delay(p->event,
-   data,
+   qdict,
evstate->rate,
p->last,
now);
@@ -490,19 +490,19 @@ monitor_qapi_event_delay(MonitorQAPIEventState *evstate, 
QDict *data)
 return false;
 }
 
-if (p->data || delta < evstate->rate) {
+if (p->qdict || delta < evstate->rate) {
 /* If there's an existing event pending, replace
  * it with the new event, otherwise schedule a
  * timer for delayed emission
  */
-if (p->data) {
-qobject_decref(p->data);
+if (p->qdict) {
+qobject_decref(p->qdict);
 } else {
 int64_t then = p->last + evstate->rate;
 timer_mod_ns(p->timer, then);
 }
-p->data = QOBJECT(data);
-qobject_incref(p->data);
+p->qdict = QOBJECT(qdict);
+qobject_incref(p->qdict);
 return true;
 }
 
@@ -515,19 +515,19 @@ monitor_qapi_event_delay(MonitorQAPIEventState *evstate, 
QDict *data)
  * applying any rate limiting if required.
  */
 static void
-monitor_qapi_event_queue(QAPIEvent event, QDict *data, Error **errp)
+monitor_qapi_event_queue(QAPIEvent event, QDict *qdict, Error **errp)
 {
 MonitorQAPIEventState *evstate;
 assert(event < QAPI_EVENT_MAX);
 
 evstate = _qapi_event_state[event];
-trace_monitor_protocol_event_queue(event, data);
+trace_monitor_protocol_event_queue(event, qdict);
 
 qemu_mutex_lock(_lock);
 
 if (!evstate->delay ||
-!evstate->delay(evstate, data)) {
-monitor_qapi_event_emit(event, QOBJECT(data));
+!evstate->delay(evstate, qdict)) {
+monitor_qapi_event_emit(event, QOBJECT(qdict));
 }
 
 qemu_mutex_unlock(_lock);
@@ -543,14 +543,14 @@ static void monitor_qapi_event_handler(void *opaque)
 int64_t now = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
 
 trace_monitor_protocol_event_handler(p->event,
- p->data,
+ p->qdict,
  p->last,
  now);
 qemu_mutex_lock(_lock);
-if (p->data) {
-monitor_qapi_event_emit(p->event, p->data);
-qobject_decref(p->data);
-p->data = NULL;
+if (p->qdict) {
+monitor_qapi_event_emit(p->event, p->qdict);
+qobject_decref(p->qdict);
+p->qdict = NULL;
 }
 p->last = now;
 qemu_mutex_unlock(_lock);
-- 
2.4.3

[Qemu-devel] [PATCH v18 18/21] replay: initialization and deinitialization

2015-09-17 Thread Pavel Dovgalyuk

This patch introduces the functions for enabling the record/replay and for
freeing the resources when simulator closes.

Reviewed-by: Paolo Bonzini 

Signed-off-by: Pavel Dovgalyuk 
---
 exec.c   |2 +
 replay/replay-internal.h |2 +
 replay/replay-user.c |4 +
 replay/replay.c  |  129 ++
 replay/replay.h  |   10 
 stubs/replay.c   |5 ++
 vl.c |4 +
 7 files changed, 156 insertions(+), 0 deletions(-)

diff --git a/exec.c b/exec.c
index b61126c..d31a9bb 100644
--- a/exec.c
+++ b/exec.c
@@ -51,6 +51,7 @@
 #include "qemu/main-loop.h"
 #include "exec/cputlb.h"
 #include "translate-all.h"
+#include "replay/replay.h"
 
 #include "exec/memory-internal.h"
 #include "exec/ram_addr.h"
@@ -841,6 +842,7 @@ void cpu_abort(CPUState *cpu, const char *fmt, ...)
 }
 va_end(ap2);
 va_end(ap);
+replay_finish();
 #if defined(CONFIG_USER_ONLY)
 {
 struct sigaction act;
diff --git a/replay/replay-internal.h b/replay/replay-internal.h
index 89c1dc7..820134f 100755
--- a/replay/replay-internal.h
+++ b/replay/replay-internal.h
@@ -33,6 +33,8 @@ enum ReplayEvents {
 /* some of greater codes are reserved for checkpoints */
 EVENT_CHECKPOINT,
 EVENT_CHECKPOINT_LAST = EVENT_CHECKPOINT + CHECKPOINT_COUNT - 1,
+/* end of log event */
+EVENT_END,
 EVENT_COUNT
 };
 
diff --git a/replay/replay-user.c b/replay/replay-user.c
index 62765e2..6a49eda 100755
--- a/replay/replay-user.c
+++ b/replay/replay-user.c
@@ -30,3 +30,7 @@ bool replay_has_interrupt(void)
 {
 return true;
 }
+
+void replay_finish(void)
+{
+}
diff --git a/replay/replay.c b/replay/replay.c
index ca9996e..e22e595 100755
--- a/replay/replay.c
+++ b/replay/replay.c
@@ -16,8 +16,16 @@
 #include "qemu/main-loop.h"
 #include "sysemu/sysemu.h"
 
+/* Current version of the replay mechanism.
+   Increase it when file format changes. */
+#define REPLAY_VERSION  0xe02002
+/* Size of replay log header */
+#define HEADER_SIZE (sizeof(uint32_t) + sizeof(uint64_t))
+
 ReplayMode replay_mode = REPLAY_MODE_NONE;
 
+/* Name of replay file  */
+static char *replay_filename;
 ReplayState replay_state;
 
 bool replay_next_event_is(int event)
@@ -194,3 +202,124 @@ out:
 replay_mutex_unlock();
 return res;
 }
+
+static void replay_enable(const char *fname, int mode)
+{
+const char *fmode = NULL;
+assert(!replay_file);
+
+switch (mode) {
+case REPLAY_MODE_RECORD:
+fmode = "wb";
+break;
+case REPLAY_MODE_PLAY:
+fmode = "rb";
+break;
+default:
+fprintf(stderr, "Replay: internal error: invalid replay mode\n");
+exit(1);
+}
+
+atexit(replay_finish);
+
+replay_mutex_init();
+
+replay_file = fopen(fname, fmode);
+if (replay_file == NULL) {
+fprintf(stderr, "Replay: open %s: %s\n", fname, strerror(errno));
+exit(1);
+}
+
+replay_filename = g_strdup(fname);
+
+replay_mode = mode;
+replay_data_kind = -1;
+replay_state.instructions_count = 0;
+replay_state.current_step = 0;
+
+/* skip file header for RECORD and check it for PLAY */
+if (replay_mode == REPLAY_MODE_RECORD) {
+fseek(replay_file, HEADER_SIZE, SEEK_SET);
+} else if (replay_mode == REPLAY_MODE_PLAY) {
+unsigned int version = replay_get_dword();
+if (version != REPLAY_VERSION) {
+fprintf(stderr, "Replay: invalid input log file version\n");
+exit(1);
+}
+/* go to the beginning */
+fseek(replay_file, HEADER_SIZE, SEEK_SET);
+replay_fetch_data_kind();
+}
+
+replay_init_events();
+}
+
+void replay_configure(QemuOpts *opts)
+{
+const char *fname;
+const char *rr;
+ReplayMode mode = REPLAY_MODE_NONE;
+
+rr = qemu_opt_get(opts, "rr");
+if (!rr) {
+/* Just enabling icount */
+return;
+} else if (!strcmp(rr, "record")) {
+mode = REPLAY_MODE_RECORD;
+} else if (!strcmp(rr, "replay")) {
+mode = REPLAY_MODE_PLAY;
+} else {
+error_report("Invalid icount rr option: %s", rr);
+exit(1);
+}
+
+fname = qemu_opt_get(opts, "rrfile");
+if (!fname) {
+error_report("File name not specified for replay");
+exit(1);
+}
+
+replay_enable(fname, mode);
+}
+
+void replay_start(void)
+{
+if (replay_mode == REPLAY_MODE_NONE) {
+return;
+}
+
+/* Timer for snapshotting will be set up here. */
+
+replay_enable_events();
+}
+
+void replay_finish(void)
+{
+if (replay_mode == REPLAY_MODE_NONE) {
+return;
+}
+
+replay_save_instructions();
+
+/* finalize the file */
+if (replay_file) {
+if (replay_mode == REPLAY_MODE_RECORD) {
+/* write end event */
+replay_put_event(EVENT_END);
+
+/* write

[Qemu-devel] [PATCH v18 14/21] replay: checkpoints

2015-09-17 Thread Pavel Dovgalyuk

This patch introduces checkpoints that synchronize cpu thread and iothread.
When checkpoint is met in the code all asynchronous events from the queue
are executed.

Signed-off-by: Pavel Dovgalyuk 
---
 cpus.c   |5 +
 qemu-timer.c |   40 
 replay/replay-internal.h |4 
 replay/replay.c  |   34 ++
 replay/replay.h  |   20 
 stubs/replay.c   |   11 +++
 vl.c |   23 +++
 7 files changed, 129 insertions(+), 8 deletions(-)

diff --git a/cpus.c b/cpus.c
index 46d6717..ad33fc1 100644
--- a/cpus.c
+++ b/cpus.c
@@ -402,6 +402,11 @@ void qemu_clock_warp(QEMUClockType type)
 return;
 }
 
+/* warp clock deterministically in record/replay mode */
+if (!replay_checkpoint(CHECKPOINT_CLOCK_WARP)) {
+return;
+}
+
 if (icount_sleep) {
 /*
  * If the CPUs have been sleeping, advance QEMU_CLOCK_VIRTUAL timer 
now.
diff --git a/qemu-timer.c b/qemu-timer.c
index c7bd643..e7a5c96 100644
--- a/qemu-timer.c
+++ b/qemu-timer.c
@@ -25,6 +25,7 @@
 #include "qemu/main-loop.h"
 #include "qemu/timer.h"
 #include "replay/replay.h"
+#include "sysemu/sysemu.h"
 
 #ifdef CONFIG_POSIX
 #include 
@@ -478,10 +479,34 @@ bool timerlist_run_timers(QEMUTimerList *timer_list)
 void *opaque;
 
 qemu_event_reset(_list->timers_done_ev);
-if (!timer_list->clock->enabled) {
+if (!timer_list->clock->enabled || !timer_list->active_timers) {
 goto out;
 }
 
+switch (timer_list->clock->type) {
+case QEMU_CLOCK_REALTIME:
+break;
+default:
+case QEMU_CLOCK_VIRTUAL:
+if ((replay_mode != REPLAY_MODE_NONE && !runstate_is_running())
+|| !replay_checkpoint(CHECKPOINT_CLOCK_VIRTUAL)) {
+goto out;
+}
+break;
+case QEMU_CLOCK_HOST:
+if ((replay_mode != REPLAY_MODE_NONE && !runstate_is_running())
+|| !replay_checkpoint(CHECKPOINT_CLOCK_HOST)) {
+goto out;
+}
+break;
+case QEMU_CLOCK_VIRTUAL_RT:
+if ((replay_mode != REPLAY_MODE_NONE && !runstate_is_running())
+|| !replay_checkpoint(CHECKPOINT_CLOCK_VIRTUAL_RT)) {
+goto out;
+}
+break;
+}
+
 current_time = qemu_clock_get_ns(timer_list->clock->type);
 for(;;) {
 qemu_mutex_lock(_list->active_timers_lock);
@@ -545,11 +570,18 @@ int64_t timerlistgroup_deadline_ns(QEMUTimerListGroup 
*tlg)
 {
 int64_t deadline = -1;
 QEMUClockType type;
+bool play = replay_mode == REPLAY_MODE_PLAY;
 for (type = 0; type < QEMU_CLOCK_MAX; type++) {
 if (qemu_clock_use_for_deadline(tlg->tl[type]->clock->type)) {
-deadline = qemu_soonest_timeout(deadline,
-timerlist_deadline_ns(
-tlg->tl[type]));
+if (!play || tlg->tl[type]->clock->type == QEMU_CLOCK_REALTIME) {
+deadline = qemu_soonest_timeout(deadline,
+timerlist_deadline_ns(
+tlg->tl[type]));
+} else {
+/* Read clock from the replay file and
+   do not calculate the deadline, based on virtual clock. */
+qemu_clock_get_ns(tlg->tl[type]->clock->type);
+}
 }
 }
 return deadline;
diff --git a/replay/replay-internal.h b/replay/replay-internal.h
index 4414695..bf64be5 100755
--- a/replay/replay-internal.h
+++ b/replay/replay-internal.h
@@ -29,6 +29,10 @@ enum ReplayEvents {
 /* some of greater codes are reserved for clocks */
 EVENT_CLOCK,
 EVENT_CLOCK_LAST = EVENT_CLOCK + REPLAY_CLOCK_COUNT - 1,
+/* for checkpoint event */
+/* some of greater codes are reserved for checkpoints */
+EVENT_CHECKPOINT,
+EVENT_CHECKPOINT_LAST = EVENT_CHECKPOINT + CHECKPOINT_COUNT - 1,
 EVENT_COUNT
 };
 
diff --git a/replay/replay.c b/replay/replay.c
index 83b0f8c..ca9996e 100755
--- a/replay/replay.c
+++ b/replay/replay.c
@@ -160,3 +160,37 @@ void replay_shutdown_request(void)
 replay_mutex_unlock();
 }
 }
+
+bool replay_checkpoint(ReplayCheckpoint checkpoint)
+{
+bool res = false;
+assert(EVENT_CHECKPOINT + checkpoint <= EVENT_CHECKPOINT_LAST);
+replay_save_instructions();
+
+if (!replay_file) {
+return true;
+}
+
+replay_mutex_lock();
+
+if (replay_mode == REPLAY_MODE_PLAY) {
+if (replay_next_event_is(EVENT_CHECKPOINT + checkpoint)) {
+replay_finish_event();
+} else if (replay_data_kind != EVENT_ASYNC) {
+res = false;
+goto out;
+}
+replay_read_events(checkpoint);
+/* replay_read_events may leave some unread events.
+

Re: [Qemu-devel] [PATCH 3/4] checkpatch: adapt some tests to QEMU

2015-09-17 Thread Eric Blake

On 09/17/2015 10:32 AM, Paolo Bonzini wrote:

 Can we revert this one, please? Checkpatch now warns about constructs
 like
   typedef struct MyDevice {
   DeviceState parent;

   int reg0, reg1, reg2;
   } MyDevice;
>>>
>>> It's interesting that qom/object.h documents this and start like:
>>>
>>> typedef struct ObjectClass ObjectClass;

> I think it varies depending on the maintainer.  PPC, USB, SCSI, ACPI all
> use a separate typedef.  I'll prepare a revert.

And qapi is about to switch from inline to separate:

https://lists.gnu.org/archive/html/qemu-devel/2015-09/msg04291.html

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [RFCv2 1/2] spapr: Remove unnecessary owner field from sPAPRDRConnector

2015-09-17 Thread Michael Roth

Quoting Paolo Bonzini (2015-09-14 09:24:23)
> 
> 
> On 14/09/2015 16:06, Alexey Kardashevskiy wrote:
> >
> > === * There is no way for a child to determine what its parent
> > is.  It is not * a bidirectional relationship.  This is by
> > design. ===
> >
> > This part always confused me as there is "Object *parent" in
> > the "struct Object". So there is way to determine but it must
> > not be used? Is it debug only?
> >
> > Anyway, all members of the Object class are under /*< private
> >> */ so they should not be accesses in sPAPR code, I believe.
> >>> Ah, good point, I missed that.  I guess we have to keep the owner
> >>> field, redundant though it seems.  Blech.
> >>
> >> I think the comment is wrong or at least inaccurate; it only applies
> >> to the external QOM interface.
> > 
> > Is this case external?
> 
> I meant external as in qom-get, qom-set, qom-list.  There isn't a ".."
> property.
> 
> > Originally I was looking for a object_get_parent() but it is not there
> > so I decided that the comment is correct or I just fail to understand it :)
> 
> Yes, we can add such an API.
> 
> Let's look also at what ->owner is used for.
> 
> > object_property_add_alias(root_container, link_name,
> >   drc->owner, child_name, );
> 
> This can be rewritten as
> 
>  object_property_add_const_link(root_container, link_name,
> drc, );
> 
> > QTAILQ_FOREACH(prop, _container->properties, node) {
> > Object *obj;
> > sPAPRDRConnector *drc;
> > sPAPRDRConnectorClass *drck;
> > uint32_t drc_index, drc_power_domain;
> > 
> > if (!strstart(prop->type, "link<", NULL)) {
> > continue;
> > }
> > 
> > obj = object_property_get_link(root_container, prop->name, NULL);
> > drc = SPAPR_DR_CONNECTOR(obj);
> > drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
> > 
> > if (owner && (drc->owner != owner)) {
> 
> Could the PCI host bridge instead store the DR connectors when it
> creates them with spapr_dr_connector_new?  Then you can just call
> spapr_drc_populate_dt directly with the right objects, and avoid another
> O(n^2) loop.

It could be done I think, but in some cases the DRC lists can include a mix
of DRC types/indexes from multiple parents, so we'd need to add some logic
to modify/append to existing paths in FDT.

We end up needing a global registry of some sort anyway, since RTAS
calls from the guest address the DRCs purely by the DRC index, so I
think if we can make use of that same registry to simply the above case
there's no reason not to.

> 
> Paolo
>

[Qemu-devel] [PATCH 0/2] tcg/mips: small cleanups

2015-09-17 Thread Aurelien Jarno

I have these patches for quite some time in one of my local branch in
the hope I would have time to do further changes. Given that I am going
to send a pull request for the 64-bit qemu_ld issue, I think it's a good
opportunity to also include them.

Aurelien Jarno (2):
  tcg/mips: move tcg_out_addsub2
  tcg/mips: pass oi to tcg_out_tlb_load

 tcg/mips/tcg-target.c | 118 +++---
 1 file changed, 54 insertions(+), 64 deletions(-)

-- 
2.1.4

[Qemu-devel] [PATCH v2 0/5] monitor: throttle VSERPORT_CHANGED by "id"

2015-09-17 Thread marcandre . lureau

From: Marc-André Lureau 

QAPI_EVENT_VSERPORT_CHANGE reports changes of a virtio serial port
state. However, the events may be for different ports, but the throttle
mechanism may replace the event for a different port, since it only
checks the event type.

The following series implements throttling of events based on the "id"
field. Hopefully this hash table approach can be later extended if
other fields or combination of fields have to be used.

v1->v2:
- split first patch in 2 to ease review
- remove some extra space
- add some comments above delay handler function, and struct fields
- rename the delay handler data "delay_data"
- add a trace in monitor_protocol_event_delay()
- improve some commit messages
- simplify monitor_qapi_event_delay()
- add some comment assert code in monitor_qapi_event_id_delay() to
  ensure the given pending struct is valid
- fixed hashtable key leak
- rename qdict "data" argument to "qdict"
- removed superfluous parenthesis
- use a single timer handler for doing "id" filtering cleanup

Marc-André Lureau (5):
  monitor: split MonitorQAPIEventState
  monitor: introduce MonitorQAPIEventDelay callback
  monitor: rename QDict *data->qdict
  monitor: throttle QAPI_EVENT_VSERPORT_CHANGE by "id"
  monitor: remove old entries from event hash table

 monitor.c| 256 ++-
 trace-events |   3 +-
 2 files changed, 203 insertions(+), 56 deletions(-)

-- 
2.4.3

[Qemu-devel] [PATCH v2 1/5] monitor: split MonitorQAPIEventState

2015-09-17 Thread marcandre . lureau

From: Marc-André Lureau 

Create a separate structure MonitorQAPIEventPending for holding the data
related to pending event.

The next commits are going to reuse this structure for different
throttling implementations.

Signed-off-by: Marc-André Lureau 
---
 monitor.c | 85 +--
 1 file changed, 50 insertions(+), 35 deletions(-)

diff --git a/monitor.c b/monitor.c
index 1f43263..2d2e030 100644
--- a/monitor.c
+++ b/monitor.c
@@ -174,18 +174,24 @@ typedef struct {
 bool in_command_mode;   /* are we in command mode? */
 } MonitorQMP;
 
+typedef struct {
+QAPIEvent event;/* Event being tracked */
+int64_t last;   /* QEMU_CLOCK_REALTIME value at last emission */
+QEMUTimer *timer;   /* Timer for handling delayed events */
+QObject *data;  /* Event pending delayed dispatch */
+} MonitorQAPIEventPending;
+
+typedef struct MonitorQAPIEventState MonitorQAPIEventState;
+
 /*
  * To prevent flooding clients, events can be throttled. The
  * throttling is calculated globally, rather than per-Monitor
  * instance.
  */
-typedef struct MonitorQAPIEventState {
-QAPIEvent event;/* Event being tracked */
+struct MonitorQAPIEventState {
 int64_t rate;   /* Minimum time (in ns) between two events */
-int64_t last;   /* QEMU_CLOCK_REALTIME value at last emission */
-QEMUTimer *timer;   /* Timer for handling delayed events */
-QObject *data;  /* Event pending delayed dispatch */
-} MonitorQAPIEventState;
+void *delay_data;   /* data for the throttle handler */
+};
 
 struct Monitor {
 CharDriverState *chr;
@@ -464,40 +470,41 @@ static void
 monitor_qapi_event_queue(QAPIEvent event, QDict *data, Error **errp)
 {
 MonitorQAPIEventState *evstate;
+MonitorQAPIEventPending *p;
 assert(event < QAPI_EVENT_MAX);
 int64_t now = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
 
-evstate = &(monitor_qapi_event_state[event]);
+evstate = _qapi_event_state[event];
+p = evstate->delay_data;
 trace_monitor_protocol_event_queue(event,
data,
evstate->rate,
-   evstate->last,
+   p->last,
now);
 
 /* Rate limit of 0 indicates no throttling */
 qemu_mutex_lock(_lock);
 if (!evstate->rate) {
 monitor_qapi_event_emit(event, QOBJECT(data));
-evstate->last = now;
 } else {
-int64_t delta = now - evstate->last;
-if (evstate->data ||
+int64_t delta = now - p->last;
+if (evstate->delay_data ||
 delta < evstate->rate) {
 /* If there's an existing event pending, replace
  * it with the new event, otherwise schedule a
  * timer for delayed emission
  */
-if (evstate->data) {
-qobject_decref(evstate->data);
+if (evstate->delay_data) {
+qobject_decref(evstate->delay_data);
 } else {
-int64_t then = evstate->last + evstate->rate;
-timer_mod_ns(evstate->timer, then);
+int64_t then = p->last + evstate->rate;
+timer_mod_ns(p->timer, then);
 }
-evstate->data = QOBJECT(data);
-qobject_incref(evstate->data);
+evstate->delay_data = QOBJECT(data);
+qobject_incref(evstate->delay_data);
 } else {
 monitor_qapi_event_emit(event, QOBJECT(data));
-evstate->last = now;
+p->last = now;
 }
 }
 qemu_mutex_unlock(_lock);
@@ -509,23 +516,37 @@ monitor_qapi_event_queue(QAPIEvent event, QDict *data, 
Error **errp)
  */
 static void monitor_qapi_event_handler(void *opaque)
 {
-MonitorQAPIEventState *evstate = opaque;
+MonitorQAPIEventPending *p = opaque;
 int64_t now = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
 
-trace_monitor_protocol_event_handler(evstate->event,
- evstate->data,
- evstate->last,
+trace_monitor_protocol_event_handler(p->event,
+ p->data,
+ p->last,
  now);
 qemu_mutex_lock(_lock);
-if (evstate->data) {
-monitor_qapi_event_emit(evstate->event, evstate->data);
-qobject_decref(evstate->data);
-evstate->data = NULL;
+if (p->data) {
+monitor_qapi_event_emit(p->event, p->data);
+qobject_decref(p->data);
+p->data = NULL;
 }
-evstate->last = now;
+p->last = now;
 qemu_mutex_unlock(_lock);
 }
 
+static MonitorQAPIEventPending *
+monitor_qapi_event_pending_new(QAPIEvent event)
+{
+MonitorQAPIEventPending

Re: [Qemu-devel] [PATCH 3/4] checkpatch: adapt some tests to QEMU

2015-09-17 Thread Peter Maydell

On 17 September 2015 at 17:00, Paolo Bonzini  wrote:
>
>
> On 17/09/2015 16:24, Peter Maydell wrote:
>> Can we revert this one, please? Checkpatch now warns about constructs
>> like
>>   typedef struct MyDevice {
>>   DeviceState parent;
>>
>>   int reg0, reg1, reg2;
>>   } MyDevice;
>
> It's interesting that qom/object.h documents this and start like:
>
> typedef struct ObjectClass ObjectClass;
> typedef struct Object Object;
>
> typedef struct TypeInfo TypeInfo;
>
> typedef struct InterfaceClass InterfaceClass;
> typedef struct InterfaceInfo InterfaceInfo;
>
> I have a patch to flag widely-disrespected rules that we still want to
> encourage in patches.  Would you agree with filing these typedefs under
> this category?

No, I think that having a separate typedef is worse. The
only exceptions are (a) when you need it to be separate because
you need to use the type within the struct itself (or some
similar dependency loop) (b) when you want to put the typedef
in include/qemu/typedefs.h.

I really don't see any need to suddenly outlaw something
that's been accepted as standard good QEMU style for a
long time.

(You could make checkpatch warn about
  typedef struct Foo
  {
  stuff;
  } Foo;

if you like, the newline before the '{' is out of line with our
usual approach.)

thanks
-- PMM

[Qemu-devel] [PATCH v18 04/21] replay: introduce mutex to protect the replay log

2015-09-17 Thread Pavel Dovgalyuk

This mutex will protect read/write operations for replay log.
Using mutex is necessary because most of the events consist of
several fields stored in the log. The mutex will help to avoid races.

Reviewed-by: Paolo Bonzini 

Signed-off-by: Pavel Dovgalyuk 
---
 replay/replay-internal.c |   27 +++
 replay/replay-internal.h |7 +++
 2 files changed, 34 insertions(+), 0 deletions(-)

diff --git a/replay/replay-internal.c b/replay/replay-internal.c
index 04efeee..efae672 100755
--- a/replay/replay-internal.c
+++ b/replay/replay-internal.c
@@ -17,6 +17,13 @@
 unsigned int replay_data_kind = -1;
 static unsigned int replay_has_unread_data;
 
+/* Mutex to protect reading and writing events to the log.
+   replay_data_kind and replay_has_unread_data are also protected
+   by this mutex.
+   It also protects replay events queue which stores events to be
+   written or read to the log. */
+static QemuMutex lock;
+
 /* File for replay writing */
 FILE *replay_file;
 
@@ -153,3 +160,23 @@ void replay_finish_event(void)
 replay_has_unread_data = 0;
 replay_fetch_data_kind();
 }
+
+void replay_mutex_init(void)
+{
+qemu_mutex_init();
+}
+
+void replay_mutex_destroy(void)
+{
+qemu_mutex_destroy();
+}
+
+void replay_mutex_lock(void)
+{
+qemu_mutex_lock();
+}
+
+void replay_mutex_unlock(void)
+{
+qemu_mutex_unlock();
+}
diff --git a/replay/replay-internal.h b/replay/replay-internal.h
index 17600de..8a0de0d 100755
--- a/replay/replay-internal.h
+++ b/replay/replay-internal.h
@@ -33,6 +33,13 @@ int64_t replay_get_qword(void);
 void replay_get_array(uint8_t *buf, size_t *size);
 void replay_get_array_alloc(uint8_t **buf, size_t *size);
 
+/* Mutex functions for protecting replay log file */
+
+void replay_mutex_init(void);
+void replay_mutex_destroy(void);
+void replay_mutex_lock(void);
+void replay_mutex_unlock(void);
+
 /*! Checks error status of the file. */
 void replay_check_error(void);

[Qemu-devel] [PATCH v18 03/21] replay: internal functions for replay log

2015-09-17 Thread Pavel Dovgalyuk

This patch adds functions to perform read and write operations
with replay log.

Reviewed-by: Paolo Bonzini 

Signed-off-by: Pavel Dovgalyuk 
---
 replay/Makefile.objs |1 
 replay/replay-internal.c |  155 ++
 replay/replay-internal.h |   46 ++
 3 files changed, 202 insertions(+), 0 deletions(-)
 create mode 100755 replay/replay-internal.c
 create mode 100755 replay/replay-internal.h

diff --git a/replay/Makefile.objs b/replay/Makefile.objs
index 0b9cb99..9711e14 100755
--- a/replay/Makefile.objs
+++ b/replay/Makefile.objs
@@ -1,2 +1,3 @@
 obj-$(CONFIG_SOFTMMU) += replay.o
+obj-$(CONFIG_SOFTMMU) += replay-internal.o
 obj-$(CONFIG_USER_ONLY) += replay-user.o
diff --git a/replay/replay-internal.c b/replay/replay-internal.c
new file mode 100755
index 000..04efeee
--- /dev/null
+++ b/replay/replay-internal.c
@@ -0,0 +1,155 @@
+/*
+ * replay-internal.c
+ *
+ * Copyright (c) 2010-2015 Institute for System Programming
+ * of the Russian Academy of Sciences.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu-common.h"
+#include "replay-internal.h"
+#include "qemu/error-report.h"
+#include "sysemu/sysemu.h"
+
+unsigned int replay_data_kind = -1;
+static unsigned int replay_has_unread_data;
+
+/* File for replay writing */
+FILE *replay_file;
+
+void replay_put_byte(uint8_t byte)
+{
+if (replay_file) {
+putc(byte, replay_file);
+}
+}
+
+void replay_put_event(uint8_t event)
+{
+replay_put_byte(event);
+}
+
+
+void replay_put_word(uint16_t word)
+{
+replay_put_byte(word >> 8);
+replay_put_byte(word);
+}
+
+void replay_put_dword(uint32_t dword)
+{
+replay_put_word(dword >> 16);
+replay_put_word(dword);
+}
+
+void replay_put_qword(int64_t qword)
+{
+replay_put_dword(qword >> 32);
+replay_put_dword(qword);
+}
+
+void replay_put_array(const uint8_t *buf, size_t size)
+{
+if (replay_file) {
+replay_put_dword(size);
+fwrite(buf, 1, size, replay_file);
+}
+}
+
+uint8_t replay_get_byte(void)
+{
+uint8_t byte = 0;
+if (replay_file) {
+byte = getc(replay_file);
+}
+return byte;
+}
+
+uint16_t replay_get_word(void)
+{
+uint16_t word = 0;
+if (replay_file) {
+word = replay_get_byte();
+word = (word << 8) + replay_get_byte();
+}
+
+return word;
+}
+
+uint32_t replay_get_dword(void)
+{
+uint32_t dword = 0;
+if (replay_file) {
+dword = replay_get_word();
+dword = (dword << 16) + replay_get_word();
+}
+
+return dword;
+}
+
+int64_t replay_get_qword(void)
+{
+int64_t qword = 0;
+if (replay_file) {
+qword = replay_get_dword();
+qword = (qword << 32) + replay_get_dword();
+}
+
+return qword;
+}
+
+void replay_get_array(uint8_t *buf, size_t *size)
+{
+if (replay_file) {
+*size = replay_get_dword();
+if (fread(buf, 1, *size, replay_file) != *size) {
+error_report("replay read error");
+}
+}
+}
+
+void replay_get_array_alloc(uint8_t **buf, size_t *size)
+{
+if (replay_file) {
+*size = replay_get_dword();
+*buf = g_malloc(*size);
+if (fread(*buf, 1, *size, replay_file) != *size) {
+error_report("replay read error");
+}
+}
+}
+
+void replay_check_error(void)
+{
+if (replay_file) {
+if (feof(replay_file)) {
+error_report("replay file is over");
+qemu_system_vmstop_request_prepare();
+qemu_system_vmstop_request(RUN_STATE_PAUSED);
+} else if (ferror(replay_file)) {
+error_report("replay file is over or something goes wrong");
+qemu_system_vmstop_request_prepare();
+qemu_system_vmstop_request(RUN_STATE_INTERNAL_ERROR);
+}
+}
+}
+
+void replay_fetch_data_kind(void)
+{
+if (replay_file) {
+if (!replay_has_unread_data) {
+replay_data_kind = replay_get_byte();
+replay_check_error();
+replay_has_unread_data = 1;
+}
+}
+}
+
+void replay_finish_event(void)
+{
+replay_has_unread_data = 0;
+replay_fetch_data_kind();
+}
diff --git a/replay/replay-internal.h b/replay/replay-internal.h
new file mode 100755
index 000..17600de
--- /dev/null
+++ b/replay/replay-internal.h
@@ -0,0 +1,46 @@
+#ifndef REPLAY_INTERNAL_H
+#define REPLAY_INTERNAL_H
+
+/*
+ * replay-internal.h
+ *
+ * Copyright (c) 2010-2015 Institute for System Programming
+ * of the Russian Academy of Sciences.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include 
+
+extern unsigned int replay_data_kind;
+
+/* File for replay writing */
+extern FILE *replay_file;
+
+void

[Qemu-devel] [PATCH v18 20/21] replay: command line options

2015-09-17 Thread Pavel Dovgalyuk

This patch introduces command line options for enabling recording or replaying
virtual machine behavior. These options are added to icount command line
parameter. They include 'rr' which switches between record and replay
and 'rrfile' for specifying the filename for replay log.

Signed-off-by: Pavel Dovgalyuk 
---
 qemu-options.hx |8 ++--
 replay/replay.c |4 
 vl.c|   15 ---
 3 files changed, 22 insertions(+), 5 deletions(-)

diff --git a/qemu-options.hx b/qemu-options.hx
index 7e147b8..cb6def8 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -3150,12 +3150,12 @@ re-inject them.
 ETEXI
 
 DEF("icount", HAS_ARG, QEMU_OPTION_icount, \
-"-icount [shift=N|auto][,align=on|off][,sleep=no]\n" \
+"-icount 
[shift=N|auto][,align=on|off][,sleep=no,rr=record|replay,rrfile=]\n" \
 "enable virtual instruction counter with 2^N clock ticks 
per\n" \
 "instruction, enable aligning the host and virtual 
clocks\n" \
 "or disable real time cpu sleeping\n", QEMU_ARCH_ALL)
 STEXI
-@item -icount [shift=@var{N}|auto]
+@item -icount [shift=@var{N}|auto][,rr=record|replay,rrfile=@var{filename}]
 @findex -icount
 Enable virtual instruction counter.  The virtual cpu will execute one
 instruction every 2^@var{N} ns of virtual time.  If @code{auto} is specified
@@ -3184,6 +3184,10 @@ Currently this option does not work when @option{shift} 
is @code{auto}.
 Note: The sync algorithm will work for those shift values for which
 the guest clock runs ahead of the host clock. Typically this happens
 when the shift value is high (how high depends on the host machine).
+
+When @option{rr} option is specified deterministic record/replay is enabled.
+Replay log is written into @var{filename} file in record mode and
+read from this file in replay mode.
 ETEXI
 
 DEF("watchdog", HAS_ARG, QEMU_OPTION_watchdog, \
diff --git a/replay/replay.c b/replay/replay.c
index 671b7b3..dfa3d6f 100755
--- a/replay/replay.c
+++ b/replay/replay.c
@@ -295,6 +295,10 @@ void replay_start(void)
  error_get_pretty(replay_blockers->data));
 exit(1);
 }
+if (!use_icount) {
+error_report("Please enable icount to use record/replay");
+exit(1);
+}
 
 /* Timer for snapshotting will be set up here. */
 
diff --git a/vl.c b/vl.c
index 1f24c94..1bcfcc3 100644
--- a/vl.c
+++ b/vl.c
@@ -476,6 +476,12 @@ static QemuOptsList qemu_icount_opts = {
 }, {
 .name = "sleep",
 .type = QEMU_OPT_BOOL,
+}, {
+.name = "rr",
+.type = QEMU_OPT_STRING,
+}, {
+.name = "rrfile",
+.type = QEMU_OPT_STRING,
 },
 { /* end of list */ }
 },
@@ -4038,6 +4044,8 @@ int main(int argc, char **argv, char **envp)
 }
 }
 
+replay_configure(icount_opts);
+
 opts = qemu_get_machine_opts();
 optarg = qemu_opt_get(opts, "type");
 if (optarg) {
@@ -4471,9 +4479,10 @@ int main(int argc, char **argv, char **envp)
 }
 
 /* open the virtual block devices */
-if (snapshot)
-qemu_opts_foreach(qemu_find_opts("drive"),
-  drive_enable_snapshot, NULL, NULL);
+if (snapshot || replay_mode != REPLAY_MODE_NONE) {
+qemu_opts_foreach(qemu_find_opts("drive"), drive_enable_snapshot,
+  NULL, NULL);
+}
 if (qemu_opts_foreach(qemu_find_opts("drive"), drive_init_func,
   _class->block_default_type, NULL)) {
 exit(1);

Re: [Qemu-devel] [RFCv2 1/2] spapr: Remove unnecessary owner field from sPAPRDRConnector

2015-09-17 Thread Michael Roth

Quoting David Gibson (2015-09-15 22:16:35)
> On Mon, Sep 14, 2015 at 04:24:23PM +0200, Paolo Bonzini wrote:
> > 
> > 
> > On 14/09/2015 16:06, Alexey Kardashevskiy wrote:
> > >
> > > === * There is no way for a child to determine what its parent
> > > is.  It is not * a bidirectional relationship.  This is by
> > > design. ===
> > >
> > > This part always confused me as there is "Object *parent" in
> > > the "struct Object". So there is way to determine but it must
> > > not be used? Is it debug only?
> > >
> > > Anyway, all members of the Object class are under /*< private
> > >> */ so they should not be accesses in sPAPR code, I believe.
> > >>> Ah, good point, I missed that.  I guess we have to keep the owner
> > >>> field, redundant though it seems.  Blech.
> > >>
> > >> I think the comment is wrong or at least inaccurate; it only applies
> > >> to the external QOM interface.
> > > 
> > > Is this case external?
> > 
> > I meant external as in qom-get, qom-set, qom-list.  There isn't a ".."
> > property.
> > 
> > > Originally I was looking for a object_get_parent() but it is not there
> > > so I decided that the comment is correct or I just fail to understand it 
> > > :)
> > 
> > Yes, we can add such an API.
> > 
> > Let's look also at what ->owner is used for.
> > 
> > > object_property_add_alias(root_container, link_name,
> > >   drc->owner, child_name, );
> > 
> > This can be rewritten as
> > 
> >  object_property_add_const_link(root_container, link_name,
> > drc, );
> > 
> > > QTAILQ_FOREACH(prop, _container->properties, node) {
> > > Object *obj;
> > > sPAPRDRConnector *drc;
> > > sPAPRDRConnectorClass *drck;
> > > uint32_t drc_index, drc_power_domain;
> > > 
> > > if (!strstart(prop->type, "link<", NULL)) {
> > > continue;
> > > }
> > > 
> > > obj = object_property_get_link(root_container, prop->name, NULL);
> > > drc = SPAPR_DR_CONNECTOR(obj);
> > > drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
> > > 
> > > if (owner && (drc->owner != owner)) {
> > 
> > Could the PCI host bridge instead store the DR connectors when it
> > creates them with spapr_dr_connector_new?  Then you can just call
> > spapr_drc_populate_dt directly with the right objects, and avoid another
> > O(n^2) loop.
> 
> So, yes, iterating over the connectors under the owner rather than
> going via the global links is another cleanup I've considered but
> haven't gotten around to.

We still need globals for RTAS lookups. I think QOM is our most
mature/well-tested interface for managing inter-device
relationships/lookups, but I can understand if using
root_container/link seems off. To me it seem like a nice
"freebie" we get from using QOM, and gave us nice guarantees like
globally unique paths to correspond to globally unique DRC indexes.

But it's a simple enough task that it can live elsewhere. It will
still be global though (RTAS + device_add + FDT generation all
needs to reference it), and I don't think adding more global
interfaces to do things already provided by QOM is worth it.

> 
> Except, I've been thinking further, and I'm not sure it makes sense to
> keep these DR connectors around as full QOM objects anyway.  Having
> them here at least potentially exposes the DR connector stuff to
> users, when it's really an internal of how the hypervisor communicates
> with the guest about hotplug.

I think QOM should manage a machine's composition tree. The fact that
there's a side-effect of exposing things to users shouldn't prevent
us from making use of it. We expose all sorts of state about gpio
pins, memory regions, etc. currently. If there's a need to hide things
from external users, or better manage things like read/write permissions,
I think that logic should be added to QOM rather than used as an argument
against it.

Whether the current composition tree makes sense is tricky to answer.
pseries is guest-only, so everything is sort of a hypervisor internal,
but if a theoretical baremetal pseries existed, I think you could easily
end up with, in the case of PCI anyway, an SHPC-like device that's wired
up to a bus of some sort, addressed by DRC index, with registers to set
indicator/isolation states and internal logic to control state
transitions like SHPC has.

SHPCs are exposed to guests via PCI bus, but pseries calls for RTAS,
and SHPCs are modeled as an internal of the bus controller, whereas
things like memory/cpu don't really have a bus we can piggy-back on,
so we have something a little ad-hoc where DRCs are modeled as
self-contained devices.

FWIW, the initial implementation of hotplug did not use QOM, and was
rewritten to use it at the suggestion of Alex. I think the pieces
fall into place much more logically taking the latter approach and
modeling DRCs as devices. The alternative is to make DRC an internal
detail of PCI

Re: [Qemu-devel] [PATCH 3/4] checkpatch: adapt some tests to QEMU

2015-09-17 Thread Paolo Bonzini

On 17/09/2015 16:24, Peter Maydell wrote:
> Can we revert this one, please? Checkpatch now warns about constructs
> like
>   typedef struct MyDevice {
>   DeviceState parent;
> 
>   int reg0, reg1, reg2;
>   } MyDevice;

It's interesting that qom/object.h documents this and start like:

typedef struct ObjectClass ObjectClass;
typedef struct Object Object;

typedef struct TypeInfo TypeInfo;

typedef struct InterfaceClass InterfaceClass;
typedef struct InterfaceInfo InterfaceInfo;

I have a patch to flag widely-disrespected rules that we still want to
encourage in patches.  Would you agree with filing these typedefs under
this category?

Paolo

[Qemu-devel] [PATCH v2 2/5] monitor: introduce MonitorQAPIEventDelay callback

2015-09-17 Thread marcandre . lureau

From: Marc-André Lureau 

Move the delay handling in a seperate function that can be changed for
different throttling implementations, as will be done in following commits.

Signed-off-by: Marc-André Lureau 
---
 monitor.c| 85 +++-
 trace-events |  3 ++-
 2 files changed, 57 insertions(+), 31 deletions(-)

diff --git a/monitor.c b/monitor.c
index 2d2e030..e78ecc2 100644
--- a/monitor.c
+++ b/monitor.c
@@ -183,6 +183,8 @@ typedef struct {
 
 typedef struct MonitorQAPIEventState MonitorQAPIEventState;
 
+typedef bool (*MonitorQAPIEventDelay)(MonitorQAPIEventState *evstate,
+  QDict *data);
 /*
  * To prevent flooding clients, events can be throttled. The
  * throttling is calculated globally, rather than per-Monitor
@@ -190,6 +192,7 @@ typedef struct MonitorQAPIEventState MonitorQAPIEventState;
  */
 struct MonitorQAPIEventState {
 int64_t rate;   /* Minimum time (in ns) between two events */
+MonitorQAPIEventDelay delay;
 void *delay_data;   /* data for the throttle handler */
 };
 
@@ -463,50 +466,70 @@ static void monitor_qapi_event_emit(QAPIEvent event, 
QObject *data)
 }
 
 /*
- * Queue a new event for emission to Monitor instances,
- * applying any rate limiting if required.
+ * A simple delay handler, that will delay all events according to the
+ * evstate->rate limit.
+ *
+ * Return 'false' if the event is not delayed and can be emitted now.
  */
-static void
-monitor_qapi_event_queue(QAPIEvent event, QDict *data, Error **errp)
+static bool
+monitor_qapi_event_delay(MonitorQAPIEventState *evstate, QDict *data)
 {
-MonitorQAPIEventState *evstate;
-MonitorQAPIEventPending *p;
-assert(event < QAPI_EVENT_MAX);
 int64_t now = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
+MonitorQAPIEventPending *p = evstate->delay_data;
+int64_t delta = now - p->last;
 
-evstate = _qapi_event_state[event];
-p = evstate->delay_data;
-trace_monitor_protocol_event_queue(event,
+trace_monitor_protocol_event_delay(p->event,
data,
evstate->rate,
p->last,
now);
 
 /* Rate limit of 0 indicates no throttling */
-qemu_mutex_lock(_lock);
 if (!evstate->rate) {
-monitor_qapi_event_emit(event, QOBJECT(data));
-} else {
-int64_t delta = now - p->last;
-if (evstate->delay_data ||
-delta < evstate->rate) {
-/* If there's an existing event pending, replace
- * it with the new event, otherwise schedule a
- * timer for delayed emission
- */
-if (evstate->delay_data) {
-qobject_decref(evstate->delay_data);
-} else {
-int64_t then = p->last + evstate->rate;
-timer_mod_ns(p->timer, then);
-}
-evstate->delay_data = QOBJECT(data);
-qobject_incref(evstate->delay_data);
+p->last = now;
+return false;
+}
+
+if (p->data || delta < evstate->rate) {
+/* If there's an existing event pending, replace
+ * it with the new event, otherwise schedule a
+ * timer for delayed emission
+ */
+if (p->data) {
+qobject_decref(p->data);
 } else {
-monitor_qapi_event_emit(event, QOBJECT(data));
-p->last = now;
+int64_t then = p->last + evstate->rate;
+timer_mod_ns(p->timer, then);
 }
+p->data = QOBJECT(data);
+qobject_incref(p->data);
+return true;
 }
+
+p->last = now;
+return false;
+}
+
+/*
+ * Queue a new event for emission to Monitor instances,
+ * applying any rate limiting if required.
+ */
+static void
+monitor_qapi_event_queue(QAPIEvent event, QDict *data, Error **errp)
+{
+MonitorQAPIEventState *evstate;
+assert(event < QAPI_EVENT_MAX);
+
+evstate = _qapi_event_state[event];
+trace_monitor_protocol_event_queue(event, data);
+
+qemu_mutex_lock(_lock);
+
+if (!evstate->delay ||
+!evstate->delay(evstate, data)) {
+monitor_qapi_event_emit(event, QOBJECT(data));
+}
+
 qemu_mutex_unlock(_lock);
 }
 
@@ -566,6 +589,8 @@ monitor_qapi_event_throttle(QAPIEvent event, int64_t rate)
 trace_monitor_protocol_event_throttle(event, rate);
 assert(rate * SCALE_MS <= INT64_MAX);
 evstate->rate = rate * SCALE_MS;
+
+evstate->delay = monitor_qapi_event_delay;
 evstate->delay_data = monitor_qapi_event_pending_new(event);
 }
 
diff --git a/trace-events b/trace-events
index 88a2f14..20eb8bf 100644
--- a/trace-events
+++ b/trace-events
@@ -1035,7 +1035,8 @@ handle_qmp_command(void *mon, const char *cmd_name) "mon 
%p cmd_name \"%s\""
 monitor_protocol_emitter(void *mon) "mon %p"

[Qemu-devel] [PATCH v18 07/21] cpu: replay instructions sequence

2015-09-17 Thread Pavel Dovgalyuk

This patch adds calls to replay functions into the icount setup block.
In record mode number of executed instructions is written to the log.
In replay mode number of istructions to execute is taken from the replay log.
When replayed instructions counter is expired qemu_notify_event()
function is called to wake up the iothread.

Reviewed-by: Paolo Bonzini 

Signed-off-by: Pavel Dovgalyuk 
---
 cpus.c  |   38 +-
 replay/replay.c |   34 ++
 replay/replay.h |4 
 3 files changed, 63 insertions(+), 13 deletions(-)

diff --git a/cpus.c b/cpus.c
index 056..8884000 100644
--- a/cpus.c
+++ b/cpus.c
@@ -42,6 +42,7 @@
 #include "qemu/seqlock.h"
 #include "qapi-event.h"
 #include "hw/nmi.h"
+#include "replay/replay.h"
 
 #ifndef _WIN32
 #include "qemu/compatfd.h"
@@ -1333,6 +1334,28 @@ int vm_stop_force_state(RunState state)
 }
 }
 
+static int64_t tcg_get_icount_limit(void)
+{
+int64_t deadline;
+
+if (replay_mode != REPLAY_MODE_PLAY) {
+deadline = qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL);
+
+/* Maintain prior (possibly buggy) behaviour where if no deadline
+ * was set (as there is no QEMU_CLOCK_VIRTUAL timer) or it is more than
+ * INT32_MAX nanoseconds ahead, we still use INT32_MAX
+ * nanoseconds.
+ */
+if ((deadline < 0) || (deadline > INT32_MAX)) {
+deadline = INT32_MAX;
+}
+
+return qemu_icount_round(deadline);
+} else {
+return replay_get_instructions();
+}
+}
+
 static int tcg_cpu_exec(CPUState *cpu)
 {
 int ret;
@@ -1345,24 +1368,12 @@ static int tcg_cpu_exec(CPUState *cpu)
 #endif
 if (use_icount) {
 int64_t count;
-int64_t deadline;
 int decr;
 timers_state.qemu_icount -= (cpu->icount_decr.u16.low
 + cpu->icount_extra);
 cpu->icount_decr.u16.low = 0;
 cpu->icount_extra = 0;
-deadline = qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL);
-
-/* Maintain prior (possibly buggy) behaviour where if no deadline
- * was set (as there is no QEMU_CLOCK_VIRTUAL timer) or it is more than
- * INT32_MAX nanoseconds ahead, we still use INT32_MAX
- * nanoseconds.
- */
-if ((deadline < 0) || (deadline > INT32_MAX)) {
-deadline = INT32_MAX;
-}
-
-count = qemu_icount_round(deadline);
+count = tcg_get_icount_limit();
 timers_state.qemu_icount += count;
 decr = (count > 0x) ? 0x : count;
 count -= decr;
@@ -1380,6 +1391,7 @@ static int tcg_cpu_exec(CPUState *cpu)
 + cpu->icount_extra);
 cpu->icount_decr.u32 = 0;
 cpu->icount_extra = 0;
+replay_account_executed_instructions();
 }
 return ret;
 }
diff --git a/replay/replay.c b/replay/replay.c
index 7c5af4c..731eec3 100755
--- a/replay/replay.c
+++ b/replay/replay.c
@@ -13,6 +13,7 @@
 #include "replay.h"
 #include "replay-internal.h"
 #include "qemu/timer.h"
+#include "qemu/main-loop.h"
 
 ReplayMode replay_mode = REPLAY_MODE_NONE;
 
@@ -45,3 +46,36 @@ uint64_t replay_get_current_step(void)
 {
 return cpu_get_icount_raw();
 }
+
+int replay_get_instructions(void)
+{
+int res = 0;
+replay_mutex_lock();
+if (replay_next_event_is(EVENT_INSTRUCTION)) {
+res = replay_state.instructions_count;
+}
+replay_mutex_unlock();
+return res;
+}
+
+void replay_account_executed_instructions(void)
+{
+if (replay_mode == REPLAY_MODE_PLAY) {
+replay_mutex_lock();
+if (replay_state.instructions_count > 0) {
+int count = (int)(replay_get_current_step()
+  - replay_state.current_step);
+replay_state.instructions_count -= count;
+replay_state.current_step += count;
+if (replay_state.instructions_count == 0) {
+assert(replay_data_kind == EVENT_INSTRUCTION);
+replay_finish_event();
+/* Wake up iothread. This is required because
+   timers will not expire until clock counters
+   will be read from the log. */
+qemu_notify_event();
+}
+}
+replay_mutex_unlock();
+}
+}
diff --git a/replay/replay.h b/replay/replay.h
index a03c748..d19715f 100755
--- a/replay/replay.h
+++ b/replay/replay.h
@@ -22,5 +22,9 @@ extern ReplayMode replay_mode;
 
 /*! Returns number of executed instructions. */
 uint64_t replay_get_current_step(void);
+/*! Returns number of instructions to execute in replay mode. */
+int replay_get_instructions(void);
+/*! Updates instructions counter in replay mode. */
+void replay_account_executed_instructions(void);
 
 #endif

[Qemu-devel] [PATCH v18 02/21] replay: global variables and function stubs

2015-09-17 Thread Pavel Dovgalyuk

This patch adds global variables, defines, function declarations,
and function stubs for deterministic VM replay used by external modules.

Reviewed-by: Paolo Bonzini 
Reviewed-by: Eric Blake 

Signed-off-by: Pavel Dovgalyuk 
---
 Makefile.target  |1 
 docs/replay.txt  |  168 ++
 qapi-schema.json |   18 +
 replay/Makefile.objs |2 +
 replay/replay-user.c |   12 
 replay/replay.c  |   14 
 replay/replay.h  |   19 ++
 stubs/Makefile.objs  |1 
 stubs/replay.c   |3 +
 9 files changed, 238 insertions(+), 0 deletions(-)
 create mode 100755 docs/replay.txt
 create mode 100755 replay/Makefile.objs
 create mode 100755 replay/replay-user.c
 create mode 100755 replay/replay.c
 create mode 100755 replay/replay.h
 create mode 100755 stubs/replay.c

diff --git a/Makefile.target b/Makefile.target
index edd136f..cc52b67 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -86,6 +86,7 @@ all: $(PROGS) stap
 # cpu emulator library
 obj-y = exec.o translate-all.o cpu-exec.o
 obj-y += tcg/tcg.o tcg/tcg-op.o tcg/optimize.o
+obj-y += replay/
 obj-$(CONFIG_TCG_INTERPRETER) += tci.o
 obj-$(CONFIG_TCG_INTERPRETER) += disas/tci.o
 obj-y += fpu/softfloat.o
diff --git a/docs/replay.txt b/docs/replay.txt
new file mode 100755
index 000..645d462
--- /dev/null
+++ b/docs/replay.txt
@@ -0,0 +1,168 @@
+Copyright (c) 2010-2015 Institute for System Programming
+of the Russian Academy of Sciences.
+
+This work is licensed under the terms of the GNU GPL, version 2 or later.
+See the COPYING file in the top-level directory.
+
+Record/replay
+-
+
+Record/replay functions are used for the reverse execution and deterministic
+replay of qemu execution. This implementation of deterministic replay can
+be used for deterministic debugging of guest code through a gdb remote
+interface.
+
+Execution recording writes a non-deterministic events log, which can be later
+used for replaying the execution anywhere and for unlimited number of times.
+It also supports checkpointing for faster rewinding during reverse debugging.
+Execution replaying reads the log and replays all non-deterministic events
+including external input, hardware clocks, and interrupts.
+
+Deterministic replay has the following features:
+ * Deterministically replays whole system execution and all contents of
+   the memory, state of the hardware devices, clocks, and screen of the VM.
+ * Writes execution log into the file for later replaying for multiple times
+   on different machines.
+ * Supports i386, x86_64, and ARM hardware platforms.
+ * Performs deterministic replay of all operations with keyboard and mouse
+   input devices.
+
+Usage of the record/replay:
+ * First, record the execution, by adding the following arguments to the 
command line:
+   '-icount shift=7,rr=record,rrfile=replay.bin -net none'. 
+   Block devices' images are not actually changed in the recording mode,
+   because all of the changes are written to the temporary overlay file.
+ * Then you can replay it by using another command
+   line option: '-icount shift=7,rr=replay,rrfile=replay.bin -net none'
+ * '-net none' option should also be specified if network replay patches
+   are not applied.
+
+Papers with description of deterministic replay implementation:
+http://www.computer.org/csdl/proceedings/csmr/2012/4666/00/4666a553-abs.html
+http://dl.acm.org/citation.cfm?id=2786805.2803179
+
+Modifications of qemu include:
+ * wrappers for clock and time functions to save their return values in the log
+ * saving different asynchronous events (e.g. system shutdown) into the log
+ * synchronization of the bottom halves execution
+ * synchronization of the threads from thread pool
+ * recording/replaying user input (mouse and keyboard)
+ * adding internal checkpoints for cpu and io synchronization
+
+Non-deterministic events
+
+
+Our record/replay system is based on saving and replaying non-deterministic
+events (e.g. keyboard input) and simulating deterministic ones (e.g. reading
+from HDD or memory of the VM). Saving only non-deterministic events makes
+log file smaller, simulation faster, and allows using reverse debugging even
+for realtime applications.
+
+The following non-deterministic data from peripheral devices is saved into
+the log: mouse and keyboard input, network packets, audio controller input,
+USB packets, serial port input, and hardware clocks (they are non-deterministic
+too, because their values are taken from the host machine). Inputs from
+simulated hardware, memory of VM, software interrupts, and execution of
+instructions are not saved into the log, because they are deterministic and
+can be replayed by simulating the behavior of virtual machine starting from
+initial state.
+
+We had to solve three tasks to implement deterministic replay: recording

[Qemu-devel] [PATCH v18 05/21] replay: introduce icount event

2015-09-17 Thread Pavel Dovgalyuk

This patch adds icount event to the replay subsystem. This event corresponds
to execution of several instructions and used to synchronize input events
in the replay phase.

Reviewed-by: Paolo Bonzini 

Signed-off-by: Pavel Dovgalyuk 
---
 replay/replay-internal.c |   24 
 replay/replay-internal.h |   21 +
 replay/replay.c  |   33 +
 replay/replay.h  |7 +++
 4 files changed, 85 insertions(+), 0 deletions(-)

diff --git a/replay/replay-internal.c b/replay/replay-internal.c
index efae672..69fe49f 100755
--- a/replay/replay-internal.c
+++ b/replay/replay-internal.c
@@ -10,6 +10,7 @@
  */
 
 #include "qemu-common.h"
+#include "replay.h"
 #include "replay-internal.h"
 #include "qemu/error-report.h"
 #include "sysemu/sysemu.h"
@@ -36,6 +37,7 @@ void replay_put_byte(uint8_t byte)
 
 void replay_put_event(uint8_t event)
 {
+assert(event < EVENT_COUNT);
 replay_put_byte(event);
 }
 
@@ -149,8 +151,15 @@ void replay_fetch_data_kind(void)
 if (replay_file) {
 if (!replay_has_unread_data) {
 replay_data_kind = replay_get_byte();
+if (replay_data_kind == EVENT_INSTRUCTION) {
+replay_state.instructions_count = replay_get_dword();
+}
 replay_check_error();
 replay_has_unread_data = 1;
+if (replay_data_kind >= EVENT_COUNT) {
+error_report("Replay: unknown event kind %d", 
replay_data_kind);
+exit(1);
+}
 }
 }
 }
@@ -180,3 +189,18 @@ void replay_mutex_unlock(void)
 {
 qemu_mutex_unlock();
 }
+
+/*! Saves cached instructions. */
+void replay_save_instructions(void)
+{
+if (replay_file && replay_mode == REPLAY_MODE_RECORD) {
+replay_mutex_lock();
+int diff = (int)(replay_get_current_step() - 
replay_state.current_step);
+if (first_cpu != NULL && diff > 0) {
+replay_put_event(EVENT_INSTRUCTION);
+replay_put_dword(diff);
+replay_state.current_step += diff;
+}
+replay_mutex_unlock();
+}
+}
diff --git a/replay/replay-internal.h b/replay/replay-internal.h
index 8a0de0d..ff4fabc 100755
--- a/replay/replay-internal.h
+++ b/replay/replay-internal.h
@@ -14,6 +14,20 @@
 
 #include 
 
+enum ReplayEvents {
+/* for instruction event */
+EVENT_INSTRUCTION,
+EVENT_COUNT
+};
+
+typedef struct ReplayState {
+/*! Current step - number of processed instructions and timer events. */
+uint64_t current_step;
+/*! Number of instructions to be executed before other events happen. */
+int instructions_count;
+} ReplayState;
+extern ReplayState replay_state;
+
 extern unsigned int replay_data_kind;
 
 /* File for replay writing */
@@ -50,4 +64,11 @@ void replay_finish_event(void);
 replay_data_kind variable. */
 void replay_fetch_data_kind(void);
 
+/*! Saves queued events (like instructions and sound). */
+void replay_save_instructions(void);
+
+/*! Skips async events until some sync event will be found.
+\return true, if event was found */
+bool replay_next_event_is(int event);
+
 #endif
diff --git a/replay/replay.c b/replay/replay.c
index 5ce066f..7c5af4c 100755
--- a/replay/replay.c
+++ b/replay/replay.c
@@ -9,6 +9,39 @@
  *
  */
 
+#include "qemu-common.h"
 #include "replay.h"
+#include "replay-internal.h"
+#include "qemu/timer.h"
 
 ReplayMode replay_mode = REPLAY_MODE_NONE;
+
+ReplayState replay_state;
+
+bool replay_next_event_is(int event)
+{
+bool res = false;
+
+/* nothing to skip - not all instructions used */
+if (replay_state.instructions_count != 0) {
+assert(replay_data_kind == EVENT_INSTRUCTION);
+return event == EVENT_INSTRUCTION;
+}
+
+while (true) {
+if (event == replay_data_kind) {
+res = true;
+}
+switch (replay_data_kind) {
+default:
+/* clock, time_t, checkpoint and other events */
+return res;
+}
+}
+return res;
+}
+
+uint64_t replay_get_current_step(void)
+{
+return cpu_get_icount_raw();
+}
diff --git a/replay/replay.h b/replay/replay.h
index d6b73c3..a03c748 100755
--- a/replay/replay.h
+++ b/replay/replay.h
@@ -12,8 +12,15 @@
  *
  */
 
+#include 
+#include 
 #include "qapi-types.h"
 
 extern ReplayMode replay_mode;
 
+/* Processing the instructions */
+
+/*! Returns number of executed instructions. */
+uint64_t replay_get_current_step(void);
+
 #endif

[Qemu-devel] [PATCH v18 21/21] replay: recording of the user input

2015-09-17 Thread Pavel Dovgalyuk

This records user input (keyboard and mouse events) in record mode and replays
these input events in replay mode.

Signed-off-by: Pavel Dovgalyuk 
---
 include/ui/input.h   |2 +
 replay/Makefile.objs |1 
 replay/replay-events.c   |   33 +
 replay/replay-input.c|  160 ++
 replay/replay-internal.h |   13 
 replay/replay.h  |4 +
 ui/input.c   |   27 +---
 7 files changed, 232 insertions(+), 8 deletions(-)
 create mode 100755 replay/replay-input.c

diff --git a/include/ui/input.h b/include/ui/input.h
index 5d5ac00..d06a12d 100644
--- a/include/ui/input.h
+++ b/include/ui/input.h
@@ -33,7 +33,9 @@ void qemu_input_handler_bind(QemuInputHandlerState *s,
  const char *device_id, int head,
  Error **errp);
 void qemu_input_event_send(QemuConsole *src, InputEvent *evt);
+void qemu_input_event_send_impl(QemuConsole *src, InputEvent *evt);
 void qemu_input_event_sync(void);
+void qemu_input_event_sync_impl(void);
 
 InputEvent *qemu_input_event_new_key(KeyValue *key, bool down);
 void qemu_input_event_send_key(QemuConsole *src, KeyValue *key, bool down);
diff --git a/replay/Makefile.objs b/replay/Makefile.objs
index 4aa527e..689b1e9 100755
--- a/replay/Makefile.objs
+++ b/replay/Makefile.objs
@@ -2,4 +2,5 @@ obj-$(CONFIG_SOFTMMU) += replay.o
 obj-$(CONFIG_SOFTMMU) += replay-internal.o
 obj-$(CONFIG_SOFTMMU) += replay-events.o
 obj-$(CONFIG_SOFTMMU) += replay-time.o
+obj-$(CONFIG_SOFTMMU) += replay-input.o
 obj-$(CONFIG_USER_ONLY) += replay-user.o
diff --git a/replay/replay-events.c b/replay/replay-events.c
index f356493..23f3b12 100755
--- a/replay/replay-events.c
+++ b/replay/replay-events.c
@@ -14,6 +14,7 @@
 #include "replay.h"
 #include "replay-internal.h"
 #include "block/aio.h"
+#include "ui/input.h"
 
 typedef struct Event {
 ReplayAsyncEventKind event_kind;
@@ -39,6 +40,13 @@ static void replay_run_event(Event *event)
 case REPLAY_ASYNC_EVENT_PTIMER:
 aio_bh_call(event->opaque);
 break;
+case REPLAY_ASYNC_EVENT_INPUT:
+qemu_input_event_send_impl(NULL, (InputEvent *)event->opaque);
+qapi_free_InputEvent((InputEvent *)event->opaque);
+break;
+case REPLAY_ASYNC_EVENT_INPUT_SYNC:
+qemu_input_event_sync_impl();
+break;
 default:
 error_report("Replay: invalid async event ID (%d) in the queue",
 event->event_kind);
@@ -126,6 +134,16 @@ void replay_add_ptimer_event(void *bh, uint64_t id)
 replay_add_event(REPLAY_ASYNC_EVENT_PTIMER, bh, NULL, id);
 }
 
+void replay_add_input_event(struct InputEvent *event)
+{
+replay_add_event(REPLAY_ASYNC_EVENT_INPUT, event, NULL, 0);
+}
+
+void replay_add_input_sync_event(void)
+{
+replay_add_event(REPLAY_ASYNC_EVENT_INPUT_SYNC, NULL, NULL, 0);
+}
+
 static void replay_save_event(Event *event, int checkpoint)
 {
 if (replay_mode != REPLAY_MODE_PLAY) {
@@ -139,6 +157,11 @@ static void replay_save_event(Event *event, int checkpoint)
 case REPLAY_ASYNC_EVENT_PTIMER:
 replay_put_qword(event->id);
 break;
+case REPLAY_ASYNC_EVENT_INPUT:
+replay_save_input_event(event->opaque);
+break;
+case REPLAY_ASYNC_EVENT_INPUT_SYNC:
+break;
 default:
 error_report("Unknown ID %d of replay event", read_event_kind);
 exit(1);
@@ -182,6 +205,16 @@ static Event *replay_read_event(int checkpoint)
 read_id = replay_get_qword();
 }
 break;
+case REPLAY_ASYNC_EVENT_INPUT:
+event = g_malloc0(sizeof(Event));
+event->event_kind = read_event_kind;
+event->opaque = replay_read_input_event();
+return event;
+case REPLAY_ASYNC_EVENT_INPUT_SYNC:
+event = g_malloc0(sizeof(Event));
+event->event_kind = read_event_kind;
+event->opaque = 0;
+return event;
 default:
 error_report("Unknown ID %d of replay event", read_event_kind);
 exit(1);
diff --git a/replay/replay-input.c b/replay/replay-input.c
new file mode 100755
index 000..bdb4db1
--- /dev/null
+++ b/replay/replay-input.c
@@ -0,0 +1,160 @@
+/*
+ * replay-input.c
+ *
+ * Copyright (c) 2010-2015 Institute for System Programming
+ * of the Russian Academy of Sciences.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu-common.h"
+#include "replay.h"
+#include "replay-internal.h"
+#include "qemu/notify.h"
+#include "ui/input.h"
+#include "qapi/qmp-output-visitor.h"
+#include "qapi/qmp-input-visitor.h"
+#include "qapi-visit.h"
+
+static InputEvent *qapi_clone_InputEvent(InputEvent *src)
+{
+QmpOutputVisitor *qov;
+QmpInputVisitor *qiv;
+Visitor *ov, *iv;
+QObject *obj;
+InputEvent

[Qemu-devel] [PATCH v18 17/21] typedef: add typedef for QemuOpts

2015-09-17 Thread Pavel Dovgalyuk

This patch moves typedefs for QemuOpts and related types
to qemu/typedefs.h file.

Reviewed-by: Paolo Bonzini 

Signed-off-by: Pavel Dovgalyuk 
---
 include/qemu/option.h   |5 +
 include/qemu/typedefs.h |3 +++
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/qemu/option.h b/include/qemu/option.h
index 57e51c9..71f5f27 100644
--- a/include/qemu/option.h
+++ b/include/qemu/option.h
@@ -30,6 +30,7 @@
 #include "qemu/queue.h"
 #include "qapi/error.h"
 #include "qapi/qmp/qdict.h"
+#include "qemu/typedefs.h"
 
 const char *get_opt_name(char *buf, int buf_size, const char *p, char delim);
 const char *get_opt_value(char *buf, int buf_size, const char *p);
@@ -44,10 +45,6 @@ void parse_option_size(const char *name, const char *value,
 bool has_help_option(const char *param);
 bool is_valid_option_list(const char *param);
 
-typedef struct QemuOpt QemuOpt;
-typedef struct QemuOpts QemuOpts;
-typedef struct QemuOptsList QemuOptsList;
-
 enum QemuOptType {
 QEMU_OPT_STRING = 0,  /* no parsing (use string as-is) 
   */
 QEMU_OPT_BOOL,/* on/off
   */
diff --git a/include/qemu/typedefs.h b/include/qemu/typedefs.h
index f8a9dd6..d154116 100644
--- a/include/qemu/typedefs.h
+++ b/include/qemu/typedefs.h
@@ -68,6 +68,9 @@ typedef struct QEMUBH QEMUBH;
 typedef struct QemuConsole QemuConsole;
 typedef struct QEMUFile QEMUFile;
 typedef struct QEMUMachine QEMUMachine;
+typedef struct QemuOpt QemuOpt;
+typedef struct QemuOpts QemuOpts;
+typedef struct QemuOptsList QemuOptsList;
 typedef struct QEMUSGList QEMUSGList;
 typedef struct QEMUSizedBuffer QEMUSizedBuffer;
 typedef struct QEMUTimerListGroup QEMUTimerListGroup;

Re: [Qemu-devel] [PATCH] README: fill out some useful quickstart information

2015-09-17 Thread Markus Armbruster

"Daniel P. Berrange"  writes:

> On Thu, Sep 17, 2015 at 01:20:43PM +0100, Peter Maydell wrote:
>> On 17 September 2015 at 13:05, Daniel P. Berrange  
>> wrote:
>> > On Thu, Sep 17, 2015 at 12:32:56PM +0100, Peter Maydell wrote:
>> >> On 17 September 2015 at 12:03, Daniel P. Berrange
>> >>  wrote:
>> >
>> >> > +Building
>> >> > +
>> >> > +
>> >> > +QEMU is multi-platform software intended to be buildable on
>> >> > all modern Linux
>> >> > +platforms, OS-X, Win32 (via the Mingw64 toolchain) and a
>> >> > variety of other
>> >> > +UNIX targets. The simple process to build QEMU is
>> >>
>> >> This whole section seems to be duplicating our existing build
>> >> documentation (which is in qemu-doc.texi in the 'compilation'
>> >> section). We should document how to build QEMU in exactly one
>> >> place, not two (though I can see the rationale for that one
>> >> place not being in a .texi file.)
>> >
>> > The problem I have with refering people to qemu-doc.html,
>> > is that in order to order to view the docs on how to build
>> > QEMU, you first have to build QEMU, or enjoy reading the
>> > .texi source code :-)  Though that doc does get exposed
>> > via the website too, it is nice to not rely on people having
>> > internet access all the time.
>> >
>> > The qemu-doc.html chapter 6 is a bit more detailed in what
>> > it descibes. I tend to view the instructions we put in the
>> > README file as the minimal quick-start, and then point to
>> > the comprehensive docs as a detailed reference on the matter.
>> 
>> I don't think we should have two places at all. If a "quick
>> start" is useful it should be at the start of the one document
>> we have on building QEMU.
>
> How about splitting "Chapter 6 Compilation" out of the qemu-doc.texi
> file into a standalone file, in a format that is friendly to read
> without needing generating first.

Do we want a "Compilation" chapter in qemu-doc, or do we want it to be a
separate document?  "Both" is a legitimate answer.  Multiple documents
can be generated from a single source.

Must the separate document be available right after unpacking a tarball?
"Yes" doesn't mean we can't generate it --- projects include generated
files in tarballs all the time, e.g. Autoconf-generated configure.

Must the separate document be available right after git-clone?  "Yes"
doesn't mean we can't generate it, only that we have to commit a
generated file, which is a bit awkward.

Personally, I wouldn't bother.  People capable of git-clone are capable
of finding and running a bootstrap script.  Common with projects using
Autoconf that don't commit the generated configure.

>Perhaps using something like
> Markdown[1] would be a suitable thing, as that is essentially plain
> text with a little extra punctuation, so it is easily readable as
> source, as well as allowing reasonably pleasant HTML generation
> allowing us to publish it to the website too ?

Texinfo isn't exactly the greatest markup system, but it works, and it
can generate reasonably pleasant HTML.  Ditto PDF and plain text.

Why not extract the existing chapter into a separate .texi, include it
from the user manual, include it from a trivial .texi master file,
format the latter to plain text with something like

$ texi2any --plaintext -I . how-to-build.texi >HOW-TO-BUILD

No need for redoing the markup.

[...]

[Qemu-devel] [PATCH v2] virtio: add some migration doc

2015-09-17 Thread Cornelia Huck

Try to cover the basics of virtio migration.

Signed-off-by: Cornelia Huck 
Reviewed-by: Greg Kurz 
---
v1->v2: make copyright explicit
---
 docs/virtio-migration.txt | 106 ++
 1 file changed, 106 insertions(+)
 create mode 100644 docs/virtio-migration.txt

diff --git a/docs/virtio-migration.txt b/docs/virtio-migration.txt
new file mode 100644
index 000..cf66458
--- /dev/null
+++ b/docs/virtio-migration.txt
@@ -0,0 +1,106 @@
+Virtio devices and migration
+
+
+Copyright 2015 IBM Corp.
+
+This work is licensed under the terms of the GNU GPL, version 2 or later.  See
+the COPYING file in the top-level directory.
+
+Saving and restoring the state of virtio devices is a bit of a twisty maze,
+for several reasons:
+- state is distributed between several parts:
+  - virtio core, for common fields like features, number of queues, ...
+  - virtio transport (pci, ccw, ...), for the different proxy devices and
+transport specific state (msix vectors, indicators, ...)
+  - virtio device (net, blk, ...), for the different device types and their
+state (mac address, request queue, ...)
+- most fields are saved via the stream interface; subsequently, subsections
+  have been added to make cross-version migration possible
+
+This file attempts to document the current procedure and point out some
+caveats.
+
+
+Save state procedure
+
+
+virtio core   virtio transport  virtio device
+---     -
+
+save() function registered
+via register_savevm()
+virtio_save()   <--
+ -->  save_config()
+  - save proxy device
+  - save transport-specific
+device fields
+- save common device
+  fields
+- save common virtqueue
+  fields
+ -->  save_queue()
+  - save transport-specific
+virtqueue fields
+ -->   save_device()
+   - save device-specific
+ fields
+- save subsections
+  - device endianness,
+if changed from
+default endianness
+  - 64 bit features, if
+any high feature bit
+is set
+  - virtio-1 virtqueue
+fields, if VERSION_1
+is set
+
+
+Load state procedure
+
+
+virtio core   virtio transport  virtio device
+---     -
+
+load() function registered
+via register_savevm()
+virtio_load()   <--
+ -->  load_config()
+  - load proxy device
+  - load transport-specific
+device fields
+- load common device
+  fields
+- load common virtqueue
+  fields
+ -->  load_queue()
+  - load transport-specific
+virtqueue fields
+- notify guest
+ -->   load_device()
+   - load device-specific
+ fields
+- load subsections
+  - device endianness
+  - 64 bit features
+  - virtio-1 virtqueue
+fields
+- sanitize endianness
+- sanitize features
+- virtqueue index sanity
+  check
+   - feature-dependent setup
+
+
+Implications of this setup
+==
+
+Devices need to be careful in their state processing during load: The
+load_device() procedure is invoked by the core before subsections have
+been loaded. Any code that depends on information transmitted in subsections
+therefore has to be invoked in the device's load() function _after_
+virtio_load() returned (like e.g. code depending on features).
+
+Any extension of the state being migrated should be done in subsections
+added to the core for compatibility reasons. If transport or device specific
+state is added, core needs to invoke a callback from the new subsection.
-- 
2.3.8

Re: [Qemu-devel] [RFC 21/38] target-i386: emulate atomic instructions + barriers using AIE

2015-09-17 Thread Alex Bennée


Emilio G. Cota  writes:

> Signed-off-by: Emilio G. Cota 
> ---
>  aie-helper.c  |   3 +-
>  linux-user/main.c |   4 +-
>  target-i386/cpu.h |   3 -
>  target-i386/excp_helper.c |   7 ++
>  target-i386/helper.h  |   6 +-
>  target-i386/mem_helper.c  |  39 +++--
>  target-i386/translate.c   | 217 
> --
>  7 files changed, 162 insertions(+), 117 deletions(-)
>
> diff --git a/aie-helper.c b/aie-helper.c
> index 7521150..a3faf04 100644
> --- a/aie-helper.c
> +++ b/aie-helper.c
> @@ -82,7 +82,8 @@ void HELPER(aie_unlock__done)(CPUArchState *env)
>  
>  void HELPER(aie_ld_pre)(CPUArchState *env, target_ulong vaddr)
>  {
> -if (likely(!env->aie_lock_enabled) || env->aie_locked) {
> +assert(env->aie_lock_enabled);
> +if (env->aie_locked) {
>  return;
>  }
>  aie_ld_lock_ret(env, vaddr, GETRA());
> diff --git a/linux-user/main.c b/linux-user/main.c
> index fd06ce9..98ebe19 100644
> --- a/linux-user/main.c
> +++ b/linux-user/main.c
> @@ -279,9 +279,9 @@ void cpu_loop(CPUX86State *env)
>  target_siginfo_t info;
>  
>  for(;;) {
> -cpu_exec_start(cs);
> +cs->running = true;
>  trapnr = cpu_x86_exec(cs);
> -cpu_exec_end(cs);
> +cs->running = false;

Hmm I could do with more in the commit message about why it is OK to
replace the cpu_exec_start/end() wrappers with a simple flag setting. 

ION the duplication in linux-user/main.c is terrifying but that's not
your fault ;-) 

>  switch(trapnr) {
>  case 0x80:
>  /* linux syscall from int $0x80 */
> diff --git a/target-i386/cpu.h b/target-i386/cpu.h
> index 3655ff3..ead2832 100644
> --- a/target-i386/cpu.h
> +++ b/target-i386/cpu.h
> @@ -1318,9 +1318,6 @@ static inline MemTxAttrs cpu_get_mem_attrs(CPUX86State 
> *env)
>  void cpu_set_mxcsr(CPUX86State *env, uint32_t val);
>  void cpu_set_fpuc(CPUX86State *env, uint16_t val);
>  
> -/* mem_helper.c */
> -void helper_lock_init(void);
> -
>  /* svm_helper.c */
>  void cpu_svm_check_intercept_param(CPUX86State *env1, uint32_t type,
> uint64_t param);
> diff --git a/target-i386/excp_helper.c b/target-i386/excp_helper.c
> index 99fca84..141cab4 100644
> --- a/target-i386/excp_helper.c
> +++ b/target-i386/excp_helper.c
> @@ -96,6 +96,13 @@ static void QEMU_NORETURN raise_interrupt2(CPUX86State 
> *env, int intno,
>  {
>  CPUState *cs = CPU(x86_env_get_cpu(env));
>  
> +if (unlikely(env->aie_locked)) {
> +helper_aie_unlock__done(env);
> +}
> +if (unlikely(env->aie_lock_enabled)) {
> +env->aie_lock_enabled = false;
> +}
> +
>  if (!is_int) {
>  cpu_svm_check_intercept_param(env, SVM_EXIT_EXCP_BASE + intno,
>error_code);
> diff --git a/target-i386/helper.h b/target-i386/helper.h
> index 74308f4..7d92140 100644
> --- a/target-i386/helper.h
> +++ b/target-i386/helper.h
> @@ -1,8 +1,8 @@
>  DEF_HELPER_FLAGS_4(cc_compute_all, TCG_CALL_NO_RWG_SE, tl, tl, tl, tl, int)
>  DEF_HELPER_FLAGS_4(cc_compute_c, TCG_CALL_NO_RWG_SE, tl, tl, tl, tl, int)
>  
> -DEF_HELPER_0(lock, void)
> -DEF_HELPER_0(unlock, void)
> +DEF_HELPER_1(lock_enable, void, env)
> +DEF_HELPER_1(lock_disable, void, env)
>  DEF_HELPER_3(write_eflags, void, env, tl, i32)
>  DEF_HELPER_1(read_eflags, tl, env)
>  DEF_HELPER_2(divb_AL, void, env, tl)
> @@ -217,3 +217,5 @@ DEF_HELPER_3(rcrl, tl, env, tl, tl)
>  DEF_HELPER_3(rclq, tl, env, tl, tl)
>  DEF_HELPER_3(rcrq, tl, env, tl, tl)
>  #endif
> +
> +#include "qemu/aie-helper.h"
> diff --git a/target-i386/mem_helper.c b/target-i386/mem_helper.c
> index 8bf0da2..60abc8a 100644
> --- a/target-i386/mem_helper.c
> +++ b/target-i386/mem_helper.c
> @@ -21,38 +21,21 @@
>  #include "exec/helper-proto.h"
>  #include "exec/cpu_ldst.h"
>  
> -/* broken thread support */
> +#include "aie-helper.c"
>  
> -#if defined(CONFIG_USER_ONLY)
> -QemuMutex global_cpu_lock;
> -
> -void helper_lock(void)
> -{
> -qemu_mutex_lock(_cpu_lock);
> -}
> -
> -void helper_unlock(void)
> -{
> -qemu_mutex_unlock(_cpu_lock);
> -}
> -
> -void helper_lock_init(void)
> -{
> -qemu_mutex_init(_cpu_lock);
> -}
> -#else
> -void helper_lock(void)
> +void helper_lock_enable(CPUX86State *env)
>  {
> +env->aie_lock_enabled = true;
>  }
>  
> -void helper_unlock(void)
> -{
> -}
> -
> -void helper_lock_init(void)
> +void helper_lock_disable(CPUX86State *env)
>  {
> +assert(env->aie_lock_enabled);
> +if (env->aie_locked) {
> +h_aie_unlock__done(env);
> +}
> +env->aie_lock_enabled = false;
>  }
> -#endif

I'm skipping over the rest of this as the locking of x86 is outside of
my area of expertise. However I'm currently erring in favour of Alvise'
TCG primitive approach over the heavy use of helpers. It will be
interesting to see a head to head in performance on some RISC backends. 

>  
>  void

Re: [Qemu-devel] [PATCH] hw/arm/virt-acpi-build: Fix wrong size of flash

2015-09-17 Thread Wei Huang

On 09/16/2015 08:57 PM, shannon.z...@linaro.org wrote:
> From: Shannon Zhao 
> 
> While virt machine creates two flash devices with total size 0x0800,
> it wrongly uses this total size for each one. So it will overlap other
> MMIO spaces.
> 
> Signed-off-by: Shannon Zhao 
> ---
>  hw/arm/virt-acpi-build.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> index 2073573..bc858c8 100644
> --- a/hw/arm/virt-acpi-build.c
> +++ b/hw/arm/virt-acpi-build.c
> @@ -114,7 +114,7 @@ static void acpi_dsdt_add_flash(Aml *scope, const 
> MemMapEntry *flash_memmap)
>  {
>  Aml *dev, *crs;
>  hwaddr base = flash_memmap->base;
> -hwaddr size = flash_memmap->size;
> +hwaddr size = flash_memmap->size / 2;
>  
>  dev = aml_device("FLS0");
>  aml_append(dev, aml_name_decl("_HID", aml_string("LNRO0015")));
> 

In current code, it looks like both FLS0 and FLS1 are using up the whole
space. That would invade into the VIRT_CPUPERIPHS address space, which
apparently is wrong.

Reviewed-by: Wei Huang

Re: [Qemu-devel] [PATCH v8 02/26] qapi: New QAPISchema intermediate reperesentation

2015-09-17 Thread Markus Armbruster

Markus Armbruster  writes:

> Eric Blake  writes:
>
>> On 09/16/2015 05:06 AM, Markus Armbruster wrote:
>>> The QAPI code generators work with a syntax tree (nested dictionaries)
>>> plus a few symbol tables (also dictionaries) on the side.
>>
>>> Nothing uses the new intermediate representation just yet, thus no
>>> change to generated files.
>>> 
>>> Signed-off-by: Markus Armbruster 
>>> ---
>>
>>
>>> +class QAPISchemaEnumType(QAPISchemaType):
>>> +def __init__(self, name, info, values, prefix):
>>> +QAPISchemaType.__init__(self, name, info)
>>> +for v in values:
>>> +assert isinstance(v, str)
>>> +self.values = values
>>> +self.prefix = prefix
>>
>> Missing:
>> assert prefix is None or isinstance(prefix, str)
>
> Yes.
>
>> Should this be self._prefix, since no clients ever directly use the
>> member field (they just use the string passed through the visitor)?
>
> Yes.

Second thoughts: I'll keep self.prefix for consistency.  So far, we use
the _-prefix only for instance variables holding values flowing from
__init__() to check().

> If nothing else comes up in review, I'll touch this up on commit.
>
>> Reviewed-by: Eric Blake 
>
> Thanks!

Re: [Qemu-devel] [RFCv2 1/2] spapr: Remove unnecessary owner field from sPAPRDRConnector

2015-09-17 Thread Paolo Bonzini



On 17/09/2015 17:50, Michael Roth wrote:
> We still need globals for RTAS lookups. I think QOM is our most
> mature/well-tested interface for managing inter-device
> relationships/lookups, but I can understand if using
> root_container/link seems off. To me it seem like a nice
> "freebie" we get from using QOM, and gave us nice guarantees like
> globally unique paths to correspond to globally unique DRC indexes.

I think it's okay; but I don't really like looking at ->properties
directly, without an API (which doesn't exist indeed).

Paolo

[Qemu-devel] [PATCH v2 5/5] monitor: remove old entries from event hash table

2015-09-17 Thread marcandre . lureau

From: Marc-André Lureau 

Do not let the hash table grow without limit, schedule a cleanup for
outdated event.

Signed-off-by: Marc-André Lureau 
---
 monitor.c | 58 +-
 1 file changed, 53 insertions(+), 5 deletions(-)

diff --git a/monitor.c b/monitor.c
index 90f06ce..a81341f 100644
--- a/monitor.c
+++ b/monitor.c
@@ -541,7 +541,7 @@ static void monitor_qapi_event_handler(void *opaque)
 }
 
 static MonitorQAPIEventPending *
-monitor_qapi_event_pending_new(QAPIEvent event)
+monitor_qapi_event_pending_new(QAPIEvent event, QEMUTimerCB *handler)
 {
 MonitorQAPIEventPending *p;
 
@@ -549,11 +549,52 @@ monitor_qapi_event_pending_new(QAPIEvent event)
 p->event = event;
 p->timer = timer_new(QEMU_CLOCK_REALTIME,
  SCALE_MS,
- monitor_qapi_event_handler,
+ handler,
  p);
 return p;
 }
 
+static void monitor_qapi_event_id_remove(MonitorQAPIEventPending *p)
+{
+MonitorQAPIEventState *s = _qapi_event_state[p->event];
+GHashTable *ht = s->delay_data;
+GHashTableIter iter;
+gpointer value;
+
+g_hash_table_iter_init(, ht);
+while (g_hash_table_iter_next(, NULL, )) {
+if (value == p) {
+g_hash_table_iter_remove();
+return;
+}
+}
+}
+
+/*
+ * do not let the hash table grow, if no later pending event
+ * scheduled, remove the old entry after rate timeout.
+ */
+static void monitor_qapi_event_id_schedule_remove(MonitorQAPIEventPending *p)
+{
+MonitorQAPIEventState *s = _qapi_event_state[p->event];
+int64_t now = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
+int64_t then = now + s->rate;
+
+timer_mod_ns(p->timer, then);
+}
+
+static void monitor_qapi_event_id_handler(void *opaque)
+{
+MonitorQAPIEventPending *p = opaque;
+
+if (p->qdict) {
+monitor_qapi_event_handler(p);
+monitor_qapi_event_id_schedule_remove(p);
+} else {
+monitor_qapi_event_id_remove(p);
+}
+}
+
 /*
  * A delay handler that will filter events by the "id" event field.
  * evstate must be an element of the monitor_qapi_event_state array.
@@ -571,11 +612,17 @@ monitor_qapi_event_id_delay(MonitorQAPIEventState 
*evstate, QDict *qdict)
 assert(event >= 0 || event < QAPI_EVENT_MAX);
 
 if (!p) {
-p = monitor_qapi_event_pending_new(event);
+p = monitor_qapi_event_pending_new(event,
+   monitor_qapi_event_id_handler);
 g_hash_table_insert(ht, g_strdup(id), p);
 }
 
-return monitor_qapi_event_pending_update(evstate, p, qdict);
+if (monitor_qapi_event_pending_update(evstate, p, qdict)) {
+return true;
+} else {
+monitor_qapi_event_id_schedule_remove(p);
+return false;
+}
 }
 
 /*
@@ -631,7 +678,8 @@ monitor_qapi_event_throttle(QAPIEvent event, int64_t rate)
 evstate->rate = rate * SCALE_MS;
 
 evstate->delay = monitor_qapi_event_delay;
-evstate->delay_data = monitor_qapi_event_pending_new(event);
+evstate->delay_data =
+monitor_qapi_event_pending_new(event, monitor_qapi_event_handler);
 }
 
 static void
-- 
2.4.3

[Qemu-devel] [PATCH v18 00/21] Deterministic replay core

2015-09-17 Thread Pavel Dovgalyuk

This set of patches is related to the reverse execution and deterministic 
replay of qemu execution. This implementation of deterministic replay can 
be used for deterministic debugging of guest code through gdb remote
interface.

Core set of patches does not include support for reverse debugging commands
of gdb, block devices' operations, USB replay support.

These patches include only core function of the replay,
excluding the support for replaying serial, audio, network, and USB devices'
operations. Reverse debugging and monitor commands were also excluded to
be submitted later as separate patches.

Execution recording writes non-deterministic events log, which can be later 
used for replaying the execution anywhere and for unlimited number of times. 
It also supports checkpointing for faster rewinding during reverse debugging. 
Execution replaying reads the log and replays all non-deterministic events 
including external input, hardware clocks, and interrupts.

Full version of deterministic replay has the following features:
 * Deterministically replays whole system execution and all contents of the 
memory,
   state of the hadrware devices, clocks, and screen of the VM.
 * Writes execution log into the file for latter replaying for multiple times 
   on different machines.
 * Supports i386, x86_64, ARM, PowerPC, and MIPS hardware platforms.
 * Performs deterministic replay of all operations with keyboard and mouse
   input devices.
 * Supports auto-checkpointing for convenient reverse debugging.

Usage of the record/replay core:
 * First, record the execution, by adding the following string to the command 
line:
   '-icount shift=7,rr=record,rrfile=replay.bin -net none'. 
   Block devices' images are not actually changed in the recording mode, 
   because all of the changes are written to the temporary overlay file.
 * Then you can replay it for the multiple times by using another command
   line option: '-icount shift=7,rr=replay,rrfile=replay.bin -net none'
 * '-net none' option should also be specified if network replay patches
   are not applied.
 * Do not add any disk images to VM, because they are not supported by
   the core patches.

Papers with description of deterministic replay implementation:
http://www.computer.org/csdl/proceedings/csmr/2012/4666/00/4666a553-abs.html
http://dl.acm.org/citation.cfm?id=2786805.2803179

Public repository with current version of the patches:
https://github.com/Dovgalyuk/qemu/tree/rr-17

Modifications of qemu include:
 * wrappers for clock and time functions to save their return values in the log
 * saving different asynchronous events (e.g. system shutdown) into the log
 * synchronization of the threads from thread pool
 * recording/replaying user input (mouse and keyboard)
 * adding internal events for cpu and io synchronization

v18 changes:
 * Patches were updated to match upstream version
 * Added missed replay-user.c file

v17 changes:
 * Removed useless stub functions (as suggested by Paolo Bonzini)
 * Refined checkpoint-related code (as suggested by Paolo Bonzini)
 * Improved icount processing (as suggested by Paolo Bonzini)
 * Added checkpoint for suspend event (as suggested by Paolo Bonzini)
 * Fixed linux-user configurations build
 * Minor fixes

v16 changes:
 * Several warnings were fixed

v15 changes:
 * Tested record/replay with MIPS and PowerPC guests
 * Published the patches on github
 * Fixed replay mutex operation in icount mode
 * Fixed timers processing in record/replay mode

v14 changes:
 * Minor fixes

v13 changes:
 * Introduced "ptimer trigger" event (as suggested by Paolo Bonzini)

v12 changes:
 * Removed block patches from the core patches set.

v11 changes:
 * Fixed instructions event processing.
 * Added some mutex protection calls for replay.
 * Fixed replaying read operations for qcow2.
 * Fixed rtc reads on initializations stage.
 * Eliminated some warnings in replay module.
 * Fixed misprints in documentation for replay (as suggested by Eric Blake)

v10 changes:
 * Fixed queue processing for bottom halves (as suggested by Paolo Bonzini)
 * Rewritten several replay functions (as suggested by Paolo Bonzini)
 * Some minor fixes.

v9 changes:
 * Replaced fwrite/fread with putc/getc (as suggested by Paolo Bonzini)
 * Stopping virtual machine in case of replay file end (as suggested by Paolo 
Bonzini)
 * Removed one of the replay mutexes (as suggested by Paolo Bonzini)
 * Fixed RCU queue for bottom halves (as suggested by Paolo Bonzini)
 * Updated command line options' names (as suggested by Paolo Bonzini)
 * Added design document for record/replay (as suggested by Paolo Bonzini)
 * Simplified checkpoints for the timers
 * Added cloning InputEvent objects for replay (as suggested by Paolo Bonzini)
 * Added replay blockers instead of checking the command line (as suggested by 
Paolo Bonzini)
 * Some functions renaming and extracting.

v8 changes:
 * Simplified processing of the shutdown event (as suggested by Paolo Bonzini)
 * Replaced

[Qemu-devel] [PATCH v18 01/21] i386: partial revert of interrupt poll fix

2015-09-17 Thread Pavel Dovgalyuk

Processing CPU_INTERRUPT_POLL requests in cpu_has_work functions
break the determinism of cpu_exec. This patch is required to make
interrupts processing deterministic.

Signed-off-by: Paolo Bonzini 

Signed-off-by: Pavel Dovgalyuk 
---
 cpu-exec.c|9 +
 target-i386/cpu.c |   10 ++
 2 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/cpu-exec.c b/cpu-exec.c
index 8945533..b247a23 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -28,6 +28,9 @@
 #include "exec/memory-internal.h"
 #include "qemu/rcu.h"
 #include "exec/tb-hash.h"
+#if defined(TARGET_I386) && !defined(CONFIG_USER_ONLY)
+#include "hw/i386/apic.h"
+#endif
 
 /* -icount align implementation. */
 
@@ -402,6 +405,12 @@ int cpu_exec(CPUState *cpu)
 SyncClocks sc;
 
 if (cpu->halted) {
+#if defined(TARGET_I386) && !defined(CONFIG_USER_ONLY)
+if (cpu->interrupt_request & CPU_INTERRUPT_POLL) {
+apic_poll_irq(x86_cpu->apic_state);
+cpu_reset_interrupt(cpu, CPU_INTERRUPT_POLL);
+}
+#endif
 if (!cpu_has_work(cpu)) {
 return EXCP_HALTED;
 }
diff --git a/target-i386/cpu.c b/target-i386/cpu.c
index cfb8aa7..3d2e2f0 100644
--- a/target-i386/cpu.c
+++ b/target-i386/cpu.c
@@ -3098,14 +3098,8 @@ static bool x86_cpu_has_work(CPUState *cs)
 X86CPU *cpu = X86_CPU(cs);
 CPUX86State *env = >env;
 
-#if !defined(CONFIG_USER_ONLY)
-if (cs->interrupt_request & CPU_INTERRUPT_POLL) {
-apic_poll_irq(cpu->apic_state);
-cpu_reset_interrupt(cs, CPU_INTERRUPT_POLL);
-}
-#endif
-
-return ((cs->interrupt_request & CPU_INTERRUPT_HARD) &&
+return ((cs->interrupt_request & (CPU_INTERRUPT_HARD |
+  CPU_INTERRUPT_POLL)) &&
 (env->eflags & IF_MASK)) ||
(cs->interrupt_request & (CPU_INTERRUPT_NMI |
  CPU_INTERRUPT_INIT |

Re: [Qemu-devel] [PATCH 3/4] checkpatch: adapt some tests to QEMU

2015-09-17 Thread Paolo Bonzini



On 17/09/2015 18:16, Peter Maydell wrote:
> On 17 September 2015 at 17:00, Paolo Bonzini  wrote:
>>
>>
>> On 17/09/2015 16:24, Peter Maydell wrote:
>>> Can we revert this one, please? Checkpatch now warns about constructs
>>> like
>>>   typedef struct MyDevice {
>>>   DeviceState parent;
>>>
>>>   int reg0, reg1, reg2;
>>>   } MyDevice;
>>
>> It's interesting that qom/object.h documents this and start like:
>>
>> typedef struct ObjectClass ObjectClass;
>> typedef struct Object Object;
>>
>> typedef struct TypeInfo TypeInfo;
>>
>> typedef struct InterfaceClass InterfaceClass;
>> typedef struct InterfaceInfo InterfaceInfo;
>>
>> I have a patch to flag widely-disrespected rules that we still want to
>> encourage in patches.  Would you agree with filing these typedefs under
>> this category?
> 
> No, I think that having a separate typedef is worse. The
> only exceptions are (a) when you need it to be separate because
> you need to use the type within the struct itself (or some
> similar dependency loop) (b) when you want to put the typedef
> in include/qemu/typedefs.h.
> 
> I really don't see any need to suddenly outlaw something
> that's been accepted as standard good QEMU style for a
> long time.

I think it varies depending on the maintainer.  PPC, USB, SCSI, ACPI all
use a separate typedef.  I'll prepare a revert.

Paolo

Re: [Qemu-devel] [PATCH] tcg/mips: Fix clobbering of qemu_ld inputs

2015-09-17 Thread Aurelien Jarno

On 2015-09-14 11:34, James Hogan wrote:
> The MIPS TCG backend implements qemu_ld with 64-bit targets using the v0
> register (base) as a temporary to load the upper half of the QEMU TLB
> comparator (see line 5 below), however this happens before the input
> address is used (line 8 to mask off the low bits for the TLB
> comparison, and line 12 to add the host-guest offset). If the input
> address (addrl) also happens to have been placed in v0 (as in the second
> column below), it gets clobbered before it is used.
> 
>  addrl in t2  addrl in v0
> 
>  1 srl a0,t2,0x7srl a0,v0,0x7
>  2 andia0,a0,0x1fe0 andia0,a0,0x1fe0
>  3 addua0,a0,s0 addua0,a0,s0
>  4 lw  at,9136(a0)  lw  at,9136(a0)  set TCG_TMP0 (at)
>  5 lw  v0,9140(a0)  lw  v0,9140(a0)  set base (v0)
>  6 li  t9,-4093 li  t9,-4093
>  7 lw  a0,9160(a0)  lw  a0,9160(a0)  set addend (a0)
>  8 and t9,t9,t2 and t9,t9,v0 use addrl
>  9 bne at,t9,0x836d8c8  bne at,t9,0x836d838  use TCG_TMP0
> 10  nop  nop
> 11 bne v0,t8,0x836d8c8  bne v0,a1,0x836d838  use base
> 12  addu   v0,a0,t2  addu   v0,a0,v0 use addrl, addend
> 13 lw  t0,0(v0) lw  t0,0(v0)
> 
> Fix by using TCG_TMP0 (at) as the temporary instead of v0 (base),
> pushing the load on line 5 forward into the delay slot of the low
> comparison (line 10). The early load of the addend on line 7 also needs
> pushing even further for 64-bit targets, or it will clobber a0 before
> we're done with it. The output for 32-bit targets is unaffected.
> 
>  srl a0,v0,0x7
>  andia0,a0,0x1fe0
>  addua0,a0,s0
>  lw  at,9136(a0)
> -lw  v0,9140(a0)  load high comparator
>  li  t9,-4093
> -lw  a0,9160(a0)  load addend
>  and t9,t9,v0
>  bne at,t9,0x836d838
> - nop
> + lw at,9140(a0)  load high comparator
> +lw  a0,9160(a0)  load addend
> -bne v0,a1,0x836d838
> +bne at,a1,0x836d838
>   addu   v0,a0,v0
>  lw  t0,0(v0)
> 
> Suggested-by: Richard Henderson 
> Signed-off-by: James Hogan 
> Cc: Aurelien Jarno 
> ---
> Tested with mips64el target (as in output diff above) and mipsel
> target (confirmed that the case was hit at least once during guest
> kernel boot and continued to work fine).
> ---
>  tcg/mips/tcg-target.c | 26 +++---
>  1 file changed, 15 insertions(+), 11 deletions(-)
> 
> diff --git a/tcg/mips/tcg-target.c b/tcg/mips/tcg-target.c
> index c0ce520228a7..38c968220b9a 100644
> --- a/tcg/mips/tcg-target.c
> +++ b/tcg/mips/tcg-target.c
> @@ -962,30 +962,34 @@ static void tcg_out_tlb_load(TCGContext *s, TCGReg 
> base, TCGReg addrl,
>  add_off -= 0x7ff0;
>  }
>  
> -/* Load the tlb comparator.  */
> -if (TARGET_LONG_BITS == 64) {
> -tcg_out_opc_imm(s, OPC_LW, TCG_TMP0, TCG_REG_A0, cmp_off + LO_OFF);
> -tcg_out_opc_imm(s, OPC_LW, base, TCG_REG_A0, cmp_off + HI_OFF);
> -} else {
> -tcg_out_opc_imm(s, OPC_LW, TCG_TMP0, TCG_REG_A0, cmp_off);
> -}
> +/* Load the (low half) tlb comparator.  */
> +tcg_out_opc_imm(s, OPC_LW, TCG_TMP0, TCG_REG_A0,
> +cmp_off + (TARGET_LONG_BITS == 64 ? LO_OFF : 0));
>  
>  /* Mask the page bits, keeping the alignment bits to compare against.
> -   In between, load the tlb addend for the fast path.  */
> +   In between on 32-bit targets, load the tlb addend for the fast path.  
> */
>  tcg_out_movi(s, TCG_TYPE_I32, TCG_TMP1,
>   TARGET_PAGE_MASK | ((1 << s_bits) - 1));
> -tcg_out_opc_imm(s, OPC_LW, TCG_REG_A0, TCG_REG_A0, add_off);
> +if (TARGET_LONG_BITS == 32) {
> +tcg_out_opc_imm(s, OPC_LW, TCG_REG_A0, TCG_REG_A0, add_off);
> +}
>  tcg_out_opc_reg(s, OPC_AND, TCG_TMP1, TCG_TMP1, addrl);
>  
>  label_ptr[0] = s->code_ptr;
>  tcg_out_opc_br(s, OPC_BNE, TCG_TMP1, TCG_TMP0);
>  
> +/* Load and test the high half tlb comparator.  */
>  if (TARGET_LONG_BITS == 64) {
>  /* delay slot */
> -tcg_out_nop(s);
> +tcg_out_opc_imm(s, OPC_LW, TCG_TMP0, TCG_REG_A0, cmp_off + HI_OFF);
> +
> +/* Load the tlb addend for the fast path. We can't do it earlier with
> +   64-bit targets or we'll clobber a0 before reading the high half 
> tlb
> +   comparator.  */
> +tcg_out_opc_imm(s, OPC_LW, TCG_REG_A0, TCG_REG_A0, add_off);
>  
>  label_ptr[1] = s->code_ptr;
> -tcg_out_opc_br(s, OPC_BNE, addrh, base);
> +tcg_out_opc_br(s, OPC_BNE, addrh, TCG_TMP0);
>  }
>  
>  /* delay slot */

Thanks for fixing this. I guess we also want this patch in the stable
releases, so I'll add a Cc to stable in the pull request.

Reviewed-by: Aurelien Jarno 

-- 
Aurelien Jarno  GPG:

[Qemu-devel] [PATCH 1/2] tcg/mips: move tcg_out_addsub2

2015-09-17 Thread Aurelien Jarno

Somehow the tcg_out_addsub2 function ended-up in the middle of the
qemu_ld/st related functions. Move it with other arithmetics related
functions.

Signed-off-by: Aurelien Jarno 
---
 tcg/mips/tcg-target.c | 98 +--
 1 file changed, 49 insertions(+), 49 deletions(-)

diff --git a/tcg/mips/tcg-target.c b/tcg/mips/tcg-target.c
index 38c9682..4f1e002 100644
--- a/tcg/mips/tcg-target.c
+++ b/tcg/mips/tcg-target.c
@@ -567,6 +567,55 @@ static inline void tcg_out_addi(TCGContext *s, TCGReg reg, 
TCGArg val)
 }
 }
 
+static void tcg_out_addsub2(TCGContext *s, TCGReg rl, TCGReg rh, TCGReg al,
+TCGReg ah, TCGArg bl, TCGArg bh, bool cbl,
+bool cbh, bool is_sub)
+{
+TCGReg th = TCG_TMP1;
+
+/* If we have a negative constant such that negating it would
+   make the high part zero, we can (usually) eliminate one insn.  */
+if (cbl && cbh && bh == -1 && bl != 0) {
+bl = -bl;
+bh = 0;
+is_sub = !is_sub;
+}
+
+/* By operating on the high part first, we get to use the final
+   carry operation to move back from the temporary.  */
+if (!cbh) {
+tcg_out_opc_reg(s, (is_sub ? OPC_SUBU : OPC_ADDU), th, ah, bh);
+} else if (bh != 0 || ah == rl) {
+tcg_out_opc_imm(s, OPC_ADDIU, th, ah, (is_sub ? -bh : bh));
+} else {
+th = ah;
+}
+
+/* Note that tcg optimization should eliminate the bl == 0 case.  */
+if (is_sub) {
+if (cbl) {
+tcg_out_opc_imm(s, OPC_SLTIU, TCG_TMP0, al, bl);
+tcg_out_opc_imm(s, OPC_ADDIU, rl, al, -bl);
+} else {
+tcg_out_opc_reg(s, OPC_SLTU, TCG_TMP0, al, bl);
+tcg_out_opc_reg(s, OPC_SUBU, rl, al, bl);
+}
+tcg_out_opc_reg(s, OPC_SUBU, rh, th, TCG_TMP0);
+} else {
+if (cbl) {
+tcg_out_opc_imm(s, OPC_ADDIU, rl, al, bl);
+tcg_out_opc_imm(s, OPC_SLTIU, TCG_TMP0, rl, bl);
+} else if (rl == al && rl == bl) {
+tcg_out_opc_sa(s, OPC_SRL, TCG_TMP0, al, 31);
+tcg_out_opc_reg(s, OPC_ADDU, rl, al, bl);
+} else {
+tcg_out_opc_reg(s, OPC_ADDU, rl, al, bl);
+tcg_out_opc_reg(s, OPC_SLTU, TCG_TMP0, rl, (rl == bl ? al : bl));
+}
+tcg_out_opc_reg(s, OPC_ADDU, rh, th, TCG_TMP0);
+}
+}
+
 /* Bit 0 set if inversion required; bit 1 set if swapping required.  */
 #define MIPS_CMP_INV  1
 #define MIPS_CMP_SWAP 2
@@ -1237,55 +1286,6 @@ static void tcg_out_qemu_st_direct(TCGContext *s, TCGReg 
datalo, TCGReg datahi,
 }
 }
 
-static void tcg_out_addsub2(TCGContext *s, TCGReg rl, TCGReg rh, TCGReg al,
-TCGReg ah, TCGArg bl, TCGArg bh, bool cbl,
-bool cbh, bool is_sub)
-{
-TCGReg th = TCG_TMP1;
-
-/* If we have a negative constant such that negating it would
-   make the high part zero, we can (usually) eliminate one insn.  */
-if (cbl && cbh && bh == -1 && bl != 0) {
-bl = -bl;
-bh = 0;
-is_sub = !is_sub;
-}
-
-/* By operating on the high part first, we get to use the final
-   carry operation to move back from the temporary.  */
-if (!cbh) {
-tcg_out_opc_reg(s, (is_sub ? OPC_SUBU : OPC_ADDU), th, ah, bh);
-} else if (bh != 0 || ah == rl) {
-tcg_out_opc_imm(s, OPC_ADDIU, th, ah, (is_sub ? -bh : bh));
-} else {
-th = ah;
-}
-
-/* Note that tcg optimization should eliminate the bl == 0 case.  */
-if (is_sub) {
-if (cbl) {
-tcg_out_opc_imm(s, OPC_SLTIU, TCG_TMP0, al, bl);
-tcg_out_opc_imm(s, OPC_ADDIU, rl, al, -bl);
-} else {
-tcg_out_opc_reg(s, OPC_SLTU, TCG_TMP0, al, bl);
-tcg_out_opc_reg(s, OPC_SUBU, rl, al, bl);
-}
-tcg_out_opc_reg(s, OPC_SUBU, rh, th, TCG_TMP0);
-} else {
-if (cbl) {
-tcg_out_opc_imm(s, OPC_ADDIU, rl, al, bl);
-tcg_out_opc_imm(s, OPC_SLTIU, TCG_TMP0, rl, bl);
-} else if (rl == al && rl == bl) {
-tcg_out_opc_sa(s, OPC_SRL, TCG_TMP0, al, 31);
-tcg_out_opc_reg(s, OPC_ADDU, rl, al, bl);
-} else {
-tcg_out_opc_reg(s, OPC_ADDU, rl, al, bl);
-tcg_out_opc_reg(s, OPC_SLTU, TCG_TMP0, rl, (rl == bl ? al : bl));
-}
-tcg_out_opc_reg(s, OPC_ADDU, rh, th, TCG_TMP0);
-}
-}
-
 static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, bool is_64)
 {
 TCGReg addr_regl, addr_regh __attribute__((unused));
-- 
2.1.4

[Qemu-devel] [PATCH v18 16/21] replay: ptimer

2015-09-17 Thread Pavel Dovgalyuk

This patch adds deterministic replay for hardware periodic countdown timers.
ptimer uses bottom halves layer to execute such an asynchronous callback.
We put this callback into the replay queue instead of bottom halves one.
When checkpoint is met by main loop thread, the replay queue is processed
and callback is executed. Binding callback moment to one of the checkpoints
makes it deterministic.

Signed-off-by: Pavel Dovgalyuk 
---
 hw/core/ptimer.c |7 ++-
 replay/replay-events.c   |   18 +-
 replay/replay-internal.h |1 +
 replay/replay.h  |2 ++
 4 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/hw/core/ptimer.c b/hw/core/ptimer.c
index 8437bd6..c56078d 100644
--- a/hw/core/ptimer.c
+++ b/hw/core/ptimer.c
@@ -9,6 +9,7 @@
 #include "qemu/timer.h"
 #include "hw/ptimer.h"
 #include "qemu/host-utils.h"
+#include "replay/replay.h"
 
 struct ptimer_state
 {
@@ -27,7 +28,11 @@ struct ptimer_state
 static void ptimer_trigger(ptimer_state *s)
 {
 if (s->bh) {
-qemu_bh_schedule(s->bh);
+if (replay_mode != REPLAY_MODE_NONE) {
+replay_add_ptimer_event(s->bh, replay_get_current_step());
+} else {
+qemu_bh_schedule(s->bh);
+}
 }
 }
 
diff --git a/replay/replay-events.c b/replay/replay-events.c
index f973cb0..f356493 100755
--- a/replay/replay-events.c
+++ b/replay/replay-events.c
@@ -13,6 +13,7 @@
 #include "qemu/error-report.h"
 #include "replay.h"
 #include "replay-internal.h"
+#include "block/aio.h"
 
 typedef struct Event {
 ReplayAsyncEventKind event_kind;
@@ -35,6 +36,9 @@ static bool events_enabled;
 static void replay_run_event(Event *event)
 {
 switch (event->event_kind) {
+case REPLAY_ASYNC_EVENT_PTIMER:
+aio_bh_call(event->opaque);
+break;
 default:
 error_report("Replay: invalid async event ID (%d) in the queue",
 event->event_kind);
@@ -117,6 +121,11 @@ static void replay_add_event(ReplayAsyncEventKind 
event_kind,
 replay_mutex_unlock();
 }
 
+void replay_add_ptimer_event(void *bh, uint64_t id)
+{
+replay_add_event(REPLAY_ASYNC_EVENT_PTIMER, bh, NULL, id);
+}
+
 static void replay_save_event(Event *event, int checkpoint)
 {
 if (replay_mode != REPLAY_MODE_PLAY) {
@@ -127,10 +136,12 @@ static void replay_save_event(Event *event, int 
checkpoint)
 
 /* save event-specific data */
 switch (event->event_kind) {
+case REPLAY_ASYNC_EVENT_PTIMER:
+replay_put_qword(event->id);
+break;
 default:
 error_report("Unknown ID %d of replay event", read_event_kind);
 exit(1);
-break;
 }
 }
 }
@@ -166,6 +177,11 @@ static Event *replay_read_event(int checkpoint)
 
 /* Events that has not to be in the queue */
 switch (read_event_kind) {
+case REPLAY_ASYNC_EVENT_PTIMER:
+if (read_id == -1) {
+read_id = replay_get_qword();
+}
+break;
 default:
 error_report("Unknown ID %d of replay event", read_event_kind);
 exit(1);
diff --git a/replay/replay-internal.h b/replay/replay-internal.h
index bf64be5..89c1dc7 100755
--- a/replay/replay-internal.h
+++ b/replay/replay-internal.h
@@ -39,6 +39,7 @@ enum ReplayEvents {
 /* Asynchronous events IDs */
 
 enum ReplayAsyncEventKind {
+REPLAY_ASYNC_EVENT_PTIMER,
 REPLAY_ASYNC_COUNT
 };
 
diff --git a/replay/replay.h b/replay/replay.h
index e2696fe..c7de964 100755
--- a/replay/replay.h
+++ b/replay/replay.h
@@ -97,5 +97,7 @@ bool replay_checkpoint(ReplayCheckpoint checkpoint);
 void replay_disable_events(void);
 /*! Returns true when saving events is enabled */
 bool replay_events_enabled(void);
+/*! Adds ptimer event to the queue */
+void replay_add_ptimer_event(void *bh, uint64_t id);
 
 #endif

[Qemu-devel] [PATCH v18 11/21] replay: recording and replaying clock ticks

2015-09-17 Thread Pavel Dovgalyuk

Clock ticks are considered as the sources of non-deterministic data for
virtual machine. This patch implements saving the clock values when they
are acquired (virtual, host clock).
When replaying the execution corresponding values are read from log and
transfered to the module, which wants to read the values.
Such a design required the clock polling to be synchronized. Sometimes
it is not true - e.g. when timeouts for timer lists are checked. In this case
we use a cached value of the clock, passing it to the client code.

Signed-off-by: Pavel Dovgalyuk 
---
 cpus.c   |3 +-
 qemu-timer.c |8 --
 replay/Makefile.objs |1 +
 replay/replay-internal.h |   11 
 replay/replay-time.c |   64 ++
 replay/replay.h  |   23 +
 stubs/replay.c   |   13 +
 7 files changed, 119 insertions(+), 4 deletions(-)
 create mode 100755 replay/replay-time.c

diff --git a/cpus.c b/cpus.c
index 8884000..95b8469 100644
--- a/cpus.c
+++ b/cpus.c
@@ -338,7 +338,8 @@ static void icount_warp_rt(void *opaque)
 
 seqlock_write_lock(_state.vm_clock_seqlock);
 if (runstate_is_running()) {
-int64_t clock = cpu_get_clock_locked();
+int64_t clock = REPLAY_CLOCK(REPLAY_CLOCK_VIRTUAL_RT,
+ cpu_get_clock_locked());
 int64_t warp_delta;
 
 warp_delta = clock - vm_clock_warp_start;
diff --git a/qemu-timer.c b/qemu-timer.c
index 2463fe6..c7bd643 100644
--- a/qemu-timer.c
+++ b/qemu-timer.c
@@ -24,6 +24,7 @@
 
 #include "qemu/main-loop.h"
 #include "qemu/timer.h"
+#include "replay/replay.h"
 
 #ifdef CONFIG_POSIX
 #include 
@@ -570,15 +571,16 @@ int64_t qemu_clock_get_ns(QEMUClockType type)
 return cpu_get_clock();
 }
 case QEMU_CLOCK_HOST:
-now = get_clock_realtime();
+now = REPLAY_CLOCK(REPLAY_CLOCK_HOST, get_clock_realtime());
 last = clock->last;
 clock->last = now;
-if (now < last || now > (last + get_max_clock_jump())) {
+if ((now < last || now > (last + get_max_clock_jump()))
+&& replay_mode == REPLAY_MODE_NONE) {
 notifier_list_notify(>reset_notifiers, );
 }
 return now;
 case QEMU_CLOCK_VIRTUAL_RT:
-return cpu_get_clock();
+return REPLAY_CLOCK(REPLAY_CLOCK_VIRTUAL_RT, cpu_get_clock());
 }
 }
 
diff --git a/replay/Makefile.objs b/replay/Makefile.objs
index 1ce61ec..4aa527e 100755
--- a/replay/Makefile.objs
+++ b/replay/Makefile.objs
@@ -1,4 +1,5 @@
 obj-$(CONFIG_SOFTMMU) += replay.o
 obj-$(CONFIG_SOFTMMU) += replay-internal.o
 obj-$(CONFIG_SOFTMMU) += replay-events.o
+obj-$(CONFIG_SOFTMMU) += replay-time.o
 obj-$(CONFIG_USER_ONLY) += replay-user.o
diff --git a/replay/replay-internal.h b/replay/replay-internal.h
index 23807ca..f042c46 100755
--- a/replay/replay-internal.h
+++ b/replay/replay-internal.h
@@ -23,6 +23,10 @@ enum ReplayEvents {
 EVENT_EXCEPTION,
 /* for async events */
 EVENT_ASYNC,
+/* for clock read/writes */
+/* some of greater codes are reserved for clocks */
+EVENT_CLOCK,
+EVENT_CLOCK_LAST = EVENT_CLOCK + REPLAY_CLOCK_COUNT - 1,
 EVENT_COUNT
 };
 
@@ -35,6 +39,8 @@ enum ReplayAsyncEventKind {
 typedef enum ReplayAsyncEventKind ReplayAsyncEventKind;
 
 typedef struct ReplayState {
+/*! Cached clock values. */
+int64_t cached_clock[REPLAY_CLOCK_COUNT];
 /*! Current step - number of processed instructions and timer events. */
 uint64_t current_step;
 /*! Number of instructions to be executed before other events happen. */
@@ -85,6 +91,11 @@ void replay_save_instructions(void);
 \return true, if event was found */
 bool replay_next_event_is(int event);
 
+/*! Reads next clock value from the file.
+If clock kind read from the file is different from the parameter,
+the value is not used. */
+void replay_read_next_clock(unsigned int kind);
+
 /* Asynchronous events queue */
 
 /*! Initializes events' processing internals */
diff --git a/replay/replay-time.c b/replay/replay-time.c
new file mode 100755
index 000..a05bfd9
--- /dev/null
+++ b/replay/replay-time.c
@@ -0,0 +1,64 @@
+/*
+ * replay-time.c
+ *
+ * Copyright (c) 2010-2015 Institute for System Programming
+ * of the Russian Academy of Sciences.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu-common.h"
+#include "replay.h"
+#include "replay-internal.h"
+#include "qemu/error-report.h"
+
+int64_t replay_save_clock(ReplayClockKind kind, int64_t clock)
+{
+replay_save_instructions();
+
+if (replay_file) {
+replay_mutex_lock();
+replay_put_event(EVENT_CLOCK + kind);
+replay_put_qword(clock);
+replay_mutex_unlock();
+}
+
+return clock;
+}
+
+void

[Qemu-devel] [PATCH v18 13/21] icount: improve counting for record/replay

2015-09-17 Thread Pavel Dovgalyuk

icount_warp_rt function is called by qemu_clock_warp and as
callback of icount_warp timer. This patch adds call to qemu_clock_warp
into main_loop_wait function, because icount warp may be missed
in record/replay mode, when CPU is sleeping.
This patch also disables of calling this function by timer, because
it is not needed after making modifications of main_loop_wait.

Signed-off-by: Pavel Dovgalyuk 
---
 cpus.c  |   11 ---
 main-loop.c |3 +++
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/cpus.c b/cpus.c
index 95b8469..46d6717 100644
--- a/cpus.c
+++ b/cpus.c
@@ -327,7 +327,7 @@ static int64_t qemu_icount_round(int64_t count)
 return (count + (1 << icount_time_shift) - 1) >> icount_time_shift;
 }
 
-static void icount_warp_rt(void *opaque)
+static void icount_warp_rt(void)
 {
 /* The icount_warp_timer is rescheduled soon after vm_clock_warp_start
  * changes from -1 to another value, so the race here is okay.
@@ -362,6 +362,11 @@ static void icount_warp_rt(void *opaque)
 }
 }
 
+static void icount_dummy_timer(void *opaque)
+{
+(void)opaque;
+}
+
 void qtest_clock_warp(int64_t dest)
 {
 int64_t clock = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
@@ -406,7 +411,7 @@ void qemu_clock_warp(QEMUClockType type)
  * the CPU starts running, in case the CPU is woken by an event other
  * than the earliest QEMU_CLOCK_VIRTUAL timer.
  */
-icount_warp_rt(NULL);
+icount_warp_rt();
 timer_del(icount_warp_timer);
 }
 if (!all_cpu_threads_idle()) {
@@ -529,7 +534,7 @@ void configure_icount(QemuOpts *opts, Error **errp)
 icount_sleep = qemu_opt_get_bool(opts, "sleep", true);
 if (icount_sleep) {
 icount_warp_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL_RT,
- icount_warp_rt, NULL);
+ icount_dummy_timer, NULL);
 }
 
 icount_align_option = qemu_opt_get_bool(opts, "align", false);
diff --git a/main-loop.c b/main-loop.c
index db600a3..df28670 100644
--- a/main-loop.c
+++ b/main-loop.c
@@ -506,6 +506,9 @@ int main_loop_wait(int nonblocking)
 slirp_pollfds_poll(gpollfds, (ret < 0));
 #endif
 
+/* CPU thread can infinitely wait for event after
+   missing the warp */
+qemu_clock_warp(QEMU_CLOCK_VIRTUAL);
 qemu_clock_run_all_timers();
 
 return ret;

[Qemu-devel] [PATCH v18 12/21] replay: shutdown event

2015-09-17 Thread Pavel Dovgalyuk

This patch records and replays simulator shutdown event.

Reviewed-by: Paolo Bonzini 

Signed-off-by: Pavel Dovgalyuk 
---
 replay/replay-internal.h |2 ++
 replay/replay.c  |   14 ++
 replay/replay.h  |5 +
 vl.c |1 +
 4 files changed, 22 insertions(+), 0 deletions(-)

diff --git a/replay/replay-internal.h b/replay/replay-internal.h
index f042c46..4414695 100755
--- a/replay/replay-internal.h
+++ b/replay/replay-internal.h
@@ -23,6 +23,8 @@ enum ReplayEvents {
 EVENT_EXCEPTION,
 /* for async events */
 EVENT_ASYNC,
+/* for shutdown request */
+EVENT_SHUTDOWN,
 /* for clock read/writes */
 /* some of greater codes are reserved for clocks */
 EVENT_CLOCK,
diff --git a/replay/replay.c b/replay/replay.c
index d072ef7..83b0f8c 100755
--- a/replay/replay.c
+++ b/replay/replay.c
@@ -14,6 +14,7 @@
 #include "replay-internal.h"
 #include "qemu/timer.h"
 #include "qemu/main-loop.h"
+#include "sysemu/sysemu.h"
 
 ReplayMode replay_mode = REPLAY_MODE_NONE;
 
@@ -34,6 +35,10 @@ bool replay_next_event_is(int event)
 res = true;
 }
 switch (replay_data_kind) {
+case EVENT_SHUTDOWN:
+replay_finish_event();
+qemu_system_shutdown_request();
+break;
 default:
 /* clock, time_t, checkpoint and other events */
 return res;
@@ -146,3 +151,12 @@ bool replay_has_interrupt(void)
 }
 return res;
 }
+
+void replay_shutdown_request(void)
+{
+if (replay_mode == REPLAY_MODE_RECORD) {
+replay_mutex_lock();
+replay_put_event(EVENT_SHUTDOWN);
+replay_mutex_unlock();
+}
+}
diff --git a/replay/replay.h b/replay/replay.h
index 2398509..fcc93d1 100755
--- a/replay/replay.h
+++ b/replay/replay.h
@@ -66,6 +66,11 @@ int64_t replay_read_clock(ReplayClockKind kind);
 ? replay_save_clock((clock), (value))   \
 : (value))
 
+/* Events */
+
+/*! Called when qemu shutdown is requested. */
+void replay_shutdown_request(void);
+
 /* Asynchronous events queue */
 
 /*! Disables storing events in the queue */
diff --git a/vl.c b/vl.c
index 066a080..a563c9e 100644
--- a/vl.c
+++ b/vl.c
@@ -1825,6 +1825,7 @@ void qemu_system_killed(int signal, pid_t pid)
 void qemu_system_shutdown_request(void)
 {
 trace_qemu_system_shutdown_request();
+replay_shutdown_request();
 shutdown_requested = 1;
 qemu_notify_event();
 }

1 2 3 >

1 - 100 of 215 matches

Mail list logo