date:20111021

Re: [Qemu-devel] [PATCH v2 4/6] tcg: Add interpreter for bytecode

2011-10-21 Thread Stefan Weil


Am 20.10.2011 23:36, schrieb malc:

On Thu, 20 Oct 2011, Stefan Weil wrote:

[..snip..]


+/* Trace message to see program flow. */
+#if defined(CONFIG_DEBUG_TCG_INTERPRETER)
+#define TRACE() \
+loglevel \
+? fprintf(stderr, TCG %s:%u: %s()\n, __FILE__, __LINE__, __func__) \
+: (void)0

This is wrong, fprintf's return value is int and not void. Similar issue
was present today on comp.std.c
http://groups.google.com/group/comp.std.c/browse_thread/thread/4e01fece59a80572#

[..snip..]


You are right, but it does not matter. TRACE() is never used in
assignments, and gcc accepts the statement as it is.

Nevertheless, I' ll fix it in the next release.

Thanks,
Stefan

[Qemu-devel] QEMU via GDB

2011-10-21 Thread Davide


Dear all,
I am trying to debug QEMU via GDB.


I configured and compiled QEMU with debugging flags, i.e.,
# CFLAGS=-g3 -O0 ./configure --disable-gfx-check


and run gdb:


# gdb ./i386-linux-user/qemu-i386


(gdb) break main
(gdb) run

Starting program: /home/test/femu/i386-linux-user/qemu-i386
Failed to read a valid object file image from memory.
Warning:
Cannot insert breakpoint 1.
Error accessing memory address 0x2f7df: Input/output error.


Is there any extra flag to be specified with the GDB for QEMU debugging? 
I am wondering if the QEMU virtual machine creates any problem to the 
GDB virtual machine.



Thanks.


G.

Re: [Qemu-devel] [Qemu-trivial] [PATCH] qed: don't pass NULL to memcpy

2011-10-21 Thread Paolo Bonzini


On 10/20/2011 07:23 PM, Stefan Hajnoczi wrote:

On Tue, Oct 18, 2011 at 09:17:35PM +0400, Pavel Borzenkov wrote:

Spotted by Clang Analyzer

Signed-off-by: Pavel Borzenkovpavel.borzen...@gmail.com
---
  block/qed.c |6 --
  1 files changed, 4 insertions(+), 2 deletions(-)


Thanks, applied to the trivial patches tree:
http://repo.or.cz/w/qemu/stefanha.git/shortlog/refs/heads/trivial-patches


I think there are other places in the tree where we assume that 
memcpy(dest, NULL, 0); works.


Paolo

Re: [Qemu-devel] [Qemu-ppc] [PATCH] pseries: Correct vmx/dfp handling in both KVM and TCG cases

2011-10-21 Thread Alexander Graf


On 20.10.2011, at 22:06, David Gibson wrote:

 On Thu, Oct 20, 2011 at 07:40:00PM -0700, Alexander Graf wrote:
 On 20.10.2011, at 17:41, David Gibson da...@gibson.dropbear.id.au wrote:
 On Thu, Oct 20, 2011 at 10:12:51AM -0700, Alexander Graf wrote:
 On 17.10.2011, at 21:15, David Gibson wrote:
 [snip]
 So, I really don't follow what the logic you want is.  It sounds more
 like what I have already, so I'm not sure how -cpu host comes into
 this.
 
 Well, I want something very simple, layered:
 
 -cpu host only searches for pvr matches and selects a different CPU
 -type based on this
 
 Hrm, ok, well I can do this if you like, but note that this is quite
 different from how -cpu host behaves on x86.  There it builds the CPU
 spec from scratch based on querying the host cpuid, rather than
 selecting from an existing list of cpus.  I selected from the existing
 table based on host PVR because that was the easiest source for some
 of the info in the cpu_spec, but my intention was that anything we
 _can_ query directly from the host would override the table.
 
 It seems to be your approach is giving up on the possibility of
 allowing -cpu host to work (and give you full access to the host
 features) when qemu doesn't recognize the precise PVR of the host cpu.

I disagree :). This is what x86 does:

  * -cpu host fetches CPUID info from host, puts it into vcpu
  * vcpu CPUID info gets ANDed with KVM capability CPUIDs

I want basically the same thing. I want to have 2 different layers for 2 
different semantics. One for what the host CPU would be able to do and one for 
what we can emulate, and two different steps to ensure control over them.

The thing I think I'm apparently not bringing over yet is that I'm more than 
happy to get rid of the PVR searching step for -cpu host and instead use a full 
host capability inquiry mechanism. But that inquiry should indicate what the 
host CPU can do. It has nothing to do with KVM yet. The masking with KVM 
capabilities should be the next separate step.

My goal is really to separate different layers into actual different layers :).

 This gets further complicated in the case of the w-i-p patch I have to
 properly advertise page sizes, where it's not just presence or absence
 of a feature, but the specific SLB and HPTE encodings must be
 advertised to the guest.

Yup, so we'd read out the host dt to find the host possible encodings (probably 
a bad idea, but that's a different story) and then ask KVM what encodings it 
supports and expose the ANDed product of them to the guest.

 
 We have 2 masks of available flags: TCG emulatable flags and KVM
 virtualizable flags. The KVM flags need to be generated dynamically,
 from the host dt for now. TCG flags are constant.
 
 Then we always AND the inst feature bits with the mask. This tells
 every other layer what features are available. That way even running
 -cpu G5 on a p7 works properly by not exposing DFP for example.
 
 That case was already fine.
 
 Are you suggesting doing the AND in the per-machine code (so, as we
 build the guest dt in the spapr case) or when we build the env-insn_flags
 from the spec-insn_flags?

I suggest doing that in translate_init.c where we actually build the 
env-insn_flags from the spec.


Alex

Re: [Qemu-devel] Multi heterogenous CPU archs for SoC sim?

2011-10-21 Thread Peter Maydell

On 20 October 2011 23:51, Andreas Färber andreas.faer...@web.de wrote:
 I have now come across such a heterogeneous SoC myself: Renesas announced
 the R-Car H1 this week, a SoC with one SH4A core and four ARM Cortex-A9
 cores.

Does it expose the SH4 to apps/OSes, or is it mostly used for
power management or similar ignorable duties? (For several
of the ARM boards we currently just ignore the fact that the real
h/w has a Cortex-M3 doing power management type stuff.)

 That would make them all 32-bit, and I am hoping to get confirmation
 that this is consistently Little Endian.

I think the endianness is a red herring for heterogenous systems
anyway -- what QEMU defines as the target endianness is really something
more like the system bus endianness, as far as I can tell.
An extra core with a different idea of endianness shouldn't be
any harder to handle than cores which can switch endianness
at runtime. You just either insert swizzling or not.

 The only realistic way to get started with such setups I see is to create a
 new target-xxx for the specific mix, define TARGET_LONG_BITS etc.
 appropriately in a new cpu.h, compile the needed target-xyz/*.c to unique
 xxx-softmmu/xyz-*.o and dispatch from a cpu_init() to the two cpu_*_init().

Yuck. Longer term if we want to support this kind of heterogeneity
we should be removing all the compile-time assumptions and generally
making the target-specifics suitably contained rather than leaking
into the rest of the code.

 I'm guessing we may need to distinguish the TBs at runtime? Reserving
 log2(#architectures) bits in the TBFLAGS might do, but feels ugly.
 Probably a lot of other issues I'm not seeing yet.

We may want the tb cache to be per-core anyway (and one thread per core),
which would avoid the problem of trying to wedge everything into one set
of tb_flags.

(Has anybody had a look at http://sourceforge.net/p/coremu/home/Home/ ?)

-- PMM

[Qemu-devel] QEMU via GDB

2011-10-21 Thread davide . ferraretto

Dear all, 
I am trying to debug QEMU via GDB. 
 
 
I configured and compiled QEMU with debugging flags, i.e., 
# CFLAGS=-g3 -O0 ./configure --disable-gfx-check 
 
 
and run gdb: 
 
 
# gdb ./i386-linux-user/qemu-i386 
 
 
(gdb) break main 
(gdb) run 
 
Starting program: /home/test/femu/i386-linux-user/qemu-i386 
Failed to read a valid object file image from memory. 
Warning: 
Cannot insert breakpoint 1. 
Error accessing memory address 0x2f7df: Input/output error. 
 
 
Is there any extra flag to be specified with the GDB for QEMU debugging?  I am 
wondering if the QEMU virtual machine creates any problem to the  GDB virtual 
machine. 
 
 
Thanks.

Re: [Qemu-devel] passing secrets to block devices

2011-10-21 Thread Daniel P. Berrange

On Fri, Oct 21, 2011 at 09:37:11AM +0800, shu ming wrote:
 On 2011-10-21 5:48, Josh Durgin wrote:
 On 10/20/2011 12:24 PM, Daniel P. Berrange wrote:
 On Thu, Oct 20, 2011 at 11:30:42AM -0700, Josh Durgin wrote:
 We're working on libvirt support for block device
 authentication [1]. To
 authenticate, rbd needs a username and a secret. Normally, to
 avoid putting the secret on the command line, you can store the secret
 in a file and pass the file to qemu, but when this is automated,
 there's no good way to know when the file can be removed. There are
 a few ways to pass the secret to qemu that avoid this problem:
 
 This is the same problem the iSCSI block driver currently faces,
 and also if the Curl/HTTP block driver wanted todo authentication
 we'd hit this. So it isn't unique to Ceph/RBD.
 
 1) pass an fd to an unlinked file containing the secret
 
 This is the simplest method, but it sounds like qemu developers don't
 like fd passing from libvirt. [2]
 
 That would be workable, but it means people trying to run the libvirt
 QEMU command line themselves, would have to remove some args.
 
 Isn't this already the case for chardevs? I can understand not
 wanting more things like that though.
 
 2) start guests paused, without disks requiring authentication, then
 use the drive_add monitor command to attach them
 
 This would make disks with authentication somewhat of a special case
 in libvirt, but would be simple to implement, and require no
 qemu changes.
 
 This makes it very hard for people to take the libvirt QEMU command line
 and run themselves, since now an entire chunk of it is just missing.
 So I really don't want to go down this route.
 
 3) start guests paused, then send the secret via a new QMP/HMP
 command (block_set_confkey value?)
 
 This is a larger change, but it would be more generally useful for
 changing configuration at runtime.
 
 I don't think you need to try to solve the problem of a general
 purpose 'set configuration' command here, not least because that
 will likely get you drawn into a huge discussion about qemu device
 configuration in general which will likely never end.
 
 We already have a 'block_passwd' command for setting qcow2 decryption
 keys. These aren't decryption passwords, rather they are authentication
 passwords, so they're a little different, but I think this command could
 still likely be leveraged for Ceph/iSCSI/etc auth passwords.
 
 Ideally, we want to cope with having both a decryption  auth password
 for the same block device. eg, an encrypted qcow2 image accessed, over
 HTTP would require both. In these case there are 2 block drivers
 involved,
 the 'qcow2' driver and the 'http' driver. So perhaps an extra parameter
 for the 'block_password' command to identify which driver the password
 is intended for is the right approach. If omitted,we'd default
 to 'qcow2'
 for back compat.
 
 So eg, for a encrypted qcow2 disk accessed over http
 
 -drive  file=http://fred@host/my.iso,format=qcow2,id=mydrive
 
 the app would invoke
 
{ execute: block_password, argument: { device: mydrive,
 driver, qcow2,
 password, 12345 } }
{ execute: block_password, argument: { device: mydrive,
 driver, curl,
 password, 7890 } }
 
 For Ceph/RBD with a plain file, you'd just do
 
 
{ execute: block_password, argument: { device: mydrive,
 driver, rbd,
 password, 7890 } }
 
 
 This sounds good to me, although the same driver might use
 authentication and encryption. Adding another argument to specify
 'auth' or 'encryption' would fix this, i.e.:
 
   { execute: block_password, argument: { device: mydrive,
driver: qcow2,
use: encryption
password: 12345 } }
 
 I'll prepare a patch if there are no objections to this approach.
 Does the authentication be calculated by QEMU finally?  If it is,
 how the secrets will be transported from libvirt to QEMU if they
 are in different hosts?
 IMO, It should be encrpted to prevent the peek from others on the network.

libvirt + QEMU are always run on the same host, communicating via a
UNIX domain socket. The application talking to libvirt might be on a
remote host, but the libvirt sockets all have strong encryption. Now
in theory you could have a mgmt app connecting to the monitor over
TCP, I don't think that is something anyone will seriously do in
practice. It doesn't offer any kind of authentication, so exposing
it to the network would be giving away effective remote root access.
So we should just consider the monitor socket to be a secure channel
for this discussion.

Regards,
Daniel
-- 
|:

Re: [Qemu-devel] passing secrets to block devices

2011-10-21 Thread Daniel P. Berrange

On Thu, Oct 20, 2011 at 02:48:15PM -0700, Josh Durgin wrote:
 On 10/20/2011 12:24 PM, Daniel P. Berrange wrote:
 On Thu, Oct 20, 2011 at 11:30:42AM -0700, Josh Durgin wrote:
 We're working on libvirt support for block device authentication [1]. To
 authenticate, rbd needs a username and a secret. Normally, to
 avoid putting the secret on the command line, you can store the secret
 in a file and pass the file to qemu, but when this is automated,
 there's no good way to know when the file can be removed. There are
 a few ways to pass the secret to qemu that avoid this problem:
 
 This is the same problem the iSCSI block driver currently faces,
 and also if the Curl/HTTP block driver wanted todo authentication
 we'd hit this. So it isn't unique to Ceph/RBD.
 
 1) pass an fd to an unlinked file containing the secret
 
 This is the simplest method, but it sounds like qemu developers don't
 like fd passing from libvirt. [2]
 
 That would be workable, but it means people trying to run the libvirt
 QEMU command line themselves, would have to remove some args.
 
 Isn't this already the case for chardevs? I can understand not
 wanting more things like that though.
 
 2) start guests paused, without disks requiring authentication, then
 use the drive_add monitor command to attach them
 
 This would make disks with authentication somewhat of a special case
 in libvirt, but would be simple to implement, and require no qemu changes.
 
 This makes it very hard for people to take the libvirt QEMU command line
 and run themselves, since now an entire chunk of it is just missing.
 So I really don't want to go down this route.
 
 3) start guests paused, then send the secret via a new QMP/HMP
 command (block_set_confkey  value?)
 
 This is a larger change, but it would be more generally useful for
 changing configuration at runtime.
 
 I don't think you need to try to solve the problem of a general
 purpose 'set configuration' command here, not least because that
 will likely get you drawn into a huge discussion about qemu device
 configuration in general which will likely never end.
 
 We already have a 'block_passwd' command for setting qcow2 decryption
 keys. These aren't decryption passwords, rather they are authentication
 passwords, so they're a little different, but I think this command could
 still likely be leveraged for Ceph/iSCSI/etc auth passwords.
 
 Ideally, we want to cope with having both a decryption  auth password
 for the same block device. eg, an encrypted qcow2 image accessed, over
 HTTP would require both. In these case there are 2 block drivers involved,
 the 'qcow2' driver and the 'http' driver. So perhaps an extra parameter
 for the 'block_password' command to identify which driver the password
 is intended for is the right approach. If omitted,we'd default to 'qcow2'
 for back compat.
 
 So eg, for a encrypted qcow2 disk accessed over http
 
 -drive  file=http://fred@host/my.iso,format=qcow2,id=mydrive
 
 the app would invoke
 
{ execute: block_password, argument: { device: mydrive,
 driver, qcow2,
 password, 12345 } }
{ execute: block_password, argument: { device: mydrive,
 driver, curl,
 password, 7890 } }
 
 For Ceph/RBD with a plain file, you'd just do
 
 
{ execute: block_password, argument: { device: mydrive,
 driver, rbd,
 password, 7890 } }
 
 
 This sounds good to me, although the same driver might use
 authentication and encryption. Adding another argument to specify
 'auth' or 'encryption' would fix this, i.e.:
 
   { execute: block_password, argument: { device: mydrive,
driver: qcow2,
use: encryption
password: 12345 } }
 
 I'll prepare a patch if there are no objections to this approach.

In absence of other suggestions, it sounds workable to me.

Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

Re: [Qemu-devel] [RFC][PATCH 28/45] qemu-kvm: msix: Drop tracking of used vectors

2011-10-21 Thread Jan Kiszka

On 2011-10-21 00:02, Michael S. Tsirkin wrote:
 Yes. But this still makes an API for acquiring per-vector resources a 
 requirement.

 Yes, but a different one than current use/unuse.
 
 What's wrong with use/unuse as an API? It's already in place
 and virtio calls it.

Not for that purpose. It remains a useless API in the absence of KVM's
requirements.

 
 And it will be an
 optional one, only for those devices that need to establish irq/eventfd
 channels.

 Jan
 
 Not sure this should be up to the device.

The device provides the fd. At least it acquires and associates it.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] [Question] dump memory when host pci device is used by guest

2011-10-21 Thread Jan Kiszka

On 2011-10-20 12:03, Wen Congyang wrote:
 At 10/20/2011 05:41 PM, Jan Kiszka Write:
 On 2011-10-20 03:22, Wen Congyang wrote:
 I didn't read full story but 'crash' is used for investigating kernel 
 core generated
 by kdump for several years. Considering support service guys, virsh dump 
 should support
 a format for crash because they can't work well at investigating vmcore 
 by gdb.

 crash has several functionality useful for them as 'show kerne log', 
 'focus on a cpu'
 'for-each-task', 'for-each-vma', 'extract ftrace log' etc.

 Anyway, if a man, who is not developper of qemu/kvm, should learn 2 tools 
 for
 investigating kernel dump, it sounds harmful.

 Right, that's why everything (live debugging  crash analysis) should be
 consolidated on the long run over gdb. crash is architecturally obsolete
 today - not saying it is useless!

 I do not know why crash is obsoleted today. Is there a new better tool to 
 instead
 crash?

 I'm not aware of equally powerful (python) scripts for gdb as
 replacement, but I think it's worth starting a porting effort at some point.


 At least, I always use crash to live debugging  crash analysis.

 Then you may answer some questions to me:
  - Can you attach to a remote target (kgdb, qemu, etc.) and how?
 
 No. crash's live debugging only can work the kernel is live. I can use it get
 some var's value, or some other information from kernel. If kernel panics,
 we can use gdb to attach to a remote target as you said. But on end user 
 machine,
 we can not do it, we should dump the memory into a file and analyze it in 
 another
 machine while the end user's guest can be restart.
 
  - Can you use it with latest gdb versions or is the gdb functionality
hard-wired due to an embedded gdb core in crash (that's how I
understood Christoph's reply to this topic)
 
 If I use crash, I can not use latest gdb versions. Do we always need to use
 the latest gdb versions? Currently, gdb-7.0 is embedded into crash, and it
 is enough to me. If the gdb embedded into crash cannot anaylze the vmcore, I
 think we can update it and rebuild crash.

crash is simply designed the wrong way around (from today's
perspective): it should augment upstream gdb instead of forking it.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] [Qemu-ppc] [PATCH] PPC: Fail configure when libfdt is not available

2011-10-21 Thread Gerd Hoffmann

  Hi,

 dtc-lexer.lex.o: In function `pop_input_file':
 /home/buildbot/git/dtc/dtc-lexer.l:201: undefined reference to
 `yypop_buffer_state'
 collect2: ld returned 1 exit status
 make: *** [dtc] Error 1
 
 ...this is harder.  I do rely fairly heavily on the lex multiple input
 buffer support for processing includes.  I'm not sure when that went
 in, but obviously after flex 2.5.4.

 I could rewrite to not rely on the flex stuff and do it myself, but it
 would be non-trivial, so I'm afraid that fix won't happen particularly
 soon.

For the record: Updating flex to 2.5.35 made dtc build fine on RHEL-5.

cheers,
  Gerd

Re: [Qemu-devel] QEMU via GDB

2011-10-21 Thread Robert Wang

What is your configure parameters?
I think you could try like this:
./configure --enable-debug --extra-cflags=-g3 -O0

And I did not find options like disable-gfx-check.

2011/10/21 Davide outshel...@gmail.com:
 Dear all,
 I am trying to debug QEMU via GDB.


 I configured and compiled QEMU with debugging flags, i.e.,
 # CFLAGS=-g3 -O0 ./configure --disable-gfx-check


 and run gdb:


 # gdb ./i386-linux-user/qemu-i386


 (gdb) break main
 (gdb) run

 Starting program: /home/test/femu/i386-linux-user/qemu-i386
 Failed to read a valid object file image from memory.
 Warning:
 Cannot insert breakpoint 1.
 Error accessing memory address 0x2f7df: Input/output error.


 Is there any extra flag to be specified with the GDB for QEMU debugging? I
 am wondering if the QEMU virtual machine creates any problem to the GDB
 virtual machine.


 Thanks.


 G.

Re: [Qemu-devel] Multi heterogenous CPU archs for SoC sim?

2011-10-21 Thread 陳韋任

 We may want the tb cache to be per-core anyway (and one thread per core),
 which would avoid the problem of trying to wedge everything into one set
 of tb_flags.
 
 (Has anybody had a look at http://sourceforge.net/p/coremu/home/Home/ ?)

  COREMU treats QEMU as an entity and lauches multiple QEMUs at the same
time. QEMUs communicates to each other by using a underlying thin layer
provided by COREMU. I think this approach is much clean than trying to
parallelize QEMU itself.

Regards,
chenwj

-- 
Wei-Ren Chen (陳韋任)
Computer Systems Lab, Institute of Information Science,
Academia Sinica, Taiwan (R.O.C.)
Tel:886-2-2788-3799 #1667

Re: [Qemu-devel] [Question] dump memory when host pci device is used by guest

2011-10-21 Thread Wen Congyang

At 10/21/2011 03:11 PM, Jan Kiszka Write:
 On 2011-10-20 12:03, Wen Congyang wrote:
 At 10/20/2011 05:41 PM, Jan Kiszka Write:
 On 2011-10-20 03:22, Wen Congyang wrote:
 I didn't read full story but 'crash' is used for investigating kernel 
 core generated
 by kdump for several years. Considering support service guys, virsh dump 
 should support
 a format for crash because they can't work well at investigating vmcore 
 by gdb.

 crash has several functionality useful for them as 'show kerne log', 
 'focus on a cpu'
 'for-each-task', 'for-each-vma', 'extract ftrace log' etc.

 Anyway, if a man, who is not developper of qemu/kvm, should learn 2 
 tools for
 investigating kernel dump, it sounds harmful.

 Right, that's why everything (live debugging  crash analysis) should be
 consolidated on the long run over gdb. crash is architecturally obsolete
 today - not saying it is useless!

 I do not know why crash is obsoleted today. Is there a new better tool to 
 instead
 crash?

 I'm not aware of equally powerful (python) scripts for gdb as
 replacement, but I think it's worth starting a porting effort at some point.


 At least, I always use crash to live debugging  crash analysis.

 Then you may answer some questions to me:
  - Can you attach to a remote target (kgdb, qemu, etc.) and how?

 No. crash's live debugging only can work the kernel is live. I can use it get
 some var's value, or some other information from kernel. If kernel panics,
 we can use gdb to attach to a remote target as you said. But on end user 
 machine,
 we can not do it, we should dump the memory into a file and analyze it in 
 another
 machine while the end user's guest can be restart.

  - Can you use it with latest gdb versions or is the gdb functionality
hard-wired due to an embedded gdb core in crash (that's how I
understood Christoph's reply to this topic)

 If I use crash, I can not use latest gdb versions. Do we always need to use
 the latest gdb versions? Currently, gdb-7.0 is embedded into crash, and it
 is enough to me. If the gdb embedded into crash cannot anaylze the vmcore, I
 think we can update it and rebuild crash.
 
 crash is simply designed the wrong way around (from today's
 perspective): it should augment upstream gdb instead of forking it.

Cc Dave Anderson. He knows how crash uses gdb.

I think that crash does not fork a task to execute gdb, and gdb is a part of 
crash.

Thanks
Wen Congyang

 
 Jan

Re: [Qemu-devel] [RFC][PATCH 28/45] qemu-kvm: msix: Drop tracking of used vectors

2011-10-21 Thread Michael S. Tsirkin

On Fri, Oct 21, 2011 at 09:09:10AM +0200, Jan Kiszka wrote:
 On 2011-10-21 00:02, Michael S. Tsirkin wrote:
  Yes. But this still makes an API for acquiring per-vector resources a 
  requirement.
 
  Yes, but a different one than current use/unuse.
  
  What's wrong with use/unuse as an API? It's already in place
  and virtio calls it.
 
 Not for that purpose.
 It remains a useless API in the absence of KVM's
 requirements.
 

Sorry, I don't understand. This can acquire whatever resources
necessary. It does not seem to make sense to rip it out
only to add a different one back in.

  
  And it will be an
  optional one, only for those devices that need to establish irq/eventfd
  channels.
 
  Jan
  
  Not sure this should be up to the device.
 
 The device provides the fd. At least it acquires and associates it.
 
 Jan

It would surely be beneficial to be able to have a uniform
API so that devices don't need to be recoded to be moved
in this way.

 -- 
 Siemens AG, Corporate Technology, CT T DE IT 1
 Corporate Competence Center Embedded Linux

Re: [Qemu-devel] [Qemu-ppc] [PATCH] PPC: Fail configure when libfdt is not available

2011-10-21 Thread Paolo Bonzini

On 10/20/2011 08:35 PM, Gerd Hoffmann wrote:
Hi,
 
 If there are build problems with libfdt on any platform let me know
 about them.  I would like it to build clean as widely as possible, but
 I don't have that great a diversity of build environments, so I have
 to reply on bug reports.
 
 Fails to build on RHEL-5:
 
   CC convert-dtsv0-lexer.lex.o
 cc1: warnings being treated as errors
 convert-dtsv0-lexer.lex.c:693: warning: no previous prototype for 'yylex'
 make: *** [convert-dtsv0-lexer.lex.o] Error 1
 
 Removing -Werror from the Makefile gets me a bit further:
 
   CC dtc-lexer.lex.o
 dtc-lexer.lex.c:683: warning: no previous prototype for 'yylex'
 dtc-lexer.l: In function 'push_input_file':
 dtc-lexer.l:192: warning: implicit declaration of function 
 'yypush_buffer_state'
 dtc-lexer.l:192: warning: nested extern declaration of 'yypush_buffer_state'
 dtc-lexer.l: In function 'pop_input_file':
 dtc-lexer.l:201: warning: implicit declaration of function 
 'yypop_buffer_state'
 dtc-lexer.l:201: warning: nested extern declaration of 'yypop_buffer_state'
   CC dtc-parser.tab.o
   LD dtc
 dtc-lexer.lex.o: In function `push_input_file':
 /home/buildbot/git/dtc/dtc-lexer.l:192: undefined reference to
 `yypush_buffer_state'
 dtc-lexer.lex.o: In function `pop_input_file':
 /home/buildbot/git/dtc/dtc-lexer.l:201: undefined reference to
 `yypop_buffer_state'
 collect2: ld returned 1 exit status
 make: *** [dtc] Error 1
 
 I guess the flex version shipped with RHEL-5 is too old.
 
 $ rpm -qf $(which lex)
 flex-2.5.4a-41.fc6

flex is only used by dtc, not libfdt, so you can probably patch it out.
However, the usual convention is that lex- and yacc-generated files
are shipped in the tarball, with a make dist that wraps tar and/or
git-archive.  See the following patch.

Paolo

-- 8 -

From f91c3f5f165df8c8331c0c33374f55f5cf157ba6 Mon Sep 17 00:00:00 2001
From: Paolo Bonzini pbonz...@redhat.com
Date: Fri, 21 Oct 2011 08:59:43 +0200
Subject: [PATCH] build: add make dist

The usual convention is that lex- and yacc-generated files are shipped in
the tarball.  Another usual convention, originating in Automake, is that
make dist wraps tar and/or git-archive and generates a self-contained
archive.  dtc does not use Automake, so add this target.

Cc: Gerd Hoffmann kra...@redhat.com
Cc: David Gibson da...@gibson.dropbear.id.au
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 Makefile |   23 +++
 Makefile.dtc |4 ++--
 2 files changed, 25 insertions(+), 2 deletions(-)

diff --git a/Makefile b/Makefile
index b32409b..edfdb9c 100644
--- a/Makefile
+++ b/Makefile
@@ -246,4 +246,27 @@ $(LIBFDT_lib):
@$(VECHO) BISON $@
$(BISON) -d $
 
+.PHONY: distdir dist-gz dist-xz dist
+
+distdir = dtc-$(dtc_version)/
+distdir: $(DTC_GEN_SRCS) $(CONVERT_GEN_SRCS)
+   mkdir $(distdir)
+   @$(VECHO) DISTDIR $@
+   git archive --format=tar HEAD --prefix=$(distdir) | tar -xf -
+   @for i in $^; do \
+   $(if $(V),echo cp $$i $(distdir),:); \
+   cp $$i $(distdir); \
+   done
+   chmod -R ug+w $(distdir)
+
+dist-gz: distdir
+   @$(VECHO) TAR dtc-$(dtc_version).tar.gz
+   tar -chozf dtc-$(dtc_version).tar.gz $(distdir)
+dist-xz: distdir
+   @$(VECHO) TAR dtc-$(dtc_version).tar.xz
+   tar -Ixz -chof dtc-$(dtc_version).tar.xz $(distdir)
+
+dist: dist-gz dist-xz
+   rm -rf $(distdir)
+
 FORCE:
diff --git a/Makefile.dtc b/Makefile.dtc
index bece49b..0b2c869 100644
--- a/Makefile.dtc
+++ b/Makefile.dtc
@@ -14,5 +14,5 @@ DTC_SRCS = \
treesource.c \
util.c
 
-DTC_GEN_SRCS = dtc-lexer.lex.c dtc-parser.tab.c
-DTC_OBJS = $(DTC_SRCS:%.c=%.o) $(DTC_GEN_SRCS:%.c=%.o)
+DTC_GEN_SRCS = dtc-lexer.lex.c dtc-parser.tab.c dtc-parser.tab.h
+DTC_OBJS = $(patsubst %.c,%.o,$(DTC_SRCS) $(filter %.c, $(DTC_GEN_SRCS)))
-- 
1.7.6

[Qemu-devel] [PATCH v2 0/6] MIPS64 user mode emulation in QEMU with Cavium specific instruction support

2011-10-21 Thread khansa

From: Khansa Butt kha...@kics.edu.pk

This is the team work of Ehsan-ul-Haq, Abdul Qadeer, Abdul Waheed, Khansa Butt
from HPCN Lab KICS UET Lahore.

v1 contains:
* SEQI related changes specified by Richard Henderson
* Fix issues related to coding style, typos and misleading comments
* Cavium specific change in set_thread_area syscall has been removed
*  as it corresponds to modified libc and kernel.

This Patch series add support of MIPS64 user mode emulation in QEMU.
Along with we implemented Cavium specific instructions which We will use 
in SME (in sysem mode emulation of Octeon processor)

If you have any objection regarding the Implementation of
Cavium instructions please read following notes.

Notes
*

The detail of some instructions are as follows
1)seq rd,rs,rt
seq--rd = 1 if rs = rt
is equivalent to
xor rd,rs,rt
sltiu rd,rd,1
2)exts rt,rs,p,lenm1
rt = sign-extend(rsp+lenm1:p,lenm1)
From reference manual of Cavium Networks
Bit locations p + lenm1 to p are extracted from rs and the result is written 
into the
lowest bits of destination register rt. The remaining bits in rt are a 
sign-extension of
the most-significant bit of the bit field (i.e. rt63:lenm1 are all duplicates 
of the
source-register bit rsp+lenm1). so we can't use any of 8,16 or 32 bit
sign extention tcg function. To sign extend according to msb of bit field
we have our own implementation
3)dmul rd,rs,rt
This instruction is included in gen_arith() because it is three operand
double word multiply instruction.

 configure |1 +
 default-configs/mips64-linux-user.mak |1 +
 linux-user/main.c |   21 ++-
 linux-user/mips64/syscall.h   |2 +
 linux-user/signal.c   |  438 -
 mips-dis.c|   53 
 target-mips/cpu.h |7 +
 target-mips/helper.h  |5 +
 target-mips/machine.c |   12 +
 target-mips/mips-defs.h   |2 +
 target-mips/op_helper.c   |   73 ++
 target-mips/translate.c   |  431 -
 target-mips/translate_init.c  |   24 ++
 13 files changed, 1050 insertions(+), 20 deletions(-)
 create mode 100644 default-configs/mips64-linux-user.mak

-- 
1.7.3.4

[Qemu-devel] [PATCH v2 3/6] linux-user:Signal handling for MIPS64

2011-10-21 Thread khansa

From: Khansa Butt kha...@kics.edu.pk


Signed-off-by: Khansa Butt kha...@kics.edu.pk
---
 linux-user/signal.c |  438 +--
 1 files changed, 426 insertions(+), 12 deletions(-)

diff --git a/linux-user/signal.c b/linux-user/signal.c
index 59c3c88..f5f8bba 100644
--- a/linux-user/signal.c
+++ b/linux-user/signal.c
@@ -30,6 +30,8 @@
 #include qemu-common.h
 #include target_signal.h
 
+int sigrt;
+
 //#define DEBUG_SIGNAL
 
 static struct target_sigaltstack target_sigaltstack_used = {
@@ -596,7 +598,11 @@ int do_sigaction(int sig, const struct target_sigaction 
*act,
 if (act) {
 /* FIXME: This is not threadsafe.  */
 k-_sa_handler = tswapl(act-_sa_handler);
+#if defined(TARGET_MIPS64)
+k-sa_flags = bswap32(act-sa_flags);
+#else
 k-sa_flags = tswapl(act-sa_flags);
+#endif
 #if !defined(TARGET_MIPS)
 k-sa_restorer = tswapl(act-sa_restorer);
 #endif
@@ -2415,29 +2421,435 @@ void sparc64_get_context(CPUSPARCState *env)
 #endif
 #elif defined(TARGET_ABI_MIPSN64)
 
+struct target_sigcontext {
+uint32_t   sc_regmask; /* Unused */
+uint32_t   sc_status;
+uint64_t   sc_pc;
+uint64_t   sc_regs[32];
+uint64_t   sc_fpregs[32];
+uint32_t   sc_ownedfp; /* Unused */
+uint32_t   sc_fpc_csr;
+uint32_t   sc_fpc_eir; /* Unused */
+uint32_t   sc_used_math;
+uint32_t   sc_dsp; /* dsp status, was sc_ssflags */
+uint32_t   pad0;
+uint64_t   sc_mdhi;
+uint64_t   sc_mdlo;
+target_ulong   sc_hi1; /* Was sc_cause */
+target_ulong   sc_lo1; /* Was sc_badvaddr */
+target_ulong   sc_hi2; /* Was sc_sigset[4] */
+target_ulong   sc_lo2;
+target_ulong   sc_hi3;
+target_ulong   sc_lo3;
+};
+
+struct sigframe {
+uint32_t sf_ass[4]; /* argument save space for o32 */
+uint32_t sf_code[2];/* signal trampoline */
+struct target_sigcontext sf_sc;
+target_sigset_t sf_mask;
+};
+
+struct target_ucontext {
+target_ulong tuc_flags;
+target_ulong tuc_link;
+target_stack_t tuc_stack;
+target_ulong pad0;
+struct target_sigcontext tuc_mcontext;
+target_sigset_t tuc_sigmask;
+};
+
+struct target_rt_sigframe {
+uint32_t rs_ass[4];   /* argument save space for o32 */
+uint32_t rs_code[2];  /* signal trampoline */
+struct target_siginfo rs_info;
+struct target_ucontext rs_uc;
+};
+
+/* Install trampoline to jump back from signal handler */
+static inline int install_sigtramp(unsigned int *tramp,   unsigned int syscall)
+{
+int err;
+
+/*
+ * Set up the return code ...
+ *
+ * li  v0, __NR__foo_sigreturn
+ * syscall
+ */
+
+err = __put_user(0x2402 + syscall, tramp + 0);
+err |= __put_user(0x000c  , tramp + 1);
+/* flush_cache_sigtramp((unsigned long) tramp); */
+return err;
+}
+
+static inline int
+setup_sigcontext(CPUState *regs, struct target_sigcontext *sc)
+{
+int err = 0;
+
+err |= __put_user(regs-active_tc.PC, sc-sc_pc);
+
+#define save_gp_reg(i) do { \
+err |= __put_user(regs-active_tc.gpr[i], sc-sc_regs[i]); \
+} while (0)
+__put_user(0, sc-sc_regs[0]); save_gp_reg(1); save_gp_reg(2);
+save_gp_reg(3); save_gp_reg(4); save_gp_reg(5); save_gp_reg(6);
+save_gp_reg(7); save_gp_reg(8); save_gp_reg(9); save_gp_reg(10);
+save_gp_reg(11); save_gp_reg(12); save_gp_reg(13); save_gp_reg(14);
+save_gp_reg(15); save_gp_reg(16); save_gp_reg(17); save_gp_reg(18);
+save_gp_reg(19); save_gp_reg(20); save_gp_reg(21); save_gp_reg(22);
+save_gp_reg(23); save_gp_reg(24); save_gp_reg(25); save_gp_reg(26);
+save_gp_reg(27); save_gp_reg(28); save_gp_reg(29); save_gp_reg(30);
+save_gp_reg(31);
+#undef save_gp_reg
+
+err |= __put_user(regs-active_tc.HI[0], sc-sc_mdhi);
+err |= __put_user(regs-active_tc.LO[0], sc-sc_mdlo);
+
+/* Not used yet, but might be useful if we ever have DSP suppport */
+#if 0
+if (cpu_has_dsp) {
+err |= __put_user(mfhi1(), sc-sc_hi1);
+err |= __put_user(mflo1(), sc-sc_lo1);
+err |= __put_user(mfhi2(), sc-sc_hi2);
+err |= __put_user(mflo2(), sc-sc_lo2);
+err |= __put_user(mfhi3(), sc-sc_hi3);
+err |= __put_user(mflo3(), sc-sc_lo3);
+err |= __put_user(rddsp(DSP_MASK), sc-sc_dsp);
+}
+/* same with 64 bit */
+#ifdef CONFIG_64BIT
+err |= __put_user(regs-hi, sc-sc_hi[0]);
+err |= __put_user(regs-lo, sc-sc_lo[0]);
+if (cpu_has_dsp) {
+err |= __put_user(mfhi1(), sc-sc_hi[1]);
+err |= __put_user(mflo1(), sc-sc_lo[1]);
+err |= __put_user(mfhi2(), sc-sc_hi[2]);
+err |= __put_user(mflo2(), sc-sc_lo[2]);
+err |= __put_user(mfhi3(), sc-sc_hi[3]);
+err |= __put_user(mflo3(), sc-sc_lo[3]);
+err |= __put_user(rddsp(DSP_MASK), sc-sc_dsp);
+}
+#endif
+#endif
+
+#if 0

[Qemu-devel] [PATCH v2 6/6] Addition of Cavium instructions in disassembler

2011-10-21 Thread khansa

From: Khansa Butt kha...@kics.edu.pk


Signed-off-by: Khansa Butt kha...@kics.edu.pk
---
 mips-dis.c |   53 +
 1 files changed, 53 insertions(+), 0 deletions(-)

diff --git a/mips-dis.c b/mips-dis.c
index e3a6e0b..96ab1e8 100644
--- a/mips-dis.c
+++ b/mips-dis.c
@@ -300,6 +300,7 @@ struct mips_opcode
Also used for immediate operands in vr5400 vector insns.
o 16 bit signed offset (OP_*_DELTA)
p 16 bit PC relative branch target address (OP_*_DELTA)
+   +p 5 bit unsigned constant describing bit position, for Octeon (OP_*_RT)
q 10 bit extra breakpoint code (OP_*_CODE2)
r 5 bit same register used as both source and target (OP_*_RS)
s 5 bit source register specifier (OP_*_RS)
@@ -491,6 +492,13 @@ struct mips_opcode
 #define INSN_MULT   0x4000
 /* Instruction synchronize shared memory.  */
 #define INSN_SYNC  0x8000
+/* Load Cavium specific multiplier registers. */
+#define INSN_WRITE_MPL0 0x1
+#define INSN_WRITE_MPL1 0x2
+#define INSN_WRITE_MPL2 0x4
+#define INSN_WRITE_P0   0x8
+#define INSN_WRITE_P1   0x10
+#define INSN_WRITE_P2   0x20
 
 /* These are the bits which may be set in the pinfo2 field of an
instruction. */
@@ -569,6 +577,8 @@ struct mips_opcode
 #define INSN_LOONGSON_2E  0x4000
 /* ST Microelectronics Loongson 2F.  */
 #define INSN_LOONGSON_2F  0x8000
+/* Cavium Network's Octeon processor */
+#define INSN_CVM_OCTEON   0x1
 
 /* MIPS ISA defines, use instead of hardcoding ISA level.  */
 
@@ -1099,6 +1109,13 @@ extern const int bfd_mips16_num_opcodes;
 #define RD_HI  INSN_READ_HI
 #define MOD_HI  WR_HI|RD_HI
 
+#define WR_MPL0 INSN_WRITE_MPL0
+#define WR_MPL1 INSN_WRITE_MPL1
+#define WR_MPL2 INSN_WRITE_MPL2
+#define WR_P0 INSN_WRITE_P0
+#define WR_P1 INSN_WRITE_P1
+#define WR_P2 INSN_WRITE_P2
+
 #define WR_LO  INSN_WRITE_LO
 #define RD_LO  INSN_READ_LO
 #define MOD_LO  WR_LO|RD_LO
@@ -1137,6 +1154,8 @@ extern const int bfd_mips16_num_opcodes;
 #define IL2E   (INSN_LOONGSON_2E)
 #define IL2F   (INSN_LOONGSON_2F)
 
+#define ICVM(INSN_CVM_OCTEON)
+
 #define P3 INSN_4650
 #define L1 INSN_4010
 #define V1 (INSN_4100 | INSN_4111 | INSN_4120)
@@ -2435,6 +2454,34 @@ const struct mips_opcode mips_builtin_opcodes[] =
 {cop1, C,  0,(int) M_COP1, INSN_MACRO, 0,  
I1  },
 {cop2, C,  0,(int) M_COP2, INSN_MACRO, 0,  
I1  },
 {cop3, C,  0,(int) M_COP3, INSN_MACRO, 0,  
I1  },
+/* Cavium specific instructions */
+{baddu,   d,s,t,0x7028, 0xfc0007ff, RD_s|RD_t|WR_d, 0,  ICVM},
+{dmul,d,s,t,0x7003, 0xfc0007ff, RD_s|RD_t|WR_d, 0,  ICVM},
+{v3mulu,  d,s,t,0x7011, 0xfc0007ff, RD_s|RD_t|WR_d, 0,  ICVM},
+{vmm0,d,s,t,0x7010, 0xfc0007ff, RD_s|RD_t|WR_d, 0,  ICVM},
+{vmulu,   d,s,t,0x700f, 0xfc0007ff, RD_s|RD_t|WR_d, 0,  ICVM},
+{seq, d,s,t,0x702a, 0xfc0007ff, RD_s|RD_t|WR_d, 0,  ICVM},
+{seqi,   t,r,j, 0x702e, 0xfc3f,  WR_t|RD_s, 0,  ICVM},
+{sne, d,s,t,0x702b, 0xfc0007ff, RD_s|RD_t|WR_d, 0,  ICVM},
+{snei,t,r,j,0x702f, 0xfc3f, WR_t|RD_s,  0,  ICVM},
+{bbit0,s,+p,p,   0xc800, 0xfc00, CBD|RD_s,  0,  ICVM},
+{bbit032,s,+p,p,   0xd800, 0xfc00, CBD|RD_s, 0, ICVM},
+{bbit1,s,+p,p,   0xe800, 0xfc00, CBD|RD_s,   0, ICVM},
+{bbit132,s,+p,p,   0xf800, 0xfc00, CBD|RD_s, 0, ICVM},
+{saa,t,(b), 0x7018, 0xfc00, SM|RD_t|RD_b,0, ICVM},
+{saad,   t,(b), 0x7019, 0xfc00, SM|RD_t|RD_b,0, ICVM},
+{exts,   t,r,+A,+C, 0x703a, 0xfc3f, WR_t|RD_s,   0, ICVM},
+{exts32, t,r,+A,+C, 0x7c3b, 0xfc3f, WR_t|RD_s,   0, ICVM},
+{cins,   t,r,+A,+B, 0x7032, 0xfc3f, WR_t|RD_s,   0, ICVM},
+{cins32, t,r,+A,+B, 0x7033, 0xfc3f, WR_t|RD_s,   0, ICVM},
+{mtm0,s,0x7008, 0xfc1f, RD_s|WR_MPL0,   0,  ICVM},
+{mtm1,s,0x700c, 0xfc1f, RD_s|WR_MPL1,   0,  ICVM},
+{mtm2,s,0x700d, 0xfc1f, RD_s|WR_MPL2,   0,  ICVM},
+{mtp0,s,0x7009, 0xfc1f, RD_s|WR_P0, 0,  ICVM},
+{mtp1,s,0x700a, 0xfc1f, RD_s|WR_P1, 0,  ICVM},
+{mtp2,s,0x700b, 0xfc1f, RD_s|WR_P2, 0,  ICVM},
+{dpop,d,s,  0x702d, 0xfc1f07ff, RD_s|WR_d,  0,  ICVM},
+{pop, d,s,  0x702c, 0xfc1f07ff, RD_s|WR_d,  0,  ICVM},
   /* Conflicts with the 4650's mul instruction.  Nobody's using the
  4010 any more, so move this insn out of the way.  If the object
  format gave us more info,

[Qemu-devel] [PATCH v2 1/6] linux-user:Support for MIPS64 user mode emulation in QEMU

2011-10-21 Thread khansa

From: Khansa Butt kha...@kics.edu.pk


Signed-off-by: Khansa Butt kha...@kics.edu.pk
---
 configure |1 +
 default-configs/mips64-linux-user.mak |1 +
 linux-user/main.c |   21 +++--
 linux-user/mips64/syscall.h   |2 ++
 linux-user/signal.c   |2 --
 5 files changed, 23 insertions(+), 4 deletions(-)
 create mode 100644 default-configs/mips64-linux-user.mak

diff --git a/configure b/configure
index 9ab3ab4..5e45a43 100755
--- a/configure
+++ b/configure
@@ -891,6 +891,7 @@ m68k-linux-user \
 microblaze-linux-user \
 microblazeel-linux-user \
 mips-linux-user \
+mips64-linux-user \
 mipsel-linux-user \
 ppc-linux-user \
 ppc64-linux-user \
diff --git a/default-configs/mips64-linux-user.mak 
b/default-configs/mips64-linux-user.mak
new file mode 100644
index 000..1598bfc
--- /dev/null
+++ b/default-configs/mips64-linux-user.mak
@@ -0,0 +1 @@
+# Default configuration for mips64-linux-user
diff --git a/linux-user/main.c b/linux-user/main.c
index 89a51d7..1cc564d 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -2068,7 +2068,8 @@ static int do_store_exclusive(CPUMIPSState *env)
 void cpu_loop(CPUMIPSState *env)
 {
 target_siginfo_t info;
-int trapnr, ret;
+int trapnr;
+abi_long ret;
 unsigned int syscall_num;
 
 for(;;) {
@@ -2077,8 +2078,23 @@ void cpu_loop(CPUMIPSState *env)
 cpu_exec_end(env);
 switch(trapnr) {
 case EXCP_SYSCALL:
-syscall_num = env-active_tc.gpr[2] - 4000;
 env-active_tc.PC += 4;
+#if defined(TARGET_MIPS64)
+syscall_num = env-active_tc.gpr[2] - 5000;
+/* MIPS64 has eight argument registers so there is
+ * no need to get arguments from stack
+ */
+ret = do_syscall(env, env-active_tc.gpr[2],
+ env-active_tc.gpr[4],
+ env-active_tc.gpr[5],
+ env-active_tc.gpr[6],
+ env-active_tc.gpr[7],
+ env-active_tc.gpr[8],
+ env-active_tc.gpr[9],
+ env-active_tc.gpr[10],
+ env-active_tc.gpr[11]);
+#else
+syscall_num = env-active_tc.gpr[2] - 4000;
 if (syscall_num = sizeof(mips_syscall_args)) {
 ret = -TARGET_ENOSYS;
 } else {
@@ -2105,6 +2121,7 @@ void cpu_loop(CPUMIPSState *env)
  env-active_tc.gpr[7],
  arg5, arg6, arg7, arg8);
 }
+#endif
 if (ret == -TARGET_QEMU_ESIGRETURN) {
 /* Returning from a successful sigreturn syscall.
Avoid clobbering register state.  */
diff --git a/linux-user/mips64/syscall.h b/linux-user/mips64/syscall.h
index 668a2b9..96f03da 100644
--- a/linux-user/mips64/syscall.h
+++ b/linux-user/mips64/syscall.h
@@ -218,4 +218,6 @@ struct target_pt_regs {
 
 
 
+#define TARGET_QEMU_ESIGRETURN 255
+
 #define UNAME_MACHINE mips64
diff --git a/linux-user/signal.c b/linux-user/signal.c
index 89276eb..59c3c88 100644
--- a/linux-user/signal.c
+++ b/linux-user/signal.c
@@ -2415,8 +2415,6 @@ void sparc64_get_context(CPUSPARCState *env)
 #endif
 #elif defined(TARGET_ABI_MIPSN64)
 
-# warning signal handling not implemented
-
 static void setup_frame(int sig, struct target_sigaction *ka,
target_sigset_t *set, CPUState *env)
 {
-- 
1.7.3.4

Re: [Qemu-devel] passing secrets to block devices

2011-10-21 Thread Kevin Wolf

Am 20.10.2011 23:48, schrieb Josh Durgin:
 On 10/20/2011 12:24 PM, Daniel P. Berrange wrote:
 On Thu, Oct 20, 2011 at 11:30:42AM -0700, Josh Durgin wrote:
 We're working on libvirt support for block device authentication [1]. To
 authenticate, rbd needs a username and a secret. Normally, to
 avoid putting the secret on the command line, you can store the secret
 in a file and pass the file to qemu, but when this is automated,
 there's no good way to know when the file can be removed. There are
 a few ways to pass the secret to qemu that avoid this problem:

 This is the same problem the iSCSI block driver currently faces,
 and also if the Curl/HTTP block driver wanted todo authentication
 we'd hit this. So it isn't unique to Ceph/RBD.

 1) pass an fd to an unlinked file containing the secret

 This is the simplest method, but it sounds like qemu developers don't
 like fd passing from libvirt. [2]

 That would be workable, but it means people trying to run the libvirt
 QEMU command line themselves, would have to remove some args.
 
 Isn't this already the case for chardevs? I can understand not wanting 
 more things like that though.
 
 2) start guests paused, without disks requiring authentication, then
 use the drive_add monitor command to attach them

 This would make disks with authentication somewhat of a special case
 in libvirt, but would be simple to implement, and require no qemu changes.

 This makes it very hard for people to take the libvirt QEMU command line
 and run themselves, since now an entire chunk of it is just missing.
 So I really don't want to go down this route.

 3) start guests paused, then send the secret via a new QMP/HMP
 command (block_set_confkey  value?)

 This is a larger change, but it would be more generally useful for
 changing configuration at runtime.

 I don't think you need to try to solve the problem of a general
 purpose 'set configuration' command here, not least because that
 will likely get you drawn into a huge discussion about qemu device
 configuration in general which will likely never end.

 We already have a 'block_passwd' command for setting qcow2 decryption
 keys. These aren't decryption passwords, rather they are authentication
 passwords, so they're a little different, but I think this command could
 still likely be leveraged for Ceph/iSCSI/etc auth passwords.

 Ideally, we want to cope with having both a decryption  auth password
 for the same block device. eg, an encrypted qcow2 image accessed, over
 HTTP would require both. In these case there are 2 block drivers involved,
 the 'qcow2' driver and the 'http' driver. So perhaps an extra parameter
 for the 'block_password' command to identify which driver the password
 is intended for is the right approach. If omitted,we'd default to 'qcow2'
 for back compat.

 So eg, for a encrypted qcow2 disk accessed over http

 -drive  file=http://fred@host/my.iso,format=qcow2,id=mydrive

 the app would invoke

{ execute: block_password, argument: { device: mydrive,
 driver, qcow2,
 password, 12345 } }
{ execute: block_password, argument: { device: mydrive,
 driver, curl,
 password, 7890 } }

 For Ceph/RBD with a plain file, you'd just do


{ execute: block_password, argument: { device: mydrive,
 driver, rbd,
 password, 7890 } }

 
 This sounds good to me, although the same driver might use 
 authentication and encryption. Adding another argument to specify 'auth' 
 or 'encryption' would fix this, i.e.:
 
{ execute: block_password, argument: { device: mydrive,
 driver: qcow2,
 use: encryption
 password: 12345 } }
 
 I'll prepare a patch if there are no objections to this approach.

This proposed interface solves a problem that is currently purely
theoretical. With blockdev-add and friends, we'll get all of this for
free, so I'm not excited about adding something preliminary now even
though there's no practical need.

For the rbd driver, please use the existing interface that qcow2 uses
for encrypted images.

Kevin

Re: [Qemu-devel] [Qemu-trivial] [PATCH] qed: don't pass NULL to memcpy

2011-10-21 Thread Markus Armbruster

Paolo Bonzini pbonz...@redhat.com writes:

 On 10/20/2011 07:23 PM, Stefan Hajnoczi wrote:
 On Tue, Oct 18, 2011 at 09:17:35PM +0400, Pavel Borzenkov wrote:
 Spotted by Clang Analyzer

 Signed-off-by: Pavel Borzenkovpavel.borzen...@gmail.com
 ---
   block/qed.c |6 --
   1 files changed, 4 insertions(+), 2 deletions(-)

 Thanks, applied to the trivial patches tree:
 http://repo.or.cz/w/qemu/stefanha.git/shortlog/refs/heads/trivial-patches

 I think there are other places in the tree where we assume that
 memcpy(dest, NULL, 0); works.

Looks like a fair assumption to me.

[Qemu-devel] [PATCH v2 5/6] target-mips: Support for Cavium specific instructions

2011-10-21 Thread khansa

From: Khansa Butt kha...@kics.edu.pk


Signed-off-by: Khansa Butt kha...@kics.edu.pk
Signed-off-by: Ehsan Ul Haq ehsan.ul...@kics.edu.pk
Signed-off-by: Abdul Qadeer qad...@kics.edu.pk
Signed-off-by: Abdul Waheed awah...@kics.edu.pk
---
 target-mips/cpu.h   |7 +
 target-mips/helper.h|5 +
 target-mips/machine.c   |   12 ++
 target-mips/op_helper.c |   73 
 target-mips/translate.c |  429 ++-
 5 files changed, 521 insertions(+), 5 deletions(-)

diff --git a/target-mips/cpu.h b/target-mips/cpu.h
index 79e2558..9180ee9 100644
--- a/target-mips/cpu.h
+++ b/target-mips/cpu.h
@@ -173,6 +173,13 @@ struct TCState {
 target_ulong CP0_TCSchedule;
 target_ulong CP0_TCScheFBack;
 int32_t CP0_Debug_tcstatus;
+/* Multiplier registers for Octeon */
+target_ulong MPL0;
+target_ulong MPL1;
+target_ulong MPL2;
+target_ulong P0;
+target_ulong P1;
+target_ulong P2;
 };
 
 typedef struct CPUMIPSState CPUMIPSState;
diff --git a/target-mips/helper.h b/target-mips/helper.h
index 442f684..7ba5d9f 100644
--- a/target-mips/helper.h
+++ b/target-mips/helper.h
@@ -8,7 +8,12 @@ DEF_HELPER_3(ldl, tl, tl, tl, int)
 DEF_HELPER_3(ldr, tl, tl, tl, int)
 DEF_HELPER_3(sdl, void, tl, tl, int)
 DEF_HELPER_3(sdr, void, tl, tl, int)
+DEF_HELPER_2(v3mulu, tl, tl, tl)
+DEF_HELPER_2(vmulu, tl, tl, tl)
+DEF_HELPER_1(dpop, tl, tl)
 #endif
+DEF_HELPER_1(pop, tl, tl)
+
 DEF_HELPER_3(lwl, tl, tl, tl, int)
 DEF_HELPER_3(lwr, tl, tl, tl, int)
 DEF_HELPER_3(swl, void, tl, tl, int)
diff --git a/target-mips/machine.c b/target-mips/machine.c
index be72b36..a274ce2 100644
--- a/target-mips/machine.c
+++ b/target-mips/machine.c
@@ -25,6 +25,12 @@ static void save_tc(QEMUFile *f, TCState *tc)
 qemu_put_betls(f, tc-CP0_TCSchedule);
 qemu_put_betls(f, tc-CP0_TCScheFBack);
 qemu_put_sbe32s(f, tc-CP0_Debug_tcstatus);
+qemu_put_betls(f, tc-MPL0);
+qemu_put_betls(f, tc-MPL1);
+qemu_put_betls(f, tc-P0);
+qemu_put_betls(f, tc-P1);
+qemu_put_betls(f, tc-P2);
+
 }
 
 static void save_fpu(QEMUFile *f, CPUMIPSFPUContext *fpu)
@@ -173,6 +179,12 @@ static void load_tc(QEMUFile *f, TCState *tc)
 qemu_get_betls(f, tc-CP0_TCSchedule);
 qemu_get_betls(f, tc-CP0_TCScheFBack);
 qemu_get_sbe32s(f, tc-CP0_Debug_tcstatus);
+qemu_get_betls(f, tc-MPL0);
+qemu_get_betls(f, tc-MPL1);
+qemu_get_betls(f, tc-MPL2);
+qemu_get_betls(f, tc-P0);
+qemu_get_betls(f, tc-P1);
+qemu_get_betls(f, tc-P2);
 }
 
 static void load_fpu(QEMUFile *f, CPUMIPSFPUContext *fpu)
diff --git a/target-mips/op_helper.c b/target-mips/op_helper.c
index 96e40c6..4565d17 100644
--- a/target-mips/op_helper.c
+++ b/target-mips/op_helper.c
@@ -320,8 +320,81 @@ void helper_dmultu (target_ulong arg1, target_ulong arg2)
 {
 mulu64((env-active_tc.LO[0]), (env-active_tc.HI[0]), arg1, arg2);
 }
+
+static void addc(uint64_t res[], uint64_t a, int i)
+{
+uint64_t c = res[i];
+for (; i  4; i++) {
+res[i] = c + a;
+if (res[i]  a) {
+c = 1;
+a = res[i+1];
+} else {
+break;
+}
+}
+}
+
+target_ulong helper_v3mulu(target_ulong arg1, target_ulong arg2)
+{
+uint64_t hi, lo, res[4];
+int i;
+for (i = 0; i  4; i++) {
+res[i] = 0;
+}
+mulu64(res[0], res[1], env-active_tc.MPL0, arg1);
+mulu64(lo, hi, env-active_tc.MPL1, arg1);
+res[1] = res[1] + lo;
+if (res[1]  lo) {
+res[2]++;
+}
+res[2] = res[2] + hi;
+if (res[2]  hi) {
+res[3]++;
+}
+mulu64(lo, hi, env-active_tc.MPL2, arg1);
+res[2] = res[2] + lo;
+if (res[2]  lo) {
+res[3]++;
+}
+res[3] = res[3] + hi;
+addc(res, arg2, 0);
+addc(res, env-active_tc.P0, 0);
+addc(res, env-active_tc.P1, 1);
+addc(res, env-active_tc.P2, 2);
+env-active_tc.P0 = res[1];
+env-active_tc.P1 = res[2];
+env-active_tc.P2 = res[3];
+return res[0];
+}
+
+target_ulong helper_vmulu(target_ulong arg1, target_ulong arg2)
+{
+uint64_t hi, lo;
+mulu64(lo, hi, env-active_tc.MPL0, arg1);
+lo = lo + arg2;
+if (lo  arg2) {
+hi++;
+}
+lo = lo + env-active_tc.P0;
+if (lo  env-active_tc.P0) {
+hi++;
+}
+env-active_tc.P0 = hi;
+return lo;
+}
+
+target_ulong helper_dpop(target_ulong arg)
+{
+return ctpop64(arg);
+}
 #endif
 
+target_ulong helper_pop(target_ulong arg)
+{
+return ctpop32((uint32_t)arg);
+}
+
 #ifndef CONFIG_USER_ONLY
 
 static inline target_phys_addr_t do_translate_address(target_ulong address, 
int rw)
diff --git a/target-mips/translate.c b/target-mips/translate.c
index 0550333..86776a8 100644
--- a/target-mips/translate.c
+++ b/target-mips/translate.c
@@ -78,6 +78,11 @@ enum {
 OPC_BGTZL= (0x17  26),
 OPC_JALX = (0x1D  26),  /* MIPS 16 only */
 OPC_JALXS= OPC_JALX | 0x5,
+/* Cavium Specific branches */
+OPC_BBIT1= (0x3a  26),  /* jump on

[Qemu-devel] [PATCH v2 0/6] MIPS64 user mode emulation in QEMU with Cavium specific instruction support

2011-10-21 Thread khansa

From: Khansa Butt kha...@kics.edu.pk

This is the team work of Ehsan-ul-Haq, Abdul Qadeer, Abdul Waheed, Khansa Butt
from HPCN Lab KICS UET Lahore.

v1 contains:
* SEQI related changes specified by Richard Henderson
* Fix issues related to coding style, typos and misleading comments
* Cavium specific change in set_thread_area syscall has been removed
*  as it corresponds to modified libc and kernel.

This Patch series add support of MIPS64 user mode emulation in QEMU.
Along with we implemented Cavium specific instructions which We will use 
in SME (in sysem mode emulation of Octeon processor)

If you have any objection regarding the Implementation of
Cavium instructions please read following notes.

Notes
*

The detail of some instructions are as follows
1)seq rd,rs,rt
seq--rd = 1 if rs = rt
is equivalent to
xor rd,rs,rt
sltiu rd,rd,1
2)exts rt,rs,p,lenm1
rt = sign-extend(rsp+lenm1:p,lenm1)
From reference manual of Cavium Networks
Bit locations p + lenm1 to p are extracted from rs and the result is written 
into the
lowest bits of destination register rt. The remaining bits in rt are a 
sign-extension of
the most-significant bit of the bit field (i.e. rt63:lenm1 are all duplicates 
of the
source-register bit rsp+lenm1). so we can't use any of 8,16 or 32 bit
sign extention tcg function. To sign extend according to msb of bit field
we have our own implementation
3)dmul rd,rs,rt
This instruction is included in gen_arith() because it is three operand
double word multiply instruction.

 configure |1 +
 default-configs/mips64-linux-user.mak |1 +
 linux-user/main.c |   21 ++-
 linux-user/mips64/syscall.h   |2 +
 linux-user/signal.c   |  438 -
 mips-dis.c|   53 
 target-mips/cpu.h |7 +
 target-mips/helper.h  |5 +
 target-mips/machine.c |   12 +
 target-mips/mips-defs.h   |2 +
 target-mips/op_helper.c   |   73 ++
 target-mips/translate.c   |  431 -
 target-mips/translate_init.c  |   24 ++
 13 files changed, 1050 insertions(+), 20 deletions(-)
 create mode 100644 default-configs/mips64-linux-user.mak

-- 
1.7.3.4

[Qemu-devel] [PATCH v2 4/6] target-mips:Octeon cpu definition

2011-10-21 Thread khansa

From: Khansa Butt kha...@kics.edu.pk


Signed-off-by: Khansa Butt kha...@kics.edu.pk
---
 target-mips/mips-defs.h  |2 ++
 target-mips/translate_init.c |   24 
 2 files changed, 26 insertions(+), 0 deletions(-)

diff --git a/target-mips/mips-defs.h b/target-mips/mips-defs.h
index bf094a3..e1ec2b2 100644
--- a/target-mips/mips-defs.h
+++ b/target-mips/mips-defs.h
@@ -41,6 +41,7 @@
 #defineASE_MICROMIPS   0x0008
 
 /* Chip specific instructions. */
+#define INSN_OCTEON  0x1000
 #defineINSN_LOONGSON2E  0x2000
 #defineINSN_LOONGSON2F  0x4000
 #defineINSN_VR54XX 0x8000
@@ -53,6 +54,7 @@
 #defineCPU_VR54XX  (CPU_MIPS4 | INSN_VR54XX)
 #defineCPU_LOONGSON2E  (CPU_MIPS3 | INSN_LOONGSON2E)
 #defineCPU_LOONGSON2F  (CPU_MIPS3 | INSN_LOONGSON2F)
+#define CPU_OCTEON  (CPU_MIPS64R2 | INSN_OCTEON)
 
 #defineCPU_MIPS5   (CPU_MIPS4 | ISA_MIPS5)
 
diff --git a/target-mips/translate_init.c b/target-mips/translate_init.c
index c39138f..09d2605 100644
--- a/target-mips/translate_init.c
+++ b/target-mips/translate_init.c
@@ -451,6 +451,30 @@ static const mips_def_t mips_defs[] =
 .mmu_type = MMU_TYPE_R4000,
 },
 {
+.name = octeon,
+.CP0_PRid = 0x0d30,
+.CP0_Config0 = MIPS_CONFIG0 | (0x1  CP0C0_AR) | (0x2  CP0C0_AT) |
+   (MMU_TYPE_R4000  CP0C0_MT),
+.CP0_Config1 = MIPS_CONFIG1 | (63  CP0C1_MMU) |
+   (2  CP0C1_IS) | (4  CP0C1_IL) | (3  CP0C1_IA) |
+   (2  CP0C1_DS) | (4  CP0C1_DL) | (3  CP0C1_DA) |
+   (1  CP0C1_PC) | (1  CP0C1_WR) | (1  CP0C1_EP),
+.CP0_Config2 = MIPS_CONFIG2,
+.CP0_Config3 = MIPS_CONFIG3 | (1  CP0C3_LPA),
+.CP0_LLAddr_rw_bitmask = 0,
+.CP0_LLAddr_shift = 0,
+.SYNCI_Step = 32,
+.CCRes = 2,
+.CP0_Status_rw_bitmask = 0x36FB,
+.CP1_fcr0 = (1  FCR0_F64) | (1  FCR0_3D) | (1  FCR0_PS) |
+(1  FCR0_L) | (1  FCR0_W) | (1  FCR0_D) |
+(1  FCR0_S) | (0x00  FCR0_PRID) | (0x0  FCR0_REV),
+.SEGBITS = 49,
+.PABITS = 49,
+.insn_flags = CPU_OCTEON | ASE_MIPS3D,
+.mmu_type = MMU_TYPE_R4000,
+},
+{
 .name = Loongson-2E,
 .CP0_PRid = 0x6302,
 /*64KB I-cache and d-cache. 4 way with 32 bit cache line size*/
-- 
1.7.3.4

[Qemu-devel] [PATCH v2 2/6] target-mips:enabling of 64 bit user mode and floating point operations

2011-10-21 Thread khansa

From: Khansa Butt kha...@kics.edu.pk


Signed-off-by: Khansa Butt kha...@kics.edu.pk
---
 target-mips/translate.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/target-mips/translate.c b/target-mips/translate.c
index d5b1c76..0550333 100644
--- a/target-mips/translate.c
+++ b/target-mips/translate.c
@@ -12779,6 +12779,8 @@ void cpu_reset (CPUMIPSState *env)
 env-hflags |= MIPS_HFLAG_FPU;
 }
 #ifdef TARGET_MIPS64
+env-hflags |=  MIPS_HFLAG_UX;
+env-active_fpu.fcr0 = env-cpu_model-CP1_fcr0;
 if (env-active_fpu.fcr0  (1  FCR0_F64)) {
 env-hflags |= MIPS_HFLAG_F64;
 }
-- 
1.7.3.4

Re: [Qemu-devel] [RFC][PATCH 28/45] qemu-kvm: msix: Drop tracking of used vectors

2011-10-21 Thread Jan Kiszka

On 2011-10-21 09:54, Michael S. Tsirkin wrote:
 On Fri, Oct 21, 2011 at 09:09:10AM +0200, Jan Kiszka wrote:
 On 2011-10-21 00:02, Michael S. Tsirkin wrote:
 Yes. But this still makes an API for acquiring per-vector resources a 
 requirement.

 Yes, but a different one than current use/unuse.

 What's wrong with use/unuse as an API? It's already in place
 and virtio calls it.

 Not for that purpose.
 It remains a useless API in the absence of KVM's
 requirements.

 
 Sorry, I don't understand. This can acquire whatever resources
 necessary. It does not seem to make sense to rip it out
 only to add a different one back in.
 

 And it will be an
 optional one, only for those devices that need to establish irq/eventfd
 channels.

 Jan

 Not sure this should be up to the device.

 The device provides the fd. At least it acquires and associates it.

 Jan
 
 It would surely be beneficial to be able to have a uniform
 API so that devices don't need to be recoded to be moved
 in this way.

The point is that the current API is useless for devices that do not
have to declare any vector to the core. By forcing them to call into
that API, we solve no current problem automatically. We rather need
associate_vector_with_x (and the reverse). And that only for device that
have different backends than user space models.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] [PATCH v2 0/7] finish coroutinization of drivers

2011-10-21 Thread Kevin Wolf

Am 20.10.2011 13:16, schrieb Paolo Bonzini:
 Drivers that only implement the bdrv_read and bdrv_write callbacks
 were unwillingly converted to be reentrant when bdrv_read and
 bdrv_write were changed to always create coroutines.  So,
 we need locks aroudn read and write operations.
 
 This series does this (patches 4-6) and removes the flush/discard
 callbacks that, as it turns out, are really duplicates of co_flush
 and co_discard (patches 7-8).
 
 Patches 1-2 are cleanups that I discovered while testing.
 
 v1-v2: rwlock-mutex, convert read-only drivers too, drop vpc change
 
 Paolo Bonzini (7):
   vmdk: fix return values of vmdk_parent_open
   vmdk: clean up open
   block: add a CoMutex to synchronous read drivers
   block: take lock around bdrv_read implementations
   block: take lock around bdrv_write implementations
   block: change flush to co_flush
   block: change discard to co_discard

Thanks, applied all to the block branch.

Kevin

[Qemu-devel] [Qemu-trivial] [PATCH] exec.c: Remove useless comment

2011-10-21 Thread 陳韋任

  As phys_ram_size had been removed since QEMU 0.12. Remove the useless
comment.

Signed-off-by: Chen Wen-Ren che...@iis.sinica.edu.tw
---
 exec.c |1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/exec.c b/exec.c
index d0cbf15..fb21e76 100644
--- a/exec.c
+++ b/exec.c
@@ -472,7 +472,6 @@ static void code_gen_alloc(unsigned long tb_size)
 code_gen_buffer_size = tb_size;
 if (code_gen_buffer_size == 0) {
 #if defined(CONFIG_USER_ONLY)
-/* in user mode, phys_ram_size is not meaningful */
 code_gen_buffer_size = DEFAULT_CODE_GEN_BUFFER_SIZE;
 #else
 /* XXX: needs adjustments */
-- 
1.7.3.4

[Qemu-devel] [PATCH] xen_disk: Always set feature-barrier = 1

2011-10-21 Thread Kevin Wolf

The synchronous .bdrv_flush callback doesn't exist any more and a device really
shouldn't poke into the block layer internals anyway. All drivers are supposed
to have a correctly working bdrv_flush, so let's just hard-code this.

Signed-off-by: Kevin Wolf kw...@redhat.com
---

I'm not sure what feature-barrier really means, but this is the closest thing
to what we used to do. Should this really be dependent on whether or not we are
using a writeback cache mode?

Also, someone should really get rid of that #include block_int.h in xen_disk.
Things defined there are not a device's business.

 hw/xen_disk.c |5 ++---
 1 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/hw/xen_disk.c b/hw/xen_disk.c
index 8a9fac4..286bbac 100644
--- a/hw/xen_disk.c
+++ b/hw/xen_disk.c
@@ -620,7 +620,7 @@ static void blk_alloc(struct XenDevice *xendev)
 static int blk_init(struct XenDevice *xendev)
 {
 struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, xendev);
-int index, qflags, have_barriers, info = 0;
+int index, qflags, info = 0;
 
 /* read xenstore entries */
 if (blkdev-params == NULL) {
@@ -706,7 +706,6 @@ static int blk_init(struct XenDevice *xendev)
   blkdev-bs-drv ? blkdev-bs-drv-format_name : -);
 blkdev-file_size = 0;
 }
-have_barriers = blkdev-bs-drv  blkdev-bs-drv-bdrv_flush ? 1 : 0;
 
 xen_be_printf(xendev, 1, type \%s\, fileproto \%s\, filename \%s\,
size % PRId64  (% PRId64  MB)\n,
@@ -714,7 +713,7 @@ static int blk_init(struct XenDevice *xendev)
   blkdev-file_size, blkdev-file_size  20);
 
 /* fill info */
-xenstore_write_be_int(blkdev-xendev, feature-barrier, have_barriers);
+xenstore_write_be_int(blkdev-xendev, feature-barrier, 1);
 xenstore_write_be_int(blkdev-xendev, info,info);
 xenstore_write_be_int(blkdev-xendev, sector-size, 
blkdev-file_blk);
 xenstore_write_be_int(blkdev-xendev, sectors,
-- 
1.7.6.4

[Qemu-devel] [PATCH] block: Add !qemu_in_coroutine() assertions to synchronous functions

2011-10-21 Thread Kevin Wolf

When adding the locking, we came to the conclusion that converting
read/write/flush/discard to coroutines should be enough because everything else
isn't called in coroutine context. Add assertions to spell this assumption out
and ensure that it won't be broken accidentally.

And even if we have missed converting a valid case, aborting qemu is better
than corrupting images.

Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block.c |   35 +++
 1 files changed, 35 insertions(+), 0 deletions(-)

diff --git a/block.c b/block.c
index 70aab63..11c7f91 100644
--- a/block.c
+++ b/block.c
@@ -849,6 +849,8 @@ int bdrv_check(BlockDriverState *bs, BdrvCheckResult *res)
 return -ENOTSUP;
 }
 
+assert(!qemu_in_coroutine());
+
 memset(res, 0, sizeof(*res));
 return bs-drv-bdrv_check(bs, res);
 }
@@ -867,6 +869,8 @@ int bdrv_commit(BlockDriverState *bs)
 char filename[1024];
 BlockDriverState *bs_rw, *bs_ro;
 
+assert(!qemu_in_coroutine());
+
 if (!drv)
 return -ENOMEDIUM;
 
@@ -926,6 +930,7 @@ int bdrv_commit(BlockDriverState *bs)
 }
 }
 
+assert(!qemu_in_coroutine());
 if (drv-bdrv_make_empty) {
 ret = drv-bdrv_make_empty(bs);
 bdrv_flush(bs);
@@ -983,6 +988,8 @@ int bdrv_change_backing_file(BlockDriverState *bs,
 {
 BlockDriver *drv = bs-drv;
 
+assert(!qemu_in_coroutine());
+
 if (drv-bdrv_change_backing_file != NULL) {
 return drv-bdrv_change_backing_file(bs, backing_file, backing_fmt);
 } else {
@@ -1323,6 +1330,8 @@ int bdrv_truncate(BlockDriverState *bs, int64_t offset)
 return -EACCES;
 if (bdrv_in_use(bs))
 return -EBUSY;
+
+assert(!qemu_in_coroutine());
 ret = drv-bdrv_truncate(bs, offset);
 if (ret == 0) {
 ret = refresh_total_sectors(bs, offset  BDRV_SECTOR_BITS);
@@ -1792,6 +1801,8 @@ int bdrv_is_allocated(BlockDriverState *bs, int64_t 
sector_num, int nb_sectors,
 *pnum = (n  nb_sectors) ? (n) : (nb_sectors);
 return 1;
 }
+
+assert(!qemu_in_coroutine());
 return bs-drv-bdrv_is_allocated(bs, sector_num, nb_sectors, pnum);
 }
 
@@ -2050,12 +2061,16 @@ int bdrv_write_compressed(BlockDriverState *bs, int64_t 
sector_num,
 set_dirty_bitmap(bs, sector_num, nb_sectors, 1);
 }
 
+assert(!qemu_in_coroutine());
 return drv-bdrv_write_compressed(bs, sector_num, buf, nb_sectors);
 }
 
 int bdrv_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
 {
 BlockDriver *drv = bs-drv;
+
+assert(!qemu_in_coroutine());
+
 if (!drv)
 return -ENOMEDIUM;
 if (!drv-bdrv_get_info)
@@ -2068,6 +2083,9 @@ int bdrv_save_vmstate(BlockDriverState *bs, const uint8_t 
*buf,
   int64_t pos, int size)
 {
 BlockDriver *drv = bs-drv;
+
+assert(!qemu_in_coroutine());
+
 if (!drv)
 return -ENOMEDIUM;
 if (drv-bdrv_save_vmstate)
@@ -2081,6 +2099,9 @@ int bdrv_load_vmstate(BlockDriverState *bs, uint8_t *buf,
   int64_t pos, int size)
 {
 BlockDriver *drv = bs-drv;
+
+assert(!qemu_in_coroutine());
+
 if (!drv)
 return -ENOMEDIUM;
 if (drv-bdrv_load_vmstate)
@@ -2149,6 +2170,9 @@ int bdrv_snapshot_create(BlockDriverState *bs,
  QEMUSnapshotInfo *sn_info)
 {
 BlockDriver *drv = bs-drv;
+
+assert(!qemu_in_coroutine());
+
 if (!drv)
 return -ENOMEDIUM;
 if (drv-bdrv_snapshot_create)
@@ -2164,6 +2188,8 @@ int bdrv_snapshot_goto(BlockDriverState *bs,
 BlockDriver *drv = bs-drv;
 int ret, open_ret;
 
+assert(!qemu_in_coroutine());
+
 if (!drv)
 return -ENOMEDIUM;
 if (drv-bdrv_snapshot_goto)
@@ -2187,6 +2213,9 @@ int bdrv_snapshot_goto(BlockDriverState *bs,
 int bdrv_snapshot_delete(BlockDriverState *bs, const char *snapshot_id)
 {
 BlockDriver *drv = bs-drv;
+
+assert(!qemu_in_coroutine());
+
 if (!drv)
 return -ENOMEDIUM;
 if (drv-bdrv_snapshot_delete)
@@ -2200,6 +2229,9 @@ int bdrv_snapshot_list(BlockDriverState *bs,
QEMUSnapshotInfo **psn_info)
 {
 BlockDriver *drv = bs-drv;
+
+assert(!qemu_in_coroutine());
+
 if (!drv)
 return -ENOMEDIUM;
 if (drv-bdrv_snapshot_list)
@@ -2213,6 +2245,9 @@ int bdrv_snapshot_load_tmp(BlockDriverState *bs,
 const char *snapshot_name)
 {
 BlockDriver *drv = bs-drv;
+
+assert(!qemu_in_coroutine());
+
 if (!drv) {
 return -ENOMEDIUM;
 }
-- 
1.7.6.4

[Qemu-devel] [PATCH] fw_cfg: Use g_file_get_contents instead of multiple fread() calls

2011-10-21 Thread Pavel Borzenkov

Signed-off-by: Pavel Borzenkov pavel.borzen...@gmail.com
---
 hw/fw_cfg.c |  100 ++-
 1 files changed, 37 insertions(+), 63 deletions(-)

diff --git a/hw/fw_cfg.c b/hw/fw_cfg.c
index 8df265c..d2400f5 100644
--- a/hw/fw_cfg.c
+++ b/hw/fw_cfg.c
@@ -60,71 +60,55 @@ struct FWCfgState {
 #define JPG_FILE 0
 #define BMP_FILE 1
 
-static FILE *probe_splashfile(char *filename, int *file_sizep, int *file_typep)
+static char *read_splashfile(char *filename, int *file_sizep, int *file_typep)
 {
-FILE *fp = NULL;
-int fop_ret;
-int file_size;
+GError *err = NULL;
+gboolean res;
+gchar *content;
 int file_type = -1;
-unsigned char buf[2] = {0, 0};
-unsigned int filehead_value = 0;
+unsigned int filehead = 0;
 int bmp_bpp;
 
-fp = fopen(filename, rb);
-if (fp == NULL) {
-error_report(failed to open file '%s'., filename);
-return fp;
+res = g_file_get_contents(filename, content, (gsize *)file_sizep, err);
+if (res == FALSE) {
+error_report(falied to read file '%s', filename);
+g_error_free(err);
+return NULL;
 }
+
 /* check file size */
-fseek(fp, 0L, SEEK_END);
-file_size = ftell(fp);
-if (file_size  2) {
-error_report(file size is less than 2 bytes '%s'., filename);
-fclose(fp);
-fp = NULL;
-return fp;
+if (*file_sizep  30) {
+error_report(file size is less than 30 bytes '%s', filename);
+g_free(content);
+return NULL;
 }
+
 /* check magic ID */
-fseek(fp, 0L, SEEK_SET);
-fop_ret = fread(buf, 1, 2, fp);
-if (fop_ret != 2) {
-error_report(Could not read header from '%s': %s,
- filename, strerror(errno));
-fclose(fp);
-fp = NULL;
-return fp;
-}
-filehead_value = (buf[0] + (buf[1]  8))  0x;
-if (filehead_value == 0xd8ff) {
+filehead = ((content[0]  0xff) + (content[1]  8))  0x;
+if (filehead == 0xd8ff) {
 file_type = JPG_FILE;
+} else if (filehead == 0x4d42) {
+file_type = BMP_FILE;
 } else {
-if (filehead_value == 0x4d42) {
-file_type = BMP_FILE;
-}
-}
-if (file_type  0) {
-error_report('%s' not jpg/bmp file,head:0x%x.,
- filename, filehead_value);
-fclose(fp);
-fp = NULL;
-return fp;
+error_report('%s' not jpg/bmp file, head:0x%x., filename, filehead);
+g_free(content);
+return NULL;
 }
+
 /* check BMP bpp */
 if (file_type == BMP_FILE) {
-fseek(fp, 28, SEEK_SET);
-fop_ret = fread(buf, 1, 2, fp);
-bmp_bpp = (buf[0] + (buf[1]  8))  0x;
+bmp_bpp = (content[28] + (content[29]  8))  0x;
 if (bmp_bpp != 24) {
 error_report(only 24bpp bmp file is supported.);
-fclose(fp);
-fp = NULL;
-return fp;
+g_free(content);
+return NULL;
 }
 }
+
 /* return values */
-*file_sizep = file_size;
 *file_typep = file_type;
-return fp;
+
+return content;
 }
 
 static void fw_cfg_bootsplash(FWCfgState *s)
@@ -132,9 +116,7 @@ static void fw_cfg_bootsplash(FWCfgState *s)
 int boot_splash_time = -1;
 const char *boot_splash_filename = NULL;
 char *p;
-char *filename;
-FILE *fp;
-int fop_ret;
+char *filename, *file_data;
 int file_size;
 int file_type = -1;
 const char *temp;
@@ -174,27 +156,19 @@ static void fw_cfg_bootsplash(FWCfgState *s)
 error_report(failed to find file '%s'., boot_splash_filename);
 return;
 }
-/* probing the file */
-fp = probe_splashfile(filename, file_size, file_type);
-if (fp == NULL) {
+
+/* loading file data */
+file_data = read_splashfile(filename, file_size, file_type);
+if (file_data == NULL) {
 g_free(filename);
 return;
 }
-/* loading file data */
 if (boot_splash_filedata != NULL) {
 g_free(boot_splash_filedata);
 }
-boot_splash_filedata = g_malloc(file_size);
+boot_splash_filedata = (uint8_t *)file_data;
 boot_splash_filedata_size = file_size;
-fseek(fp, 0L, SEEK_SET);
-fop_ret = fread(boot_splash_filedata, 1, file_size, fp);
-if (fop_ret != file_size) {
-error_report(failed to read data from '%s'.,
- boot_splash_filename);
-fclose(fp);
-return;
-}
-fclose(fp);
+
 /* insert data */
 if (file_type == JPG_FILE) {
 fw_cfg_add_file(s, bootsplash.jpg,
-- 
1.7.0.4

Re: [Qemu-devel] [PATCH 11/35] scsi-disk: support READ DVD STRUCTURE

2011-10-21 Thread Kevin Wolf

Am 13.10.2011 13:03, schrieb Paolo Bonzini:
 Signed-off-by: Paolo Bonzini pbonz...@redhat.com
 ---
  hw/scsi-disk.c |  101 
 +++-
  1 files changed, 100 insertions(+), 1 deletions(-)
 
 diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c
 index 1786c37..14db6a0 100644
 --- a/hw/scsi-disk.c
 +++ b/hw/scsi-disk.c
 @@ -576,10 +576,109 @@ static inline bool media_is_dvd(SCSIDiskState *s)
  return nb_sectors  CD_MAX_SECTORS;
  }
  
 +static inline bool media_is_cd(SCSIDiskState *s)
 +{
 +uint64_t nb_sectors;
 +if (s-qdev.type != TYPE_ROM) {
 +return false;
 +}
 +if (!bdrv_is_inserted(s-bs)) {
 +return false;
 +}
 +bdrv_get_geometry(s-bs, nb_sectors);
 +return nb_sectors = CD_MAX_SECTORS;
 +}
 +
  static int scsi_read_dvd_structure(SCSIDiskState *s, SCSIDiskReq *r,
 uint8_t *outbuf)
  {
 -scsi_check_condition(r, SENSE_CODE(INVALID_OPCODE));
 +static const int rds_caps_size[5] = {
 +[0] = 2048 + 4,
 +[1] = 4 + 4,
 +[3] = 188 + 4,
 +[4] = 2048 + 4,
 +};
 +
 +uint8_t media = r-req.cmd.buf[1];
 +uint8_t layer = r-req.cmd.buf[6];
 +uint8_t format = r-req.cmd.buf[7];
 +int size = -1;
 +
 +if (s-qdev.type != TYPE_ROM || !bdrv_is_inserted(s-bs)) {
 +return -1;
 +}
 +if (s-tray_open || !bdrv_is_inserted(s-bs)) {
 +scsi_check_condition(r, SENSE_CODE(NO_MEDIUM));
 +return -1;
 +}

You are checking twice for bdrv_is_inserted, which one do you really mean?

Also, format = 0xff should work even without a medium.

 +if (media_is_cd(s)) {
 +scsi_check_condition(r, SENSE_CODE(INCOMPATIBLE_FORMAT));
 +return -1;
 +}
 +if (media != 0) {
 +scsi_check_condition(r, SENSE_CODE(INCOMPATIBLE_FORMAT));
 +return -1;
 +}
 +
 +if (format != 0xff) {
 +if (format = sizeof(rds_caps_size) / sizeof(rds_caps_size[0])) {

osdep.h has an ARRAY_SIZE() macro.

 +return -1;
 +}
 +size = rds_caps_size[format];
 +memset(outbuf, 0, size);
 +}
 +
 +switch (format) {
 +case 0x00: {
 +/* Physical format information */
 +uint64_t nb_sectors;
 +if (layer != 0)
 +goto fail;

Braces

 +bdrv_get_geometry(s-bs, nb_sectors);
 +
 +outbuf[4] = 1;   /* DVD-ROM, part version 1 */
 +outbuf[5] = 0xf; /* 120mm disc, minimum rate unspecified */
 +outbuf[6] = 1;   /* one layer, read-only (per MMC-2 spec) */
 +outbuf[7] = 0;   /* default densities */
 +
 +stl_be_p(outbuf[12], (nb_sectors  2) - 1); /* end sector */
 +stl_be_p(outbuf[16], (nb_sectors  2) - 1); /* l0 end sector */
 +break;
 +}
 +
 +case 0x01: /* DVD copyright information, all zeros */
 +break;
 +
 +case 0x03: /* BCA information - invalid field for no BCA info */
 +return -1;
 +
 +case 0x04: /* DVD disc manufacturing information, all zeros */
 +break;
 +
 +case 0xff: { /* List capabilities */
 +int i;
 +size = 4;
 +for (i = 0; i  sizeof(rds_caps_size) / sizeof(rds_caps_size[0]); 
 i++) {

ARRAY_SIZE() again

 +if (!rds_caps_size[i]) {
 +continue;
 +}
 +outbuf[size] = i;
 +outbuf[size + 1] = 0x40; /* Not writable, readable */
 +stw_be_p(outbuf[size + 2], rds_caps_size[i]);
 +size += 4;
 +}
 +break;
 + }
 +
 +default:
 +return -1;
 +}
 +
 +/* Size of buffer, not including 2 byte size field */
 +stw_be_p(outbuf, size - 2);
 +return size;
 +
 +fail:
  return -1;
  }

There is only one 'goto fail', all other places have a direct return -1.
It would be good to be consistent.

Also, as this is mostly a refactored copy from the ATAPI code, I wonder
what our long-term plan is. At which point will we be able to unify what
we're duplicating right now? Can we share some parts even now?

Kevin

Re: [Qemu-devel] [PATCH] [v3] hw/arm_gic.c: Fix save/load of irq_target array

2011-10-21 Thread Andreas Färber

Am 20.10.2011 12:48, schrieb Dmitry Koshelev:
 irq_target array saving/loading is in the wrong loop.
 Version bump.
 
 Signed-off-by: Dmitry Koshelev karaghio...@gmail.com

Acked-by: Andreas Färber andreas.faer...@web.de

Applies cleanly now.

Is there a particular use case that was broken before and works now, or
did this turn up during code review only?

Andreas

 ---
  hw/arm_gic.c |   16 
  1 files changed, 8 insertions(+), 8 deletions(-)
 
 diff --git a/hw/arm_gic.c b/hw/arm_gic.c
 index 83213dd..8dd8742 100644
 --- a/hw/arm_gic.c
 +++ b/hw/arm_gic.c
 @@ -658,9 +658,6 @@ static void gic_save(QEMUFile *f, void *opaque)
  qemu_put_be32(f, s-enabled);
  for (i = 0; i  NUM_CPU(s); i++) {
  qemu_put_be32(f, s-cpu_enabled[i]);
 -#ifndef NVIC
 -qemu_put_be32(f, s-irq_target[i]);
 -#endif
  for (j = 0; j  32; j++)
  qemu_put_be32(f, s-priority1[j][i]);
  for (j = 0; j  GIC_NIRQ; j++)
 @@ -674,6 +671,9 @@ static void gic_save(QEMUFile *f, void *opaque)
  qemu_put_be32(f, s-priority2[i]);
  }
  for (i = 0; i  GIC_NIRQ; i++) {
 +#ifndef NVIC
 +qemu_put_be32(f, s-irq_target[i]);
 +#endif
  qemu_put_byte(f, s-irq_state[i].enabled);
  qemu_put_byte(f, s-irq_state[i].pending);
  qemu_put_byte(f, s-irq_state[i].active);
 @@ -689,15 +689,12 @@ static int gic_load(QEMUFile *f, void *opaque, int 
 version_id)
  int i;
  int j;
  
 -if (version_id != 1)
 +if (version_id != 2)
  return -EINVAL;
  
  s-enabled = qemu_get_be32(f);
  for (i = 0; i  NUM_CPU(s); i++) {
  s-cpu_enabled[i] = qemu_get_be32(f);
 -#ifndef NVIC
 -s-irq_target[i] = qemu_get_be32(f);
 -#endif
  for (j = 0; j  32; j++)
  s-priority1[j][i] = qemu_get_be32(f);
  for (j = 0; j  GIC_NIRQ; j++)
 @@ -711,6 +708,9 @@ static int gic_load(QEMUFile *f, void *opaque, int 
 version_id)
  s-priority2[i] = qemu_get_be32(f);
  }
  for (i = 0; i  GIC_NIRQ; i++) {
 +#ifndef NVIC
 +s-irq_target[i] = qemu_get_be32(f);
 +#endif
  s-irq_state[i].enabled = qemu_get_byte(f);
  s-irq_state[i].pending = qemu_get_byte(f);
  s-irq_state[i].active = qemu_get_byte(f);
 @@ -739,5 +739,5 @@ static void gic_init(gic_state *s)
  }
  memory_region_init_io(s-iomem, gic_dist_ops, s, gic_dist, 0x1000);
  gic_reset(s);
 -register_savevm(NULL, arm_gic, -1, 1, gic_save, gic_load, s);
 +register_savevm(NULL, arm_gic, -1, 2, gic_save, gic_load, s);
  }


-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746, AG Nürnberg

Re: [Qemu-devel] [PATCH 12/35] scsi-disk: report media changed via GET EVENT STATUS NOTIFICATION

2011-10-21 Thread Kevin Wolf

Am 13.10.2011 13:03, schrieb Paolo Bonzini:
 This adds support for media change notification via the GET EVENT STATUS
 NOTIFICATION command, used by Linux versions 2.6.38 and newer.
 
 Signed-off-by: Paolo Bonzini pbonz...@redhat.com

Looks good, but the ATAPI version of the code is somewhat nicer to read.

Kevin

Re: [Qemu-devel] [RFC][PATCH 28/45] qemu-kvm: msix: Drop tracking of used vectors

2011-10-21 Thread Michael S. Tsirkin

On Fri, Oct 21, 2011 at 11:27:48AM +0200, Jan Kiszka wrote:
 On 2011-10-21 09:54, Michael S. Tsirkin wrote:
  On Fri, Oct 21, 2011 at 09:09:10AM +0200, Jan Kiszka wrote:
  On 2011-10-21 00:02, Michael S. Tsirkin wrote:
  Yes. But this still makes an API for acquiring per-vector resources a 
  requirement.
 
  Yes, but a different one than current use/unuse.
 
  What's wrong with use/unuse as an API? It's already in place
  and virtio calls it.
 
  Not for that purpose.
  It remains a useless API in the absence of KVM's
  requirements.
 
  
  Sorry, I don't understand. This can acquire whatever resources
  necessary. It does not seem to make sense to rip it out
  only to add a different one back in.
  
 
  And it will be an
  optional one, only for those devices that need to establish irq/eventfd
  channels.
 
  Jan
 
  Not sure this should be up to the device.
 
  The device provides the fd. At least it acquires and associates it.
 
  Jan
  
  It would surely be beneficial to be able to have a uniform
  API so that devices don't need to be recoded to be moved
  in this way.
 
 The point is that the current API is useless for devices that do not
 have to declare any vector to the core.

Don't assigned devices want this as well?
They handle 0-address vectors specially, and
this hack absolutely doesn't belong in pci core ...

 By forcing them to call into
 that API, we solve no current problem automatically. We rather need
 associate_vector_with_x (and the reverse). And that only for device that
 have different backends than user space models.
 
 Jan

I'll need to think about this, would prefer this series not
to get blocked on this issue. We more or less agreed
to add _use_all/unuse_all for now?

 -- 
 Siemens AG, Corporate Technology, CT T DE IT 1
 Corporate Competence Center Embedded Linux

Re: [Qemu-devel] [PATCH 13/35] scsi: move tcq/ndev to SCSIBusOps (now SCSIBusInfo)

2011-10-21 Thread Kevin Wolf

Am 13.10.2011 13:03, schrieb Paolo Bonzini:
 Signed-off-by: Paolo Bonzini pbonz...@redhat.com
 ---
  hw/esp.c |7 +--
  hw/lsi53c895a.c  |9 ++---
  hw/scsi-bus.c|   27 ---
  hw/scsi-disk.c   |2 +-
  hw/scsi.h|   11 +--
  hw/spapr_vscsi.c |8 +---
  hw/usb-msd.c |7 +--
  7 files changed, 39 insertions(+), 32 deletions(-)
 
 diff --git a/hw/esp.c b/hw/esp.c
 index 697c2c5..d3fb1c6 100644
 --- a/hw/esp.c
 +++ b/hw/esp.c
 @@ -720,7 +720,10 @@ void esp_init(target_phys_addr_t espaddr, int it_shift,
  *dma_enable = qdev_get_gpio_in(dev, 1);
  }
  
 -static const struct SCSIBusOps esp_scsi_ops = {
 +static const struct SCSIBusInfo esp_scsi_info = {
 +.tcq = false,
 +.ndev = ESP_MAX_DEVS,
 +
  .transfer_data = esp_transfer_data,
  .complete = esp_command_complete,
  .cancel = esp_request_cancelled
 @@ -740,7 +743,7 @@ static int esp_init1(SysBusDevice *dev)
  
  qdev_init_gpio_in(dev-qdev, esp_gpio_demux, 2);
  
 -scsi_bus_new(s-bus, dev-qdev, 0, ESP_MAX_DEVS, esp_scsi_ops);
 +scsi_bus_new(s-bus, dev-qdev, esp_scsi_info);
  return scsi_bus_legacy_handle_cmdline(s-bus);
  }
  
 diff --git a/hw/lsi53c895a.c b/hw/lsi53c895a.c
 index e077ec0..4eeb496 100644
 --- a/hw/lsi53c895a.c
 +++ b/hw/lsi53c895a.c
 @@ -1686,7 +1686,7 @@ static void lsi_reg_writeb(LSIState *s, int offset, 
 uint8_t val)
  DeviceState *dev;
  int id;
  
 -for (id = 0; id  s-bus.ndev; id++) {
 +for (id = 0; id  LSI_MAX_DEVS; id++) {
  if (s-bus.devs[id]) {
  dev = s-bus.devs[id]-qdev;
  dev-info-reset(dev);
 @@ -2091,7 +2091,10 @@ static int lsi_scsi_uninit(PCIDevice *d)
  return 0;
  }
  
 -static const struct SCSIBusOps lsi_scsi_ops = {
 +static const struct SCSIBusInfo lsi_scsi_info = {
 +.tcq = true,
 +.ndev = LSI_MAX_DEVS,
 +
  .transfer_data = lsi_transfer_data,
  .complete = lsi_command_complete,
  .cancel = lsi_request_cancelled
 @@ -2118,7 +2121,7 @@ static int lsi_scsi_init(PCIDevice *dev)
  pci_register_bar(s-dev, 2, PCI_BASE_ADDRESS_SPACE_MEMORY, s-ram_io);
  QTAILQ_INIT(s-queue);
  
 -scsi_bus_new(s-bus, dev-qdev, 1, LSI_MAX_DEVS, lsi_scsi_ops);
 +scsi_bus_new(s-bus, dev-qdev, lsi_scsi_info);
  if (!dev-qdev.hotplugged) {
  return scsi_bus_legacy_handle_cmdline(s-bus);
  }
 diff --git a/hw/scsi-bus.c b/hw/scsi-bus.c
 index 867b1a8..d9d4e18 100644
 --- a/hw/scsi-bus.c
 +++ b/hw/scsi-bus.c
 @@ -24,14 +24,11 @@ static struct BusInfo scsi_bus_info = {
  static int next_scsi_bus;
  
  /* Create a scsi bus, and attach devices to it.  */
 -void scsi_bus_new(SCSIBus *bus, DeviceState *host, int tcq, int ndev,
 -  const SCSIBusOps *ops)
 +void scsi_bus_new(SCSIBus *bus, DeviceState *host, const SCSIBusInfo *info)
  {
  qbus_create_inplace(bus-qbus, scsi_bus_info, host, NULL);
  bus-busnr = next_scsi_bus++;
 -bus-tcq = tcq;
 -bus-ndev = ndev;
 -bus-ops = ops;
 +bus-info = info;
  bus-qbus.allow_hotplug = 1;
  }
  
 @@ -43,12 +40,12 @@ static int scsi_qdev_init(DeviceState *qdev, DeviceInfo 
 *base)
  int rc = -1;
  
  if (dev-id == -1) {
 -for (dev-id = 0; dev-id  bus-ndev; dev-id++) {
 +for (dev-id = 0; dev-id  bus-info-ndev; dev-id++) {
  if (bus-devs[dev-id] == NULL)
  break;
  }
  }
 -if (dev-id = bus-ndev) {
 +if (dev-id = bus-info-ndev) {
  error_report(bad scsi device id: %d, dev-id);
  goto err;
  }
 @@ -120,7 +117,7 @@ int scsi_bus_legacy_handle_cmdline(SCSIBus *bus)
  int res = 0, unit;
  
  loc_push_none(loc);
 -for (unit = 0; unit  bus-ndev; unit++) {
 +for (unit = 0; unit  bus-info-ndev; unit++) {
  dinfo = drive_get(IF_SCSI, bus-busnr, unit);
  if (dinfo == NULL) {
  continue;
 @@ -265,7 +262,7 @@ static bool scsi_target_emulate_inquiry(SCSITargetReq *r)
  r-buf[2] = 5; /* Version */
  r-buf[3] = 2 | 0x10; /* HiSup, response data format */
  r-buf[4] = r-len - 5; /* Additional Length = (Len - 1) - 4 */
 -r-buf[7] = 0x10 | (r-req.bus-tcq ? 0x02 : 0); /* Sync, TCQ.  */
 +r-buf[7] = 0x10 | (r-req.bus-info-tcq ? 0x02 : 0); /* Sync, TCQ. 
  */
  memcpy(r-buf[8], QEMU, 8);
  memcpy(r-buf[16], QEMU TARGET , 16);
  strncpy((char *) r-buf[32], QEMU_VERSION, 4);
 @@ -1062,7 +1059,7 @@ void scsi_req_continue(SCSIRequest *req)
  void scsi_req_data(SCSIRequest *req, int len)
  {
  trace_scsi_req_data(req-dev-id, req-lun, req-tag, len);
 -req-bus-ops-transfer_data(req, len);
 +req-bus-info-transfer_data(req, len);
  }
  
  void scsi_req_print(SCSIRequest *req)
 @@ -1121,7 +1118,7 @@ void scsi_req_complete(SCSIRequest *req, int status)
  
  scsi_req_ref(req);

Re: [Qemu-devel] [PATCH 13/35] scsi: move tcq/ndev to SCSIBusOps (now SCSIBusInfo)

2011-10-21 Thread Paolo Bonzini


On 10/21/2011 02:01 PM, Kevin Wolf wrote:


  -static const struct SCSIBusOps vscsi_scsi_ops = {
  +static const struct SCSIBusInfo vscsi_scsi_info = {
  +.tcq = true,
  +.ndev = 63, /* logical unit addressing format */

This is a change from VSCSI_REQ_LIMIT = 24. This may be a bugfix or not
- I don't know the hardware - but it's matter for a separate patch.


Ok, will leave 24.

Paolo

Re: [Qemu-devel] [PATCH 15/35] scsi: remove devs array from SCSIBus

2011-10-21 Thread Kevin Wolf

Am 13.10.2011 13:03, schrieb Paolo Bonzini:
 Change the devs array into a linked list, and add a scsi_device_find
 function to navigate the children list instead.  This lets the SCSI
 bus use more complex addressing.
 
 scsi_device_find may return another LUN on the same target if none is
 found that matches exactly.
 
 Signed-off-by: Paolo Bonzini pbonz...@redhat.com
 ---
  hw/esp.c |5 +++--
  hw/lsi53c895a.c  |   22 +++---
  hw/qdev.h|2 +-
  hw/scsi-bus.c|   53 ++---
  hw/scsi.h|3 +--
  hw/spapr_vscsi.c |   14 ++
  6 files changed, 48 insertions(+), 51 deletions(-)
 
 diff --git a/hw/esp.c b/hw/esp.c
 index d3fb1c6..8e17005 100644
 --- a/hw/esp.c
 +++ b/hw/esp.c
 @@ -217,7 +217,8 @@ static uint32_t get_cmd(ESPState *s, uint8_t *buf)
  s-async_len = 0;
  }
  
 -if (target = ESP_MAX_DEVS || !s-bus.devs[target]) {
 +s-current_dev = scsi_device_find(s-bus, target, 0);
 +if (!s-current_dev) {
  // No such drive
  s-rregs[ESP_RSTAT] = 0;
  s-rregs[ESP_RINTR] = INTR_DC;
 @@ -225,7 +226,6 @@ static uint32_t get_cmd(ESPState *s, uint8_t *buf)
  esp_raise_irq(s);
  return 0;
  }
 -s-current_dev = s-bus.devs[target];
  return dmalen;
  }
  
 @@ -236,6 +236,7 @@ static void do_busid_cmd(ESPState *s, uint8_t *buf, 
 uint8_t busid)
  
  trace_esp_do_busid_cmd(busid);
  lun = busid  7;
 +s-current_dev = scsi_device_find(s-bus, s-current_dev-id, lun);

This is new, and I can't see an explanation in the commit log.

Kevin

Re: [Qemu-devel] [PATCH 11/35] scsi-disk: support READ DVD STRUCTURE

2011-10-21 Thread Paolo Bonzini


On 10/21/2011 01:42 PM, Kevin Wolf wrote:

  +if (s-qdev.type != TYPE_ROM || !bdrv_is_inserted(s-bs)) {
  +return -1;
  +}
  +if (s-tray_open || !bdrv_is_inserted(s-bs)) {
  +scsi_check_condition(r, SENSE_CODE(NO_MEDIUM));
  +return -1;
  +}

You are checking twice for bdrv_is_inserted, which one do you really mean?


The first is bogus.


Also, format = 0xff should work even without a medium.


Will move the tray_open/bdrv_is_inserted/media_is_cd tests within the 
if (format != 0xff).



  +if (media_is_cd(s)) {
  +scsi_check_condition(r, SENSE_CODE(INCOMPATIBLE_FORMAT));
  +return -1;
  +}
  +if (media != 0) {
  +scsi_check_condition(r, SENSE_CODE(INCOMPATIBLE_FORMAT));
  +return -1;
  +}


media != 0 should return INVALID_FIELD too.

Paolo

Re: [Qemu-devel] [PATCH 11/35] scsi-disk: support READ DVD STRUCTURE

2011-10-21 Thread Paolo Bonzini


On 10/21/2011 01:42 PM, Kevin Wolf wrote:

There is only one 'goto fail', all other places have a direct return -1.
It would be good to be consistent.

Also, as this is mostly a refactored copy from the ATAPI code, I wonder
what our long-term plan is. At which point will we be able to unify what
we're duplicating right now? Can we share some parts even now?


That's a tricky question.  I think there are three choices:

1) use SCSI as the sole interface exposed by the block layer (with an 
adaptor).  There would be a common implementation of SCSI for 
SCSI-oblivious devices, and other devices (hdev, sg, iscsi) could just 
reason in terms of SCSI.  You could stack the common implementations 
(hard drive and CD-ROM) on top of hdev/iscsi or use passthrough.  This 
however is wrong IMHO because some bits of SCSI code really do deal with 
guest state, for example the tray.


2) let ide-cd create its own SCSI bus and act as an adaptor, similar to 
USB devices.  There would still be duplication for commands that do DMA 
in multiple steps; I think READ CD is the only one.


3) create a separate API just for the purpose of sharing code between 
ATAPI and SCSI (your can we share some parts even now, basically).



I think I'm leaning towards (3), but I don't think it makes sense to do 
it now unless someone is interested in implementing for example CD 
burning support.  However, I'm leaning towards that also because I 
honestly have no idea how hard (2) would be.


Paolo

Re: [Qemu-devel] [PATCH 15/35] scsi: remove devs array from SCSIBus

2011-10-21 Thread Paolo Bonzini


On 10/21/2011 02:31 PM, Kevin Wolf wrote:

  diff --git a/hw/esp.c b/hw/esp.c
  index d3fb1c6..8e17005 100644
  --- a/hw/esp.c
  +++ b/hw/esp.c
  @@ -217,7 +217,8 @@ static uint32_t get_cmd(ESPState *s, uint8_t *buf)
s-async_len = 0;
}

  -if (target= ESP_MAX_DEVS || !s-bus.devs[target]) {
  +s-current_dev = scsi_device_find(s-bus, target, 0);
  +if (!s-current_dev) {
// No such drive
s-rregs[ESP_RSTAT] = 0;
s-rregs[ESP_RINTR] = INTR_DC;
  @@ -225,7 +226,6 @@ static uint32_t get_cmd(ESPState *s, uint8_t *buf)
esp_raise_irq(s);
return 0;
}
  -s-current_dev = s-bus.devs[target];
return dmalen;
}

  @@ -236,6 +236,7 @@ static void do_busid_cmd(ESPState *s, uint8_t *buf, 
uint8_t busid)

trace_esp_do_busid_cmd(busid);
lun = busid  7;
  +s-current_dev = scsi_device_find(s-bus, s-current_dev-id, lun);

This is new, and I can't see an explanation in the commit log.


It isn't really new; up until now the lun was hard-coded to zero and so 
s-current_dev could be set in get_cmd.  Now we have to delay it until 
do_busid_cmd because we actually have a place to pass the LUN.


That said, s-current_dev is never really used outside do_busid_cmd, so 
I can instead do something like:


-   s-current_req = scsi_req_new(s-current_dev, 0, lun, buf, NULL);
+   SCSIDevice *current_lun;
+
+   current_lun = scsi_device_find(s-bus, 0, s-current_dev-id, lun);
+   s-current_req = scsi_req_new(current_lun, 0, lun, buf, NULL);

That would be the same as the code I have above.

Paolo

Re: [Qemu-devel] [Question] dump memory when host pci device is used by guest

2011-10-21 Thread Jan Kiszka

On 2011-10-21 15:02, Dave Anderson wrote:
 
 
 - Original Message -
 At 10/21/2011 03:11 PM, Jan Kiszka Write:
 On 2011-10-20 12:03, Wen Congyang wrote:
 At 10/20/2011 05:41 PM, Jan Kiszka Write:
 On 2011-10-20 03:22, Wen Congyang wrote:
 I didn't read full story but 'crash' is used for investigating kernel 
 core generated
 by kdump for several years. Considering support service guys, virsh 
 dump should support
 a format for crash because they can't work well at investigating 
 vmcore by gdb.

 crash has several functionality useful for them as 'show kerne log', 
 'focus on a cpu'
 'for-each-task', 'for-each-vma', 'extract ftrace log' etc.

 Anyway, if a man, who is not developper of qemu/kvm, should learn 2 
 tools for
 investigating kernel dump, it sounds harmful.

 Right, that's why everything (live debugging  crash analysis) should be
 consolidated on the long run over gdb. crash is architecturally obsolete
 today - not saying it is useless!

 I do not know why crash is obsoleted today. Is there a new better tool 
 to instead
 crash?

 I'm not aware of equally powerful (python) scripts for gdb as
 replacement, but I think it's worth starting a porting effort at
 some point.


 At least, I always use crash to live debugging  crash analysis.

 Then you may answer some questions to me:
  - Can you attach to a remote target (kgdb, qemu, etc.) and how?

 No. crash's live debugging only can work the kernel is live. I can use it 
 get
 some var's value, or some other information from kernel. If kernel panics,
 we can use gdb to attach to a remote target as you said. But on end user 
 machine,
 we can not do it, we should dump the memory into a file and analyze it in 
 another
 machine while the end user's guest can be restart.

  - Can you use it with latest gdb versions or is the gdb functionality
hard-wired due to an embedded gdb core in crash (that's how I
understood Christoph's reply to this topic)

 If I use crash, I can not use latest gdb versions. Do we always need to use
 the latest gdb versions? Currently, gdb-7.0 is embedded into crash, and it
 is enough to me. If the gdb embedded into crash cannot anaylze the vmcore, 
 I
 think we can update it and rebuild crash.

 crash is simply designed the wrong way around (from today's
 perspective): it should augment upstream gdb instead of forking it.

 Cc Dave Anderson. He knows how crash uses gdb.

 I think that crash does not fork a task to execute gdb, and gdb is a
 part of crash.
 
 I'm not sure what the question is, but you can consider crash as a huge
 wrapper around its embedded gdb, which it invokes as gdb vmlinux, and
 then takes over the user interface.  It doesn't have a clue as to what
 the memory source is, i.e., whether it's one of the almost 20 different
 dumpfile formats that it supports (including virsh dump), or if it's
 running against a live system.  It has its own command set, although
 you can enter some gdb commands, write gdb scripts, etc.  But the main
 purpose of the embedded gdb is for the crash-level sources to be able
 to gather data structure information, disassemble text, add-symbol-file
 kernel modules, and so on.  There is no kgdb remote linkage. 
 
 It's currently embedding gdb-7.0, although as we speak I'm updating it
 to gdb-7.3.1 because the compiler guys have decided that dwarf4 should be
 used by default.
 
 It would be kind of cool if there was a /dev/mem-like interface
 to a KVM guest's physical memory, so that you could sit on a KVM host
 and enter crash vmlinux-of-guest /dev/mem-of-guest in order to
 run live analysis of a guest.
 
 Anyway, sorry if it doesn't meet your needs...

Yes, I'd prefer to have the added value of crash available with standard
gdb, specifically to reuse it for remote target debugging which is
fairly common in embedded scenarios and for guest debugging (via
qemu/kvm). I do not yet see that there is anything preventing this
except that it needs to be done - or collected from previous work:
there should be, e.g., dozens of gdb scripts circling around that
automate kernel module symbol loading with gdb.

We do have a proper remote debugging interface already, no need to
invent a new one for crash-on-qemu. We do lack complete x86 system-level
debugging support, but that's an absolutely generic gdb problem,
independent of Linux as a debugging target. And, AFAIK, we do not have
such issues with common non-x86 targets.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] [PATCH 11/35] scsi-disk: support READ DVD STRUCTURE

2011-10-21 Thread Kevin Wolf

Am 21.10.2011 15:12, schrieb Paolo Bonzini:
 On 10/21/2011 01:42 PM, Kevin Wolf wrote:
 There is only one 'goto fail', all other places have a direct return -1.
 It would be good to be consistent.

 Also, as this is mostly a refactored copy from the ATAPI code, I wonder
 what our long-term plan is. At which point will we be able to unify what
 we're duplicating right now? Can we share some parts even now?
 
 That's a tricky question.  I think there are three choices:
 
 1) use SCSI as the sole interface exposed by the block layer (with an 
 adaptor).  There would be a common implementation of SCSI for 
 SCSI-oblivious devices, and other devices (hdev, sg, iscsi) could just 
 reason in terms of SCSI.  You could stack the common implementations 
 (hard drive and CD-ROM) on top of hdev/iscsi or use passthrough.  This 
 however is wrong IMHO because some bits of SCSI code really do deal with 
 guest state, for example the tray.
 
 2) let ide-cd create its own SCSI bus and act as an adaptor, similar to 
 USB devices.  There would still be duplication for commands that do DMA 
 in multiple steps; I think READ CD is the only one.
 
 3) create a separate API just for the purpose of sharing code between 
 ATAPI and SCSI (your can we share some parts even now, basically).
 
 
 I think I'm leaning towards (3), but I don't think it makes sense to do 
 it now unless someone is interested in implementing for example CD 
 burning support.  However, I'm leaning towards that also because I 
 honestly have no idea how hard (2) would be.

Which gives me the impression that your feeling is (as well as mine)
that (2) would give us the nicer result and is probably the Right Thing
to do long-term.

Though at the same time I agree that I don't have an idea of how hard
this would be and if it would be worth the effort. And with the current
qdev that doesn't allow device composition it might even get really ugly.

It's a hard question, but ignoring it is probably not a solution.

Kevin

[Qemu-devel] [PATCH] Mark future contributions to GPLv2-only files as GPLv2+

2011-10-21 Thread Paolo Bonzini

Even for files are licensed GPLv2-only, let's not play catch with
ourselves, and explicitly declare that future contributions to those
files will also be available as any later version.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 aio.c  |2 ++
 block-migration.c  |2 ++
 block/raw-posix-aio.h  |2 ++
 block/rbd.c|2 ++
 block/sheepdog.c   |3 +++
 buffered_file.c|2 ++
 compatfd.c |2 ++
 hmp.c  |2 ++
 hw/ac97.c  |3 +++
 hw/acpi.c  |3 +++
 hw/acpi_piix4.c|3 +++
 hw/ads7846.c   |3 +++
 hw/apm.c   |3 +++
 hw/bitbang_i2c.c   |3 +++
 hw/bonito.c|3 +++
 hw/collie.c|3 +++
 hw/ds1338.c|3 +++
 hw/ecc.c   |3 +++
 hw/event_notifier.c|3 +++
 hw/framebuffer.c   |3 +++
 hw/gumstix.c   |3 +++
 hw/ivshmem.c   |3 +++
 hw/kvmclock.c  |2 ++
 hw/lan9118.c   |3 +++
 hw/mainstone.c |3 +++
 hw/marvell_88w8618_audio.c |3 +++
 hw/max111x.c   |3 +++
 hw/mips_fulong2e.c |3 +++
 hw/msix.c  |3 +++
 hw/mst_fpga.c  |3 +++
 hw/musicpal.c  |3 +++
 hw/nand.c  |3 +++
 hw/pl031.c |2 ++
 hw/pxa2xx_keypad.c |3 +++
 hw/pxa2xx_lcd.c|3 +++
 hw/pxa2xx_mmci.c   |3 +++
 hw/pxa2xx_pcmcia.c |3 +++
 hw/smbios.c|2 ++
 hw/spitz.c |3 +++
 hw/ssi-sd.c|3 +++
 hw/ssi.c   |3 +++
 hw/strongarm.c |3 +++
 hw/tc6393xb.c  |3 +++
 hw/tosa.c  |3 +++
 hw/vexpress.c  |3 +++
 hw/vhost.c |3 +++
 hw/vhost_net.c |3 +++
 hw/virtio-pci.c|2 ++
 hw/virtio-serial-bus.c |3 +++
 hw/vt82c686.c  |3 +++
 hw/xen_backend.c   |3 +++
 hw/xen_disk.c  |3 +++
 hw/xen_nic.c   |3 +++
 hw/z2.c|3 +++
 iov.c  |3 +++
 memory.c   |2 ++
 migration-exec.c   |2 ++
 migration-fd.c |2 ++
 migration-tcp.c|2 ++
 migration-unix.c   |2 ++
 migration.c|2 ++
 module.c   |2 ++
 net/checksum.c |3 +++
 notify.c   |2 ++
 pflib.c|2 ++
 posix-aio-compat.c |2 ++
 qemu-tool.c|2 ++
 qmp.c  |2 ++
 roms/SLOF  |2 +-
 xen-all.c  |2 ++
 xen-mapcache.c |2 ++
 xen-stub.c |2 ++
 72 files changed, 188 insertions(+), 1 deletions(-)

diff --git a/aio.c b/aio.c
index 1239ca7..c6f3cb1 100644
--- a/aio.c
+++ b/aio.c
@@ -9,6 +9,8 @@
  * This work is licensed under the terms of the GNU GPL, version 2.  See
  * the COPYING file in the top-level directory.
  *
+ * Contributions after 2011-10-25 are licensed under the terms of the
+ * GNU GPL, version 2 or (at your option) any later version.
  */
 
 #include qemu-common.h
diff --git a/block-migration.c b/block-migration.c
index 0bff075..32c2eea 100644
--- a/block-migration.c
+++ b/block-migration.c
@@ -9,6 +9,8 @@
  * This work is licensed under the terms of the GNU GPL, version 2.  See
  * the COPYING file in the top-level directory.
  *
+ * Contributions after 2011-10-25 are licensed under the terms of the
+ * GNU GPL, version 2 or (at your option) any later version.
  */
 
 #include qemu-common.h
diff --git a/block/raw-posix-aio.h b/block/raw-posix-aio.h
index dfc63b8..d6d7275 100644
--- a/block/raw-posix-aio.h
+++ b/block/raw-posix-aio.h
@@ -9,6 +9,8 @@
  * This work is licensed under the terms of the GNU GPL, version 2.  See
  * the COPYING file in the top-level directory.
  *
+ * Contributions after 2011-10-25 are licensed under the terms of the
+ * GNU GPL, version 2 or (at your option) any later version.
  */
 #ifndef QEMU_RAW_POSIX_AIO_H
 #define QEMU_RAW_POSIX_AIO_H
diff --git a/block/rbd.c b/block/rbd.c
index 3068c82..b726c80 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -7,6 +7,8 @@
  * This work is licensed under the terms of the GNU GPL, version 2.  See
  * the COPYING file in the top-level directory.
  *
+ * Contributions after 2011-10-25 are licensed under the terms of the
+ * GNU GPL, version 2 or (at your option) any later version.
  */
 
 #include inttypes.h
diff --git a/block/sheepdog.c b/block/sheepdog.c
index ae857e2..d69795a 100644
--- a/block/sheepdog.c
+++ b/block/sheepdog.c
@@ -7,6 +7,9 @@
  *
  * You should have received a copy of the GNU General

Re: [Qemu-devel] [PATCH] Mark future contributions to GPLv2-only files as GPLv2+

2011-10-21 Thread Anthony Liguori


On 10/21/2011 09:03 AM, Paolo Bonzini wrote:

Even for files are licensed GPLv2-only, let's not play catch with
ourselves, and explicitly declare that future contributions to those
files will also be available as any later version.

Signed-off-by: Paolo Bonzinipbonz...@redhat.com
diff --git a/roms/SLOF b/roms/SLOF
index d1d6b53..b94bde0 16
--- a/roms/SLOF
+++ b/roms/SLOF
@@ -1 +1 @@
-Subproject commit d1d6b53b713a2b7c2c25685268fa932d28a4b4c0
+Subproject commit b94bde008b0d49ec4bfe933e110d0952d032ac28


I think you made a mistake here.

Otherwise I'm a bit concerned about ambiguity here.  Let's say we have to 
backport a fit to stable, we need to pull in this new copyright statement.


But then what if we later discovered we need to pull in a fix from before 10/25. 
 That will appear in the stable tree as a post-10/25 commit but it carries a 
GPLv2 only license.


I think a per-file flag day is really the only sane approach to this.

Regards,

Anthony Liguori


diff --git a/xen-all.c b/xen-all.c
index b5e28ab..4d6bf1a 100644
--- a/xen-all.c
+++ b/xen-all.c
@@ -4,6 +4,8 @@
   * This work is licensed under the terms of the GNU GPL, version 2.  See
   * the COPYING file in the top-level directory.
   *
+ * Contributions after 2011-10-25 are licensed under the terms of the
+ * GNU GPL, version 2 or (at your option) any later version.
   */

  #includesys/mman.h
diff --git a/xen-mapcache.c b/xen-mapcache.c
index 7bcb86e..f66bc60 100644
--- a/xen-mapcache.c
+++ b/xen-mapcache.c
@@ -4,6 +4,8 @@
   * This work is licensed under the terms of the GNU GPL, version 2.  See
   * the COPYING file in the top-level directory.
   *
+ * Contributions after 2011-10-25 are licensed under the terms of the
+ * GNU GPL, version 2 or (at your option) any later version.
   */

  #include config.h
diff --git a/xen-stub.c b/xen-stub.c
index efe2ab5..d713750 100644
--- a/xen-stub.c
+++ b/xen-stub.c
@@ -4,6 +4,8 @@
   * This work is licensed under the terms of the GNU GPL, version 2.  See
   * the COPYING file in the top-level directory.
   *
+ * Contributions after 2011-10-25 are licensed under the terms of the
+ * GNU GPL, version 2 or (at your option) any later version.
   */

  #include qemu-common.h

Re: [Qemu-devel] [Question] dump memory when host pci device is used by guest

2011-10-21 Thread Richard W.M. Jones

On Fri, Oct 21, 2011 at 09:02:37AM -0400, Dave Anderson wrote:
 It would be kind of cool if there was a /dev/mem-like interface
 to a KVM guest's physical memory, so that you could sit on a KVM host
 and enter crash vmlinux-of-guest /dev/mem-of-guest in order to
 run live analysis of a guest.

OT for this thread, but this sort of thing does exist, kind of.

You can send monitor commands to qemu to read physical and virtual
memory (pmemsave and memsave respectively).  At least one, and
possibly now both of these are bound through libvirt APIs:

http://libvirt.org/html/libvirt-libvirt.html#virDomainMemoryPeek

Here was your previous response about 4 years ago:

https://www.redhat.com/archives/crash-utility/2008-August/msg00032.html

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
libguestfs lets you edit virtual machines.  Supports shell scripting,
bindings from many languages.  http://libguestfs.org

Re: [Qemu-devel] [PATCH] [v3] hw/arm_gic.c: Fix save/load of irq_target array

2011-10-21 Thread Andreas Färber

Am 21.10.2011 15:58, schrieb Dmitry Koshelev:
 On Fri, Oct 21, 2011 at 3:42 PM, Andreas Färber afaer...@suse.de wrote:
 Am 20.10.2011 12:48, schrieb Dmitry Koshelev:
 irq_target array saving/loading is in the wrong loop.
 Version bump.

 Signed-off-by: Dmitry Koshelev karaghio...@gmail.com

 Acked-by: Andreas Färber andreas.faer...@web.de

Ah sorry, habits, should've been:

Acked-by: Andreas Färber afaer...@suse.de

 Is there a particular use case that was broken before and works now, or
 did this turn up during code review only?
 
 There is a use case but it's complicated and involves proprietary software.

I see. ;) Was just wondering which -M were affected.

Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746, AG Nürnberg

Re: [Qemu-devel] [PATCH v2 5/6] target-mips: Support for Cavium specific instructions

2011-10-21 Thread Richard Henderson

On 10/21/2011 01:17 AM, kha...@kics.edu.pk wrote:
 +switch (opc) {
 +case OPC_SEQ:
 +tcg_gen_setcondi_tl(TCG_COND_LTU, cpu_gpr[rd], t0, 1);
 +opn = seq;
 +break;
 +case OPC_SNE:
 +tcg_gen_setcondi_tl(TCG_COND_GTU, cpu_gpr[rd], t0, 0);
 +opn = sne;

If you keep posting the same un-fixed patch set, pretty soon
no one's going to pay any attention to you whatsoever.


r~

Re: [Qemu-devel] [PATCH] Mark future contributions to GPLv2-only files as GPLv2+

2011-10-21 Thread Paolo Bonzini


On 10/21/2011 04:11 PM, Anthony Liguori wrote:


Otherwise I'm a bit concerned about ambiguity here.  Let's say we have
to backport a fit to stable, we need to pull in this new copyright
statement.

But then what if we later discovered we need to pull in a fix from
before 10/25.  That will appear in the stable tree as a post-10/25
commit but it carries a GPLv2 only license.


You will never need to include this patch on 0.15 and earlier stable 
branches.


It is legal to take GPLv2+ contributions and restrict them to 
GPLv2-only.  Backporting is distributing, and a distributor can choose 
under which license he does so.  So there should be no problem with 
stable backports, whoever does the backports is implicitly restricting 
the licensing to GPLv2-only.


In fact, the text is just there to inform new contributors of the 
license.  Perhaps just changing the wording satisfies you, like By 
signing off changes to this files after 10/25 you agree that the file 
may be relicensed under GPLv2+ in the future?



I think a per-file flag day is really the only sane approach to this.


We need to make it clear right now that, from now on, GPLv3-incompatible 
changes will not be accepted.


Paolo

Re: [Qemu-devel] [PATCH] Mark future contributions to GPLv2-only files as GPLv2+

2011-10-21 Thread Kevin Wolf

Am 21.10.2011 16:11, schrieb Anthony Liguori:
 On 10/21/2011 09:03 AM, Paolo Bonzini wrote:
 Even for files are licensed GPLv2-only, let's not play catch with
 ourselves, and explicitly declare that future contributions to those
 files will also be available as any later version.

 Signed-off-by: Paolo Bonzinipbonz...@redhat.com
 diff --git a/roms/SLOF b/roms/SLOF
 index d1d6b53..b94bde0 16
 --- a/roms/SLOF
 +++ b/roms/SLOF
 @@ -1 +1 @@
 -Subproject commit d1d6b53b713a2b7c2c25685268fa932d28a4b4c0
 +Subproject commit b94bde008b0d49ec4bfe933e110d0952d032ac28
 
 I think you made a mistake here.
 
 Otherwise I'm a bit concerned about ambiguity here.  Let's say we have to 
 backport a fit to stable, we need to pull in this new copyright statement.
 
 But then what if we later discovered we need to pull in a fix from before 
 10/25. 
   That will appear in the stable tree as a post-10/25 commit but it carries a 
 GPLv2 only license.
 
 I think a per-file flag day is really the only sane approach to this.

I don't think any part of this patch should be pulled into stable. When
backporting a new fix (which is basically dual GPLv2 and GPLv3), we can
choose which of the offered licenses to use. IANAL, but nothing should
stop us from only taking the GPLv2 option.

Am I misunderstanding something here?

Kevin

[Qemu-devel] [PATCH v2 2/4] Add access control support to qemu bridge helper

2011-10-21 Thread Corey Bryant

We go to great lengths to restrict ourselves to just cap_net_admin as an OS
enforced security mechanism.  However, we further restrict what we allow users
to do to simply adding a tap device to a bridge interface by virtue of the fact
that this is the only functionality we expose.

This is not good enough though.  An administrator is likely to want to restrict
the bridges that an unprivileged user can access, in particular, to restrict
an unprivileged user from putting a guest on what should be isolated networks.

This patch implements an ACL mechanism that is enforced by qemu-bridge-helper.
The ACLs are fairly simple whitelist/blacklist mechanisms with a wildcard of
'all'.  All users are blacklisted by default, and deny takes precedence over
allow.

An interesting feature of this ACL mechanism is that you can include external
ACL files.  The main reason to support this is so that you can set different
file system permissions on those external ACL files.  This allows an
administrator to implement rather sophisicated ACL policies based on user/group
policies via the file system.

As an example:

/etc/qemu/bridge.conf root:qemu 0640

 allow br0
 include /etc/qemu/alice.conf
 include /etc/qemu/bob.conf
 include /etc/qemu/charlie.conf

/etc/qemu/alice.conf root:alice 0640
 allow br1

/etc/qemu/bob.conf root:bob 0640
 allow br2

/etc/qemu/charlie.conf root:charlie 0640
 deny all

This ACL pattern allows any user in the qemu group to get a tap device
connected to br0 (which is bridged to the physical network).

Users in the alice group can additionally get a tap device connected to br1.
This allows br1 to act as a private bridge for the alice group.

Users in the bob group can additionally get a tap device connected to br2.
This allows br2 to act as a private bridge for the bob group.

Users in the charlie group cannot get a tap device connected to any bridge.

Under no circumstance can the bob group get access to br1 or can the alice
group get access to br2.  And under no cicumstance can the charlie group
get access to any bridge.

Signed-off-by: Anthony Liguori aligu...@us.ibm.com
Signed-off-by: Richa Marwaha rmar...@linux.vnet.ibm.com
Signed-off-by: Corey Bryant cor...@linux.vnet.ibm.com
---
 qemu-bridge-helper.c |  141 ++
 1 files changed, 141 insertions(+), 0 deletions(-)

diff --git a/qemu-bridge-helper.c b/qemu-bridge-helper.c
index 2ce82fb..db257d5 100644
--- a/qemu-bridge-helper.c
+++ b/qemu-bridge-helper.c
@@ -33,6 +33,105 @@
 
 #include net/tap-linux.h
 
+#define MAX_ACLS (128)
+#define DEFAULT_ACL_FILE CONFIG_QEMU_CONFDIR /bridge.conf
+
+enum {
+ACL_ALLOW = 0,
+ACL_ALLOW_ALL,
+ACL_DENY,
+ACL_DENY_ALL,
+};
+
+typedef struct ACLRule {
+int type;
+char iface[IFNAMSIZ];
+} ACLRule;
+
+static int parse_acl_file(const char *filename, ACLRule *acls, int *pacl_count)
+{
+int acl_count = *pacl_count;
+FILE *f;
+char line[4096];
+
+f = fopen(filename, r);
+if (f == NULL) {
+return -1;
+}
+
+while (acl_count != MAX_ACLS 
+fgets(line, sizeof(line), f) != NULL) {
+char *ptr = line;
+char *cmd, *arg, *argend;
+
+while (isspace(*ptr)) {
+ptr++;
+}
+
+/* skip comments and empty lines */
+if (*ptr == '#' || *ptr == 0) {
+continue;
+}
+
+cmd = ptr;
+arg = strchr(cmd, ' ');
+if (arg == NULL) {
+arg = strchr(cmd, '\t');
+}
+
+if (arg == NULL) {
+fprintf(stderr, Invalid config line:\n  %s\n, line);
+fclose(f);
+errno = EINVAL;
+return -1;
+}
+
+*arg = 0;
+arg++;
+while (isspace(*arg)) {
+arg++;
+}
+
+argend = arg + strlen(arg);
+while (arg != argend  isspace(*(argend - 1))) {
+argend--;
+}
+*argend = 0;
+
+if (strcmp(cmd, deny) == 0) {
+if (strcmp(arg, all) == 0) {
+acls[acl_count].type = ACL_DENY_ALL;
+} else {
+acls[acl_count].type = ACL_DENY;
+snprintf(acls[acl_count].iface, IFNAMSIZ, %s, arg);
+}
+acl_count++;
+} else if (strcmp(cmd, allow) == 0) {
+if (strcmp(arg, all) == 0) {
+acls[acl_count].type = ACL_ALLOW_ALL;
+} else {
+acls[acl_count].type = ACL_ALLOW;
+snprintf(acls[acl_count].iface, IFNAMSIZ, %s, arg);
+}
+acl_count++;
+} else if (strcmp(cmd, include) == 0) {
+/* ignore errors */
+parse_acl_file(arg, acls, acl_count);
+} else {
+fprintf(stderr, Unknown command `%s'\n, cmd);
+fclose(f);
+errno = EINVAL;
+return -1;
+}
+}
+
+*pacl_count = acl_count;
+
+fclose(f);
+
+return 0;
+}
+
 static int

Re: [Qemu-devel] [PATCH] [v3] hw/arm_gic.c: Fix save/load of irq_target array

2011-10-21 Thread andrzej zaborowski

On 20 October 2011 12:48, Dmitry Koshelev karaghio...@gmail.com wrote:
 irq_target array saving/loading is in the wrong loop.
 Version bump.

 Signed-off-by: Dmitry Koshelev karaghio...@gmail.com

Thanks, pushed this patch.

Cheers

[Qemu-devel] [PATCH v2 3/4] Add cap reduction support to enable use as SUID

2011-10-21 Thread Corey Bryant

The ideal way to use qemu-bridge-helper is to give it an fscap of using:

 setcap cap_net_admin=ep qemu-bridge-helper

Unfortunately, most distros still do not have a mechanism to package files
with fscaps applied.  This means they'll have to SUID the qemu-bridge-helper
binary.

To improve security, use libcap to reduce our capability set to just
cap_net_admin, then reduce privileges down to the calling user.  This is
hopefully close to equivalent to fscap support from a security perspective.

Signed-off-by: Anthony Liguori aligu...@us.ibm.com
Signed-off-by: Richa Marwaha rmar...@linux.vnet.ibm.com
Signed-off-by: Corey Bryant cor...@linux.vnet.ibm.com
---
 configure|   34 ++
 qemu-bridge-helper.c |   39 +++
 2 files changed, 73 insertions(+), 0 deletions(-)

diff --git a/configure b/configure
index 6c8b659..fed66b0 100755
--- a/configure
+++ b/configure
@@ -128,6 +128,7 @@ vnc_thread=no
 xen=
 xen_ctrl_version=
 linux_aio=
+cap=
 attr=
 xfs=
 
@@ -653,6 +654,10 @@ for opt do
   ;;
   --enable-kvm) kvm=yes
   ;;
+  --disable-cap)  cap=no
+  ;;
+  --enable-cap) cap=yes
+  ;;
   --disable-spice) spice=no
   ;;
   --enable-spice) spice=yes
@@ -1032,6 +1037,8 @@ echo   --disable-vdedisable support for vde 
network
 echo   --enable-vde enable support for vde network
 echo   --disable-linux-aio  disable Linux AIO support
 echo   --enable-linux-aio   enable Linux AIO support
+echo   --disable-capdisable libcap-ng support
+echo   --enable-cap enable libcap-ng support
 echo   --disable-attr   disables attr and xattr support
 echo   --enable-attrenable attr and xattr support
 echo   --disable-blobs  disable installing provided firmware blobs
@@ -1638,6 +1645,29 @@ EOF
 fi
 
 ##
+# libcap-ng library probe
+if test $cap != no ; then
+  cap_libs=-lcap-ng
+  cat  $TMPC  EOF
+#include cap-ng.h
+int main(void)
+{
+capng_capability_to_name(CAPNG_EFFECTIVE);
+return 0;
+}
+EOF
+  if compile_prog  $cap_libs ; then
+cap=yes
+libs_tools=$cap_libs $libs_tools
+  else
+if test $cap = yes ; then
+  feature_not_found cap
+fi
+cap=no
+  fi
+fi
+
+##
 # Sound support libraries probe
 
 audio_drv_probe()
@@ -2735,6 +2765,7 @@ echo fdatasync $fdatasync
 echo madvise   $madvise
 echo posix_madvise $posix_madvise
 echo uuid support  $uuid
+echo libcap-ng support $cap
 echo vhost-net support $vhost_net
 echo Trace backend $trace_backend
 echo Trace output file $trace_file-pid
@@ -2846,6 +2877,9 @@ fi
 if test $vde = yes ; then
   echo CONFIG_VDE=y  $config_host_mak
 fi
+if test $cap = yes ; then
+  echo CONFIG_LIBCAP=y  $config_host_mak
+fi
 for card in $audio_card_list; do
 def=CONFIG_`echo $card | tr '[:lower:]' '[:upper:]'`
 echo $def=y  $config_host_mak
diff --git a/qemu-bridge-helper.c b/qemu-bridge-helper.c
index db257d5..b1562eb 100644
--- a/qemu-bridge-helper.c
+++ b/qemu-bridge-helper.c
@@ -33,6 +33,10 @@
 
 #include net/tap-linux.h
 
+#ifdef CONFIG_LIBCAP
+#include cap-ng.h
+#endif
+
 #define MAX_ACLS (128)
 #define DEFAULT_ACL_FILE CONFIG_QEMU_CONFDIR /bridge.conf
 
@@ -185,6 +189,27 @@ static int send_fd(int c, int fd)
 return sendmsg(c, msg, 0);
 }
 
+#ifdef CONFIG_LIBCAP
+static int drop_privileges(void)
+{
+/* clear all capabilities */
+capng_clear(CAPNG_SELECT_BOTH);
+
+if (capng_update(CAPNG_ADD, CAPNG_EFFECTIVE | CAPNG_PERMITTED,
+ CAP_NET_ADMIN)  0) {
+return -1;
+}
+
+/* change to calling user's real uid and gid, retaining supplemental
+ * groups and CAP_NET_ADMIN */
+if (capng_change_id(getuid(), getgid(), CAPNG_CLEAR_BOUNDING)) {
+return -1;
+}
+
+return 0;
+}
+#endif
+
 int main(int argc, char **argv)
 {
 struct ifreq ifr;
@@ -198,6 +223,20 @@ int main(int argc, char **argv)
 int acl_count = 0;
 int i, access_allowed, access_denied;
 
+/* if we're run from an suid binary, immediately drop privileges preserving
+ * cap_net_admin -- exit immediately if libcap not configured */
+if (geteuid() == 0  getuid() != geteuid()) {
+#ifdef CONFIG_LIBCAP
+if (drop_privileges() == -1) {
+fprintf(stderr, failed to drop privileges\n);
+return 1;
+}
+#else
+fprintf(stderr, failed to drop privileges\n);
+return 1;
+#endif
+}
+
 /* parse arguments */
 if (argc  3 || argc  4) {
 fprintf(stderr, Usage: %s [--use-vnet] BRIDGE FD\n, argv[0]);
-- 
1.7.3.4

Re: [Qemu-devel] [PATCH] hw/omap2: Wire up the IRQ for the 2430's fifth GPIO module

2011-10-21 Thread andrzej zaborowski

On 18 October 2011 17:12, Peter Maydell peter.mayd...@linaro.org wrote:
 The OMAP2430 version of the omap-gpio device has five GPIO modules,
 not four like the other OMAP2 versions; wire up the fifth module's
 IRQ line correctly.

Thanks, pushed this patch.

Cheers

[Qemu-devel] [PATCH v2 1/4] Add basic version of bridge helper

2011-10-21 Thread Corey Bryant

This patch adds a helper that can be used to create a tap device attached to
a bridge device.  Since this helper is minimal in what it does, it can be
given CAP_NET_ADMIN which allows qemu to avoid running as root while still
satisfying the majority of what users tend to want to do with tap devices.

The way this all works is that qemu launches this helper passing a bridge
name and the name of an inherited file descriptor.  The descriptor is one
end of a socketpair() of domain sockets.  This domain socket is used to
transmit a file descriptor of the opened tap device from the helper to qemu.

The helper can then exit and let qemu use the tap device.

Signed-off-by: Anthony Liguori aligu...@us.ibm.com
Signed-off-by: Richa Marwaha rmar...@linux.vnet.ibm.com
Signed-off-by: Corey Bryant cor...@linux.vnet.ibm.com
---
 Makefile |   12 +++-
 configure|1 +
 qemu-bridge-helper.c |  205 ++
 3 files changed, 216 insertions(+), 2 deletions(-)
 create mode 100644 qemu-bridge-helper.c

diff --git a/Makefile b/Makefile
index f63fc02..d9b447e 100644
--- a/Makefile
+++ b/Makefile
@@ -35,6 +35,8 @@ $(call set-vpath, $(SRC_PATH):$(SRC_PATH)/hw)
 
 LIBS+=-lz $(LIBS_TOOLS)
 
+HELPERS-$(CONFIG_LINUX) = qemu-bridge-helper$(EXESUF)
+
 ifdef BUILD_DOCS
 DOCS=qemu-doc.html qemu-tech.html qemu.1 qemu-img.1 qemu-nbd.8 
QMP/qmp-commands.txt
 else
@@ -75,7 +77,7 @@ defconfig:
 
 -include config-all-devices.mak
 
-build-all: $(DOCS) $(TOOLS) recurse-all
+build-all: $(DOCS) $(TOOLS) $(HELPERS-y) recurse-all
 
 config-host.h: config-host.h-timestamp
 config-host.h-timestamp: config-host.mak
@@ -153,6 +155,8 @@ qemu-img$(EXESUF): qemu-img.o $(tools-obj-y)
 qemu-nbd$(EXESUF): qemu-nbd.o $(tools-obj-y)
 qemu-io$(EXESUF): qemu-io.o cmd.o $(tools-obj-y)
 
+qemu-bridge-helper$(EXESUF): qemu-bridge-helper.o
+
 qemu-img-cmds.h: $(SRC_PATH)/qemu-img-cmds.hx
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -h  $  $@,  GEN  
 $@)
 
@@ -221,7 +225,7 @@ clean:
 # avoid old build problems by removing potentially incorrect old files
rm -f config.mak op-i386.h opc-i386.h gen-op-i386.h op-arm.h opc-arm.h 
gen-op-arm.h
rm -f qemu-options.def
-   rm -f *.o *.d *.a *.lo $(TOOLS) qemu-ga TAGS cscope.* *.pod *~ */*~
+   rm -f *.o *.d *.a *.lo $(TOOLS) $(HELPERS-y) qemu-ga TAGS cscope.* 
*.pod *~ */*~
rm -Rf .libs
rm -f slirp/*.o slirp/*.d audio/*.o audio/*.d block/*.o block/*.d 
net/*.o net/*.d fsdev/*.o fsdev/*.d ui/*.o ui/*.d qapi/*.o qapi/*.d qga/*.o 
qga/*.d
rm -f qemu-img-cmds.h
@@ -289,6 +293,10 @@ install: all $(if $(BUILD_DOCS),install-doc) 
install-sysconfig
 ifneq ($(TOOLS),)
$(INSTALL_PROG) $(STRIP_OPT) $(TOOLS) $(DESTDIR)$(bindir)
 endif
+ifneq ($(HELPERS-y),)
+   $(INSTALL_DIR) $(DESTDIR)$(libexecdir)
+   $(INSTALL_PROG) $(STRIP_OPT) $(HELPERS-y) $(DESTDIR)$(libexecdir)
+endif
 ifneq ($(BLOBS),)
$(INSTALL_DIR) $(DESTDIR)$(datadir)
set -e; for x in $(BLOBS); do \
diff --git a/configure b/configure
index 4f87e0a..6c8b659 100755
--- a/configure
+++ b/configure
@@ -2768,6 +2768,7 @@ echo datadir=$datadir  $config_host_mak
 echo sysconfdir=$sysconfdir  $config_host_mak
 echo docdir=$docdir  $config_host_mak
 echo confdir=$confdir  $config_host_mak
+echo libexecdir=\${prefix}/libexec  $config_host_mak
 
 case $cpu in
   
i386|x86_64|alpha|cris|hppa|ia64|lm32|m68k|microblaze|mips|mips64|ppc|ppc64|s390|s390x|sparc|sparc64|unicore32)
diff --git a/qemu-bridge-helper.c b/qemu-bridge-helper.c
new file mode 100644
index 000..2ce82fb
--- /dev/null
+++ b/qemu-bridge-helper.c
@@ -0,0 +1,205 @@
+/*
+ * QEMU Bridge Helper
+ *
+ * Copyright IBM, Corp. 2011
+ *
+ * Authors:
+ * Anthony Liguori   aligu...@us.ibm.com
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#include config-host.h
+
+#include stdio.h
+#include errno.h
+#include fcntl.h
+#include unistd.h
+#include string.h
+#include stdlib.h
+#include ctype.h
+
+#include sys/types.h
+#include sys/ioctl.h
+#include sys/socket.h
+#include sys/un.h
+#include sys/prctl.h
+
+#include net/if.h
+
+#include linux/sockios.h
+
+#include net/tap-linux.h
+
+static int has_vnet_hdr(int fd)
+{
+unsigned int features = 0;
+struct ifreq ifreq;
+
+if (ioctl(fd, TUNGETFEATURES, features) == -1) {
+return -errno;
+}
+
+if (!(features  IFF_VNET_HDR)) {
+return -ENOTSUP;
+}
+
+if (ioctl(fd, TUNGETIFF, ifreq) != -1 || errno != EBADFD) {
+return -ENOTSUP;
+}
+
+return 1;
+}
+
+static void prep_ifreq(struct ifreq *ifr, const char *ifname)
+{
+memset(ifr, 0, sizeof(*ifr));
+snprintf(ifr-ifr_name, IFNAMSIZ, %s, ifname);
+}
+
+static int send_fd(int c, int fd)
+{
+char msgbuf[CMSG_SPACE(sizeof(fd))];
+struct msghdr msg = {
+.msg_control = msgbuf,
+.msg_controllen = sizeof(msgbuf),
+};
+

[Qemu-devel] [PATCH v2 0/4] -net bridge: rootless bridge support for qemu

2011-10-21 Thread Corey Bryant

With qemu it is possible to run a guest from an unprivileged user but if
we wanted to communicate with the outside world we had to switch
to root.

We address this problem by introducing a new network backend and a new
network option for -net tap.  This is less flexible when compared to
existing -net tap options because it relies on a helper with elevated
privileges to do the heavy lifting of allocating and attaching a tap
device to a bridge.  We use a special purpose helper because we don't
want to elevate the privileges of more generic tools like brctl.

Qemu can be run with the default network helper as follows (in these cases
attaching the tap device to the default br0 bridge):

 qemu -hda linux.img -net bridge -net nic
or:
 qemu -hda linux.img -net tap,helper=/usr/local/libexec/qemu-bridge-helper 
-net nic

The default helper uses it's own ACL mechanism for access control, but
future network helpers could be developed, for example, to support PolicyKit
for access control.

More details are included in individual patches.  The helper is broken into
a series of patches to improve reviewabilty.

v2:
 - Updated signed-off-by's
 - Updated author's email
 - Set default bridge to br0
 - Added -net bridge
 - Updated ACL example
 - Moved from libcap to libcap-ng
 - Fail helper when libcap-ng not configured

Corey Bryant (4):
  Add basic version of bridge helper
  Add access control support to qemu bridge helper
  Add cap reduction support to enable use as SUID
  Add support for net bridge

 Makefile |   12 ++-
 configure|   37 +
 net.c|   29 -
 net.h|3 +
 net/tap.c|  190 -
 net/tap.h|2 +
 qemu-bridge-helper.c |  380 ++
 qemu-options.hx  |   73 --
 8 files changed, 703 insertions(+), 23 deletions(-)
 create mode 100644 qemu-bridge-helper.c

-- 
1.7.3.4

[Qemu-devel] [PATCH v2 4/4] Add support for net bridge

2011-10-21 Thread Corey Bryant

The most common use of -net tap is to connect a tap device to a bridge.  This
requires the use of a script and running qemu as root in order to allocate a
tap device to pass to the script.

This model is great for portability and flexibility but it's incredibly
difficult to eliminate the need to run qemu as root.  The only really viable
mechanism is to use tunctl to create a tap device, attach it to a bridge as
root, and then hand that tap device to qemu.  The problem with this mechanism
is that it requires administrator intervention whenever a user wants to create
a guest.

By essentially writing a helper that implements the most common qemu-ifup
script that can be safely given cap_net_admin, we can dramatically simplify
things for non-privileged users.  We still support existing -net tap options
as a mechanism for advanced users and backwards compatibility.

Currently, this is very Linux centric but there's really no reason why it
couldn't be extended for other Unixes.

A typical invocation would be:

  qemu linux.img -net bridge -net nic,model=virtio

or:

  qemu linux.img -net tap,helper=/usr/local/libexec/qemu-bridge-helper
 -net nic,model=virtio

The default bridge that we attach to is br0.  The thinking is that a distro
could preconfigure such an interface to allow out-of-the-box bridged networking.

Alternatively, if a user wants to use a different bridge, they can say:

  qemu linux.img -net bridge,br=qemubr0 -net nic,model=virtio

or:

  qemu linux.img -net 
tap,helper=/usr/local/libexec/qemu-bridge-helper,br=qemubr0
 -net nic,model=virtio

Signed-off-by: Anthony Liguori aligu...@us.ibm.com
Signed-off-by: Richa Marwaha rmar...@linux.vnet.ibm.com
Signed-off-by: Corey Bryant cor...@linux.vnet.ibm.com
---
 configure   |2 +
 net.c   |   29 -
 net.h   |3 +
 net/tap.c   |  190 +--
 net/tap.h   |3 +
 qemu-options.hx |   73 +
 6 files changed, 279 insertions(+), 21 deletions(-)

diff --git a/configure b/configure
index fed66b0..9493d60 100755
--- a/configure
+++ b/configure
@@ -2800,6 +2800,8 @@ echo sysconfdir=$sysconfdir  $config_host_mak
 echo docdir=$docdir  $config_host_mak
 echo confdir=$confdir  $config_host_mak
 echo libexecdir=\${prefix}/libexec  $config_host_mak
+echo CONFIG_QEMU_SHAREDIR=\$prefix$datasuffix\  $config_host_mak
+echo CONFIG_QEMU_HELPERDIR=\$prefix/libexec\  $config_host_mak
 
 case $cpu in
   
i386|x86_64|alpha|cris|hppa|ia64|lm32|m68k|microblaze|mips|mips64|ppc|ppc64|s390|s390x|sparc|sparc64|unicore32)
diff --git a/net.c b/net.c
index d05930c..2dcb2d4 100644
--- a/net.c
+++ b/net.c
@@ -956,6 +956,14 @@ static const struct {
 .type = QEMU_OPT_STRING,
 .help = script to shut down the interface,
 }, {
+.name = br,
+.type = QEMU_OPT_STRING,
+.help = bridge name,
+}, {
+.name = helper,
+.type = QEMU_OPT_STRING,
+.help = command to execute to configure bridge,
+}, {
 .name = sndbuf,
 .type = QEMU_OPT_SIZE,
 .help = send buffer limit
@@ -1053,6 +1061,23 @@ static const struct {
 { /* end of list */ }
 },
 },
+[NET_CLIENT_TYPE_BRIDGE] = {
+.type = bridge,
+.init = net_init_bridge,
+.desc = {
+NET_COMMON_PARAMS_DESC,
+{
+.name = br,
+.type = QEMU_OPT_STRING,
+.help = bridge name,
+}, {
+.name = helper,
+.type = QEMU_OPT_STRING,
+.help = command to execute to configure bridge,
+},
+{ /* end of list */ }
+},
+},
 };
 
 int net_client_init(Monitor *mon, QemuOpts *opts, int is_netdev)
@@ -1075,7 +1100,8 @@ int net_client_init(Monitor *mon, QemuOpts *opts, int 
is_netdev)
 #ifdef CONFIG_VDE
 strcmp(type, vde) != 0 
 #endif
-strcmp(type, socket) != 0) {
+strcmp(type, socket) != 0 
+strcmp(type, bridge) != 0) {
 qerror_report(QERR_INVALID_PARAMETER_VALUE, type,
   a netdev backend type);
 return -1;
@@ -1145,6 +1171,7 @@ static int net_host_check_device(const char *device)
 #ifdef CONFIG_VDE
,vde
 #endif
+   , bridge
 };
 for (i = 0; i  sizeof(valid_param_list) / sizeof(char *); i++) {
 if (!strncmp(valid_param_list[i], device,
diff --git a/net.h b/net.h
index 9f633f8..d1340ad 100644
--- a/net.h
+++ b/net.h
@@ -36,6 +36,7 @@ typedef enum {
 NET_CLIENT_TYPE_SOCKET,
 NET_CLIENT_TYPE_VDE,
 NET_CLIENT_TYPE_DUMP,
+NET_CLIENT_TYPE_BRIDGE,
 
 NET_CLIENT_TYPE_MAX
 } net_client_type;
@@ -174,6 +175,8 @@ int

Re: [Qemu-devel] build with trace enabled is broken by the commit c572f23a3e7180dbeab5e86583e43ea2afed6271 hw/9pfs: Introduce tracing for 9p pdu handlers

2011-10-21 Thread Aneesh Kumar K.V

On Thu, 20 Oct 2011 23:20:54 +0400, Max Filippov jcmvb...@gmail.com wrote:
 Hi.
 
 Current git head build with trace enabled is broken by the commit 
 c572f23a3e7180dbeab5e86583e43ea2afed6271 hw/9pfs: Introduce tracing for 9p 
 pdu handlers.
 Error messages:
 
 In file included from trace.c:2:0:
 trace.h: In function ‘trace_v9fs_attach’:
 trace.h:2850:9: error: too many arguments for format 
 [-Werror=format-extra-args]
 trace.h: In function ‘trace_v9fs_wstat’:
 trace.h:3039:9: error: too many arguments for format 
 [-Werror=format-extra-args]
 trace.h: In function ‘trace_v9fs_mkdir’:
 trace.h:3088:9: error: too many arguments for format 
 [-Werror=format-extra-args]
 trace.h: In function ‘trace_v9fs_mkdir_return’:
 trace.h:3095:9: error: too many arguments for format 
 [-Werror=format-extra-args]
 cc1: all warnings being treated as errors
 
 Prototypes in the trace-events do not match format strings, e.g.
 
 v9fs_attach(uint16_t tag, uint8_t id, int32_t fid, int32_t afid, char* uname, 
 char* aname) tag %d id %d fid %d afid %d aname %s
 
 The following patch fixes it, but I'm not sure the format lines are 
 appropriate.

Can you send the patch with signed-off-by: I will add it in the next
pull request.

-aneesh

[Qemu-devel] New message-Cheque 904533.

2011-10-21 Thread The Co-operative Bank

[Qemu-devel] [PATCH] Add SPICE support to add_client monitor command

2011-10-21 Thread Daniel P. Berrange

From: Daniel P. Berrange berra...@redhat.com

With the proposal of some new APIs[1] to libspice-server.so it
is possible to add support for SPICE to the 'add_client'
monitor command, bringing parity with VNC. Since SPICE can
use TLS or plain connections, the command also gains a new
'tls' parameter to specify whether TLS should be attempted
on the injected client sockets.

NB1, since there is no SPICE release with these APIs, I
have guessed the next SPICE version number in the #ifdef
to be 0x000a00 (ie 0.10.0 in hex).

NB2, obviously this should only be merged once the SPICE
developers have accepted my proposed patches to libspice-server.so

* qmp-commands.hx: Add 'tls' parameter  missing doc for
  'skipauth' parameter
* monitor.c: Wire up SPICE for 'add_client' command
* ui/qemu-spice.h, ui/spice-core.c: Add qemu_spice_display_add_client
  API to wire up from monitor

[1] http://lists.freedesktop.org/archives/spice-devel/2011-October/005834.html

Signed-off-by: Daniel P. Berrange berra...@redhat.com
---
 monitor.c   |9 +++--
 qmp-commands.hx |6 --
 ui/qemu-spice.h |1 +
 ui/spice-core.c |   13 +
 4 files changed, 25 insertions(+), 4 deletions(-)

diff --git a/monitor.c b/monitor.c
index 31b212a..cae3b12 100644
--- a/monitor.c
+++ b/monitor.c
@@ -1120,13 +1120,18 @@ static int add_graphics_client(Monitor *mon, const 
QDict *qdict, QObject **ret_d
 CharDriverState *s;
 
 if (strcmp(protocol, spice) == 0) {
+int fd = monitor_get_fd(mon, fdname);
+int skipauth = qdict_get_try_bool(qdict, skipauth, 0);
+int tls = qdict_get_try_bool(qdict, tls, 0);
 if (!using_spice) {
 /* correct one? spice isn't a device ,,, */
 qerror_report(QERR_DEVICE_NOT_ACTIVE, spice);
 return -1;
 }
-   qerror_report(QERR_ADD_CLIENT_FAILED);
-   return -1;
+if (qemu_spice_display_add_client(fd, skipauth, tls)  0) {
+close(fd);
+}
+return 0;
 #ifdef CONFIG_VNC
 } else if (strcmp(protocol, vnc) == 0) {
int fd = monitor_get_fd(mon, fdname);
diff --git a/qmp-commands.hx b/qmp-commands.hx
index ea96191..de1ada3 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -911,8 +911,8 @@ EQMP
 
 {
 .name   = add_client,
-.args_type  = protocol:s,fdname:s,skipauth:b?,
-.params = protocol fdname skipauth,
+.args_type  = protocol:s,fdname:s,skipauth:b?,tls:b?,
+.params = protocol fdname skipauth tls,
 .help   = add a graphics client,
 .user_print = monitor_user_noop,
 .mhandler.cmd_new = add_graphics_client,
@@ -928,6 +928,8 @@ Arguments:
 
 - protocol: protocol name (json-string)
 - fdname: file descriptor name (json-string)
+- skipauth: whether to skip authentication (json-bool)
+- tls: whether to perform TLS (json-bool)
 
 Example:
 
diff --git a/ui/qemu-spice.h b/ui/qemu-spice.h
index f34be69..a0e213c 100644
--- a/ui/qemu-spice.h
+++ b/ui/qemu-spice.h
@@ -32,6 +32,7 @@ void qemu_spice_init(void);
 void qemu_spice_input_init(void);
 void qemu_spice_audio_init(void);
 void qemu_spice_display_init(DisplayState *ds);
+int qemu_spice_display_add_client(int csock, int skipauth, int tls);
 int qemu_spice_add_interface(SpiceBaseInstance *sin);
 int qemu_spice_set_passwd(const char *passwd,
   bool fail_if_connected, bool 
disconnect_if_connected);
diff --git a/ui/spice-core.c b/ui/spice-core.c
index 3cbc721..854670e 100644
--- a/ui/spice-core.c
+++ b/ui/spice-core.c
@@ -712,6 +712,19 @@ int qemu_spice_set_pw_expire(time_t expires)
 return qemu_spice_set_ticket(false, false);
 }
 
+int qemu_spice_display_add_client(int csock, int skipauth, int tls)
+{
+#if SPICE_SERVER_VERSION = 0x000a00
+if (tls) {
+return spice_server_add_ssl_client(spice_server, csock, skipauth);
+} else {
+return spice_server_add_client(spice_server, csock, skipauth);
+}
+#else
+return -1;
+#endif
+}
+
 static void spice_register_config(void)
 {
 qemu_add_opts(qemu_spice_opts);
-- 
1.7.6.4

Re: [Qemu-devel] [PATCH] linux-user: Fix broken -version option

2011-10-21 Thread andrzej zaborowski

On 29 September 2011 16:48, Peter Maydell peter.mayd...@linaro.org wrote:
 Fix the -version option, which was accidentally broken in commit
 fc9c541:
  * exit after printing version information rather than proceeding
   blithely onward (and likely printing the full usage message)
  * correct the cut-n-paste error in the usage message for it
  * don't insist on the presence of a following argument for
   options which don't take an argument (this was preventing
   'qemu-arm -version' from working)
  * remove a spurious argc check from the beginning of main() which
   meant 'QEMU_VERSION=1 qemu-arm' didn't work.

Thanks, I pushed this patch.

Cheers

Re: [Qemu-devel] [PATCH] compatfd.c: Don't pass NULL pointer to SYS_signalfd

2011-10-21 Thread andrzej zaborowski

On 13 October 2011 19:45, Peter Maydell peter.mayd...@linaro.org wrote:
 Don't pass a NULL pointer in to SYS_signalfd in qemu_signalfd_available():
 this isn't valid and Valgrind complains about it.

Also pushed this patch.

Cheers

[Qemu-devel] [PATCH] [v2] target-arm/machine.c: Fix load of floating point registers

2011-10-21 Thread Dmitry Koshelev

Fix load of floating point registers

Signed-off-by: Dmitry Koshelev karaghio...@gmail.com
---
 target-arm/machine.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/target-arm/machine.c b/target-arm/machine.c
index 7d4fc54..aaee9b9 100644
--- a/target-arm/machine.c
+++ b/target-arm/machine.c
@@ -189,7 +189,7 @@ int cpu_load(QEMUFile *f, void *opaque, int version_id)
 env-vfp.vec_stride = qemu_get_be32(f);
 
 if (arm_feature(env, ARM_FEATURE_VFP3)) {
-for (i = 0;  i  16; i++) {
+for (i = 16;  i  32; i++) {
 CPU_DoubleU u;
 u.l.upper = qemu_get_be32(f);
 u.l.lower = qemu_get_be32(f);
-- 
1.7.1

Re: [Qemu-devel] [PATCH V5] Add AACI audio playback support to the ARM Versatile/PB platform

2011-10-21 Thread andrzej zaborowski

Hi Mathieu,

On 18 October 2011 23:45, Mathieu Sonet cont...@elasticsheep.com wrote:
 This driver emulates the ARM AACI interface (PL041) connected to a LM4549
 codec.
 It enables audio playback for the Versatile/PB platform.

 Limitations:
 - Supports only a playback on one channel (Versatile/Vexpress)
 - Supports only one TX FIFO in compact-mode or non-compact mode.
 - Supports playback of 12, 16, 18 and 20 bits samples.
 - Record is not supported.
 - The PL041 is hardwired to a LM4549 codec.

 Versatile/PB test build:
 linux-2.6.38.5
 buildroot-2010.11
 alsa-lib-1.0.22
 alsa-utils-1.0.22
 mpg123-0.66

 Qemu host: Ubuntu 10.04 in Vmware/OS X

 Playback tested successfully with speaker-test/aplay/mpg123.

 Signed-off-by: Mathieu Sonet cont...@elasticsheep.com
 ---
 v4-v5

 * Move the lm4549 post_load hook in lm4549.c
 * Fix naked debug printf in lm4549.c
 * Clarify the size of the lm4549 audio buffer

  Makefile.target  |    1 +
  hw/lm4549.c      |  336 
  hw/lm4549.h      |   43 
  hw/pl041.c       |  636
 ++
  hw/pl041.h       |  135 
  hw/pl041.hx      |   81 +++
  hw/versatilepb.c |    8 +
  7 files changed, 1240 insertions(+), 0 deletions(-)
  create mode 100644 hw/lm4549.c
  create mode 100644 hw/lm4549.h
  create mode 100644 hw/pl041.c
  create mode 100644 hw/pl041.h
  create mode 100644 hw/pl041.hx

 diff --git a/Makefile.target b/Makefile.target
 index 417f23e..25b9fc1 100644
 --- a/Makefile.target
 +++ b/Makefile.target
 @@ -355,6 +355,7 @@ obj-arm-y += syborg_virtio.o
  obj-arm-y += vexpress.o
  obj-arm-y += strongarm.o
  obj-arm-y += collie.o
 +obj-arm-y += pl041.o lm4549.o

  obj-sh4-y = shix.o r2d.o sh7750.o sh7750_regnames.o tc58128.o
  obj-sh4-y += sh_timer.o sh_serial.o sh_intc.o sh_pci.o sm501.o
 diff --git a/hw/lm4549.c b/hw/lm4549.c
 new file mode 100644
 index 000..4d5b831
 --- /dev/null
 +++ b/hw/lm4549.c
 @@ -0,0 +1,336 @@
 +/*
 + * LM4549 Audio Codec Interface
 + *
 + * Copyright (c) 2011
 + * Written by Mathieu Sonet - www.elasticsheep.com
 + *
 + * This code is licenced under the GPL.
 + *
 + * *
 + *
 + * This driver emulates the LM4549 codec.
 + *
 + * It supports only one playback voice and no record voice.
 + */
 +
 +#include hw.h
 +#include audio/audio.h
 +#include lm4549.h
 +
 +#if 0
 +#define LM4549_DEBUG  1
 +#endif
 +
 +#if 0
 +#define LM4549_DUMP_DAC_INPUT 1
 +#endif
 +
 +#ifdef LM4549_DEBUG
 +#define DPRINTF(fmt, ...) \
 +do { printf(lm4549:  fmt , ## __VA_ARGS__); } while (0)
 +#else
 +#define DPRINTF(fmt, ...) do {} while (0)
 +#endif
 +
 +#if defined(LM4549_DUMP_DAC_INPUT)
 +#include stdio.h
 +static FILE *fp_dac_input;
 +#endif
 +
 +/* LM4549 register list */
 +enum {
 +    LM4549_Reset                    = 0x00,
 +    LM4549_Master_Volume            = 0x02,
 +    LM4549_Line_Out_Volume          = 0x04,
 +    LM4549_Master_Volume_Mono       = 0x06,
 +    LM4549_PC_Beep_Volume           = 0x0A,
 +    LM4549_Phone_Volume             = 0x0C,
 +    LM4549_Mic_Volume               = 0x0E,
 +    LM4549_Line_In_Volume           = 0x10,
 +    LM4549_CD_Volume                = 0x12,
 +    LM4549_Video_Volume             = 0x14,
 +    LM4549_Aux_Volume               = 0x16,
 +    LM4549_PCM_Out_Volume           = 0x18,
 +    LM4549_Record_Select            = 0x1A,
 +    LM4549_Record_Gain              = 0x1C,
 +    LM4549_General_Purpose          = 0x20,
 +    LM4549_3D_Control               = 0x22,
 +    LM4549_Powerdown_Ctrl_Stat      = 0x26,
 +    LM4549_Ext_Audio_ID             = 0x28,
 +    LM4549_Ext_Audio_Stat_Ctrl      = 0x2A,
 +    LM4549_PCM_Front_DAC_Rate       = 0x2C,
 +    LM4549_PCM_ADC_Rate             = 0x32,
 +    LM4549_Vendor_ID1               = 0x7C,
 +    LM4549_Vendor_ID2               = 0x7E
 +};
 +
 +static void lm4549_reset(lm4549_state *s)
 +{
 +    uint16_t *regfile = s-regfile;
 +
 +    regfile[LM4549_Reset]               = 0x0d50;
 +    regfile[LM4549_Master_Volume]       = 0x8008;
 +    regfile[LM4549_Line_Out_Volume]     = 0x8000;
 +    regfile[LM4549_Master_Volume_Mono]  = 0x8000;
 +    regfile[LM4549_PC_Beep_Volume]      = 0x;
 +    regfile[LM4549_Phone_Volume]        = 0x8008;
 +    regfile[LM4549_Mic_Volume]          = 0x8008;
 +    regfile[LM4549_Line_In_Volume]      = 0x8808;
 +    regfile[LM4549_CD_Volume]           = 0x8808;
 +    regfile[LM4549_Video_Volume]        = 0x8808;
 +    regfile[LM4549_Aux_Volume]          = 0x8808;
 +    regfile[LM4549_PCM_Out_Volume]      = 0x8808;
 +    regfile[LM4549_Record_Select]       = 0x;
 +    regfile[LM4549_Record_Gain]         = 0x8000;
 +    regfile[LM4549_General_Purpose]     = 0x;
 +    regfile[LM4549_3D_Control]          = 0x0101;
 +    regfile[LM4549_Powerdown_Ctrl_Stat] = 0x000f;
 +    regfile[LM4549_Ext_Audio_ID]        = 0x0001;
 +    regfile[LM4549_Ext_Audio_Stat_Ctrl] = 0x;
 +

[Qemu-devel] [PATCH v3 01/13] remove unused function

2011-10-21 Thread Paolo Bonzini

Reviewed-by: Anthony Liguori aligu...@us.ibm.com
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 hw/mac_dbdma.c |5 -
 hw/mac_dbdma.h |1 -
 2 files changed, 0 insertions(+), 6 deletions(-)

diff --git a/hw/mac_dbdma.c b/hw/mac_dbdma.c
index 5affdd1..1791ec1 100644
--- a/hw/mac_dbdma.c
+++ b/hw/mac_dbdma.c
@@ -661,11 +661,6 @@ void DBDMA_register_channel(void *dbdma, int nchan, 
qemu_irq irq,
 ch-io.channel = ch;
 }
 
-void DBDMA_schedule(void)
-{
-qemu_notify_event();
-}
-
 static void
 dbdma_control_write(DBDMA_channel *ch)
 {
diff --git a/hw/mac_dbdma.h b/hw/mac_dbdma.h
index 933e17c..6d1abe6 100644
--- a/hw/mac_dbdma.h
+++ b/hw/mac_dbdma.h
@@ -41,5 +41,4 @@ struct DBDMA_io {
 void DBDMA_register_channel(void *dbdma, int nchan, qemu_irq irq,
 DBDMA_rw rw, DBDMA_flush flush,
 void *opaque);
-void DBDMA_schedule(void);
 void* DBDMA_init (MemoryRegion **dbdma_mem);
-- 
1.7.6

[Qemu-devel] [PATCH v3 12/13] Revert to a hand-made select loop

2011-10-21 Thread Paolo Bonzini

This reverts commit c82dc29a9112f34e0a51cad9a412cf6d9d05dfb2
and 4d88a2ac8643265108ef1fb47ceee5d7b28e19f2.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 iohandler.c |   54 +-
 1 files changed, 1 insertions(+), 53 deletions(-)

diff --git a/iohandler.c b/iohandler.c
index 687dc56..5640d49 100644
--- a/iohandler.c
+++ b/iohandler.c
@@ -81,64 +81,12 @@ int qemu_set_fd_handler2(int fd,
 return 0;
 }
 
-typedef struct IOTrampoline
-{
-GIOChannel *chan;
-IOHandler *fd_read;
-IOHandler *fd_write;
-void *opaque;
-guint tag;
-} IOTrampoline;
-
-static gboolean fd_trampoline(GIOChannel *chan, GIOCondition cond, gpointer 
opaque)
-{
-IOTrampoline *tramp = opaque;
-
-if ((cond  G_IO_IN)  tramp-fd_read) {
-tramp-fd_read(tramp-opaque);
-}
-
-if ((cond  G_IO_OUT)  tramp-fd_write) {
-tramp-fd_write(tramp-opaque);
-}
-
-return TRUE;
-}
-
 int qemu_set_fd_handler(int fd,
 IOHandler *fd_read,
 IOHandler *fd_write,
 void *opaque)
 {
-static IOTrampoline fd_trampolines[FD_SETSIZE];
-IOTrampoline *tramp = fd_trampolines[fd];
-
-if (tramp-tag != 0) {
-g_io_channel_unref(tramp-chan);
-g_source_remove(tramp-tag);
-tramp-tag = 0;
-}
-
-if (fd_read || fd_write || opaque) {
-GIOCondition cond = 0;
-
-tramp-fd_read = fd_read;
-tramp-fd_write = fd_write;
-tramp-opaque = opaque;
-
-if (fd_read) {
-cond |= G_IO_IN | G_IO_ERR;
-}
-
-if (fd_write) {
-cond |= G_IO_OUT | G_IO_ERR;
-}
-
-tramp-chan = g_io_channel_unix_new(fd);
-tramp-tag = g_io_add_watch(tramp-chan, cond, fd_trampoline, tramp);
-}
-
-return 0;
+return qemu_set_fd_handler2(fd, NULL, fd_read, fd_write, opaque);
 }
 
 void qemu_iohandler_fill(int *pnfds, fd_set *readfds, fd_set *writefds, fd_set 
*xfds)
-- 
1.7.6

[Qemu-devel] [PATCH v3 04/13] qemu-timer: more clock functions

2011-10-21 Thread Paolo Bonzini

These will be used when moving icount accounting to cpus.c.

Reviewed-by: Anthony Liguori aligu...@us.ibm.com
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 qemu-timer.c |   25 +
 qemu-timer.h |3 +++
 2 files changed, 28 insertions(+), 0 deletions(-)

diff --git a/qemu-timer.c b/qemu-timer.c
index e2551f3..ebb5089 100644
--- a/qemu-timer.c
+++ b/qemu-timer.c
@@ -495,6 +495,31 @@ void qemu_clock_warp(QEMUClock *clock)
 }
 }
 
+int64_t qemu_clock_has_timers(QEMUClock *clock)
+{
+return !!clock-active_timers;
+}
+
+int64_t qemu_clock_expired(QEMUClock *clock)
+{
+return (clock-active_timers 
+clock-active_timers-expire_time  qemu_get_clock_ns(clock));
+}
+
+int64_t qemu_clock_deadline(QEMUClock *clock)
+{
+/* To avoid problems with overflow limit this to 2^32.  */
+int64_t delta = INT32_MAX;
+
+if (clock-active_timers) {
+delta = clock-active_timers-expire_time - qemu_get_clock_ns(clock);
+}
+if (delta  0) {
+delta = 0;
+}
+return delta;
+}
+
 QEMUTimer *qemu_new_timer(QEMUClock *clock, int scale,
   QEMUTimerCB *cb, void *opaque)
 {
diff --git a/qemu-timer.h b/qemu-timer.h
index 0a43469..4578075 100644
--- a/qemu-timer.h
+++ b/qemu-timer.h
@@ -38,6 +38,9 @@ extern QEMUClock *vm_clock;
 extern QEMUClock *host_clock;
 
 int64_t qemu_get_clock_ns(QEMUClock *clock);
+int64_t qemu_clock_has_timers(QEMUClock *clock);
+int64_t qemu_clock_expired(QEMUClock *clock);
+int64_t qemu_clock_deadline(QEMUClock *clock);
 void qemu_clock_enable(QEMUClock *clock, int enabled);
 void qemu_clock_warp(QEMUClock *clock);
 
-- 
1.7.6

[Qemu-devel] [PATCH v3 02/13] qemu-timer: remove active_timers array

2011-10-21 Thread Paolo Bonzini

Embed the list in the QEMUClock instead.

Reviewed-by: Anthony Liguori aligu...@us.ibm.com
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 qemu-timer.c |   59 +++--
 1 files changed, 28 insertions(+), 31 deletions(-)

diff --git a/qemu-timer.c b/qemu-timer.c
index ad1fc8b..acf7a15 100644
--- a/qemu-timer.c
+++ b/qemu-timer.c
@@ -134,6 +134,7 @@ struct QEMUClock {
 int enabled;
 
 QEMUTimer *warp_timer;
+QEMUTimer *active_timers;
 
 NotifierList reset_notifiers;
 int64_t last;
@@ -352,14 +353,10 @@ next:
 }
 }
 
-#define QEMU_NUM_CLOCKS 3
-
 QEMUClock *rt_clock;
 QEMUClock *vm_clock;
 QEMUClock *host_clock;
 
-static QEMUTimer *active_timers[QEMU_NUM_CLOCKS];
-
 static QEMUClock *qemu_new_clock(int type)
 {
 QEMUClock *clock;
@@ -403,7 +400,7 @@ static void icount_warp_rt(void *opaque)
 int64_t delta = cur_time - cur_icount;
 qemu_icount_bias += MIN(warp_delta, delta);
 }
-if (qemu_timer_expired(active_timers[QEMU_CLOCK_VIRTUAL],
+if (qemu_timer_expired(vm_clock-active_timers,
qemu_get_clock_ns(vm_clock))) {
 qemu_notify_event();
 }
@@ -434,7 +431,7 @@ void qemu_clock_warp(QEMUClock *clock)
  * the earliest vm_clock timer.
  */
 icount_warp_rt(NULL);
-if (!all_cpu_threads_idle() || !active_timers[clock-type]) {
+if (!all_cpu_threads_idle() || !clock-active_timers) {
 qemu_del_timer(clock-warp_timer);
 return;
 }
@@ -489,7 +486,7 @@ void qemu_del_timer(QEMUTimer *ts)
 
 /* NOTE: this code must be signal safe because
qemu_timer_expired() can be called from a signal. */
-pt = active_timers[ts-clock-type];
+pt = ts-clock-active_timers;
 for(;;) {
 t = *pt;
 if (!t)
@@ -513,7 +510,7 @@ static void qemu_mod_timer_ns(QEMUTimer *ts, int64_t 
expire_time)
 /* add the timer in the sorted list */
 /* NOTE: this code must be signal safe because
qemu_timer_expired() can be called from a signal. */
-pt = active_timers[ts-clock-type];
+pt = ts-clock-active_timers;
 for(;;) {
 t = *pt;
 if (!qemu_timer_expired_ns(t, expire_time)) {
@@ -526,7 +523,7 @@ static void qemu_mod_timer_ns(QEMUTimer *ts, int64_t 
expire_time)
 *pt = ts;
 
 /* Rearm if necessary  */
-if (pt == active_timers[ts-clock-type]) {
+if (pt == ts-clock-active_timers) {
 if (!alarm_timer-pending) {
 qemu_rearm_alarm_timer(alarm_timer);
 }
@@ -548,7 +545,7 @@ void qemu_mod_timer(QEMUTimer *ts, int64_t expire_time)
 int qemu_timer_pending(QEMUTimer *ts)
 {
 QEMUTimer *t;
-for(t = active_timers[ts-clock-type]; t != NULL; t = t-next) {
+for (t = ts-clock-active_timers; t != NULL; t = t-next) {
 if (t == ts)
 return 1;
 }
@@ -569,7 +566,7 @@ static void qemu_run_timers(QEMUClock *clock)
 return;
 
 current_time = qemu_get_clock_ns(clock);
-ptimer_head = active_timers[clock-type];
+ptimer_head = clock-active_timers;
 for(;;) {
 ts = *ptimer_head;
 if (!qemu_timer_expired_ns(ts, current_time)) {
@@ -773,8 +770,8 @@ int64_t qemu_next_icount_deadline(void)
 int64_t delta = INT32_MAX;
 
 assert(use_icount);
-if (active_timers[QEMU_CLOCK_VIRTUAL]) {
-delta = active_timers[QEMU_CLOCK_VIRTUAL]-expire_time -
+if (vm_clock-active_timers) {
+delta = vm_clock-active_timers-expire_time -
  qemu_get_clock_ns(vm_clock);
 }
 
@@ -789,20 +786,20 @@ static int64_t qemu_next_alarm_deadline(void)
 int64_t delta;
 int64_t rtdelta;
 
-if (!use_icount  active_timers[QEMU_CLOCK_VIRTUAL]) {
-delta = active_timers[QEMU_CLOCK_VIRTUAL]-expire_time -
+if (!use_icount  vm_clock-active_timers) {
+delta = vm_clock-active_timers-expire_time -
  qemu_get_clock_ns(vm_clock);
 } else {
 delta = INT32_MAX;
 }
-if (active_timers[QEMU_CLOCK_HOST]) {
-int64_t hdelta = active_timers[QEMU_CLOCK_HOST]-expire_time -
+if (host_clock-active_timers) {
+int64_t hdelta = host_clock-active_timers-expire_time -
  qemu_get_clock_ns(host_clock);
 if (hdelta  delta)
 delta = hdelta;
 }
-if (active_timers[QEMU_CLOCK_REALTIME]) {
-rtdelta = (active_timers[QEMU_CLOCK_REALTIME]-expire_time -
+if (rt_clock-active_timers) {
+rtdelta = (rt_clock-active_timers-expire_time -
  qemu_get_clock_ns(rt_clock));
 if (rtdelta  delta)
 delta = rtdelta;
@@ -871,9 +868,9 @@ static void dynticks_rearm_timer(struct qemu_alarm_timer *t)
 int64_t current_ns;
 
 assert(alarm_has_dynticks(t));
-if (!active_timers[QEMU_CLOCK_REALTIME] 
-!active_timers[QEMU_CLOCK_VIRTUAL] 
-!active_timers[QEMU_CLOCK_HOST])
+if (!rt_clock-active_timers 
+

[Qemu-devel] [PATCH 16/19] block: take lock around bdrv_read implementations

2011-10-21 Thread Kevin Wolf

From: Paolo Bonzini pbonz...@redhat.com

This does the first part of the conversion to coroutines, by
wrapping bdrv_read implementations to take the mutex.

Drivers that implement bdrv_read rather than bdrv_co_readv can
then benefit from asynchronous operation (at least if the underlying
protocol supports it, which is not the case for raw-win32), even
though they still operate with a bounce buffer.

raw-win32 does not need the lock, because it cannot yield.
nbd also doesn't probably, but better be safe.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block/bochs.c |   13 -
 block/cloop.c |   13 -
 block/cow.c   |   13 -
 block/dmg.c   |   13 -
 block/nbd.c   |   13 -
 block/parallels.c |   13 -
 block/vmdk.c  |   13 -
 block/vpc.c   |   13 -
 block/vvfat.c |   13 -
 9 files changed, 108 insertions(+), 9 deletions(-)

diff --git a/block/bochs.c b/block/bochs.c
index b0f8072..ab7944d 100644
--- a/block/bochs.c
+++ b/block/bochs.c
@@ -209,6 +209,17 @@ static int bochs_read(BlockDriverState *bs, int64_t 
sector_num,
 return 0;
 }
 
+static coroutine_fn int bochs_co_read(BlockDriverState *bs, int64_t sector_num,
+  uint8_t *buf, int nb_sectors)
+{
+int ret;
+BDRVBochsState *s = bs-opaque;
+qemu_co_mutex_lock(s-lock);
+ret = bochs_read(bs, sector_num, buf, nb_sectors);
+qemu_co_mutex_unlock(s-lock);
+return ret;
+}
+
 static void bochs_close(BlockDriverState *bs)
 {
 BDRVBochsState *s = bs-opaque;
@@ -220,7 +231,7 @@ static BlockDriver bdrv_bochs = {
 .instance_size = sizeof(BDRVBochsState),
 .bdrv_probe= bochs_probe,
 .bdrv_open = bochs_open,
-.bdrv_read = bochs_read,
+.bdrv_read  = bochs_co_read,
 .bdrv_close= bochs_close,
 };
 
diff --git a/block/cloop.c b/block/cloop.c
index a91f372..775f8a9 100644
--- a/block/cloop.c
+++ b/block/cloop.c
@@ -146,6 +146,17 @@ static int cloop_read(BlockDriverState *bs, int64_t 
sector_num,
 return 0;
 }
 
+static coroutine_fn int cloop_co_read(BlockDriverState *bs, int64_t sector_num,
+  uint8_t *buf, int nb_sectors)
+{
+int ret;
+BDRVCloopState *s = bs-opaque;
+qemu_co_mutex_lock(s-lock);
+ret = cloop_read(bs, sector_num, buf, nb_sectors);
+qemu_co_mutex_unlock(s-lock);
+return ret;
+}
+
 static void cloop_close(BlockDriverState *bs)
 {
 BDRVCloopState *s = bs-opaque;
@@ -161,7 +172,7 @@ static BlockDriver bdrv_cloop = {
 .instance_size = sizeof(BDRVCloopState),
 .bdrv_probe= cloop_probe,
 .bdrv_open = cloop_open,
-.bdrv_read = cloop_read,
+.bdrv_read  = cloop_co_read,
 .bdrv_close= cloop_close,
 };
 
diff --git a/block/cow.c b/block/cow.c
index 2f426e7..a5fcd20 100644
--- a/block/cow.c
+++ b/block/cow.c
@@ -201,6 +201,17 @@ static int cow_read(BlockDriverState *bs, int64_t 
sector_num,
 return 0;
 }
 
+static coroutine_fn int cow_co_read(BlockDriverState *bs, int64_t sector_num,
+uint8_t *buf, int nb_sectors)
+{
+int ret;
+BDRVCowState *s = bs-opaque;
+qemu_co_mutex_lock(s-lock);
+ret = cow_read(bs, sector_num, buf, nb_sectors);
+qemu_co_mutex_unlock(s-lock);
+return ret;
+}
+
 static int cow_write(BlockDriverState *bs, int64_t sector_num,
  const uint8_t *buf, int nb_sectors)
 {
@@ -308,7 +319,7 @@ static BlockDriver bdrv_cow = {
 .instance_size = sizeof(BDRVCowState),
 .bdrv_probe= cow_probe,
 .bdrv_open = cow_open,
-.bdrv_read = cow_read,
+.bdrv_read  = cow_co_read,
 .bdrv_write= cow_write,
 .bdrv_close= cow_close,
 .bdrv_create   = cow_create,
diff --git a/block/dmg.c b/block/dmg.c
index 111aeae..37902a4 100644
--- a/block/dmg.c
+++ b/block/dmg.c
@@ -282,6 +282,17 @@ static int dmg_read(BlockDriverState *bs, int64_t 
sector_num,
 return 0;
 }
 
+static coroutine_fn int dmg_co_read(BlockDriverState *bs, int64_t sector_num,
+uint8_t *buf, int nb_sectors)
+{
+int ret;
+BDRVDMGState *s = bs-opaque;
+qemu_co_mutex_lock(s-lock);
+ret = dmg_read(bs, sector_num, buf, nb_sectors);
+qemu_co_mutex_unlock(s-lock);
+return ret;
+}
+
 static void dmg_close(BlockDriverState *bs)
 {
 BDRVDMGState *s = bs-opaque;
@@ -302,7 +313,7 @@ static BlockDriver bdrv_dmg = {
 .instance_size = sizeof(BDRVDMGState),
 .bdrv_probe= dmg_probe,
 .bdrv_open = dmg_open,
-.bdrv_read = dmg_read,
+.bdrv_read  = dmg_co_read,
 .bdrv_close= dmg_close,
 };
 
diff --git a/block/nbd.c

[Qemu-devel] [PATCH v3 10/13] main-loop: create main-loop.h

2011-10-21 Thread Paolo Bonzini

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 async.c   |1 +
 cpus.c|7 +-
 cpus.h|1 -
 iohandler.c   |1 +
 main-loop.h   |  327 +
 qemu-char.h   |   12 +--
 qemu-common.h |   30 -
 qemu-coroutine-lock.c |1 +
 qemu-os-win32.h   |   15 +--
 qemu-timer.h  |1 +
 sysemu.h  |3 +-
 vl.c  |1 +
 12 files changed, 336 insertions(+), 64 deletions(-)
 create mode 100644 main-loop.h

diff --git a/async.c b/async.c
index ca13962..332d511 100644
--- a/async.c
+++ b/async.c
@@ -24,6 +24,7 @@
 
 #include qemu-common.h
 #include qemu-aio.h
+#include main-loop.h
 
 /* Anchor of the list of Bottom Halves belonging to the context */
 static struct QEMUBH *first_bh;
diff --git a/cpus.c b/cpus.c
index 1328baa..64237b4 100644
--- a/cpus.c
+++ b/cpus.c
@@ -33,17 +33,12 @@
 
 #include qemu-thread.h
 #include cpus.h
+#include main-loop.h
 
 #ifndef _WIN32
 #include compatfd.h
 #endif
 
-#ifdef SIGRTMIN
-#define SIG_IPI (SIGRTMIN+4)
-#else
-#define SIG_IPI SIGUSR1
-#endif
-
 #ifdef CONFIG_LINUX
 
 #include sys/prctl.h
diff --git a/cpus.h b/cpus.h
index 5885885..4ccf986 100644
--- a/cpus.h
+++ b/cpus.h
@@ -2,7 +2,6 @@
 #define QEMU_CPUS_H
 
 /* cpus.c */
-int qemu_init_main_loop(void);
 void qemu_main_loop_start(void);
 void resume_all_vcpus(void);
 void pause_all_vcpus(void);
diff --git a/iohandler.c b/iohandler.c
index 4cc1c5a..687dc56 100644
--- a/iohandler.c
+++ b/iohandler.c
@@ -26,6 +26,7 @@
 #include qemu-common.h
 #include qemu-char.h
 #include qemu-queue.h
+#include main-loop.h
 
 #ifndef _WIN32
 #include sys/wait.h
diff --git a/main-loop.h b/main-loop.h
new file mode 100644
index 000..a73b9c0
--- /dev/null
+++ b/main-loop.h
@@ -0,0 +1,327 @@
+/*
+ * QEMU System Emulator
+ *
+ * Copyright (c) 2003-2008 Fabrice Bellard
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the Software), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#ifndef QEMU_MAIN_LOOP_H
+#define QEMU_MAIN_LOOP_H 1
+
+#ifdef SIGRTMIN
+#define SIG_IPI (SIGRTMIN+4)
+#else
+#define SIG_IPI SIGUSR1
+#endif
+
+/**
+ * qemu_init_main_loop: Set up the process so that it can run the main loop.
+ *
+ * This includes setting up signal handlers.  It should be called before
+ * any other threads are created.  In addition, threads other than the
+ * main one should block signals that are trapped by the main loop.
+ * For simplicity, you can consider these signals to be safe: SIGUSR1,
+ * SIGUSR2, thread signals (SIGFPE, SIGILL, SIGSEGV, SIGBUS) and real-time
+ * signals if available.  Remember that Windows in practice does not have
+ * signals, though.
+ */
+int qemu_init_main_loop(void);
+
+/**
+ * main_loop_wait: Run one iteration of the main loop.
+ *
+ * If @nonblocking is true, poll for events, otherwise suspend until
+ * one actually occurs.  The main loop usually consists of a loop that
+ * repeatedly calls main_loop_wait(false).
+ *
+ * Main loop services include file descriptor callbacks, bottom halves
+ * and timers (defined in qemu-timer.h).  Bottom halves are similar to timers
+ * that execute immediately, but have a lower overhead and scheduling them
+ * is wait-free, thread-safe and signal-safe.
+ *
+ * It is sometimes useful to put a whole program in a coroutine.  In this
+ * case, the coroutine actually should be started from within the main loop,
+ * so that the main loop can run whenever the coroutine yields.  To do this,
+ * you can use a bottom half to enter the coroutine as soon as the main loop
+ * starts:
+ *
+ * void enter_co_bh(void *opaque) {
+ * QEMUCoroutine *co = opaque;
+ * qemu_coroutine_enter(co, NULL);
+ * }
+ *
+ * ...
+ * QEMUCoroutine *co = qemu_coroutine_create(coroutine_entry);
+ * QEMUBH *start_bh = qemu_bh_new(enter_co_bh, co);
+ * qemu_bh_schedule(start_bh);
+ * while (...) {
+ * main_loop_wait(false);
+ *

[Qemu-devel] [PATCH v3 03/13] qemu-timer: move common code to qemu_rearm_alarm_timer

2011-10-21 Thread Paolo Bonzini

Reviewed-by: Anthony Liguori aligu...@us.ibm.com
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 qemu-timer.c |  129 --
 1 files changed, 53 insertions(+), 76 deletions(-)

diff --git a/qemu-timer.c b/qemu-timer.c
index acf7a15..e2551f3 100644
--- a/qemu-timer.c
+++ b/qemu-timer.c
@@ -153,7 +153,7 @@ struct qemu_alarm_timer {
 char const *name;
 int (*start)(struct qemu_alarm_timer *t);
 void (*stop)(struct qemu_alarm_timer *t);
-void (*rearm)(struct qemu_alarm_timer *t);
+void (*rearm)(struct qemu_alarm_timer *t, int64_t nearest_delta_ns);
 #if defined(__linux__)
 int fd;
 timer_t timer;
@@ -181,12 +181,46 @@ static inline int alarm_has_dynticks(struct 
qemu_alarm_timer *t)
 return !!t-rearm;
 }
 
+static int64_t qemu_next_alarm_deadline(void)
+{
+int64_t delta;
+int64_t rtdelta;
+
+if (!use_icount  vm_clock-active_timers) {
+delta = vm_clock-active_timers-expire_time -
+ qemu_get_clock_ns(vm_clock);
+} else {
+delta = INT32_MAX;
+}
+if (host_clock-active_timers) {
+int64_t hdelta = host_clock-active_timers-expire_time -
+ qemu_get_clock_ns(host_clock);
+if (hdelta  delta) {
+delta = hdelta;
+}
+}
+if (rt_clock-active_timers) {
+rtdelta = (rt_clock-active_timers-expire_time -
+ qemu_get_clock_ns(rt_clock));
+if (rtdelta  delta) {
+delta = rtdelta;
+}
+}
+
+return delta;
+}
+
 static void qemu_rearm_alarm_timer(struct qemu_alarm_timer *t)
 {
-if (!alarm_has_dynticks(t))
+int64_t nearest_delta_ns;
+assert(alarm_has_dynticks(t));
+if (!rt_clock-active_timers 
+!vm_clock-active_timers 
+!host_clock-active_timers) {
 return;
-
-t-rearm(t);
+}
+nearest_delta_ns = qemu_next_alarm_deadline();
+t-rearm(t, nearest_delta_ns);
 }
 
 /* TODO: MIN_TIMER_REARM_NS should be optimized */
@@ -196,23 +230,23 @@ static void qemu_rearm_alarm_timer(struct 
qemu_alarm_timer *t)
 
 static int mm_start_timer(struct qemu_alarm_timer *t);
 static void mm_stop_timer(struct qemu_alarm_timer *t);
-static void mm_rearm_timer(struct qemu_alarm_timer *t);
+static void mm_rearm_timer(struct qemu_alarm_timer *t, int64_t delta);
 
 static int win32_start_timer(struct qemu_alarm_timer *t);
 static void win32_stop_timer(struct qemu_alarm_timer *t);
-static void win32_rearm_timer(struct qemu_alarm_timer *t);
+static void win32_rearm_timer(struct qemu_alarm_timer *t, int64_t delta);
 
 #else
 
 static int unix_start_timer(struct qemu_alarm_timer *t);
 static void unix_stop_timer(struct qemu_alarm_timer *t);
-static void unix_rearm_timer(struct qemu_alarm_timer *t);
+static void unix_rearm_timer(struct qemu_alarm_timer *t, int64_t delta);
 
 #ifdef __linux__
 
 static int dynticks_start_timer(struct qemu_alarm_timer *t);
 static void dynticks_stop_timer(struct qemu_alarm_timer *t);
-static void dynticks_rearm_timer(struct qemu_alarm_timer *t);
+static void dynticks_rearm_timer(struct qemu_alarm_timer *t, int64_t delta);
 
 #endif /* __linux__ */
 
@@ -715,8 +749,6 @@ void qemu_run_all_timers(void)
 qemu_run_timers(host_clock);
 }
 
-static int64_t qemu_next_alarm_deadline(void);
-
 #ifdef _WIN32
 static void CALLBACK host_alarm_handler(PVOID lpParam, BOOLEAN unused)
 #else
@@ -781,33 +813,6 @@ int64_t qemu_next_icount_deadline(void)
 return delta;
 }
 
-static int64_t qemu_next_alarm_deadline(void)
-{
-int64_t delta;
-int64_t rtdelta;
-
-if (!use_icount  vm_clock-active_timers) {
-delta = vm_clock-active_timers-expire_time -
- qemu_get_clock_ns(vm_clock);
-} else {
-delta = INT32_MAX;
-}
-if (host_clock-active_timers) {
-int64_t hdelta = host_clock-active_timers-expire_time -
- qemu_get_clock_ns(host_clock);
-if (hdelta  delta)
-delta = hdelta;
-}
-if (rt_clock-active_timers) {
-rtdelta = (rt_clock-active_timers-expire_time -
- qemu_get_clock_ns(rt_clock));
-if (rtdelta  delta)
-delta = rtdelta;
-}
-
-return delta;
-}
-
 #if defined(__linux__)
 
 #include compatfd.h
@@ -860,20 +865,13 @@ static void dynticks_stop_timer(struct qemu_alarm_timer 
*t)
 timer_delete(host_timer);
 }
 
-static void dynticks_rearm_timer(struct qemu_alarm_timer *t)
+static void dynticks_rearm_timer(struct qemu_alarm_timer *t,
+ int64_t nearest_delta_ns)
 {
 timer_t host_timer = t-timer;
 struct itimerspec timeout;
-int64_t nearest_delta_ns = INT64_MAX;
 int64_t current_ns;
 
-assert(alarm_has_dynticks(t));
-if (!rt_clock-active_timers 
-!vm_clock-active_timers 
-!host_clock-active_timers)
-return;
-
-nearest_delta_ns = qemu_next_alarm_deadline();
 if (nearest_delta_ns

[Qemu-devel] [PATCH v3 08/13] qemu-timer: move more stuff out of qemu-timer.c

2011-10-21 Thread Paolo Bonzini

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 qemu-timer.c |   35 ---
 qemu-timer.h |2 ++
 savevm.c |   25 +
 vl.c |1 +
 4 files changed, 32 insertions(+), 31 deletions(-)

diff --git a/qemu-timer.c b/qemu-timer.c
index 7fa81e1..58926dd 100644
--- a/qemu-timer.c
+++ b/qemu-timer.c
@@ -266,11 +266,8 @@ static QEMUClock *qemu_new_clock(int type)
 clock = g_malloc0(sizeof(QEMUClock));
 clock-type = type;
 clock-enabled = 1;
+clock-last = INT64_MIN;
 notifier_list_init(clock-reset_notifiers);
-/* required to detect  report backward jumps */
-if (type == QEMU_CLOCK_HOST) {
-clock-last = get_clock_realtime();
-}
 return clock;
 }
 
@@ -344,7 +341,7 @@ void qemu_del_timer(QEMUTimer *ts)
 
 /* modify the current timer so that it will be fired when current_time
= expire_time. The corresponding callback will be called. */
-static void qemu_mod_timer_ns(QEMUTimer *ts, int64_t expire_time)
+void qemu_mod_timer_ns(QEMUTimer *ts, int64_t expire_time)
 {
 QEMUTimer **pt, *t;
 
@@ -378,8 +375,6 @@ static void qemu_mod_timer_ns(QEMUTimer *ts, int64_t 
expire_time)
 }
 }
 
-/* modify the current timer so that it will be fired when current_time
-   = expire_time. The corresponding callback will be called. */
 void qemu_mod_timer(QEMUTimer *ts, int64_t expire_time)
 {
 qemu_mod_timer_ns(ts, expire_time * ts-scale);
@@ -464,33 +459,11 @@ void init_clocks(void)
 rt_clock = qemu_new_clock(QEMU_CLOCK_REALTIME);
 vm_clock = qemu_new_clock(QEMU_CLOCK_VIRTUAL);
 host_clock = qemu_new_clock(QEMU_CLOCK_HOST);
-
-rtc_clock = host_clock;
 }
 
-/* save a timer */
-void qemu_put_timer(QEMUFile *f, QEMUTimer *ts)
+uint64_t qemu_timer_expire_time_ns(QEMUTimer *ts)
 {
-uint64_t expire_time;
-
-if (qemu_timer_pending(ts)) {
-expire_time = ts-expire_time;
-} else {
-expire_time = -1;
-}
-qemu_put_be64(f, expire_time);
-}
-
-void qemu_get_timer(QEMUFile *f, QEMUTimer *ts)
-{
-uint64_t expire_time;
-
-expire_time = qemu_get_be64(f);
-if (expire_time != -1) {
-qemu_mod_timer_ns(ts, expire_time);
-} else {
-qemu_del_timer(ts);
-}
+return qemu_timer_pending(ts) ? ts-expire_time : -1;
 }
 
 void qemu_run_all_timers(void)
diff --git a/qemu-timer.h b/qemu-timer.h
index b4ea201..9f4ffed 100644
--- a/qemu-timer.h
+++ b/qemu-timer.h
@@ -52,9 +52,11 @@ QEMUTimer *qemu_new_timer(QEMUClock *clock, int scale,
   QEMUTimerCB *cb, void *opaque);
 void qemu_free_timer(QEMUTimer *ts);
 void qemu_del_timer(QEMUTimer *ts);
+void qemu_mod_timer_ns(QEMUTimer *ts, int64_t expire_time);
 void qemu_mod_timer(QEMUTimer *ts, int64_t expire_time);
 int qemu_timer_pending(QEMUTimer *ts);
 int qemu_timer_expired(QEMUTimer *timer_head, int64_t current_time);
+uint64_t qemu_timer_expire_time_ns(QEMUTimer *ts);
 
 void qemu_run_all_timers(void);
 int qemu_alarm_pending(void);
diff --git a/savevm.c b/savevm.c
index cf79a56..f01838f 100644
--- a/savevm.c
+++ b/savevm.c
@@ -81,6 +81,7 @@
 #include migration.h
 #include qemu_socket.h
 #include qemu-queue.h
+#include qemu-timer.h
 #include cpus.h
 
 #define SELF_ANNOUNCE_ROUNDS 5
@@ -712,6 +713,30 @@ uint64_t qemu_get_be64(QEMUFile *f)
 return v;
 }
 
+
+/* timer */
+
+void qemu_put_timer(QEMUFile *f, QEMUTimer *ts)
+{
+uint64_t expire_time;
+
+expire_time = qemu_timer_expire_time_ns(ts);
+qemu_put_be64(f, expire_time);
+}
+
+void qemu_get_timer(QEMUFile *f, QEMUTimer *ts)
+{
+uint64_t expire_time;
+
+expire_time = qemu_get_be64(f);
+if (expire_time != -1) {
+qemu_mod_timer_ns(ts, expire_time);
+} else {
+qemu_del_timer(ts);
+}
+}
+
+
 /* bool */
 
 static int get_bool(QEMUFile *f, void *pv, size_t size)
diff --git a/vl.c b/vl.c
index 6bd7e71..cf25d65 100644
--- a/vl.c
+++ b/vl.c
@@ -2311,6 +2311,7 @@ int main(int argc, char **argv, char **envp)
 runstate_init();
 
 init_clocks();
+rtc_clock = host_clock;
 
 qemu_cache_utils_init(envp);
 
-- 
1.7.6

[Qemu-devel] [PATCH v3 07/13] qemu-timer: use atexit for quit_timers

2011-10-21 Thread Paolo Bonzini

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 qemu-timer.c |   15 ---
 qemu-timer.h |1 -
 vl.c |1 -
 3 files changed, 8 insertions(+), 9 deletions(-)

diff --git a/qemu-timer.c b/qemu-timer.c
index d8507e3..7fa81e1 100644
--- a/qemu-timer.c
+++ b/qemu-timer.c
@@ -840,6 +840,13 @@ static void alarm_timer_on_change_state_rearm(void 
*opaque, int running,
 qemu_rearm_alarm_timer((struct qemu_alarm_timer *) opaque);
 }
 
+static void quit_timers(void)
+{
+struct qemu_alarm_timer *t = alarm_timer;
+alarm_timer = NULL;
+t-stop(t);
+}
+
 int init_timer_alarm(void)
 {
 struct qemu_alarm_timer *t = NULL;
@@ -859,6 +866,7 @@ int init_timer_alarm(void)
 }
 
 /* first event is at time 0 */
+atexit(quit_timers);
 t-pending = 1;
 alarm_timer = t;
 qemu_add_vm_change_state_handler(alarm_timer_on_change_state_rearm, t);
@@ -869,13 +877,6 @@ fail:
 return err;
 }
 
-void quit_timers(void)
-{
-struct qemu_alarm_timer *t = alarm_timer;
-alarm_timer = NULL;
-t-stop(t);
-}
-
 int qemu_calculate_timeout(void)
 {
 return 1000;
diff --git a/qemu-timer.h b/qemu-timer.h
index ce576b9..b4ea201 100644
--- a/qemu-timer.h
+++ b/qemu-timer.h
@@ -62,7 +62,6 @@ void configure_alarms(char const *opt);
 int qemu_calculate_timeout(void);
 void init_clocks(void);
 int init_timer_alarm(void);
-void quit_timers(void);
 
 int64_t cpu_get_ticks(void);
 void cpu_enable_ticks(void);
diff --git a/vl.c b/vl.c
index 66f70fb..6bd7e71 100644
--- a/vl.c
+++ b/vl.c
@@ -3565,7 +3565,6 @@ int main(int argc, char **argv, char **envp)
 os_setup_post();
 
 main_loop();
-quit_timers();
 net_cleanup();
 res_free();
 
-- 
1.7.6

[Qemu-devel] [PATCH 11/19] qcow2: Fix bdrv_write_compressed error handling

2011-10-21 Thread Kevin Wolf

If during allocation of compressed clusters the cluster was already allocated
uncompressed, fail and properly release the l2_table (the latter avoids a
failed assertion).

While at it, make it return some real error numbers instead of -1.

Signed-off-by: Kevin Wolf kw...@redhat.com
Reviewed-by: Dong Xu Wang wdon...@linux.vnet.ibm.com
---
 block/qcow2-cluster.c |6 --
 block/qcow2.c |   29 ++---
 2 files changed, 22 insertions(+), 13 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 2f76311..f4e049f 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -568,8 +568,10 @@ uint64_t 
qcow2_alloc_compressed_cluster_offset(BlockDriverState *bs,
 }
 
 cluster_offset = be64_to_cpu(l2_table[l2_index]);
-if (cluster_offset  QCOW_OFLAG_COPIED)
-return cluster_offset  ~QCOW_OFLAG_COPIED;
+if (cluster_offset  QCOW_OFLAG_COPIED) {
+qcow2_cache_put(bs, s-l2_table_cache, (void**) l2_table);
+return 0;
+}
 
 if (cluster_offset)
 qcow2_free_any_clusters(bs, cluster_offset, 1);
diff --git a/block/qcow2.c b/block/qcow2.c
index 4dc980c..91f4f04 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1053,8 +1053,8 @@ static int qcow2_write_compressed(BlockDriverState *bs, 
int64_t sector_num,
Z_DEFLATED, -12,
9, Z_DEFAULT_STRATEGY);
 if (ret != 0) {
-g_free(out_buf);
-return -1;
+ret = -EINVAL;
+goto fail;
 }
 
 strm.avail_in = s-cluster_size;
@@ -1064,9 +1064,9 @@ static int qcow2_write_compressed(BlockDriverState *bs, 
int64_t sector_num,
 
 ret = deflate(strm, Z_FINISH);
 if (ret != Z_STREAM_END  ret != Z_OK) {
-g_free(out_buf);
 deflateEnd(strm);
-return -1;
+ret = -EINVAL;
+goto fail;
 }
 out_len = strm.next_out - out_buf;
 
@@ -1074,22 +1074,29 @@ static int qcow2_write_compressed(BlockDriverState *bs, 
int64_t sector_num,
 
 if (ret != Z_STREAM_END || out_len = s-cluster_size) {
 /* could not compress: write normal cluster */
-bdrv_write(bs, sector_num, buf, s-cluster_sectors);
+ret = bdrv_write(bs, sector_num, buf, s-cluster_sectors);
+if (ret  0) {
+goto fail;
+}
 } else {
 cluster_offset = qcow2_alloc_compressed_cluster_offset(bs,
 sector_num  9, out_len);
-if (!cluster_offset)
-return -1;
+if (!cluster_offset) {
+ret = -EIO;
+goto fail;
+}
 cluster_offset = s-cluster_offset_mask;
 BLKDBG_EVENT(bs-file, BLKDBG_WRITE_COMPRESSED);
-if (bdrv_pwrite(bs-file, cluster_offset, out_buf, out_len) != 
out_len) {
-g_free(out_buf);
-return -1;
+ret = bdrv_pwrite(bs-file, cluster_offset, out_buf, out_len);
+if (ret  0) {
+goto fail;
 }
 }
 
+ret = 0;
+fail:
 g_free(out_buf);
-return 0;
+return ret;
 }
 
 static BlockDriverAIOCB *qcow2_aio_flush(BlockDriverState *bs,
-- 
1.7.6.4

[Qemu-devel] [PATCH 05/19] xen_disk: Always set feature-barrier = 1

2011-10-21 Thread Kevin Wolf

The synchronous .bdrv_flush callback doesn't exist any more and a device really
shouldn't poke into the block layer internals anyway. All drivers are supposed
to have a correctly working bdrv_flush, so let's just hard-code this.

Signed-off-by: Kevin Wolf kw...@redhat.com
---
 hw/xen_disk.c |5 ++---
 1 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/hw/xen_disk.c b/hw/xen_disk.c
index 8a9fac4..286bbac 100644
--- a/hw/xen_disk.c
+++ b/hw/xen_disk.c
@@ -620,7 +620,7 @@ static void blk_alloc(struct XenDevice *xendev)
 static int blk_init(struct XenDevice *xendev)
 {
 struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, xendev);
-int index, qflags, have_barriers, info = 0;
+int index, qflags, info = 0;
 
 /* read xenstore entries */
 if (blkdev-params == NULL) {
@@ -706,7 +706,6 @@ static int blk_init(struct XenDevice *xendev)
   blkdev-bs-drv ? blkdev-bs-drv-format_name : -);
 blkdev-file_size = 0;
 }
-have_barriers = blkdev-bs-drv  blkdev-bs-drv-bdrv_flush ? 1 : 0;
 
 xen_be_printf(xendev, 1, type \%s\, fileproto \%s\, filename \%s\,
size % PRId64  (% PRId64  MB)\n,
@@ -714,7 +713,7 @@ static int blk_init(struct XenDevice *xendev)
   blkdev-file_size, blkdev-file_size  20);
 
 /* fill info */
-xenstore_write_be_int(blkdev-xendev, feature-barrier, have_barriers);
+xenstore_write_be_int(blkdev-xendev, feature-barrier, 1);
 xenstore_write_be_int(blkdev-xendev, info,info);
 xenstore_write_be_int(blkdev-xendev, sector-size, 
blkdev-file_blk);
 xenstore_write_be_int(blkdev-xendev, sectors,
-- 
1.7.6.4

[Qemu-devel] [PATCH 10/19] qemu-img: Don't allow preallocation and compression at the same time

2011-10-21 Thread Kevin Wolf

Only qcow and qcow2 can do compression at all, and they require unallocated
clusters when writing the compressed data.

Signed-off-by: Kevin Wolf kw...@redhat.com
Reviewed-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
---
 qemu-img.c |   11 +++
 1 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/qemu-img.c b/qemu-img.c
index 6a39731..86127f0 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -824,6 +824,8 @@ static int img_convert(int argc, char **argv)
 if (compress) {
 QEMUOptionParameter *encryption =
 get_option_parameter(param, BLOCK_OPT_ENCRYPT);
+QEMUOptionParameter *preallocation =
+get_option_parameter(param, BLOCK_OPT_PREALLOC);
 
 if (!drv-bdrv_write_compressed) {
 error_report(Compression not supported for this file format);
@@ -837,6 +839,15 @@ static int img_convert(int argc, char **argv)
 ret = -1;
 goto out;
 }
+
+if (preallocation  preallocation-value.s
+ strcmp(preallocation-value.s, off))
+{
+error_report(Compression and preallocation not supported at 
+ the same time);
+ret = -1;
+goto out;
+}
 }
 
 /* Create the new image */
-- 
1.7.6.4

[Qemu-devel] [PATCH 02/19] add socket_set_block

2011-10-21 Thread Kevin Wolf

From: Paolo Bonzini pbonz...@redhat.com

Cc: MORITA Kazutaka morita.kazut...@lab.ntt.co.jp
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
Signed-off-by: Kevin Wolf kw...@redhat.com
---
 oslib-posix.c |7 +++
 oslib-win32.c |6 ++
 qemu_socket.h |1 +
 3 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/oslib-posix.c b/oslib-posix.c
index a304fb0..dbc8ee8 100644
--- a/oslib-posix.c
+++ b/oslib-posix.c
@@ -103,6 +103,13 @@ void qemu_vfree(void *ptr)
 free(ptr);
 }
 
+void socket_set_block(int fd)
+{
+int f;
+f = fcntl(fd, F_GETFL);
+fcntl(fd, F_SETFL, f  ~O_NONBLOCK);
+}
+
 void socket_set_nonblock(int fd)
 {
 int f;
diff --git a/oslib-win32.c b/oslib-win32.c
index 5f0759f..5e3de7d 100644
--- a/oslib-win32.c
+++ b/oslib-win32.c
@@ -73,6 +73,12 @@ void qemu_vfree(void *ptr)
 VirtualFree(ptr, 0, MEM_RELEASE);
 }
 
+void socket_set_block(int fd)
+{
+unsigned long opt = 0;
+ioctlsocket(fd, FIONBIO, opt);
+}
+
 void socket_set_nonblock(int fd)
 {
 unsigned long opt = 1;
diff --git a/qemu_socket.h b/qemu_socket.h
index 180e4db..9e32fac 100644
--- a/qemu_socket.h
+++ b/qemu_socket.h
@@ -35,6 +35,7 @@ int inet_aton(const char *cp, struct in_addr *ia);
 /* misc helpers */
 int qemu_socket(int domain, int type, int protocol);
 int qemu_accept(int s, struct sockaddr *addr, socklen_t *addrlen);
+void socket_set_block(int fd);
 void socket_set_nonblock(int fd);
 int send_all(int fd, const void *buf, int len1);
 
-- 
1.7.6.4

[Qemu-devel] [PATCH 08/19] block: add bdrv_co_discard and bdrv_aio_discard support

2011-10-21 Thread Kevin Wolf

From: Paolo Bonzini pbonz...@redhat.com

This similarly adds support for coroutine and asynchronous discard.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block.c  |  102 +++--
 block.h  |4 ++
 block/raw.c  |   10 +++--
 block_int.h  |9 -
 trace-events |1 +
 5 files changed, 109 insertions(+), 17 deletions(-)

diff --git a/block.c b/block.c
index 7b8b14d..28508f2 100644
--- a/block.c
+++ b/block.c
@@ -1768,17 +1768,6 @@ int bdrv_has_zero_init(BlockDriverState *bs)
 return 1;
 }
 
-int bdrv_discard(BlockDriverState *bs, int64_t sector_num, int nb_sectors)
-{
-if (!bs-drv) {
-return -ENOMEDIUM;
-}
-if (!bs-drv-bdrv_discard) {
-return 0;
-}
-return bs-drv-bdrv_discard(bs, sector_num, nb_sectors);
-}
-
 /*
  * Returns true iff the specified sector is present in the disk image. Drivers
  * not implementing the functionality are assumed to not support backing files,
@@ -2754,6 +2743,34 @@ BlockDriverAIOCB *bdrv_aio_flush(BlockDriverState *bs,
 return acb-common;
 }
 
+static void coroutine_fn bdrv_aio_discard_co_entry(void *opaque)
+{
+BlockDriverAIOCBCoroutine *acb = opaque;
+BlockDriverState *bs = acb-common.bs;
+
+acb-req.error = bdrv_co_discard(bs, acb-req.sector, acb-req.nb_sectors);
+acb-bh = qemu_bh_new(bdrv_co_em_bh, acb);
+qemu_bh_schedule(acb-bh);
+}
+
+BlockDriverAIOCB *bdrv_aio_discard(BlockDriverState *bs,
+int64_t sector_num, int nb_sectors,
+BlockDriverCompletionFunc *cb, void *opaque)
+{
+Coroutine *co;
+BlockDriverAIOCBCoroutine *acb;
+
+trace_bdrv_aio_discard(bs, sector_num, nb_sectors, opaque);
+
+acb = qemu_aio_get(bdrv_em_co_aio_pool, bs, cb, opaque);
+acb-req.sector = sector_num;
+acb-req.nb_sectors = nb_sectors;
+co = qemu_coroutine_create(bdrv_aio_discard_co_entry);
+qemu_coroutine_enter(co, acb);
+
+return acb-common;
+}
+
 void bdrv_init(void)
 {
 module_call_init(MODULE_INIT_BLOCK);
@@ -2915,6 +2932,69 @@ int bdrv_flush(BlockDriverState *bs)
 return rwco.ret;
 }
 
+static void coroutine_fn bdrv_discard_co_entry(void *opaque)
+{
+RwCo *rwco = opaque;
+
+rwco-ret = bdrv_co_discard(rwco-bs, rwco-sector_num, rwco-nb_sectors);
+}
+
+int coroutine_fn bdrv_co_discard(BlockDriverState *bs, int64_t sector_num,
+ int nb_sectors)
+{
+if (!bs-drv) {
+return -ENOMEDIUM;
+} else if (bdrv_check_request(bs, sector_num, nb_sectors)) {
+return -EIO;
+} else if (bs-read_only) {
+return -EROFS;
+} else if (bs-drv-bdrv_co_discard) {
+return bs-drv-bdrv_co_discard(bs, sector_num, nb_sectors);
+} else if (bs-drv-bdrv_aio_discard) {
+BlockDriverAIOCB *acb;
+CoroutineIOCompletion co = {
+.coroutine = qemu_coroutine_self(),
+};
+
+acb = bs-drv-bdrv_aio_discard(bs, sector_num, nb_sectors,
+bdrv_co_io_em_complete, co);
+if (acb == NULL) {
+return -EIO;
+} else {
+qemu_coroutine_yield();
+return co.ret;
+}
+} else if (bs-drv-bdrv_discard) {
+return bs-drv-bdrv_discard(bs, sector_num, nb_sectors);
+} else {
+return 0;
+}
+}
+
+int bdrv_discard(BlockDriverState *bs, int64_t sector_num, int nb_sectors)
+{
+Coroutine *co;
+RwCo rwco = {
+.bs = bs,
+.sector_num = sector_num,
+.nb_sectors = nb_sectors,
+.ret = NOT_DONE,
+};
+
+if (qemu_in_coroutine()) {
+/* Fast-path if already in coroutine context */
+bdrv_discard_co_entry(rwco);
+} else {
+co = qemu_coroutine_create(bdrv_discard_co_entry);
+qemu_coroutine_enter(co, rwco);
+while (rwco.ret == NOT_DONE) {
+qemu_aio_wait();
+}
+}
+
+return rwco.ret;
+}
+
 /**/
 /* removable device support */
 
diff --git a/block.h b/block.h
index 65c5166..5a042c9 100644
--- a/block.h
+++ b/block.h
@@ -166,6 +166,9 @@ BlockDriverAIOCB *bdrv_aio_writev(BlockDriverState *bs, 
int64_t sector_num,
   BlockDriverCompletionFunc *cb, void *opaque);
 BlockDriverAIOCB *bdrv_aio_flush(BlockDriverState *bs,
  BlockDriverCompletionFunc *cb, void *opaque);
+BlockDriverAIOCB *bdrv_aio_discard(BlockDriverState *bs,
+   int64_t sector_num, int nb_sectors,
+   BlockDriverCompletionFunc *cb, void 
*opaque);
 void bdrv_aio_cancel(BlockDriverAIOCB *acb);
 
 typedef struct BlockRequest {
@@ -196,6 +199,7 @@ void bdrv_flush_all(void);
 void bdrv_close_all(void);
 
 int bdrv_discard(BlockDriverState *bs, int64_t sector_num, int nb_sectors);
+int bdrv_co_discard(BlockDriverState *bs, int64_t sector_num,

[Qemu-devel] [PATCH v3 11/13] main-loop: create main-loop.c

2011-10-21 Thread Paolo Bonzini

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 Makefile.objs|2 +-
 cpus.c   |  189 +-
 cpus.h   |1 +
 main-loop.c  |  495 ++
 main-loop.h  |   24 +++
 os-win32.c   |  123 --
 qemu-common.h|3 -
 qemu-os-posix.h  |4 -
 qemu-os-win32.h  |2 -
 slirp/libslirp.h |   11 --
 vl.c |  123 +-
 11 files changed, 523 insertions(+), 454 deletions(-)
 create mode 100644 main-loop.c

diff --git a/Makefile.objs b/Makefile.objs
index 9e20778..01587c8 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -81,7 +81,7 @@ common-obj-y += $(oslib-obj-y)
 common-obj-$(CONFIG_WIN32) += os-win32.o
 common-obj-$(CONFIG_POSIX) += os-posix.o
 
-common-obj-y += tcg-runtime.o host-utils.o
+common-obj-y += tcg-runtime.o host-utils.o main-loop.o
 common-obj-y += irq.o input.o
 common-obj-$(CONFIG_PTIMER) += ptimer.o
 common-obj-$(CONFIG_MAX7310) += max7310.o
diff --git a/cpus.c b/cpus.c
index 64237b4..b9f1573 100644
--- a/cpus.c
+++ b/cpus.c
@@ -542,143 +542,10 @@ static void qemu_kvm_eat_signals(CPUState *env)
 #endif /* !CONFIG_LINUX */
 
 #ifndef _WIN32
-static int io_thread_fd = -1;
-
-static void qemu_event_increment(void)
-{
-/* Write 8 bytes to be compatible with eventfd.  */
-static const uint64_t val = 1;
-ssize_t ret;
-
-if (io_thread_fd == -1) {
-return;
-}
-do {
-ret = write(io_thread_fd, val, sizeof(val));
-} while (ret  0  errno == EINTR);
-
-/* EAGAIN is fine, a read must be pending.  */
-if (ret  0  errno != EAGAIN) {
-fprintf(stderr, qemu_event_increment: write() failed: %s\n,
-strerror(errno));
-exit (1);
-}
-}
-
-static void qemu_event_read(void *opaque)
-{
-int fd = (intptr_t)opaque;
-ssize_t len;
-char buffer[512];
-
-/* Drain the notify pipe.  For eventfd, only 8 bytes will be read.  */
-do {
-len = read(fd, buffer, sizeof(buffer));
-} while ((len == -1  errno == EINTR) || len == sizeof(buffer));
-}
-
-static int qemu_event_init(void)
-{
-int err;
-int fds[2];
-
-err = qemu_eventfd(fds);
-if (err == -1) {
-return -errno;
-}
-err = fcntl_setfl(fds[0], O_NONBLOCK);
-if (err  0) {
-goto fail;
-}
-err = fcntl_setfl(fds[1], O_NONBLOCK);
-if (err  0) {
-goto fail;
-}
-qemu_set_fd_handler2(fds[0], NULL, qemu_event_read, NULL,
- (void *)(intptr_t)fds[0]);
-
-io_thread_fd = fds[1];
-return 0;
-
-fail:
-close(fds[0]);
-close(fds[1]);
-return err;
-}
-
 static void dummy_signal(int sig)
 {
 }
 
-/* If we have signalfd, we mask out the signals we want to handle and then
- * use signalfd to listen for them.  We rely on whatever the current signal
- * handler is to dispatch the signals when we receive them.
- */
-static void sigfd_handler(void *opaque)
-{
-int fd = (intptr_t)opaque;
-struct qemu_signalfd_siginfo info;
-struct sigaction action;
-ssize_t len;
-
-while (1) {
-do {
-len = read(fd, info, sizeof(info));
-} while (len == -1  errno == EINTR);
-
-if (len == -1  errno == EAGAIN) {
-break;
-}
-
-if (len != sizeof(info)) {
-printf(read from sigfd returned %zd: %m\n, len);
-return;
-}
-
-sigaction(info.ssi_signo, NULL, action);
-if ((action.sa_flags  SA_SIGINFO)  action.sa_sigaction) {
-action.sa_sigaction(info.ssi_signo,
-(siginfo_t *)info, NULL);
-} else if (action.sa_handler) {
-action.sa_handler(info.ssi_signo);
-}
-}
-}
-
-static int qemu_signal_init(void)
-{
-int sigfd;
-sigset_t set;
-
-/*
- * SIG_IPI must be blocked in the main thread and must not be caught
- * by sigwait() in the signal thread. Otherwise, the cpu thread will
- * not catch it reliably.
- */
-sigemptyset(set);
-sigaddset(set, SIG_IPI);
-pthread_sigmask(SIG_BLOCK, set, NULL);
-
-sigemptyset(set);
-sigaddset(set, SIGIO);
-sigaddset(set, SIGALRM);
-sigaddset(set, SIGBUS);
-pthread_sigmask(SIG_BLOCK, set, NULL);
-
-sigfd = qemu_signalfd(set);
-if (sigfd == -1) {
-fprintf(stderr, failed to create signalfd\n);
-return -errno;
-}
-
-fcntl_setfl(sigfd, O_NONBLOCK);
-
-qemu_set_fd_handler2(sigfd, NULL, sigfd_handler, NULL,
- (void *)(intptr_t)sigfd);
-
-return 0;
-}
-
 static void qemu_kvm_init_cpu_signals(CPUState *env)
 {
 int r;
@@ -722,38 +589,6 @@ static void qemu_tcg_init_cpu_signals(void)
 }
 
 #else /* _WIN32 */
-
-HANDLE qemu_event_handle;
-
-static void dummy_event_handler(void *opaque)
-{
-}
-
-static int qemu_event_init(void)
-{
-qemu_event_handle = CreateEvent(NULL, FALSE, FALSE, NULL);
-if (!qemu_event_handle) {
-

[Qemu-devel] [PATCH v3 05/13] qemu-timer: move icount to cpus.c

2011-10-21 Thread Paolo Bonzini

None of this is needed by tools, and most of it can even be made static
inside cpus.c.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 cpus.c|  295 +
 exec-all.h|   14 +++
 exec.c|3 -
 qemu-common.h |4 +
 qemu-timer.c  |  279 -
 qemu-timer.h  |   24 +-
 6 files changed, 296 insertions(+), 323 deletions(-)

diff --git a/cpus.c b/cpus.c
index 5f5b763..21538e6 100644
--- a/cpus.c
+++ b/cpus.c
@@ -65,6 +65,281 @@
 static CPUState *next_cpu;
 
 /***/
+/* guest cycle counter */
+
+/* Conversion factor from emulated instructions to virtual clock ticks.  */
+static int icount_time_shift;
+/* Arbitrarily pick 1MIPS as the minimum allowable speed.  */
+#define MAX_ICOUNT_SHIFT 10
+/* Compensate for varying guest execution speed.  */
+static int64_t qemu_icount_bias;
+static QEMUTimer *icount_rt_timer;
+static QEMUTimer *icount_vm_timer;
+static QEMUTimer *icount_warp_timer;
+static int64_t vm_clock_warp_start;
+static int64_t qemu_icount;
+
+typedef struct TimersState {
+int64_t cpu_ticks_prev;
+int64_t cpu_ticks_offset;
+int64_t cpu_clock_offset;
+int32_t cpu_ticks_enabled;
+int64_t dummy;
+} TimersState;
+
+TimersState timers_state;
+
+/* Return the virtual CPU time, based on the instruction counter.  */
+int64_t cpu_get_icount(void)
+{
+int64_t icount;
+CPUState *env = cpu_single_env;;
+
+icount = qemu_icount;
+if (env) {
+if (!can_do_io(env)) {
+fprintf(stderr, Bad clock read\n);
+}
+icount -= (env-icount_decr.u16.low + env-icount_extra);
+}
+return qemu_icount_bias + (icount  icount_time_shift);
+}
+
+/* return the host CPU cycle counter and handle stop/restart */
+int64_t cpu_get_ticks(void)
+{
+if (use_icount) {
+return cpu_get_icount();
+}
+if (!timers_state.cpu_ticks_enabled) {
+return timers_state.cpu_ticks_offset;
+} else {
+int64_t ticks;
+ticks = cpu_get_real_ticks();
+if (timers_state.cpu_ticks_prev  ticks) {
+/* Note: non increasing ticks may happen if the host uses
+   software suspend */
+timers_state.cpu_ticks_offset += timers_state.cpu_ticks_prev - 
ticks;
+}
+timers_state.cpu_ticks_prev = ticks;
+return ticks + timers_state.cpu_ticks_offset;
+}
+}
+
+/* return the host CPU monotonic timer and handle stop/restart */
+int64_t cpu_get_clock(void)
+{
+int64_t ti;
+if (!timers_state.cpu_ticks_enabled) {
+return timers_state.cpu_clock_offset;
+} else {
+ti = get_clock();
+return ti + timers_state.cpu_clock_offset;
+}
+}
+
+/* enable cpu_get_ticks() */
+void cpu_enable_ticks(void)
+{
+if (!timers_state.cpu_ticks_enabled) {
+timers_state.cpu_ticks_offset -= cpu_get_real_ticks();
+timers_state.cpu_clock_offset -= get_clock();
+timers_state.cpu_ticks_enabled = 1;
+}
+}
+
+/* disable cpu_get_ticks() : the clock is stopped. You must not call
+   cpu_get_ticks() after that.  */
+void cpu_disable_ticks(void)
+{
+if (timers_state.cpu_ticks_enabled) {
+timers_state.cpu_ticks_offset = cpu_get_ticks();
+timers_state.cpu_clock_offset = cpu_get_clock();
+timers_state.cpu_ticks_enabled = 0;
+}
+}
+
+/* Correlation between real and virtual time is always going to be
+   fairly approximate, so ignore small variation.
+   When the guest is idle real and virtual time will be aligned in
+   the IO wait loop.  */
+#define ICOUNT_WOBBLE (get_ticks_per_sec() / 10)
+
+static void icount_adjust(void)
+{
+int64_t cur_time;
+int64_t cur_icount;
+int64_t delta;
+static int64_t last_delta;
+/* If the VM is not running, then do nothing.  */
+if (!runstate_is_running()) {
+return;
+}
+cur_time = cpu_get_clock();
+cur_icount = qemu_get_clock_ns(vm_clock);
+delta = cur_icount - cur_time;
+/* FIXME: This is a very crude algorithm, somewhat prone to oscillation.  
*/
+if (delta  0
+ last_delta + ICOUNT_WOBBLE  delta * 2
+ icount_time_shift  0) {
+/* The guest is getting too far ahead.  Slow time down.  */
+icount_time_shift--;
+}
+if (delta  0
+ last_delta - ICOUNT_WOBBLE  delta * 2
+ icount_time_shift  MAX_ICOUNT_SHIFT) {
+/* The guest is getting too far behind.  Speed time up.  */
+icount_time_shift++;
+}
+last_delta = delta;
+qemu_icount_bias = cur_icount - (qemu_icount  icount_time_shift);
+}
+
+static void icount_adjust_rt(void *opaque)
+{
+qemu_mod_timer(icount_rt_timer,
+   qemu_get_clock_ms(rt_clock) + 1000);
+icount_adjust();
+}
+
+static void icount_adjust_vm(void *opaque)
+{
+qemu_mod_timer(icount_vm_timer,
+   qemu_get_clock_ns(vm_clock) +

[Qemu-devel] [PULL v3 00/13] allow tools to use the QEMU main loop

2011-10-21 Thread Paolo Bonzini

The following changes since commit c76eaf13975130768070ecd2d4f3107eb69ab757:

  hw/9pfs: Fix broken compilation caused by wrong trace events (2011-10-20 
15:30:59 -0500)

are available in the git repository at:
  git://github.com/bonzini/qemu.git split-main-loop-for-anthony

This patch series makes the QEMU main loop usable out of the executable,
and especially in tools and possibly unit tests.  This is cleaner because
it avoids introducing partial transitions to GIOChannel.  Interfacing with
the glib main loop is still possible.

The main loop code is currently split in cpus.c and vl.c.  Moving it
to a new file is easy; the problem is that the main loop depends on the
timer infrastructure in qemu-timer.c, and that file currently contains
the implementation of icount and the vm_clock.  This is bad for the
perspective of linking qemu-timer.c into the tools.  Luckily, it is
relatively easy to untie them and move them out of the way.  This is
what the largest part of the series does (patches 1-9).

Patches 10-13 complete the refactoring and cleanup some surrounding
code.

v2-v3
Rebased, added documentation

v1-v2
Rebased

Paolo Bonzini (13):
  remove unused function
  qemu-timer: remove active_timers array
  qemu-timer: move common code to qemu_rearm_alarm_timer
  qemu-timer: more clock functions
  qemu-timer: move icount to cpus.c
  qemu-timer: do not refer to runstate_is_running()
  qemu-timer: use atexit for quit_timers
  qemu-timer: move more stuff out of qemu-timer.c
  qemu-timer: do not use RunState change handlers
  main-loop: create main-loop.h
  main-loop: create main-loop.c
  Revert to a hand-made select loop
  simplify main loop functions

 Makefile.objs |2 +-
 async.c   |1 +
 cpus.c|  497 -
 cpus.h|3 +-
 exec-all.h|   14 ++
 exec.c|3 -
 hw/mac_dbdma.c|5 -
 hw/mac_dbdma.h|1 -
 iohandler.c   |   55 +--
 main-loop.c   |  495 
 main-loop.h   |  351 ++
 os-win32.c|  123 
 qemu-char.h   |   12 +-
 qemu-common.h |   37 +
 qemu-coroutine-lock.c |1 +
 qemu-os-posix.h   |4 -
 qemu-os-win32.h   |   17 +--
 qemu-timer.c  |  489 +---
 qemu-timer.h  |   31 +---
 savevm.c  |   25 +++
 slirp/libslirp.h  |   11 -
 sysemu.h  |3 +-
 vl.c  |  189 ---
 23 files changed, 1309 insertions(+), 1060 deletions(-)
 create mode 100644 main-loop.c
 create mode 100644 main-loop.h

-- 
1.7.6

[Qemu-devel] [PATCH v3 09/13] qemu-timer: do not use RunState change handlers

2011-10-21 Thread Paolo Bonzini

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 qemu-timer.c |   12 
 1 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/qemu-timer.c b/qemu-timer.c
index 58926dd..f11a28d 100644
--- a/qemu-timer.c
+++ b/qemu-timer.c
@@ -273,7 +273,11 @@ static QEMUClock *qemu_new_clock(int type)
 
 void qemu_clock_enable(QEMUClock *clock, int enabled)
 {
+bool old = clock-enabled;
 clock-enabled = enabled;
+if (enabled  !old) {
+qemu_rearm_alarm_timer(alarm_timer);
+}
 }
 
 int64_t qemu_clock_has_timers(QEMUClock *clock)
@@ -806,13 +810,6 @@ static void win32_rearm_timer(struct qemu_alarm_timer *t,
 
 #endif /* _WIN32 */
 
-static void alarm_timer_on_change_state_rearm(void *opaque, int running,
-  RunState state)
-{
-if (running)
-qemu_rearm_alarm_timer((struct qemu_alarm_timer *) opaque);
-}
-
 static void quit_timers(void)
 {
 struct qemu_alarm_timer *t = alarm_timer;
@@ -842,7 +839,6 @@ int init_timer_alarm(void)
 atexit(quit_timers);
 t-pending = 1;
 alarm_timer = t;
-qemu_add_vm_change_state_handler(alarm_timer_on_change_state_rearm, t);
 
 return 0;
 
-- 
1.7.6

[Qemu-devel] [PATCH 09/19] fdc: Fix floppy port I/O

2011-10-21 Thread Kevin Wolf

The floppy device was broken by commit 212ec7ba (fdc: Convert to
isa_register_portio_list). While the old interface provided the port number
relative to the floppy drive's io_base, the new one provides the real port
number, so we need to apply a bitmask now to get the register number.

Signed-off-by: Kevin Wolf kw...@redhat.com
---
 hw/fdc.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/hw/fdc.c b/hw/fdc.c
index 4b06e04..f8af2de 100644
--- a/hw/fdc.c
+++ b/hw/fdc.c
@@ -434,6 +434,7 @@ static uint32_t fdctrl_read (void *opaque, uint32_t reg)
 FDCtrl *fdctrl = opaque;
 uint32_t retval;
 
+reg = 7;
 switch (reg) {
 case FD_REG_SRA:
 retval = fdctrl_read_statusA(fdctrl);
@@ -471,6 +472,7 @@ static void fdctrl_write (void *opaque, uint32_t reg, 
uint32_t value)
 
 FLOPPY_DPRINTF(write reg%d: 0x%02x\n, reg  7, value);
 
+reg = 7;
 switch (reg) {
 case FD_REG_DOR:
 fdctrl_write_dor(fdctrl, value);
-- 
1.7.6.4

[Qemu-devel] [PATCH 03/19] block: rename bdrv_co_rw_bh

2011-10-21 Thread Kevin Wolf

From: Paolo Bonzini pbonz...@redhat.com

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/block.c b/block.c
index 9873b57..7184a0f 100644
--- a/block.c
+++ b/block.c
@@ -2735,7 +2735,7 @@ static AIOPool bdrv_em_co_aio_pool = {
 .cancel = bdrv_aio_co_cancel_em,
 };
 
-static void bdrv_co_rw_bh(void *opaque)
+static void bdrv_co_em_bh(void *opaque)
 {
 BlockDriverAIOCBCoroutine *acb = opaque;
 
@@ -2758,7 +2758,7 @@ static void coroutine_fn bdrv_co_do_rw(void *opaque)
 acb-req.nb_sectors, acb-req.qiov);
 }
 
-acb-bh = qemu_bh_new(bdrv_co_rw_bh, acb);
+acb-bh = qemu_bh_new(bdrv_co_em_bh, acb);
 qemu_bh_schedule(acb-bh);
 }
 
-- 
1.7.6.4

[Qemu-devel] [PATCH 06/19] block: unify flush implementations

2011-10-21 Thread Kevin Wolf

From: Paolo Bonzini pbonz...@redhat.com

Add coroutine support for flush and apply the same emulation that
we already do for read/write.  bdrv_aio_flush is simplified to always
go through a coroutine.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block.c |  164 ++
 block.h |1 +
 block_int.h |1 +
 3 files changed, 76 insertions(+), 90 deletions(-)

diff --git a/block.c b/block.c
index 7184a0f..7b8b14d 100644
--- a/block.c
+++ b/block.c
@@ -53,17 +53,12 @@ static BlockDriverAIOCB *bdrv_aio_readv_em(BlockDriverState 
*bs,
 static BlockDriverAIOCB *bdrv_aio_writev_em(BlockDriverState *bs,
 int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
 BlockDriverCompletionFunc *cb, void *opaque);
-static BlockDriverAIOCB *bdrv_aio_flush_em(BlockDriverState *bs,
-BlockDriverCompletionFunc *cb, void *opaque);
-static BlockDriverAIOCB *bdrv_aio_noop_em(BlockDriverState *bs,
-BlockDriverCompletionFunc *cb, void *opaque);
 static int coroutine_fn bdrv_co_readv_em(BlockDriverState *bs,
  int64_t sector_num, int nb_sectors,
  QEMUIOVector *iov);
 static int coroutine_fn bdrv_co_writev_em(BlockDriverState *bs,
  int64_t sector_num, int nb_sectors,
  QEMUIOVector *iov);
-static int coroutine_fn bdrv_co_flush_em(BlockDriverState *bs);
 static int coroutine_fn bdrv_co_do_readv(BlockDriverState *bs,
 int64_t sector_num, int nb_sectors, QEMUIOVector *qiov);
 static int coroutine_fn bdrv_co_do_writev(BlockDriverState *bs,
@@ -203,9 +198,6 @@ void bdrv_register(BlockDriver *bdrv)
 }
 }
 
-if (!bdrv-bdrv_aio_flush)
-bdrv-bdrv_aio_flush = bdrv_aio_flush_em;
-
 QLIST_INSERT_HEAD(bdrv_drivers, bdrv, list);
 }
 
@@ -1027,11 +1019,6 @@ static int bdrv_check_request(BlockDriverState *bs, 
int64_t sector_num,
nb_sectors * BDRV_SECTOR_SIZE);
 }
 
-static inline bool bdrv_has_async_flush(BlockDriver *drv)
-{
-return drv-bdrv_aio_flush != bdrv_aio_flush_em;
-}
-
 typedef struct RwCo {
 BlockDriverState *bs;
 int64_t sector_num;
@@ -1759,33 +1746,6 @@ const char *bdrv_get_device_name(BlockDriverState *bs)
 return bs-device_name;
 }
 
-int bdrv_flush(BlockDriverState *bs)
-{
-if (bs-open_flags  BDRV_O_NO_FLUSH) {
-return 0;
-}
-
-if (bs-drv  bdrv_has_async_flush(bs-drv)  qemu_in_coroutine()) {
-return bdrv_co_flush_em(bs);
-}
-
-if (bs-drv  bs-drv-bdrv_flush) {
-return bs-drv-bdrv_flush(bs);
-}
-
-/*
- * Some block drivers always operate in either writethrough or unsafe mode
- * and don't support bdrv_flush therefore. Usually qemu doesn't know how
- * the server works (because the behaviour is hardcoded or depends on
- * server-side configuration), so we can't ensure that everything is safe
- * on disk. Returning an error doesn't work because that would break guests
- * even if the server operates in writethrough mode.
- *
- * Let's hope the user knows what he's doing.
- */
-return 0;
-}
-
 void bdrv_flush_all(void)
 {
 BlockDriverState *bs;
@@ -2610,22 +2570,6 @@ fail:
 return -1;
 }
 
-BlockDriverAIOCB *bdrv_aio_flush(BlockDriverState *bs,
-BlockDriverCompletionFunc *cb, void *opaque)
-{
-BlockDriver *drv = bs-drv;
-
-trace_bdrv_aio_flush(bs, opaque);
-
-if (bs-open_flags  BDRV_O_NO_FLUSH) {
-return bdrv_aio_noop_em(bs, cb, opaque);
-}
-
-if (!drv)
-return NULL;
-return drv-bdrv_aio_flush(bs, cb, opaque);
-}
-
 void bdrv_aio_cancel(BlockDriverAIOCB *acb)
 {
 acb-pool-cancel(acb);
@@ -2785,41 +2729,28 @@ static BlockDriverAIOCB 
*bdrv_co_aio_rw_vector(BlockDriverState *bs,
 return acb-common;
 }
 
-static BlockDriverAIOCB *bdrv_aio_flush_em(BlockDriverState *bs,
-BlockDriverCompletionFunc *cb, void *opaque)
+static void coroutine_fn bdrv_aio_flush_co_entry(void *opaque)
 {
-BlockDriverAIOCBSync *acb;
-
-acb = qemu_aio_get(bdrv_em_aio_pool, bs, cb, opaque);
-acb-is_write = 1; /* don't bounce in the completion hadler */
-acb-qiov = NULL;
-acb-bounce = NULL;
-acb-ret = 0;
-
-if (!acb-bh)
-acb-bh = qemu_bh_new(bdrv_aio_bh_cb, acb);
+BlockDriverAIOCBCoroutine *acb = opaque;
+BlockDriverState *bs = acb-common.bs;
 
-bdrv_flush(bs);
+acb-req.error = bdrv_co_flush(bs);
+acb-bh = qemu_bh_new(bdrv_co_em_bh, acb);
 qemu_bh_schedule(acb-bh);
-return acb-common;
 }
 
-static BlockDriverAIOCB *bdrv_aio_noop_em(BlockDriverState *bs,
+BlockDriverAIOCB *bdrv_aio_flush(BlockDriverState *bs,
 BlockDriverCompletionFunc *cb, void *opaque)
 {
-BlockDriverAIOCBSync *acb;
+trace_bdrv_aio_flush(bs, opaque);
 
-acb =

Re: [Qemu-devel] [PATCH 5/5] Convert remaining calls to g_malloc(sizeof(type)) to g_new()

2011-10-21 Thread Stuart Brady

On Fri, Oct 21, 2011 at 09:37:02AM +0200, Paolo Bonzini wrote:
 On 10/21/2011 02:26 AM, Stuart Brady wrote:
   They all look okay, perhaps the include path you passed to
   Coccinelle is incomplete?
 Ah, good point!  I'm not sure what include dirs are needed, though...
 anyone have any advice?
 
 Blue Swirl, I gather you're one of the few other people to have used
 Coccinelle with Qemu's source...
 
 I played a bit yesterday and it turns out that Coccinelle is a bit
 limited WRT handling headers, because they are very expensive.  I
 used -I . -I +build -I hw but it didn't help much.
 
 Stuart/Blue, do you have a macro file?  Mine was simply #define
 coroutine_fn.

I didn't even have that, but Coccinelle didn't seem to mind...

It did occur to me that since a lot of Qemu's source is recompiled with
different macro definitions for different targets, we need to be really
careful about what we do regarding includes.  Hopefully the names of
types that are used won't vary between targets, though.

Submitting what Coccinelle could process successfully and fixing up the
rest manually seemed reasonable, but I'd like to be as confident as
possible of these changes.

BTW, I'd thought that noone would ever do E = (T *)g_malloc(sizeof(*E)),
but from looking hw/blizzard.c, hw/cbus.c and hw/nseries.c, it seems
that this isn't quite the case afterall!  I'll be sure to include this
in my second attempt, once QEMU 1.0 has been released.

One thing that did not occur to me is use of E = malloc(sizeof(*E1)) or
E = malloc(sizeof(T1)) where E is of type void *, but E1 or T1 is not
what was intended.

I'm also somewhat astonished to find that sizeof(void) and sizeof(*E)
where E is of type void * both compile!  It would probably make sense to
check for these.

Any remaining calls to g_malloc() would be then be reviewed to make sure
that they're all correct.

We could also perhaps search for places where free() is called on memory
that is allocated with g_malloc(), as g_free() should be used instead.

---

Some background on my thinking before sending the patch series:

(T *)g_malloc(sizeof(T)) can obviously be safely replaced with
g_new(T, 1) since that's what g_new(T, 1) expands to.

Replacing E = g_malloc(sizeof(*E)) with E = g_new(T, 1) adds a cast, but
the cast does not provide any extra safety, since sizeof(*T) is pretty
much certain to be the correct size (unless T = void *).  There seems
to be some agreement that this is more readable, though.

Replacing E = g_malloc(sizeof(T)) without a cast with E = g_new(T, 1)
effectively just adds a cast to T *, which might result in additional
compilation warnings (which are turned into errors) but should have no
other effect, so this should be perfectly safe.

Other cases where g_malloc(sizeof(*E)) or g_malloc(sizeof(T)) is used
will either be due to Coccinelle not understanding the types, or due to
a bug in Qemu, and both of these cases need special consideration.

Cheers,
-- 
Stuart

[Qemu-devel] [PATCH 17/19] block: take lock around bdrv_write implementations

2011-10-21 Thread Kevin Wolf

From: Paolo Bonzini pbonz...@redhat.com

This does the first part of the conversion to coroutines, by
wrapping bdrv_write implementations to take the mutex.

Drivers that implement bdrv_write rather than bdrv_co_writev can
then benefit from asynchronous operation (at least if the underlying
protocol supports it, which is not the case for raw-win32), even
though they still operate with a bounce buffer.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block/cow.c   |   13 -
 block/nbd.c   |   13 -
 block/vmdk.c  |   13 -
 block/vpc.c   |   13 -
 block/vvfat.c |   13 -
 5 files changed, 60 insertions(+), 5 deletions(-)

diff --git a/block/cow.c b/block/cow.c
index a5fcd20..29fa844 100644
--- a/block/cow.c
+++ b/block/cow.c
@@ -226,6 +226,17 @@ static int cow_write(BlockDriverState *bs, int64_t 
sector_num,
 return cow_update_bitmap(bs, sector_num, nb_sectors);
 }
 
+static coroutine_fn int cow_co_write(BlockDriverState *bs, int64_t sector_num,
+ const uint8_t *buf, int nb_sectors)
+{
+int ret;
+BDRVCowState *s = bs-opaque;
+qemu_co_mutex_lock(s-lock);
+ret = cow_write(bs, sector_num, buf, nb_sectors);
+qemu_co_mutex_unlock(s-lock);
+return ret;
+}
+
 static void cow_close(BlockDriverState *bs)
 {
 }
@@ -320,7 +331,7 @@ static BlockDriver bdrv_cow = {
 .bdrv_probe= cow_probe,
 .bdrv_open = cow_open,
 .bdrv_read  = cow_co_read,
-.bdrv_write= cow_write,
+.bdrv_write = cow_co_write,
 .bdrv_close= cow_close,
 .bdrv_create   = cow_create,
 .bdrv_flush= cow_flush,
diff --git a/block/nbd.c b/block/nbd.c
index 6b22ae1..882b2dc 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -251,6 +251,17 @@ static coroutine_fn int nbd_co_read(BlockDriverState *bs, 
int64_t sector_num,
 return ret;
 }
 
+static coroutine_fn int nbd_co_write(BlockDriverState *bs, int64_t sector_num,
+ const uint8_t *buf, int nb_sectors)
+{
+int ret;
+BDRVNBDState *s = bs-opaque;
+qemu_co_mutex_lock(s-lock);
+ret = nbd_write(bs, sector_num, buf, nb_sectors);
+qemu_co_mutex_unlock(s-lock);
+return ret;
+}
+
 static void nbd_close(BlockDriverState *bs)
 {
 BDRVNBDState *s = bs-opaque;
@@ -272,7 +283,7 @@ static BlockDriver bdrv_nbd = {
 .instance_size = sizeof(BDRVNBDState),
 .bdrv_file_open= nbd_open,
 .bdrv_read  = nbd_co_read,
-.bdrv_write= nbd_write,
+.bdrv_write = nbd_co_write,
 .bdrv_close= nbd_close,
 .bdrv_getlength= nbd_getlength,
 .protocol_name = nbd,
diff --git a/block/vmdk.c b/block/vmdk.c
index 0e791f2..3b376ed 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -1116,6 +1116,17 @@ static int vmdk_write(BlockDriverState *bs, int64_t 
sector_num,
 return 0;
 }
 
+static coroutine_fn int vmdk_co_write(BlockDriverState *bs, int64_t sector_num,
+  const uint8_t *buf, int nb_sectors)
+{
+int ret;
+BDRVVmdkState *s = bs-opaque;
+qemu_co_mutex_lock(s-lock);
+ret = vmdk_write(bs, sector_num, buf, nb_sectors);
+qemu_co_mutex_unlock(s-lock);
+return ret;
+}
+
 
 static int vmdk_create_extent(const char *filename, int64_t filesize,
   bool flat, bool compress)
@@ -1554,7 +1565,7 @@ static BlockDriver bdrv_vmdk = {
 .bdrv_probe = vmdk_probe,
 .bdrv_open  = vmdk_open,
 .bdrv_read  = vmdk_co_read,
-.bdrv_write = vmdk_write,
+.bdrv_write = vmdk_co_write,
 .bdrv_close = vmdk_close,
 .bdrv_create= vmdk_create,
 .bdrv_flush = vmdk_flush,
diff --git a/block/vpc.c b/block/vpc.c
index 0941533..74ca642 100644
--- a/block/vpc.c
+++ b/block/vpc.c
@@ -456,6 +456,17 @@ static int vpc_write(BlockDriverState *bs, int64_t 
sector_num,
 return 0;
 }
 
+static coroutine_fn int vpc_co_write(BlockDriverState *bs, int64_t sector_num,
+ const uint8_t *buf, int nb_sectors)
+{
+int ret;
+BDRVVPCState *s = bs-opaque;
+qemu_co_mutex_lock(s-lock);
+ret = vpc_write(bs, sector_num, buf, nb_sectors);
+qemu_co_mutex_unlock(s-lock);
+return ret;
+}
+
 static int vpc_flush(BlockDriverState *bs)
 {
 return bdrv_flush(bs-file);
@@ -653,7 +664,7 @@ static BlockDriver bdrv_vpc = {
 .bdrv_probe = vpc_probe,
 .bdrv_open  = vpc_open,
 .bdrv_read  = vpc_co_read,
-.bdrv_write = vpc_write,
+.bdrv_write = vpc_co_write,
 .bdrv_flush = vpc_flush,
 .bdrv_close = vpc_close,
 .bdrv_create= vpc_create,
diff --git a/block/vvfat.c b/block/vvfat.c
index 970cccf..e1fcdbc 100644
--- a/block/vvfat.c
+++ b/block/vvfat.c
@@ -2727,6 +2727,17 @@ DLOG(checkpoint());
 return 0;
 }

[Qemu-devel] [PATCH v3 06/13] qemu-timer: do not refer to runstate_is_running()

2011-10-21 Thread Paolo Bonzini

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 cpus.c   |1 +
 qemu-timer.c |5 +
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/cpus.c b/cpus.c
index 21538e6..1328baa 100644
--- a/cpus.c
+++ b/cpus.c
@@ -1059,6 +1059,7 @@ void pause_all_vcpus(void)
 {
 CPUState *penv = first_cpu;
 
+qemu_clock_enable(vm_clock, false);
 while (penv) {
 penv-stop = 1;
 qemu_cpu_kick(penv);
diff --git a/qemu-timer.c b/qemu-timer.c
index 8129af6..d8507e3 100644
--- a/qemu-timer.c
+++ b/qemu-timer.c
@@ -504,10 +504,7 @@ void qemu_run_all_timers(void)
 }
 
 /* vm time timers */
-if (runstate_is_running()) {
-qemu_run_timers(vm_clock);
-}
-
+qemu_run_timers(vm_clock);
 qemu_run_timers(rt_clock);
 qemu_run_timers(host_clock);
 }
-- 
1.7.6

[Qemu-devel] [PULL 00/19] Block patches

2011-10-21 Thread Kevin Wolf

The following changes since commit c2e2343e1faae7bbc77574c12a25881b1b696808:

  hw/arm_gic.c: Fix save/load of irq_target array (2011-10-21 17:19:56 +0200)

are available in the git repository at:
  git://repo.or.cz/qemu/kevin.git for-anthony

Alex Jia (1):
  fix memory leak in aio_write_f

Kevin Wolf (5):
  xen_disk: Always set feature-barrier = 1
  fdc: Fix floppy port I/O
  qemu-img: Don't allow preallocation and compression at the same time
  qcow2: Fix bdrv_write_compressed error handling
  pc: Fix floppy drives with if=none

Paolo Bonzini (12):
  sheepdog: add coroutine_fn markers
  add socket_set_block
  block: rename bdrv_co_rw_bh
  block: unify flush implementations
  block: add bdrv_co_discard and bdrv_aio_discard support
  vmdk: fix return values of vmdk_parent_open
  vmdk: clean up open
  block: add a CoMutex to synchronous read drivers
  block: take lock around bdrv_read implementations
  block: take lock around bdrv_write implementations
  block: change flush to co_flush
  block: change discard to co_discard

Stefan Hajnoczi (1):
  block: drop redundant bdrv_flush implementation

 block.c   |  258 ++---
 block.h   |5 +
 block/blkdebug.c  |6 -
 block/blkverify.c |9 --
 block/bochs.c |   15 +++-
 block/cloop.c |   15 +++-
 block/cow.c   |   34 ++-
 block/dmg.c   |   15 +++-
 block/nbd.c   |   28 +-
 block/parallels.c |   15 +++-
 block/qcow.c  |   17 +---
 block/qcow2-cluster.c |6 +-
 block/qcow2.c |   72 ++
 block/qed.c   |6 -
 block/raw-posix.c |   23 +
 block/raw-win32.c |4 +-
 block/raw.c   |   23 ++---
 block/rbd.c   |4 +-
 block/sheepdog.c  |   14 ++--
 block/vdi.c   |6 +-
 block/vmdk.c  |   82 ++--
 block/vpc.c   |   34 ++-
 block/vvfat.c |   28 +-
 block_int.h   |9 +-
 hw/fdc.c  |   14 +++
 hw/fdc.h  |9 ++-
 hw/pc.c   |   25 +++--
 hw/pc.h   |3 +-
 hw/pc_piix.c  |5 +-
 hw/xen_disk.c |5 +-
 oslib-posix.c |7 ++
 oslib-win32.c |6 +
 qemu-img.c|   11 ++
 qemu-io.c |1 +
 qemu_socket.h |1 +
 trace-events  |1 +
 36 files changed, 524 insertions(+), 292 deletions(-)

[Qemu-devel] [PATCH 15/19] block: add a CoMutex to synchronous read drivers

2011-10-21 Thread Kevin Wolf

From: Paolo Bonzini pbonz...@redhat.com

The big conversion of bdrv_read/write to coroutines caused the two
homonymous callbacks in BlockDriver to become reentrant.  It goes
like this:

1) bdrv_read is now called in a coroutine, and calls bdrv_read or
bdrv_pread.

2) the nested bdrv_read goes through the fast path in bdrv_rw_co_entry;

3) in the common case when the protocol is file, bdrv_co_do_readv calls
bdrv_co_readv_em (and from here goes to bdrv_co_io_em), which yields
until the AIO operation is complete;

4) if bdrv_read had been called from a bottom half, the main loop
is free to iterate again: a device model or another bottom half
can then come and call bdrv_read again.

This applies to all four of read/write/flush/discard.  It would also
apply to is_allocated, but it is not used from within coroutines:
besides qemu-img.c and qemu-io.c, which operate synchronously, the
only user is the monitor.  Copy-on-read will introduce a use in the
block layer, and will require converting it.

The solution is simply to convert all drivers to coroutines!  We
just need to add a CoMutex that is taken around affected operations.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block/bochs.c |2 ++
 block/cloop.c |2 ++
 block/cow.c   |2 ++
 block/dmg.c   |2 ++
 block/nbd.c   |2 ++
 block/parallels.c |2 ++
 block/vmdk.c  |2 ++
 block/vpc.c   |2 ++
 block/vvfat.c |2 ++
 9 files changed, 18 insertions(+), 0 deletions(-)

diff --git a/block/bochs.c b/block/bochs.c
index 3c2f8d1..b0f8072 100644
--- a/block/bochs.c
+++ b/block/bochs.c
@@ -80,6 +80,7 @@ struct bochs_header {
 };
 
 typedef struct BDRVBochsState {
+CoMutex lock;
 uint32_t *catalog_bitmap;
 int catalog_size;
 
@@ -150,6 +151,7 @@ static int bochs_open(BlockDriverState *bs, int flags)
 
 s-extent_size = le32_to_cpu(bochs.extra.redolog.extent);
 
+qemu_co_mutex_init(s-lock);
 return 0;
  fail:
 return -1;
diff --git a/block/cloop.c b/block/cloop.c
index 8cff9f2..a91f372 100644
--- a/block/cloop.c
+++ b/block/cloop.c
@@ -27,6 +27,7 @@
 #include zlib.h
 
 typedef struct BDRVCloopState {
+CoMutex lock;
 uint32_t block_size;
 uint32_t n_blocks;
 uint64_t* offsets;
@@ -93,6 +94,7 @@ static int cloop_open(BlockDriverState *bs, int flags)
 
 s-sectors_per_block = s-block_size/512;
 bs-total_sectors = s-n_blocks*s-sectors_per_block;
+qemu_co_mutex_init(s-lock);
 return 0;
 
 cloop_close:
diff --git a/block/cow.c b/block/cow.c
index 4cf543c..2f426e7 100644
--- a/block/cow.c
+++ b/block/cow.c
@@ -42,6 +42,7 @@ struct cow_header_v2 {
 };
 
 typedef struct BDRVCowState {
+CoMutex lock;
 int64_t cow_sectors_offset;
 } BDRVCowState;
 
@@ -84,6 +85,7 @@ static int cow_open(BlockDriverState *bs, int flags)
 
 bitmap_size = ((bs-total_sectors + 7)  3) + sizeof(cow_header);
 s-cow_sectors_offset = (bitmap_size + 511)  ~511;
+qemu_co_mutex_init(s-lock);
 return 0;
  fail:
 return -1;
diff --git a/block/dmg.c b/block/dmg.c
index 64c3cce..111aeae 100644
--- a/block/dmg.c
+++ b/block/dmg.c
@@ -28,6 +28,7 @@
 #include zlib.h
 
 typedef struct BDRVDMGState {
+CoMutex lock;
 /* each chunk contains a certain number of sectors,
  * offsets[i] is the offset in the .dmg file,
  * lengths[i] is the length of the compressed chunk,
@@ -177,6 +178,7 @@ static int dmg_open(BlockDriverState *bs, int flags)
 
 s-current_chunk = s-n_chunks;
 
+qemu_co_mutex_init(s-lock);
 return 0;
 fail:
 return -1;
diff --git a/block/nbd.c b/block/nbd.c
index 76f04d8..14ab225 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -47,6 +47,7 @@
 #endif
 
 typedef struct BDRVNBDState {
+CoMutex lock;
 int sock;
 uint32_t nbdflags;
 off_t size;
@@ -175,6 +176,7 @@ static int nbd_open(BlockDriverState *bs, const char* 
filename, int flags)
  */
 result = nbd_establish_connection(bs);
 
+qemu_co_mutex_init(s-lock);
 return result;
 }
 
diff --git a/block/parallels.c b/block/parallels.c
index c64103d..b86e87e 100644
--- a/block/parallels.c
+++ b/block/parallels.c
@@ -46,6 +46,7 @@ struct parallels_header {
 } QEMU_PACKED;
 
 typedef struct BDRVParallelsState {
+CoMutex lock;
 
 uint32_t *catalog_bitmap;
 int catalog_size;
@@ -95,6 +96,7 @@ static int parallels_open(BlockDriverState *bs, int flags)
 for (i = 0; i  s-catalog_size; i++)
le32_to_cpus(s-catalog_bitmap[i]);
 
+qemu_co_mutex_init(s-lock);
 return 0;
 fail:
 if (s-catalog_bitmap)
diff --git a/block/vmdk.c b/block/vmdk.c
index ace2977..1ce220d 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -90,6 +90,7 @@ typedef struct VmdkExtent {
 } VmdkExtent;
 
 typedef struct BDRVVmdkState {
+CoMutex lock;
 int desc_offset;
 bool cid_updated;
 uint32_t parent_cid;
@@ -646,6 +647,7 @@ static int vmdk_open(BlockDriverState *bs, int flags)
 goto fail;
 }

[Qemu-devel] [PATCH 01/19] sheepdog: add coroutine_fn markers

2011-10-21 Thread Kevin Wolf

From: Paolo Bonzini pbonz...@redhat.com

This makes the following patch easier to review.

Cc: MORITA Kazutaka morita.kazut...@lab.ntt.co.jp
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block/sheepdog.c |   14 +++---
 1 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/block/sheepdog.c b/block/sheepdog.c
index ae857e2..9f80609 100644
--- a/block/sheepdog.c
+++ b/block/sheepdog.c
@@ -396,7 +396,7 @@ static inline int free_aio_req(BDRVSheepdogState *s, AIOReq 
*aio_req)
 return !QLIST_EMPTY(acb-aioreq_head);
 }
 
-static void sd_finish_aiocb(SheepdogAIOCB *acb)
+static void coroutine_fn sd_finish_aiocb(SheepdogAIOCB *acb)
 {
 if (!acb-canceled) {
 qemu_coroutine_enter(acb-coroutine, NULL);
@@ -735,7 +735,7 @@ out:
 return ret;
 }
 
-static int add_aio_request(BDRVSheepdogState *s, AIOReq *aio_req,
+static int coroutine_fn add_aio_request(BDRVSheepdogState *s, AIOReq *aio_req,
struct iovec *iov, int niov, int create,
enum AIOCBState aiocb_type);
 
@@ -743,7 +743,7 @@ static int add_aio_request(BDRVSheepdogState *s, AIOReq 
*aio_req,
  * This function searchs pending requests to the object `oid', and
  * sends them.
  */
-static void send_pending_req(BDRVSheepdogState *s, uint64_t oid, uint32_t id)
+static void coroutine_fn send_pending_req(BDRVSheepdogState *s, uint64_t oid, 
uint32_t id)
 {
 AIOReq *aio_req, *next;
 SheepdogAIOCB *acb;
@@ -777,7 +777,7 @@ static void send_pending_req(BDRVSheepdogState *s, uint64_t 
oid, uint32_t id)
  * This function is registered as a fd handler, and called from the
  * main loop when s-fd is ready for reading responses.
  */
-static void aio_read_response(void *opaque)
+static void coroutine_fn aio_read_response(void *opaque)
 {
 SheepdogObjRsp rsp;
 BDRVSheepdogState *s = opaque;
@@ -1064,7 +1064,7 @@ out:
 return ret;
 }
 
-static int add_aio_request(BDRVSheepdogState *s, AIOReq *aio_req,
+static int coroutine_fn add_aio_request(BDRVSheepdogState *s, AIOReq *aio_req,
struct iovec *iov, int niov, int create,
enum AIOCBState aiocb_type)
 {
@@ -1517,7 +1517,7 @@ static int sd_truncate(BlockDriverState *bs, int64_t 
offset)
  * update metadata, this sends a write request to the vdi object.
  * Otherwise, this switches back to sd_co_readv/writev.
  */
-static void sd_write_done(SheepdogAIOCB *acb)
+static void coroutine_fn sd_write_done(SheepdogAIOCB *acb)
 {
 int ret;
 BDRVSheepdogState *s = acb-common.bs-opaque;
@@ -1615,7 +1615,7 @@ out:
  * Returns 1 when we need to wait a response, 0 when there is no sent
  * request and -errno in error cases.
  */
-static int sd_co_rw_vector(void *p)
+static int coroutine_fn sd_co_rw_vector(void *p)
 {
 SheepdogAIOCB *acb = p;
 int ret = 0;
-- 
1.7.6.4

[Qemu-devel] [PATCH 2/2] block: Handle cache=unsafe only in raw-posix/win32

2011-10-21 Thread Kevin Wolf

The expected meaning of cache=unsafe with qcow2 is that on a flush the metadata
caches are written out, but no fsync is performed.

Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block.c   |4 +---
 block/raw-posix.c |4 
 block/raw-win32.c |4 
 3 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/block.c b/block.c
index 11c7f91..0b7bc06 100644
--- a/block.c
+++ b/block.c
@@ -2908,9 +2908,7 @@ static void coroutine_fn bdrv_flush_co_entry(void *opaque)
 
 int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
 {
-if (bs-open_flags  BDRV_O_NO_FLUSH) {
-return 0;
-} else if (!bs-drv) {
+if (!bs-drv) {
 return 0;
 } else if (bs-drv-bdrv_co_flush) {
 return bs-drv-bdrv_co_flush(bs);
diff --git a/block/raw-posix.c b/block/raw-posix.c
index dcae88a..9a3d3af 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -380,6 +380,10 @@ static int raw_co_flush(BlockDriverState *bs)
 .coroutine = qemu_coroutine_self(),
 };
 
+if (bs-open_flags  BDRV_O_NO_FLUSH) {
+return 0;
+}
+
 ret = fd_open(bs);
 if (ret  0) {
 return ret;
diff --git a/block/raw-win32.c b/block/raw-win32.c
index f5f73bc..37a8bdb 100644
--- a/block/raw-win32.c
+++ b/block/raw-win32.c
@@ -156,6 +156,10 @@ static int raw_flush(BlockDriverState *bs)
 BDRVRawState *s = bs-opaque;
 int ret;
 
+if (bs-open_flags  BDRV_O_NO_FLUSH) {
+return 0;
+}
+
 ret = FlushFileBuffers(s-hfile);
 if (ret == 0) {
 return -EIO;
-- 
1.7.6.4

[Qemu-devel] [PATCH v3 13/13] simplify main loop functions

2011-10-21 Thread Paolo Bonzini

Provide a clean example of how to use the main loop in the tools.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 cpus.c |5 
 cpus.h |1 -
 vl.c   |   79 +--
 3 files changed, 41 insertions(+), 44 deletions(-)

diff --git a/cpus.c b/cpus.c
index b9f1573..79a7656 100644
--- a/cpus.c
+++ b/cpus.c
@@ -626,11 +626,6 @@ void qemu_init_cpu_loop(void)
 qemu_thread_get_self(io_thread);
 }
 
-void qemu_main_loop_start(void)
-{
-resume_all_vcpus();
-}
-
 void run_on_cpu(CPUState *env, void (*func)(void *data), void *data)
 {
 struct qemu_work_item wi;
diff --git a/cpus.h b/cpus.h
index 7422584..3525375 100644
--- a/cpus.h
+++ b/cpus.h
@@ -3,7 +3,6 @@
 
 /* cpus.c */
 void qemu_init_cpu_loop(void);
-void qemu_main_loop_start(void);
 void resume_all_vcpus(void);
 void pause_all_vcpus(void);
 void cpu_stop_current(void);
diff --git a/vl.c b/vl.c
index 7914df7..1ddb17b 100644
--- a/vl.c
+++ b/vl.c
@@ -1428,18 +1428,49 @@ void qemu_system_vmstop_request(RunState state)
 
 qemu_irq qemu_system_powerdown;
 
+static bool main_loop_should_exit(void)
+{
+RunState r;
+if (qemu_debug_requested()) {
+vm_stop(RUN_STATE_DEBUG);
+}
+if (qemu_shutdown_requested()) {
+qemu_kill_report();
+monitor_protocol_event(QEVENT_SHUTDOWN, NULL);
+if (no_shutdown) {
+vm_stop(RUN_STATE_SHUTDOWN);
+} else {
+return true;
+}
+}
+if (qemu_reset_requested()) {
+pause_all_vcpus();
+cpu_synchronize_all_states();
+qemu_system_reset(VMRESET_REPORT);
+resume_all_vcpus();
+if (runstate_check(RUN_STATE_INTERNAL_ERROR) ||
+runstate_check(RUN_STATE_SHUTDOWN)) {
+runstate_set(RUN_STATE_PAUSED);
+}
+}
+if (qemu_powerdown_requested()) {
+monitor_protocol_event(QEVENT_POWERDOWN, NULL);
+qemu_irq_raise(qemu_system_powerdown);
+}
+if (qemu_vmstop_requested(r)) {
+vm_stop(r);
+}
+return false;
+}
+
 static void main_loop(void)
 {
 bool nonblocking;
-int last_io __attribute__ ((unused)) = 0;
+int last_io = 0;
 #ifdef CONFIG_PROFILER
 int64_t ti;
 #endif
-RunState r;
-
-qemu_main_loop_start();
-
-for (;;) {
+do {
 nonblocking = !kvm_enabled()  last_io  0;
 #ifdef CONFIG_PROFILER
 ti = profile_getclock();
@@ -1448,38 +1479,7 @@ static void main_loop(void)
 #ifdef CONFIG_PROFILER
 dev_time += profile_getclock() - ti;
 #endif
-
-if (qemu_debug_requested()) {
-vm_stop(RUN_STATE_DEBUG);
-}
-if (qemu_shutdown_requested()) {
-qemu_kill_report();
-monitor_protocol_event(QEVENT_SHUTDOWN, NULL);
-if (no_shutdown) {
-vm_stop(RUN_STATE_SHUTDOWN);
-} else
-break;
-}
-if (qemu_reset_requested()) {
-pause_all_vcpus();
-cpu_synchronize_all_states();
-qemu_system_reset(VMRESET_REPORT);
-resume_all_vcpus();
-if (runstate_check(RUN_STATE_INTERNAL_ERROR) ||
-runstate_check(RUN_STATE_SHUTDOWN)) {
-runstate_set(RUN_STATE_PAUSED);
-}
-}
-if (qemu_powerdown_requested()) {
-monitor_protocol_event(QEVENT_POWERDOWN, NULL);
-qemu_irq_raise(qemu_system_powerdown);
-}
-if (qemu_vmstop_requested(r)) {
-vm_stop(r);
-}
-}
-bdrv_close_all();
-pause_all_vcpus();
+} while (!main_loop_should_exit());
 }
 
 static void version(void)
@@ -3445,7 +3445,10 @@ int main(int argc, char **argv, char **envp)
 
 os_setup_post();
 
+resume_all_vcpus();
 main_loop();
+bdrv_close_all();
+pause_all_vcpus();
 net_cleanup();
 res_free();
 
-- 
1.7.6

[Qemu-devel] [PATCH 14/19] vmdk: clean up open

2011-10-21 Thread Kevin Wolf

From: Paolo Bonzini pbonz...@redhat.com

Move vmdk_parent_open to vmdk_open.  There's another path how
vmdk_parent_open can be reached:

  vmdk_parse_extents() -  vmdk_open_sparse() -  vmdk_open_vmdk4() -
  vmdk_open_desc_file().

If that can happen, however, the code is bogus.  vmdk_parent_open
reads from bs-file:

if (bdrv_pread(bs-file, s-desc_offset, desc, DESC_SIZE) != DESC_SIZE) {

but it is always called with s-desc_offset == 0 and with the same
bs-file.  So the data that vmdk_parent_open reads comes always from the
same place, and anyway there is only one place where it can write it,
namely bs-backing_file.

So, if it cannot happen, the patched code is okay.

It is also possible that the recursive call can happen, but only once.  In
that case there would still be a bug in vmdk_open_desc_file setting
s-desc_offset = 0, but the patched code is okay.

Finally, in the case where multiple recursive calls can happen the code
would need to be rewritten anyway.  It is likely that this would anyway
involve adding several parameters to vmdk_parent_open, and calling it from
vmdk_open_vmdk4.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block/vmdk.c |   37 +++--
 1 files changed, 15 insertions(+), 22 deletions(-)

diff --git a/block/vmdk.c b/block/vmdk.c
index ea00938..ace2977 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -624,20 +624,7 @@ static int vmdk_open_desc_file(BlockDriverState *bs, int 
flags,
 return -ENOTSUP;
 }
 s-desc_offset = 0;
-ret = vmdk_parse_extents(buf, bs, bs-file-filename);
-if (ret) {
-vmdk_free_extents(bs);
-return ret;
-}
-
-/* try to open parent images, if exist */
-ret = vmdk_parent_open(bs);
-if (ret) {
-vmdk_free_extents(bs);
-return ret;
-}
-s-parent_cid = vmdk_read_cid(bs, 1);
-return 0;
+return vmdk_parse_extents(buf, bs, bs-file-filename);
 }
 
 static int vmdk_open(BlockDriverState *bs, int flags)
@@ -647,17 +634,23 @@ static int vmdk_open(BlockDriverState *bs, int flags)
 
 if (vmdk_open_sparse(bs, bs-file, flags) == 0) {
 s-desc_offset = 0x200;
-/* try to open parent images, if exist */
-ret = vmdk_parent_open(bs);
+} else {
+ret = vmdk_open_desc_file(bs, flags, 0);
 if (ret) {
-vmdk_free_extents(bs);
-return ret;
+goto fail;
 }
-s-parent_cid = vmdk_read_cid(bs, 1);
-return 0;
-} else {
-return vmdk_open_desc_file(bs, flags, 0);
 }
+/* try to open parent images, if exist */
+ret = vmdk_parent_open(bs);
+if (ret) {
+goto fail;
+}
+s-parent_cid = vmdk_read_cid(bs, 1);
+return ret;
+
+fail:
+vmdk_free_extents(bs);
+return ret;
 }
 
 static int get_whole_cluster(BlockDriverState *bs,
-- 
1.7.6.4

[Qemu-devel] [PATCH 1/2] raw-posix: Convert to bdrv_co_flush

2011-10-21 Thread Kevin Wolf

The next patch will introduce an early return. Using a bottom half to invoke
the AIO callback wouldn't be much less code, so let's go with the native
block layer interface.

Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block/raw-posix.c |   53 -
 1 files changed, 40 insertions(+), 13 deletions(-)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index a3de373..dcae88a 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -357,15 +357,42 @@ static BlockDriverAIOCB *raw_aio_writev(BlockDriverState 
*bs,
   cb, opaque, QEMU_AIO_WRITE);
 }
 
-static BlockDriverAIOCB *raw_aio_flush(BlockDriverState *bs,
-BlockDriverCompletionFunc *cb, void *opaque)
+typedef struct CoroutineIOCompletion {
+Coroutine *coroutine;
+int ret;
+} CoroutineIOCompletion;
+
+static void raw_aio_flush_cb(void *opaque, int ret)
+{
+CoroutineIOCompletion *co = opaque;
+
+co-ret = ret;
+qemu_coroutine_enter(co-coroutine, NULL);
+}
+
+static int raw_co_flush(BlockDriverState *bs)
 {
 BDRVRawState *s = bs-opaque;
+BlockDriverAIOCB *acb;
+int ret;
 
-if (fd_open(bs)  0)
-return NULL;
+CoroutineIOCompletion co = {
+.coroutine = qemu_coroutine_self(),
+};
 
-return paio_submit(bs, s-fd, 0, NULL, 0, cb, opaque, QEMU_AIO_FLUSH);
+ret = fd_open(bs);
+if (ret  0) {
+return ret;
+}
+
+acb = paio_submit(bs, s-fd, 0, NULL, 0, raw_aio_flush_cb, co, 
QEMU_AIO_FLUSH);
+if (acb == NULL) {
+return -EIO;
+}
+
+qemu_coroutine_yield();
+
+return co.ret;
 }
 
 static void raw_close(BlockDriverState *bs)
@@ -635,9 +662,9 @@ static BlockDriver bdrv_file = {
 .bdrv_create = raw_create,
 .bdrv_co_discard = raw_co_discard,
 
-.bdrv_aio_readv = raw_aio_readv,
+.bdrv_aio_readv  = raw_aio_readv,
 .bdrv_aio_writev = raw_aio_writev,
-.bdrv_aio_flush = raw_aio_flush,
+.bdrv_co_flush   = raw_co_flush,
 
 .bdrv_truncate = raw_truncate,
 .bdrv_getlength = raw_getlength,
@@ -903,9 +930,9 @@ static BlockDriver bdrv_host_device = {
 .create_options = raw_create_options,
 .bdrv_has_zero_init = hdev_has_zero_init,
 
-.bdrv_aio_readv= raw_aio_readv,
-.bdrv_aio_writev   = raw_aio_writev,
-.bdrv_aio_flush= raw_aio_flush,
+.bdrv_aio_readv = raw_aio_readv,
+.bdrv_aio_writev= raw_aio_writev,
+.bdrv_co_flush  = raw_co_flush,
 
 .bdrv_truncate  = raw_truncate,
 .bdrv_getlength= raw_getlength,
@@ -1024,7 +1051,7 @@ static BlockDriver bdrv_host_floppy = {
 
 .bdrv_aio_readv = raw_aio_readv,
 .bdrv_aio_writev= raw_aio_writev,
-.bdrv_aio_flush= raw_aio_flush,
+.bdrv_co_flush  = raw_co_flush,
 
 .bdrv_truncate  = raw_truncate,
 .bdrv_getlength= raw_getlength,
@@ -1123,7 +1150,7 @@ static BlockDriver bdrv_host_cdrom = {
 
 .bdrv_aio_readv = raw_aio_readv,
 .bdrv_aio_writev= raw_aio_writev,
-.bdrv_aio_flush= raw_aio_flush,
+.bdrv_co_flush  = raw_co_flush,
 
 .bdrv_truncate  = raw_truncate,
 .bdrv_getlength = raw_getlength,
@@ -1242,7 +1269,7 @@ static BlockDriver bdrv_host_cdrom = {
 
 .bdrv_aio_readv = raw_aio_readv,
 .bdrv_aio_writev= raw_aio_writev,
-.bdrv_aio_flush= raw_aio_flush,
+.bdrv_co_flush  = raw_co_flush,
 
 .bdrv_truncate  = raw_truncate,
 .bdrv_getlength = raw_getlength,
-- 
1.7.6.4

Re: [Qemu-devel] [PATCH 0/2] block: Write out internal caches even with cache=unsafe

2011-10-21 Thread Paolo Bonzini


On 10/21/2011 07:08 PM, Kevin Wolf wrote:

Avi complained that not even writing out qcow2's cache on bdrv_flush() made
cache=unsafe too unsafe to be useful. He's got a point.


Why? cache=unsafe is explicitly allowing to s/data/manure/ on crash.

If you do this for raw-posix, you need to do it for all protocols.


Kevin Wolf (2):
   raw-posix: Convert to bdrv_co_flush
   block: Handle cache=unsafe only in raw-posix/win32


Paolo

[Qemu-devel] New message-Cheque 0000904533 on hold.

2011-10-21 Thread The Co-operative Bank

Dear Customer,

You Have One New Message 
You have a new message in online banking.

Subject: Cheque on hold.

Resolve the problem. 

We telephoned you today in relation to your Cheque payment, unfortunately we 
were unable to contact you on the telephone numbers registered for your account 
to verify the details.
You have receive a Cheque with a possible error in writing your name,
for more security verify before you cashing. 
Please click on the following link to to view a foto copy of the Cheque.
JQLHH0Y904533

Re: [Qemu-devel] Multi heterogenous CPU archs for SoC sim?

2011-10-21 Thread Andreas

Am 21.10.2011 09:26, schrieb 陳韋任:
   COREMU treats QEMU as an entity and lauches multiple QEMUs at the same
 time. QEMUs communicates to each other by using a underlying thin layer
 provided by COREMU.

 I think this approach is much clean than trying to
 parallelize QEMU itself.

In this case I disagree. Given shared global memory and peripherals in
the SoC case, any IPC or shared-memory setup is destined to create
performance or management overhead.

When there's independent nodes connected via CAN/LIN/FlexRay, then I
agree that multiple processes communicating via UNIX sockets make a lot
of sense.

My use case here is testing and debugging, so I think we could live with
the blocks being executed in an interleaved fashion until someone has
the ultimate parallelization solution for upstream.

Regards,
Andreas

[Qemu-devel] [Bug 739785] Re: qemu-i386 user mode on ARMv5 host fails (bash: fork: Invalid argument)

2011-10-21 Thread Steve

@Peter - thanks, fair enough... I don't know enough about qemu's source
to understand what you mean, but clearly it's a complex issue. For now
the patch seems to work for me-- I haven't had the issue that @Ricardo
discusses (again, I'm on armv7l, though).

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/739785

Title:
  qemu-i386 user mode on ARMv5 host fails (bash: fork: Invalid argument)

Status in QEMU:
  New

Bug description:
  Good time of day everybody,

  I have been trying to make usermode qemu on ARM with plugapps
  (archlinux) with archlinux i386 chroot to work.

  1. I installed arch linux in a virtuabox and created a chroot for it with 
mkarchroot. Transferred it to my pogo plug into /i386/
  2. I comiled qemu-i386 static and put it into /i386/usr/bin/
  ./configure --static --disable-blobs --disable-system 
--target-list=i386-linux-user
  make

  3. I also compiled linux kernel 2.6.38 with CONFIG_BINFMT_MISC=y and 
installed it.
  uname -a
  Linux Plugbox 2.6.38 #4 PREEMPT Fri Mar 18 22:19:10 CDT 2011 armv5tel 
Feroceon 88FR131 rev 1 (v5l) Marvell SheevaPlug Reference Board GNU/Linux

  4. Added the following options into /etc/rc.local
  /sbin/modprobe binfmt_misc
  /bin/mount binfmt_misc -t binfmt_misc /proc/sys/fs/binfmt_misc
  echo 
':qemu-i386:M::\x7fELF\x01\x01\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\x03\x00:\xff\xff\xff\xff\xff\xfe\xfe\xff\xff\xff\xff\xff\xff\xff\xff\xff\xfb\xff\xff\xff:/usr/bin/qemu-i386:'
 /proc/sys/fs/binfmt_misc/register

  5. Also copied ld-linux.so.3 (actually ld-2.13.so because ld-
  linux.so.3 is a link to that file) from /lib/ to /i386/lib/

  6.Now i chroot into /i386 and I get this:
  [root@Plugbox i386]# chroot .
  [II aI hnve ao n@P /]# pacman -Suy
  bash: fork: Invalid argument

  7.I also downloaded linux-user-test-0.3 from qemu website and ran the test:
  [root@Plugbox linux-user-test-0.3]# make
  ./qemu-linux-user.sh
  [qemu-i386]
  ../qemu-0.14.0/i386-linux-user/qemu-i386 -L ./gnemul/qemu-i386 i386/ls -l 
dummyfile
  BUG IN DYNAMIC LINKER ld.so: dl-version.c: 210: _dl_check_map_versions: 
Assertion `needed != ((void *)0)' failed!
  make: *** [test] Error 127

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/739785/+subscriptions

1 2 >

1 - 100 of 113 matches

Mail list logo