[Qemu-devel] [Bug 1098729] Re: qemu-user-static for armhf: segfault in threaded code
I also experimented the bug. It may SIGSEGV or hang. Or it may work, very rarely. But I cannot reproduce it at all if change my app to stay on a single CPU: int main(int argc, char * argv[] ) { #ifdef QEMU cpu_set_t cpuSet; CPU_ZERO(cpuSet); CPU_SET(0,cpuSet); if (sched_setaffinity(getpid(), sizeof(cpu_set_t), cpuSet) !=0) cerr sched_setaffinity failed endl; #endif /* QEMU */ -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1098729 Title: qemu-user-static for armhf: segfault in threaded code Status in QEMU: New Bug description: Currently running QEMU from git (fedf2de31023) and running the armhf version of qemu-user-static which I have renamed qemu-armhf-static to follow the naming convention used in Debian. The host systems is a Debian testing x86_64-linux and I have an Debian testing armhf chroot which I invoke using schroot. Majority of program in the armhf chroot run fine, but I'm getting qemu segfaults in multi-threaded programs. As an example, I've grabbed the threads demo program here: https://computing.llnl.gov/tutorials/pthreads/samples/dotprod_mutex.c and changed NUMTHRDS from 4 to 10. I compile it as (same compile command on both x86_64 host and armhf guest): gcc -Wall -lpthread dotprod_mutex.c -o dotprod_mutex When compiled for x86_64 host it runs perfectly and even under Valgrind displays no errors whatsoever. However, when I compile the program in my armhs chroot and run it it usually (but not always) segaults or hangs or crashes. Example output: (armhf) $ ./dotprod_mutex Thread 1 did 10 to 20: mysum=10.00 global sum=10.00 Thread 0 did 0 to 10: mysum=10.00 global sum=20.00 TCG temporary leak before f6731ca0 qemu-arm-static: /home/erikd/Git/qemu-posix-timer-hacking/Upstream/tcg/tcg-op.h:2371: tcg_gen_goto_tb: Assertion `(tcg_ctx.goto_tb_issue_mask (1 idx)) == 0' failed. (armhf) $ ./dotprod_mutex qemu: uncaught target signal 11 (Segmentation fault) - core dumped Segmentation fault (armhf) $ ./dotprod_mutex qemu-arm-static: /home/erikd/Git/qemu-posix-timer-hacking/Upstream/tcg/tcg.c:519: tcg_temp_free_internal: Assertion `idx = s-nb_globals idx s-nb_temps' failed. (armhf) $ ./dotprod_mutex Thread 1 did 10 to 20: mysum=10.00 global sum=10.00 qemu: uncaught target signal 11 (Segmentation fault) - core dumped Segmentation fault To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1098729/+subscriptions
Re: [Qemu-devel] [PATCH v2 2/6] net: introduce lock to protect NetClientState's peer's access
On Tue, Jun 18, 2013 at 8:25 PM, Stefan Hajnoczi stefa...@gmail.com wrote: On Thu, Jun 13, 2013 at 05:03:02PM +0800, Liu Ping Fan wrote: @@ -67,6 +67,10 @@ struct NetClientState { NetClientInfo *info; int link_down; QTAILQ_ENTRY(NetClientState) next; +/* protect the race access of peer only between reader and writer. + * to resolve the writer's race condition, resort on biglock. + */ Indentation Will fix. @@ -301,6 +303,38 @@ static void qemu_free_net_client(NetClientState *nc) } } +/* elimate the reference and sync with exit of rx/tx action. s/elimate/Eliminate/ Will fix + * And flush out peer's queue. + */ +static void qemu_net_client_detach_flush(NetClientState *nc) +{ +NetClientState *peer; + +/* reader of self's peer field , fixme? the deleters are not concurrent, + * so this pair lock can save. + */ Indentation, also please resolve the fixme. So, here can I take the assumption that the deleters are serialized by biglock, and remove the lock following this comment? @@ -394,6 +433,28 @@ int qemu_can_send_packet(NetClientState *sender) return 1; } +int qemu_can_send_packet(NetClientState *sender) +{ +int ret = 1; + +qemu_mutex_lock(sender-peer_lock); +if (!sender-peer) { +goto unlock; +} + +if (sender-peer-receive_disabled) { +ret = 0; +goto unlock; +} else if (sender-peer-info-can_receive + !sender-peer-info-can_receive(sender-peer)) { +ret = 0; +goto unlock; +} Just call qemu_can_send_packet_nolock() instead of duplicating code? Yes. Thx regards, pingfan
Re: [Qemu-devel] [PATCH v2 4/6] net: force NetQue opaque to be NetClientState
On Tue, Jun 18, 2013 at 8:47 PM, Stefan Hajnoczi stefa...@gmail.com wrote: On Thu, Jun 13, 2013 at 05:03:04PM +0800, Liu Ping Fan wrote: From: Liu Ping Fan pingf...@linux.vnet.ibm.com qemu_net_client_setup() is the only user of qemu_new_net_queue(), which will pass in NetClientState. By forcing it be a NetClientState, we can ref/unref NetQueue's owner Please s/opaque/nc/ in net/queue.[hc] since it's no longer opaque :). Ok, I will. Also, qemu_deliver_packet()/qemu_deliver_packet_iov() can take an NetClientState *nc instead of void *opaque. Signed-off-by: Liu Ping Fan pingf...@linux.vnet.ibm.com pingfank here and pingfanl in the From: header. Are both okay and which do you prefer to use? Change my disk, totally re-install my system, and mix up with internal mail address. Will fix it. Thx regards, Pingfan
Re: [Qemu-devel] [PATCH v2 5/6] net: defer nested call to BH
On Tue, Jun 18, 2013 at 8:57 PM, Stefan Hajnoczi stefa...@gmail.com wrote: On Thu, Jun 13, 2013 at 05:03:05PM +0800, Liu Ping Fan wrote: From: Liu Ping Fan pingf...@linux.vnet.ibm.com Nested call caused by -receive() will raise issue like deadlock, so postphone it to BH. Signed-off-by: Liu Ping Fan pingf...@linux.vnet.ibm.com --- net/queue.c | 40 ++-- 1 file changed, 38 insertions(+), 2 deletions(-) Does this patch belong before the netqueue lock patch? The commit history should be bisectable without temporary failures/deadlocks. Ok. diff --git a/net/queue.c b/net/queue.c index 58222b0..9c343ab 100644 --- a/net/queue.c +++ b/net/queue.c @@ -24,6 +24,8 @@ #include net/queue.h #include qemu/queue.h #include net/net.h +#include block/aio.h +#include qemu/main-loop.h /* The delivery handler may only return zero if it will call * qemu_net_queue_flush() when it determines that it is once again able @@ -183,6 +185,22 @@ static ssize_t qemu_net_queue_deliver_iov(NetQueue *queue, return ret; } +typedef struct NetQueBH { This file uses Queue consistently, please don't add Que here. @@ -192,8 +210,17 @@ ssize_t qemu_net_queue_send(NetQueue *queue, { ssize_t ret; -if (queue-delivering || !qemu_can_send_packet_nolock(sender)) { +if (queue-delivering || !qemu_can_send_packet_nolock(sender) +|| sender-send_queue-delivering) { Not sure this is safe, we're only holding one NetClientState-peer_lock and one NetQueue-lock. How can we access both queue-delivering and sender-send_queue-delivering safely? Yes, you are right, it is not safely. The queue-delivering is protected by peer_lock and we do not take the verse direction lock . So finally the above code can not tell out the nested calling A--B--A from A--B, B--A (where A, B stands for a NetClientState). What about using TLS to trace the nested calling? With it, we can avoid AB-BA lock problem. Thx regards, Pingfan
[Qemu-devel] [PATCH v2 0/2] libqtest leak fix cleanup
v2: qtest_start() function comment Markus Armbruster (2): libqtest: Plug fd and memory leaks in qtest_quit() libqtest: New qtest_end() to go with qtest_start() tests/fdc-test.c| 2 +- tests/hd-geo-test.c | 8 tests/ide-test.c| 2 +- tests/libqtest.c| 4 tests/libqtest.h| 12 5 files changed, 22 insertions(+), 6 deletions(-) -- 1.7.11.7
[Qemu-devel] [PATCH v2 1/2] libqtest: Plug fd and memory leaks in qtest_quit()
Reviewed-by: Anthony Liguori aligu...@us.ibm.com Signed-off-by: Markus Armbruster arm...@redhat.com --- tests/libqtest.c | 4 1 file changed, 4 insertions(+) diff --git a/tests/libqtest.c b/tests/libqtest.c index 879ffe9..bb82069 100644 --- a/tests/libqtest.c +++ b/tests/libqtest.c @@ -171,12 +171,16 @@ void qtest_quit(QTestState *s) waitpid(pid, status, 0); } +close(s-fd); +close(s-qmp_fd); +g_string_free(s-rx, true); unlink(s-pid_file); unlink(s-socket_path); unlink(s-qmp_socket_path); g_free(s-pid_file); g_free(s-socket_path); g_free(s-qmp_socket_path); +g_free(s); } static void socket_sendf(int fd, const char *fmt, va_list ap) -- 1.7.11.7
[Qemu-devel] [PATCH v2 2/2] libqtest: New qtest_end() to go with qtest_start()
Signed-off-by: Markus Armbruster arm...@redhat.com --- tests/fdc-test.c| 2 +- tests/hd-geo-test.c | 8 tests/ide-test.c| 2 +- tests/libqtest.h| 12 4 files changed, 18 insertions(+), 6 deletions(-) diff --git a/tests/fdc-test.c b/tests/fdc-test.c index 4b0301d..fd198dc 100644 --- a/tests/fdc-test.c +++ b/tests/fdc-test.c @@ -556,7 +556,7 @@ int main(int argc, char **argv) ret = g_test_run(); /* Cleanup */ -qtest_quit(global_qtest); +qtest_end(); unlink(test_image); return ret; diff --git a/tests/hd-geo-test.c b/tests/hd-geo-test.c index 9a31e85..b72042e 100644 --- a/tests/hd-geo-test.c +++ b/tests/hd-geo-test.c @@ -244,7 +244,7 @@ static void test_ide_none(void) setup_common(argv, ARRAY_SIZE(argv)); qtest_start(g_strjoinv( , argv)); test_cmos(); -qtest_quit(global_qtest); +qtest_end(); } static void test_ide_mbr(bool use_device, MBRcontents mbr) @@ -262,7 +262,7 @@ static void test_ide_mbr(bool use_device, MBRcontents mbr) } qtest_start(g_strjoinv( , argv)); test_cmos(); -qtest_quit(global_qtest); +qtest_end(); } /* @@ -334,7 +334,7 @@ static void test_ide_drive_user(const char *dev, bool trans) g_free(opts); qtest_start(g_strjoinv( , argv)); test_cmos(); -qtest_quit(global_qtest); +qtest_end(); } /* @@ -387,7 +387,7 @@ static void test_ide_drive_cd_0(void) } qtest_start(g_strjoinv( , argv)); test_cmos(); -qtest_quit(global_qtest); +qtest_end(); } int main(int argc, char **argv) diff --git a/tests/ide-test.c b/tests/ide-test.c index 7e2eb94..7307f1d 100644 --- a/tests/ide-test.c +++ b/tests/ide-test.c @@ -122,7 +122,7 @@ static void ide_test_start(const char *cmdline_fmt, ...) static void ide_test_quit(void) { -qtest_quit(global_qtest); +qtest_end(); } static QPCIDevice *get_pci_device(uint16_t *bmdma_base) diff --git a/tests/libqtest.h b/tests/libqtest.h index 437bda3..0f6aade 100644 --- a/tests/libqtest.h +++ b/tests/libqtest.h @@ -17,6 +17,7 @@ #ifndef LIBQTEST_H #define LIBQTEST_H +#include stddef.h #include stdint.h #include stdbool.h #include stdarg.h @@ -319,6 +320,17 @@ static inline QTestState *qtest_start(const char *args) } /** + * qtest_end: + * + * Shut down the QEMU process started by qtest_start(). + */ +static inline void qtest_end(void) +{ +qtest_quit(global_qtest); +global_qtest = NULL; +} + +/** * qmp: * @fmt...: QMP message to send to qemu * -- 1.7.11.7
Re: [Qemu-devel] [RFC 06/13] qemu-thread: add TLS wrappers
On Fri, 06/14 11:48, Stefan Hajnoczi wrote: From: Paolo Bonzini pbonz...@redhat.com Fast TLS is not available on some platforms, but it is always nice to use it. This wrapper implementation falls back to pthread_get/setspecific on POSIX systems that lack __thread, but uses the dynamic linker's TLS support on Linux and Windows. The user shall call alloc_foo() in every thread that needs to access the variable---exactly once and before any access. foo is the name of the variable as passed to DECLARE_TLS and DEFINE_TLS. Then, get_foo() will return the address of the variable. It is guaranteed to remain the same across the lifetime of a thread, so you can cache it. Would tls_alloc_foo() and tls_get_foo() be easier to read and less possible for name conflict? Fam
Re: [Qemu-devel] [Xen-devel] [PATCH] Remove hardcoded xen-platform device initialization
-Original Message- From: Stefano Stabellini [mailto:stefano.stabell...@eu.citrix.com] Sent: 19 June 2013 17:28 To: Paul Durrant Cc: Stefano Stabellini; Ian Campbell; Paolo Bonzini; xen-de...@lists.xen.org; qemu-devel@nongnu.org Subject: RE: [Qemu-devel] [Xen-devel] [PATCH] Remove hardcoded xen- platform device initialization On Wed, 19 Jun 2013, Paul Durrant wrote: -Original Message- From: qemu-devel-bounces+paul.durrant=citrix@nongnu.org [mailto:qemu-devel-bounces+paul.durrant=citrix@nongnu.org] On Behalf Of Stefano Stabellini Sent: 19 June 2013 14:53 To: Ian Campbell Cc: Paolo Bonzini; Paul Durrant; xen-de...@lists.xen.org; qemu- de...@nongnu.org; Stefano Stabellini Subject: Re: [Qemu-devel] [Xen-devel] [PATCH] Remove hardcoded xen- platform device initialization On Wed, 19 Jun 2013, Ian Campbell wrote: On Tue, 2013-06-18 at 19:56 +0100, Stefano Stabellini wrote: On Fri, 14 Jun 2013, Paul Durrant wrote: -Original Message- From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of Paolo Bonzini Sent: 14 June 2013 15:58 To: Paul Durrant Cc: Ian Campbell; Stefano Stabellini; qemu-devel@nongnu.org; xen- de...@lists.xen.org Subject: Re: [Xen-devel] [PATCH] Remove hardcoded xen- platform device initialization Il 14/06/2013 10:11, Paul Durrant ha scritto: I think we're still going to need -M xenpv, I think; it's quite distinct from pc. Of course! Even more: -M xenpv should be reused on ARM. I guess we could use -M pc for HVM and gate the accel code as you suggest but, if that's the way we're going, it would seem more logical just to ditch the accel code for xenpv completely (assuming we can do all we need from the machine init) and then use -M pc -accel=xen for HVM guests going forward. There is common code between pv and fv, and that one definitely belongs in xen_init. Most fv-only code probably should be in pc_init. The rest should move to xen_init though, because it would apply just as well for example to Q35. It's a bit ugly to have fv-only code there, but it's better than having a Xen-specific machine type. Xen/KVM/TCG should be as similar as possible at the QEMU level, any difference should be handled in the toolstack. But that does rather screw up my autodiscovery plans because I would not know, for a given qemu binary, which machine type to use. There's no need for that. 4.4 can just use -M pc unconditionally, =4.3 will just use -M xenfv unconditionally. If I create a new xenfv-2.0 machine type though I *can* do auto discovery... in which case do we need the -accel=xen option at all? Yes. Please try not do things differently from other accelerators. Ok. I guess we can have the ability to override the machine type in the VM config, so you could still kick off an older qemu with a newer libxl - but it sounds like the auto-discovery idea is a no-go then. xenfv-2.0 is a bad idea, like Paolo wrote, it should be possible to just use -M pc for HVM guests and retain -M xenpv for pv guests. However it seems to me that we also need a way in libxl to find out whether QEMU is new enough for us to be able to use -M pc. We can't just assume that users will be able to figure out the magic rune they need to write in the VM config file to solve their VM crash at boot problem. What crash at boot problem? If you start QEMU as device model on Xen with the wrong machine option (for example -M pc on an old QEMU), QEMU would probably just abort during initialization. We could spawn an instance of QEMU just to figure out the QEMU version but we certainly cannot do that every time we start a new VM. Once we figure out the QEMU version the first time we could write it to xenstore so that the next time we don't have to go through the same process again. Due to the device_model_override we might need to make this per- path. You'd also likely need to store mtime or something in case qemu gets upgraded, although perhaps that is getting unnecessarily picky... I think of device_model_override as an option for developers. People that use device_model_override can also override the QEMUMachine version. Are you suggesting we allow a freeform -machine option in libxl, or are you suggesting they point device_model_override at a script which drops the -M argument and inserts their new choice before invoking qemu? I am suggesting that we could have a qemu_machine_override option in QEMU, or maybe a
Re: [Qemu-devel] [PATCH v3 2/2] QEMUBH: make AioContext's bh re-entrant
On Thu, Jun 20, 2013 at 04:59:29AM +0800, Liu Ping Fan wrote: BH will be used outside big lock, so introduce lock to protect between the writers, ie, bh's adders and deleter. The lock only affects the writers and bh's callback does not take this extra lock. Note that for the same AioContext, aio_bh_poll() can not run in parallel yet. Signed-off-by: Liu Ping Fan pingf...@linux.vnet.ibm.com --- async.c | 22 ++ include/block/aio.h | 5 + 2 files changed, 27 insertions(+) qemu_bh_cancel() and qemu_bh_delete() are not modified by this patch. It seems that calling them from a thread is a little risky because there is no guarantee that the BH is no longer invoked after a thread calls these functions. I think that's worth a comment or do you want them to take the lock so they become safe? The other thing I'm unclear on is the -idle assignment followed immediately by a -scheduled assignment. Without memory barriers aio_bh_poll() isn't guaranteed to get an ordered view of these updates: it may see an idle BH as a regular scheduled BH because -idle is still 0. Stefan
Re: [Qemu-devel] [PATCH v2 2/6] net: introduce lock to protect NetClientState's peer's access
On Thu, Jun 20, 2013 at 02:30:30PM +0800, liu ping fan wrote: On Tue, Jun 18, 2013 at 8:25 PM, Stefan Hajnoczi stefa...@gmail.com wrote: On Thu, Jun 13, 2013 at 05:03:02PM +0800, Liu Ping Fan wrote: + * And flush out peer's queue. + */ +static void qemu_net_client_detach_flush(NetClientState *nc) +{ +NetClientState *peer; + +/* reader of self's peer field , fixme? the deleters are not concurrent, + * so this pair lock can save. + */ Indentation, also please resolve the fixme. So, here can I take the assumption that the deleters are serialized by biglock, and remove the lock following this comment? Ah, I understand the comment now. Is there any advantage to dropping the lock? IMO it's clearer to take the lock consistently instead of optimizing cases we think only get called from the main loop.
Re: [Qemu-devel] [Xen-devel] [PATCH] Add Xen platform PCI device version 2.
-Original Message- From: Tim Deegan [mailto:t...@xen.org] Sent: 19 June 2013 21:15 To: Matt Wilson Cc: Alex Bligh; Paul Durrant; xen-de...@lists.xen.org; Ian Campbell; qemu- de...@nongnu.org Subject: Re: [Xen-devel] [Qemu-devel] [PATCH] Add Xen platform PCI device version 2. At 11:21 -0700 on 19 Jun (1371640904), Matt Wilson wrote: On Wed, Jun 19, 2013 at 11:42:06AM +0100, Alex Bligh wrote: --On 19 June 2013 10:13:17 + Paul Durrant paul.durr...@citrix.com wrote: We obviously can't say to users Are you running Windows and are you running PV drivers = X.Y, if so set lever A to position B, otherwise if you are running some other OS or an earlier version of the Windows PV driver set it to position A. Why not? The device can be chosen on a per-VM basis. Not everyone knows what guest some random user will be running (consider cloud platforms). I agree. If this is really the only solution, we would need to have both versions presented to the guest so that old drivers continue to work without any intervention. I suspect that if we expose both, both sets of drivers try to run the same PV connections, and hilarity ensues. Actually I think I can make that work, and it is the conclusion I came to after Alex's comment. I'll create a new patch which introduces a new device, let's call it citrix-pv-bus or somesuch, which will have the necessary device id and revision and will be a dedicate device purely for the Citrix PV drivers. Then, if someone wants to create a VM which will be able use Citrix PV drivers they add this device to their config but leave all other aspects of the config unchanged, thus not precluding using that VM with any drivers that bind to the xen platform device. If someone has a VM that has the old Citrix drivers installed, or GPLPV, I think I should be able to spot this and make sure that the new bus driver quiesces itself to prevent strangeness ensuing. If and when said previous drivers are un-installed then the new bus driver can wake up and enumerate the device nodes for the other pv drivers and Windows Update can carry on doing its stuff. Paul
Re: [Qemu-devel] [PATCH v2 5/6] net: defer nested call to BH
On Thu, Jun 20, 2013 at 02:30:56PM +0800, liu ping fan wrote: On Tue, Jun 18, 2013 at 8:57 PM, Stefan Hajnoczi stefa...@gmail.com wrote: On Thu, Jun 13, 2013 at 05:03:05PM +0800, Liu Ping Fan wrote: From: Liu Ping Fan pingf...@linux.vnet.ibm.com Nested call caused by -receive() will raise issue like deadlock, so postphone it to BH. Signed-off-by: Liu Ping Fan pingf...@linux.vnet.ibm.com --- net/queue.c | 40 ++-- 1 file changed, 38 insertions(+), 2 deletions(-) Does this patch belong before the netqueue lock patch? The commit history should be bisectable without temporary failures/deadlocks. Ok. diff --git a/net/queue.c b/net/queue.c index 58222b0..9c343ab 100644 --- a/net/queue.c +++ b/net/queue.c @@ -24,6 +24,8 @@ #include net/queue.h #include qemu/queue.h #include net/net.h +#include block/aio.h +#include qemu/main-loop.h /* The delivery handler may only return zero if it will call * qemu_net_queue_flush() when it determines that it is once again able @@ -183,6 +185,22 @@ static ssize_t qemu_net_queue_deliver_iov(NetQueue *queue, return ret; } +typedef struct NetQueBH { This file uses Queue consistently, please don't add Que here. @@ -192,8 +210,17 @@ ssize_t qemu_net_queue_send(NetQueue *queue, { ssize_t ret; -if (queue-delivering || !qemu_can_send_packet_nolock(sender)) { +if (queue-delivering || !qemu_can_send_packet_nolock(sender) +|| sender-send_queue-delivering) { Not sure this is safe, we're only holding one NetClientState-peer_lock and one NetQueue-lock. How can we access both queue-delivering and sender-send_queue-delivering safely? Yes, you are right, it is not safely. The queue-delivering is protected by peer_lock and we do not take the verse direction lock . So finally the above code can not tell out the nested calling A--B--A from A--B, B--A (where A, B stands for a NetClientState). What about using TLS to trace the nested calling? With it, we can avoid AB-BA lock problem. I would take a step back and see if there's a way to avoid reaching into inspect sender-send_queue-delivering here. Stefan
Re: [Qemu-devel] [PATCH v2 0/5] qcow2: Discard freed clusters
On Wed, Jun 19, 2013 at 01:44:16PM +0200, Kevin Wolf wrote: This series adds options to make qcow2 discard freed clusters, in several categories. By default, only freed clusters related to snapshots (i.e. mainly snapshot deletion) are discarded. v2: - Removed leftover debug code - Don't discard after COW (overwriting compressed clusters) - Changed some commas into semicolons Kevin Wolf (5): Revert block: Disable driver-specific options for 1.5 qcow2: Add refcount update reason to all callers qcow2: Options to enable discard for freed clusters qcow2: Batch discards block: Always enable discard on the protocol level block.c | 2 +- block/qcow2-cluster.c| 41 ++ block/qcow2-refcount.c | 136 +++ block/qcow2-snapshot.c | 6 ++- block/qcow2.c| 30 ++- block/qcow2.h| 32 +-- blockdev.c | 118 ++-- tests/qemu-iotests/group | 2 +- 8 files changed, 214 insertions(+), 153 deletions(-) -- 1.8.1.4 Thanks, applied to my block tree: https://github.com/stefanha/qemu/commits/block Stefan
Re: [Qemu-devel] [PATCH 0/3] qapi: Top-level type reference for command definitions
On Wed, Jun 19, 2013 at 06:28:04PM +0200, Kevin Wolf wrote: Kevin Wolf (3): qapi.py: Move common code to evaluate() qapi.py: Allow top-level type reference for command definitions qapi-schema: Use BlockdevSnapshot type for blockdev-snapshot-sync qapi-schema.json | 3 +-- scripts/qapi.py | 43 +-- 2 files changed, 30 insertions(+), 16 deletions(-) Nice, I'll use this for drive-backup. Thanks, Stefan
Re: [Qemu-devel] Java volatile vs. C11 seq_cst (was Re: [PATCH v2 1/2] add a header file for atomic operations)
Il 19/06/2013 22:25, Torvald Riegel ha scritto: On Wed, 2013-06-19 at 17:14 +0200, Paolo Bonzini wrote: (1) I don't care about relaxed RMW ops (loads/stores occur in hot paths, but RMW shouldn't be that bad. I don't care if reference counting is a little slower than it could be, for example); I doubt relaxed RMW ops are sufficient even for reference counting. They are enough on the increment side, or so says boost... http://www.chaoticmind.net/~hcb/projects/boost.atomic/doc/atomic/usage_examples.html#boost_atomic.usage_examples.example_reference_counters [An aside: Java guarantees that volatile stores are not reordered with volatile loads. This is not guaranteed by just using release stores and acquire stores, and is why IIUC acq_rel Java seq_cst]. Or maybe Java volatile is acq for loads and seq_cst for stores... Perhaps (but I'm not 100% sure). As long as you only have a producer and a consumer, C11 is fine, because all you need is load-acquire/store-release. In fact, if it weren't for the experience factor, C11 is easier than manually placing acquire and release barriers. But as soon as two or more threads are reading _and_ writing the shared memory, it gets complicated and I want to provide something simple that people can use. This is the reason for (2) above. I can't quite follow you here. There is a total order for all modifications to a single variable, and if you use acq/rel combined with loads and stores on this variable, then you basically can make use of the total order. (All loads that read-from a certain store get a synchronized-with (and thus happens-before edge) with the store, and the stores are in a total order.) This is independent of the number of readers and writers. The difference starts once you want to sync with more than one variable, and need to establish an order between those accesses. You're right of course. More specifically when there is a thread where some variables are stored while others are loaded. There will still be a few cases that need to be optimized, and here are where the difficult requirements come: (R1) the primitives *should* not be alien to people who know Linux. (R2) those optimizations *must* be easy to do and review; at least as easy as these things go. The two are obviously related. Ease of review is why it is important to make things familiar to people who know Linux. In C11, relaxing SC loads and stores is complicated, and more specifically hard to explain! I can't see why that would be harder than reasoning about equally weaker Java semantics. But you obviously know your community, and I don't :) Because Java semantics are almost SC, and as Paul mentioned the difference doesn't matter in practice (IRIW/RWC is where it matters, WRC works even on Power; see http://www.cl.cam.ac.uk/~pes20/ppc-supplemental/ppc051.html#toc5, row WRC+lwsyncs). It hasn't ever mattered for Linux, at least. By contrast, Java volatile semantics are easily converted to a sequence of relaxed loads, relaxed stores, and acq/rel/sc fences. The same holds for C11/C++11. If you look at either the standard or the Batty model, you'll see that for every pair like store(rel)--load(acq), there is also store(rel)--fence(acq)+load(relaxed), store(relaxed)+fence(rel)--fence(acq)+load(relaxed), etc. defined, giving the same semantics. Likewise for SC. Do you have a pointer to that? It would help. You can also build Dekker with SC stores and acq loads, if I'm not mistaken. Typically one would probably use SC fences and relaxed stores/loads. Yes. I guess so. But you also have to consider the legacy that you create. I do think the C11/C++11 model will used widely, and more and more people will used to it. I don't think many people will learn how to use the various non-seqcst modes... At least so far I punted. :) But you already use similarly weaker orderings that the other abstractions provide (e.g., Java), so you're half-way there :) True. On the other hand you can treat Java like kinda SC but don't worry, you won't see the difference. It is both worrisome and appealing... Paolo
[Qemu-devel] [PATCH v2 2/8] exec: Clean up fall back when -mem-path allocation fails
With -mem-path, qemu_ram_alloc_from_ptr() first tries to allocate accordingly, but when it fails, it falls back to normal allocation. The fall back allocation code used to be effectively identical to the -mem-path not given code, until it started to diverge in commit 432d268. I believe the code still works, but clean it up anyway: drop the special fall back allocation code, and fall back to the ordinary -mem-path not given code instead. Reviewed-by: Paolo Bonzini pbonz...@redhat.com Signed-off-by: Markus Armbruster arm...@redhat.com --- exec.c | 7 ++- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/exec.c b/exec.c index b424e12..56c31a9 100644 --- a/exec.c +++ b/exec.c @@ -1091,15 +1091,12 @@ ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host, if (mem_path) { #if defined (__linux__) !defined(TARGET_S390X) new_block-host = file_ram_alloc(new_block, size, mem_path); -if (!new_block-host) { -new_block-host = qemu_anon_ram_alloc(size); -memory_try_enable_merging(new_block-host, size); -} #else fprintf(stderr, -mem-path option unsupported\n); exit(1); #endif -} else { +} +if (!new_block-host) { if (kvm_enabled()) { /* some s390/kvm configurations have special constraints */ new_block-host = kvm_ram_alloc(size); -- 1.7.11.7
[Qemu-devel] [PATCH v2 1/8] exec: Fix Xen RAM allocation with unusual options
Issues: * We try to obey -mem-path even though it can't work with Xen. * To implement -machine mem-merge, we call memory_try_enable_merging(new_block-host, size). But with Xen, new_block-host remains null. Oops. Fix by separating Xen allocation from normal allocation. Acked-by: Stefano Stabellini stefano.stabell...@eu.citrix.com Signed-off-by: Markus Armbruster arm...@redhat.com --- exec.c | 20 1 file changed, 12 insertions(+), 8 deletions(-) diff --git a/exec.c b/exec.c index 5b8b40d..b424e12 100644 --- a/exec.c +++ b/exec.c @@ -1081,6 +1081,12 @@ ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host, if (host) { new_block-host = host; new_block-flags |= RAM_PREALLOC_MASK; +} else if (xen_enabled()) { +if (mem_path) { +fprintf(stderr, -mem-path not supported with Xen\n); +exit(1); +} +xen_ram_alloc(new_block-offset, size, mr); } else { if (mem_path) { #if defined (__linux__) !defined(TARGET_S390X) @@ -1094,9 +1100,7 @@ ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host, exit(1); #endif } else { -if (xen_enabled()) { -xen_ram_alloc(new_block-offset, size, mr); -} else if (kvm_enabled()) { +if (kvm_enabled()) { /* some s390/kvm configurations have special constraints */ new_block-host = kvm_ram_alloc(size); } else { @@ -1174,6 +1178,8 @@ void qemu_ram_free(ram_addr_t addr) ram_list.version++; if (block-flags RAM_PREALLOC_MASK) { ; +} else if (xen_enabled()) { +xen_invalidate_map_cache_entry(block-host); } else if (mem_path) { #if defined (__linux__) !defined(TARGET_S390X) if (block-fd) { @@ -1186,11 +1192,7 @@ void qemu_ram_free(ram_addr_t addr) abort(); #endif } else { -if (xen_enabled()) { -xen_invalidate_map_cache_entry(block-host); -} else { -qemu_anon_ram_free(block-host, block-length); -} +qemu_anon_ram_free(block-host, block-length); } g_free(block); break; @@ -1214,6 +1216,8 @@ void qemu_ram_remap(ram_addr_t addr, ram_addr_t length) vaddr = block-host + offset; if (block-flags RAM_PREALLOC_MASK) { ; +} else if (xen_enabled()) { +abort(); } else { flags = MAP_FIXED; munmap(vaddr, length); -- 1.7.11.7
[Qemu-devel] [PATCH v2 8/8] pc_sysfw: Fix ISA BIOS init for ridiculously big flash
pc_isa_bios_init() suffers integer overflow for flash larger than INT_MAX. Signed-off-by: Markus Armbruster arm...@redhat.com --- hw/block/pc_sysfw.c | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/hw/block/pc_sysfw.c b/hw/block/pc_sysfw.c index 412d1b0..aebefc9 100644 --- a/hw/block/pc_sysfw.c +++ b/hw/block/pc_sysfw.c @@ -54,10 +54,7 @@ static void pc_isa_bios_init(MemoryRegion *rom_memory, flash_size = memory_region_size(flash_mem); /* map the last 128KB of the BIOS in ISA space */ -isa_bios_size = flash_size; -if (isa_bios_size (128 * 1024)) { -isa_bios_size = 128 * 1024; -} +isa_bios_size = MIN(flash_size, 128 * 1024); isa_bios = g_malloc(sizeof(*isa_bios)); memory_region_init_ram(isa_bios, isa-bios, isa_bios_size); vmstate_register_ram_global(isa_bios); -- 1.7.11.7
[Qemu-devel] [PATCH v2 3/8] exec: Reduce ifdeffery around -mem-path
Instead of spreading its ifdeffery everywhere, confine it to qemu_ram_alloc_from_ptr(). Everywhere else, simply test block-fd, which is non-negative exactly when block uses -mem-path. Signed-off-by: Markus Armbruster arm...@redhat.com --- exec.c | 37 ++--- include/exec/cpu-all.h | 2 -- 2 files changed, 10 insertions(+), 29 deletions(-) diff --git a/exec.c b/exec.c index 56c31a9..4dbb0f1 100644 --- a/exec.c +++ b/exec.c @@ -1073,6 +1073,7 @@ ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host, size = TARGET_PAGE_ALIGN(size); new_block = g_malloc0(sizeof(*new_block)); +new_block-fd = -1; /* This assumes the iothread lock is taken here too. */ qemu_mutex_lock_ramlist(); @@ -1177,17 +1178,9 @@ void qemu_ram_free(ram_addr_t addr) ; } else if (xen_enabled()) { xen_invalidate_map_cache_entry(block-host); -} else if (mem_path) { -#if defined (__linux__) !defined(TARGET_S390X) -if (block-fd) { -munmap(block-host, block-length); -close(block-fd); -} else { -qemu_anon_ram_free(block-host, block-length); -} -#else -abort(); -#endif +} else if (block-fd = 0) { +munmap(block-host, block-length); +close(block-fd); } else { qemu_anon_ram_free(block-host, block-length); } @@ -1218,25 +1211,15 @@ void qemu_ram_remap(ram_addr_t addr, ram_addr_t length) } else { flags = MAP_FIXED; munmap(vaddr, length); -if (mem_path) { -#if defined(__linux__) !defined(TARGET_S390X) -if (block-fd) { +if (block-fd = 0) { #ifdef MAP_POPULATE -flags |= mem_prealloc ? MAP_POPULATE | MAP_SHARED : -MAP_PRIVATE; +flags |= mem_prealloc ? MAP_POPULATE | MAP_SHARED : +MAP_PRIVATE; #else -flags |= MAP_PRIVATE; -#endif -area = mmap(vaddr, length, PROT_READ | PROT_WRITE, -flags, block-fd, offset); -} else { -flags |= MAP_PRIVATE | MAP_ANONYMOUS; -area = mmap(vaddr, length, PROT_READ | PROT_WRITE, -flags, -1, 0); -} -#else -abort(); +flags |= MAP_PRIVATE; #endif +area = mmap(vaddr, length, PROT_READ | PROT_WRITE, +flags, block-fd, offset); } else { #if defined(TARGET_S390X) defined(CONFIG_KVM) flags |= MAP_SHARED | MAP_ANONYMOUS; diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h index e9c3717..c369b25 100644 --- a/include/exec/cpu-all.h +++ b/include/exec/cpu-all.h @@ -476,9 +476,7 @@ typedef struct RAMBlock { * Writes must take both locks. */ QTAILQ_ENTRY(RAMBlock) next; -#if defined(__linux__) !defined(TARGET_S390X) int fd; -#endif } RAMBlock; typedef struct RAMList { -- 1.7.11.7
[Qemu-devel] [PATCH v2 4/8] exec: Simplify the guest physical memory allocation hook
Make it a generic hook rather than a KVM hook. Less code and ifdeffery. Since the only user of the hook is old S390 KVM, there's hope we can get rid of it some day. Acked-by: Christian Borntraeger borntrae...@de.ibm.com Signed-off-by: Markus Armbruster arm...@redhat.com --- exec.c | 20 ++-- include/exec/exec-all.h | 2 ++ include/sysemu/kvm.h| 5 - kvm-all.c | 13 - target-s390x/kvm.c | 17 ++--- 5 files changed, 22 insertions(+), 35 deletions(-) diff --git a/exec.c b/exec.c index 4dbb0f1..c45eb33 100644 --- a/exec.c +++ b/exec.c @@ -685,6 +685,19 @@ typedef struct subpage_t { static int subpage_register (subpage_t *mmio, uint32_t start, uint32_t end, uint16_t section); static subpage_t *subpage_init(hwaddr base); + +static void *(*phys_mem_alloc)(ram_addr_t size) = qemu_anon_ram_alloc; + +/* + * Set a custom physical guest memory alloator. + * Accelerators with unusual needs may need this. Hopefully, we can + * get rid of it eventually. + */ +void phys_mem_set_alloc(void *(*alloc)(ram_addr_t)) +{ +phys_mem_alloc = alloc; +} + static void destroy_page_desc(uint16_t section_index) { MemoryRegionSection *section = phys_sections[section_index]; @@ -1098,12 +,7 @@ ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host, #endif } if (!new_block-host) { -if (kvm_enabled()) { -/* some s390/kvm configurations have special constraints */ -new_block-host = kvm_ram_alloc(size); -} else { -new_block-host = qemu_anon_ram_alloc(size); -} +new_block-host = phys_mem_alloc(size); memory_try_enable_merging(new_block-host, size); } } diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h index b2162a4..4921696 100644 --- a/include/exec/exec-all.h +++ b/include/exec/exec-all.h @@ -369,6 +369,8 @@ bool is_tcg_gen_code(uintptr_t pc_ptr); #if !defined(CONFIG_USER_ONLY) +void phys_mem_set_alloc(void *(*alloc)(ram_addr_t)); + struct MemoryRegion *iotlb_to_region(hwaddr index); bool io_mem_read(struct MemoryRegion *mr, hwaddr addr, uint64_t *pvalue, unsigned size); diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h index 8b19322..e722027 100644 --- a/include/sysemu/kvm.h +++ b/include/sysemu/kvm.h @@ -151,11 +151,6 @@ int kvm_init_vcpu(CPUState *cpu); #ifdef NEED_CPU_H int kvm_cpu_exec(CPUArchState *env); -#if !defined(CONFIG_USER_ONLY) -void *kvm_ram_alloc(ram_addr_t size); -void *kvm_arch_ram_alloc(ram_addr_t size); -#endif - void kvm_setup_guest_memory(void *start, size_t size); void kvm_flush_coalesced_mmio_buffer(void); diff --git a/kvm-all.c b/kvm-all.c index 405480e..f88c4ec 100644 --- a/kvm-all.c +++ b/kvm-all.c @@ -1816,19 +1816,6 @@ int kvm_has_intx_set_mask(void) return kvm_state-intx_set_mask; } -void *kvm_ram_alloc(ram_addr_t size) -{ -#ifdef TARGET_S390X -void *mem; - -mem = kvm_arch_ram_alloc(size); -if (mem) { -return mem; -} -#endif -return qemu_anon_ram_alloc(size); -} - void kvm_setup_guest_memory(void *start, size_t size) { #ifdef CONFIG_VALGRIND_H diff --git a/target-s390x/kvm.c b/target-s390x/kvm.c index 4d9ac4a..e7863d7 100644 --- a/target-s390x/kvm.c +++ b/target-s390x/kvm.c @@ -92,9 +92,15 @@ const KVMCapabilityInfo kvm_arch_required_capabilities[] = { static int cap_sync_regs; +static void *legacy_s390_alloc(ram_addr_t size); + int kvm_arch_init(KVMState *s) { cap_sync_regs = kvm_check_extension(s, KVM_CAP_SYNC_REGS); +if (!kvm_check_extension(s, KVM_CAP_S390_GMAP) +|| !kvm_check_extension(s, KVM_CAP_S390_COW)) { +phys_mem_set_alloc(legacy_s390_alloc); +} return 0; } @@ -332,17 +338,6 @@ static void *legacy_s390_alloc(ram_addr_t size) return mem; } -void *kvm_arch_ram_alloc(ram_addr_t size) -{ -/* Can we use the standard allocation ? */ -if (kvm_check_extension(kvm_state, KVM_CAP_S390_GMAP) -kvm_check_extension(kvm_state, KVM_CAP_S390_COW)) { -return NULL; -} else { -return legacy_s390_alloc(size); -} -} - int kvm_arch_insert_sw_breakpoint(CPUState *cs, struct kvm_sw_breakpoint *bp) { S390CPU *cpu = S390_CPU(cs); -- 1.7.11.7
[Qemu-devel] [PATCH v2 0/8] Guest memory allocation fixes cleanup
All I wanted to do is exit(1) instead of abort() on guest memory allocation failure [07/08]. But that lead me into a minor #ifdef bog, and here's what I brought back. Enjoy! Testing: * Christian Borntraeger reports v1 works fine under LPAR (new S390 KVM, i.e. generic allocation) and as second guest under z/VM (old S390 KVM, i.e. legacy S390 allocation). Thanks for testing, and for catching a stupid mistake. v2 differs from v1 only in code that isn't reachable on S390. Changes since v1: * 5/8: Fix assertion in qemu_ram_remap() (Paolo) * All other patches unchanged except for Acked-by in commit messages Changes since RFC: * 1-3+8/8 unchanged except for commit message tweaks * 4+6/8 rewritten to address Paolo's review * 5/8 rewritten: don't fix dead code, just assert it's dead * 7/8 fix mistakes caught by Richard Henderson and Peter Maydell Markus Armbruster (8): exec: Fix Xen RAM allocation with unusual options exec: Clean up fall back when -mem-path allocation fails exec: Reduce ifdeffery around -mem-path exec: Simplify the guest physical memory allocation hook exec: Drop incorrect dead S390 code in qemu_ram_remap() exec: Clean up unnecessary S390 ifdeffery exec: Don't abort when we can't allocate guest memory pc_sysfw: Fix ISA BIOS init for ridiculously big flash exec.c | 121 ++-- hw/block/pc_sysfw.c | 5 +- include/exec/cpu-all.h | 2 - include/exec/exec-all.h | 2 + include/sysemu/kvm.h| 5 -- kvm-all.c | 13 -- target-s390x/kvm.c | 23 +++-- util/oslib-posix.c | 4 +- util/oslib-win32.c | 5 +- 9 files changed, 78 insertions(+), 102 deletions(-) -- 1.7.11.7
[Qemu-devel] [PATCH v2 5/8] exec: Drop incorrect dead S390 code in qemu_ram_remap()
Old S390 KVM wants guest RAM mapped in a peculiar way. Commit 6b02494 implemented that. When qemu_ram_remap() got added in commit cd19cfa, its code carefully mimicked the allocation code: peculiar way if defined(TARGET_S390X) defined(CONFIG_KVM), else normal way. For new S390 KVM, we actually want the normal way. Commit fdec991 changed qemu_ram_alloc_from_ptr() accordingly, but forgot to update qemu_ram_remap(). If qemu_ram_alloc_from_ptr() maps RAM the normal way, but qemu_ram_remap() remaps it the peculiar way, remapping changes protection and flags, which it shouldn't. Fortunately, this can't happen, as we never remap on S390. Replace the incorrect code with an assertion. Thanks to Christian Borntraeger for help with assessing the bug's (non-)impact. Acked-by: Christian Borntraeger borntrae...@de.ibm.com Signed-off-by: Markus Armbruster arm...@redhat.com --- exec.c | 13 +++-- 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/exec.c b/exec.c index c45eb33..366ac6a 100644 --- a/exec.c +++ b/exec.c @@ -1229,15 +1229,16 @@ void qemu_ram_remap(ram_addr_t addr, ram_addr_t length) area = mmap(vaddr, length, PROT_READ | PROT_WRITE, flags, block-fd, offset); } else { -#if defined(TARGET_S390X) defined(CONFIG_KVM) -flags |= MAP_SHARED | MAP_ANONYMOUS; -area = mmap(vaddr, length, PROT_EXEC|PROT_READ|PROT_WRITE, -flags, -1, 0); -#else +/* + * Remap needs to match alloc. Accelerators that + * set phys_mem_alloc never remap. If they did, + * we'd need a remap hook here. + */ +assert(phys_mem_alloc == qemu_anon_ram_alloc); + flags |= MAP_PRIVATE | MAP_ANONYMOUS; area = mmap(vaddr, length, PROT_READ | PROT_WRITE, flags, -1, 0); -#endif } if (area != vaddr) { fprintf(stderr, Could not remap addr: -- 1.7.11.7
[Qemu-devel] [PATCH v2 6/8] exec: Clean up unnecessary S390 ifdeffery
Another issue missed in commit fdec991 is -mem-path: it needs to be rejected only for old S390 KVM, not for any S390. Not that I personally care, but the ifdeffery in qemu_ram_alloc_from_ptr() annoys me. Note that this doesn't actually make -mem-path work, as the kernel doesn't (yet?) support large pages in the host for KVM guests. Clean it up anyway. Thanks to Christian Borntraeger for pointing out the S390 kernel limitations. Signed-off-by: Markus Armbruster arm...@redhat.com --- exec.c | 25 +++-- 1 file changed, 19 insertions(+), 6 deletions(-) diff --git a/exec.c b/exec.c index 366ac6a..bf2a7d6 100644 --- a/exec.c +++ b/exec.c @@ -862,7 +862,7 @@ void qemu_mutex_unlock_ramlist(void) qemu_mutex_unlock(ram_list.mutex); } -#if defined(__linux__) !defined(TARGET_S390X) +#ifdef __linux__ #include sys/vfs.h @@ -965,6 +965,14 @@ static void *file_ram_alloc(RAMBlock *block, block-fd = fd; return area; } +#else +static void *file_ram_alloc(RAMBlock *block, +ram_addr_t memory, +const char *path) +{ +fprintf(stderr, -mem-path not supported on this host\n); +exit(1); +} #endif static ram_addr_t find_ram_offset(ram_addr_t size) @@ -1103,12 +,17 @@ ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host, xen_ram_alloc(new_block-offset, size, mr); } else { if (mem_path) { -#if defined (__linux__) !defined(TARGET_S390X) +if (phys_mem_alloc != qemu_anon_ram_alloc) { +/* + * file_ram_alloc() needs to allocate just like + * phys_mem_alloc, but we haven't bothered to provide + * a hook there. + */ +fprintf(stderr, +-mem-path not supported with this accelerator\n); +exit(1); +} new_block-host = file_ram_alloc(new_block, size, mem_path); -#else -fprintf(stderr, -mem-path option unsupported\n); -exit(1); -#endif } if (!new_block-host) { new_block-host = phys_mem_alloc(size); -- 1.7.11.7
[Qemu-devel] [PATCH v2 7/8] exec: Don't abort when we can't allocate guest memory
We abort() on memory allocation failure. abort() is appropriate for programming errors. Maybe most memory allocation failures are programming errors, maybe not. But guest memory allocation failure isn't, and aborting when the user asks for more memory than we can provide is not nice. exit(1) instead, and do it in just one place, so the error message is consistent. Tested-by: Christian Borntraeger borntrae...@de.ibm.com Signed-off-by: Markus Armbruster arm...@redhat.com --- exec.c | 5 + target-s390x/kvm.c | 6 +- util/oslib-posix.c | 4 +--- util/oslib-win32.c | 5 + 4 files changed, 8 insertions(+), 12 deletions(-) diff --git a/exec.c b/exec.c index bf2a7d6..3f7fe29 100644 --- a/exec.c +++ b/exec.c @@ -1125,6 +1125,11 @@ ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host, } if (!new_block-host) { new_block-host = phys_mem_alloc(size); +if (!new_block-host) { +fprintf(stderr, Cannot set up guest memory '%s': %s\n, +new_block-mr-name, strerror(errno)); +exit(1); +} memory_try_enable_merging(new_block-host, size); } } diff --git a/target-s390x/kvm.c b/target-s390x/kvm.c index e7863d7..b1ffcea 100644 --- a/target-s390x/kvm.c +++ b/target-s390x/kvm.c @@ -331,11 +331,7 @@ static void *legacy_s390_alloc(ram_addr_t size) mem = mmap((void *) 0x8ULL, size, PROT_EXEC|PROT_READ|PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS | MAP_FIXED, -1, 0); -if (mem == MAP_FAILED) { -fprintf(stderr, Allocating RAM failed\n); -abort(); -} -return mem; +return mem == MAP_FAILED ? NULL : mem; } int kvm_arch_insert_sw_breakpoint(CPUState *cs, struct kvm_sw_breakpoint *bp) diff --git a/util/oslib-posix.c b/util/oslib-posix.c index 3dc8b1b..253bc3d 100644 --- a/util/oslib-posix.c +++ b/util/oslib-posix.c @@ -112,9 +112,7 @@ void *qemu_anon_ram_alloc(size_t size) size_t offset = QEMU_ALIGN_UP((uintptr_t)ptr, align) - (uintptr_t)ptr; if (ptr == MAP_FAILED) { -fprintf(stderr, Failed to allocate %zu B: %s\n, -size, strerror(errno)); -abort(); +return NULL; } ptr += offset; diff --git a/util/oslib-win32.c b/util/oslib-win32.c index 961fbf5..983b7a2 100644 --- a/util/oslib-win32.c +++ b/util/oslib-win32.c @@ -65,10 +65,7 @@ void *qemu_anon_ram_alloc(size_t size) /* FIXME: this is not exactly optimal solution since VirtualAlloc has 64Kb granularity, but at least it guarantees us that the memory is page aligned. */ -if (!size) { -abort(); -} -ptr = qemu_oom_check(VirtualAlloc(NULL, size, MEM_COMMIT, PAGE_READWRITE)); +ptr = VirtualAlloc(NULL, size, MEM_COMMIT, PAGE_READWRITE); trace_qemu_anon_ram_alloc(size, ptr); return ptr; } -- 1.7.11.7
Re: [Qemu-devel] [PATCH v3 1/2] pvpanic: initialization cleanup
On 06/19/13 17:02, Michael S. Tsirkin wrote: Avoid use of static variables: PC systems initialize pvpanic device through pvpanic_init, so we can simply create the fw_cfg file at that point. This also makes it possible to skip device creation completely if fw_cfg is not there, e.g. for xen - so the ports it reserves are not discoverable by guests. Also, make pvpanic_init void since callers ignore return status anyway. Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com Cc: Laszlo Ersek ler...@redhat.com Cc: Paul Durrant paul.durr...@citrix.com Signed-off-by: Michael S. Tsirkin m...@redhat.com --- Chanes from v2: skip device creation completely if !fw_cfg make pvpanic_init void Changes from v1: don't assert if !fw_cfg hw/misc/pvpanic.c| 30 -- include/hw/i386/pc.h | 2 +- 2 files changed, 17 insertions(+), 15 deletions(-) diff --git a/hw/misc/pvpanic.c b/hw/misc/pvpanic.c index 060099b..83ed226 100644 --- a/hw/misc/pvpanic.c +++ b/hw/misc/pvpanic.c @@ -97,26 +97,28 @@ static void pvpanic_isa_realizefn(DeviceState *dev, Error **errp) { ISADevice *d = ISA_DEVICE(dev); PVPanicState *s = ISA_PVPANIC_DEVICE(dev); -static bool port_configured; -FWCfgState *fw_cfg; isa_register_ioport(d, s-io, s-ioport); +} -if (!port_configured) { -fw_cfg = fw_cfg_find(); -if (fw_cfg) { -fw_cfg_add_file(fw_cfg, etc/pvpanic-port, -g_memdup(s-ioport, sizeof(s-ioport)), -sizeof(s-ioport)); -port_configured = true; -} -} +static void pvpanic_fw_cfg(ISADevice *dev, FWCfgState *fw_cfg) +{ +PVPanicState *s = ISA_PVPANIC_DEVICE(dev); + +fw_cfg_add_file(fw_cfg, etc/pvpanic-port, +g_memdup(s-ioport, sizeof(s-ioport)), +sizeof(s-ioport)); } -int pvpanic_init(ISABus *bus) +void pvpanic_init(ISABus *bus) { -isa_create_simple(bus, TYPE_ISA_PVPANIC_DEVICE); -return 0; +ISADevice *dev; +FWCfgState *fw_cfg = fw_cfg_find(); +if (!fw_cfg) { +return; +} +dev = isa_create_simple (bus, TYPE_ISA_PVPANIC_DEVICE); +pvpanic_fw_cfg(dev, fw_cfg); } static Property pvpanic_isa_properties[] = { diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h index ba9ba1a..458eded 100644 --- a/include/hw/i386/pc.h +++ b/include/hw/i386/pc.h @@ -196,7 +196,7 @@ static inline bool isa_ne2000_init(ISABus *bus, int base, int irq, NICInfo *nd) void pc_system_firmware_init(MemoryRegion *rom_memory); /* pvpanic.c */ -int pvpanic_init(ISABus *bus); +void pvpanic_init(ISABus *bus); /* e820 types */ #define E820_RAM1 series Reviewed-by: Laszlo Ersek ler...@redhat.com
Re: [Qemu-devel] [PATCH v3 1/2] add a header file for atomic operations
Il 19/06/2013 22:44, Richard Henderson ha scritto: +/* Data must be read atomically. We don't really need barrier semantics + * but it's easier to use atomic_* than roll our own. */ +log = atomic_xchg(from, 0); If you really don't need any ordering guarantees / barriers here, then using a relaxed load should be fine. But my gut feeling tells me you probably do need some barriers; either you are re-using another barrier (and then the comment should probably point out which), or it must be a case where it's either fine to read any value someone (else) wrote or there's no concurrent store after all. There is a store here, before and after. Read the value, store zero. I suppose what the comment is saying is that the atomic operation doesn't need to be ordered with respect to the rest of the surrounding code, as the object being synchronized is just that one integer. Exactly. The items of the array can be read independently. Paolo
Re: [Qemu-devel] [Xen-devel] [PATCH] Add Xen platform PCI device version 2.
--On 20 June 2013 07:47:12 + Paul Durrant paul.durr...@citrix.com wrote: If someone has a VM that has the old Citrix drivers installed, or GPLPV, I think I should be able to spot this and make sure that the new bus driver quiesces itself to prevent strangeness ensuing. If and when said previous drivers are un-installed then the new bus driver can wake up and enumerate the device nodes for the other pv drivers and Windows Update can carry on doing its stuff. I have no clue about Windows device drivers, so this may be a silly suggestion. If your suggestion above already requires a Xen code change, one possibility might be copy the idea behind the PCI unplug logic. Either if the new PCI device is used, it could unplug the old one, or vice versa. Drivers magically unplugging themselves may not be ideal, but it beats having 2 drivers fighting over the same device. -- Alex Bligh
Re: [Qemu-devel] [PATCH v3 2/2] QEMUBH: make AioContext's bh re-entrant
Il 20/06/2013 09:39, Stefan Hajnoczi ha scritto: qemu_bh_cancel() and qemu_bh_delete() are not modified by this patch. It seems that calling them from a thread is a little risky because there is no guarantee that the BH is no longer invoked after a thread calls these functions. I think that's worth a comment or do you want them to take the lock so they become safe? Taking the lock wouldn't help. The invoking loop of aio_bh_poll runs lockless. I think a comment is better. qemu_bh_cancel is inherently not thread-safe, there's not much you can do about it. qemu_bh_delete is safe as long as you wait for the bottom half to stop before deleting the containing object. Once we have RCU, deletion of QOM objects will be RCU-protected. Hence, a simple way could be to put the first part of aio_bh_poll() within rcu_read_lock/unlock. The other thing I'm unclear on is the -idle assignment followed immediately by a -scheduled assignment. Without memory barriers aio_bh_poll() isn't guaranteed to get an ordered view of these updates: it may see an idle BH as a regular scheduled BH because -idle is still 0. Right. You need to order -idle writes before -scheduled writes, and add memory barriers, or alternatively use two bits in -scheduled so that you can assign both atomically. Paolo
Re: [Qemu-devel] [Xen-devel] [PATCH] Add Xen platform PCI device version 2.
-Original Message- From: Alex Bligh [mailto:a...@alex.org.uk] Sent: 20 June 2013 09:09 To: Paul Durrant; Tim (Xen.org); Matt Wilson Cc: xen-de...@lists.xen.org; Ian Campbell; qemu-devel@nongnu.org; Alex Bligh Subject: RE: [Xen-devel] [Qemu-devel] [PATCH] Add Xen platform PCI device version 2. --On 20 June 2013 07:47:12 + Paul Durrant paul.durr...@citrix.com wrote: If someone has a VM that has the old Citrix drivers installed, or GPLPV, I think I should be able to spot this and make sure that the new bus driver quiesces itself to prevent strangeness ensuing. If and when said previous drivers are un-installed then the new bus driver can wake up and enumerate the device nodes for the other pv drivers and Windows Update can carry on doing its stuff. I have no clue about Windows device drivers, so this may be a silly suggestion. If your suggestion above already requires a Xen code change, one possibility might be copy the idea behind the PCI unplug logic. Either if the new PCI device is used, it could unplug the old one, or vice versa. Drivers magically unplugging themselves may not be ideal, but it beats having 2 drivers fighting over the same device. Unfortunately, whilst it sounds good on the face of it, it's not as straightforward as that. The old Citrix PV drivers did not just bind to the Xen platform device, and make that device go away automagically would actually cause the system disk to disappear without any clean fallback to emulation. As long as nothing actually breaks if and when Windows fetches the new PV bus driver from Windows Update then we can document the need to manually uninstall any other PV drivers. Paul
Re: [Qemu-devel] [RESEND PATCH] virtio-scsi: forward scsibus for virtio-scsi-pci.
On 18/06/2013 17:21, Michael S. Tsirkin wrote: On Fri, Jun 14, 2013 at 08:13:29AM +0200, Frederic Konrad wrote: On 13/06/2013 09:59, Michael S. Tsirkin wrote: On Thu, Jun 13, 2013 at 09:34:30AM +0200, Frederic Konrad wrote: On 13/06/2013 09:23, Michael S. Tsirkin wrote: On Thu, Jun 13, 2013 at 04:46:09PM +1000, Alexey Kardashevskiy wrote: On 06/13/2013 04:28 PM, Frederic Konrad wrote: On 12/06/2013 13:21, Alexey Kardashevskiy wrote: On 06/12/2013 07:16 PM, Michael S. Tsirkin wrote: On Wed, Jun 12, 2013 at 07:04:48PM +1000, Alexey Kardashevskiy wrote: On 06/12/2013 07:03 PM, Michael S. Tsirkin wrote: On Wed, Jun 12, 2013 at 08:15:17AM +0200, fred.kon...@greensocs.com wrote: From: KONRAD Frederic fred.kon...@greensocs.com This fix a bug with scsi hotplug on virtio-scsi-pci: As virtio-scsi-pci doesn't have any scsi bus, we need to forward scsi-hot-add to the virtio-scsi-device plugged on the virtio-bus. Cc: qemu-sta...@nongnu.org Reported-by: Alexey Kardashevskiy a...@ozlabs.ru Reviewed-by: Andreas Färber afaer...@suse.de Signed-off-by: KONRAD Frederic fred.kon...@greensocs.com Acked-by: Michael S. Tsirkin m...@redhat.com Note: we don't seem to have any decent way to add disks to devices: no QMP interface, pci address is required instead of using an id ... Anyone can be bothered to fix this? Actually PCI address is not always required, this field (we are talking about drive_add?) is ignored when if=none. Then documentation in hmp-commands.hx is wrong, isn't it? Add that to the list. if=none can't be actually used to hot-add a disk to a device, can it? It creates a disc and assumes you will use it by a device created later. Yep. I run QEMU with -device virtio-scsi-pci,id=device0 and then do in console: drive_add auto file=virtimg/fc18guest,if=none,id=bar1 device_add scsi-disk,bus=device0.0,drive=bar1 Pretty hot plug :) I thought you use drive_add 0 if=scsi? That's the other option, I posted a bug but I did not actually try the fix till now :) It works now if I run QEMU with -device virtio-scsi-pci and do this in qemu console: drive_add 0 file=virtimg/fc18guest No extra parameters or anything, cool, thanks, and :) Tested-by: Alexey Kardashevskiy a...@ozlabs.ru The only problem with it that it still wants PCI SCSI adapter while spapr-vscsi is VIO device so if the guest kernel does not have virtio-scsi support, I have to do what I described in the quote but this is a different story. Okay. How about: - document that pci_addr is optional in hmp - if no pci_addr assume if=none - add drive_add to qmp without the pci_addr and if options We are left with the bus=device0.0 syntax for device_add which is also gross - user asked for device0, the .0 part is qemu internals exposed to users. How about teaching qdev that if there's a single bus under a device, naming the device itself should be identical? Yes why not seems a good idea, but you'll pass it through bus= option? This will solve the problem neatly without virtio specific hacks, won't it? The issue here is command line back-compatibility for pci_addr, which won't be solved with the single bus idea? Why not? This code: scsibus = (SCSIBus *) object_dynamic_cast(OBJECT(QLIST_FIRST(adapter-child_bus)), TYPE_SCSI_BUS); should be replaced with code from qdev that we'll write that goes down the chain as long as there's 1 device on each bus, looking for a device of the appropriate type. Ok, understood what you mean :). Why not if everybody is happy with it. Fred Ok so - want to try implementing this? Ok, will try to look at it next week. What about the stable release? Wouldn't be safe to take this patch for the stable? Fred --- hw/pci/pci-hotplug.c | 19 +-- 1 file changed, 17 insertions(+), 2 deletions(-) diff --git a/hw/pci/pci-hotplug.c b/hw/pci/pci-hotplug.c index 12287d1..c708752 100644 --- a/hw/pci/pci-hotplug.c +++ b/hw/pci/pci-hotplug.c @@ -30,6 +30,8 @@ #include monitor/monitor.h #include hw/scsi/scsi.h #include hw/virtio/virtio-blk.h +#include hw/virtio/virtio-scsi.h +#include hw/virtio/virtio-pci.h #include qemu/config-file.h #include sysemu/blockdev.h #include qapi/error.h @@ -79,13 +81,26 @@ static int scsi_hot_add(Monitor *mon, DeviceState *adapter, { SCSIBus *scsibus; SCSIDevice *scsidev; +VirtIOPCIProxy *virtio_proxy; scsibus = (SCSIBus *) object_dynamic_cast(OBJECT(QLIST_FIRST(adapter-child_bus)), TYPE_SCSI_BUS); if (!scsibus) { -error_report(Device is not a SCSI adapter); -return -1; +/* + * Check if the adapter is a virtio-scsi-pci, and forward scsi_hot_add + * to the virtio-scsi-device. + */ +if (!object_dynamic_cast(OBJECT(adapter), TYPE_VIRTIO_SCSI_PCI)) { +error_report(Device is not a SCSI adapter); +return -1; +} +virtio_proxy = VIRTIO_PCI(adapter); +
Re: [Qemu-devel] [RESEND PATCH] virtio-scsi: forward scsibus for virtio-scsi-pci.
On Thu, Jun 20, 2013 at 10:26:18AM +0200, Frederic Konrad wrote: On 18/06/2013 17:21, Michael S. Tsirkin wrote: On Fri, Jun 14, 2013 at 08:13:29AM +0200, Frederic Konrad wrote: On 13/06/2013 09:59, Michael S. Tsirkin wrote: On Thu, Jun 13, 2013 at 09:34:30AM +0200, Frederic Konrad wrote: On 13/06/2013 09:23, Michael S. Tsirkin wrote: On Thu, Jun 13, 2013 at 04:46:09PM +1000, Alexey Kardashevskiy wrote: On 06/13/2013 04:28 PM, Frederic Konrad wrote: On 12/06/2013 13:21, Alexey Kardashevskiy wrote: On 06/12/2013 07:16 PM, Michael S. Tsirkin wrote: On Wed, Jun 12, 2013 at 07:04:48PM +1000, Alexey Kardashevskiy wrote: On 06/12/2013 07:03 PM, Michael S. Tsirkin wrote: On Wed, Jun 12, 2013 at 08:15:17AM +0200, fred.kon...@greensocs.com wrote: From: KONRAD Frederic fred.kon...@greensocs.com This fix a bug with scsi hotplug on virtio-scsi-pci: As virtio-scsi-pci doesn't have any scsi bus, we need to forward scsi-hot-add to the virtio-scsi-device plugged on the virtio-bus. Cc: qemu-sta...@nongnu.org Reported-by: Alexey Kardashevskiy a...@ozlabs.ru Reviewed-by: Andreas Färber afaer...@suse.de Signed-off-by: KONRAD Frederic fred.kon...@greensocs.com Acked-by: Michael S. Tsirkin m...@redhat.com Note: we don't seem to have any decent way to add disks to devices: no QMP interface, pci address is required instead of using an id ... Anyone can be bothered to fix this? Actually PCI address is not always required, this field (we are talking about drive_add?) is ignored when if=none. Then documentation in hmp-commands.hx is wrong, isn't it? Add that to the list. if=none can't be actually used to hot-add a disk to a device, can it? It creates a disc and assumes you will use it by a device created later. Yep. I run QEMU with -device virtio-scsi-pci,id=device0 and then do in console: drive_add auto file=virtimg/fc18guest,if=none,id=bar1 device_add scsi-disk,bus=device0.0,drive=bar1 Pretty hot plug :) I thought you use drive_add 0 if=scsi? That's the other option, I posted a bug but I did not actually try the fix till now :) It works now if I run QEMU with -device virtio-scsi-pci and do this in qemu console: drive_add 0 file=virtimg/fc18guest No extra parameters or anything, cool, thanks, and :) Tested-by: Alexey Kardashevskiy a...@ozlabs.ru The only problem with it that it still wants PCI SCSI adapter while spapr-vscsi is VIO device so if the guest kernel does not have virtio-scsi support, I have to do what I described in the quote but this is a different story. Okay. How about: - document that pci_addr is optional in hmp - if no pci_addr assume if=none - add drive_add to qmp without the pci_addr and if options We are left with the bus=device0.0 syntax for device_add which is also gross - user asked for device0, the .0 part is qemu internals exposed to users. How about teaching qdev that if there's a single bus under a device, naming the device itself should be identical? Yes why not seems a good idea, but you'll pass it through bus= option? This will solve the problem neatly without virtio specific hacks, won't it? The issue here is command line back-compatibility for pci_addr, which won't be solved with the single bus idea? Why not? This code: scsibus = (SCSIBus *) object_dynamic_cast(OBJECT(QLIST_FIRST(adapter-child_bus)), TYPE_SCSI_BUS); should be replaced with code from qdev that we'll write that goes down the chain as long as there's 1 device on each bus, looking for a device of the appropriate type. Ok, understood what you mean :). Why not if everybody is happy with it. Fred Ok so - want to try implementing this? Ok, will try to look at it next week. What about the stable release? Wouldn't be safe to take this patch for the stable? Fred Yes. My ACK is for stable. --- hw/pci/pci-hotplug.c | 19 +-- 1 file changed, 17 insertions(+), 2 deletions(-) diff --git a/hw/pci/pci-hotplug.c b/hw/pci/pci-hotplug.c index 12287d1..c708752 100644 --- a/hw/pci/pci-hotplug.c +++ b/hw/pci/pci-hotplug.c @@ -30,6 +30,8 @@ #include monitor/monitor.h #include hw/scsi/scsi.h #include hw/virtio/virtio-blk.h +#include hw/virtio/virtio-scsi.h +#include hw/virtio/virtio-pci.h #include qemu/config-file.h #include sysemu/blockdev.h #include qapi/error.h @@ -79,13 +81,26 @@ static int scsi_hot_add(Monitor *mon, DeviceState *adapter, { SCSIBus *scsibus; SCSIDevice *scsidev; +VirtIOPCIProxy *virtio_proxy; scsibus = (SCSIBus *) object_dynamic_cast(OBJECT(QLIST_FIRST(adapter-child_bus)), TYPE_SCSI_BUS); if (!scsibus) { -error_report(Device is not a SCSI adapter); -return -1; +/* + * Check if the adapter is a virtio-scsi-pci, and forward scsi_hot_add + * to the virtio-scsi-device. + */
Re: [Qemu-devel] [PATCH] pseries: Fix compiler warning (conversion of pointer to integral value)
Am 20.06.2013 um 07:10 schrieb Michael Tokarev m...@tls.msk.ru: 20.06.2013 01:40, Alexander Graf wrote: On 19.06.2013, at 23:08, Stefan Weil wrote: This kind of type cast must use uintptr_t or target_ulong to be portable for hosts with sizeof(void *) != sizeof(long). Here the value is assigned to a variable of type target_ulong. Signed-off-by: Stefan Weil s...@weilnetz.de Acked-by: Alexander Graf ag...@suse.de I suppose this one goes through the trivial tree? Anything which goes to -trivial can be applied directly or into some other subsystem tree first. When I send a pull request I rebase trivial tree ontop of current master and filter out anything which has been already applied, so that's not an issue. The only possible issue is when you applied it to some other tree, and -trivial pull request were handled before that other tree - how that will be handled by git? Will it complain (so that the situation should be resolved manually), will it apply nothing or will it appy an empty patch? (the patch signature will be different, with different S-o-b.) I think I've seen empty commits before in qemu tree, with the same subject/author as some previous commit. It depends on how you handle the tree. I rebase ppc-next too, so the commit would simply vanish. I'll apply it to ppc-next as well then. Alex
Re: [Qemu-devel] [PATCH 6/9] vhost-scsi: new device supporting the tcm_vhost Linux kernel module
Hi Asias, Thanks for your config. According to you config,I test booting from vhost device with upstream kernel and qemu,but failed. 1 installing guest from cdrom,ok. 2 booting vhost-scsi,guest fs error occurs. 3 using fileio backstores,the error is same.. 4 rebooting guest,a log printed: (qemu) hw/scsi/virtio-scsi.c:533:virtio_scsi_handle_event: Object 0x7fccae7f2c88 is not an instance of type virtio-scsi-device 5 using upstream seabios,core dumped. Could you give me some advise to debug this problem ? I can provide more information if need. The qemu cmd: [root@fedora121 x86_64-softmmu]# ./qemu-system-x86_64 -enable-kvm -name fedora -M pc -m 1024 -smp 2 -drive file=/home/fedora18.iso,if=ide,media=cdrom -device vhost-scsi-pci,wwpn=naa.50014057133e25dc -monitor stdio -vga qxl -vnc :1 The vnc output: Dracut-initqueue[189]:/dev/mapper/fedora-root:UNEXPECTED INCONSISTENCY;RUN FSCK MANUALLY. Dracut-initqueue[189]: Warning: e2fsck returned with 4 Dracut-initqueue[189]: Warning: ***An error occurred during the file system check. The guest kernel log: Kernel: virtio-pci :00:04.0: irq 40 for MSI/MSI-X Kernel: virtio-pci :00:04.0: irq 41 for MSI/MSI-X Kernel: virtio-pci :00:04.0: irq 42 for MSI/MSI-X Kernel: virtio-pci :00:04.0: irq 43 for MSI/MSI-X Kernel: scsi2 : Virtio SCSI HBA Kernel: scsi 2:0:1:0: Direct-Access LIO-ORG r0 Kernel: sd 2:0:1:0: Attached scsi generic sg1 type 0 Kernel: sd 2:0:1:0: [sda]1258912 512-byte logical . Kernel: sd 2:0:1:0: [sda]write protect is off Kernel: sd 2:0:1:0: [sda]Mode sense :43 00 00 08 Kernel: sd 2:0:1:0: [sda]write cache: disabled, read . Kernel: sda sda1 sda2 Kernel: sd 2:0:1:0: [sda] Attached SCSI disk Dracut-initqueue[189]: Scanning devices sda2 for LVM Dracut-initqueue[189]: inactive '/dev/fedora/swap'... Dracut-initqueue[189]: inactive '/dev/fedora/root'... The info of host: [root@fedora121 x86_64-softmmu]# uname -a Linux fedora121 3.10.0-rc6 #1 SMP Wed Jun 19 19:34:24 CST 2013 x86_64 x86_64 x86_64 GNU/Linux [root@fedora121 x86_64-softmmu]# lsmod |grep vhost_scsi vhost_scsi 49456 5 target_core_mod 282163 14 target_core_iblock,target_core_pscsi,iscsi_target_mod,target_core_file,vhost_scsi [root@fedora121 x86_64-softmmu]# targetcli targetcli shell version v2.1.fb26 Copyright 2011 by RisingTide Systems LLC and others. For help on commands, type 'help'. / ls o- / . [...] o- backstores .. [...] | o- block .. [Storage Objects: 0] | o- fileio . [Storage Objects: 0] | o- pscsi .. [Storage Objects: 0] | o- ramdisk [Storage Objects: 1] | o- r0 ... [(6.0GiB) activated] o- iscsi [Targets: 0] o- loopback . [Targets: 0] o- vhost [Targets: 1] o- naa.50014057133e25dc .. [TPGs: 1] o- tpg1 ... [naa.5001405a70ac3421] o- acls .. [ACLs: 0] o- luns .. [LUNs: 1] o- lun0 . [ramdisk/r0] Regards, baiqing -Original Message- From: Asias He [mailto:as...@redhat.com] Sent: Thursday, June 20, 2013 9:34 AM To: Libaiqing Cc: Paolo Bonzini; Wenchao Xia; qemu-devel@nongnu.org; n...@linux-iscsi.org; Michael S. Tsirkin; Haofeng Subject: Re: [Qemu-devel] [PATCH 6/9] vhost-scsi: new device supporting the tcm_vhost Linux kernel module On Wed, Jun 19, 2013 at 12:55:10PM +, Libaiqing wrote: Hi paolo, The vhost-scsi device can be used as boot device? I tested with your config + 3.10 rc6 +
Re: [Qemu-devel] [RFC 06/13] qemu-thread: add TLS wrappers
Il 20/06/2013 09:26, Fam Zheng ha scritto: On Fri, 06/14 11:48, Stefan Hajnoczi wrote: From: Paolo Bonzini pbonz...@redhat.com Fast TLS is not available on some platforms, but it is always nice to use it. This wrapper implementation falls back to pthread_get/setspecific on POSIX systems that lack __thread, but uses the dynamic linker's TLS support on Linux and Windows. The user shall call alloc_foo() in every thread that needs to access the variable---exactly once and before any access. foo is the name of the variable as passed to DECLARE_TLS and DEFINE_TLS. Then, get_foo() will return the address of the variable. It is guaranteed to remain the same across the lifetime of a thread, so you can cache it. Would tls_alloc_foo() and tls_get_foo() be easier to read and less possible for name conflict? Fine by me. Paolo
Re: [Qemu-devel] [PATCH 06/12] spapr-vty: add copyright and license
Il 19/06/2013 22:40, Anthony Liguori ha scritto: If you are on CC, then please Ack this patch as you touched this file at some point in time. Cc: Alexey Kardashevskiy a...@ozlabs.ru Cc: Andreas Färber afaer...@suse.de Cc: David Gibson da...@gibson.dropbear.id.au Cc: Michael Ellerman mich...@ellerman.id.au Cc: Paolo Bonzini pbonz...@redhat.com Signed-off-by: Anthony Liguori aligu...@us.ibm.com --- hw/char/spapr_vty.c | 13 + 1 file changed, 13 insertions(+) diff --git a/hw/char/spapr_vty.c b/hw/char/spapr_vty.c index 2993848..ecc2bb5 100644 --- a/hw/char/spapr_vty.c +++ b/hw/char/spapr_vty.c @@ -1,3 +1,16 @@ +/* + * QEMU PowerPC pSeries Logical Partition (aka sPAPR) hardware System Emulator + * + * PAPR Inter-VM Logical Lan, aka ibmveth + * + * Copyright IBM, Corp. 2010-2013 + * + * Authors: + * David Gibson da...@gibson.dropbear.id.au + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ #include hw/qdev.h #include sysemu/char.h #include hw/ppc/spapr.h ACK Paolo
Re: [Qemu-devel] [Xen-devel] [PATCH] Add Xen platform PCI device version 2.
At 07:47 + on 20 Jun (1371714432), Paul Durrant wrote: I agree. If this is really the only solution, we would need to have both versions presented to the guest so that old drivers continue to work without any intervention. I suspect that if we expose both, both sets of drivers try to run the same PV connections, and hilarity ensues. Actually I think I can make that work, and it is the conclusion I came to after Alex's comment. Ah, nice! In that case, I'm a lot less worried -- we can just expose both versions/devices by default and there's no need for a visible control knob tied to driver version (except maybe for debugging). It means an 'unsupported' device appearing on other/older OSes, which is unfortunate, but ISTR only Windows really complains visibly about that. Tim.
Re: [Qemu-devel] [PATCH v4 0/9] Make 'dump-guest-memory' dump in kdump-compressed format
On Thu, Jun 20, 2013 at 10:18:35AM +0800, Qiao Nuohan wrote: On 06/19/2013 09:49 PM, Stefan Hajnoczi wrote: Where does that code live that writes DISKDUMP files? I can see the diskdump.[ch] code. Sorry, I cannot catch what do you mean here. Please link to the code that writes DISKDUMP kdump files on a physical machine. I only see the crash utility code to read the DISKDUMP code but I haven't found anything in the Linux kernel, the crash utility, or the kexec-utils code to actually write a DISKDUMP file. The file format is pretty bad: we need 4 temporary files and a lot of data copying to write it out. Why not just compress an ELF file and teach the crash utility how to decompress while reading the ELF? Also, did you look into simply outputting the ELF file without zero pages? What I want is a dump file with smaller size. And compressed format and with zero pages excluded can make it. I choose kdump-compressed format because it is a standard format and it can realize what I want. Why 4 temporary files are need? dump-guest-memory may be called with a fd which is supposed to send data of dump to. If fd is opened on a pipe or etc which is unable to seek, then I need to cache the data. I understand why you need temporary files, but my questions stand: Have you looked at using ELF more efficiently instead of duplicating kdump code into QEMU? kdump is not a great format for the problem you're trying to solve - you're not filling in the Linux-specific metadata and it's a pain to write due to its layout. Why can't you simply omit zero pages from the ELF? Why can't you compress the entire ELF file and add straightforward decompression to the crash utility? Stefan
[Qemu-devel] [RFC] qemu-img: add option -d in convert
Hi, This is a draft design which aimed for internal snapshot convert, hope to get your comments: Internal snapshot is not as easy as external snapshot, to query and convert. This patch will improve convertion side, which helps internal / external snapshot mixed case. With it user can treat internal snapshot as lineraity relationship, use it like external ones with tool qemu-img. An detailed example, If there is a chain as following: imageA(sn0)-imageB(sn0,sn1)-imageC(sn0) The real relationship in it could be: --imageA.qcow2imageB.qcow2-imageC.qcow2 |-imageA(sn0) |-imageB(sn0)|-imageC(sn0) |-imageB(sn1) To export it, two steps: 1. duplicate them to get an exactly same tree by: qemu-img convert imageA.qcow2 -O export/imageA.qcow2 -f qcow2 qemu-img convert imageA.qcow2 -s sn0 -O export/imageA_sn0.qcow2 qemu-img convert imageB.qcow2 -O export/imageB.qcow2 -f qcow2 -o backing_file=export/imageA.qcow2 qemu-img convert imageB.qcow2 -s sn0 -O export/imageB.qcow2 -f qcow2 -o backing_file=export/imageB.qcow2 ... result at ./export: --imageA.qcow2imageB.qcow2-imageC.qcow2 |-imageA_sn0.qcow2 |-imageB_sn0.qcow2 |-imageC_sn0.qcow2 |-imageB_sn1.qcow2 2. change the relationship to linearity to save space(or by 3rd party diff tool): qemu-img create imageA_l.qcow2 -f qcow2 -p backing_file=imageA_qcow2 qemu-img rebase imageA_l.qcow2 -b imageA_sn0.qcow2 qemu-img rebase -u imageB.qcow2 -b imageA_l.qcow2 discard imageA.qcow2 result at ./export: imageA_sn0.qcow2--imageA_l.qcow2--imageB_sn0.qcow2--imageB_sn1_l.qcow2- -imageB_l.qcow2--imageC_sn0.qcow2--imageC_l.qcow2 This is a bit complexity, they can be merged into one step, to save disk I/O and make procedure simple, add a parameter: [-d [base_image=IMAGE,]snapshot=SNAPSHOT] qemu-img convert imageA.qcow2 -s sn0 -O export/imageA_sn0.qcow2 -f qcow2 qemu-img convert imageA.qcow2 -d snapshot=sn0 -O export/imageA.qcow2 -f qcow2 -o backing_file=export/imageA_sn0.qcow2 ... result at ./export: imageA_sn0.qcow2--imageA.qcow2--imageB_sn0.qcow2--imageB_sn1.qcow2- -imageB.qcow2--imageC_sn0.qcow2--imageC.qcow2 parameter base_image allow diff operation taken across image in the backing chain. Note: 1 snapshot query can be added in qemu-nbd easily later. 2 This is actually a work around by qemu-img and qemu-nbd. A better way is to provide user snapshot_read() and snapshot_allocated() interface, typically a library. But that need some adjust in block level, especially thread, coroutine, and emulator cut off, so delay that. -- Best Regards Wenchao Xia
[Qemu-devel] 1192847 : NMI watchdog fails to increment the NMI counter in /proc/interrupts
Hi All, I have filed the following bug for watchdog: NMI watchdog fails to increment the NMI counter in /proc/interrupts Kernel Version: 3.10.0-rc5+ Libvirt Version: 1.0.6 Qemu Version: 1.5.50 Steps to reproduce the issue: 1. Booted the VM with : qemu-system-x86_64 VM1.qcow2 -enable-kvm -watchdog i6300esb -watchdog-action reset -smp 2 -m 2000 2. Edit the /boot/grub/grub/grub.conf with nmi_watchdog = 1 before the initrd image. 3. Restart the guests, the NMI counter in /proc/interrupts was 0 4. Installed the watchdog rpm and ran chkconfig watchdog on 5. Restart the guest, even then the NMI counter did not increment 6. Changed the /boot/grub/grub/grub.conf with nmi_watchdog = 1 to /boot/grub/grub/grub.conf with nmi_watchdog = 2 and restarted the guest. Even then NMI conuter did not increment (The NMI counter was showing 0 all the time for all the above steps). Please let me know if I am missing some steps to test the NMI. Thanks, Shastri
[Qemu-devel] Failed booting into OS after introduct the KVM_MEM_READONLY flag for regions
HI, Jordan: By using the latest master of qemu, after booting the vnc view will continue to be just black, and the os cannot be start. After bisect, I found the problem was introduced by the commit: 235e8982ad393e5611cb892df54881c872eea9e1 (kvm: support using KVM_MEM_READONLY flag for regions). There are plenty of medications in this commit, so I report this problem to you. The running environment: Qemu: 4eda32f588086b6cd0ec2be6a7a6c131f8c2b427 Host: Fedora 18, kernel: 3.8.1-201.fc18.x86_64 Boot Cmd: ./x86_64-softmmu/qemu-system-x86_64 -name ovirt -M pc-0.15 -m 1024 -enable-kvm -boot d \ -drive file=/pkgs/imgs/win7.qcow2,format=qcow2,id=drive0,if=none \ -device ide-hd,drive=drive0,id=disk0 \ -vnc 186.100.8.144:0 -monitor stdio \ -net tap,ifname=tap0,downscript=no -net nic I can provide more information if need. -- Leiqzhang Best Regards
Re: [Qemu-devel] [PATCH v2 2/2] QEMUBH: make AioContext's bh re-entrant
On Wed, Jun 19, 2013 at 11:27:25AM +0200, Paolo Bonzini wrote: Il 19/06/2013 00:26, mdroth ha scritto: On Tue, Jun 18, 2013 at 09:20:26PM +0200, Paolo Bonzini wrote: Il 18/06/2013 17:14, mdroth ha scritto: Could we possibly simplify this by introducing a recursive mutex that we could use to protect the whole list loop and hold even during the cb? If it is possible, we should avoid recursive locks. It makes impossible to establish a lock hierarchy. For example: I assume we can't hold the lock during the cb currently since we might try to reschedule, but if it's a recursive mutex would that simplify things? If you have two callbacks in two different AioContexts, both of which do bdrv_drain_all(), you get an AB-BA deadlock I think I see what you mean. That problem exists regardless of whether we introduce a recursive mutex though right? Without a recursive mutex, you only hold one lock at a time in each thread. I guess the main issue is the fact that we'd be encouraging sloppy locking practices without addressing the root problem? Yeah. We're basically standing where the Linux kernel stood 10 years ago (let's say 2.2 timeframe). If Linux got this far without recursive mutexes, we can at least try. :) FWIW I was also looking into recursive mutexes for the block layer. What scared me a little is that they make it tempting to stop thinking about locks since you know you'll be able to reacquire locks you already hold. Especially when converting existing code, I think we need to be rigorous about exploring every function and thinking about the locks it needs and which child functions it calls. Otherwise we'll have code paths hidden away somewhere that were never truly thought through. Stefan
Re: [Qemu-devel] [PATCH v3 2/2] QEMUBH: make AioContext's bh re-entrant
On Thu, Jun 20, 2013 at 4:16 PM, Paolo Bonzini pbonz...@redhat.com wrote: Il 20/06/2013 09:39, Stefan Hajnoczi ha scritto: qemu_bh_cancel() and qemu_bh_delete() are not modified by this patch. It seems that calling them from a thread is a little risky because there is no guarantee that the BH is no longer invoked after a thread calls these functions. I think that's worth a comment or do you want them to take the lock so they become safe? Taking the lock wouldn't help. The invoking loop of aio_bh_poll runs lockless. I think a comment is better. Yes, will document it. qemu_bh_cancel is inherently not thread-safe, there's not much you can do about it. qemu_bh_delete is safe as long as you wait for the bottom half to stop before deleting the containing object. Once we have RCU, deletion of QOM objects will be RCU-protected. Hence, a simple way could be to put the first part of aio_bh_poll() within rcu_read_lock/unlock. The other thing I'm unclear on is the -idle assignment followed immediately by a -scheduled assignment. Without memory barriers aio_bh_poll() isn't guaranteed to get an ordered view of these updates: it may see an idle BH as a regular scheduled BH because -idle is still 0. Right. You need to order -idle writes before -scheduled writes, and add memory barriers, or alternatively use two bits in -scheduled so that you can assign both atomically. I think just shift the position of smp_rmb/wmb in _schedule and _poll, we can acheive this (callbacks will not refer to -idle) Regards, Pingfan Paolo
Re: [Qemu-devel] [PATCH v2 3/6] net: make netclient re-entrant with refcnt
On Tue, Jun 18, 2013 at 8:41 PM, Stefan Hajnoczi stefa...@gmail.com wrote: On Thu, Jun 13, 2013 at 05:03:03PM +0800, Liu Ping Fan wrote: From: Liu Ping Fan pingf...@linux.vnet.ibm.com With refcnt, NetClientState's user can run agaist deleter. Please split this into two patches: 1. net_clients lock 2. NetClientState refcount Ok. Signed-off-by: Liu Ping Fan pingf...@linux.vnet.ibm.com --- hw/core/qdev-properties-system.c | 14 include/net/net.h| 3 +++ net/hub.c| 3 +++ net/net.c| 47 +--- net/slirp.c | 3 ++- 5 files changed, 66 insertions(+), 4 deletions(-) diff --git a/hw/core/qdev-properties-system.c b/hw/core/qdev-properties-system.c index 0eada32..41cc7e6 100644 --- a/hw/core/qdev-properties-system.c +++ b/hw/core/qdev-properties-system.c @@ -302,6 +302,7 @@ static void set_vlan(Object *obj, Visitor *v, void *opaque, return; } +/* inc ref, released when unset property */ hubport = net_hub_port_find(id); if (!hubport) { error_set(errp, QERR_INVALID_PARAMETER_VALUE, @@ -311,11 +312,24 @@ static void set_vlan(Object *obj, Visitor *v, void *opaque, *ptr = hubport; } +static void release_vlan(Object *obj, const char *name, void *opaque) +{ +DeviceState *dev = DEVICE(obj); +Property *prop = opaque; +NICPeers *peers_ptr = qdev_get_prop_ptr(dev, prop); +NetClientState **ptr = peers_ptr-ncs[0]; + +if (*ptr) { +netclient_unref(*ptr); +} +} + PropertyInfo qdev_prop_vlan = { .name = vlan, .print = print_vlan, .get = get_vlan, .set = set_vlan, +.release = release_vlan, }; int qdev_prop_set_drive(DeviceState *dev, const char *name, What about the netdev property? I don't see any refcount code there. Yes, the release of netdev and vlan property should all free its backend. Will add the code. @@ -1109,6 +1146,7 @@ void net_cleanup(void) qemu_del_net_client(nc); } } +qemu_mutex_destroy(net_clients_lock); Why is it okay to iterate over net_clients here without the lock? atexit(net_cleanup); So no other racers exist. Thx Regards, Pingfan
Re: [Qemu-devel] [PATCH v2 2/6] net: introduce lock to protect NetClientState's peer's access
On Thu, Jun 20, 2013 at 3:46 PM, Stefan Hajnoczi stefa...@redhat.com wrote: On Thu, Jun 20, 2013 at 02:30:30PM +0800, liu ping fan wrote: On Tue, Jun 18, 2013 at 8:25 PM, Stefan Hajnoczi stefa...@gmail.com wrote: On Thu, Jun 13, 2013 at 05:03:02PM +0800, Liu Ping Fan wrote: + * And flush out peer's queue. + */ +static void qemu_net_client_detach_flush(NetClientState *nc) +{ +NetClientState *peer; + +/* reader of self's peer field , fixme? the deleters are not concurrent, + * so this pair lock can save. + */ Indentation, also please resolve the fixme. So, here can I take the assumption that the deleters are serialized by biglock, and remove the lock following this comment? Ah, I understand the comment now. Is there any advantage to dropping :), only two atomic instruction in rare path. the lock? IMO it's clearer to take the lock consistently instead of optimizing cases we think only get called from the main loop. Reasonable, will keep them.
Re: [Qemu-devel] [PATCH v3 2/2] QEMUBH: make AioContext's bh re-entrant
Il 20/06/2013 11:12, liu ping fan ha scritto: Right. You need to order -idle writes before -scheduled writes, and add memory barriers, or alternatively use two bits in -scheduled so that you can assign both atomically. I think just shift the position of smp_rmb/wmb in _schedule and _poll, we can acheive this (callbacks will not refer to -idle) Yes, but you also need to swap -idle and -scheduled assignments (aio_bh_poll reads scheduled before idle; qemu_bh_schedule* must write idle before scheduled). Paolo
Re: [Qemu-devel] [Xen-devel] [PATCH] Add Xen platform PCI device version 2.
-Original Message- From: Tim Deegan [mailto:t...@xen.org] Sent: 20 June 2013 09:56 To: Paul Durrant Cc: Matt Wilson; Alex Bligh; xen-de...@lists.xen.org; Ian Campbell; qemu- de...@nongnu.org Subject: Re: [Xen-devel] [Qemu-devel] [PATCH] Add Xen platform PCI device version 2. At 07:47 + on 20 Jun (1371714432), Paul Durrant wrote: I agree. If this is really the only solution, we would need to have both versions presented to the guest so that old drivers continue to work without any intervention. I suspect that if we expose both, both sets of drivers try to run the same PV connections, and hilarity ensues. Actually I think I can make that work, and it is the conclusion I came to after Alex's comment. Ah, nice! In that case, I'm a lot less worried -- we can just expose both versions/devices by default and there's no need for a visible control knob tied to driver version (except maybe for debugging). It means an 'unsupported' device appearing on other/older OSes, which is unfortunate, but ISTR only Windows really complains visibly about that. Yes, I think only Windows complains and we should be able to post an article somewhere saying 'don't worry about it' :-) Paul
[Qemu-devel] [Bug 994378] Re: Nested-virt)L1 (kvm on kvm)guest panic with parameter “-cpu host” in qemu command line.
Short: I can't reproduce here with L1 guest having has host-passthrough for CPU. Long: = Version Info: - On Physical host: ~ $ uname -r; rpm -q libvirt-daemon-kvm qemu 3.10.0-0.rc2.git1.2.fc20.x86_64 qemu-1.4.2-3.fc19.x86_64 libvirt-daemon-kvm-1.0.5.2-1.fc19.x86_64 libguestfs-1.22.3-1.fc19.x86_64 On L1: ~~ $ uname -r; rpm -q libvirt-daemon-kvm qemu 3.10.0-0.rc3.git0.2.fc20.x86_64 libvirt-daemon-kvm-1.0.5.1-1.fc19.x86_64 qemu-1.4.2-2.fc19.x86_64 [root@dhcp47-209 ~]# L1 guest CLI: - [root@bare-metal ~]# ps -ef | grep qemu qemu 7281 1 67 04:57 ?00:00:10 /usr/bin/qemu-system-x86_64 -machine accel=kvm -name regular-guest -S -machine pc-i440fx-1.4,accel=kvm,usb=off -cpu host -m 10240 -smp 4,sockets=4,cores=1,threads=1 -uuid 4ed9ac0b-7f72-dfcf-68b3-e6fe2ac588b2 -nographic -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/regular-guest.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/home/test/vmimages/regular-guest.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=23,id=hostnet0,vhost=on,vhostfd=24 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:80:c1:34,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 L2 guest CLI: - [root@regular-guest ~]# ps -ef | grep -i qemu qemu 1138 1 88 05:18 ?00:00:07 /usr/bin/qemu-system-x86_64 -machine accel=kvm -name nguest-01 -S -machine pc-i440fx-1.4,accel=kvm,usb=off -m 2048 -smp 2,sockets=2,cores=1,threads=1 -uuid b47c5cbb-b320-ce9d-c595-4e083b0e465d -nographic -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/nguest-01.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/home/test/vmimages/nguest-01.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=23,id=hostnet0,vhost=on,vhostfd=24 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:be:d5:8e,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 A search for string 'error' in logs doesn't turn up anything: [root@nguest-01 ~]# grep -i error /var/log/boot.log [root@nguest-01 ~]# grep -i error /var/log/messages [root@nguest-01 ~]# Yongjie, can you please re-try? -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/994378 Title: Nested-virt)L1 (kvm on kvm)guest panic with parameter “-cpu host” in qemu command line. Status in QEMU: New Bug description: Environment: Host OS (ia32/ia32e/IA64):ia32e Guest OS (ia32/ia32e/IA64):ia32e Guest OS Type (Linux/Windows):Linux kvm.git Commit:19853301ef3289bda2d5264c1093e74efddaeab9 qemu-kvm Commit:69abebf20280152da8fa7c418a819ae51e862231 Host Kernel Version:3.4.0-rc3 Hardware:WSM-EP, Romley-EP Bug detailed description: -- (KVM on KVM) L1 guest panic when starting the L1 guest with “-cpu host” parameter in qemu command line. Note: 1. when creating guest with “-cpu qemu64,+vmx”, L1 guest and L2 guest can boot up. 2. This should be a qemu-kvm bug. using '-cpu host' parameter, the following is the result. Kvm+ qemu-kvm =result 19853301 + 69abebf2 = bad 19853301 + 44755ea3 = good 3. when booting up the guest with the good commit of 19853301 + 44755ea3, you can see some error info, but nested virt works fine. (L1 and L2 guest can boot up.) “error: feature i64 not available in set error: bad option value [extfeature_edx = i64 xd syscall]” some logs [root@vt-snb9 x86_64-softmmu]# ./qemu-system-x86_64 -m 2048 -net nic,model=rtl8139 -net tap,script=/etc/kvm/qemu-ifup -hda /root/nested-kvm.qcow -cpu host error: feature i64 not available in set error: bad option value [extfeature_edx = i64 xd syscall] error: feature i64 not available in set error: bad option value [extfeature_edx = i64 xd syscall] error: feature i64 not available in set error: bad option value [extfeature_edx = i64 syscall xd] error: feature i64 not available in set error: bad option value [extfeature_edx = i64 syscall xd] VNC server running on `::1:5900' Reproduce steps: 1.start up a host with kvm (commit: 19853301) 2.rmmod
Re: [Qemu-devel] [PATCH 6/9] vhost-scsi: new device supporting the tcm_vhost Linux kernel module
On Thu, Jun 20, 2013 at 08:49:50AM +, Libaiqing wrote: Hi Asias, Thanks for your config. According to you config,I test booting from vhost device with upstream kernel and qemu,but failed. 1 installing guest from cdrom,ok. 2 booting vhost-scsi,guest fs error occurs. 3 using fileio backstores,the error is same.. 4 rebooting guest,a log printed: (qemu) hw/scsi/virtio-scsi.c:533:virtio_scsi_handle_event: Object 0x7fccae7f2c88 is not an instance of type virtio-scsi-device Paolo, I remember you fixed a similar issue? 5 using upstream seabios,core dumped. Could you give me some advise to debug this problem ? I can provide more information if need. Can you show me qemu commit id you used? Can you verity that if using the host kernel for guest helps? Does booting directly (without the install and reboot process) work? The qemu cmd: [root@fedora121 x86_64-softmmu]# ./qemu-system-x86_64 -enable-kvm -name fedora -M pc -m 1024 -smp 2 -drive file=/home/fedora18.iso,if=ide,media=cdrom -device vhost-scsi-pci,wwpn=naa.50014057133e25dc -monitor stdio -vga qxl -vnc :1 The vnc output: Dracut-initqueue[189]:/dev/mapper/fedora-root:UNEXPECTED INCONSISTENCY;RUN FSCK MANUALLY. Dracut-initqueue[189]: Warning: e2fsck returned with 4 Dracut-initqueue[189]: Warning: ***An error occurred during the file system check. The guest kernel log: Kernel: virtio-pci :00:04.0: irq 40 for MSI/MSI-X Kernel: virtio-pci :00:04.0: irq 41 for MSI/MSI-X Kernel: virtio-pci :00:04.0: irq 42 for MSI/MSI-X Kernel: virtio-pci :00:04.0: irq 43 for MSI/MSI-X Kernel: scsi2 : Virtio SCSI HBA Kernel: scsi 2:0:1:0: Direct-Access LIO-ORG r0 Kernel: sd 2:0:1:0: Attached scsi generic sg1 type 0 Kernel: sd 2:0:1:0: [sda]1258912 512-byte logical . Kernel: sd 2:0:1:0: [sda]write protect is off Kernel: sd 2:0:1:0: [sda]Mode sense :43 00 00 08 Kernel: sd 2:0:1:0: [sda]write cache: disabled, read . Kernel: sda sda1 sda2 Kernel: sd 2:0:1:0: [sda] Attached SCSI disk Dracut-initqueue[189]: Scanning devices sda2 for LVM Dracut-initqueue[189]: inactive '/dev/fedora/swap'... Dracut-initqueue[189]: inactive '/dev/fedora/root'... The info of host: [root@fedora121 x86_64-softmmu]# uname -a Linux fedora121 3.10.0-rc6 #1 SMP Wed Jun 19 19:34:24 CST 2013 x86_64 x86_64 x86_64 GNU/Linux [root@fedora121 x86_64-softmmu]# lsmod |grep vhost_scsi vhost_scsi 49456 5 target_core_mod 282163 14 target_core_iblock,target_core_pscsi,iscsi_target_mod,target_core_file,vhost_scsi [root@fedora121 x86_64-softmmu]# targetcli targetcli shell version v2.1.fb26 Copyright 2011 by RisingTide Systems LLC and others. For help on commands, type 'help'. / ls o- / . [...] o- backstores .. [...] | o- block .. [Storage Objects: 0] | o- fileio . [Storage Objects: 0] | o- pscsi .. [Storage Objects: 0] | o- ramdisk [Storage Objects: 1] | o- r0 ... [(6.0GiB) activated] o- iscsi [Targets: 0] o- loopback . [Targets: 0] o- vhost [Targets: 1] o- naa.50014057133e25dc .. [TPGs: 1] o- tpg1 ... [naa.5001405a70ac3421] o- acls .. [ACLs: 0] o- luns .. [LUNs: 1] o- lun0 . [ramdisk/r0] Regards, baiqing -Original Message- From: Asias He [mailto:as...@redhat.com] Sent: Thursday, June 20, 2013 9:34 AM To: Libaiqing
Re: [Qemu-devel] [PATCH v3 2/2] QEMUBH: make AioContext's bh re-entrant
On Thu, Jun 20, 2013 at 4:16 PM, Paolo Bonzini pbonz...@redhat.com wrote: Il 20/06/2013 09:39, Stefan Hajnoczi ha scritto: qemu_bh_cancel() and qemu_bh_delete() are not modified by this patch. It seems that calling them from a thread is a little risky because there is no guarantee that the BH is no longer invoked after a thread calls these functions. I think that's worth a comment or do you want them to take the lock so they become safe? Taking the lock wouldn't help. The invoking loop of aio_bh_poll runs lockless. I think a comment is better. qemu_bh_cancel is inherently not thread-safe, there's not much you can do about it. qemu_bh_delete is safe as long as you wait for the bottom half to stop before deleting the containing object. Once we have RCU, deletion of QOM objects will be RCU-protected. Hence, a simple way could be to put the first part of aio_bh_poll() within rcu_read_lock/unlock. In fact, I have some idea about this, introduce another member - Object for QEMUBH which will be refereed in cb, then we leave anything to refcnt mechanism. For qemu_bh_cancel(), I do not figure out whether it is important or not to sync with caller. diff --git a/async.c b/async.c index 4b17eb7..60c35a1 100644 --- a/async.c +++ b/async.c @@ -61,6 +61,7 @@ int aio_bh_poll(AioContext *ctx) { QEMUBH *bh, **bhp, *next; int ret; +int sched; { QEMUBH *bh, **bhp, *next; int ret; +int sched; ctx-walking_bh++; @@ -69,8 +70,10 @@ int aio_bh_poll(AioContext *ctx) /* Make sure fetching bh before accessing its members */ smp_read_barrier_depends(); next = bh-next; -if (!bh-deleted bh-scheduled) { -bh-scheduled = 0; +sched = 0; +atomic_xchg(bh-scheduled, sched); +if (!bh-deleted sched) { +//bh-scheduled = 0; if (!bh-idle) ret = 1; bh-idle = 0; @@ -79,6 +82,9 @@ int aio_bh_poll(AioContext *ctx) */ smp_rmb(); bh-cb(bh-opaque); +if (bh-obj) { +object_unref(bh-obj); +} } } @@ -105,8 +111,12 @@ int aio_bh_poll(AioContext *ctx) void qemu_bh_schedule_idle(QEMUBH *bh) { -if (bh-scheduled) +int sched = 1; + +atomic_xchg( bh-scheduled, sched); +if (sched) { return; +} /* Make sure any writes that are needed by the callback are done * before the locations are read in the aio_bh_poll. */ @@ -117,25 +127,46 @@ void qemu_bh_schedule_idle(QEMUBH *bh) void qemu_bh_schedule(QEMUBH *bh) { -if (bh-scheduled) +int sched = 1; + +atomic_xchg( bh-scheduled, sched); +if (sched) { return; +} /* Make sure any writes that are needed by the callback are done * before the locations are read in the aio_bh_poll. */ smp_wmb(); bh-scheduled = 1; +if (bh-obj) { +object_ref(bh-obj); +} bh-idle = 0; aio_notify(bh-ctx); } void qemu_bh_cancel(QEMUBH *bh) { -bh-scheduled = 0; +int sched = 0; + +atomic_xchg( bh-scheduled, sched); +if (sched) { +if (bh-obj) { +object_ref(bh-obj); +} +} } void qemu_bh_delete(QEMUBH *bh) { -bh-scheduled = 0; +int sched = 0; + +atomic_xchg( bh-scheduled, sched); +if (sched) { +if (bh-obj) { +object_ref(bh-obj); +} +} bh-deleted = 1; } Regards, Pingfan The other thing I'm unclear on is the -idle assignment followed immediately by a -scheduled assignment. Without memory barriers aio_bh_poll() isn't guaranteed to get an ordered view of these updates: it may see an idle BH as a regular scheduled BH because -idle is still 0. Right. You need to order -idle writes before -scheduled writes, and add memory barriers, or alternatively use two bits in -scheduled so that you can assign both atomically. Paolo
Re: [Qemu-devel] [PATCH v3 2/2] QEMUBH: make AioContext's bh re-entrant
Il 20/06/2013 11:41, liu ping fan ha scritto: On Thu, Jun 20, 2013 at 4:16 PM, Paolo Bonzini pbonz...@redhat.com wrote: Il 20/06/2013 09:39, Stefan Hajnoczi ha scritto: qemu_bh_cancel() and qemu_bh_delete() are not modified by this patch. It seems that calling them from a thread is a little risky because there is no guarantee that the BH is no longer invoked after a thread calls these functions. I think that's worth a comment or do you want them to take the lock so they become safe? Taking the lock wouldn't help. The invoking loop of aio_bh_poll runs lockless. I think a comment is better. qemu_bh_cancel is inherently not thread-safe, there's not much you can do about it. qemu_bh_delete is safe as long as you wait for the bottom half to stop before deleting the containing object. Once we have RCU, deletion of QOM objects will be RCU-protected. Hence, a simple way could be to put the first part of aio_bh_poll() within rcu_read_lock/unlock. In fact, I have some idea about this, introduce another member - Object for QEMUBH which will be refereed in cb, then we leave anything to refcnt mechanism. For qemu_bh_cancel(), I do not figure out whether it is important or not to sync with caller. This is a separate patch anyway... and a long discussion to have before too. :) Let's concentrate on one thing at a time. Paolo diff --git a/async.c b/async.c index 4b17eb7..60c35a1 100644 --- a/async.c +++ b/async.c @@ -61,6 +61,7 @@ int aio_bh_poll(AioContext *ctx) { QEMUBH *bh, **bhp, *next; int ret; +int sched; { QEMUBH *bh, **bhp, *next; int ret; +int sched; ctx-walking_bh++; @@ -69,8 +70,10 @@ int aio_bh_poll(AioContext *ctx) /* Make sure fetching bh before accessing its members */ smp_read_barrier_depends(); next = bh-next; -if (!bh-deleted bh-scheduled) { -bh-scheduled = 0; +sched = 0; +atomic_xchg(bh-scheduled, sched); This is expensive. +if (!bh-deleted sched) { +//bh-scheduled = 0; if (!bh-idle) ret = 1; bh-idle = 0; @@ -79,6 +82,9 @@ int aio_bh_poll(AioContext *ctx) */ smp_rmb(); bh-cb(bh-opaque); +if (bh-obj) { +object_unref(bh-obj); +} } } @@ -105,8 +111,12 @@ int aio_bh_poll(AioContext *ctx) void qemu_bh_schedule_idle(QEMUBH *bh) { -if (bh-scheduled) +int sched = 1; + +atomic_xchg( bh-scheduled, sched); +if (sched) { return; +} /* Make sure any writes that are needed by the callback are done * before the locations are read in the aio_bh_poll. */ @@ -117,25 +127,46 @@ void qemu_bh_schedule_idle(QEMUBH *bh) void qemu_bh_schedule(QEMUBH *bh) { -if (bh-scheduled) +int sched = 1; + +atomic_xchg( bh-scheduled, sched); +if (sched) { return; +} /* Make sure any writes that are needed by the callback are done * before the locations are read in the aio_bh_poll. */ smp_wmb(); bh-scheduled = 1; +if (bh-obj) { +object_ref(bh-obj); +} bh-idle = 0; aio_notify(bh-ctx); } void qemu_bh_cancel(QEMUBH *bh) { -bh-scheduled = 0; +int sched = 0; + +atomic_xchg( bh-scheduled, sched); +if (sched) { +if (bh-obj) { +object_ref(bh-obj); +} +} } void qemu_bh_delete(QEMUBH *bh) { -bh-scheduled = 0; +int sched = 0; + +atomic_xchg( bh-scheduled, sched); +if (sched) { +if (bh-obj) { +object_ref(bh-obj); +} +} bh-deleted = 1; } Regards, Pingfan The other thing I'm unclear on is the -idle assignment followed immediately by a -scheduled assignment. Without memory barriers aio_bh_poll() isn't guaranteed to get an ordered view of these updates: it may see an idle BH as a regular scheduled BH because -idle is still 0. Right. You need to order -idle writes before -scheduled writes, and add memory barriers, or alternatively use two bits in -scheduled so that you can assign both atomically. Paolo
Re: [Qemu-devel] Adding a persistent writeback cache to qemu
On Wed, Jun 19, 2013 at 10:28:53PM +0100, Alex Bligh wrote: --On 11 April 2013 11:25:48 +0200 Stefan Hajnoczi stefa...@gmail.com wrote: I'd like to experiment with adding persistent writeback cache to qemu. The use case here is where non-local storage is used (e.g. rbd, ceph) using the qemu drivers, together with a local cache as a file on a much faster locally mounted device, for instance an SSD (possibly replicated). This would I think give a similar performance boost to using an rbd block device plus flashcache/dm-cache/bcache, but without introducing all the context switches and limitations of having to use real block devices. I appreciate it would need to be live migration aware (worst case solution: flush and turn off caching during live migrate), and ideally be capable of replaying a dirty writeback cache in the event the host crashes. Is there any support for this already? Has anyone worked on this before? If not, would there be any interest in it? I'm concerned about the complexity this would introduce in QEMU. Therefore I'm a fan of using existing solutions like the Linux block layer instead of reimplementing this stuff in Linux. What concrete issues are there with using rbd plus flashcache/dm-cache/bcache? I'm not sure I understand the context switch problem since implementing it in user space will still require system calls to do all the actual cache I/O. I failed to see your reply and got distracted from this. Apologies. So several months later ... Happens to me sometimes too ;-). The concrete problem here is that flashcache/dm-cache/bcache don't work with the rbd (librbd) driver, as flashcache/dm-cache/bcache cache access to block devices (in the host layer), and with rbd (for instance) there is no access to a block device at all. block/rbd.c simply calls librbd which calls librados etc. So the context switches etc. I am avoiding are the ones that would be introduced by using kernel rbd devices rather than librbd. I understand the limitations with kernel block devices - their setup/teardown is an extra step outside QEMU and privileges need to be managed. That basically means you need to use a management tool like libvirt to make it usable. But I don't understand the performance angle here. Do you have profiles that show kernel rbd is a bottleneck due to context switching? We use the kernel page cache for -drive file=test.img,cache=writeback and no one has suggested reimplementing the page cache inside QEMU for better performance. Also, how do you want to manage QEMU page cache with multiple guests running? They are independent and know nothing about each other. Their process memory consumption will be bloated and the kernel memory management will end up having to sort out who gets to stay in physical memory. You can see I'm skeptical of this and think it's premature optimization, but if there's really a case for it with performance profiles then I guess it would be necessary. But we should definitely get feedback from the Ceph folks too. I'd like to hear from Ceph folks what their position on kernel rbd vs librados is. Why one do they recommend for QEMU guests and what are the pros/cons? CCed Sage and Josh Stefan
Re: [Qemu-devel] [PATCH v3] vl.c: Support multiple CPU ranges on -numa option
Il 20/06/2013 11:30, Igor Mammedov ha scritto: So, basically the format seemed easier to work with if we are thinking of using QemuOpts for -numa. Using -cpu rather than cpus probably makes it less ambiguous as well IMO. However, it's probably not a good idea if the current syntax is well established ? libvirt uses the cpus option already, so we have to keep it working. Sure, we can leave it as it's now for some time while a new interface is introduced/adopted. And than later deprecate cpus. So, you used a new name because the new behavior of -numa node,cpus=1-2,cpus=3-4 would be incompatible with the old. Personally I don't think that's a problem, but I remember a long discussion in the past. Igor/Eduardo, do you remember the conclusions? Paolo
Re: [Qemu-devel] [RFC PATCH 0/4] per-object libraries
On 20 June 2013 10:49, Paolo Bonzini pbonz...@redhat.com wrote: This only leaves Darwin. I have no idea about that, and I don't have anymore a machine to test it. Andreas or Peter, can you shed light? If you have something concrete you'd like me to test I can test it. But still, libtool wouldn't be a particularly problematic dependency. We're already using it for libcacard. ...and we're already disabling the probe for libtool in configure on MacOSX because MacOS libtool is something completely different... thanks -- PMM
Re: [Qemu-devel] [RFC PATCH 0/4] per-object libraries
Il 19/06/2013 22:00, Michael Tokarev ha scritto: 19.06.2013 22:52, Paolo Bonzini wrote: Il 19/06/2013 20:18, Michael Tokarev ha scritto: Currently I expand it like this: $(foreach m, $(filter %.o,$1), $($(m:%.o=%.libs))) Probably I can change that to $(foreach m, $(filter %.o,$1), $($(m:%.o=./%.libs))) (here and in other similar cases), and it will work without changing anything around $(obj). But maybe we can argee here that this is not really OBJect, it is a path or dir, and name it $(d) or $(p) instead of $(obj) ? To include the slash when needed. just like I did for $(obj). I chose $(obj) because that's what Kbuild uses. kbuild almost never requires this variable in user makefiles (not counting kbuild internals), -- because of recursive way of its working there's no need to prepend directory names like qemu makefiles currently do, user makefiles refer to files using bare (no path) names. Contrary to that, qemu does not recurse, it tries to do everything in one instance, so it can expand just a few magic variables, and for everything else we have to explicitly use that prefix. So that it becomes inconsistent (some vars require prefix, some don't), and too verbose. I think something like this would still be messy: common-obj-y += $(p)core.o $(p)smbus.o $(p)smbus_eeprom.o common-obj-$(CONFIG_VERSATILE_I2C) += $(p)versatile_i2c.o common-obj-$(CONFIG_ACPI) += $(p)smbus_ich9.o common-obj-$(CONFIG_APM) += $(p)pm_smbus.o common-obj-$(CONFIG_BITBANG_I2C) += $(p)bitbang_i2c.o common-obj-$(CONFIG_EXYNOS4) += $(p)exynos4210_i2c.o obj-$(CONFIG_OMAP) += $(p)omap_i2c.o and this is why I preferred per-target variable values to magic variable names. I also disliked the duplication, but in fact you do not even have the duplication if you use libtool and make a .la convenience library (similar to ld -r but preserving library dependencies) for each module, even if it is builtin. Then you can link the .la either into a .so module, or add it to the executable. If you do this, the LIBS only goes into a $(obj)/foo.la target-specific variable. Also, for the inevitable bikeshedding, I would prefer cflags-$(obj)/curl.o-y libs-$(obj)/curl.o-y What are all these -y suffixes for? In existing variables and in this new your invention? It's already a bit too verbose. It is so that you can do foo-$(CONFIG_XYZ) += blah instead of ifeq ($(CONFIG_XYZ),y) FOO += blah endif Yes, that sure I undertand -- obj-y, block-m etc. I'm asking about those new vars yuo propose -- why you want -y sufffix *there*, like cflags-foo.o-y ? Do you want to be able to specify different flags for -y and -m builds? Isn't it a bit too much? I was thinking of a module that optionally uses a library. For example raw block I/O may optionally use the linux AIO library. But you would do module-raw-posix-$(CONFIG_LINUX_AIO) += linux-aio.o libs-linux-aio.o = -laio or something like that, not module-raw-posix-$(CONFIG_LINUX_AIO) += linux-aio.o libs-raw-posix-$(CONFIG_LINUX_AIO) += -laio So it looks like there's no need for the -y here, indeed. Paolo
[Qemu-devel] [PATCH RFC 01/15] i440fx: remove unused parameter i440fx_state of i440fx_init.
Signed-off-by: Hu Tao hu...@cn.fujitsu.com --- hw/i386/pc_piix.c| 4 +--- hw/pci-host/piix.c | 10 -- include/hw/i386/pc.h | 2 +- 3 files changed, 6 insertions(+), 10 deletions(-) diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c index 97362f2..1e83c1c 100644 --- a/hw/i386/pc_piix.c +++ b/hw/i386/pc_piix.c @@ -74,7 +74,6 @@ static void pc_init1(MemoryRegion *system_memory, ram_addr_t below_4g_mem_size, above_4g_mem_size; PCIBus *pci_bus; ISABus *isa_bus; -PCII440FXState *i440fx_state; int piix3_devfn = -1; qemu_irq *cpu_irq; qemu_irq *gsi; @@ -137,7 +136,7 @@ static void pc_init1(MemoryRegion *system_memory, } if (pci_enabled) { -pci_bus = i440fx_init(i440fx_state, piix3_devfn, isa_bus, gsi, +pci_bus = i440fx_init(piix3_devfn, isa_bus, gsi, system_memory, system_io, ram_size, below_4g_mem_size, 0x1ULL - below_4g_mem_size, @@ -148,7 +147,6 @@ static void pc_init1(MemoryRegion *system_memory, pci_memory, ram_memory); } else { pci_bus = NULL; -i440fx_state = NULL; isa_bus = isa_bus_new(NULL, system_io); no_hpet = 1; } diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c index f9e68c3..0176ae9 100644 --- a/hw/pci-host/piix.c +++ b/hw/pci-host/piix.c @@ -225,7 +225,6 @@ static int i440fx_initfn(PCIDevice *dev) } static PCIBus *i440fx_common_init(const char *device_name, - PCII440FXState **pi440fx_state, int *piix3_devfn, ISABus **isa_bus, qemu_irq *pic, MemoryRegion *address_space_mem, @@ -255,8 +254,7 @@ static PCIBus *i440fx_common_init(const char *device_name, qdev_init_nofail(dev); d = pci_create_simple(b, 0, device_name); -*pi440fx_state = I440FX_PCI_DEVICE(d); -f = *pi440fx_state; +f = I440FX_PCI_DEVICE(d); f-system_memory = address_space_mem; f-pci_address_space = pci_address_space; f-ram_memory = ram_memory; @@ -307,14 +305,14 @@ static PCIBus *i440fx_common_init(const char *device_name, ram_size = ram_size / 8 / 1024 / 1024; if (ram_size 255) ram_size = 255; -(*pi440fx_state)-dev.config[0x57]=ram_size; +f-dev.config[0x57] = ram_size; i440fx_update_memory_mappings(f); return b; } -PCIBus *i440fx_init(PCII440FXState **pi440fx_state, int *piix3_devfn, +PCIBus *i440fx_init(int *piix3_devfn, ISABus **isa_bus, qemu_irq *pic, MemoryRegion *address_space_mem, MemoryRegion *address_space_io, @@ -328,7 +326,7 @@ PCIBus *i440fx_init(PCII440FXState **pi440fx_state, int *piix3_devfn, { PCIBus *b; -b = i440fx_common_init(TYPE_I440FX_PCI_DEVICE, pi440fx_state, +b = i440fx_common_init(TYPE_I440FX_PCI_DEVICE, piix3_devfn, isa_bus, pic, address_space_mem, address_space_io, ram_size, pci_hole_start, pci_hole_size, diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h index 7f04967..7c95189 100644 --- a/include/hw/i386/pc.h +++ b/include/hw/i386/pc.h @@ -126,7 +126,7 @@ extern int no_hpet; struct PCII440FXState; typedef struct PCII440FXState PCII440FXState; -PCIBus *i440fx_init(PCII440FXState **pi440fx_state, int *piix_devfn, +PCIBus *i440fx_init(int *piix_devfn, ISABus **isa_bus, qemu_irq *pic, MemoryRegion *address_space_mem, MemoryRegion *address_space_io, -- 1.8.3.1
[Qemu-devel] [PATCH RFC 02/15] i440fx: rename i440FX to i440FX-PMC
Signed-off-by: Hu Tao hu...@cn.fujitsu.com --- hw/pci-host/piix.c | 60 ++-- include/hw/i386/pc.h | 6 +++--- 2 files changed, 33 insertions(+), 33 deletions(-) diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c index 0176ae9..fc955bd 100644 --- a/hw/pci-host/piix.c +++ b/hw/pci-host/piix.c @@ -82,11 +82,11 @@ typedef struct PIIX3State { MemoryRegion rcr_mem; } PIIX3State; -#define TYPE_I440FX_PCI_DEVICE i440FX -#define I440FX_PCI_DEVICE(obj) \ -OBJECT_CHECK(PCII440FXState, (obj), TYPE_I440FX_PCI_DEVICE) +#define TYPE_I440FX_PMC_DEVICE i440FX-PMC +#define I440FX_PMC_DEVICE(obj) \ +OBJECT_CHECK(I440FXPMCState, (obj), TYPE_I440FX_PMC_DEVICE) -struct PCII440FXState { +struct I440FXPMCState { PCIDevice dev; MemoryRegion *system_memory; MemoryRegion *pci_address_space; @@ -118,7 +118,7 @@ static int pci_slot_get_pirq(PCIDevice *pci_dev, int pci_intx) return (pci_intx + slot_addend) 3; } -static void i440fx_update_memory_mappings(PCII440FXState *d) +static void i440fx_pmc_update_memory_mappings(I440FXPMCState *d) { int i; @@ -133,7 +133,7 @@ static void i440fx_update_memory_mappings(PCII440FXState *d) static void i440fx_set_smm(int val, void *arg) { -PCII440FXState *d = arg; +I440FXPMCState *d = arg; memory_region_transaction_begin(); smram_set_smm(d-smm_enabled, val, d-dev.config[I440FX_SMRAM], @@ -145,25 +145,25 @@ static void i440fx_set_smm(int val, void *arg) static void i440fx_write_config(PCIDevice *dev, uint32_t address, uint32_t val, int len) { -PCII440FXState *d = I440FX_PCI_DEVICE(dev); +I440FXPMCState *d = I440FX_PMC_DEVICE(dev); /* XXX: implement SMRAM.D_LOCK */ pci_default_write_config(dev, address, val, len); if (ranges_overlap(address, len, I440FX_PAM, I440FX_PAM_SIZE) || range_covers_byte(address, len, I440FX_SMRAM)) { -i440fx_update_memory_mappings(d); +i440fx_pmc_update_memory_mappings(d); } } static int i440fx_load_old(QEMUFile* f, void *opaque, int version_id) { -PCII440FXState *d = opaque; +I440FXPMCState *d = opaque; int ret, i; ret = pci_device_load(d-dev, f); if (ret 0) return ret; -i440fx_update_memory_mappings(d); +i440fx_pmc_update_memory_mappings(d); qemu_get_8s(f, d-smm_enabled); if (version_id == 2) { @@ -177,22 +177,22 @@ static int i440fx_load_old(QEMUFile* f, void *opaque, int version_id) static int i440fx_post_load(void *opaque, int version_id) { -PCII440FXState *d = opaque; +I440FXPMCState *d = opaque; -i440fx_update_memory_mappings(d); +i440fx_pmc_update_memory_mappings(d); return 0; } -static const VMStateDescription vmstate_i440fx = { -.name = I440FX, +static const VMStateDescription vmstate_i440fx_pmc = { +.name = TYPE_I440FX_PMC_DEVICE, .version_id = 3, .minimum_version_id = 3, .minimum_version_id_old = 1, .load_state_old = i440fx_load_old, .post_load = i440fx_post_load, .fields = (VMStateField []) { -VMSTATE_PCI_DEVICE(dev, PCII440FXState), -VMSTATE_UINT8(smm_enabled, PCII440FXState), +VMSTATE_PCI_DEVICE(dev, I440FXPMCState), +VMSTATE_UINT8(smm_enabled, I440FXPMCState), VMSTATE_END_OF_LIST() } }; @@ -214,9 +214,9 @@ static int i440fx_pcihost_initfn(SysBusDevice *dev) return 0; } -static int i440fx_initfn(PCIDevice *dev) +static int i440fx_pmc_initfn(PCIDevice *dev) { -PCII440FXState *d = I440FX_PCI_DEVICE(dev); +I440FXPMCState *d = I440FX_PMC_DEVICE(dev); d-dev.config[I440FX_SMRAM] = 0x02; @@ -242,7 +242,7 @@ static PCIBus *i440fx_common_init(const char *device_name, PCIDevice *d; PCIHostState *s; PIIX3State *piix3; -PCII440FXState *f; +I440FXPMCState *f; unsigned i; dev = qdev_create(NULL, i440FX-pcihost); @@ -254,7 +254,7 @@ static PCIBus *i440fx_common_init(const char *device_name, qdev_init_nofail(dev); d = pci_create_simple(b, 0, device_name); -f = I440FX_PCI_DEVICE(d); +f = I440FX_PMC_DEVICE(d); f-system_memory = address_space_mem; f-pci_address_space = pci_address_space; f-ram_memory = ram_memory; @@ -307,7 +307,7 @@ static PCIBus *i440fx_common_init(const char *device_name, ram_size = 255; f-dev.config[0x57] = ram_size; -i440fx_update_memory_mappings(f); +i440fx_pmc_update_memory_mappings(f); return b; } @@ -326,7 +326,7 @@ PCIBus *i440fx_init(int *piix3_devfn, { PCIBus *b; -b = i440fx_common_init(TYPE_I440FX_PCI_DEVICE, +b = i440fx_common_init(TYPE_I440FX_PMC_DEVICE, piix3_devfn, isa_bus, pic, address_space_mem, address_space_io, ram_size, pci_hole_start, pci_hole_size, @@ -603,13 +603,13 @@ static const TypeInfo piix3_xen_info = {
[Qemu-devel] [PATCH RFC 03/15] i440fx: rename i440FX-pcihost to i440FX
Signed-off-by: Hu Tao hu...@cn.fujitsu.com --- hw/pci-host/piix.c | 20 1 file changed, 12 insertions(+), 8 deletions(-) diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c index fc955bd..9c482ec 100644 --- a/hw/pci-host/piix.c +++ b/hw/pci-host/piix.c @@ -38,6 +38,10 @@ * http://download.intel.com/design/chipsets/datashts/29054901.pdf */ +#define TYPE_I440FX_DEVICE i440FX +#define I440FX_DEVICE(obj) \ +OBJECT_CHECK(I440FXState, (obj), TYPE_I440FX_DEVICE) + typedef struct I440FXState { PCIHostState parent_obj; } I440FXState; @@ -197,7 +201,7 @@ static const VMStateDescription vmstate_i440fx_pmc = { } }; -static int i440fx_pcihost_initfn(SysBusDevice *dev) +static int i440fx_initfn(SysBusDevice *dev) { PCIHostState *s = PCI_HOST_BRIDGE(dev); @@ -245,7 +249,7 @@ static PCIBus *i440fx_common_init(const char *device_name, I440FXPMCState *f; unsigned i; -dev = qdev_create(NULL, i440FX-pcihost); +dev = qdev_create(NULL, TYPE_I440FX_DEVICE); s = PCI_HOST_BRIDGE(dev); b = pci_bus_new(dev, NULL, pci_address_space, address_space_io, 0, TYPE_PCI_BUS); @@ -627,21 +631,21 @@ static const TypeInfo i440fx_pmc_info = { .class_init= i440fx_pmc_class_init, }; -static void i440fx_pcihost_class_init(ObjectClass *klass, void *data) +static void i440fx_class_init(ObjectClass *klass, void *data) { DeviceClass *dc = DEVICE_CLASS(klass); SysBusDeviceClass *k = SYS_BUS_DEVICE_CLASS(klass); -k-init = i440fx_pcihost_initfn; +k-init = i440fx_initfn; dc-fw_name = pci; dc-no_user = 1; } -static const TypeInfo i440fx_pcihost_info = { -.name = i440FX-pcihost, +static const TypeInfo i440fx_info = { +.name = TYPE_I440FX_DEVICE, .parent= TYPE_PCI_HOST_BRIDGE, .instance_size = sizeof(I440FXState), -.class_init= i440fx_pcihost_class_init, +.class_init= i440fx_class_init, }; static void i440fx_register_types(void) @@ -649,7 +653,7 @@ static void i440fx_register_types(void) type_register_static(i440fx_pmc_info); type_register_static(piix3_info); type_register_static(piix3_xen_info); -type_register_static(i440fx_pcihost_info); +type_register_static(i440fx_info); } type_init(i440fx_register_types) -- 1.8.3.1
[Qemu-devel] [PATCH RFC 05/15] i440fx pmc: create pmc through comosition
Signed-off-by: Hu Tao hu...@cn.fujitsu.com --- hw/pci-host/piix.c | 123 + 1 file changed, 76 insertions(+), 47 deletions(-) diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c index 1c5c761..218aca2 100644 --- a/hw/pci-host/piix.c +++ b/hw/pci-host/piix.c @@ -38,16 +38,6 @@ * http://download.intel.com/design/chipsets/datashts/29054901.pdf */ -#define TYPE_I440FX_DEVICE i440FX -#define I440FX_DEVICE(obj) \ -OBJECT_CHECK(I440FXState, (obj), TYPE_I440FX_DEVICE) - -typedef struct I440FXState { -PCIHostState parent_obj; -MemoryRegion *address_space_io; -MemoryRegion *pci_address_space; -} I440FXState; - #define PIIX_NUM_PIC_IRQS 16 /* i8259 * 2 */ #define PIIX_NUM_PIRQS 4ULL/* PIRQ[A-D] */ #define XEN_PIIX_NUM_PIRQS 128ULL @@ -102,8 +92,25 @@ struct I440FXPMCState { PAMMemoryRegion pam_regions[13]; MemoryRegion smram_region; uint8_t smm_enabled; +ram_addr_t ram_size; +hwaddr pci_hole_start; +hwaddr pci_hole_size; +hwaddr pci_hole64_start; +hwaddr pci_hole64_size; }; +#define TYPE_I440FX_DEVICE i440FX +#define I440FX_DEVICE(obj) \ +OBJECT_CHECK(I440FXState, (obj), TYPE_I440FX_DEVICE) + +typedef struct I440FXState { +PCIHostState parent_obj; +MemoryRegion *address_space_io; +MemoryRegion *pci_address_space; + +PIIX3State piix3; +I440FXPMCState pmc; +} I440FXState; #define I440FX_PAM 0x59 #define I440FX_PAM_SIZE 7 @@ -221,16 +228,63 @@ static int i440fx_realize(SysBusDevice *dev) sysbus_add_io(dev, 0xcfc, s-data_mem); sysbus_init_ioports(s-busdev, 0xcfc, 4); +qdev_set_parent_bus(DEVICE(f-pmc), BUS(s-bus)); +qdev_init_nofail(DEVICE(f-pmc)); + return 0; } static void i440fx_initfn(Object *obj) { +I440FXState *f = I440FX_DEVICE(obj); + +object_initialize(f-pmc, TYPE_I440FX_PMC_DEVICE); +object_property_add_child(obj, pmc, OBJECT(f-pmc), NULL); +qdev_prop_set_uint32(DEVICE(f-pmc), addr, PCI_DEVFN(0, 0)); } static int i440fx_pmc_initfn(PCIDevice *dev) { I440FXPMCState *d = I440FX_PMC_DEVICE(dev); +ram_addr_t ram_size; +int i; + +g_assert(d-system_memory != NULL); +g_assert(d-pci_address_space != NULL); +g_assert(d-ram_memory != NULL); + +memory_region_init_alias(d-pci_hole, pci-hole, d-pci_address_space, + d-pci_hole_start, d-pci_hole_size); +memory_region_add_subregion(d-system_memory, d-pci_hole_start, +d-pci_hole); +memory_region_init_alias(d-pci_hole_64bit, pci-hole64, + d-pci_address_space, + d-pci_hole64_start, d-pci_hole64_size); +if (d-pci_hole64_size) { +memory_region_add_subregion(d-system_memory, d-pci_hole64_start, +d-pci_hole_64bit); +} +memory_region_init_alias(d-smram_region, smram-region, + d-pci_address_space, 0xa, 0x2); +memory_region_add_subregion_overlap(d-system_memory, 0xa, +d-smram_region, 1); +memory_region_set_enabled(d-smram_region, false); + +init_pam(d-ram_memory, d-system_memory, d-pci_address_space, + d-pam_regions[0], PAM_BIOS_BASE, PAM_BIOS_SIZE); +for (i = 0; i 12; ++i) { +init_pam(d-ram_memory, d-system_memory, d-pci_address_space, + d-pam_regions[i+1], PAM_EXPAN_BASE + i * PAM_EXPAN_SIZE, + PAM_EXPAN_SIZE); +} + +ram_size = d-ram_size / 8 / 1024 / 1024; +if (ram_size 255) { +ram_size = 255; +} +d-dev.config[0x57] = ram_size; + +i440fx_pmc_update_memory_mappings(d); d-dev.config[I440FX_SMRAM] = 0x02; @@ -251,12 +305,10 @@ static PCIBus *i440fx_common_init(const char *device_name, MemoryRegion *pci_address_space, MemoryRegion *ram_memory) { -PCIDevice *d; PCIHostState *s; PIIX3State *piix3; I440FXPMCState *f; I440FXState *i440fx; -unsigned i; i440fx = I440FX_DEVICE(object_new(TYPE_I440FX_DEVICE)); s = PCI_HOST_BRIDGE(i440fx); @@ -264,38 +316,22 @@ static PCIBus *i440fx_common_init(const char *device_name, i440fx-address_space_io = address_space_io; i440fx-pci_address_space = pci_address_space; -object_property_add_child(qdev_get_machine(), i440fx, - OBJECT(i440fx), NULL); -qdev_set_parent_bus(DEVICE(i440fx), sysbus_get_default()); -qdev_init_nofail(DEVICE(i440fx)); +/* FIXME these should be derived */ +i440fx-pmc.pci_hole_start = pci_hole_start; +i440fx-pmc.pci_hole_size = pci_hole_size; +i440fx-pmc.pci_hole64_start = pci_hole64_start; +i440fx-pmc.pci_hole64_size = pci_hole64_size; -d = pci_create_simple(s-bus, 0, device_name); -f = I440FX_PMC_DEVICE(d); +f = i440fx-pmc;
[Qemu-devel] [PATCH RFC 00/15] pc refactor about memory controller
This series introduces MemoryController, and refactor pc code about i440fx pmc and q35 mch, and introduces ISAPc to resolve breakage of isapc. The memory hotplug patchset will base on this series. Comments are welcome! Hu Tao (15): i440fx: remove unused parameter i440fx_state of i440fx_init. i440fx: rename i440FX to i440FX-PMC i440fx: rename i440FX-pcihost to i440FX i440fx: prepare for composition i440fx pmc: create pmc through comosition i440fx-pmc: calculate PCI memory hole directly i440fx-pmc: create pci address space q35-mch: create pci address space i440fx-pmc: move ram initialization into i440fx-pmc q35-mch: move ram initialization into q35-mch introduce ISAPc i440fx pmc: inherit from MemoryController q35 mch: inherit from MemoryController move bios loading to MemoryController and ISAPc hw/i386/pc.c | 214 +++- hw/i386/pc_piix.c | 52 -- hw/i386/pc_q35.c | 31 ++ hw/isa/Makefile.objs | 2 +- hw/isa/isa_pc.c | 53 ++ hw/pci-host/piix.c| 244 +++--- hw/pci-host/q35.c | 132 +++-- include/hw/i386/pc.h | 66 ++--- include/hw/isa/isa_pc.h | 27 + include/hw/pci-host/q35.h | 16 +-- 10 files changed, 465 insertions(+), 372 deletions(-) create mode 100644 hw/isa/isa_pc.c create mode 100644 include/hw/isa/isa_pc.h -- 1.8.3.1
[Qemu-devel] [PATCH RFC 09/15] i440fx-pmc: move ram initialization into i440fx-pmc
Signed-off-by: Hu Tao hu...@cn.fujitsu.com --- hw/i386/pc.c | 24 +--- hw/i386/pc_piix.c| 5 ++--- hw/pci-host/piix.c | 39 +++ include/hw/i386/pc.h | 3 +-- 4 files changed, 35 insertions(+), 36 deletions(-) diff --git a/hw/i386/pc.c b/hw/i386/pc.c index 5e8f143..909307e 100644 --- a/hw/i386/pc.c +++ b/hw/i386/pc.c @@ -1033,33 +1033,11 @@ FWCfgState *pc_memory_init(MemoryRegion *system_memory, MemoryRegion **ram_memory) { int linux_boot, i; -MemoryRegion *ram, *option_rom_mr; -MemoryRegion *ram_below_4g, *ram_above_4g; +MemoryRegion *option_rom_mr; FWCfgState *fw_cfg; linux_boot = (kernel_filename != NULL); -/* Allocate RAM. We allocate it as a single memory region and use - * aliases to address portions of it, mostly for backwards compatibility - * with older qemus that used qemu_ram_alloc(). - */ -ram = g_malloc(sizeof(*ram)); -memory_region_init_ram(ram, pc.ram, - below_4g_mem_size + above_4g_mem_size); -vmstate_register_ram_global(ram); -*ram_memory = ram; -ram_below_4g = g_malloc(sizeof(*ram_below_4g)); -memory_region_init_alias(ram_below_4g, ram-below-4g, ram, - 0, below_4g_mem_size); -memory_region_add_subregion(system_memory, 0, ram_below_4g); -if (above_4g_mem_size 0) { -ram_above_4g = g_malloc(sizeof(*ram_above_4g)); -memory_region_init_alias(ram_above_4g, ram-above-4g, ram, - below_4g_mem_size, above_4g_mem_size); -memory_region_add_subregion(system_memory, 0x1ULL, -ram_above_4g); -} - /* Initialize PC system firmware */ pc_system_firmware_init(rom_memory); diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c index 1ca705c..4614c07 100644 --- a/hw/i386/pc_piix.c +++ b/hw/i386/pc_piix.c @@ -84,7 +84,6 @@ static void pc_init1(MemoryRegion *system_memory, BusState *idebus[MAX_IDE_BUS]; ISADevice *rtc_state; ISADevice *floppy; -MemoryRegion *ram_memory; MemoryRegion *pci_memory; MemoryRegion *rom_memory; DeviceState *icc_bridge; @@ -121,7 +120,7 @@ static void pc_init1(MemoryRegion *system_memory, if (pci_enabled) { pci_bus = i440fx_init(piix3_devfn, isa_bus, gsi, system_memory, system_io, ram_size, - pci_memory, ram_memory); + pci_memory); } else { pci_bus = NULL; isa_bus = isa_bus_new(NULL, system_io); @@ -140,7 +139,7 @@ static void pc_init1(MemoryRegion *system_memory, fw_cfg = pc_memory_init(system_memory, kernel_filename, kernel_cmdline, initrd_filename, below_4g_mem_size, above_4g_mem_size, -rom_memory, ram_memory); +rom_memory, NULL); } if (kvm_irqchip_in_kernel()) { diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c index 604a66e..5318ddc 100644 --- a/hw/pci-host/piix.c +++ b/hw/pci-host/piix.c @@ -96,6 +96,10 @@ struct I440FXPMCState { MemoryRegion smram_region; uint8_t smm_enabled; ram_addr_t ram_size; +MemoryRegion ram; +MemoryRegion ram_below_4g; +MemoryRegion ram_above_4g; + }; #define TYPE_I440FX_DEVICE i440FX @@ -250,18 +254,39 @@ static int i440fx_pmc_initfn(PCIDevice *dev) { I440FXPMCState *d = I440FX_PMC_DEVICE(dev); ram_addr_t ram_size; +hwaddr below_4g_mem_size, above_4g_mem_size; hwaddr pci_hole_start, pci_hole_size; hwaddr pci_hole64_start, pci_hole64_size; int i; g_assert(d-system_memory != NULL); -g_assert(d-ram_memory != NULL); if(d-ram_size I440FX_PMC_PCI_HOLE) { -pci_hole_start = I440FX_PMC_PCI_HOLE; +below_4g_mem_size = I440FX_PMC_PCI_HOLE; +above_4g_mem_size = d-ram_size - I440FX_PMC_PCI_HOLE; } else { -pci_hole_start = d-ram_size; +below_4g_mem_size = d-ram_size; +above_4g_mem_size = 0; } + +/* Allocate RAM. We allocate it as a single memory region and use + * aliases to address portions of it, mostly for backwards compatibility + * with older qemus that used qemu_ram_alloc(). + */ +memory_region_init_ram(d-ram, pc.ram, + below_4g_mem_size + above_4g_mem_size); +vmstate_register_ram_global(d-ram); +memory_region_init_alias(d-ram_below_4g, ram-below-4g, d-ram, + 0, below_4g_mem_size); +memory_region_add_subregion(d-system_memory, 0, d-ram_below_4g); +if (above_4g_mem_size 0) { +memory_region_init_alias(d-ram_above_4g, ram-above-4g, d-ram, + below_4g_mem_size, above_4g_mem_size); +memory_region_add_subregion(d-system_memory,
[Qemu-devel] [PATCH RFC 06/15] i440fx-pmc: calculate PCI memory hole directly
Signed-off-by: Hu Tao hu...@cn.fujitsu.com --- hw/i386/pc_piix.c| 6 -- hw/pci-host/piix.c | 49 - include/hw/i386/pc.h | 4 3 files changed, 24 insertions(+), 35 deletions(-) diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c index 1e83c1c..3934754 100644 --- a/hw/i386/pc_piix.c +++ b/hw/i386/pc_piix.c @@ -138,12 +138,6 @@ static void pc_init1(MemoryRegion *system_memory, if (pci_enabled) { pci_bus = i440fx_init(piix3_devfn, isa_bus, gsi, system_memory, system_io, ram_size, - below_4g_mem_size, - 0x1ULL - below_4g_mem_size, - 0x1ULL + above_4g_mem_size, - (sizeof(hwaddr) == 4 - ? 0 - : ((uint64_t)1 62)), pci_memory, ram_memory); } else { pci_bus = NULL; diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c index 218aca2..7b58d56 100644 --- a/hw/pci-host/piix.c +++ b/hw/pci-host/piix.c @@ -43,6 +43,9 @@ #define XEN_PIIX_NUM_PIRQS 128ULL #define PIIX_PIRQC 0x60 +#define I440FX_PMC_PCI_HOLE 0xE000ULL +#define I440FX_PMC_PCI_HOLE_END 0x1ULL + /* * Reset Control Register: PCI-accessible ISA-Compatible Register at address * 0xcf9, provided by the PCI/ISA bridge (PIIX3 PCI function 0, 8086:7000). @@ -93,10 +96,6 @@ struct I440FXPMCState { MemoryRegion smram_region; uint8_t smm_enabled; ram_addr_t ram_size; -hwaddr pci_hole_start; -hwaddr pci_hole_size; -hwaddr pci_hole64_start; -hwaddr pci_hole64_size; }; #define TYPE_I440FX_DEVICE i440FX @@ -247,21 +246,37 @@ static int i440fx_pmc_initfn(PCIDevice *dev) { I440FXPMCState *d = I440FX_PMC_DEVICE(dev); ram_addr_t ram_size; +hwaddr pci_hole_start, pci_hole_size; +hwaddr pci_hole64_start, pci_hole64_size; int i; g_assert(d-system_memory != NULL); g_assert(d-pci_address_space != NULL); g_assert(d-ram_memory != NULL); +if(d-ram_size I440FX_PMC_PCI_HOLE) { +pci_hole_start = I440FX_PMC_PCI_HOLE; +} else { +pci_hole_start = d-ram_size; +} +pci_hole_size = I440FX_PMC_PCI_HOLE_END - pci_hole_start; + +pci_hole64_start = I440FX_PMC_PCI_HOLE_END + d-ram_size - pci_hole_start; +if (sizeof(hwaddr) == 4) { +pci_hole64_size = 0; +} else { +pci_hole64_size = (1ULL 62); +} + memory_region_init_alias(d-pci_hole, pci-hole, d-pci_address_space, - d-pci_hole_start, d-pci_hole_size); -memory_region_add_subregion(d-system_memory, d-pci_hole_start, + pci_hole_start, pci_hole_size); +memory_region_add_subregion(d-system_memory, pci_hole_start, d-pci_hole); memory_region_init_alias(d-pci_hole_64bit, pci-hole64, d-pci_address_space, - d-pci_hole64_start, d-pci_hole64_size); -if (d-pci_hole64_size) { -memory_region_add_subregion(d-system_memory, d-pci_hole64_start, + pci_hole64_start, pci_hole64_size); +if (pci_hole64_size) { +memory_region_add_subregion(d-system_memory, pci_hole64_start, d-pci_hole_64bit); } memory_region_init_alias(d-smram_region, smram-region, @@ -298,10 +313,6 @@ static PCIBus *i440fx_common_init(const char *device_name, MemoryRegion *address_space_mem, MemoryRegion *address_space_io, ram_addr_t ram_size, - hwaddr pci_hole_start, - hwaddr pci_hole_size, - hwaddr pci_hole64_start, - hwaddr pci_hole64_size, MemoryRegion *pci_address_space, MemoryRegion *ram_memory) { @@ -316,12 +327,6 @@ static PCIBus *i440fx_common_init(const char *device_name, i440fx-address_space_io = address_space_io; i440fx-pci_address_space = pci_address_space; -/* FIXME these should be derived */ -i440fx-pmc.pci_hole_start = pci_hole_start; -i440fx-pmc.pci_hole_size = pci_hole_size; -i440fx-pmc.pci_hole64_start = pci_hole64_start; -i440fx-pmc.pci_hole64_size = pci_hole64_size; - f = i440fx-pmc; f-ram_size = ram_size; f-system_memory = address_space_mem; @@ -362,10 +367,6 @@ PCIBus *i440fx_init(int *piix3_devfn, MemoryRegion *address_space_mem, MemoryRegion *address_space_io, ram_addr_t ram_size, -hwaddr pci_hole_start, -hwaddr pci_hole_size, -
[Qemu-devel] [PATCH RFC 08/15] q35-mch: create pci address space
Signed-off-by: Hu Tao hu...@cn.fujitsu.com --- hw/i386/pc_q35.c | 13 - hw/pci-host/q35.c | 12 +++- include/hw/pci-host/q35.h | 1 + 3 files changed, 12 insertions(+), 14 deletions(-) diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c index bb0ce6a..109bb64 100644 --- a/hw/i386/pc_q35.c +++ b/hw/i386/pc_q35.c @@ -64,7 +64,6 @@ static void pc_q35_init(QEMUMachineInitArgs *args) BusState *idebus[MAX_SATA_PORTS]; ISADevice *rtc_state; ISADevice *floppy; -MemoryRegion *pci_memory; MemoryRegion *rom_memory; MemoryRegion *ram_memory; GSIState *gsi_state; @@ -87,6 +86,9 @@ static void pc_q35_init(QEMUMachineInitArgs *args) kvmclock_create(); +/* create pci host bus */ +q35_host = Q35_HOST_DEVICE(qdev_create(NULL, TYPE_Q35_HOST_DEVICE)); + if (ram_size = 0xb000) { above_4g_mem_size = ram_size - 0xb000; below_4g_mem_size = 0xb000; @@ -97,11 +99,8 @@ static void pc_q35_init(QEMUMachineInitArgs *args) /* pci enabled */ if (pci_enabled) { -pci_memory = g_new(MemoryRegion, 1); -memory_region_init(pci_memory, pci, INT64_MAX); -rom_memory = pci_memory; +rom_memory = q35_host-pci_address_space; } else { -pci_memory = NULL; rom_memory = get_system_memory(); } @@ -122,11 +121,7 @@ static void pc_q35_init(QEMUMachineInitArgs *args) gsi = qemu_allocate_irqs(gsi_handler, gsi_state, GSI_NUM_PINS); } -/* create pci host bus */ -q35_host = Q35_HOST_DEVICE(qdev_create(NULL, TYPE_Q35_HOST_DEVICE)); - q35_host-mch.ram_memory = ram_memory; -q35_host-mch.pci_address_space = pci_memory; q35_host-mch.system_memory = get_system_memory(); q35_host-mch.address_space_io = get_system_io(); q35_host-mch.below_4g_mem_size = below_4g_mem_size; diff --git a/hw/pci-host/q35.c b/hw/pci-host/q35.c index 24df6b5..3ab5ed8 100644 --- a/hw/pci-host/q35.c +++ b/hw/pci-host/q35.c @@ -36,8 +36,7 @@ static int q35_host_init(SysBusDevice *dev) { -PCIBus *b; -PCIHostState *pci = FROM_SYSBUS(PCIHostState, dev); +PCIHostState *pci = PCI_HOST_BRIDGE(dev); Q35PCIHost *s = Q35_HOST_DEVICE(dev-qdev); memory_region_init_io(pci-conf_mem, pci_host_conf_le_ops, pci, @@ -53,11 +52,12 @@ static int q35_host_init(SysBusDevice *dev) if (pcie_host_init(s-host) 0) { return -1; } -b = pci_bus_new(s-host.pci.busdev.qdev, pcie.0, + +s-mch.pci_address_space = s-pci_address_space; +pci-bus = pci_bus_new(DEVICE(s), pcie.0, s-mch.pci_address_space, s-mch.address_space_io, 0, TYPE_PCIE_BUS); -s-host.pci.bus = b; -qdev_set_parent_bus(DEVICE(s-mch), BUS(b)); +qdev_set_parent_bus(DEVICE(s-mch), BUS(pci-bus)); qdev_init_nofail(DEVICE(s-mch)); return 0; @@ -87,6 +87,8 @@ static void q35_host_initfn(Object *obj) object_property_add_child(OBJECT(s), mch, OBJECT(s-mch), NULL); qdev_prop_set_uint32(DEVICE(s-mch), addr, PCI_DEVFN(0, 0)); qdev_prop_set_bit(DEVICE(s-mch), multifunction, false); + +memory_region_init(s-pci_address_space, pci, INT64_MAX); } static const TypeInfo q35_host_info = { diff --git a/include/hw/pci-host/q35.h b/include/hw/pci-host/q35.h index e182c82..1c02420 100644 --- a/include/hw/pci-host/q35.h +++ b/include/hw/pci-host/q35.h @@ -60,6 +60,7 @@ typedef struct MCHPCIState { typedef struct Q35PCIHost { PCIExpressHost host; MCHPCIState mch; +MemoryRegion pci_address_space; } Q35PCIHost; #define Q35_MASK(bit, ms_bit, ls_bit) \ -- 1.8.3.1
[Qemu-devel] [PATCH RFC 04/15] i440fx: prepare for composition
Signed-off-by: Hu Tao hu...@cn.fujitsu.com --- hw/pci-host/piix.c | 49 +++-- 1 file changed, 31 insertions(+), 18 deletions(-) diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c index 9c482ec..1c5c761 100644 --- a/hw/pci-host/piix.c +++ b/hw/pci-host/piix.c @@ -44,6 +44,8 @@ typedef struct I440FXState { PCIHostState parent_obj; +MemoryRegion *address_space_io; +MemoryRegion *pci_address_space; } I440FXState; #define PIIX_NUM_PIC_IRQS 16 /* i8259 * 2 */ @@ -201,9 +203,13 @@ static const VMStateDescription vmstate_i440fx_pmc = { } }; -static int i440fx_initfn(SysBusDevice *dev) +static int i440fx_realize(SysBusDevice *dev) { PCIHostState *s = PCI_HOST_BRIDGE(dev); +I440FXState *f = I440FX_DEVICE(dev); + +s-bus = pci_bus_new(DEVICE(f), NULL, f-pci_address_space, + f-address_space_io, 0, TYPE_PCI_BUS); memory_region_init_io(s-conf_mem, pci_host_conf_le_ops, s, pci-conf-idx, 4); @@ -218,6 +224,10 @@ static int i440fx_initfn(SysBusDevice *dev) return 0; } +static void i440fx_initfn(Object *obj) +{ +} + static int i440fx_pmc_initfn(PCIDevice *dev) { I440FXPMCState *d = I440FX_PMC_DEVICE(dev); @@ -241,23 +251,25 @@ static PCIBus *i440fx_common_init(const char *device_name, MemoryRegion *pci_address_space, MemoryRegion *ram_memory) { -DeviceState *dev; -PCIBus *b; PCIDevice *d; PCIHostState *s; PIIX3State *piix3; I440FXPMCState *f; +I440FXState *i440fx; unsigned i; -dev = qdev_create(NULL, TYPE_I440FX_DEVICE); -s = PCI_HOST_BRIDGE(dev); -b = pci_bus_new(dev, NULL, pci_address_space, -address_space_io, 0, TYPE_PCI_BUS); -s-bus = b; -object_property_add_child(qdev_get_machine(), i440fx, OBJECT(dev), NULL); -qdev_init_nofail(dev); +i440fx = I440FX_DEVICE(object_new(TYPE_I440FX_DEVICE)); +s = PCI_HOST_BRIDGE(i440fx); + +i440fx-address_space_io = address_space_io; +i440fx-pci_address_space = pci_address_space; -d = pci_create_simple(b, 0, device_name); +object_property_add_child(qdev_get_machine(), i440fx, + OBJECT(i440fx), NULL); +qdev_set_parent_bus(DEVICE(i440fx), sysbus_get_default()); +qdev_init_nofail(DEVICE(i440fx)); + +d = pci_create_simple(s-bus, 0, device_name); f = I440FX_PMC_DEVICE(d); f-system_memory = address_space_mem; f-pci_address_space = pci_address_space; @@ -291,15 +303,15 @@ static PCIBus *i440fx_common_init(const char *device_name, * These additional routes can be discovered through ACPI. */ if (xen_enabled()) { piix3 = DO_UPCAST(PIIX3State, dev, -pci_create_simple_multifunction(b, -1, true, PIIX3-xen)); -pci_bus_irqs(b, xen_piix3_set_irq, xen_pci_slot_get_pirq, +pci_create_simple_multifunction(s-bus, -1, true, PIIX3-xen)); +pci_bus_irqs(s-bus, xen_piix3_set_irq, xen_pci_slot_get_pirq, piix3, XEN_PIIX_NUM_PIRQS); } else { piix3 = DO_UPCAST(PIIX3State, dev, -pci_create_simple_multifunction(b, -1, true, PIIX3)); -pci_bus_irqs(b, piix3_set_irq, pci_slot_get_pirq, piix3, +pci_create_simple_multifunction(s-bus, -1, true, PIIX3)); +pci_bus_irqs(s-bus, piix3_set_irq, pci_slot_get_pirq, piix3, PIIX_NUM_PIRQS); -pci_bus_set_route_irq_fn(b, piix3_route_intx_pin_to_irq); +pci_bus_set_route_irq_fn(s-bus, piix3_route_intx_pin_to_irq); } piix3-pic = pic; *isa_bus = ISA_BUS(qdev_get_child_bus(DEVICE(piix3), isa.0)); @@ -313,7 +325,7 @@ static PCIBus *i440fx_common_init(const char *device_name, i440fx_pmc_update_memory_mappings(f); -return b; +return s-bus; } PCIBus *i440fx_init(int *piix3_devfn, @@ -636,7 +648,7 @@ static void i440fx_class_init(ObjectClass *klass, void *data) DeviceClass *dc = DEVICE_CLASS(klass); SysBusDeviceClass *k = SYS_BUS_DEVICE_CLASS(klass); -k-init = i440fx_initfn; +k-init = i440fx_realize; dc-fw_name = pci; dc-no_user = 1; } @@ -645,6 +657,7 @@ static const TypeInfo i440fx_info = { .name = TYPE_I440FX_DEVICE, .parent= TYPE_PCI_HOST_BRIDGE, .instance_size = sizeof(I440FXState), +.instance_init = i440fx_initfn, .class_init= i440fx_class_init, }; -- 1.8.3.1
[Qemu-devel] [PATCH RFC 12/15] introduce memory controller
Signed-off-by: Hu Tao hu...@cn.fujitsu.com --- hw/i386/pc.c | 156 +++ hw/pci-host/piix.c | 1 - include/hw/i386/pc.h | 43 ++ 3 files changed, 199 insertions(+), 1 deletion(-) diff --git a/hw/i386/pc.c b/hw/i386/pc.c index 398b201..c28baa2 100644 --- a/hw/i386/pc.c +++ b/hw/i386/pc.c @@ -1256,3 +1256,159 @@ void ioapic_init_gsi(GSIState *gsi_state, const char *parent_name) gsi_state-ioapic_irq[i] = qdev_get_gpio_in(dev, i); } } + +void mc_update_pam(MemoryController *d) +{ +PCIDevice *pd = PCI_DEVICE(d); +MemoryControllerClass *c = MEMORY_CONTROLLER_GET_CLASS(d); +int i; + +memory_region_transaction_begin(); +for (i = 0; i 13; i++) { +pam_update(d-pam_regions[i], i, + pd-config[c-pam0 + ((i + 1) / 2)]); +} +memory_region_transaction_commit(); +} + +void mc_update_smram(MemoryController *d) +{ +PCIDevice *pd = PCI_DEVICE(d); +MemoryControllerClass *c = MEMORY_CONTROLLER_GET_CLASS(d); + +memory_region_transaction_begin(); +smram_update(d-smram_region, pd-config[c-smram], + d-smm_enabled); +memory_region_transaction_commit(); +} + +void mc_update(MemoryController *d) +{ +mc_update_pam(d); +mc_update_smram(d); +} + +void mc_set_smm(int val, void *arg) +{ +MemoryController *d = arg; +PCIDevice *pd = PCI_DEVICE(d); +MemoryControllerClass *c = MEMORY_CONTROLLER_GET_CLASS(d); + +memory_region_transaction_begin(); +smram_set_smm(d-smm_enabled, val, pd-config[c-smram], + d-smram_region); +memory_region_transaction_commit(); +} + +static int memory_controller_init(PCIDevice *dev) +{ +MemoryController *m = MEMORY_CONTROLLER(dev); +MemoryControllerClass *c = MEMORY_CONTROLLER_GET_CLASS(dev); +ram_addr_t ram_size; +hwaddr below_4g_mem_size, above_4g_mem_size; +hwaddr pci_hole_start, pci_hole_size; +hwaddr pci_hole64_start, pci_hole64_size; +int i; + +g_assert(m-system_memory != NULL); + +if(m-ram_size c-pci_hole_start) { +below_4g_mem_size = c-pci_hole_start; +above_4g_mem_size = m-ram_size - c-pci_hole_start; +} else { +below_4g_mem_size = m-ram_size; +above_4g_mem_size = 0; +} + +/* Allocate RAM. We allocate it as a single memory region and use + * aliases to address portions of it, mostly for backwards compatibility + * with older qemus that used qemu_ram_alloc(). + */ +memory_region_init_ram(m-ram, pc.ram, m-ram_size); +vmstate_register_ram_global(m-ram); +memory_region_init_alias(m-ram_below_4g, ram-below-4g, m-ram, + 0, below_4g_mem_size); +memory_region_add_subregion(m-system_memory, 0, m-ram_below_4g); +if (above_4g_mem_size 0) { +memory_region_init_alias(m-ram_above_4g, ram-above-4g, m-ram, + below_4g_mem_size, above_4g_mem_size); +memory_region_add_subregion(m-system_memory, c-pci_hole_end, +m-ram_above_4g); +} + +pci_hole_start = below_4g_mem_size; +pci_hole_size = c-pci_hole_end - pci_hole_start; + +pci_hole64_start = c-pci_hole_end + m-ram_size - pci_hole_start; +if (sizeof(hwaddr) == 4) { +pci_hole64_size = 0; +} else { +pci_hole64_size = (1ULL 62); +} + +memory_region_init_alias(m-pci_hole, pci-hole, m-pci_address_space, + pci_hole_start, pci_hole_size); +memory_region_add_subregion(m-system_memory, pci_hole_start, +m-pci_hole); +memory_region_init_alias(m-pci_hole_64bit, pci-hole64, + m-pci_address_space, + pci_hole64_start, pci_hole64_size); +if (pci_hole64_size) { +memory_region_add_subregion(m-system_memory, pci_hole64_start, +m-pci_hole_64bit); +} +memory_region_init_alias(m-smram_region, smram-region, + m-pci_address_space, 0xa, 0x2); +memory_region_add_subregion_overlap(m-system_memory, 0xa, +m-smram_region, 1); +memory_region_set_enabled(m-smram_region, false); + +init_pam(m-ram_memory, m-system_memory, m-pci_address_space, + m-pam_regions[0], PAM_BIOS_BASE, PAM_BIOS_SIZE); +for (i = 0; i 12; ++i) { +init_pam(m-ram_memory, m-system_memory, m-pci_address_space, + m-pam_regions[i+1], PAM_EXPAN_BASE + i * PAM_EXPAN_SIZE, + PAM_EXPAN_SIZE); +} + +ram_size = m-ram_size / 8 / 1024 / 1024; +if (ram_size 255) { +ram_size = 255; +} +dev-config[0x57] = ram_size; + +dev-config[0x72] = 0x02; + +cpu_smm_register(c-set_smm, m); +return 0; +} + +static void memory_controller_class_init(ObjectClass *klass, void *data) +{ +DeviceClass *dc =
[Qemu-devel] [PATCH RFC 10/15] q35-mch: move ram initialization into q35-mch
Signed-off-by: Hu Tao hu...@cn.fujitsu.com --- hw/i386/pc.c | 3 +-- hw/i386/pc_piix.c | 2 +- hw/i386/pc_q35.c | 7 ++- hw/pci-host/q35.c | 36 +++- include/hw/i386/pc.h | 3 +-- include/hw/pci-host/q35.h | 9 +++-- 6 files changed, 43 insertions(+), 17 deletions(-) diff --git a/hw/i386/pc.c b/hw/i386/pc.c index 909307e..398b201 100644 --- a/hw/i386/pc.c +++ b/hw/i386/pc.c @@ -1029,8 +1029,7 @@ FWCfgState *pc_memory_init(MemoryRegion *system_memory, const char *initrd_filename, ram_addr_t below_4g_mem_size, ram_addr_t above_4g_mem_size, - MemoryRegion *rom_memory, - MemoryRegion **ram_memory) + MemoryRegion *rom_memory) { int linux_boot, i; MemoryRegion *option_rom_mr; diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c index 4614c07..77d7ef0 100644 --- a/hw/i386/pc_piix.c +++ b/hw/i386/pc_piix.c @@ -139,7 +139,7 @@ static void pc_init1(MemoryRegion *system_memory, fw_cfg = pc_memory_init(system_memory, kernel_filename, kernel_cmdline, initrd_filename, below_4g_mem_size, above_4g_mem_size, -rom_memory, NULL); +rom_memory); } if (kvm_irqchip_in_kernel()) { diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c index 109bb64..d7d9703 100644 --- a/hw/i386/pc_q35.c +++ b/hw/i386/pc_q35.c @@ -65,7 +65,6 @@ static void pc_q35_init(QEMUMachineInitArgs *args) ISADevice *rtc_state; ISADevice *floppy; MemoryRegion *rom_memory; -MemoryRegion *ram_memory; GSIState *gsi_state; ISABus *isa_bus; int pci_enabled = 1; @@ -108,7 +107,7 @@ static void pc_q35_init(QEMUMachineInitArgs *args) if (!xen_enabled()) { pc_memory_init(get_system_memory(), kernel_filename, kernel_cmdline, initrd_filename, below_4g_mem_size, above_4g_mem_size, - rom_memory, ram_memory); + rom_memory); } /* irq lines */ @@ -121,11 +120,9 @@ static void pc_q35_init(QEMUMachineInitArgs *args) gsi = qemu_allocate_irqs(gsi_handler, gsi_state, GSI_NUM_PINS); } -q35_host-mch.ram_memory = ram_memory; +q35_host-mch.ram_size = ram_size; q35_host-mch.system_memory = get_system_memory(); q35_host-mch.address_space_io = get_system_io(); -q35_host-mch.below_4g_mem_size = below_4g_mem_size; -q35_host-mch.above_4g_mem_size = above_4g_mem_size; /* pci */ qdev_init_nofail(DEVICE(q35_host)); host_bus = q35_host-host.pci.bus; diff --git a/hw/pci-host/q35.c b/hw/pci-host/q35.c index 3ab5ed8..5f99431 100644 --- a/hw/pci-host/q35.c +++ b/hw/pci-host/q35.c @@ -244,24 +244,50 @@ static int mch_init(PCIDevice *d) { int i; hwaddr pci_hole64_size; +hwaddr below_4g_mem_size, above_4g_mem_size; MCHPCIState *mch = MCH_PCI_DEVICE(d); +if(mch-ram_size MCH_PCI_HOLE) { +below_4g_mem_size = MCH_PCI_HOLE; +above_4g_mem_size = mch-ram_size - MCH_PCI_HOLE; +} else { +below_4g_mem_size = mch-ram_size; +above_4g_mem_size = 0; +} + +/* Allocate RAM. We allocate it as a single memory region and use + * aliases to address portions of it, mostly for backwards compatibility + * with older qemus that used qemu_ram_alloc(). + */ +memory_region_init_ram(mch-ram, pc.ram, + below_4g_mem_size + above_4g_mem_size); +vmstate_register_ram_global(mch-ram); +memory_region_init_alias(mch-ram_below_4g, ram-below-4g, mch-ram, + 0, below_4g_mem_size); +memory_region_add_subregion(mch-system_memory, 0, mch-ram_below_4g); +if (above_4g_mem_size 0) { +memory_region_init_alias(mch-ram_above_4g, ram-above-4g, mch-ram, + below_4g_mem_size, above_4g_mem_size); +memory_region_add_subregion(mch-system_memory, MCH_PCI_HOLE_END, +mch-ram_above_4g); +} + /* setup pci memory regions */ memory_region_init_alias(mch-pci_hole, pci-hole, mch-pci_address_space, - mch-below_4g_mem_size, - 0x1ULL - mch-below_4g_mem_size); -memory_region_add_subregion(mch-system_memory, mch-below_4g_mem_size, + below_4g_mem_size, + 0x1ULL - below_4g_mem_size); +memory_region_add_subregion(mch-system_memory, below_4g_mem_size, mch-pci_hole); pci_hole64_size = (sizeof(hwaddr) == 4 ? 0 : ((uint64_t)1 62)); memory_region_init_alias(mch-pci_hole_64bit, pci-hole64,
[Qemu-devel] [PATCH RFC 15/15] move bios loading to MemoryController and ISAPc
Signed-off-by: Hu Tao hu...@cn.fujitsu.com --- hw/i386/pc.c| 35 ++- hw/i386/pc_piix.c | 13 ++--- hw/i386/pc_q35.c| 13 ++--- hw/isa/isa_pc.c | 11 +++ include/hw/i386/pc.h| 7 +++ include/hw/isa/isa_pc.h | 1 + 6 files changed, 33 insertions(+), 47 deletions(-) diff --git a/hw/i386/pc.c b/hw/i386/pc.c index c28baa2..5bb4989 100644 --- a/hw/i386/pc.c +++ b/hw/i386/pc.c @@ -1023,36 +1023,19 @@ void pc_acpi_init(const char *default_dsdt) } } -FWCfgState *pc_memory_init(MemoryRegion *system_memory, - const char *kernel_filename, +FWCfgState *pc_memory_init(const char *kernel_filename, const char *kernel_cmdline, const char *initrd_filename, ram_addr_t below_4g_mem_size, - ram_addr_t above_4g_mem_size, - MemoryRegion *rom_memory) + ram_addr_t above_4g_mem_size) { -int linux_boot, i; -MemoryRegion *option_rom_mr; +int i; FWCfgState *fw_cfg; -linux_boot = (kernel_filename != NULL); - - -/* Initialize PC system firmware */ -pc_system_firmware_init(rom_memory); - -option_rom_mr = g_malloc(sizeof(*option_rom_mr)); -memory_region_init_ram(option_rom_mr, pc.rom, PC_ROM_SIZE); -vmstate_register_ram_global(option_rom_mr); -memory_region_add_subregion_overlap(rom_memory, -PC_ROM_MIN_VGA, -option_rom_mr, -1); - fw_cfg = bochs_bios_init(); rom_set_fw(fw_cfg); -if (linux_boot) { +if (kernel_filename) { load_linux(fw_cfg, kernel_filename, initrd_filename, kernel_cmdline, below_4g_mem_size); } @@ -1380,6 +1363,16 @@ static int memory_controller_init(PCIDevice *dev) dev-config[0x72] = 0x02; cpu_smm_register(c-set_smm, m); + +pc_system_firmware_init(m-pci_address_space); + +memory_region_init_ram(m-option_roms, pc.rom, PC_ROM_SIZE); +vmstate_register_ram_global(m-option_roms); +memory_region_add_subregion_overlap(m-pci_address_space, +PC_ROM_MIN_VGA, +m-option_roms, +1); + return 0; } diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c index 2e84cc2..fb056df 100644 --- a/hw/i386/pc_piix.c +++ b/hw/i386/pc_piix.c @@ -86,7 +86,6 @@ static void pc_init1(MemoryRegion *system_memory, ISADevice *rtc_state; ISADevice *floppy; MemoryRegion *pci_memory; -MemoryRegion *rom_memory; DeviceState *icc_bridge; FWCfgState *fw_cfg = NULL; ISAPc *isapc = NULL; @@ -135,18 +134,10 @@ static void pc_init1(MemoryRegion *system_memory, } isa_bus_irqs(isa_bus, gsi); -if (pci_enabled) { -rom_memory = pci_memory; -} else { -rom_memory = system_memory; -} - /* allocate ram and load rom/bios */ if (!xen_enabled()) { -fw_cfg = pc_memory_init(system_memory, -kernel_filename, kernel_cmdline, initrd_filename, -below_4g_mem_size, above_4g_mem_size, -rom_memory); +fw_cfg = pc_memory_init(kernel_filename, kernel_cmdline, initrd_filename, +below_4g_mem_size, above_4g_mem_size); } if (kvm_irqchip_in_kernel()) { diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c index fde83b4..5fe14bb 100644 --- a/hw/i386/pc_q35.c +++ b/hw/i386/pc_q35.c @@ -64,7 +64,6 @@ static void pc_q35_init(QEMUMachineInitArgs *args) BusState *idebus[MAX_SATA_PORTS]; ISADevice *rtc_state; ISADevice *floppy; -MemoryRegion *rom_memory; GSIState *gsi_state; ISABus *isa_bus; int pci_enabled = 1; @@ -96,18 +95,10 @@ static void pc_q35_init(QEMUMachineInitArgs *args) below_4g_mem_size = ram_size; } -/* pci enabled */ -if (pci_enabled) { -rom_memory = q35_host-pci_address_space; -} else { -rom_memory = get_system_memory(); -} - /* allocate ram and load rom/bios */ if (!xen_enabled()) { -pc_memory_init(get_system_memory(), kernel_filename, kernel_cmdline, - initrd_filename, below_4g_mem_size, above_4g_mem_size, - rom_memory); +pc_memory_init(kernel_filename, kernel_cmdline, + initrd_filename, below_4g_mem_size, above_4g_mem_size); } /* irq lines */ diff --git a/hw/isa/isa_pc.c b/hw/isa/isa_pc.c index f73cddb..ae9f2c8 100644 --- a/hw/isa/isa_pc.c +++ b/hw/isa/isa_pc.c @@ -1,4 +1,6 @@ +#include hw/i386/pc.h #include hw/isa/isa_pc.h +#include hw/loader.h static void isa_pc_realize(DeviceState *dev, Error **errp) { @@ -17,6 +19,15 @@
[Qemu-devel] [PATCH RFC 11/15] introduce ISAPc
Previous refactor breaks isapc. This patch introduces ISAPc and move isapc initializing code to ISAPc. Signed-off-by: Hu Tao hu...@cn.fujitsu.com --- hw/i386/pc_piix.c | 15 +++ hw/isa/Makefile.objs| 2 +- hw/isa/isa_pc.c | 42 ++ include/hw/isa/isa_pc.h | 26 ++ 4 files changed, 80 insertions(+), 5 deletions(-) create mode 100644 hw/isa/isa_pc.c create mode 100644 include/hw/isa/isa_pc.h diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c index 77d7ef0..2e84cc2 100644 --- a/hw/i386/pc_piix.c +++ b/hw/i386/pc_piix.c @@ -49,6 +49,7 @@ #ifdef CONFIG_XEN # include xen/hvm/hvm_info_table.h #endif +#include hw/isa/isa_pc.h #define MAX_IDE_BUS 2 @@ -72,7 +73,7 @@ static void pc_init1(MemoryRegion *system_memory, { int i; ram_addr_t below_4g_mem_size, above_4g_mem_size; -PCIBus *pci_bus; +PCIBus *pci_bus = NULL; ISABus *isa_bus; int piix3_devfn = -1; qemu_irq *cpu_irq; @@ -88,6 +89,7 @@ static void pc_init1(MemoryRegion *system_memory, MemoryRegion *rom_memory; DeviceState *icc_bridge; FWCfgState *fw_cfg = NULL; +ISAPc *isapc = NULL; icc_bridge = qdev_create(NULL, TYPE_ICC_BRIDGE); object_property_add_child(qdev_get_machine(), icc-bridge, @@ -122,8 +124,13 @@ static void pc_init1(MemoryRegion *system_memory, system_memory, system_io, ram_size, pci_memory); } else { -pci_bus = NULL; -isa_bus = isa_bus_new(NULL, system_io); +isapc = ISA_PC(object_new(TYPE_ISA_PC)); +isapc-ram_size = ram_size; +isapc-address_space_mem = system_memory; +isapc-address_space_io = system_io; +qdev_init_nofail(DEVICE(isapc)); + +isa_bus = isapc-bus; no_hpet = 1; } isa_bus_irqs(isa_bus, gsi); @@ -161,7 +168,7 @@ static void pc_init1(MemoryRegion *system_memory, pc_register_ferr_irq(gsi[13]); -pc_vga_init(isa_bus, pci_enabled ? pci_bus : NULL); +pc_vga_init(isa_bus, pci_bus); if (xen_enabled()) { pci_create_simple(pci_bus, -1, xen-platform); } diff --git a/hw/isa/Makefile.objs b/hw/isa/Makefile.objs index 193746a..99e6a4f 100644 --- a/hw/isa/Makefile.objs +++ b/hw/isa/Makefile.objs @@ -1,4 +1,4 @@ -common-obj-y += isa-bus.o +common-obj-y += isa-bus.o isa_pc.o common-obj-$(CONFIG_APM) += apm.o common-obj-$(CONFIG_I82378) += i82378.o common-obj-$(CONFIG_ISA_MMIO) += isa_mmio.o diff --git a/hw/isa/isa_pc.c b/hw/isa/isa_pc.c new file mode 100644 index 000..f73cddb --- /dev/null +++ b/hw/isa/isa_pc.c @@ -0,0 +1,42 @@ +#include hw/isa/isa_pc.h + +static void isa_pc_realize(DeviceState *dev, Error **errp) +{ +ISAPc *isapc = ISA_PC(dev); + +g_assert(isapc-address_space_mem != NULL); +g_assert(isapc-address_space_io != NULL); +g_assert(isapc-ram_size 0); + +isapc-bus = isa_bus_new(NULL, isapc-address_space_io); + +/* Allocate RAM. We allocate it as a single memory region and use + * aliases to address portions of it, mostly for backwards compatibility + * with older qemus that used qemu_ram_alloc(). + */ +memory_region_init_ram(isapc-ram, pc.ram, isapc-ram_size); +vmstate_register_ram_global(isapc-ram); +memory_region_add_subregion(isapc-address_space_mem, 0, isapc-ram); +} + +static void isa_pc_class_init(ObjectClass *klass, void *data) +{ +DeviceClass *dc = DEVICE_CLASS(klass); + +dc-realize = isa_pc_realize; +dc-no_user = 1; +} + +static const TypeInfo isa_pc_type_info = { +.name = TYPE_ISA_PC, +.parent = TYPE_DEVICE, +.instance_size = sizeof(ISAPc), +.class_init = isa_pc_class_init, +}; + +static void isa_pc_register_types(void) +{ +type_register_static(isa_pc_type_info); +} + +type_init(isa_pc_register_types) diff --git a/include/hw/isa/isa_pc.h b/include/hw/isa/isa_pc.h new file mode 100644 index 000..91a0701 --- /dev/null +++ b/include/hw/isa/isa_pc.h @@ -0,0 +1,26 @@ +#ifndef ISA_PC_H +#define ISA_PC_H + +#include qom/object.h +#include exec/memory.h +#include hw/isa/isa.h + +#define TYPE_ISA_PC isa-pc +#define ISA_PC(obj) OBJECT_CHECK(ISAPc, (obj), TYPE_ISA_PC) + +typedef struct ISAPc ISAPc; + +struct ISAPc { +/* private */ +Object parent_obj; +/* public */ + +ISABus *bus; + +MemoryRegion *address_space_mem; +MemoryRegion *address_space_io; +MemoryRegion ram; +ram_addr_t ram_size; +}; + +#endif -- 1.8.3.1
[Qemu-devel] [PATCH RFC 14/15] q35 mch: inherit from MemoryController
Signed-off-by: Hu Tao hu...@cn.fujitsu.com --- hw/i386/pc_q35.c | 4 +- hw/pci-host/q35.c | 150 +- include/hw/pci-host/q35.h | 14 + 3 files changed, 32 insertions(+), 136 deletions(-) diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c index d7d9703..fde83b4 100644 --- a/hw/i386/pc_q35.c +++ b/hw/i386/pc_q35.c @@ -120,8 +120,8 @@ static void pc_q35_init(QEMUMachineInitArgs *args) gsi = qemu_allocate_irqs(gsi_handler, gsi_state, GSI_NUM_PINS); } -q35_host-mch.ram_size = ram_size; -q35_host-mch.system_memory = get_system_memory(); +q35_host-mch.dev.ram_size = ram_size; +q35_host-mch.dev.system_memory = get_system_memory(); q35_host-mch.address_space_io = get_system_io(); /* pci */ qdev_init_nofail(DEVICE(q35_host)); diff --git a/hw/pci-host/q35.c b/hw/pci-host/q35.c index 5f99431..0296550 100644 --- a/hw/pci-host/q35.c +++ b/hw/pci-host/q35.c @@ -53,10 +53,10 @@ static int q35_host_init(SysBusDevice *dev) return -1; } -s-mch.pci_address_space = s-pci_address_space; -pci-bus = pci_bus_new(DEVICE(s), pcie.0, -s-mch.pci_address_space, s-mch.address_space_io, -0, TYPE_PCIE_BUS); +s-mch.dev.pci_address_space = s-pci_address_space; +pci-bus = pci_bus_new(DEVICE(dev), pcie.0, + s-mch.dev.pci_address_space, s-mch.address_space_io, + 0, TYPE_PCIE_BUS); qdev_set_parent_bus(DEVICE(s-mch), BUS(pci-bus)); qdev_init_nofail(DEVICE(s-mch)); @@ -106,8 +106,8 @@ static const TypeInfo q35_host_info = { /* PCIe MMCFG */ static void mch_update_pciexbar(MCHPCIState *mch) { -PCIDevice *pci_dev = mch-d; -BusState *bus = qdev_get_parent_bus(pci_dev-qdev); +PCIDevice *pci_dev = PCI_DEVICE(mch); +BusState *bus = qdev_get_parent_bus(DEVICE(pci_dev)); DeviceState *qdev = bus-parent; Q35PCIHost *s = Q35_HOST_DEVICE(qdev); @@ -144,49 +144,18 @@ static void mch_update_pciexbar(MCHPCIState *mch) pcie_host_mmcfg_update(s-host, enable, addr, length); } -/* PAM */ -static void mch_update_pam(MCHPCIState *mch) -{ -int i; - -memory_region_transaction_begin(); -for (i = 0; i 13; i++) { -pam_update(mch-pam_regions[i], i, - mch-d.config[MCH_HOST_BRIDGE_PAM0 + ((i + 1) / 2)]); -} -memory_region_transaction_commit(); -} - -/* SMRAM */ -static void mch_update_smram(MCHPCIState *mch) -{ -memory_region_transaction_begin(); -smram_update(mch-smram_region, mch-d.config[MCH_HOST_BRDIGE_SMRAM], -mch-smm_enabled); -memory_region_transaction_commit(); -} - -static void mch_set_smm(int smm, void *arg) -{ -MCHPCIState *mch = arg; - -memory_region_transaction_begin(); -smram_set_smm(mch-smm_enabled, smm, mch-d.config[MCH_HOST_BRDIGE_SMRAM], -mch-smram_region); -memory_region_transaction_commit(); -} - static void mch_write_config(PCIDevice *d, - uint32_t address, uint32_t val, int len) + uint32_t address, uint32_t val, int len) { MCHPCIState *mch = MCH_PCI_DEVICE(d); +MemoryController *m = MEMORY_CONTROLLER(d); /* XXX: implement SMRAM.D_LOCK */ pci_default_write_config(d, address, val, len); if (ranges_overlap(address, len, MCH_HOST_BRIDGE_PAM0, MCH_HOST_BRIDGE_PAM_SIZE)) { -mch_update_pam(mch); +mc_update_pam(m); } if (ranges_overlap(address, len, MCH_HOST_BRIDGE_PCIEXBAR, @@ -196,21 +165,23 @@ static void mch_write_config(PCIDevice *d, if (ranges_overlap(address, len, MCH_HOST_BRDIGE_SMRAM, MCH_HOST_BRDIGE_SMRAM_SIZE)) { -mch_update_smram(mch); +mc_update_smram(m); } } -static void mch_update(MCHPCIState *mch) +static void mch_update(MemoryController *d) { +MCHPCIState *mch = MCH_PCI_DEVICE(d); + mch_update_pciexbar(mch); -mch_update_pam(mch); -mch_update_smram(mch); +mc_update_pam(d); +mc_update_smram(d); } static int mch_post_load(void *opaque, int version_id) { -MCHPCIState *mch = opaque; -mch_update(mch); +MemoryController *m = opaque; +mch_update(m); return 0; } @@ -221,8 +192,8 @@ static const VMStateDescription vmstate_mch = { .minimum_version_id_old = 1, .post_load = mch_post_load, .fields = (VMStateField []) { -VMSTATE_PCI_DEVICE(d, MCHPCIState), -VMSTATE_UINT8(smm_enabled, MCHPCIState), +VMSTATE_PCI_DEVICE(dev.dev, MCHPCIState), +VMSTATE_UINT8(dev.smm_enabled, MCHPCIState), VMSTATE_END_OF_LIST() } }; @@ -230,102 +201,39 @@ static const VMStateDescription vmstate_mch = { static void mch_reset(DeviceState *qdev) { PCIDevice *d = PCI_DEVICE(qdev); -MCHPCIState *mch = MCH_PCI_DEVICE(d); +MemoryController *m = MEMORY_CONTROLLER(d);
[Qemu-devel] [PATCH RFC 13/15] i440fx pmc: inherit from MemoryController
Signed-off-by: Hu Tao hu...@cn.fujitsu.com --- hw/pci-host/piix.c | 159 - 1 file changed, 22 insertions(+), 137 deletions(-) diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c index 44ebe0f..71a9b8b 100644 --- a/hw/pci-host/piix.c +++ b/hw/pci-host/piix.c @@ -86,19 +86,7 @@ typedef struct PIIX3State { OBJECT_CHECK(I440FXPMCState, (obj), TYPE_I440FX_PMC_DEVICE) struct I440FXPMCState { -PCIDevice dev; -MemoryRegion *system_memory; -MemoryRegion *pci_address_space; -MemoryRegion *ram_memory; -MemoryRegion pci_hole; -MemoryRegion pci_hole_64bit; -PAMMemoryRegion pam_regions[13]; -MemoryRegion smram_region; -uint8_t smm_enabled; -ram_addr_t ram_size; -MemoryRegion ram; -MemoryRegion ram_below_4g; -MemoryRegion ram_above_4g; +MemoryController dev; }; #define TYPE_I440FX_DEVICE i440FX @@ -133,52 +121,32 @@ static int pci_slot_get_pirq(PCIDevice *pci_dev, int pci_intx) return (pci_intx + slot_addend) 3; } -static void i440fx_pmc_update_memory_mappings(I440FXPMCState *d) -{ -int i; - -memory_region_transaction_begin(); -for (i = 0; i 13; i++) { -pam_update(d-pam_regions[i], i, - d-dev.config[I440FX_PAM + ((i + 1) / 2)]); -} -smram_update(d-smram_region, d-dev.config[I440FX_SMRAM], d-smm_enabled); -memory_region_transaction_commit(); -} - -static void i440fx_set_smm(int val, void *arg) +static void i440fx_pmc_reset(DeviceState *d) { -I440FXPMCState *d = arg; - -memory_region_transaction_begin(); -smram_set_smm(d-smm_enabled, val, d-dev.config[I440FX_SMRAM], - d-smram_region); -memory_region_transaction_commit(); +mc_update(MEMORY_CONTROLLER(d)); } - static void i440fx_write_config(PCIDevice *dev, uint32_t address, uint32_t val, int len) { -I440FXPMCState *d = I440FX_PMC_DEVICE(dev); - /* XXX: implement SMRAM.D_LOCK */ pci_default_write_config(dev, address, val, len); if (ranges_overlap(address, len, I440FX_PAM, I440FX_PAM_SIZE) || range_covers_byte(address, len, I440FX_SMRAM)) { -i440fx_pmc_update_memory_mappings(d); +mc_update(MEMORY_CONTROLLER(dev)); } } static int i440fx_load_old(QEMUFile* f, void *opaque, int version_id) { -I440FXPMCState *d = opaque; +MemoryController *d = opaque; int ret, i; ret = pci_device_load(d-dev, f); if (ret 0) return ret; -i440fx_pmc_update_memory_mappings(d); + +mc_update(MEMORY_CONTROLLER(d)); qemu_get_8s(f, d-smm_enabled); if (version_id == 2) { @@ -192,9 +160,9 @@ static int i440fx_load_old(QEMUFile* f, void *opaque, int version_id) static int i440fx_post_load(void *opaque, int version_id) { -I440FXPMCState *d = opaque; +MemoryController *d = opaque; -i440fx_pmc_update_memory_mappings(d); +mc_update(d); return 0; } @@ -206,8 +174,8 @@ static const VMStateDescription vmstate_i440fx_pmc = { .load_state_old = i440fx_load_old, .post_load = i440fx_post_load, .fields = (VMStateField []) { -VMSTATE_PCI_DEVICE(dev, I440FXPMCState), -VMSTATE_UINT8(smm_enabled, I440FXPMCState), +VMSTATE_PCI_DEVICE(dev.dev, I440FXPMCState), +VMSTATE_UINT8(dev.smm_enabled, I440FXPMCState), VMSTATE_END_OF_LIST() } }; @@ -230,7 +198,7 @@ static int i440fx_realize(SysBusDevice *dev) sysbus_add_io(dev, 0xcfc, s-data_mem); sysbus_init_ioports(s-busdev, 0xcfc, 4); -f-pmc.pci_address_space = f-pci_address_space; +f-pmc.dev.pci_address_space = f-pci_address_space; qdev_set_parent_bus(DEVICE(f-pmc), BUS(s-bus)); qdev_init_nofail(DEVICE(f-pmc)); @@ -249,91 +217,6 @@ static void i440fx_initfn(Object *obj) memory_region_init(f-pci_address_space, pci, INT64_MAX); } -static int i440fx_pmc_initfn(PCIDevice *dev) -{ -I440FXPMCState *d = I440FX_PMC_DEVICE(dev); -ram_addr_t ram_size; -hwaddr below_4g_mem_size, above_4g_mem_size; -hwaddr pci_hole_start, pci_hole_size; -hwaddr pci_hole64_start, pci_hole64_size; -int i; - -g_assert(d-system_memory != NULL); - -if(d-ram_size I440FX_PMC_PCI_HOLE) { -below_4g_mem_size = I440FX_PMC_PCI_HOLE; -above_4g_mem_size = d-ram_size - I440FX_PMC_PCI_HOLE; -} else { -below_4g_mem_size = d-ram_size; -above_4g_mem_size = 0; -} - -/* Allocate RAM. We allocate it as a single memory region and use - * aliases to address portions of it, mostly for backwards compatibility - * with older qemus that used qemu_ram_alloc(). - */ -memory_region_init_ram(d-ram, pc.ram, - below_4g_mem_size + above_4g_mem_size); -vmstate_register_ram_global(d-ram); -memory_region_init_alias(d-ram_below_4g, ram-below-4g, d-ram, - 0, below_4g_mem_size); -
Re: [Qemu-devel] [PATCH v3] vl.c: Support multiple CPU ranges on -numa option
On Wed, 19 Jun 2013 10:26:42 -0300 Eduardo Habkost ehabk...@redhat.com wrote: On Wed, Jun 19, 2013 at 01:42:52PM +0200, Igor Mammedov wrote: On Tue, 18 Jun 2013 16:09:49 -0400 Bandan Das b...@redhat.com wrote: This allows us to use the cpu property multiple times to specify multiple cpu (ranges) to the -numa option : -numa node,cpu=1,cpu=2,cpu=4 or -numa node,cpu=1-3,cpu=5 Signed-off-by: Bandan Das b...@redhat.com --- v3: Convert to using QemuOpts Use -cpu rather than -cpus which probably probably makes it more meaningful for non-range arguments Sorry for reviving this up :) This is a follow up to earlier proposals sent by Eduardo. References : 1. http://lists.gnu.org/archive/html/qemu-devel/2013-02/msg03832.html 2. https://lists.gnu.org/archive/html/qemu-devel/2013-02/msg03857.html So, basically the format seemed easier to work with if we are thinking of using QemuOpts for -numa. Using -cpu rather than cpus probably makes it less ambiguous as well IMO. However, it's probably not a good idea if the current syntax is well established ? libvirt uses the cpus option already, so we have to keep it working. Sure, we can leave it as it's now for some time while a new interface is introduced/adopted. And than later deprecate cpus. In context of x86, allowing to specify CPU threads using cpu_index isn't correct, since node calculated from APIC ID and node it gets from ACPI table could differ. It could be better for CLI interface to accept socket number and build always correct NUMA mapping internally using APIC IDs from CPUs, as it's done in real hardware. In addition it would allow to deprecate use of cpu_index on CLI interface, and simplify following re-factoring to use socket/[core/]thread as means to address/ specify specific CPUs there and later in monitor/qmp interface as well. What about simply accepting a QOM object path? Today we could only accept CPU thread objects (because there are no socket/core objects yet), but the day we introduce CPU socket objects, we can change the code to accept them without changing the syntax. It doesn't matter if it's socket=N or QOM path, the idea is not to allow individual CPU threads there to avoid misconfiguration, but use socket entities in some form in interface part. Sockets could be dummy containers for initial implementation so not to delay sanitizing NUMA code.
[Qemu-devel] [PATCH RFC 07/15] i440fx-pmc: create pci address space
Signed-off-by: Hu Tao hu...@cn.fujitsu.com --- hw/i386/pc_piix.c| 33 +++-- hw/pci-host/piix.c | 16 +--- include/hw/i386/pc.h | 2 +- 3 files changed, 25 insertions(+), 26 deletions(-) diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c index 3934754..1ca705c 100644 --- a/hw/i386/pc_piix.c +++ b/hw/i386/pc_piix.c @@ -109,23 +109,6 @@ static void pc_init1(MemoryRegion *system_memory, below_4g_mem_size = ram_size; } -if (pci_enabled) { -pci_memory = g_new(MemoryRegion, 1); -memory_region_init(pci_memory, pci, INT64_MAX); -rom_memory = pci_memory; -} else { -pci_memory = NULL; -rom_memory = system_memory; -} - -/* allocate ram and load rom/bios */ -if (!xen_enabled()) { -fw_cfg = pc_memory_init(system_memory, - kernel_filename, kernel_cmdline, initrd_filename, - below_4g_mem_size, above_4g_mem_size, - rom_memory, ram_memory); -} - gsi_state = g_malloc0(sizeof(*gsi_state)); if (kvm_irqchip_in_kernel()) { kvm_pc_setup_irq_routing(pci_enabled); @@ -138,7 +121,7 @@ static void pc_init1(MemoryRegion *system_memory, if (pci_enabled) { pci_bus = i440fx_init(piix3_devfn, isa_bus, gsi, system_memory, system_io, ram_size, - pci_memory, ram_memory); + pci_memory, ram_memory); } else { pci_bus = NULL; isa_bus = isa_bus_new(NULL, system_io); @@ -146,6 +129,20 @@ static void pc_init1(MemoryRegion *system_memory, } isa_bus_irqs(isa_bus, gsi); +if (pci_enabled) { +rom_memory = pci_memory; +} else { +rom_memory = system_memory; +} + +/* allocate ram and load rom/bios */ +if (!xen_enabled()) { +fw_cfg = pc_memory_init(system_memory, +kernel_filename, kernel_cmdline, initrd_filename, +below_4g_mem_size, above_4g_mem_size, +rom_memory, ram_memory); +} + if (kvm_irqchip_in_kernel()) { i8259 = kvm_i8259_init(isa_bus); } else if (xen_enabled()) { diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c index 7b58d56..604a66e 100644 --- a/hw/pci-host/piix.c +++ b/hw/pci-host/piix.c @@ -105,7 +105,7 @@ struct I440FXPMCState { typedef struct I440FXState { PCIHostState parent_obj; MemoryRegion *address_space_io; -MemoryRegion *pci_address_space; +MemoryRegion pci_address_space; PIIX3State piix3; I440FXPMCState pmc; @@ -214,7 +214,7 @@ static int i440fx_realize(SysBusDevice *dev) PCIHostState *s = PCI_HOST_BRIDGE(dev); I440FXState *f = I440FX_DEVICE(dev); -s-bus = pci_bus_new(DEVICE(f), NULL, f-pci_address_space, +s-bus = pci_bus_new(DEVICE(f), NULL, f-pci_address_space, f-address_space_io, 0, TYPE_PCI_BUS); memory_region_init_io(s-conf_mem, pci_host_conf_le_ops, s, @@ -227,6 +227,8 @@ static int i440fx_realize(SysBusDevice *dev) sysbus_add_io(dev, 0xcfc, s-data_mem); sysbus_init_ioports(s-busdev, 0xcfc, 4); +f-pmc.pci_address_space = f-pci_address_space; + qdev_set_parent_bus(DEVICE(f-pmc), BUS(s-bus)); qdev_init_nofail(DEVICE(f-pmc)); @@ -240,6 +242,8 @@ static void i440fx_initfn(Object *obj) object_initialize(f-pmc, TYPE_I440FX_PMC_DEVICE); object_property_add_child(obj, pmc, OBJECT(f-pmc), NULL); qdev_prop_set_uint32(DEVICE(f-pmc), addr, PCI_DEVFN(0, 0)); + +memory_region_init(f-pci_address_space, pci, INT64_MAX); } static int i440fx_pmc_initfn(PCIDevice *dev) @@ -251,7 +255,6 @@ static int i440fx_pmc_initfn(PCIDevice *dev) int i; g_assert(d-system_memory != NULL); -g_assert(d-pci_address_space != NULL); g_assert(d-ram_memory != NULL); if(d-ram_size I440FX_PMC_PCI_HOLE) { @@ -313,7 +316,7 @@ static PCIBus *i440fx_common_init(const char *device_name, MemoryRegion *address_space_mem, MemoryRegion *address_space_io, ram_addr_t ram_size, - MemoryRegion *pci_address_space, + MemoryRegion **pci_address_space, MemoryRegion *ram_memory) { PCIHostState *s; @@ -325,12 +328,10 @@ static PCIBus *i440fx_common_init(const char *device_name, s = PCI_HOST_BRIDGE(i440fx); i440fx-address_space_io = address_space_io; -i440fx-pci_address_space = pci_address_space; f = i440fx-pmc; f-ram_size = ram_size; f-system_memory = address_space_mem; -f-pci_address_space = pci_address_space; f-ram_memory = ram_memory; object_property_add_child(qdev_get_machine(), i440fx, @@ -358,6 +359,7 @@ static PCIBus *i440fx_common_init(const char
[Qemu-devel] [PATCH 0/1] ARM aarch64 TCG tlb fast lookup
This patch implements the TCG tlb fast lookup in tcg_out_qemu_ld/st for the aarch64 TCG target. Supports also CONFIG_QEMU_LDST_OPTIMIZATION. Tested running on a x86-64 physical machine running Foundation v8, running a linux 3.2.0 minimal host system based on linaro v8 image build 0.8.4423 for user space. Tested guests: arm v5, PPC64, sparc, i386 linux test images. Also tested on x86-64/linux built with buildroot. Jani Kokkonen (1): tcg/aarch64: Implement tlb lookup fast path configure|2 +- include/exec/exec-all.h | 16 +++- tcg/aarch64/tcg-target.c | 197 ++ 3 files changed, 162 insertions(+), 53 deletions(-) -- 1.7.9.5
[Qemu-devel] [PATCH 1/1] tcg/aarch64: Implement tlb lookup fast path
Supports CONFIG_QEMU_LDST_OPTIMIZATION Signed-off-by: Jani Kokkonen jani.kokko...@huawei.com --- configure|2 +- include/exec/exec-all.h | 16 +++- tcg/aarch64/tcg-target.c | 197 ++ 3 files changed, 162 insertions(+), 53 deletions(-) diff --git a/configure b/configure index bed0104..6a36682 100755 --- a/configure +++ b/configure @@ -3599,7 +3599,7 @@ echo libs_softmmu=$libs_softmmu $config_host_mak echo ARCH=$ARCH $config_host_mak case $cpu in - arm|i386|x86_64|ppc) + arm|i386|x86_64|ppc|aarch64) # The TCG interpreter currently does not support ld/st optimization. if test $tcg_interpreter = no ; then echo CONFIG_QEMU_LDST_OPTIMIZATION=y $config_host_mak diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h index 5c31863..39c28d1 100644 --- a/include/exec/exec-all.h +++ b/include/exec/exec-all.h @@ -19,9 +19,7 @@ #ifndef _EXEC_ALL_H_ #define _EXEC_ALL_H_ - #include qemu-common.h - /* allow to see translation results - the slowdown should be negligible, so we leave it */ #define DEBUG_DISAS @@ -358,6 +356,20 @@ static inline uintptr_t tcg_getpc_ldst(uintptr_t ra) not the start of the next opcode */ return ra; } +#elif defined(__aarch64__) +# define GETRA() ((uintptr_t)__builtin_return_address(0)) +# define GETPC_LDST() tcg_getpc_ldst(GETRA()) +static inline uintptr_t tcg_getpc_ldst(uintptr_t ra) +{ +int32_t b; +ra += 4;/* skip one instruction */ +b = *(int32_t *)ra; /* load the branch insn */ +b = (b 6) (6 - 2);/* extract the displacement */ +ra += b;/* apply the displacement */ +ra -= 4;/* return a pointer into the current opcode, + not the start of the next opcode */ +return ra; +} # else # error CONFIG_QEMU_LDST_OPTIMIZATION needs GETPC_LDST() implementation! # endif diff --git a/tcg/aarch64/tcg-target.c b/tcg/aarch64/tcg-target.c index 8bb195e..03da082 100644 --- a/tcg/aarch64/tcg-target.c +++ b/tcg/aarch64/tcg-target.c @@ -706,6 +706,41 @@ static inline void tcg_out_uxt(TCGContext *s, int s_bits, tcg_out_ubfm(s, 0, rd, rn, 0, bits); } +static inline void tcg_out_addi(TCGContext *s, int ext, +TCGReg rd, TCGReg rn, unsigned int aimm) +{ +/* add immediate aimm unsigned 12bit value (with LSL 0 or 12) */ +/* using ADD 0x1100 | (ext) | (aimm 10) | (rn 5) | rd */ +unsigned int base = ext ? 0x9100 : 0x1100; + +if (aimm = 0xfff) { +aimm = 10; +} else { +/* we can only shift left by 12, on assert we cannot represent */ +assert(!(aimm 0xfff)); +assert(aimm = 0xfff000); +base |= 1 22; /* apply LSL 12 */ +aimm = 2; +} + +tcg_out32(s, base | aimm | (rn 5) | rd); +} + +static inline void tcg_out_subi(TCGContext *s, int ext, +TCGReg rd, TCGReg rn, unsigned int aimm) +{ +/* sub immediate aimm unsigned 12bit value (we use LSL 0 - no shift) */ +/* using SUB 0x5100 | (ext) | (aimm 10) | (rn 5) | rd */ +unsigned int base = ext ? 0xd100 : 0x5100; +assert(aimm = 0xfff); +tcg_out32(s, base | (aimm 10) | (rn 5) | rd); +} + +static inline void tcg_out_nop(TCGContext *s) +{ +tcg_out32(s, 0xd503201f); +} + #ifdef CONFIG_SOFTMMU #include exec/softmmu_defs.h @@ -727,7 +762,106 @@ static const void * const qemu_st_helpers[4] = { helper_stq_mmu, }; -#else /* !CONFIG_SOFTMMU */ +static void tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *lb) +{ +reloc_pc19(lb-label_ptr[0], (tcg_target_long)s-code_ptr); +tcg_out_movr(s, 1, TCG_REG_X0, TCG_AREG0); +tcg_out_movr(s, (TARGET_LONG_BITS == 64), TCG_REG_X1, lb-addrlo_reg); +tcg_out_movi(s, TCG_TYPE_I32, TCG_REG_X2, lb-mem_index); +tcg_out_movi(s, TCG_TYPE_I64, TCG_REG_TMP, + (tcg_target_long)qemu_ld_helpers[lb-opc 3]); +tcg_out_callr(s, TCG_REG_TMP); +if (lb-opc 0x04) { +tcg_out_sxt(s, 1, lb-opc 3, lb-datalo_reg, TCG_REG_X0); +} else { +tcg_out_movr(s, 1, lb-datalo_reg, TCG_REG_X0); +} + +tcg_out_goto(s, (tcg_target_long)lb-raddr); +} + +static void tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *lb) +{ +reloc_pc19(lb-label_ptr[0], (tcg_target_long)s-code_ptr); + +tcg_out_movr(s, 1, TCG_REG_X0, TCG_AREG0); +tcg_out_movr(s, (TARGET_LONG_BITS == 64), TCG_REG_X1, lb-addrlo_reg); +tcg_out_movr(s, 1, TCG_REG_X2, lb-datalo_reg); +tcg_out_movi(s, TCG_TYPE_I32, TCG_REG_X3, lb-mem_index); +tcg_out_movi(s, TCG_TYPE_I64, TCG_REG_TMP, + (tcg_target_long)qemu_st_helpers[lb-opc 3]); +tcg_out_callr(s, TCG_REG_TMP); + +tcg_out_nop(s); +tcg_out_goto(s, (tcg_target_long)lb-raddr); +} + +void tcg_out_tb_finalize(TCGContext *s) +{ +int i; +
Re: [Qemu-devel] [PATCH v3 1/2] pvpanic: initialization cleanup
On Wed, 19 Jun 2013, Michael S. Tsirkin wrote: -int pvpanic_init(ISABus *bus) +void pvpanic_init(ISABus *bus) { -isa_create_simple(bus, TYPE_ISA_PVPANIC_DEVICE); -return 0; +ISADevice *dev; +FWCfgState *fw_cfg = fw_cfg_find(); +if (!fw_cfg) { +return; +} +dev = isa_create_simple (bus, TYPE_ISA_PVPANIC_DEVICE); +pvpanic_fw_cfg(dev, fw_cfg); } looks OK to me
Re: [Qemu-devel] [PATCH 1/2] PPC: Add dump_mmu() for 6xx
On 18.06.2013, at 18:04, Fabien Chouteau wrote: On 06/18/2013 05:31 PM, Alexander Graf wrote: +for (type = 0; type 2; type++) You need braces on these. Please run your patch through checkpatch.pl :). I did ;) - ./scripts/checkpatch.pl 0001-PPC-Add-dump_mmu-for-6xx.patch total: 0 errors, 0 warnings, 51 lines checked 0001-PPC-Add-dump_mmu-for-6xx.patch has no obvious style problems and is ready for submission. Meh - broken script :). According to the CODING_STYLE convention all of the above need to be cluttered with braces ;). +for (way = 0; way env-nb_ways; way++) +for (entry = env-nb_tlb * type + env-tlb_per_way * way; + entry (env-nb_tlb * type + env-tlb_per_way * (way + 1)); + entry++) { + +tlb = env-tlb.tlb6[entry]; +cpu_fprintf(f, TLB %02d/%02d %s way:%d %s [ +TARGET_FMT_lx TARGET_FMT_lx ]\n, +entry % env-nb_tlb, env-nb_tlb, +type ? code : data, way, +pte_is_valid(tlb-pte0) ? valid : inval, +tlb-EPN, tlb-EPN + TARGET_PAGE_SIZE); +} I thought 6xx and 74xx also support HTAB and SRs? Shouldn't we dump those as well? I don't know what that is, can you send me an example of what the printf line should be? SRs are similar to the SLB that book3s_64 print out. Just that there are a fixed smaller number of them (16). Basically you'd dump the env-sr array, similar to how the debug functions in get_segment_6xx_tlb() dump it. For the HTAB I think SDR1 should be enough, so you don't need to do too much here. If you like, you can just dump the decoded fields env-htab_base and env-htab_mask. Dumping the whole HTAB would just explode the output. However, you also should definitely dump all (valid) BATs. Check out get_bat_6xx_tlb() for debug code that dumps BATs. Alex
Re: [Qemu-devel] [PATCH v3] vl.c: Support multiple CPU ranges on -numa option
20.06.2013 13:52, Paolo Bonzini wrote: Il 20/06/2013 11:30, Igor Mammedov ha scritto: libvirt uses the cpus option already, so we have to keep it working. Sure, we can leave it as it's now for some time while a new interface is introduced/adopted. And than later deprecate cpus. So, you used a new name because the new behavior of -numa node,cpus=1-2,cpus=3-4 would be incompatible with the old. BTW, as I tried to touch exactly the same place yesterday (trying to convert it to QemuOpts) -- what does this node mean? For example, with -device [type=]devicetype,foo=bar,xzy=abc this creates a new device for each invocation of option. But what does this `-numa node' mean? Can there be anything else besides node? Why it is needed/used for? This -numa option is the last one which uses the old option parsing mechanism (there's also some smbios-related thing but it's simple to convert, I almost got it ready yesterday), but it is rather non-standard. Thanks, /mjt
Re: [Qemu-devel] [PATCH v5 03/11] block: add basic backup support to block driver
On Wed, Jun 19, 2013 at 01:14:17PM +0200, Paolo Bonzini wrote: Il 19/06/2013 12:50, Kevin Wolf ha scritto: +/* Publish progress */ +job-sectors_read += n; +job-common.offset += n * BDRV_SECTOR_SIZE; This is interesting, because the function is not only called by the background job, but also by write notifiers. So 'offset' in a literal sense doesn't make too much sense because we're not operating purely sequential. The QAPI documentation describes 'offset' like this: # @offset: the current progress value If we take it as just that, I think we could actually consider this code correct, because it's a useful measure for the progress (each sector is copied only once, either by the job or by a notifier), even though it really has nothing to do with an offset into the image. Yes, this is similar to what we do for mirroring. I think it is a feature. Yes, we explicitly defined offset to be an opaque progress value for this reason. I will still add a comment since the name offset does make the drive-backup code suggest the wrong thing. Stefan
Re: [Qemu-devel] [PATCH v5 03/11] block: add basic backup support to block driver
On Wed, Jun 19, 2013 at 01:19:34PM +0200, Paolo Bonzini wrote: Il 19/06/2013 12:50, Kevin Wolf ha scritto: + +DPRINTF(%s enter %s C% PRId64 % PRId64 %d\n, +__func__, bdrv_get_device_name(bs), start, sector_num, nb_sectors); Maybe put the first %s and __func__ directly into the DPRINTF macro? Or just use tracepoints. backup_do_cow could definitely be one, and it would subsume another DPRINTF (backup_run loop). hbitmap_get and block_job_completed are two other useful tracepoint that is not present. All that's left then are the DPRINTF for failed readv and writev, which could also be useful in generic code (bdrv_co_*_done). Can be done as a follow-up, of course. I need to respin anyway. The only reason for DPRINTF() is that I originally wanted to change as little as possible from the orginal patch. But we've gone so far we might as well do this too :). Stefan
[Qemu-devel] [PATCH] Graphics: Switch to 800x600x32 as default mode
We have stayed at 800x600x15 as default graphics mode for the last 9 years. If there ever was a reason to be there, surely nobody remembers it. However, recently non-Linux PPC guests started to show bad effects on 15 bit color mode. They do work just fine with 32 bits however. So let's switch to 32 bit color as the default graphic mode. Reported-by: Mark Cave-Ayland mark.cave-ayl...@ilande.co.uk Signed-off-by: Alexander Graf ag...@suse.de --- arch_init.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch_init.c b/arch_init.c index a8b91ee..8ba3d07 100644 --- a/arch_init.c +++ b/arch_init.c @@ -65,7 +65,7 @@ int graphic_depth = 8; #else int graphic_width = 800; int graphic_height = 600; -int graphic_depth = 15; +int graphic_depth = 32; #endif -- 1.8.1.4
Re: [Qemu-devel] [PATCH v5 03/11] block: add basic backup support to block driver
On Wed, Jun 19, 2013 at 12:50:16PM +0200, Kevin Wolf wrote: Am 30.05.2013 um 14:34 hat Stefan Hajnoczi geschrieben: +DPRINTF(%s enter %s C% PRId64 % PRId64 %d\n, +__func__, bdrv_get_device_name(bs), start, sector_num, nb_sectors); Maybe put the first %s and __func__ directly into the DPRINTF macro? Will use trace events instead. +wait_for_overlapping_requests(job, start, end); +cow_request_begin(cow_request, job, start, end); + +total_sectors = bdrv_getlength(bs); +if (total_sectors 0) { +if (error_is_read) { +*error_is_read = true; +} +ret = total_sectors; +goto out; +} +total_sectors /= BDRV_SECTOR_SIZE; Why is this needed? There are two callers of the function: One is the write notifier, which has already a sector number that is checked against the image size. The other one is background job that gets the size once when it starts. I don't think there's a reason to call the block driver each time and potentially do an expensive request. At first I thought it has something to do with resizing images, but it's forbidden while a block job is running (otherwise the job's main loop exit condition would be wrong, too), so that doesn't make it necessary. Thanks, I will eliminate the call. +/* Publish progress */ +job-sectors_read += n; +job-common.offset += n * BDRV_SECTOR_SIZE; This is interesting, because the function is not only called by the background job, but also by write notifiers. So 'offset' in a literal sense doesn't make too much sense because we're not operating purely sequential. The QAPI documentation describes 'offset' like this: # @offset: the current progress value If we take it as just that, I think we could actually consider this code correct, because it's a useful measure for the progress (each sector is copied only once, either by the job or by a notifier), even though it really has nothing to do with an offset into the image. Maybe a comment would be appropriate. Will add the comment. +static BlockJobType backup_job_type = { +.instance_size = sizeof(BackupBlockJob), +.job_type = backup, +.set_speed = backup_set_speed, +.iostatus_reset = backup_iostatus_reset, +}; Align = on the same column? Should probably be const, too. Thanks for pointing the const out. I'll throw in alignment as a bonus, there is a team of ASCII artists here who do amazing whitespace work ;). +static void coroutine_fn backup_run(void *opaque) +{ +BackupBlockJob *job = opaque; +BlockDriverState *bs = job-common.bs; +BlockDriverState *target = job-target; +BlockdevOnError on_target_error = job-on_target_error; +NotifierWithReturn before_write = { +.notify = backup_before_write_notify, +}; +int64_t start, end; +int ret = 0; + +QLIST_INIT(job-inflight_reqs); +qemu_co_rwlock_init(job-flush_rwlock); + +start = 0; +end = DIV_ROUND_UP(bdrv_getlength(bs) / BDRV_SECTOR_SIZE, bdrv_getlength() can fail. Will fix. + BACKUP_SECTORS_PER_CLUSTER); + +job-bitmap = hbitmap_alloc(end, 0); + +bdrv_set_on_error(target, on_target_error, on_target_error); +bdrv_iostatus_enable(target); Mirroring also has: bdrv_set_enable_write_cache(s-target, true); Wouldn't it make sense for backup as well? I guess so. +/* we need to yield so that qemu_aio_flush() returns. + * (without, VM does not reboot) + */ +if (job-common.speed) { +uint64_t delay_ns = ratelimit_calculate_delay( +job-limit, job-sectors_read); +job-sectors_read = 0; +block_job_sleep_ns(job-common, rt_clock, delay_ns); Here's the other implication of counting the progress of copies initiated by write notifiers: The more copies they trigger, the less additional copies are made by the background job. I'm willing to count this as a feature rather than a bug. Yes, the guest does not get throttled by the block job. +} else { +block_job_sleep_ns(job-common, rt_clock, 0); +} + +if (block_job_is_cancelled(job-common)) { +break; +} + +DPRINTF(backup_run loop C% PRId64 \n, start); + +ret = backup_do_cow(bs, start * BACKUP_SECTORS_PER_CLUSTER, 1, +error_is_read); You're taking advantage of the fact that backup_do_cow() rounds this one sector up to 64k, which is a much more reasonable size. But why not specify BACKUP_SECTORS_PER_CLUSTER as nb_sectors when this is what you really assume? Sounds good though I need to double-check if we run into issues when hitting the end of the disk (if not aligned to BACKUP_SECTORS_PER_CLUSTER).
Re: [Qemu-devel] [PATCH] Graphics: Switch to 800x600x32 as default mode
On 20 June 2013 13:09, Alexander Graf ag...@suse.de wrote: We have stayed at 800x600x15 as default graphics mode for the last 9 years. If there ever was a reason to be there, surely nobody remembers it. However, recently non-Linux PPC guests started to show bad effects on 15 bit color mode. They do work just fine with 32 bits however. Have we identified what the actual problem with these guests is? 32 bit might be a more sensible default but it masks a bug doesn't seem like a very solid justification for the change... thanks -- PMM
Re: [Qemu-devel] [PATCH v5 06/11] block: add drive-backup QMP command
On Wed, Jun 19, 2013 at 01:07:42PM +0200, Kevin Wolf wrote: Am 30.05.2013 um 14:34 hat Stefan Hajnoczi geschrieben: @drive-backup Start a point-in-time copy of a block device to a new destination. The status of ongoing drive-backup operations can be checked with query-block-jobs where the BlockJobInfo.type field has the value 'backup'. The operation can be stopped before it has completed using the block-job-cancel command. @device: the name of the device which should be copied. @target: the target of the new image. If the file exists, or if it is a device, the existing file/device will be used as the new destination. If it does not exist, a new file will be created. @format: #optional the format of the new destination, default is to probe if @mode is 'existing', else the format of the source @mode: #optional whether and how QEMU should create a new image, default is 'absolute-paths'. @speed: #optional the maximum speed, in bytes per second @on-source-error: #optional the action to take on an error on the source, default 'report'. 'stop' and 'enospc' can only be used if the block device supports io-status (see BlockInfo). @on-target-error: #optional the action to take on an error on the target, default 'report' (no limitations, since this applies to a different block device than @device). Note that @on-source-error and @on-target-error only affect background I/O. If an error occurs during a guest write request, the device's rerror/werror actions will be used. Returns: nothing on success If @device is not a valid block device, DeviceNotFound Since 1.6 Signed-off-by: Stefan Hajnoczi stefa...@redhat.com Reviewed-by: Kevin Wolf kw...@redhat.com But what about HMP, the series doesn't seem to have a separate patch to add a command there? I can send it as a follow-up if you want. The test cases use QMP and so do management tools - I didn't feel the need for HMP. Stefan
Re: [Qemu-devel] [PATCH v2] s390: Implement dump-guest-memory support for target s390x
On 12.06.2013, at 10:09, Jens Freimann wrote: From: Ekaterina Tumanova tuman...@linux.vnet.ibm.com With this patch dump-guest-memory on s390 produces an ELF formatted, crash-readable dump. In order to implement this, the arch-specific part of dump-guest-memory was added: target-s390x/arch_dump.c contains the whole set of function for writing Elf note sections of all types for s390x. Signed-off-by: Ekaterina Tumanova tuman...@linux.vnet.ibm.com Signed-off-by: Jens Freimann jf...@linux.vnet.ibm.com [...] diff --git a/target-s390x/cpu-qom.h b/target-s390x/cpu-qom.h index 34d45c2..4dfc8f1 100644 --- a/target-s390x/cpu-qom.h +++ b/target-s390x/cpu-qom.h @@ -72,5 +72,14 @@ static inline S390CPU *s390_env_get_cpu(CPUS390XState *env) #define ENV_OFFSET offsetof(S390CPU, env) void s390_cpu_do_interrupt(CPUState *cpu); +int s390_cpu_write_elf64_note(WriteCoreDumpFunction f, CPUState *cs, + int cpuid, void *opaque); + +static inline int s390_cpu_write_elf64_qemunote(WriteCoreDumpFunction f, +CPUState *env, void *opaque) +{ +return 0; +} This one looks odd to me. cpu-qom.h shouldn't really have a static inline to not write notes :). Either put the function into arch_dump.c as well and only export the header here or drop the whole thing and make sure that write_elf64_qemunote == NULL still works. The rest looks good, without any guarantees that the format is actually correctly implemented. Alex + #endif diff --git a/target-s390x/cpu.c b/target-s390x/cpu.c index 23fe51f..9376a6c 100644 --- a/target-s390x/cpu.c +++ b/target-s390x/cpu.c @@ -171,6 +171,10 @@ static void s390_cpu_class_init(ObjectClass *oc, void *data) cc-do_interrupt = s390_cpu_do_interrupt; dc-vmsd = vmstate_s390_cpu; +#ifndef CONFIG_USER_ONLY +cc-write_elf64_note = s390_cpu_write_elf64_note; +cc-write_elf64_qemunote = s390_cpu_write_elf64_qemunote; +#endif } static const TypeInfo s390_cpu_type_info = { -- 1.8.1.6
Re: [Qemu-devel] [PATCH] Graphics: Switch to 800x600x32 as default mode
On 20.06.2013, at 14:19, Peter Maydell wrote: On 20 June 2013 13:09, Alexander Graf ag...@suse.de wrote: We have stayed at 800x600x15 as default graphics mode for the last 9 years. If there ever was a reason to be there, surely nobody remembers it. However, recently non-Linux PPC guests started to show bad effects on 15 bit color mode. They do work just fine with 32 bits however. Have we identified what the actual problem with these guests is? 32 bit might be a more sensible default but it masks a bug doesn't seem like a very solid justification for the change... Since Linux works just fine, I'd say it falls under the category of QEMU diverges from typical test systems. Mark, have you explored the breakages any deeper? Alex
Re: [Qemu-devel] [PATCH] Add Xen platform PCI device version 2.
Paul Durrant paul.durr...@citrix.com writes: The XenServer 6.1+ Citrix Windows PV bus driver binds to a new version of the Xen platform device (since the newer driver set cannot co-exist with previous drivers which bind to the existing xen-platform type of device). This patch introduces a new xen-platform-2 device type with the appropriate device_id and revision. Signed-off-by: Paul Durrant paul.durr...@citrix.com --- hw/xen/xen_platform.c| 75 ++ include/hw/pci/pci_ids.h |1 + 2 files changed, 63 insertions(+), 13 deletions(-) diff --git a/hw/xen/xen_platform.c b/hw/xen/xen_platform.c index b6c6793..6edb850 100644 --- a/hw/xen/xen_platform.c +++ b/hw/xen/xen_platform.c @@ -48,6 +48,20 @@ #define PFFLAG_ROM_LOCK 1 /* Sets whether ROM memory area is RW or RO */ +typedef struct { +const char *name; +const char *desc; +uint16_t device_id; +uint8_t revision; +uint16_t subsystem_vendor_id; +uint16_t subsystem_id; +} PCIXenPlatformDeviceInfo; + +typedef struct PCIXenPlatformDeviceClass { +PCIDeviceClass parent_class; +PCIXenPlatformDeviceInfoinfo; +} PCIXenPlatformDeviceClass; + typedef struct PCIXenPlatformState { PCIDevice pci_dev; MemoryRegion fixed_io; @@ -372,8 +386,13 @@ static const VMStateDescription vmstate_xen_platform = { static int xen_platform_initfn(PCIDevice *dev) { PCIXenPlatformState *d = DO_UPCAST(PCIXenPlatformState, pci_dev, dev); +PCIDeviceClass *k = PCI_DEVICE_GET_CLASS(dev); +__attribute__((unused)) PCIXenPlatformDeviceClass *u; uint8_t *pci_conf; +u = container_of(k, PCIXenPlatformDeviceClass, parent_class); +DPRINTF(initializing %s\n, u-info.name); + pci_conf = d-pci_dev.config; pci_set_word(pci_conf + PCI_COMMAND, PCI_COMMAND_IO | PCI_COMMAND_MEMORY); @@ -402,33 +421,63 @@ static void platform_reset(DeviceState *dev) platform_fixed_ioport_reset(s); } +static PCIXenPlatformDeviceInfo platform_devices[] = { +{ +.name = xen-platform, +.desc = XEN platform pci device (version 1), +.device_id = PCI_DEVICE_ID_XEN_PLATFORM, +.revision = 1, +.subsystem_vendor_id = PCI_VENDOR_ID_XEN, +.subsystem_id = PCI_DEVICE_ID_XEN_PLATFORM, +}, { +.name = xen-platform-2, +.desc = XEN platform pci device (version 2), +.device_id = PCI_DEVICE_ID_XEN_PLATFORM_V2, +.revision = 2, +.subsystem_vendor_id = PCI_VENDOR_ID_XEN, +.subsystem_id = PCI_DEVICE_ID_XEN_PLATFORM_V2, +} +}; + static void xen_platform_class_init(ObjectClass *klass, void *data) { DeviceClass *dc = DEVICE_CLASS(klass); PCIDeviceClass *k = PCI_DEVICE_CLASS(klass); +PCIXenPlatformDeviceInfo *info = data; +PCIXenPlatformDeviceClass *u; + +u = container_of(k, PCIXenPlatformDeviceClass, parent_class); k-init = xen_platform_initfn; k-vendor_id = PCI_VENDOR_ID_XEN; -k-device_id = PCI_DEVICE_ID_XEN_PLATFORM; +k-device_id = info-device_id; k-class_id = PCI_CLASS_OTHERS 8 | 0x80; -k-subsystem_vendor_id = PCI_VENDOR_ID_XEN; -k-subsystem_id = PCI_DEVICE_ID_XEN_PLATFORM; -k-revision = 1; -dc-desc = XEN platform pci device; +k-subsystem_vendor_id = info-subsystem_vendor_id; +k-subsystem_id = info-subsystem_id; +k-revision = info-revision; +dc-desc = info-desc; dc-reset = platform_reset; dc-vmsd = vmstate_xen_platform; +u-info = *info; } -static const TypeInfo xen_platform_info = { -.name = xen-platform, -.parent= TYPE_PCI_DEVICE, -.instance_size = sizeof(PCIXenPlatformState), -.class_init= xen_platform_class_init, -}; - static void xen_platform_register_types(void) { -type_register_static(xen_platform_info); +TypeInfo type_info = { +.parent= TYPE_PCI_DEVICE, +.instance_size = sizeof(PCIXenPlatformState), +.class_size= sizeof(PCIXenPlatformDeviceClass), +.class_init= xen_platform_class_init, +}; +int i; +for (i = 0; i ARRAY_SIZE(platform_devices); i++) { +PCIXenPlatformDeviceInfo *info = platform_devices[i]; + +type_info.name = info-name; +type_info.class_data = info; + +type_register(type_info); +} I can't tell if this is an RFC or meant a complete patch. But the approach you're taking is overly complex. v2 of the device can just derive from v1 and in the class_init function change the PCI information. Also, if you are going to be adding logic for v2, you should use QOM cast macros, not container_of. I don't understand why two devices are required here and the thread doesn't really answer that either. Is there a spec for the Xen platform devices? Take a look at docs/specs for some examples in the tree.
Re: [Qemu-devel] [RFC PATCH 0/4] per-object libraries
Il 20/06/2013 12:06, Peter Maydell ha scritto: This only leaves Darwin. I have no idea about that, and I don't have anymore a machine to test it. Andreas or Peter, can you shed light? If you have something concrete you'd like me to test I can test it. You can modify the configure probe to detect GNU libtool (Mac OS X calls it glibtool, IIRC), and attach the log of make libcacard.la V=1. Paolo
Re: [Qemu-devel] [PATCH] Add Xen platform PCI device version 2.
-Original Message- I don't understand why two devices are required here and the thread doesn't really answer that either. Is there a spec for the Xen platform devices? Take a look at docs/specs for some examples in the tree. It certainly helps to have one for discussions like this. Anthony, I'm going to take a different approach so please disregard this patch now. Paul
[Qemu-devel] KVM call agenda for 2013-06-25
Please, send any topic that you are interested in covering. Thanks, MST -- MST
Re: [Qemu-devel] [RFC PATCH 0/4] per-object libraries
On 20 June 2013 13:39, Paolo Bonzini pbonz...@redhat.com wrote: Il 20/06/2013 12:06, Peter Maydell ha scritto: This only leaves Darwin. I have no idea about that, and I don't have anymore a machine to test it. Andreas or Peter, can you shed light? If you have something concrete you'd like me to test I can test it. You can modify the configure probe to detect GNU libtool (Mac OS X calls it glibtool, IIRC), and attach the log of make libcacard.la V=1. Unfortunately MacOSX doesn't have the 'nss' prereq so CONFIG_SMARTCARD_NSS isn't set and make just says make: *** No rule to make target `libcacard.la'. Stop. -- PMM
Re: [Qemu-devel] [RFC PATCH v6 3/3] Force auto-convegence of live migration
Il 14/06/2013 15:58, Chegu Vinod ha scritto: If a user chooses to turn on the auto-converge migration capability these changes detect the lack of convergence and throttle down the guest. i.e. force the VCPUs out of the guest for some duration and let the migration thread catchup and help converge. Hi Vinod, pretty much the same comments I sent you yesterday on the obsolete version of the patch still apply. Verified the convergence using the following : - Java Warehouse workload running on a 20VCPU/256G guest(~80% busy) - OLTP like workload running on a 80VCPU/512G guest (~80% busy) Sample results with Java warehouse workload : (migrate speed set to 20Gb and migrate downtime set to 4seconds). (qemu) info migrate capabilities: xbzrle: off auto-converge: off Migration status: active total time: 1487503 milliseconds expected downtime: 519 milliseconds transferred ram: 383749347 kbytes remaining ram: 2753372 kbytes total ram: 268444224 kbytes duplicate: 65461532 pages skipped: 64901568 pages normal: 95750218 pages normal bytes: 383000872 kbytes dirty pages rate: 67551 pages --- (qemu) info migrate capabilities: xbzrle: off auto-converge: on Migration status: completed total time: 241161 milliseconds downtime: 6373 milliseconds transferred ram: 28235307 kbytes remaining ram: 0 kbytes total ram: 268444224 kbytes duplicate: 64946416 pages skipped: 64903523 pages normal: 7044971 pages normal bytes: 28179884 kbytes Signed-off-by: Chegu Vinod chegu_vi...@hp.com --- arch_init.c | 85 +++ 1 files changed, 85 insertions(+), 0 deletions(-) diff --git a/arch_init.c b/arch_init.c index 5d32ecf..69c6c8c 100644 --- a/arch_init.c +++ b/arch_init.c @@ -104,6 +104,8 @@ int graphic_depth = 15; #endif const uint32_t arch_type = QEMU_ARCH; +static bool mig_throttle_on; +static void throttle_down_guest_to_converge(void); /***/ /* ram save/restore */ @@ -378,8 +380,15 @@ static void migration_bitmap_sync(void) uint64_t num_dirty_pages_init = migration_dirty_pages; MigrationState *s = migrate_get_current(); static int64_t start_time; +static int64_t bytes_xfer_prev; static int64_t num_dirty_pages_period; int64_t end_time; +int64_t bytes_xfer_now; +static int dirty_rate_high_cnt; + +if (!bytes_xfer_prev) { +bytes_xfer_prev = ram_bytes_transferred(); +} if (!start_time) { start_time = qemu_get_clock_ms(rt_clock); @@ -404,6 +413,23 @@ static void migration_bitmap_sync(void) /* more than 1 second = 1000 millisecons */ if (end_time start_time + 1000) { +if (migrate_auto_converge()) { +/* The following detection logic can be refined later. For now: + Check to see if the dirtied bytes is 50% more than the approx. + amount of bytes that just got transferred since the last time we + were in this routine. If that happens N times (for now N==4) + we turn on the throttle down logic */ +bytes_xfer_now = ram_bytes_transferred(); +if (s-dirty_pages_rate +((num_dirty_pages_period*TARGET_PAGE_SIZE) +((bytes_xfer_now - bytes_xfer_prev)/2))) { +if (dirty_rate_high_cnt++ 4) { Too many parentheses, and please remove the nested if. +DPRINTF(Unable to converge. Throtting down guest\n); Please use tracepoint instead. +mig_throttle_on = true; Need to reset dirty_rate_high_cnt here, and both dirty_rate_high_cnt/mig_throttle_on if you see !migrate_auto_converge(). This ensures that throttling does not kick in automatically if you disable and re-enable the feature. It also lets you remove a bunch of migrate_auto_converge() checks. You also need to reset dirty_rate_high_cnt/mig_throttle_on in the setup phase of migration. +} + } + bytes_xfer_prev = bytes_xfer_now; +} s-dirty_pages_rate = num_dirty_pages_period * 1000 / (end_time - start_time); s-dirty_bytes_rate = s-dirty_pages_rate * TARGET_PAGE_SIZE; @@ -628,6 +654,7 @@ static int ram_save_iterate(QEMUFile *f, void *opaque) } total_sent += bytes_sent; acct_info.iterations++; +throttle_down_guest_to_converge(); You can use a shorter name, like check_cpu_throttling(). /* we want to check in the 1st loop, just in case it was the 1st time and we had to sync the dirty bitmap. qemu_get_clock_ns() is a bit expensive, so we only check each some @@ -1098,3 +1125,61 @@ TargetInfo *qmp_query_target(Error **errp) return info; } + +static bool throttling_needed(void) +{ +if (!migrate_auto_converge()) {
Re: [Qemu-devel] [RFC 06/13] qemu-thread: add TLS wrappers
On Thu, Jun 20, 2013 at 10:50:32AM +0200, Paolo Bonzini wrote: Il 20/06/2013 09:26, Fam Zheng ha scritto: On Fri, 06/14 11:48, Stefan Hajnoczi wrote: From: Paolo Bonzini pbonz...@redhat.com Fast TLS is not available on some platforms, but it is always nice to use it. This wrapper implementation falls back to pthread_get/setspecific on POSIX systems that lack __thread, but uses the dynamic linker's TLS support on Linux and Windows. The user shall call alloc_foo() in every thread that needs to access the variable---exactly once and before any access. foo is the name of the variable as passed to DECLARE_TLS and DEFINE_TLS. Then, get_foo() will return the address of the variable. It is guaranteed to remain the same across the lifetime of a thread, so you can cache it. Would tls_alloc_foo() and tls_get_foo() be easier to read and less possible for name conflict? Fine by me. Nice, idea. Will fix in the next version. Stefan
Re: [Qemu-devel] [PATCH v3] vl.c: Support multiple CPU ranges on -numa option
Il 20/06/2013 13:34, Michael Tokarev ha scritto: 20.06.2013 13:52, Paolo Bonzini wrote: Il 20/06/2013 11:30, Igor Mammedov ha scritto: libvirt uses the cpus option already, so we have to keep it working. Sure, we can leave it as it's now for some time while a new interface is introduced/adopted. And than later deprecate cpus. So, you used a new name because the new behavior of -numa node,cpus=1-2,cpus=3-4 would be incompatible with the old. BTW, as I tried to touch exactly the same place yesterday (trying to convert it to QemuOpts) -- what does this node mean? For example, with -device [type=]devicetype,foo=bar,xzy=abc this creates a new device for each invocation of option. But what does this `-numa node' mean? Can there be anything else besides node? Why it is needed/used for? Nothing, I think it's just that somebody took inspiration from -device. :) This -numa option is the last one which uses the old option parsing mechanism (there's also some smbios-related thing but it's simple to convert, I almost got it ready yesterday), That would be awesome. Paolo
Re: [Qemu-devel] [PATCH 1/2] PPC: Add dump_mmu() for 6xx
On 06/20/2013 01:16 PM, Alexander Graf wrote: On 18.06.2013, at 18:04, Fabien Chouteau wrote: On 06/18/2013 05:31 PM, Alexander Graf wrote: +for (type = 0; type 2; type++) You need braces on these. Please run your patch through checkpatch.pl :). I did ;) - ./scripts/checkpatch.pl 0001-PPC-Add-dump_mmu-for-6xx.patch total: 0 errors, 0 warnings, 51 lines checked 0001-PPC-Add-dump_mmu-for-6xx.patch has no obvious style problems and is ready for submission. Meh - broken script :). According to the CODING_STYLE convention all of the above need to be cluttered with braces ;). Will do. +for (way = 0; way env-nb_ways; way++) +for (entry = env-nb_tlb * type + env-tlb_per_way * way; + entry (env-nb_tlb * type + env-tlb_per_way * (way + 1)); + entry++) { + +tlb = env-tlb.tlb6[entry]; +cpu_fprintf(f, TLB %02d/%02d %s way:%d %s [ +TARGET_FMT_lx TARGET_FMT_lx ]\n, +entry % env-nb_tlb, env-nb_tlb, +type ? code : data, way, +pte_is_valid(tlb-pte0) ? valid : inval, +tlb-EPN, tlb-EPN + TARGET_PAGE_SIZE); +} I thought 6xx and 74xx also support HTAB and SRs? Shouldn't we dump those as well? I don't know what that is, can you send me an example of what the printf line should be? SRs are similar to the SLB that book3s_64 print out. Just that there are a fixed smaller number of them (16). Basically you'd dump the env-sr array, similar to how the debug functions in get_segment_6xx_tlb() dump it. For the HTAB I think SDR1 should be enough, so you don't need to do too much here. If you like, you can just dump the decoded fields env-htab_base and env-htab_mask. Dumping the whole HTAB would just explode the output. However, you also should definitely dump all (valid) BATs. Check out get_bat_6xx_tlb() for debug code that dumps BATs. Ok I'll have a look at that, and BATs should definitely be printed out. -- Fabien Chouteau
[Qemu-devel] [PULL 00/21] pci,net,misc enhancements
From: Michael S. Tsirkin m...@redhat.com The following changes since commit 90a2541b763b31d2b551b07e24aae3de5266d31b: target-i386: fix over 80 chars warnings (2013-06-15 17:50:38 +) are available in the git repository at: git://git.kernel.org/pub/scm/virt/kvm/mst/qemu.git tags/for_anthony for you to fetch changes up to f96c30047009f8a9c3cecf68104d8d99f989f54d: pci: Fold host_buses list into PCIHostState functionality (2013-06-19 18:35:05 +0300) pci,net,misc enhancements This includes some pci and net-related enhancements: Better support for systems with multiple PCI root buses A new management interface for access to rx filter in NICs KVM Speedup for MSI updates on kvm FW cfg interface for more robust pci programming in BIOS Minor fixes/cleanups for fw cfg and cross-version migration - because of dependencies with other patches Signed-off-by: Michael S. Tsirkin m...@redhat.com Amos Kong (1): net: add support of mac-programming over macvtap in QEMU side Andrew Jones (1): e1000: cleanup process_tx_desc David Gibson (10): pci: Cleanup configuration for pci-hotplug.c pci: Move pci_read_devaddr to pci-hotplug-old.c pci: Abolish pci_find_root_bus() pci: Use helper to find device's root bus in pci_find_domain() pci: Replace pci_find_domain() with more general pci_root_bus_path() pci: Add root bus argument to pci_get_bus_devfn() pci: Add root bus parameter to pci_nic_init() pci: Simpler implementation of primary PCI bus pci: Remove domain from PCIHostBus pci: Fold host_buses list into PCIHostState functionality Michael S. Tsirkin (9): range: add Range structure pci: store PCI hole ranges in guestinfo structure pc: pass PCI hole ranges to Guests pc_piix: cleanup init compat handling kvm: zero-initialize KVM_SET_GSI_ROUTING input kvm: skip system call when msi route is unchanged MAINTAINERS: s/Marcelo/Paolo/ pvpanic: initialization cleanup pvpanic: fix fwcfg for big endian hosts MAINTAINERS | 2 +- QMP/qmp-events.txt | 17 default-configs/i386-softmmu.mak| 3 +- default-configs/ppc64-softmmu.mak | 2 - default-configs/x86_64-softmmu.mak | 3 +- hmp-commands.hx | 4 +- hw/alpha/dp264.c| 2 +- hw/arm/realview.c | 6 +- hw/arm/versatilepb.c| 2 +- hw/i386/pc.c| 74 ++- hw/i386/pc_piix.c | 40 +--- hw/i386/pc_q35.c| 18 +++- hw/mips/mips_fulong2e.c | 6 +- hw/mips/mips_malta.c| 6 +- hw/misc/pvpanic.c | 31 --- hw/net/e1000.c | 18 ++-- hw/net/virtio-net.c | 111 ++ hw/pci-host/piix.c | 9 ++ hw/pci-host/q35.c | 17 hw/pci/Makefile.objs| 2 +- hw/pci/{pci-hotplug.c = pci-hotplug-old.c} | 75 --- hw/pci/pci.c| 137 ++-- hw/pci/pci_host.c | 1 + hw/pci/pcie_aer.c | 9 +- hw/ppc/e500.c | 2 +- hw/ppc/mac_newworld.c | 2 +- hw/ppc/mac_oldworld.c | 2 +- hw/ppc/ppc440_bamboo.c | 2 +- hw/ppc/prep.c | 2 +- hw/ppc/spapr.c | 2 +- hw/ppc/spapr_pci.c | 10 ++ hw/sh4/r2d.c| 5 +- hw/sparc64/sun4u.c | 2 +- include/hw/i386/pc.h| 22 - include/hw/pci-host/q35.h | 2 + include/hw/pci/pci.h| 17 ++-- include/hw/pci/pci_host.h | 12 +++ include/monitor/monitor.h | 1 + include/net/net.h | 3 + include/qemu/range.h| 16 include/qemu/typedefs.h | 1 + kvm-all.c | 23 +++-- monitor.c | 1 + net/net.c | 47 ++ qapi-schema.json| 75 +++ qmp-commands.hx | 63 + 46 files changed, 733 insertions(+), 174 deletions(-) rename hw/pci/{pci-hotplug.c = pci-hotplug-old.c} (78%)
[Qemu-devel] [PULL 01/21] range: add Range structure
Sometimes we need to pass ranges around, add a handy structure for this purpose. Note: memory.c defines its own concept of AddrRange structure for working with 128 addresses. It's necessary there for doing range math. This is not needed for most users: struct Range is much simpler, and is only used for passing the range around. Cc: Peter Maydell peter.mayd...@linaro.org Signed-off-by: Michael S. Tsirkin m...@redhat.com --- include/qemu/range.h | 16 1 file changed, 16 insertions(+) diff --git a/include/qemu/range.h b/include/qemu/range.h index 3502372..b76cc0d 100644 --- a/include/qemu/range.h +++ b/include/qemu/range.h @@ -1,6 +1,22 @@ #ifndef QEMU_RANGE_H #define QEMU_RANGE_H +#include inttypes.h + +/* + * Operations on 64 bit address ranges. + * Notes: + * - ranges must not wrap around 0, but can include the last byte ~0x0LL. + * - this can not represent a full 0 to ~0x0LL range. + */ + +/* A structure representing a range of addresses. */ +struct Range { +uint64_t begin; /* First byte of the range, or 0 if empty. */ +uint64_t end; /* 1 + the last byte. 0 if range empty or ends at ~0x0LL. */ +}; +typedef struct Range Range; + /* Get last byte of a range from offset + length. * Undefined for ranges that wrap around 0. */ static inline uint64_t range_get_last(uint64_t offset, uint64_t len) -- MST
[Qemu-devel] [PULL 03/21] pc: pass PCI hole ranges to Guests
Guest currently has to jump through lots of hoops to guess the PCI hole ranges. It's fragile, and makes us change BIOS each time we add a new chipset. Let's report the window in a ROM file, to make BIOS do exactly what QEMU intends. Signed-off-by: Michael S. Tsirkin m...@redhat.com --- hw/i386/pc.c | 26 ++ hw/i386/pc_piix.c| 16 +++- hw/i386/pc_q35.c | 12 ++-- include/hw/i386/pc.h | 1 + 4 files changed, 52 insertions(+), 3 deletions(-) diff --git a/hw/i386/pc.c b/hw/i386/pc.c index c685ee8..e1ed760 100644 --- a/hw/i386/pc.c +++ b/hw/i386/pc.c @@ -990,6 +990,31 @@ void pc_cpus_init(const char *cpu_model, DeviceState *icc_bridge) } } +/* pci-info ROM file. Little endian format */ +typedef struct PcRomPciInfo { +uint64_t w32_min; +uint64_t w32_max; +uint64_t w64_min; +uint64_t w64_max; +} PcRomPciInfo; + +static void pc_fw_cfg_guest_info(PcGuestInfo *guest_info) +{ +PcRomPciInfo *info; +if (!guest_info-has_pci_info) { +return; +} + +info = g_malloc(sizeof *info); +info-w32_min = cpu_to_le64(guest_info-pci_info.w32.begin); +info-w32_max = cpu_to_le64(guest_info-pci_info.w32.end); +info-w64_min = cpu_to_le64(guest_info-pci_info.w64.begin); +info-w64_max = cpu_to_le64(guest_info-pci_info.w64.end); +/* Pass PCI hole info to guest via a side channel. + * Required so guest PCI enumeration does the right thing. */ +fw_cfg_add_file(guest_info-fw_cfg, etc/pci-info, info, sizeof *info); +} + typedef struct PcGuestInfoState { PcGuestInfo info; Notifier machine_done; @@ -1001,6 +1026,7 @@ void pc_guest_info_machine_done(Notifier *notifier, void *data) PcGuestInfoState *guest_info_state = container_of(notifier, PcGuestInfoState, machine_done); +pc_fw_cfg_guest_info(guest_info_state-info); } PcGuestInfo *pc_guest_info_init(ram_addr_t below_4g_mem_size, diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c index a70a0d9..d87da95 100644 --- a/hw/i386/pc_piix.c +++ b/hw/i386/pc_piix.c @@ -57,6 +57,7 @@ static const int ide_iobase2[MAX_IDE_BUS] = { 0x3f6, 0x376 }; static const int ide_irq[MAX_IDE_BUS] = { 14, 15 }; static bool has_pvpanic = true; +static bool has_pci_info = true; /* PC hardware initialisation */ static void pc_init1(MemoryRegion *system_memory, @@ -121,6 +122,7 @@ static void pc_init1(MemoryRegion *system_memory, } guest_info = pc_guest_info_init(below_4g_mem_size, above_4g_mem_size); +guest_info-has_pci_info = has_pci_info; /* Set PCI window size the way seabios has always done it. */ /* Power of 2 so bios can cover it with a single MTRR */ @@ -258,8 +260,15 @@ static void pc_init_pci(QEMUMachineInitArgs *args) initrd_filename, cpu_model, 1, 1); } +static void pc_init_pci_1_5(QEMUMachineInitArgs *args) +{ +has_pci_info = false; +pc_init_pci(args); +} + static void pc_init_pci_1_4(QEMUMachineInitArgs *args) { +has_pci_info = false; has_pvpanic = false; x86_cpu_compat_set_features(n270, FEAT_1_ECX, 0, CPUID_EXT_MOVBE); pc_init_pci(args); @@ -267,6 +276,7 @@ static void pc_init_pci_1_4(QEMUMachineInitArgs *args) static void pc_init_pci_1_3(QEMUMachineInitArgs *args) { +has_pci_info = false; enable_compat_apic_id_mode(); has_pvpanic = false; pc_init_pci(args); @@ -275,6 +285,7 @@ static void pc_init_pci_1_3(QEMUMachineInitArgs *args) /* PC machine init function for pc-1.1 to pc-1.2 */ static void pc_init_pci_1_2(QEMUMachineInitArgs *args) { +has_pci_info = false; disable_kvm_pv_eoi(); enable_compat_apic_id_mode(); has_pvpanic = false; @@ -284,6 +295,7 @@ static void pc_init_pci_1_2(QEMUMachineInitArgs *args) /* PC machine init function for pc-0.14 to pc-1.0 */ static void pc_init_pci_1_0(QEMUMachineInitArgs *args) { +has_pci_info = false; disable_kvm_pv_eoi(); enable_compat_apic_id_mode(); has_pvpanic = false; @@ -300,6 +312,7 @@ static void pc_init_pci_no_kvmclock(QEMUMachineInitArgs *args) const char *initrd_filename = args-initrd_filename; const char *boot_device = args-boot_device; has_pvpanic = false; +has_pci_info = false; disable_kvm_pv_eoi(); enable_compat_apic_id_mode(); pc_init1(get_system_memory(), @@ -318,6 +331,7 @@ static void pc_init_isa(QEMUMachineInitArgs *args) const char *initrd_filename = args-initrd_filename; const char *boot_device = args-boot_device; has_pvpanic = false; +has_pci_info = false; if (cpu_model == NULL) cpu_model = 486; disable_kvm_pv_eoi(); @@ -353,7 +367,7 @@ static QEMUMachine pc_i440fx_machine_v1_6 = { static QEMUMachine pc_i440fx_machine_v1_5 = { .name = pc-i440fx-1.5, .desc = Standard PC (i440FX + PIIX, 1996), -.init = pc_init_pci, +.init = pc_init_pci_1_5,
[Qemu-devel] [PULL 05/21] e1000: cleanup process_tx_desc
From: Andrew Jones drjo...@redhat.com Coverity complains about two overruns in process_tx_desc(). The complaints are false positives, but we might as well eliminate them. The problem is that hdr is defined as an unsigned int, but then used to offset an array of size 65536, and another of size 256 bytes. hdr will actually never be greater than 255 though, as it's assigned only once and to the value of tp-hdr_len, which is an uint8_t. This patch simply gets rid of hdr, replacing it with tp-hdr_len, which makes it consistent with all other tp member use in the function. v2: - also cleanup coding style issues in the touched lines Signed-off-by: Andrew Jones drjo...@redhat.com Signed-off-by: Michael S. Tsirkin m...@redhat.com --- hw/net/e1000.c | 18 ++ 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/hw/net/e1000.c b/hw/net/e1000.c index e6f46f0..620f947 100644 --- a/hw/net/e1000.c +++ b/hw/net/e1000.c @@ -556,7 +556,7 @@ process_tx_desc(E1000State *s, struct e1000_tx_desc *dp) uint32_t txd_lower = le32_to_cpu(dp-lower.data); uint32_t dtype = txd_lower (E1000_TXD_CMD_DEXT | E1000_TXD_DTYP_D); unsigned int split_size = txd_lower 0x, bytes, sz, op; -unsigned int msh = 0xf, hdr = 0; +unsigned int msh = 0xf; uint64_t addr; struct e1000_context_desc *xp = (struct e1000_context_desc *)dp; struct e1000_tx *tp = s-tx; @@ -603,8 +603,7 @@ process_tx_desc(E1000State *s, struct e1000_tx_desc *dp) addr = le64_to_cpu(dp-buffer_addr); if (tp-tse tp-cptse) { -hdr = tp-hdr_len; -msh = hdr + tp-mss; +msh = tp-hdr_len + tp-mss; do { bytes = split_size; if (tp-size + bytes msh) @@ -612,14 +611,16 @@ process_tx_desc(E1000State *s, struct e1000_tx_desc *dp) bytes = MIN(sizeof(tp-data) - tp-size, bytes); pci_dma_read(s-dev, addr, tp-data + tp-size, bytes); -if ((sz = tp-size + bytes) = hdr tp-size hdr) -memmove(tp-header, tp-data, hdr); +sz = tp-size + bytes; +if (sz = tp-hdr_len tp-size tp-hdr_len) { +memmove(tp-header, tp-data, tp-hdr_len); +} tp-size = sz; addr += bytes; if (sz == msh) { xmit_seg(s); -memmove(tp-data, tp-header, hdr); -tp-size = hdr; +memmove(tp-data, tp-header, tp-hdr_len); +tp-size = tp-hdr_len; } } while (split_size -= bytes); } else if (!tp-tse tp-cptse) { @@ -633,8 +634,9 @@ process_tx_desc(E1000State *s, struct e1000_tx_desc *dp) if (!(txd_lower E1000_TXD_CMD_EOP)) return; -if (!(tp-tse tp-cptse tp-size hdr)) +if (!(tp-tse tp-cptse tp-size tp-hdr_len)) { xmit_seg(s); +} tp-tso_frames = 0; tp-sum_needed = 0; tp-vlan_needed = 0; -- MST
[Qemu-devel] [PULL 06/21] kvm: zero-initialize KVM_SET_GSI_ROUTING input
kvm_add_routing_entry makes an attempt to zero-initialize any new routing entry. However, it fails to initialize padding within the u field of the structure kvm_irq_routing_entry. Other functions like kvm_irqchip_update_msi_route also fail to initialize the padding field in kvm_irq_routing_entry. While mostly harmless, this would prevent us from reusing these fields for something useful in the future. It's better to just make sure all input is initialized. Once it is, we can also drop complex field by field assignment and just do the simple *a = *b to update a route entry. Signed-off-by: Michael S. Tsirkin m...@redhat.com --- kvm-all.c | 19 +++ 1 file changed, 7 insertions(+), 12 deletions(-) diff --git a/kvm-all.c b/kvm-all.c index 405480e..f119ce1 100644 --- a/kvm-all.c +++ b/kvm-all.c @@ -1006,11 +1006,8 @@ static void kvm_add_routing_entry(KVMState *s, } n = s-irq_routes-nr++; new = s-irq_routes-entries[n]; -memset(new, 0, sizeof(*new)); -new-gsi = entry-gsi; -new-type = entry-type; -new-flags = entry-flags; -new-u = entry-u; + +*new = *entry; set_gsi(s, entry-gsi); @@ -1029,9 +1026,7 @@ static int kvm_update_routing_entry(KVMState *s, continue; } -entry-type = new_entry-type; -entry-flags = new_entry-flags; -entry-u = new_entry-u; +*entry = *new_entry; kvm_irqchip_commit_routes(s); @@ -1043,7 +1038,7 @@ static int kvm_update_routing_entry(KVMState *s, void kvm_irqchip_add_irq_route(KVMState *s, int irq, int irqchip, int pin) { -struct kvm_irq_routing_entry e; +struct kvm_irq_routing_entry e = {}; assert(pin s-gsi_count); @@ -1156,7 +1151,7 @@ int kvm_irqchip_send_msi(KVMState *s, MSIMessage msg) return virq; } -route = g_malloc(sizeof(KVMMSIRoute)); +route = g_malloc0(sizeof(KVMMSIRoute)); route-kroute.gsi = virq; route-kroute.type = KVM_IRQ_ROUTING_MSI; route-kroute.flags = 0; @@ -1177,7 +1172,7 @@ int kvm_irqchip_send_msi(KVMState *s, MSIMessage msg) int kvm_irqchip_add_msi_route(KVMState *s, MSIMessage msg) { -struct kvm_irq_routing_entry kroute; +struct kvm_irq_routing_entry kroute = {}; int virq; if (!kvm_gsi_routing_enabled()) { @@ -1203,7 +1198,7 @@ int kvm_irqchip_add_msi_route(KVMState *s, MSIMessage msg) int kvm_irqchip_update_msi_route(KVMState *s, int virq, MSIMessage msg) { -struct kvm_irq_routing_entry kroute; +struct kvm_irq_routing_entry kroute = {}; if (!kvm_irqchip_in_kernel()) { return -ENOSYS; -- MST
[Qemu-devel] [PULL 04/21] pc_piix: cleanup init compat handling
Make sure 1.4 calls 1.5, 1.3 calls 1.4 etc. This way it's enough to add enough new compat hook in a single place in piix. Signed-off-by: Michael S. Tsirkin m...@redhat.com --- hw/i386/pc_piix.c | 18 -- 1 file changed, 4 insertions(+), 14 deletions(-) diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c index d87da95..234424a 100644 --- a/hw/i386/pc_piix.c +++ b/hw/i386/pc_piix.c @@ -268,38 +268,28 @@ static void pc_init_pci_1_5(QEMUMachineInitArgs *args) static void pc_init_pci_1_4(QEMUMachineInitArgs *args) { -has_pci_info = false; has_pvpanic = false; x86_cpu_compat_set_features(n270, FEAT_1_ECX, 0, CPUID_EXT_MOVBE); -pc_init_pci(args); +pc_init_pci_1_5(args); } static void pc_init_pci_1_3(QEMUMachineInitArgs *args) { -has_pci_info = false; enable_compat_apic_id_mode(); -has_pvpanic = false; -pc_init_pci(args); +pc_init_pci_1_4(args); } /* PC machine init function for pc-1.1 to pc-1.2 */ static void pc_init_pci_1_2(QEMUMachineInitArgs *args) { -has_pci_info = false; disable_kvm_pv_eoi(); -enable_compat_apic_id_mode(); -has_pvpanic = false; -pc_init_pci(args); +pc_init_pci_1_3(args); } /* PC machine init function for pc-0.14 to pc-1.0 */ static void pc_init_pci_1_0(QEMUMachineInitArgs *args) { -has_pci_info = false; -disable_kvm_pv_eoi(); -enable_compat_apic_id_mode(); -has_pvpanic = false; -pc_init_pci(args); +pc_init_pci_1_2(args); } /* PC init function for pc-0.10 to pc-0.13, and reused by xenfv */ -- MST