date:20101019

Re: [Qemu-devel] [PATCH 0/7] ATAPI CDROM passthrough v5

2010-10-19 Thread Alexander Graf


Am 19.10.2010 um 02:10 schrieb Anthony Liguori anth...@codemonkey.ws:

 On 10/18/2010 06:29 PM, Alexander Graf wrote:
 A user will get a really nasty surprise if they think they can use a flag 
 or rely on QEMU to prevent a VM from doing something nasty with a device.  
 If they have this feeling of security, they're likely to chmod the device 
 to allow unprivileged users to access it.
 
 But how a device handles ATAPI commands is totally up to the device.  If 
 you issue the wrong sequence, I'm sure there are devices out there that 
 totally hose themselves.  Are you absolutely confident that every ATAPI 
 device out there is completely safe against hostile code provided that you 
 simply prevent the FW update commands?  I'm certainly not.
 
 Ping?
   
 
 Who are you pinging?

Mostly Ian. I haven't seen any follow-up on this discussion and would like to 
know why and if there's still plans to upstream this code :).

Alex

[Qemu-devel] [Tracing][v4 PATCH 0/2] QMP Query interfaces for tracing

2010-10-19 Thread Prerna Saxena

This patch set introduces three QMP query interfaces for tracing :

* query-trace: to list current contents of trace-buffer
* query-trace-events : to list all available trace-events with their 
   state.
* query-trace-file   : to list currently set trace-file with its status.

Changelog :
---
Changes v3 - v4 :
- Add 'query-trace-file' interface to query currently active trace-file.
- Cleanup.

Changes v2 - v3 :
- Change declarations of st_print_trace_to_qlist() and 
st_print_trace_events_to_qlist() to return QList*

Changes v1 - v2 :
- Add 'timestamp' field for query-trace output.
- Misc cleanups.


-- 
Prerna Saxena

Linux Technology Centre,
IBM Systems and Technology Lab,
Bangalore, India

[Qemu-devel] [Tracing][v4 PATCH 1/2] Introduce QMP interfaces

2010-10-19 Thread Prerna Saxena

[PATCH 1/2] Introduce QMP interfaces :
 - query-trace
 - query-trace-events
 - query-trace-file


Signed-off-by: Prerna Saxena pre...@linux.vnet.ibm.com
---
 monitor.c |   53 ---
 simpletrace.c |   69 +
 simpletrace.h |5 
 3 files changed, 123 insertions(+), 4 deletions(-)

diff --git a/monitor.c b/monitor.c
index 260cc02..c7e1f53 100644
--- a/monitor.c
+++ b/monitor.c
@@ -578,6 +578,11 @@ static void do_trace_file(Monitor *mon, const QDict *qdict)
 help_cmd(mon, trace-file);
 }
 }
+
+static void do_info_trace_file_to_qmp(Monitor *mon, QObject **ret_data)
+{
+*ret_data = st_print_file_to_qobject();
+}
 #endif
 
 static void user_monitor_complete(void *opaque, QObject *ret_data)
@@ -945,15 +950,27 @@ static void do_info_cpu_stats(Monitor *mon)
 #endif
 
 #if defined(CONFIG_SIMPLE_TRACE)
-static void do_info_trace(Monitor *mon)
+static void do_info_trace_print(Monitor *mon, const QObject *data)
 {
 st_print_trace((FILE *)mon, monitor_fprintf);
 }
 
-static void do_info_trace_events(Monitor *mon)
+static void do_info_trace(Monitor *mon, QObject **ret_data)
+{
+QList *trace_event_list = st_print_trace_to_qlist();
+*ret_data = QOBJECT(trace_event_list);
+}
+
+static void do_info_trace_events_print(Monitor *mon, const QObject *data)
 {
 st_print_trace_events((FILE *)mon, monitor_fprintf);
 }
+
+static void do_info_trace_events(Monitor *mon, QObject **ret_data)
+{
+QList *trace_event_list = st_print_trace_events_to_qlist();
+*ret_data = QOBJECT(trace_event_list);
+}
 #endif
 
 /**
@@ -2610,14 +2627,16 @@ static const mon_cmd_t info_cmds[] = {
 .args_type  = ,
 .params = ,
 .help   = show current contents of trace buffer,
-.mhandler.info = do_info_trace,
+.user_print = do_info_trace_print,
+.mhandler.info_new = do_info_trace,
 },
 {
 .name   = trace-events,
 .args_type  = ,
 .params = ,
 .help   = show available trace-events  their state,
-.mhandler.info = do_info_trace_events,
+.user_print = do_info_trace_events_print,
+.mhandler.info_new = do_info_trace_events,
 },
 #endif
 {
@@ -2752,6 +2771,32 @@ static const mon_cmd_t qmp_query_cmds[] = {
 .mhandler.info_async = do_info_balloon,
 .flags  = MONITOR_CMD_ASYNC,
 },
+#if defined(CONFIG_SIMPLE_TRACE)
+{
+.name   = trace,
+.args_type  = ,
+.params = ,
+.help   = show current contents of trace buffer,
+.user_print = do_info_trace_print,
+.mhandler.info_new = do_info_trace,
+},
+{
+.name   = trace-events,
+.args_type  = ,
+.params = ,
+.help   = show available trace-events  their state,
+.user_print = do_info_trace_events_print,
+.mhandler.info_new = do_info_trace_events,
+},
+{
+.name   = trace-file,
+.args_type  = ,
+.params = ,
+.help   = show currently active trace output file and its status,
+.user_print = monitor_user_noop,
+.mhandler.info_new = do_info_trace_file_to_qmp,
+},
+#endif
 { /* NULL */ },
 };
 
diff --git a/simpletrace.c b/simpletrace.c
index deb1e07..d24d6b0 100644
--- a/simpletrace.c
+++ b/simpletrace.c
@@ -220,6 +220,43 @@ void st_print_trace(FILE *stream, int 
(*stream_printf)(FILE *stream, const char
 }
 }
 
+/**
+ * Add the current contents of trace-buffer as a QList.
+ *
+ */
+QList* st_print_trace_to_qlist(void)
+{
+QObject *data;
+QList *tlist;
+unsigned int i;
+
+tlist = qlist_new();
+
+for (i = 0; i  trace_idx; i++) {
+  data = qobject_from_jsonf({
+ 'timestamp': % PRId64 ,
+ 'event': % PRId64 ,
+ 'arg1': % PRId64 ,
+ 'arg2': % PRId64 ,
+ 'arg3': % PRId64 ,
+ 'arg4': % PRId64 ,
+ 'arg5': % PRId64 ,
+ 'arg6': % PRId64
+},
+trace_buf[i].timestamp_ns,
+trace_buf[i].event,
+trace_buf[i].x1,
+trace_buf[i].x2,
+trace_buf[i].x3,
+trace_buf[i].x4,
+trace_buf[i].x5,
+trace_buf[i].x6);
+  qlist_append_obj(tlist, data);
+}
+
+return tlist;
+}
+
 void st_print_trace_events(FILE *stream, int (*stream_printf)(FILE *stream, 
const char *fmt, ...))
 {
 unsigned int i;
@@ -230,6 +267,38 @@

[Qemu-devel] [Tracing][v4 PATCH 2/2] Add documentation for QMP interfaces

2010-10-19 Thread Prerna Saxena

[PATCH 2/2] Add documentation for QMP commands:
 - query-trace
 - query-trace-events
 - query-trace-file.


Signed-off-by: Prerna Saxena pre...@linux.vnet.ibm.com
---
 qmp-commands.hx |   94 +++
 1 files changed, 94 insertions(+), 0 deletions(-)

diff --git a/qmp-commands.hx b/qmp-commands.hx
index 793cf1c..bc79b55 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -1539,3 +1539,97 @@ Example:
 
 EQMP
 
+SQMP
+query-trace
+-
+
+Show contents of trace buffer.
+
+Returns a set of json-objects containing the following data:
+
+- event: Event ID for the trace-event(json-int)
+- timestamp: trace timestamp (json-int)
+- arg1 .. arg6: Arguments logged by the trace-event (json-int)
+
+Example:
+
+- { execute: query-trace }
+- {
+  return:{
+ event: 22,
+ timestamp: 129456235912365,
+ arg1: 886
+ arg2: 80,
+ arg3: 0,
+ arg4: 0,
+ arg5: 0,
+ arg6: 0,
+   },
+   {
+ event: 22,
+ timestamp: 129456235973407,
+ arg1: 886,
+ arg2: 80,
+ arg3: 0,
+ arg4: 0,
+ arg5: 0,
+ arg6: 0
+   },
+   ...
+   }
+
+EQMP
+
+SQMP
+query-trace-events
+--
+
+Show all available trace-events  their state.
+
+Returns a set of json-objects containing the following data:
+
+- name: Name of Trace-event (json-string)
+- event-id: Event ID of Trace-event (json-int)
+- state: State of trace-event [ '0': inactive; '1':active  ] (json-int)
+
+Example:
+
+- { execute: query-trace-events }
+- {
+  return:{
+ name: qemu_malloc,
+ event-id: 0
+ state: 0,
+  },
+  {
+ name: qemu_realloc,
+ event-id: 1,
+ state: 0
+  },
+  ...
+   }
+
+EQMP
+
+SQMP
+query-trace-file
+
+
+Display currently set trace file name and its status.
+
+Returns a set of json-objects containing the following data:
+
+- trace-file: Name of Trace-file (json-string)
+- status: State of trace-event [ '0': disabled; '1':enabled  ] (json-int)
+
+Example:
+
+- { execute: query-trace-file }
+- {
+  return:{
+ trace-file: trace-26609,
+ status: 1
+  }
+   }
+
+EQMP
-- 
1.7.2.2



-- 
Prerna Saxena

Linux Technology Centre,
IBM Systems and Technology Lab,
Bangalore, India

Re: [Qemu-devel] Re: [PATCH 1/2] pci: Automatically patch PCI vendor id and device id in PCI ROM

2010-10-19 Thread Gerd Hoffmann


On 10/18/10 21:36, Stefan Weil wrote:

There is already some kind of error feedback: the rom will not work.
For etherboot roms, booting from network won't work.


VGA works, after hacking the vgabios to not have the PCI ID hardcoded 
elsewhere.


Nevertheless /me gets the feeling that we better should not take that 
route.  vgabios needs special patching to work.  etherboot does not work 
as-is.  Even if we make it work now it always will be fragile.  The next 
rom update might break it again.  The ID automagically adapting doesn't 
happen on real hardware ...


cheers,
  Gerd

Re: [Qemu-devel] [PATCH 1/2] pci: Automatically patch PCI vendor id and device id in PCI ROM

2010-10-19 Thread Michael S. Tsirkin

On Mon, Oct 18, 2010 at 09:11:55PM +0200, Stefan Weil wrote:
 QEMU must only make sure that patching of the supported roms
 with supported devices work.

I think that's what Anthony was saying too - make this depend
on a qdev property and set it only in eepro100 for now.

-- 
MST

[Qemu-devel] [PATCH v5 04/14] pci/bridge: fix pci_bridge_reset()

2010-10-19 Thread Isaku Yamahata

The default value of base/limit registers aren't specified in the spec.
So pci_bridge_reset() shouldn't touch them.
Instead, introduced two functions to reset those registers in a way
of typical implementation. zero base/limit registers or disable forwarding.
They will be used later.

Signed-off-by: Isaku Yamahata yamah...@valinux.co.jp
---
Changes v4 - v5:
- drop the lines in pci_bridge_reset()
- introduced two functions to reset base/limit registers.
---
 hw/pci_bridge.c |   57 +++---
 hw/pci_bridge.h |2 +
 2 files changed, 51 insertions(+), 8 deletions(-)

diff --git a/hw/pci_bridge.c b/hw/pci_bridge.c
index 638e3b3..de75e6a 100644
--- a/hw/pci_bridge.c
+++ b/hw/pci_bridge.c
@@ -151,6 +151,46 @@ void pci_bridge_write_config(PCIDevice *d,
 }
 }
 
+void pci_bridge_reset_zero_base_limit(PCIDevice *dev)
+{
+uint8_t *conf = dev-config;
+
+pci_byte_test_and_clear_mask(conf + PCI_IO_BASE,
+ PCI_IO_RANGE_MASK  0xff);
+pci_byte_test_and_clear_mask(conf + PCI_IO_LIMIT,
+ PCI_IO_RANGE_MASK  0xff);
+pci_word_test_and_clear_mask(conf + PCI_MEMORY_BASE,
+ PCI_MEMORY_RANGE_MASK  0x);
+pci_word_test_and_clear_mask(conf + PCI_MEMORY_LIMIT,
+ PCI_MEMORY_RANGE_MASK  0x);
+pci_word_test_and_clear_mask(conf + PCI_PREF_MEMORY_BASE,
+ PCI_PREF_RANGE_MASK  0x);
+pci_word_test_and_clear_mask(conf + PCI_PREF_MEMORY_LIMIT,
+ PCI_PREF_RANGE_MASK  0x);
+pci_set_word(conf + PCI_PREF_BASE_UPPER32, 0);
+pci_set_word(conf + PCI_PREF_LIMIT_UPPER32, 0);
+}
+
+void pci_bridge_reset_disable_base_limit(PCIDevice *dev)
+{
+uint8_t *conf = dev-config;
+
+pci_byte_test_and_set_mask(conf + PCI_IO_BASE,
+   PCI_IO_RANGE_MASK  0xff);
+pci_byte_test_and_clear_mask(conf + PCI_IO_LIMIT,
+ PCI_IO_RANGE_MASK  0xff);
+pci_word_test_and_set_mask(conf + PCI_MEMORY_BASE,
+   PCI_MEMORY_RANGE_MASK  0x);
+pci_word_test_and_clear_mask(conf + PCI_MEMORY_LIMIT,
+ PCI_MEMORY_RANGE_MASK  0x);
+pci_word_test_and_set_mask(conf + PCI_PREF_MEMORY_BASE,
+   PCI_PREF_RANGE_MASK  0x);
+pci_word_test_and_clear_mask(conf + PCI_PREF_MEMORY_LIMIT,
+ PCI_PREF_RANGE_MASK  0x);
+pci_set_word(conf + PCI_PREF_BASE_UPPER32, 0);
+pci_set_word(conf + PCI_PREF_LIMIT_UPPER32, 0);
+}
+
 /* reset bridge specific configuration registers */
 void pci_bridge_reset_reg(PCIDevice *dev)
 {
@@ -161,14 +201,15 @@ void pci_bridge_reset_reg(PCIDevice *dev)
 conf[PCI_SUBORDINATE_BUS] = 0;
 conf[PCI_SEC_LATENCY_TIMER] = 0;
 
-conf[PCI_IO_BASE] = 0;
-conf[PCI_IO_LIMIT] = 0;
-pci_set_word(conf + PCI_MEMORY_BASE, 0);
-pci_set_word(conf + PCI_MEMORY_LIMIT, 0);
-pci_set_word(conf + PCI_PREF_MEMORY_BASE, 0);
-pci_set_word(conf + PCI_PREF_MEMORY_LIMIT, 0);
-pci_set_word(conf + PCI_PREF_BASE_UPPER32, 0);
-pci_set_word(conf + PCI_PREF_LIMIT_UPPER32, 0);
+/*
+ * the default values for base/limit registers aren't specified
+ * in the PCI-to-PCI-bridge spec. So we don't thouch them here.
+ * Each implementation can override it.
+ * typical implementation does
+ * - zero registers: pci_bridge_reset_zer_base_limit()
+ * or
+ * - disable forwarding: pci_bridge_reset_disable_base_limit()
+ */
 
 pci_set_word(conf + PCI_BRIDGE_CONTROL, 0);
 }
diff --git a/hw/pci_bridge.h b/hw/pci_bridge.h
index f6fade0..2359684 100644
--- a/hw/pci_bridge.h
+++ b/hw/pci_bridge.h
@@ -39,6 +39,8 @@ pcibus_t pci_bridge_get_limit(const PCIDevice *bridge, 
uint8_t type);
 
 void pci_bridge_write_config(PCIDevice *d,
  uint32_t address, uint32_t val, int len);
+void pci_bridge_reset_zero_base_limit(PCIDevice *dev);
+void pci_bridge_reset_disable_base_limit(PCIDevice *dev);
 void pci_bridge_reset_reg(PCIDevice *dev);
 void pci_bridge_reset(DeviceState *qdev);
 
-- 
1.7.1.1

[Qemu-devel] [PATCH v5 13/14] pcie/hotplug: introduce pushing attention button command

2010-10-19 Thread Isaku Yamahata

glue pcie_push_attention_button command.

Signed-off-by: Isaku Yamahata yamah...@valinux.co.jp
---
 hw/pcie_port.c  |   82 +++
 qemu-monitor.hx |   14 +
 sysemu.h|4 +++
 3 files changed, 100 insertions(+), 0 deletions(-)

diff --git a/hw/pcie_port.c b/hw/pcie_port.c
index 117de61..f43a1c7 100644
--- a/hw/pcie_port.c
+++ b/hw/pcie_port.c
@@ -18,6 +18,10 @@
  * with this program; if not, see http://www.gnu.org/licenses/.
  */
 
+#include qemu-objects.h
+#include sysemu.h
+#include monitor.h
+#include pcie.h
 #include pcie_port.h
 
 void pcie_port_init_reg(PCIDevice *d)
@@ -114,3 +118,81 @@ void pcie_chassis_del_slot(PCIESlot *s)
 {
 QLIST_REMOVE(s, next);
 }
+
+/**
+ * glue for qemu monitor
+ */
+
+/* Parse [chassis.]slot, return -1 on error */
+static int pcie_parse_slot_addr(const char* slot_addr,
+uint8_t *chassisp, uint16_t *slotp)
+{
+const char *p;
+char *e;
+unsigned long val;
+unsigned long chassis = 0;
+unsigned long slot;
+
+p = slot_addr;
+val = strtoul(p, e, 0);
+if (e == p) {
+return -1;
+}
+if (*e == '.') {
+chassis = val;
+p = e + 1;
+val = strtoul(p, e, 0);
+if (e == p) {
+return -1;
+}
+}
+slot = val;
+
+if (*e) {
+return -1;
+}
+
+if (chassis  0xff || slot  0x) {
+return -1;
+}
+
+*chassisp = chassis;
+*slotp = slot;
+return 0;
+}
+
+void pcie_attention_button_push_print(Monitor *mon, const QObject *data)
+{
+QDict *qdict;
+
+assert(qobject_type(data) == QTYPE_QDICT);
+qdict = qobject_to_qdict(data);
+
+monitor_printf(mon, OK chassis %d, slot %d\n,
+   (int) qdict_get_int(qdict, chassis),
+   (int) qdict_get_int(qdict, slot));
+}
+
+int pcie_attention_button_push(Monitor *mon, const QDict *qdict,
+   QObject **ret_data)
+{
+const char* pcie_slot = qdict_get_str(qdict, pcie_slot);
+uint8_t chassis;
+uint16_t slot;
+PCIESlot *s;
+
+if (pcie_parse_slot_addr(pcie_slot, chassis, slot)  0) {
+monitor_printf(mon, invalid pcie slot address %s\n, pcie_slot);
+return -1;
+}
+s = pcie_chassis_find_slot(chassis, slot);
+if (!s) {
+monitor_printf(mon, slot is not found. %s\n, pcie_slot);
+return -1;
+}
+pcie_cap_slot_push_attention_button(s-port.br.dev);
+*ret_data = qobject_from_jsonf({ 'chassis': %d, 'slot': %d},
+   chassis, slot);
+assert(*ret_data);
+return 0;
+}
diff --git a/qemu-monitor.hx b/qemu-monitor.hx
index 2af3de6..965c754 100644
--- a/qemu-monitor.hx
+++ b/qemu-monitor.hx
@@ -1154,6 +1154,20 @@ Hot remove PCI device.
 ETEXI
 
 {
+.name   = pcie_push_attention_button,
+.args_type  = pcie_slot:s,
+.params = [chassis.]slot,
+.help   = push pci express attention button,
+.user_print  = pcie_attention_button_push_print,
+.mhandler.cmd_new = pcie_attention_button_push,
+},
+
+STEXI
+...@item pcie_abp
+Push PCI express attention button
+ETEXI
+
+{
 .name   = host_net_add,
 .args_type  = device:s,opts:s?,
 .params = tap|user|socket|vde|dump [options],
diff --git a/sysemu.h b/sysemu.h
index 9c988bb..cca411d 100644
--- a/sysemu.h
+++ b/sysemu.h
@@ -150,6 +150,10 @@ extern unsigned int nb_prom_envs;
 void pci_device_hot_add(Monitor *mon, const QDict *qdict);
 void drive_hot_add(Monitor *mon, const QDict *qdict);
 void do_pci_device_hot_remove(Monitor *mon, const QDict *qdict);
+/* pcie hotplug */
+void pcie_attention_button_push_print(Monitor *mon, const QObject *data);
+int pcie_attention_button_push(Monitor *mon, const QDict *qdict,
+   QObject **ret_data);
 
 /* serial ports */
 
-- 
1.7.1.1

[Qemu-devel] [PATCH v5 02/14] pci: introduce helper function to handle msi-x and msi.

2010-10-19 Thread Isaku Yamahata

this patch implements helper functions to handle msi-x and msi
uniformly.
They will be used later.

Signed-off-by: Isaku Yamahata yamah...@valinux.co.jp
---
 hw/pci.c |   19 +++
 hw/pci.h |3 +++
 2 files changed, 22 insertions(+), 0 deletions(-)

diff --git a/hw/pci.c b/hw/pci.c
index e3462a9..300079f 100644
--- a/hw/pci.c
+++ b/hw/pci.c
@@ -25,6 +25,8 @@
 #include pci.h
 #include pci_bridge.h
 #include pci_internals.h
+#include msix.h
+#include msi.h
 #include monitor.h
 #include net.h
 #include sysemu.h
@@ -1034,6 +1036,23 @@ static void pci_set_irq(void *opaque, int irq_num, int 
level)
 pci_change_irq_level(pci_dev, irq_num, change);
 }
 
+bool pci_msi_enabled(PCIDevice *dev)
+{
+return msix_enabled(dev) || msi_enabled(dev);
+}
+
+void pci_msi_notify(PCIDevice *dev, unsigned int vector)
+{
+if (msix_enabled(dev)) {
+msix_notify(dev, vector);
+} else if (msi_enabled(dev)) {
+msi_notify(dev, vector);
+} else {
+/* MSI/MSI-X must be enabled */
+abort();
+}
+}
+
 /***/
 /* monitor info on PCI */
 
diff --git a/hw/pci.h b/hw/pci.h
index 752e652..3072a5f 100644
--- a/hw/pci.h
+++ b/hw/pci.h
@@ -239,6 +239,9 @@ void do_pci_info_print(Monitor *mon, const QObject *data);
 void do_pci_info(Monitor *mon, QObject **ret_data);
 void pci_bridge_update_mappings(PCIBus *b);
 
+bool pci_msi_enabled(PCIDevice *dev);
+void pci_msi_notify(PCIDevice *dev, unsigned int vector);
+
 static inline void
 pci_set_byte(uint8_t *config, uint8_t val)
 {
-- 
1.7.1.1

[Qemu-devel] [PATCH v5 14/14] pcie/aer: glue aer error injection into qemu monitor

2010-10-19 Thread Isaku Yamahata

introduce pcie_aer_inject_error command.

Signed-off-by: Isaku Yamahata yamah...@valinux.co.jp
---
Changes v3 - v4:
- s/PCIE_AER/PCIEAER/g for structure names.
- compilation adjustment.

Changes v2 - v3:
- compilation adjustment.
---
 hw/pcie_aer.c   |   84 +++
 qemu-monitor.hx |   22 ++
 sysemu.h|5 +++
 3 files changed, 111 insertions(+), 0 deletions(-)

diff --git a/hw/pcie_aer.c b/hw/pcie_aer.c
index 1b023b0..97d3e2e 100644
--- a/hw/pcie_aer.c
+++ b/hw/pcie_aer.c
@@ -19,6 +19,8 @@
  */
 
 #include sysemu.h
+#include qemu-objects.h
+#include monitor.h
 #include pci_bridge.h
 #include pcie.h
 #include msix.h
@@ -783,3 +785,85 @@ const VMStateDescription vmstate_pcie_aer_log = {
 }
 };
 
+void pcie_aer_inject_error_print(Monitor *mon, const QObject *data)
+{
+QDict *qdict;
+int devfn;
+assert(qobject_type(data) == QTYPE_QDICT);
+qdict = qobject_to_qdict(data);
+
+devfn = (int)qdict_get_int(qdict, devfn);
+monitor_printf(mon, OK domain: %x, bus: %x devfn: %x.%x\n,
+   (int) qdict_get_int(qdict, domain),
+   (int) qdict_get_int(qdict, bus),
+   PCI_SLOT(devfn), PCI_FUNC(devfn));
+}
+
+int do_pcie_aer_inejct_error(Monitor *mon,
+ const QDict *qdict, QObject **ret_data)
+{
+const char *pci_addr = qdict_get_str(qdict, pci_addr);
+int dom;
+int bus;
+unsigned int slot;
+unsigned int func;
+PCIDevice *dev;
+PCIEAERErr err;
+
+/* Ideally qdev device path should be used.
+ * However at the moment there is no reliable way to determine
+ * wheher a given qdev is pci device or not.
+ * so pci_addr is used.
+ */
+if (pci_parse_devaddr(pci_addr, dom, bus, slot, func)) {
+monitor_printf(mon, invalid pci address %s\n, pci_addr);
+return -1;
+}
+dev = pci_find_device(pci_find_root_bus(dom), bus, slot, func);
+if (!dev) {
+monitor_printf(mon, device is not found. 0x%x:0x%x.0x%x\n,
+   bus, slot, func);
+return -1;
+}
+if (!pci_is_express(dev)) {
+monitor_printf(mon, the device doesn't support pci express. 
+   0x%x:0x%x.0x%x\n,
+   bus, slot, func);
+return -1;
+}
+
+err.status = qdict_get_int(qdict, error_status);
+err.source_id = (pci_bus_num(dev-bus)  8) | dev-devfn;
+
+err.flags = 0;
+if (qdict_get_int(qdict, is_correctable)) {
+err.flags |= PCIE_AER_ERR_IS_CORRECTABLE;
+}
+if (qdict_get_int(qdict, advisory_non_fatal)) {
+err.flags |= PCIE_AER_ERR_MAYBE_ADVISORY;
+}
+if (qdict_haskey(qdict, tlph0)) {
+err.flags |= PCIE_AER_ERR_HEADER_VALID;
+}
+if (qdict_haskey(qdict, hpfx0)) {
+err.flags |= PCIE_AER_ERR_TLP_PRESENT;
+}
+
+err.header[0] = qdict_get_try_int(qdict, tlph0, 0);
+err.header[1] = qdict_get_try_int(qdict, tlph1, 0);
+err.header[2] = qdict_get_try_int(qdict, tlph2, 0);
+err.header[3] = qdict_get_try_int(qdict, tlph3, 0);
+
+err.prefix[0] = qdict_get_try_int(qdict, hpfx0, 0);
+err.prefix[1] = qdict_get_try_int(qdict, hpfx1, 0);
+err.prefix[2] = qdict_get_try_int(qdict, hpfx2, 0);
+err.prefix[3] = qdict_get_try_int(qdict, hpfx3, 0);
+
+pcie_aer_inject_error(dev, err);
+*ret_data = qobject_from_jsonf({ 'domain': %d, 'bus': %d, 'devfn': %d },
+   pci_find_domain(dev-bus),
+   pci_bus_num(dev-bus), dev-devfn);
+assert(*ret_data);
+
+return 0;
+}
diff --git a/qemu-monitor.hx b/qemu-monitor.hx
index 965c754..ccb3d0e 100644
--- a/qemu-monitor.hx
+++ b/qemu-monitor.hx
@@ -1168,6 +1168,28 @@ Push PCI express attention button
 ETEXI
 
 {
+.name   = pcie_aer_inject_error,
+.args_type  = advisory_non_fatal:-a,is_correctable:-c,
+ pci_addr:s,error_status:i,
+ tlph0:i?,tlph1:i?,tlph2:i?,tlph3:i?,
+ hpfx0:i?,hpfx1:i?,hpfx2:i?,hpfx3:i?,
+.params = [-a] [-c] [[domain:]bus:]slot.func 
+ error status:32bit 
+ [tlp header:(32bit x 4)] 
+ [tlp header prefix:(32bit x 4)],
+.help   = inject pcie aer error 
+  (use -a for advisory non fatal error) 
+  (use -c for correctrable error),
+.user_print  = pcie_aer_inject_error_print,
+.mhandler.cmd_new = do_pcie_aer_inejct_error,
+},
+
+STEXI
+...@item pcie_abp
+Push PCI express attention button
+ETEXI
+
+{
 .name   = host_net_add,
 .args_type  = device:s,opts:s?,
 .params = tap|user|socket|vde|dump [options],
diff --git a/sysemu.h b/sysemu.h
index cca411d..2f7157c 100644
--- a/sysemu.h
+++ b/sysemu.h
@@ -155,6 +155,11 @@ void pcie_attention_button_push_print(Monitor *mon, const 
QObject

[Qemu-devel] [PATCH v5 01/14] pci: introduce helper functions to test-and-{clear, set} mask in configuration space

2010-10-19 Thread Isaku Yamahata

This patch introduces helper functions to test-and-{clear, set} mask in 
configuration
space. pci_{byte, word, long, quad}_test_and_{clear, set}_mask().
They will be used later.

Signed-off-by: Isaku Yamahata yamah...@valinux.co.jp
---
 hw/pci.h |   70 ++
 1 files changed, 70 insertions(+), 0 deletions(-)

diff --git a/hw/pci.h b/hw/pci.h
index d8b399f..752e652 100644
--- a/hw/pci.h
+++ b/hw/pci.h
@@ -323,6 +323,76 @@ pci_config_set_interrupt_pin(uint8_t *pci_config, uint8_t 
val)
 pci_set_byte(pci_config[PCI_INTERRUPT_PIN], val);
 }
 
+/*
+ * helper functions to do bit mask operation on configuration space.
+ * Just to set bit, use test-and-set and discard returned value.
+ * Just to clear bit, use test-and-clear and discard returned value.
+ * NOTE: They aren't atomic.
+ */
+static inline uint8_t
+pci_byte_test_and_clear_mask(uint8_t *config, uint8_t mask)
+{
+uint8_t val = pci_get_byte(config);
+pci_set_byte(config, val  ~mask);
+return val  mask;
+}
+
+static inline uint8_t
+pci_byte_test_and_set_mask(uint8_t *config, uint8_t mask)
+{
+uint8_t val = pci_get_byte(config);
+pci_set_byte(config, val | mask);
+return val  mask;
+}
+
+static inline uint16_t
+pci_word_test_and_clear_mask(uint8_t *config, uint16_t mask)
+{
+uint16_t val = pci_get_word(config);
+pci_set_word(config, val  ~mask);
+return val  mask;
+}
+
+static inline uint16_t
+pci_word_test_and_set_mask(uint8_t *config, uint16_t mask)
+{
+uint16_t val = pci_get_word(config);
+pci_set_word(config, val | mask);
+return val  mask;
+}
+
+static inline uint32_t
+pci_long_test_and_clear_mask(uint8_t *config, uint32_t mask)
+{
+uint32_t val = pci_get_long(config);
+pci_set_long(config, val  ~mask);
+return val  mask;
+}
+
+static inline uint32_t
+pci_long_test_and_set_mask(uint8_t *config, uint32_t mask)
+{
+uint32_t val = pci_get_long(config);
+pci_set_long(config, val | mask);
+return val  mask;
+}
+
+static inline uint64_t
+pci_quad_test_and_clear_mask(uint8_t *config, uint64_t mask)
+{
+uint64_t val = pci_get_quad(config);
+pci_set_quad(config, val  ~mask);
+return val  mask;
+}
+
+static inline uint64_t
+pci_quad_test_and_set_mask(uint8_t *config, uint64_t mask)
+{
+uint64_t val = pci_get_quad(config);
+pci_set_quad(config, val | mask);
+return val  mask;
+}
+
 typedef int (*pci_qdev_initfn)(PCIDevice *dev);
 typedef struct {
 DeviceInfo qdev;
-- 
1.7.1.1

[Qemu-devel] [PATCH v5 03/14] pci: use pci_word_test_and_clear_mask() in pci_device_reset()

2010-10-19 Thread Isaku Yamahata

use pci_clear_bit_word() in pci_device_reset() where appropriate.

Signed-off-by: Isaku Yamahata yamah...@valinux.co.jp
---
Changes v4 - v5
- use pci_word_test_and_clear_mask()
---
 hw/pci.c |5 ++---
 1 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/hw/pci.c b/hw/pci.c
index 300079f..409e2c0 100644
--- a/hw/pci.c
+++ b/hw/pci.c
@@ -139,9 +139,8 @@ static void pci_device_reset(PCIDevice *dev)
 dev-irq_state = 0;
 pci_update_irq_status(dev);
 /* Clear all writeable bits */
-pci_set_word(dev-config + PCI_COMMAND,
- pci_get_word(dev-config + PCI_COMMAND) 
- ~pci_get_word(dev-wmask + PCI_COMMAND));
+pci_word_test_and_clear_mask(dev-config + PCI_COMMAND,
+ pci_get_word(dev-wmask + PCI_COMMAND));
 dev-config[PCI_CACHE_LINE_SIZE] = 0x0;
 dev-config[PCI_INTERRUPT_LINE] = 0x0;
 for (r = 0; r  PCI_NUM_REGIONS; ++r) {
-- 
1.7.1.1

[Qemu-devel] [PATCH v5 00/14] pcie port switch emulators

2010-10-19 Thread Isaku Yamahata

Here is v5 of the pcie patch series.
I hope I addressed the blockers.
On uncorrectable error status register in pcie_aer_write_config().
The register is RW1CS, so making it writable and test-and-clear doesn't
work.

new patches: 1, 2, 
updasted patches except trivial change: 4, 7, 8

BTW, as 0.13 is released, any chance to sync pci branch with
the upstream by requesting pull?

Patch description:
This patch series implements pcie port switch emulators
which is basic part for pcie/q35 support.
This is for mst/pci tree.

change v4 - v5:
- introduced pci_xxx_test_and_clear/set_mask
- eliminated xxx_notify(msi_trigger, int_level)
- eliminated FLR bits.
  FLR will be addressed at the next phase.

changes v3 - v4:
- introduced new pci config helper functions.(clear set bit)
- various clean up and some bug fixes.
- dropped pci_shift_xxx().
- dropped function pointerin pcie_aer.h
- dropped pci_exp_cap(), pcie_aer_cap().
- file rename (pcie_{root, upstream, downsatrem} = ioh33420, x3130).

changes v2 - v3:
- msi: improved commant and simplified shift/ffs dance
- pci w1c config register framework
- split pcie.[ch] into pcie_regs.h, pcie.[ch] and pcie_aer.[ch]
- pcie, aer: many changes by following reviews.

changes v1 - v2:
- update msi
- dropped already pushed out patches.
- added msix patches.

Isaku Yamahata (14):
  pci: introduce helper functions to test-and-{clear, set} mask in
configuration space
  pci: introduce helper function to handle msi-x and msi.
  pci: use pci_word_test_and_clear_mask() in pci_device_reset()
  pci/bridge: fix pci_bridge_reset()
  msi: implements msi
  pcie: add pcie constants to pcie_regs.h
  pcie: helper functions for pcie capability and extended capability
  pcie/aer: helper functions for pcie aer capability
  pcie port: define struct PCIEPort/PCIESlot and helper functions
  ioh3420: pcie root port in X58 ioh
  x3130: pcie upstream port
  x3130: pcie downstream port
  pcie/hotplug: introduce pushing attention button command
  pcie/aer: glue aer error injection into qemu monitor

 Makefile.objs   |4 +-
 hw/ioh3420.c|  229 +
 hw/ioh3420.h|   10 +
 hw/msi.c|  352 +++
 hw/msi.h|   41 +++
 hw/pci.c|   24 ++-
 hw/pci.h|   88 +-
 hw/pci_bridge.c |   57 +++-
 hw/pci_bridge.h |2 +
 hw/pcie.c   |  540 +
 hw/pcie.h   |  113 ++
 hw/pcie_aer.c   |  869 +++
 hw/pcie_aer.h   |  105 ++
 hw/pcie_port.c  |  198 +++
 hw/pcie_port.h  |   51 +++
 hw/pcie_regs.h  |  154 +
 hw/xio3130_downstream.c |  197 +++
 hw/xio3130_downstream.h |   11 +
 hw/xio3130_upstream.c   |  181 ++
 hw/xio3130_upstream.h   |   10 +
 qemu-common.h   |6 +
 qemu-monitor.hx |   36 ++
 sysemu.h|9 +
 23 files changed, 3272 insertions(+), 15 deletions(-)
 create mode 100644 hw/ioh3420.c
 create mode 100644 hw/ioh3420.h
 create mode 100644 hw/msi.c
 create mode 100644 hw/msi.h
 create mode 100644 hw/pcie.c
 create mode 100644 hw/pcie.h
 create mode 100644 hw/pcie_aer.c
 create mode 100644 hw/pcie_aer.h
 create mode 100644 hw/pcie_port.c
 create mode 100644 hw/pcie_port.h
 create mode 100644 hw/pcie_regs.h
 create mode 100644 hw/xio3130_downstream.c
 create mode 100644 hw/xio3130_downstream.h
 create mode 100644 hw/xio3130_upstream.c
 create mode 100644 hw/xio3130_upstream.h

[Qemu-devel] Re: [Tracing][v4 PATCH 1/2] Introduce QMP interfaces

2010-10-19 Thread Stefan Hajnoczi

On Tue, Oct 19, 2010 at 11:55:50AM +0530, Prerna Saxena wrote:
 [PATCH 1/2] Introduce QMP interfaces :
  - query-trace
  - query-trace-events
  - query-trace-file
 
 
 Signed-off-by: Prerna Saxena pre...@linux.vnet.ibm.com
 ---
  monitor.c |   53 ---
  simpletrace.c |   69 
 +
  simpletrace.h |5 
  3 files changed, 123 insertions(+), 4 deletions(-)

Acked-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com

[Qemu-devel] [PATCH v5 11/14] x3130: pcie upstream port

2010-10-19 Thread Isaku Yamahata

Implement TI x3130 pcie upstream port switch.

Signed-off-by: Isaku Yamahata yamah...@valinux.co.jp
---
Changes v4 - v5:
- remove flr related stuff.
  This will be addressed at the next phase.
- use pci_xxx_test_and_xxx_mask().

Chnages v3 - v4:
- rename pcie_upstream - x3130_upstream.
- compilation adjustment.

Changes v2 - v3:
- compilation adjustment.
---
 Makefile.objs |2 +-
 hw/xio3130_upstream.c |  181 +
 hw/xio3130_upstream.h |   10 +++
 3 files changed, 192 insertions(+), 1 deletions(-)
 create mode 100644 hw/xio3130_upstream.c
 create mode 100644 hw/xio3130_upstream.h

diff --git a/Makefile.objs b/Makefile.objs
index cf7d2e9..d61e88a 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -140,7 +140,7 @@ hw-obj-y =
 hw-obj-y += vl.o loader.o
 hw-obj-y += virtio.o virtio-console.o
 hw-obj-y += fw_cfg.o pci.o pci_host.o pcie_host.o pci_bridge.o
-hw-obj-y += ioh3420.o
+hw-obj-y += ioh3420.o xio3130_upstream.o
 hw-obj-y += watchdog.o
 hw-obj-$(CONFIG_ISA_MMIO) += isa_mmio.o
 hw-obj-$(CONFIG_ECC) += ecc.o
diff --git a/hw/xio3130_upstream.c b/hw/xio3130_upstream.c
new file mode 100644
index 000..cba2b09
--- /dev/null
+++ b/hw/xio3130_upstream.c
@@ -0,0 +1,181 @@
+/*
+ * xio3130_upstream.c
+ * TI X3130 pci express upstream port switch
+ *
+ * Copyright (c) 2010 Isaku Yamahata yamahata at valinux co jp
+ *VA Linux Systems Japan K.K.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see http://www.gnu.org/licenses/.
+ */
+
+#include pci_ids.h
+#include msi.h
+#include pcie.h
+#include xio3130_upstream.h
+
+#define PCI_DEVICE_ID_TI_XIO3130U   0x8232  /* upstream port */
+#define XIO3130_REVISION0x2
+#define XIO3130_MSI_OFFSET  0x70
+#define XIO3130_MSI_SUPPORTED_FLAGS PCI_MSI_FLAGS_64BIT
+#define XIO3130_MSI_NR_VECTOR   1
+#define XIO3130_SSVID_OFFSET0x80
+#define XIO3130_SSVID_SVID  0
+#define XIO3130_SSVID_SSID  0
+#define XIO3130_EXP_OFFSET  0x90
+#define XIO3130_AER_OFFSET  0x100
+
+static void xio3130_upstream_write_config(PCIDevice *d, uint32_t address,
+  uint32_t val, int len)
+{
+uint32_t uncorsta =
+pci_get_long(d-config + d-exp.aer_cap + PCI_ERR_UNCOR_STATUS);
+
+pci_bridge_write_config(d, address, val, len);
+pcie_cap_flr_write_config(d, address, val, len);
+msi_write_config(d, address, val, len);
+pcie_aer_write_config(d, address, val, len, uncorsta);
+pci_clear_written_write_config(d, address, val, len);
+}
+
+static void xio3130_upstream_reset(DeviceState *qdev)
+{
+PCIDevice *d = DO_UPCAST(PCIDevice, qdev, qdev);
+msi_reset(d);
+pci_bridge_reset_zero_base_limit(d);
+pci_bridge_reset(qdev);
+pcie_cap_deverr_reset(d);
+}
+
+static int xio3130_upstream_initfn(PCIDevice *d)
+{
+PCIBridge* br = DO_UPCAST(PCIBridge, dev, d);
+PCIEPort *p = DO_UPCAST(PCIEPort, br, br);
+int rc;
+
+rc = pci_bridge_initfn(d);
+if (rc  0) {
+return rc;
+}
+
+pcie_port_init_reg(d);
+pci_config_set_vendor_id(d-config, PCI_VENDOR_ID_TI);
+pci_config_set_device_id(d-config, PCI_DEVICE_ID_TI_XIO3130U);
+d-config[PCI_REVISION_ID] = XIO3130_REVISION;
+
+rc = msi_init(d, XIO3130_MSI_OFFSET, XIO3130_MSI_NR_VECTOR,
+  XIO3130_MSI_SUPPORTED_FLAGS  PCI_MSI_FLAGS_64BIT,
+  XIO3130_MSI_SUPPORTED_FLAGS  PCI_MSI_FLAGS_MASKBIT);
+if (rc  0) {
+return rc;
+}
+rc = pci_bridge_ssvid_init(d, XIO3130_SSVID_OFFSET,
+   XIO3130_SSVID_SVID, XIO3130_SSVID_SSID);
+if (rc  0) {
+return rc;
+}
+rc = pcie_cap_init(d, XIO3130_EXP_OFFSET, PCI_EXP_TYPE_UPSTREAM,
+   p-port);
+if (rc  0) {
+return rc;
+}
+
+/* TODO: implement FLR */
+pcie_cap_flr_init(d);
+
+pcie_cap_deverr_init(d);
+pcie_aer_init(d, XIO3130_AER_OFFSET);
+
+return 0;
+}
+
+static int xio3130_upstream_exitfn(PCIDevice *d)
+{
+pcie_aer_exit(d);
+msi_uninit(d);
+pcie_cap_exit(d);
+return pci_bridge_exitfn(d);
+}
+
+PCIEPort *xio3130_upstream_init(PCIBus *bus, int devfn, bool multifunction,
+ const char *bus_name, pci_map_irq_fn map_irq,
+ uint8_t port)
+{
+PCIDevice *d;
+

[Qemu-devel] [PATCH v5 08/14] pcie/aer: helper functions for pcie aer capability

2010-10-19 Thread Isaku Yamahata

This patch implements helper functions for pcie aer capability
which will be used later.

Signed-off-by: Isaku Yamahata yamah...@valinux.co.jp
---
Changes v4 - v5:
- use pci_xxx_test_and_xxx_mask()
- rewrote PCIDevice::written bits.
- eliminated pcie_aer_notify()
- introduced PCIExpressDevice::aer_intx

Changes v3 - v4:
- various naming fixes.
- use pci bit operation helper function
- eliminate errmsg function pointer
- replace pci_shift_xxx() with PCIDevice::written
- uncorrect error status register.
- dropped pcie_aer_cap()

Changes v2 - v3:
- split out from pcie.[ch] to pcie_aer.[ch] to make the files sorter.
- embeded PCIExpressDevice into PCIDevice.
- CodingStyle fix
---
 Makefile.objs |2 +-
 hw/pcie.h |6 +
 hw/pcie_aer.c |  785 +
 hw/pcie_aer.h |  105 
 qemu-common.h |3 +
 5 files changed, 900 insertions(+), 1 deletions(-)
 create mode 100644 hw/pcie_aer.c
 create mode 100644 hw/pcie_aer.h

diff --git a/Makefile.objs b/Makefile.objs
index eeb5134..68bcc48 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -186,7 +186,7 @@ hw-obj-$(CONFIG_PIIX4) += piix4.o
 # PCI watchdog devices
 hw-obj-y += wdt_i6300esb.o
 
-hw-obj-y += pcie.o
+hw-obj-y += pcie.o pcie_aer.o
 hw-obj-y += msix.o msi.o
 
 # PCI network cards
diff --git a/hw/pcie.h b/hw/pcie.h
index 68327d8..1b10753 100644
--- a/hw/pcie.h
+++ b/hw/pcie.h
@@ -24,6 +24,7 @@
 #include hw.h
 #include pci_regs.h
 #include pcie_regs.h
+#include pcie_aer.h
 
 typedef enum {
 /* for attention and power indicator */
@@ -66,6 +67,11 @@ struct PCIExpressDevice {
 
 /* SLOT */
 unsigned int hpev_intx; /* INTx for hot plug event */
+
+/* AER */
+uint16_t aer_cap;
+PCIEAERLog aer_log;
+unsigned int aer_intx;  /* INTx for error reporting */
 };
 
 /* PCI express capability helper functions */
diff --git a/hw/pcie_aer.c b/hw/pcie_aer.c
new file mode 100644
index 000..1b023b0
--- /dev/null
+++ b/hw/pcie_aer.c
@@ -0,0 +1,785 @@
+/*
+ * pcie_aer.c
+ *
+ * Copyright (c) 2010 Isaku Yamahata yamahata at valinux co jp
+ *VA Linux Systems Japan K.K.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see http://www.gnu.org/licenses/.
+ */
+
+#include sysemu.h
+#include pci_bridge.h
+#include pcie.h
+#include msix.h
+#include msi.h
+#include pci_internals.h
+#include pcie_regs.h
+
+//#define DEBUG_PCIE
+#ifdef DEBUG_PCIE
+# define PCIE_DPRINTF(fmt, ...) \
+fprintf(stderr, %s:%d  fmt, __func__, __LINE__, ## __VA_ARGS__)
+#else
+# define PCIE_DPRINTF(fmt, ...) do {} while (0)
+#endif
+#define PCIE_DEV_PRINTF(dev, fmt, ...)  \
+PCIE_DPRINTF(%s:%x fmt, (dev)-name, (dev)-devfn, ## __VA_ARGS__)
+
+static void pcie_aer_clear_error(PCIDevice *dev);
+static uint8_t pcie_aer_root_get_vector(PCIDevice *dev);
+static AERMsgResult
+pcie_aer_msg_alldev(PCIDevice *dev, const PCIEAERMsg *msg);
+static AERMsgResult
+pcie_aer_msg_vbridge(PCIDevice *dev, const PCIEAERMsg *msg);
+static AERMsgResult
+pcie_aer_msg_root_port(PCIDevice *dev, const PCIEAERMsg *msg);
+
+/* From 6.2.7 Error Listing and Rules. Table 6-2, 6-3 and 6-4 */
+static PCIEAERSeverity pcie_aer_uncor_default_severity(uint32_t status)
+{
+switch (status) {
+case PCI_ERR_UNC_INTN:
+case PCI_ERR_UNC_DLP:
+case PCI_ERR_UNC_SDN:
+case PCI_ERR_UNC_RX_OVER:
+case PCI_ERR_UNC_FCP:
+case PCI_ERR_UNC_MALF_TLP:
+return AER_ERR_FATAL;
+case PCI_ERR_UNC_POISON_TLP:
+case PCI_ERR_UNC_ECRC:
+case PCI_ERR_UNC_UNSUP:
+case PCI_ERR_UNC_COMP_TIME:
+case PCI_ERR_UNC_COMP_ABORT:
+case PCI_ERR_UNC_UNX_COMP:
+case PCI_ERR_UNC_ACSV:
+case PCI_ERR_UNC_MCBTLP:
+case PCI_ERR_UNC_ATOP_EBLOCKED:
+case PCI_ERR_UNC_TLP_PRF_BLOCKED:
+return AER_ERR_NONFATAL;
+default:
+break;
+}
+abort();
+return AER_ERR_FATAL;
+}
+
+static uint32_t aer_log_next(uint32_t i, uint32_t max)
+{
+return (i + 1) % max;
+}
+
+static bool aer_log_empty_index(uint32_t producer, uint32_t consumer)
+{
+return producer == consumer;
+}
+
+static bool aer_log_empty(PCIEAERLog *aer_log)
+{
+return aer_log_empty_index(aer_log-producer, aer_log-consumer);
+}
+
+static bool aer_log_full(PCIEAERLog *aer_log)
+{
+return aer_log_next(aer_log-producer, aer_log-log_max) ==
+aer_log-consumer;
+}

[Qemu-devel] [PATCH v5 06/14] pcie: add pcie constants to pcie_regs.h

2010-10-19 Thread Isaku Yamahata

add pcie constants to pcie_regs.h.
Those constants should go to Linux pci_regs.h and then the file should
go away eventually.

Signed-off-by: Isaku Yamahata yamah...@valinux.co.jp
---
Changes v3 - v4:
- removed copyright notice as requested.

Changes v2 - v3:
- moved out pcie constants from pcie.c to pcie_regs.h.
- removed unused macros
---
 hw/pcie_regs.h |  154 
 1 files changed, 154 insertions(+), 0 deletions(-)
 create mode 100644 hw/pcie_regs.h

diff --git a/hw/pcie_regs.h b/hw/pcie_regs.h
new file mode 100644
index 000..3461a1b
--- /dev/null
+++ b/hw/pcie_regs.h
@@ -0,0 +1,154 @@
+/*
+ * constants for pcie configurations space from pci express spec.
+ *
+ * TODO:
+ * Those constants and macros should go to Linux pci_regs.h
+ * Once they're merged, they will go away.
+ */
+#ifndef QEMU_PCIE_REGS_H
+#define QEMU_PCIE_REGS_H
+
+
+/* express capability */
+
+#define PCI_EXP_VER2_SIZEOF 0x3c /* express capability of ver. 2 */
+#define PCI_EXT_CAP_VER_SHIFT   16
+#define PCI_EXT_CAP_NEXT_SHIFT  20
+#define PCI_EXT_CAP_NEXT_MASK   (0xffc  PCI_EXT_CAP_NEXT_SHIFT)
+
+#define PCI_EXT_CAP(id, ver, next)  \
+((id) | \
+ ((ver)  PCI_EXT_CAP_VER_SHIFT) | \
+ ((next)  PCI_EXT_CAP_NEXT_SHIFT))
+
+#define PCI_EXT_CAP_ALIGN   4
+#define PCI_EXT_CAP_ALIGNUP(x)  \
+(((x) + PCI_EXT_CAP_ALIGN - 1)  ~(PCI_EXT_CAP_ALIGN - 1))
+
+/* PCI_EXP_FLAGS */
+#define PCI_EXP_FLAGS_VER2  2 /* for now, supports only ver. 2 */
+#define PCI_EXP_FLAGS_IRQ_SHIFT (ffs(PCI_EXP_FLAGS_IRQ) - 1)
+#define PCI_EXP_FLAGS_TYPE_SHIFT(ffs(PCI_EXP_FLAGS_TYPE) - 1)
+
+
+/* PCI_EXP_LINK{CAP, STA} */
+/* link speed */
+#define PCI_EXP_LNK_LS_25   1
+
+#define PCI_EXP_LNK_MLW_SHIFT   (ffs(PCI_EXP_LNKCAP_MLW) - 1)
+#define PCI_EXP_LNK_MLW_1   (1  PCI_EXP_LNK_MLW_SHIFT)
+
+/* PCI_EXP_LINKCAP */
+#define PCI_EXP_LNKCAP_ASPMS_SHIFT  (ffs(PCI_EXP_LNKCAP_ASPMS) - 1)
+#define PCI_EXP_LNKCAP_ASPMS_0S (1  PCI_EXP_LNKCAP_ASPMS_SHIFT)
+
+#define PCI_EXP_LNKCAP_PN_SHIFT (ffs(PCI_EXP_LNKCAP_PN) - 1)
+
+#define PCI_EXP_SLTCAP_PSN_SHIFT(ffs(PCI_EXP_SLTCAP_PSN) - 1)
+
+#define PCI_EXP_SLTCTL_IND_RESERVED 0x0
+#define PCI_EXP_SLTCTL_IND_ON   0x1
+#define PCI_EXP_SLTCTL_IND_BLINK0x2
+#define PCI_EXP_SLTCTL_IND_OFF  0x3
+#define PCI_EXP_SLTCTL_AIC_SHIFT(ffs(PCI_EXP_SLTCTL_AIC) - 1)
+#define PCI_EXP_SLTCTL_AIC_OFF  \
+(PCI_EXP_SLTCTL_IND_OFF  PCI_EXP_SLTCTL_AIC_SHIFT)
+
+#define PCI_EXP_SLTCTL_PIC_SHIFT(ffs(PCI_EXP_SLTCTL_PIC) - 1)
+#define PCI_EXP_SLTCTL_PIC_OFF  \
+(PCI_EXP_SLTCTL_IND_OFF  PCI_EXP_SLTCTL_PIC_SHIFT)
+
+#define PCI_EXP_SLTCTL_SUPPORTED\
+(PCI_EXP_SLTCTL_ABPE |  \
+ PCI_EXP_SLTCTL_PDCE |  \
+ PCI_EXP_SLTCTL_CCIE |  \
+ PCI_EXP_SLTCTL_HPIE |  \
+ PCI_EXP_SLTCTL_AIC |   \
+ PCI_EXP_SLTCTL_PCC |   \
+ PCI_EXP_SLTCTL_EIC)
+
+#define PCI_EXP_DEVCAP2_EFF 0x10
+#define PCI_EXP_DEVCAP2_EETLPP  0x20
+
+#define PCI_EXP_DEVCTL2_EETLPPB 0x80
+
+/* ARI */
+#define PCI_ARI_VER 1
+#define PCI_ARI_SIZEOF  8
+
+/* AER */
+#define PCI_ERR_VER 2
+#define PCI_ERR_SIZEOF  0x48
+
+#define PCI_ERR_UNC_SDN 0x0020  /* surprise down */
+#define PCI_ERR_UNC_ACSV0x0020  /* ACS Violation */
+#define PCI_ERR_UNC_INTN0x0040  /* Internal Error */
+#define PCI_ERR_UNC_MCBTLP  0x0080  /* MC Blcoked TLP */
+#define PCI_ERR_UNC_ATOP_EBLOCKED   0x0100  /* atomic op egress 
blocked */
+#define PCI_ERR_UNC_TLP_PRF_BLOCKED 0x0200  /* TLP Prefix Blocked 
*/
+#define PCI_ERR_COR_ADV_NONFATAL0x2000  /* Advisory Non-Fatal 
*/
+#define PCI_ERR_COR_INTERNAL0x4000  /* Corrected Internal 
*/
+#define PCI_ERR_COR_HL_OVERFLOW 0x8000  /* Header Long 
Overflow */
+#define PCI_ERR_CAP_FEP_MASK0x001f
+#define PCI_ERR_CAP_MHRC0x0200
+#define PCI_ERR_CAP_MHRE0x0400
+#define PCI_ERR_CAP_TLP 0x0800
+
+#define PCI_ERR_TLP_PREFIX_LOG  0x38
+
+#define PCI_SEC_STATUS_RCV_SYSTEM_ERROR 0x4000
+
+/* aer root error command/status */
+#define PCI_ERR_ROOT_CMD_EN_MASK(PCI_ERR_ROOT_CMD_COR_EN |  \
+ PCI_ERR_ROOT_CMD_NONFATAL_EN | \
+ PCI_ERR_ROOT_CMD_FATAL_EN)
+
+#define PCI_ERR_ROOT_IRQ_MAX

[Qemu-devel] Re: [Tracing][v4 PATCH 2/2] Add documentation for QMP interfaces

2010-10-19 Thread Stefan Hajnoczi

On Tue, Oct 19, 2010 at 11:57:50AM +0530, Prerna Saxena wrote:
 [PATCH 2/2] Add documentation for QMP commands:
  - query-trace
  - query-trace-events
  - query-trace-file.
 
 
 Signed-off-by: Prerna Saxena pre...@linux.vnet.ibm.com
 ---
  qmp-commands.hx |   94 
 +++
  1 files changed, 94 insertions(+), 0 deletions(-)

Acked-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com

[Qemu-devel] [PATCH v5 07/14] pcie: helper functions for pcie capability and extended capability

2010-10-19 Thread Isaku Yamahata

This patch implements helper functions for pci express capability
and pci express extended capability allocation.
NOTE: presence detection depends on pci_qdev_init() change.

Signed-off-by: Isaku Yamahata yamah...@valinux.co.jp
---
Changes v4 - v5:
- dropped FLR related members. This will be addressed at the next phase.
- use pci_xxx_test_and_xxx_mask().
- drop PCIDevice::written bits. and made related registers writable.
- eliminated pcie_cap_slot_notify()
- introduced PCIExpressDevice::hpev_intx

Changes v3 - v4:
- various clean up
- dropped pcie_notify(), pcie_del_capability()
- use pci_{clear_set, clear}_bit_xxx() helper functions.
- dropped pci_exp_cap()
---
 Makefile.objs |1 +
 hw/pci.h  |5 +
 hw/pcie.c |  540 +
 hw/pcie.h |  107 
 qemu-common.h |1 +
 5 files changed, 654 insertions(+), 0 deletions(-)
 create mode 100644 hw/pcie.c
 create mode 100644 hw/pcie.h

diff --git a/Makefile.objs b/Makefile.objs
index 5f5a4c5..eeb5134 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -186,6 +186,7 @@ hw-obj-$(CONFIG_PIIX4) += piix4.o
 # PCI watchdog devices
 hw-obj-y += wdt_i6300esb.o
 
+hw-obj-y += pcie.o
 hw-obj-y += msix.o msi.o
 
 # PCI network cards
diff --git a/hw/pci.h b/hw/pci.h
index 9e2f27d..d6c522b 100644
--- a/hw/pci.h
+++ b/hw/pci.h
@@ -9,6 +9,8 @@
 /* PCI includes legacy ISA access.  */
 #include isa.h
 
+#include pcie.h
+
 /* PCI bus */
 
 #define PCI_DEVFN(slot, func)   slot)  0x1f)  3) | ((func)  0x07))
@@ -175,6 +177,9 @@ struct PCIDevice {
 /* Offset of MSI capability in config space */
 uint8_t msi_cap;
 
+/* PCI Express */
+PCIExpressDevice exp;
+
 /* Location of option rom */
 char *romfile;
 ram_addr_t rom_offset;
diff --git a/hw/pcie.c b/hw/pcie.c
new file mode 100644
index 000..53d1fce
--- /dev/null
+++ b/hw/pcie.c
@@ -0,0 +1,540 @@
+/*
+ * pcie.c
+ *
+ * Copyright (c) 2010 Isaku Yamahata yamahata at valinux co jp
+ *VA Linux Systems Japan K.K.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see http://www.gnu.org/licenses/.
+ */
+
+#include sysemu.h
+#include pci_bridge.h
+#include pcie.h
+#include msix.h
+#include msi.h
+#include pci_internals.h
+#include pcie_regs.h
+
+//#define DEBUG_PCIE
+#ifdef DEBUG_PCIE
+# define PCIE_DPRINTF(fmt, ...) \
+fprintf(stderr, %s:%d  fmt, __func__, __LINE__, ## __VA_ARGS__)
+#else
+# define PCIE_DPRINTF(fmt, ...) do {} while (0)
+#endif
+#define PCIE_DEV_PRINTF(dev, fmt, ...)  \
+PCIE_DPRINTF(%s:%x fmt, (dev)-name, (dev)-devfn, ## __VA_ARGS__)
+
+
+/***
+ * pci express capability helper functions
+ */
+int pcie_cap_init(PCIDevice *dev, uint8_t offset, uint8_t type, uint8_t port)
+{
+int pos;
+uint8_t *exp_cap;
+
+assert(pci_is_express(dev));
+
+pos = pci_add_capability(dev, PCI_CAP_ID_EXP, offset,
+ PCI_EXP_VER2_SIZEOF);
+if (pos  0) {
+return pos;
+}
+dev-exp.exp_cap = pos;
+exp_cap = dev-config + pos;
+
+/* capability register
+   interrupt message number defaults to 0 */
+pci_set_word(exp_cap + PCI_EXP_FLAGS,
+ ((type  PCI_EXP_FLAGS_TYPE_SHIFT)  PCI_EXP_FLAGS_TYPE) |
+ PCI_EXP_FLAGS_VER2);
+
+/* device capability register
+ * table 7-12:
+ * roll based error reporting bit must be set by all
+ * Functions conforming to the ECN, PCI Express Base
+ * Specification, Revision 1.1., or subsequent PCI Express Base
+ * Specification revisions.
+ */
+pci_set_long(exp_cap + PCI_EXP_DEVCAP, PCI_EXP_DEVCAP_RBER);
+
+pci_set_long(exp_cap + PCI_EXP_LNKCAP,
+ (port  PCI_EXP_LNKCAP_PN_SHIFT) |
+ PCI_EXP_LNKCAP_ASPMS_0S |
+ PCI_EXP_LNK_MLW_1 |
+ PCI_EXP_LNK_LS_25);
+
+pci_set_word(exp_cap + PCI_EXP_LNKSTA,
+ PCI_EXP_LNK_MLW_1 | PCI_EXP_LNK_LS_25);
+
+pci_set_long(exp_cap + PCI_EXP_DEVCAP2,
+ PCI_EXP_DEVCAP2_EFF | PCI_EXP_DEVCAP2_EETLPP);
+
+pci_set_word(dev-wmask + pos, PCI_EXP_DEVCTL2_EETLPPB);
+return pos;
+}
+
+void pcie_cap_exit(PCIDevice *dev)
+{
+pci_del_capability(dev, PCI_CAP_ID_EXP, PCI_EXP_VER2_SIZEOF);
+}
+
+uint8_t

[Qemu-devel] [PATCH v5 05/14] msi: implements msi

2010-10-19 Thread Isaku Yamahata

implements msi related functions.

Signed-off-by: Isaku Yamahata yamah...@valinux.co.jp

---
Changes v4 - v5:
- use pci_xxx_test_and_clear/set_mask().

Changes v3 - v4:
- use pci_set_bit_xxx helper function.
- make nr_vectors, vector unsigned int.
- introduce PCI_MSI_VECTORS_MAX.
- fix undefined bit operations.
- eliminate msi_set_pending().

Changes v2 - v3:
- improved comment wording.
- simplified shift/ffs dance.

Changes v1 - v2:
- opencode some oneline helper function/macros for readability
- use ffs where appropriate
- rename some functions/variables as suggested.
- added assert()
- 1 - 1U
- clear INTx# when MSI is enabled
- clear pending bits for freed vectors.
- check the requested number of vectors.

msi: update for helper functions.

update for helper functions.

Signed-off-by: Isaku Yamahata yamah...@valinux.co.jp
---
 Makefile.objs |2 +-
 hw/msi.c  |  352 +
 hw/msi.h  |   41 +++
 hw/pci.h  |   10 +-
 4 files changed, 401 insertions(+), 4 deletions(-)
 create mode 100644 hw/msi.c
 create mode 100644 hw/msi.h

diff --git a/Makefile.objs b/Makefile.objs
index 594894b..5f5a4c5 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -186,7 +186,7 @@ hw-obj-$(CONFIG_PIIX4) += piix4.o
 # PCI watchdog devices
 hw-obj-y += wdt_i6300esb.o
 
-hw-obj-y += msix.o
+hw-obj-y += msix.o msi.o
 
 # PCI network cards
 hw-obj-y += ne2000.o
diff --git a/hw/msi.c b/hw/msi.c
new file mode 100644
index 000..a949d82
--- /dev/null
+++ b/hw/msi.c
@@ -0,0 +1,352 @@
+/*
+ * msi.c
+ *
+ * Copyright (c) 2010 Isaku Yamahata yamahata at valinux co jp
+ *VA Linux Systems Japan K.K.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see http://www.gnu.org/licenses/.
+ */
+
+#include msi.h
+
+/* Eventually those constants should go to Linux pci_regs.h */
+#define PCI_MSI_PENDING_32  0x10
+#define PCI_MSI_PENDING_64  0x14
+
+/* PCI_MSI_ADDRESS_LO */
+#define PCI_MSI_ADDRESS_LO_MASK (~0x3)
+
+/* If we get rid of cap allocator, we won't need those. */
+#define PCI_MSI_32_SIZEOF   0x0a
+#define PCI_MSI_64_SIZEOF   0x0e
+#define PCI_MSI_32M_SIZEOF  0x14
+#define PCI_MSI_64M_SIZEOF  0x18
+
+#define PCI_MSI_VECTORS_MAX 32
+
+/* If we get rid of cap allocator, we won't need this. */
+static inline uint8_t msi_cap_sizeof(uint16_t flags)
+{
+switch (flags  (PCI_MSI_FLAGS_MASKBIT | PCI_MSI_FLAGS_64BIT)) {
+case PCI_MSI_FLAGS_MASKBIT | PCI_MSI_FLAGS_64BIT:
+return PCI_MSI_64M_SIZEOF;
+case PCI_MSI_FLAGS_64BIT:
+return PCI_MSI_64_SIZEOF;
+case PCI_MSI_FLAGS_MASKBIT:
+return PCI_MSI_32M_SIZEOF;
+case 0:
+return PCI_MSI_32_SIZEOF;
+default:
+abort();
+break;
+}
+return 0;
+}
+
+//#define MSI_DEBUG
+
+#ifdef MSI_DEBUG
+# define MSI_DPRINTF(fmt, ...)  \
+fprintf(stderr, %s:%d  fmt, __func__, __LINE__, ## __VA_ARGS__)
+#else
+# define MSI_DPRINTF(fmt, ...)  do { } while (0)
+#endif
+#define MSI_DEV_PRINTF(dev, fmt, ...)   \
+MSI_DPRINTF(%s:%x  fmt, (dev)-name, (dev)-devfn, ## __VA_ARGS__)
+
+static inline unsigned int msi_nr_vectors(uint16_t flags)
+{
+return 1U 
+((flags  PCI_MSI_FLAGS_QSIZE)  (ffs(PCI_MSI_FLAGS_QSIZE) - 1));
+}
+
+static inline uint8_t msi_flags_off(const PCIDevice* dev)
+{
+return dev-msi_cap + PCI_MSI_FLAGS;
+}
+
+static inline uint8_t msi_address_lo_off(const PCIDevice* dev)
+{
+return dev-msi_cap + PCI_MSI_ADDRESS_LO;
+}
+
+static inline uint8_t msi_address_hi_off(const PCIDevice* dev)
+{
+return dev-msi_cap + PCI_MSI_ADDRESS_HI;
+}
+
+static inline uint8_t msi_data_off(const PCIDevice* dev, bool msi64bit)
+{
+return dev-msi_cap + (msi64bit ? PCI_MSI_DATA_64 : PCI_MSI_DATA_32);
+}
+
+static inline uint8_t msi_mask_off(const PCIDevice* dev, bool msi64bit)
+{
+return dev-msi_cap + (msi64bit ? PCI_MSI_MASK_64 : PCI_MSI_MASK_32);
+}
+
+static inline uint8_t msi_pending_off(const PCIDevice* dev, bool msi64bit)
+{
+return dev-msi_cap + (msi64bit ? PCI_MSI_PENDING_64 : PCI_MSI_PENDING_32);
+}
+
+bool msi_enabled(const PCIDevice *dev)
+{
+return msi_present(dev) 
+(pci_get_word(dev-config + msi_flags_off(dev)) 
+ PCI_MSI_FLAGS_ENABLE);
+}
+
+int msi_init(struct PCIDevice *dev, uint8_t offset,
+ unsigned int nr_vectors,

[Qemu-devel] [PATCH v5 12/14] x3130: pcie downstream port

2010-10-19 Thread Isaku Yamahata

Implement TI x3130 pcie downstream port switch.

Signed-off-by: Isaku Yamahata yamah...@valinux.co.jp
---
Changes v4 - v5:
- use pci_xxx_test_and_xxx_mask().
- removed flr related stuff.

Changes v3 - v4:
- rename: pcie_downstream - x3130_downstream
- compilation adjustment.

Changes v2 - v3:
- compilation adjustment.
---
 Makefile.objs   |2 +-
 hw/xio3130_downstream.c |  197 +++
 hw/xio3130_downstream.h |   11 +++
 3 files changed, 209 insertions(+), 1 deletions(-)
 create mode 100644 hw/xio3130_downstream.c
 create mode 100644 hw/xio3130_downstream.h

diff --git a/Makefile.objs b/Makefile.objs
index d61e88a..48f98f3 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -140,7 +140,7 @@ hw-obj-y =
 hw-obj-y += vl.o loader.o
 hw-obj-y += virtio.o virtio-console.o
 hw-obj-y += fw_cfg.o pci.o pci_host.o pcie_host.o pci_bridge.o
-hw-obj-y += ioh3420.o xio3130_upstream.o
+hw-obj-y += ioh3420.o xio3130_upstream.o xio3130_downstream.o
 hw-obj-y += watchdog.o
 hw-obj-$(CONFIG_ISA_MMIO) += isa_mmio.o
 hw-obj-$(CONFIG_ECC) += ecc.o
diff --git a/hw/xio3130_downstream.c b/hw/xio3130_downstream.c
new file mode 100644
index 000..9801723
--- /dev/null
+++ b/hw/xio3130_downstream.c
@@ -0,0 +1,197 @@
+/*
+ * x3130_downstream.c
+ * TI X3130 pci express downstream port switch
+ *
+ * Copyright (c) 2010 Isaku Yamahata yamahata at valinux co jp
+ *VA Linux Systems Japan K.K.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see http://www.gnu.org/licenses/.
+ */
+
+#include pci_ids.h
+#include msi.h
+#include pcie.h
+#include xio3130_downstream.h
+
+#define PCI_DEVICE_ID_TI_XIO3130D   0x8233  /* downstream port */
+#define XIO3130_REVISION0x1
+#define XIO3130_MSI_OFFSET  0x70
+#define XIO3130_MSI_SUPPORTED_FLAGS PCI_MSI_FLAGS_64BIT
+#define XIO3130_MSI_NR_VECTOR   1
+#define XIO3130_SSVID_OFFSET0x80
+#define XIO3130_SSVID_SVID  0
+#define XIO3130_SSVID_SSID  0
+#define XIO3130_EXP_OFFSET  0x90
+#define XIO3130_AER_OFFSET  0x100
+
+static void xio3130_downstream_write_config(PCIDevice *d, uint32_t address,
+ uint32_t val, int len)
+{
+uint16_t sltctl =
+pci_get_word(d-config + d-exp.exp_cap + PCI_EXP_SLTCTL);
+uint32_t uncorsta =
+pci_get_long(d-config + d-exp.aer_cap + PCI_ERR_UNCOR_STATUS);
+
+pci_bridge_write_config(d, address, val, len);
+pcie_cap_flr_write_config(d, address, val, len);
+pcie_cap_slot_write_config(d, address, val, len, sltctl);
+msi_write_config(d, address, val, len);
+pcie_aer_write_config(d, address, val, len, uncorsta);
+pci_clear_written_write_config(d, address, val, len);
+}
+
+static void xio3130_downstream_reset(DeviceState *qdev)
+{
+PCIDevice *d = DO_UPCAST(PCIDevice, qdev, qdev);
+msi_reset(d);
+pcie_cap_deverr_reset(d);
+pcie_cap_slot_reset(d);
+pcie_cap_ari_reset(d);
+pci_bridge_reset_zero_base_limit(d);
+pci_bridge_reset(qdev);
+}
+
+static int xio3130_downstream_initfn(PCIDevice *d)
+{
+PCIBridge* br = DO_UPCAST(PCIBridge, dev, d);
+PCIEPort *p = DO_UPCAST(PCIEPort, br, br);
+PCIESlot *s = DO_UPCAST(PCIESlot, port, p);
+int rc;
+
+rc = pci_bridge_initfn(d);
+if (rc  0) {
+return rc;
+}
+
+pcie_port_init_reg(d);
+pci_config_set_vendor_id(d-config, PCI_VENDOR_ID_TI);
+pci_config_set_device_id(d-config, PCI_DEVICE_ID_TI_XIO3130D);
+d-config[PCI_REVISION_ID] = XIO3130_REVISION;
+
+rc = msi_init(d, XIO3130_MSI_OFFSET, XIO3130_MSI_NR_VECTOR,
+  XIO3130_MSI_SUPPORTED_FLAGS  PCI_MSI_FLAGS_64BIT,
+  XIO3130_MSI_SUPPORTED_FLAGS  PCI_MSI_FLAGS_MASKBIT);
+if (rc  0) {
+return rc;
+}
+rc = pci_bridge_ssvid_init(d, XIO3130_SSVID_OFFSET,
+   XIO3130_SSVID_SVID, XIO3130_SSVID_SSID);
+if (rc  0) {
+return rc;
+}
+rc = pcie_cap_init(d, XIO3130_EXP_OFFSET, PCI_EXP_TYPE_DOWNSTREAM,
+   p-port);
+if (rc  0) {
+return rc;
+}
+pcie_cap_flr_init(d);   /* TODO: implement FLR */
+pcie_cap_deverr_init(d);
+pcie_cap_slot_init(d, s-slot);
+pcie_chassis_create(s-chassis);
+rc = pcie_chassis_add_slot(s);
+if (rc  0) {
+return rc;
+}
+

[Qemu-devel] [PATCH v5 09/14] pcie port: define struct PCIEPort/PCIESlot and helper functions

2010-10-19 Thread Isaku Yamahata

define struct PCIEPort which represents common part
of pci express port.(root, upstream and downstream.)
add a helper function for pcie port which can be used commonly by
root/upstream/downstream port.
define struct PCIESlot which represents common part of
pcie slot.(root and downstream.) and helper functions for it.
helper functions for chassis, slot - PCIESlot conversion.

Signed-off-by: Isaku Yamahata yamah...@valinux.co.jp
---
Changes v4 - v5:
- use pci_xxx_test_and_xxx_mask()

Changes v3 - v4:
- Initialize prefetchable memory base/limit registers correctly.
  They must support 64bit.
- compilation adjustment.

Changes v2 - v3:
- static'fy chassis.
- compilation adjustment.
---
 Makefile.objs  |2 +-
 hw/pcie_port.c |  116 
 hw/pcie_port.h |   51 
 qemu-common.h  |2 +
 4 files changed, 170 insertions(+), 1 deletions(-)
 create mode 100644 hw/pcie_port.c
 create mode 100644 hw/pcie_port.h

diff --git a/Makefile.objs b/Makefile.objs
index 68bcc48..6c3b84a 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -186,7 +186,7 @@ hw-obj-$(CONFIG_PIIX4) += piix4.o
 # PCI watchdog devices
 hw-obj-y += wdt_i6300esb.o
 
-hw-obj-y += pcie.o pcie_aer.o
+hw-obj-y += pcie.o pcie_aer.o pcie_port.o
 hw-obj-y += msix.o msi.o
 
 # PCI network cards
diff --git a/hw/pcie_port.c b/hw/pcie_port.c
new file mode 100644
index 000..117de61
--- /dev/null
+++ b/hw/pcie_port.c
@@ -0,0 +1,116 @@
+/*
+ * pcie_port.c
+ *
+ * Copyright (c) 2010 Isaku Yamahata yamahata at valinux co jp
+ *VA Linux Systems Japan K.K.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see http://www.gnu.org/licenses/.
+ */
+
+#include pcie_port.h
+
+void pcie_port_init_reg(PCIDevice *d)
+{
+/* Unlike pci bridge,
+   66MHz and fast back to back don't apply to pci express port. */
+pci_set_word(d-config + PCI_STATUS, 0);
+pci_set_word(d-config + PCI_SEC_STATUS, 0);
+
+/* 7.5.3.5 Prefetchable Memory Base Limit
+ * The Prefetchable Memory Base and Prefetchable Memory Limit registers
+ * must indicate that 64-bit addresses are supported, as defined in
+ * PCI-to-PCI Bridge Architecture Specification, Revision 1.2.
+ */
+pci_word_test_and_set_mask(d-config + PCI_PREF_MEMORY_BASE,
+   PCI_PREF_RANGE_TYPE_64);
+pci_word_test_and_set_mask(d-config + PCI_PREF_MEMORY_LIMIT,
+   PCI_PREF_RANGE_TYPE_64);
+}
+
+/**
+ * (chassis number, pcie physical slot number) - pcie slot conversion
+ */
+struct PCIEChassis {
+uint8_t number;
+
+QLIST_HEAD(, PCIESlot) slots;
+QLIST_ENTRY(PCIEChassis) next;
+};
+
+static QLIST_HEAD(, PCIEChassis) chassis = QLIST_HEAD_INITIALIZER(chassis);
+
+static struct PCIEChassis *pcie_chassis_find(uint8_t chassis_number)
+{
+struct PCIEChassis *c;
+QLIST_FOREACH(c, chassis, next) {
+if (c-number == chassis_number) {
+break;
+}
+}
+return c;
+}
+
+void pcie_chassis_create(uint8_t chassis_number)
+{
+struct PCIEChassis *c;
+c = pcie_chassis_find(chassis_number);
+if (c) {
+return;
+}
+c = qemu_mallocz(sizeof(*c));
+c-number = chassis_number;
+QLIST_INIT(c-slots);
+QLIST_INSERT_HEAD(chassis, c, next);
+}
+
+static PCIESlot *pcie_chassis_find_slot_with_chassis(struct PCIEChassis *c,
+ uint8_t slot)
+{
+PCIESlot *s;
+QLIST_FOREACH(s, c-slots, next) {
+if (s-slot == slot) {
+break;
+}
+}
+return s;
+}
+
+PCIESlot *pcie_chassis_find_slot(uint8_t chassis_number, uint16_t slot)
+{
+struct PCIEChassis *c;
+c = pcie_chassis_find(chassis_number);
+if (!c) {
+return NULL;
+}
+return pcie_chassis_find_slot_with_chassis(c, slot);
+}
+
+int pcie_chassis_add_slot(struct PCIESlot *slot)
+{
+struct PCIEChassis *c;
+c = pcie_chassis_find(slot-chassis);
+if (!c) {
+return -ENODEV;
+}
+if (pcie_chassis_find_slot_with_chassis(c, slot-slot)) {
+return -EBUSY;
+}
+QLIST_INSERT_HEAD(c-slots, slot, next);
+return 0;
+}
+
+void pcie_chassis_del_slot(PCIESlot *s)
+{
+QLIST_REMOVE(s, next);
+}
diff --git a/hw/pcie_port.h b/hw/pcie_port.h
new file mode 100644
index

[Qemu-devel] [PATCH v5 10/14] ioh3420: pcie root port in X58 ioh

2010-10-19 Thread Isaku Yamahata

Implements pcie root port switch in intel X58 ioh
whose device id is 0x3420.

Signed-off-by: Isaku Yamahata yamah...@valinux.co.jp
---
Changes v4 - v5:
- use pci_xxx_test_and_xxx_mask()

Changes v3 - v4:
- rename pcie_root - ioh3420
- compilation adjustment.

Changes v2 - v3:
- compilation adjustment.
---
 Makefile.objs |1 +
 hw/ioh3420.c  |  229 +
 hw/ioh3420.h  |   10 +++
 3 files changed, 240 insertions(+), 0 deletions(-)
 create mode 100644 hw/ioh3420.c
 create mode 100644 hw/ioh3420.h

diff --git a/Makefile.objs b/Makefile.objs
index 6c3b84a..cf7d2e9 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -140,6 +140,7 @@ hw-obj-y =
 hw-obj-y += vl.o loader.o
 hw-obj-y += virtio.o virtio-console.o
 hw-obj-y += fw_cfg.o pci.o pci_host.o pcie_host.o pci_bridge.o
+hw-obj-y += ioh3420.o
 hw-obj-y += watchdog.o
 hw-obj-$(CONFIG_ISA_MMIO) += isa_mmio.o
 hw-obj-$(CONFIG_ECC) += ecc.o
diff --git a/hw/ioh3420.c b/hw/ioh3420.c
new file mode 100644
index 000..4317ac3
--- /dev/null
+++ b/hw/ioh3420.c
@@ -0,0 +1,229 @@
+/*
+ * ioh3420.c
+ * Intel X58 north bridge IOH
+ * PCI Express root port device id 3420
+ *
+ * Copyright (c) 2010 Isaku Yamahata yamahata at valinux co jp
+ *VA Linux Systems Japan K.K.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see http://www.gnu.org/licenses/.
+ */
+
+#include pci_ids.h
+#include msi.h
+#include pcie.h
+#include ioh3420.h
+
+#define PCI_DEVICE_ID_IOH_EPORT 0x3420  /* D0:F0 express mode */
+#define PCI_DEVICE_ID_IOH_REV   0x2
+#define IOH_EP_SSVID_OFFSET 0x40
+#define IOH_EP_SSVID_SVID   PCI_VENDOR_ID_INTEL
+#define IOH_EP_SSVID_SSID   0
+#define IOH_EP_MSI_OFFSET   0x60
+#define IOH_EP_MSI_SUPPORTED_FLAGS  PCI_MSI_FLAGS_MASKBIT
+#define IOH_EP_MSI_NR_VECTOR2
+#define IOH_EP_EXP_OFFSET   0x90
+#define IOH_EP_AER_OFFSET   0x100
+
+/*
+ * If two MSI vector are allocated, Advanced Error Interrupt Message Number
+ * is 1. otherwise 0.
+ * 17.12.5.10 RPERRSTS,  32:27 bit Advanced Error Interrupt Message Number.
+ */
+static uint8_t ioh3420_aer_vector(const PCIDevice *d)
+{
+switch (msi_nr_vectors_allocated(d)) {
+case 1:
+return 0;
+case 2:
+return 1;
+case 4:
+case 8:
+case 16:
+case 32:
+default:
+break;
+}
+abort();
+return 0;
+}
+
+static void ioh3420_aer_vector_update(PCIDevice *d)
+{
+pcie_aer_root_set_vector(d, ioh3420_aer_vector(d));
+}
+
+static void ioh3420_write_config(PCIDevice *d,
+   uint32_t address, uint32_t val, int len)
+{
+uint16_t sltctl =
+pci_get_word(d-config + d-exp.exp_cap + PCI_EXP_SLTCTL);
+uint32_t uncorsta =
+pci_get_long(d-config + d-exp.aer_cap + PCI_ERR_UNCOR_STATUS);
+uint32_t root_cmd =
+pci_get_long(d-config + d-exp.aer_cap + PCI_ERR_ROOT_COMMAND);
+
+pci_bridge_write_config(d, address, val, len);
+msi_write_config(d, address, val, len);
+ioh3420_aer_vector_update(d);
+pcie_cap_slot_write_config(d, address, val, len, sltctl);
+pcie_aer_write_config(d, address, val, len, uncorsta);
+pcie_aer_root_write_config(d, address, val, len, root_cmd);
+pci_clear_written_write_config(d, address, val, len);
+}
+
+static void ioh3420_reset(DeviceState *qdev)
+{
+PCIDevice *d = DO_UPCAST(PCIDevice, qdev, qdev);
+msi_reset(d);
+ioh3420_aer_vector_update(d);
+pcie_cap_root_reset(d);
+pcie_cap_deverr_reset(d);
+pcie_cap_slot_reset(d);
+pcie_aer_root_reset(d);
+pci_bridge_reset_disable_base_limit(d);
+pci_bridge_reset(qdev);
+}
+
+static int ioh3420_initfn(PCIDevice *d)
+{
+PCIBridge* br = DO_UPCAST(PCIBridge, dev, d);
+PCIEPort *p = DO_UPCAST(PCIEPort, br, br);
+PCIESlot *s = DO_UPCAST(PCIESlot, port, p);
+int rc;
+
+rc = pci_bridge_initfn(d);
+if (rc  0) {
+return rc;
+}
+
+d-config[PCI_REVISION_ID] = PCI_DEVICE_ID_IOH_REV;
+pcie_port_init_reg(d);
+
+pci_config_set_vendor_id(d-config, PCI_VENDOR_ID_INTEL);
+pci_config_set_device_id(d-config, PCI_DEVICE_ID_IOH_EPORT);
+
+rc = pci_bridge_ssvid_init(d, IOH_EP_SSVID_OFFSET,
+   IOH_EP_SSVID_SVID, IOH_EP_SSVID_SSID);
+if (rc  0) {
+return rc;
+}
+rc = msi_init(d,

Re: Testing of russian keymap (was Re: [Qemu-devel] [PATCH] fix '/' and '|' on russian keymap)

2010-10-19 Thread Daniel P. Berrange

On Mon, Oct 18, 2010 at 01:59:15PM -0500, Anthony Liguori wrote:
 On 10/18/2010 12:30 PM, Oleg Sadov wrote:
 I don't understand reasons for such locale-default keyboard settings for
 qemu too, but may be it's useful for someone...

 
 -k only exists to deal with crappy VNC clients.
 
 If you use a good VNC client (like vinagre or virt-viewer) then you 
 don't have to use -k.

Indeed you must *NOT* use -k then, because that disables the extension
that vinagre/virt-viewer rely on for sane keyboard handling.

Regards,
Daniel
-- 
|: Red Hat, Engineering, London-o-   http://people.redhat.com/berrange/ :|
|: http://libvirt.org -o- http://virt-manager.org -o- http://deltacloud.org :|
|: http://autobuild.org-o- http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505  -o-   F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

[Qemu-devel] qemu aborts if i add a already registered device from qemu monitor ..

2010-10-19 Thread pradeep

Hi

I tried to add a device to guest from upstream qemu monitor using
device_add.
Unknowingly i try to add already registered devices from qemu
monitor, my qemu monitor is aborted. I don't see a reason to kill
monitor. I think abort() is a bit rough. we need a better way to handle
it.  If a user try to add a already registered device, qemu should
convey this to user saying that, this device already registered and an
error message should be fine than aborting qemu.


QLIST_FOREACH(block, ram_list.blocks, next) {
if (!strcmp(block-idstr, new_block-idstr)) {
fprintf(stderr, RAMBlock \%s\ already registered,
abort!\n,
new_block-idstr);
abort();
}


If i return some other value in above code, instead of abort(), I
would  need change the code for every device, which i dont want to. 
Is there a way to check, if device is already enrolled or not in the very 
beginning of device_add
call.



Thanks
Pradeep

Re: [Qemu-devel] [PATCH] Add a DTrace tracing backend targetted for SystemTAP compatability

2010-10-19 Thread Stefan Hajnoczi

On Mon, Oct 18, 2010 at 3:04 PM, Daniel P. Berrange berra...@redhat.com wrote:
 This introduces a new tracing backend that targets the SystemTAP
 implementation of DTrace userspace tracing. The core functionality
 should be applicable and standard across any DTrace implementation
 on Solaris, OS-X, *BSD, but the Makefile rules will likely need
 some small additional changes to cope with OS specific build
 requirements.

Cool, I will try this patch out shortly.  Here a few comments:

DTrace detection in ./configure would help users trying out
--trace-backend=dtrace on systems without SystemTAP installed.
Perhaps running dtrace(1) is a sufficient test?  If SystemTAP is not
installed then an error message from ./configure will save users time.


 This backend builds a little differently from the other tracing
 backends. Specifically there is no 'trace.c' file, because the
 'dtrace' command line tool generates a '.o' file directly from
 the dtrace probe definition file. The probe definition is usually
 named with a '.d' extension but QEMU uses '.d' files for its
 external makefile dependancy tracking, so this uses '.dtrace' as
 the extension for the probe definition file.

 The 'tracetool' program gains the ability to generate a trace.h
 file for DTrace, and also to generate the trace.d file containing
 the dtrace probe definition, and finally a qemu.stp file which is
 a wrapper around the probe definition providing more convenient
 access from SystemTAP scripts.

 eg, instead of

  probe process(qemu).mark(qemu_malloc) {
    printf(Malloc %d %p\n, $arg1, $arg2);
  }

 The addition of qemu.stp to /usr/share/systemtap/tapset/
 lets users write

  probe qemu.qemu_malloc {
    printf(Malloc %d %p\n, size, ptr);
  }

 * .gitignore: Ignore trace-dtrace.*
 * Makefile: Extra rules for generating DTrace files
 * Makefile.obj: Don't build trace.o for DTrace, use
  trace-dtrace.o generated by 'dtrace' instead
 * tracetool: Support for generating DTrace/SystemTAP
  data files

 Signed-off-by: Daniel P. Berrange berra...@redhat.com
 ---
  .gitignore    |    3 +
  Makefile      |   31 ++
  Makefile.objs |    4 +
  tracetool     |  175 
 -
  4 files changed, 211 insertions(+), 2 deletions(-)

 diff --git a/.gitignore b/.gitignore
 index a43e4d1..0d27afd 100644
 --- a/.gitignore
 +++ b/.gitignore
 @@ -4,6 +4,9 @@ config-host.*
  config-target.*
  trace.h
  trace.c
 +trace-dtrace.h
 +trace-dtrace.dtrace
 +qemu.stp
  *-timestamp
  *-softmmu
  *-darwin-user
 diff --git a/Makefile b/Makefile
 index 252c817..812b0d3 100644
 --- a/Makefile
 +++ b/Makefile
 @@ -1,6 +1,9 @@
  # Makefile for QEMU.

  GENERATED_HEADERS = config-host.h trace.h
 +ifeq ($(TRACE_BACKEND),dtrace)
 +GENERATED_HEADERS += trace-dtrace.h
 +endif

  ifneq ($(wildcard config-host.mak),)
  # Put the all: rule here so that config-host.mak can contain dependencies.
 @@ -106,7 +109,11 @@ ui/vnc.o: QEMU_CFLAGS += $(VNC_TLS_CFLAGS)

  bt-host.o: QEMU_CFLAGS += $(BLUEZ_CFLAGS)

 +ifeq ($(TRACE_BACKEND),dtrace)
 +trace.h: trace.h-timestamp trace-dtrace.h
 +else
  trace.h: trace.h-timestamp
 +endif
  trace.h-timestamp: $(SRC_PATH)/trace-events config-host.mak
        $(call quiet-command,sh $(SRC_PATH)/tracetool --$(TRACE_BACKEND) -h  
 $  $@,  GEN   trace.h)
       �...@cmp -s $@ trace.h || cp $@ trace.h
 @@ -118,6 +125,23 @@ trace.c-timestamp: $(SRC_PATH)/trace-events 
 config-host.mak

  trace.o: trace.c $(GENERATED_HEADERS)

 +trace-dtrace.h: trace-dtrace.dtrace
 +       $(call quiet-command,dtrace -o $@ -h -s $,   GEN   trace-dtrace.h)
 +
 +# Normal practice is to name DTrace probe file with a '.d' extension
 +# but that gets picked up by QEMU's Makefile as an external dependancy
 +# rule file. So we use '.dtrace' instead
 +trace-dtrace.dtrace: trace-dtrace.dtrace-timestamp
 +trace-dtrace.dtrace-timestamp: $(SRC_PATH)/trace-events config-host.mak
 +       $(call quiet-command,sh $(SRC_PATH)/tracetool --$(TRACE_BACKEND) -d  
 $  $@,  GEN   trace-dtrace.dtrace)
 +       @cmp -s $@ trace-dtrace.dtrace || cp $@ trace-dtrace.dtrace
 +ifdef CONFIG_LINUX
 +       $(call quiet-command,sh $(SRC_PATH)/tracetool --$(TRACE_BACKEND) -s  
 $  qemu.stp,  GEN   qemu.stp)
 +endif
 +
 +trace-dtrace.o: trace-dtrace.dtrace $(GENERATED_HEADERS)
 +       $(call quiet-command,dtrace -o $@ -G -s $,   GEN trace-dtrace.o)
 +
  simpletrace.o: simpletrace.c $(GENERATED_HEADERS)

  version.o: $(SRC_PATH)/version.rc config-host.mak
 @@ -154,6 +178,7 @@ clean:
        rm -f slirp/*.o slirp/*.d audio/*.o audio/*.d block/*.o block/*.d 
 net/*.o net/*.d fsdev/*.o fsdev/*.d ui/*.o ui/*.d
        rm -f qemu-img-cmds.h
        rm -f trace.c trace.h trace.c-timestamp trace.h-timestamp
 +       rm -f trace-dtrace.dtrace trace-dtrace.h trace-dtrace.h-timestamp 
 qemu.stp
        $(MAKE) -C tests clean
        for d in $(ALL_SUBDIRS) libhw32 libhw64 libuser libdis libdis-user; do 
 \
        if test -d $$d; then $(MAKE) -C $$d $@ || exit 1;

Re: [Qemu-devel] [PATCH 1/2] Add drive_get_by_id

2010-10-19 Thread Stefan Hajnoczi

On Mon, Oct 18, 2010 at 11:17 PM, Ryan Harper ry...@us.ibm.com wrote:
 Add a function to find a drive by id string.

 Signed-off-by: Ryan Harper ry...@us.ibm.com
 ---
  blockdev.c |   12 
  blockdev.h |    1 +
  2 files changed, 13 insertions(+), 0 deletions(-)

 diff --git a/blockdev.c b/blockdev.c
 index ff7602b..a00b3fa 100644
 --- a/blockdev.c
 +++ b/blockdev.c
 @@ -75,6 +75,18 @@ DriveInfo *drive_get(BlockInterfaceType type, int bus, int 
 unit)
     return NULL;
  }

 +DriveInfo *drive_get_by_id(const char *id)
 +{
 +    DriveInfo *dinfo;
 +
 +    QTAILQ_FOREACH(dinfo, drives, next) {
 +        if (strcmp(id, dinfo-id))
 +            continue;

QEMU coding style uses curly braces even for 1-line if statements:
if (strcmp(id, dinfo-id)) {
continue;
}

Stefan

[Qemu-devel] [PATCH 1/3] qdev: make qdev_find_recursive public

2010-10-19 Thread Alon Levy

---
 hw/qdev.c |2 +-
 hw/qdev.h |1 +
 2 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/hw/qdev.c b/hw/qdev.c
index 35858cb..d669a9d 100644
--- a/hw/qdev.c
+++ b/hw/qdev.c
@@ -477,7 +477,7 @@ static BusState *qbus_find_recursive(BusState *bus, const 
char *name,
 return NULL;
 }
 
-static DeviceState *qdev_find_recursive(BusState *bus, const char *id)
+DeviceState *qdev_find_recursive(BusState *bus, const char *id)
 {
 DeviceState *dev, *ret;
 BusState *child;
diff --git a/hw/qdev.h b/hw/qdev.h
index 579328a..214066e 100644
--- a/hw/qdev.h
+++ b/hw/qdev.h
@@ -177,6 +177,7 @@ void qbus_create_inplace(BusState *bus, BusInfo *info,
  DeviceState *parent, const char *name);
 BusState *qbus_create(BusInfo *info, DeviceState *parent, const char *name);
 void qbus_free(BusState *bus);
+DeviceState *qdev_find_recursive(BusState *bus, const char *id);
 
 #define FROM_QBUS(type, dev) DO_UPCAST(type, qbus, dev)
 
-- 
1.7.3.1

[Qemu-devel] [PATCH 0/3] add usb_detach and usb_attach (v2)

2010-10-19 Thread Alon Levy

This patchset uses id like device_del for attaching/detaching usb
devices. The first two patches ready the way:
 1. makes qdev_find_recursive non static and in qdev.h
 2. adds a usb_device_by_id which goes over the usb buses calling
  qdev_find_recursive
 3. adds the commands that use usb_device_by_id

Alon Levy (3):
  qdev: make qdev_find_recursive public
  usb: add public usb_device_by_id
  monitor: add usb_attach and usb_detach

 hmp-commands.hx |   34 ++
 hw/qdev.c   |2 +-
 hw/qdev.h   |1 +
 hw/usb-bus.c|   16 
 hw/usb.h|1 +
 sysemu.h|2 ++
 vl.c|   31 +++
 7 files changed, 86 insertions(+), 1 deletions(-)

-- 
1.7.3.1

[Qemu-devel] [PATCH 2/3] usb: add public usb_device_by_id

2010-10-19 Thread Alon Levy

---
 hw/usb-bus.c |   16 
 hw/usb.h |1 +
 2 files changed, 17 insertions(+), 0 deletions(-)

diff --git a/hw/usb-bus.c b/hw/usb-bus.c
index b692503..d732bd3 100644
--- a/hw/usb-bus.c
+++ b/hw/usb-bus.c
@@ -189,6 +189,22 @@ int usb_device_detach(USBDevice *dev)
 return 0;
 }
 
+USBDevice *usb_device_by_id(const char* id)
+{
+USBBus *bus;
+DeviceState *qdev;
+USBDevice *dev;
+
+QTAILQ_FOREACH(bus, busses, next) {
+qdev = qdev_find_recursive(bus-qbus, id);
+if (qdev != NULL) {
+dev = DO_UPCAST(USBDevice, qdev, qdev);
+return dev;
+}
+}
+return NULL;
+}
+
 int usb_device_delete_addr(int busnr, int addr)
 {
 USBBus *bus;
diff --git a/hw/usb.h b/hw/usb.h
index 00d2802..e70fccd 100644
--- a/hw/usb.h
+++ b/hw/usb.h
@@ -317,6 +317,7 @@ void usb_unregister_port(USBBus *bus, USBPort *port);
 int usb_device_attach(USBDevice *dev);
 int usb_device_detach(USBDevice *dev);
 int usb_device_delete_addr(int busnr, int addr);
+USBDevice *usb_device_by_id(const char* id);
 
 static inline USBBus *usb_bus_from_device(USBDevice *d)
 {
-- 
1.7.3.1

[Qemu-devel] [PATCH 3/3] monitor: add usb_attach and usb_detach

2010-10-19 Thread Alon Levy

---
 hmp-commands.hx |   34 ++
 sysemu.h|2 ++
 vl.c|   31 +++
 3 files changed, 67 insertions(+), 0 deletions(-)

diff --git a/hmp-commands.hx b/hmp-commands.hx
index 81999aa..660205c 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -517,6 +517,40 @@ command @code{info usb} to see the devices you can remove.
 ETEXI
 
 {
+.name   = usb_attach,
+.args_type  = id:s,
+.params = device,
+.help   = attach USB device 'bus.addr',
+.mhandler.cmd = do_usb_attach,
+},
+
+STEXI
+...@item usb_attach @var{devname}
+...@findex usb_attach
+
+Attach the USB device @var{devname} to the QEMU virtual USB
+hub. @var{devname} has the syntax @code{bus.addr}. Use the monitor
+command @code{info usb} to see the devices you can attach.
+ETEXI
+
+{
+.name   = usb_detach,
+.args_type  = id:s,
+.params = device,
+.help   = remove USB device 'bus.addr',
+.mhandler.cmd = do_usb_detach,
+},
+
+STEXI
+...@item usb_detach @var{devname}
+...@findex usb_detach
+
+Detach the USB device @var{devname} from the QEMU virtual USB
+hub. @var{devname} has the syntax @code{bus.addr}. Use the monitor
+command @code{info usb} to see the devices you can detach.
+ETEXI
+
+{
 .name   = device_add,
 .args_type  = device:O,
 .params = driver[,prop=value][,...],
diff --git a/sysemu.h b/sysemu.h
index b81a70e..1dc0e58 100644
--- a/sysemu.h
+++ b/sysemu.h
@@ -182,6 +182,8 @@ extern struct soundhw soundhw[];
 
 void do_usb_add(Monitor *mon, const QDict *qdict);
 void do_usb_del(Monitor *mon, const QDict *qdict);
+void do_usb_attach(Monitor *mon, const QDict *qdict);
+void do_usb_detach(Monitor *mon, const QDict *qdict);
 void usb_info(Monitor *mon);
 
 void rtc_change_mon_event(struct tm *tm);
diff --git a/vl.c b/vl.c
index df414ef..35db6c8 100644
--- a/vl.c
+++ b/vl.c
@@ -894,6 +894,37 @@ void do_usb_del(Monitor *mon, const QDict *qdict)
 }
 }
 
+void do_usb_attach(Monitor *mon, const QDict *qdict)
+{
+const char *id = qdict_get_str(qdict, id);
+USBDevice *dev;
+
+dev = usb_device_by_id(id);
+
+if (dev == NULL) {
+error_report(no such USB device '%s', id);
+return;
+}
+if (usb_device_attach(dev)  0) {
+error_report(could not attach USB device '%s', id);
+}
+}
+
+void do_usb_detach(Monitor *mon, const QDict *qdict)
+{
+const char *id = qdict_get_str(qdict, id);
+USBDevice *dev;
+
+dev = usb_device_by_id(id);
+if (dev == NULL) {
+error_report(no such USB device '%s', id);
+return;
+}
+if (usb_device_detach(dev)  0) {
+error_report(could not detach USB device '%s', id);
+}
+}
+
 /***/
 /* PCMCIA/Cardbus */
 
-- 
1.7.3.1

[Qemu-devel] [PATCH 10/10] Add savevm/loadvm support for MCE

2010-10-19 Thread Marcelo Tosatti

Port qemu-kvm's

commit 1bab5d11545d8de5facf46c28630085a2f9651ae
Author: Huang Ying ying.hu...@intel.com
Date:   Wed Mar 3 16:52:46 2010 +0800

Add savevm/loadvm support for MCE

MCE registers are saved/load into/from CPUState in
kvm_arch_save/load_regs. To simulate the MCG_STATUS clearing upon
reset, MSR_MCG_STATUS is set to 0 for KVM_PUT_RESET_STATE.

Signed-off-by: Marcelo Tosatti mtosa...@redhat.com
Signed-off-by: Avi Kivity a...@redhat.com
---
 target-i386/kvm.c |   39 ++-
 1 files changed, 38 insertions(+), 1 deletions(-)

diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 84bd400..d940175 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -777,7 +777,7 @@ static int kvm_put_msrs(CPUState *env, int level)
 struct kvm_msr_entry entries[100];
 } msr_data;
 struct kvm_msr_entry *msrs = msr_data.entries;
-int n = 0;
+int i, n = 0;
 
 kvm_msr_entry_set(msrs[n++], MSR_IA32_SYSENTER_CS, env-sysenter_cs);
 kvm_msr_entry_set(msrs[n++], MSR_IA32_SYSENTER_ESP, env-sysenter_esp);
@@ -797,6 +797,18 @@ static int kvm_put_msrs(CPUState *env, int level)
   env-system_time_msr);
 kvm_msr_entry_set(msrs[n++], MSR_KVM_WALL_CLOCK, env-wall_clock_msr);
 }
+#ifdef KVM_CAP_MCE
+if (env-mcg_cap) {
+if (level == KVM_PUT_RESET_STATE)
+kvm_msr_entry_set(msrs[n++], MSR_MCG_STATUS, env-mcg_status);
+else if (level == KVM_PUT_FULL_STATE) {
+kvm_msr_entry_set(msrs[n++], MSR_MCG_STATUS, env-mcg_status);
+kvm_msr_entry_set(msrs[n++], MSR_MCG_CTL, env-mcg_ctl);
+for (i = 0; i  (env-mcg_cap  0xff) * 4; i++)
+kvm_msr_entry_set(msrs[n++], MSR_MC0_CTL + i, 
env-mce_banks[i]);
+}
+}
+#endif
 
 msr_data.info.nmsrs = n;
 
@@ -1004,6 +1016,15 @@ static int kvm_get_msrs(CPUState *env)
 msrs[n++].index = MSR_KVM_SYSTEM_TIME;
 msrs[n++].index = MSR_KVM_WALL_CLOCK;
 
+#ifdef KVM_CAP_MCE
+if (env-mcg_cap) {
+msrs[n++].index = MSR_MCG_STATUS;
+msrs[n++].index = MSR_MCG_CTL;
+for (i = 0; i  (env-mcg_cap  0xff) * 4; i++)
+msrs[n++].index = MSR_MC0_CTL + i;
+}
+#endif
+
 msr_data.info.nmsrs = n;
 ret = kvm_vcpu_ioctl(env, KVM_GET_MSRS, msr_data);
 if (ret  0)
@@ -1046,6 +1067,22 @@ static int kvm_get_msrs(CPUState *env)
 case MSR_KVM_WALL_CLOCK:
 env-wall_clock_msr = msrs[i].data;
 break;
+#ifdef KVM_CAP_MCE
+case MSR_MCG_STATUS:
+env-mcg_status = msrs[i].data;
+break;
+case MSR_MCG_CTL:
+env-mcg_ctl = msrs[i].data;
+break;
+#endif
+default:
+#ifdef KVM_CAP_MCE
+if (msrs[i].index = MSR_MC0_CTL 
+msrs[i].index  MSR_MC0_CTL + (env-mcg_cap  0xff) * 4) {
+env-mce_banks[msrs[i].index - MSR_MC0_CTL] = msrs[i].data;
+break;
+}
+#endif
 }
 }
 
-- 
1.7.2.1

[Qemu-devel] [PATCH 09/10] MCE: Relay UCR MCE to guest

2010-10-19 Thread Marcelo Tosatti

Port qemu-kvm's

commit 4b62fff1101a7ad77553147717a8bd3bf79df7ef
Author: Huang Ying ying.hu...@intel.com
Date:   Mon Sep 21 10:43:25 2009 +0800

MCE: Relay UCR MCE to guest

UCR (uncorrected recovery) MCE is supported in recent Intel CPUs,
where some hardware error such as some memory error can be reported
without PCC (processor context corrupted). To recover from such MCE,
the corresponding memory will be unmapped, and all processes accessing
the memory will be killed via SIGBUS.

For KVM, if QEMU/KVM is killed, all guest processes will be killed
too. So we relay SIGBUS from host OS to guest system via a UCR MCE
injection. Then guest OS can isolate corresponding memory and kill
necessary guest processes only. SIGBUS sent to main thread (not VCPU
threads) will be broadcast to all VCPU threads as UCR MCE.

Signed-off-by: Marcelo Tosatti mtosa...@redhat.com
Signed-off-by: Avi Kivity a...@redhat.com
---
 cpus.c|   82 --
 kvm-stub.c|5 ++
 kvm.h |3 +
 target-i386/cpu.h |   20 +-
 target-i386/helper.c  |2 +-
 target-i386/kvm.c |  178 -
 target-i386/kvm_x86.h |3 +-
 7 files changed, 279 insertions(+), 14 deletions(-)

diff --git a/cpus.c b/cpus.c
index 429993a..62de0bc 100644
--- a/cpus.c
+++ b/cpus.c
@@ -34,6 +34,10 @@
 
 #include cpus.h
 #include compatfd.h
+#ifdef CONFIG_LINUX
+#include sys/prctl.h
+#include sys/signalfd.h
+#endif
 
 #ifdef SIGRTMIN
 #define SIG_IPI (SIGRTMIN+4)
@@ -41,6 +45,10 @@
 #define SIG_IPI SIGUSR1
 #endif
 
+#ifndef PR_MCE_KILL
+#define PR_MCE_KILL 33
+#endif
+
 static CPUState *next_cpu;
 
 /***/
@@ -498,28 +506,77 @@ static void qemu_tcg_wait_io_event(void)
 }
 }
 
+static void sigbus_reraise(void)
+{
+sigset_t set;
+struct sigaction action;
+
+memset(action, 0, sizeof(action));
+action.sa_handler = SIG_DFL;
+if (!sigaction(SIGBUS, action, NULL)) {
+raise(SIGBUS);
+sigemptyset(set);
+sigaddset(set, SIGBUS);
+sigprocmask(SIG_UNBLOCK, set, NULL);
+}
+perror(Failed to re-raise SIGBUS!\n);
+abort();
+}
+
+static void sigbus_handler(int n, struct qemu_signalfd_siginfo *siginfo,
+   void *ctx)
+{
+#if defined(TARGET_I386)
+if (kvm_on_sigbus(siginfo-ssi_code, (void *)(intptr_t)siginfo-ssi_addr))
+#endif
+sigbus_reraise();
+}
+
 static void qemu_kvm_eat_signal(CPUState *env, int timeout)
 {
 struct timespec ts;
 int r, e;
 siginfo_t siginfo;
 sigset_t waitset;
+sigset_t chkset;
 
 ts.tv_sec = timeout / 1000;
 ts.tv_nsec = (timeout % 1000) * 100;
 
 sigemptyset(waitset);
 sigaddset(waitset, SIG_IPI);
+sigaddset(waitset, SIGBUS);
 
-qemu_mutex_unlock(qemu_global_mutex);
-r = sigtimedwait(waitset, siginfo, ts);
-e = errno;
-qemu_mutex_lock(qemu_global_mutex);
+do {
+qemu_mutex_unlock(qemu_global_mutex);
 
-if (r == -1  !(e == EAGAIN || e == EINTR)) {
-fprintf(stderr, sigtimedwait: %s\n, strerror(e));
-exit(1);
-}
+r = sigtimedwait(waitset, siginfo, ts);
+e = errno;
+
+qemu_mutex_lock(qemu_global_mutex);
+
+if (r == -1  !(e == EAGAIN || e == EINTR)) {
+fprintf(stderr, sigtimedwait: %s\n, strerror(e));
+exit(1);
+}
+
+switch (r) {
+case SIGBUS:
+#ifdef TARGET_I386
+if (kvm_on_sigbus_vcpu(env, siginfo.si_code, siginfo.si_addr))
+#endif
+sigbus_reraise();
+break;
+default:
+break;
+}
+
+r = sigpending(chkset);
+if (r == -1) {
+fprintf(stderr, sigpending: %s\n, strerror(e));
+exit(1);
+}
+} while (sigismember(chkset, SIG_IPI) || sigismember(chkset, SIGBUS));
 }
 
 static void qemu_kvm_wait_io_event(CPUState *env)
@@ -645,6 +702,7 @@ static void kvm_init_ipi(CPUState *env)
 
 pthread_sigmask(SIG_BLOCK, NULL, set);
 sigdelset(set, SIG_IPI);
+sigdelset(set, SIGBUS);
 r = kvm_set_signal_mask(env, set);
 if (r) {
 fprintf(stderr, kvm_set_signal_mask: %s\n, strerror(r));
@@ -655,6 +713,7 @@ static void kvm_init_ipi(CPUState *env)
 static sigset_t block_io_signals(void)
 {
 sigset_t set;
+struct sigaction action;
 
 /* SIGUSR2 used by posix-aio-compat.c */
 sigemptyset(set);
@@ -665,8 +724,15 @@ static sigset_t block_io_signals(void)
 sigaddset(set, SIGIO);
 sigaddset(set, SIGALRM);
 sigaddset(set, SIG_IPI);
+sigaddset(set, SIGBUS);
 pthread_sigmask(SIG_BLOCK, set, NULL);
 
+memset(action, 0, sizeof(action));
+action.sa_flags = SA_SIGINFO;
+action.sa_sigaction = (void (*)(int, siginfo_t*, void*))sigbus_handler;
+sigaction(SIGBUS, action, NULL);
+prctl(PR_MCE_KILL, 1, 1, 0, 0);
+
 return set;

[Qemu-devel] Re: [PATCH] Implement a virtio GPU transport

2010-10-19 Thread Ian Molton


On 10/10/10 16:11, Avi Kivity wrote:

On 10/06/2010 05:59 PM, Ian Molton wrote:

This patch implements a virtio-based transport for use by a
virtualised OpenGL passthrough implementation.

The libGL and qemu-gl code to support this patch are available here:

http://gitorious.org/vm-gl-accel/qemu-gl
http://gitorious.org/vm-gl-accel/qemu-libgl


Comments please!


1. copy qemu-devel


Ok, will do.


an virtualization@, many virtio developers live there.


you mean virtualizat...@lists.osdl.org ?


2. should start with a patch to the virtio-pci spec to document what
you're doing


Where can I find that spec?


+ /* Transfer data */
+ if (virtqueue_add_buf(vq, sg_list, o_page, i_page, (void *)1)= 0) {
+ virtqueue_kick(vq);
+ /* Chill out until it's done with the buffer. */
+ while (!virtqueue_get_buf(vq,count))
+ cpu_relax();
+ }
+


This is pretty gross, and will burn lots of cpu if the hypervisor
processes the queue asynchronously.


It doesnt, at present... It could be changed fairly easily ithout 
breaking anything if that happens though.


-Ian

Re: [Qemu-devel] [PATCH][block] qcow2: Support exact L1 table growth

2010-10-19 Thread Kevin Wolf

Am 18.10.2010 17:53, schrieb Stefan Hajnoczi:
 The L1 table grow operation includes a size calculation that bumps up
 the new L1 table size in order to anticipate the size needs of vmstate
 data.  This helps reduce the number of times that the L1 table has to be
 grown when vmstate data is appended.
 
 This size overhead is not necessary during image creation,
 bdrv_truncate(), or snapshot goto operations.  In fact, existing
 qemu-iotests that exercise table growth are no longer able to trigger it
 because image creation preallocates an L1 table that is too large after
 changes to qcow_create2().
 
 This patch keeps the size calculation but also adds exact growth for
 callers that do not want to inflate the L1 table size unnecessarily.
 
 Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
 ---
  block/qcow2-cluster.c  |   25 -
  block/qcow2-snapshot.c |2 +-
  block/qcow2.c  |2 +-
  block/qcow2.h  |2 +-
  4 files changed, 19 insertions(+), 12 deletions(-)
 
 Hi Kevin,
 This patch fixes the qcow_create2() issue seen in qemu-iotests 026 with your
 kevin.git/block branch.  The issue was that the L1 table size of new images is
 inflated by qcow2_grow_l1_table().  This caused the differences in the test,
 e.g. L1 table grow tests no longer worked because they couldn't force the 
 table
 to grow (it was already more than large enough).
 
 If we use exact L1 growth in bdrv_truncate() then less image space is wasted
 and the test passes again without changes to 026.out.
 
 I think this patch is the way to go, not just to satisfy the test, but also
 because we don't need to overallocate L1 tables to start with.

Good that you took a look at it, I hadn't even thought of changing the
qcow2 code. I agree that this makes sense even independent of qemu-iotests.

The patch looks good to me, too.

Kevin

[Qemu-devel] [PATCH 05/10] Expose thread_id in info cpus

2010-10-19 Thread Marcelo Tosatti

commit ce6325ff1af34dbaee91c8d28e792277e43f1227
Author: Glauber Costa gco...@redhat.com
Date:   Wed Mar 5 17:01:10 2008 -0300

Augment info cpus

This patch exposes the thread id associated with each
cpu through the already well known 'info cpus' interface.

Signed-off-by: Marcelo Tosatti mtosa...@redhat.com
Signed-off-by: Avi Kivity a...@redhat.com
---
 cpu-defs.h |1 +
 cpus.c |5 +
 exec.c |1 +
 monitor.c  |4 
 osdep.c|   15 +++
 osdep.h|1 +
 6 files changed, 27 insertions(+), 0 deletions(-)

diff --git a/cpu-defs.h b/cpu-defs.h
index 8d4bf86..eaed43e 100644
--- a/cpu-defs.h
+++ b/cpu-defs.h
@@ -197,6 +197,7 @@ typedef struct CPUWatchpoint {
 int nr_cores;  /* number of cores within this CPU package */\
 int nr_threads;/* number of threads within this CPU */  \
 int running; /* Nonzero if cpu is currently running(usermode).  */  \
+int thread_id;  \
 /* user data */ \
 void *opaque;   \
 \
diff --git a/cpus.c b/cpus.c
index 3875657..429993a 100644
--- a/cpus.c
+++ b/cpus.c
@@ -539,6 +539,7 @@ static void *kvm_cpu_thread_fn(void *arg)
 
 qemu_mutex_lock(qemu_global_mutex);
 qemu_thread_self(env-thread);
+env-thread_id = get_thread_id();
 if (kvm_enabled())
 kvm_init_vcpu(env);
 
@@ -578,6 +579,10 @@ static void *tcg_cpu_thread_fn(void *arg)
 while (!qemu_system_ready)
 qemu_cond_timedwait(qemu_system_cond, qemu_global_mutex, 100);
 
+for (env = first_cpu; env != NULL; env = env-next_cpu) {
+env-thread_id = get_thread_id();
+}
+
 while (1) {
 cpu_exec_all();
 qemu_tcg_wait_io_event();
diff --git a/exec.c b/exec.c
index 1fbe91c..c09051d 100644
--- a/exec.c
+++ b/exec.c
@@ -637,6 +637,7 @@ void cpu_exec_init(CPUState *env)
 env-numa_node = 0;
 QTAILQ_INIT(env-breakpoints);
 QTAILQ_INIT(env-watchpoints);
+env-thread_id = get_thread_id();
 *penv = env;
 #if defined(CONFIG_USER_ONLY)
 cpu_list_unlock();
diff --git a/monitor.c b/monitor.c
index 260cc02..709d0fd 100644
--- a/monitor.c
+++ b/monitor.c
@@ -849,6 +849,9 @@ static void print_cpu_iter(QObject *obj, void *opaque)
 monitor_printf(mon,  (halted));
 }
 
+monitor_printf(mon,  thread_id=% PRId64  ,
+   qdict_get_int(cpu, thread_id));
+
 monitor_printf(mon, \n);
 }
 
@@ -893,6 +896,7 @@ static void do_info_cpus(Monitor *mon, QObject **ret_data)
 #elif defined(TARGET_MIPS)
 qdict_put(cpu, PC, qint_from_int(env-active_tc.PC));
 #endif
+qdict_put(cpu, thread_id, qint_from_int(env-thread_id));
 
 qlist_append(cpu_list, cpu);
 }
diff --git a/osdep.c b/osdep.c
index 2e05b21..dda0f43 100644
--- a/osdep.c
+++ b/osdep.c
@@ -44,6 +44,10 @@
 extern int madvise(caddr_t, size_t, int);
 #endif
 
+#ifdef CONFIG_LINUX
+#include sys/syscall.h
+#endif
+
 #ifdef CONFIG_EVENTFD
 #include sys/eventfd.h
 #endif
@@ -200,6 +204,17 @@ int qemu_create_pidfile(const char *filename)
 return 0;
 }
 
+int get_thread_id(void)
+{
+#if defined (_WIN32)
+return GetCurrentThreadId();
+#elif defined (__linux__)
+return syscall(SYS_gettid);
+#else
+return getpid();
+#endif
+}
+
 #ifdef _WIN32
 
 /* mingw32 needs ffs for compilations without optimization. */
diff --git a/osdep.h b/osdep.h
index 6716281..9b3bc2e 100644
--- a/osdep.h
+++ b/osdep.h
@@ -126,6 +126,7 @@ void qemu_vfree(void *ptr);
 int qemu_madvise(void *addr, size_t len, int advice);
 
 int qemu_create_pidfile(const char *filename);
+int get_thread_id(void);
 
 #ifdef _WIN32
 int ffs(int i);
-- 
1.7.2.1

[Qemu-devel] [PATCH 04/10] iothread: use signalfd

2010-10-19 Thread Marcelo Tosatti

Block SIGALRM, SIGIO and consume them via signalfd.

Signed-off-by: Marcelo Tosatti mtosa...@redhat.com
Signed-off-by: Avi Kivity a...@redhat.com
---
 cpus.c |   74 +++
 1 files changed, 69 insertions(+), 5 deletions(-)

diff --git a/cpus.c b/cpus.c
index b09f5e3..3875657 100644
--- a/cpus.c
+++ b/cpus.c
@@ -33,6 +33,7 @@
 #include exec-all.h
 
 #include cpus.h
+#include compatfd.h
 
 #ifdef SIGRTMIN
 #define SIG_IPI (SIGRTMIN+4)
@@ -329,14 +330,75 @@ static QemuCond qemu_work_cond;
 
 static void tcg_init_ipi(void);
 static void kvm_init_ipi(CPUState *env);
-static void unblock_io_signals(void);
+static sigset_t block_io_signals(void);
+
+/* If we have signalfd, we mask out the signals we want to handle and then
+ * use signalfd to listen for them.  We rely on whatever the current signal
+ * handler is to dispatch the signals when we receive them.
+ */
+static void sigfd_handler(void *opaque)
+{
+int fd = (unsigned long) opaque;
+struct qemu_signalfd_siginfo info;
+struct sigaction action;
+ssize_t len;
+
+while (1) {
+do {
+len = read(fd, info, sizeof(info));
+} while (len == -1  errno == EINTR);
+
+if (len == -1  errno == EAGAIN) {
+break;
+}
+
+if (len != sizeof(info)) {
+printf(read from sigfd returned %zd: %m\n, len);
+return;
+}
+
+sigaction(info.ssi_signo, NULL, action);
+if ((action.sa_flags  SA_SIGINFO)  action.sa_sigaction) {
+action.sa_sigaction(info.ssi_signo,
+(siginfo_t *)info, NULL);
+} else if (action.sa_handler) {
+action.sa_handler(info.ssi_signo);
+}
+}
+}
+
+static int qemu_signalfd_init(sigset_t mask)
+{
+int sigfd;
+
+sigfd = qemu_signalfd(mask);
+if (sigfd == -1) {
+fprintf(stderr, failed to create signalfd\n);
+return -errno;
+}
+
+fcntl_setfl(sigfd, O_NONBLOCK);
+
+qemu_set_fd_handler2(sigfd, NULL, sigfd_handler, NULL,
+ (void *)(unsigned long) sigfd);
+
+return 0;
+}
 
 int qemu_init_main_loop(void)
 {
 int ret;
+sigset_t blocked_signals;
 
 cpu_set_debug_excp_handler(cpu_debug_handler);
 
+blocked_signals = block_io_signals();
+
+ret = qemu_signalfd_init(blocked_signals);
+if (ret)
+return ret;
+
+/* Note eventfd must be drained before signalfd handlers run */
 ret = qemu_event_init();
 if (ret)
 return ret;
@@ -347,7 +409,6 @@ int qemu_init_main_loop(void)
 qemu_mutex_init(qemu_global_mutex);
 qemu_mutex_lock(qemu_global_mutex);
 
-unblock_io_signals();
 qemu_thread_self(io_thread);
 
 return 0;
@@ -586,19 +647,22 @@ static void kvm_init_ipi(CPUState *env)
 }
 }
 
-static void unblock_io_signals(void)
+static sigset_t block_io_signals(void)
 {
 sigset_t set;
 
+/* SIGUSR2 used by posix-aio-compat.c */
 sigemptyset(set);
 sigaddset(set, SIGUSR2);
-sigaddset(set, SIGIO);
-sigaddset(set, SIGALRM);
 pthread_sigmask(SIG_UNBLOCK, set, NULL);
 
 sigemptyset(set);
+sigaddset(set, SIGIO);
+sigaddset(set, SIGALRM);
 sigaddset(set, SIG_IPI);
 pthread_sigmask(SIG_BLOCK, set, NULL);
+
+return set;
 }
 
 void qemu_mutex_lock_iothread(void)
-- 
1.7.2.1

[Qemu-devel] [PATCH 00/10] [PULL] qemu-kvm.git uq/master queue

2010-10-19 Thread Marcelo Tosatti

The following changes since commit 38cc9b607f85017b095793cab6c129bc9844f441:

  issue snd_pcm_start() when capturing audio (2010-10-18 00:39:06 +0400)

are available in the git repository at:
  git://git.kernel.org/pub/scm/virt/kvm/qemu-kvm.git uq/master

Huang Ying (1):
  Add RAM - physical addr mapping in MCE simulation

Joerg Roedel (2):
  Set cpuid definition to 0 before initializing it
  Add svm cpuid features

Marcelo Tosatti (7):
  signalfd compatibility
  iothread: use signalfd
  Expose thread_id in info cpus
  kvm: x86: add mce support
  Export qemu_ram_addr_from_host
  MCE: Relay UCR MCE to guest
  Add savevm/loadvm support for MCE

 Makefile.objs |1 +
 compatfd.c|  117 +++
 compatfd.h|   43 +++
 configure |   18 +++
 cpu-common.h  |3 +-
 cpu-defs.h|1 +
 cpus.c|  161 --
 exec-all.h|2 +-
 exec.c|   27 +++--
 kvm-all.c |   18 +++
 kvm-stub.c|5 +
 kvm.h |6 +
 monitor.c |4 +
 osdep.c   |   15 +++
 osdep.h   |1 +
 target-i386/cpu.h |   32 +-
 target-i386/cpuid.c   |   79 ++---
 target-i386/helper.c  |6 +
 target-i386/kvm.c |  300 -
 target-i386/kvm_x86.h |   22 
 20 files changed, 817 insertions(+), 44 deletions(-)
 create mode 100644 compatfd.c
 create mode 100644 compatfd.h
 create mode 100644 target-i386/kvm_x86.h

[Qemu-devel] Re: [PATCH] Implement a virtio GPU transport

2010-10-19 Thread Avi Kivity


 On 10/19/2010 12:31 PM, Ian Molton wrote:



an virtualization@, many virtio developers live there.


you mean virtualizat...@lists.osdl.org ?


Yes.




2. should start with a patch to the virtio-pci spec to document what
you're doing


Where can I find that spec?


http://ozlabs.org/~rusty/virtio-spec/




+ /* Transfer data */
+ if (virtqueue_add_buf(vq, sg_list, o_page, i_page, (void *)1)= 0) {
+ virtqueue_kick(vq);
+ /* Chill out until it's done with the buffer. */
+ while (!virtqueue_get_buf(vq,count))
+ cpu_relax();
+ }
+


This is pretty gross, and will burn lots of cpu if the hypervisor
processes the queue asynchronously.


It doesnt, at present... It could be changed fairly easily ithout 
breaking anything if that happens though.


The hypervisor and the guest can be changed independently.  The driver 
should be coded so that it doesn't depend on hypervisor implementation 
details.


--
error compiling committee.c: too many arguments to function

[Qemu-devel] [PATCH 06/10] kvm: x86: add mce support

2010-10-19 Thread Marcelo Tosatti

Port qemu-kvm's MCE support

commit c68b2374c9048812f488e00ffb95db66c0bc07a7
Author: Huang Ying ying.hu...@intel.com
Date:   Mon Jul 20 10:00:53 2009 +0800

Add MCE simulation support to qemu/kvm

KVM ioctls are used to initialize MCE simulation and inject MCE. The
real MCE simulation is implemented in Linux kernel. The Kernel part
has been merged.

Signed-off-by: Marcelo Tosatti mtosa...@redhat.com
Signed-off-by: Avi Kivity a...@redhat.com
---
 target-i386/helper.c  |6 +++
 target-i386/kvm.c |   84 +
 target-i386/kvm_x86.h |   21 
 3 files changed, 111 insertions(+), 0 deletions(-)
 create mode 100644 target-i386/kvm_x86.h

diff --git a/target-i386/helper.c b/target-i386/helper.c
index e134340..4b430dd 100644
--- a/target-i386/helper.c
+++ b/target-i386/helper.c
@@ -27,6 +27,7 @@
 #include exec-all.h
 #include qemu-common.h
 #include kvm.h
+#include kvm_x86.h
 
 //#define DEBUG_MMU
 
@@ -1030,6 +1031,11 @@ void cpu_inject_x86_mce(CPUState *cenv, int bank, 
uint64_t status,
 if (bank = bank_num || !(status  MCI_STATUS_VAL))
 return;
 
+if (kvm_enabled()) {
+kvm_inject_x86_mce(cenv, bank, status, mcg_status, addr, misc);
+return;
+}
+
 /*
  * if MSR_MCG_CTL is not all 1s, the uncorrected error
  * reporting is disabled
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 74e7b4f..343fb02 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -27,6 +27,7 @@
 #include hw/pc.h
 #include hw/apic.h
 #include ioport.h
+#include kvm_x86.h
 
 #ifdef CONFIG_KVM_PARA
 #include linux/kvm_para.h
@@ -167,6 +168,67 @@ static int get_para_features(CPUState *env)
 }
 #endif
 
+#ifdef KVM_CAP_MCE
+static int kvm_get_mce_cap_supported(KVMState *s, uint64_t *mce_cap,
+ int *max_banks)
+{
+int r;
+
+r = kvm_ioctl(s, KVM_CHECK_EXTENSION, KVM_CAP_MCE);
+if (r  0) {
+*max_banks = r;
+return kvm_ioctl(s, KVM_X86_GET_MCE_CAP_SUPPORTED, mce_cap);
+}
+return -ENOSYS;
+}
+
+static int kvm_setup_mce(CPUState *env, uint64_t *mcg_cap)
+{
+return kvm_vcpu_ioctl(env, KVM_X86_SETUP_MCE, mcg_cap);
+}
+
+static int kvm_set_mce(CPUState *env, struct kvm_x86_mce *m)
+{
+return kvm_vcpu_ioctl(env, KVM_X86_SET_MCE, m);
+}
+
+struct kvm_x86_mce_data
+{
+CPUState *env;
+struct kvm_x86_mce *mce;
+};
+
+static void kvm_do_inject_x86_mce(void *_data)
+{
+struct kvm_x86_mce_data *data = _data;
+int r;
+
+r = kvm_set_mce(data-env, data-mce);
+if (r  0)
+perror(kvm_set_mce FAILED);
+}
+#endif
+
+void kvm_inject_x86_mce(CPUState *cenv, int bank, uint64_t status,
+uint64_t mcg_status, uint64_t addr, uint64_t misc)
+{
+#ifdef KVM_CAP_MCE
+struct kvm_x86_mce mce = {
+.bank = bank,
+.status = status,
+.mcg_status = mcg_status,
+.addr = addr,
+.misc = misc,
+};
+struct kvm_x86_mce_data data = {
+.env = cenv,
+.mce = mce,
+};
+
+run_on_cpu(cenv, kvm_do_inject_x86_mce, data);
+#endif
+}
+
 int kvm_arch_init_vcpu(CPUState *env)
 {
 struct {
@@ -277,6 +339,28 @@ int kvm_arch_init_vcpu(CPUState *env)
 
 cpuid_data.cpuid.nent = cpuid_i;
 
+#ifdef KVM_CAP_MCE
+if (((env-cpuid_version  8)0xF) = 6
+ (env-cpuid_features(CPUID_MCE|CPUID_MCA)) == (CPUID_MCE|CPUID_MCA)
+ kvm_check_extension(env-kvm_state, KVM_CAP_MCE)  0) {
+uint64_t mcg_cap;
+int banks;
+
+if (kvm_get_mce_cap_supported(env-kvm_state, mcg_cap, banks))
+perror(kvm_get_mce_cap_supported FAILED);
+else {
+if (banks  MCE_BANKS_DEF)
+banks = MCE_BANKS_DEF;
+mcg_cap = MCE_CAP_DEF;
+mcg_cap |= banks;
+if (kvm_setup_mce(env, mcg_cap))
+perror(kvm_setup_mce FAILED);
+else
+env-mcg_cap = mcg_cap;
+}
+}
+#endif
+
 return kvm_vcpu_ioctl(env, KVM_SET_CPUID2, cpuid_data);
 }
 
diff --git a/target-i386/kvm_x86.h b/target-i386/kvm_x86.h
new file mode 100644
index 000..c1ebd24
--- /dev/null
+++ b/target-i386/kvm_x86.h
@@ -0,0 +1,21 @@
+/*
+ * QEMU KVM support
+ *
+ * Copyright (C) 2009 Red Hat Inc.
+ * Copyright IBM, Corp. 2008
+ *
+ * Authors:
+ *  Anthony Liguori   aligu...@us.ibm.com
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#ifndef __KVM_X86_H__
+#define __KVM_X86_H__
+
+void kvm_inject_x86_mce(CPUState *cenv, int bank, uint64_t status,
+uint64_t mcg_status, uint64_t addr, uint64_t misc);
+
+#endif
-- 
1.7.2.1

Re: [Qemu-devel] [PATCH 2/2] Fix Block Hotplug race with drive_unplug()

2010-10-19 Thread Stefan Hajnoczi

On Mon, Oct 18, 2010 at 11:17 PM, Ryan Harper ry...@us.ibm.com wrote:
 Block hot unplug is racy since the guest is required to acknowlege the ACPI
 unplug event; this may not happen synchronously with the device removal 
 command

 This series aims to close a gap where by mgmt applications that assume the
 block resource has been removed without confirming that the guest has
 acknowledged the removal may re-assign the underlying device to a second guest
 leading to data leakage.

 This series introduces a new montor command to decouple asynchornous device
 removal from restricting guest access to a block device.  We do this by 
 creating
 a new monitor command drive_unplug which maps to a bdrv_unplug() command which
 does a bdrv_flush() and bdrv_close().  Once complete, subsequent IO is 
 rejected
 from the device and the guest will get IO errors but continue to function.

 A subsequent device removal command can be issued to remove the device, to 
 which
 the guest may or maynot respond, but as long as the unplugged bit is set, no 
 IO
 will be sumbitted.

 Signed-off-by: Ryan Harper ry...@us.ibm.com
 ---
  block.c         |    6 ++
  block.h         |    1 +
  blockdev.c      |   26 ++
  blockdev.h      |    1 +
  hmp-commands.hx |   15 +++
  5 files changed, 49 insertions(+), 0 deletions(-)

 diff --git a/block.c b/block.c
 index a19374d..9fedb27 100644
 --- a/block.c
 +++ b/block.c
 @@ -1328,6 +1328,12 @@ void bdrv_set_removable(BlockDriverState *bs, int 
 removable)
     }
  }

 +void bdrv_unplug(BlockDriverState *bs)
 +{
 +    bdrv_flush(bs);
 +    bdrv_close(bs);

bdrv_flush() does not wait for pending aio requests to complete.
bdrv_close() does not wait either.

A VM with a qcow2 image file and pending aio requests could
bdrv_unplug() and free the qcow2 state before aio completions occur.
If a completion is handled after bdrv_close(), the qcow2 in-memory
state has been freed and we get memory corruption or a crash.

I think the solution is to use qemu_aio_flush() before bdrv_flush().
I waits until all pending aio requests have been completed.

Stefan

[Qemu-devel] [PATCH 07/10] Export qemu_ram_addr_from_host

2010-10-19 Thread Marcelo Tosatti

To be used by next patches.

Signed-off-by: Marcelo Tosatti mtosa...@redhat.com
Signed-off-by: Avi Kivity a...@redhat.com
---
 cpu-common.h |3 ++-
 exec-all.h   |2 +-
 exec.c   |   26 +-
 3 files changed, 20 insertions(+), 11 deletions(-)

diff --git a/cpu-common.h b/cpu-common.h
index 0426bc8..a543b5d 100644
--- a/cpu-common.h
+++ b/cpu-common.h
@@ -47,7 +47,8 @@ void qemu_ram_free(ram_addr_t addr);
 /* This should only be used for ram local to a device.  */
 void *qemu_get_ram_ptr(ram_addr_t addr);
 /* This should not be used by devices.  */
-ram_addr_t qemu_ram_addr_from_host(void *ptr);
+int qemu_ram_addr_from_host(void *ptr, ram_addr_t *ram_addr);
+ram_addr_t qemu_ram_addr_from_host_nofail(void *ptr);
 
 int cpu_register_io_memory(CPUReadMemoryFunc * const *mem_read,
CPUWriteMemoryFunc * const *mem_write,
diff --git a/exec-all.h b/exec-all.h
index 3a53fe6..c457058 100644
--- a/exec-all.h
+++ b/exec-all.h
@@ -334,7 +334,7 @@ static inline tb_page_addr_t get_page_addr_code(CPUState 
*env1, target_ulong add
 }
 p = (void *)(unsigned long)addr
 + env1-tlb_table[mmu_idx][page_index].addend;
-return qemu_ram_addr_from_host(p);
+return qemu_ram_addr_from_host_nofail(p);
 }
 #endif
 
diff --git a/exec.c b/exec.c
index c09051d..9991203 100644
--- a/exec.c
+++ b/exec.c
@@ -2086,7 +2086,7 @@ static inline void tlb_update_dirty(CPUTLBEntry 
*tlb_entry)
 if ((tlb_entry-addr_write  ~TARGET_PAGE_MASK) == IO_MEM_RAM) {
 p = (void *)(unsigned long)((tlb_entry-addr_write  TARGET_PAGE_MASK)
 + tlb_entry-addend);
-ram_addr = qemu_ram_addr_from_host(p);
+ram_addr = qemu_ram_addr_from_host_nofail(p);
 if (!cpu_physical_memory_is_dirty(ram_addr)) {
 tlb_entry-addr_write |= TLB_NOTDIRTY;
 }
@@ -2939,23 +2939,31 @@ void *qemu_get_ram_ptr(ram_addr_t addr)
 return NULL;
 }
 
-/* Some of the softmmu routines need to translate from a host pointer
-   (typically a TLB entry) back to a ram offset.  */
-ram_addr_t qemu_ram_addr_from_host(void *ptr)
+int qemu_ram_addr_from_host(void *ptr, ram_addr_t *ram_addr)
 {
 RAMBlock *block;
 uint8_t *host = ptr;
 
 QLIST_FOREACH(block, ram_list.blocks, next) {
 if (host - block-host  block-length) {
-return block-offset + (host - block-host);
+*ram_addr = block-offset + (host - block-host);
+return 0;
 }
 }
+return -1;
+}
 
-fprintf(stderr, Bad ram pointer %p\n, ptr);
-abort();
+/* Some of the softmmu routines need to translate from a host pointer
+   (typically a TLB entry) back to a ram offset.  */
+ram_addr_t qemu_ram_addr_from_host_nofail(void *ptr)
+{
+ram_addr_t ram_addr;
 
-return 0;
+if (qemu_ram_addr_from_host(ptr, ram_addr)) {
+fprintf(stderr, Bad ram pointer %p\n, ptr);
+abort();
+}
+return ram_addr;
 }
 
 static uint32_t unassigned_mem_readb(void *opaque, target_phys_addr_t addr)
@@ -3704,7 +3712,7 @@ void cpu_physical_memory_unmap(void *buffer, 
target_phys_addr_t len,
 {
 if (buffer != bounce.buffer) {
 if (is_write) {
-ram_addr_t addr1 = qemu_ram_addr_from_host(buffer);
+ram_addr_t addr1 = qemu_ram_addr_from_host_nofail(buffer);
 while (access_len) {
 unsigned l;
 l = TARGET_PAGE_SIZE;
-- 
1.7.2.1

[Qemu-devel] [PATCH 08/10] Add RAM - physical addr mapping in MCE simulation

2010-10-19 Thread Marcelo Tosatti

From: Huang Ying ying.hu...@intel.com

In QEMU-KVM, physical address != RAM address. While MCE simulation
needs physical address instead of RAM address. So
kvm_physical_memory_addr_from_ram() is implemented to do the
conversion, and it is invoked before being filled in the IA32_MCi_ADDR
MSR.

Reported-by: Dean Nelson dnel...@redhat.com
Signed-off-by: Huang Ying ying.hu...@intel.com
Signed-off-by: Marcelo Tosatti mtosa...@redhat.com
Signed-off-by: Avi Kivity a...@redhat.com
---
 kvm-all.c |   18 ++
 kvm.h |3 +++
 2 files changed, 21 insertions(+), 0 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index 1cc696f..37b99c7 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -137,6 +137,24 @@ static KVMSlot *kvm_lookup_overlapping_slot(KVMState *s,
 return found;
 }
 
+int kvm_physical_memory_addr_from_ram(KVMState *s, ram_addr_t ram_addr,
+  target_phys_addr_t *phys_addr)
+{
+int i;
+
+for (i = 0; i  ARRAY_SIZE(s-slots); i++) {
+KVMSlot *mem = s-slots[i];
+
+if (ram_addr = mem-phys_offset 
+ram_addr  mem-phys_offset + mem-memory_size) {
+*phys_addr = mem-start_addr + (ram_addr - mem-phys_offset);
+return 1;
+}
+}
+
+return 0;
+}
+
 static int kvm_set_user_memory_region(KVMState *s, KVMSlot *slot)
 {
 struct kvm_userspace_memory_region mem;
diff --git a/kvm.h b/kvm.h
index 50b6c01..8f5a754 100644
--- a/kvm.h
+++ b/kvm.h
@@ -174,6 +174,9 @@ static inline void cpu_synchronize_post_init(CPUState *env)
 }
 }
 
+int kvm_physical_memory_addr_from_ram(KVMState *s, ram_addr_t ram_addr,
+  target_phys_addr_t *phys_addr);
+
 #endif
 int kvm_set_ioeventfd_mmio_long(int fd, uint32_t adr, uint32_t val, bool 
assign);
 
-- 
1.7.2.1

[Qemu-devel] [PATCH 01/10] Set cpuid definition to 0 before initializing it

2010-10-19 Thread Marcelo Tosatti

From: Joerg Roedel joerg.roe...@amd.com

This patch cleans the (stack-allocated) cpuid definition to
0 before actually initializing it.

Signed-off-by: Joerg Roedel joerg.roe...@amd.com
Signed-off-by: Avi Kivity a...@redhat.com
---
 target-i386/cpuid.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/target-i386/cpuid.c b/target-i386/cpuid.c
index 04ba8d5..3fcf78f 100644
--- a/target-i386/cpuid.c
+++ b/target-i386/cpuid.c
@@ -788,6 +788,8 @@ int cpu_x86_register (CPUX86State *env, const char 
*cpu_model)
 {
 x86_def_t def1, *def = def1;
 
+memset(def, 0, sizeof(*def));
+
 if (cpu_x86_find_by_name(def, cpu_model)  0)
 return -1;
 if (def-vendor1) {
-- 
1.7.2.1

[Qemu-devel] [PATCH 03/10] signalfd compatibility

2010-10-19 Thread Marcelo Tosatti

Port qemu-kvm's signalfd compat code.

commit 5a7fdd0abd7cd24dac205317a4195446ab8748b5
Author: Anthony Liguori aligu...@us.ibm.com
Date:   Wed May 7 11:55:47 2008 -0500

Use signalfd() in io-thread

This patch reworks the IO thread to use signalfd() instead of sigtimedwait()
This will eliminate the need to use SIGIO everywhere.

Signed-off-by: Marcelo Tosatti mtosa...@redhat.com
Signed-off-by: Avi Kivity a...@redhat.com
---
 Makefile.objs |1 +
 compatfd.c|  117 +
 compatfd.h|   43 +
 configure |   18 +
 4 files changed, 179 insertions(+), 0 deletions(-)
 create mode 100644 compatfd.c
 create mode 100644 compatfd.h

diff --git a/Makefile.objs b/Makefile.objs
index 816194a..d73002d 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -125,6 +125,7 @@ common-obj-y += $(addprefix ui/, $(ui-obj-y))
 
 common-obj-y += iov.o acl.o
 common-obj-$(CONFIG_THREAD) += qemu-thread.o
+common-obj-$(CONFIG_IOTHREAD) += compatfd.o
 common-obj-y += notify.o event_notifier.o
 common-obj-y += qemu-timer.o
 
diff --git a/compatfd.c b/compatfd.c
new file mode 100644
index 000..a7cebc4
--- /dev/null
+++ b/compatfd.c
@@ -0,0 +1,117 @@
+/*
+ * signalfd/eventfd compatibility
+ *
+ * Copyright IBM, Corp. 2008
+ *
+ * Authors:
+ *  Anthony Liguori   aligu...@us.ibm.com
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#include qemu-common.h
+#include compatfd.h
+
+#include sys/syscall.h
+#include pthread.h
+
+struct sigfd_compat_info
+{
+sigset_t mask;
+int fd;
+};
+
+static void *sigwait_compat(void *opaque)
+{
+struct sigfd_compat_info *info = opaque;
+int err;
+sigset_t all;
+
+sigfillset(all);
+sigprocmask(SIG_BLOCK, all, NULL);
+
+do {
+siginfo_t siginfo;
+
+err = sigwaitinfo(info-mask, siginfo);
+if (err == -1  errno == EINTR) {
+err = 0;
+continue;
+}
+
+if (err  0) {
+char buffer[128];
+size_t offset = 0;
+
+memcpy(buffer, err, sizeof(err));
+while (offset  sizeof(buffer)) {
+ssize_t len;
+
+len = write(info-fd, buffer + offset,
+sizeof(buffer) - offset);
+if (len == -1  errno == EINTR)
+continue;
+
+if (len = 0) {
+err = -1;
+break;
+}
+
+offset += len;
+}
+}
+} while (err = 0);
+
+return NULL;
+}
+
+static int qemu_signalfd_compat(const sigset_t *mask)
+{
+pthread_attr_t attr;
+pthread_t tid;
+struct sigfd_compat_info *info;
+int fds[2];
+
+info = malloc(sizeof(*info));
+if (info == NULL) {
+errno = ENOMEM;
+return -1;
+}
+
+if (pipe(fds) == -1) {
+free(info);
+return -1;
+}
+
+qemu_set_cloexec(fds[0]);
+qemu_set_cloexec(fds[1]);
+
+memcpy(info-mask, mask, sizeof(*mask));
+info-fd = fds[1];
+
+pthread_attr_init(attr);
+pthread_attr_setdetachstate(attr, PTHREAD_CREATE_DETACHED);
+
+pthread_create(tid, attr, sigwait_compat, info);
+
+pthread_attr_destroy(attr);
+
+return fds[0];
+}
+
+int qemu_signalfd(const sigset_t *mask)
+{
+#if defined(CONFIG_SIGNALFD)
+int ret;
+
+ret = syscall(SYS_signalfd, -1, mask, _NSIG / 8);
+if (ret != -1) {
+qemu_set_cloexec(ret);
+return ret;
+}
+#endif
+
+return qemu_signalfd_compat(mask);
+}
diff --git a/compatfd.h b/compatfd.h
new file mode 100644
index 000..fc37915
--- /dev/null
+++ b/compatfd.h
@@ -0,0 +1,43 @@
+/*
+ * signalfd/eventfd compatibility
+ *
+ * Copyright IBM, Corp. 2008
+ *
+ * Authors:
+ *  Anthony Liguori   aligu...@us.ibm.com
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#ifndef QEMU_COMPATFD_H
+#define QEMU_COMPATFD_H
+
+#include signal.h
+
+struct qemu_signalfd_siginfo {
+uint32_t ssi_signo;   /* Signal number */
+int32_t  ssi_errno;   /* Error number (unused) */
+int32_t  ssi_code;/* Signal code */
+uint32_t ssi_pid; /* PID of sender */
+uint32_t ssi_uid; /* Real UID of sender */
+int32_t  ssi_fd;  /* File descriptor (SIGIO) */
+uint32_t ssi_tid; /* Kernel timer ID (POSIX timers) */
+uint32_t ssi_band;/* Band event (SIGIO) */
+uint32_t ssi_overrun; /* POSIX timer overrun count */
+uint32_t ssi_trapno;  /* Trap number that caused signal */
+int32_t  ssi_status;  /* Exit status or signal (SIGCHLD) */
+int32_t  ssi_int; /* Integer sent by sigqueue(2) */
+uint64_t ssi_ptr; /* Pointer sent by sigqueue(2) */
+uint64_t ssi_utime;   /* User CPU time consumed (SIGCHLD) */
+uint64_t ssi_stime;   /* System CPU

[Qemu-devel] Re: [PATCH v5 00/14] pcie port switch emulators

2010-10-19 Thread Michael S. Tsirkin

On Tue, Oct 19, 2010 at 06:06:27PM +0900, Isaku Yamahata wrote:
 On uncorrectable error status register in pcie_aer_write_config().
 The register is RW1CS, so making it writable and test-and-clear doesn't
 work.

Sure. But isn't this what w1cmask implements?

Also - mail to ad...@khaleel.us seems to bounce. I stripped it
from the Cc list for now.

-- 
MST

[Qemu-devel] Re: [PATCH v5 07/14] pcie: helper functions for pcie capability and extended capability

2010-10-19 Thread Michael S. Tsirkin

On Tue, Oct 19, 2010 at 06:06:34PM +0900, Isaku Yamahata wrote:
 This patch implements helper functions for pci express capability
 and pci express extended capability allocation.
 NOTE: presence detection depends on pci_qdev_init() change.
 
 Signed-off-by: Isaku Yamahata yamah...@valinux.co.jp
 ---
 Changes v4 - v5:
 - dropped FLR related members. This will be addressed at the next phase.
 - use pci_xxx_test_and_xxx_mask().
 - drop PCIDevice::written bits. and made related registers writable.
 - eliminated pcie_cap_slot_notify()
 - introduced PCIExpressDevice::hpev_intx

Please also add a comment saying that hpev_intx field defaults to 0,
and needs to be changed by devices that want to use another
interrupt for hotplug events.

 Changes v3 - v4:
 - various clean up
 - dropped pcie_notify(), pcie_del_capability()
 - use pci_{clear_set, clear}_bit_xxx() helper functions.
 - dropped pci_exp_cap()
 ---
  Makefile.objs |1 +
  hw/pci.h  |5 +
  hw/pcie.c |  540 
 +
  hw/pcie.h |  107 
  qemu-common.h |1 +
  5 files changed, 654 insertions(+), 0 deletions(-)
  create mode 100644 hw/pcie.c
  create mode 100644 hw/pcie.h
 
 diff --git a/Makefile.objs b/Makefile.objs
 index 5f5a4c5..eeb5134 100644
 --- a/Makefile.objs
 +++ b/Makefile.objs
 @@ -186,6 +186,7 @@ hw-obj-$(CONFIG_PIIX4) += piix4.o
  # PCI watchdog devices
  hw-obj-y += wdt_i6300esb.o
  
 +hw-obj-y += pcie.o
  hw-obj-y += msix.o msi.o
  
  # PCI network cards
 diff --git a/hw/pci.h b/hw/pci.h
 index 9e2f27d..d6c522b 100644
 --- a/hw/pci.h
 +++ b/hw/pci.h
 @@ -9,6 +9,8 @@
  /* PCI includes legacy ISA access.  */
  #include isa.h
  
 +#include pcie.h
 +
  /* PCI bus */
  
  #define PCI_DEVFN(slot, func)   slot)  0x1f)  3) | ((func)  0x07))
 @@ -175,6 +177,9 @@ struct PCIDevice {
  /* Offset of MSI capability in config space */
  uint8_t msi_cap;
  
 +/* PCI Express */
 +PCIExpressDevice exp;
 +
  /* Location of option rom */
  char *romfile;
  ram_addr_t rom_offset;
 diff --git a/hw/pcie.c b/hw/pcie.c
 new file mode 100644
 index 000..53d1fce
 --- /dev/null
 +++ b/hw/pcie.c
 @@ -0,0 +1,540 @@
 +/*
 + * pcie.c
 + *
 + * Copyright (c) 2010 Isaku Yamahata yamahata at valinux co jp
 + *VA Linux Systems Japan K.K.
 + *
 + * This program is free software; you can redistribute it and/or modify
 + * it under the terms of the GNU General Public License as published by
 + * the Free Software Foundation; either version 2 of the License, or
 + * (at your option) any later version.
 + *
 + * This program is distributed in the hope that it will be useful,
 + * but WITHOUT ANY WARRANTY; without even the implied warranty of
 + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 + * GNU General Public License for more details.
 + *
 + * You should have received a copy of the GNU General Public License along
 + * with this program; if not, see http://www.gnu.org/licenses/.
 + */
 +
 +#include sysemu.h
 +#include pci_bridge.h
 +#include pcie.h
 +#include msix.h
 +#include msi.h
 +#include pci_internals.h
 +#include pcie_regs.h
 +
 +//#define DEBUG_PCIE
 +#ifdef DEBUG_PCIE
 +# define PCIE_DPRINTF(fmt, ...) \
 +fprintf(stderr, %s:%d  fmt, __func__, __LINE__, ## __VA_ARGS__)
 +#else
 +# define PCIE_DPRINTF(fmt, ...) do {} while (0)
 +#endif
 +#define PCIE_DEV_PRINTF(dev, fmt, ...)  \
 +PCIE_DPRINTF(%s:%x fmt, (dev)-name, (dev)-devfn, ## __VA_ARGS__)
 +
 +
 +/***
 + * pci express capability helper functions
 + */
 +int pcie_cap_init(PCIDevice *dev, uint8_t offset, uint8_t type, uint8_t port)
 +{
 +int pos;
 +uint8_t *exp_cap;
 +
 +assert(pci_is_express(dev));
 +
 +pos = pci_add_capability(dev, PCI_CAP_ID_EXP, offset,
 + PCI_EXP_VER2_SIZEOF);
 +if (pos  0) {
 +return pos;
 +}
 +dev-exp.exp_cap = pos;
 +exp_cap = dev-config + pos;
 +
 +/* capability register
 +   interrupt message number defaults to 0 */
 +pci_set_word(exp_cap + PCI_EXP_FLAGS,
 + ((type  PCI_EXP_FLAGS_TYPE_SHIFT)  PCI_EXP_FLAGS_TYPE) |
 + PCI_EXP_FLAGS_VER2);
 +
 +/* device capability register
 + * table 7-12:
 + * roll based error reporting bit must be set by all
 + * Functions conforming to the ECN, PCI Express Base
 + * Specification, Revision 1.1., or subsequent PCI Express Base
 + * Specification revisions.
 + */
 +pci_set_long(exp_cap + PCI_EXP_DEVCAP, PCI_EXP_DEVCAP_RBER);
 +
 +pci_set_long(exp_cap + PCI_EXP_LNKCAP,
 + (port  PCI_EXP_LNKCAP_PN_SHIFT) |
 + PCI_EXP_LNKCAP_ASPMS_0S |
 + PCI_EXP_LNK_MLW_1 |
 + PCI_EXP_LNK_LS_25);
 +
 +pci_set_word(exp_cap + PCI_EXP_LNKSTA,
 +

[Qemu-devel] Re: [PATCH v5 00/14] pcie port switch emulators

2010-10-19 Thread Michael S. Tsirkin

On Tue, Oct 19, 2010 at 06:06:27PM +0900, Isaku Yamahata wrote:
 Here is v5 of the pcie patch series.
 I hope I addressed the blockers.
 On uncorrectable error status register in pcie_aer_write_config().
 The register is RW1CS, so making it writable and test-and-clear doesn't
 work.
 
 new patches: 1, 2, 
 updasted patches except trivial change: 4, 7, 8
 
 BTW, as 0.13 is released, any chance to sync pci branch with
 the upstream by requesting pull?
 
 Patch description:
 This patch series implements pcie port switch emulators
 which is basic part for pcie/q35 support.
 This is for mst/pci tree.
 
 change v4 - v5:
 - introduced pci_xxx_test_and_clear/set_mask
 - eliminated xxx_notify(msi_trigger, int_level)
 - eliminated FLR bits.
   FLR will be addressed at the next phase.
 
 changes v3 - v4:
 - introduced new pci config helper functions.(clear set bit)
 - various clean up and some bug fixes.
 - dropped pci_shift_xxx().
 - dropped function pointerin pcie_aer.h
 - dropped pci_exp_cap(), pcie_aer_cap().
 - file rename (pcie_{root, upstream, downsatrem} = ioh33420, x3130).
 
 changes v2 - v3:
 - msi: improved commant and simplified shift/ffs dance
 - pci w1c config register framework
 - split pcie.[ch] into pcie_regs.h, pcie.[ch] and pcie_aer.[ch]
 - pcie, aer: many changes by following reviews.
 
 changes v1 - v2:
 - update msi
 - dropped already pushed out patches.
 - added msix patches.
 
 Isaku Yamahata (14):
   pci: introduce helper functions to test-and-{clear, set} mask in
 configuration space
   pci: introduce helper function to handle msi-x and msi.
   pci: use pci_word_test_and_clear_mask() in pci_device_reset()
   pci/bridge: fix pci_bridge_reset()
   msi: implements msi
   pcie: add pcie constants to pcie_regs.h
   pcie: helper functions for pcie capability and extended capability

I'll apply these.

   pcie/aer: helper functions for pcie aer capability

Maybe move this to the end of the series?

   pcie port: define struct PCIEPort/PCIESlot and helper functions
   ioh3420: pcie root port in X58 ioh
   x3130: pcie upstream port
   x3130: pcie downstream port
   pcie/hotplug: introduce pushing attention button command

I think the above can be applied - just remove the dependency
on aer for now.

   pcie/aer: glue aer error injection into qemu monitor
 
  Makefile.objs   |4 +-
  hw/ioh3420.c|  229 +
  hw/ioh3420.h|   10 +
  hw/msi.c|  352 +++
  hw/msi.h|   41 +++
  hw/pci.c|   24 ++-
  hw/pci.h|   88 +-
  hw/pci_bridge.c |   57 +++-
  hw/pci_bridge.h |2 +
  hw/pcie.c   |  540 +
  hw/pcie.h   |  113 ++
  hw/pcie_aer.c   |  869 
 +++
  hw/pcie_aer.h   |  105 ++
  hw/pcie_port.c  |  198 +++
  hw/pcie_port.h  |   51 +++
  hw/pcie_regs.h  |  154 +
  hw/xio3130_downstream.c |  197 +++
  hw/xio3130_downstream.h |   11 +
  hw/xio3130_upstream.c   |  181 ++
  hw/xio3130_upstream.h   |   10 +
  qemu-common.h   |6 +
  qemu-monitor.hx |   36 ++
  sysemu.h|9 +
  23 files changed, 3272 insertions(+), 15 deletions(-)
  create mode 100644 hw/ioh3420.c
  create mode 100644 hw/ioh3420.h
  create mode 100644 hw/msi.c
  create mode 100644 hw/msi.h
  create mode 100644 hw/pcie.c
  create mode 100644 hw/pcie.h
  create mode 100644 hw/pcie_aer.c
  create mode 100644 hw/pcie_aer.h
  create mode 100644 hw/pcie_port.c
  create mode 100644 hw/pcie_port.h
  create mode 100644 hw/pcie_regs.h
  create mode 100644 hw/xio3130_downstream.c
  create mode 100644 hw/xio3130_downstream.h
  create mode 100644 hw/xio3130_upstream.c
  create mode 100644 hw/xio3130_upstream.h

Re: [Qemu-devel] qemu aborts if i add a already registered device from qemu monitor ..

2010-10-19 Thread Luiz Capitulino

On Tue, 19 Oct 2010 15:27:37 +0530
pradeep psuri...@linux.vnet.ibm.com wrote:

 Hi
 
 I tried to add a device to guest from upstream qemu monitor using
 device_add.

Are you developing a new device or does it happen with existing ones?

If it's the latter, can you describe steps to reproduce it?

 Unknowingly i try to add already registered devices from qemu
 monitor, my qemu monitor is aborted. I don't see a reason to kill
 monitor. I think abort() is a bit rough. we need a better way to handle
 it.  If a user try to add a already registered device, qemu should
 convey this to user saying that, this device already registered and an
 error message should be fine than aborting qemu.
 
 
 QLIST_FOREACH(block, ram_list.blocks, next) {
 if (!strcmp(block-idstr, new_block-idstr)) {
 fprintf(stderr, RAMBlock \%s\ already registered,
 abort!\n,
 new_block-idstr);
 abort();
 }
 
 
 If i return some other value in above code, instead of abort(), I
 would  need change the code for every device, which i dont want to. 
 Is there a way to check, if device is already enrolled or not in the very 
 beginning of device_add
 call.
 
 
 
 Thanks
 Pradeep

[Qemu-devel] Re: [PATCH v5 04/14] pci/bridge: fix pci_bridge_reset()

2010-10-19 Thread Michael S. Tsirkin

On Tue, Oct 19, 2010 at 06:06:31PM +0900, Isaku Yamahata wrote:
 The default value of base/limit registers aren't specified in the spec.
 So pci_bridge_reset() shouldn't touch them.
 Instead, introduced two functions to reset those registers in a way
 of typical implementation. zero base/limit registers or disable forwarding.
 They will be used later.
 
 Signed-off-by: Isaku Yamahata yamah...@valinux.co.jp
 ---

I have some second thoughts:

1. pci_bridge_reset is used in several devices and I have no idea what
the reset change will do there, how do real devices behave or whether
guests depend on a specific behaviour.  It seems harmless to leave the
current implementation in place, and simply add
pci_bridge_disable_base_limit which devices can call after
pci_bridge_reset.

2.  _zero and _disable describe what function does already,
so we should drop _reset from function name.
Thus we only get 1 new function, pci_bridge_disable_base_limit.

 Changes v4 - v5:
 - drop the lines in pci_bridge_reset()
 - introduced two functions to reset base/limit registers.
 ---
  hw/pci_bridge.c |   57 +++---
  hw/pci_bridge.h |2 +
  2 files changed, 51 insertions(+), 8 deletions(-)
 
 diff --git a/hw/pci_bridge.c b/hw/pci_bridge.c
 index 638e3b3..de75e6a 100644
 --- a/hw/pci_bridge.c
 +++ b/hw/pci_bridge.c
 @@ -151,6 +151,46 @@ void pci_bridge_write_config(PCIDevice *d,
  }
  }
  
 +void pci_bridge_reset_zero_base_limit(PCIDevice *dev)
 +{
 +uint8_t *conf = dev-config;
 +
 +pci_byte_test_and_clear_mask(conf + PCI_IO_BASE,
 + PCI_IO_RANGE_MASK  0xff);
 +pci_byte_test_and_clear_mask(conf + PCI_IO_LIMIT,
 + PCI_IO_RANGE_MASK  0xff);
 +pci_word_test_and_clear_mask(conf + PCI_MEMORY_BASE,
 + PCI_MEMORY_RANGE_MASK  0x);
 +pci_word_test_and_clear_mask(conf + PCI_MEMORY_LIMIT,
 + PCI_MEMORY_RANGE_MASK  0x);
 +pci_word_test_and_clear_mask(conf + PCI_PREF_MEMORY_BASE,
 + PCI_PREF_RANGE_MASK  0x);
 +pci_word_test_and_clear_mask(conf + PCI_PREF_MEMORY_LIMIT,
 + PCI_PREF_RANGE_MASK  0x);
 +pci_set_word(conf + PCI_PREF_BASE_UPPER32, 0);
 +pci_set_word(conf + PCI_PREF_LIMIT_UPPER32, 0);
 +}
 +
 +void pci_bridge_reset_disable_base_limit(PCIDevice *dev)
 +{
 +uint8_t *conf = dev-config;
 +
 +pci_byte_test_and_set_mask(conf + PCI_IO_BASE,
 +   PCI_IO_RANGE_MASK  0xff);
 +pci_byte_test_and_clear_mask(conf + PCI_IO_LIMIT,
 + PCI_IO_RANGE_MASK  0xff);
 +pci_word_test_and_set_mask(conf + PCI_MEMORY_BASE,
 +   PCI_MEMORY_RANGE_MASK  0x);
 +pci_word_test_and_clear_mask(conf + PCI_MEMORY_LIMIT,
 + PCI_MEMORY_RANGE_MASK  0x);
 +pci_word_test_and_set_mask(conf + PCI_PREF_MEMORY_BASE,
 +   PCI_PREF_RANGE_MASK  0x);
 +pci_word_test_and_clear_mask(conf + PCI_PREF_MEMORY_LIMIT,
 + PCI_PREF_RANGE_MASK  0x);
 +pci_set_word(conf + PCI_PREF_BASE_UPPER32, 0);
 +pci_set_word(conf + PCI_PREF_LIMIT_UPPER32, 0);
 +}
 +
  /* reset bridge specific configuration registers */
  void pci_bridge_reset_reg(PCIDevice *dev)
  {
 @@ -161,14 +201,15 @@ void pci_bridge_reset_reg(PCIDevice *dev)
  conf[PCI_SUBORDINATE_BUS] = 0;
  conf[PCI_SEC_LATENCY_TIMER] = 0;
  
 -conf[PCI_IO_BASE] = 0;
 -conf[PCI_IO_LIMIT] = 0;
 -pci_set_word(conf + PCI_MEMORY_BASE, 0);
 -pci_set_word(conf + PCI_MEMORY_LIMIT, 0);
 -pci_set_word(conf + PCI_PREF_MEMORY_BASE, 0);
 -pci_set_word(conf + PCI_PREF_MEMORY_LIMIT, 0);
 -pci_set_word(conf + PCI_PREF_BASE_UPPER32, 0);
 -pci_set_word(conf + PCI_PREF_LIMIT_UPPER32, 0);
 +/*
 + * the default values for base/limit registers aren't specified
 + * in the PCI-to-PCI-bridge spec. So we don't thouch them here.
 + * Each implementation can override it.
 + * typical implementation does
 + * - zero registers: pci_bridge_reset_zer_base_limit()
 + * or
 + * - disable forwarding: pci_bridge_reset_disable_base_limit()
 + */
  
  pci_set_word(conf + PCI_BRIDGE_CONTROL, 0);
  }
 diff --git a/hw/pci_bridge.h b/hw/pci_bridge.h
 index f6fade0..2359684 100644
 --- a/hw/pci_bridge.h
 +++ b/hw/pci_bridge.h
 @@ -39,6 +39,8 @@ pcibus_t pci_bridge_get_limit(const PCIDevice *bridge, 
 uint8_t type);
  
  void pci_bridge_write_config(PCIDevice *d,
   uint32_t address, uint32_t val, int len);
 +void pci_bridge_reset_zero_base_limit(PCIDevice *dev);
 +void pci_bridge_reset_disable_base_limit(PCIDevice *dev);
  void pci_bridge_reset_reg(PCIDevice *dev);
  void pci_bridge_reset(DeviceState *qdev);
  
 -- 
 1.7.1.1

Re: [Qemu-devel] [Tracing][v4 PATCH 2/2] Add documentation for QMP interfaces

2010-10-19 Thread Prerna Saxena


On 10/19/2010 11:57 AM, Prerna Saxena wrote:

[PATCH 2/2] Add documentation for QMP commands:
  - query-trace
  - query-trace-events
  - query-trace-file.




I've been trying ways to avoid building this documentation for other 
trace backends ( since these commands are only available with the 
'simple' backend ). However, looks like hxtool blindly copies text 
between SQMP and EQMP.
I can only think of making hxtool a wee bit intelligent to be able to 
parse CONFIG_* options and build documentation accordingly. Is there a 
workaround I'm missing ?


--
Prerna Saxena

Linux Technology Centre,
IBM Systems and Technology Lab,
Bangalore, India

Re: [Qemu-devel] Re: KVM call agenda for Oct 19

2010-10-19 Thread Dor Laor


On 10/19/2010 04:11 AM, Chris Wright wrote:

* Juan Quintela (quint...@redhat.com) wrote:


Please send in any agenda items you are interested in covering.


- 0.13.X -stable handoff
- 0.14 planning
- threadlet work
- virtfs proposals



- Live snapshots
  - We were asked to add this feature for external qcow2
images. Will simple approach of fsync + tracking each requested
backing file (it can be per vDisk) and re-open the new image would
be accepted?
  - Integration with FS freeze for consistent guest app snapshot
Many apps do not sync their ram state to disk correctly or frequent
enough. Physical world backup software calls fs freeze on xfs and
VSS for windows to make the backup consistent.
In order to integrated this with live snapshots we need a guest
agent to trigger the guest fs freeze.
We can either have qemu communicate with the agent directly through
virtio-serial or have a mgmt daemon use virtio-serial to
communicate with the guest in addition to QMP messages about the
live snapshot state.
Preferences? The first solution complicates qemu while the second
complicates mgmt.

[Qemu-devel] [PATCH 0/1] ccid emulated card (v2, for usb-ccid v3)

2010-10-19 Thread Alon Levy

v2 changes:
 fixed a bug that made certificates emulation not work, and some cleanup.

v1 message:

Meant to be applied after the usb-ccid v3 patch on the list.
Causes --enable-smartcard to depend on libcac_card, library for emulating
CAC compliant smart cards at http://cgit.freedesktop.org/~alon/cac_card/

hw/ccid-card-emulated.c: new device
Makefile.objs: add ccid-card-emulated.o if --enable-smartcard
configure: dependency on libcac_card if --enable-smartcard
hw/usb-ccid.c: added a TODO note
hw/ccid-card-passthru.c: removed does-nothing print method.

Alon Levy (1):
  add ccid-card-emulated device (v2)

 Makefile.objs   |2 +-
 configure   |   20 ++
 hw/ccid-card-emulated.c |  495 +++
 hw/ccid-card-passthru.c |6 -
 hw/usb-ccid.c   |2 +
 5 files changed, 518 insertions(+), 7 deletions(-)
 create mode 100644 hw/ccid-card-emulated.c

-- 
1.7.3.1

[Qemu-devel] [PATCH 1/1] add ccid-card-emulated device (v2)

2010-10-19 Thread Alon Levy

changes from v1:
remove stale comments, use only c-style comments
bugfix, forgot to set recv_len
change reader name to 'Virtual Reader'

Signed-off-by: Alon Levy al...@redhat.com
---
 Makefile.objs   |2 +-
 configure   |   20 ++
 hw/ccid-card-emulated.c |  495 +++
 hw/ccid-card-passthru.c |6 -
 hw/usb-ccid.c   |2 +
 5 files changed, 518 insertions(+), 7 deletions(-)
 create mode 100644 hw/ccid-card-emulated.c

diff --git a/Makefile.objs b/Makefile.objs
index 3c4a880..ae12546 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -173,7 +173,7 @@ hw-obj-$(CONFIG_FDC) += fdc.o
 hw-obj-$(CONFIG_ACPI) += acpi.o acpi_piix4.o
 hw-obj-$(CONFIG_APM) += pm_smbus.o apm.o
 hw-obj-$(CONFIG_DMA) += dma.o
-hw-obj-$(CONFIG_SMARTCARD) += usb-ccid.o ccid-card-passthru.o
+hw-obj-$(CONFIG_SMARTCARD) += usb-ccid.o ccid-card-passthru.o 
ccid-card-emulated.o
 
 # PPC devices
 hw-obj-$(CONFIG_OPENPIC) += openpic.o
diff --git a/configure b/configure
index e1922a3..59b8436 100755
--- a/configure
+++ b/configure
@@ -2112,6 +2112,26 @@ EOF
   fi
 fi
 
+# check for libcaccard for smartcard support
+if test $smartcard != no ; then
+  cat  $TMPC  EOF
+#include vscard_common.h
+int main() { return 0; }
+EOF
+  smartcard_cflags=$($pkgconfig --cflags cac_card cac_card 2/dev/null)
+  smartcard_libs=$($pkgconfig --libs cac_card cac_card 2/dev/null)
+  if $pkgconfig --atleast-version=0.0.1 cac_card \
+ compile_prog $smartcard_cflags $smartcard_libs ; then
+smartcard=yes
+QEMU_CFLAGS=$QEMU_CFLAGS $smartcard_cflags
+  else
+if test smartcard = yes ; then
+  feature_not_found smartcard
+fi
+smartcard=no
+  fi
+fi
+
 ##
 
 ##
diff --git a/hw/ccid-card-emulated.c b/hw/ccid-card-emulated.c
new file mode 100644
index 000..9eee6b7
--- /dev/null
+++ b/hw/ccid-card-emulated.c
@@ -0,0 +1,495 @@
+/*
+ * CCID Card Device. Emulated card.
+ *
+ * It can be used to provide access to the local hardware in a non exclusive
+ * way, or it can use certificates. It requires the usb-ccid bus.
+ *
+ * Usage 1: standard, mirror hardware reader+card:
+ * qemu .. -usb -device usb-ccid -device ccid-card-emulated
+ *
+ * Usage 2: use certificates, no hardware required
+ * one time: create the certificates:
+ *  for i in 1 2 3; do certutil -d /etc/pki/nssdb -x -t CT,CT,CT -S -s 
CN=user$i -n user$i; done 
+ * qemu .. -usb -device usb-ccid -device 
ccid-card-emulated,cert1=user1,cert2=user2,cert3=user3
+ *
+ * If you use a non default db for the certificates you can specify it using 
the db parameter.
+ *
+ *
+ * Copyright (c) 2010 Red Hat.
+ * Written by Alon Levy.
+ *
+ * This code is licenced under the LGPL.
+ */
+
+#include pthread.h
+#include eventt.h
+#include vevent.h
+#include vreader.h
+#include vcard_emul.h
+#include qemu-char.h
+#include monitor.h
+#include hw/ccid.h
+
+#define DPRINTF(lvl, fmt, ...) \
+do { if (lvl = debug) { printf(ccid-card-emul: %s:  fmt , __func__, ## 
__VA_ARGS__); } } while (0)
+
+static int debug = 0;
+
+#define EMULATED_DEV_NAME ccid-card-emulated
+
+#define BACKEND_NSS_EMULATED nss-emulated /* the default */
+#define BACKEND_CERTIFICATES certificates
+
+typedef struct EmulatedState EmulatedState;
+
+enum {
+EMUL_READER_INSERT = 0,
+EMUL_READER_REMOVE,
+EMUL_CARD_INSERT,
+EMUL_CARD_REMOVE,
+EMUL_GUEST_APDU,
+EMUL_RESPONSE_APDU,
+EMUL_ERROR,
+};
+
+static const char* emul_event_to_string(uint32_t emul_event)
+{
+switch (emul_event) {
+case EMUL_READER_INSERT: return EMUL_READER_INSERT;
+case EMUL_READER_REMOVE: return EMUL_READER_REMOVE;
+case EMUL_CARD_INSERT: return EMUL_CARD_INSERT;
+case EMUL_CARD_REMOVE: return EMUL_CARD_REMOVE;
+case EMUL_GUEST_APDU: return EMUL_GUEST_APDU;
+case EMUL_RESPONSE_APDU: return EMUL_RESPONSE_APDU;
+case EMUL_ERROR: return EMUL_ERROR;
+default:
+break;
+}
+return UNKNOWN;
+}
+
+typedef struct EmulEvent {
+QSIMPLEQ_ENTRY(EmulEvent) entry;
+union {
+struct {
+uint32_t type;
+} gen;
+struct {
+uint32_t type;
+uint64_t code;
+} error;
+struct {
+uint32_t type;
+uint32_t len;
+uint8_t data[];
+} data;
+} p;
+} EmulEvent;
+
+#define MAX_ATR_SIZE 40
+struct EmulatedState {
+CCIDCardState base;
+uint8_t  debug;
+char*backend;
+char*cert1;
+char*cert2;
+char*cert3;
+char*db;
+uint8_t  atr[MAX_ATR_SIZE];
+uint8_t  atr_length;
+QSIMPLEQ_HEAD(event_list, EmulEvent) event_list;
+pthread_mutex_t event_list_mutex;
+VReader *reader;
+QSIMPLEQ_HEAD(guest_apdu_list, EmulEvent) guest_apdu_list;
+pthread_mutex_t vreader_mutex; /* and guest_apdu_list mutex */
+pthread_mutex_t handle_apdu_mutex;
+

Re: [Qemu-devel] Re: KVM call agenda for Oct 19

2010-10-19 Thread Avi Kivity


 On 10/19/2010 02:48 PM, Dor Laor wrote:

On 10/19/2010 04:11 AM, Chris Wright wrote:

* Juan Quintela (quint...@redhat.com) wrote:


Please send in any agenda items you are interested in covering.


- 0.13.X -stable handoff
- 0.14 planning
- threadlet work
- virtfs proposals



- Live snapshots
  - We were asked to add this feature for external qcow2
images. Will simple approach of fsync + tracking each requested
backing file (it can be per vDisk) and re-open the new image would
be accepted?
  - Integration with FS freeze for consistent guest app snapshot
Many apps do not sync their ram state to disk correctly or frequent
enough. Physical world backup software calls fs freeze on xfs and
VSS for windows to make the backup consistent.
In order to integrated this with live snapshots we need a guest
agent to trigger the guest fs freeze.
We can either have qemu communicate with the agent directly through
virtio-serial or have a mgmt daemon use virtio-serial to
communicate with the guest in addition to QMP messages about the
live snapshot state.
Preferences? The first solution complicates qemu while the second
complicates mgmt.


Third option, make the freeze path management - qemu - virtio-blk - 
guest kernel - file systems.  The advantage is that it's easy to 
associate file systems with a block device this way.


--
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] Re: KVM call agenda for Oct 19

2010-10-19 Thread Dor Laor


On 10/19/2010 02:55 PM, Avi Kivity wrote:

On 10/19/2010 02:48 PM, Dor Laor wrote:

On 10/19/2010 04:11 AM, Chris Wright wrote:

* Juan Quintela (quint...@redhat.com) wrote:


Please send in any agenda items you are interested in covering.


- 0.13.X -stable handoff
- 0.14 planning
- threadlet work
- virtfs proposals



- Live snapshots
- We were asked to add this feature for external qcow2
images. Will simple approach of fsync + tracking each requested
backing file (it can be per vDisk) and re-open the new image would
be accepted?
- Integration with FS freeze for consistent guest app snapshot
Many apps do not sync their ram state to disk correctly or frequent
enough. Physical world backup software calls fs freeze on xfs and
VSS for windows to make the backup consistent.
In order to integrated this with live snapshots we need a guest
agent to trigger the guest fs freeze.
We can either have qemu communicate with the agent directly through
virtio-serial or have a mgmt daemon use virtio-serial to
communicate with the guest in addition to QMP messages about the
live snapshot state.
Preferences? The first solution complicates qemu while the second
complicates mgmt.


Third option, make the freeze path management - qemu - virtio-blk -
guest kernel - file systems. The advantage is that it's easy to
associate file systems with a block device this way.


OTH the userspace freeze path already exist and now you create another 
path. What about FS that span over LVM with multiple drives? IDE/SCSI?

Re: [Qemu-devel] Re: KVM call agenda for Oct 19

2010-10-19 Thread Anthony Liguori


On 10/19/2010 08:03 AM, Avi Kivity wrote:

 On 10/19/2010 02:58 PM, Dor Laor wrote:

On 10/19/2010 02:55 PM, Avi Kivity wrote:

On 10/19/2010 02:48 PM, Dor Laor wrote:

On 10/19/2010 04:11 AM, Chris Wright wrote:

* Juan Quintela (quint...@redhat.com) wrote:


Please send in any agenda items you are interested in covering.


- 0.13.X -stable handoff
- 0.14 planning
- threadlet work
- virtfs proposals



- Live snapshots
- We were asked to add this feature for external qcow2
images. Will simple approach of fsync + tracking each requested
backing file (it can be per vDisk) and re-open the new image would
be accepted?
- Integration with FS freeze for consistent guest app snapshot
Many apps do not sync their ram state to disk correctly or frequent
enough. Physical world backup software calls fs freeze on xfs and
VSS for windows to make the backup consistent.
In order to integrated this with live snapshots we need a guest
agent to trigger the guest fs freeze.
We can either have qemu communicate with the agent directly through
virtio-serial or have a mgmt daemon use virtio-serial to
communicate with the guest in addition to QMP messages about the
live snapshot state.
Preferences? The first solution complicates qemu while the second
complicates mgmt.


Third option, make the freeze path management - qemu - virtio-blk -
guest kernel - file systems. The advantage is that it's easy to
associate file systems with a block device this way.


OTH the userspace freeze path already exist and now you create 
another path. 


I guess we would still have a userspace daemon; instead of talking to 
virtio-serial it talks to virtio-blk.  So:


  management - qemu - virtio-blk - guest driver - kernel fs 
resolver - daemon - apps


Yuck.


Yeah, in Windows, I'm pretty sure the freeze API is a userspace 
concept.  Various apps can hook into it to serialize their state.


At the risk of stealing Mike's thunder, we've actually been working on a 
simple guest agent exactly for this type of task.  Mike's planning an 
RFC for later this week but for those that are interested the repo is at 
http://repo.or.cz/w/qemu/mdroth.git


Regards,

Anthony Liguori



What about FS that span over LVM with multiple drives? IDE/SCSI?


Good points.

Re: [Qemu-devel] [PATCH 2/3] usb: add public usb_device_by_id

2010-10-19 Thread Alon Levy


- Gerd Hoffmann kra...@redhat.com wrote:

  +USBDevice *usb_device_by_id(const char* id)
  +{
  +USBBus *bus;
  +DeviceState *qdev;
  +USBDevice *dev;
  +
  +QTAILQ_FOREACH(bus,busses, next) {
  +qdev = qdev_find_recursive(bus-qbus, id);
  +if (qdev != NULL) {
  +dev = DO_UPCAST(USBDevice, qdev, qdev);
  +return dev;
  +}
  +}
 
 You don't need qdev_find_recursive here.  Have a look at the
 usb_info() 
 code to see how to loop over all usb devices.  Then compare id with 
 USBDevice-qdev.id.
 
 cheers,
Gerd

There is no problem to loop over all usb devices. But first of all I don't want 
to loop on used, since then I miss any detached devices, so I actually do want 
the same behavior of qdev_find_recursive, and since it's already available, why 
rewrite it in a different compilation unit?

Alon

[Qemu-devel] Re: [PATCH] virtio: Use ioeventfd for virtqueue notify

2010-10-19 Thread Anthony Liguori


On 10/19/2010 08:07 AM, Stefan Hajnoczi wrote:

Is there anything stopping this patch from being merged?
   


Michael, any objections?  If not, I'll merge it.

Regards,

Anthony Liguori


Thanks,
Stefan

Re: [Qemu-devel] [PATCH 2/3] usb: add public usb_device_by_id

2010-10-19 Thread Gerd Hoffmann


+USBDevice *usb_device_by_id(const char* id)
+{
+USBBus *bus;
+DeviceState *qdev;
+USBDevice *dev;
+
+QTAILQ_FOREACH(bus,busses, next) {
+qdev = qdev_find_recursive(bus-qbus, id);
+if (qdev != NULL) {
+dev = DO_UPCAST(USBDevice, qdev, qdev);
+return dev;
+}
+}


You don't need qdev_find_recursive here.  Have a look at the usb_info() 
code to see how to loop over all usb devices.  Then compare id with 
USBDevice-qdev.id.


cheers,
  Gerd

[Qemu-devel] Static tracepoint control via trace-event

2010-10-19 Thread Jan Kiszka

Hi Stefan,

just had a closer look at qemu's new tracing framework. Looks cool,
though it leaves a bit room for improvements. ;)

One quirk I stumbled over quickly was the disable tag in trace-events.
It confused me first as qemu starts without any tracepoint enabled by
default and I thought I had to hack the file. Then I read the doc and
wondered which exiting or future backend would come without sufficiently
fast dynamic tracepoint control. Do you have any in mind?

Instead of making it a compile-time switch (except for simpletrace), I
would vote for declaring the simpletrace usage as the only one: disable
sets the default state of the dynamic tracepoint. That way we could use
trace-events to define a useful set of standard, moderate-impact
tracepoints that shall be on. Others will still be available once a
backend is configured, but remain off until enabled during runtime.
Anything else looks like overkill to me.

There are a few more things I have in mind (ftrace backend, enhanced
-trace switch, wildcards for enabling tracepoints, and more
tracepoints). Will hopefully come up with patches to address them, but
this may take a while.

Jan

PS: Do you maintain a tracing git tree?

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] Re: KVM call agenda for Oct 19

2010-10-19 Thread Avi Kivity


 On 10/19/2010 03:22 PM, Anthony Liguori wrote:


I had assumed that this would involve:

qemu -hda windows.img

(qemu) snapshot ide0-disk0 snap0.img

1) create snap0.img internally by doing the equivalent of `qemu-img 
create -f qcow2 -b windows.img snap0.img'

2) bdrv_flush('ide0-disk0')
3) bdrv_open(snap0.img)
4) bdrv_close(windows.img)
5) rename('windows.img', 'windows.img.tmp')
6) rename('snap0.img', 'windows.img')
7) rename('windows.img.tmp', 'snap0.img')



Looks reasonable.

Would be interesting to look at this as a use case for the threading 
work.  We should eventually be able to create a snapshot without 
stalling vcpus (stalling I/O of course allowed).


--
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] Re: KVM call agenda for Oct 19

2010-10-19 Thread Anthony Liguori


On 10/19/2010 07:48 AM, Dor Laor wrote:

On 10/19/2010 04:11 AM, Chris Wright wrote:

* Juan Quintela (quint...@redhat.com) wrote:


Please send in any agenda items you are interested in covering.


- 0.13.X -stable handoff
- 0.14 planning
- threadlet work
- virtfs proposals



- Live snapshots
  - We were asked to add this feature for external qcow2
images. Will simple approach of fsync + tracking each requested
backing file (it can be per vDisk) and re-open the new image would
be accepted?
  - Integration with FS freeze for consistent guest app snapshot
Many apps do not sync their ram state to disk correctly or frequent
enough. Physical world backup software calls fs freeze on xfs and
VSS for windows to make the backup consistent.
In order to integrated this with live snapshots we need a guest
agent to trigger the guest fs freeze.
We can either have qemu communicate with the agent directly through
virtio-serial or have a mgmt daemon use virtio-serial to
communicate with the guest in addition to QMP messages about the
live snapshot state.
Preferences? The first solution complicates qemu while the second
complicates mgmt.


- usb-ccid (aka external device modules)

We probably won't get to it for today's call, but we should try to queue 
this topic up for discussion.  We have a similar situation with vtpm 
(existing device model that wants to integrate with QEMU).  My position 
so far has been that we should avoid external device models because of 
difficulty integrating QEMU features with external device models.


However, I'd like to hear opinions from a wider audience.

Regards,

Anthony Liguori


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Qemu-devel] Re: [PATCH] virtio: Use ioeventfd for virtqueue notify

2010-10-19 Thread Stefan Hajnoczi

On Thu, Sep 30, 2010 at 03:01:52PM +0100, Stefan Hajnoczi wrote:
 Virtqueue notify is currently handled synchronously in userspace virtio.
 This prevents the vcpu from executing guest code while hardware
 emulation code handles the notify.
 
 On systems that support KVM, the ioeventfd mechanism can be used to make
 virtqueue notify a lightweight exit by deferring hardware emulation to
 the iothread and allowing the VM to continue execution.  This model is
 similar to how vhost receives virtqueue notifies.
 
 The result of this change is improved performance for userspace virtio
 devices.  Virtio-blk throughput increases especially for multithreaded
 scenarios and virtio-net transmit throughput increases substantially.
 Full numbers are below.
 
 This patch employs ioeventfd virtqueue notify for all virtio devices.
 Linux kernels pre-2.6.34 only allow for 6 ioeventfds per VM and care
 must be taken so that vhost-net, the other ioeventfd user in QEMU, is
 able to function.  On such kernels ioeventfd virtqueue notify will not
 be used.
 
 Khoa Huynh k...@us.ibm.com collected the following data for
 virtio-blk with cache=none,aio=native:
 
 FFSB Test  Threads  Unmodified  Patched
 (MB/s)  (MB/s)
 Large file create  121.721.8
8101.0   118.0
16   119.0   157.0
 
 Sequential reads   121.923.2
8114.0   139.0
16   143.0   178.0
 
 Random reads   13.3 3.6
823.025.4
16   43.347.8
 
 Random writes  122.223.0
893.1111.6
16   110.5   132.0
 
 Sridhar Samudrala s...@us.ibm.com collected the following data for
 virtio-net with 2.6.36-rc1 on the host and 2.6.34 on the guest.
 
 Guest to Host TCP_STREAM throughput(Mb/sec)
 ---
 Msg Size  vhost-net  virtio-net  virtio-net/ioeventfd
 65536 127556430  7590
 16384  84993084  5764
  4096  47231578  3659
  1024  1827 981  2060
 
 Host to Guest TCP_STREAM throughput(Mb/sec)
 ---
 Msg Size  vhost-net  virtio-net  virtio-net/ioeventfd
 65536 111565790  5853
 16384 107875575  5691
  4096 104525556  4277
  1024  44373671  5277
 
 Guest to Host TCP_RR latency(transactions/sec)
 --
 
 Msg Size  vhost-net  virtio-net  virtio-net/ioeventfd
 1  99033459  3425
  4096  71851931  1899
 16384  61082102  1923
 65536  31611610  1744
 
 Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
 ---
 Small changes are required for qemu-kvm.git.  I will send them once qemu.git
 has virtio-ioeventfd support.
 
  hw/vhost.c  |6 ++--
  hw/virtio.c |  106 
 +++
  hw/virtio.h |9 +
  kvm-all.c   |   39 +
  kvm-stub.c  |5 +++
  kvm.h   |1 +
  6 files changed, 156 insertions(+), 10 deletions(-)

Is there anything stopping this patch from being merged?

Thanks,
Stefan

[Qemu-devel] Re: [PATCH v5 00/14] pcie port switch emulators

2010-10-19 Thread Michael S. Tsirkin

On Tue, Oct 19, 2010 at 06:06:27PM +0900, Isaku Yamahata wrote:
 Here is v5 of the pcie patch series.
 I hope I addressed the blockers.
 On uncorrectable error status register in pcie_aer_write_config().
 The register is RW1CS, so making it writable and test-and-clear doesn't
 work.
 
 new patches: 1, 2, 
 updasted patches except trivial change: 4, 7, 8

Ok, I applied patches 1,2,3 and 5.

 BTW, as 0.13 is released, any chance to sync pci branch with
 the upstream by requesting pull?
 
 Patch description:
 This patch series implements pcie port switch emulators
 which is basic part for pcie/q35 support.
 This is for mst/pci tree.
 
 change v4 - v5:
 - introduced pci_xxx_test_and_clear/set_mask
 - eliminated xxx_notify(msi_trigger, int_level)
 - eliminated FLR bits.
   FLR will be addressed at the next phase.
 
 changes v3 - v4:
 - introduced new pci config helper functions.(clear set bit)
 - various clean up and some bug fixes.
 - dropped pci_shift_xxx().
 - dropped function pointerin pcie_aer.h
 - dropped pci_exp_cap(), pcie_aer_cap().
 - file rename (pcie_{root, upstream, downsatrem} = ioh33420, x3130).
 
 changes v2 - v3:
 - msi: improved commant and simplified shift/ffs dance
 - pci w1c config register framework
 - split pcie.[ch] into pcie_regs.h, pcie.[ch] and pcie_aer.[ch]
 - pcie, aer: many changes by following reviews.
 
 changes v1 - v2:
 - update msi
 - dropped already pushed out patches.
 - added msix patches.
 
 Isaku Yamahata (14):
   pci: introduce helper functions to test-and-{clear, set} mask in
 configuration space
   pci: introduce helper function to handle msi-x and msi.
   pci: use pci_word_test_and_clear_mask() in pci_device_reset()
   pci/bridge: fix pci_bridge_reset()
   msi: implements msi
   pcie: add pcie constants to pcie_regs.h
   pcie: helper functions for pcie capability and extended capability
   pcie/aer: helper functions for pcie aer capability
   pcie port: define struct PCIEPort/PCIESlot and helper functions
   ioh3420: pcie root port in X58 ioh
   x3130: pcie upstream port
   x3130: pcie downstream port
   pcie/hotplug: introduce pushing attention button command
   pcie/aer: glue aer error injection into qemu monitor
 
  Makefile.objs   |4 +-
  hw/ioh3420.c|  229 +
  hw/ioh3420.h|   10 +
  hw/msi.c|  352 +++
  hw/msi.h|   41 +++
  hw/pci.c|   24 ++-
  hw/pci.h|   88 +-
  hw/pci_bridge.c |   57 +++-
  hw/pci_bridge.h |2 +
  hw/pcie.c   |  540 +
  hw/pcie.h   |  113 ++
  hw/pcie_aer.c   |  869 
 +++
  hw/pcie_aer.h   |  105 ++
  hw/pcie_port.c  |  198 +++
  hw/pcie_port.h  |   51 +++
  hw/pcie_regs.h  |  154 +
  hw/xio3130_downstream.c |  197 +++
  hw/xio3130_downstream.h |   11 +
  hw/xio3130_upstream.c   |  181 ++
  hw/xio3130_upstream.h   |   10 +
  qemu-common.h   |6 +
  qemu-monitor.hx |   36 ++
  sysemu.h|9 +
  23 files changed, 3272 insertions(+), 15 deletions(-)
  create mode 100644 hw/ioh3420.c
  create mode 100644 hw/ioh3420.h
  create mode 100644 hw/msi.c
  create mode 100644 hw/msi.h
  create mode 100644 hw/pcie.c
  create mode 100644 hw/pcie.h
  create mode 100644 hw/pcie_aer.c
  create mode 100644 hw/pcie_aer.h
  create mode 100644 hw/pcie_port.c
  create mode 100644 hw/pcie_port.h
  create mode 100644 hw/pcie_regs.h
  create mode 100644 hw/xio3130_downstream.c
  create mode 100644 hw/xio3130_downstream.h
  create mode 100644 hw/xio3130_upstream.c
  create mode 100644 hw/xio3130_upstream.h

Re: [Qemu-devel] Static tracepoint control via trace-event

2010-10-19 Thread Daniel P. Berrange

On Tue, Oct 19, 2010 at 03:08:08PM +0200, Jan Kiszka wrote:
 Hi Stefan,
 
 just had a closer look at qemu's new tracing framework. Looks cool,
 though it leaves a bit room for improvements. ;)
 
 One quirk I stumbled over quickly was the disable tag in trace-events.
 It confused me first as qemu starts without any tracepoint enabled by
 default and I thought I had to hack the file. Then I read the doc and
 wondered which exiting or future backend would come without sufficiently
 fast dynamic tracepoint control. Do you have any in mind?
 
 Instead of making it a compile-time switch (except for simpletrace), I
 would vote for declaring the simpletrace usage as the only one: disable
 sets the default state of the dynamic tracepoint. That way we could use
 trace-events to define a useful set of standard, moderate-impact
 tracepoints that shall be on. Others will still be available once a
 backend is configured, but remain off until enabled during runtime.
 Anything else looks like overkill to me.

FYI with the DTrace/SystemTAP backend I posted yesterday, the 'disable'
keyword is effectively completely ignored. All tracepoints are disabled
when QEMU is running normally. Only when a end user runs a dtrace script
that references a QEMU tracepoint, is that specific tracepoint enabled.

Regards,
Daniel
-- 
|: Red Hat, Engineering, London-o-   http://people.redhat.com/berrange/ :|
|: http://libvirt.org -o- http://virt-manager.org -o- http://deltacloud.org :|
|: http://autobuild.org-o- http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505  -o-   F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

Re: [Qemu-devel] Re: KVM call agenda for Oct 19

2010-10-19 Thread Stefan Hajnoczi

On Tue, Oct 19, 2010 at 2:33 PM, Anthony Liguori anth...@codemonkey.ws wrote:
 On 10/19/2010 08:27 AM, Avi Kivity wrote:

  On 10/19/2010 03:22 PM, Anthony Liguori wrote:

 I had assumed that this would involve:

 qemu -hda windows.img

 (qemu) snapshot ide0-disk0 snap0.img

 1) create snap0.img internally by doing the equivalent of `qemu-img
 create -f qcow2 -b windows.img snap0.img'
 2) bdrv_flush('ide0-disk0')
 3) bdrv_open(snap0.img)
 4) bdrv_close(windows.img)
 5) rename('windows.img', 'windows.img.tmp')
 6) rename('snap0.img', 'windows.img')
 7) rename('windows.img.tmp', 'snap0.img')


 Looks reasonable.

 Would be interesting to look at this as a use case for the threading work.
  We should eventually be able to create a snapshot without stalling vcpus
 (stalling I/O of course allowed).

 If we had another block-level command, like bdrv_aio_freeze(), that queued
 all pending requests until the given callback completed, it would be very
 easy to do this entirely asynchronously.  For instance:

 bdrv_aio_freeze(create_snapshot)

 create_snapshot():
  bdrv_aio_flush(done_flush)

 done_flush():
  bdrv_open(...)
  bdrv_close(...)
  ...

 Of course, closing a device while it's being frozen is probably a recipe for
 disaster but you get the idea :-)

bdrv_aio_freeze() or any mechanism to deal with pending requests in
the generic block code would be a good step for future live support
of other operations like truncate.

Stefan

Re: [Qemu-devel] [PATCH 00/10] [PULL] qemu-kvm.git uq/master queue

2010-10-19 Thread Anthony Liguori


On 10/19/2010 05:40 AM, Marcelo Tosatti wrote:

The following changes since commit 38cc9b607f85017b095793cab6c129bc9844f441:

   issue snd_pcm_start() when capturing audio (2010-10-18 00:39:06 +0400)

are available in the git repository at:
   git://git.kernel.org/pub/scm/virt/kvm/qemu-kvm.git uq/master
   


This breaks the build.

cc1: warnings being treated as errors
/home/anthony/git/qemu/target-i386/kvm.c: In function ‘kvm_on_sigbus_vcpu’:
/home/anthony/git/qemu/target-i386/kvm.c:1671: error: passing argument 3 
of ‘kvm_physical_memory_addr_from_ram’ from incompatible pointer type
/home/anthony/git/qemu/kvm.h:180: note: expected ‘target_phys_addr_t *’ 
but argument is of type ‘long unsigned int *’

/home/anthony/git/qemu/target-i386/kvm.c: In function ‘kvm_on_sigbus’:
/home/anthony/git/qemu/target-i386/kvm.c:1714: error: passing argument 3 
of ‘kvm_physical_memory_addr_from_ram’ from incompatible pointer type
/home/anthony/git/qemu/kvm.h:180: note: expected ‘target_phys_addr_t *’ 
but argument is of type ‘long unsigned int *’

make[1]: *** [kvm.o] Error 1
make: *** [subdir-i386-softmmu] Error 2

I've pushed my tree to http://repo.or.cz/w/qemu/aliguori.git 
qemu-kvm-20101019 but the merge is a fast-forward so you should have no 
trouble reproducing.


anth...@titi:~/build/qemu$ uname -a
Linux titi 2.6.32.11+drm33.2-x201 #1 SMP Sat May 22 09:58:34 PDT 2010 
x86_64 GNU/Linux


anth...@titi:~/build/qemu$ gcc -v
Using built-in specs.
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 
4.4.3-4ubuntu5' --with-bugurl=file:///usr/share/doc/gcc-4.4/README.Bugs 
--enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr 
--enable-shared --enable-multiarch --enable-linker-build-id 
--with-system-zlib --libexecdir=/usr/lib --without-included-gettext 
--enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.4 
--program-suffix=-4.4 --enable-nls --enable-clocale=gnu 
--enable-libstdcxx-debug --enable-plugin --enable-objc-gc 
--disable-werror --with-arch-32=i486 --with-tune=generic 
--enable-checking=release --build=x86_64-linux-gnu 
--host=x86_64-linux-gnu --target=x86_64-linux-gnu

Thread model: posix
gcc version 4.4.3 (Ubuntu 4.4.3-4ubuntu5)

Regards,

Anthony Liguori


Huang Ying (1):
   Add RAM -  physical addr mapping in MCE simulation

Joerg Roedel (2):
   Set cpuid definition to 0 before initializing it
   Add svm cpuid features

Marcelo Tosatti (7):
   signalfd compatibility
   iothread: use signalfd
   Expose thread_id in info cpus
   kvm: x86: add mce support
   Export qemu_ram_addr_from_host
   MCE: Relay UCR MCE to guest
   Add savevm/loadvm support for MCE

  Makefile.objs |1 +
  compatfd.c|  117 +++
  compatfd.h|   43 +++
  configure |   18 +++
  cpu-common.h  |3 +-
  cpu-defs.h|1 +
  cpus.c|  161 --
  exec-all.h|2 +-
  exec.c|   27 +++--
  kvm-all.c |   18 +++
  kvm-stub.c|5 +
  kvm.h |6 +
  monitor.c |4 +
  osdep.c   |   15 +++
  osdep.h   |1 +
  target-i386/cpu.h |   32 +-
  target-i386/cpuid.c   |   79 ++---
  target-i386/helper.c  |6 +
  target-i386/kvm.c |  300 -
  target-i386/kvm_x86.h |   22 
  20 files changed, 817 insertions(+), 44 deletions(-)
  create mode 100644 compatfd.c
  create mode 100644 compatfd.h
  create mode 100644 target-i386/kvm_x86.h

[Qemu-devel] Re: [PATCH] virtio: Use ioeventfd for virtqueue notify

2010-10-19 Thread Michael S. Tsirkin

On Tue, Oct 19, 2010 at 08:12:42AM -0500, Anthony Liguori wrote:
 On 10/19/2010 08:07 AM, Stefan Hajnoczi wrote:
 Is there anything stopping this patch from being merged?
 
 Michael, any objections?  If not, I'll merge it.

I don't really understand what's going on there.  The extra state in
notifiers especially scares me. If you do and are comfortable with the
code, go ahead :)

-- 
MST

[Qemu-devel] Re: [PATCH] virtio: Use ioeventfd for virtqueue notify

2010-10-19 Thread Stefan Hajnoczi

On Tue, Oct 19, 2010 at 2:35 PM, Michael S. Tsirkin m...@redhat.com wrote:
 On Tue, Oct 19, 2010 at 08:12:42AM -0500, Anthony Liguori wrote:
 On 10/19/2010 08:07 AM, Stefan Hajnoczi wrote:
 Is there anything stopping this patch from being merged?

 Michael, any objections?  If not, I'll merge it.

 I don't really understand what's going on there.  The extra state in
 notifiers especially scares me. If you do and are comfortable with the
 code, go ahead :)

I'm happy to address your comments.  The state machine was a bit icky
but I don't see a way around it.  Will follow up to your review email.

Stefan

[Qemu-devel] Re: [PATCH] virtio: Use ioeventfd for virtqueue notify

2010-10-19 Thread Michael S. Tsirkin

As a general comment, could you please try to split this patch
up, to make it easier to review? I did a pass over it but I am
still not understanding it completely.

My main concern is with the fact that we add more state
in notifiers that can easily get out of sync with users.
If we absolutely need this state, let's try to at least
document the state machine, and make the API
for state transitions more transparent.

On Thu, Sep 30, 2010 at 03:01:52PM +0100, Stefan Hajnoczi wrote:
 diff --git a/hw/vhost.c b/hw/vhost.c
 index 1b8624d..f127a07 100644
 --- a/hw/vhost.c
 +++ b/hw/vhost.c
 @@ -517,7 +517,7 @@ static int vhost_virtqueue_init(struct vhost_dev *dev,
  goto fail_guest_notifier;
  }
  
 -r = vdev-binding-set_host_notifier(vdev-binding_opaque, idx, true);
 +r = virtio_set_host_notifier(vdev, idx, true);
  if (r  0) {
  fprintf(stderr, Error binding host notifier: %d\n, -r);
  goto fail_host_notifier;
 @@ -539,7 +539,7 @@ static int vhost_virtqueue_init(struct vhost_dev *dev,
  
  fail_call:
  fail_kick:
 -vdev-binding-set_host_notifier(vdev-binding_opaque, idx, false);
 +virtio_set_host_notifier(vdev, idx, false);
  fail_host_notifier:
  vdev-binding-set_guest_notifier(vdev-binding_opaque, idx, false);
  fail_guest_notifier:
 @@ -575,7 +575,7 @@ static void vhost_virtqueue_cleanup(struct vhost_dev *dev,
  }
  assert (r = 0);
  
 -r = vdev-binding-set_host_notifier(vdev-binding_opaque, idx, false);
 +r = virtio_set_host_notifier(vdev, idx, false);
  if (r  0) {
  fprintf(stderr, vhost VQ %d host cleanup failed: %d\n, idx, r);
  fflush(stderr);
 diff --git a/hw/virtio.c b/hw/virtio.c
 index fbef788..f075b3a 100644
 --- a/hw/virtio.c
 +++ b/hw/virtio.c
 @@ -16,6 +16,7 @@
  #include trace.h
  #include virtio.h
  #include sysemu.h
 +#include kvm.h
  
  /* The alignment to use between consumer and producer parts of vring.
   * x86 pagesize again. */
 @@ -77,6 +78,11 @@ struct VirtQueue
  VirtIODevice *vdev;
  EventNotifier guest_notifier;
  EventNotifier host_notifier;
 +enum {
 +HOST_NOTIFIER_DEASSIGNED,   /* inactive */
 +HOST_NOTIFIER_ASSIGNED, /* active */
 +HOST_NOTIFIER_OFFLIMITS,/* active but outside our control */
 +} host_notifier_state;

This state machine confuses me. Please note that users already
track notifier state and call set with assign/deassign correctly.
The comment does not help: what does 'outside our control' mean?
Who's control?

  };
  
  /* virt queue functions */
 @@ -453,6 +459,93 @@ void virtio_update_irq(VirtIODevice *vdev)
  virtio_notify_vector(vdev, VIRTIO_NO_VECTOR);
  }
  
 +/* Service virtqueue notify from a host notifier */
 +static void virtio_read_host_notifier(void *opaque)
 +{
 +VirtQueue *vq = opaque;
 +EventNotifier *notifier = virtio_queue_get_host_notifier(vq);
 +if (event_notifier_test_and_clear(notifier)) {
 +if (vq-vring.desc) {
 +vq-handle_output(vq-vdev, vq);
 +}
 +}
 +}
 +
 +/* Transition between host notifier states */
 +static int virtio_set_host_notifier_state(VirtIODevice *vdev, int n, int 
 state)

really unfortunate naming for functions: we seem to have
about 4 of them starting with virtio_set_host_notifier*

 +{
 +VirtQueue *vq = vdev-vq[n];
 +EventNotifier *notifier = virtio_queue_get_host_notifier(vq);
 +int rc;
 +
 +if (!kvm_enabled()) {
 +return -ENOSYS;
 +}

If this means that there's no need to do anything for non kvm,
return 0 here.

 +
 +/* If the number of ioeventfds is limited, use them for vhost only */
 +if (state == HOST_NOTIFIER_ASSIGNED  !kvm_has_many_iobus_devs()) {
 +state = HOST_NOTIFIER_DEASSIGNED;
 +}
 +
 +/* Ignore if no state change */
 +if (vq-host_notifier_state == state) {
 +return 0;
 +}
 +
 +/* Disable read handler if transitioning away from assigned */
 +if (vq-host_notifier_state == HOST_NOTIFIER_ASSIGNED) {
 +qemu_set_fd_handler(event_notifier_get_fd(notifier), NULL, NULL, 
 NULL);
 +}
 +
 +/* Toggle host notifier if transitioning to or from deassigned */
 +if (state == HOST_NOTIFIER_DEASSIGNED ||
 +vq-host_notifier_state == HOST_NOTIFIER_DEASSIGNED) {
 +rc = vdev-binding-set_host_notifier(vdev-binding_opaque, n,
 +state != HOST_NOTIFIER_DEASSIGNED);
 +if (rc  0) {
 +return rc;
 +}
 +}
 +
 +/* Enable read handler if transitioning to assigned */
 +if (state == HOST_NOTIFIER_ASSIGNED) {
 +qemu_set_fd_handler(event_notifier_get_fd(notifier),
 +virtio_read_host_notifier, NULL, vq);
 +}
 +
 +vq-host_notifier_state = state;
 +return 0;
 +}
 +
 +/* Try to assign/deassign host notifiers for all virtqueues */
 +static void virtio_set_host_notifiers(VirtIODevice *vdev, bool assigned)

void? don't we care whether this fails?

 +{

[Qemu-devel] CFP: 1st International QEMU Users Forum

2010-10-19 Thread Wolfgang Mueller

*
Call for Presentations
   1st International QEMU Users Forum

   March 18th, 2011, Grenoble, France
*

Deadlines:
Extended abstract   Nov 28th, 2010
Notification of acceptance  Nov 30th, 2010

More information is available at: http://adt.cs.upb.de/quf

Re: [Qemu-devel] Re: KVM call agenda for Oct 19

2010-10-19 Thread Anthony Liguori


On 10/19/2010 07:48 AM, Dor Laor wrote:

On 10/19/2010 04:11 AM, Chris Wright wrote:

* Juan Quintela (quint...@redhat.com) wrote:


Please send in any agenda items you are interested in covering.


- 0.13.X -stable handoff
- 0.14 planning
- threadlet work
- virtfs proposals



- Live snapshots
  - We were asked to add this feature for external qcow2
images. Will simple approach of fsync + tracking each requested
backing file (it can be per vDisk) and re-open the new image would
be accepted?


I had assumed that this would involve:

qemu -hda windows.img

(qemu) snapshot ide0-disk0 snap0.img

1) create snap0.img internally by doing the equivalent of `qemu-img 
create -f qcow2 -b windows.img snap0.img'

2) bdrv_flush('ide0-disk0')
3) bdrv_open(snap0.img)
4) bdrv_close(windows.img)
5) rename('windows.img', 'windows.img.tmp')
6) rename('snap0.img', 'windows.img')
7) rename('windows.img.tmp', 'snap0.img')

Regards,

Anthony Liguori


  - Integration with FS freeze for consistent guest app snapshot
Many apps do not sync their ram state to disk correctly or frequent
enough. Physical world backup software calls fs freeze on xfs and
VSS for windows to make the backup consistent.
In order to integrated this with live snapshots we need a guest
agent to trigger the guest fs freeze.
We can either have qemu communicate with the agent directly through
virtio-serial or have a mgmt daemon use virtio-serial to
communicate with the guest in addition to QMP messages about the
live snapshot state.
Preferences? The first solution complicates qemu while the second
complicates mgmt.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] [PATCH 3/3] monitor: add usb_attach and usb_detach

2010-10-19 Thread Gerd Hoffmann


  Hi,


+.help   = attach USB device 'bus.addr',



+...@item usb_attach @var{devname}


/me sees a mismatch here.

There is still the use case question.  Also note that this might have 
unwanted side effects when drivers automagically attach/detach devices 
like usb-host.


Having this purely for debugging/troubleshooting purposes would be fine 
with me, but the documentation should clearly say so.


cheers,
  Gerd

Re: [Qemu-devel] [PATCH 2/3] usb: add public usb_device_by_id

2010-10-19 Thread Gerd Hoffmann


  Hi,


There is no problem to loop over all usb devices. But first of all I
don't want to loop on used, since then I miss any detached devices,
so I actually do want the same behavior of qdev_find_recursive, and
since it's already available, why rewrite it in a different
compilation unit?


Point.  ACK then.

cheers,
  Gerd

Re: [Qemu-devel] Re: KVM call agenda for Oct 19

2010-10-19 Thread Avi Kivity


 On 10/19/2010 02:58 PM, Dor Laor wrote:

On 10/19/2010 02:55 PM, Avi Kivity wrote:

On 10/19/2010 02:48 PM, Dor Laor wrote:

On 10/19/2010 04:11 AM, Chris Wright wrote:

* Juan Quintela (quint...@redhat.com) wrote:


Please send in any agenda items you are interested in covering.


- 0.13.X -stable handoff
- 0.14 planning
- threadlet work
- virtfs proposals



- Live snapshots
- We were asked to add this feature for external qcow2
images. Will simple approach of fsync + tracking each requested
backing file (it can be per vDisk) and re-open the new image would
be accepted?
- Integration with FS freeze for consistent guest app snapshot
Many apps do not sync their ram state to disk correctly or frequent
enough. Physical world backup software calls fs freeze on xfs and
VSS for windows to make the backup consistent.
In order to integrated this with live snapshots we need a guest
agent to trigger the guest fs freeze.
We can either have qemu communicate with the agent directly through
virtio-serial or have a mgmt daemon use virtio-serial to
communicate with the guest in addition to QMP messages about the
live snapshot state.
Preferences? The first solution complicates qemu while the second
complicates mgmt.


Third option, make the freeze path management - qemu - virtio-blk -
guest kernel - file systems. The advantage is that it's easy to
associate file systems with a block device this way.


OTH the userspace freeze path already exist and now you create another 
path. 


I guess we would still have a userspace daemon; instead of talking to 
virtio-serial it talks to virtio-blk.  So:


  management - qemu - virtio-blk - guest driver - kernel fs 
resolver - daemon - apps


Yuck.


What about FS that span over LVM with multiple drives? IDE/SCSI?


Good points.

--
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] Re: KVM call agenda for Oct 19

2010-10-19 Thread Anthony Liguori


On 10/19/2010 08:27 AM, Avi Kivity wrote:

 On 10/19/2010 03:22 PM, Anthony Liguori wrote:


I had assumed that this would involve:

qemu -hda windows.img

(qemu) snapshot ide0-disk0 snap0.img

1) create snap0.img internally by doing the equivalent of `qemu-img 
create -f qcow2 -b windows.img snap0.img'

2) bdrv_flush('ide0-disk0')
3) bdrv_open(snap0.img)
4) bdrv_close(windows.img)
5) rename('windows.img', 'windows.img.tmp')
6) rename('snap0.img', 'windows.img')
7) rename('windows.img.tmp', 'snap0.img')



Looks reasonable.

Would be interesting to look at this as a use case for the threading 
work.  We should eventually be able to create a snapshot without 
stalling vcpus (stalling I/O of course allowed).


If we had another block-level command, like bdrv_aio_freeze(), that 
queued all pending requests until the given callback completed, it would 
be very easy to do this entirely asynchronously.  For instance:


bdrv_aio_freeze(create_snapshot)

create_snapshot():
  bdrv_aio_flush(done_flush)

done_flush():
  bdrv_open(...)
  bdrv_close(...)
  ...

Of course, closing a device while it's being frozen is probably a recipe 
for disaster but you get the idea :-)


Regards,

Anthony Liguori

[Qemu-devel] Re: Static tracepoint control via trace-event

2010-10-19 Thread Stefan Hajnoczi

On Tue, Oct 19, 2010 at 03:08:08PM +0200, Jan Kiszka wrote:
 One quirk I stumbled over quickly was the disable tag in trace-events.
 It confused me first as qemu starts without any tracepoint enabled by
 default and I thought I had to hack the file. Then I read the doc and
 wondered which exiting or future backend would come without sufficiently
 fast dynamic tracepoint control. Do you have any in mind?
 
 Instead of making it a compile-time switch (except for simpletrace), I
 would vote for declaring the simpletrace usage as the only one: disable
 sets the default state of the dynamic tracepoint. That way we could use
 trace-events to define a useful set of standard, moderate-impact
 tracepoints that shall be on. Others will still be available once a
 backend is configured, but remain off until enabled during runtime.
 Anything else looks like overkill to me.

The motivation for disable producing a nop trace event is that it
allows QEMU builds without certain trace events.  A trace event cannot
simply be removed by deleting its trace-events declaration since there
are calls to its trace_*() function in the source tree.  So this
provided a way to disable trace events before simpletrace supported
enabling/disabling trace events at runtime :).

Today that's no longer an issue for simpletrace and other tracing
backends like LTTng UST and SystemTAP handle disabled trace events well.

I agree that keeping just one meaning for the disable keyword is
better.  Perhaps we should keep a separate nop keyword to build out
specific trace events.

When would nop be handy?  I think an ftrace backend is a good example.
Since an ftrace marker cannot be enabled/disabled at runtime, the only
way to silence unwanted trace events is to nop them at compile-time.

 There are a few more things I have in mind (ftrace backend, enhanced
 -trace switch, wildcards for enabling tracepoints, and more
 tracepoints). Will hopefully come up with patches to address them, but
 this may take a while.

Sounds great.

 PS: Do you maintain a tracing git tree?

No, I'm reviewing patches as they are posted for qemu-devel.  If the
backlog between mailing list discussion and merge reaches the point
where your patches are suffering conflicts please let me know and I can
maintain one.

For the initial QEMU tracing effort I kept a tree but I stopped after
the patches were accepted into mainline.  The patches I write go
straight to qemu-devel now.

Stefan

[Qemu-devel] Re: Static tracepoint control via trace-event

2010-10-19 Thread Jan Kiszka

Am 19.10.2010 15:30, Stefan Hajnoczi wrote:
 On Tue, Oct 19, 2010 at 03:08:08PM +0200, Jan Kiszka wrote:
 One quirk I stumbled over quickly was the disable tag in trace-events.
 It confused me first as qemu starts without any tracepoint enabled by
 default and I thought I had to hack the file. Then I read the doc and
 wondered which exiting or future backend would come without sufficiently
 fast dynamic tracepoint control. Do you have any in mind?

 Instead of making it a compile-time switch (except for simpletrace), I
 would vote for declaring the simpletrace usage as the only one: disable
 sets the default state of the dynamic tracepoint. That way we could use
 trace-events to define a useful set of standard, moderate-impact
 tracepoints that shall be on. Others will still be available once a
 backend is configured, but remain off until enabled during runtime.
 Anything else looks like overkill to me.
 
 The motivation for disable producing a nop trace event is that it
 allows QEMU builds without certain trace events.  A trace event cannot
 simply be removed by deleting its trace-events declaration since there
 are calls to its trace_*() function in the source tree.  So this
 provided a way to disable trace events before simpletrace supported
 enabling/disabling trace events at runtime :).
 
 Today that's no longer an issue for simpletrace and other tracing
 backends like LTTng UST and SystemTAP handle disabled trace events well.
 
 I agree that keeping just one meaning for the disable keyword is
 better.  Perhaps we should keep a separate nop keyword to build out
 specific trace events.
 
 When would nop be handy?  I think an ftrace backend is a good example.
 Since an ftrace marker cannot be enabled/disabled at runtime, the only
 way to silence unwanted trace events is to nop them at compile-time.

Another to-do item is to remove the strange dependency of tracing
managements features on CONFIG_SIMPLE_TRACE. That way the monitor
commands and a to-be-added command line option to control individual
tracepoints could of course also be used by an ftrace backend. I bet the
DTrace backend will like to see this as well.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

[Qemu-devel] Re: [PATCH] virtio: Use ioeventfd for virtqueue notify

2010-10-19 Thread Michael S. Tsirkin

On Tue, Oct 19, 2010 at 02:44:35PM +0100, Stefan Hajnoczi wrote:
 On Tue, Oct 19, 2010 at 2:35 PM, Michael S. Tsirkin m...@redhat.com wrote:
  On Tue, Oct 19, 2010 at 08:12:42AM -0500, Anthony Liguori wrote:
  On 10/19/2010 08:07 AM, Stefan Hajnoczi wrote:
  Is there anything stopping this patch from being merged?
 
  Michael, any objections?  If not, I'll merge it.
 
  I don't really understand what's going on there.  The extra state in
  notifiers especially scares me. If you do and are comfortable with the
  code, go ahead :)
 
 I'm happy to address your comments.  The state machine was a bit icky
 but I don't see a way around it.

I think the situation is similar to irqfd in qemu-kvm - take a look
there, specifically msix mask notifiers.

-- 
MST

Re: [Qemu-devel] Re: Static tracepoint control via trace-event

2010-10-19 Thread Stefan Hajnoczi

On Tue, Oct 19, 2010 at 2:46 PM, Jan Kiszka jan.kis...@siemens.com wrote:
 Am 19.10.2010 15:30, Stefan Hajnoczi wrote:
 On Tue, Oct 19, 2010 at 03:08:08PM +0200, Jan Kiszka wrote:
 One quirk I stumbled over quickly was the disable tag in trace-events.
 It confused me first as qemu starts without any tracepoint enabled by
 default and I thought I had to hack the file. Then I read the doc and
 wondered which exiting or future backend would come without sufficiently
 fast dynamic tracepoint control. Do you have any in mind?

 Instead of making it a compile-time switch (except for simpletrace), I
 would vote for declaring the simpletrace usage as the only one: disable
 sets the default state of the dynamic tracepoint. That way we could use
 trace-events to define a useful set of standard, moderate-impact
 tracepoints that shall be on. Others will still be available once a
 backend is configured, but remain off until enabled during runtime.
 Anything else looks like overkill to me.

 The motivation for disable producing a nop trace event is that it
 allows QEMU builds without certain trace events.  A trace event cannot
 simply be removed by deleting its trace-events declaration since there
 are calls to its trace_*() function in the source tree.  So this
 provided a way to disable trace events before simpletrace supported
 enabling/disabling trace events at runtime :).

 Today that's no longer an issue for simpletrace and other tracing
 backends like LTTng UST and SystemTAP handle disabled trace events well.

 I agree that keeping just one meaning for the disable keyword is
 better.  Perhaps we should keep a separate nop keyword to build out
 specific trace events.

 When would nop be handy?  I think an ftrace backend is a good example.
 Since an ftrace marker cannot be enabled/disabled at runtime, the only
 way to silence unwanted trace events is to nop them at compile-time.

 Another to-do item is to remove the strange dependency of tracing
 managements features on CONFIG_SIMPLE_TRACE. That way the monitor
 commands and a to-be-added command line option to control individual
 tracepoints could of course also be used by an ftrace backend. I bet the
 DTrace backend will like to see this as well.

If there is a programmatic way of inspecting and toggling trace events
from inside an instrumented process, then yes.  If this is possible
with SystemTAP we should think about it now before QMP tracing
commands become available in a release.

Stefan

[Qemu-devel] [PATCH] Fix test suite build with tracing enabled

2010-10-19 Thread Jan Kiszka

qemu_malloc instrumentations require linking against the trace objects.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 Makefile |   12 ++--
 1 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/Makefile b/Makefile
index 252c817..106a401 100644
--- a/Makefile
+++ b/Makefile
@@ -140,12 +140,12 @@ qemu-img-cmds.h: $(SRC_PATH)/qemu-img-cmds.hx
 
 check-qint.o check-qstring.o check-qdict.o check-qlist.o check-qfloat.o 
check-qjson.o: $(GENERATED_HEADERS)
 
-check-qint: check-qint.o qint.o qemu-malloc.o
-check-qstring: check-qstring.o qstring.o qemu-malloc.o
-check-qdict: check-qdict.o qdict.o qfloat.o qint.o qstring.o qbool.o 
qemu-malloc.o qlist.o
-check-qlist: check-qlist.o qlist.o qint.o qemu-malloc.o
-check-qfloat: check-qfloat.o qfloat.o qemu-malloc.o
-check-qjson: check-qjson.o qfloat.o qint.o qdict.o qstring.o qlist.o qbool.o 
qjson.o json-streamer.o json-lexer.o json-parser.o qemu-malloc.o
+check-qint: check-qint.o qint.o qemu-malloc.o $(trace-obj-y)
+check-qstring: check-qstring.o qstring.o qemu-malloc.o $(trace-obj-y)
+check-qdict: check-qdict.o qdict.o qfloat.o qint.o qstring.o qbool.o 
qemu-malloc.o qlist.o $(trace-obj-y)
+check-qlist: check-qlist.o qlist.o qint.o qemu-malloc.o $(trace-obj-y)
+check-qfloat: check-qfloat.o qfloat.o qemu-malloc.o $(trace-obj-y)
+check-qjson: check-qjson.o qfloat.o qint.o qdict.o qstring.o qlist.o qbool.o 
qjson.o json-streamer.o json-lexer.o json-parser.o qemu-malloc.o $(trace-obj-y)
 
 clean:
 # avoid old build problems by removing potentially incorrect old files
-- 
1.7.1

Re: [Qemu-devel] Static tracepoint control via trace-event

2010-10-19 Thread Stefan Hajnoczi

On Tue, Oct 19, 2010 at 2:36 PM, Daniel P. Berrange berra...@redhat.com wrote:
 On Tue, Oct 19, 2010 at 03:08:08PM +0200, Jan Kiszka wrote:
 Hi Stefan,

 just had a closer look at qemu's new tracing framework. Looks cool,
 though it leaves a bit room for improvements. ;)

 One quirk I stumbled over quickly was the disable tag in trace-events.
 It confused me first as qemu starts without any tracepoint enabled by
 default and I thought I had to hack the file. Then I read the doc and
 wondered which exiting or future backend would come without sufficiently
 fast dynamic tracepoint control. Do you have any in mind?

 Instead of making it a compile-time switch (except for simpletrace), I
 would vote for declaring the simpletrace usage as the only one: disable
 sets the default state of the dynamic tracepoint. That way we could use
 trace-events to define a useful set of standard, moderate-impact
 tracepoints that shall be on. Others will still be available once a
 backend is configured, but remain off until enabled during runtime.
 Anything else looks like overkill to me.

 FYI with the DTrace/SystemTAP backend I posted yesterday, the 'disable'
 keyword is effectively completely ignored. All tracepoints are disabled
 when QEMU is running normally. Only when a end user runs a dtrace script
 that references a QEMU tracepoint, is that specific tracepoint enabled.

I think that makes sense for external trace backends.  DTrace can
launch a process for you with the probes you want enabled from the
start.  The simpletrace backend can't really do this so probes can be
enabled/disabled at compile-time (e.g. early startup tracing).

Stefan

[Qemu-devel] Re: [PATCH v5 00/14] pcie port switch emulators

2010-10-19 Thread Isaku Yamahata

  Isaku Yamahata (14):
pci: introduce helper functions to test-and-{clear, set} mask in
  configuration space
pci: introduce helper function to handle msi-x and msi.
pci: use pci_word_test_and_clear_mask() in pci_device_reset()
pci/bridge: fix pci_bridge_reset()
msi: implements msi
pcie: add pcie constants to pcie_regs.h
pcie: helper functions for pcie capability and extended capability
 
 I'll apply these.
 
pcie/aer: helper functions for pcie aer capability
 
 Maybe move this to the end of the series?
 
pcie port: define struct PCIEPort/PCIESlot and helper functions
ioh3420: pcie root port in X58 ioh
x3130: pcie upstream port
x3130: pcie downstream port
pcie/hotplug: introduce pushing attention button command
 
 I think the above can be applied - just remove the dependency
 on aer for now.

Okay. I'll update the patch series and send it tomorrow.

-- 
yamahata

Re: [Qemu-devel] Re: KVM call agenda for Oct 19

2010-10-19 Thread Avi Kivity


 On 10/19/2010 03:38 PM, Stefan Hajnoczi wrote:

bdrv_aio_freeze() or any mechanism to deal with pending requests in
the generic block code would be a good step for future live support
of other operations like truncate.


+ logical disk grow, etc.

--
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] Re: Static tracepoint control via trace-event

2010-10-19 Thread Daniel P. Berrange

On Tue, Oct 19, 2010 at 03:46:35PM +0200, Jan Kiszka wrote:
 Am 19.10.2010 15:30, Stefan Hajnoczi wrote:
  On Tue, Oct 19, 2010 at 03:08:08PM +0200, Jan Kiszka wrote:
  One quirk I stumbled over quickly was the disable tag in trace-events.
  It confused me first as qemu starts without any tracepoint enabled by
  default and I thought I had to hack the file. Then I read the doc and
  wondered which exiting or future backend would come without sufficiently
  fast dynamic tracepoint control. Do you have any in mind?
 
  Instead of making it a compile-time switch (except for simpletrace), I
  would vote for declaring the simpletrace usage as the only one: disable
  sets the default state of the dynamic tracepoint. That way we could use
  trace-events to define a useful set of standard, moderate-impact
  tracepoints that shall be on. Others will still be available once a
  backend is configured, but remain off until enabled during runtime.
  Anything else looks like overkill to me.
  
  The motivation for disable producing a nop trace event is that it
  allows QEMU builds without certain trace events.  A trace event cannot
  simply be removed by deleting its trace-events declaration since there
  are calls to its trace_*() function in the source tree.  So this
  provided a way to disable trace events before simpletrace supported
  enabling/disabling trace events at runtime :).
  
  Today that's no longer an issue for simpletrace and other tracing
  backends like LTTng UST and SystemTAP handle disabled trace events well.
  
  I agree that keeping just one meaning for the disable keyword is
  better.  Perhaps we should keep a separate nop keyword to build out
  specific trace events.
  
  When would nop be handy?  I think an ftrace backend is a good example.
  Since an ftrace marker cannot be enabled/disabled at runtime, the only
  way to silence unwanted trace events is to nop them at compile-time.
 
 Another to-do item is to remove the strange dependency of tracing
 managements features on CONFIG_SIMPLE_TRACE. That way the monitor
 commands and a to-be-added command line option to control individual
 tracepoints could of course also be used by an ftrace backend. I bet the
 DTrace backend will like to see this as well.

I don't see a need for any monitor commands or command line options
for the DTrace backend, since everything is completely dynamically
controlled based on the tracing scripts the user is running. 

Regards,
Daniel
-- 
|: Red Hat, Engineering, London-o-   http://people.redhat.com/berrange/ :|
|: http://libvirt.org -o- http://virt-manager.org -o- http://deltacloud.org :|
|: http://autobuild.org-o- http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505  -o-   F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

Re: [Qemu-devel] Static tracepoint control via trace-event

2010-10-19 Thread Jan Kiszka

Am 19.10.2010 15:52, Stefan Hajnoczi wrote:
 On Tue, Oct 19, 2010 at 2:36 PM, Daniel P. Berrange berra...@redhat.com 
 wrote:
 On Tue, Oct 19, 2010 at 03:08:08PM +0200, Jan Kiszka wrote:
 Hi Stefan,

 just had a closer look at qemu's new tracing framework. Looks cool,
 though it leaves a bit room for improvements. ;)

 One quirk I stumbled over quickly was the disable tag in trace-events.
 It confused me first as qemu starts without any tracepoint enabled by
 default and I thought I had to hack the file. Then I read the doc and
 wondered which exiting or future backend would come without sufficiently
 fast dynamic tracepoint control. Do you have any in mind?

 Instead of making it a compile-time switch (except for simpletrace), I
 would vote for declaring the simpletrace usage as the only one: disable
 sets the default state of the dynamic tracepoint. That way we could use
 trace-events to define a useful set of standard, moderate-impact
 tracepoints that shall be on. Others will still be available once a
 backend is configured, but remain off until enabled during runtime.
 Anything else looks like overkill to me.

 FYI with the DTrace/SystemTAP backend I posted yesterday, the 'disable'
 keyword is effectively completely ignored. All tracepoints are disabled
 when QEMU is running normally. Only when a end user runs a dtrace script
 that references a QEMU tracepoint, is that specific tracepoint enabled.
 
 I think that makes sense for external trace backends.  DTrace can
 launch a process for you with the probes you want enabled from the
 start.  The simpletrace backend can't really do this so probes can be
 enabled/disabled at compile-time (e.g. early startup tracing).

Once we have -trace events=..., defining the list of active
tracepoints before starting qemu will be trivial (e.g. via a config
file). Of course, this requires that all tracepoints are built-in...

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

[Qemu-devel] [PATCH] simpletrace: Inline runtime state check

2010-10-19 Thread Jan Kiszka

Instead of preparing all traced args, jumping into the common trace
function, even collecting a timestamp, do the check if a particular
tracepoint is enabled inline. Also, mark the enabled case unlikely to
motivate the compiler to push the trace code out of the fastpath.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 simpletrace.c |4 
 tracetool |7 +--
 2 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/simpletrace.c b/simpletrace.c
index deb1e07..224e4ab 100644
--- a/simpletrace.c
+++ b/simpletrace.c
@@ -148,10 +148,6 @@ static void trace(TraceEventID event, uint64_t x1, 
uint64_t x2, uint64_t x3,
  */
 clock_gettime(CLOCK_MONOTONIC, ts);
 
-if (!trace_list[event].state) {
-return;
-}
-
 rec-event = event;
 rec-timestamp_ns = ts.tv_sec * 10LL + ts.tv_nsec;
 rec-x1 = x1;
diff --git a/tracetool b/tracetool
index 7010858..9532409 100755
--- a/tracetool
+++ b/tracetool
@@ -146,6 +146,8 @@ linetoh_begin_simple()
 {
 cat EOF
 #include simpletrace.h
+
+extern TraceEvent trace_list[];
 EOF
 
 simple_event_num=0
@@ -179,7 +181,9 @@ linetoh_simple()
 cat EOF
 static inline void trace_$name($args)
 {
-trace$argc($trace_args);
+if (unlikely(trace_list[$simple_event_num].state)) {
+trace$argc($trace_args);
+}
 }
 EOF
 
@@ -190,7 +194,6 @@ linetoh_end_simple()
 {
 cat EOF
 #define NR_TRACE_EVENTS $simple_event_num
-extern TraceEvent trace_list[NR_TRACE_EVENTS];
 EOF
 }
 
-- 
1.7.1

[Qemu-devel] Re: [PATCH] Fix test suite build with tracing enabled

2010-10-19 Thread Stefan Hajnoczi

On Tue, Oct 19, 2010 at 04:03:15PM +0200, Jan Kiszka wrote:
 qemu_malloc instrumentations require linking against the trace objects.
 
 Signed-off-by: Jan Kiszka jan.kis...@siemens.com
 ---
  Makefile |   12 ++--
  1 files changed, 6 insertions(+), 6 deletions(-)

Acked-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com

Re: [Qemu-devel] [PATCH 0/7] ATAPI CDROM passthrough v5

2010-10-19 Thread Michal Suchanek

On 19 October 2010 08:17, Alexander Graf ag...@suse.de wrote:

 Am 19.10.2010 um 02:10 schrieb Anthony Liguori anth...@codemonkey.ws:

 On 10/18/2010 06:29 PM, Alexander Graf wrote:
 A user will get a really nasty surprise if they think they can use a flag 
 or rely on QEMU to prevent a VM from doing something nasty with a device.  
 If they have this feeling of security, they're likely to chmod the device 
 to allow unprivileged users to access it.

 But how a device handles ATAPI commands is totally up to the device.  If 
 you issue the wrong sequence, I'm sure there are devices out there that 
 totally hose themselves.  Are you absolutely confident that every ATAPI 
 device out there is completely safe against hostile code provided that you 
 simply prevent the FW update commands?  I'm certainly not.

 Ping?


 Who are you pinging?

 Mostly Ian. I haven't seen any follow-up on this discussion and would like to 
 know why and if there's still plans to upstream this code :).


Why is allowing ATAPI passthrough such a problem?

Sure if your boot drive is on the same IDE cable as the device you may
have issues but other than that the device may just stop working if it
is not designed to handle incorrect command gracefully (ie it is
broken).

I am sure there are devices that also break under issuing correct
commands or commands that look vaguely sane. Eg. there are CD-ROMs
that would lock up the whole system when you boot certain vintage of
Linux (not tested with current Linux due to lack of old hardware) on a
machine with the Intel BX chipset and one of these CD-ROMs attached
over IDE cable.

However, assuming random hardware breakage you cannot allow anything.

Perhaps the ATAPI passthrough should be designed to allow any commands
and some command profiles could be selected to allow for some
sane/conservative subset, burning, LightScribe, LabelFlash, disc
t...@tto, FW upgrade, ..

It would be nice if these subsets were defined in a configuration file
so that people can create their own 'default' combination or just
install a new set when a new fancy feature comes out.

Thanks

Michal

Tracing block devices (was: Re: [Qemu-devel] Static tracepoint control via trace-event)

2010-10-19 Thread Richard W.M. Jones

On Tue, Oct 19, 2010 at 03:59:36PM +0200, Jan Kiszka wrote:
 Once we have -trace events=..., defining the list of active
 tracepoints before starting qemu will be trivial (e.g. via a config
 file). Of course, this requires that all tracepoints are built-in...

Sorry that I've not been following this very closely, but does this
sort of thing allow tracing reads and writes to block devices?  Am I
right in thinking that if a tracepoint existed in the right place, one
could get a log file from that which could be post-processed in
another tool?

cf:
http://rwmj.wordpress.com/2010/10/05/visualizing-reads-writes-and-alignment/#content

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine.  Supports Linux and Windows.
http://et.redhat.com/~rjones/virt-df/

Re: [Qemu-devel] Re: Static tracepoint control via trace-event

2010-10-19 Thread Jan Kiszka

Am 19.10.2010 16:12, Daniel P. Berrange wrote:
 On Tue, Oct 19, 2010 at 03:46:35PM +0200, Jan Kiszka wrote:
 Am 19.10.2010 15:30, Stefan Hajnoczi wrote:
 On Tue, Oct 19, 2010 at 03:08:08PM +0200, Jan Kiszka wrote:
 One quirk I stumbled over quickly was the disable tag in trace-events.
 It confused me first as qemu starts without any tracepoint enabled by
 default and I thought I had to hack the file. Then I read the doc and
 wondered which exiting or future backend would come without sufficiently
 fast dynamic tracepoint control. Do you have any in mind?

 Instead of making it a compile-time switch (except for simpletrace), I
 would vote for declaring the simpletrace usage as the only one: disable
 sets the default state of the dynamic tracepoint. That way we could use
 trace-events to define a useful set of standard, moderate-impact
 tracepoints that shall be on. Others will still be available once a
 backend is configured, but remain off until enabled during runtime.
 Anything else looks like overkill to me.

 The motivation for disable producing a nop trace event is that it
 allows QEMU builds without certain trace events.  A trace event cannot
 simply be removed by deleting its trace-events declaration since there
 are calls to its trace_*() function in the source tree.  So this
 provided a way to disable trace events before simpletrace supported
 enabling/disabling trace events at runtime :).

 Today that's no longer an issue for simpletrace and other tracing
 backends like LTTng UST and SystemTAP handle disabled trace events well.

 I agree that keeping just one meaning for the disable keyword is
 better.  Perhaps we should keep a separate nop keyword to build out
 specific trace events.

 When would nop be handy?  I think an ftrace backend is a good example.
 Since an ftrace marker cannot be enabled/disabled at runtime, the only
 way to silence unwanted trace events is to nop them at compile-time.

 Another to-do item is to remove the strange dependency of tracing
 managements features on CONFIG_SIMPLE_TRACE. That way the monitor
 commands and a to-be-added command line option to control individual
 tracepoints could of course also be used by an ftrace backend. I bet the
 DTrace backend will like to see this as well.
 
 I don't see a need for any monitor commands or command line options
 for the DTrace backend, since everything is completely dynamically
 controlled based on the tracing scripts the user is running. 

Ah, it's all dynamic probing, you just need the marks. OK, was a bad
example. :)

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

[Qemu-devel] Re: Tracing block devices

2010-10-19 Thread Jan Kiszka

Am 19.10.2010 16:29, Richard W.M. Jones wrote:
 On Tue, Oct 19, 2010 at 03:59:36PM +0200, Jan Kiszka wrote:
 Once we have -trace events=..., defining the list of active
 tracepoints before starting qemu will be trivial (e.g. via a config
 file). Of course, this requires that all tracepoints are built-in...
 
 Sorry that I've not been following this very closely, but does this
 sort of thing allow tracing reads and writes to block devices?  Am I
 right in thinking that if a tracepoint existed in the right place, one
 could get a log file from that which could be post-processed in
 another tool?
 
 cf:
 http://rwmj.wordpress.com/2010/10/05/visualizing-reads-writes-and-alignment/#content
 
 Rich.
 

Yes. The block layer is instrumented, not sure if already sufficiently,
but you may simply want to try the simpletrace backend and inspect the
result via its postprocessor (simpletrace.py).

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

[Qemu-devel] [PATCH 09/10] MCE: Relay UCR MCE to guest (v2)

2010-10-19 Thread Marcelo Tosatti


Port qemu-kvm's

commit 4b62fff1101a7ad77553147717a8bd3bf79df7ef
Author: Huang Ying ying.hu...@intel.com
Date:   Mon Sep 21 10:43:25 2009 +0800

MCE: Relay UCR MCE to guest

UCR (uncorrected recovery) MCE is supported in recent Intel CPUs,
where some hardware error such as some memory error can be reported
without PCC (processor context corrupted). To recover from such MCE,
the corresponding memory will be unmapped, and all processes accessing
the memory will be killed via SIGBUS.

For KVM, if QEMU/KVM is killed, all guest processes will be killed
too. So we relay SIGBUS from host OS to guest system via a UCR MCE
injection. Then guest OS can isolate corresponding memory and kill
necessary guest processes only. SIGBUS sent to main thread (not VCPU
threads) will be broadcast to all VCPU threads as UCR MCE.

v2: use target_phys_addr_t type for paddr.

Signed-off-by: Marcelo Tosatti mtosa...@redhat.com
Signed-off-by: Avi Kivity a...@redhat.com
 ---
  cpus.c|   82 --
  kvm-stub.c|5 ++
  kvm.h |3 +
  target-i386/cpu.h |   20 +-
  target-i386/helper.c  |2 +-
  target-i386/kvm.c |  178 -
  target-i386/kvm_x86.h |3 +-
  7 files changed, 279 insertions(+), 14 deletions(-)
 
diff --git a/cpus.c b/cpus.c
index 429993a..62de0bc 100644
--- a/cpus.c
+++ b/cpus.c
@@ -34,6 +34,10 @@
 
 #include cpus.h
 #include compatfd.h
+#ifdef CONFIG_LINUX
+#include sys/prctl.h
+#include sys/signalfd.h
+#endif
 
 #ifdef SIGRTMIN
 #define SIG_IPI (SIGRTMIN+4)
@@ -41,6 +45,10 @@
 #define SIG_IPI SIGUSR1
 #endif
 
+#ifndef PR_MCE_KILL
+#define PR_MCE_KILL 33
+#endif
+
 static CPUState *next_cpu;
 
 /***/
@@ -498,28 +506,77 @@ static void qemu_tcg_wait_io_event(void)
 }
 }
 
+static void sigbus_reraise(void)
+{
+sigset_t set;
+struct sigaction action;
+
+memset(action, 0, sizeof(action));
+action.sa_handler = SIG_DFL;
+if (!sigaction(SIGBUS, action, NULL)) {
+raise(SIGBUS);
+sigemptyset(set);
+sigaddset(set, SIGBUS);
+sigprocmask(SIG_UNBLOCK, set, NULL);
+}
+perror(Failed to re-raise SIGBUS!\n);
+abort();
+}
+
+static void sigbus_handler(int n, struct qemu_signalfd_siginfo *siginfo,
+   void *ctx)
+{
+#if defined(TARGET_I386)
+if (kvm_on_sigbus(siginfo-ssi_code, (void *)(intptr_t)siginfo-ssi_addr))
+#endif
+sigbus_reraise();
+}
+
 static void qemu_kvm_eat_signal(CPUState *env, int timeout)
 {
 struct timespec ts;
 int r, e;
 siginfo_t siginfo;
 sigset_t waitset;
+sigset_t chkset;
 
 ts.tv_sec = timeout / 1000;
 ts.tv_nsec = (timeout % 1000) * 100;
 
 sigemptyset(waitset);
 sigaddset(waitset, SIG_IPI);
+sigaddset(waitset, SIGBUS);
 
-qemu_mutex_unlock(qemu_global_mutex);
-r = sigtimedwait(waitset, siginfo, ts);
-e = errno;
-qemu_mutex_lock(qemu_global_mutex);
+do {
+qemu_mutex_unlock(qemu_global_mutex);
 
-if (r == -1  !(e == EAGAIN || e == EINTR)) {
-fprintf(stderr, sigtimedwait: %s\n, strerror(e));
-exit(1);
-}
+r = sigtimedwait(waitset, siginfo, ts);
+e = errno;
+
+qemu_mutex_lock(qemu_global_mutex);
+
+if (r == -1  !(e == EAGAIN || e == EINTR)) {
+fprintf(stderr, sigtimedwait: %s\n, strerror(e));
+exit(1);
+}
+
+switch (r) {
+case SIGBUS:
+#ifdef TARGET_I386
+if (kvm_on_sigbus_vcpu(env, siginfo.si_code, siginfo.si_addr))
+#endif
+sigbus_reraise();
+break;
+default:
+break;
+}
+
+r = sigpending(chkset);
+if (r == -1) {
+fprintf(stderr, sigpending: %s\n, strerror(e));
+exit(1);
+}
+} while (sigismember(chkset, SIG_IPI) || sigismember(chkset, SIGBUS));
 }
 
 static void qemu_kvm_wait_io_event(CPUState *env)
@@ -645,6 +702,7 @@ static void kvm_init_ipi(CPUState *env)
 
 pthread_sigmask(SIG_BLOCK, NULL, set);
 sigdelset(set, SIG_IPI);
+sigdelset(set, SIGBUS);
 r = kvm_set_signal_mask(env, set);
 if (r) {
 fprintf(stderr, kvm_set_signal_mask: %s\n, strerror(r));
@@ -655,6 +713,7 @@ static void kvm_init_ipi(CPUState *env)
 static sigset_t block_io_signals(void)
 {
 sigset_t set;
+struct sigaction action;
 
 /* SIGUSR2 used by posix-aio-compat.c */
 sigemptyset(set);
@@ -665,8 +724,15 @@ static sigset_t block_io_signals(void)
 sigaddset(set, SIGIO);
 sigaddset(set, SIGALRM);
 sigaddset(set, SIG_IPI);
+sigaddset(set, SIGBUS);
 pthread_sigmask(SIG_BLOCK, set, NULL);
 
+memset(action, 0, sizeof(action));
+action.sa_flags = SA_SIGINFO;
+action.sa_sigaction = (void (*)(int, siginfo_t*, void*))sigbus_handler;
+sigaction(SIGBUS, action, NULL);

[Qemu-devel] [PATCH 0/2] v2 Decouple block device removal from device removal

2010-10-19 Thread Ryan Harper

This patch series decouples the detachment of a block device from the removal
of the backing pci-device.  Removal of a hotplugged pci device requires the
guest to respond before qemu tears down the block device. In some cases, the
guest may not respond leaving the guest with continued access to the block
device.  

The new monitor command, drive_unplug, will revoke a guests access to the
block device independently of the removal of the pci device.

The first patch adds a new drive find method, the second patch implements the
monitor command and block layer changes.

Changes since v1:
- CodingStyle fixes
- Added qemu_aio_flush() to bdrv_unplug()

Signed-off-by: Ryan Harper ry...@us.ibm.com

Re: Tracing block devices (was: Re: [Qemu-devel] Static tracepoint control via trace-event)

2010-10-19 Thread Stefan Hajnoczi

On Tue, Oct 19, 2010 at 03:29:51PM +0100, Richard W.M. Jones wrote:
 On Tue, Oct 19, 2010 at 03:59:36PM +0200, Jan Kiszka wrote:
  Once we have -trace events=..., defining the list of active
  tracepoints before starting qemu will be trivial (e.g. via a config
  file). Of course, this requires that all tracepoints are built-in...
 
 Sorry that I've not been following this very closely, but does this
 sort of thing allow tracing reads and writes to block devices?  Am I
 right in thinking that if a tracepoint existed in the right place, one
 could get a log file from that which could be post-processed in
 another tool?
 
 cf:
 http://rwmj.wordpress.com/2010/10/05/visualizing-reads-writes-and-alignment/#content

Definitely, here is the commit that added bdrv_aio_writev/bdrv_aio_readv
tracing.  bdrv_aio_multiwrite has been traced for a while.

http://patchwork.ozlabs.org/patch/66843/

As an example, I use the following script to find all write requests
that touch a given region.  This is very useful for debugging image
corruptions given a trace file:

The usage is:

find_overlapping_io.py bs sector_num nb_sectors

where bs is the block driver state pointer, sector_num is the starting
sector address, and nb_sectors is the number of sectors.

#!/usr/bin/env python
import sys

def trace_filter(fobj, event, keys):
for line in fobj:
fields = line.strip().split()
if fields[0] != event:
continue

attrs = dict([(k, v) for k, v in (x.split('=') for x in fields[2:])])
match = True
for k, v in keys.iteritems():
if k not in attrs:
match = False
break
if attrs[k] != v:
match = False
break

if match:
yield attrs

def intersection(a_sector_num, a_nb_sectors, b_sector_num, b_nb_sectors):
return not (a_sector_num + a_nb_sectors = b_sector_num or \
b_sector_num + b_nb_sectors = a_sector_num)

bs, sector_num, nb_sectors = sys.argv[1:]
sector_num = int(sector_num, 0)
nb_sectors = int(nb_sectors, 0)

for req in trace_filter(sys.stdin, 'bdrv_aio_writev', {'bs': bs}):
if intersection(sector_num, nb_sectors, int(req['sector_num'], 0), 
int(req['nb_sectors'], 0)):
print req

Stefan

Re: Testing of russian keymap (was Re: [Qemu-devel] [PATCH] fix '/' and '|' on russian keymap)

2010-10-19 Thread Oleg Sadov

19/10/2010 10:32 +0100, Daniel P. Berrange wrote:
 On Mon, Oct 18, 2010 at 01:59:15PM -0500, Anthony Liguori wrote:
  On 10/18/2010 12:30 PM, Oleg Sadov wrote:
  I don't understand reasons for such locale-default keyboard settings for
  qemu too, but may be it's useful for someone...
 
  
  -k only exists to deal with crappy VNC clients.
  
  If you use a good VNC client (like vinagre or virt-viewer) then you 
  don't have to use -k.
 
 Indeed you must *NOT* use -k then, because that disables the extension
 that vinagre/virt-viewer rely on for sane keyboard handling.

I don't use '-k' option directly -- in my RHEL-based system it's
automagically appended to qemu-kvm by libvirt. KVM XML-description,
created by standard virt-manager GUI-interface (package
virt-manager-0.6.1-12.el5.x86_64), has a 'keymap' attribute of
'graphics' tag, despite that configurator don't have any controls for
'keymap' setting.

As I understand, 'default_keymap' function from util.py (package
python-virtinst-0.400.3-9.el5.noarch) got information
from /etc/sysconfig/keyboard, then keymap searched in 'keytable'
dictionary from keytable.py and automatically placed to 'keymap'
attribute of 'graphics' tag in virtual-machine XML-description.

In our system we have a russian keyboard settings = we've got a XML
description like this:

 graphics type='vnc' port='-1' autoport='yes' keymap='ru'/

and, as a consequence, qemu-kvm running with '-k ru' option.

 Regards,
 Daniel

Sincerely,
--Oleg

[Qemu-devel] [PATCH 2/2] v2 Fix Block Hotplug race with drive_unplug()

2010-10-19 Thread Ryan Harper

Block hot unplug is racy since the guest is required to acknowlege the ACPI
unplug event; this may not happen synchronously with the device removal command

This series aims to close a gap where by mgmt applications that assume the
block resource has been removed without confirming that the guest has
acknowledged the removal may re-assign the underlying device to a second guest
leading to data leakage.

This series introduces a new montor command to decouple asynchornous device
removal from restricting guest access to a block device.  We do this by creating
a new monitor command drive_unplug which maps to a bdrv_unplug() command which
does a qemu_aio_flush; bdrv_flush() and bdrv_close().  Once complete, subsequent
IO is rejected from the device and the guest will get IO errors but continue to
function.

A subsequent device removal command can be issued to remove the device, to which
the guest may or maynot respond, but as long as the unplugged bit is set, no IO
will be sumbitted.

Changes since v1:
- Added qemu_aio_flush() before bdrv_flush() to wait on pending io

Signed-off-by: Ryan Harper ry...@us.ibm.com
---
 block.c |7 +++
 block.h |1 +
 blockdev.c  |   26 ++
 blockdev.h  |1 +
 hmp-commands.hx |   15 +++
 5 files changed, 50 insertions(+), 0 deletions(-)

diff --git a/block.c b/block.c
index a19374d..be47655 100644
--- a/block.c
+++ b/block.c
@@ -1328,6 +1328,13 @@ void bdrv_set_removable(BlockDriverState *bs, int 
removable)
 }
 }
 
+void bdrv_unplug(BlockDriverState *bs)
+{
+qemu_aio_flush();
+bdrv_flush(bs);
+bdrv_close(bs);
+}
+
 int bdrv_is_removable(BlockDriverState *bs)
 {
 return bs-removable;
diff --git a/block.h b/block.h
index 5f64380..732f63e 100644
--- a/block.h
+++ b/block.h
@@ -171,6 +171,7 @@ void bdrv_set_on_error(BlockDriverState *bs, 
BlockErrorAction on_read_error,
BlockErrorAction on_write_error);
 BlockErrorAction bdrv_get_on_error(BlockDriverState *bs, int is_read);
 void bdrv_set_removable(BlockDriverState *bs, int removable);
+void bdrv_unplug(BlockDriverState *bs);
 int bdrv_is_removable(BlockDriverState *bs);
 int bdrv_is_read_only(BlockDriverState *bs);
 int bdrv_is_sg(BlockDriverState *bs);
diff --git a/blockdev.c b/blockdev.c
index 5fc3b9b..68eb329 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -610,3 +610,29 @@ int do_change_block(Monitor *mon, const char *device,
 }
 return monitor_read_bdrv_key_start(mon, bs, NULL, NULL);
 }
+
+int do_drive_unplug(Monitor *mon, const QDict *qdict, QObject **ret_data)
+{
+DriveInfo *dinfo;
+BlockDriverState *bs;
+const char *id;
+
+if (!qdict_haskey(qdict, id)) {
+qerror_report(QERR_MISSING_PARAMETER, id);
+return -1;
+}
+
+id = qdict_get_str(qdict, id);
+dinfo = drive_get_by_id(id);
+if (!dinfo) {
+qerror_report(QERR_DEVICE_NOT_FOUND, id);
+return -1;
+}
+
+/* mark block device unplugged */
+bs = dinfo-bdrv;
+bdrv_unplug(bs);
+
+return 0;
+}
+ 
diff --git a/blockdev.h b/blockdev.h
index 19c6915..ecb9ac8 100644
--- a/blockdev.h
+++ b/blockdev.h
@@ -52,5 +52,6 @@ int do_eject(Monitor *mon, const QDict *qdict, QObject 
**ret_data);
 int do_block_set_passwd(Monitor *mon, const QDict *qdict, QObject **ret_data);
 int do_change_block(Monitor *mon, const char *device,
 const char *filename, const char *fmt);
+int do_drive_unplug(Monitor *mon, const QDict *qdict, QObject **ret_data);
 
 #endif
diff --git a/hmp-commands.hx b/hmp-commands.hx
index 81999aa..7a32a2e 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -68,6 +68,21 @@ Eject a removable medium (use -f to force it).
 ETEXI
 
 {
+.name   = drive_unplug,
+.args_type  = id:s,
+.params = device,
+.help   = unplug block device,
+.user_print = monitor_user_noop,
+.mhandler.cmd_new = do_drive_unplug,
+},
+
+STEXI
+...@item unplug @var{device}
+...@findex unplug
+Unplug block device.
+ETEXI
+
+{
 .name   = change,
 .args_type  = device:B,target:F,arg:s?,
 .params = device filename [format],
-- 
1.6.3.3

[Qemu-devel] [PATCH][RESEND] char: Flush read buffer in mux_chr_can_read

2010-10-19 Thread Jan Kiszka

Move the buffer flush from mux_chr_read to mux_chr_can_read. While the
latter is called periodically, the former will only be invoked when new
characters arrive at the back-end. This caused problems to front-end
drivers whenever they were unable to read data immediately, e.g.
virtio-console attached to stdio.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 qemu-char.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/qemu-char.c b/qemu-char.c
index 6d2dce7..f4c3876 100644
--- a/qemu-char.c
+++ b/qemu-char.c
@@ -398,6 +398,8 @@ static int mux_chr_can_read(void *opaque)
 MuxDriver *d = chr-opaque;
 int m = d-focus;
 
+mux_chr_accept_input(opaque);
+
 if ((d-prod[m] - d-cons[m])  MUX_BUFFER_SIZE)
 return 1;
 if (d-chr_can_read[m])
@@ -412,8 +414,6 @@ static void mux_chr_read(void *opaque, const uint8_t *buf, 
int size)
 int m = d-focus;
 int i;
 
-mux_chr_accept_input (opaque);
-
 for(i = 0; i  size; i++)
 if (mux_proc_byte(chr, d, buf[i])) {
 if (d-prod[m] == d-cons[m] 
-- 
1.7.1

[Qemu-devel] [PATCH][RESEND] pcnet: Do not receive external frames in loopback mode

2010-10-19 Thread Jan Kiszka

While not explicitly stated in the spec, it was observed on real systems
that enabling loopback testing on the pcnet controller disables
reception of external frames. And some legacy software relies on it, so
provide this behavior.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 hw/pcnet.c |5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/hw/pcnet.c b/hw/pcnet.c
index b52935a..f970bda 100644
--- a/hw/pcnet.c
+++ b/hw/pcnet.c
@@ -1048,9 +1048,10 @@ ssize_t pcnet_receive(VLANClientState *nc, const uint8_t 
*buf, size_t size_)
 int crc_err = 0;
 int size = size_;
 
-if (CSR_DRX(s) || CSR_STOP(s) || CSR_SPND(s) || !size)
+if (CSR_DRX(s) || CSR_STOP(s) || CSR_SPND(s) || !size ||
+(CSR_LOOP(s)  !s-looptest)) {
 return -1;
-
+}
 #ifdef PCNET_DEBUG
 printf(pcnet_receive size=%d\n, size);
 #endif
-- 
1.7.1

1 2 >

1 - 100 of 151 matches

Mail list logo