Re: [Qemu-devel] [PATCH 1/2] Add no-op aio emulation stub

2010-05-10 Thread Stefan Hajnoczi
bdrv_aio_noop_em() could be useful for benchmarking and optimizing the
aio code.  It serves as a cheap operation that lets us see the cost of
the aio roundtrip.

Stefan



Re: [Qemu-devel] [RFC][MIPS][PATCH 3/6] Initial support of VIA IDE controller used by fulong mini pc

2010-05-10 Thread Markus Armbruster
Blue Swirl  writes:

> On 5/9/10, chen huacai  wrote:
>> This patch add initial support of VIA IDE controller used by fulong mini pc
>>
>>  Signed-off-by: Huacai Chen 
>>  -
[...]
>>  diff --git a/hw/ide/via.c b/hw/ide/via.c
>>  new file mode 100644
>>  index 000..9adc5b5
>>  --- /dev/null
>>  +++ b/hw/ide/via.c
>>  @@ -0,0 +1,189 @@
[...]
>>  +static void bmdma_writeb(void *opaque, uint32_t addr, uint32_t val)
>>  +{
>>  +BMDMAState *bm = opaque;
>>  +#ifdef DEBUG_IDE
>>  +printf("bmdma: writeb 0x%02x : 0x%02x\n", addr, val);
>>  +#endif
>>  +switch(addr & 3) {
>>  +case 2:
>>  +bm->status = (val & 0x60) | (bm->status & 1) | (bm->status &
>>  ~val & 0x06);
>>  +break;
>
> Some gccs complain if there is no default case.

Are you sure?  Which version?

It warns when the switch expression's type is an enumeration, and the
switch doesn't handle all enumeration constants, and has no default
case.

I can be made to warn whenever there's no default, but we don't do that.

A quick grep finds roughly 600 switches without default case in our
code.

>>  +}
>>  +}
[...]



[Qemu-devel] Re: vmstate: Useless post_save?

2010-05-10 Thread andrzej zaborowski
Hi,

On 8 May 2010 00:39, Jan Kiszka  wrote:
> I wondered why we have the post_save callback in vmstate. Conceptually,
> it made no sense to me. So I grep'ed for its users - and found exactly
> one: tmp105. As suspected, only "strange" code was found:
>
> static void tmp105_post_save(void *opaque)
> {
>    TMP105State *s = opaque;
>    s->faults = tmp105_faultq[(s->config >> 3) & 3];            /* F */
> }
>
> First, s->config cannot be changed by saving the state. And, second,
> s->faults is only written by this driver, never read.

I'm not sure why the post_save is there, it looks like it should be in
post_load rather.

The faults counter is an actual register somewhere in the hardware,
just not exposed through I2C bus.  However the counter is never
decremented because the temperature measurements are fake so there's
no measurement error and no point in delaying the interrupt until the
counter reaches zero.

Cheers



Re: [Qemu-devel] [RFC][MIPS][PATCH 1/6] Initial support of bonito north bridge used by fulong mini pc

2010-05-10 Thread chen huacai
>>  +    s->pci = qemu_mallocz(sizeof(*s->pci));
>>  +    assert(s->pci != NULL);
>>  +    bonito_state = s;
>>  +
>>  +    /* get the north bridge pci bus */
>>  +    s->pci->bus = pci_register_bus(NULL, "pci", pci_bonito_set_irq,
>>  +                                   pci_bonito_map_irq, pic, 0x28, 32);
>>  +
>>  +    /* set the north bridge register mapping */
>>  +    s->bonito_reg_handle = cpu_register_io_memory(bonito_read,
>>  bonito_write, s);
>>  +    s->bonito_reg_start = BONITO_INTERNAL_REG_BASE;
>
> Usually the devices don't specify their addresses, but these are
> passed from the board level.

I'm a bit confusing here, bonito internal registers are mapped to a
fixed physical address according to specification.


>>  +    /*add PCI io space */
>>  +    /*PCI IO Space  0x1fd0  - 0x1fd1  */
>>  +    if (s->bonito_pciio_length)
>>  +    {
>>  +        cpu_register_physical_memory(s->bonito_pciio_start,
>>  +                                     s->bonito_pciio_length,
>>  IO_MEM_UNASSIGNED);
>
> Why would this be needed?

This is borrowed from gt64xxx.c


>>  +    d = pci_register_device(s->pci->bus, "Bonito PCI Bus", 
>> sizeof(PCIDevice),
>>  +                            0, bonito_read_config, bonito_write_config);
>>  +
>>  +    pci_config_set_vendor_id(d->config, 0xdf53); //Bonito North Bridge
>>  +    pci_config_set_device_id(d->config, 0x00d5);
>
> Please put the above constants to hw/pci_ids.h.

Bonito north bridge is built on FPGA now, VENDOR_ID/DEVICE_ID are
temporary value so I didn't put them in pci_ids.h

For your other comments I'll improve my code, thanks.

Best regards,

Huacai Chen



[Qemu-devel] Re: [PATCHv3] Support for booting from virtio disks

2010-05-10 Thread Kevin O'Connor
On Mon, May 10, 2010 at 11:36:37AM +0300, Gleb Natapov wrote:
> This patch adds native support for booting from virtio disks to Seabios.
> 
> Signed-off-by: Gleb Natapov 

Thanks - commit 89acfa3f.  The patch had some compile errors on gcc3.4
and gcc4.5 - I went ahead and committed an update to fix the errors
(commit 7d09d0e3).

-Kevin



[Qemu-devel] Re: [PATCH 0/6] [AHCI]resend all patches to add ahci support into qemu

2010-05-10 Thread Alexander Graf

On 11.05.2010, at 01:19, QiaoChong wrote:

> When ahci init ,driver will send ATA_SRST command,ahci device report device 
> type through port's sig register.
> Ahci disk lookup change from IF_SD to IF_SCSI now,because IF_SD does not 
> support cdrom media.
> I just copy ide_atapi_cmd from hw/ide/core.c into hw/ahci.c,change a 
> little,then the cdrom can be identified,and read by os.
> If qemu can change dma_buf_prepare,dma_buf_rw,dma_buf_commit to a function 
> pointer in BMDMAState,then I can rewrite three functions to support ahci's 
> prtd,because it is different from ide's.
> 
> test a sata disk like this:
> ./i386-softmmu/qemu -cdrom KNOPPIX_V6.0.1CD-2009-02-08-EN.iso -drive 
> if=scsi,file=/tmp/disk
> test a sata cd like this:
> ./i386-softmmu/qemu -cdrom KNOPPIX_V6.0.1CD-2009-02-08-EN.iso -drive 
> if=scsi,media=cdrom,file=KNOPPIX_V6.0.1CD-2009-02-08-EN.iso

Alright, I put the set into the git repo at:

git://repo.or.cz/qemu/ahci.git
branch "ahci"

Please keep in mind that this is a pure working branch. For upstream submission 
we will need to clean it up completely. But this way we can stick to working 
code and improve it there.


Alex




[Qemu-devel] [PATCH 5/6] ahci pci ids into pci_ids.h, add warning messages.

2010-05-10 Thread QiaoChong
move ahci pci device id define into pci_ids.h,add warning messages for
unsupported features.

Signed-off-by: QiaoChong 
---
 hw/ahci.c|   10 +-
 hw/pci_ids.h |1 +
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/hw/ahci.c b/hw/ahci.c
index cb4a851..e1aed4a 100644
--- a/hw/ahci.c
+++ b/hw/ahci.c
@@ -662,7 +662,9 @@ static void handle_cmd(AHCIState *s,int port,int slot)
 
if(fis[1]==0)
{
-
+#ifdef DEBUG_AHCI
+   printf("now,just ignore command fis[0]=%02x fis[1]=%02x 
fis[2]=%02x\n",fis[0],fis[1],fis[2]);
+#endif
}
 
if(fis[1]==(1<<7))
@@ -755,15 +757,13 @@ static void ahci_pci_map(PCIDevice *pci_dev, int 
region_num,
cpu_register_physical_memory(addr, size, s->mem);
 }
 
-#define PCI_VENDOR_MYDEVICE  0x8086
-#define PCI_PRODUCT_MYDEVICE 0x2652
 
 static int pci_ahci_init(PCIDevice *dev)
 {
struct ahci_pci_state *d;
d = DO_UPCAST(struct ahci_pci_state, card, dev);
-   pci_config_set_vendor_id(d->card.config,PCI_VENDOR_MYDEVICE);
-   pci_config_set_device_id(d->card.config,PCI_PRODUCT_MYDEVICE);
+   pci_config_set_vendor_id(d->card.config,PCI_VENDOR_ID_INTEL);
+   pci_config_set_device_id(d->card.config,PCI_DEVICE_ID_INTEL_ICH6R_AHCI);
d->card.config[PCI_COMMAND] = PCI_COMMAND_IO | 
PCI_COMMAND_MEMORY | PCI_COMMAND_MASTER;
d->card.config[PCI_CLASS_DEVICE]= 0;
d->card.config[0x0b]= 1;//storage
diff --git a/hw/pci_ids.h b/hw/pci_ids.h
index fe7a121..4d4de93 100644
--- a/hw/pci_ids.h
+++ b/hw/pci_ids.h
@@ -97,3 +97,4 @@
 #define PCI_DEVICE_ID_INTEL_82371AB  0x7111
 #define PCI_DEVICE_ID_INTEL_82371AB_20x7112
 #define PCI_DEVICE_ID_INTEL_82371AB_30x7113
+#define PCI_DEVICE_ID_INTEL_ICH6R_AHCI0x2652
-- 
1.7.0.3.254.g4503b.dirty




[Qemu-devel] [PATCH 1/6] add ahci support into qemu, only support sata disk.

2010-05-10 Thread QiaoChong
use -drive if=sd,file=diskname to add a ahci disk into qemu.

Signed-off-by: QiaoChong 
---
 Makefile.target |4 +
 hw/ahci.c   |  805 +++
 2 files changed, 809 insertions(+), 0 deletions(-)
 create mode 100644 hw/ahci.c

diff --git a/Makefile.target b/Makefile.target
index c092900..d338af8 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -187,6 +187,10 @@ obj-$(CONFIG_USB_OHCI) += usb-ohci.o
 obj-y += rtl8139.o
 obj-y += e1000.o
 
+# ahci
+#
+obj-$(CONFIG_AHCI) += ahci.o
+
 # Hardware support
 obj-i386-y = pckbd.o dma.o
 obj-i386-y += vga.o
diff --git a/hw/ahci.c b/hw/ahci.c
new file mode 100644
index 000..a332a45
--- /dev/null
+++ b/hw/ahci.c
@@ -0,0 +1,805 @@
+/*
+ * QEMU AHCI Emulation
+ * Copyright (c) 2010 qiaoch...@loongson.cn
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see .
+ *
+ * TODO:
+ *  o ahci cd support
+ */
+#include "hw.h"
+#include "qemu-timer.h"
+#include "monitor.h"
+#include "sysbus.h"
+#include "pci.h"
+#include "dma.h"
+#include "cpu-common.h"
+#include 
+#define DPRINTF(...)
+
+
+enum {
+   AHCI_PCI_BAR= 5,
+   AHCI_MAX_PORTS  = 32,
+   AHCI_MAX_SG = 168, /* hardware max is 64K */
+   AHCI_DMA_BOUNDARY   = 0x,
+   AHCI_USE_CLUSTERING = 0,
+   AHCI_MAX_CMDS   = 32,
+   AHCI_CMD_SZ = 32,
+   AHCI_CMD_SLOT_SZ= AHCI_MAX_CMDS * AHCI_CMD_SZ,
+   AHCI_RX_FIS_SZ  = 256,
+   AHCI_CMD_TBL_CDB= 0x40,
+   AHCI_CMD_TBL_HDR_SZ = 0x80,
+   AHCI_CMD_TBL_SZ = AHCI_CMD_TBL_HDR_SZ + (AHCI_MAX_SG * 16),
+   AHCI_CMD_TBL_AR_SZ  = AHCI_CMD_TBL_SZ * AHCI_MAX_CMDS,
+   AHCI_PORT_PRIV_DMA_SZ   = AHCI_CMD_SLOT_SZ + AHCI_CMD_TBL_AR_SZ +
+   AHCI_RX_FIS_SZ,
+   AHCI_IRQ_ON_SG  = (1 << 31),
+   AHCI_CMD_ATAPI  = (1 << 5),
+   AHCI_CMD_WRITE  = (1 << 6),
+   AHCI_CMD_PREFETCH   = (1 << 7),
+   AHCI_CMD_RESET  = (1 << 8),
+   AHCI_CMD_CLR_BUSY   = (1 << 10),
+
+   RX_FIS_D2H_REG  = 0x40, /* offset of D2H Register FIS data */
+   RX_FIS_SDB  = 0x58, /* offset of SDB FIS data */
+   RX_FIS_UNK  = 0x60, /* offset of Unknown FIS data */
+
+   board_ahci  = 0,
+   board_ahci_pi   = 1,
+   board_ahci_vt8251   = 2,
+   board_ahci_ign_iferr= 3,
+   board_ahci_sb600= 4,
+
+   /* global controller registers */
+   HOST_CAP= 0x00, /* host capabilities */
+   HOST_CTL= 0x04, /* global host control */
+   HOST_IRQ_STAT   = 0x08, /* interrupt status */
+   HOST_PORTS_IMPL = 0x0c, /* bitmap of implemented ports */
+   HOST_VERSION= 0x10, /* AHCI spec. version compliancy */
+
+   /* HOST_CTL bits */
+   HOST_RESET  = (1 << 0),  /* reset controller; self-clear */
+   HOST_IRQ_EN = (1 << 1),  /* global IRQ enable */
+   HOST_AHCI_EN= (1 << 31), /* AHCI enabled */
+
+   /* HOST_CAP bits */
+   HOST_CAP_SSC= (1 << 14), /* Slumber capable */
+   HOST_CAP_CLO= (1 << 24), /* Command List Override support */
+   HOST_CAP_SSS= (1 << 27), /* Staggered Spin-up */
+   HOST_CAP_NCQ= (1 << 30), /* Native Command Queueing */
+   HOST_CAP_64 = (1 << 31), /* PCI DAC (64-bit DMA) support */
+
+   /* registers for each SATA port */
+   PORT_LST_ADDR   = 0x00, /* command list DMA addr */
+   PORT_LST_ADDR_HI= 0x04, /* command list DMA addr hi */
+   PORT_FIS_ADDR   = 0x08, /* FIS rx buf addr */
+   PORT_FIS_ADDR_HI= 0x0c, /* FIS rx buf addr hi */
+   PORT_IRQ_STAT   = 0x10, /* interrupt status */
+   PORT_IRQ_MASK   = 0x14, /* interrupt enable/disable mask */
+   PORT_CMD= 0x18, /* port command */
+   PORT_TFDATA = 0x20, /* taskfile data */
+   PORT_SIG= 0x24, /* device TF signature */
+   PORT_CMD_ISSUE  = 0x38, /* command issue */
+   PORT_SCR= 0x28, /* SATA phy register block */
+   PORT_SCR_STAT   = 0x28, /* SATA phy register: SStatus 

[Qemu-devel] [PATCH 6/6] add cdrom support for ahci.

2010-05-10 Thread QiaoChong
ahci disk look up from IF_SCSI now.
test a sata disk:
./i386-softmmu/qemu -cdrom KNOPPIX_V6.0.1CD-2009-02-08-EN.iso -drive 
if=scsi,file=/tmp/disk
test a sata cd:
./i386-softmmu/qemu -cdrom KNOPPIX_V6.0.1CD-2009-02-08-EN.iso -drive 
if=scsi,media=cdrom,file=KNOPPIX_V6.0.1CD-2009-02-08-EN.iso

Signed-off-by: QiaoChong 
---
 hw/ahci.c |  427 -
 1 files changed, 424 insertions(+), 3 deletions(-)

diff --git a/hw/ahci.c b/hw/ahci.c
index e1aed4a..2763075 100644
--- a/hw/ahci.c
+++ b/hw/ahci.c
@@ -16,7 +16,9 @@
  * License along with this library; if not, see .
  *
  * TODO:
- *  o ahci cd support
+ *  o ahci cd support should use ide,but now ide 's bmdma prdt is different 
from ahci's prdt.
+ *  make ahci use multi-devices at the same time
+ *  make code to reuse ide's code
  */
 #include "hw.h"
 #include "qemu-timer.h"
@@ -25,13 +27,15 @@
 #include "pci.h"
 #include "dma.h"
 #include "cpu-common.h"
+#include "scsi-defs.h"
+#include "scsi.h"
 #include 
 
 #define DEBUG_AHCI
 
 #ifdef DEBUG_AHCI
 #define DPRINTF(fmt, ...) \
-do { printf("ahci: " fmt , ## __VA_ARGS__); } while (0)
+do { fprintf(stderr,"ahci: " fmt , ## __VA_ARGS__); } while (0)
 #else
 #define DPRINTF(fmt, ...) do {} while(0)
 #endif
@@ -160,6 +164,15 @@ enum {
AHCI_FLAG_32BIT_ONLY= (1 << 28), /* force 32bit */
 };
 
+enum {
+   ATA_SRST= (1 << 2), /* software reset */
+};
+
+enum {
+STATE_RUN=0,
+STATE_RESET
+};
+
 /*
  * ATA Commands (only mandatory commands listed here)
  */
@@ -250,6 +263,7 @@ typedef struct ahci_sg {
 typedef struct AHCIState{
ahci_control_regs control_regs;
ahci_port_regs port_regs[SATA_PORTS];
+   uint32_t port_state[SATA_PORTS];
int mem;
QEMUTimer *timer;
IDEBus *ide;
@@ -472,10 +486,13 @@ static CPUWriteMemoryFunc *ahci_writefn[3]={
 
 static void ahci_reg_init(AHCIState *s)
 {
+   int i;
s->control_regs.cap = 3 | (0x1f << 8) | (1 << 20) ; /* 4 ports, 32 
command slots, 1.5 Gb/s */
s->control_regs.ghc = 1 << 31; /* AHCI Enable */
s->control_regs.impl = 1; /* Port 0 implemented */
s->control_regs.version = 0x1;
+   for(i=0;iport_state[i]=STATE_RUN;
 }
 
 static void padstr(char *str, const char *src, int len)
@@ -490,12 +507,91 @@ static void padstr(char *str, const char *src, int len)
}
 }
 
+static void padstr8(uint8_t *buf, int buf_size, const char *src)
+{
+int i;
+for(i = 0; i < buf_size; i++) {
+if (*src)
+buf[i] = *src++;
+else
+buf[i] = ' ';
+}
+}
 
 static void put_le16(uint16_t *p, unsigned int v)
 {
*p = cpu_to_le16(v);
 }
 
+static inline void cpu_to_ube16(uint8_t *buf, int val)
+{
+buf[0] = val >> 8;
+buf[1] = val & 0xff;
+}
+
+static inline int ube16_to_cpu(const uint8_t *buf)
+{
+return (buf[0] << 8) | buf[1];
+}
+
+static inline int ube32_to_cpu(const uint8_t *buf)
+{
+return (buf[0] << 24) | (buf[1] << 16) | (buf[2] << 8) | buf[3];
+}
+static inline void cpu_to_ube32(uint8_t *buf, unsigned int val)
+{
+buf[0] = val >> 24;
+buf[1] = val >> 16;
+buf[2] = val >> 8;
+buf[3] = val & 0xff;
+}
+static void ide_atapi_identify(IDEState *s)
+{
+uint16_t *p;
+
+if (s->identify_set) {
+   memcpy(s->io_buffer, s->identify_data, sizeof(s->identify_data));
+   return;
+}
+
+memset(s->io_buffer, 0, 512);
+p = (uint16_t *)s->io_buffer;
+/* Removable CDROM, 50us response, 12 byte packets */
+put_le16(p + 0, (2 << 14) | (5 << 8) | (1 << 7) | (2 << 5) | (0 << 0));
+padstr((char *)(p + 10), s->drive_serial_str, 20); /* serial number */
+put_le16(p + 20, 3); /* buffer type */
+put_le16(p + 21, 512); /* cache size in sectors */
+put_le16(p + 22, 4); /* ecc bytes */
+padstr((char *)(p + 23), s->version, 8); /* firmware version */
+padstr((char *)(p + 27), "QEMU DVD-ROM", 40); /* model */
+put_le16(p + 48, 1); /* dword I/O (XXX: should not be set on CDROM) */
+#ifdef USE_DMA_CDROM
+put_le16(p + 49, 1 << 9 | 1 << 8); /* DMA and LBA supported */
+put_le16(p + 53, 7); /* words 64-70, 54-58, 88 valid */
+put_le16(p + 62, 7);  /* single word dma0-2 supported */
+put_le16(p + 63, 7);  /* mdma0-2 supported */
+put_le16(p + 64, 0x3f); /* PIO modes supported */
+#else
+put_le16(p + 49, 1 << 9); /* LBA supported, no DMA */
+put_le16(p + 53, 3); /* words 64-70, 54-58 valid */
+put_le16(p + 63, 0x103); /* DMA modes XXX: may be incorrect */
+put_le16(p + 64, 1); /* PIO modes */
+#endif
+put_le16(p + 65, 0xb4); /* minimum DMA multiword tx cycle time */
+put_le16(p + 66, 0xb4); /* recommended DMA multiword tx cycle time */
+put_le16(p + 67, 0x12c); /* minimum PIO cycle time without flow control */
+put_le16(p + 68, 0xb4); /* minimum PIO cycle time with IORDY flow control 
*/
+
+put_le16(p + 71, 30); /* in ns *

[Qemu-devel] [PATCH 0/6] [AHCI]resend all patches to add ahci support into qemu

2010-05-10 Thread QiaoChong
When ahci init ,driver will send ATA_SRST command,ahci device report device 
type through port's sig register.
Ahci disk lookup change from IF_SD to IF_SCSI now,because IF_SD does not 
support cdrom media.
I just copy ide_atapi_cmd from hw/ide/core.c into hw/ahci.c,change a 
little,then the cdrom can be identified,and read by os.
If qemu can change dma_buf_prepare,dma_buf_rw,dma_buf_commit to a function 
pointer in BMDMAState,then I can rewrite three functions to support ahci's 
prtd,because it is different from ide's.

test a sata disk like this:
./i386-softmmu/qemu -cdrom KNOPPIX_V6.0.1CD-2009-02-08-EN.iso -drive 
if=scsi,file=/tmp/disk
test a sata cd like this:
./i386-softmmu/qemu -cdrom KNOPPIX_V6.0.1CD-2009-02-08-EN.iso -drive 
if=scsi,media=cdrom,file=KNOPPIX_V6.0.1CD-2009-02-08-EN.iso

QiaoChong (5):
  add ahci support into qemu,only support sata disk.
  add ahci device into i386 pc just for test.
  add  WIN_STANDBYNOW1 process into ahci.
  ahci pci ids  into pci_ids.h,add warning messages.
  add cdrom support for ahci.

Sebastian Herbszt (1):
  fix port count,cap and version etc to ahci.

 Makefile.target  |4 +
 default-configs/i386-softmmu.mak |2 +
 hw/ahci.c| 1241 ++
 hw/pc.c  |1 +
 hw/pci_ids.h |1 +
 5 files changed, 1249 insertions(+), 0 deletions(-)
 create mode 100644 hw/ahci.c




[Qemu-devel] [PATCH 3/6] fix port count, cap and version etc to ahci.

2010-05-10 Thread QiaoChong
From: Sebastian Herbszt 

- debug output with DEBUG_AHCI
- set port count to 4
- change return value of PxSSTS to include SPD and IPM
- change cap and version default values according to Intel #301473-002

Signed-off-by: QiaoChong 
---
 hw/ahci.c |   68 
 1 files changed, 41 insertions(+), 27 deletions(-)

diff --git a/hw/ahci.c b/hw/ahci.c
index a332a45..b6a81af 100644
--- a/hw/ahci.c
+++ b/hw/ahci.c
@@ -26,8 +26,15 @@
 #include "dma.h"
 #include "cpu-common.h"
 #include 
-#define DPRINTF(...)
 
+#define DEBUG_AHCI
+
+#ifdef DEBUG_AHCI
+#define DPRINTF(fmt, ...) \
+do { printf("ahci: " fmt , ## __VA_ARGS__); } while (0)
+#else
+#define DPRINTF(fmt, ...) do {} while(0)
+#endif
 
 enum {
AHCI_PCI_BAR= 5,
@@ -203,6 +210,8 @@ typedef struct ahci_control_regs {
uint32_t  version;
 } ahci_control_regs;
 
+#define SATA_PORTS 4
+
 typedef struct ahci_port_regs {
uint32_t lst_addr;
uint32_t lst_addr_hi;
@@ -240,7 +249,7 @@ typedef struct ahci_sg {
 
 typedef struct AHCIState{
ahci_control_regs control_regs;
-   ahci_port_regs port_regs[2];
+   ahci_port_regs port_regs[SATA_PORTS];
int mem;
QEMUTimer *timer;
IDEBus *ide;
@@ -268,8 +277,10 @@ static uint32_t  ahci_port_read(AHCIState *s,int port,int 
offset)
switch(offset)
{
case PORT_SCR:
-   if(s->ide && port==0) val=3;
-   else val=0;
+   if(s->ide && port==0)
+   val= 3 /* DET */ | (1 << 4) /* SPD */ | (1 << 
8) /* IPM */;
+   else
+   val=0;
break;
case PORT_IRQ_STAT:
val=pr->irq_stat;
@@ -291,6 +302,7 @@ static uint32_t  ahci_port_read(AHCIState *s,int port,int 
offset)
val= p[offset>>2];
break;
}
+   DPRINTF("ahci_port_read: port: 0x%x offset: 0x%x val: 0x%x\n", port, 
offset, val);
return val;
 
 }
@@ -299,7 +311,7 @@ static void ahci_check_irq(AHCIState *s)
 {
ahci_port_regs *pr;
int i;
-   for(i=0;i<2;i++)
+   for(i=0;iport_regs[i];
 
@@ -319,6 +331,7 @@ static void  ahci_port_write(AHCIState *s,int port,int 
offset,uint32_t val)
ahci_port_regs *pr=&s->port_regs[port];
uint32_t *p;
 
+   DPRINTF("ahci_port_write: port: 0x%x offset: 0x%x val: 0x%x\n", port, 
offset, val);
switch(offset)
{
case PORT_LST_ADDR:
@@ -396,7 +409,7 @@ static uint32_t ahci_mem_readl(void *ptr, 
target_phys_addr_t addr)
val=p[addr>>2];
}
}
-   else if(addr>=0x100 && addr<0x200)
+   else if(addr>=0x100 && addr<0x300)
{
val=ahci_port_read(s,(addr-0x100)>>7,addr&0x7f);
}
@@ -436,7 +449,7 @@ static void ahci_mem_writel(void *ptr, target_phys_addr_t 
addr, uint32_t val)
p=(uint32_t *)&s->control_regs;
}
}
-   else if(addr>=0x100 && addr<0x200)
+   else if(addr>=0x100 && addr<0x300)
{
ahci_port_write(s,(addr-0x100)>>7,addr&0x7f,val);
}
@@ -459,10 +472,10 @@ static CPUWriteMemoryFunc *ahci_writefn[3]={
 
 static void ahci_reg_init(AHCIState *s)
 {
-   s->control_regs.cap=2|(0x1f<<8); /*2 ports,32 cmd slot*/
-   s->control_regs.ghc=1<<31;
-   s->control_regs.impl=1;/*2 ports*/
-   s->control_regs.version=0x10100;
+   s->control_regs.cap = 3 | (0x1f << 8) | (1 << 20) ; /* 4 ports, 32 
command slots, 1.5 Gb/s */
+   s->control_regs.ghc = 1 << 31; /* AHCI Enable */
+   s->control_regs.impl = 1; /* Port 0 implemented */
+   s->control_regs.version = 0x1;
 }
 
 static void padstr(char *str, const char *src, int len)
@@ -619,19 +632,22 @@ static void handle_cmd(AHCIState *s,int port,int slot)
prdt_num=cmd_hdr.opts>>16;
if(prdt_num) cpu_physical_memory_read(cmd_hdr.tbl_addr+0x80,(uint8_t 
*)s->prdt_buf,prdt_num*32);
 
-
+#ifdef DEBUG_AHCI
+DPRINTF("fis:");
for(i=0;iirq_stat |= (1<<2);
break;
default:
-   hw_error("unkonow command fis[0]=%02x 
fis[1]=%02x fis[2]=%02x\n",fis[0],fis[1],fis[2]);break;
+   hw_error("unknown command fis[0]=%02x 
fis[1]=%02x fis[2]=%02x\n",fis[0],fis[1],fis[2]);break;
}
 
}
@@ -698,7 +714,7 @@ static void ahci_timer_function(void *opaque)
AHCIState *s = opaque;
ahci_port_regs *pr;
int i,j;
-   for(i=0;i<2;i++)
+   for(i=0;iport_regs[i];
for(j=0;j<32 && pr->cmd_issue;j++)
@@ -741,23 +757,21 @@ static void ahci_pci_map(PCIDevice *pci_dev, int 
region_num,
 #define PCI_VENDOR_MYDEVICE  0x8086
 #define PCI_PRODUCT_MYDEVICE 0x2652

[Qemu-devel] [PATCH 2/6] add ahci device into i386 pc just for test.

2010-05-10 Thread QiaoChong
test like this:

dd if=/dev/zero of=/tmp/disk bs=1M count=100
./i386-softmmu/qemu  -cdrom 
/mnt/hdb1/knoppix-dvd/KNOPPIX_V6.0.1CD-2009-02-08-EN.iso -boot d -drive 
if=sd,file=/tmp/disk

Signed-off-by: QiaoChong 
---
 default-configs/i386-softmmu.mak |2 ++
 hw/pc.c  |1 +
 2 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/default-configs/i386-softmmu.mak b/default-configs/i386-softmmu.mak
index 4c1495f..bd72f39 100644
--- a/default-configs/i386-softmmu.mak
+++ b/default-configs/i386-softmmu.mak
@@ -20,3 +20,5 @@ CONFIG_NE2000_ISA=y
 CONFIG_PIIX_PCI=y
 CONFIG_SOUND=y
 CONFIG_VIRTIO_PCI=y
+CONFIG_AHCI=y
+
diff --git a/hw/pc.c b/hw/pc.c
index db2b9a2..26a1eda 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -1005,6 +1005,7 @@ static void pc_init1(ram_addr_t ram_size,
 
 if (pci_enabled) {
 pci_piix3_ide_init(pci_bus, hd, piix3_devfn + 1);
+   pci_create_simple(pci_bus,-1,"ahci");
 } else {
 for(i = 0; i < MAX_IDE_BUS; i++) {
 isa_ide_init(ide_iobase[i], ide_iobase2[i], ide_irq[i],
-- 
1.7.0.3.254.g4503b.dirty




[Qemu-devel] [PATCH 4/6] add WIN_STANDBYNOW1 process into ahci.

2010-05-10 Thread QiaoChong
Signed-off-by: QiaoChong 
---
 hw/ahci.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/hw/ahci.c b/hw/ahci.c
index b6a81af..cb4a851 100644
--- a/hw/ahci.c
+++ b/hw/ahci.c
@@ -676,6 +676,7 @@ static void handle_cmd(AHCIState *s,int port,int slot)
write_to_sglist(ide_state->identify_data, 
sizeof(ide_state->identify_data),s->prdt_buf,prdt_num);
pr->irq_stat |= (1<<2);
break;
+   case WIN_STANDBYNOW1:
case WIN_SETFEATURES:
pr->irq_stat |= (1<<2);
break;
-- 
1.7.0.3.254.g4503b.dirty




[Qemu-devel] Re: [PATCH 0/1] [RFC][AHCI] add cdrom support for ahci.

2010-05-10 Thread Alexander Graf

On 11.05.2010, at 00:13, Sebastian Herbszt wrote:

> Alexander Graf wrote:
>> Hi Chong,
>> 
>> On 10.05.2010, at 13:55, QiaoChong wrote:
>> 
>> > When ahci init ,driver will send ATA_SRST command,ahci device report 
>> > device type through port's sig register.
>> > Ahci disk lookup change from IF_SD to IF_SCSI now,because IF_SD does not 
>> > support cdrom media.
>> > I just copy ide_atapi_cmd from hw/ide/core.c into hw/ahci.c,change a 
>> > little,then the cdrom can be identified,and > read by os.
>> > If qemu can change dma_buf_prepare,dma_buf_rw,dma_buf_commit to a function 
>> > pointer in BMDMAState,then I can rewrite > three functions to support 
>> > ahci's prtd,because it is different from ide's.
>> >
>> > test a sata disk like this:
>> > ./i386-softmmu/qemu -cdrom KNOPPIX_V6.0.1CD-2009-02-08-EN.iso -drive 
>> > if=scsi,file=/tmp/disk
>> > test a sata cd like this:
>> > ./i386-softmmu/qemu -cdrom KNOPPIX_V6.0.1CD-2009-02-08-EN.iso -drive > 
>> > if=scsi,media=cdrom,file=KNOPPIX_V6.0.1CD-2009-02-08-EN.iso
>> 
>> Thanks for improving the patch, but I have some nitpicks considering on how 
>> to process here.
>> 
>> For starters, this patch is incremental to the previous one. Since the 
>> previous patch did not get applied to qemu, it doesn't make sense to send an 
>> incremental patch. Please send the full patchset but bump up > the version 
>> in that case. You will find many examples for that on the mailing list. In 
>> most cases it also makes sense to rethink the splitting between patches.
> 
> The problem of incremental patches will be a non issue as soon as the git 
> tree is available.

I set up a mailing list and a git tree for development. The mailing list is 
here:

https://lists.sourceforge.net/lists/listinfo/qemu-ahci-devel

The git tree is here (pure mirror of qemu at this moment):

git://repo.or.cz/qemu/ahci.git

Could you guys please send me all the patches you have so far so I can put them 
into an ahci branch? Please also register with repo.or.cz so I can give you 
commit rights. If things become too messy, I'll try to put me or Roland as 
gatekeeper for patches into the repo.


Alex




[Qemu-devel] Re: [PATCH v5 4/5] Inter-VM shared memory PCI device

2010-05-10 Thread Cam Macdonell
On Mon, May 10, 2010 at 5:59 AM, Avi Kivity  wrote:
> On 04/21/2010 08:53 PM, Cam Macdonell wrote:
>>
>> Support an inter-vm shared memory device that maps a shared-memory object
>> as a
>> PCI device in the guest.  This patch also supports interrupts between
>> guest by
>> communicating over a unix domain socket.  This patch applies to the
>> qemu-kvm
>> repository.
>>
>>     -device ivshmem,size=[,shm=]
>>
>> Interrupts are supported between multiple VMs by using a shared memory
>> server
>> by using a chardev socket.
>>
>>     -device ivshmem,size=[,shm=]
>>                     [,chardev=][,msi=on][,irqfd=on][,vectors=n]
>>     -chardev socket,path=,id=
>>
>> (shared memory server is qemu.git/contrib/ivshmem-server)
>>
>> Sample programs and init scripts are in a git repo here:
>>
>>
>> +typedef struct EventfdEntry {
>> +    PCIDevice *pdev;
>> +    int vector;
>> +} EventfdEntry;
>> +
>> +typedef struct IVShmemState {
>> +    PCIDevice dev;
>> +    uint32_t intrmask;
>> +    uint32_t intrstatus;
>> +    uint32_t doorbell;
>> +
>> +    CharDriverState * chr;
>> +    CharDriverState ** eventfd_chr;
>> +    int ivshmem_mmio_io_addr;
>> +
>> +    pcibus_t mmio_addr;
>> +    unsigned long ivshmem_offset;
>> +    uint64_t ivshmem_size; /* size of shared memory region */
>> +    int shm_fd; /* shared memory file descriptor */
>> +
>> +    int nr_allocated_vms;
>> +    /* array of eventfds for each guest */
>> +    int ** eventfds;
>> +    /* keep track of # of eventfds for each guest*/
>> +    int * eventfds_posn_count;
>>
>
> More readable:
>
>  typedef struct Peer {
>      int nb_eventfds;
>      int *eventfds;
>  } Peer;
>  int nb_peers;
>  Peer *peers;
>
> Does eventfd_chr need to be there as well?

No it does not, eventfd_chr store character devices for receiving
interrupts when irqfd is not available, so we only them for this
guest, not for our peers.

I've switched over to this more readable naming you've suggested.

>
>> +
>> +    int nr_alloc_guests;
>> +    int vm_id;
>> +    int num_eventfds;
>> +    uint32_t vectors;
>> +    uint32_t features;
>> +    EventfdEntry *eventfd_table;
>> +
>> +    char * shmobj;
>> +    char * sizearg;
>>
>
> Does this need to be part of the state?

They are because they're passed in as qdev properties from the
command-line so I thought they needed to be in the state struct to be
assigned via DEFINE_PROP_...

>
>> +} IVShmemState;
>> +
>> +/* registers for the Inter-VM shared memory device */
>> +enum ivshmem_registers {
>> +    IntrMask = 0,
>> +    IntrStatus = 4,
>> +    IVPosition = 8,
>> +    Doorbell = 12,
>> +};
>> +
>> +static inline uint32_t ivshmem_has_feature(IVShmemState *ivs, int
>> feature) {
>> +    return (ivs->features&  (1<<  feature));
>> +}
>> +
>> +static inline int is_power_of_two(int x) {
>> +    return (x&  (x-1)) == 0;
>> +}
>>
>
> argument needs to be uint64_t to avoid overflow with large BARs.  Return
> type can be bool.
>
>> +static void ivshmem_io_writel(void *opaque, uint8_t addr, uint32_t val)
>> +{
>> +    IVShmemState *s = opaque;
>> +
>> +    u_int64_t write_one = 1;
>> +    u_int16_t dest = val>>  16;
>> +    u_int16_t vector = val&  0xff;
>> +
>> +    addr&= 0xfe;
>>
>
> Why 0xfe?  Can understand 0xfc or 0xff.

Forgot to change to 0xfc when registers went from 16 to 32-bits.

>
>> +
>> +    switch (addr)
>> +    {
>> +        case IntrMask:
>> +            ivshmem_IntrMask_write(s, val);
>> +            break;
>> +
>> +        case IntrStatus:
>> +            ivshmem_IntrStatus_write(s, val);
>> +            break;
>> +
>> +        case Doorbell:
>> +            /* check doorbell range */
>> +            if ((vector>= 0)&&  (vector<  s->eventfds_posn_count[dest]))
>> {
>>
>
> What if dest is too big?  We overflow s->eventfds_posn_count.

added a check for that.

Thanks,
Cam



[Qemu-devel] [PATCH] pci: cleanly backout of pci_qdev_init()

2010-05-10 Thread Alex Williamson
If the init function of a device fails, as might happen with device
assignment, we never undo the work done by do_pci_register_device().
This not only causes a bit of a memory leak, but also leaves a bogus
pointer in the bus devices array that can cause a segfault or
garbage data from 'info pci'.

Signed-off-by: Alex Williamson 
---

 hw/pci.c |   17 -
 1 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/hw/pci.c b/hw/pci.c
index f167436..3d3560e 100644
--- a/hw/pci.c
+++ b/hw/pci.c
@@ -625,6 +625,14 @@ static PCIDevice *do_pci_register_device(PCIDevice 
*pci_dev, PCIBus *bus,
 return pci_dev;
 }
 
+static void do_pci_unregister_device(PCIDevice *pci_dev)
+{
+qemu_free_irqs(pci_dev->irq);
+pci_dev->bus->devices[pci_dev->devfn] = NULL;
+pci_config_free(pci_dev);
+return;
+}
+
 PCIDevice *pci_register_device(PCIBus *bus, const char *name,
int instance_size, int devfn,
PCIConfigReadFunc *config_read,
@@ -680,10 +688,7 @@ static int pci_unregister_device(DeviceState *dev)
 return ret;
 
 pci_unregister_io_regions(pci_dev);
-
-qemu_free_irqs(pci_dev->irq);
-pci_dev->bus->devices[pci_dev->devfn] = NULL;
-pci_config_free(pci_dev);
+do_pci_unregister_device(pci_dev);
 return 0;
 }
 
@@ -1652,8 +1657,10 @@ static int pci_qdev_init(DeviceState *qdev, DeviceInfo 
*base)
 if (pci_dev == NULL)
 return -1;
 rc = info->init(pci_dev);
-if (rc != 0)
+if (rc != 0) {
+do_pci_unregister_device(pci_dev);
 return rc;
+}
 
 /* rom loading */
 if (pci_dev->romfile == NULL && info->romfile != NULL)




[Qemu-devel] [PATCH] target-arm: Handle 'smc' as an undefined instruction

2010-05-10 Thread Adam Lackorzynski


Signed-off-by: Adam Lackorzynski 
---
 target-arm/translate.c |6 +-
 1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/target-arm/translate.c b/target-arm/translate.c
index 0eccca5..afd6716 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -6344,7 +6344,11 @@ static void disas_arm_insn(CPUState * env, DisasContext 
*s)
 dead_tmp(tmp2);
 store_reg(s, rd, tmp);
 break;
-case 7: /* bkpt */
+case 7:
+/* SMC? */
+if ((insn & 0xfff0) == 0xe1600070)
+  goto illegal_op;
+/* bkpt */
 gen_set_condexec(s);
 gen_set_pc_im(s->pc - 4);
 gen_exception(EXCP_BKPT);
-- 
1.7.1


Adam
-- 
Adam a...@os.inf.tu-dresden.de
  Lackorzynski http://os.inf.tu-dresden.de/~adam/



[Qemu-devel] Re: [PATCH 0/1] [RFC][AHCI] add cdrom support for ahci.

2010-05-10 Thread Alexander Graf

On 11.05.2010, at 00:13, Sebastian Herbszt wrote:

> Alexander Graf wrote:
>> Hi Chong,
>> 
>> On 10.05.2010, at 13:55, QiaoChong wrote:
>> 
>> > When ahci init ,driver will send ATA_SRST command,ahci device report 
>> > device type through port's sig register.
>> > Ahci disk lookup change from IF_SD to IF_SCSI now,because IF_SD does not 
>> > support cdrom media.
>> > I just copy ide_atapi_cmd from hw/ide/core.c into hw/ahci.c,change a 
>> > little,then the cdrom can be identified,and > read by os.
>> > If qemu can change dma_buf_prepare,dma_buf_rw,dma_buf_commit to a function 
>> > pointer in BMDMAState,then I can rewrite > three functions to support 
>> > ahci's prtd,because it is different from ide's.
>> >
>> > test a sata disk like this:
>> > ./i386-softmmu/qemu -cdrom KNOPPIX_V6.0.1CD-2009-02-08-EN.iso -drive 
>> > if=scsi,file=/tmp/disk
>> > test a sata cd like this:
>> > ./i386-softmmu/qemu -cdrom KNOPPIX_V6.0.1CD-2009-02-08-EN.iso -drive > 
>> > if=scsi,media=cdrom,file=KNOPPIX_V6.0.1CD-2009-02-08-EN.iso
>> 
>> Thanks for improving the patch, but I have some nitpicks considering on how 
>> to process here.
>> 
>> For starters, this patch is incremental to the previous one. Since the 
>> previous patch did not get applied to qemu, it doesn't make sense to send an 
>> incremental patch. Please send the full patchset but bump up > the version 
>> in that case. You will find many examples for that on the mailing list. In 
>> most cases it also makes sense to rethink the splitting between patches.
> 
> The problem of incremental patches will be a non issue as soon as the git 
> tree is available.

Yes, I'll take care of doing that. I hope I'll get to it tomorrow. Until then 
all the small incremental patches are completely useless to me and impossible 
to review. They don't make sense without the broader scope.

Alex




[Qemu-devel] [PATCH 3/3] target-sparc: Inline some generation of carry for ADDX/SUBX.

2010-05-10 Thread Richard Henderson
Computing carry is trivial for some inputs.  By avoiding an
external function call, we generate near-optimal code for
the common cases of add+addx (double-word arithmetic) and
cmp+addx (a setcc pattern).

Signed-off-by: Richard Henderson 
---
 target-sparc/helper.h|2 +-
 target-sparc/op_helper.c |2 +-
 target-sparc/translate.c |  268 +-
 3 files changed, 196 insertions(+), 76 deletions(-)

diff --git a/target-sparc/helper.h b/target-sparc/helper.h
index 04c1306..6f103e7 100644
--- a/target-sparc/helper.h
+++ b/target-sparc/helper.h
@@ -158,6 +158,6 @@ VIS_CMPHELPER(cmpne);
 #undef VIS_HELPER
 #undef VIS_CMPHELPER
 DEF_HELPER_0(compute_psr, void);
-DEF_HELPER_0(compute_C_icc, tl);
+DEF_HELPER_0(compute_C_icc, i32);
 
 #include "def-helper.h"
diff --git a/target-sparc/op_helper.c b/target-sparc/op_helper.c
index c36bc54..3d6177b 100644
--- a/target-sparc/op_helper.c
+++ b/target-sparc/op_helper.c
@@ -1314,7 +1314,7 @@ void helper_compute_psr(void)
 CC_OP = CC_OP_FLAGS;
 }
 
-target_ulong helper_compute_C_icc(void)
+uint32_t helper_compute_C_icc(void)
 {
 uint32_t ret;
 
diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index ea7c71b..06f0f34 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -332,24 +332,130 @@ static inline void gen_op_add_cc(TCGv dst, TCGv src1, 
TCGv src2)
 tcg_gen_mov_tl(dst, cpu_cc_dst);
 }
 
-static inline void gen_op_addxi_cc(TCGv dst, TCGv src1, target_long src2)
+static TCGv_i32 gen_add32_carry32(void)
 {
-gen_helper_compute_C_icc(cpu_tmp0);
-tcg_gen_mov_tl(cpu_cc_src, src1);
-tcg_gen_movi_tl(cpu_cc_src2, src2);
-tcg_gen_add_tl(cpu_cc_dst, cpu_cc_src, cpu_tmp0);
-tcg_gen_addi_tl(cpu_cc_dst, cpu_cc_dst, src2);
-tcg_gen_mov_tl(dst, cpu_cc_dst);
+TCGv_i32 carry_32, cc_src1_32, cc_src2_32;
+
+/* Carry is computed from a previous add: (dst < src)  */
+#if TARGET_LONG_BITS == 64
+cc_src1_32 = tcg_temp_new_i32();
+cc_src2_32 = tcg_temp_new_i32();
+tcg_gen_trunc_i64_i32(cc_src1_32, cpu_cc_dst);
+tcg_gen_trunc_i64_i32(cc_src2_32, cpu_cc_src);
+#else
+cc_src1_32 = cpu_cc_dst;
+cc_src2_32 = cpu_cc_src;
+#endif
+
+carry_32 = tcg_temp_new_i32();
+tcg_gen_setcond_i32(TCG_COND_LTU, carry_32, cc_src1_32, cc_src2_32);
+
+#if TARGET_LONG_BITS == 64
+tcg_temp_free_i32(cc_src1_32);
+tcg_temp_free_i32(cc_src2_32);
+#endif
+
+return carry_32;
 }
 
-static inline void gen_op_addx_cc(TCGv dst, TCGv src1, TCGv src2)
+static TCGv_i32 gen_sub32_carry32(void)
 {
-gen_helper_compute_C_icc(cpu_tmp0);
-tcg_gen_mov_tl(cpu_cc_src, src1);
-tcg_gen_mov_tl(cpu_cc_src2, src2);
-tcg_gen_add_tl(cpu_cc_dst, cpu_cc_src, cpu_tmp0);
-tcg_gen_add_tl(cpu_cc_dst, cpu_cc_dst, cpu_cc_src2);
-tcg_gen_mov_tl(dst, cpu_cc_dst);
+TCGv_i32 carry_32, cc_src1_32, cc_src2_32;
+
+/* Carry is computed from a previous borrow: (src1 < src2)  */
+#if TARGET_LONG_BITS == 64
+cc_src1_32 = tcg_temp_new_i32();
+cc_src2_32 = tcg_temp_new_i32();
+tcg_gen_trunc_i64_i32(cc_src1_32, cpu_cc_src);
+tcg_gen_trunc_i64_i32(cc_src2_32, cpu_cc_src2);
+#else
+cc_src1_32 = cpu_cc_src;
+cc_src2_32 = cpu_cc_src2;
+#endif
+
+carry_32 = tcg_temp_new_i32();
+tcg_gen_setcond_i32(TCG_COND_LTU, carry_32, cc_src1_32, cc_src2_32);
+
+#if TARGET_LONG_BITS == 64
+tcg_temp_free_i32(cc_src1_32);
+tcg_temp_free_i32(cc_src2_32);
+#endif
+
+return carry_32;
+}
+
+static void gen_op_addx_int(DisasContext *dc, TCGv dst, TCGv src1,
+TCGv src2, int update_cc)
+{
+TCGv_i32 carry_32;
+TCGv carry;
+
+switch (dc->cc_op) {
+case CC_OP_DIV:
+case CC_OP_LOGIC:
+/* Carry is known to be zero.  Fall back to plain ADD.  */
+if (update_cc) {
+gen_op_add_cc(dst, src1, src2);
+} else {
+tcg_gen_add_tl(dst, src1, src2);
+}
+return;
+
+case CC_OP_ADD:
+case CC_OP_TADD:
+case CC_OP_TADDTV:
+#if TCG_TARGET_REG_BITS == 32 && TARGET_LONG_BITS == 32
+{
+/* For 32-bit hosts, we can re-use the host's hardware carry
+   generation by using an ADD2 opcode.  We discard the low
+   part of the output.  Ideally we'd combine this operation
+   with the add that generated the carry in the first place.  */
+TCGv dst_low = tcg_temp_new();
+tcg_gen_op6_i32(INDEX_op_add2_i32, dst_low, dst, 
+cpu_cc_src, src1, cpu_cc_src2, src2);
+tcg_temp_free(dst_low);
+goto add_done;
+}
+#endif
+carry_32 = gen_add32_carry32();
+break;
+
+case CC_OP_SUB:
+case CC_OP_TSUB:
+case CC_OP_TSUBTV:
+carry_32 = gen_sub32_carry32();
+break;
+
+default:
+/* We need external help to produce the carry.  */
+carry_32 = tcg_temp_new_i32();
+gen_helper_compute_C_ic

[Qemu-devel] [PATCH 0/3] Fix ADDX compilation plus improvements.

2010-05-10 Thread Richard Henderson
The first patch is required in order to fix TCGv_i32/_i64 type errors.

The second patch fixes some mistakes I noticed with ADDX carry generation.

The third patch improves code generation for some common cases.  With
Aurelien's tcg-optimization patches we get nearly optimal code, and
it isn't half bad with the TCG optimizer as-is.



r~



Richard Henderson (3):
  target-sparc: Fix compilation with --enable-debug.
  target-sparc: Simplify ICC generation; fix ADDX carry generation.
  target-sparc: Inline some generation of carry for ADDX/SUBX.

 target-sparc/op_helper.c |  106 ---
 target-sparc/translate.c |  268 +-
 2 files changed, 263 insertions(+), 111 deletions(-)




[Qemu-devel] [PATCH 1/3] target-sparc: Fix compilation with --enable-debug.

2010-05-10 Thread Richard Henderson
Return a target_ulong from compute_C_icc to match the width of the users.

Signed-off-by: Richard Henderson 
---
 target-sparc/helper.h|2 +-
 target-sparc/op_helper.c |2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/target-sparc/helper.h b/target-sparc/helper.h
index 6f103e7..04c1306 100644
--- a/target-sparc/helper.h
+++ b/target-sparc/helper.h
@@ -158,6 +158,6 @@ VIS_CMPHELPER(cmpne);
 #undef VIS_HELPER
 #undef VIS_CMPHELPER
 DEF_HELPER_0(compute_psr, void);
-DEF_HELPER_0(compute_C_icc, i32);
+DEF_HELPER_0(compute_C_icc, tl);
 
 #include "def-helper.h"
diff --git a/target-sparc/op_helper.c b/target-sparc/op_helper.c
index fcfd3f3..09449c5 100644
--- a/target-sparc/op_helper.c
+++ b/target-sparc/op_helper.c
@@ -1282,7 +1282,7 @@ void helper_compute_psr(void)
 CC_OP = CC_OP_FLAGS;
 }
 
-uint32_t helper_compute_C_icc(void)
+target_ulong helper_compute_C_icc(void)
 {
 uint32_t ret;
 
-- 
1.7.0.1




[Qemu-devel] [PATCH 2/3] target-sparc: Simplify ICC generation; fix ADDX carry generation.

2010-05-10 Thread Richard Henderson
Use int32 types instead of target_ulong when computing ICC.  This
simplifies the generated code for 32-bit host and 64-bit guest.
Use the same simplified expressions for ICC as were already used
for XCC in carry flag generation.

ADDX ICC carry generation was using the same routines as ADD ICC,
which is incorrect if the input carry bit produces the output carry.
Use the algorithms already in place for ADDX XCC carry generation.
Similarly for SUBX.

Signed-off-by: Richard Henderson 
---
 target-sparc/op_helper.c |  106 ++
 1 files changed, 69 insertions(+), 37 deletions(-)

diff --git a/target-sparc/op_helper.c b/target-sparc/op_helper.c
index 09449c5..c36bc54 100644
--- a/target-sparc/op_helper.c
+++ b/target-sparc/op_helper.c
@@ -896,13 +896,13 @@ static uint32_t compute_C_flags(void)
 return env->psr & PSR_CARRY;
 }
 
-static inline uint32_t get_NZ_icc(target_ulong dst)
+static inline uint32_t get_NZ_icc(int32_t dst)
 {
 uint32_t ret = 0;
 
-if (!(dst & 0xULL))
+if (dst == 0)
 ret |= PSR_ZERO;
-if ((int32_t) (dst & 0xULL) < 0)
+if (dst < 0)
 ret |= PSR_NEG;
 return ret;
 }
@@ -918,13 +918,13 @@ static uint32_t compute_C_flags_xcc(void)
 return env->xcc & PSR_CARRY;
 }
 
-static inline uint32_t get_NZ_xcc(target_ulong dst)
+static inline uint32_t get_NZ_xcc(target_long dst)
 {
 uint32_t ret = 0;
 
 if (!dst)
 ret |= PSR_ZERO;
-if ((int64_t)dst < 0)
+if (dst < 0)
 ret |= PSR_NEG;
 return ret;
 }
@@ -953,25 +953,21 @@ static uint32_t compute_C_div(void)
 return 0;
 }
 
-/* carry = (src1[31] & src2[31]) | ( ~dst[31] & (src1[31] | src2[31])) */
-static inline uint32_t get_C_add_icc(target_ulong dst, target_ulong src1,
- target_ulong src2)
+static inline uint32_t get_C_add_icc(uint32_t dst, uint32_t src1)
 {
 uint32_t ret = 0;
 
-if (((src1 & (1ULL << 31)) & (src2 & (1ULL << 31)))
-| ((~(dst & (1ULL << 31)))
-   & ((src1 & (1ULL << 31)) | (src2 & (1ULL << 31)
+if (dst < src1)
 ret |= PSR_CARRY;
 return ret;
 }
 
-static inline uint32_t get_V_add_icc(target_ulong dst, target_ulong src1,
- target_ulong src2)
+static inline uint32_t get_V_add_icc(uint32_t dst, uint32_t src1,
+ uint32_t src2)
 {
 uint32_t ret = 0;
 
-if (((src1 ^ src2 ^ -1) & (src1 ^ dst)) & (1ULL << 31))
+if (((src1 ^ src2 ^ -1) & (src1 ^ dst)) & (1U << 31))
 ret |= PSR_OVF;
 return ret;
 }
@@ -1017,14 +1013,14 @@ static uint32_t compute_all_add(void)
 uint32_t ret;
 
 ret = get_NZ_icc(CC_DST);
-ret |= get_C_add_icc(CC_DST, CC_SRC, CC_SRC2);
+ret |= get_C_add_icc(CC_DST, CC_SRC);
 ret |= get_V_add_icc(CC_DST, CC_SRC, CC_SRC2);
 return ret;
 }
 
 static uint32_t compute_C_add(void)
 {
-return get_C_add_icc(CC_DST, CC_SRC, CC_SRC2);
+return get_C_add_icc(CC_DST, CC_SRC);
 }
 
 #ifdef TARGET_SPARC64
@@ -1049,6 +1045,26 @@ static uint32_t compute_C_addx_xcc(void)
 }
 #endif
 
+static uint32_t compute_all_addx(void)
+{
+uint32_t ret;
+
+ret = get_NZ_icc(CC_DST);
+ret |= get_C_add_icc(CC_DST - CC_SRC2, CC_SRC);
+ret |= get_C_add_icc(CC_DST, CC_SRC);
+ret |= get_V_add_icc(CC_DST, CC_SRC, CC_SRC2);
+return ret;
+}
+
+static uint32_t compute_C_addx(void)
+{
+uint32_t ret;
+
+ret = get_C_add_icc(CC_DST - CC_SRC2, CC_SRC);
+ret |= get_C_add_icc(CC_DST, CC_SRC);
+return ret;
+}
+
 static inline uint32_t get_V_tag_icc(target_ulong src1, target_ulong src2)
 {
 uint32_t ret = 0;
@@ -1063,7 +1079,7 @@ static uint32_t compute_all_tadd(void)
 uint32_t ret;
 
 ret = get_NZ_icc(CC_DST);
-ret |= get_C_add_icc(CC_DST, CC_SRC, CC_SRC2);
+ret |= get_C_add_icc(CC_DST, CC_SRC);
 ret |= get_V_add_icc(CC_DST, CC_SRC, CC_SRC2);
 ret |= get_V_tag_icc(CC_SRC, CC_SRC2);
 return ret;
@@ -1071,7 +1087,7 @@ static uint32_t compute_all_tadd(void)
 
 static uint32_t compute_C_tadd(void)
 {
-return get_C_add_icc(CC_DST, CC_SRC, CC_SRC2);
+return get_C_add_icc(CC_DST, CC_SRC);
 }
 
 static uint32_t compute_all_taddtv(void)
@@ -1079,34 +1095,30 @@ static uint32_t compute_all_taddtv(void)
 uint32_t ret;
 
 ret = get_NZ_icc(CC_DST);
-ret |= get_C_add_icc(CC_DST, CC_SRC, CC_SRC2);
+ret |= get_C_add_icc(CC_DST, CC_SRC);
 return ret;
 }
 
 static uint32_t compute_C_taddtv(void)
 {
-return get_C_add_icc(CC_DST, CC_SRC, CC_SRC2);
+return get_C_add_icc(CC_DST, CC_SRC);
 }
 
-/* carry = (~src1[31] & src2[31]) | ( dst[31]  & (~src1[31] | src2[31])) */
-static inline uint32_t get_C_sub_icc(target_ulong dst, target_ulong src1,
- target_ulong src2)
+static inline uint32_t get_C_sub_icc(uint32_t src1, uint32_t src2)
 {
 uint32_t ret = 0;
 
-if (((~(src1 & (1ULL << 31))) & (src2 & (1ULL << 31)))
- 

[Qemu-devel] Re: [PATCH 0/1] [RFC][AHCI] add cdrom support for ahci.

2010-05-10 Thread Sebastian Herbszt

Alexander Graf wrote:

Hi Chong,

On 10.05.2010, at 13:55, QiaoChong wrote:

> When ahci init ,driver will send ATA_SRST command,ahci device report device 
type through port's sig register.
> Ahci disk lookup change from IF_SD to IF_SCSI now,because IF_SD does not 
support cdrom media.
> I just copy ide_atapi_cmd from hw/ide/core.c into hw/ahci.c,change a little,then the cdrom can be identified,and 
> read by os.
> If qemu can change dma_buf_prepare,dma_buf_rw,dma_buf_commit to a function pointer in BMDMAState,then I can rewrite 
> three functions to support ahci's prtd,because it is different from ide's.

>
> test a sata disk like this:
> ./i386-softmmu/qemu -cdrom KNOPPIX_V6.0.1CD-2009-02-08-EN.iso -drive 
if=scsi,file=/tmp/disk
> test a sata cd like this:
> ./i386-softmmu/qemu -cdrom KNOPPIX_V6.0.1CD-2009-02-08-EN.iso -drive 
> if=scsi,media=cdrom,file=KNOPPIX_V6.0.1CD-2009-02-08-EN.iso


Thanks for improving the patch, but I have some nitpicks considering on how to 
process here.

For starters, this patch is incremental to the previous one. Since the previous patch did not get applied to qemu, it 
doesn't make sense to send an incremental patch. Please send the full patchset but bump up > the version in that case. 
You will find many examples for that on the mailing list. In most cases it also makes sense to rethink the splitting 
between patches.


The problem of incremental patches will be a non issue as soon as the git tree 
is available.

Sebastian




Re: [Qemu-devel] [PATCH 0/2] Enable qemu block layer to not flush

2010-05-10 Thread Anthony Liguori

On 05/10/2010 05:03 PM, Alexander Graf wrote:

On 10.05.2010, at 23:59, Anthony Liguori wrote:

   

On 05/10/2010 04:51 PM, Alexander Graf wrote:
 

Thanks to recent improvements, qemu flushes guest data to disk when the guest
tells us to do so.

This is great if we care about data consistency on host disk failures. In cases
where we don't it just creates additional overhead for no net win. One such use
case is the building of appliances in SUSE Studio. We write the resulting images
out of the build VM, but compress it directly afterwards. So if possible we'd
love to keep it in RAM.

This patchset introduces a new block parameter to -drive called "flush" which
allows a user to disable flushing in odd scenarios like the above. To show the
difference in performance this makes, I have put together a small test case.
Inside the initrd, I call the following piece of code on a 500MB preallocated
vmdk image:

   

This seems like it's asking for trouble to me.  I'm not sure it's worth the 
minor performance gain.
 

The gain is little on my netbook where I did the test on. This is part of 
performance regressions from 0.10 to 0.12 where we're talking build times of 2 
minutes going to 30. While writeback was most of the chunk, flushing still at 
least doubled the build times which is unacceptable for us.
   


There's got to be a better place to fix this.  Disable barriers in your 
guests?


Regards,

Anthony Liguori


I also fail to see where it's asking for trouble. If we don't flush volatile 
data, things are good, no?


Alex

   





Re: [Qemu-devel] [PATCH 0/2] Enable qemu block layer to not flush

2010-05-10 Thread Alexander Graf

On 10.05.2010, at 23:59, Anthony Liguori wrote:

> On 05/10/2010 04:51 PM, Alexander Graf wrote:
>> Thanks to recent improvements, qemu flushes guest data to disk when the guest
>> tells us to do so.
>> 
>> This is great if we care about data consistency on host disk failures. In 
>> cases
>> where we don't it just creates additional overhead for no net win. One such 
>> use
>> case is the building of appliances in SUSE Studio. We write the resulting 
>> images
>> out of the build VM, but compress it directly afterwards. So if possible we'd
>> love to keep it in RAM.
>> 
>> This patchset introduces a new block parameter to -drive called "flush" which
>> allows a user to disable flushing in odd scenarios like the above. To show 
>> the
>> difference in performance this makes, I have put together a small test case.
>> Inside the initrd, I call the following piece of code on a 500MB preallocated
>> vmdk image:
>>   
> 
> This seems like it's asking for trouble to me.  I'm not sure it's worth the 
> minor performance gain.

The gain is little on my netbook where I did the test on. This is part of 
performance regressions from 0.10 to 0.12 where we're talking build times of 2 
minutes going to 30. While writeback was most of the chunk, flushing still at 
least doubled the build times which is unacceptable for us.

I also fail to see where it's asking for trouble. If we don't flush volatile 
data, things are good, no?


Alex




Re: [Qemu-devel] [PATCH 0/2] Enable qemu block layer to not flush

2010-05-10 Thread Anthony Liguori

On 05/10/2010 04:51 PM, Alexander Graf wrote:

Thanks to recent improvements, qemu flushes guest data to disk when the guest
tells us to do so.

This is great if we care about data consistency on host disk failures. In cases
where we don't it just creates additional overhead for no net win. One such use
case is the building of appliances in SUSE Studio. We write the resulting images
out of the build VM, but compress it directly afterwards. So if possible we'd
love to keep it in RAM.

This patchset introduces a new block parameter to -drive called "flush" which
allows a user to disable flushing in odd scenarios like the above. To show the
difference in performance this makes, I have put together a small test case.
Inside the initrd, I call the following piece of code on a 500MB preallocated
vmdk image:
   


This seems like it's asking for trouble to me.  I'm not sure it's worth 
the minor performance gain.


Regards,

Anthony Liguori




[Qemu-devel] [PATCH] add interface type IF_SATA

2010-05-10 Thread Sebastian Herbszt

Add interface type IF_SATA.

Signed-off-by: Sebastian Herbszt 

diff --git a/sysemu.h b/sysemu.h
index fa921df..b88bae9 100644
--- a/sysemu.h
+++ b/sysemu.h
@@ -151,7 +151,7 @@ extern unsigned int nb_prom_envs;
typedef enum {
IF_NONE,
IF_IDE, IF_SCSI, IF_FLOPPY, IF_PFLASH, IF_MTD, IF_SD, IF_VIRTIO, IF_XEN,
-IF_COUNT
+IF_SATA, IF_COUNT
} BlockInterfaceType;

typedef enum {
@@ -177,6 +177,7 @@ typedef struct DriveInfo {

#define MAX_IDE_DEVS 2
#define MAX_SCSI_DEVS 7
+#define MAX_SATA_DEVS 1
#define MAX_DRIVES 32

extern QTAILQ_HEAD(drivelist, DriveInfo) drives;
diff --git a/vl.c b/vl.c
index 85bcc84..cd3a343 100644
--- a/vl.c
+++ b/vl.c
@@ -849,6 +849,9 @@ DriveInfo *drive_init(QemuOpts *opts, void *opaque,
 } else if (!strcmp(buf, "xen")) {
 type = IF_XEN;
max_devs = 0;
+ } else if (!strcmp(buf, "sata")) {
+ type = IF_SATA;
+max_devs = MAX_SATA_DEVS;
 } else if (!strcmp(buf, "none")) {
 type = IF_NONE;
max_devs = 0;
@@ -1039,7 +1042,7 @@ DriveInfo *drive_init(QemuOpts *opts, void *opaque,
} else {
/* no id supplied -> create one */
dinfo->id = qemu_mallocz(32);
-if (type == IF_IDE || type == IF_SCSI)
+if (type == IF_IDE || type == IF_SCSI || type == IF_SATA)
mediastr = (media == MEDIA_CDROM) ? "-cd" : "-hd";
if (max_devs)
snprintf(dinfo->id, 32, "%s%i%s%i",
@@ -1064,6 +1067,7 @@ DriveInfo *drive_init(QemuOpts *opts, void *opaque,
case IF_IDE:
case IF_SCSI:
case IF_XEN:
+case IF_SATA:
case IF_NONE:
switch(media) {
 case MEDIA_DISK:




Re: [Qemu-devel] [PATCH] fix migration with large mem

2010-05-10 Thread Anthony Liguori

On 05/10/2010 04:45 PM, Izik Eidus wrote:

On Mon, 10 May 2010 15:24:20 -0500
Anthony Liguori  wrote:

   

On 04/13/2010 04:33 AM, Izik Eidus wrote:
 

   From f881b371e08760a67bf1f5b992a586c3de600f7a Mon Sep 17 00:00:00
2001 From: Izik Eidus
Date: Tue, 13 Apr 2010 12:24:57 +0300
Subject: [PATCH] fix migration with large mem

In cases of guests with large mem that have pages
that all their bytes content are the same, we will
spend alot of time reading the memory from the guest
(is_dup_page())

It is happening beacuse ram_save_live() function have
limit of how much we can send to the dest but not how
much we read from it, and in cases we have many is_dup_page()
hits, we might read huge amount of data without updating important
stuff like the timers...

The guest lose all its repsonsibility and have many softlock ups
inside itself.

this patch add limit on the size we can read from the guest each
iteration.

  Thanks.

Signed-off-by: Izik Eidus
---
   arch_init.c |6 +-
   1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index cfc03ea..e27b1a0 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -88,6 +88,8 @@ const uint32_t arch_type = QEMU_ARCH;
   #define RAM_SAVE_FLAG_PAGE   0x08
   #define RAM_SAVE_FLAG_EOS0x10

+#define MAX_SAVE_BLOCK_READ 10 * 1024 * 1024
+
   static int is_dup_page(uint8_t *page, uint8_t ch)
   {
   uint32_t val = ch<<   24 | ch<<   16 | ch<<   8 | ch;
@@ -175,6 +177,7 @@ int ram_save_live(Monitor *mon, QEMUFile *f,
int stage, void *opaque) uint64_t bytes_transferred_last;
   double bwidth = 0;
   uint64_t expected_time = 0;
+int data_read = 0;

   if (stage<   0) {
   cpu_physical_memory_set_dirty_tracking(0);
@@ -205,10 +208,11 @@ int ram_save_live(Monitor *mon, QEMUFile *f,
int stage, void *opaque) bytes_transferred_last = bytes_transferred;
   bwidth = qemu_get_clock_ns(rt_clock);

-while (!qemu_file_rate_limit(f)) {
+while (!qemu_file_rate_limit(f)&&   data_read<
MAX_SAVE_BLOCK_READ) {
   

The effect of this patch is that we'll never send more than 10mb/s
during live migration?  If so, it's totally wrong as a fix to the
problem.
 

It is 100mb/s... (if I remember correct the migration code will run
this thing 10 times for each iteration)
   


No, it only runs it once.


My feeling is that limit it with the network 32mb/s limit is too low,
reading memory for 100mb/s is not such a problem as long as we don`t
read gigas out of memory every sec...
   


You've limited bandwidth to 10 mb/sec.  Even if it was 100 mb/sec a 
fixed limit is wrong.  On a 10gbit (or 40gbit) link, 100 mb/sec is not 
enough.



(Still we want to optimize the billion of zeros cases of windows guests)

Anyway if the above does not make sense to you, I will just change it
into what you suggested

So ?
   


That would work for me.

Regards,

Anthony Liguori




[Qemu-devel] [PATCH 1/2] Add no-op aio emulation stub

2010-05-10 Thread Alexander Graf
We need to be able to do nothing in AIO fashion. Since I suspect this
could be useful for more cases than the non flushing, I figured I'd
create a new function that does everything AIO-like, but doesn't do
anything.

Signed-off-by: Alexander Graf 
---
 block.c |   18 ++
 block.h |5 +
 2 files changed, 23 insertions(+), 0 deletions(-)

diff --git a/block.c b/block.c
index 48305b7..1cd39d7 100644
--- a/block.c
+++ b/block.c
@@ -2196,6 +2196,24 @@ static BlockDriverAIOCB 
*bdrv_aio_flush_em(BlockDriverState *bs,
 return &acb->common;
 }
 
+BlockDriverAIOCB *bdrv_aio_noop_em(BlockDriverState *bs,
+BlockDriverCompletionFunc *cb, void *opaque)
+{
+BlockDriverAIOCBSync *acb;
+
+acb = qemu_aio_get(&bdrv_em_aio_pool, bs, cb, opaque);
+acb->is_write = 1; /* don't bounce in the completion hadler */
+acb->qiov = NULL;
+acb->bounce = NULL;
+acb->ret = 0;
+
+if (!acb->bh)
+acb->bh = qemu_bh_new(bdrv_aio_bh_cb, acb);
+
+qemu_bh_schedule(acb->bh);
+return &acb->common;
+}
+
 /**/
 /* sync block device emulation */
 
diff --git a/block.h b/block.h
index f87d24e..bef6358 100644
--- a/block.h
+++ b/block.h
@@ -33,6 +33,7 @@ typedef struct QEMUSnapshotInfo {
 #define BDRV_O_CACHE_WB0x0040 /* use write-back caching */
 #define BDRV_O_NATIVE_AIO  0x0080 /* use native AIO instead of the thread pool 
*/
 #define BDRV_O_NO_BACKING  0x0100 /* don't open the backing file */
+#define BDRV_O_NOFLUSH 0x0200 /* don't flush the image ever */
 
 #define BDRV_O_CACHE_MASK  (BDRV_O_NOCACHE | BDRV_O_CACHE_WB)
 
@@ -97,6 +98,10 @@ BlockDriverAIOCB *bdrv_aio_flush(BlockDriverState *bs,
 BlockDriverCompletionFunc *cb, void *opaque);
 void bdrv_aio_cancel(BlockDriverAIOCB *acb);
 
+/* Emulate a no-op */
+BlockDriverAIOCB *bdrv_aio_noop_em(BlockDriverState *bs,
+BlockDriverCompletionFunc *cb, void *opaque);
+
 typedef struct BlockRequest {
 /* Fields to be filled by multiwrite caller */
 int64_t sector;
-- 
1.6.0.2




[Qemu-devel] [PATCH 2/2] Add flush=off parameter to -drive

2010-05-10 Thread Alexander Graf
Usually the guest can tell the host to flush data to disk. In some cases we
don't want to flush though, but try to keep everything in cache.

So let's add a new parameter to -drive that allows us to set the flushing
behavior to "on" or "off", defaulting to enabling the guest to flush.

Signed-off-by: Alexander Graf 
---
 block/raw-posix.c |   13 +
 qemu-config.c |3 +++
 qemu-options.hx   |3 +++
 vl.c  |3 +++
 4 files changed, 22 insertions(+), 0 deletions(-)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index 7541ed2..2510b1b 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -106,6 +106,7 @@ typedef struct BDRVRawState {
 int fd;
 int type;
 int open_flags;
+int bdrv_flags;
 #if defined(__linux__)
 /* linux floppy specific */
 int64_t fd_open_time;
@@ -133,6 +134,7 @@ static int raw_open_common(BlockDriverState *bs, const char 
*filename,
 BDRVRawState *s = bs->opaque;
 int fd, ret;
 
+s->bdrv_flags = bdrv_flags;
 s->open_flags = open_flags | O_BINARY;
 s->open_flags &= ~O_ACCMODE;
 if (bdrv_flags & BDRV_O_RDWR) {
@@ -555,6 +557,11 @@ static BlockDriverAIOCB *raw_aio_flush(BlockDriverState 
*bs,
 if (fd_open(bs) < 0)
 return NULL;
 
+/* Don't flush? */
+if (s->bdrv_flags & BDRV_O_NOFLUSH) {
+return bdrv_aio_noop_em(bs, cb, opaque);
+}
+
 return paio_submit(bs, s->fd, 0, NULL, 0, cb, opaque, QEMU_AIO_FLUSH);
 }
 
@@ -726,6 +733,12 @@ static int raw_create(const char *filename, 
QEMUOptionParameter *options)
 static void raw_flush(BlockDriverState *bs)
 {
 BDRVRawState *s = bs->opaque;
+
+/* No flush means no flush */
+if (s->bdrv_flags & BDRV_O_NOFLUSH) {
+return;
+}
+
 qemu_fdatasync(s->fd);
 }
 
diff --git a/qemu-config.c b/qemu-config.c
index d500885..c358add 100644
--- a/qemu-config.c
+++ b/qemu-config.c
@@ -79,6 +79,9 @@ QemuOptsList qemu_drive_opts = {
 },{
 .name = "readonly",
 .type = QEMU_OPT_BOOL,
+},{
+.name = "flush",
+.type = QEMU_OPT_BOOL,
 },
 { /* end if list */ }
 },
diff --git a/qemu-options.hx b/qemu-options.hx
index 12f6b51..69ae8de 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -120,6 +120,7 @@ DEF("drive", HAS_ARG, QEMU_OPTION_drive,
 "   [,cyls=c,heads=h,secs=s[,trans=t]][,snapshot=on|off]\n"
 "   [,cache=writethrough|writeback|none][,format=f][,serial=s]\n"
 "   [,addr=A][,id=name][,aio=threads|native][,readonly=on|off]\n"
+"   [,flush=on|off]\n"
 "use 'file' as a drive image\n", QEMU_ARCH_ALL)
 STEXI
 @item -drive @var{option}[,@var{option}[,@var{option}[,...]]]
@@ -151,6 +152,8 @@ These options have the same definition as they have in 
@option{-hdachs}.
 @var{cache} is "none", "writeback", or "writethrough" and controls how the 
host cache is used to access block data.
 @item a...@var{aio}
 @var{aio} is "threads", or "native" and selects between pthread based disk I/O 
and native Linux AIO.
+...@item flu...@var{flush}
+...@var{flush} is "on" (default), or "off" and select whether the guest can 
trigger a host flush
 @item form...@var{format}
 Specify which disk @var{format} will be used rather than detecting
 the format.  Can be used to specifiy format=raw to avoid interpreting
diff --git a/vl.c b/vl.c
index 85bcc84..a7ca2c3 100644
--- a/vl.c
+++ b/vl.c
@@ -787,6 +787,7 @@ DriveInfo *drive_init(QemuOpts *opts, void *opaque,
 int max_devs;
 int index;
 int ro = 0;
+int flush = 1;
 int bdrv_flags = 0;
 int on_read_error, on_write_error;
 const char *devaddr;
@@ -819,6 +820,7 @@ DriveInfo *drive_init(QemuOpts *opts, void *opaque,
 
 snapshot = qemu_opt_get_bool(opts, "snapshot", 0);
 ro = qemu_opt_get_bool(opts, "readonly", 0);
+flush = qemu_opt_get_bool(opts, "flush", 1);
 
 file = qemu_opt_get(opts, "file");
 serial = qemu_opt_get(opts, "serial");
@@ -1118,6 +1120,7 @@ DriveInfo *drive_init(QemuOpts *opts, void *opaque,
 }
 
 bdrv_flags |= ro ? 0 : BDRV_O_RDWR;
+bdrv_flags |= flush ? 0 : BDRV_O_NOFLUSH;
 
 if (bdrv_open(dinfo->bdrv, file, bdrv_flags, drv) < 0) {
 fprintf(stderr, "qemu: could not open disk image %s: %s\n",
-- 
1.6.0.2




[Qemu-devel] [PATCH 0/2] Enable qemu block layer to not flush

2010-05-10 Thread Alexander Graf
Thanks to recent improvements, qemu flushes guest data to disk when the guest
tells us to do so.

This is great if we care about data consistency on host disk failures. In cases
where we don't it just creates additional overhead for no net win. One such use
case is the building of appliances in SUSE Studio. We write the resulting images
out of the build VM, but compress it directly afterwards. So if possible we'd
love to keep it in RAM.

This patchset introduces a new block parameter to -drive called "flush" which
allows a user to disable flushing in odd scenarios like the above. To show the
difference in performance this makes, I have put together a small test case.
Inside the initrd, I call the following piece of code on a 500MB preallocated
vmdk image:

  mkfs.ext3 /dev/vda
  mkdir -p /mnt
  mount /dev/vda /mnt
  dd if=/dev/zero of=/mnt/test bs=1M
  umount /mnt
  sync
  halt -fp

With flush=on (default)

real0m33.597s
user0m16.453s
sys 0m6.192s

With flush=off

real0m27.150s
user0m16.533s
sys 0m5.348s


Alexander Graf (2):
  Add no-op aio emulation stub
  Add flush=off parameter to -drive

 block.c   |   18 ++
 block.h   |5 +
 block/raw-posix.c |   13 +
 qemu-config.c |3 +++
 qemu-options.hx   |3 +++
 vl.c  |3 +++
 6 files changed, 45 insertions(+), 0 deletions(-)




Re: [Qemu-devel] [PATCH] fix migration with large mem

2010-05-10 Thread Izik Eidus
On Mon, 10 May 2010 15:24:20 -0500
Anthony Liguori  wrote:

> On 04/13/2010 04:33 AM, Izik Eidus wrote:
> >  From f881b371e08760a67bf1f5b992a586c3de600f7a Mon Sep 17 00:00:00
> > 2001 From: Izik Eidus
> > Date: Tue, 13 Apr 2010 12:24:57 +0300
> > Subject: [PATCH] fix migration with large mem
> >
> > In cases of guests with large mem that have pages
> > that all their bytes content are the same, we will
> > spend alot of time reading the memory from the guest
> > (is_dup_page())
> >
> > It is happening beacuse ram_save_live() function have
> > limit of how much we can send to the dest but not how
> > much we read from it, and in cases we have many is_dup_page()
> > hits, we might read huge amount of data without updating important
> > stuff like the timers...
> >
> > The guest lose all its repsonsibility and have many softlock ups
> > inside itself.
> >
> > this patch add limit on the size we can read from the guest each
> > iteration.
> >
> >  Thanks.
> >
> > Signed-off-by: Izik Eidus
> > ---
> >   arch_init.c |6 +-
> >   1 files changed, 5 insertions(+), 1 deletions(-)
> >
> > diff --git a/arch_init.c b/arch_init.c
> > index cfc03ea..e27b1a0 100644
> > --- a/arch_init.c
> > +++ b/arch_init.c
> > @@ -88,6 +88,8 @@ const uint32_t arch_type = QEMU_ARCH;
> >   #define RAM_SAVE_FLAG_PAGE0x08
> >   #define RAM_SAVE_FLAG_EOS 0x10
> >
> > +#define MAX_SAVE_BLOCK_READ 10 * 1024 * 1024
> > +
> >   static int is_dup_page(uint8_t *page, uint8_t ch)
> >   {
> >   uint32_t val = ch<<  24 | ch<<  16 | ch<<  8 | ch;
> > @@ -175,6 +177,7 @@ int ram_save_live(Monitor *mon, QEMUFile *f,
> > int stage, void *opaque) uint64_t bytes_transferred_last;
> >   double bwidth = 0;
> >   uint64_t expected_time = 0;
> > +int data_read = 0;
> >
> >   if (stage<  0) {
> >   cpu_physical_memory_set_dirty_tracking(0);
> > @@ -205,10 +208,11 @@ int ram_save_live(Monitor *mon, QEMUFile *f,
> > int stage, void *opaque) bytes_transferred_last = bytes_transferred;
> >   bwidth = qemu_get_clock_ns(rt_clock);
> >
> > -while (!qemu_file_rate_limit(f)) {
> > +while (!qemu_file_rate_limit(f)&&  data_read<
> > MAX_SAVE_BLOCK_READ) { 
> 
> The effect of this patch is that we'll never send more than 10mb/s 
> during live migration?  If so, it's totally wrong as a fix to the
> problem.

It is 100mb/s... (if I remember correct the migration code will run
this thing 10 times for each iteration)

My feeling is that limit it with the network 32mb/s limit is too low,
reading memory for 100mb/s is not such a problem as long as we don`t
read gigas out of memory every sec...

(Still we want to optimize the billion of zeros cases of windows guests)

Anyway if the above does not make sense to you, I will just change it
into what you suggested

So ?

> 
> It would be better to account the deduplicated pages as part of the
> rate limiting calculations.
> 
> Regards,
> 
> Anthony Liguori
> 
> >   int ret;
> >
> >   ret = ram_save_block(f);
> > +data_read += ret * TARGET_PAGE_SIZE;
> >   bytes_transferred += ret * TARGET_PAGE_SIZE;
> >   if (ret == 0) { /* no more blocks */
> >   break;
> >
> 




[Qemu-devel] Re: [SeaBIOS] [PATCH] smbios: avoid counting io hole as ram

2010-05-10 Thread Alex Williamson
On Mon, 2010-05-10 at 23:05 +0200, Sebastian Herbszt wrote:
> Kevin O'Connor wrote:
> > On Fri, May 07, 2010 at 01:38:55PM -0600, Alex Williamson wrote:
> >> Avoid counting the io hole as part of ram, a vm started with 4G
> >> should report 4G in smbios, not 4.5G.
> >> 
> >> Signed-off-by: Alex Williamson 
> > 
> > Looks okay to me.  If there are no other comments, I'll commit in the
> > next couple of days.
> > 
> > -Kevin
> 
> If i don't mistake it seems to address what i have reported back in January
> in the "wrong memsize in smbios_init()" thread [1], correct?
> 
> [1] http://www.seabios.org/pipermail/seabios/2010-January/000146.html

Yes, it should fix that.  I started out effectively where you did,
changing memsize to RamSize + RamSizeOver4G, but then quickly ran into
the problems Gleb pointed out with some tables wanting address ranges
and others wanting quantity.  With this patch tables 16, 17, 19, and 20
should make sense and match the spec.  Thanks,

Alex





[Qemu-devel] Re: [SeaBIOS] [PATCH] smbios: avoid counting io hole as ram

2010-05-10 Thread Sebastian Herbszt

Kevin O'Connor wrote:

On Fri, May 07, 2010 at 01:38:55PM -0600, Alex Williamson wrote:

Avoid counting the io hole as part of ram, a vm started with 4G
should report 4G in smbios, not 4.5G.

Signed-off-by: Alex Williamson 


Looks okay to me.  If there are no other comments, I'll commit in the
next couple of days.

-Kevin


If i don't mistake it seems to address what i have reported back in January
in the "wrong memsize in smbios_init()" thread [1], correct?

[1] http://www.seabios.org/pipermail/seabios/2010-January/000146.html

Sebastian




[Qemu-devel] Re: [PATCH 1/2] Pad iommu with an empty slot (necessary for SunOS 4.1.4)

2010-05-10 Thread Blue Swirl
On 5/10/10, Artyom Tarasenko  wrote:
> 2010/5/10 Blue Swirl :
>
> > On 5/10/10, Artyom Tarasenko  wrote:
>  >> 2010/5/9 Blue Swirl :
>  >>  > On 5/9/10, Artyom Tarasenko  wrote:
>  >>  >> 2010/5/9 Blue Swirl :
>  >>  >>
>  >>  >> > On 5/8/10, Artyom Tarasenko  wrote:
>  >>  >>  >> On the real hardware (SS-5, LX) the MMU is not padded, but 
> aliased.
>  >>  >>  >>  Software shouldn't use aliased addresses, neither should it crash
>  >>  >>  >>  when it uses (on the real hardware it wouldn't). Using empty_slot
>  >>  >>  >>  instead of aliasing can help with debugging such accesses.
>  >>  >>  >
>  >>  >>  > TurboSPARC Microprocessor User's Manual shows that there are
>  >>  >>  > additional pages after the main IOMMU for AFX registers. So this is
>  >>  >>  > not board specific, but depends on CPU/IOMMU versions.
>  >>  >>
>  >>  >>
>  >>  >> I checked it on the real hw: on LX and SS-5 these are aliased MMU 
> addresses.
>  >>  >>  SS-20 doesn't have any aliasing.
>  >>  >
>  >>  > But are your machines equipped with TurboSPARC or some other CPU?
>  >>
>  >>
>  >> Good point, I must confess, I missed the word "Turbo" in your first
>  >>  answer. LX and SS-20 don't.
>  >>  But SS-5 must have a TurboSPARC CPU:
>  >>
>  >>  ok cd /FMI,MB86904
>  >>  ok .attributes
>  >>  context-table00 00 00 00 03 ff f0 00 00 00 10 00
>  >>  psr-implementation   
>  >>  psr-version  0004
>  >>  implementation   
>  >>  version  0004
>  >>  cache-line-size  0020
>  >>  cache-nlines 0200
>  >>  page-size1000
>  >>  dcache-line-size 0010
>  >>  dcache-nlines0200
>  >>  dcache-associativity 0001
>  >>  icache-line-size 0020
>  >>  icache-nlines0200
>  >>  icache-associativity 0001
>  >>  ncaches  0002
>  >>  mmu-nctx 0100
>  >>  sparc-version0008
>  >>  mask_rev 0026
>  >>  device_type  cpu
>  >>  name FMI,MB86904
>  >>
>  >>  and still it behaves the same as TI,TMS390S10 from the LX. This is done 
> on SS-5:
>  >>
>  >>  ok 1000 20 spacel@ .
>  >>  409
>  >>  ok 1400 20 spacel@ .
>  >>  409
>  >>  ok 1404 20 spacel@ .
>  >>  23000
>  >>  ok 1f04 20 spacel@ .
>  >>  23000
>  >>  ok 1008 20 spacel@ .
>  >>  409
>  >>  ok 1428 20 spacel@ .
>  >>  409
>  >>  ok 100c 20 spacel@ .
>  >>  23000
>  >>  ok 1010 20 spacel@ .
>  >>  409
>  >>
>  >>
>  >>  LX is the same except for the IOMMU-version:
>  >>
>  >>  ok 1000 20 spacel@ .
>  >>  405
>  >>  ok 1400 20 spacel@ .
>  >>  405
>  >>  ok 1800 20 spacel@ .
>  >>  405
>  >>  ok 1f00 20 spacel@ .
>  >>  405
>  >>  ok 1ff0 20 spacel@ .
>  >>  405
>  >>  ok 1fff0004 20 spacel@ .
>  >>  1fe000
>  >>  ok 1004 20 spacel@ .
>  >>  1fe000
>  >>  ok 1108 20 spacel@ .
>  >>  4105
>  >>  ok 1040 20 spacel@ .
>  >>  4105
>  >>  ok 1fff0040 20 spacel@ .
>  >>  4105
>  >>  ok 1fff0044 20 spacel@ .
>  >>  1fe000
>  >>  ok 1fff0024 20 spacel@ .
>  >>  1fe000
>  >>
>  >>
>  >>  >>  At what address the additional AFX registers are located?
>  >>  >
>  >>  > Here's complete TurboSPARC IOMMU address map:
>  >>  >  PA[30:0]  Register  Access
>  >>  > 1000_   IOMMU Control R/W
>  >>  > 1000_0004IOMMU Base Address   R/W
>  >>  > 1000_0014   Flush All IOTLB EntriesW
>  >>  > 1000_0018Address Flush W
>  >>  > 1000_1000  Asynchronous Fault Status  R/W
>  >>  > 1000_1004 Asynchronous Fault Address  R/W
>  >>  > 1000_1010  SBus Slot Configuration 0   R/W
>  >>  > 1000_1014  SBus Slot Configuration 1   R/W
>  >>  > 1000_1018  SBus Slot Configuration 2   R/W
>  >>  > 1000_101C  SBus Slot Configuration 3   R/W
>  >>  > 1000_1020  SBus Slot Configuration 4   R/W
>  >>  > 1000_1050 Memory Fault Status R/W
>  >>  > 1000_1054Memory Fault Address R/W
>  >>  > 1000_2000 Module IdentificationR/W
>  >>  > 1000_3018  Mask Identification  R
>  >>  > 1000_4000  AFX Queue Level W
>  >>  > 1000_6000  AFX Queue Level R
>  >>  > 1000_7000  AFX Queue StatusR
>  >>
>  >>
>  >>
>  >> But if I read it correctly 0x12fff294 (which makes SunOS crash with -m 
> 32) is
>  >>  well above this limit.
>  >
>  > Oh, so I also misread something. You are not talking about the
>  > adjacent pages, but 16MB increments.
>  >
>  > Earlier I sent a patch for a generic address alias device, would it be
>  > useful for this?
>
>
> Should do as well. But I thought empty_slot is less overhead and
>  easier to debug.
>
>
>  > Maybe we have a general design problem, perhaps unassigned access
>  > faults should only be triggered inside SBus slots and ignored
>  > elsewhere. If this is true, generic Sparc32 unassigned access han

[Qemu-devel] [PATCH 1/2] acpi: remove static pm_state

2010-05-10 Thread Blue Swirl
Signed-off-by: Blue Swirl 
---
 hw/acpi.c |8 +++-
 1 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/hw/acpi.c b/hw/acpi.c
index e3b63b7..bb2974e 100644
--- a/hw/acpi.c
+++ b/hw/acpi.c
@@ -76,8 +76,6 @@ typedef struct PIIX4PMState {
 #define SMBHSTDAT1 0x06
 #define SMBBLKDAT 0x07

-static PIIX4PMState *pm_state;
-
 static uint32_t get_pmtmr(PIIX4PMState *s)
 {
 uint32_t d;
@@ -509,7 +507,6 @@ i2c_bus *piix4_pm_init(PCIBus *bus, int devfn,
uint32_t smb_io_base,
 s = (PIIX4PMState *)pci_register_device(bus,
  "PM", sizeof(PIIX4PMState),
  devfn, NULL, pm_write_config);
-pm_state = s;
 pci_conf = s->dev.config;
 pci_config_set_vendor_id(pci_conf, PCI_VENDOR_ID_INTEL);
 pci_config_set_device_id(pci_conf, PCI_DEVICE_ID_INTEL_82371AB_3);
@@ -747,6 +744,7 @@ static void disable_device(struct pci_status *p,
struct gpe_regs *g, int slot)

 static int piix4_device_hotplug(PCIDevice *dev, int state)
 {
+PIIX4PMState *s = container_of(dev, PIIX4PMState, dev);
 int slot = PCI_SLOT(dev->devfn);

 pci0_status.up = 0;
@@ -756,8 +754,8 @@ static int piix4_device_hotplug(PCIDevice *dev, int state)
 else
 disable_device(&pci0_status, &gpe, slot);
 if (gpe.en & 2) {
-qemu_set_irq(pm_state->irq, 1);
-qemu_set_irq(pm_state->irq, 0);
+qemu_set_irq(s->irq, 1);
+qemu_set_irq(s->irq, 0);
 }
 return 0;
 }
-- 
1.6.2.4



[Qemu-devel] Re: [PATCH 1/2] Pad iommu with an empty slot (necessary for SunOS 4.1.4)

2010-05-10 Thread Artyom Tarasenko
2010/5/10 Blue Swirl :
> On 5/10/10, Artyom Tarasenko  wrote:
>> 2010/5/9 Blue Swirl :
>>  > On 5/9/10, Artyom Tarasenko  wrote:
>>  >> 2010/5/9 Blue Swirl :
>>  >>
>>  >> > On 5/8/10, Artyom Tarasenko  wrote:
>>  >>  >> On the real hardware (SS-5, LX) the MMU is not padded, but aliased.
>>  >>  >>  Software shouldn't use aliased addresses, neither should it crash
>>  >>  >>  when it uses (on the real hardware it wouldn't). Using empty_slot
>>  >>  >>  instead of aliasing can help with debugging such accesses.
>>  >>  >
>>  >>  > TurboSPARC Microprocessor User's Manual shows that there are
>>  >>  > additional pages after the main IOMMU for AFX registers. So this is
>>  >>  > not board specific, but depends on CPU/IOMMU versions.
>>  >>
>>  >>
>>  >> I checked it on the real hw: on LX and SS-5 these are aliased MMU 
>> addresses.
>>  >>  SS-20 doesn't have any aliasing.
>>  >
>>  > But are your machines equipped with TurboSPARC or some other CPU?
>>
>>
>> Good point, I must confess, I missed the word "Turbo" in your first
>>  answer. LX and SS-20 don't.
>>  But SS-5 must have a TurboSPARC CPU:
>>
>>  ok cd /FMI,MB86904
>>  ok .attributes
>>  context-table            00 00 00 00 03 ff f0 00 00 00 10 00
>>  psr-implementation       
>>  psr-version              0004
>>  implementation           
>>  version                  0004
>>  cache-line-size          0020
>>  cache-nlines             0200
>>  page-size                1000
>>  dcache-line-size         0010
>>  dcache-nlines            0200
>>  dcache-associativity     0001
>>  icache-line-size         0020
>>  icache-nlines            0200
>>  icache-associativity     0001
>>  ncaches                  0002
>>  mmu-nctx                 0100
>>  sparc-version            0008
>>  mask_rev                 0026
>>  device_type              cpu
>>  name                     FMI,MB86904
>>
>>  and still it behaves the same as TI,TMS390S10 from the LX. This is done on 
>> SS-5:
>>
>>  ok 1000 20 spacel@ .
>>  409
>>  ok 1400 20 spacel@ .
>>  409
>>  ok 1404 20 spacel@ .
>>  23000
>>  ok 1f04 20 spacel@ .
>>  23000
>>  ok 1008 20 spacel@ .
>>  409
>>  ok 1428 20 spacel@ .
>>  409
>>  ok 100c 20 spacel@ .
>>  23000
>>  ok 1010 20 spacel@ .
>>  409
>>
>>
>>  LX is the same except for the IOMMU-version:
>>
>>  ok 1000 20 spacel@ .
>>  405
>>  ok 1400 20 spacel@ .
>>  405
>>  ok 1800 20 spacel@ .
>>  405
>>  ok 1f00 20 spacel@ .
>>  405
>>  ok 1ff0 20 spacel@ .
>>  405
>>  ok 1fff0004 20 spacel@ .
>>  1fe000
>>  ok 1004 20 spacel@ .
>>  1fe000
>>  ok 1108 20 spacel@ .
>>  4105
>>  ok 1040 20 spacel@ .
>>  4105
>>  ok 1fff0040 20 spacel@ .
>>  4105
>>  ok 1fff0044 20 spacel@ .
>>  1fe000
>>  ok 1fff0024 20 spacel@ .
>>  1fe000
>>
>>
>>  >>  At what address the additional AFX registers are located?
>>  >
>>  > Here's complete TurboSPARC IOMMU address map:
>>  >  PA[30:0]          Register          Access
>>  > 1000_       IOMMU Control         R/W
>>  > 1000_0004    IOMMU Base Address       R/W
>>  > 1000_0014   Flush All IOTLB Entries    W
>>  > 1000_0018        Address Flush         W
>>  > 1000_1000  Asynchronous Fault Status  R/W
>>  > 1000_1004 Asynchronous Fault Address  R/W
>>  > 1000_1010  SBus Slot Configuration 0   R/W
>>  > 1000_1014  SBus Slot Configuration 1   R/W
>>  > 1000_1018  SBus Slot Configuration 2   R/W
>>  > 1000_101C  SBus Slot Configuration 3   R/W
>>  > 1000_1020  SBus Slot Configuration 4   R/W
>>  > 1000_1050     Memory Fault Status     R/W
>>  > 1000_1054    Memory Fault Address     R/W
>>  > 1000_2000     Module Identification    R/W
>>  > 1000_3018      Mask Identification      R
>>  > 1000_4000      AFX Queue Level         W
>>  > 1000_6000      AFX Queue Level         R
>>  > 1000_7000      AFX Queue Status        R
>>
>>
>>
>> But if I read it correctly 0x12fff294 (which makes SunOS crash with -m 32) is
>>  well above this limit.
>
> Oh, so I also misread something. You are not talking about the
> adjacent pages, but 16MB increments.
>
> Earlier I sent a patch for a generic address alias device, would it be
> useful for this?

Should do as well. But I thought empty_slot is less overhead and
easier to debug.

> Maybe we have a general design problem, perhaps unassigned access
> faults should only be triggered inside SBus slots and ignored
> elsewhere. If this is true, generic Sparc32 unassigned access handler
> should just ignore the access and special fault generating slots
> should be installed for empty SBus address ranges.

My impression was that SS-5 and SS-20 do unassigned accesses a bit differently.
The current IOMMU implementation fits SS-20, which has no aliasing.

>>  >>  > One approach would be that IOMMU_NREGS would be increased to cover
>>  >>  > these registers (with the bump in savevm version field) and
>>  >>  > iommu_init1() should c

[Qemu-devel] [PATCH 2/2] acpi: remove static gpe and pci0_status variables

2010-05-10 Thread Blue Swirl
Make gpe and pci0_status fields in PIIX4PMState.

Signed-off-by: Blue Swirl 
---
 hw/acpi.c |   93 +---
 hw/pc.c   |1 -
 hw/pc.h   |1 -
 hw/pci.c  |   12 +--
 hw/pci.h  |6 ++-
 5 files changed, 63 insertions(+), 50 deletions(-)

diff --git a/hw/acpi.c b/hw/acpi.c
index bb2974e..6db1a12 100644
--- a/hw/acpi.c
+++ b/hw/acpi.c
@@ -30,6 +30,20 @@

 #define ACPI_DBG_IO_ADDR  0xb044

+#define GPE_BASE 0xafe0
+#define PCI_BASE 0xae00
+#define PCI_EJ_BASE 0xae08
+
+struct gpe_regs {
+uint16_t sts; /* status */
+uint16_t en;  /* enabled */
+};
+
+struct pci_status {
+uint32_t up;
+uint32_t down;
+};
+
 typedef struct PIIX4PMState {
 PCIDevice dev;
 uint16_t pmsts;
@@ -52,6 +66,8 @@ typedef struct PIIX4PMState {
 qemu_irq cmos_s3;
 qemu_irq smi_irq;
 int kvm_enabled;
+struct gpe_regs gpe;
+struct pci_status pci0_status;
 } PIIX4PMState;

 #define RSM_STS (1 << 15)
@@ -497,16 +513,19 @@ static void piix4_powerdown(void *opaque, int
irq, int power_failing)
 }
 }

+static void piix4_acpi_system_hot_add_init(PCIBus *bus, PCIDevice
*hotplug_dev);
+
 i2c_bus *piix4_pm_init(PCIBus *bus, int devfn, uint32_t smb_io_base,
qemu_irq sci_irq, qemu_irq cmos_s3, qemu_irq smi_irq,
int kvm_enabled)
 {
 PIIX4PMState *s;
+PCIDevice *d;
 uint8_t *pci_conf;

-s = (PIIX4PMState *)pci_register_device(bus,
- "PM", sizeof(PIIX4PMState),
- devfn, NULL, pm_write_config);
+d = pci_register_device(bus, "PM", sizeof(PIIX4PMState), devfn, NULL,
+pm_write_config);
+s = container_of(d, PIIX4PMState, dev);
 pci_conf = s->dev.config;
 pci_config_set_vendor_id(pci_conf, PCI_VENDOR_ID_INTEL);
 pci_config_set_device_id(pci_conf, PCI_DEVICE_ID_INTEL_82371AB_3);
@@ -557,26 +576,11 @@ i2c_bus *piix4_pm_init(PCIBus *bus, int devfn,
uint32_t smb_io_base,
 s->smi_irq = smi_irq;
 qemu_register_reset(piix4_reset, s);

+piix4_acpi_system_hot_add_init(bus, d);
+
 return s->smbus;
 }

-#define GPE_BASE 0xafe0
-#define PCI_BASE 0xae00
-#define PCI_EJ_BASE 0xae08
-
-struct gpe_regs {
-uint16_t sts; /* status */
-uint16_t en;  /* enabled */
-};
-
-struct pci_status {
-uint32_t up;
-uint32_t down;
-};
-
-static struct gpe_regs gpe;
-static struct pci_status pci0_status;
-
 static uint32_t gpe_read_val(uint16_t val, uint32_t addr)
 {
 if (addr & 1)
@@ -714,46 +718,51 @@ static void pciej_write(void *opaque, uint32_t
addr, uint32_t val)
 #endif
 }

-static int piix4_device_hotplug(PCIDevice *dev, int state);
+static int piix4_device_hotplug(PCIDevice *hotplug_dev, PCIDevice *dev,
+int state);

-void piix4_acpi_system_hot_add_init(PCIBus *bus)
+static void piix4_acpi_system_hot_add_init(PCIBus *bus, PCIDevice *hotplug_dev)
 {
-register_ioport_write(GPE_BASE, 4, 1, gpe_writeb, &gpe);
-register_ioport_read(GPE_BASE, 4, 1,  gpe_readb, &gpe);
+PIIX4PMState *s = container_of(hotplug_dev, PIIX4PMState, dev);

-register_ioport_write(PCI_BASE, 8, 4, pcihotplug_write, &pci0_status);
-register_ioport_read(PCI_BASE, 8, 4,  pcihotplug_read, &pci0_status);
+register_ioport_write(GPE_BASE, 4, 1, gpe_writeb, s);
+register_ioport_read(GPE_BASE, 4, 1,  gpe_readb, s);
+
+register_ioport_write(PCI_BASE, 8, 4, pcihotplug_write, s);
+register_ioport_read(PCI_BASE, 8, 4,  pcihotplug_read, s);

 register_ioport_write(PCI_EJ_BASE, 4, 4, pciej_write, bus);
 register_ioport_read(PCI_EJ_BASE, 4, 4,  pciej_read, bus);

-pci_bus_hotplug(bus, piix4_device_hotplug);
+pci_bus_hotplug(bus, piix4_device_hotplug, hotplug_dev);
 }

-static void enable_device(struct pci_status *p, struct gpe_regs *g, int slot)
+static void enable_device(PIIX4PMState *s, int slot)
 {
-g->sts |= 2;
-p->up |= (1 << slot);
+s->gpe.sts |= 2;
+s->pci0_status.up |= (1 << slot);
 }

-static void disable_device(struct pci_status *p, struct gpe_regs *g, int slot)
+static void disable_device(PIIX4PMState *s, int slot)
 {
-g->sts |= 2;
-p->down |= (1 << slot);
+s->gpe.sts |= 2;
+s->pci0_status.down |= (1 << slot);
 }

-static int piix4_device_hotplug(PCIDevice *dev, int state)
+static int piix4_device_hotplug(PCIDevice *hotplug_dev, PCIDevice *dev,
+int state)
 {
-PIIX4PMState *s = container_of(dev, PIIX4PMState, dev);
+PIIX4PMState *s = container_of(hotplug_dev, PIIX4PMState, dev);
 int slot = PCI_SLOT(dev->devfn);

-pci0_status.up = 0;
-pci0_status.down = 0;
-if (state)
-enable_device(&pci0_status, &gpe, slot);
-else
-disable_device(&pci0_status, &gpe, slot);
-if (gpe.en & 2) {
+s->pci0_status.up = 0;
+s->pci0_status.down = 0;
+if (state) {
+enable_device(s, slot);
+} else {
+

[Qemu-devel] Re: [PATCH] vdi: Fix image opening and creation for odd disk sizes

2010-05-10 Thread François Revol
Le Mon, 10 May 2010 22:12:33 +0200, Stefan Weil a écrit :
> The fix is based on a patch from Kevin Wolf. Here his comment:
>
> "The number of blocks needs to be rounded up to cover all of the
> virtual hard
> disk. Without this fix, we can't even open our own images if their
> size is not
> a multiple of the block size."
>
> While Kevin's patch addressed vdi_create, my modification also fixes
> vdi_open which now accepts images with odd disk sizes as well as
> images created with old versions of qemu-img.
>
> Cc: Kevin Wolf 
> Cc: François Revol 

Looks good to me on first read.

François.



[Qemu-devel] [PATCH 3/6] virtio-9p: modify create/open2 and mkdir for new security model.

2010-05-10 Thread Venkateswararao Jujjuri (JV)
Add required infrastructure and modify create/open2 and mkdir per the new
security model.

Signed-off-by: Venkateswararao Jujjuri 
---
 hw/file-op-9p.h  |   23 +-
 hw/virtio-9p-local.c |  117 +
 hw/virtio-9p.c   |   42 --
 3 files changed, 129 insertions(+), 53 deletions(-)

diff --git a/hw/file-op-9p.h b/hw/file-op-9p.h
index f84767f..1eceeb2 100644
--- a/hw/file-op-9p.h
+++ b/hw/file-op-9p.h
@@ -18,13 +18,32 @@
 #include 
 #include 
 #include 
+#define SM_LOCAL_MODE_BITS0600
+
+typedef enum
+{
+sm_none = 0,
+sm_passthrough, /* uid/gid set on fileserver files */
+sm_mapped,  /* uid/gid part of xattr */
+} SecModel;
+
+typedef struct FsCred
+{
+uid_t   fc_uid;
+gid_t   fc_gid;
+mode_t  fc_mode;
+dev_t   fc_rdev;
+} FsCred;
 
 typedef struct FsContext
 {
 char *fs_root;
+SecModel fs_sm;
 uid_t uid;
 } FsContext;
 
+extern void cred_init(FsCred *);
+
 typedef struct FileOperations
 {
 int (*lstat)(FsContext *, const char *, struct stat *);
@@ -42,7 +61,7 @@ typedef struct FileOperations
 int (*closedir)(FsContext *, DIR *);
 DIR *(*opendir)(FsContext *, const char *);
 int (*open)(FsContext *, const char *, int);
-int (*open2)(FsContext *, const char *, int, mode_t);
+int (*open2)(FsContext *, const char *, int, FsCred *);
 void (*rewinddir)(FsContext *, DIR *);
 off_t (*telldir)(FsContext *, DIR *);
 struct dirent *(*readdir)(FsContext *, DIR *);
@@ -50,7 +69,7 @@ typedef struct FileOperations
 ssize_t (*readv)(FsContext *, int, const struct iovec *, int);
 ssize_t (*writev)(FsContext *, int, const struct iovec *, int);
 off_t (*lseek)(FsContext *, int, off_t, int);
-int (*mkdir)(FsContext *, const char *, mode_t);
+int (*mkdir)(FsContext *, const char *, FsCred *);
 int (*fstat)(FsContext *, int, struct stat *);
 int (*rename)(FsContext *, const char *, const char *);
 int (*truncate)(FsContext *, const char *, off_t);
diff --git a/hw/virtio-9p-local.c b/hw/virtio-9p-local.c
index 1afb731..8ed8c66 100644
--- a/hw/virtio-9p-local.c
+++ b/hw/virtio-9p-local.c
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
 
 static const char *rpath(FsContext *ctx, const char *path)
 {
@@ -31,47 +32,49 @@ static int local_lstat(FsContext *ctx, const char *path, 
struct stat *stbuf)
 return lstat(rpath(ctx, path), stbuf);
 }
 
-static int local_setuid(FsContext *ctx, uid_t uid)
+static void local_set_cred(FsCred *credp)
 {
-struct passwd *pw;
-gid_t groups[33];
-int ngroups;
-static uid_t cur_uid = -1;
-
-if (cur_uid == uid) {
-return 0;
+if (credp->fc_uid != -1) {
+setuid(credp->fc_uid);
 }
-
-if (setreuid(0, 0)) {
-return -1;
+if (credp->fc_gid != -1) {
+setgid(credp->fc_gid);
 }
+}
 
-pw = getpwuid(uid);
-if (pw == NULL) {
-return -1;
+static int local_set_xattr(const char *path, FsCred *credp)
+{
+int err;
+if (credp->fc_uid != -1) {
+err = setxattr(path, "user.virtfs.uid", &credp->fc_uid, sizeof(uid_t),
+XATTR_CREATE);
+if (err) {
+return err;
+}
 }
-
-ngroups = 33;
-if (getgrouplist(pw->pw_name, pw->pw_gid, groups, &ngroups) == -1) {
-return -1;
+if (credp->fc_gid != -1) {
+err = setxattr(path, "user.virtfs.gid", &credp->fc_uid, sizeof(gid_t),
+XATTR_CREATE);
+if (err) {
+return err;
+}
 }
-
-if (setgroups(ngroups, groups)) {
-return -1;
+if (credp->fc_mode != -1) {
+err = setxattr(path, "user.virtfs.mode", &credp->fc_mode,
+sizeof(mode_t), XATTR_CREATE);
+if (err) {
+return err;
+}
 }
-
-if (setregid(-1, pw->pw_gid)) {
-return -1;
-}
-
-if (setreuid(-1, uid)) {
-return -1;
+if (credp->fc_rdev != -1) {
+err = setxattr(path, "user.virtfs.rdev", &credp->fc_rdev,
+sizeof(dev_t), XATTR_CREATE);
+if (err) {
+return err;
+}
 }
-
-cur_uid = uid;
-
-return 0;
-}
+ return 0;
+ }
 
 static ssize_t local_readlink(FsContext *ctx, const char *path,
 char *buf, size_t bufsz)
@@ -168,9 +171,26 @@ static int local_mksock(FsContext *ctx2, const char *path)
 return 0;
 }
 
-static int local_mkdir(FsContext *ctx, const char *path, mode_t mode)
+static int local_mkdir(FsContext *fs_ctx, const char *path, FsCred *credp)
 {
-return mkdir(rpath(ctx, path), mode);
+int err = -1;
+/* Determine the security model */
+if (fs_ctx->fs_sm == sm_mapped) {
+err = mkdir(rpath(fs_ctx, path), SM_LOCAL_MODE_BITS);
+if (err == -1) {
+return err;
+}
+credp->fc_mode = credp->fc_mode|S_IFDIR;
+err = local_set_xattr(rpath(fs_ctx, path), credp);
+if (e

Re: [Qemu-devel] [PATCH 2/3] dmg: use pread

2010-05-10 Thread Christoph Hellwig
On Mon, May 10, 2010 at 12:07:40PM +0200, Kevin Wolf wrote:
> >  
> > -info_begin=read_off(s->fd);
> > -if(info_begin==0)
> > -   goto fail;
> > -if(lseek(s->fd,info_begin,SEEK_SET)<0)
> > -   goto fail;
> 
> We seek to info_begin.
> 
> > -if(read_uint32(s->fd)!=0x100)
> > -   goto fail;
> 
> Now we are at info_begin + 4
> 
> > -if((count = read_uint32(s->fd))==0)
> > -   goto fail;
> 
> info_begin + 8
> 
> > -info_end = info_begin+count;
> > -if(lseek(s->fd,0xf8,SEEK_CUR)<0)
> 
> info_begin + 0x100
> 
> > +info_begin = read_off(s->fd, offset);
> > +if (info_begin == 0) {
> > goto fail;
> > +}
> > +
> > +if (read_uint32(s->fd, info_begin) != 0x100) {
> > +goto fail;
> > +}
> > +
> > +count = read_uint32(s->fd, info_begin + 4);
> > +if (count == 0) {
> > +goto fail;
> > +}
> > +info_end = info_begin + count;
> > +
> > +offset = info_begin + 0xfc;
> 
> So, wrong offset here?

Yeah, should be 0x100.  That's what you get for quickly doing hex
calculation in your head.

> > +   if (type == 0x6d697368 && count >= 244) {
> > int new_size, chunk_count;
> > -   if(lseek(s->fd,200,SEEK_CUR)<0)
> > -   goto fail;
> > +
> > +offset += 4;
> 
> Isn't this needed in the else case, too?

I don't think so.  For that case we previously did a

lseek(s->fd,count-4,SEEK_CUR)

to undo the 4 byte advance done by the read.

> > -   s->sectors[i] = last_out_offset+read_off(s->fd);
> > -   s->sectorcounts[i] = read_off(s->fd);
> > -   s->offsets[i] = last_in_offset+read_off(s->fd);
> > -   s->lengths[i] = read_off(s->fd);
> > +   read_uint32(s->fd, offset);
> 
> This read is useless. offset += 4 alone should be enough.

Thanks, fixed.

> > /* we need to buffer, because only the chunk as whole can be
> >  * inflated. */
> > i=0;
> > do {
> > -   ret = read(s->fd, s->compressed_chunk+i, s->lengths[chunk]-i);
> > +   ret = pread(s->fd, s->compressed_chunk+i, s->lengths[chunk]-i,
> > +s->offsets[chunk]);
> 
> This is in a loop, whereas the lseek was outside the loop. From the
> second iteration on you'll repeat the first read instead of advancing.

You're right.  The EINTR check confused me an I took this for just
retrying reads on EINTR.  Now this code i quite nasty for error returns
except EINTR because we'll subtract one from the i loop iteration,
yikes.  I'll just reuse the i variable to keep the same kind of bug
for both sides of the equation.

God, do I hate this code..



[Qemu-devel] [PATCH 0/6] virtio-9p:Introducing security model for VirtFS

2010-05-10 Thread Venkateswararao Jujjuri (JV)
This patch series introduces the security model for VirtFS.

Brief description of this patch series.

It introduces two type of security models for VirtFS.
They are: mapped and passthrough.

The following is common to both security models.

* Client's VFS determines/enforces the access control.
  Largely server should never return EACCESS.

* Client sends gid/mode-bit information as part of creation only.


Security model: mapped
--

VirtFS server(QEMU) intercepts and maps all the file object create requests.
Files on the fileserver will be created with QEMU's user credentials and the
client-user's credentials are stored in extended attributes.
During getattr() server extracts the client-user's credentials from extended
attributes and sends to the client.

Given that only the user space extended attributes are available to regular
files, special files are created as regular files on the fileserver and the
appropriate mode bits are stored in xattrs and will be extracted during
getattr.

If the extended attributes are missing, server sends back the filesystem
stat() unaltered. This provision will make the files created on the
fileserver usable to client.

Points to be considered

* Filesystem will be VirtFS'ized. Meaning, other filesystems may not
  understand the credentials of the files created under this model.

* Regular utilities like 'df' may not report required results in this model.
  Need for special reporting utilities which can understand this security model.


Security model : passthrough


In this security model, VirtFS server passes down all requests to the
underlying filesystem. File system objects on the fileserver will be created
with client-user's credentials. This is done by setting setuid()/setgid()
during creation or ch* after file creation. At the end of create protocol
request, files on the fileserver will be owned by cleint-user's uid/gid.

Points to be considered

   * Fileserver should always run as 'root'.
   * Root squashing may be needed. Will be for future work.
   * Potential for user credential clash between guest's user space IDs and
 host's user space IDs.

It also adds security model attribute to -fsdev device and to -virtfs shortcut.

Usage model:
-fsdev local,id=jvrao,path=/tmp/,security_model=mapped
-virtfs local,path=/tmp/,security_model=passthrough,mnt_tag=v_tmp.

--
Signed-off-by: Venkateswararao Jujjuri 





[Qemu-devel] Re: [PATCH 0/1] [RFC][AHCI] add cdrom support for ahci.

2010-05-10 Thread Alexander Graf
Hi Chong,

On 10.05.2010, at 13:55, QiaoChong wrote:

> When ahci init ,driver will send ATA_SRST command,ahci device report device 
> type through port's sig register.
> Ahci disk lookup change from IF_SD to IF_SCSI now,because IF_SD does not 
> support cdrom media.
> I just copy ide_atapi_cmd from hw/ide/core.c into hw/ahci.c,change a 
> little,then the cdrom can be identified,and read by os.
> If qemu can change dma_buf_prepare,dma_buf_rw,dma_buf_commit to a function 
> pointer in BMDMAState,then I can rewrite three functions to support ahci's 
> prtd,because it is different from ide's.
> 
> test a sata disk like this:
> ./i386-softmmu/qemu -cdrom KNOPPIX_V6.0.1CD-2009-02-08-EN.iso -drive 
> if=scsi,file=/tmp/disk
> test a sata cd like this:
> ./i386-softmmu/qemu -cdrom KNOPPIX_V6.0.1CD-2009-02-08-EN.iso -drive 
> if=scsi,media=cdrom,file=KNOPPIX_V6.0.1CD-2009-02-08-EN.iso

Thanks for improving the patch, but I have some nitpicks considering on how to 
process here.

For starters, this patch is incremental to the previous one. Since the previous 
patch did not get applied to qemu, it doesn't make sense to send an incremental 
patch. Please send the full patchset but bump up the version in that case. You 
will find many examples for that on the mailing list. In most cases it also 
makes sense to rethink the splitting between patches.

I also gave you several comments and suggestions on the previous patch which I 
didn't see addressed. I think it's wrong to deal with the ATA commands in ahci 
specific code. The only viable solution in making this code usable for upstream 
is to reuse the IDE command processor. If there's anything missing in the 
abstaction layers, just change the abstraction in an early patch in your 
patchset.

Alex




[Qemu-devel] [PATCH 5/6] virtio-9p: Implemented security model for symlink and link.

2010-05-10 Thread Venkateswararao Jujjuri (JV)
Signed-off-by: Venkateswararao Jujjuri 
---
 hw/file-op-9p.h  |4 +-
 hw/virtio-9p-local.c |   74 +
 hw/virtio-9p.c   |   24 +++-
 3 files changed, 75 insertions(+), 27 deletions(-)

diff --git a/hw/file-op-9p.h b/hw/file-op-9p.h
index f87a35e..6ecc009 100644
--- a/hw/file-op-9p.h
+++ b/hw/file-op-9p.h
@@ -53,8 +53,8 @@ typedef struct FileOperations
 int (*mknod)(FsContext *, const char *, FsCred *);
 int (*utime)(FsContext *, const char *, const struct utimbuf *);
 int (*remove)(FsContext *, const char *);
-int (*symlink)(FsContext *, const char *, const char *);
-int (*link)(FsContext *, const char *, const char *);
+int (*symlink)(FsContext *, const char *, const char *, FsCred *);
+int (*link)(FsContext *, const char *, const char *, FsCred *);
 int (*setuid)(FsContext *, uid_t);
 int (*close)(FsContext *, int);
 int (*closedir)(FsContext *, DIR *);
diff --git a/hw/virtio-9p-local.c b/hw/virtio-9p-local.c
index 5589f72..89b17f0 100644
--- a/hw/virtio-9p-local.c
+++ b/hw/virtio-9p-local.c
@@ -74,12 +74,25 @@ static int local_set_xattr(const char *path, FsCred *credp)
 }
 }
  return 0;
- }
+}
 
-static ssize_t local_readlink(FsContext *ctx, const char *path,
-char *buf, size_t bufsz)
+static ssize_t local_readlink(FsContext *fs_ctx, const char *path,
+char *buf, size_t bufsz)
 {
-return readlink(rpath(ctx, path), buf, bufsz);
+ssize_t tsize = -1;
+if (fs_ctx->fs_sm == sm_mapped) {
+int fd;
+fd = open(rpath(fs_ctx, path), O_RDONLY);
+if (fd == -1) {
+return -1;
+}
+tsize = read(fd, (void *)buf, bufsz);
+close(fd);
+return tsize;
+} else if (fs_ctx->fs_sm == sm_passthrough) {
+tsize = readlink(rpath(fs_ctx, path), buf, bufsz);
+}
+return tsize;
 }
 
 static int local_close(FsContext *ctx, int fd)
@@ -219,32 +232,57 @@ err_end:
 return fd;
 }
 
-static int local_symlink(FsContext *ctx, const char *oldpath,
-const char *newpath)
+
+static int local_symlink(FsContext *fs_ctx, const char *oldpath,
+const char *newpath, FsCred *credp)
 {
-return symlink(oldpath, rpath(ctx, newpath));
+int err = -1;
+/* Determine the security model */
+if (fs_ctx->fs_sm == sm_mapped) {
+int fd;
+ssize_t oldpath_size, write_size;
+fd = open(rpath(fs_ctx, newpath), O_CREAT|O_EXCL|O_RDWR,
+SM_LOCAL_MODE_BITS);
+if (fd == -1) {
+return -1;
+}
+/* Write the oldpath (target) to the file. */
+oldpath_size = strlen(oldpath) + 1;
+write_size = write(fd, (void *)oldpath, oldpath_size);
+if (write_size != oldpath_size) {
+close(fd);
+remove(rpath(fs_ctx, newpath));
+return -1;
+}
+close(fd);
+/* Set cleint credentials in symlink's xattr */
+credp->fc_mode = credp->fc_mode|S_IFLNK;
+err = local_set_xattr(rpath(fs_ctx, newpath), credp);
+if (err == -1) {
+remove(rpath(fs_ctx, newpath));
+return err;
+}
+} else if (fs_ctx->fs_sm == sm_passthrough) {
+local_set_cred(credp);
+err = symlink(oldpath, rpath(fs_ctx, newpath));
+}
+return err;
 }
 
-static int local_link(FsContext *ctx, const char *oldpath, const char *newpath)
+static int local_link(FsContext *fs_ctx, const char *oldpath,
+const char *newpath, FsCred *credp)
 {
-char *tmp = qemu_strdup(rpath(ctx, oldpath));
-int err, serrno = 0;
+char *tmp = qemu_strdup(rpath(fs_ctx, oldpath));
+int err;
 
 if (tmp == NULL) {
 return -ENOMEM;
 }
 
-err = link(tmp, rpath(ctx, newpath));
-if (err == -1) {
-serrno = errno;
-}
+err = link(tmp, rpath(fs_ctx, newpath));
 
 qemu_free(tmp);
 
-if (err == -1) {
-errno = serrno;
-}
-
 return err;
 }
 
diff --git a/hw/virtio-9p.c b/hw/virtio-9p.c
index bbeba7c..9da14f8 100644
--- a/hw/virtio-9p.c
+++ b/hw/virtio-9p.c
@@ -197,15 +197,25 @@ static int v9fs_do_open2(V9fsState *s, V9fsCreateState 
*vs)
 return s->ops->open2(&s->ctx, vs->fullname.data, flags, &cred);
 }
 
-static int v9fs_do_symlink(V9fsState *s, V9fsString *oldpath,
-V9fsString *newpath)
+static int v9fs_do_symlink(V9fsState *s, V9fsCreateState *vs)
 {
-return s->ops->symlink(&s->ctx, oldpath->data, newpath->data);
+FsCred cred;
+cred_init(&cred);
+cred.fc_uid = vs->fidp->uid;
+cred.fc_mode = vs->perm | 0777;
+
+return s->ops->symlink(&s->ctx, vs->extension.data, vs->fullname.data,
+&cred);
 }
 
-static int v9fs_do_link(V9fsState *s, V9fsString *oldpath, V9fsString *newpath)
+static int v9fs_do_link(V9fsState *s, V9fsFidState *nfidp, V9fsCreateState *vs)
 {
-return s->ops->link(&s->c

[Qemu-devel] [PATCH 2/6] virtio-9p: Rearrange fileop structures

2010-05-10 Thread Venkateswararao Jujjuri (JV)
Signed-off-by: Venkateswararao Jujjuri 
---
 hw/virtio-9p.c |  185 ++--
 hw/virtio-9p.h |   92 
 2 files changed, 138 insertions(+), 139 deletions(-)

diff --git a/hw/virtio-9p.c b/hw/virtio-9p.c
index 62be770..365259c 100644
--- a/hw/virtio-9p.c
+++ b/hw/virtio-9p.c
@@ -21,6 +21,52 @@
 int dotu = 1;
 int debug_9p_pdu;
 
+enum {
+Oread   = 0x00,
+Owrite  = 0x01,
+Ordwr   = 0x02,
+Oexec   = 0x03,
+Oexcl   = 0x04,
+Otrunc  = 0x10,
+Orexec  = 0x20,
+Orclose = 0x40,
+Oappend = 0x80,
+};
+
+static int omode_to_uflags(int8_t mode)
+{
+int ret = 0;
+
+switch (mode & 3) {
+case Oread:
+ret = O_RDONLY;
+break;
+case Ordwr:
+ret = O_RDWR;
+break;
+case Owrite:
+ret = O_WRONLY;
+break;
+case Oexec:
+ret = O_RDONLY;
+break;
+}
+
+if (mode & Otrunc) {
+ret |= O_TRUNC;
+}
+
+if (mode & Oappend) {
+ret |= O_APPEND;
+}
+
+if (mode & Oexcl) {
+ret |= O_EXCL;
+}
+
+return ret;
+}
+
 static int v9fs_do_lstat(V9fsState *s, V9fsString *path, struct stat *stbuf)
 {
 return s->ops->lstat(&s->ctx, path->data, stbuf);
@@ -995,14 +1041,6 @@ out:
 v9fs_string_free(&aname);
 }
 
-typedef struct V9fsStatState {
-V9fsPDU *pdu;
-size_t offset;
-V9fsStat v9stat;
-V9fsFidState *fidp;
-struct stat stbuf;
-} V9fsStatState;
-
 static void v9fs_stat_post_lstat(V9fsState *s, V9fsStatState *vs, int err)
 {
 if (err == -1) {
@@ -1053,19 +1091,6 @@ out:
 qemu_free(vs);
 }
 
-typedef struct V9fsWalkState {
-V9fsPDU *pdu;
-size_t offset;
-int16_t nwnames;
-int name_idx;
-V9fsQID *qids;
-V9fsFidState *fidp;
-V9fsFidState *newfidp;
-V9fsString path;
-V9fsString *wnames;
-struct stat stbuf;
-} V9fsWalkState;
-
 static void v9fs_walk_complete(V9fsState *s, V9fsWalkState *vs, int err)
 {
 complete_pdu(s, vs->pdu, err);
@@ -1229,62 +1254,6 @@ out:
 v9fs_walk_complete(s, vs, err);
 }
 
-typedef struct V9fsOpenState {
-V9fsPDU *pdu;
-size_t offset;
-int8_t mode;
-V9fsFidState *fidp;
-V9fsQID qid;
-struct stat stbuf;
-
-} V9fsOpenState;
-
-enum {
-Oread   = 0x00,
-Owrite  = 0x01,
-Ordwr   = 0x02,
-Oexec   = 0x03,
-Oexcl   = 0x04,
-Otrunc  = 0x10,
-Orexec  = 0x20,
-Orclose = 0x40,
-Oappend = 0x80,
-};
-
-static int omode_to_uflags(int8_t mode)
-{
-int ret = 0;
-
-switch (mode & 3) {
-case Oread:
-ret = O_RDONLY;
-break;
-case Ordwr:
-ret = O_RDWR;
-break;
-case Owrite:
-ret = O_WRONLY;
-break;
-case Oexec:
-ret = O_RDONLY;
-break;
-}
-
-if (mode & Otrunc) {
-ret |= O_TRUNC;
-}
-
-if (mode & Oappend) {
-ret |= O_APPEND;
-}
-
-if (mode & Oexcl) {
-ret |= O_EXCL;
-}
-
-return ret;
-}
-
 static void v9fs_open_post_opendir(V9fsState *s, V9fsOpenState *vs, int err)
 {
 if (vs->fidp->dir == NULL) {
@@ -1387,25 +1356,6 @@ out:
 complete_pdu(s, pdu, err);
 }
 
-typedef struct V9fsReadState {
-V9fsPDU *pdu;
-size_t offset;
-int32_t count;
-int32_t total;
-int64_t off;
-V9fsFidState *fidp;
-struct iovec iov[128]; /* FIXME: bad, bad, bad */
-struct iovec *sg;
-off_t dir_pos;
-struct dirent *dent;
-struct stat stbuf;
-V9fsString name;
-V9fsStat v9stat;
-int32_t len;
-int32_t cnt;
-int32_t max_count;
-} V9fsReadState;
-
 static void v9fs_read_post_readdir(V9fsState *, V9fsReadState *, ssize_t);
 
 static void v9fs_read_post_seekdir(V9fsState *s, V9fsReadState *vs, ssize_t 
err)
@@ -1593,19 +1543,6 @@ out:
 qemu_free(vs);
 }
 
-typedef struct V9fsWriteState {
-V9fsPDU *pdu;
-size_t offset;
-int32_t len;
-int32_t count;
-int32_t total;
-int64_t off;
-V9fsFidState *fidp;
-struct iovec iov[128]; /* FIXME: bad, bad, bad */
-struct iovec *sg;
-int cnt;
-} V9fsWriteState;
-
 static void v9fs_write_post_writev(V9fsState *s, V9fsWriteState *vs,
ssize_t err)
 {
@@ -1702,19 +1639,6 @@ out:
 qemu_free(vs);
 }
 
-typedef struct V9fsCreateState {
-V9fsPDU *pdu;
-size_t offset;
-V9fsFidState *fidp;
-V9fsQID qid;
-int32_t perm;
-int8_t mode;
-struct stat stbuf;
-V9fsString name;
-V9fsString extension;
-V9fsString fullname;
-} V9fsCreateState;
-
 static void v9fs_post_create(V9fsState *s, V9fsCreateState *vs, int err)
 {
 if (err == 0) {
@@ -1934,12 +1858,6 @@ static void v9fs_flush(V9fsState *s, V9fsPDU *pdu)
 complete_pdu(s, pdu, 7);
 }
 
-typedef struct V9fsRemoveState {
-V9fsPDU *pdu;
-size_t offset;
-V9fsFidState *fidp;
-} V9fsRemoveState;
-
 static void v9fs_remove_post_remove(V9fsState *s, V9fsRemoveState *vs,
  

[Qemu-devel] [PATCH 1/6] virtio-9p: Introduces an option to specify the security model.

2010-05-10 Thread Venkateswararao Jujjuri (JV)
The new option is:

-fsdev local,id=jvrao,path=/tmp/,security_model=[mapped|passthrough]
-virtfs local,path=/tmp/,security_model=[mapped|passthrough],mnt_tag=v_tmp.

In the case of mapped security model, files are created with QEMU user
credentials and the client-user's credentials are saved in extended attributes.
Whereas in the case of passthrough security model, files on the
filesystem are directly created with client-user's credentials.

Signed-off-by: Venkateswararao Jujjuri 
---
 fsdev/qemu-fsdev.c |2 ++
 fsdev/qemu-fsdev.h |1 +
 hw/virtio-9p.c |   14 ++
 qemu-config.c  |   12 +---
 qemu-options.hx|   15 +++
 vl.c   |8 +---
 6 files changed, 42 insertions(+), 10 deletions(-)

diff --git a/fsdev/qemu-fsdev.c b/fsdev/qemu-fsdev.c
index 813e1f7..8148518 100644
--- a/fsdev/qemu-fsdev.c
+++ b/fsdev/qemu-fsdev.c
@@ -50,6 +50,8 @@ int qemu_fsdev_add(QemuOpts *opts)
 
 fsle->fse.fsdev_id = qemu_strdup(qemu_opts_id(opts));
 fsle->fse.path = qemu_strdup(qemu_opt_get(opts, "path"));
+fsle->fse.security_model = qemu_strdup(qemu_opt_get(opts,
+"security_model"));
 fsle->fse.ops = FsTypes[i].ops;
 
 QTAILQ_INSERT_TAIL(&fstype_entries, fsle, next);
diff --git a/fsdev/qemu-fsdev.h b/fsdev/qemu-fsdev.h
index b50fbe0..6c27881 100644
--- a/fsdev/qemu-fsdev.h
+++ b/fsdev/qemu-fsdev.h
@@ -40,6 +40,7 @@ typedef struct FsTypeTable {
 typedef struct FsTypeEntry {
 char *fsdev_id;
 char *path;
+char *security_model;
 FileOperations *ops;
 } FsTypeEntry;
 
diff --git a/hw/virtio-9p.c b/hw/virtio-9p.c
index e5d0112..62be770 100644
--- a/hw/virtio-9p.c
+++ b/hw/virtio-9p.c
@@ -2346,6 +2346,20 @@ VirtIODevice *virtio_9p_init(DeviceState *dev, V9fsConf 
*conf)
 exit(1);
 }
 
+if (!strcmp(fse->security_model, "passthrough")) {
+/* Files on the Fileserver set to client user credentials */
+} else if (!strcmp(fse->security_model, "mapped")) {
+/* Files on the fileserver are set to QEMU credentials.
+ * Client user credentials are saved in extended attributes.
+ */
+} else {
+/* user haven't specified a correct security option */
+fprintf(stderr, "one of the following must be specified as the"
+"security option:\n\t security_model=passthrough \n\t "
+"security_model=mapped\n");
+exit(1);
+}
+
 if (lstat(fse->path, &stat)) {
 fprintf(stderr, "share path %s does not exist\n", fse->path);
 exit(1);
diff --git a/qemu-config.c b/qemu-config.c
index d500885..e1e3aa1 100644
--- a/qemu-config.c
+++ b/qemu-config.c
@@ -160,9 +160,12 @@ QemuOptsList qemu_fsdev_opts = {
 {
 .name = "fstype",
 .type = QEMU_OPT_STRING,
-}, {
+},{
 .name = "path",
 .type = QEMU_OPT_STRING,
+},{
+.name = "security_model",
+.type = QEMU_OPT_STRING,
 },
 { /*End of list */ }
 },
@@ -178,12 +181,15 @@ QemuOptsList qemu_virtfs_opts = {
 {
 .name = "fstype",
 .type = QEMU_OPT_STRING,
-}, {
+},{
 .name = "path",
 .type = QEMU_OPT_STRING,
-}, {
+},{
 .name = "mount_tag",
 .type = QEMU_OPT_STRING,
+},{
+.name = "security_model",
+.type = QEMU_OPT_STRING,
 },
 
 { /*End of list */ }
diff --git a/qemu-options.hx b/qemu-options.hx
index 12f6b51..d557c92 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -482,7 +482,7 @@ ETEXI
 DEFHEADING(File system options:)
 
 DEF("fsdev", HAS_ARG, QEMU_OPTION_fsdev,
-"-fsdev local,id=id,path=path\n",
+"-fsdev local,id=id,path=path,security_model=[mapped|passthrough]\n",
 QEMU_ARCH_ALL)
 
 STEXI
@@ -498,7 +498,7 @@ The specific Fstype will determine the applicable options.
 
 Options to each backend are described below.
 
-...@item -fsdev local ,i...@var{id} ,pa...@var{path}
+...@item -fsdev local ,i...@var{id} ,pa...@var{path} 
,security_mod...@var{security_model}
 
 Create a file-system-"device" for local-filesystem.
 
@@ -506,6 +506,9 @@ Create a file-system-"device" for local-filesystem.
 
 @option{path} specifies the path to be exported. @option{path} is required.
 
+...@option{security_model} specifies the security model to be followed.
+...@option{security_model} is required.
+
 @end table
 ETEXI
 #endif
@@ -514,7 +517,7 @@ ETEXI
 DEFHEADING(Virtual File system pass-through options:)
 
 DEF("virtfs", HAS_ARG, QEMU_OPTION_virtfs,
-"-virtfs local,path=path,mount_tag=tag\n",
+"-virtfs 
local,path=path,mount_tag=tag,security_model=[mapped|passthrough]\n",
 QEMU_ARCH_ALL)
 
 STEXI
@@ -530,7 +533,7 @@ The specific Fstype will determine the applicable options.
 
 Options to each backend are described below.
 
-...@item -virtfs local ,pa...@var{path} ,mount_t...@var{mount_tag}
+...

Re: [Qemu-devel] [PATCH] fix migration with large mem

2010-05-10 Thread Anthony Liguori

On 04/13/2010 04:33 AM, Izik Eidus wrote:

 From f881b371e08760a67bf1f5b992a586c3de600f7a Mon Sep 17 00:00:00 2001
From: Izik Eidus
Date: Tue, 13 Apr 2010 12:24:57 +0300
Subject: [PATCH] fix migration with large mem

In cases of guests with large mem that have pages
that all their bytes content are the same, we will
spend alot of time reading the memory from the guest
(is_dup_page())

It is happening beacuse ram_save_live() function have
limit of how much we can send to the dest but not how
much we read from it, and in cases we have many is_dup_page()
hits, we might read huge amount of data without updating important
stuff like the timers...

The guest lose all its repsonsibility and have many softlock ups
inside itself.

this patch add limit on the size we can read from the guest each
iteration.

 Thanks.

Signed-off-by: Izik Eidus
---
  arch_init.c |6 +-
  1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index cfc03ea..e27b1a0 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -88,6 +88,8 @@ const uint32_t arch_type = QEMU_ARCH;
  #define RAM_SAVE_FLAG_PAGE0x08
  #define RAM_SAVE_FLAG_EOS 0x10

+#define MAX_SAVE_BLOCK_READ 10 * 1024 * 1024
+
  static int is_dup_page(uint8_t *page, uint8_t ch)
  {
  uint32_t val = ch<<  24 | ch<<  16 | ch<<  8 | ch;
@@ -175,6 +177,7 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
  uint64_t bytes_transferred_last;
  double bwidth = 0;
  uint64_t expected_time = 0;
+int data_read = 0;

  if (stage<  0) {
  cpu_physical_memory_set_dirty_tracking(0);
@@ -205,10 +208,11 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
  bytes_transferred_last = bytes_transferred;
  bwidth = qemu_get_clock_ns(rt_clock);

-while (!qemu_file_rate_limit(f)) {
+while (!qemu_file_rate_limit(f)&&  data_read<  MAX_SAVE_BLOCK_READ) {
   


The effect of this patch is that we'll never send more than 10mb/s 
during live migration?  If so, it's totally wrong as a fix to the problem.


It would be better to account the deduplicated pages as part of the rate 
limiting calculations.


Regards,

Anthony Liguori


  int ret;

  ret = ram_save_block(f);
+data_read += ret * TARGET_PAGE_SIZE;
  bytes_transferred += ret * TARGET_PAGE_SIZE;
  if (ret == 0) { /* no more blocks */
  break;
   





[Qemu-devel] [PATCH 4/6] virtio-9p: Implement Security model for mknod related files

2010-05-10 Thread Venkateswararao Jujjuri (JV)
In the mapped security model all the special files are created as regular files
on the fileserver and appropriate mode bits are added to the extended
attributes. These extended attributes are used to present this file
as special file to the client.

Signed-off-by: Venkateswararao Jujjuri 
---
 hw/file-op-9p.h  |3 +--
 hw/virtio-9p-local.c |   45 +++--
 hw/virtio-9p.c   |   21 +++--
 3 files changed, 31 insertions(+), 38 deletions(-)

diff --git a/hw/file-op-9p.h b/hw/file-op-9p.h
index 1eceeb2..f87a35e 100644
--- a/hw/file-op-9p.h
+++ b/hw/file-op-9p.h
@@ -50,8 +50,7 @@ typedef struct FileOperations
 ssize_t (*readlink)(FsContext *, const char *, char *, size_t);
 int (*chmod)(FsContext *, const char *, mode_t);
 int (*chown)(FsContext *, const char *, uid_t, gid_t);
-int (*mknod)(FsContext *, const char *, mode_t, dev_t);
-int (*mksock)(FsContext *, const char *);
+int (*mknod)(FsContext *, const char *, FsCred *);
 int (*utime)(FsContext *, const char *, const struct utimbuf *);
 int (*remove)(FsContext *, const char *);
 int (*symlink)(FsContext *, const char *, const char *);
diff --git a/hw/virtio-9p-local.c b/hw/virtio-9p-local.c
index 8ed8c66..5589f72 100644
--- a/hw/virtio-9p-local.c
+++ b/hw/virtio-9p-local.c
@@ -144,33 +144,27 @@ static int local_chmod(FsContext *ctx, const char *path, 
mode_t mode)
 return chmod(rpath(ctx, path), mode);
 }
 
-static int local_mknod(FsContext *ctx, const char *path, mode_t mode, dev_t 
dev)
+static int local_mknod(FsContext *fs_ctx, const char *path, FsCred *credp)
 {
-return mknod(rpath(ctx, path), mode, dev);
-}
-
-static int local_mksock(FsContext *ctx2, const char *path)
-{
-struct sockaddr_un addr;
-int s;
-
-addr.sun_family = AF_UNIX;
-snprintf(addr.sun_path, 108, "%s", rpath(ctx2, path));
-
-s = socket(PF_UNIX, SOCK_STREAM, 0);
-if (s == -1) {
-return -1;
-}
-
-if (bind(s, (struct sockaddr *)&addr, sizeof(addr))) {
-close(s);
-return -1;
-}
-
-close(s);
-return 0;
+int err = -1;
+/* Determine the security model */
+if (fs_ctx->fs_sm == sm_mapped) {
+err = mknod(rpath(fs_ctx, path), SM_LOCAL_MODE_BITS|S_IFREG, 0);
+if (err == -1) {
+goto err_end;
+}
+local_set_xattr(rpath(fs_ctx, path), credp);
+if (err == -1) {
+remove(rpath(fs_ctx, path));
+return err;
+}
+} else if (fs_ctx->fs_sm == sm_passthrough) {
+local_set_cred(credp);
+err = mknod(rpath(fs_ctx, path), credp->fc_mode, credp->fc_rdev);
+ }
+err_end:
+return err;
 }
-
 static int local_mkdir(FsContext *fs_ctx, const char *path, FsCred *credp)
 {
 int err = -1;
@@ -320,7 +314,6 @@ FileOperations local_ops = {
 .writev = local_writev,
 .chmod = local_chmod,
 .mknod = local_mknod,
-.mksock = local_mksock,
 .mkdir = local_mkdir,
 .fstat = local_fstat,
 .open2 = local_open2,
diff --git a/hw/virtio-9p.c b/hw/virtio-9p.c
index 9033541..bbeba7c 100644
--- a/hw/virtio-9p.c
+++ b/hw/virtio-9p.c
@@ -157,14 +157,15 @@ static int v9fs_do_chmod(V9fsState *s, V9fsString *path, 
mode_t mode)
 return s->ops->chmod(&s->ctx, path->data, mode);
 }
 
-static int v9fs_do_mknod(V9fsState *s, V9fsString *path, mode_t mode, dev_t 
dev)
+static int v9fs_do_mknod(V9fsState *s, V9fsCreateState *vs, mode_t mode,
+dev_t dev)
 {
-return s->ops->mknod(&s->ctx, path->data, mode, dev);
-}
-
-static int v9fs_do_mksock(V9fsState *s, V9fsString *path)
-{
-return s->ops->mksock(&s->ctx, path->data);
+FsCred cred;
+cred_init(&cred);
+cred.fc_uid = vs->fidp->uid;
+cred.fc_mode = mode;
+cred.fc_rdev = dev;
+return s->ops->mknod(&s->ctx, vs->fullname.data, &cred);
 }
 
 static int v9fs_do_mkdir(V9fsState *s, V9fsCreateState *vs)
@@ -1812,13 +1813,13 @@ static void v9fs_create_post_lstat(V9fsState *s, 
V9fsCreateState *vs, int err)
 }
 
 nmode |= vs->perm & 0777;
-err = v9fs_do_mknod(s, &vs->fullname, nmode, makedev(major, minor));
+err = v9fs_do_mknod(s, vs, nmode, makedev(major, minor));
 v9fs_create_post_perms(s, vs, err);
 } else if (vs->perm & P9_STAT_MODE_NAMED_PIPE) {
-err = v9fs_do_mknod(s, &vs->fullname, S_IFIFO | (vs->mode & 0777), 0);
+err = v9fs_do_mknod(s, vs, S_IFIFO | (vs->perm & 0777), 0);
 v9fs_post_create(s, vs, err);
 } else if (vs->perm & P9_STAT_MODE_SOCKET) {
-err = v9fs_do_mksock(s, &vs->fullname);
+err = v9fs_do_mknod(s, vs, S_IFSOCK | (vs->perm & 0777), 0);
 v9fs_create_post_mksock(s, vs, err);
 } else {
 vs->fidp->fd = v9fs_do_open2(s, vs);
-- 
1.6.5.2




[Qemu-devel] [PATCH 6/6] virtio-9p: Implemented Security model for lstat and fstat

2010-05-10 Thread Venkateswararao Jujjuri (JV)
Signed-off-by: Venkateswararao Jujjuri 
---
 hw/virtio-9p-local.c |   67 +++---
 1 files changed, 63 insertions(+), 4 deletions(-)

diff --git a/hw/virtio-9p-local.c b/hw/virtio-9p-local.c
index 89b17f0..529de73 100644
--- a/hw/virtio-9p-local.c
+++ b/hw/virtio-9p-local.c
@@ -27,9 +27,40 @@ static const char *rpath(FsContext *ctx, const char *path)
 return buffer;
 }
 
-static int local_lstat(FsContext *ctx, const char *path, struct stat *stbuf)
+
+static int local_lstat(FsContext *fs_ctx, const char *path, struct stat *stbuf)
 {
-return lstat(rpath(ctx, path), stbuf);
+int err;
+err =  lstat(rpath(fs_ctx, path), stbuf);
+if (err) {
+return err;
+}
+if (fs_ctx->fs_sm == sm_mapped) {
+/* Actual credentials are part of extended attrs */
+uid_t tmp_uid;
+gid_t tmp_gid;
+mode_t tmp_mode;
+dev_t tmp_dev;
+if (getxattr(rpath(fs_ctx, path), "user.virtfs.uid", &tmp_uid,
+sizeof(uid_t)) > 0) {
+stbuf->st_uid = tmp_uid;
+}
+if (getxattr(rpath(fs_ctx, path), "user.virtfs.gid", &tmp_gid,
+sizeof(gid_t)) > 0) {
+stbuf->st_gid = tmp_gid;
+}
+if (getxattr(rpath(fs_ctx, path), "user.virtfs.mode", &tmp_mode,
+sizeof(mode_t)) > 0) {
+stbuf->st_mode = tmp_mode;
+}
+if (S_ISCHR(tmp_mode) || S_ISCHR(tmp_mode)) {
+if (getxattr(rpath(fs_ctx, path), "user.virtfs.rdev", &tmp_dev,
+sizeof(dev_t)) > 0) {
+stbuf->st_rdev = tmp_dev;
+}
+}
+}
+return err;
 }
 
 static void local_set_cred(FsCred *credp)
@@ -200,9 +231,37 @@ static int local_mkdir(FsContext *fs_ctx, const char 
*path, FsCred *credp)
 return err;
 }
 
-static int local_fstat(FsContext *ctx, int fd, struct stat *stbuf)
+static int local_fstat(FsContext *fs_ctx, int fd, struct stat *stbuf)
 {
-return fstat(fd, stbuf);
+int err;
+err = fstat(fd, stbuf);
+if (err) {
+return err;
+}
+if (fs_ctx->fs_sm == sm_mapped) {
+/* Actual credentials are part of extended attrs */
+uid_t tmp_uid;
+gid_t tmp_gid;
+mode_t tmp_mode;
+dev_t tmp_dev;
+
+if (fgetxattr(fd, "user.virtfs.uid", &tmp_uid, sizeof(uid_t)) > 0) {
+stbuf->st_uid = tmp_uid;
+}
+if (fgetxattr(fd, "user.virtfs.gid", &tmp_gid, sizeof(gid_t)) > 0) {
+stbuf->st_gid = tmp_gid;
+}
+if (fgetxattr(fd, "user.virtfs.mode", &tmp_mode, sizeof(mode_t)) > 0) {
+stbuf->st_mode = tmp_mode;
+}
+if (S_ISCHR(tmp_mode) || S_ISCHR(tmp_mode)) {
+if (fgetxattr(fd, "user.virtfs.rdev", &tmp_dev,
+sizeof(dev_t)) > 0) {
+stbuf->st_rdev = tmp_dev;
+}
+}
+}
+return err;
 }
 
 static int local_open2(FsContext *fs_ctx, const char *path, int flags,
-- 
1.6.5.2




Re: [Qemu-devel] [PATCH 1/5] SCSI: Add disk reset handler

2010-05-10 Thread Anthony Liguori

On 05/04/2010 07:20 AM, Jan Kiszka wrote:

Ensure that pending requests of an SCSI disk are purged on system reset
and also restore max_lba. The latter is no only present in the reset
handler as that one is called after init as well.

Signed-off-by: Jan Kiszka
   


Applied all (including v2 of 5/5).  Thanks.

Regards,

Anthony Liguori


---
  hw/scsi-disk.c |   35 +++
  1 files changed, 27 insertions(+), 8 deletions(-)

diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c
index 77cb1da..b8d805f 100644
--- a/hw/scsi-disk.c
+++ b/hw/scsi-disk.c
@@ -1010,22 +1010,45 @@ static int32_t scsi_send_command(SCSIDevice *d, 
uint32_t tag,
  }
  }

-static void scsi_destroy(SCSIDevice *dev)
+static void scsi_disk_purge_requests(SCSIDiskState *s)
  {
-SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, dev);
  SCSIDiskReq *r;

  while (!QTAILQ_EMPTY(&s->qdev.requests)) {
  r = DO_UPCAST(SCSIDiskReq, req, QTAILQ_FIRST(&s->qdev.requests));
+if (r->req.aiocb) {
+bdrv_aio_cancel(r->req.aiocb);
+}
  scsi_remove_request(r);
  }
+}
+
+static void scsi_disk_reset(DeviceState *dev)
+{
+SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev.qdev, dev);
+uint64_t nb_sectors;
+
+scsi_disk_purge_requests(s);
+
+bdrv_get_geometry(s->bs,&nb_sectors);
+nb_sectors /= s->cluster_size;
+if (nb_sectors) {
+nb_sectors--;
+}
+s->max_lba = nb_sectors;
+}
+
+static void scsi_destroy(SCSIDevice *dev)
+{
+SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, dev);
+
+scsi_disk_purge_requests(s);
  drive_uninit(s->qdev.conf.dinfo);
  }

  static int scsi_disk_initfn(SCSIDevice *dev)
  {
  SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, dev);
-uint64_t nb_sectors;

  if (!s->qdev.conf.dinfo || !s->qdev.conf.dinfo->bdrv) {
  error_report("scsi-disk: drive property not set");
@@ -1046,11 +1069,6 @@ static int scsi_disk_initfn(SCSIDevice *dev)
  s->cluster_size = s->qdev.blocksize / 512;

  s->qdev.type = TYPE_DISK;
-bdrv_get_geometry(s->bs,&nb_sectors);
-nb_sectors /= s->cluster_size;
-if (nb_sectors)
-nb_sectors--;
-s->max_lba = nb_sectors;
  qemu_add_vm_change_state_handler(scsi_dma_restart_cb, s);
  return 0;
  }
@@ -1059,6 +1077,7 @@ static SCSIDeviceInfo scsi_disk_info = {
  .qdev.name= "scsi-disk",
  .qdev.desc= "virtual scsi disk or cdrom",
  .qdev.size= sizeof(SCSIDiskState),
+.qdev.reset   = scsi_disk_reset,
  .init = scsi_disk_initfn,
  .destroy  = scsi_destroy,
  .send_command = scsi_send_command,
   





Re: [Qemu-devel] [PATCH] iov: Move from hw/ to topdir

2010-05-10 Thread Anthony Liguori

On 05/04/2010 06:09 AM, Amit Shah wrote:

The iov functions can be useful to other code as well.

Signed-off-by: Amit Shah
CC: Christoph Hellwig
   


Applied.  Thanks.

Regards,

Anthony Liguori


---
  hw/iov.c =>  iov.c |0
  hw/iov.h =>  iov.h |0
  2 files changed, 0 insertions(+), 0 deletions(-)
  rename hw/iov.c =>  iov.c (100%)
  rename hw/iov.h =>  iov.h (100%)

diff --git a/hw/iov.c b/iov.c
similarity index 100%
rename from hw/iov.c
rename to iov.c
diff --git a/hw/iov.h b/iov.h
similarity index 100%
rename from hw/iov.h
rename to iov.h
   





Re: [Qemu-devel] [PATCH 1/4] doc: Fix host forwarding monitor command documentation

2010-05-10 Thread Anthony Liguori

On 05/04/2010 06:20 AM, Markus Armbruster wrote:

Commit f3546deb replaced host_net_redir by hostfwd_add,
hostfwd_remove, but neglected to update documentation.

Signed-off-by: Markus Armbruster
   


Applied all.  Thanks.

Regards,

Anthony Liguori


---
  qemu-monitor.hx |   13 ++---
  1 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/qemu-monitor.hx b/qemu-monitor.hx
index 5ea5748..21aeb6b 100644
--- a/qemu-monitor.hx
+++ b/qemu-monitor.hx
@@ -953,7 +953,14 @@ ETEXI
  .help   = "redirect TCP or UDP connections from host to guest 
(requires -net user)",
  .mhandler.cmd = net_slirp_hostfwd_add,
  },
+#endif
+STEXI
+...@item hostfwd_add
+...@findex hostfwd_add
+Redirect TCP or UDP connections from host to guest (requires -net user).
+ETEXI

+#ifdef CONFIG_SLIRP
  {
  .name   = "hostfwd_remove",
  .args_type  = "arg1:s,arg2:s?,arg3:s?",
@@ -964,9 +971,9 @@ ETEXI

  #endif
  STEXI
-...@item host_net_redir
-...@findex host_net_redir
-Redirect TCP or UDP connections from host to guest (requires -net user).
+...@item hostfwd_remove
+...@findex hostfwd_remove
+Remove host-to-guest TCP or UDP redirection.
  ETEXI

  {
   





Re: [Qemu-devel] [PATCH 1/2 v4] Support for multiple keyboard devices

2010-05-10 Thread Anthony Liguori

On 04/18/2010 02:21 PM, Shahar Havivi wrote:

Patch add QEMUPutKbdEntry structure - handling each keyboard entry, the 
structure handled
by qemu tail queue.
Adding a new keyboard add to the list and select it, removing keyboard select 
the previous
keyboard in list.

Signed-off-by: Shahar Havivi
   


Applied all.  Thanks.

Regards,

Anthony Liguori


---
  console.h|   14 -
  hw/adb.c |2 +-
  hw/escc.c|3 +-
  hw/musicpal.c|2 +-
  hw/nseries.c |4 +-
  hw/palm.c|2 +-
  hw/ps2.c |2 +-
  hw/pxa2xx_keypad.c   |3 +-
  hw/spitz.c   |3 +-
  hw/stellaris_input.c |2 +-
  hw/syborg_keyboard.c |2 +-
  hw/usb-hid.c |   10 ++--
  hw/xenfb.c   |5 ++-
  input.c  |   51 -
  14 files changed, 78 insertions(+), 27 deletions(-)

diff --git a/console.h b/console.h
index 6def115..91b66ea 100644
--- a/console.h
+++ b/console.h
@@ -41,7 +41,19 @@ typedef struct QEMUPutLEDEntry {
  QTAILQ_ENTRY(QEMUPutLEDEntry) next;
  } QEMUPutLEDEntry;

-void qemu_add_kbd_event_handler(QEMUPutKBDEvent *func, void *opaque);
+typedef struct QEMUPutKbdEntry {
+char *qemu_put_kbd_name;
+QEMUPutKBDEvent *qemu_put_kbd_event;
+void *qemu_put_kbd_event_opaque;
+int index;
+
+QTAILQ_ENTRY(QEMUPutKbdEntry) node;
+} QEMUPutKbdEntry;
+
+QEMUPutKbdEntry *qemu_add_kbd_event_handler(QEMUPutKBDEvent *func,
+void *opaque,
+const char *name);
+void qemu_remove_kbd_event_handler(QEMUPutKbdEntry *entry);
  QEMUPutMouseEntry *qemu_add_mouse_event_handler(QEMUPutMouseEvent *func,
  void *opaque, int absolute,
  const char *name);
diff --git a/hw/adb.c b/hw/adb.c
index 4fb7a62..09afcf9 100644
--- a/hw/adb.c
+++ b/hw/adb.c
@@ -304,7 +304,7 @@ void adb_kbd_init(ADBBusState *bus)
  s = qemu_mallocz(sizeof(KBDState));
  d = adb_register_device(bus, ADB_KEYBOARD, adb_kbd_request,
  adb_kbd_reset, s);
-qemu_add_kbd_event_handler(adb_kbd_put_keycode, d);
+qemu_add_kbd_event_handler(adb_kbd_put_keycode, d, "adb");
  register_savevm("adb_kbd", -1, 1, adb_kbd_save,
  adb_kbd_load, s);
  }
diff --git a/hw/escc.c b/hw/escc.c
index 6d2fd36..2b21d98 100644
--- a/hw/escc.c
+++ b/hw/escc.c
@@ -919,7 +919,8 @@ static int escc_init1(SysBusDevice *dev)
   "QEMU Sun Mouse");
  }
  if (s->chn[1].type == kbd) {
-qemu_add_kbd_event_handler(sunkbd_event,&s->chn[1]);
+qemu_add_kbd_event_handler(sunkbd_event,&s->chn[1],
+   "QEMU Sun Keyboard");
  }

  return 0;
diff --git a/hw/musicpal.c b/hw/musicpal.c
index ebd933e..e1a3b6a 100644
--- a/hw/musicpal.c
+++ b/hw/musicpal.c
@@ -1447,7 +1447,7 @@ static int musicpal_key_init(SysBusDevice *dev)

  qdev_init_gpio_out(&dev->qdev, s->out, ARRAY_SIZE(s->out));

-qemu_add_kbd_event_handler(musicpal_key_event, s);
+qemu_add_kbd_event_handler(musicpal_key_event, s, "Musicpal");

  return 0;
  }
diff --git a/hw/nseries.c b/hw/nseries.c
index 0273eee..abfcec3 100644
--- a/hw/nseries.c
+++ b/hw/nseries.c
@@ -262,7 +262,7 @@ static void n800_tsc_kbd_setup(struct n800_s *s)
  if (n800_keys[i]>= 0)
  s->keymap[n800_keys[i]] = i;

-qemu_add_kbd_event_handler(n800_key_event, s);
+qemu_add_kbd_event_handler(n800_key_event, s, "Nokia n800");

  tsc210x_set_transform(s->ts.chip,&n800_pointercal);
  }
@@ -371,7 +371,7 @@ static void n810_kbd_setup(struct n800_s *s)
  if (n810_keys[i]>  0)
  s->keymap[n810_keys[i]] = i;

-qemu_add_kbd_event_handler(n810_key_event, s);
+qemu_add_kbd_event_handler(n810_key_event, s, "Nokia n810");

  /* Attach the LM8322 keyboard to the I2C bus,
   * should happen in n8x0_i2c_setup and s->kbd be initialised here.  */
diff --git a/hw/palm.c b/hw/palm.c
index 6d19167..1b405d4 100644
--- a/hw/palm.c
+++ b/hw/palm.c
@@ -228,7 +228,7 @@ static void palmte_init(ram_addr_t ram_size,

  palmte_microwire_setup(cpu);

-qemu_add_kbd_event_handler(palmte_button_event, cpu);
+qemu_add_kbd_event_handler(palmte_button_event, cpu, "Palm Keyboard");

  palmte_gpio_setup(cpu);

diff --git a/hw/ps2.c b/hw/ps2.c
index f0b206a..886da37 100644
--- a/hw/ps2.c
+++ b/hw/ps2.c
@@ -596,7 +596,7 @@ void *ps2_kbd_init(void (*update_irq)(void *, int), void 
*update_arg)
  s->common.update_arg = update_arg;
  s->scancode_set = 2;
  vmstate_register(0,&vmstate_ps2_keyboard, s);
-qemu_add_kbd_event_handler(ps2_put_keycode, s);
+qemu_add_kbd_event_handler(ps2_put_keycode, s, "QEMU PS/2 Keyboard");
  qemu_register_reset(ps2_kbd_reset, s);
  return s;
  }
di

[Qemu-devel] [PATCH] vdi: Fix image opening and creation for odd disk sizes

2010-05-10 Thread Stefan Weil
The fix is based on a patch from Kevin Wolf. Here his comment:

"The number of blocks needs to be rounded up to cover all of the virtual hard
disk. Without this fix, we can't even open our own images if their size is not
a multiple of the block size."

While Kevin's patch addressed vdi_create, my modification also fixes
vdi_open which now accepts images with odd disk sizes as well as
images created with old versions of qemu-img.

Cc: Kevin Wolf 
Cc: François Revol 
Signed-off-by: Stefan Weil 
---
 block/vdi.c |   29 +
 1 files changed, 21 insertions(+), 8 deletions(-)

diff --git a/block/vdi.c b/block/vdi.c
index 1ce18d5..362c898 100644
--- a/block/vdi.c
+++ b/block/vdi.c
@@ -405,19 +405,12 @@ static int vdi_open(BlockDriverState *bs, int flags)
 /* We only support data blocks which start on a sector boundary. */
 logout("unsupported data offset 0x%x B\n", header.offset_data);
 goto fail;
-} else if (header.disk_size % SECTOR_SIZE != 0) {
-logout("unsupported disk size %" PRIu64 " B\n", header.disk_size);
-goto fail;
 } else if (header.sector_size != SECTOR_SIZE) {
 logout("unsupported sector size %u B\n", header.sector_size);
 goto fail;
 } else if (header.block_size != 1 * MiB) {
 logout("unsupported block size %u B\n", header.block_size);
 goto fail;
-} else if ((header.disk_size + header.block_size - 1) / header.block_size 
!=
-   (uint64_t)header.blocks_in_image) {
-logout("unexpected block number %u B\n", header.blocks_in_image);
-goto fail;
 } else if (!uuid_is_null(header.uuid_link)) {
 logout("link uuid != 0, unsupported\n");
 goto fail;
@@ -426,6 +419,23 @@ static int vdi_open(BlockDriverState *bs, int flags)
 goto fail;
 }
 
+if (header.disk_size % SECTOR_SIZE != 0) {
+/* 'VBoxManage convertfromraw' can create images with odd disk sizes.
+   We accept them but round the disk size to the next multiple of
+   SECTOR_SIZE. */
+logout("odd disk size %" PRIu64 " B, round up\n", header.disk_size);
+header.disk_size += SECTOR_SIZE - 1;
+header.disk_size &= ~(SECTOR_SIZE - 1);
+}
+
+if (header.disk_size >
+(uint64_t)header.blocks_in_image * header.block_size) {
+/* Old versions of qemu-img could create images with too large
+   disk sizes. We accept them but truncate the disk size. */
+logout("large disk size %" PRIu64 " B, truncated\n", header.disk_size);
+header.disk_size = (uint64_t)header.blocks_in_image * 
header.block_size;
+}
+
 bs->total_sectors = header.disk_size / SECTOR_SIZE;
 
 s->block_size = header.block_size;
@@ -829,7 +839,10 @@ static int vdi_create(const char *filename, 
QEMUOptionParameter *options)
 return -errno;
 }
 
-blocks = bytes / block_size;
+/* We need enough blocks to store the given disk size,
+   so always round up. */
+blocks = (bytes + block_size - 1) / block_size;
+
 bmap_size = blocks * sizeof(uint32_t);
 bmap_size = ((bmap_size + SECTOR_SIZE - 1) & ~(SECTOR_SIZE -1));
 
-- 
1.7.1




Re: [Qemu-devel] Print input opcodes before translation

2010-05-10 Thread Blue Swirl
On 5/10/10, John Vele  wrote:
> Greetings,
>
> I'm using qemu user mode for i386 and I would like to print the opcodes of
> an
> input executable one bye one before any translation.
> Is the file target-i386/translate.c the one that I shall edit? The function
> disas_insn()
>  in the same file is the appropriate one?

No need to edit anything, please RTFM for '-d in_asm' and
'-singlestep' parameters.



[Qemu-devel] Print input opcodes before translation

2010-05-10 Thread John Vele
Greetings,

I'm using qemu user mode for i386 and I would like to print the opcodes of
an
input executable one bye one before any translation.
Is the file target-i386/translate.c the one that I shall edit? The function
disas_insn()
in the same file is the appropriate one?

Thanx in advance


[Qemu-devel] [PATCH] Compile virtio-9p-debug and virtio-9p-local once

2010-05-10 Thread Blue Swirl
Signed-off-by: Blue Swirl 
---
 Makefile.objs   |2 ++
 Makefile.target |2 +-
 2 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/Makefile.objs b/Makefile.objs
index ecdd53e..110f8fd 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -226,6 +226,8 @@ sound-obj-$(CONFIG_CS4231A) += cs4231a.o
 adlib.o fmopl.o: QEMU_CFLAGS += -DBUILD_Y8950=0
 hw-obj-$(CONFIG_SOUND) += $(sound-obj-y)

+hw-obj-$(CONFIG_LINUX) += virtio-9p-debug.o virtio-9p-local.o
+
 ##
 # libdis
 # NOTE: the disassembler code is only needed for debugging
diff --git a/Makefile.target b/Makefile.target
index c092900..b62117c 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -168,7 +168,7 @@ obj-y += virtio-blk.o virtio-balloon.o
virtio-net.o virtio-serial-bus.o
 obj-$(CONFIG_VIRTIO_PCI) += virtio-pci.o
 obj-y += vhost_net.o
 obj-$(CONFIG_VHOST_NET) += vhost.o
-obj-$(CONFIG_LINUX) += virtio-9p.o virtio-9p-debug.o virtio-9p-local.o
+obj-$(CONFIG_LINUX) += virtio-9p.o
 obj-y += rwhandler.o
 obj-$(CONFIG_KVM) += kvm.o kvm-all.o
 obj-$(CONFIG_NO_KVM) += kvm-stub.o
-- 
1.6.2.4



[Qemu-devel] [PATCH] block/vpc: Fix conversion from size to disk geometry

2010-05-10 Thread Stefan Weil
The VHD algorithm calculates a disk geometry
which is usually smaller than the requested size.

QEMU tried to round up but failed for certain sizes:

qemu-img create -f vpc disk.vpc 9437184
would create an image with 9435136 bytes
(which is too small for qemu-img convert).

Instead of hacking the geometry algorithm, the patch
increases the number of sectors until we get enough
sectors.

Cc: Kevin Wolf 
Signed-off-by: Stefan Weil 
---
 block/vpc.c |   21 -
 1 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/block/vpc.c b/block/vpc.c
index f94e469..214e9d1 100644
--- a/block/vpc.c
+++ b/block/vpc.c
@@ -463,9 +463,7 @@ static int calculate_geometry(int64_t total_sectors, 
uint16_t* cyls,
 }
 }
 
-// Note: Rounding up deviates from the Virtual PC behaviour
-// However, we need this to avoid truncating images in qemu-img convert
-*cyls = (cyls_times_heads + *heads - 1) / *heads;
+*cyls = cyls_times_heads / *heads;
 
 return 0;
 }
@@ -477,9 +475,9 @@ static int vpc_create(const char *filename, 
QEMUOptionParameter *options)
 struct vhd_dyndisk_header* dyndisk_header =
 (struct vhd_dyndisk_header*) buf;
 int fd, i;
-uint16_t cyls;
-uint8_t heads;
-uint8_t secs_per_cyl;
+uint16_t cyls = 0;
+uint8_t heads = 0;
+uint8_t secs_per_cyl = 0;
 size_t block_size, num_bat_entries;
 int64_t total_sectors = 0;
 
@@ -496,9 +494,14 @@ static int vpc_create(const char *filename, 
QEMUOptionParameter *options)
 if (fd < 0)
 return -EIO;
 
-// Calculate matching total_size and geometry
-if (calculate_geometry(total_sectors, &cyls, &heads, &secs_per_cyl))
-return -EFBIG;
+/* Calculate matching total_size and geometry. Increase the number of
+   sectors requested until we get enough (or fail). */
+for (i = 0; total_sectors > (int64_t)cyls * heads * secs_per_cyl; i++) {
+if (calculate_geometry(total_sectors + i,
+   &cyls, &heads, &secs_per_cyl)) {
+return -EFBIG;
+}
+}
 total_sectors = (int64_t) cyls * heads * secs_per_cyl;
 
 // Prepare the Hard Disk Footer
-- 
1.7.1




Re: [Qemu-devel] [PATCH] sh: sm501: add 2D engine support

2010-05-10 Thread Blue Swirl
On 5/9/10, Shin-ichiro KAWASAKI  wrote:
> In linux kernel v2.6.33, sm501 frame buffer driver modified to support
>  2D graphics engine on sm501 chip.  One example is "fill rectangle" operation.
>  But current qemu's sm501 emulation doesn't support it.  This results in
>  graphics console disturbance.
>
>  This patch introduces sm501 2D graphics engine emulation and solve this 
> problem.
>
>
>  Signed-off-by: Shin-ichiro KAWASAKI 
>
> Add SM501 2D hardware engine support.
>
>  - Add 2D engine register set read/write handlers.
>  - Support 'fill rectangle'. Other operations are left for future work.
>  - Update SM501 support status comment.
>
>  ---
>   hw/sm501.c |  168 
> ++-
>   1 files changed, 165 insertions(+), 3 deletions(-)
>
>  diff --git a/hw/sm501.c b/hw/sm501.c
>  index cd1f595..2f1610b 100644
>  --- a/hw/sm501.c
>  +++ b/hw/sm501.c
>  @@ -29,16 +29,16 @@
>   #include "devices.h"
>
>   /*
>  - * Status: 2008/11/02
>  + * Status: 2010/05/07
>   *   - Minimum implementation for Linux console : mmio regs and CRT layer.
>  - *   - Always updates full screen.
>  + *   - 2D grapihcs acceleration partially supported : only fill rectangle.
>   *
>   * TODO:
>   *   - Panel support
>  - *   - Hardware cursor support
>   *   - Touch panel support
>   *   - USB support
>   *   - UART support
>  + *   - More 2D graphics engine support
>   *   - Performance tuning
>   */
>
>  @@ -508,6 +508,18 @@ typedef struct SM501State {
>  uint32_t dc_crt_hwc_color_1_2;
>  uint32_t dc_crt_hwc_color_3;
>
>  +uint32_t _2d_destination;
>  +uint32_t _2d_dimension;
>  +uint32_t _2d_control;
>  +uint32_t _2d_pitch;
>  +uint32_t _2d_foreground;
>  +uint32_t _2d_stretch;
>  +uint32_t _2d_color_compare_mask;
>  +uint32_t _2d_mask;
>  +uint32_t _2d_window_width;
>  +uint32_t _2d_source_base;
>  +uint32_t _2d_destination_base;
>  +
>   } SM501State;
>
>   static uint32_t get_local_mem_size_index(uint32_t size)
>  @@ -617,6 +629,65 @@ static int within_hwc_y_range(SM501State *state, int y, 
> int crt)
>  return (hwc_y <= y && y < hwc_y + SM501_HWC_HEIGHT);
>   }
>
>  +static void sm501_2d_operation(SM501State * s)
>  +{
>  +/* obtain operation parameters */
>  +int operation = (s->_2d_control >> 16) & 0x1f;
>  +int dst_x = (s->_2d_destination >> 16) & 0x01FFF;
>  +int dst_y = s->_2d_destination & 0x;
>  +int operation_width = (s->_2d_dimension >> 16) & 0x1FFF;
>  +int operation_height = s->_2d_dimension & 0x;
>  +uint32_t color = s->_2d_foreground;
>  +int format_flags = (s->_2d_stretch >> 20) & 0x3;
>  +int addressing = (s->_2d_stretch >> 16) & 0xF;
>  +
>  +/* get frame buffer info */
>  +#if 0 /* for future use */
>  +uint8_t * src = s->local_mem + (s->_2d_source_base & 0x03FF);
>  +#endif
>  +uint8_t * dst = s->local_mem + (s->_2d_destination_base & 0x03FF);
>  +int dst_width = (s->dc_crt_h_total & 0x0FFF) + 1;
>  +
>  +/* only XY addressing is supported */
>  +assert(addressing == 0x0);
>  +
>  +/* only local memory is supporetd */
>  +assert(!(s->_2d_source_base & 0x0800));
>  +assert(!(s->_2d_destination_base & 0x0800));

In general, assertions which can be triggered by the guest are bad.

>  +
>  +switch (operation) {
>  +case 0x01: /* fill rectangle */
>  +
>  +#define FILL_RECT(_bpp, _pixel_type) {  
> \
>  +int y, x;   
> \
>  +for (y = 0; y < operation_height; y++) {
> \
>  +for (x = 0; x < operation_width; x++) { 
> \
>  +int index = ((dst_y + y) * dst_width + dst_x + x) * _bpp;   
> \
>  +*(_pixel_type*)&dst[index] = (_pixel_type)color;
> \
>  +}   
> \
>  +}   
> \
>  +}
>  +
>  +switch (format_flags) {
>  +case 0:
>  +FILL_RECT(1, uint8_t);
>  +break;
>  +case 1:
>  +FILL_RECT(2, uint16_t);
>  +break;
>  +case 2:
>  +FILL_RECT(4, uint32_t);
>  +break;
>  +}
>  +break;
>  +
>  +default:
>  +printf("non-implemented SM501 2D operation. %d\n", operation);
>  +assert(0);
>  +break;
>  +}
>  +}
>  +
>   static uint32_t sm501_system_config_read(void *opaque, target_phys_addr_t 
> addr)
>   {
>  SM501State * s = (SM501State *)opaque;
>  @@ -967,6 +1038,92 @@ static CPUWriteMemoryFunc * const 
> sm501_disp_ctrl_writefn[] = {
>  &sm501_disp_ctrl_write,
>   };
>
>  +static uint32_t sm501_2d_engine_read(void *opaque, target_phys_addr_t addr)
>  +{
>  +SM501State * s = (SM501State

Re: [Qemu-devel] [RFC][MIPS][PATCH 3/6] Initial support of VIA IDE controller used by fulong mini pc

2010-05-10 Thread Blue Swirl
On 5/9/10, chen huacai  wrote:
> This patch add initial support of VIA IDE controller used by fulong mini pc
>
>  Signed-off-by: Huacai Chen 
>  -
>  diff --git a/Makefile.objs b/Makefile.objs
>  index ecdd53e..75be9ce 100644
>  --- a/Makefile.objs
>  +++ b/Makefile.objs
>  @@ -195,6 +195,7 @@ hw-obj-$(CONFIG_IDE_ISA) += ide/isa.o
>   hw-obj-$(CONFIG_IDE_PIIX) += ide/piix.o
>   hw-obj-$(CONFIG_IDE_CMD646) += ide/cmd646.o
>   hw-obj-$(CONFIG_IDE_MACIO) += ide/macio.o
>  +hw-obj-$(CONFIG_IDE_VIA) += ide/via.o
>
>   # SCSI layer
>   hw-obj-y += lsi53c895a.o
>  diff --git a/default-configs/mips-softmmu.mak 
> b/default-configs/mips-softmmu.mak
>  index 7793dbc..02fea47 100644
>  --- a/default-configs/mips-softmmu.mak
>  +++ b/default-configs/mips-softmmu.mak
>  @@ -18,6 +18,7 @@ CONFIG_IDE_QDEV=y
>   CONFIG_IDE_PCI=y
>   CONFIG_IDE_ISA=y
>   CONFIG_IDE_PIIX=y
>  +CONFIG_IDE_VIA=y
>   CONFIG_NE2000_ISA=y
>   CONFIG_SOUND=y
>   CONFIG_VIRTIO_PCI=y
>  diff --git a/default-configs/mips64-softmmu.mak
>  b/default-configs/mips64-softmmu.mak
>  index aa65d92..8c13da6 100644
>  --- a/default-configs/mips64-softmmu.mak
>  +++ b/default-configs/mips64-softmmu.mak
>  @@ -18,6 +18,7 @@ CONFIG_IDE_QDEV=y
>   CONFIG_IDE_PCI=y
>   CONFIG_IDE_ISA=y
>   CONFIG_IDE_PIIX=y
>  +CONFIG_IDE_VIA=y
>   CONFIG_NE2000_ISA=y
>   CONFIG_SOUND=y
>   CONFIG_VIRTIO_PCI=y
>  diff --git a/default-configs/mips64el-softmmu.mak
>  b/default-configs/mips64el-softmmu.mak
>  index b9b8c71..7e3a5ef 100644
>  --- a/default-configs/mips64el-softmmu.mak
>  +++ b/default-configs/mips64el-softmmu.mak
>  @@ -18,6 +18,7 @@ CONFIG_IDE_QDEV=y
>   CONFIG_IDE_PCI=y
>   CONFIG_IDE_ISA=y
>   CONFIG_IDE_PIIX=y
>  +CONFIG_IDE_VIA=y
>   CONFIG_NE2000_ISA=y
>   CONFIG_SOUND=y
>   CONFIG_VIRTIO_PCI=y
>  diff --git a/default-configs/mipsel-softmmu.mak
>  b/default-configs/mipsel-softmmu.mak
>  index e14831e..c329fb2 100644
>  --- a/default-configs/mipsel-softmmu.mak
>  +++ b/default-configs/mipsel-softmmu.mak
>  @@ -18,6 +18,7 @@ CONFIG_IDE_QDEV=y
>   CONFIG_IDE_PCI=y
>   CONFIG_IDE_ISA=y
>   CONFIG_IDE_PIIX=y
>  +CONFIG_IDE_VIA=y
>   CONFIG_NE2000_ISA=y
>   CONFIG_SOUND=y
>   CONFIG_VIRTIO_PCI=y
>  diff --git a/hw/ide.h b/hw/ide.h
>  index 0e7d540..bb635b6 100644
>  --- a/hw/ide.h
>  +++ b/hw/ide.h
>  @@ -12,6 +12,7 @@ void pci_cmd646_ide_init(PCIBus *bus, DriveInfo **hd_table,
>   int secondary_ide_enabled);
>   void pci_piix3_ide_init(PCIBus *bus, DriveInfo **hd_table, int devfn);
>   void pci_piix4_ide_init(PCIBus *bus, DriveInfo **hd_table, int devfn);
>  +void vt82c686b_ide_init(PCIBus *bus, DriveInfo **hd_table, int devfn);
>
>   /* ide-macio.c */
>   int pmac_ide_init (DriveInfo **hd_table, qemu_irq irq,
>  diff --git a/hw/ide/via.c b/hw/ide/via.c
>  new file mode 100644
>  index 000..9adc5b5
>  --- /dev/null
>  +++ b/hw/ide/via.c
>  @@ -0,0 +1,189 @@
>  +/*
>  + * QEMU IDE Emulation: PCI VIA82C686B support.
>  + *
>  + * Copyright (c) 2003 Fabrice Bellard
>  + * Copyright (c) 2006 Openedhand Ltd.
>  + * Copyright (c) 2010 Huacai Chen 
>  + *
>  + * Permission is hereby granted, free of charge, to any person obtaining a 
> copy
>  + * of this software and associated documentation files (the
>  "Software"), to deal
>  + * in the Software without restriction, including without limitation the 
> rights
>  + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
>  + * copies of the Software, and to permit persons to whom the Software is
>  + * furnished to do so, subject to the following conditions:
>  + *
>  + * The above copyright notice and this permission notice shall be included 
> in
>  + * all copies or substantial portions of the Software.
>  + *
>  + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS 
> OR
>  + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
>  + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
>  + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR 
> OTHER
>  + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
>  ARISING FROM,
>  + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
>  + * THE SOFTWARE.
>  + */
>  +#include 
>  +#include 
>  +#include 
>  +#include 
>  +#include "block.h"
>  +#include "block_int.h"
>  +#include "sysemu.h"
>  +#include "dma.h"
>  +
>  +#include 
>  +
>  +static uint32_t bmdma_readb(void *opaque, uint32_t addr)
>  +{
>  +BMDMAState *bm = opaque;
>  +uint32_t val;
>  +
>  +switch(addr & 3) {
>  +case 0:
>  +val = bm->cmd;
>  +break;
>  +case 2:
>  +val = bm->status;
>  +break;
>  +default:
>  +val = 0xff;
>  +break;
>  +}
>  +#ifdef DEBUG_IDE
>  +printf("bmdma: readb 0x%02x : 0x%02x\n", addr, val);
>  +#endif
>  +return val;
>  +}
>  +
>  +static void bmdma_writeb(void *opaque, uint32_t addr, uint32_t val)
>  +{
>  +BMDMAState *bm = o

Re: [Qemu-devel] [RFC] default mac address issue

2010-05-10 Thread Anthony Liguori

Hi Bruce,

On 05/10/2010 02:07 PM, Bruce Rogers wrote:

I know this behavior has worked this way all along, but I wanted to bring up 
the following concern and float a few ideas about possible solutions. Please 
provide your perspective, opinion, etc.

qemu (or qemu-kvm) users can easily get into trouble when they don't specifying 
the mac address for their vm's nic and don't realize that multiple vm's running 
this way on the same network segment are colliding, since they all get a 
default mac address that is the same. They may be under the assumption that a 
random mac would be the default, as in many higher level tools for vm creation
   


This is certainly an important issue but it's one that's difficult to 
resolve.



Does it make sense to do any of the following:

1) have qemu print a warning to stdout/stderr that the default mac address is 
being used and that it will interfere with other vms running the same way on a 
common network segment
   


This is definitely reasonable.


2) what about changing the default behavior to randomizing the mac, and provide the legacy behavior 
with "-net nic,macaddr=default" or just "-use-default-mac"

(or, as a flip side to #2):

3) to at least make it easy for people to get around the problem, and just
use qem directly (without additional tools to launch qemu), add an option such as "-net 
nic,macaddr=randomize" or "-use-random-mac" which randomizes the mac for you
each time the machine is brought up, and hence avoids possible collisions.
   


A random mac address is almost always wrong.  If you run a guest twice 
with this option, it's usually enough to trigger a new network detection 
which which rename the network device to ethN + 1.  The result would be 
broken networking for naive users since distros don't bother configuring 
interfaces that weren't present during installation.


Regards,

Anthony Liguori


Of course having the same vm come up with a different mac each time is an issue 
for some guest os's, but for some usages, that is not a problem.

I imagine the concensus would be that the current behavior is appropriate, and 
it's up to management tools to do the randomization, but I think there is 
perhaps a place for randomization within qemu itself.  I'm interested to hear 
what you think.

Bruce



   





Re: [Qemu-devel] [RFC][MIPS][PATCH 2/6] Initial support of vt82686b south bridge used by fulong mini pc

2010-05-10 Thread Blue Swirl
On 5/9/10, chen huacai  wrote:
> This patch add initial support of vt82686b south bridge used by fulong mini pc
>
>  Signed-off-by: Huacai Chen 
>  -
>  diff --git a/Makefile.target b/Makefile.target
>  index fc4c59f..08968d6 100644
>  --- a/Makefile.target
>  +++ b/Makefile.target
>  @@ -219,7 +219,7 @@ obj-mips-y += mips_addr.o mips_timer.o mips_int.o
>   obj-mips-y += dma.o vga.o i8259.o
>   obj-mips-y += g364fb.o jazz_led.o
>   obj-mips-y += gt64xxx.o bonito.o pckbd.o mc146818rtc.o
>  -obj-mips-y += piix4.o cirrus_vga.o
>  +obj-mips-y += piix4.o vt82c686.o cirrus_vga.o
>
>   obj-microblaze-y = petalogix_s3adsp1800_mmu.o
>
>  diff --git a/hw/pc.h b/hw/pc.h
>  index d11a576..612fa06 100644
>  --- a/hw/pc.h
>  +++ b/hw/pc.h
>  @@ -115,6 +115,14 @@ void i440fx_init_memory_mappings(PCII440FXState *d);
>   extern PCIDevice *piix4_dev;
>   int piix4_init(PCIBus *bus, int devfn);
>
>  +/* vt82c686.c */
>  +extern PCIDevice *vt82c686b_dev;
>  +int vt82c686b_init(PCIBus * bus, int devfn);
>  +void vt82c686b_ac97_init(PCIBus *bus, int devfn);
>  +void vt82c686b_mc97_init(PCIBus *bus, int devfn);
>  +i2c_bus *vt82c686b_pm_init(PCIBus *bus, int devfn, uint32_t smb_io_base,
>  +qemu_irq sci_irq);

Can this device be used in a PC?

>  +
>   /* vga.c */
>   enum vga_retrace_method {
>  VGA_RETRACE_DUMB,
>  diff --git a/hw/pci_ids.h b/hw/pci_ids.h
>  index fe7a121..39e9f1d 100644
>  --- a/hw/pci_ids.h
>  +++ b/hw/pci_ids.h
>  @@ -78,6 +78,14 @@
>
>   #define PCI_VENDOR_ID_XILINX 0x10ee
>
>  +#define PCI_VENDOR_ID_VIA0x1106
>  +#define PCI_DEVICE_ID_VIA_ISA_BRIDGE 0x0686
>  +#define PCI_DEVICE_ID_VIA_IDE0x0571
>  +#define PCI_DEVICE_ID_VIA_UHCI   0x3038
>  +#define PCI_DEVICE_ID_VIA_ACPI   0x3057
>  +#define PCI_DEVICE_ID_VIA_AC97   0x3058
>  +#define PCI_DEVICE_ID_VIA_MC97   0x3068
>  +
>   #define PCI_VENDOR_ID_MARVELL0x11ab
>
>   #define PCI_VENDOR_ID_ENSONIQ0x1274
>  diff --git a/hw/vt82c686.c b/hw/vt82c686.c
>  new file mode 100644
>  index 000..7fc065f
>  --- /dev/null
>  +++ b/hw/vt82c686.c
>  @@ -0,0 +1,838 @@
>  +/*
>  + * VT82C686B south bridge support
>  + *
>  + * Copyright (c) 2008 yajin (ya...@vm-kernel.org)
>  + * Copyright (c) 2009 chenming (chenm...@rdc.faw.com.cn)
>  + * Copyright (c) 2009 Huacai Chen (zltjiang...@gmail.com)
>  + * This code is licensed under the GNU GPL v2.
>  + */
>  +
>  +#include "hw.h"
>  +#include "pc.h"
>  +#include "i2c.h"
>  +#include "smbus.h"
>  +#include "pci.h"
>  +#include "isa.h"
>  +#include "sysbus.h"
>  +#include "mips.h"
>  +
>  +typedef uint32_t pci_addr_t;
>  +#include "pci_host.h"
>  +//#define DEBUG

Please add the DPRINTF macro and use that.

>  +
>  +typedef struct SuperIOConfig
>  +{
>  +uint8_t config[0xff];
>  +uint8_t index;
>  +uint8_t data;
>  +} SuperIOConfig;
>  +
>  +PCIDevice *vt82c686b_dev;
>  +static SuperIOConfig *superio_conf;

Static device instances are ugly. While in some cases it's not
feasible to have more than one device in a machine, this usually
indicates poor design. Please pass the device state around.

I realize we have some pretty ugly code which may be used as reference
for new designs, this is of course not your fault.

>  +
>  +uint32_t smb_data[16];
>  +static void superio_ioport_writeb(void *opaque, uint32_t addr, uint32_t 
> data)
>  +{
>  +int can_write;
>  +
>  +#ifdef DEBUG
>  +printf("superio_ioport_writeb  address 0x%x  val 0x%x  \n", addr, data);
>  +#endif
>  +if (addr == 0x3f0)
>  +{

CODING_STYLE.

>  +superio_conf->index = data & 0xff;
>  +}
>  +else
>  +{
>  +/*0x3f1 */
>  +switch (superio_conf->index)
>  +{
>  +case 0x00 ... 0xdf:
>  +case 0xe4:
>  +case 0xe5:
>  +case 0xe9 ... 0xed:
>  +case 0xf3:
>  +case 0xf5:
>  +case 0xf7:
>  +case 0xf9 ... 0xfb:
>  +case 0xfd ... 0xff:
>  +can_write = 0;
>  +break;
>  +default:
>  +can_write = 1;
>  +
>  +if (can_write)
>  +{
>  +switch (superio_conf->index)
>  +{
>  +case 0xe7:
>  +if ((data & 0xff) != 0xfe)
>  +{
>  +#ifdef DEBUG
>  +printf("chage uart 1 base. unsupported yet \n");
>  +#endif
>  +}
>  +break;
>  +case 0xe8:
>  +if ((data & 0xff) != 0xbe)
>  +{
>  +#ifdef DEBUG
>  +printf("chage uart 2 base. unsupported yet \n");
>  +#endif
>  +}
>  +break;
>  +
>  +default:
>  +superio_conf->config[superio_conf->index] = data & 0xff;
>  +}
>  +}
>  +
>  +}
>  +
>  +superio_conf->config[superio_conf->in

Re: [Qemu-devel] [RFC][MIPS][PATCH 1/6] Initial support of bonito north bridge used by fulong mini pc

2010-05-10 Thread Blue Swirl
On 5/9/10, chen huacai  wrote:
> This patch add initial support of bonito north bridge used by fulong mini pc
>
>  Signed-off-by: Huacai Chen 
>  -
>  diff --git a/Makefile.target b/Makefile.target
>  index c092900..fc4c59f 100644
>  --- a/Makefile.target
>  +++ b/Makefile.target
>  @@ -218,7 +218,7 @@ obj-mips-y = mips_r4k.o mips_jazz.o mips_malta.o
>  mips_mipssim.o
>   obj-mips-y += mips_addr.o mips_timer.o mips_int.o
>   obj-mips-y += dma.o vga.o i8259.o
>   obj-mips-y += g364fb.o jazz_led.o
>  -obj-mips-y += gt64xxx.o pckbd.o mc146818rtc.o
>  +obj-mips-y += gt64xxx.o bonito.o pckbd.o mc146818rtc.o
>   obj-mips-y += piix4.o cirrus_vga.o
>
>   obj-microblaze-y = petalogix_s3adsp1800_mmu.o
>  diff --git a/hw/bonito.c b/hw/bonito.c
>  new file mode 100644
>  index 000..7d1c8eb
>  --- /dev/null
>  +++ b/hw/bonito.c
>  @@ -0,0 +1,921 @@
>  +/*
>  + * bonito north bridge support
>  + *
>  + * Copyright (c) 2008 yajin (ya...@vm-kernel.org)
>  + * Copyright (c) 2010 Huacai Chen (zltjiang...@gmail.com)
>  + *
>  + * This code is licensed under the GNU GPL v2.
>  + */
>  +
>  +/*
>  +fulong 2e mini pc has a bonito north bridge.

Please add '*' before fulong.

Links to chipset docs would be nice.

>  +*/
>  +#include 
>  +
>  +#include "hw.h"
>  +#include "mips.h"
>  +#include "pci.h"
>  +#include "pc.h"
>  +
>  +
>  +typedef target_phys_addr_t pci_addr_t;
>  +#include "pci_host.h"
>  +
>  +//#define DEBUG

Please use a more specific name, like DEBUG_BONITO.

>  +#ifdef DEBUG
>  +#define dprintf(fmt, ...) fprintf(stderr, "%s: " fmt, __FUNCTION__,
>  ##__VA_ARGS__)
>  +#define PCI_DPRINTF(fmt, ...) \
>  +do { printf("pci_host_data: " fmt , ## __VA_ARGS__); } while (0)
>  +#else
>  +#define dprintf(fmt, ...)
>  +#define PCI_DPRINTF(fmt, ...)
>  +#endif

I think this macro should be named just DPRINTF.

>  +
>  +/*from linux soure code. include/asm-mips/mips-boards/bonito64.h*/
>  +#define BONITO_BOOT_BASE   0x1fc0
>  +#define BONITO_BOOT_SIZE   0x0010
>  +#define BONITO_BOOT_TOP(BONITO_BOOT_BASE+BONITO_BOOT_SIZE-1)
>  +#define BONITO_FLASH_BASE  0x1c00
>  +#define BONITO_FLASH_SIZE  0x0300
>  +#define BONITO_FLASH_TOP   
> (BONITO_FLASH_BASE+BONITO_FLASH_SIZE-1)
>  +#define BONITO_SOCKET_BASE 0x1f80
>  +#define BONITO_SOCKET_SIZE 0x0040
>  +#define BONITO_SOCKET_TOP  
> (BONITO_SOCKET_BASE+BONITO_SOCKET_SIZE-1)
>  +#define BONITO_REG_BASE0x1fe0
>  +#define BONITO_REG_SIZE0x0004
>  +#define BONITO_REG_TOP (BONITO_REG_BASE+BONITO_REG_SIZE-1)
>  +#define BONITO_DEV_BASE0x1ff0
>  +#define BONITO_DEV_SIZE0x0010
>  +#define BONITO_DEV_TOP (BONITO_DEV_BASE+BONITO_DEV_SIZE-1)
>  +#define BONITO_PCILO_BASE  0x1000
>  +#define BONITO_PCILO_BASE_VA0xb000
>  +#define BONITO_PCILO_SIZE  0x0c00
>  +#define BONITO_PCILO_TOP   
> (BONITO_PCILO_BASE+BONITO_PCILO_SIZE-1)
>  +#define BONITO_PCILO0_BASE 0x1000
>  +#define BONITO_PCILO1_BASE 0x1400
>  +#define BONITO_PCILO2_BASE 0x1800
>  +#define BONITO_PCIHI_BASE  0x2000
>  +#define BONITO_PCIHI_SIZE  0x2000
>  +#define BONITO_PCIHI_TOP   
> (BONITO_PCIHI_BASE+BONITO_PCIHI_SIZE-1)
>  +#define BONITO_PCIIO_BASE  0x1fd0
>  +#define BONITO_PCIIO_BASE_VA   0xbfd0
>  +#define BONITO_PCIIO_SIZE  0x0001
>  +#define BONITO_PCIIO_TOP   
> (BONITO_PCIIO_BASE+BONITO_PCIIO_SIZE-1)
>  +#define BONITO_PCICFG_BASE 0x1fe8
>  +#define BONITO_PCICFG_SIZE 0x0008
>  +#define BONITO_PCICFG_TOP  
> (BONITO_PCICFG_BASE+BONITO_PCICFG_SIZE-1)
>  +
>  +
>  +
>  +#define BONITO_PCICONFIGBASE   0x00
>  +#define BONITO_REGBASE 0x100
>  +
>  +#define BONITO_PCICONFIG_BASE  (BONITO_PCICONFIGBASE+BONITO_REG_BASE)
>  +#define BONITO_PCICONFIG_SIZE  (0x100)
>  +
>  +#define BONITO_INTERNAL_REG_BASE  (BONITO_REGBASE+BONITO_REG_BASE)
>  +#define BONITO_INTERNAL_REG_SIZE  (0x70)
>  +
>  +#define BONITO_SPCICONFIG_BASE  (BONITO_PCICFG_BASE)
>  +#define BONITO_SPCICONFIG_SIZE  (BONITO_PCICFG_SIZE)
>  +
>  +
>  +
>  +/* 1. Bonito h/w Configuration */
>  +/* Power on register */
>  +
>  +#define BONITO_BONPONCFG   ( 0x00>>2)  /*0x100 */

Extra space between '(' and '0x00'. There should be spaces between
'0x00', '>>' and '2' as well as between '/*' and '0x100'.

In general, please pay attention to spaces, consistent use (and also
leaving the spaces out consistently) makes the code much easier to
read.

>  +#define BONITO_BONGENCFG_OFFSET 0x4
>  +#define BONITO_BONGENCFG   ( BONITO_BONGENCFG_OFFSET>>2)   
> /*0x104 */
>  +
>  +/* 2. IO & IDE configuration */
>  +#define BONITO

[Qemu-devel] [RFC] default mac address issue

2010-05-10 Thread Bruce Rogers
I know this behavior has worked this way all along, but I wanted to bring up 
the following concern and float a few ideas about possible solutions. Please 
provide your perspective, opinion, etc.

qemu (or qemu-kvm) users can easily get into trouble when they don't specifying 
the mac address for their vm's nic and don't realize that multiple vm's running 
this way on the same network segment are colliding, since they all get a 
default mac address that is the same. They may be under the assumption that a 
random mac would be the default, as in many higher level tools for vm creation.

Does it make sense to do any of the following:

1) have qemu print a warning to stdout/stderr that the default mac address is 
being used and that it will interfere with other vms running the same way on a 
common network segment

2) what about changing the default behavior to randomizing the mac, and provide 
the legacy behavior with "-net nic,macaddr=default" or just "-use-default-mac"

(or, as a flip side to #2):

3) to at least make it easy for people to get around the problem, and just
use qem directly (without additional tools to launch qemu), add an option such 
as "-net nic,macaddr=randomize" or "-use-random-mac" which randomizes the mac 
for you
each time the machine is brought up, and hence avoids possible collisions.

Of course having the same vm come up with a different mac each time is an issue 
for some guest os's, but for some usages, that is not a problem.

I imagine the concensus would be that the current behavior is appropriate, and 
it's up to management tools to do the randomization, but I think there is 
perhaps a place for randomization within qemu itself.  I'm interested to hear 
what you think.

Bruce





Re: [Qemu-devel] Re: sparc64 lazy conditional codes evaluation

2010-05-10 Thread Blue Swirl
On 5/10/10, Mark Cave-Ayland  wrote:
> Blue Swirl wrote:
>
>
> > Thanks a lot, with this patch my tests passed! I applied the combined
> patch.
> >
>
>  Yes, I definitely see an improvement with this patch - at least my Debian
> lenny SPARC boot cd doesn't randomly kernel panic any more. It looks as if
> it now just can't find /init which could just be due to an incorrect device
> mapping somewhere.
>
>
> > I also did a bit of refactoring to get the original Sparc64 issue fixed.
> >
>
>  However, one thing I did notice is that this does introduce a noticeable
> performance penalty. With OpenBIOS SVN head I see the following:
>
>  With commit 72139e83a98eba2bfed2dbc2db2818fb19e47ca0 (just
> before the changes):
>
>  [   59.225406] Failed to execute /init
>  [   59.304088] Kernel panic - not syncing: No init found.  Try passing
> init= option to kernel.
>  [   59.450313] Press Stop-A (L1-A) to return to the boot prom
>
>  With commit 5a834bb47c373e887de5210b7ceae96e1ef413f7 (just
> after the changes):
>
>  [   70.384466] Failed to execute /init
>  [   70.474804] Kernel panic - not syncing: No init found.  Try passing
> init= option to kernel.
>
>
>  So while it's technically correct, it seems to have added ~15% overhead to
> the emulation :(

Guest time can be unreliable, it could also indicate that Linux
executes a lot more timer interrupts. Could you retest and measure the
wall clock time?

I think the C flag change should only increase performance. The next
commit may have negative effects because more work is done every
interrupt, but it's also more correct now.



[Qemu-devel] Re: [PATCH 1/2] Pad iommu with an empty slot (necessary for SunOS 4.1.4)

2010-05-10 Thread Blue Swirl
On 5/10/10, Artyom Tarasenko  wrote:
> 2010/5/9 Blue Swirl :
>  > On 5/9/10, Artyom Tarasenko  wrote:
>  >> 2010/5/9 Blue Swirl :
>  >>
>  >> > On 5/8/10, Artyom Tarasenko  wrote:
>  >>  >> On the real hardware (SS-5, LX) the MMU is not padded, but aliased.
>  >>  >>  Software shouldn't use aliased addresses, neither should it crash
>  >>  >>  when it uses (on the real hardware it wouldn't). Using empty_slot
>  >>  >>  instead of aliasing can help with debugging such accesses.
>  >>  >
>  >>  > TurboSPARC Microprocessor User's Manual shows that there are
>  >>  > additional pages after the main IOMMU for AFX registers. So this is
>  >>  > not board specific, but depends on CPU/IOMMU versions.
>  >>
>  >>
>  >> I checked it on the real hw: on LX and SS-5 these are aliased MMU 
> addresses.
>  >>  SS-20 doesn't have any aliasing.
>  >
>  > But are your machines equipped with TurboSPARC or some other CPU?
>
>
> Good point, I must confess, I missed the word "Turbo" in your first
>  answer. LX and SS-20 don't.
>  But SS-5 must have a TurboSPARC CPU:
>
>  ok cd /FMI,MB86904
>  ok .attributes
>  context-table00 00 00 00 03 ff f0 00 00 00 10 00
>  psr-implementation   
>  psr-version  0004
>  implementation   
>  version  0004
>  cache-line-size  0020
>  cache-nlines 0200
>  page-size1000
>  dcache-line-size 0010
>  dcache-nlines0200
>  dcache-associativity 0001
>  icache-line-size 0020
>  icache-nlines0200
>  icache-associativity 0001
>  ncaches  0002
>  mmu-nctx 0100
>  sparc-version0008
>  mask_rev 0026
>  device_type  cpu
>  name FMI,MB86904
>
>  and still it behaves the same as TI,TMS390S10 from the LX. This is done on 
> SS-5:
>
>  ok 1000 20 spacel@ .
>  409
>  ok 1400 20 spacel@ .
>  409
>  ok 1404 20 spacel@ .
>  23000
>  ok 1f04 20 spacel@ .
>  23000
>  ok 1008 20 spacel@ .
>  409
>  ok 1428 20 spacel@ .
>  409
>  ok 100c 20 spacel@ .
>  23000
>  ok 1010 20 spacel@ .
>  409
>
>
>  LX is the same except for the IOMMU-version:
>
>  ok 1000 20 spacel@ .
>  405
>  ok 1400 20 spacel@ .
>  405
>  ok 1800 20 spacel@ .
>  405
>  ok 1f00 20 spacel@ .
>  405
>  ok 1ff0 20 spacel@ .
>  405
>  ok 1fff0004 20 spacel@ .
>  1fe000
>  ok 1004 20 spacel@ .
>  1fe000
>  ok 1108 20 spacel@ .
>  4105
>  ok 1040 20 spacel@ .
>  4105
>  ok 1fff0040 20 spacel@ .
>  4105
>  ok 1fff0044 20 spacel@ .
>  1fe000
>  ok 1fff0024 20 spacel@ .
>  1fe000
>
>
>  >>  At what address the additional AFX registers are located?
>  >
>  > Here's complete TurboSPARC IOMMU address map:
>  >  PA[30:0]  Register  Access
>  > 1000_   IOMMU Control R/W
>  > 1000_0004IOMMU Base Address   R/W
>  > 1000_0014   Flush All IOTLB EntriesW
>  > 1000_0018Address Flush W
>  > 1000_1000  Asynchronous Fault Status  R/W
>  > 1000_1004 Asynchronous Fault Address  R/W
>  > 1000_1010  SBus Slot Configuration 0   R/W
>  > 1000_1014  SBus Slot Configuration 1   R/W
>  > 1000_1018  SBus Slot Configuration 2   R/W
>  > 1000_101C  SBus Slot Configuration 3   R/W
>  > 1000_1020  SBus Slot Configuration 4   R/W
>  > 1000_1050 Memory Fault Status R/W
>  > 1000_1054Memory Fault Address R/W
>  > 1000_2000 Module IdentificationR/W
>  > 1000_3018  Mask Identification  R
>  > 1000_4000  AFX Queue Level W
>  > 1000_6000  AFX Queue Level R
>  > 1000_7000  AFX Queue StatusR
>
>
>
> But if I read it correctly 0x12fff294 (which makes SunOS crash with -m 32) is
>  well above this limit.

Oh, so I also misread something. You are not talking about the
adjacent pages, but 16MB increments.

Earlier I sent a patch for a generic address alias device, would it be
useful for this?

Maybe we have a general design problem, perhaps unassigned access
faults should only be triggered inside SBus slots and ignored
elsewhere. If this is true, generic Sparc32 unassigned access handler
should just ignore the access and special fault generating slots
should be installed for empty SBus address ranges.

>  >>  > One approach would be that IOMMU_NREGS would be increased to cover
>  >>  > these registers (with the bump in savevm version field) and
>  >>  > iommu_init1() should check the version field to see how much MMIO to
>  >>  > provide.
>  >>
>  >>
>  >> The problem I see here is that we already have too much registers: we
>  >>  emulate SS-20 IOMMU (I guess), while SS-5 and LX seem to have only
>  >>  0x20 registers which are aliased all the way.
>  >>
>  >>
>  >>  > But in order to avoid the savevm version change, iommu_init1() could
>  >>  > just install dummy MMIO (i

[Qemu-devel] Re: [PATCH v5 4/5] Inter-VM shared memory PCI device

2010-05-10 Thread Cam Macdonell
On Mon, May 10, 2010 at 11:52 AM, Anthony Liguori  wrote:
> On 05/10/2010 12:43 PM, Cam Macdonell wrote:
>>
>> On Mon, May 10, 2010 at 11:25 AM, Anthony Liguori
>>  wrote:
>>
>>>
>>> On 05/10/2010 11:59 AM, Avi Kivity wrote:
>>>

 On 05/10/2010 06:38 PM, Anthony Liguori wrote:

>
>
>>>
>>> Otherwise, if the BAR is allocated during initialization, I would
>>> have
>>> to use MAP_FIXED to mmap the memory.  This is what I did before the
>>> qemu_ram_mmap() function was added.
>>>
>>
>> What would happen to any data written to the BAR before the the
>> handshake completed?  I think it would disappear.
>>
>
> You don't have to do MAP_FIXED.  You can allocate a ram area and map
> that
> in when disconnected.  When you connect, you create another ram area
> and
> memcpy() the previous ram area to the new one.  You then map the second
> ram
> area in.
>

 But it's a shared memory area.  Other peers could have connected and
 written some data in.  The memcpy() would destroy their data.

>>>
>>> Why try to attempt to support multi-master shared memory?  What's the
>>> use-case?
>>>
>>
>> I don't see it as multi-master, but that the latest guest to join
>> shouldn't have their contents take precedence.  In developing this
>> patch, my motivation has been to let the guests decide.  If the memcpy
>> is always done, even when no data is written, a guest cannot join
>> without overwriting everything.
>>
>> One use case we're looking at is having VMs using a map reduce
>> framework like Hadoop or Phoenix running in VMs.  However, if a
>> workqueue is stored or data transfer passes through shared memory, a
>> system can't scale up the number of workers because each new guest
>> will erase the shared memory (and the workqueue or in progress data
>> transfer).
>>
>
> (Replying again to list)

Sorry about that.

> What data structure would you use?  For a lockless ring queue, you can only
> support a single producer and consumer.  To achieve bidirectional
> communication in virtio, we always use two queues.

MCS locks can work with multiple producer/consumers, either with busy
waiting or using the doorbell mechanism.

>
> If you're adding additional queues to support other levels of communication,
> you can always use different areas of shared memory.

True, and my point is simply that the memcpy would wipe those all out.

>
> I guess this is the point behind the doorbell mechanism?

Yes.

>
> Regards,
>
> Anthony Liguori
>
>> In cases where the latest guest to join wants to clear the memory, it
>> can do so without the automatic memcpy.  The guest can do a memset
>> once it knows the memory is attached.  My opinion is to leave it to
>> the guests and the application that is using the shared memory to
>> decide what to do on guest joins.
>>
>> Cam
>>
>>
>>>
>>> Regards,
>>>
>>> Anthony Liguori
>>>
>>>
>
>  From the guest's perspective, it's totally transparent.  For the
> backend,
> I'd suggest having an explicit "initialized" ack or something so that
> it
> knows that the data is now mapped to the guest.
>

  From the peers' perspective, it's non-transparent :(

 Also it doubles the transient memory requirement.


>
> If you're doing just a ring queue in shared memory, it should allow
> disconnect/reconnect during live migration asynchronously to the actual
> qemu
> live migration.
>
>

 Live migration of guests using shared memory is interesting.  You'd need
 to freeze all peers on one node, disconnect, reconnect, and restart them
 on
 the other node.


>>>
>>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>
>



[Qemu-devel] Re: [PATCH v5 4/5] Inter-VM shared memory PCI device

2010-05-10 Thread Anthony Liguori

On 05/10/2010 12:43 PM, Cam Macdonell wrote:

On Mon, May 10, 2010 at 11:25 AM, Anthony Liguori  wrote:
   

On 05/10/2010 11:59 AM, Avi Kivity wrote:
 

On 05/10/2010 06:38 PM, Anthony Liguori wrote:
   
 

Otherwise, if the BAR is allocated during initialization, I would have
to use MAP_FIXED to mmap the memory.  This is what I did before the
qemu_ram_mmap() function was added.
 

What would happen to any data written to the BAR before the the
handshake completed?  I think it would disappear.
   

You don't have to do MAP_FIXED.  You can allocate a ram area and map that
in when disconnected.  When you connect, you create another ram area and
memcpy() the previous ram area to the new one.  You then map the second ram
area in.
 

But it's a shared memory area.  Other peers could have connected and
written some data in.  The memcpy() would destroy their data.
   

Why try to attempt to support multi-master shared memory?  What's the
use-case?
 

I don't see it as multi-master, but that the latest guest to join
shouldn't have their contents take precedence.  In developing this
patch, my motivation has been to let the guests decide.  If the memcpy
is always done, even when no data is written, a guest cannot join
without overwriting everything.

One use case we're looking at is having VMs using a map reduce
framework like Hadoop or Phoenix running in VMs.  However, if a
workqueue is stored or data transfer passes through shared memory, a
system can't scale up the number of workers because each new guest
will erase the shared memory (and the workqueue or in progress data
transfer).
   


(Replying again to list)

What data structure would you use?  For a lockless ring queue, you can 
only support a single producer and consumer.  To achieve bidirectional 
communication in virtio, we always use two queues.


If you're adding additional queues to support other levels of 
communication, you can always use different areas of shared memory.


I guess this is the point behind the doorbell mechanism?

Regards,

Anthony Liguori


In cases where the latest guest to join wants to clear the memory, it
can do so without the automatic memcpy.  The guest can do a memset
once it knows the memory is attached.  My opinion is to leave it to
the guests and the application that is using the shared memory to
decide what to do on guest joins.

Cam

   

Regards,

Anthony Liguori

 

 From the guest's perspective, it's totally transparent.  For the backend,
I'd suggest having an explicit "initialized" ack or something so that it
knows that the data is now mapped to the guest.
 

 From the peers' perspective, it's non-transparent :(

Also it doubles the transient memory requirement.

   

If you're doing just a ring queue in shared memory, it should allow
disconnect/reconnect during live migration asynchronously to the actual qemu
live migration.

 

Live migration of guests using shared memory is interesting.  You'd need
to freeze all peers on one node, disconnect, reconnect, and restart them on
the other node.

   


 

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
   





Re: [Qemu-devel] [PATCH 07/22] qemu-error: Introduce get_errno_string()

2010-05-10 Thread Markus Armbruster
Anthony Liguori  writes:

> On 04/21/2010 03:28 AM, Daniel P. Berrange wrote:
>> On Tue, Apr 20, 2010 at 06:09:37PM -0300, Luiz Capitulino wrote:
[...]
>> Wouldn't it be nicer to return strerror_r()  output instead of errno
>> names ?
>>
>
> Both are equally wrong :-)
>
> QMP should insult users from underlying platform quirks.  We should
> translate errnos to appropriate QMP error types.

While I find QMP's gold-plated errors pretty insulting, too, I wouldn't
go so far as to say QMP *should* insult its users.

SCNR



Re: [Qemu-devel] [PATCH 1/2] qemu-error: Introduce get_errno_name()

2010-05-10 Thread Markus Armbruster
Luiz Capitulino  writes:

> On Tue, 04 May 2010 16:56:19 -0500
> Anthony Liguori  wrote:
>
>> On 05/04/2010 03:30 PM, Luiz Capitulino wrote:
>> >
>> >   StateVmSaveFailed is not like CommandFailed, there are five errors
>> > in do_savevm() and StateVmSaveFailed happens to be one of them.
>> >
>> >   But I understand what you mean and I have considered doing something
>> > like it, one of the problems though is that I'm not sure 'source' is
>> > enough to determine where the error has happened.
>> >
>> >   Consider do_savevm() again. We have three 'operations' that might
>> > fail: delete an existing snapshot, save the VM state and create the
>> > snapshot. All those operations can return -EIO as an error.
>> >
>> 
>> Maybe those three operations should return distinct errnos?
>
>  I don't think this is possible, as we would have to guarantee that no
> function called by a handler return the same errno.
>
>  Taking the block layer as an example. Most block drivers handlers check
> if the driver really exist (-ENOMEDIUM) and if the driver supports the
> operation being requested (-ENOTSUP).
>
>  How can we have unique errnos in this case?
>
>  Also remember that we're only talking about the surface. The call
> chain is deep. We have almost a hundred handlers, they use functions
> from almost all subsystems.
>
>> That way, we can make more useful QErrors.
>
>  Perhaps, the question boils down to how QErrors should be done.
>
>  Today, we're doing it like this, consider handler foo(), it does the 
> following:
>
>   1. connect somewhere
>   2. send some data
>   3. close
>
>  All operations performed can fail and we want to report that. Currently,
> afaiu we're doing the following (Markus correct me if I'm wrong):
>
>   1. ConnectFailed
>   2. SendFailed
>   3. CloseFailed
>
>  An obvious problem is that we don't say why it has failed. But this is
> what errno is for and I thought we could use it someway. The advantage
> of this approach is that, errors are high-level. It's easy to identify
> what went wrong and we can have very good error messages for them.
>
>  Now, if I got it right, you're suggesting we should do:
>
>   1. BadFileDescriptor, Interrupted, NoPermission ...
>  (or anything connect() may return)
>   2. IOError ...
>   3. IOError, BadFileDescriptor
>
>  This makes sense, but if I'm a user (or a QMP client) I don't want this:
>
> (qemu) savevm foobar
> Bad file descriptor
>
>  I'd prefer this instead:
>
> (qemu) savevm foobar
> Can't delete 'foobar': Bad file descriptor
>
>  Or:
>
> (qemu) savevm foobar
> Can't save VM state: I/O Error

These are of the form "what failed: how it failed".

Aside: not mentioned, because it's obvious for monitor commands: where
it failed.  error_report() adds that information, but only when it's not
obvious, see error_print_loc().

We keep these parts separate because squashing them together into a
single QError class multiplies the classes for no good reason, and makes
it needlessly hard to group errors.  E.g. sometimes we care only about
"what failed".  Then a squashed class is most unwelcome, because to
recognize the "what failed" part you have to know all the squashed
classes, hence all possible how-it-faileds, including the ones that will
be invented after your program ships.

[...]



[Qemu-devel] Re: [PATCH v5 4/5] Inter-VM shared memory PCI device

2010-05-10 Thread Cam Macdonell
On Mon, May 10, 2010 at 11:25 AM, Anthony Liguori  wrote:
> On 05/10/2010 11:59 AM, Avi Kivity wrote:
>>
>> On 05/10/2010 06:38 PM, Anthony Liguori wrote:
>>>
> Otherwise, if the BAR is allocated during initialization, I would have
> to use MAP_FIXED to mmap the memory.  This is what I did before the
> qemu_ram_mmap() function was added.

 What would happen to any data written to the BAR before the the
 handshake completed?  I think it would disappear.
>>>
>>> You don't have to do MAP_FIXED.  You can allocate a ram area and map that
>>> in when disconnected.  When you connect, you create another ram area and
>>> memcpy() the previous ram area to the new one.  You then map the second ram
>>> area in.
>>
>> But it's a shared memory area.  Other peers could have connected and
>> written some data in.  The memcpy() would destroy their data.
>
> Why try to attempt to support multi-master shared memory?  What's the
> use-case?

I don't see it as multi-master, but that the latest guest to join
shouldn't have their contents take precedence.  In developing this
patch, my motivation has been to let the guests decide.  If the memcpy
is always done, even when no data is written, a guest cannot join
without overwriting everything.

One use case we're looking at is having VMs using a map reduce
framework like Hadoop or Phoenix running in VMs.  However, if a
workqueue is stored or data transfer passes through shared memory, a
system can't scale up the number of workers because each new guest
will erase the shared memory (and the workqueue or in progress data
transfer).

In cases where the latest guest to join wants to clear the memory, it
can do so without the automatic memcpy.  The guest can do a memset
once it knows the memory is attached.  My opinion is to leave it to
the guests and the application that is using the shared memory to
decide what to do on guest joins.

Cam

>
> Regards,
>
> Anthony Liguori
>
>>>
>>> From the guest's perspective, it's totally transparent.  For the backend,
>>> I'd suggest having an explicit "initialized" ack or something so that it
>>> knows that the data is now mapped to the guest.
>>
>> From the peers' perspective, it's non-transparent :(
>>
>> Also it doubles the transient memory requirement.
>>
>>>
>>> If you're doing just a ring queue in shared memory, it should allow
>>> disconnect/reconnect during live migration asynchronously to the actual qemu
>>> live migration.
>>>
>>
>> Live migration of guests using shared memory is interesting.  You'd need
>> to freeze all peers on one node, disconnect, reconnect, and restart them on
>> the other node.
>>
>
>



Re: [Qemu-devel] [PATCH 1/2] qemu-error: Introduce get_errno_name()

2010-05-10 Thread Markus Armbruster
Anthony Liguori  writes:

> On 05/03/2010 08:06 AM, Markus Armbruster wrote:
>> Luiz Capitulino  writes:
>>
>>
>>> We need to expose errno in QMP, for three reasons:
>>>
>>>1. Some error handling functions print errno codes to the user,
>>>   while it's debatable whether this is good or not from a user
>>>   perspective, sometimes it's the best we can do because it's
>>>   what system calls and libraries return
>>>
>>>2. Some events (eg. BLOCK_IO_ERROR) will be made even more
>>>   complete with errno information
>>>
>>>3. It's very good for debugging
>>>
>>> So, we need a way to expose those codes in QMP. We can't just use
>>> the codes themselfs because they may vary between systems.
>>>
>>> The best solution I can think of is to return the string
>>> representation of the name. For example, EIO becomes "EIO".
>>>
>>> This is what get_errno_name() does.
>>>
>>> Signed-off-by: Luiz Capitulino
>>> ---
>>>   qemu-error.c |   85 
>>> ++
>>>   qemu-error.h |1 +
>>>   2 files changed, 86 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/qemu-error.c b/qemu-error.c
>>> index 5a35e7c..7035417 100644
>>> --- a/qemu-error.c
>>> +++ b/qemu-error.c
>>> @@ -207,3 +207,88 @@ void error_report(const char *fmt, ...)
>>>   va_end(ap);
>>>   error_printf("\n");
>>>   }
>>> +
>>> +/*
>>> + * Probably only useful for QMP
>>> + */
>>> +const char *get_errno_name(int err)
>>> +{
>>> +switch (abs(err)) {
>>> +case EPERM:
>>> +return "EPERM";
>>> +case ENOENT:
>>> +return "ENOENT";
>>>  
>> [...]
>>
>>> +case EDOM:
>>> +return "EDOM";
>>> +case ERANGE:
>>> +return "ERANGE";
>>> +case ENOMEDIUM:
>>> +return "ENOMEDIUM";
>>> +case ENOTSUP:
>>> +return "ENOTSUP";
>>> +default:
>>> +return "unknown";
>>>  
>> How did you choose the codes to implement?  POSIX has many more...
>>
>
> Let me say another way why I think this is a bad path to go down.
>
> In generally, we could never just pass errno through down.  Different
> host platforms are going to generate different errno values so we
> really need to filter and send reliable errno values so that clients
> don't have to have special code for when they're on Linux vs. AIX
> vs. Solaris.

Most instances are covered by POSIX.  The best way to deal with systems
that violate POSIX is to provide conforming replacements for the
offenders.  See gnulib (I just wish it wasn't joined at the hip to
automake).

> If we're white listing errno values, we should be able to trivially
> convert errnos to QError types via a table just like you have above.

Yes, but you'll get many tables.

Luiz encodes the information as event name + errno for events, and as
error class + errno for errors.

The same errno can mean different things for different events / errors.
Thus, if you flatten the two into a single error, you need in the order
of one table per event / error.



[Qemu-devel] Re: [PATCH v5 4/5] Inter-VM shared memory PCI device

2010-05-10 Thread Anthony Liguori

On 05/10/2010 11:59 AM, Avi Kivity wrote:

On 05/10/2010 06:38 PM, Anthony Liguori wrote:



Otherwise, if the BAR is allocated during initialization, I would have
to use MAP_FIXED to mmap the memory.  This is what I did before the
qemu_ram_mmap() function was added.


What would happen to any data written to the BAR before the the 
handshake completed?  I think it would disappear.


You don't have to do MAP_FIXED.  You can allocate a ram area and map 
that in when disconnected.  When you connect, you create another ram 
area and memcpy() the previous ram area to the new one.  You then map 
the second ram area in.


But it's a shared memory area.  Other peers could have connected and 
written some data in.  The memcpy() would destroy their data.


Why try to attempt to support multi-master shared memory?  What's the 
use-case?


Regards,

Anthony Liguori



From the guest's perspective, it's totally transparent.  For the 
backend, I'd suggest having an explicit "initialized" ack or 
something so that it knows that the data is now mapped to the guest.


From the peers' perspective, it's non-transparent :(

Also it doubles the transient memory requirement.



If you're doing just a ring queue in shared memory, it should allow 
disconnect/reconnect during live migration asynchronously to the 
actual qemu live migration.




Live migration of guests using shared memory is interesting.  You'd 
need to freeze all peers on one node, disconnect, reconnect, and 
restart them on the other node.







Re: [Qemu-devel] [RFC][MIPS][PATCH 1/6] Initial support of bonito north bridge used by fulong mini pc

2010-05-10 Thread Stefan Weil

Am 10.05.2010 13:21, schrieb chen huacai:

--- a/hw/mips.h
+++ b/hw/mips.h
@@ -5,6 +5,9 @@
 /* gt64xxx.c */
 PCIBus *pci_gt64120_init(qemu_irq *pic);

+/* bonito.c */
+PCIBus *bonito_init_2e(qemu_irq pic);
+
 /* ds1225y.c */
 void *ds1225y_init(target_phys_addr_t mem_base, const char *filename);
 void ds1225y_set_protection(void *opaque, int protection);
-



Please see my annotations above.

Kind regards,
Stefan Weil



Hi, Stefan, do you means that I should do something like this?

#ifdef CONFIG_FULONG
/* bonito.c */
PCIBus *bonito_init_2e(qemu_irq pic);
#endif


You don't need CONFIG_FULONG here, because you may declare
bonito_init_2e even if it not used.

By the way: why is it called bonito_init_2e (and not bonito_2e_init)?



I found that even if I put CONFIG_FULONG=y in
default-configs/mips64el-softmmu.mak, CONFIG_FULONG will not get
defined in config-target.h.
Because CONFIG_FULONG=y will appear config-device.mak, but not
config-target.mak.
Could you please give me some suggestions?


CONFIG_FULONG is only used in Makefile.target for the
object files which are only needed for fulong. You could also
use a CONFIG_XXX for each individual device XXX, for example

CONFIG_VT82C686=y (indefault-configs/mips64el-softmmu.mak)
obj-mips-$(CONFIG_VT82C686) += vt82c686.o (in Makefile.target)


CONFIG_FULONG is not used in your source code,
so it is not needed in config-target.h.

Kind regards,
Stefan Weil




[Qemu-devel] Re: [PATCH v5 4/5] Inter-VM shared memory PCI device

2010-05-10 Thread Avi Kivity

On 05/10/2010 06:38 PM, Anthony Liguori wrote:



Otherwise, if the BAR is allocated during initialization, I would have
to use MAP_FIXED to mmap the memory.  This is what I did before the
qemu_ram_mmap() function was added.


What would happen to any data written to the BAR before the the 
handshake completed?  I think it would disappear.


You don't have to do MAP_FIXED.  You can allocate a ram area and map 
that in when disconnected.  When you connect, you create another ram 
area and memcpy() the previous ram area to the new one.  You then map 
the second ram area in.


But it's a shared memory area.  Other peers could have connected and 
written some data in.  The memcpy() would destroy their data.




From the guest's perspective, it's totally transparent.  For the 
backend, I'd suggest having an explicit "initialized" ack or something 
so that it knows that the data is now mapped to the guest.


From the peers' perspective, it's non-transparent :(

Also it doubles the transient memory requirement.



If you're doing just a ring queue in shared memory, it should allow 
disconnect/reconnect during live migration asynchronously to the 
actual qemu live migration.




Live migration of guests using shared memory is interesting.  You'd need 
to freeze all peers on one node, disconnect, reconnect, and restart them 
on the other node.


--
error compiling committee.c: too many arguments to function




[Qemu-devel] Re: [PATCH v5 4/5] Inter-VM shared memory PCI device

2010-05-10 Thread Anthony Liguori

On 05/10/2010 11:20 AM, Cam Macdonell wrote:

On Mon, May 10, 2010 at 9:38 AM, Anthony Liguori  wrote:
   

On 05/10/2010 10:28 AM, Avi Kivity wrote:
 

On 05/10/2010 06:22 PM, Cam Macdonell wrote:
   
 
   

+
+/* if the position is -1, then it's shared memory region fd */
+if (incoming_posn == -1) {
+
+s->num_eventfds = 0;
+
+if (check_shm_size(s, incoming_fd) == -1) {
+exit(-1);
+}
+
+/* creating a BAR in qemu_chr callback may be crazy */
+create_shared_memory_BAR(s, incoming_fd);

 

It probably is... why can't you create it during initialization?
   

This is for the shared memory server implementation, so the fd for the
shared memory has to be received (over the qemu char device) from the
server before the BAR can be created via qemu_ram_mmap() which adds
the necessary memory

 


We could do the handshake during initialization.  I'm worried that the
device will appear without the BAR, and strange things will happen.  But the
chardev API is probably not geared for passing data during init.

Anthony, any ideas?
   

Why can't we create the BAR with just normal RAM and then change it to a
mmap()'d fd after initialization?  This will be behavior would be important
for live migration as it would let you quickly migrate preserving the memory
contents without waiting for an external program to reconnect.

Regards,

Anthony Lioguori

 

Otherwise, if the BAR is allocated during initialization, I would have
to use MAP_FIXED to mmap the memory.  This is what I did before the
qemu_ram_mmap() function was added.
 

What would happen to any data written to the BAR before the the handshake
completed?  I think it would disappear.
   

You don't have to do MAP_FIXED.  You can allocate a ram area and map that in
when disconnected.  When you connect, you create another ram area and
memcpy() the previous ram area to the new one.  You then map the second ram
area in.
 

the memcpy() would overwrite the contents of the shared memory each
time a guest joins which would be dangerous.
   


I think those are reasonable semantics and is really the only way to get 
guest-transparent reconnect.  The later is pretty critical for guest 
transparent live migration.



 From the guest's perspective, it's totally transparent.  For the backend,
I'd suggest having an explicit "initialized" ack or something so that it
knows that the data is now mapped to the guest.
 

Yes, I think the ack is the way to go, so the guest has to be aware of
it.  Would setting a flag in the driver-specific config space be an
acceptable ack that the shared region is now mapped?
   


You know it's mapped because it's mapped when the pci map function 
returns.  You don't need the guest to explicitly tell you.


Regards,

Anthony Liguori


Cam
   





[Qemu-devel] Re: [PATCH v5 4/5] Inter-VM shared memory PCI device

2010-05-10 Thread Cam Macdonell
On Mon, May 10, 2010 at 10:40 AM, Avi Kivity  wrote:
> On 05/10/2010 06:41 PM, Cam Macdonell wrote:
>>
>>> What would happen to any data written to the BAR before the the handshake
>>> completed?  I think it would disappear.
>>>
>>
>> But, the BAR isn't there until the handshake is completed.  Only after
>> receiving the shared memory fd does my device call pci_register_bar()
>> in the callback function.  So there may be a case with BAR2 (the
>> shared memory BAR) missing during initialization.  FWIW, I haven't
>> encountered this.
>>
>
> Well, that violates PCI.  You can't have a PCI device with no BAR, then have
> a BAR appear.  It may work since the BAR is registered a lot faster than the
> BIOS is able to peek at it, but it's a race nevertheless.

Agreed.  I'll get Anthony's idea up and running.  It seems that is the
way forward.

Cam



[Qemu-devel] Re: [PATCH v5 4/5] Inter-VM shared memory PCI device

2010-05-10 Thread Avi Kivity

On 05/10/2010 06:41 PM, Cam Macdonell wrote:



What would happen to any data written to the BAR before the the handshake
completed?  I think it would disappear.
 

But, the BAR isn't there until the handshake is completed.  Only after
receiving the shared memory fd does my device call pci_register_bar()
in the callback function.  So there may be a case with BAR2 (the
shared memory BAR) missing during initialization.  FWIW, I haven't
encountered this.
   


Well, that violates PCI.  You can't have a PCI device with no BAR, then 
have a BAR appear.  It may work since the BAR is registered a lot faster 
than the BIOS is able to peek at it, but it's a race nevertheless.


--
error compiling committee.c: too many arguments to function




[Qemu-devel] Re: [PATCH v5 4/5] Inter-VM shared memory PCI device

2010-05-10 Thread Cam Macdonell
On Mon, May 10, 2010 at 9:38 AM, Anthony Liguori  wrote:
> On 05/10/2010 10:28 AM, Avi Kivity wrote:
>>
>> On 05/10/2010 06:22 PM, Cam Macdonell wrote:
>>>

> +
> +    /* if the position is -1, then it's shared memory region fd */
> +    if (incoming_posn == -1) {
> +
> +        s->num_eventfds = 0;
> +
> +        if (check_shm_size(s, incoming_fd) == -1) {
> +            exit(-1);
> +        }
> +
> +        /* creating a BAR in qemu_chr callback may be crazy */
> +        create_shared_memory_BAR(s, incoming_fd);
>
 It probably is... why can't you create it during initialization?
>>>
>>> This is for the shared memory server implementation, so the fd for the
>>> shared memory has to be received (over the qemu char device) from the
>>> server before the BAR can be created via qemu_ram_mmap() which adds
>>> the necessary memory
>>>
>>
>>
>> We could do the handshake during initialization.  I'm worried that the
>> device will appear without the BAR, and strange things will happen.  But the
>> chardev API is probably not geared for passing data during init.
>>
>> Anthony, any ideas?
>
> Why can't we create the BAR with just normal RAM and then change it to a
> mmap()'d fd after initialization?  This will be behavior would be important
> for live migration as it would let you quickly migrate preserving the memory
> contents without waiting for an external program to reconnect.
>
> Regards,
>
> Anthony Lioguori
>
>>> Otherwise, if the BAR is allocated during initialization, I would have
>>> to use MAP_FIXED to mmap the memory.  This is what I did before the
>>> qemu_ram_mmap() function was added.
>>
>> What would happen to any data written to the BAR before the the handshake
>> completed?  I think it would disappear.
>
> You don't have to do MAP_FIXED.  You can allocate a ram area and map that in
> when disconnected.  When you connect, you create another ram area and
> memcpy() the previous ram area to the new one.  You then map the second ram
> area in.

the memcpy() would overwrite the contents of the shared memory each
time a guest joins which would be dangerous.

>
> From the guest's perspective, it's totally transparent.  For the backend,
> I'd suggest having an explicit "initialized" ack or something so that it
> knows that the data is now mapped to the guest.

Yes, I think the ack is the way to go, so the guest has to be aware of
it.  Would setting a flag in the driver-specific config space be an
acceptable ack that the shared region is now mapped?

Cam



[Qemu-devel] Re: [PATCH] virtio: invoke set_features on load

2010-05-10 Thread David Stevens
"Michael S. Tsirkin"  wrote on 05/09/2010 09:42:09 AM:

> After migration, vhost was not getting features
> acked because set_features callback was never invoked.
> The fix is just to invoke that callback.
> 
> Reported-by: David L Stevens 
> Signed-off-by: Michael S. Tsirkin 
> ---
> 
> David, a tested-by tag would be appreciated.

Tested-by: David L Stevens 

> 
>  hw/virtio.c |2 ++
>  1 files changed, 2 insertions(+), 0 deletions(-)
> 
> diff --git a/hw/virtio.c b/hw/virtio.c
> index 5d686f0..74c450d 100644
> --- a/hw/virtio.c
> +++ b/hw/virtio.c
> @@ -692,6 +692,8 @@ int virtio_load(VirtIODevice *vdev, QEMUFile *f)
>  features, supported_features);
>  return -1;
>  }
> +if (vdev->set_features)
> +vdev->set_features(vdev, features);
>  vdev->guest_features = features;
>  vdev->config_len = qemu_get_be32(f);
>  qemu_get_buffer(f, vdev->config, vdev->config_len);
> -- 
> 1.7.1.12.g42b7f




[Qemu-devel] Re: [PATCHv2] Support for booting from virtio disks

2010-05-10 Thread Gleb Natapov
On Mon, May 10, 2010 at 10:58:45AM -0500, Anthony Liguori wrote:
> On 05/10/2010 10:54 AM, Gleb Natapov wrote:
> >On Mon, May 10, 2010 at 10:48:42AM -0500, Anthony Liguori wrote:
> >>On 05/10/2010 03:11 AM, Gleb Natapov wrote:
> >>>This patch adds native support for booting from virtio disks to Seabios.
> >>>
> >>>Signed-off-by: Gleb Natapov
> >>A related problem that I think we need to think about how we solve
> >>is indicating to Seabios which device we want to boot from
> >>
> >>With your patch, a user can select a virtio device explicitly or if
> >>they use only one virtio device, it will Just Work.
> >>
> >>However, if a user uses IDE and virtio, or a user has multiple
> >>disks, they cannot select a device via -boot.
> >>
> >Isn't this problem unrelated to this patch?  I mean if I start qemu with
> >two ide devices can I specify from qemu command line which one I want to
> >boot from?
> 
> That's sort of what I'm asking.  If you compare this approach to
> extboot, extboot provided a capability to select a disk.  I think it
> can be argued though that this isn't a necessary feature to carry
> over and I'm looking for additional opinions on that.
> 
Well, extboot is just a hack and shouldn't be used with ide disks at
all. With extboot it is not possible to switch to another disk from
F12 menu for instance (is it actually possible to read more then one
disks with bios int13 when extboot is in use?). About specifying boot
disk from qemu command like I think it will be very useful. It is not
clear how to pass default boot device into Seabios though. We should
pass a bus boot device is attached too (ide/virtio) and an unique id
of the device on the bus.

--
Gleb.



[Qemu-devel] Re: [PATCHv2] Support for booting from virtio disks

2010-05-10 Thread Anthony Liguori

On 05/10/2010 10:54 AM, Gleb Natapov wrote:

On Mon, May 10, 2010 at 10:48:42AM -0500, Anthony Liguori wrote:
   

On 05/10/2010 03:11 AM, Gleb Natapov wrote:
 

This patch adds native support for booting from virtio disks to Seabios.

Signed-off-by: Gleb Natapov
   

A related problem that I think we need to think about how we solve
is indicating to Seabios which device we want to boot from

With your patch, a user can select a virtio device explicitly or if
they use only one virtio device, it will Just Work.

However, if a user uses IDE and virtio, or a user has multiple
disks, they cannot select a device via -boot.

 

Isn't this problem unrelated to this patch?  I mean if I start qemu with
two ide devices can I specify from qemu command line which one I want to
boot from?
   


That's sort of what I'm asking.  If you compare this approach to 
extboot, extboot provided a capability to select a disk.  I think it can 
be argued though that this isn't a necessary feature to carry over and 
I'm looking for additional opinions on that.


Regards,

Anthony Liguori




[Qemu-devel] Re: [PATCHv2] Support for booting from virtio disks

2010-05-10 Thread Gleb Natapov
On Mon, May 10, 2010 at 10:48:42AM -0500, Anthony Liguori wrote:
> On 05/10/2010 03:11 AM, Gleb Natapov wrote:
> >This patch adds native support for booting from virtio disks to Seabios.
> >
> >Signed-off-by: Gleb Natapov
> 
> A related problem that I think we need to think about how we solve
> is indicating to Seabios which device we want to boot from
> 
> With your patch, a user can select a virtio device explicitly or if
> they use only one virtio device, it will Just Work.
> 
> However, if a user uses IDE and virtio, or a user has multiple
> disks, they cannot select a device via -boot.
> 
Isn't this problem unrelated to this patch?  I mean if I start qemu with
two ide devices can I specify from qemu command line which one I want to
boot from?

> Is this something we need to address?  I don't think we'd break
> libvirt if we didn't.
> 
> Regards,
> 
> Anthony Liguori
> 
> >---
> >
> >Changelog:
> >  v1->v2:
> >   - free memory in case of vq initialization error.
> >   - change license of virtio ring/pci to LGPLv3 with permission
> > of Laurent Vivier (aka the author).
> >
> >diff --git a/Makefile b/Makefile
> >index 327a1bf..d0b8881 100644
> >--- a/Makefile
> >+++ b/Makefile
> >@@ -14,7 +14,8 @@ OUT=out/
> >  SRCBOTH=misc.c pmm.c stacks.c output.c util.c block.c floppy.c ata.c 
> > mouse.c \
> >  kbd.c pci.c serial.c clock.c pic.c cdrom.c ps2port.c smp.c 
> > resume.c \
> >  pnpbios.c pirtable.c vgahooks.c ramdisk.c pcibios.c blockcmd.c \
> >-usb.c usb-uhci.c usb-ohci.c usb-ehci.c usb-hid.c usb-msc.c
> >+usb.c usb-uhci.c usb-ohci.c usb-ehci.c usb-hid.c usb-msc.c \
> >+virtio-ring.c virtio-pci.c virtio-blk.c
> >  SRC16=$(SRCBOTH) system.c disk.c apm.c font.c
> >  SRC32FLAT=$(SRCBOTH) post.c shadow.c memmap.c coreboot.c boot.c \
> >acpi.c smm.c mptable.c smbios.c pciinit.c optionroms.c mtrr.c \
> >diff --git a/src/block.c b/src/block.c
> >index ddf441f..b6b1902 100644
> >--- a/src/block.c
> >+++ b/src/block.c
> >@@ -11,6 +11,7 @@
> >  #include "util.h" // dprintf
> >  #include "ata.h" // process_ata_op
> >  #include "usb-msc.h" // process_usb_op
> >+#include "virtio-blk.h" // process_virtio_op
> >
> >  struct drives_s Drives VAR16VISIBLE;
> >
> >@@ -289,6 +290,8 @@ process_op(struct disk_op_s *op)
> >  return process_cdemu_op(op);
> >  case DTYPE_USB:
> >  return process_usb_op(op);
> >+case DTYPE_VIRTIO:
> >+return process_virtio_op(op);
> >  default:
> >  op->count = 0;
> >  return DISK_RET_EPARAM;
> >diff --git a/src/config.h b/src/config.h
> >index b101174..ad569c6 100644
> >--- a/src/config.h
> >+++ b/src/config.h
> >@@ -136,6 +136,9 @@
> >  #define CONFIG_SUBMODEL_ID   0x00
> >  #define CONFIG_BIOS_REVISION 0x01
> >
> >+// Support boot from virtio storage
> >+#define CONFIG_VIRTIO_BLK 1
> >+
> >  // Various memory addresses used by the code.
> >  #define BUILD_STACK_ADDR  0x7000
> >  #define BUILD_S3RESUME_STACK_ADDR 0x1000
> >diff --git a/src/disk.h b/src/disk.h
> >index 0cd1b74..9e5b083 100644
> >--- a/src/disk.h
> >+++ b/src/disk.h
> >@@ -197,6 +197,7 @@ struct drive_s {
> >  #define DTYPE_RAMDISK  0x04
> >  #define DTYPE_CDEMU0x05
> >  #define DTYPE_USB  0x06
> >+#define DTYPE_VIRTIO   0x07
> >
> >  #define MAXDESCSIZE 80
> >
> >diff --git a/src/pci_ids.h b/src/pci_ids.h
> >index 1800f1d..e1cded2 100644
> >--- a/src/pci_ids.h
> >+++ b/src/pci_ids.h
> >@@ -2605,3 +2605,6 @@
> >  #define PCI_DEVICE_ID_RME_DIGI32   0x9896
> >  #define PCI_DEVICE_ID_RME_DIGI32_PRO   0x9897
> >  #define PCI_DEVICE_ID_RME_DIGI32_8 0x9898
> >+
> >+#define PCI_VENDOR_ID_REDHAT_QUMRANET   0x1af4
> >+#define PCI_DEVICE_ID_VIRTIO_BLK0x1001
> >diff --git a/src/post.c b/src/post.c
> >index 638b0f7..25535e2 100644
> >--- a/src/post.c
> >+++ b/src/post.c
> >@@ -23,6 +23,7 @@
> >  #include "smbios.h" // smbios_init
> >  #include "paravirt.h" // qemu_cfg_port_probe
> >  #include "ps2port.h" // ps2port_setup
> >+#include "virtio-blk.h" // virtio_blk_setup
> >
> >  void
> >  __set_irq(int vector, void *loc)
> >@@ -184,6 +185,7 @@ init_hw(void)
> >  floppy_setup();
> >  ata_setup();
> >  ramdisk_setup();
> >+virtio_blk_setup();
> >  }
> >
> >  // Main setup code.
> >diff --git a/src/virtio-blk.c b/src/virtio-blk.c
> >new file mode 100644
> >index 000..a41c336
> >--- /dev/null
> >+++ b/src/virtio-blk.c
> >@@ -0,0 +1,155 @@
> >+// Virtio blovl boot support.
> >+//
> >+// Copyright (C) 2010 Red Hat Inc.
> >+//
> >+// Authors:
> >+//  Gleb Natapov
> >+//
> >+// This file may be distributed under the terms of the GNU LGPLv3 license.
> >+
> >+#include "util.h" // dprintf
> >+#include "pci.h" // foreachpci
> >+#include "config.h" // CONFIG_*
> >+#include "virtio-pci.h"
> >+#include "virtio-blk.h"
> >+#include "disk.h"
> >+
> >+struct virtiodrive_s {
> >+struct drive_s drive;
> >+struct vring_virtqueue *vq;
> >+u16 ioaddr;
> >+};
> >+
> >+static int
> >+virtio_blk_read(struct disk_

Re: [Qemu-devel] qemu-system-sh4 broken again.

2010-05-10 Thread Aurelien Jarno
Shin-ichiro KAWASAKI a écrit :
> Hello, Rob,
> 
> This mail might be too late, but I want to report you that I
> encountered similar trouble.
> 
> Using the linux kernel after the following commit, the qemu-sh
> serial console shows no output.
> 
>cd5f107628ab89c5dec5ad923f1c27f4cba41972
> 
> This trouble was discussed in sh-linux ML.
> 
>http://marc.info/?l=linux-sh&m=127183863325672&w=2
> 
> To avoid this,
> 
>- add "earlyprintk=sh-sci.1" to kernel command line, and
>- modify CONFIG_SERIAL_SH_SCI_NR_UARTS valud from 1 to 2,
> 
> in kernel configuration menu.
> 
> I hope that this is the trouble you see.
> 

The main problem here is that QEMU emulates both SCI channels as serial
port, while the real board has one of the two channel configured as a
SPI bus (for the RTC clock), and the other as a serial port.

-- 
Aurelien Jarno  GPG: 1024D/F1BCDB73
aurel...@aurel32.net http://www.aurel32.net



[Qemu-devel] Re: [PATCHv2] Support for booting from virtio disks

2010-05-10 Thread Anthony Liguori

On 05/10/2010 03:11 AM, Gleb Natapov wrote:

This patch adds native support for booting from virtio disks to Seabios.

Signed-off-by: Gleb Natapov
   


A related problem that I think we need to think about how we solve is 
indicating to Seabios which device we want to boot from


With your patch, a user can select a virtio device explicitly or if they 
use only one virtio device, it will Just Work.


However, if a user uses IDE and virtio, or a user has multiple disks, 
they cannot select a device via -boot.


Is this something we need to address?  I don't think we'd break libvirt 
if we didn't.


Regards,

Anthony Liguori


---

Changelog:
  v1->v2:
   - free memory in case of vq initialization error.
   - change license of virtio ring/pci to LGPLv3 with permission
 of Laurent Vivier (aka the author).

diff --git a/Makefile b/Makefile
index 327a1bf..d0b8881 100644
--- a/Makefile
+++ b/Makefile
@@ -14,7 +14,8 @@ OUT=out/
  SRCBOTH=misc.c pmm.c stacks.c output.c util.c block.c floppy.c ata.c mouse.c \
  kbd.c pci.c serial.c clock.c pic.c cdrom.c ps2port.c smp.c resume.c \
  pnpbios.c pirtable.c vgahooks.c ramdisk.c pcibios.c blockcmd.c \
-usb.c usb-uhci.c usb-ohci.c usb-ehci.c usb-hid.c usb-msc.c
+usb.c usb-uhci.c usb-ohci.c usb-ehci.c usb-hid.c usb-msc.c \
+virtio-ring.c virtio-pci.c virtio-blk.c
  SRC16=$(SRCBOTH) system.c disk.c apm.c font.c
  SRC32FLAT=$(SRCBOTH) post.c shadow.c memmap.c coreboot.c boot.c \
acpi.c smm.c mptable.c smbios.c pciinit.c optionroms.c mtrr.c \
diff --git a/src/block.c b/src/block.c
index ddf441f..b6b1902 100644
--- a/src/block.c
+++ b/src/block.c
@@ -11,6 +11,7 @@
  #include "util.h" // dprintf
  #include "ata.h" // process_ata_op
  #include "usb-msc.h" // process_usb_op
+#include "virtio-blk.h" // process_virtio_op

  struct drives_s Drives VAR16VISIBLE;

@@ -289,6 +290,8 @@ process_op(struct disk_op_s *op)
  return process_cdemu_op(op);
  case DTYPE_USB:
  return process_usb_op(op);
+case DTYPE_VIRTIO:
+   return process_virtio_op(op);
  default:
  op->count = 0;
  return DISK_RET_EPARAM;
diff --git a/src/config.h b/src/config.h
index b101174..ad569c6 100644
--- a/src/config.h
+++ b/src/config.h
@@ -136,6 +136,9 @@
  #define CONFIG_SUBMODEL_ID   0x00
  #define CONFIG_BIOS_REVISION 0x01

+// Support boot from virtio storage
+#define CONFIG_VIRTIO_BLK 1
+
  // Various memory addresses used by the code.
  #define BUILD_STACK_ADDR  0x7000
  #define BUILD_S3RESUME_STACK_ADDR 0x1000
diff --git a/src/disk.h b/src/disk.h
index 0cd1b74..9e5b083 100644
--- a/src/disk.h
+++ b/src/disk.h
@@ -197,6 +197,7 @@ struct drive_s {
  #define DTYPE_RAMDISK  0x04
  #define DTYPE_CDEMU0x05
  #define DTYPE_USB  0x06
+#define DTYPE_VIRTIO   0x07

  #define MAXDESCSIZE 80

diff --git a/src/pci_ids.h b/src/pci_ids.h
index 1800f1d..e1cded2 100644
--- a/src/pci_ids.h
+++ b/src/pci_ids.h
@@ -2605,3 +2605,6 @@
  #define PCI_DEVICE_ID_RME_DIGI32  0x9896
  #define PCI_DEVICE_ID_RME_DIGI32_PRO  0x9897
  #define PCI_DEVICE_ID_RME_DIGI32_80x9898
+
+#define PCI_VENDOR_ID_REDHAT_QUMRANET  0x1af4
+#define PCI_DEVICE_ID_VIRTIO_BLK   0x1001
diff --git a/src/post.c b/src/post.c
index 638b0f7..25535e2 100644
--- a/src/post.c
+++ b/src/post.c
@@ -23,6 +23,7 @@
  #include "smbios.h" // smbios_init
  #include "paravirt.h" // qemu_cfg_port_probe
  #include "ps2port.h" // ps2port_setup
+#include "virtio-blk.h" // virtio_blk_setup

  void
  __set_irq(int vector, void *loc)
@@ -184,6 +185,7 @@ init_hw(void)
  floppy_setup();
  ata_setup();
  ramdisk_setup();
+virtio_blk_setup();
  }

  // Main setup code.
diff --git a/src/virtio-blk.c b/src/virtio-blk.c
new file mode 100644
index 000..a41c336
--- /dev/null
+++ b/src/virtio-blk.c
@@ -0,0 +1,155 @@
+// Virtio blovl boot support.
+//
+// Copyright (C) 2010 Red Hat Inc.
+//
+// Authors:
+//  Gleb Natapov
+//
+// This file may be distributed under the terms of the GNU LGPLv3 license.
+
+#include "util.h" // dprintf
+#include "pci.h" // foreachpci
+#include "config.h" // CONFIG_*
+#include "virtio-pci.h"
+#include "virtio-blk.h"
+#include "disk.h"
+
+struct virtiodrive_s {
+struct drive_s drive;
+struct vring_virtqueue *vq;
+u16 ioaddr;
+};
+
+static int
+virtio_blk_read(struct disk_op_s *op)
+{
+struct virtiodrive_s *vdrive_g =
+container_of(op->drive_g, struct virtiodrive_s, drive);
+struct vring_virtqueue *vq = GET_GLOBAL(vdrive_g->vq);
+struct virtio_blk_outhdr hdr = {
+.type = VIRTIO_BLK_T_IN,
+.ioprio = 0,
+.sector = op->lba,
+};
+u8 status = VIRTIO_BLK_S_UNSUPP;
+struct vring_list sg[] = {
+{
+.addr  = MAKE_FLATPTR(GET_SEG(SS),&hdr),
+.length= sizeof(hdr),
+},
+{
+.addr  = op->buf_fl,
+.length= GET_GLOBAL(vdrive_g->drive.blksize) * op->count,
+},
+{
+  

[Qemu-devel] Re: [PATCH v5 4/5] Inter-VM shared memory PCI device

2010-05-10 Thread Cam Macdonell
On Mon, May 10, 2010 at 9:28 AM, Avi Kivity  wrote:
> On 05/10/2010 06:22 PM, Cam Macdonell wrote:
>>
>>>
 +
 +    /* if the position is -1, then it's shared memory region fd */
 +    if (incoming_posn == -1) {
 +
 +        s->num_eventfds = 0;
 +
 +        if (check_shm_size(s, incoming_fd) == -1) {
 +            exit(-1);
 +        }
 +
 +        /* creating a BAR in qemu_chr callback may be crazy */
 +        create_shared_memory_BAR(s, incoming_fd);


>>>
>>> It probably is... why can't you create it during initialization?
>>>
>>
>> This is for the shared memory server implementation, so the fd for the
>> shared memory has to be received (over the qemu char device) from the
>> server before the BAR can be created via qemu_ram_mmap() which adds
>> the necessary memory
>>
>>
>
>
> We could do the handshake during initialization.  I'm worried that the
> device will appear without the BAR, and strange things will happen.  But the
> chardev API is probably not geared for passing data during init.

More specifically, the challenge I've found is that there is no
function to tell a chardev to block and wait for the initialization
data.

>
> Anthony, any ideas?
>
>> Otherwise, if the BAR is allocated during initialization, I would have
>> to use MAP_FIXED to mmap the memory.  This is what I did before the
>> qemu_ram_mmap() function was added.
>>
>
> What would happen to any data written to the BAR before the the handshake
> completed?  I think it would disappear.

But, the BAR isn't there until the handshake is completed.  Only after
receiving the shared memory fd does my device call pci_register_bar()
in the callback function.  So there may be a case with BAR2 (the
shared memory BAR) missing during initialization.  FWIW, I haven't
encountered this.

>
> So it's a good idea to make the initialization process atomic.
>
> --
> error compiling committee.c: too many arguments to function
>
>



[Qemu-devel] Re: [PATCH v5 4/5] Inter-VM shared memory PCI device

2010-05-10 Thread Anthony Liguori

On 05/10/2010 10:28 AM, Avi Kivity wrote:

On 05/10/2010 06:22 PM, Cam Macdonell wrote:





+
+/* if the position is -1, then it's shared memory region fd */
+if (incoming_posn == -1) {
+
+s->num_eventfds = 0;
+
+if (check_shm_size(s, incoming_fd) == -1) {
+exit(-1);
+}
+
+/* creating a BAR in qemu_chr callback may be crazy */
+create_shared_memory_BAR(s, incoming_fd);


It probably is... why can't you create it during initialization?

This is for the shared memory server implementation, so the fd for the
shared memory has to be received (over the qemu char device) from the
server before the BAR can be created via qemu_ram_mmap() which adds
the necessary memory




We could do the handshake during initialization.  I'm worried that the 
device will appear without the BAR, and strange things will happen.  
But the chardev API is probably not geared for passing data during init.


Anthony, any ideas?


Why can't we create the BAR with just normal RAM and then change it to a 
mmap()'d fd after initialization?  This will be behavior would be 
important for live migration as it would let you quickly migrate 
preserving the memory contents without waiting for an external program 
to reconnect.


Regards,

Anthony Lioguori


Otherwise, if the BAR is allocated during initialization, I would have
to use MAP_FIXED to mmap the memory.  This is what I did before the
qemu_ram_mmap() function was added.


What would happen to any data written to the BAR before the the 
handshake completed?  I think it would disappear.


You don't have to do MAP_FIXED.  You can allocate a ram area and map 
that in when disconnected.  When you connect, you create another ram 
area and memcpy() the previous ram area to the new one.  You then map 
the second ram area in.


From the guest's perspective, it's totally transparent.  For the 
backend, I'd suggest having an explicit "initialized" ack or something 
so that it knows that the data is now mapped to the guest.


If you're doing just a ring queue in shared memory, it should allow 
disconnect/reconnect during live migration asynchronously to the actual 
qemu live migration.


Regards,

Anthony Liguori


So it's a good idea to make the initialization process atomic.






[Qemu-devel] Re: [PATCH v5 2/5] Support adding a file to qemu's ram allocation

2010-05-10 Thread Cam Macdonell
On Mon, May 10, 2010 at 4:39 AM, Avi Kivity  wrote:
> On 04/21/2010 08:53 PM, Cam Macdonell wrote:
>>
>> This avoids the need of using qemu_ram_alloc and mmap with MAP_FIXED to
>> map a
>> host file into guest RAM.  This function mmaps the opened file anywhere
>> and adds
>> the memory to the ram blocks.
>>
>> Usage is
>>
>> qemu_ram_mmap(fd, size, MAP_SHARED, offset);
>>
>
> Signoff?
>>
>> +ram_addr_t qemu_ram_mmap(int fd, ram_addr_t size, int flags, off_t
>> offset)
>> +{
>> +    RAMBlock *new_block;
>> +
>> +    size = TARGET_PAGE_ALIGN(size);
>> +    new_block = qemu_malloc(sizeof(*new_block));
>> +
>> +    /* map the file passed as a parameter to be this part of memory */
>> +    new_block->host = mmap(0, size, PROT_READ|PROT_WRITE, flags, fd,
>> offset);
>> +
>> +    if (new_block->host == MAP_FAILED)
>> +        exit(1);
>>
>
> Braces after if ()
>
>> +    if (kvm_enabled())
>> +        kvm_setup_guest_memory(new_block->host, size);
>> +
>>
>
> More braces.
>

This function is possibly made redundant by Marcelo's patch for qemu_ram_map

http://kerneltrap.org/mailarchive/linux-kvm/2010/4/26/6261299

qemu_ram_map isn't merged yet either, but I'm fine with either one.
Marcelo's requires the device to map the memory and then pass the
pointer to be added to the memory allocation, so it gives the device
full mapping control.  Alternatively, I could add the protection flag
to my function (I think that's all that is missing).

Let me know and I'll change my patch if necessary.

> --
> error compiling committee.c: too many arguments to function
>
>



[Qemu-devel] Re: [PATCH v5 4/5] Inter-VM shared memory PCI device

2010-05-10 Thread Avi Kivity

On 05/10/2010 06:22 PM, Cam Macdonell wrote:





+
+/* if the position is -1, then it's shared memory region fd */
+if (incoming_posn == -1) {
+
+s->num_eventfds = 0;
+
+if (check_shm_size(s, incoming_fd) == -1) {
+exit(-1);
+}
+
+/* creating a BAR in qemu_chr callback may be crazy */
+create_shared_memory_BAR(s, incoming_fd);

   

It probably is... why can't you create it during initialization?
 

This is for the shared memory server implementation, so the fd for the
shared memory has to be received (over the qemu char device) from the
server before the BAR can be created via qemu_ram_mmap() which adds
the necessary memory

   



We could do the handshake during initialization.  I'm worried that the 
device will appear without the BAR, and strange things will happen.  But 
the chardev API is probably not geared for passing data during init.


Anthony, any ideas?


Otherwise, if the BAR is allocated during initialization, I would have
to use MAP_FIXED to mmap the memory.  This is what I did before the
qemu_ram_mmap() function was added.
   


What would happen to any data written to the BAR before the the 
handshake completed?  I think it would disappear.


So it's a good idea to make the initialization process atomic.

--
error compiling committee.c: too many arguments to function




[Qemu-devel] Re: [PATCH v5 4/5] Inter-VM shared memory PCI device

2010-05-10 Thread Cam Macdonell
On Mon, May 10, 2010 at 5:59 AM, Avi Kivity  wrote:
> On 04/21/2010 08:53 PM, Cam Macdonell wrote:
>>
>> Support an inter-vm shared memory device that maps a shared-memory object
>> as a
>> PCI device in the guest.  This patch also supports interrupts between
>> guest by
>> communicating over a unix domain socket.  This patch applies to the
>> qemu-kvm
>> repository.
>>
>>     -device ivshmem,size=[,shm=]
>>
>> Interrupts are supported between multiple VMs by using a shared memory
>> server
>> by using a chardev socket.
>>
>>     -device ivshmem,size=[,shm=]
>>                     [,chardev=][,msi=on][,irqfd=on][,vectors=n]
>>     -chardev socket,path=,id=
>>
>> (shared memory server is qemu.git/contrib/ivshmem-server)
>>
>> Sample programs and init scripts are in a git repo here:
>>
>>
>> +typedef struct EventfdEntry {
>> +    PCIDevice *pdev;
>> +    int vector;
>> +} EventfdEntry;
>> +
>> +typedef struct IVShmemState {
>> +    PCIDevice dev;
>> +    uint32_t intrmask;
>> +    uint32_t intrstatus;
>> +    uint32_t doorbell;
>> +
>> +    CharDriverState * chr;
>> +    CharDriverState ** eventfd_chr;
>> +    int ivshmem_mmio_io_addr;
>> +
>> +    pcibus_t mmio_addr;
>> +    unsigned long ivshmem_offset;
>> +    uint64_t ivshmem_size; /* size of shared memory region */
>> +    int shm_fd; /* shared memory file descriptor */
>> +
>> +    int nr_allocated_vms;
>> +    /* array of eventfds for each guest */
>> +    int ** eventfds;
>> +    /* keep track of # of eventfds for each guest*/
>> +    int * eventfds_posn_count;
>>
>
> More readable:
>
>  typedef struct Peer {
>      int nb_eventfds;
>      int *eventfds;
>  } Peer;
>  int nb_peers;
>  Peer *peers;
>
> Does eventfd_chr need to be there as well?
>
>> +
>> +    int nr_alloc_guests;
>> +    int vm_id;
>> +    int num_eventfds;
>> +    uint32_t vectors;
>> +    uint32_t features;
>> +    EventfdEntry *eventfd_table;
>> +
>> +    char * shmobj;
>> +    char * sizearg;
>>
>
> Does this need to be part of the state?
>
>> +} IVShmemState;
>> +
>> +/* registers for the Inter-VM shared memory device */
>> +enum ivshmem_registers {
>> +    IntrMask = 0,
>> +    IntrStatus = 4,
>> +    IVPosition = 8,
>> +    Doorbell = 12,
>> +};
>> +
>> +static inline uint32_t ivshmem_has_feature(IVShmemState *ivs, int
>> feature) {
>> +    return (ivs->features&  (1<<  feature));
>> +}
>> +
>> +static inline int is_power_of_two(int x) {
>> +    return (x&  (x-1)) == 0;
>> +}
>>
>
> argument needs to be uint64_t to avoid overflow with large BARs.  Return
> type can be bool.
>
>> +static void ivshmem_io_writel(void *opaque, uint8_t addr, uint32_t val)
>> +{
>> +    IVShmemState *s = opaque;
>> +
>> +    u_int64_t write_one = 1;
>> +    u_int16_t dest = val>>  16;
>> +    u_int16_t vector = val&  0xff;
>> +
>> +    addr&= 0xfe;
>>
>
> Why 0xfe?  Can understand 0xfc or 0xff.
>
>> +
>> +    switch (addr)
>> +    {
>> +        case IntrMask:
>> +            ivshmem_IntrMask_write(s, val);
>> +            break;
>> +
>> +        case IntrStatus:
>> +            ivshmem_IntrStatus_write(s, val);
>> +            break;
>> +
>> +        case Doorbell:
>> +            /* check doorbell range */
>> +            if ((vector>= 0)&&  (vector<  s->eventfds_posn_count[dest]))
>> {
>>
>
> What if dest is too big?  We overflow s->eventfds_posn_count.
>>
>> +
>> +static void close_guest_eventfds(IVShmemState *s, int posn)
>> +{
>> +    int i, guest_curr_max;
>> +
>> +    guest_curr_max = s->eventfds_posn_count[posn];
>> +
>> +    for (i = 0; i<  guest_curr_max; i++)
>> +        close(s->eventfds[posn][i]);
>> +
>> +    free(s->eventfds[posn]);
>>
>
> qemu_free().
>
>> +/* this function increase the dynamic storage need to store data about
>> other
>> + * guests */
>> +static void increase_dynamic_storage(IVShmemState *s, int new_min_size) {
>> +
>> +    int j, old_nr_alloc;
>> +
>> +    old_nr_alloc = s->nr_alloc_guests;
>> +
>> +    while (s->nr_alloc_guests<  new_min_size)
>> +        s->nr_alloc_guests = s->nr_alloc_guests * 2;
>> +
>> +    IVSHMEM_DPRINTF("bumping storage to %d guests\n",
>> s->nr_alloc_guests);
>> +    s->eventfds = qemu_realloc(s->eventfds, s->nr_alloc_guests *
>> +                                                        sizeof(int *));
>> +    s->eventfds_posn_count = qemu_realloc(s->eventfds_posn_count,
>> +                                                    s->nr_alloc_guests *
>> +                                                        sizeof(int));
>> +    s->eventfd_table = qemu_realloc(s->eventfd_table, s->nr_alloc_guests
>> *
>> +
>>  sizeof(EventfdEntry));
>> +
>> +    if ((s->eventfds == NULL) || (s->eventfds_posn_count == NULL) ||
>> +            (s->eventfd_table == NULL)) {
>> +        fprintf(stderr, "Allocation error - exiting\n");
>> +        exit(1);
>> +    }
>> +
>> +    if (!ivshmem_has_feature(s, IVSHMEM_IRQFD)) {
>> +        s->eventfd_chr = (CharDriverState **)qemu_realloc(s->eventfd_chr,
>> +                                    s->nr_alloc_guests * siz

[Qemu-devel] Re: [PATCH v5 3/5] Add functions for assigning ioeventfd and irqfds.

2010-05-10 Thread Avi Kivity

On 05/10/2010 06:13 PM, Cam Macdonell wrote:



+int kvm_set_ioeventfd_mmio_long(int fd, uint32_t addr, uint32_t val, bool
assign)
+{
+
+int ret;
+struct kvm_ioeventfd iofd;
+
+iofd.datamatch = val;
+iofd.addr = addr;
+iofd.len = 4;
+iofd.flags = KVM_IOEVENTFD_FLAG_DATAMATCH;
+iofd.fd = fd;
+
+if (!kvm_enabled())
+return -ENOSYS;
+if (!assign)
+iofd.flags |= KVM_IOEVENTFD_FLAG_DEASSIGN;

   

May be more usable to have separate assign and deassign functions (that can
call into a single internal implementation).
 

I believe the convention so far is to use the 'assign' flag as
Michael's patch and the PIO version kvm_set_ioeventfd_pio_word() do.
   


I dislike bool arguments since they're hard to understand at the call 
site.  However if there's precedent we can stick to it and perhaps 
change it all later.


--
error compiling committee.c: too many arguments to function




[Qemu-devel] Re: [PATCH v5 3/5] Add functions for assigning ioeventfd and irqfds.

2010-05-10 Thread Cam Macdonell
On Mon, May 10, 2010 at 4:43 AM, Avi Kivity  wrote:
> On 04/21/2010 08:53 PM, Cam Macdonell wrote:
>>
>> Generic functions to assign irqfds and ioeventfds.
>>
>>
>
> Signoff.
>
>>  }
>>
>>  #ifdef KVM_IOEVENTFD
>> +int kvm_set_irqfd(int fd, uint16_t vector, uint32_t gsi)
>> +{
>> +    struct kvm_irqfd call = { };
>> +    int r;
>> +
>> +    call.fd = fd;
>> +    call.gsi = gsi;
>>
>
>> +
>> +    if (!kvm_enabled())
>> +        return -ENOSYS;
>>
>
> Braces, here and elsewhere.

This function is unnecessary as Michael added one that does the same thing.

>
>> +    r = kvm_vm_ioctl(kvm_state, KVM_IRQFD,&call);
>> +
>> +    if (r<  0) {
>> +        return r;
>>
>
> -errno
>
>> +    }
>> +    return 0;
>> +}
>> +
>> +int kvm_set_ioeventfd_mmio_long(int fd, uint32_t addr, uint32_t val, bool
>> assign)
>> +{
>> +
>> +    int ret;
>> +    struct kvm_ioeventfd iofd;
>> +
>> +    iofd.datamatch = val;
>> +    iofd.addr = addr;
>> +    iofd.len = 4;
>> +    iofd.flags = KVM_IOEVENTFD_FLAG_DATAMATCH;
>> +    iofd.fd = fd;
>> +
>> +    if (!kvm_enabled())
>> +        return -ENOSYS;
>> +    if (!assign)
>> +        iofd.flags |= KVM_IOEVENTFD_FLAG_DEASSIGN;
>>
>
> May be more usable to have separate assign and deassign functions (that can
> call into a single internal implementation).

I believe the convention so far is to use the 'assign' flag as
Michael's patch and the PIO version kvm_set_ioeventfd_pio_word() do.

>
>> +
>> +    ret = kvm_vm_ioctl(kvm_state, KVM_IOEVENTFD,&iofd);
>> +
>> +    if (ret<  0) {
>> +        return ret;
>>
>
> -errno
>
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>>
>
> --
> error compiling committee.c: too many arguments to function
>
>



[Qemu-devel] KVM call agenda for May 11

2010-05-10 Thread Chris Wright
Please send in any agenda items you are interested in covering.

If we have a lack of agenda items I'll cancel the week's call.

thanks,
-chris



Re: [Qemu-devel] qemu-system-sh4 broken again.

2010-05-10 Thread Shin-ichiro KAWASAKI

Hello, Rob,

This mail might be too late, but I want to report you that I
encountered similar trouble.

Using the linux kernel after the following commit, the qemu-sh
serial console shows no output.

  cd5f107628ab89c5dec5ad923f1c27f4cba41972

This trouble was discussed in sh-linux ML.

  http://marc.info/?l=linux-sh&m=127183863325672&w=2

To avoid this,

  - add "earlyprintk=sh-sci.1" to kernel command line, and
  - modify CONFIG_SERIAL_SH_SCI_NR_UARTS valud from 1 to 2,

in kernel configuration menu.

I hope that this is the trouble you see.

Regards,
Shin-ichiro KAWASAKI


(2010/03/15 9:08), Rob Landley wrote:

On Sunday 14 March 2010 16:28:32 Aurelien Jarno wrote:

On Sat, Mar 13, 2010 at 05:11:43PM -0600, Rob Landley wrote:

I found out that "-serial stdio" is apparently trying to open /dev/stdio,
which Ubuntu 9.04 hasn't got.  If I say -serial /dev/tty it works from
the command line (but not in scripts).


This is actually not specific at all to sh4. See this thread:
http://www.mail-archive.com/qemu-devel@nongnu.org/msg20920.html


http://www.mail-archive.com/qemu-devel@nongnu.org/msg22763.html


This is redundant.  -nographic implies -serial stdio.


Trying with just -nographic and no -serial lines, I get:

$ qemu-system-sh4 -M r2d -nographic -no-reboot -kernel zImage-sh4 -hda image-
sh4.sqf -append "root=/dev/sda rw init=/usr/sbin/init.sh panic=1 PATH=/usr/bin
console=ttySC0 noiotrap HOST=sh4"
long read to SH7750_WCR1_A7 (0x1f88) ignored
long read to SH7750_WCR2_A7 (0x1f8c) ignored
long read to SH7750_WCR3_A7 (0x1f800010) ignored
long read to SH7750_MCR_A7 (0x1f800014) ignored
long read to SH7750_MCR_A7 (0x1f800014) ignored

And it hangs.  No output from any of the kernel serial writes.

http://www.mail-archive.com/qemu-devel@nongnu.org/msg22775.html


eg this should work as you'd expect it

  qemu -nodefaults -nographic -serial stdio


-nographic is basically equivalent to -serial mon:stdio,signal=on -vga none
except it operates on defaults. Your invocation actually ends up being very
different as it doesn't multiplex the monitor and it doesn't disable ctrl-c.
Basically, your invocation is equivalent to qemu -vga none -serial stdio


http://www.mail-archive.com/qemu-devel@nongnu.org/msg22775.html

$ qemu-system-sh4 -M r2d -nographic -no-reboot -kernel zImage-sh4 -hda image-
sh4.sqf -append "root=/dev/sda rw init=/usr/sbin/init.sh panic=1 PATH=/usr/bin
console=ttySC0 noiotrap HOST=sh4"  -vga none -serial stdio
chardev: opening backend "stdio" failed
qemu: could not open serial device 'stdio': Inappropriate ioctl for device

http://www.mail-archive.com/qemu-devel@nongnu.org/msg22777.html

$ qemu-system-sh4 -M r2d -nodefaults -nographic -serial stdio -no-reboot -
kernel zImage-sh4 -hda image-sh4.sqf -append "root=/dev/sda rw
init=/usr/sbin/init.sh panic=1 PATH=/usr/bin console=ttySC0 noiotrap HOST=sh4"
long read to SH7750_WCR1_A7 (0x1f88) ignored
long read to SH7750_WCR2_A7 (0x1f8c) ignored
long read to SH7750_WCR3_A7 (0x1f800010) ignored
long read to SH7750_MCR_A7 (0x1f800014) ignored
long read to SH7750_MCR_A7 (0x1f800014) ignored

And the hang's back, no output...

Ok, this seems to work:

qemu-system-sh4 -M r2d -nodefaults -nographic -serial null -serial stdio -no-
reboot -kernel zImage-sh4 -hda image-sh4.sqf -append "root=/dev/sda rw
init=/usr/sbin/init.sh panic=1 PATH=/usr/bin console=ttySC0 noiotrap HOST=sh4"

I no longer even pretend to know why...

Do I have to say "-nodefaults" on every other target as well to disable the
unwanted monitor I never knew was there?

Rob





Re: [Qemu-devel] Re: Registering buffers with a qdict

2010-05-10 Thread Avi Kivity

On 05/10/2010 04:43 PM, Jan Kiszka wrote:

Avi Kivity wrote:
   

On 05/10/2010 01:59 PM, Jan Kiszka wrote:
 

   From a quick glance at the JSON spec, there is no room for a new type. I
think we have to overload an existing one and convert that into a
QBuffer (typically, we know the actual semantic). Hex string encoding is
most compact, so I went this road.
   

Base64 is even more compact.
 

For sure, still room for improvements. There is just no encode() service
in current qemu, so I went the lazy way so far. :)
   


Well, if we expose these encoded buffers via QMP, we can't unlazy the 
implementation.  It becomes an ABI, so we better think (and document) 
this through.



But I'm open to change it into a true
type if JSON actually allows it (or we are fine with breaking it).

   

That ruins any possibility of using a standard json encoder/decoder on
the other end.

 

That was my concern as well. Such decoders would not able to tell
strings apart from buffers as that can only be derived from the context.
Still, they could visualize the result and/or forward it to some libqmp
for proper interpretation.
   


That's fine; the documentation for a command that accepts or returns 
buffers would mention that the value is a hex (or base64) encoded string.


--
error compiling committee.c: too many arguments to function




[Qemu-devel] Re: AHCI support integration

2010-05-10 Thread Tejun Heo
Hello,

On 05/09/2010 09:11 PM, Alexander Graf wrote:
> Sebastian Herbszt wrote:
>> The ICH6 AHCI implementation submitted by Chong is an all-in-one
>> attempt (ahci.c).  It includes all needed parts of the ICH6, AHCI,
>> SATA and ATA specification.  The code in hw/ide/* on the other hand
>> is split (or could be split) into smaller parts like port based and
>> bus master access, IDE and ATA core.  I think it might be
>> reasonable to split ahci.c into ICH6, AHCI and SATA parts and strip
>> the limited ATA support and reuse it from the ide core.  This would
>> give us something like the following:
>>
>> hw/ide/piix.c (PIIX3 and PIIX4)
>> hw/ide/pci.c and core.c (IDE, BM)
>> hw/ata-core.c (ATA)
>> hw/sata/ich6.c (ICH6)
>> hw/sata-core.c (SATA)
>> hw/ahci-core.c (AHCI)
>>
>> Should this be a goal or am i over-engineering here?
> 
> [CC'ing Tejun - he volunteered to help out on this topic as well]
> 
> I think there's no need to split sata and ahci.
> Apart from that, I think we should take things incrementally. For now
> there's no need to split IDE further until we hit a technical limit. I
> have yet to see a patch trying to reuse the IDE command processing, so
> depending on how the respective person implements that, I'm open to
> suggenstions.

I don't know the code but here are a few things which might worth
considering.

* ahci and the IDE interfaces are pretty much independent in most
  implementations.

* On host emulation side, there might not be too much to separate out
  for generic sata part from ahci at this point.  Link state emulation
  should be pretty simple and I suppose multiple command processing
  would be a bit strongly tied into specific host emulation.

* If necessary, a separate IDE layer below PIIX3/4, ICH whatever would
  be nice.  Most IDE controllers behave about the same after all and
  depending on implementation except for link emulation and NCQ
  handling, ahci can probably just wrap the IDE layer for other stuff
  too.

* If necessary, implementing generic ahci would be nice.  Most ahci
  implementations only have small quirks on top of generic ahci
  behavior (even less deviance compared to IDE), so separating it out
  shouldn't be difficult at all.

Thanks.

-- 
tejun




[Qemu-devel] qemu-system-arm use for bootloader running

2010-05-10 Thread Belisko Marek
Hi,

I'm  working on eol bootloader
(http://vivien.chappelier.free.fr/typhoon/release/eol/20070609/eol-0.5.tar.gz)
This bootloader was written to load kernel image from sd card in SPL
mode for htc phones (omap 730, omap 850) to
no start WINCE but you can choose what to load.

First of all I would like to ask if there exist in qemu support for
(omap 730, 850) MCUs. If not can I use some generic
arm? It's possible to use qemu for running, debugging bootloader code?
I have a problem that it doesn't work on my htc
and it's hard to debug where the problem is because no serial port
only LCD which remain white :).

Thanks for replies,

Marek

-- 
as simple and primitive as possible
-
Marek Belisko - OPEN-NANDRA
Freelance Developer

Ruska Nova Ves 219 | Presov, 08005 Slovak Republic
Tel: +421 915 052 184
skype: marekwhite
icq: 290551086
web: http://open-nandra.com




[Qemu-devel] [PATCH 2/2] ehci: Fix debug traces

2010-05-10 Thread Vincent Palatin
- fix build error when activating traces
- properly display the config flags register

Signed-off-by: Vincent Palatin 
---
 hw/usb-ehci.c |6 +-
 1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/hw/usb-ehci.c b/hw/usb-ehci.c
index e724653..ab9a23e 100644
--- a/hw/usb-ehci.c
+++ b/hw/usb-ehci.c
@@ -469,6 +469,10 @@ static const char *addr2str(unsigned addr)
 case PORTSC_BEGIN ... PORTSC_END:
 r = "PORT STATUS";
 break;
+
+case CONFIGFLAG:
+r = "CONFIG FLAG";
+break;
 }
 
 return r;
@@ -1956,7 +1960,7 @@ static void ehci_map(PCIDevice *pci_dev, int region_num,
 {
 EHCIState *s =(EHCIState *)pci_dev;
 
-DPRINTF("ehci_map: region %d, addr %08lX, size %ld, s->mem %08X\n",
+DPRINTF("ehci_map: region %d, addr %08llX, size %lld, s->mem %08X\n",
 region_num, addr, size, s->mem);
 s->mem_base = addr;
 cpu_register_physical_memory(addr, size, s->mem);
-- 
1.5.6.5





[Qemu-devel] ehci fixes

2010-05-10 Thread Vincent Palatin
Dear developers,

While using the EHCI patchset, I have found 2 minor issues.
So, I send in this email thread 2 fix proposals.

Those patches apply on top of the Jan Kiszka's ehci branch.
Thanks to Jan and David for gathering and updating this patchset.

--
Vincent






[Qemu-devel] [PATCH 1/2] ehci: Fix error detection when registering a new list base address

2010-05-10 Thread Vincent Palatin
We must check against the current running command not the list address.

Signed-off-by: Vincent Palatin 
---
 hw/usb-ehci.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/usb-ehci.c b/hw/usb-ehci.c
index 8be0780..e724653 100644
--- a/hw/usb-ehci.c
+++ b/hw/usb-ehci.c
@@ -774,7 +774,7 @@ static void ehci_mem_writel(void *ptr, target_phys_addr_t 
addr, uint32_t val)
 break;
 
 case PERIODICLISTBASE:
-if (val & USBCMD_PSE) {
+if (s->usbcmd & USBCMD_PSE) {
 fprintf(stderr, "Guest OS should not be setting the periodic"
" list base register while periodic schedule is enabled\n");
 return;
@@ -783,7 +783,7 @@ static void ehci_mem_writel(void *ptr, target_phys_addr_t 
addr, uint32_t val)
 break;
 
 case ASYNCLISTADDR:
-if (val & USBCMD_ASE) {
+if (s->usbcmd & USBCMD_ASE) {
 fprintf(stderr, "Guest OS should not be setting the async list"
" address register while async schedule is enabled\n");
 return;
-- 
1.5.6.5





  1   2   >