[PATCH v2] virtio-blk: Respect discard granularity
Signed-off-by: Akihiko Odaki --- hw/block/virtio-blk.c | 8 +++- hw/core/machine.c | 9 - include/hw/virtio/virtio-blk.h | 1 + 3 files changed, 16 insertions(+), 2 deletions(-) diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c index bac2d6fa2b2..f4378e61182 100644 --- a/hw/block/virtio-blk.c +++ b/hw/block/virtio-blk.c @@ -962,10 +962,14 @@ static void virtio_blk_update_config(VirtIODevice *vdev, uint8_t *config) blkcfg.wce = blk_enable_write_cache(s->blk); virtio_stw_p(vdev, _queues, s->conf.num_queues); if (virtio_has_feature(s->host_features, VIRTIO_BLK_F_DISCARD)) { +uint32_t discard_granularity = conf->discard_granularity; +if (discard_granularity == -1 || !s->conf.report_discard_granularity) { +discard_granularity = blk_size; +} virtio_stl_p(vdev, _discard_sectors, s->conf.max_discard_sectors); virtio_stl_p(vdev, _sector_alignment, - blk_size >> BDRV_SECTOR_BITS); + discard_granularity >> BDRV_SECTOR_BITS); /* * We support only one segment per request since multiple segments * are not widely used and there are no userspace APIs that allow @@ -1299,6 +1303,8 @@ static Property virtio_blk_properties[] = { IOThread *), DEFINE_PROP_BIT64("discard", VirtIOBlock, host_features, VIRTIO_BLK_F_DISCARD, true), +DEFINE_PROP_BOOL("report-discard-granularity", VirtIOBlock, + conf.report_discard_granularity, true), DEFINE_PROP_BIT64("write-zeroes", VirtIOBlock, host_features, VIRTIO_BLK_F_WRITE_ZEROES, true), DEFINE_PROP_UINT32("max-discard-sectors", VirtIOBlock, diff --git a/hw/core/machine.c b/hw/core/machine.c index de3b8f1b318..3ba976e5bbc 100644 --- a/hw/core/machine.c +++ b/hw/core/machine.c @@ -33,7 +33,9 @@ #include "migration/global_state.h" #include "migration/vmstate.h" -GlobalProperty hw_compat_5_2[] = {}; +GlobalProperty hw_compat_5_2[] = { +{ "virtio-blk-device", "report-discard-granularity", "off" }, +}; const size_t hw_compat_5_2_len = G_N_ELEMENTS(hw_compat_5_2); GlobalProperty hw_compat_5_1[] = { @@ -41,6 +43,7 @@ GlobalProperty hw_compat_5_1[] = { { "vhost-user-blk", "num-queues", "1"}, { "vhost-user-scsi", "num_queues", "1"}, { "virtio-blk-device", "num-queues", "1"}, +{ "virtio-blk-device", "report-discard-granularity", "off" }, { "virtio-scsi-device", "num_queues", "1"}, { "nvme", "use-intel-id", "on"}, { "pvpanic", "events", "1"}, /* PVPANIC_PANICKED */ @@ -50,6 +53,7 @@ const size_t hw_compat_5_1_len = G_N_ELEMENTS(hw_compat_5_1); GlobalProperty hw_compat_5_0[] = { { "pci-host-bridge", "x-config-reg-migration-enabled", "off" }, { "virtio-balloon-device", "page-poison", "false" }, +{ "virtio-blk-device", "report-discard-granularity", "off" }, { "vmport", "x-read-set-eax", "off" }, { "vmport", "x-signal-unsupported-cmd", "off" }, { "vmport", "x-report-vmx-type", "off" }, @@ -59,6 +63,7 @@ GlobalProperty hw_compat_5_0[] = { const size_t hw_compat_5_0_len = G_N_ELEMENTS(hw_compat_5_0); GlobalProperty hw_compat_4_2[] = { +{ "virtio-blk-device", "report-discard-granularity", "off" }, { "virtio-blk-device", "queue-size", "128"}, { "virtio-scsi-device", "virtqueue_size", "128"}, { "virtio-blk-device", "x-enable-wce-if-config-wce", "off" }, @@ -74,6 +79,7 @@ GlobalProperty hw_compat_4_2[] = { const size_t hw_compat_4_2_len = G_N_ELEMENTS(hw_compat_4_2); GlobalProperty hw_compat_4_1[] = { +{ "virtio-blk-device", "report-discard-granularity", "off" }, { "virtio-pci", "x-pcie-flr-init", "off" }, { "virtio-device", "use-disabled-flag", "false" }, }; @@ -83,6 +89,7 @@ GlobalProperty hw_compat_4_0[] = { { "VGA","edid", "false" }, { "secondary-vga", "edid", "false" }, { "bochs-display", "edid", "false" }, +{ "virtio-blk-device", "report-discard-granularity", "off" }, { "virtio-vga", "edid", "false" }, { "virtio-gpu-device", "edid", "false" }, { "virtio-device", "use-started", "false" }, diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h index 214ab748229..29655a406dd 100644 --- a/include/hw/virtio/virtio-blk.h +++ b/include/hw/virtio/virtio-blk.h @@ -41,6 +41,7 @@ struct VirtIOBlkConf uint16_t num_queues; uint16_t queue_size; bool seg_max_adjust; +bool report_discard_granularity; uint32_t max_discard_sectors; uint32_t max_write_zeroes_sectors; bool x_enable_wce_if_config_wce; -- 2.24.3 (Apple Git-128)
Re: [PATCH V2 1/7] block/rbd: bump librbd requirement to luminous release
On Mon, Feb 15, 2021 at 8:29 AM Peter Lieven wrote: > > Am 15.02.21 um 13:13 schrieb Kevin Wolf: > > Am 15.02.2021 um 12:45 hat Peter Lieven geschrieben: > >> Am 15.02.21 um 12:41 schrieb Daniel P. Berrangé: > >>> On Mon, Feb 15, 2021 at 12:32:24PM +0100, Peter Lieven wrote: > Am 15.02.21 um 11:24 schrieb Daniel P. Berrangé: > > On Tue, Jan 26, 2021 at 12:25:34PM +0100, Peter Lieven wrote: > >> even luminous (version 12.2) is unmaintained for over 3 years now. > >> Bump the requirement to get rid of the ifdef'ry in the code. > > We have clear rules on when we bump minimum versions, determined by > > the OS platforms we target: > > > > https://qemu.readthedocs.io/en/latest/system/build-platforms.html > > > > At this time RHEL-7 is usually the oldest platform, and it > > builds with RBD 10.2.5, so we can't bump the version to 12.2. > > > > I'm afraid this patch has to be dropped. > I have asked exactly this question before I started work on this series > and got reply > > from Jason that he sees no problem in bumping to a release which is > already unmaintained > > for 3 years. > >>> I'm afraid Jason is wrong here. It doesn't matter what the upstream > >>> consider the support status to be. QEMU targets what the OS vendors > >>> ship, and they still consider this to be a supported version. > >> > >> Okay, but the whole coroutine stuff would get a total mess with all > >> the ifdef'ry. > > Hm, but how are these ifdefs even related to the coroutine conversation? > > It's a bit more code that you're moving around, but shouldn't it be > > unchanged from the old code, just moving from an AIO callback to a > > coroutine? Or am I missing some complications? > > > No, the ifdef's only come back in for the write zeroes part. > > > > > >> Would it be an option to make a big ifdef in the rbd driver? One with > >> old code for < 12.0.0 and one > >> > >> with new code for >= 12.0.0? > > I don't think this is a good idea, this would be a huge mess to > > maintain. > > > > The conversion is probably a good idea in general, simply because it's > > more in line with the rest of the block layer, but I don't think it adds > > anything per se, so it's hard to justify such duplication with the > > benefits it brings. > > > I would wait for Jasons comment on the rbd part of the series and then spin a > V3 > > with a for-6.1 tag. Sorry for the long delay -- I was delayed from being out-of-town. I've reviewed and play-tested the patches and it looks good for me. I'll wait for V3 before adding my official review. > > > Peter > -- Jason
Re: What prevents discarding a cluster during rewrite?
23.02.2021 00:30, Vladimir Sementsov-Ogievskiy wrote: Hi all! Thinking of how to prevent dereferencing to zero (and discard) of host cluster during flush of compressed cache (which I'm working on now), I have a question.. What prevents it for normal writes? I have no idea about why didn't it fail for years.. May be, I'm missing something? I have idea of fixing: increase the refcount of host cluster before write to data_file (it's important to increase refacount in same s->lock critical section where we get host_offset) and dereference it after write.. It should help. Any thoughts? -- Best regards, Vladimir
What prevents discarding a cluster during rewrite?
Hi all! Thinking of how to prevent dereferencing to zero (and discard) of host cluster during flush of compressed cache (which I'm working on now), I have a question.. What prevents it for normal writes? A simple interactive qemu-io session on master branch: ./qemu-img create -f qcow2 x 1M [root@kvm build]# ./qemu-io blkdebug::x do initial write: qemu-io> write -P 1 0 64K wrote 65536/65536 bytes at offset 0 64 KiB, 1 ops; 00.12 sec (556.453 KiB/sec and 8.6946 ops/sec) rewrite, and break before write (assume long write by fs or hardware for some reason) qemu-io> break write_aio A qemu-io> aio_write -P 2 0 64K blkdebug: Suspended request 'A' OK, we stopped before write. Everything is already allocated on initial write, mutex now resumed.. And suddenly we do discard: qemu-io> discard 0 64K discard 65536/65536 bytes at offset 0 64 KiB, 1 ops; 00.00 sec (146.034 MiB/sec and 2336.5414 ops/sec) Now, start another write, to another place.. But it will allocate same host cluster!!! qemu-io> write -P 3 128K 64K wrote 65536/65536 bytes at offset 131072 64 KiB, 1 ops; 00.08 sec (787.122 KiB/sec and 12.2988 ops/sec) Check it: qemu-io> read -P 3 128K 64K read 65536/65536 bytes at offset 131072 64 KiB, 1 ops; 00.00 sec (188.238 MiB/sec and 3011.8033 ops/sec) resume our old write: qemu-io> resume A blkdebug: Resuming request 'A' qemu-io> wrote 65536/65536 bytes at offset 0 64 KiB, 1 ops; 0:05:07.10 (213.400382 bytes/sec and 0.0033 ops/sec) of course it doesn't influence first cluster, as it is discarded: qemu-io> read -P 2 0 64K Pattern verification failed at offset 0, 65536 bytes read 65536/65536 bytes at offset 0 64 KiB, 1 ops; 00.00 sec (726.246 MiB/sec and 11619.9352 ops/sec) qemu-io> read -P 0 0 64K read 65536/65536 bytes at offset 0 64 KiB, 1 ops; 00.00 sec (632.348 MiB/sec and 10117.5661 ops/sec) But in 3rd cluster data is corrupted now: qemu-io> read -P 3 128K 64K Pattern verification failed at offset 131072, 65536 bytes read 65536/65536 bytes at offset 131072 64 KiB, 1 ops; 00.00 sec (163.922 MiB/sec and 2622.7444 ops/sec) qemu-io> read -P 2 128K 64K read 65536/65536 bytes at offset 131072 64 KiB, 1 ops; 00.00 sec (257.058 MiB/sec and 4112.9245 ops/sec So, that's a classical use-after-free... For user it looks like racy write/discard to one cluster may corrupt another cluster... It may be even worse, if use-after-free corrupts metadata. Note, that initial write is significant, as when we do allocate cluster we write L2 entry after data write (as I understand), so the race doesn't happen. But, if consider compressed writes, they allocate everything before write.. Let's check: [root@kvm build]# ./qemu-img create -f qcow2 x 1M; ./qemu-io blkdebug::x Formatting 'x', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=1048576 lazy_refcounts=off refcount_bits=16 qemu-io> break write_compressed A qemu-io> aio_write -c -P 1 0 64K qemu-io> compressed: 327680 79 blkdebug: Suspended request 'A' qemu-io> discard 0 64K discarded: 327680 discard 65536/65536 bytes at offset 0 64 KiB, 1 ops; 00.01 sec (7.102 MiB/sec and 113.6297 ops/sec) qemu-io> write -P 3 128K 64K normal cluster alloc: 327680 wrote 65536/65536 bytes at offset 131072 64 KiB, 1 ops; 00.06 sec (1.005 MiB/sec and 16.0774 ops/sec) qemu-io> resume A blkdebug: Resuming request 'A' qemu-io> wrote 65536/65536 bytes at offset 0 64 KiB, 1 ops; 0:00:15.90 (4.026 KiB/sec and 0.0629 ops/sec) qemu-io> read -P 3 128K 64K Pattern verification failed at offset 131072, 65536 bytes read 65536/65536 bytes at offset 131072 64 KiB, 1 ops; 00.00 sec (237.791 MiB/sec and 3804.6539 ops/sec) (strange, but seems it didn't fail several times for me.. But now it fails several times... Anyway, it's all not good). -- Best regards, Vladimir
Re: [PATCH v2 5/5] hw/block/nvme: report non-mdts command size limit for dsm
On Feb 23 05:55, Keith Busch wrote: > On Mon, Feb 22, 2021 at 07:47:59PM +0100, Klaus Jensen wrote: > > +typedef struct NvmeIdCtrlNvm { > > +uint8_t vsl; > > +uint8_t wzsl; > > +uint8_t wusl; > > +uint8_t dmrl; > > +uint32_tdmrsl; > > +uint64_tdmsl; > > +uint8_t rsvd16[4080]; > > +} NvmeIdCtrlNvm; > > TP 4040a still displays these fields with preceding '...' indicating > something comes before this. Is that just left-over from the integration > for TBD offsets, or is there something that still hasn't been accounted > for? Good question. But since the TBDs have been assigned I believe it is just a left-over. I must admit that I have not cross checked this with all other TPs, but AFAIK this is the only ratified TP that adds something to the NVM-specific identify controller data structure. signature.asc Description: PGP signature
Re: [PATCH 0/3] hw/block/nvme: mdts/zasl cleanup
These look good. Reviewed-by: Keith Busch
Re: [PATCH v2 5/5] hw/block/nvme: report non-mdts command size limit for dsm
On Mon, Feb 22, 2021 at 07:47:59PM +0100, Klaus Jensen wrote: > +typedef struct NvmeIdCtrlNvm { > +uint8_t vsl; > +uint8_t wzsl; > +uint8_t wusl; > +uint8_t dmrl; > +uint32_tdmrsl; > +uint64_tdmsl; > +uint8_t rsvd16[4080]; > +} NvmeIdCtrlNvm; TP 4040a still displays these fields with preceding '...' indicating something comes before this. Is that just left-over from the integration for TBD offsets, or is there something that still hasn't been accounted for?
Re: [PATCH V2 0/6] hw/block/nvme: support namespace attachment
On Feb 11 01:09, Minwoo Im wrote: > Hello, > > This series supports namespace attachment: attach and detach. This is > the second version series with a fix a bug on choosing a controller to > attach for a namespace in the attach command handler. > > Since V1: > - Fix to take 'ctrl' which is given from the command rather than 'n'. > (Klaus) > - Add a [7/7] patch to support CNS 12h Identify command (Namespace > Attached Controller list). > Good stuff Minwoo! For the lot, Tested-by: Klaus Jensen Reviewed-by: Klaus Jensen signature.asc Description: PGP signature
Re: [PATCH V2 6/7] hw/block/nvme: support namespace attachment command
On Feb 11 01:09, Minwoo Im wrote: > This patch supports Namespace Attachment command for the pre-defined > nvme-ns device nodes. Of course, attach/detach namespace should only be > supported in case 'subsys' is given. This is because if we detach a > namespace from a controller, somebody needs to manage the detached, but > allocated namespace in the NVMe subsystem. > > Signed-off-by: Minwoo Im > --- > hw/block/nvme-subsys.h | 10 +++ > hw/block/nvme.c| 59 ++ > hw/block/nvme.h| 5 > hw/block/trace-events | 2 ++ > include/block/nvme.h | 5 > 5 files changed, 81 insertions(+) > > diff --git a/hw/block/nvme-subsys.h b/hw/block/nvme-subsys.h > index 14627f9ccb41..ef4bec928eae 100644 > --- a/hw/block/nvme-subsys.h > +++ b/hw/block/nvme-subsys.h > @@ -30,6 +30,16 @@ typedef struct NvmeSubsystem { > int nvme_subsys_register_ctrl(NvmeCtrl *n, Error **errp); > int nvme_subsys_register_ns(NvmeNamespace *ns, Error **errp); > > +static inline NvmeCtrl *nvme_subsys_ctrl(NvmeSubsystem *subsys, > +uint32_t cntlid) > +{ > +if (!subsys) { > +return NULL; > +} > + > +return subsys->ctrls[cntlid]; > +} > + > /* > * Return allocated namespace of the specified nsid in the subsystem. > */ > diff --git a/hw/block/nvme.c b/hw/block/nvme.c > index 697368a6ae0c..71bcd66f1956 100644 > --- a/hw/block/nvme.c > +++ b/hw/block/nvme.c > @@ -183,6 +183,7 @@ static const uint32_t nvme_cse_acs[256] = { > [NVME_ADM_CMD_SET_FEATURES] = NVME_CMD_EFF_CSUPP, > [NVME_ADM_CMD_GET_FEATURES] = NVME_CMD_EFF_CSUPP, > [NVME_ADM_CMD_ASYNC_EV_REQ] = NVME_CMD_EFF_CSUPP, > +[NVME_ADM_CMD_NS_ATTACHMENT]= NVME_CMD_EFF_CSUPP, > }; > > static const uint32_t nvme_cse_iocs_none[256]; > @@ -3766,6 +3767,62 @@ static uint16_t nvme_aer(NvmeCtrl *n, NvmeRequest *req) > return NVME_NO_COMPLETE; > } > > +static void __nvme_select_ns_iocs(NvmeCtrl *n, NvmeNamespace *ns); > +static uint16_t nvme_ns_attachment(NvmeCtrl *n, NvmeRequest *req) > +{ > +NvmeNamespace *ns; > +NvmeCtrl *ctrl; > +uint16_t list[NVME_CONTROLLER_LIST_SIZE] = {}; > +uint32_t nsid = le32_to_cpu(req->cmd.nsid); > +uint32_t dw10 = le32_to_cpu(req->cmd.cdw10); > +bool attach = !(dw10 & 0xf); > +uint16_t *nr_ids = [0]; > +uint16_t *ids = [1]; > +uint16_t ret; > +int i; > + > +trace_pci_nvme_ns_attachment(nvme_cid(req), dw10 & 0xf); > + > +ns = nvme_subsys_ns(n->subsys, nsid); > +if (!ns) { > +return NVME_INVALID_FIELD | NVME_DNR; > +} > + > +ret = nvme_dma(n, (uint8_t *)list, 4096, > + DMA_DIRECTION_TO_DEVICE, req); > +if (ret) { > +return ret; > +} > + > +if (!*nr_ids) { > +return NVME_NS_CTRL_LIST_INVALID | NVME_DNR; > +} > + > +for (i = 0; i < *nr_ids; i++) { > +ctrl = nvme_subsys_ctrl(n->subsys, ids[i]); > +if (!ctrl) { > +return NVME_NS_CTRL_LIST_INVALID | NVME_DNR; > +} > + > +if (attach) { > +if (nvme_ns_is_attached(ctrl, ns)) { > +return NVME_NS_ALREADY_ATTACHED | NVME_DNR; > +} > + > +nvme_ns_attach(ctrl, ns); > +__nvme_select_ns_iocs(ctrl, ns); > +} else { > +if (!nvme_ns_is_attached(ctrl, ns)) { > +return NVME_NS_NOT_ATTACHED | NVME_DNR; > +} > + > +nvme_ns_detach(ctrl, ns); > +} > +} > + > +return NVME_SUCCESS; > +} > + > static uint16_t nvme_admin_cmd(NvmeCtrl *n, NvmeRequest *req) > { > trace_pci_nvme_admin_cmd(nvme_cid(req), nvme_sqid(req), req->cmd.opcode, > @@ -3797,6 +3854,8 @@ static uint16_t nvme_admin_cmd(NvmeCtrl *n, NvmeRequest > *req) > return nvme_get_feature(n, req); > case NVME_ADM_CMD_ASYNC_EV_REQ: > return nvme_aer(n, req); > +case NVME_ADM_CMD_NS_ATTACHMENT: > +return nvme_ns_attachment(n, req); > default: > assert(false); > } > diff --git a/hw/block/nvme.h b/hw/block/nvme.h > index 1c7796b20996..5a1ab857d166 100644 > --- a/hw/block/nvme.h > +++ b/hw/block/nvme.h > @@ -222,6 +222,11 @@ static inline void nvme_ns_attach(NvmeCtrl *n, > NvmeNamespace *ns) > n->namespaces[nvme_nsid(ns) - 1] = ns; > } > > +static inline void nvme_ns_detach(NvmeCtrl *n, NvmeNamespace *ns) > +{ > +n->namespaces[nvme_nsid(ns) - 1] = NULL; > +} > + > static inline NvmeCQueue *nvme_cq(NvmeRequest *req) > { > NvmeSQueue *sq = req->sq; > diff --git a/hw/block/trace-events b/hw/block/trace-events > index b6e972d733a6..bf67fe7873d2 100644 > --- a/hw/block/trace-events > +++ b/hw/block/trace-events > @@ -80,6 +80,8 @@ pci_nvme_aer(uint16_t cid) "cid %"PRIu16"" > pci_nvme_aer_aerl_exceeded(void) "aerl exceeded" > pci_nvme_aer_masked(uint8_t type, uint8_t mask) "type 0x%"PRIx8" mask > 0x%"PRIx8"" > pci_nvme_aer_post_cqe(uint8_t typ, uint8_t info, uint8_t
[PATCH 2/3] hw/block/nvme: deduplicate bad mdts trace event
From: Klaus Jensen If mdts is exceeded, trace it from a single place. Signed-off-by: Klaus Jensen --- hw/block/nvme.c | 6 +- hw/block/trace-events | 2 +- 2 files changed, 2 insertions(+), 6 deletions(-) diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 6a27b28f2c2d..25a7726ca05b 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -1075,6 +1075,7 @@ static inline uint16_t nvme_check_mdts(NvmeCtrl *n, size_t len) uint8_t mdts = n->params.mdts; if (mdts && len > n->page_size << mdts) { +trace_pci_nvme_err_mdts(len); return NVME_INVALID_FIELD | NVME_DNR; } @@ -1945,7 +1946,6 @@ static uint16_t nvme_compare(NvmeCtrl *n, NvmeRequest *req) status = nvme_check_mdts(n, len); if (status) { -trace_pci_nvme_err_mdts(nvme_cid(req), len); return status; } @@ -2048,7 +2048,6 @@ static uint16_t nvme_read(NvmeCtrl *n, NvmeRequest *req) status = nvme_check_mdts(n, data_size); if (status) { -trace_pci_nvme_err_mdts(nvme_cid(req), data_size); goto invalid; } @@ -2116,7 +2115,6 @@ static uint16_t nvme_do_write(NvmeCtrl *n, NvmeRequest *req, bool append, if (!wrz) { status = nvme_check_mdts(n, data_size); if (status) { -trace_pci_nvme_err_mdts(nvme_cid(req), data_size); goto invalid; } } @@ -2610,7 +2608,6 @@ static uint16_t nvme_zone_mgmt_recv(NvmeCtrl *n, NvmeRequest *req) status = nvme_check_mdts(n, data_size); if (status) { -trace_pci_nvme_err_mdts(nvme_cid(req), data_size); return status; } @@ -3052,7 +3049,6 @@ static uint16_t nvme_get_log(NvmeCtrl *n, NvmeRequest *req) status = nvme_check_mdts(n, len); if (status) { -trace_pci_nvme_err_mdts(nvme_cid(req), len); return status; } diff --git a/hw/block/trace-events b/hw/block/trace-events index b04f7a3e1890..e1a85661cf3f 100644 --- a/hw/block/trace-events +++ b/hw/block/trace-events @@ -114,7 +114,7 @@ pci_nvme_clear_ns_close(uint32_t state, uint64_t slba) "zone state=%"PRIu32", sl pci_nvme_clear_ns_reset(uint32_t state, uint64_t slba) "zone state=%"PRIu32", slba=%"PRIu64" transitioned to Empty state" # nvme traces for error conditions -pci_nvme_err_mdts(uint16_t cid, size_t len) "cid %"PRIu16" len %zu" +pci_nvme_err_mdts(size_t len) "len %zu" pci_nvme_err_req_status(uint16_t cid, uint32_t nsid, uint16_t status, uint8_t opc) "cid %"PRIu16" nsid %"PRIu32" status 0x%"PRIx16" opc 0x%"PRIx8"" pci_nvme_err_addr_read(uint64_t addr) "addr 0x%"PRIx64"" pci_nvme_err_addr_write(uint64_t addr) "addr 0x%"PRIx64"" -- 2.30.1
[PATCH 0/3] hw/block/nvme: mdts/zasl cleanup
From: Klaus Jensen The gist of this series is about aligning the zoned.zasl parameter with the mdts parameter. I complained about this back when I was reviewing the zoned series but was shot down. I relented on the size/capacity debate (and still fully support that), but I never really liked that ZASL is different from MDTS. Changing the definition makes the validation code much simpler and, well, it aligns perfectly with the existing mdts parameter, which is the goal here. While the current definition of zasl is in master, it has not yet been released, so this is sort of our last chance to change this before v6.0. I'll repeat the commit message of [3/3] here for context: ZASL (Zone Append Size Limit) is defined exactly like MDTS (Maximum Data Transfer Size), that is, it is a value in units of the minimum memory page size (CAP.MPSMIN) and is reported as a power of two. The 'mdts' nvme device parameter is specified as in the spec, but the 'zoned.append_size_limit' parameter is specified in bytes. This is suboptimal for a number of reasons: 1. It is just plain confusing wrt. the definition of mdts. 2. There is a lot of complexity involved in validating the value; it must be a power of two, it should be larger than 4k, if it is zero we set it internally to mdts, but still report it as zero. 3. While "hw/block/nvme: improve invalid zasl value reporting" slightly improved the handling of the parameter, the validation is still wrong; it does not depend on CC.MPS, it depends on CAP.MPSMIN. And we are not even checking that it is actually less than or equal to MDTS, which is kinda the *one* condition it must satisfy. Fix this by defining zasl exactly like mdts and checking the one thing that it must satisfy (that it is less than or equal to mdts). Also, change the default value from 128KiB to 0 (aka, whatever mdts is). Klaus Jensen (3): hw/block/nvme: document 'mdts' nvme device parameter hw/block/nvme: deduplicate bad mdts trace event hw/block/nvme: align zoned.zasl with mdts hw/block/nvme.h | 4 +-- hw/block/nvme.c | 67 ++- hw/block/trace-events | 4 +-- 3 files changed, 25 insertions(+), 50 deletions(-) -- 2.30.1
[PATCH 3/3] hw/block/nvme: align zoned.zasl with mdts
From: Klaus Jensen ZASL (Zone Append Size Limit) is defined exactly like MDTS (Maximum Data Transfer Size), that is, it is a value in units of the minimum memory page size (CAP.MPSMIN) and is reported as a power of two. The 'mdts' nvme device parameter is specified as in the spec, but the 'zoned.append_size_limit' parameter is specified in bytes. This is suboptimal for a number of reasons: 1. It is just plain confusing wrt. the definition of mdts. 2. There is a lot of complexity involved in validating the value; it must be a power of two, it should be larger than 4k, if it is zero we set it internally to mdts, but still report it as zero. 3. While "hw/block/nvme: improve invalid zasl value reporting" slightly improved the handling of the parameter, the validation is still wrong; it does not depend on CC.MPS, it depends on CAP.MPSMIN. And we are not even checking that it is actually less than or equal to MDTS, which is kinda the *one* condition it must satisfy. Fix this by defining zasl exactly like mdts and checking the one thing that it must satisfy (that it is less than or equal to mdts). Also, change the default value from 128KiB to 0 (aka, whatever mdts is). Signed-off-by: Klaus Jensen --- hw/block/nvme.h | 4 +--- hw/block/nvme.c | 55 --- hw/block/trace-events | 2 +- 3 files changed, 17 insertions(+), 44 deletions(-) diff --git a/hw/block/nvme.h b/hw/block/nvme.h index cb2b5175f1a1..f45ace0cff5b 100644 --- a/hw/block/nvme.h +++ b/hw/block/nvme.h @@ -20,7 +20,7 @@ typedef struct NvmeParams { uint32_t aer_max_queued; uint8_t mdts; bool use_intel_id; -uint32_t zasl_bs; +uint8_t zasl; bool legacy_cmb; } NvmeParams; @@ -171,8 +171,6 @@ typedef struct NvmeCtrl { QTAILQ_HEAD(, NvmeAsyncEvent) aer_queue; int aer_queued; -uint8_t zasl; - NvmeSubsystem *subsys; NvmeNamespace namespace; diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 25a7726ca05b..edd0b85c10ce 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -69,13 +69,11 @@ * as a power of two (2^n) and is in units of the minimum memory page size * (CAP.MPSMIN). The default value is 7 (i.e. 512 KiB). * - * - `zoned.append_size_limit` - * The maximum I/O size in bytes that is allowed in Zone Append command. - * The default is 128KiB. Since internally this this value is maintained as - * ZASL = log2( / ), some values assigned - * to this property may be rounded down and result in a lower maximum ZA - * data size being in effect. By setting this property to 0, users can make - * ZASL to be equal to MDTS. This property only affects zoned namespaces. + * - `zoned.zasl` + * Indicates the maximum data transfer size for the Zone Append command. Like + * `mdts`, the value is specified as a power of two (2^n) and is in units of + * the minimum memory page size (CAP.MPSMIN). The default value is 0 (i.e. + * defaulting to the value of `mdts`). * * nvme namespace device parameters * @@ -2135,10 +2133,9 @@ static uint16_t nvme_do_write(NvmeCtrl *n, NvmeRequest *req, bool append, goto invalid; } -if (nvme_l2b(ns, nlb) > (n->page_size << n->zasl)) { -trace_pci_nvme_err_append_too_large(slba, nlb, n->zasl); -status = NVME_INVALID_FIELD; -goto invalid; +if (n->params.zasl && data_size > n->page_size << n->params.zasl) { +trace_pci_nvme_err_zasl(data_size); +return NVME_INVALID_FIELD | NVME_DNR; } slba = zone->w_ptr; @@ -3212,9 +3209,8 @@ static uint16_t nvme_identify_ctrl_csi(NvmeCtrl *n, NvmeRequest *req) if (c->csi == NVME_CSI_NVM) { return nvme_rpt_empty_id_struct(n, req); } else if (c->csi == NVME_CSI_ZONED) { -if (n->params.zasl_bs) { -id.zasl = n->zasl; -} +id.zasl = n->params.zasl; + return nvme_dma(n, (uint8_t *), sizeof(id), DMA_DIRECTION_FROM_DEVICE, req); } @@ -4088,19 +4084,6 @@ static int nvme_start_ctrl(NvmeCtrl *n) nvme_init_sq(>admin_sq, n, n->bar.asq, 0, 0, NVME_AQA_ASQS(n->bar.aqa) + 1); -if (!n->params.zasl_bs) { -n->zasl = n->params.mdts; -} else { -if (n->params.zasl_bs < n->page_size) { -NVME_GUEST_ERR(pci_nvme_err_startfail_zasl_too_small, - "Zone Append Size Limit (ZASL) of %d bytes is too " - "small; must be at least %d bytes", - n->params.zasl_bs, n->page_size); -return -1; -} -n->zasl = 31 - clz32(n->params.zasl_bs / n->page_size); -} - nvme_set_timestamp(n, 0ULL); QTAILQ_INIT(>aer_queue); @@ -4609,17 +4592,10 @@ static void
[PATCH 1/3] hw/block/nvme: document 'mdts' nvme device parameter
From: Klaus Jensen Document the 'mdts' nvme device parameter. Signed-off-by: Klaus Jensen --- hw/block/nvme.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 1cd82fa3c9fe..6a27b28f2c2d 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -63,6 +63,12 @@ * completion when there are no outstanding AERs. When the maximum number of * enqueued events are reached, subsequent events will be dropped. * + * - `mdts` + * Indicates the maximum data transfer size for a command that transfers data + * between host-accessible memory and the controller. The value is specified + * as a power of two (2^n) and is in units of the minimum memory page size + * (CAP.MPSMIN). The default value is 7 (i.e. 512 KiB). + * * - `zoned.append_size_limit` * The maximum I/O size in bytes that is allowed in Zone Append command. * The default is 128KiB. Since internally this this value is maintained as -- 2.30.1
Re: [PATCH v2 0/4] improve do_strtosz precision
On 2/11/21 2:44 PM, Eric Blake wrote: > Parsing sizes with only 53 bits of precision is surprising; it's time > to fix it to use a full 64 bits of precision. > > v1 was here: > https://lists.gnu.org/archive/html/qemu-devel/2021-02/msg01800.html > > Since then: > - split testsuite improvements from code changes [Vladimir] > - more tests for more corner cases [Vladimir, Rich, Dan] > - fix handling of '123-45' when endptr is non-NULL [Vladimir] > - fix handling of '1.k' > - actually enable deprecation of '0x1k' [Vladimir] > - include missing deprecation text for rounded fractions > - improved commit messages > > I'm still not sure I like patch 4, but it's at least worth considering. Ping. I've also just realized that this series will fix: https://bugzilla.redhat.com/show_bug.cgi?id=1909185 "The error message of "qemu-img convert -r" should advertise the correct maximum number" -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
[PATCH v2 0/5] hw/block/nvme: misc fixes
From: Klaus Jensen Small set of misc fixes from Gollu. v2 changes * Split off the trace event additions from "[PATCH 1/3] hw/block/nvme: nvme_identify fixes" and "[PATCH 2/3] hw/block/nvme: fix potential compilation error" into their own commits (Minwoo, Philippe) * Fix a missing check on the zasl_bs param in the nvme_identify_ctrl_csi refactor (Minwoo) Gollu Appalanaidu (5): hw/block/nvme: remove unnecessary endian conversion hw/block/nvme: add identify trace event hw/block/nvme: fix potential compilation error hw/block/nvme: add trace event for zone read check hw/block/nvme: report non-mdts command size limit for dsm hw/block/nvme.h | 1 + include/block/nvme.h | 11 +++ hw/block/nvme.c | 45 --- hw/block/trace-events | 2 ++ 4 files changed, 43 insertions(+), 16 deletions(-) -- 2.30.1
[PATCH v2 5/5] hw/block/nvme: report non-mdts command size limit for dsm
From: Gollu Appalanaidu Dataset Management is not subject to MDTS, but exceeded a certain size per range causes internal looping. Report this limit (DMRSL) in the NVM command set specific identify controller data structure. Signed-off-by: Gollu Appalanaidu Signed-off-by: Klaus Jensen --- hw/block/nvme.h | 1 + include/block/nvme.h | 11 +++ hw/block/nvme.c | 29 + hw/block/trace-events | 1 + 4 files changed, 34 insertions(+), 8 deletions(-) diff --git a/hw/block/nvme.h b/hw/block/nvme.h index cb2b5175f1a1..3046b82b3da1 100644 --- a/hw/block/nvme.h +++ b/hw/block/nvme.h @@ -172,6 +172,7 @@ typedef struct NvmeCtrl { int aer_queued; uint8_t zasl; +uint32_tdmrsl; NvmeSubsystem *subsys; diff --git a/include/block/nvme.h b/include/block/nvme.h index b23f3ae2279f..16d8c4c90f7e 100644 --- a/include/block/nvme.h +++ b/include/block/nvme.h @@ -1041,6 +1041,16 @@ typedef struct NvmeIdCtrlZoned { uint8_t rsvd1[4095]; } NvmeIdCtrlZoned; +typedef struct NvmeIdCtrlNvm { +uint8_t vsl; +uint8_t wzsl; +uint8_t wusl; +uint8_t dmrl; +uint32_tdmrsl; +uint64_tdmsl; +uint8_t rsvd16[4080]; +} NvmeIdCtrlNvm; + enum NvmeIdCtrlOacs { NVME_OACS_SECURITY = 1 << 0, NVME_OACS_FORMAT= 1 << 1, @@ -1396,6 +1406,7 @@ static inline void _nvme_check_size(void) QEMU_BUILD_BUG_ON(sizeof(NvmeEffectsLog) != 4096); QEMU_BUILD_BUG_ON(sizeof(NvmeIdCtrl) != 4096); QEMU_BUILD_BUG_ON(sizeof(NvmeIdCtrlZoned) != 4096); +QEMU_BUILD_BUG_ON(sizeof(NvmeIdCtrlNvm) != 4096); QEMU_BUILD_BUG_ON(sizeof(NvmeLBAF) != 4); QEMU_BUILD_BUG_ON(sizeof(NvmeLBAFE) != 16); QEMU_BUILD_BUG_ON(sizeof(NvmeIdNs) != 4096); diff --git a/hw/block/nvme.c b/hw/block/nvme.c index b641c7a45a2f..4013bf53d4ff 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -1775,6 +1775,10 @@ static uint16_t nvme_dsm(NvmeCtrl *n, NvmeRequest *req) trace_pci_nvme_dsm_deallocate(nvme_cid(req), nvme_nsid(ns), slba, nlb); +if (nlb > n->dmrsl) { +trace_pci_nvme_dsm_single_range_limit_exceeded(nlb, n->dmrsl); +} + offset = nvme_l2b(ns, slba); len = nvme_l2b(ns, nlb); @@ -3200,21 +3204,27 @@ static uint16_t nvme_identify_ctrl(NvmeCtrl *n, NvmeRequest *req) static uint16_t nvme_identify_ctrl_csi(NvmeCtrl *n, NvmeRequest *req) { NvmeIdentify *c = (NvmeIdentify *)>cmd; -NvmeIdCtrlZoned id = {}; +uint8_t id[NVME_IDENTIFY_DATA_SIZE] = {}; trace_pci_nvme_identify_ctrl_csi(c->csi); -if (c->csi == NVME_CSI_NVM) { -return nvme_rpt_empty_id_struct(n, req); -} else if (c->csi == NVME_CSI_ZONED) { +switch (c->csi) { +case NVME_CSI_NVM: +((NvmeIdCtrlNvm *))->dmrsl = cpu_to_le32(n->dmrsl); +break; + +case NVME_CSI_ZONED: if (n->params.zasl_bs) { -id.zasl = n->zasl; +((NvmeIdCtrlZoned *))->zasl = n->zasl; } -return nvme_dma(n, (uint8_t *), sizeof(id), -DMA_DIRECTION_FROM_DEVICE, req); + +break; + +default: +return NVME_INVALID_FIELD | NVME_DNR; } -return NVME_INVALID_FIELD | NVME_DNR; +return nvme_dma(n, id, sizeof(id), DMA_DIRECTION_FROM_DEVICE, req); } static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeRequest *req) @@ -4668,6 +4678,9 @@ int nvme_register_namespace(NvmeCtrl *n, NvmeNamespace *ns, Error **errp) n->namespaces[nsid - 1] = ns; +n->dmrsl = MIN_NON_ZERO(n->dmrsl, +BDRV_REQUEST_MAX_BYTES / nvme_l2b(ns, 1)); + return 0; } diff --git a/hw/block/trace-events b/hw/block/trace-events index 1f958d09d2a9..27940fe2e98a 100644 --- a/hw/block/trace-events +++ b/hw/block/trace-events @@ -51,6 +51,7 @@ pci_nvme_copy_cb(uint16_t cid) "cid %"PRIu16"" pci_nvme_block_status(int64_t offset, int64_t bytes, int64_t pnum, int ret, bool zeroed) "offset %"PRId64" bytes %"PRId64" pnum %"PRId64" ret 0x%x zeroed %d" pci_nvme_dsm(uint16_t cid, uint32_t nsid, uint32_t nr, uint32_t attr) "cid %"PRIu16" nsid %"PRIu32" nr %"PRIu32" attr 0x%"PRIx32"" pci_nvme_dsm_deallocate(uint16_t cid, uint32_t nsid, uint64_t slba, uint32_t nlb) "cid %"PRIu16" nsid %"PRIu32" slba %"PRIu64" nlb %"PRIu32"" +pci_nvme_dsm_single_range_limit_exceeded(uint32_t nlb, uint32_t dmrsl) "nlb %"PRIu32" dmrsl %"PRIu32"" pci_nvme_compare(uint16_t cid, uint32_t nsid, uint64_t slba, uint32_t nlb) "cid %"PRIu16" nsid %"PRIu32" slba 0x%"PRIx64" nlb %"PRIu32"" pci_nvme_compare_cb(uint16_t cid) "cid %"PRIu16"" pci_nvme_aio_discard_cb(uint16_t cid) "cid %"PRIu16"" -- 2.30.1
[PATCH v2 4/5] hw/block/nvme: add trace event for zone read check
From: Gollu Appalanaidu Add a trace event for the offline zone condition when checking zone read. Signed-off-by: Gollu Appalanaidu [k.jensen: split commit] Signed-off-by: Klaus Jensen --- hw/block/nvme.c | 1 + 1 file changed, 1 insertion(+) diff --git a/hw/block/nvme.c b/hw/block/nvme.c index dcd51a52951c..b641c7a45a2f 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -1241,6 +1241,7 @@ static uint16_t nvme_check_zone_state_for_read(NvmeZone *zone) case NVME_ZONE_STATE_READ_ONLY: return NVME_SUCCESS; case NVME_ZONE_STATE_OFFLINE: +trace_pci_nvme_err_zone_is_offline(zone->d.zslba); return NVME_ZONE_OFFLINE; default: assert(false); -- 2.30.1
[PATCH v2 1/5] hw/block/nvme: remove unnecessary endian conversion
From: Gollu Appalanaidu Remove an unnecessary le_to_cpu conversion in Identify. Signed-off-by: Gollu Appalanaidu Signed-off-by: Klaus Jensen Reviewed-by: Minwoo Im --- hw/block/nvme.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 1cd82fa3c9fe..c0b349dfab0d 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -3415,7 +3415,7 @@ static uint16_t nvme_identify(NvmeCtrl *n, NvmeRequest *req) { NvmeIdentify *c = (NvmeIdentify *)>cmd; -switch (le32_to_cpu(c->cns)) { +switch (c->cns) { case NVME_ID_CNS_NS: /* fall through */ case NVME_ID_CNS_NS_PRESENT: -- 2.30.1
[PATCH v2 2/5] hw/block/nvme: add identify trace event
From: Gollu Appalanaidu Add a trace event for the Identify command. Signed-off-by: Gollu Appalanaidu Signed-off-by: Klaus Jensen Reviewed-by: Minwoo Im --- hw/block/nvme.c | 3 +++ hw/block/trace-events | 1 + 2 files changed, 4 insertions(+) diff --git a/hw/block/nvme.c b/hw/block/nvme.c index c0b349dfab0d..ddc83f7f7a19 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -3415,6 +3415,9 @@ static uint16_t nvme_identify(NvmeCtrl *n, NvmeRequest *req) { NvmeIdentify *c = (NvmeIdentify *)>cmd; +trace_pci_nvme_identify(nvme_cid(req), c->cns, le16_to_cpu(c->ctrlid), +c->csi); + switch (c->cns) { case NVME_ID_CNS_NS: /* fall through */ diff --git a/hw/block/trace-events b/hw/block/trace-events index b04f7a3e1890..1f958d09d2a9 100644 --- a/hw/block/trace-events +++ b/hw/block/trace-events @@ -61,6 +61,7 @@ pci_nvme_create_sq(uint64_t addr, uint16_t sqid, uint16_t cqid, uint16_t qsize, pci_nvme_create_cq(uint64_t addr, uint16_t cqid, uint16_t vector, uint16_t size, uint16_t qflags, int ien) "create completion queue, addr=0x%"PRIx64", cqid=%"PRIu16", vector=%"PRIu16", qsize=%"PRIu16", qflags=%"PRIu16", ien=%d" pci_nvme_del_sq(uint16_t qid) "deleting submission queue sqid=%"PRIu16"" pci_nvme_del_cq(uint16_t cqid) "deleted completion queue, cqid=%"PRIu16"" +pci_nvme_identify(uint16_t cid, uint8_t cns, uint16_t ctrlid, uint8_t csi) "cid %"PRIu16" cns 0x%"PRIx8" ctrlid %"PRIu16" csi 0x%"PRIx8"" pci_nvme_identify_ctrl(void) "identify controller" pci_nvme_identify_ctrl_csi(uint8_t csi) "identify controller, csi=0x%"PRIx8"" pci_nvme_identify_ns(uint32_t ns) "nsid %"PRIu32"" -- 2.30.1
[PATCH v2 3/5] hw/block/nvme: fix potential compilation error
From: Gollu Appalanaidu assert may be compiled to a noop and we could end up returning an uninitialized status. Fix this by always returning Internal Device Error as a fallback. Note that, as pointed out by Philippe, per commit 262a69f4282 ("osdep.h: Prohibit disabling assert() in supported builds") this shouldn't be possible. But clean it up so we don't worry about it again. Signed-off-by: Gollu Appalanaidu [k.jensen: split commit] Signed-off-by: Klaus Jensen --- hw/block/nvme.c | 10 +++--- 1 file changed, 3 insertions(+), 7 deletions(-) diff --git a/hw/block/nvme.c b/hw/block/nvme.c index ddc83f7f7a19..dcd51a52951c 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -1232,8 +1232,6 @@ static uint16_t nvme_check_zone_write(NvmeNamespace *ns, NvmeZone *zone, static uint16_t nvme_check_zone_state_for_read(NvmeZone *zone) { -uint16_t status; - switch (nvme_get_zone_state(zone)) { case NVME_ZONE_STATE_EMPTY: case NVME_ZONE_STATE_IMPLICITLY_OPEN: @@ -1241,16 +1239,14 @@ static uint16_t nvme_check_zone_state_for_read(NvmeZone *zone) case NVME_ZONE_STATE_FULL: case NVME_ZONE_STATE_CLOSED: case NVME_ZONE_STATE_READ_ONLY: -status = NVME_SUCCESS; -break; +return NVME_SUCCESS; case NVME_ZONE_STATE_OFFLINE: -status = NVME_ZONE_OFFLINE; -break; +return NVME_ZONE_OFFLINE; default: assert(false); } -return status; +return NVME_INTERNAL_DEV_ERROR; } static uint16_t nvme_check_zone_read(NvmeNamespace *ns, uint64_t slba, -- 2.30.1
Re: [PATCH v2 0/2] block: Use 'read-zeroes=true' mode by default with 'null-co' driver
On 2/22/21 7:15 PM, Daniel P. Berrangé wrote: > On Fri, Feb 19, 2021 at 03:09:43PM +0100, Philippe Mathieu-Daudé wrote: >> On 2/19/21 12:07 PM, Max Reitz wrote: >>> On 13.02.21 22:54, Fam Zheng wrote: On 2021-02-11 15:26, Philippe Mathieu-Daudé wrote: > The null-co driver doesn't zeroize buffer in its default config, > because it is designed for testing and tests want to run fast. > However this confuses security researchers (access to uninit > buffers). I'm a little surprised. Is changing default the only way to fix this? I'm not opposed to changing the default but I'm not convinced this is the easiest way. block/nvme.c also doesn't touch the memory, but defers to the device DMA, why doesn't that confuse the security checker? >> >> Generally speaking, there is a balance between security and performance. >> We try to provide both, but when we can't, my understanding is security >> is more important. >> >> Customers expect a secure product. If they prefer performance and >> at the price of security, it is also possible by enabling an option >> that is not the default. >> >> I'm not sure why you mention block/nvme here. I have the understanding >> the null-co driver is only useful for testing. Are there production >> cases where null-co is used? > > Do we have any real world figures for the performance of null-co > with & without zero'ing ? Before worrying about a tradeoff of > security vs performance, it'd be good to know if there is actually > a real world performance problem in the first place. Personally I'd > go for zero'ing by defualt unless the performance hit was really > bad. I simply wanted to help the security team by removing the amount of reports using the null-co driver. This is probably easier to resolve with one documentation line. As I am not understanding where this thread is heading, I am taking a step back with this topic. Please disregard this series. Regards, Phil.
Re: [PATCH 3/3] hw/block/nvme: report non-mdts command size limit for dsm
On Feb 22 21:11, Minwoo Im wrote: > On 21-02-22 08:06:15, Klaus Jensen wrote: > > From: Gollu Appalanaidu > > > > Dataset Management is not subject to MDTS, but exceeded a certain size > > per range causes internal looping. Report this limit (DMRSL) in the NVM > > command set specific identify controller data structure. > > > > Signed-off-by: Gollu Appalanaidu > > Signed-off-by: Klaus Jensen > > --- > > hw/block/nvme.h | 1 + > > include/block/nvme.h | 11 +++ > > hw/block/nvme.c | 30 -- > > hw/block/trace-events | 1 + > > 4 files changed, 33 insertions(+), 10 deletions(-) > > > > diff --git a/hw/block/nvme.h b/hw/block/nvme.h > > index cb2b5175f1a1..3046b82b3da1 100644 > > --- a/hw/block/nvme.h > > +++ b/hw/block/nvme.h > > @@ -172,6 +172,7 @@ typedef struct NvmeCtrl { > > int aer_queued; > > > > uint8_t zasl; > > +uint32_tdmrsl; > > > > NvmeSubsystem *subsys; > > > > diff --git a/include/block/nvme.h b/include/block/nvme.h > > index b23f3ae2279f..16d8c4c90f7e 100644 > > --- a/include/block/nvme.h > > +++ b/include/block/nvme.h > > @@ -1041,6 +1041,16 @@ typedef struct NvmeIdCtrlZoned { > > uint8_t rsvd1[4095]; > > } NvmeIdCtrlZoned; > > > > +typedef struct NvmeIdCtrlNvm { > > +uint8_t vsl; > > +uint8_t wzsl; > > +uint8_t wusl; > > +uint8_t dmrl; > > +uint32_tdmrsl; > > +uint64_tdmsl; > > +uint8_t rsvd16[4080]; > > +} NvmeIdCtrlNvm; > > + > > enum NvmeIdCtrlOacs { > > NVME_OACS_SECURITY = 1 << 0, > > NVME_OACS_FORMAT= 1 << 1, > > @@ -1396,6 +1406,7 @@ static inline void _nvme_check_size(void) > > QEMU_BUILD_BUG_ON(sizeof(NvmeEffectsLog) != 4096); > > QEMU_BUILD_BUG_ON(sizeof(NvmeIdCtrl) != 4096); > > QEMU_BUILD_BUG_ON(sizeof(NvmeIdCtrlZoned) != 4096); > > +QEMU_BUILD_BUG_ON(sizeof(NvmeIdCtrlNvm) != 4096); > > QEMU_BUILD_BUG_ON(sizeof(NvmeLBAF) != 4); > > QEMU_BUILD_BUG_ON(sizeof(NvmeLBAFE) != 16); > > QEMU_BUILD_BUG_ON(sizeof(NvmeIdNs) != 4096); > > diff --git a/hw/block/nvme.c b/hw/block/nvme.c > > index 897b9ff0db91..5d6bba5fcb0d 100644 > > --- a/hw/block/nvme.c > > +++ b/hw/block/nvme.c > > @@ -1777,6 +1777,10 @@ static uint16_t nvme_dsm(NvmeCtrl *n, NvmeRequest > > *req) > > trace_pci_nvme_dsm_deallocate(nvme_cid(req), nvme_nsid(ns), > > slba, > >nlb); > > > > +if (nlb > n->dmrsl) { > > +trace_pci_nvme_dsm_single_range_limit_exceeded(nlb, > > n->dmrsl); > > +} > > + > > offset = nvme_l2b(ns, slba); > > len = nvme_l2b(ns, nlb); > > > > @@ -3202,21 +3206,24 @@ static uint16_t nvme_identify_ctrl(NvmeCtrl *n, > > NvmeRequest *req) > > static uint16_t nvme_identify_ctrl_csi(NvmeCtrl *n, NvmeRequest *req) > > { > > NvmeIdentify *c = (NvmeIdentify *)>cmd; > > -NvmeIdCtrlZoned id = {}; > > +uint8_t id[NVME_IDENTIFY_DATA_SIZE] = {}; > > > > trace_pci_nvme_identify_ctrl_csi(c->csi); > > > > -if (c->csi == NVME_CSI_NVM) { > > -return nvme_rpt_empty_id_struct(n, req); > > -} else if (c->csi == NVME_CSI_ZONED) { > > -if (n->params.zasl_bs) { > > -id.zasl = n->zasl; > > -} > > -return nvme_dma(n, (uint8_t *), sizeof(id), > > -DMA_DIRECTION_FROM_DEVICE, req); > > +switch (c->csi) { > > +case NVME_CSI_NVM: > > +((NvmeIdCtrlNvm *))->dmrsl = cpu_to_le32(n->dmrsl); > > +break; > > + > > +case NVME_CSI_ZONED: > > +((NvmeIdCtrlZoned *))->zasl = n->zasl; > > Question. Are we okay without checking this like above ? :) > > if (n->params.zasl_bs) { > ((NvmeIdCtrlZoned *))->zasl = n->zasl; > } > Ah yeah. Thanks! I forgot about that zasl_bs vs zasl quirkiness. > > +break; > > + > > +default: > > +return NVME_INVALID_FIELD | NVME_DNR; > > } > > > > -return NVME_INVALID_FIELD | NVME_DNR; > > +return nvme_dma(n, id, sizeof(id), DMA_DIRECTION_FROM_DEVICE, req); > > } > > > > static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeRequest *req) > > @@ -4670,6 +4677,9 @@ int nvme_register_namespace(NvmeCtrl *n, > > NvmeNamespace *ns, Error **errp) > > > > n->namespaces[nsid - 1] = ns; > > > > +n->dmrsl = MIN_NON_ZERO(n->dmrsl, > > +BDRV_REQUEST_MAX_BYTES / nvme_l2b(ns, 1)); > > + > > return 0; > > } > > > > diff --git a/hw/block/trace-events b/hw/block/trace-events > > index 1f958d09d2a9..27940fe2e98a 100644 > > --- a/hw/block/trace-events > > +++ b/hw/block/trace-events > > @@ -51,6 +51,7 @@ pci_nvme_copy_cb(uint16_t cid) "cid %"PRIu16"" > > pci_nvme_block_status(int64_t offset, int64_t bytes, int64_t pnum, int > > ret, bool zeroed) "offset %"PRId64" bytes %"PRId64" pnum %"PRId64" ret 0x%x > > zeroed %d" > > pci_nvme_dsm(uint16_t cid,
Re: [PATCH v2 0/2] block: Use 'read-zeroes=true' mode by default with 'null-co' driver
On Fri, Feb 19, 2021 at 03:09:43PM +0100, Philippe Mathieu-Daudé wrote: > On 2/19/21 12:07 PM, Max Reitz wrote: > > On 13.02.21 22:54, Fam Zheng wrote: > >> On 2021-02-11 15:26, Philippe Mathieu-Daudé wrote: > >>> The null-co driver doesn't zeroize buffer in its default config, > >>> because it is designed for testing and tests want to run fast. > >>> However this confuses security researchers (access to uninit > >>> buffers). > >> > >> I'm a little surprised. > >> > >> Is changing default the only way to fix this? I'm not opposed to > >> changing the default but I'm not convinced this is the easiest way. > >> block/nvme.c also doesn't touch the memory, but defers to the device > >> DMA, why doesn't that confuse the security checker? > > Generally speaking, there is a balance between security and performance. > We try to provide both, but when we can't, my understanding is security > is more important. > > Customers expect a secure product. If they prefer performance and > at the price of security, it is also possible by enabling an option > that is not the default. > > I'm not sure why you mention block/nvme here. I have the understanding > the null-co driver is only useful for testing. Are there production > cases where null-co is used? Do we have any real world figures for the performance of null-co with & without zero'ing ? Before worrying about a tradeoff of security vs performance, it'd be good to know if there is actually a real world performance problem in the first place. Personally I'd go for zero'ing by defualt unless the performance hit was really bad. > BTW block/nvme is a particular driver where performance matters more > than security (but we have to make sure the users are aware of that). Regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
[RFC PATCH v3 1/3] hw/pflash_cfi*: Replace DPRINTF with trace events
Rather than having a device specific debug implementation in pflash_cfi01.c and pflash_cfi02.c, use the standard tracing facility. Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: David Edmondson --- hw/block/pflash_cfi01.c | 78 + hw/block/pflash_cfi02.c | 75 +++ hw/block/trace-events | 39 - 3 files changed, 91 insertions(+), 101 deletions(-) diff --git a/hw/block/pflash_cfi01.c b/hw/block/pflash_cfi01.c index 22287a1522..9e1f3b42c6 100644 --- a/hw/block/pflash_cfi01.c +++ b/hw/block/pflash_cfi01.c @@ -56,16 +56,6 @@ #include "sysemu/runstate.h" #include "trace.h" -/* #define PFLASH_DEBUG */ -#ifdef PFLASH_DEBUG -#define DPRINTF(fmt, ...) \ -do {\ -fprintf(stderr, "PFLASH: " fmt , ## __VA_ARGS__); \ -} while (0) -#else -#define DPRINTF(fmt, ...) do { } while (0) -#endif - #define PFLASH_BE 0 #define PFLASH_SECURE 1 @@ -152,10 +142,8 @@ static uint32_t pflash_cfi_query(PFlashCFI01 *pfl, hwaddr offset) * wider part. */ if (pfl->device_width != 1 || pfl->bank_width > 4) { -DPRINTF("%s: Unsupported device configuration: " -"device_width=%d, max_device_width=%d\n", -__func__, pfl->device_width, -pfl->max_device_width); +trace_pflash_unsupported_device_configuration( +pfl->name, pfl->device_width, pfl->max_device_width); return 0; } /* CFI query data is repeated, rather than zero padded for @@ -205,14 +193,14 @@ static uint32_t pflash_devid_query(PFlashCFI01 *pfl, hwaddr offset) switch (boff & 0xFF) { case 0: resp = pfl->ident0; -trace_pflash_manufacturer_id(resp); +trace_pflash_manufacturer_id(pfl->name, resp); break; case 1: resp = pfl->ident1; -trace_pflash_device_id(resp); +trace_pflash_device_id(pfl->name, resp); break; default: -trace_pflash_device_info(offset); +trace_pflash_device_info(pfl->name, offset); return 0; } /* Replicate responses for each device in bank. */ @@ -260,10 +248,9 @@ static uint32_t pflash_data_read(PFlashCFI01 *pfl, hwaddr offset, } break; default: -DPRINTF("BUG in %s\n", __func__); abort(); } -trace_pflash_data_read(offset, width, ret); +trace_pflash_data_read(pfl->name, offset, width, ret); return ret; } @@ -277,7 +264,7 @@ static uint32_t pflash_read(PFlashCFI01 *pfl, hwaddr offset, switch (pfl->cmd) { default: /* This should never happen : reset state & treat it as a read */ -DPRINTF("%s: unknown command state: %x\n", __func__, pfl->cmd); +trace_pflash_read_unknown_state(pfl->name, pfl->cmd); pfl->wcycle = 0; /* * The command 0x00 is not assigned by the CFI open standard, @@ -313,7 +300,7 @@ static uint32_t pflash_read(PFlashCFI01 *pfl, hwaddr offset, */ ret |= pfl->status << 16; } -DPRINTF("%s: status %x\n", __func__, ret); +trace_pflash_read_status(pfl->name, ret); break; case 0x90: if (!pfl->device_width) { @@ -328,14 +315,14 @@ static uint32_t pflash_read(PFlashCFI01 *pfl, hwaddr offset, switch (boff) { case 0: ret = pfl->ident0 << 8 | pfl->ident1; -trace_pflash_manufacturer_id(ret); +trace_pflash_manufacturer_id(pfl->name, ret); break; case 1: ret = pfl->ident2 << 8 | pfl->ident3; -trace_pflash_device_id(ret); +trace_pflash_device_id(pfl->name, ret); break; default: -trace_pflash_device_info(boff); +trace_pflash_device_info(pfl->name, boff); ret = 0; break; } @@ -380,7 +367,7 @@ static uint32_t pflash_read(PFlashCFI01 *pfl, hwaddr offset, break; } -trace_pflash_io_read(offset, width, ret, pfl->cmd, pfl->wcycle); +trace_pflash_io_read(pfl->name, offset, width, ret, pfl->cmd, pfl->wcycle); return ret; } @@ -410,7 +397,7 @@ static inline void pflash_data_write(PFlashCFI01 *pfl, hwaddr offset, { uint8_t *p = pfl->storage; -trace_pflash_data_write(offset, width, value, pfl->counter); +trace_pflash_data_write(pfl->name, offset, width, value, pfl->counter); switch (width) { case 1: p[offset] = value; @@ -449,7 +436,7 @@ static void pflash_write(PFlashCFI01 *pfl, hwaddr offset, cmd = value; -trace_pflash_io_write(offset, width, value, pfl->wcycle); +trace_pflash_io_write(pfl->name, offset, width, value, pfl->wcycle); if
Re: [PATCH v2 0/2] block: Use 'read-zeroes=true' mode by default with 'null-co' driver
On 2/22/21 6:35 PM, Fam Zheng wrote: > On 2021-02-19 15:09, Philippe Mathieu-Daudé wrote: >> On 2/19/21 12:07 PM, Max Reitz wrote: >>> On 13.02.21 22:54, Fam Zheng wrote: On 2021-02-11 15:26, Philippe Mathieu-Daudé wrote: > The null-co driver doesn't zeroize buffer in its default config, > because it is designed for testing and tests want to run fast. > However this confuses security researchers (access to uninit > buffers). I'm a little surprised. Is changing default the only way to fix this? I'm not opposed to changing the default but I'm not convinced this is the easiest way. block/nvme.c also doesn't touch the memory, but defers to the device DMA, why doesn't that confuse the security checker? >> >> Generally speaking, there is a balance between security and performance. >> We try to provide both, but when we can't, my understanding is security >> is more important. > > Why is hiding the code path behind a non-default more secure? What is > not secure now? Se we are back to the problem of having default values. I'd like to remove the default and have the option explicit, but qemu_opt_get_bool() expects a 'default' value. Should we rename qemu_opt_get_bool() -> qemu_opt_get_bool_with_default() and add a simpler qemu_opt_get_bool()?
[RFC PATCH v3 2/3] hw/pflash_cfi01: Correct the type of PFlashCFI01.ro
PFlashCFI01.ro is a bool, declare it as such. Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: David Edmondson --- hw/block/pflash_cfi01.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/hw/block/pflash_cfi01.c b/hw/block/pflash_cfi01.c index 9e1f3b42c6..6b21b4af52 100644 --- a/hw/block/pflash_cfi01.c +++ b/hw/block/pflash_cfi01.c @@ -72,7 +72,7 @@ struct PFlashCFI01 { uint8_t max_device_width; /* max device width in bytes */ uint32_t features; uint8_t wcycle; /* if 0, the flash is read normally */ -int ro; +bool ro; uint8_t cmd; uint8_t status; uint16_t ident0; @@ -738,7 +738,7 @@ static void pflash_cfi01_realize(DeviceState *dev, Error **errp) return; } } else { -pfl->ro = 0; +pfl->ro = false; } if (pfl->blk) { -- 2.30.0
[RFC PATCH v3 3/3] hw/pflash_cfi01: Allow devices to have a smaller backing device
Allow the backing device to be smaller than the extent of the flash device by mapping it as a subregion of the flash device region. Return zeroes for all reads of the flash device beyond the extent of the backing device. For writes beyond the extent of the underlying device, fail on read-only devices and discard them for writable devices. Signed-off-by: David Edmondson --- hw/block/pflash_cfi01.c | 106 hw/block/trace-events | 3 ++ 2 files changed, 88 insertions(+), 21 deletions(-) diff --git a/hw/block/pflash_cfi01.c b/hw/block/pflash_cfi01.c index 6b21b4af52..4b6cbc85f6 100644 --- a/hw/block/pflash_cfi01.c +++ b/hw/block/pflash_cfi01.c @@ -83,6 +83,8 @@ struct PFlashCFI01 { uint64_t counter; unsigned int writeblock_size; MemoryRegion mem; +MemoryRegion mem_outer; +char outer_name[64]; char *name; void *storage; VMChangeStateEntry *vmstate; @@ -425,7 +427,6 @@ static inline void pflash_data_write(PFlashCFI01 *pfl, hwaddr offset, } break; } - } static void pflash_write(PFlashCFI01 *pfl, hwaddr offset, @@ -646,8 +647,45 @@ static void pflash_write(PFlashCFI01 *pfl, hwaddr offset, } -static MemTxResult pflash_mem_read_with_attrs(void *opaque, hwaddr addr, uint64_t *value, - unsigned len, MemTxAttrs attrs) +static MemTxResult pflash_outer_read_with_attrs(void *opaque, hwaddr addr, +uint64_t *value, +unsigned len, +MemTxAttrs attrs) +{ +PFlashCFI01 *pfl = opaque; + +trace_pflash_outer_read(pfl->name, addr, len); +*value = 0; +return MEMTX_OK; +} + +static MemTxResult pflash_outer_write_with_attrs(void *opaque, hwaddr addr, + uint64_t value, + unsigned len, + MemTxAttrs attrs) +{ +PFlashCFI01 *pfl = opaque; + +trace_pflash_outer_write(pfl->name, addr, len); +if (pfl->ro) { +return MEMTX_ERROR; +} else { +/* Discard writes. */ +warn_report_once("%s: attempt to write outside of the backing block device " + "(offset " TARGET_FMT_plx ") ignored", pfl->name, addr); +return MEMTX_OK; +} +} + +static const MemoryRegionOps pflash_cfi01_outer_ops = { +.read_with_attrs = pflash_outer_read_with_attrs, +.write_with_attrs = pflash_outer_write_with_attrs, +.endianness = DEVICE_NATIVE_ENDIAN, +}; + +static MemTxResult pflash_mem_read_with_attrs(void *opaque, hwaddr addr, + uint64_t *value, unsigned len, + MemTxAttrs attrs) { PFlashCFI01 *pfl = opaque; bool be = !!(pfl->features & (1 << PFLASH_BE)); @@ -660,8 +698,9 @@ static MemTxResult pflash_mem_read_with_attrs(void *opaque, hwaddr addr, uint64_ return MEMTX_OK; } -static MemTxResult pflash_mem_write_with_attrs(void *opaque, hwaddr addr, uint64_t value, - unsigned len, MemTxAttrs attrs) +static MemTxResult pflash_mem_write_with_attrs(void *opaque, hwaddr addr, + uint64_t value, unsigned len, + MemTxAttrs attrs) { PFlashCFI01 *pfl = opaque; bool be = !!(pfl->features & (1 << PFLASH_BE)); @@ -684,7 +723,7 @@ static void pflash_cfi01_realize(DeviceState *dev, Error **errp) { ERRP_GUARD(); PFlashCFI01 *pfl = PFLASH_CFI01(dev); -uint64_t total_len; +uint64_t outer_len, inner_len; int ret; uint64_t blocks_per_device, sector_len_per_device, device_len; int num_devices; @@ -702,7 +741,7 @@ static void pflash_cfi01_realize(DeviceState *dev, Error **errp) return; } -total_len = pfl->sector_len * pfl->nb_blocs; +outer_len = pfl->sector_len * pfl->nb_blocs; /* These are only used to expose the parameters of each device * in the cfi_table[]. @@ -717,33 +756,58 @@ static void pflash_cfi01_realize(DeviceState *dev, Error **errp) } device_len = sector_len_per_device * blocks_per_device; -memory_region_init_rom_device( ->mem, OBJECT(dev), -_cfi01_ops, -pfl, -pfl->name, total_len, errp); -if (*errp) { -return; -} - -pfl->storage = memory_region_get_ram_ptr(>mem); -sysbus_init_mmio(SYS_BUS_DEVICE(dev), >mem); - if (pfl->blk) { uint64_t perm; + pfl->ro = !blk_supports_write_perm(pfl->blk); perm = BLK_PERM_CONSISTENT_READ | (pfl->ro ? 0 : BLK_PERM_WRITE); ret = blk_set_perm(pfl->blk, perm, BLK_PERM_ALL, errp); if (ret < 0) { return; } + +
[RFC PATCH v3 0/3] hw/flash_cfi01: Reduce memory consumption when flash image is smaller than region
As described in https://lore.kernel.org/r/20201116104216.439650-1-david.edmond...@oracle.com, I'd like to reduce the amount of memory consumed by QEMU mapping UEFI images on aarch64. To recap: > Currently ARM UEFI images are typically built as 2MB/768kB flash > images for code and variables respectively. These images are both > then padded out to 64MB before being loaded by QEMU. > > Because the images are 64MB each, QEMU allocates 128MB of memory to > read them, and then proceeds to read all 128MB from disk (dirtying > the memory). Of this 128MB less than 3MB is useful - the rest is > zero padding. > > On a machine with 100 VMs this wastes over 12GB of memory. There were objections to my previous patch because it changed the size of the regions reported to the guest via the memory map (the reported size depended on the size of the image). This is a smaller patch which changes the memory region that covers the entire region to be IO rather than RAM, and loads the flash image into a smaller sub-region that is the more traditional mixed IO/ROMD type. All read/write operations to areas outside of the underlying block device are handled directly (reads return 0, writes fail or are discarded). This reduces the memory consumption for the AAVMF code image from 64MiB to around 2MB and that for the AAVMF vars from 64MiB to 768KiB (presuming that the UEFI images are adjusted accordingly). For read-only devices (such as the AAVMF code) this seems completely safe. For writable devices there is a change in behaviour - previously it was possible to write anywhere in the extent of the flash device, read back the data written and have that data persist through a restart of QEMU. This is no longer the case - writes outside of the extent of the underlying backing block device will be discarded. That is, a read after a write will *not* return the written data, either immediately or after a QEMU restart - it will return zeros. Looking at the AAVMF implementation, it seems to me that if the initial VARS image is prepared as being 768KiB in size (which it is), none of AAVMF itself will attempt to write outside of that extent, and so I believe that this is an acceptable compromise. It would be relatively straightforward to allow writes outside of the backing device to persist for the lifetime of a particular QEMU by allocating memory on demand (i.e. when there is a write to the relevant region). This would allow a read to return the relevant data, but only until a QEMU restart, at which point the data would be lost. There was a suggestion in a previous thread that perhaps the pflash driver could be re-worked to use the block IO interfaces to access the underlying device "on demand" rather than reading in the entire image at startup (at least, that's how I understood the comment). I looked at implementing this and struggled to get it to work for all of the required use cases. Specifically, there are several code paths that expect to retrieve a pointer to the flat memory image of the pflash device and manipulate it directly (examples include the Malta board and encrypted memory support on x86), or write the entire image to storage (after migration). My implementation was based around mapping the flash region only for IO, which meant that every read or write had to be handled directly by the pflash driver (there was no ROMD style operation), which also made booting an aarch64 VM noticeably slower - getting through the firmware went from under 1 second to around 10 seconds. v2: - Unify the approach for both read-only and writable devices, saving another 63MiB per QEMU instance. v3: - Add Reviewed-by: for two changes (Philippe). - Call blk_pread() directly rather than using blk_check_size_and_read_all(), given that we know how much we want to read. David Edmondson (3): hw/pflash_cfi*: Replace DPRINTF with trace events hw/pflash_cfi01: Correct the type of PFlashCFI01.ro hw/pflash_cfi01: Allow devices to have a smaller backing device hw/block/pflash_cfi01.c | 188 +--- hw/block/pflash_cfi02.c | 75 +++- hw/block/trace-events | 42 +++-- 3 files changed, 181 insertions(+), 124 deletions(-) -- 2.30.0
Re: [PATCH v2 0/2] block: Use 'read-zeroes=true' mode by default with 'null-co' driver
On 2021-02-19 15:09, Philippe Mathieu-Daudé wrote: > On 2/19/21 12:07 PM, Max Reitz wrote: > > On 13.02.21 22:54, Fam Zheng wrote: > >> On 2021-02-11 15:26, Philippe Mathieu-Daudé wrote: > >>> The null-co driver doesn't zeroize buffer in its default config, > >>> because it is designed for testing and tests want to run fast. > >>> However this confuses security researchers (access to uninit > >>> buffers). > >> > >> I'm a little surprised. > >> > >> Is changing default the only way to fix this? I'm not opposed to > >> changing the default but I'm not convinced this is the easiest way. > >> block/nvme.c also doesn't touch the memory, but defers to the device > >> DMA, why doesn't that confuse the security checker? > > Generally speaking, there is a balance between security and performance. > We try to provide both, but when we can't, my understanding is security > is more important. Why is hiding the code path behind a non-default more secure? What is not secure now? Fam
Re: [PATCH v5 3/9] block-backend: Add device specific retry callback
On Fri, Feb 05, 2021 at 06:13:09PM +0800, Jiahui Cen wrote: > Add retry_request_cb in BlockDevOps to do device specific retry action. > Backend's timer would be registered only when the backend is set 'retry' > on errors and the device supports retry action. > > Signed-off-by: Jiahui Cen > Signed-off-by: Ying Fang > --- > block/block-backend.c | 8 > include/sysemu/block-backend.h | 4 > 2 files changed, 12 insertions(+) > > diff --git a/block/block-backend.c b/block/block-backend.c > index 3a9d55cbe3..a8bfaf6e4a 100644 > --- a/block/block-backend.c > +++ b/block/block-backend.c > @@ -995,6 +995,14 @@ void blk_set_dev_ops(BlockBackend *blk, const > BlockDevOps *ops, > blk->dev_ops = ops; > blk->dev_opaque = opaque; > > +if ((blk->on_read_error == BLOCKDEV_ON_ERROR_RETRY || > + blk->on_write_error == BLOCKDEV_ON_ERROR_RETRY) && > +ops->retry_request_cb) { > +blk->retry_timer = aio_timer_new(blk->ctx, QEMU_CLOCK_REALTIME, > + SCALE_MS, ops->retry_request_cb, > + opaque); > +} I didn't see anything that handles AioContext changes. If the BlockBackend is detached from the current AioContext and attached to a different one, then it is necessary to delete the old timer and create a new one. Stefan signature.asc Description: PGP signature
Re: [PATCH v5 0/9] block: Add retry for werror=/rerror= mechanism
On Fri, Feb 05, 2021 at 06:13:06PM +0800, Jiahui Cen wrote: > This patch series propose to extend the werror=/rerror= mechanism to add > a 'retry' feature. It can automatically retry failed I/O requests on error > without sending error back to guest, and guest can get back running smoothly > when I/O is recovred. I posted a few questions but overall this looks promising. Thanks! Stefan signature.asc Description: PGP signature
Re: [PATCH v5 1/9] qapi/block-core: Add retry option for error action
On Fri, Feb 05, 2021 at 06:13:07PM +0800, Jiahui Cen wrote: > Add a new error action 'retry' to support retry on errors. > > Signed-off-by: Jiahui Cen > Signed-off-by: Ying Fang > --- > blockdev.c | 2 ++ > qapi/block-core.json | 9 +++-- > 2 files changed, 9 insertions(+), 2 deletions(-) > > diff --git a/blockdev.c b/blockdev.c > index b250b9b959..ece1d8ae58 100644 > --- a/blockdev.c > +++ b/blockdev.c > @@ -342,6 +342,8 @@ static int parse_block_error_action(const char *buf, bool > is_read, Error **errp) > return BLOCKDEV_ON_ERROR_STOP; > } else if (!strcmp(buf, "report")) { > return BLOCKDEV_ON_ERROR_REPORT; > +} else if (!strcmp(buf, "retry")) { > +return BLOCKDEV_ON_ERROR_RETRY; > } else { > error_setg(errp, "'%s' invalid %s error action", > buf, is_read ? "read" : "write"); > diff --git a/qapi/block-core.json b/qapi/block-core.json > index 9f555d5c1d..30ea43cb77 100644 > --- a/qapi/block-core.json > +++ b/qapi/block-core.json > @@ -1143,10 +1143,13 @@ > # > # @auto: inherit the error handling policy of the backend (since: 2.7) > # > +# @retry: for guest operations, retry the failing request; (since: 6.0) > +# for jobs, not supported Does this mean block_job_error_action() can now reach abort() in switch (on_err)? If yes, please add a check that reports an error when "retry" is specified so that abort() cannot be reached. Stefan signature.asc Description: PGP signature
Re: [PATCH v5 4/9] block-backend: Enable retry action on errors
On Fri, Feb 05, 2021 at 06:13:10PM +0800, Jiahui Cen wrote: > Enable retry action when backend's retry timer is available. It would > trigger the timer to do device specific retry action. > > Signed-off-by: Jiahui Cen > Signed-off-by: Ying Fang > --- > block/block-backend.c | 9 + > 1 file changed, 9 insertions(+) > > diff --git a/block/block-backend.c b/block/block-backend.c > index a8bfaf6e4a..ab75cb1971 100644 > --- a/block/block-backend.c > +++ b/block/block-backend.c > @@ -1803,6 +1803,9 @@ BlockErrorAction blk_get_error_action(BlockBackend > *blk, bool is_read, > return BLOCK_ERROR_ACTION_REPORT; > case BLOCKDEV_ON_ERROR_IGNORE: > return BLOCK_ERROR_ACTION_IGNORE; > +case BLOCKDEV_ON_ERROR_RETRY: > +return (blk->retry_timer) ? > + BLOCK_ERROR_ACTION_RETRY : BLOCK_ERROR_ACTION_REPORT; > case BLOCKDEV_ON_ERROR_AUTO: > default: > abort(); > @@ -1850,6 +1853,12 @@ void blk_error_action(BlockBackend *blk, > BlockErrorAction action, > qemu_system_vmstop_request_prepare(); > send_qmp_error_event(blk, action, is_read, error); > qemu_system_vmstop_request(RUN_STATE_IO_ERROR); > +} else if (action == BLOCK_ERROR_ACTION_RETRY) { > +if (!blk->quiesce_counter) { > +timer_mod(blk->retry_timer, > qemu_clock_get_ms(QEMU_CLOCK_REALTIME) + > +blk->retry_interval); QEMU should not make any guest-visible changes while the guest is stopped (vm_stop()). Please use QEMU_CLOCK_VIRTUAL so the timer does not fire while the guest is stopped. Stefan signature.asc Description: PGP signature
Re: [PATCH v5 6/9] block: Add error retry param setting
On Fri, Feb 05, 2021 at 06:13:12PM +0800, Jiahui Cen wrote: > @@ -581,6 +582,10 @@ static BlockBackend *blockdev_init(const char *file, > QDict *bs_opts, > } > } > > +retry_interval = qemu_opt_get_number(opts, "retry_interval", > + > BLOCK_BACKEND_DEFAULT_RETRY_INTERVAL); > +retry_timeout = qemu_opt_get_number(opts, "retry_timeout", 0); Please use "retry-interval" and "retry-timeout". "-" is used more often for command-line parameters than "_". signature.asc Description: PGP signature
Re: [PATCH] virtio-blk: Respect discard granularity
On Fri, Feb 19, 2021 at 07:19:19PM +0900, Akihiko Odaki wrote: > Signed-off-by: Akihiko Odaki > --- > hw/block/virtio-blk.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c > index bac2d6fa2b2..692fd17b0e0 100644 > --- a/hw/block/virtio-blk.c > +++ b/hw/block/virtio-blk.c > @@ -965,7 +965,7 @@ static void virtio_blk_update_config(VirtIODevice *vdev, > uint8_t *config) > virtio_stl_p(vdev, _discard_sectors, > s->conf.max_discard_sectors); > virtio_stl_p(vdev, _sector_alignment, > - blk_size >> BDRV_SECTOR_BITS); > + conf->discard_granularity >> BDRV_SECTOR_BITS); Please handle the -1 default value like this: uint32_t discard_granularity = conf->discard_granularity; if (conf->discard_granularity == -1) { discard_granularity = BDRV_SECTOR_SIZE; } I noticed this when comparing the blk_size and discard_granularity values when I run QEMU: $ qemu-system-x86_64 -M accel=kvm -m 1G -cpu host -drive if=virtio,file=test.img,format=raw blk_size 512 discard_granularity 4294967295 Also, please add a compat prop in hw/core/machine.c to ensure that existing machine types are unaffected by this change. This can be done by adding DEFINE_PROP_BOOL("report-discard-granularity", ...) in hw/block/virtio-blk.c and then setting it to false for existing machine types in hw/core/machine.c. Then new machine types benefit from the new feature but existing machine types will be unchanged (eliminating the risk of live migration/snapshot incompatibilities when the device unexpectedly changes behavior while the guest is running). Stefan signature.asc Description: PGP signature
Re: [PATCH v2 01/22] block: add eMMC block device type
* Sai Pavan Boddu (saip...@xilinx.com) wrote: > Hi Philippe, > > > -Original Message- > > From: Philippe Mathieu-Daudé > > Sent: Monday, February 22, 2021 6:54 PM > > To: Dr. David Alan Gilbert ; Markus Armbruster > > > > Cc: Sai Pavan Boddu ; Kevin Wolf ; > > Max Reitz ; Vladimir Sementsov-Ogievskiy > > ; Eric Blake ; Joel Stanley > > ; Cédric Le Goater ; Vincent Palatin > > ; Thomas Huth ; Stefan > > Hajnoczi ; Peter Maydell ; > > Alistair Francis ; Edgar Iglesias > > ; > > Luc Michel ; Paolo Bonzini > > ; Sai Pavan Boddu ; qemu- > > de...@nongnu.org; qemu-block@nongnu.org > > Subject: Re: [PATCH v2 01/22] block: add eMMC block device type > > > > On 2/22/21 2:16 PM, Dr. David Alan Gilbert wrote: > > > * Markus Armbruster (arm...@redhat.com) wrote: > > >> Philippe Mathieu-Daudé writes: > > >> > > >>> On 2/22/21 9:20 AM, Sai Pavan Boddu wrote: > > From: Vincent Palatin > > > > Add new block device type. > > > > Signed-off-by: Vincent Palatin > > [SPB: Rebased over 5.1 version] > > Signed-off-by: Sai Pavan Boddu > > Signed-off-by: Joel Stanley > > Signed-off-by: Cédric Le Goater > > Reviewed-by: Alistair Francis > > --- > > include/sysemu/blockdev.h | 1 + > > blockdev.c| 1 + > > 2 files changed, 2 insertions(+) > > > > diff --git a/include/sysemu/blockdev.h b/include/sysemu/blockdev.h > > index 3b5fcda..eefae9f 100644 > > --- a/include/sysemu/blockdev.h > > +++ b/include/sysemu/blockdev.h > > @@ -24,6 +24,7 @@ typedef enum { > > */ > > IF_NONE = 0, > > IF_IDE, IF_SCSI, IF_FLOPPY, IF_PFLASH, IF_MTD, IF_SD, > > IF_VIRTIO, IF_XEN, > > +IF_EMMC, > > IF_COUNT > > } BlockInterfaceType; > > > > diff --git a/blockdev.c b/blockdev.c index cd438e6..390d43c 100644 > > --- a/blockdev.c > > +++ b/blockdev.c > > @@ -83,6 +83,7 @@ static const char *const if_name[IF_COUNT] = { > > [IF_SD] = "sd", > > [IF_VIRTIO] = "virtio", > > [IF_XEN] = "xen", > > +[IF_EMMC] = "emmc", > > }; > > >>> > > >>> We don't need to introduce support for the legacy -drive magic. > > >>> > > >>> -device should be enough for this device, right? > > >> > > >> External interface extensions need rationale: why do we want / need it? > > >> The commit message neglects to provide one. > > >> > > >> Even more so when the interface in question is in a state like -drive > > >> is. > > > > > > I wouldn't be too nasty about -drive; for me I still find it the > > > easiest way to start a VM. > > > > But eMMC isn't a bus where you can plug drives, it is soldered on-board and > > is > > mmio mapped to a fixed address. I don't see the point of having a drive > > interface for it... > [Sai Pavan Boddu] Yeah, this makes sense but having a drive would be a simple > implementation without disturbing much in the sd card emulation code. And its > just easy to use, just as how sd cards are inserted. > > I need to see, how easy it would be with -device. Lets see what your command line looks like for starting it with emmc. Dave > Thanks, > Sai Pavan > -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
RE: [PATCH v2 01/22] block: add eMMC block device type
Hi Philippe, > -Original Message- > From: Philippe Mathieu-Daudé > Sent: Monday, February 22, 2021 6:54 PM > To: Dr. David Alan Gilbert ; Markus Armbruster > > Cc: Sai Pavan Boddu ; Kevin Wolf ; > Max Reitz ; Vladimir Sementsov-Ogievskiy > ; Eric Blake ; Joel Stanley > ; Cédric Le Goater ; Vincent Palatin > ; Thomas Huth ; Stefan > Hajnoczi ; Peter Maydell ; > Alistair Francis ; Edgar Iglesias > ; > Luc Michel ; Paolo Bonzini > ; Sai Pavan Boddu ; qemu- > de...@nongnu.org; qemu-block@nongnu.org > Subject: Re: [PATCH v2 01/22] block: add eMMC block device type > > On 2/22/21 2:16 PM, Dr. David Alan Gilbert wrote: > > * Markus Armbruster (arm...@redhat.com) wrote: > >> Philippe Mathieu-Daudé writes: > >> > >>> On 2/22/21 9:20 AM, Sai Pavan Boddu wrote: > From: Vincent Palatin > > Add new block device type. > > Signed-off-by: Vincent Palatin > [SPB: Rebased over 5.1 version] > Signed-off-by: Sai Pavan Boddu > Signed-off-by: Joel Stanley > Signed-off-by: Cédric Le Goater > Reviewed-by: Alistair Francis > --- > include/sysemu/blockdev.h | 1 + > blockdev.c| 1 + > 2 files changed, 2 insertions(+) > > diff --git a/include/sysemu/blockdev.h b/include/sysemu/blockdev.h > index 3b5fcda..eefae9f 100644 > --- a/include/sysemu/blockdev.h > +++ b/include/sysemu/blockdev.h > @@ -24,6 +24,7 @@ typedef enum { > */ > IF_NONE = 0, > IF_IDE, IF_SCSI, IF_FLOPPY, IF_PFLASH, IF_MTD, IF_SD, > IF_VIRTIO, IF_XEN, > +IF_EMMC, > IF_COUNT > } BlockInterfaceType; > > diff --git a/blockdev.c b/blockdev.c index cd438e6..390d43c 100644 > --- a/blockdev.c > +++ b/blockdev.c > @@ -83,6 +83,7 @@ static const char *const if_name[IF_COUNT] = { > [IF_SD] = "sd", > [IF_VIRTIO] = "virtio", > [IF_XEN] = "xen", > +[IF_EMMC] = "emmc", > }; > >>> > >>> We don't need to introduce support for the legacy -drive magic. > >>> > >>> -device should be enough for this device, right? > >> > >> External interface extensions need rationale: why do we want / need it? > >> The commit message neglects to provide one. > >> > >> Even more so when the interface in question is in a state like -drive > >> is. > > > > I wouldn't be too nasty about -drive; for me I still find it the > > easiest way to start a VM. > > But eMMC isn't a bus where you can plug drives, it is soldered on-board and is > mmio mapped to a fixed address. I don't see the point of having a drive > interface for it... [Sai Pavan Boddu] Yeah, this makes sense but having a drive would be a simple implementation without disturbing much in the sd card emulation code. And its just easy to use, just as how sd cards are inserted. I need to see, how easy it would be with -device. Thanks, Sai Pavan
Re: [PATCH 0/2] sysemu: Let VMChangeStateHandler take boolean 'running' argument
Paolo, this series is fully reviewed, can it go via your misc tree? On 1/11/21 4:20 PM, Philippe Mathieu-Daudé wrote: > Trivial prototype change to clarify the use of the 'running' > argument of VMChangeStateHandler. > > Green CI: > https://gitlab.com/philmd/qemu/-/pipelines/239497352 > > Philippe Mathieu-Daudé (2): > sysemu/runstate: Let runstate_is_running() return bool > sysemu: Let VMChangeStateHandler take boolean 'running' argument > > include/sysemu/runstate.h | 12 +--- > target/arm/kvm_arm.h| 2 +- > target/ppc/cpu-qom.h| 2 +- > accel/xen/xen-all.c | 2 +- > audio/audio.c | 2 +- > block/block-backend.c | 2 +- > gdbstub.c | 2 +- > hw/block/pflash_cfi01.c | 2 +- > hw/block/virtio-blk.c | 2 +- > hw/display/qxl.c| 2 +- > hw/i386/kvm/clock.c | 2 +- > hw/i386/kvm/i8254.c | 2 +- > hw/i386/kvmvapic.c | 2 +- > hw/i386/xen/xen-hvm.c | 2 +- > hw/ide/core.c | 2 +- > hw/intc/arm_gicv3_its_kvm.c | 2 +- > hw/intc/arm_gicv3_kvm.c | 2 +- > hw/intc/spapr_xive_kvm.c| 2 +- > hw/misc/mac_via.c | 2 +- > hw/net/e1000e_core.c| 2 +- > hw/nvram/spapr_nvram.c | 2 +- > hw/ppc/ppc.c| 2 +- > hw/ppc/ppc_booke.c | 2 +- > hw/s390x/tod-kvm.c | 2 +- > hw/scsi/scsi-bus.c | 2 +- > hw/usb/hcd-ehci.c | 2 +- > hw/usb/host-libusb.c| 2 +- > hw/usb/redirect.c | 2 +- > hw/vfio/migration.c | 2 +- > hw/virtio/virtio-rng.c | 2 +- > hw/virtio/virtio.c | 2 +- > net/net.c | 2 +- > softmmu/memory.c| 2 +- > softmmu/runstate.c | 4 ++-- > target/arm/kvm.c| 2 +- > target/i386/kvm/kvm.c | 2 +- > target/i386/sev.c | 2 +- > target/i386/whpx/whpx-all.c | 2 +- > target/mips/kvm.c | 4 ++-- > ui/gtk.c| 2 +- > ui/spice-core.c | 2 +- > 41 files changed, 51 insertions(+), 45 deletions(-) >
Re: [RFC PATCH v2 3/3] hw/pflash_cfi01: Allow devices to have a smaller backing device
On Monday, 2021-02-22 at 15:06:35 +01, Philippe Mathieu-Daudé wrote: > Hi David, > > On 2/22/21 10:07 AM, David Edmondson wrote: >> Allow the backing device to be smaller than the extent of the flash >> device by mapping it as a subregion of the flash device region. >> >> Return zeroes for all reads of the flash device beyond the extent of >> the backing device. >> >> For writes beyond the extent of the underlying device, fail on >> read-only devices and discard them for writable devices. > > This looks much simpler now. Thanks for looking at it. >> Signed-off-by: David Edmondson >> --- >> hw/block/pflash_cfi01.c | 108 ++-- >> hw/block/trace-events | 3 ++ >> 2 files changed, 86 insertions(+), 25 deletions(-) > >> if (pfl->blk) { >> uint64_t perm; >> + >> pfl->ro = !blk_supports_write_perm(pfl->blk); >> perm = BLK_PERM_CONSISTENT_READ | (pfl->ro ? 0 : BLK_PERM_WRITE); >> ret = blk_set_perm(pfl->blk, perm, BLK_PERM_ALL, errp); >> if (ret < 0) { >> return; >> } >> + >> +inner_len = blk_getlength(pfl->blk); >> + >> +if (inner_len > outer_len) { >> +error_setg(errp, >> + "block backend provides %" HWADDR_PRIu " bytes, " >> + "device limited to %" PRIu64 " bytes", >> + inner_len, outer_len); >> +return; >> +} > > Do you mind extracting this change in a previous patch? Before this change that test was performed by blk_check_size_and_read_all(), which insisted that the block device was exactly the same size as the region (because it was told to read enough bytes to fill the region). Now that we pass the size of the backing device to that function, rather than the size of the region, it's necessary to add the extra check before the call. Hence, I don't think that it's useful to add this as an earlier patch. It would perhaps make sense to switch away from blk_check_size_and_read_all(), preferring to just call blk_pread() now, which would be cleaner? dme. -- Do I have to tell the story, of a thousand rainy days since we first met?
[PATCH v4 2/8] hw/sh4: Add missing Kconfig dependency on SH7750 for the R2D board
r2d_init() calls sh7750_init() so depends on SH7750.Harmless at the moment because nothing actually uses CONFIG_SH7750 (hw/sh4/meson.build always compiles sh7750.c and sh7750_regnames.c unconditionally). Fixes: 7ab58d4c841 ("sh4-softmmu.mak: express dependencies with Kconfig") Reported-by: Peter Maydell Reviewed-by: Peter Maydell Signed-off-by: Philippe Mathieu-Daudé --- hw/sh4/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/hw/sh4/Kconfig b/hw/sh4/Kconfig index 4cbce3a0ed5..0452b4624ae 100644 --- a/hw/sh4/Kconfig +++ b/hw/sh4/Kconfig @@ -10,6 +10,7 @@ config R2D select PCI select SM501 select SH4 +select SH7750 config SHIX bool -- 2.26.2
Re: [PATCH v3 1/8] hw/sh4: Add missing license
On Mon, 22 Feb 2021 at 14:13, Philippe Mathieu-Daudé wrote: > > On 2/22/21 2:55 PM, Peter Maydell wrote: > > On Sun, 21 Feb 2021 at 21:59, Philippe Mathieu-Daudé > > wrote: > >> > >> This code was introduced in commit 27c7ca7e775, > >> ("SHIX board emulation (Samuel Tardieu)"). Use > >> the same license. > > I thought it wasn't generally recommended to convert > > a license text to a single SPDX line? The sh7750.c file > > has a full 3-para license text. > > Yes you are right, sorry. I'll respin. Also on the subject of sh4 code licenses, hw/sh4/sh7750_regs.h has * The license and distribution terms for this file may be * found in the file LICENSE in this distribution or at * http://www.rtems.com/license/LICENSE. because it's borrowed from RTEMS. (The license at the URL is gpl-2-or-later with a header exception). Maybe we should expand the license in-place so we're not dependent on a 3rd party website to stay up to tell us what the license on the file is? In particular I assume the comment really means "the file LICENSE in the rtems distribution" https://git.rtems.org/rtems/tree/LICENSE (same text as web page) and not the file LICENSE in QEMU, which is not the same thing, so that part is actively misleading. -- PMM
[PATCH v4 0/8] hw/sh4: Kconfig cleanups
Missing review: 1 (license) Since v3: - Include full MIT license text (Peter) Since v2: - Added missing TC58128/SH_PCI Kconfig entries (Peter) Since v1: - Addressed Peter Maydell review comments from https://www.mail-archive.com/qemu-block@nongnu.org/msg80599.html Philippe Mathieu-Daudé (8): hw/sh4: Add missing license hw/sh4: Add missing Kconfig dependency on SH7750 for the R2D board hw/intc: Introduce SH_INTC Kconfig entry hw/char: Introduce SH_SCI Kconfig entry hw/timer: Introduce SH_TIMER Kconfig entry hw/block: Introduce TC58128 eeprom Kconfig entry hw/pci-host: Introduce SH_PCI Kconfig entry hw/sh4: Remove now unused CONFIG_SH4 from Kconfig include/hw/sh4/sh.h | 31 --- hw/block/tc58128.c| 26 ++ hw/{sh4 => pci-host}/sh_pci.c | 0 MAINTAINERS | 6 ++ hw/block/Kconfig | 3 +++ hw/block/meson.build | 2 +- hw/char/Kconfig | 3 +++ hw/char/meson.build | 2 +- hw/intc/Kconfig | 3 +++ hw/intc/meson.build | 2 +- hw/pci-host/Kconfig | 4 hw/pci-host/meson.build | 1 + hw/sh4/Kconfig| 12 ++-- hw/sh4/meson.build| 1 - hw/timer/Kconfig | 4 hw/timer/meson.build | 2 +- 16 files changed, 88 insertions(+), 14 deletions(-) rename hw/{sh4 => pci-host}/sh_pci.c (100%) -- 2.26.2
[PATCH v4 4/8] hw/char: Introduce SH_SCI Kconfig entry
We want to be able to use the 'SH4' config for architecture specific features. Add more fine-grained selection by adding a CONFIG_SH_SCI selector for the SH4 serial controller. Add the missing MAINTAINERS entries. Suggested-by: Peter Maydell Reviewed-by: Peter Maydell Signed-off-by: Philippe Mathieu-Daudé --- MAINTAINERS | 2 ++ hw/char/Kconfig | 3 +++ hw/char/meson.build | 2 +- hw/sh4/Kconfig | 1 + 4 files changed, 7 insertions(+), 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index 9b2aa18e1fe..b53a937714a 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1388,6 +1388,7 @@ R2D M: Yoshinori Sato R: Magnus Damm S: Odd Fixes +F: hw/char/sh_serial.c F: hw/sh4/r2d.c F: hw/intc/sh_intc.c F: include/hw/sh4/sh_intc.h @@ -1396,6 +1397,7 @@ Shix M: Yoshinori Sato R: Magnus Damm S: Odd Fixes +F: hw/char/sh_serial.c F: hw/sh4/shix.c F: hw/intc/sh_intc.c F: include/hw/sh4/sh_intc.h diff --git a/hw/char/Kconfig b/hw/char/Kconfig index 939bc447588..f6f4fffd1b7 100644 --- a/hw/char/Kconfig +++ b/hw/char/Kconfig @@ -50,6 +50,9 @@ config SCLPCONSOLE config TERMINAL3270 bool +config SH_SCI +bool + config RENESAS_SCI bool diff --git a/hw/char/meson.build b/hw/char/meson.build index 196ac91fa29..afe9a0af88c 100644 --- a/hw/char/meson.build +++ b/hw/char/meson.build @@ -31,7 +31,7 @@ softmmu_ss.add(when: 'CONFIG_RASPI', if_true: files('bcm2835_aux.c')) softmmu_ss.add(when: 'CONFIG_RENESAS_SCI', if_true: files('renesas_sci.c')) softmmu_ss.add(when: 'CONFIG_SIFIVE_UART', if_true: files('sifive_uart.c')) -softmmu_ss.add(when: 'CONFIG_SH4', if_true: files('sh_serial.c')) +softmmu_ss.add(when: 'CONFIG_SH_SCI', if_true: files('sh_serial.c')) softmmu_ss.add(when: 'CONFIG_STM32F2XX_USART', if_true: files('stm32f2xx_usart.c')) softmmu_ss.add(when: 'CONFIG_MCHP_PFSOC_MMUART', if_true: files('mchp_pfsoc_mmuart.c')) diff --git a/hw/sh4/Kconfig b/hw/sh4/Kconfig index c2008c6a0d2..47240aa97b7 100644 --- a/hw/sh4/Kconfig +++ b/hw/sh4/Kconfig @@ -20,6 +20,7 @@ config SHIX config SH7750 bool select SH_INTC +select SH_SCI config SH4 bool -- 2.26.2
[PATCH v4 8/8] hw/sh4: Remove now unused CONFIG_SH4 from Kconfig
As replaced the generic CONFIG_SH4 by more fine-grained selectors, we can remove this now unused config variable. Reviewed-by: Peter Maydell Signed-off-by: Philippe Mathieu-Daudé --- hw/sh4/Kconfig | 5 - 1 file changed, 5 deletions(-) diff --git a/hw/sh4/Kconfig b/hw/sh4/Kconfig index b9f0538d57f..ab733a3f760 100644 --- a/hw/sh4/Kconfig +++ b/hw/sh4/Kconfig @@ -9,14 +9,12 @@ config R2D select USB_OHCI_PCI select PCI select SM501 -select SH4 select SH7750 select SH_PCI config SHIX bool select SH7750 -select SH4 select TC58128 config SH7750 @@ -24,6 +22,3 @@ config SH7750 select SH_INTC select SH_SCI select SH_TIMER - -config SH4 -bool -- 2.26.2
[PATCH v4 6/8] hw/block: Introduce TC58128 eeprom Kconfig entry
Add more fine-grained selection by adding a CONFIG_TC58128 selector for the TC58128 eeprom. As this device is only used by the Shix machine, add an entry to the proper section in MAINTAINERS. Suggested-by: Peter Maydell Reviewed-by: Peter Maydell Signed-off-by: Philippe Mathieu-Daudé --- MAINTAINERS | 1 + hw/block/Kconfig | 3 +++ hw/block/meson.build | 2 +- hw/sh4/Kconfig | 1 + 4 files changed, 6 insertions(+), 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index a03557f4f7c..a0cb89161cb 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1398,6 +1398,7 @@ Shix M: Yoshinori Sato R: Magnus Damm S: Odd Fixes +F: hw/block/tc58128.c F: hw/char/sh_serial.c F: hw/sh4/shix.c F: hw/intc/sh_intc.c diff --git a/hw/block/Kconfig b/hw/block/Kconfig index 2d17f481adc..4fcd1521668 100644 --- a/hw/block/Kconfig +++ b/hw/block/Kconfig @@ -22,6 +22,9 @@ config ECC config ONENAND bool +config TC58128 +bool + config NVME_PCI bool default y if PCI_DEVICES diff --git a/hw/block/meson.build b/hw/block/meson.build index 602ca6c8541..4bf994c64ff 100644 --- a/hw/block/meson.build +++ b/hw/block/meson.build @@ -12,7 +12,7 @@ softmmu_ss.add(when: 'CONFIG_SSI_M25P80', if_true: files('m25p80.c')) softmmu_ss.add(when: 'CONFIG_SWIM', if_true: files('swim.c')) softmmu_ss.add(when: 'CONFIG_XEN', if_true: files('xen-block.c')) -softmmu_ss.add(when: 'CONFIG_SH4', if_true: files('tc58128.c')) +softmmu_ss.add(when: 'CONFIG_TC58128', if_true: files('tc58128.c')) softmmu_ss.add(when: 'CONFIG_NVME_PCI', if_true: files('nvme.c', 'nvme-ns.c')) specific_ss.add(when: 'CONFIG_VIRTIO_BLK', if_true: files('virtio-blk.c')) diff --git a/hw/sh4/Kconfig b/hw/sh4/Kconfig index e569470a614..34c01dadde9 100644 --- a/hw/sh4/Kconfig +++ b/hw/sh4/Kconfig @@ -16,6 +16,7 @@ config SHIX bool select SH7750 select SH4 +select TC58128 config SH7750 bool -- 2.26.2
[PATCH v4 3/8] hw/intc: Introduce SH_INTC Kconfig entry
We want to be able to use the 'SH4' config for architecture specific features. Add more fine-grained selection by adding a CONFIG_SH_INTC selector for the SH4 interrupt controller. Suggested-by: Peter Maydell Reviewed-by: Peter Maydell Signed-off-by: Philippe Mathieu-Daudé --- hw/intc/Kconfig | 3 +++ hw/intc/meson.build | 2 +- hw/sh4/Kconfig | 1 + 3 files changed, 5 insertions(+), 1 deletion(-) diff --git a/hw/intc/Kconfig b/hw/intc/Kconfig index c18d11142a8..66bf0b90b47 100644 --- a/hw/intc/Kconfig +++ b/hw/intc/Kconfig @@ -53,6 +53,9 @@ config OMPIC config PPC_UIC bool +config SH_INTC +bool + config RX_ICU bool diff --git a/hw/intc/meson.build b/hw/intc/meson.build index 53cba115690..b3d9345a0d2 100644 --- a/hw/intc/meson.build +++ b/hw/intc/meson.build @@ -47,7 +47,7 @@ specific_ss.add(when: 'CONFIG_RX_ICU', if_true: files('rx_icu.c')) specific_ss.add(when: 'CONFIG_S390_FLIC', if_true: files('s390_flic.c')) specific_ss.add(when: 'CONFIG_S390_FLIC_KVM', if_true: files('s390_flic_kvm.c')) -specific_ss.add(when: 'CONFIG_SH4', if_true: files('sh_intc.c')) +specific_ss.add(when: 'CONFIG_SH_INTC', if_true: files('sh_intc.c')) specific_ss.add(when: 'CONFIG_SIFIVE_CLINT', if_true: files('sifive_clint.c')) specific_ss.add(when: 'CONFIG_SIFIVE_PLIC', if_true: files('sifive_plic.c')) specific_ss.add(when: 'CONFIG_XICS', if_true: files('xics.c')) diff --git a/hw/sh4/Kconfig b/hw/sh4/Kconfig index 0452b4624ae..c2008c6a0d2 100644 --- a/hw/sh4/Kconfig +++ b/hw/sh4/Kconfig @@ -19,6 +19,7 @@ config SHIX config SH7750 bool +select SH_INTC config SH4 bool -- 2.26.2
[PATCH v4 7/8] hw/pci-host: Introduce SH_PCI Kconfig entry
We want to be able to use the 'SH4' config for architecture specific features. Add more fine-grained selection by adding a CONFIG_SH_PCI selector for the SH4 PCI controller. Move the file with the other PCI host devices in hw/pci-host and add its missing MAINTAINERS entries. Suggested-by: Peter Maydell Reviewed-by: Peter Maydell Signed-off-by: Philippe Mathieu-Daudé --- hw/{sh4 => pci-host}/sh_pci.c | 0 MAINTAINERS | 1 + hw/pci-host/Kconfig | 4 hw/pci-host/meson.build | 1 + hw/sh4/Kconfig| 1 + hw/sh4/meson.build| 1 - 6 files changed, 7 insertions(+), 1 deletion(-) rename hw/{sh4 => pci-host}/sh_pci.c (100%) diff --git a/hw/sh4/sh_pci.c b/hw/pci-host/sh_pci.c similarity index 100% rename from hw/sh4/sh_pci.c rename to hw/pci-host/sh_pci.c diff --git a/MAINTAINERS b/MAINTAINERS index a0cb89161cb..e39997f129a 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1391,6 +1391,7 @@ S: Odd Fixes F: hw/char/sh_serial.c F: hw/sh4/r2d.c F: hw/intc/sh_intc.c +F: hw/pci-host/sh_pci.c F: hw/timer/sh_timer.c F: include/hw/sh4/sh_intc.h diff --git a/hw/pci-host/Kconfig b/hw/pci-host/Kconfig index 8b8c763c28c..2ccc96f02ce 100644 --- a/hw/pci-host/Kconfig +++ b/hw/pci-host/Kconfig @@ -68,3 +68,7 @@ config PCI_POWERNV config REMOTE_PCIHOST bool + +config SH_PCI +bool +select PCI diff --git a/hw/pci-host/meson.build b/hw/pci-host/meson.build index 1847c69905c..87a896973e7 100644 --- a/hw/pci-host/meson.build +++ b/hw/pci-host/meson.build @@ -10,6 +10,7 @@ pci_ss.add(when: 'CONFIG_PCI_SABRE', if_true: files('sabre.c')) pci_ss.add(when: 'CONFIG_XEN_IGD_PASSTHROUGH', if_true: files('xen_igd_pt.c')) pci_ss.add(when: 'CONFIG_REMOTE_PCIHOST', if_true: files('remote.c')) +pci_ss.add(when: 'CONFIG_SH_PCI', if_true: files('sh_pci.c')) # PPC devices pci_ss.add(when: 'CONFIG_PREP_PCI', if_true: files('prep.c')) diff --git a/hw/sh4/Kconfig b/hw/sh4/Kconfig index 34c01dadde9..b9f0538d57f 100644 --- a/hw/sh4/Kconfig +++ b/hw/sh4/Kconfig @@ -11,6 +11,7 @@ config R2D select SM501 select SH4 select SH7750 +select SH_PCI config SHIX bool diff --git a/hw/sh4/meson.build b/hw/sh4/meson.build index 303c0f42879..424d5674dea 100644 --- a/hw/sh4/meson.build +++ b/hw/sh4/meson.build @@ -2,7 +2,6 @@ sh4_ss.add(files( 'sh7750.c', 'sh7750_regnames.c', - 'sh_pci.c' )) sh4_ss.add(when: 'CONFIG_R2D', if_true: files('r2d.c')) sh4_ss.add(when: 'CONFIG_SHIX', if_true: files('shix.c')) -- 2.26.2
[PATCH v4 5/8] hw/timer: Introduce SH_TIMER Kconfig entry
We want to be able to use the 'SH4' config for architecture specific features. Add more fine-grained selection by adding a CONFIG_SH_TIMER selector for the SH4 timer control unit. Add the missing MAINTAINERS entries. Suggested-by: Peter Maydell Reviewed-by: Peter Maydell Signed-off-by: Philippe Mathieu-Daudé --- MAINTAINERS | 2 ++ hw/sh4/Kconfig | 2 +- hw/timer/Kconfig | 4 hw/timer/meson.build | 2 +- 4 files changed, 8 insertions(+), 2 deletions(-) diff --git a/MAINTAINERS b/MAINTAINERS index b53a937714a..a03557f4f7c 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1391,6 +1391,7 @@ S: Odd Fixes F: hw/char/sh_serial.c F: hw/sh4/r2d.c F: hw/intc/sh_intc.c +F: hw/timer/sh_timer.c F: include/hw/sh4/sh_intc.h Shix @@ -1400,6 +1401,7 @@ S: Odd Fixes F: hw/char/sh_serial.c F: hw/sh4/shix.c F: hw/intc/sh_intc.c +F: hw/timer/sh_timer.c F: include/hw/sh4/sh_intc.h SPARC Machines diff --git a/hw/sh4/Kconfig b/hw/sh4/Kconfig index 47240aa97b7..e569470a614 100644 --- a/hw/sh4/Kconfig +++ b/hw/sh4/Kconfig @@ -21,7 +21,7 @@ config SH7750 bool select SH_INTC select SH_SCI +select SH_TIMER config SH4 bool -select PTIMER diff --git a/hw/timer/Kconfig b/hw/timer/Kconfig index 8749edfb6a5..18936ef55bf 100644 --- a/hw/timer/Kconfig +++ b/hw/timer/Kconfig @@ -36,6 +36,10 @@ config CMSDK_APB_DUALTIMER bool select PTIMER +config SH_TIMER +bool +select PTIMER + config RENESAS_TMR bool diff --git a/hw/timer/meson.build b/hw/timer/meson.build index be343f68fed..26c2701fd78 100644 --- a/hw/timer/meson.build +++ b/hw/timer/meson.build @@ -30,7 +30,7 @@ softmmu_ss.add(when: 'CONFIG_PUV3', if_true: files('puv3_ost.c')) softmmu_ss.add(when: 'CONFIG_PXA2XX', if_true: files('pxa2xx_timer.c')) softmmu_ss.add(when: 'CONFIG_RASPI', if_true: files('bcm2835_systmr.c')) -softmmu_ss.add(when: 'CONFIG_SH4', if_true: files('sh_timer.c')) +softmmu_ss.add(when: 'CONFIG_SH_TIMER', if_true: files('sh_timer.c')) softmmu_ss.add(when: 'CONFIG_SLAVIO', if_true: files('slavio_timer.c')) softmmu_ss.add(when: 'CONFIG_STM32F2XX_TIMER', if_true: files('stm32f2xx_timer.c')) softmmu_ss.add(when: 'CONFIG_XILINX', if_true: files('xilinx_timer.c')) -- 2.26.2
[PATCH v4 1/8] hw/sh4: Add missing license
This code was introduced in commit 27c7ca7e775, ("SHIX board emulation (Samuel Tardieu)"). Use the same license. Cc: Samuel Tardieu Signed-off-by: Philippe Mathieu-Daudé --- include/hw/sh4/sh.h | 31 --- hw/block/tc58128.c | 26 ++ 2 files changed, 54 insertions(+), 3 deletions(-) diff --git a/include/hw/sh4/sh.h b/include/hw/sh4/sh.h index 93f464bf4cd..becb5969790 100644 --- a/include/hw/sh4/sh.h +++ b/include/hw/sh4/sh.h @@ -1,6 +1,31 @@ -#ifndef QEMU_SH_H -#define QEMU_SH_H -/* Definitions for SH board emulation. */ +/* + * Definitions for SH board emulation + * + * Copyright (c) 2005 Samuel Tardieu + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the "Software"), to deal + * in the Software without restriction, including without limitation the rights + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + * copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN + * THE SOFTWARE. + * + * SPDX-License-Identifier: MIT + */ +#ifndef QEMU_HW_SH_H +#define QEMU_HW_SH_H #include "hw/sh4/sh_intc.h" #include "target/sh4/cpu-qom.h" diff --git a/hw/block/tc58128.c b/hw/block/tc58128.c index 9888f01ac60..bfc27ad8993 100644 --- a/hw/block/tc58128.c +++ b/hw/block/tc58128.c @@ -1,3 +1,29 @@ +/* + * TC58128 NAND EEPROM emulation + * + * Copyright (c) 2005 Samuel Tardieu + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the "Software"), to deal + * in the Software without restriction, including without limitation the rights + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + * copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN + * THE SOFTWARE. + * + * SPDX-License-Identifier: MIT + */ #include "qemu/osdep.h" #include "qemu/units.h" #include "hw/sh4/sh.h" -- 2.26.2
Re: [PATCH v3 1/8] hw/sh4: Add missing license
On 2/22/21 2:55 PM, Peter Maydell wrote: > On Sun, 21 Feb 2021 at 21:59, Philippe Mathieu-Daudé wrote: >> >> This code was introduced in commit 27c7ca7e775, >> ("SHIX board emulation (Samuel Tardieu)"). Use >> the same license. >> >> Cc: Samuel Tardieu >> Signed-off-by: Philippe Mathieu-Daudé >> --- >> include/hw/sh4/sh.h | 12 +--- >> hw/block/tc58128.c | 7 +++ >> 2 files changed, 16 insertions(+), 3 deletions(-) >> >> diff --git a/include/hw/sh4/sh.h b/include/hw/sh4/sh.h >> index 93f464bf4cd..33dde7a56dd 100644 >> --- a/include/hw/sh4/sh.h >> +++ b/include/hw/sh4/sh.h >> @@ -1,6 +1,12 @@ >> -#ifndef QEMU_SH_H >> -#define QEMU_SH_H >> -/* Definitions for SH board emulation. */ >> +/* >> + * Definitions for SH board emulation >> + * >> + * Copyright (c) 2005 Samuel Tardieu >> + * >> + * SPDX-License-Identifier: MIT >> + */ > > I thought it wasn't generally recommended to convert > a license text to a single SPDX line? The sh7750.c file > has a full 3-para license text. Yes you are right, sorry. I'll respin. Phil.
Re: [RFC PATCH v2 3/3] hw/pflash_cfi01: Allow devices to have a smaller backing device
Hi David, On 2/22/21 10:07 AM, David Edmondson wrote: > Allow the backing device to be smaller than the extent of the flash > device by mapping it as a subregion of the flash device region. > > Return zeroes for all reads of the flash device beyond the extent of > the backing device. > > For writes beyond the extent of the underlying device, fail on > read-only devices and discard them for writable devices. This looks much simpler now. > Signed-off-by: David Edmondson > --- > hw/block/pflash_cfi01.c | 108 ++-- > hw/block/trace-events | 3 ++ > 2 files changed, 86 insertions(+), 25 deletions(-) > if (pfl->blk) { > uint64_t perm; > + > pfl->ro = !blk_supports_write_perm(pfl->blk); > perm = BLK_PERM_CONSISTENT_READ | (pfl->ro ? 0 : BLK_PERM_WRITE); > ret = blk_set_perm(pfl->blk, perm, BLK_PERM_ALL, errp); > if (ret < 0) { > return; > } > + > +inner_len = blk_getlength(pfl->blk); > + > +if (inner_len > outer_len) { > +error_setg(errp, > + "block backend provides %" HWADDR_PRIu " bytes, " > + "device limited to %" PRIu64 " bytes", > + inner_len, outer_len); > +return; > +} Do you mind extracting this change in a previous patch? Thanks, Phil.
Re: [RFC PATCH v2 2/3] hw/pflash_cfi01: Correct the type of PFlashCFI01.ro
On 2/22/21 10:07 AM, David Edmondson wrote: > PFlashCFI01.ro is a bool, declare it as such. > > Signed-off-by: David Edmondson > --- > hw/block/pflash_cfi01.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) Reviewed-by: Philippe Mathieu-Daudé
Re: [RFC PATCH v2 1/3] hw/pflash_cfi*: Replace DPRINTF with trace events
On 2/22/21 10:07 AM, David Edmondson wrote: > Rather than having a device specific debug implementation in > pflash_cfi01.c and pflash_cfi02.c, use the standard tracing facility. > > Signed-off-by: David Edmondson > --- > hw/block/pflash_cfi01.c | 78 + > hw/block/pflash_cfi02.c | 75 +++ > hw/block/trace-events | 39 - > 3 files changed, 91 insertions(+), 101 deletions(-) Reviewed-by: Philippe Mathieu-Daudé
Re: [PATCH v3 1/8] hw/sh4: Add missing license
On Sun, 21 Feb 2021 at 21:59, Philippe Mathieu-Daudé wrote: > > This code was introduced in commit 27c7ca7e775, > ("SHIX board emulation (Samuel Tardieu)"). Use > the same license. > > Cc: Samuel Tardieu > Signed-off-by: Philippe Mathieu-Daudé > --- > include/hw/sh4/sh.h | 12 +--- > hw/block/tc58128.c | 7 +++ > 2 files changed, 16 insertions(+), 3 deletions(-) > > diff --git a/include/hw/sh4/sh.h b/include/hw/sh4/sh.h > index 93f464bf4cd..33dde7a56dd 100644 > --- a/include/hw/sh4/sh.h > +++ b/include/hw/sh4/sh.h > @@ -1,6 +1,12 @@ > -#ifndef QEMU_SH_H > -#define QEMU_SH_H > -/* Definitions for SH board emulation. */ > +/* > + * Definitions for SH board emulation > + * > + * Copyright (c) 2005 Samuel Tardieu > + * > + * SPDX-License-Identifier: MIT > + */ I thought it wasn't generally recommended to convert a license text to a single SPDX line? The sh7750.c file has a full 3-para license text. thanks -- PMM
Re: [PATCH v3 7/8] hw/pci-host: Introduce SH_PCI Kconfig entry
On Sun, 21 Feb 2021 at 21:59, Philippe Mathieu-Daudé wrote: > > We want to be able to use the 'SH4' config for architecture > specific features. Add more fine-grained selection by adding > a CONFIG_SH_PCI selector for the SH4 PCI controller. > Move the file with the other PCI host devices in hw/pci-host > and add its missing MAINTAINERS entries. > > Suggested-by: Peter Maydell > Signed-off-by: Philippe Mathieu-Daudé > --- > hw/{sh4 => pci-host}/sh_pci.c | 0 > MAINTAINERS | 1 + > hw/pci-host/Kconfig | 4 > hw/pci-host/meson.build | 1 + > hw/sh4/Kconfig| 1 + > hw/sh4/meson.build| 1 - > 6 files changed, 7 insertions(+), 1 deletion(-) > rename hw/{sh4 => pci-host}/sh_pci.c (100%) > Reviewed-by: Peter Maydell thanks -- PMM
Re: [PATCH v2 01/22] block: add eMMC block device type
On 2/22/21 2:16 PM, Dr. David Alan Gilbert wrote: > * Markus Armbruster (arm...@redhat.com) wrote: >> Philippe Mathieu-Daudé writes: >> >>> On 2/22/21 9:20 AM, Sai Pavan Boddu wrote: From: Vincent Palatin Add new block device type. Signed-off-by: Vincent Palatin [SPB: Rebased over 5.1 version] Signed-off-by: Sai Pavan Boddu Signed-off-by: Joel Stanley Signed-off-by: Cédric Le Goater Reviewed-by: Alistair Francis --- include/sysemu/blockdev.h | 1 + blockdev.c| 1 + 2 files changed, 2 insertions(+) diff --git a/include/sysemu/blockdev.h b/include/sysemu/blockdev.h index 3b5fcda..eefae9f 100644 --- a/include/sysemu/blockdev.h +++ b/include/sysemu/blockdev.h @@ -24,6 +24,7 @@ typedef enum { */ IF_NONE = 0, IF_IDE, IF_SCSI, IF_FLOPPY, IF_PFLASH, IF_MTD, IF_SD, IF_VIRTIO, IF_XEN, +IF_EMMC, IF_COUNT } BlockInterfaceType; diff --git a/blockdev.c b/blockdev.c index cd438e6..390d43c 100644 --- a/blockdev.c +++ b/blockdev.c @@ -83,6 +83,7 @@ static const char *const if_name[IF_COUNT] = { [IF_SD] = "sd", [IF_VIRTIO] = "virtio", [IF_XEN] = "xen", +[IF_EMMC] = "emmc", }; >>> >>> We don't need to introduce support for the legacy -drive magic. >>> >>> -device should be enough for this device, right? >> >> External interface extensions need rationale: why do we want / need it? >> The commit message neglects to provide one. >> >> Even more so when the interface in question is in a state like -drive >> is. > > I wouldn't be too nasty about -drive; for me I still find it the > easiest way to start a VM. But eMMC isn't a bus where you can plug drives, it is soldered on-board and is mmio mapped to a fixed address. I don't see the point of having a drive interface for it...
Re: [PATCH v2 01/22] block: add eMMC block device type
* Markus Armbruster (arm...@redhat.com) wrote: > Philippe Mathieu-Daudé writes: > > > On 2/22/21 9:20 AM, Sai Pavan Boddu wrote: > >> From: Vincent Palatin > >> > >> Add new block device type. > >> > >> Signed-off-by: Vincent Palatin > >> [SPB: Rebased over 5.1 version] > >> Signed-off-by: Sai Pavan Boddu > >> Signed-off-by: Joel Stanley > >> Signed-off-by: Cédric Le Goater > >> Reviewed-by: Alistair Francis > >> --- > >> include/sysemu/blockdev.h | 1 + > >> blockdev.c| 1 + > >> 2 files changed, 2 insertions(+) > >> > >> diff --git a/include/sysemu/blockdev.h b/include/sysemu/blockdev.h > >> index 3b5fcda..eefae9f 100644 > >> --- a/include/sysemu/blockdev.h > >> +++ b/include/sysemu/blockdev.h > >> @@ -24,6 +24,7 @@ typedef enum { > >> */ > >> IF_NONE = 0, > >> IF_IDE, IF_SCSI, IF_FLOPPY, IF_PFLASH, IF_MTD, IF_SD, IF_VIRTIO, > >> IF_XEN, > >> +IF_EMMC, > >> IF_COUNT > >> } BlockInterfaceType; > >> > >> diff --git a/blockdev.c b/blockdev.c > >> index cd438e6..390d43c 100644 > >> --- a/blockdev.c > >> +++ b/blockdev.c > >> @@ -83,6 +83,7 @@ static const char *const if_name[IF_COUNT] = { > >> [IF_SD] = "sd", > >> [IF_VIRTIO] = "virtio", > >> [IF_XEN] = "xen", > >> +[IF_EMMC] = "emmc", > >> }; > > > > We don't need to introduce support for the legacy -drive magic. > > > > -device should be enough for this device, right? > > External interface extensions need rationale: why do we want / need it? > The commit message neglects to provide one. > > Even more so when the interface in question is in a state like -drive > is. I wouldn't be too nasty about -drive; for me I still find it the easiest way to start a VM. Dave -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [PATCH 2/3] hw/block/nvme: fix potential compilation error
On 2/22/21 8:06 AM, Klaus Jensen wrote: > From: Gollu Appalanaidu > > assert may be compiled to a noop and we could end up returning an > uninitialized status. Per commit 262a69f4282 ("osdep.h: Prohibit disabling assert() in supported builds") this shouldn't be possible. Anyhow cleanup is good. > > Fix this by always returning Internal Device Error as a fallback. > > Signed-off-by: Gollu Appalanaidu > Signed-off-by: Klaus Jensen > --- > hw/block/nvme.c | 11 +-- > 1 file changed, 5 insertions(+), 6 deletions(-) > > diff --git a/hw/block/nvme.c b/hw/block/nvme.c > index ddc83f7f7a19..897b9ff0db91 100644 > --- a/hw/block/nvme.c > +++ b/hw/block/nvme.c > @@ -1232,7 +1232,7 @@ static uint16_t nvme_check_zone_write(NvmeNamespace > *ns, NvmeZone *zone, > > static uint16_t nvme_check_zone_state_for_read(NvmeZone *zone) > { > -uint16_t status; > +uint64_t zslba = zone->d.zslba; > > switch (nvme_get_zone_state(zone)) { > case NVME_ZONE_STATE_EMPTY: > @@ -1241,16 +1241,15 @@ static uint16_t > nvme_check_zone_state_for_read(NvmeZone *zone) > case NVME_ZONE_STATE_FULL: > case NVME_ZONE_STATE_CLOSED: > case NVME_ZONE_STATE_READ_ONLY: > -status = NVME_SUCCESS; > -break; > +return NVME_SUCCESS; > case NVME_ZONE_STATE_OFFLINE: > -status = NVME_ZONE_OFFLINE; > -break; > +trace_pci_nvme_err_zone_is_offline(zslba); > +return NVME_ZONE_OFFLINE; > default: > assert(false); > } > > -return status; > +return NVME_INTERNAL_DEV_ERROR; > } > > static uint16_t nvme_check_zone_read(NvmeNamespace *ns, uint64_t slba, >
Re: [PATCH 1/3] hw/block/nvme: nvme_identify fixes
On 2/22/21 1:00 PM, Minwoo Im wrote: > On 21-02-22 08:06:13, Klaus Jensen wrote: >> From: Gollu Appalanaidu >> >> Remove an unnecessary le_to_cpu conversion and add trace event for >> Identify. >> >> Signed-off-by: Gollu Appalanaidu >> Signed-off-by: Klaus Jensen >> --- >> hw/block/nvme.c | 5 - >> hw/block/trace-events | 1 + >> 2 files changed, 5 insertions(+), 1 deletion(-) >> >> diff --git a/hw/block/nvme.c b/hw/block/nvme.c >> index 1cd82fa3c9fe..ddc83f7f7a19 100644 >> --- a/hw/block/nvme.c >> +++ b/hw/block/nvme.c >> @@ -3415,7 +3415,10 @@ static uint16_t nvme_identify(NvmeCtrl *n, >> NvmeRequest *req) >> { >> NvmeIdentify *c = (NvmeIdentify *)>cmd; >> >> -switch (le32_to_cpu(c->cns)) { >> +trace_pci_nvme_identify(nvme_cid(req), c->cns, le16_to_cpu(c->ctrlid), >> +c->csi); > > I think it would be great if it can be separated into two. Agreed. > Anyway, changes look good to me. > > Reviewed-by: Minwoo Im
Re: [PATCH v2 01/22] block: add eMMC block device type
On 2/22/21 9:20 AM, Sai Pavan Boddu wrote: > From: Vincent Palatin > > Add new block device type. > > Signed-off-by: Vincent Palatin > [SPB: Rebased over 5.1 version] > Signed-off-by: Sai Pavan Boddu > Signed-off-by: Joel Stanley > Signed-off-by: Cédric Le Goater > Reviewed-by: Alistair Francis > --- > include/sysemu/blockdev.h | 1 + > blockdev.c| 1 + > 2 files changed, 2 insertions(+) > > diff --git a/include/sysemu/blockdev.h b/include/sysemu/blockdev.h > index 3b5fcda..eefae9f 100644 > --- a/include/sysemu/blockdev.h > +++ b/include/sysemu/blockdev.h > @@ -24,6 +24,7 @@ typedef enum { > */ > IF_NONE = 0, > IF_IDE, IF_SCSI, IF_FLOPPY, IF_PFLASH, IF_MTD, IF_SD, IF_VIRTIO, IF_XEN, > +IF_EMMC, > IF_COUNT > } BlockInterfaceType; > > diff --git a/blockdev.c b/blockdev.c > index cd438e6..390d43c 100644 > --- a/blockdev.c > +++ b/blockdev.c > @@ -83,6 +83,7 @@ static const char *const if_name[IF_COUNT] = { > [IF_SD] = "sd", > [IF_VIRTIO] = "virtio", > [IF_XEN] = "xen", > +[IF_EMMC] = "emmc", > }; We don't need to introduce support for the legacy -drive magic. -device should be enough for this device, right?
Re: [PATCH 2/3] hw/block/nvme: fix potential compilation error
On 21-02-22 08:06:14, Klaus Jensen wrote: > From: Gollu Appalanaidu > > assert may be compiled to a noop and we could end up returning an > uninitialized status. > > Fix this by always returning Internal Device Error as a fallback. > > Signed-off-by: Gollu Appalanaidu > Signed-off-by: Klaus Jensen > --- > hw/block/nvme.c | 11 +-- > 1 file changed, 5 insertions(+), 6 deletions(-) > > diff --git a/hw/block/nvme.c b/hw/block/nvme.c > index ddc83f7f7a19..897b9ff0db91 100644 > --- a/hw/block/nvme.c > +++ b/hw/block/nvme.c > @@ -1232,7 +1232,7 @@ static uint16_t nvme_check_zone_write(NvmeNamespace > *ns, NvmeZone *zone, > > static uint16_t nvme_check_zone_state_for_read(NvmeZone *zone) > { > -uint16_t status; > +uint64_t zslba = zone->d.zslba; > > switch (nvme_get_zone_state(zone)) { > case NVME_ZONE_STATE_EMPTY: > @@ -1241,16 +1241,15 @@ static uint16_t > nvme_check_zone_state_for_read(NvmeZone *zone) > case NVME_ZONE_STATE_FULL: > case NVME_ZONE_STATE_CLOSED: > case NVME_ZONE_STATE_READ_ONLY: > -status = NVME_SUCCESS; > -break; > +return NVME_SUCCESS; > case NVME_ZONE_STATE_OFFLINE: > -status = NVME_ZONE_OFFLINE; > -break; > +trace_pci_nvme_err_zone_is_offline(zslba); This also is a tiny addition to the potential error fix. Anyway, it can be shorten to: (if zslba is used in a place only) trace_pci_nvme_err_zone_is_offline(zone->d.zslba); > +return NVME_ZONE_OFFLINE; > default: > assert(false); > } > > -return status; > +return NVME_INTERNAL_DEV_ERROR; > } > > static uint16_t nvme_check_zone_read(NvmeNamespace *ns, uint64_t slba, > -- > 2.30.1 > >
Re: [PATCH v2 01/22] block: add eMMC block device type
Philippe Mathieu-Daudé writes: > On 2/22/21 9:20 AM, Sai Pavan Boddu wrote: >> From: Vincent Palatin >> >> Add new block device type. >> >> Signed-off-by: Vincent Palatin >> [SPB: Rebased over 5.1 version] >> Signed-off-by: Sai Pavan Boddu >> Signed-off-by: Joel Stanley >> Signed-off-by: Cédric Le Goater >> Reviewed-by: Alistair Francis >> --- >> include/sysemu/blockdev.h | 1 + >> blockdev.c| 1 + >> 2 files changed, 2 insertions(+) >> >> diff --git a/include/sysemu/blockdev.h b/include/sysemu/blockdev.h >> index 3b5fcda..eefae9f 100644 >> --- a/include/sysemu/blockdev.h >> +++ b/include/sysemu/blockdev.h >> @@ -24,6 +24,7 @@ typedef enum { >> */ >> IF_NONE = 0, >> IF_IDE, IF_SCSI, IF_FLOPPY, IF_PFLASH, IF_MTD, IF_SD, IF_VIRTIO, IF_XEN, >> +IF_EMMC, >> IF_COUNT >> } BlockInterfaceType; >> >> diff --git a/blockdev.c b/blockdev.c >> index cd438e6..390d43c 100644 >> --- a/blockdev.c >> +++ b/blockdev.c >> @@ -83,6 +83,7 @@ static const char *const if_name[IF_COUNT] = { >> [IF_SD] = "sd", >> [IF_VIRTIO] = "virtio", >> [IF_XEN] = "xen", >> +[IF_EMMC] = "emmc", >> }; > > We don't need to introduce support for the legacy -drive magic. > > -device should be enough for this device, right? External interface extensions need rationale: why do we want / need it? The commit message neglects to provide one. Even more so when the interface in question is in a state like -drive is.
Re: [PATCH 1/3] hw/block/nvme: nvme_identify fixes
On 21-02-22 08:06:13, Klaus Jensen wrote: > From: Gollu Appalanaidu > > Remove an unnecessary le_to_cpu conversion and add trace event for > Identify. > > Signed-off-by: Gollu Appalanaidu > Signed-off-by: Klaus Jensen > --- > hw/block/nvme.c | 5 - > hw/block/trace-events | 1 + > 2 files changed, 5 insertions(+), 1 deletion(-) > > diff --git a/hw/block/nvme.c b/hw/block/nvme.c > index 1cd82fa3c9fe..ddc83f7f7a19 100644 > --- a/hw/block/nvme.c > +++ b/hw/block/nvme.c > @@ -3415,7 +3415,10 @@ static uint16_t nvme_identify(NvmeCtrl *n, NvmeRequest > *req) > { > NvmeIdentify *c = (NvmeIdentify *)>cmd; > > -switch (le32_to_cpu(c->cns)) { > +trace_pci_nvme_identify(nvme_cid(req), c->cns, le16_to_cpu(c->ctrlid), > +c->csi); I think it would be great if it can be separated into two. Anyway, changes look good to me. Reviewed-by: Minwoo Im > + > +switch (c->cns) { > case NVME_ID_CNS_NS: > /* fall through */ > case NVME_ID_CNS_NS_PRESENT: > diff --git a/hw/block/trace-events b/hw/block/trace-events > index b04f7a3e1890..1f958d09d2a9 100644 > --- a/hw/block/trace-events > +++ b/hw/block/trace-events > @@ -61,6 +61,7 @@ pci_nvme_create_sq(uint64_t addr, uint16_t sqid, uint16_t > cqid, uint16_t qsize, > pci_nvme_create_cq(uint64_t addr, uint16_t cqid, uint16_t vector, uint16_t > size, uint16_t qflags, int ien) "create completion queue, addr=0x%"PRIx64", > cqid=%"PRIu16", vector=%"PRIu16", qsize=%"PRIu16", qflags=%"PRIu16", ien=%d" > pci_nvme_del_sq(uint16_t qid) "deleting submission queue sqid=%"PRIu16"" > pci_nvme_del_cq(uint16_t cqid) "deleted completion queue, cqid=%"PRIu16"" > +pci_nvme_identify(uint16_t cid, uint8_t cns, uint16_t ctrlid, uint8_t csi) > "cid %"PRIu16" cns 0x%"PRIx8" ctrlid %"PRIu16" csi 0x%"PRIx8"" > pci_nvme_identify_ctrl(void) "identify controller" > pci_nvme_identify_ctrl_csi(uint8_t csi) "identify controller, csi=0x%"PRIx8"" > pci_nvme_identify_ns(uint32_t ns) "nsid %"PRIu32"" > -- > 2.30.1 > >
Re: [PATCH 3/3] hw/block/nvme: report non-mdts command size limit for dsm
On 21-02-22 08:06:15, Klaus Jensen wrote: > From: Gollu Appalanaidu > > Dataset Management is not subject to MDTS, but exceeded a certain size > per range causes internal looping. Report this limit (DMRSL) in the NVM > command set specific identify controller data structure. > > Signed-off-by: Gollu Appalanaidu > Signed-off-by: Klaus Jensen > --- > hw/block/nvme.h | 1 + > include/block/nvme.h | 11 +++ > hw/block/nvme.c | 30 -- > hw/block/trace-events | 1 + > 4 files changed, 33 insertions(+), 10 deletions(-) > > diff --git a/hw/block/nvme.h b/hw/block/nvme.h > index cb2b5175f1a1..3046b82b3da1 100644 > --- a/hw/block/nvme.h > +++ b/hw/block/nvme.h > @@ -172,6 +172,7 @@ typedef struct NvmeCtrl { > int aer_queued; > > uint8_t zasl; > +uint32_tdmrsl; > > NvmeSubsystem *subsys; > > diff --git a/include/block/nvme.h b/include/block/nvme.h > index b23f3ae2279f..16d8c4c90f7e 100644 > --- a/include/block/nvme.h > +++ b/include/block/nvme.h > @@ -1041,6 +1041,16 @@ typedef struct NvmeIdCtrlZoned { > uint8_t rsvd1[4095]; > } NvmeIdCtrlZoned; > > +typedef struct NvmeIdCtrlNvm { > +uint8_t vsl; > +uint8_t wzsl; > +uint8_t wusl; > +uint8_t dmrl; > +uint32_tdmrsl; > +uint64_tdmsl; > +uint8_t rsvd16[4080]; > +} NvmeIdCtrlNvm; > + > enum NvmeIdCtrlOacs { > NVME_OACS_SECURITY = 1 << 0, > NVME_OACS_FORMAT= 1 << 1, > @@ -1396,6 +1406,7 @@ static inline void _nvme_check_size(void) > QEMU_BUILD_BUG_ON(sizeof(NvmeEffectsLog) != 4096); > QEMU_BUILD_BUG_ON(sizeof(NvmeIdCtrl) != 4096); > QEMU_BUILD_BUG_ON(sizeof(NvmeIdCtrlZoned) != 4096); > +QEMU_BUILD_BUG_ON(sizeof(NvmeIdCtrlNvm) != 4096); > QEMU_BUILD_BUG_ON(sizeof(NvmeLBAF) != 4); > QEMU_BUILD_BUG_ON(sizeof(NvmeLBAFE) != 16); > QEMU_BUILD_BUG_ON(sizeof(NvmeIdNs) != 4096); > diff --git a/hw/block/nvme.c b/hw/block/nvme.c > index 897b9ff0db91..5d6bba5fcb0d 100644 > --- a/hw/block/nvme.c > +++ b/hw/block/nvme.c > @@ -1777,6 +1777,10 @@ static uint16_t nvme_dsm(NvmeCtrl *n, NvmeRequest *req) > trace_pci_nvme_dsm_deallocate(nvme_cid(req), nvme_nsid(ns), slba, >nlb); > > +if (nlb > n->dmrsl) { > +trace_pci_nvme_dsm_single_range_limit_exceeded(nlb, > n->dmrsl); > +} > + > offset = nvme_l2b(ns, slba); > len = nvme_l2b(ns, nlb); > > @@ -3202,21 +3206,24 @@ static uint16_t nvme_identify_ctrl(NvmeCtrl *n, > NvmeRequest *req) > static uint16_t nvme_identify_ctrl_csi(NvmeCtrl *n, NvmeRequest *req) > { > NvmeIdentify *c = (NvmeIdentify *)>cmd; > -NvmeIdCtrlZoned id = {}; > +uint8_t id[NVME_IDENTIFY_DATA_SIZE] = {}; > > trace_pci_nvme_identify_ctrl_csi(c->csi); > > -if (c->csi == NVME_CSI_NVM) { > -return nvme_rpt_empty_id_struct(n, req); > -} else if (c->csi == NVME_CSI_ZONED) { > -if (n->params.zasl_bs) { > -id.zasl = n->zasl; > -} > -return nvme_dma(n, (uint8_t *), sizeof(id), > -DMA_DIRECTION_FROM_DEVICE, req); > +switch (c->csi) { > +case NVME_CSI_NVM: > +((NvmeIdCtrlNvm *))->dmrsl = cpu_to_le32(n->dmrsl); > +break; > + > +case NVME_CSI_ZONED: > +((NvmeIdCtrlZoned *))->zasl = n->zasl; Question. Are we okay without checking this like above ? :) if (n->params.zasl_bs) { ((NvmeIdCtrlZoned *))->zasl = n->zasl; } > +break; > + > +default: > +return NVME_INVALID_FIELD | NVME_DNR; > } > > -return NVME_INVALID_FIELD | NVME_DNR; > +return nvme_dma(n, id, sizeof(id), DMA_DIRECTION_FROM_DEVICE, req); > } > > static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeRequest *req) > @@ -4670,6 +4677,9 @@ int nvme_register_namespace(NvmeCtrl *n, NvmeNamespace > *ns, Error **errp) > > n->namespaces[nsid - 1] = ns; > > +n->dmrsl = MIN_NON_ZERO(n->dmrsl, > +BDRV_REQUEST_MAX_BYTES / nvme_l2b(ns, 1)); > + > return 0; > } > > diff --git a/hw/block/trace-events b/hw/block/trace-events > index 1f958d09d2a9..27940fe2e98a 100644 > --- a/hw/block/trace-events > +++ b/hw/block/trace-events > @@ -51,6 +51,7 @@ pci_nvme_copy_cb(uint16_t cid) "cid %"PRIu16"" > pci_nvme_block_status(int64_t offset, int64_t bytes, int64_t pnum, int ret, > bool zeroed) "offset %"PRId64" bytes %"PRId64" pnum %"PRId64" ret 0x%x zeroed > %d" > pci_nvme_dsm(uint16_t cid, uint32_t nsid, uint32_t nr, uint32_t attr) "cid > %"PRIu16" nsid %"PRIu32" nr %"PRIu32" attr 0x%"PRIx32"" > pci_nvme_dsm_deallocate(uint16_t cid, uint32_t nsid, uint64_t slba, uint32_t > nlb) "cid %"PRIu16" nsid %"PRIu32" slba %"PRIu64" nlb %"PRIu32"" > +pci_nvme_dsm_single_range_limit_exceeded(uint32_t nlb, uint32_t dmrsl) "nlb > %"PRIu32" dmrsl %"PRIu32"" >
Re: [PATCH v2 04/22] sd: emmc: update OCR fields for eMMC
On 2/22/21 10:55 AM, Sai Pavan Boddu wrote: > Hi Cedric > >> -Original Message- >> From: Cédric Le Goater >> Sent: Monday, February 22, 2021 3:22 PM >> To: Sai Pavan Boddu ; Markus Armbruster >> ; Kevin Wolf ; Max Reitz >> ; Vladimir Sementsov-Ogievskiy >> ; Eric Blake ; Joel Stanley >> ; Vincent Palatin ; Dr. David Alan >> Gilbert ; Thomas Huth ; Stefan >> Hajnoczi ; Peter Maydell ; >> Alistair Francis ; Edgar Iglesias >> ; >> Luc Michel ; Paolo Bonzini >> >> Cc: qemu-block@nongnu.org; qemu-de...@nongnu.org; Sai Pavan Boddu >> >> Subject: Re: [PATCH v2 04/22] sd: emmc: update OCR fields for eMMC >> >> On 2/22/21 9:20 AM, Sai Pavan Boddu wrote: >>> From: Vincent Palatin >>> >>> eMMC OCR register doesn't has UHS-II field and voltage fields are >>> different. >> >> Can a patch be "From" a person without a "Signed-off-by" of the same person ? > [Sai Pavan Boddu] No I should not be like this. My mistake, I would respin > the series with corrections. Few other patches may have this after the split > I did. Please wait for more review before respining for this detail.
[PATCH v2] iotests: Drop deprecated 'props' from object-add
Signed-off-by: Alberto Garcia --- v2: Don't use the x-* interface to specify limits [Kevin] tests/qemu-iotests/087 | 8 ++-- tests/qemu-iotests/184 | 18 ++ tests/qemu-iotests/218 | 2 +- tests/qemu-iotests/235 | 2 +- tests/qemu-iotests/245 | 4 ++-- tests/qemu-iotests/258 | 6 +++--- tests/qemu-iotests/258.out | 4 ++-- tests/qemu-iotests/295 | 2 +- tests/qemu-iotests/296 | 2 +- 9 files changed, 19 insertions(+), 29 deletions(-) diff --git a/tests/qemu-iotests/087 b/tests/qemu-iotests/087 index edd43f1a28..d8e0e384cd 100755 --- a/tests/qemu-iotests/087 +++ b/tests/qemu-iotests/087 @@ -143,9 +143,7 @@ run_qemu <
Re: [PATCH v5 0/9] block: Add retry for werror=/rerror= mechanism
On Wed, Feb 10, 2021 at 09:22:38AM +0800, Jiahui Cen wrote: > Kindly ping. > Any comments and reviews are wellcome :) I lost track of this series. Sorry! Feel free to ping me on #qemu IRC (my nick is "stefanha") if I'm not responding to emails. I will review the series now. Stefan signature.asc Description: PGP signature
RE: [PATCH v2 04/22] sd: emmc: update OCR fields for eMMC
Hi Cedric > -Original Message- > From: Cédric Le Goater > Sent: Monday, February 22, 2021 3:22 PM > To: Sai Pavan Boddu ; Markus Armbruster > ; Kevin Wolf ; Max Reitz > ; Vladimir Sementsov-Ogievskiy > ; Eric Blake ; Joel Stanley > ; Vincent Palatin ; Dr. David Alan > Gilbert ; Thomas Huth ; Stefan > Hajnoczi ; Peter Maydell ; > Alistair Francis ; Edgar Iglesias > ; > Luc Michel ; Paolo Bonzini > > Cc: qemu-block@nongnu.org; qemu-de...@nongnu.org; Sai Pavan Boddu > > Subject: Re: [PATCH v2 04/22] sd: emmc: update OCR fields for eMMC > > On 2/22/21 9:20 AM, Sai Pavan Boddu wrote: > > From: Vincent Palatin > > > > eMMC OCR register doesn't has UHS-II field and voltage fields are > > different. > > Can a patch be "From" a person without a "Signed-off-by" of the same person ? [Sai Pavan Boddu] No I should not be like this. My mistake, I would respin the series with corrections. Few other patches may have this after the split I did. Regards, Sai Pavan > > C. > > > Signed-off-by: Sai Pavan Boddu > > > > --- > > hw/sd/sd.c | 27 --- > > 1 file changed, 24 insertions(+), 3 deletions(-) > > > > diff --git a/hw/sd/sd.c b/hw/sd/sd.c > > index 42ee49c..430bea5 100644 > > --- a/hw/sd/sd.c > > +++ b/hw/sd/sd.c > > @@ -283,6 +283,15 @@ FIELD(OCR, UHS_II_CARD,29, 1) /* Only > UHS-II */ > > FIELD(OCR, CARD_CAPACITY, 30, 1) /* 0:SDSC, 1:SDHC/SDXC */ > > FIELD(OCR, CARD_POWER_UP, 31, 1) > > > > +/* > > + * eMMC OCR register > > + */ > > +FIELD(EMMC_OCR, VDD_WINDOW_0, 7, 1) > > +FIELD(EMMC_OCR, VDD_WINDOW_1, 8, 7) > > +FIELD(EMMC_OCR, VDD_WINDOW_2, 15, 9) > > +FIELD(EMMC_OCR, ACCESS_MODE, 29, 2) > > +FIELD(EMMC_OCR, POWER_UP, 31, 1) > > + > > #define ACMD41_ENQUIRY_MASK 0x00ff > > #define ACMD41_R3_MASK (R_OCR_VDD_VOLTAGE_WIN_HI_MASK \ > > | R_OCR_ACCEPT_SWITCH_1V8_MASK \ > > @@ -292,8 +301,16 @@ FIELD(OCR, CARD_POWER_UP, 31, 1) > > > > static void sd_set_ocr(SDState *sd) > > { > > -/* All voltages OK */ > > -sd->ocr = R_OCR_VDD_VOLTAGE_WIN_HI_MASK; > > +if (sd->emmc) { > > +/* > > + * Dual Voltage eMMC card > > + */ > > +sd->ocr = R_EMMC_OCR_VDD_WINDOW_0_MASK | > > + R_EMMC_OCR_VDD_WINDOW_2_MASK; > > +} else { > > +/* All voltages OK */ > > +sd->ocr = R_OCR_VDD_VOLTAGE_WIN_HI_MASK; > > +} > > } > > > > static void sd_ocr_powerup(void *opaque) @@ -521,7 +538,11 @@ static > > void sd_response_r1_make(SDState *sd, uint8_t *response) > > > > static void sd_response_r3_make(SDState *sd, uint8_t *response) { > > -stl_be_p(response, sd->ocr & ACMD41_R3_MASK); > > +if (sd->emmc) { > > +stl_be_p(response, sd->ocr); > > +} else { > > +stl_be_p(response, sd->ocr & ACMD41_R3_MASK); > > +} > > } > > > > static void sd_response_r6_make(SDState *sd, uint8_t *response) > >
Re: [PATCH v2 16/22] sd: emmc: Add Extended CSD register definitions
On 2/22/21 9:20 AM, Sai Pavan Boddu wrote: > From: Cédric Le Goater > > Add user friendly macros for EXT_CSD register. > > Signed-off-by: Cédric Le Goater > [spb: Rebased over versal emmc series, > updated commit message] > Signed-off-by: Sai Pavan Boddu You can merge this patch in patch 05 "sd: emmc: Add support for EXT_CSD & CSD for eMMC" if you wish. C. > --- > hw/sd/sdmmc-internal.h | 97 > ++ > hw/sd/sd.c | 54 +++- > 2 files changed, 126 insertions(+), 25 deletions(-) > > diff --git a/hw/sd/sdmmc-internal.h b/hw/sd/sdmmc-internal.h > index d8bf17d..7ab7b4d 100644 > --- a/hw/sd/sdmmc-internal.h > +++ b/hw/sd/sdmmc-internal.h > @@ -37,4 +37,101 @@ const char *sd_cmd_name(uint8_t cmd); > */ > const char *sd_acmd_name(uint8_t cmd); > > +/* > + * EXT_CSD fields > + */ > + > +#define EXT_CSD_CMDQ_MODE_EN15 /* R/W */ > +#define EXT_CSD_FLUSH_CACHE 32 /* W */ > +#define EXT_CSD_CACHE_CTRL33 /* R/W */ > +#define EXT_CSD_POWER_OFF_NOTIFICATION 34 /* R/W */ > +#define EXT_CSD_PACKED_FAILURE_INDEX 35 /* RO */ > +#define EXT_CSD_PACKED_CMD_STATUS 36 /* RO */ > +#define EXT_CSD_EXP_EVENTS_STATUS 54 /* RO, 2 bytes */ > +#define EXT_CSD_EXP_EVENTS_CTRL 56 /* R/W, 2 bytes */ > +#define EXT_CSD_DATA_SECTOR_SIZE 61 /* R */ > +#define EXT_CSD_GP_SIZE_MULT143 /* R/W */ > +#define EXT_CSD_PARTITION_SETTING_COMPLETED 155 /* R/W */ > +#define EXT_CSD_PARTITION_ATTRIBUTE 156 /* R/W */ > +#define EXT_CSD_PARTITION_SUPPORT 160 /* RO */ > +#define EXT_CSD_HPI_MGMT161 /* R/W */ > +#define EXT_CSD_RST_N_FUNCTION162 /* R/W */ > +#define EXT_CSD_BKOPS_EN163 /* R/W */ > +#define EXT_CSD_BKOPS_START 164 /* W */ > +#define EXT_CSD_SANITIZE_START165 /* W */ > +#define EXT_CSD_WR_REL_PARAM166 /* RO */ > +#define EXT_CSD_RPMB_MULT 168 /* RO */ > +#define EXT_CSD_FW_CONFIG 169 /* R/W */ > +#define EXT_CSD_BOOT_WP 173 /* R/W */ > +#define EXT_CSD_ERASE_GROUP_DEF 175 /* R/W */ > +#define EXT_CSD_PART_CONFIG 179 /* R/W */ > +#define EXT_CSD_ERASED_MEM_CONT 181 /* RO */ > +#define EXT_CSD_BUS_WIDTH 183 /* R/W */ > +#define EXT_CSD_STROBE_SUPPORT184 /* RO */ > +#define EXT_CSD_HS_TIMING 185 /* R/W */ > +#define EXT_CSD_POWER_CLASS 187 /* R/W */ > +#define EXT_CSD_REV 192 /* RO */ > +#define EXT_CSD_STRUCTURE 194 /* RO */ > +#define EXT_CSD_CARD_TYPE 196 /* RO */ > +#define EXT_CSD_DRIVER_STRENGTH 197 /* RO */ > +#define EXT_CSD_OUT_OF_INTERRUPT_TIME 198 /* RO */ > +#define EXT_CSD_PART_SWITCH_TIME199 /* RO */ > +#define EXT_CSD_PWR_CL_52_195 200 /* RO */ > +#define EXT_CSD_PWR_CL_26_195 201 /* RO */ > +#define EXT_CSD_PWR_CL_52_360 202 /* RO */ > +#define EXT_CSD_PWR_CL_26_360 203 /* RO */ > +#define EXT_CSD_SEC_CNT 212 /* RO, 4 bytes */ > +#define EXT_CSD_S_A_TIMEOUT 217 /* RO */ > +#define EXT_CSD_S_C_VCCQ 219 /* RO */ > +#define EXT_CSD_S_C_VCC 220 /* RO */ > +#define EXT_CSD_REL_WR_SEC_C222 /* RO */ > +#define EXT_CSD_HC_WP_GRP_SIZE221 /* RO */ > +#define EXT_CSD_ERASE_TIMEOUT_MULT 223 /* RO */ > +#define EXT_CSD_HC_ERASE_GRP_SIZE 224 /* RO */ > +#define EXT_CSD_ACC_SIZE225 /* RO */ > +#define EXT_CSD_BOOT_MULT 226 /* RO */ > +#define EXT_CSD_BOOT_INFO 228 /* RO */ > +#define EXT_CSD_SEC_TRIM_MULT 229 /* RO */ > +#define EXT_CSD_SEC_ERASE_MULT230 /* RO */ > +#define EXT_CSD_SEC_FEATURE_SUPPORT 231 /* RO */ > +#define EXT_CSD_TRIM_MULT 232 /* RO */ > +#define EXT_CSD_PWR_CL_200_195236 /* RO */ > +#define EXT_CSD_PWR_CL_200_360237 /* RO */ > +#define EXT_CSD_PWR_CL_DDR_52_195 238 /* RO */ > +#define EXT_CSD_PWR_CL_DDR_52_360 239 /* RO */ > +#define EXT_CSD_BKOPS_STATUS246 /* RO */ > +#define EXT_CSD_POWER_OFF_LONG_TIME 247 /* RO */ > +#define EXT_CSD_GENERIC_CMD6_TIME 248 /* RO */ > +#define EXT_CSD_CACHE_SIZE249 /* RO, 4 bytes */ > +#define EXT_CSD_PWR_CL_DDR_200_360 253 /* RO */ > +#define EXT_CSD_FIRMWARE_VERSION 254 /* RO, 8 bytes */ > +#define EXT_CSD_PRE_EOL_INFO267 /* RO */ > +#define EXT_CSD_DEVICE_LIFE_TIME_EST_TYP_A 268 /* RO */ > +#define EXT_CSD_DEVICE_LIFE_TIME_EST_TYP_B 269 /* RO */ > +#define EXT_CSD_CMDQ_DEPTH307 /* RO */ > +#define EXT_CSD_CMDQ_SUPPORT308 /* RO */ > +#define EXT_CSD_SUPPORTED_MODE493 /* RO */ > +#define EXT_CSD_TAG_UNIT_SIZE 498 /* RO */ > +#define EXT_CSD_DATA_TAG_SUPPORT 499 /* RO */ > +#define EXT_CSD_MAX_PACKED_WRITES 500 /* RO */ > +#define EXT_CSD_MAX_PACKED_READS 501 /* RO */ > +#define EXT_CSD_BKOPS_SUPPORT 502 /* RO */ > +#define EXT_CSD_HPI_FEATURES503 /* RO */ > +#define EXT_CSD_S_CMD_SET 504 /* RO */ > + > +/* > + * EXT_CSD field definitions > + */ > + > +#define EXT_CSD_WR_REL_PARAM_EN (1 << 2) > +#define EXT_CSD_WR_REL_PARAM_EN_RPMB_REL_WR (1 << 4) > + > +#define EXT_CSD_PART_CONFIG_ACC_MASK (0x7) > +#define EXT_CSD_PART_CONFIG_ACC_DEFAULT (0x0) > +#define
Re: [PATCH v2 04/22] sd: emmc: update OCR fields for eMMC
On 2/22/21 9:20 AM, Sai Pavan Boddu wrote: > From: Vincent Palatin > > eMMC OCR register doesn't has UHS-II field and voltage fields are > different. Can a patch be "From" a person without a "Signed-off-by" of the same person ? C. > Signed-off-by: Sai Pavan Boddu > > --- > hw/sd/sd.c | 27 --- > 1 file changed, 24 insertions(+), 3 deletions(-) > > diff --git a/hw/sd/sd.c b/hw/sd/sd.c > index 42ee49c..430bea5 100644 > --- a/hw/sd/sd.c > +++ b/hw/sd/sd.c > @@ -283,6 +283,15 @@ FIELD(OCR, UHS_II_CARD,29, 1) /* Only > UHS-II */ > FIELD(OCR, CARD_CAPACITY, 30, 1) /* 0:SDSC, 1:SDHC/SDXC */ > FIELD(OCR, CARD_POWER_UP, 31, 1) > > +/* > + * eMMC OCR register > + */ > +FIELD(EMMC_OCR, VDD_WINDOW_0, 7, 1) > +FIELD(EMMC_OCR, VDD_WINDOW_1, 8, 7) > +FIELD(EMMC_OCR, VDD_WINDOW_2, 15, 9) > +FIELD(EMMC_OCR, ACCESS_MODE, 29, 2) > +FIELD(EMMC_OCR, POWER_UP, 31, 1) > + > #define ACMD41_ENQUIRY_MASK 0x00ff > #define ACMD41_R3_MASK (R_OCR_VDD_VOLTAGE_WIN_HI_MASK \ > | R_OCR_ACCEPT_SWITCH_1V8_MASK \ > @@ -292,8 +301,16 @@ FIELD(OCR, CARD_POWER_UP, 31, 1) > > static void sd_set_ocr(SDState *sd) > { > -/* All voltages OK */ > -sd->ocr = R_OCR_VDD_VOLTAGE_WIN_HI_MASK; > +if (sd->emmc) { > +/* > + * Dual Voltage eMMC card > + */ > +sd->ocr = R_EMMC_OCR_VDD_WINDOW_0_MASK | > + R_EMMC_OCR_VDD_WINDOW_2_MASK; > +} else { > +/* All voltages OK */ > +sd->ocr = R_OCR_VDD_VOLTAGE_WIN_HI_MASK; > +} > } > > static void sd_ocr_powerup(void *opaque) > @@ -521,7 +538,11 @@ static void sd_response_r1_make(SDState *sd, uint8_t > *response) > > static void sd_response_r3_make(SDState *sd, uint8_t *response) > { > -stl_be_p(response, sd->ocr & ACMD41_R3_MASK); > +if (sd->emmc) { > +stl_be_p(response, sd->ocr); > +} else { > +stl_be_p(response, sd->ocr & ACMD41_R3_MASK); > +} > } > > static void sd_response_r6_make(SDState *sd, uint8_t *response) >
Re: [RFC PATCH 11/15] sd: emmc: Add Extended CSD register definitions
On 2/16/21 12:11 PM, Sai Pavan Boddu wrote: > Hi Luc, >> -Original Message- >> From: Luc Michel >> Sent: Saturday, February 13, 2021 6:26 PM >> To: Sai Pavan Boddu ; Markus Armbruster >> ; Kevin Wolf ; Max Reitz >> ; Vladimir Sementsov-Ogievskiy >> ; Eric Blake ; Joel Stanley >> ; Cédric Le Goater ; Vincent Palatin >> ; Dr. David Alan Gilbert ; >> Thomas Huth ; Stefan Hajnoczi ; >> Peter Maydell ; Alistair Francis >> ; Edgar Iglesias ; Paolo Bonzini >> >> Cc: qemu-block@nongnu.org; qemu-de...@nongnu.org; Sai Pavan Boddu >> >> Subject: Re: [RFC PATCH 11/15] sd: emmc: Add Extended CSD register >> definitions >> >> On 2/11/21 9:17 AM, Sai Pavan Boddu wrote: >>> From: Cédric Le Goater >>> >>> Add user friendly macros for EXT_CSD register. > >>> Signed-off-by: Cédric Le Goater >>> [spb: Rebased over versal emmc series, >>>updated commit message] >>> Signed-off-by: Sai Pavan Boddu >> >> Hi, >> >> If Cédric agrees, maybe you can squash this commit into patch 2, and add the >> missing register definitions? > [Sai Pavan Boddu] Yeah, that would be nice. I will leave @Cédric Le Goater > comment here. Sure. I hope you did in v2. Sorry I was out. Thanks, C.
[RFC PATCH v2 2/3] hw/pflash_cfi01: Correct the type of PFlashCFI01.ro
PFlashCFI01.ro is a bool, declare it as such. Signed-off-by: David Edmondson --- hw/block/pflash_cfi01.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/hw/block/pflash_cfi01.c b/hw/block/pflash_cfi01.c index 9e1f3b42c6..6b21b4af52 100644 --- a/hw/block/pflash_cfi01.c +++ b/hw/block/pflash_cfi01.c @@ -72,7 +72,7 @@ struct PFlashCFI01 { uint8_t max_device_width; /* max device width in bytes */ uint32_t features; uint8_t wcycle; /* if 0, the flash is read normally */ -int ro; +bool ro; uint8_t cmd; uint8_t status; uint16_t ident0; @@ -738,7 +738,7 @@ static void pflash_cfi01_realize(DeviceState *dev, Error **errp) return; } } else { -pfl->ro = 0; +pfl->ro = false; } if (pfl->blk) { -- 2.30.0
[RFC PATCH v2 3/3] hw/pflash_cfi01: Allow devices to have a smaller backing device
Allow the backing device to be smaller than the extent of the flash device by mapping it as a subregion of the flash device region. Return zeroes for all reads of the flash device beyond the extent of the backing device. For writes beyond the extent of the underlying device, fail on read-only devices and discard them for writable devices. Signed-off-by: David Edmondson --- hw/block/pflash_cfi01.c | 108 ++-- hw/block/trace-events | 3 ++ 2 files changed, 86 insertions(+), 25 deletions(-) diff --git a/hw/block/pflash_cfi01.c b/hw/block/pflash_cfi01.c index 6b21b4af52..94970816a6 100644 --- a/hw/block/pflash_cfi01.c +++ b/hw/block/pflash_cfi01.c @@ -83,6 +83,8 @@ struct PFlashCFI01 { uint64_t counter; unsigned int writeblock_size; MemoryRegion mem; +MemoryRegion mem_outer; +char outer_name[64]; char *name; void *storage; VMChangeStateEntry *vmstate; @@ -425,7 +427,6 @@ static inline void pflash_data_write(PFlashCFI01 *pfl, hwaddr offset, } break; } - } static void pflash_write(PFlashCFI01 *pfl, hwaddr offset, @@ -646,8 +647,45 @@ static void pflash_write(PFlashCFI01 *pfl, hwaddr offset, } -static MemTxResult pflash_mem_read_with_attrs(void *opaque, hwaddr addr, uint64_t *value, - unsigned len, MemTxAttrs attrs) +static MemTxResult pflash_outer_read_with_attrs(void *opaque, hwaddr addr, +uint64_t *value, +unsigned len, +MemTxAttrs attrs) +{ +PFlashCFI01 *pfl = opaque; + +trace_pflash_outer_read(pfl->name, addr, len); +*value = 0; +return MEMTX_OK; +} + +static MemTxResult pflash_outer_write_with_attrs(void *opaque, hwaddr addr, + uint64_t value, + unsigned len, + MemTxAttrs attrs) +{ +PFlashCFI01 *pfl = opaque; + +trace_pflash_outer_write(pfl->name, addr, len); +if (pfl->ro) { +return MEMTX_ERROR; +} else { +/* Discard writes. */ +warn_report_once("%s: attempt to write outside of the backing block device " + "(offset " TARGET_FMT_plx ") ignored", pfl->name, addr); +return MEMTX_OK; +} +} + +static const MemoryRegionOps pflash_cfi01_outer_ops = { +.read_with_attrs = pflash_outer_read_with_attrs, +.write_with_attrs = pflash_outer_write_with_attrs, +.endianness = DEVICE_NATIVE_ENDIAN, +}; + +static MemTxResult pflash_mem_read_with_attrs(void *opaque, hwaddr addr, + uint64_t *value, unsigned len, + MemTxAttrs attrs) { PFlashCFI01 *pfl = opaque; bool be = !!(pfl->features & (1 << PFLASH_BE)); @@ -660,8 +698,9 @@ static MemTxResult pflash_mem_read_with_attrs(void *opaque, hwaddr addr, uint64_ return MEMTX_OK; } -static MemTxResult pflash_mem_write_with_attrs(void *opaque, hwaddr addr, uint64_t value, - unsigned len, MemTxAttrs attrs) +static MemTxResult pflash_mem_write_with_attrs(void *opaque, hwaddr addr, + uint64_t value, unsigned len, + MemTxAttrs attrs) { PFlashCFI01 *pfl = opaque; bool be = !!(pfl->features & (1 << PFLASH_BE)); @@ -684,7 +723,7 @@ static void pflash_cfi01_realize(DeviceState *dev, Error **errp) { ERRP_GUARD(); PFlashCFI01 *pfl = PFLASH_CFI01(dev); -uint64_t total_len; +uint64_t outer_len, inner_len; int ret; uint64_t blocks_per_device, sector_len_per_device, device_len; int num_devices; @@ -702,7 +741,7 @@ static void pflash_cfi01_realize(DeviceState *dev, Error **errp) return; } -total_len = pfl->sector_len * pfl->nb_blocs; +outer_len = pfl->sector_len * pfl->nb_blocs; /* These are only used to expose the parameters of each device * in the cfi_table[]. @@ -717,36 +756,55 @@ static void pflash_cfi01_realize(DeviceState *dev, Error **errp) } device_len = sector_len_per_device * blocks_per_device; -memory_region_init_rom_device( ->mem, OBJECT(dev), -_cfi01_ops, -pfl, -pfl->name, total_len, errp); -if (*errp) { -return; -} - -pfl->storage = memory_region_get_ram_ptr(>mem); -sysbus_init_mmio(SYS_BUS_DEVICE(dev), >mem); - if (pfl->blk) { uint64_t perm; + pfl->ro = !blk_supports_write_perm(pfl->blk); perm = BLK_PERM_CONSISTENT_READ | (pfl->ro ? 0 : BLK_PERM_WRITE); ret = blk_set_perm(pfl->blk, perm, BLK_PERM_ALL, errp); if (ret < 0) { return; } + +
[RFC PATCH v2 0/3] hw/flash_cfi01: Reduce memory consumption when flash image is smaller than region
As described in https://lore.kernel.org/r/20201116104216.439650-1-david.edmond...@oracle.com, I'd like to reduce the amount of memory consumed by QEMU mapping UEFI images on aarch64. To recap: > Currently ARM UEFI images are typically built as 2MB/768kB flash > images for code and variables respectively. These images are both > then padded out to 64MB before being loaded by QEMU. > > Because the images are 64MB each, QEMU allocates 128MB of memory to > read them, and then proceeds to read all 128MB from disk (dirtying > the memory). Of this 128MB less than 3MB is useful - the rest is > zero padding. > > On a machine with 100 VMs this wastes over 12GB of memory. There were objections to my previous patch because it changed the size of the regions reported to the guest via the memory map (the reported size depended on the size of the image). This is a smaller patch which changes the memory region that covers the entire region to be IO rather than RAM, and loads the flash image into a smaller sub-region that is the more traditional mixed IO/ROMD type. All read/write operations to areas outside of the underlying block device are handled directly (reads return 0, writes fail or are discarded). This reduces the memory consumption for the AAVMF code image from 64MiB to around 2MB and that for the AAVMF vars from 64MiB to 768KiB (presuming that the UEFI images are adjusted accordingly). For read-only devices (such as the AAVMF code) this seems completely safe. For writable devices there is a change in behaviour - previously it was possible to write anywhere in the extent of the flash device, read back the data written and have that data persist through a restart of QEMU. This is no longer the case - writes outside of the extent of the underlying backing block device will be discarded. That is, a read after a write will *not* return the written data, either immediately or after a QEMU restart - it will return zeros. Looking at the AAVMF implementation, it seems to me that if the initial VARS image is prepared as being 768KiB in size (which it is), none of AAVMF itself will attempt to write outside of that extent, and so I believe that this is an acceptable compromise. It would be relatively straightforward to allow writes outside of the backing device to persist for the lifetime of a particular QEMU by allocating memory on demand (i.e. when there is a write to the relevant region). This would allow a read to return the relevant data, but only until a QEMU restart, at which point the data would be lost. There was a suggestion in a previous thread that perhaps the pflash driver could be re-worked to use the block IO interfaces to access the underlying device "on demand" rather than reading in the entire image at startup (at least, that's how I understood the comment). I looked at implementing this and struggled to get it to work for all of the required use cases. Specifically, there are several code paths that expect to retrieve a pointer to the flat memory image of the pflash device and manipulate it directly (examples include the Malta board and encrypted memory support on x86), or write the entire image to storage (after migration). My implementation was based around mapping the flash region only for IO, which meant that every read or write had to be handled directly by the pflash driver (there was no ROMD style operation), which also made booting an aarch64 VM noticeably slower - getting through the firmware went from under 1 second to around 10 seconds. v2: - Unify the approach for both read-only and writable devices, saving another 63MiB per QEMU instance. David Edmondson (3): hw/pflash_cfi*: Replace DPRINTF with trace events hw/pflash_cfi01: Correct the type of PFlashCFI01.ro hw/pflash_cfi01: Allow devices to have a smaller backing device hw/block/pflash_cfi01.c | 190 +--- hw/block/pflash_cfi02.c | 75 ++-- hw/block/trace-events | 42 +++-- 3 files changed, 179 insertions(+), 128 deletions(-) -- 2.30.0
[RFC PATCH v2 1/3] hw/pflash_cfi*: Replace DPRINTF with trace events
Rather than having a device specific debug implementation in pflash_cfi01.c and pflash_cfi02.c, use the standard tracing facility. Signed-off-by: David Edmondson --- hw/block/pflash_cfi01.c | 78 + hw/block/pflash_cfi02.c | 75 +++ hw/block/trace-events | 39 - 3 files changed, 91 insertions(+), 101 deletions(-) diff --git a/hw/block/pflash_cfi01.c b/hw/block/pflash_cfi01.c index 22287a1522..9e1f3b42c6 100644 --- a/hw/block/pflash_cfi01.c +++ b/hw/block/pflash_cfi01.c @@ -56,16 +56,6 @@ #include "sysemu/runstate.h" #include "trace.h" -/* #define PFLASH_DEBUG */ -#ifdef PFLASH_DEBUG -#define DPRINTF(fmt, ...) \ -do {\ -fprintf(stderr, "PFLASH: " fmt , ## __VA_ARGS__); \ -} while (0) -#else -#define DPRINTF(fmt, ...) do { } while (0) -#endif - #define PFLASH_BE 0 #define PFLASH_SECURE 1 @@ -152,10 +142,8 @@ static uint32_t pflash_cfi_query(PFlashCFI01 *pfl, hwaddr offset) * wider part. */ if (pfl->device_width != 1 || pfl->bank_width > 4) { -DPRINTF("%s: Unsupported device configuration: " -"device_width=%d, max_device_width=%d\n", -__func__, pfl->device_width, -pfl->max_device_width); +trace_pflash_unsupported_device_configuration( +pfl->name, pfl->device_width, pfl->max_device_width); return 0; } /* CFI query data is repeated, rather than zero padded for @@ -205,14 +193,14 @@ static uint32_t pflash_devid_query(PFlashCFI01 *pfl, hwaddr offset) switch (boff & 0xFF) { case 0: resp = pfl->ident0; -trace_pflash_manufacturer_id(resp); +trace_pflash_manufacturer_id(pfl->name, resp); break; case 1: resp = pfl->ident1; -trace_pflash_device_id(resp); +trace_pflash_device_id(pfl->name, resp); break; default: -trace_pflash_device_info(offset); +trace_pflash_device_info(pfl->name, offset); return 0; } /* Replicate responses for each device in bank. */ @@ -260,10 +248,9 @@ static uint32_t pflash_data_read(PFlashCFI01 *pfl, hwaddr offset, } break; default: -DPRINTF("BUG in %s\n", __func__); abort(); } -trace_pflash_data_read(offset, width, ret); +trace_pflash_data_read(pfl->name, offset, width, ret); return ret; } @@ -277,7 +264,7 @@ static uint32_t pflash_read(PFlashCFI01 *pfl, hwaddr offset, switch (pfl->cmd) { default: /* This should never happen : reset state & treat it as a read */ -DPRINTF("%s: unknown command state: %x\n", __func__, pfl->cmd); +trace_pflash_read_unknown_state(pfl->name, pfl->cmd); pfl->wcycle = 0; /* * The command 0x00 is not assigned by the CFI open standard, @@ -313,7 +300,7 @@ static uint32_t pflash_read(PFlashCFI01 *pfl, hwaddr offset, */ ret |= pfl->status << 16; } -DPRINTF("%s: status %x\n", __func__, ret); +trace_pflash_read_status(pfl->name, ret); break; case 0x90: if (!pfl->device_width) { @@ -328,14 +315,14 @@ static uint32_t pflash_read(PFlashCFI01 *pfl, hwaddr offset, switch (boff) { case 0: ret = pfl->ident0 << 8 | pfl->ident1; -trace_pflash_manufacturer_id(ret); +trace_pflash_manufacturer_id(pfl->name, ret); break; case 1: ret = pfl->ident2 << 8 | pfl->ident3; -trace_pflash_device_id(ret); +trace_pflash_device_id(pfl->name, ret); break; default: -trace_pflash_device_info(boff); +trace_pflash_device_info(pfl->name, boff); ret = 0; break; } @@ -380,7 +367,7 @@ static uint32_t pflash_read(PFlashCFI01 *pfl, hwaddr offset, break; } -trace_pflash_io_read(offset, width, ret, pfl->cmd, pfl->wcycle); +trace_pflash_io_read(pfl->name, offset, width, ret, pfl->cmd, pfl->wcycle); return ret; } @@ -410,7 +397,7 @@ static inline void pflash_data_write(PFlashCFI01 *pfl, hwaddr offset, { uint8_t *p = pfl->storage; -trace_pflash_data_write(offset, width, value, pfl->counter); +trace_pflash_data_write(pfl->name, offset, width, value, pfl->counter); switch (width) { case 1: p[offset] = value; @@ -449,7 +436,7 @@ static void pflash_write(PFlashCFI01 *pfl, hwaddr offset, cmd = value; -trace_pflash_io_write(offset, width, value, pfl->wcycle); +trace_pflash_io_write(pfl->name, offset, width, value, pfl->wcycle); if (!pfl->wcycle) { /* Set the device in
[PATCH v2 14/22] sd: emmc: Add support for emmc erase
Add CMD35 and CMD36 which sets the erase start and end. Signed-off-by: Sai Pavan Boddu Signed-off-by: Edgar E. Iglesias --- hw/sd/sd.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/hw/sd/sd.c b/hw/sd/sd.c index df82b61..6d2ef2b 100644 --- a/hw/sd/sd.c +++ b/hw/sd/sd.c @@ -1544,6 +1544,7 @@ static sd_rsp_type_t sd_normal_command(SDState *sd, SDRequest req) /* Erase commands (Class 5) */ case 32:/* CMD32: ERASE_WR_BLK_START */ +case 35:/* CMD35: ERASE_GROUP_START */ switch (sd->state) { case sd_transfer_state: sd->erase_start = req.arg; @@ -1555,6 +1556,7 @@ static sd_rsp_type_t sd_normal_command(SDState *sd, SDRequest req) break; case 33:/* CMD33: ERASE_WR_BLK_END */ +case 36:/* CMD36: ERASE_GROUP_END */ switch (sd->state) { case sd_transfer_state: sd->erase_end = req.arg; -- 2.7.4
[PATCH v2 06/22] sd: emmc: Update CMD8 to send EXT_CSD register
From: Vincent Palatin Sends the EXT_CSD register as response to CMD8. Signed-off-by: Sai Pavan Boddu --- hw/sd/sd.c | 52 1 file changed, 36 insertions(+), 16 deletions(-) diff --git a/hw/sd/sd.c b/hw/sd/sd.c index 4c211ba..a4f93b5 100644 --- a/hw/sd/sd.c +++ b/hw/sd/sd.c @@ -1133,24 +1133,37 @@ static sd_rsp_type_t sd_normal_command(SDState *sd, SDRequest req) } break; -case 8: /* CMD8: SEND_IF_COND */ -if (sd->spec_version < SD_PHY_SPECv2_00_VERS) { -break; -} -if (sd->state != sd_idle_state) { -break; -} -sd->vhs = 0; - -/* No response if not exactly one VHS bit is set. */ -if (!(req.arg >> 8) || (req.arg >> (ctz32(req.arg & ~0xff) + 1))) { -return sd->spi ? sd_r7 : sd_r0; -} +case 8:/* CMD8: SEND_IF_COND / SEND_EXT_CSD */ +if (sd->emmc) { +switch (sd->state) { +case sd_transfer_state: +/* MMC : Sends the EXT_CSD register as a Block of data */ +sd->state = sd_sendingdata_state; +memcpy(sd->data, sd->ext_csd, sizeof(sd->ext_csd)); +sd->data_start = addr; +sd->data_offset = 0; +return sd_r1; +default: +break; +} +} else { +if (sd->spec_version < SD_PHY_SPECv2_00_VERS) { +break; +} +if (sd->state != sd_idle_state) { +break; +} +sd->vhs = 0; -/* Accept. */ -sd->vhs = req.arg; -return sd_r7; +/* No response if not exactly one VHS bit is set. */ +if (!(req.arg >> 8) || (req.arg >> (ctz32(req.arg & ~0xff) + 1))) { +return sd->spi ? sd_r7 : sd_r0; +} +/* Accept. */ +sd->vhs = req.arg; +return sd_r7; +} case 9: /* CMD9: SEND_CSD */ switch (sd->state) { case sd_standby_state: @@ -2073,6 +2086,13 @@ uint8_t sd_read_byte(SDState *sd) sd->state = sd_transfer_state; break; +case 8: /* CMD8: SEND_EXT_CSD on MMC */ +ret = sd->data[sd->data_offset++]; +if (sd->data_offset >= sizeof(sd->ext_csd)) { +sd->state = sd_transfer_state; +} +break; + case 9: /* CMD9: SEND_CSD */ case 10:/* CMD10: SEND_CID */ ret = sd->data[sd->data_offset ++]; -- 2.7.4
[PATCH v2 21/22] docs: devel: emmc: Add a doc for emmc card emulation
Add few simple steps to create emmc card with boot and user data partitions. Signed-off-by: Sai Pavan Boddu --- docs/devel/emmc.txt | 16 1 file changed, 16 insertions(+) create mode 100644 docs/devel/emmc.txt diff --git a/docs/devel/emmc.txt b/docs/devel/emmc.txt new file mode 100644 index 000..2d098fe --- /dev/null +++ b/docs/devel/emmc.txt @@ -0,0 +1,16 @@ + +eMMC block emulation + + +Any eMMC devices has 3 kinds of partitions Boot, RPMB and User data. We +are supporting Boot and User data partitions. Boot area partitions are +expected to be 1MB size as hard coded in EXT_CSD register. + +Below is the example of combining two 1MB bootarea partition and +user data partitions. + + cat mmc-bootarea0.bin mmc-bootarea1.bin image.wic > mmc.img + qemu-img resize mmc.img 4G + +Note: mmc-bootarea0/1 are just raw paritions. User data can have +partition tables. -- 2.7.4
[PATCH v2 10/22] sd: emmc: support idle state in CMD2
eMMC is expected to be in idle-state post CMD1. Ready state is an intermediate stage which we don't come across in Device identification mode. Signed-off-by: Sai Pavan Boddu Signed-off-by: Edgar E. Iglesias Reviewed-by: Alistair Francis --- hw/sd/sd.c | 4 1 file changed, 4 insertions(+) diff --git a/hw/sd/sd.c b/hw/sd/sd.c index 8bc8d5d..ae5c5e8 100644 --- a/hw/sd/sd.c +++ b/hw/sd/sd.c @@ -1051,6 +1051,10 @@ static sd_rsp_type_t sd_normal_command(SDState *sd, SDRequest req) if (sd->spi) goto bad_cmd; switch (sd->state) { +case sd_idle_state: +if (!sd->emmc) { +break; +} case sd_ready_state: sd->state = sd_identification_state; return sd_r2_i; -- 2.7.4
[PATCH v2 02/22] sd: sd: Remove usage of tabs in the file
Set tabstop as 4 and used expandtabs Signed-off-by: Sai Pavan Boddu --- hw/sd/sd.c | 190 ++--- 1 file changed, 95 insertions(+), 95 deletions(-) diff --git a/hw/sd/sd.c b/hw/sd/sd.c index 8517dbc..74b9162 100644 --- a/hw/sd/sd.c +++ b/hw/sd/sd.c @@ -332,39 +332,39 @@ static void sd_set_scr(SDState *sd) sd->scr[7] = 0x00; } -#define MID0xaa -#define OID"XY" -#define PNM"QEMU!" -#define PRV0x01 -#define MDT_YR 2006 -#define MDT_MON2 +#define MID 0xaa +#define OID "XY" +#define PNM "QEMU!" +#define PRV 0x01 +#define MDT_YR 2006 +#define MDT_MON 2 static void sd_set_cid(SDState *sd) { -sd->cid[0] = MID; /* Fake card manufacturer ID (MID) */ -sd->cid[1] = OID[0]; /* OEM/Application ID (OID) */ +sd->cid[0] = MID; /* Fake card manufacturer ID (MID) */ +sd->cid[1] = OID[0];/* OEM/Application ID (OID) */ sd->cid[2] = OID[1]; -sd->cid[3] = PNM[0]; /* Fake product name (PNM) */ +sd->cid[3] = PNM[0];/* Fake product name (PNM) */ sd->cid[4] = PNM[1]; sd->cid[5] = PNM[2]; sd->cid[6] = PNM[3]; sd->cid[7] = PNM[4]; -sd->cid[8] = PRV; /* Fake product revision (PRV) */ -sd->cid[9] = 0xde; /* Fake serial number (PSN) */ +sd->cid[8] = PRV; /* Fake product revision (PRV) */ +sd->cid[9] = 0xde; /* Fake serial number (PSN) */ sd->cid[10] = 0xad; sd->cid[11] = 0xbe; sd->cid[12] = 0xef; -sd->cid[13] = 0x00 | /* Manufacture date (MDT) */ +sd->cid[13] = 0x00 |/* Manufacture date (MDT) */ ((MDT_YR - 2000) / 10); sd->cid[14] = ((MDT_YR % 10) << 4) | MDT_MON; sd->cid[15] = (sd_crc7(sd->cid, 15) << 1) | 1; } -#define HWBLOCK_SHIFT 9 /* 512 bytes */ -#define SECTOR_SHIFT 5 /* 16 kilobytes */ -#define WPGROUP_SHIFT 7 /* 2 megs */ -#define CMULT_SHIFT9 /* 512 times HWBLOCK_SIZE */ -#define WPGROUP_SIZE (1 << (HWBLOCK_SHIFT + SECTOR_SHIFT + WPGROUP_SHIFT)) +#define HWBLOCK_SHIFT 9 /* 512 bytes */ +#define SECTOR_SHIFT5 /* 16 kilobytes */ +#define WPGROUP_SHIFT 7 /* 2 megs */ +#define CMULT_SHIFT 9 /* 512 times HWBLOCK_SIZE */ +#define WPGROUP_SIZE(1 << (HWBLOCK_SHIFT + SECTOR_SHIFT + WPGROUP_SHIFT)) static const uint8_t sd_csd_rw_mask[16] = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, @@ -385,31 +385,31 @@ static void sd_set_csd(SDState *sd, uint64_t size) csize = (size >> (CMULT_SHIFT + hwblock_shift)) - 1; if (size <= SDSC_MAX_CAPACITY) { /* Standard Capacity SD */ -sd->csd[0] = 0x00; /* CSD structure */ -sd->csd[1] = 0x26; /* Data read access-time-1 */ -sd->csd[2] = 0x00; /* Data read access-time-2 */ +sd->csd[0] = 0x00; /* CSD structure */ +sd->csd[1] = 0x26; /* Data read access-time-1 */ +sd->csd[2] = 0x00; /* Data read access-time-2 */ sd->csd[3] = 0x32; /* Max. data transfer rate: 25 MHz */ -sd->csd[4] = 0x5f; /* Card Command Classes */ -sd->csd[5] = 0x50 |/* Max. read data block length */ +sd->csd[4] = 0x5f; /* Card Command Classes */ +sd->csd[5] = 0x50 | /* Max. read data block length */ hwblock_shift; -sd->csd[6] = 0xe0 |/* Partial block for read allowed */ +sd->csd[6] = 0xe0 | /* Partial block for read allowed */ ((csize >> 10) & 0x03); -sd->csd[7] = 0x00 |/* Device size */ +sd->csd[7] = 0x00 | /* Device size */ ((csize >> 2) & 0xff); -sd->csd[8] = 0x3f |/* Max. read current */ +sd->csd[8] = 0x3f | /* Max. read current */ ((csize << 6) & 0xc0); -sd->csd[9] = 0xfc |/* Max. write current */ +sd->csd[9] = 0xfc | /* Max. write current */ ((CMULT_SHIFT - 2) >> 1); -sd->csd[10] = 0x40 | /* Erase sector size */ +sd->csd[10] = 0x40 |/* Erase sector size */ (((CMULT_SHIFT - 2) << 7) & 0x80) | (sectsize >> 1); -sd->csd[11] = 0x00 | /* Write protect group size */ +sd->csd[11] = 0x00 |/* Write protect group size */ ((sectsize << 7) & 0x80) | wpsize; -sd->csd[12] = 0x90 | /* Write speed factor */ +sd->csd[12] = 0x90 |/* Write speed factor */ (hwblock_shift >> 2); -sd->csd[13] = 0x20 | /* Max. write data block length */ +sd->csd[13] = 0x20 |/* Max. write data block length */ ((hwblock_shift << 6) & 0xc0); -sd->csd[14] = 0x00;/* File format group */ -} else { /* SDHC */ +sd->csd[14] = 0x00; /* File format group */ +} else {/* SDHC */ size /= 512 * KiB; size -= 1; sd->csd[0] = 0x40; @@ -503,7 +503,7 @@ static int
[PATCH v2 17/22] sd: emmc: Support boot area in emmc image
From: Joel Stanley This assumes a specially constructued image: dd if=/dev/zero of=mmc-bootarea.img count=2 bs=1M dd if=u-boot-spl.bin of=mmc-bootarea.img conv=notrunc dd if=u-boot.bin of=mmc-bootarea.img conv=notrunc count=64 bs=1K cat mmc-bootarea.img obmc-phosphor-image.wic > mmc.img truncate --size 16GB mmc.img truncate --size 128MB mmc-bootarea.img Signed-off-by: Joel Stanley [clg: - changes on the definition names ] Signed-off-by: Cédric Le Goater [spb: use data_start property to access right emmc partition, Clean up PARTITION_ENABLE support as incomplete, Fix commit message to be generic.] Signed-off-by: Sai Pavan Boddu --- hw/sd/sd.c | 40 1 file changed, 40 insertions(+) diff --git a/hw/sd/sd.c b/hw/sd/sd.c index 60799aa..ab29e54 100644 --- a/hw/sd/sd.c +++ b/hw/sd/sd.c @@ -1040,6 +1040,34 @@ static void sd_lock_command(SDState *sd) sd->card_status &= ~CARD_IS_LOCKED; } +/* + * This requires a disk image that has two boot partitions inserted at the + * beginning of it. The size of the boot partitions are configured in the + * ext_csd structure, which is hardcoded in qemu. They are currently set to + * 1MB each. + */ +static uint32_t sd_bootpart_offset(SDState *sd) +{ +unsigned int access = sd->ext_csd[EXT_CSD_PART_CONFIG] & +EXT_CSD_PART_CONFIG_ACC_MASK; +unsigned int boot_capacity = sd->ext_csd[EXT_CSD_BOOT_MULT] << 17; + +if (!sd->emmc) { +return 0; +} + +switch (access) { +case EXT_CSD_PART_CONFIG_ACC_DEFAULT: +return boot_capacity * 2; +case EXT_CSD_PART_CONFIG_ACC_BOOT0: +return 0; +case EXT_CSD_PART_CONFIG_ACC_BOOT0 + 1: +return boot_capacity * 1; +default: + g_assert_not_reached(); +} +} + static sd_rsp_type_t sd_normal_command(SDState *sd, SDRequest req) { uint32_t rca = 0x; @@ -1355,6 +1383,9 @@ static sd_rsp_type_t sd_normal_command(SDState *sd, SDRequest req) return sd_r1; } +if (sd->emmc) { +addr += sd_bootpart_offset(sd); +} sd->state = sd_sendingdata_state; sd->data_start = addr; sd->data_offset = 0; @@ -1374,6 +1405,9 @@ static sd_rsp_type_t sd_normal_command(SDState *sd, SDRequest req) return sd_r1; } +if (sd->emmc) { +addr += sd_bootpart_offset(sd); +} sd->state = sd_sendingdata_state; sd->data_start = addr; sd->data_offset = 0; @@ -1430,6 +1464,9 @@ static sd_rsp_type_t sd_normal_command(SDState *sd, SDRequest req) return sd_r1; } +if (sd->emmc) { +addr += sd_bootpart_offset(sd); +} sd->state = sd_receivingdata_state; sd->data_start = addr; sd->data_offset = 0; @@ -1460,6 +1497,9 @@ static sd_rsp_type_t sd_normal_command(SDState *sd, SDRequest req) return sd_r1; } +if (sd->emmc) { +addr += sd_bootpart_offset(sd); +} sd->state = sd_receivingdata_state; sd->data_start = addr; sd->data_offset = 0; -- 2.7.4
[PATCH v2 19/22] sd: sdhci: Support eMMC devices
Embedded device slots should be allowed as support of eMMC is available. Signed-off-by: Sai Pavan Boddu --- hw/sd/sdhci.c | 4 1 file changed, 4 deletions(-) diff --git a/hw/sd/sdhci.c b/hw/sd/sdhci.c index 8ffa539..771212a 100644 --- a/hw/sd/sdhci.c +++ b/hw/sd/sdhci.c @@ -99,10 +99,6 @@ static void sdhci_check_capareg(SDHCIState *s, Error **errp) msk = FIELD_DP64(msk, SDHC_CAPAB, ASYNC_INT, 0); val = FIELD_EX64(s->capareg, SDHC_CAPAB, SLOT_TYPE); -if (val) { -error_setg(errp, "slot-type not supported"); -return; -} trace_sdhci_capareg("slot type", val); msk = FIELD_DP64(msk, SDHC_CAPAB, SLOT_TYPE, 0); -- 2.7.4
[PATCH v2 22/22] docs: arm: xlnx-versal-virt: Add eMMC support documentation
Add details of eMMC specific machine property. Signed-off-by: Sai Pavan Boddu --- docs/system/arm/xlnx-versal-virt.rst | 10 ++ 1 file changed, 10 insertions(+) diff --git a/docs/system/arm/xlnx-versal-virt.rst b/docs/system/arm/xlnx-versal-virt.rst index 2602d0f..a48a88d 100644 --- a/docs/system/arm/xlnx-versal-virt.rst +++ b/docs/system/arm/xlnx-versal-virt.rst @@ -29,6 +29,7 @@ Implemented devices: - 2 GEMs (Cadence MACB Ethernet MACs) - 8 ADMA (Xilinx zDMA) channels - 2 SD Controllers +* SDHCI0 can be configured as eMMC - OCM (256KB of On Chip Memory) - DDR memory @@ -43,6 +44,15 @@ Other differences between the hardware and the QEMU model: - QEMU provides 8 virtio-mmio virtio transports; these start at address ``0xa000`` and have IRQs from 111 and upwards. +Enabling eMMC +" +In order to enable eMMC pass the following machine property "emmc=on". +ex: +"-M xlnx-versal-virt,emmc=on" + +Above switch would configure SDHCI0 to accept eMMC card. More details on eMMC +emulation can be found in docs/devel/emmc.txt. + Running """ If the user provides an Operating System to be loaded, we expect users -- 2.7.4
[PATCH v2 05/22] sd: emmc: Add support for EXT_CSD & CSD for eMMC
From: Vincent Palatin eMMC CSD is similar to SD with an option to refer EXT_CSD for larger devices. Signed-off-by: Sai Pavan Boddu --- hw/sd/sd.c | 57 +++-- 1 file changed, 55 insertions(+), 2 deletions(-) diff --git a/hw/sd/sd.c b/hw/sd/sd.c index 430bea5..4c211ba 100644 --- a/hw/sd/sd.c +++ b/hw/sd/sd.c @@ -135,6 +135,7 @@ struct SDState { uint64_t data_start; uint32_t data_offset; uint8_t data[512]; +uint8_t ext_csd[512]; qemu_irq readonly_cb; qemu_irq inserted_cb; QEMUTimer *ocr_power_timer; @@ -389,6 +390,51 @@ static const uint8_t sd_csd_rw_mask[16] = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xfc, 0xfe, }; +static void mmc_set_ext_csd(SDState *sd, uint64_t size) +{ +uint32_t sectcount = size >> HWBLOCK_SHIFT; + +memset(sd->ext_csd, 0, sizeof(sd->ext_csd)); +sd->ext_csd[504] = 0x1; /* supported command sets */ +sd->ext_csd[503] = 0x1; /* HPI features */ +sd->ext_csd[502] = 0x1; /* Background operations support */ +sd->ext_csd[241] = 0xA; /* 1st initialization time after partitioning */ +sd->ext_csd[232] = 0x1; /* Trim multiplier */ +sd->ext_csd[231] = 0x15; /* Secure feature support */ +sd->ext_csd[230] = 0x96; /* Secure erase support */ +sd->ext_csd[229] = 0x96; /* Secure TRIM multiplier */ +sd->ext_csd[228] = 0x7; /* Boot information */ +sd->ext_csd[226] = 0x8; /* Boot partition size */ +sd->ext_csd[225] = 0x6; /* Access size */ +sd->ext_csd[224] = 0x4; /* HC Erase unit size */ +sd->ext_csd[223] = 0x1; /* HC erase timeout */ +sd->ext_csd[222] = 0x1; /* Reliable write sector count */ +sd->ext_csd[221] = 0x4; /* HC write protect group size */ +sd->ext_csd[220] = 0x8; /* Sleep current VCC */ +sd->ext_csd[219] = 0x7; /* Sleep current VCCQ */ +sd->ext_csd[217] = 0x11; /* Sleep/Awake timeout */ +sd->ext_csd[215] = (sectcount >> 24) & 0xff; /* Sector count */ +sd->ext_csd[214] = (sectcount >> 16) & 0xff; /* ... */ +sd->ext_csd[213] = (sectcount >> 8) & 0xff; /* ... */ +sd->ext_csd[212] = (sectcount & 0xff); /* ... */ +sd->ext_csd[210] = 0xa; /* Min write perf for 8bit@52Mhz */ +sd->ext_csd[209] = 0xa; /* Min read perf for 8bit@52Mhz */ +sd->ext_csd[208] = 0xa; /* Min write perf for 4bit@52Mhz */ +sd->ext_csd[207] = 0xa; /* Min read perf for 4bit@52Mhz */ +sd->ext_csd[206] = 0xa; /* Min write perf for 4bit@26Mhz */ +sd->ext_csd[205] = 0xa; /* Min read perf for 4bit@26Mhz */ +sd->ext_csd[199] = 0x1; /* Partition switching timing */ +sd->ext_csd[198] = 0x1; /* Out-of-interrupt busy timing */ +sd->ext_csd[196] = 0xFF; /* Card type */ +sd->ext_csd[194] = 0x2; /* CSD Structure version */ +sd->ext_csd[192] = 0x5; /* Extended CSD revision */ +sd->ext_csd[168] = 0x1; /* RPMB size */ +sd->ext_csd[160] = 0x3; /* Partinioning support */ +sd->ext_csd[159] = 0x00; /* Max enhanced area size */ +sd->ext_csd[158] = 0x00; /* ... */ +sd->ext_csd[157] = 0xEC; /* ... */ +} + static void sd_set_csd(SDState *sd, uint64_t size) { int hwblock_shift = HWBLOCK_SHIFT; @@ -402,8 +448,11 @@ static void sd_set_csd(SDState *sd, uint64_t size) } csize = (size >> (CMULT_SHIFT + hwblock_shift)) - 1; -if (size <= SDSC_MAX_CAPACITY) { /* Standard Capacity SD */ -sd->csd[0] = 0x00; /* CSD structure */ +if (size <= SDSC_MAX_CAPACITY || sd->emmc) { /* Standard Capacity SD */ +if (sd->emmc && size >= SDSC_MAX_CAPACITY) { +csize = 0xfff; +} +sd->csd[0] = sd->emmc ? 0xd0 : 0x00; /* CSD structure */ sd->csd[1] = 0x26; /* Data read access-time-1 */ sd->csd[2] = 0x00; /* Data read access-time-2 */ sd->csd[3] = 0x32; /* Max. data transfer rate: 25 MHz */ @@ -447,6 +496,10 @@ static void sd_set_csd(SDState *sd, uint64_t size) sd->csd[14] = 0x00; } sd->csd[15] = (sd_crc7(sd->csd, 15) << 1) | 1; + +if (sd->emmc) { +mmc_set_ext_csd(sd, size); +} } static void sd_set_rca(SDState *sd, uint16_t value) -- 2.7.4
[PATCH v2 12/22] sd: emmc: add CMD21 tuning sequence
eMMC cards support tuning sequence for entering HS200 mode. Signed-off-by: Sai Pavan Boddu Signed-off-by: Edgar E. Iglesias --- hw/sd/sd.c | 47 +++ 1 file changed, 47 insertions(+) diff --git a/hw/sd/sd.c b/hw/sd/sd.c index e50d40b..d702027 100644 --- a/hw/sd/sd.c +++ b/hw/sd/sd.c @@ -1378,6 +1378,14 @@ static sd_rsp_type_t sd_normal_command(SDState *sd, SDRequest req) } break; +case 21:/* CMD21: mmc SEND TUNING_BLOCK */ +if (sd->emmc && (sd->state == sd_transfer_state)) { +sd->state = sd_sendingdata_state; +sd->data_offset = 0; +return sd_r1; +} +break; + case 23:/* CMD23: SET_BLOCK_COUNT */ if (sd->spec_version < SD_PHY_SPECv3_01_VERS) { break; @@ -2112,6 +2120,30 @@ static const uint8_t sd_tuning_block_pattern[SD_TUNING_BLOCK_SIZE] = { 0xbb, 0xff, 0xf7, 0xff, 0xf7, 0x7f, 0x7b, 0xde, }; +#define EXCSD_BUS_WIDTH_OFFSET 183 +#define BUS_WIDTH_8_MASK0x4 +#define BUS_WIDTH_4_MASK0x2 +#define MMC_TUNING_BLOCK_SIZE 128 + +static const uint8_t mmc_tunning_block_pattern[128] = { + 0xff, 0xff, 0x00, 0xff, 0xff, 0xff, 0x00, 0x00, + 0xff, 0xff, 0xcc, 0xcc, 0xcc, 0x33, 0xcc, 0xcc, + 0xcc, 0x33, 0x33, 0xcc, 0xcc, 0xcc, 0xff, 0xff, + 0xff, 0xee, 0xff, 0xff, 0xff, 0xee, 0xee, 0xff, + 0xff, 0xff, 0xdd, 0xff, 0xff, 0xff, 0xdd, 0xdd, + 0xff, 0xff, 0xff, 0xbb, 0xff, 0xff, 0xff, 0xbb, + 0xbb, 0xff, 0xff, 0xff, 0x77, 0xff, 0xff, 0xff, + 0x77, 0x77, 0xff, 0x77, 0xbb, 0xdd, 0xee, 0xff, + 0xff, 0xff, 0xff, 0x00, 0xff, 0xff, 0xff, 0x00, + 0x00, 0xff, 0xff, 0xcc, 0xcc, 0xcc, 0x33, 0xcc, + 0xcc, 0xcc, 0x33, 0x33, 0xcc, 0xcc, 0xcc, 0xff, + 0xff, 0xff, 0xee, 0xff, 0xff, 0xff, 0xee, 0xee, + 0xff, 0xff, 0xff, 0xdd, 0xff, 0xff, 0xff, 0xdd, + 0xdd, 0xff, 0xff, 0xff, 0xbb, 0xff, 0xff, 0xff, + 0xbb, 0xbb, 0xff, 0xff, 0xff, 0x77, 0xff, 0xff, + 0xff, 0x77, 0x77, 0xff, 0x77, 0xbb, 0xdd, 0xee, +}; + uint8_t sd_read_byte(SDState *sd) { /* TODO: Append CRCs */ @@ -2205,6 +2237,21 @@ uint8_t sd_read_byte(SDState *sd) ret = sd_tuning_block_pattern[sd->data_offset++]; break; +case 21:/* CMD21: SEND_TUNNING_BLOCK (MMC) */ +if (sd->data_offset >= MMC_TUNING_BLOCK_SIZE - 1) { +sd->state = sd_transfer_state; +} +if (sd->ext_csd[EXCSD_BUS_WIDTH_OFFSET] & BUS_WIDTH_8_MASK) { +ret = mmc_tunning_block_pattern[sd->data_offset++]; +} else { +/* Return LSB Nibbles of two byte from the 8bit tuning block + * for 4bit mode + */ +ret = mmc_tunning_block_pattern[sd->data_offset++] & 0x0F; +ret |= (mmc_tunning_block_pattern[sd->data_offset++] & 0x0F) << 4; +} +break; + case 22:/* ACMD22: SEND_NUM_WR_BLOCKS */ ret = sd->data[sd->data_offset ++]; -- 2.7.4
[PATCH v2 08/22] sd: emmc: Dont not update CARD_CAPACITY for eMMC cards
OCR.CARD_CAPACITY field is only valid for sd cards, So skip it for eMMC. Signed-off-by: Sai Pavan Boddu Signed-off-by: Edgar E. Iglesias Reviewed-by: Alistair Francis --- hw/sd/sd.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/hw/sd/sd.c b/hw/sd/sd.c index a4f93b5..9835f52 100644 --- a/hw/sd/sd.c +++ b/hw/sd/sd.c @@ -324,7 +324,8 @@ static void sd_ocr_powerup(void *opaque) /* card power-up OK */ sd->ocr = FIELD_DP32(sd->ocr, OCR, CARD_POWER_UP, 1); -if (sd->size > SDSC_MAX_CAPACITY) { +/* eMMC supports only Byte mode */ +if (!sd->emmc && sd->size > SDSC_MAX_CAPACITY) { sd->ocr = FIELD_DP32(sd->ocr, OCR, CARD_CAPACITY, 1); } } -- 2.7.4
[PATCH v2 20/22] arm: xlnx-versal: Add emmc to versal
Configuring SDHCI-0 to act as eMMC controller. Signed-off-by: Sai Pavan Boddu --- include/hw/arm/xlnx-versal.h | 1 + hw/arm/xlnx-versal-virt.c| 30 +- hw/arm/xlnx-versal.c | 13 +++-- 3 files changed, 37 insertions(+), 7 deletions(-) diff --git a/include/hw/arm/xlnx-versal.h b/include/hw/arm/xlnx-versal.h index 2b76885..440f3b4 100644 --- a/include/hw/arm/xlnx-versal.h +++ b/include/hw/arm/xlnx-versal.h @@ -76,6 +76,7 @@ struct Versal { struct { MemoryRegion *mr_ddr; uint32_t psci_conduit; +bool has_emmc; } cfg; }; diff --git a/hw/arm/xlnx-versal-virt.c b/hw/arm/xlnx-versal-virt.c index 8482cd6..273e0c7 100644 --- a/hw/arm/xlnx-versal-virt.c +++ b/hw/arm/xlnx-versal-virt.c @@ -46,6 +46,7 @@ struct VersalVirt { struct { bool secure; +bool has_emmc; } cfg; }; @@ -333,6 +334,13 @@ static void fdt_add_sd_nodes(VersalVirt *s) qemu_fdt_setprop_sized_cells(s->fdt, name, "reg", 2, addr, 2, MM_PMC_SD0_SIZE); qemu_fdt_setprop(s->fdt, name, "compatible", compat, sizeof(compat)); +/* + * eMMC specific properties + */ +if (s->cfg.has_emmc && i == 0) { +qemu_fdt_setprop(s->fdt, name, "non-removable", NULL, 0); +qemu_fdt_setprop_sized_cells(s->fdt, name, "bus-width", 1, 8); +} g_free(name); } } @@ -512,7 +520,7 @@ static void create_virtio_regions(VersalVirt *s) } } -static void sd_plugin_card(SDHCIState *sd, DriveInfo *di) +static void sd_plugin_card(SDHCIState *sd, DriveInfo *di, bool emmc) { BlockBackend *blk = di ? blk_by_legacy_dinfo(di) : NULL; DeviceState *card; @@ -520,15 +528,22 @@ static void sd_plugin_card(SDHCIState *sd, DriveInfo *di) card = qdev_new(TYPE_SD_CARD); object_property_add_child(OBJECT(sd), "card[*]", OBJECT(card)); qdev_prop_set_drive_err(card, "drive", blk, _fatal); +object_property_set_bool(OBJECT(card), "emmc", emmc, _fatal); qdev_realize_and_unref(card, qdev_get_child_bus(DEVICE(sd), "sd-bus"), _fatal); } +static void versal_virt_set_emmc(Object *obj, bool value, Error **errp) +{ +VersalVirt *s = XLNX_VERSAL_VIRT_MACHINE(obj); + +s->cfg.has_emmc = value; +} + static void versal_virt_init(MachineState *machine) { VersalVirt *s = XLNX_VERSAL_VIRT_MACHINE(machine); int psci_conduit = QEMU_PSCI_CONDUIT_DISABLED; -int i; /* * If the user provides an Operating System to be loaded, we expect them @@ -560,6 +575,8 @@ static void versal_virt_init(MachineState *machine) _abort); object_property_set_int(OBJECT(>soc), "psci-conduit", psci_conduit, _abort); +object_property_set_bool(OBJECT(>soc), "has-emmc", s->cfg.has_emmc, + _abort); sysbus_realize(SYS_BUS_DEVICE(>soc), _fatal); fdt_create(s); @@ -581,10 +598,10 @@ static void versal_virt_init(MachineState *machine) memory_region_add_subregion_overlap(get_system_memory(), 0, >soc.fpd.apu.mr, 0); +sd_plugin_card(>soc.pmc.iou.sd[0], +drive_get_next(s->cfg.has_emmc ? IF_EMMC : IF_SD), s->cfg.has_emmc); /* Plugin SD cards. */ -for (i = 0; i < ARRAY_SIZE(s->soc.pmc.iou.sd); i++) { -sd_plugin_card(>soc.pmc.iou.sd[i], drive_get_next(IF_SD)); -} +sd_plugin_card(>soc.pmc.iou.sd[1], drive_get_next(IF_SD), false); s->binfo.ram_size = machine->ram_size; s->binfo.loader_start = 0x0; @@ -621,6 +638,9 @@ static void versal_virt_machine_class_init(ObjectClass *oc, void *data) mc->default_cpus = XLNX_VERSAL_NR_ACPUS; mc->no_cdrom = true; mc->default_ram_id = "ddr"; +object_class_property_add_bool(oc, "emmc", +NULL, versal_virt_set_emmc); + } static const TypeInfo versal_virt_machine_init_typeinfo = { diff --git a/hw/arm/xlnx-versal.c b/hw/arm/xlnx-versal.c index 628e77e..67ed1af 100644 --- a/hw/arm/xlnx-versal.c +++ b/hw/arm/xlnx-versal.c @@ -230,6 +230,9 @@ static void versal_create_admas(Versal *s, qemu_irq *pic) } #define SDHCI_CAPABILITIES 0x280737ec6481 /* Same as on ZynqMP. */ +#define SDHCI_EMMC_CAPS ((SDHCI_CAPABILITIES & ~(3 << 30)) | \ + (1 << 30)) + static void versal_create_sds(Versal *s, qemu_irq *pic) { int i; @@ -244,9 +247,14 @@ static void versal_create_sds(Versal *s, qemu_irq *pic) object_property_set_uint(OBJECT(dev), "sd-spec-version", 3, _fatal); -object_property_set_uint(OBJECT(dev), "capareg", SDHCI_CAPABILITIES, +object_property_set_uint(OBJECT(dev), "capareg", SDHCI_EMMC_CAPS, _fatal); -object_property_set_uint(OBJECT(dev), "uhs", UHS_I, _fatal); +/* + * UHS is not applicable for eMMC + */
[PATCH v2 01/22] block: add eMMC block device type
From: Vincent Palatin Add new block device type. Signed-off-by: Vincent Palatin [SPB: Rebased over 5.1 version] Signed-off-by: Sai Pavan Boddu Signed-off-by: Joel Stanley Signed-off-by: Cédric Le Goater Reviewed-by: Alistair Francis --- include/sysemu/blockdev.h | 1 + blockdev.c| 1 + 2 files changed, 2 insertions(+) diff --git a/include/sysemu/blockdev.h b/include/sysemu/blockdev.h index 3b5fcda..eefae9f 100644 --- a/include/sysemu/blockdev.h +++ b/include/sysemu/blockdev.h @@ -24,6 +24,7 @@ typedef enum { */ IF_NONE = 0, IF_IDE, IF_SCSI, IF_FLOPPY, IF_PFLASH, IF_MTD, IF_SD, IF_VIRTIO, IF_XEN, +IF_EMMC, IF_COUNT } BlockInterfaceType; diff --git a/blockdev.c b/blockdev.c index cd438e6..390d43c 100644 --- a/blockdev.c +++ b/blockdev.c @@ -83,6 +83,7 @@ static const char *const if_name[IF_COUNT] = { [IF_SD] = "sd", [IF_VIRTIO] = "virtio", [IF_XEN] = "xen", +[IF_EMMC] = "emmc", }; static int if_max_devs[IF_COUNT] = { -- 2.7.4
[PATCH v2 11/22] sd: emmc: Add mmc switch function support
switch operation in eMMC card updates the ext_csd register to request changes in card operations. Here we implement similar sequence but requests are mostly dummy and make no change. Implement SWITCH_ERROR if the write operation extends goes beyond length of ext_csd. Signed-off-by: Sai Pavan Boddu Signed-off-by: Edgar E. Iglesias --- hw/sd/sd.c | 56 ++-- 1 file changed, 50 insertions(+), 6 deletions(-) diff --git a/hw/sd/sd.c b/hw/sd/sd.c index ae5c5e8..e50d40b 100644 --- a/hw/sd/sd.c +++ b/hw/sd/sd.c @@ -515,6 +515,7 @@ static void sd_set_rca(SDState *sd, uint16_t value) FIELD(CSR, AKE_SEQ_ERROR, 3, 1) FIELD(CSR, APP_CMD, 5, 1) FIELD(CSR, FX_EVENT,6, 1) +FIELD(CSR, SWITCH_ERROR,7, 1) FIELD(CSR, READY_FOR_DATA, 8, 1) FIELD(CSR, CURRENT_STATE, 9, 4) FIELD(CSR, ERASE_RESET,13, 1) @@ -878,6 +879,43 @@ static uint32_t sd_wpbits(SDState *sd, uint64_t addr) return ret; } +enum { +MMC_CMD6_ACCESS_COMMAND_SET = 0, +MMC_CMD6_ACCESS_SET_BITS, +MMC_CMD6_ACCESS_CLEAR_BITS, +MMC_CMD6_ACCESS_WRITE_BYTE, +}; + +static void mmc_function_switch(SDState *sd, uint32_t arg) +{ +uint32_t access = extract32(arg, 24, 2); +uint32_t index = extract32(arg, 16, 8); +uint32_t value = extract32(arg, 8, 8); +uint8_t b = sd->ext_csd[index]; + +switch (access) { +case MMC_CMD6_ACCESS_COMMAND_SET: +qemu_log_mask(LOG_UNIMP, "MMC Command set switching not supported\n"); +return; +case MMC_CMD6_ACCESS_SET_BITS: +b |= value; +break; +case MMC_CMD6_ACCESS_CLEAR_BITS: +b &= ~value; +break; +case MMC_CMD6_ACCESS_WRITE_BYTE: +b = value; +break; +} + +if (index >= 192) { +sd->card_status |= R_CSR_SWITCH_ERROR_MASK; +return; +} + +sd->ext_csd[index] = b; +} + static void sd_function_switch(SDState *sd, uint32_t arg) { int i, mode, new_func; @@ -1097,12 +1135,18 @@ static sd_rsp_type_t sd_normal_command(SDState *sd, SDRequest req) case 6: /* CMD6: SWITCH_FUNCTION */ switch (sd->mode) { case sd_data_transfer_mode: -sd_function_switch(sd, req.arg); -sd->state = sd_sendingdata_state; -sd->data_start = 0; -sd->data_offset = 0; -return sd_r1; - +if (sd->emmc) { +sd->state = sd_programming_state; +mmc_function_switch(sd, req.arg); +sd->state = sd_transfer_state; +return sd_r1b; +} else { +sd_function_switch(sd, req.arg); +sd->state = sd_sendingdata_state; +sd->data_start = 0; +sd->data_offset = 0; +return sd_r1; +} default: break; } -- 2.7.4
[PATCH v2 16/22] sd: emmc: Add Extended CSD register definitions
From: Cédric Le Goater Add user friendly macros for EXT_CSD register. Signed-off-by: Cédric Le Goater [spb: Rebased over versal emmc series, updated commit message] Signed-off-by: Sai Pavan Boddu --- hw/sd/sdmmc-internal.h | 97 ++ hw/sd/sd.c | 54 +++- 2 files changed, 126 insertions(+), 25 deletions(-) diff --git a/hw/sd/sdmmc-internal.h b/hw/sd/sdmmc-internal.h index d8bf17d..7ab7b4d 100644 --- a/hw/sd/sdmmc-internal.h +++ b/hw/sd/sdmmc-internal.h @@ -37,4 +37,101 @@ const char *sd_cmd_name(uint8_t cmd); */ const char *sd_acmd_name(uint8_t cmd); +/* + * EXT_CSD fields + */ + +#define EXT_CSD_CMDQ_MODE_EN15 /* R/W */ +#define EXT_CSD_FLUSH_CACHE 32 /* W */ +#define EXT_CSD_CACHE_CTRL33 /* R/W */ +#define EXT_CSD_POWER_OFF_NOTIFICATION 34 /* R/W */ +#define EXT_CSD_PACKED_FAILURE_INDEX 35 /* RO */ +#define EXT_CSD_PACKED_CMD_STATUS 36 /* RO */ +#define EXT_CSD_EXP_EVENTS_STATUS 54 /* RO, 2 bytes */ +#define EXT_CSD_EXP_EVENTS_CTRL 56 /* R/W, 2 bytes */ +#define EXT_CSD_DATA_SECTOR_SIZE 61 /* R */ +#define EXT_CSD_GP_SIZE_MULT143 /* R/W */ +#define EXT_CSD_PARTITION_SETTING_COMPLETED 155 /* R/W */ +#define EXT_CSD_PARTITION_ATTRIBUTE 156 /* R/W */ +#define EXT_CSD_PARTITION_SUPPORT 160 /* RO */ +#define EXT_CSD_HPI_MGMT161 /* R/W */ +#define EXT_CSD_RST_N_FUNCTION162 /* R/W */ +#define EXT_CSD_BKOPS_EN163 /* R/W */ +#define EXT_CSD_BKOPS_START 164 /* W */ +#define EXT_CSD_SANITIZE_START165 /* W */ +#define EXT_CSD_WR_REL_PARAM166 /* RO */ +#define EXT_CSD_RPMB_MULT 168 /* RO */ +#define EXT_CSD_FW_CONFIG 169 /* R/W */ +#define EXT_CSD_BOOT_WP 173 /* R/W */ +#define EXT_CSD_ERASE_GROUP_DEF 175 /* R/W */ +#define EXT_CSD_PART_CONFIG 179 /* R/W */ +#define EXT_CSD_ERASED_MEM_CONT 181 /* RO */ +#define EXT_CSD_BUS_WIDTH 183 /* R/W */ +#define EXT_CSD_STROBE_SUPPORT184 /* RO */ +#define EXT_CSD_HS_TIMING 185 /* R/W */ +#define EXT_CSD_POWER_CLASS 187 /* R/W */ +#define EXT_CSD_REV 192 /* RO */ +#define EXT_CSD_STRUCTURE 194 /* RO */ +#define EXT_CSD_CARD_TYPE 196 /* RO */ +#define EXT_CSD_DRIVER_STRENGTH 197 /* RO */ +#define EXT_CSD_OUT_OF_INTERRUPT_TIME 198 /* RO */ +#define EXT_CSD_PART_SWITCH_TIME199 /* RO */ +#define EXT_CSD_PWR_CL_52_195 200 /* RO */ +#define EXT_CSD_PWR_CL_26_195 201 /* RO */ +#define EXT_CSD_PWR_CL_52_360 202 /* RO */ +#define EXT_CSD_PWR_CL_26_360 203 /* RO */ +#define EXT_CSD_SEC_CNT 212 /* RO, 4 bytes */ +#define EXT_CSD_S_A_TIMEOUT 217 /* RO */ +#define EXT_CSD_S_C_VCCQ 219 /* RO */ +#define EXT_CSD_S_C_VCC 220 /* RO */ +#define EXT_CSD_REL_WR_SEC_C222 /* RO */ +#define EXT_CSD_HC_WP_GRP_SIZE221 /* RO */ +#define EXT_CSD_ERASE_TIMEOUT_MULT 223 /* RO */ +#define EXT_CSD_HC_ERASE_GRP_SIZE 224 /* RO */ +#define EXT_CSD_ACC_SIZE225 /* RO */ +#define EXT_CSD_BOOT_MULT 226 /* RO */ +#define EXT_CSD_BOOT_INFO 228 /* RO */ +#define EXT_CSD_SEC_TRIM_MULT 229 /* RO */ +#define EXT_CSD_SEC_ERASE_MULT230 /* RO */ +#define EXT_CSD_SEC_FEATURE_SUPPORT 231 /* RO */ +#define EXT_CSD_TRIM_MULT 232 /* RO */ +#define EXT_CSD_PWR_CL_200_195236 /* RO */ +#define EXT_CSD_PWR_CL_200_360237 /* RO */ +#define EXT_CSD_PWR_CL_DDR_52_195 238 /* RO */ +#define EXT_CSD_PWR_CL_DDR_52_360 239 /* RO */ +#define EXT_CSD_BKOPS_STATUS246 /* RO */ +#define EXT_CSD_POWER_OFF_LONG_TIME 247 /* RO */ +#define EXT_CSD_GENERIC_CMD6_TIME 248 /* RO */ +#define EXT_CSD_CACHE_SIZE249 /* RO, 4 bytes */ +#define EXT_CSD_PWR_CL_DDR_200_360 253 /* RO */ +#define EXT_CSD_FIRMWARE_VERSION 254 /* RO, 8 bytes */ +#define EXT_CSD_PRE_EOL_INFO267 /* RO */ +#define EXT_CSD_DEVICE_LIFE_TIME_EST_TYP_A 268 /* RO */ +#define EXT_CSD_DEVICE_LIFE_TIME_EST_TYP_B 269 /* RO */ +#define EXT_CSD_CMDQ_DEPTH307 /* RO */ +#define EXT_CSD_CMDQ_SUPPORT308 /* RO */ +#define EXT_CSD_SUPPORTED_MODE493 /* RO */ +#define EXT_CSD_TAG_UNIT_SIZE 498 /* RO */ +#define EXT_CSD_DATA_TAG_SUPPORT 499 /* RO */ +#define EXT_CSD_MAX_PACKED_WRITES 500 /* RO */ +#define EXT_CSD_MAX_PACKED_READS 501 /* RO */ +#define EXT_CSD_BKOPS_SUPPORT 502 /* RO */ +#define EXT_CSD_HPI_FEATURES503 /* RO */ +#define EXT_CSD_S_CMD_SET 504 /* RO */ + +/* + * EXT_CSD field definitions + */ + +#define EXT_CSD_WR_REL_PARAM_EN (1 << 2) +#define EXT_CSD_WR_REL_PARAM_EN_RPMB_REL_WR (1 << 4) + +#define EXT_CSD_PART_CONFIG_ACC_MASK (0x7) +#define EXT_CSD_PART_CONFIG_ACC_DEFAULT (0x0) +#define EXT_CSD_PART_CONFIG_ACC_BOOT0 (0x1) + +#define EXT_CSD_PART_CONFIG_EN_MASK (0x7 << 3) +#define EXT_CSD_PART_CONFIG_EN_BOOT0 (0x1 << 3) +#define EXT_CSD_PART_CONFIG_EN_USER (0x7 << 3) + #endif diff --git a/hw/sd/sd.c b/hw/sd/sd.c index 3c24810..60799aa 100644 --- a/hw/sd/sd.c +++ b/hw/sd/sd.c @@ -409,41 +409,45 @@ static void mmc_set_ext_csd(SDState *sd, uint64_t size)
[PATCH v2 18/22] sd: emmc: Subtract bootarea size from blk
From: Joel Stanley The userdata size is derived from the file the user passes on the command line, but we must take into account the boot areas. Signed-off-by: Joel Stanley Signed-off-by: Cédric Le Goater --- hw/sd/sd.c | 5 + 1 file changed, 5 insertions(+) diff --git a/hw/sd/sd.c b/hw/sd/sd.c index ab29e54..a0b4507 100644 --- a/hw/sd/sd.c +++ b/hw/sd/sd.c @@ -653,6 +653,11 @@ static void sd_reset(DeviceState *dev) } size = sect << 9; +if (sd->emmc) { +unsigned int boot_capacity = sd->ext_csd[EXT_CSD_BOOT_MULT] << 17; +size -= boot_capacity * 2; +} + sect = sd_addr_to_wpnum(size) + 1; sd->state = sd_idle_state; -- 2.7.4
[PATCH v2 07/22] sd: sdmmc-internal: Add command string for SEND_OP_CMD
From: Cédric Le Goater This adds extra info to trace log. Signed-off-by: Sai Pavan Boddu --- hw/sd/sdmmc-internal.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/sd/sdmmc-internal.c b/hw/sd/sdmmc-internal.c index 2053def..8648a78 100644 --- a/hw/sd/sdmmc-internal.c +++ b/hw/sd/sdmmc-internal.c @@ -14,7 +14,7 @@ const char *sd_cmd_name(uint8_t cmd) { static const char *cmd_abbrev[SDMMC_CMD_MAX] = { - [0]= "GO_IDLE_STATE", + [0]= "GO_IDLE_STATE", [1]= "SEND_OP_CMD", [2]= "ALL_SEND_CID",[3]= "SEND_RELATIVE_ADDR", [4]= "SET_DSR", [5]= "IO_SEND_OP_COND", [6]= "SWITCH_FUNC", [7]= "SELECT/DESELECT_CARD", -- 2.7.4
[PATCH v2 03/22] sd: emmc: Add support for eMMC cards
From: Vincent Palatin This patch adds support for eMMC cards, change SET_RELATIVE_ADDR command to assign relative address as requested by user. Signed-off-by: Vincent Palatin Signed-off-by: Joel Stanley Signed-off-by: Cédric Le Goater [spb: Split original patch series] Signed-off-by: Sai Pavan Boddu --- hw/sd/sd.c | 14 ++ 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/hw/sd/sd.c b/hw/sd/sd.c index 74b9162..42ee49c 100644 --- a/hw/sd/sd.c +++ b/hw/sd/sd.c @@ -108,6 +108,7 @@ struct SDState { uint8_t spec_version; BlockBackend *blk; bool spi; +bool emmc; /* Runtime changeables */ @@ -431,9 +432,13 @@ static void sd_set_csd(SDState *sd, uint64_t size) sd->csd[15] = (sd_crc7(sd->csd, 15) << 1) | 1; } -static void sd_set_rca(SDState *sd) +static void sd_set_rca(SDState *sd, uint16_t value) { -sd->rca += 0x4567; +if (sd->emmc) { +sd->rca = value; +} else { +sd->rca += 0x4567; +} } FIELD(CSR, AKE_SEQ_ERROR, 3, 1) @@ -979,8 +984,8 @@ static sd_rsp_type_t sd_normal_command(SDState *sd, SDRequest req) case sd_identification_state: case sd_standby_state: sd->state = sd_standby_state; -sd_set_rca(sd); -return sd_r6; +sd_set_rca(sd, req.arg >> 16); +return sd->emmc ? sd_r1 : sd_r6; default: break; @@ -2176,6 +2181,7 @@ static Property sd_properties[] = { * board to ensure that ssi transfers only occur when the chip select * is asserted. */ DEFINE_PROP_BOOL("spi", SDState, spi, false), +DEFINE_PROP_BOOL("emmc", SDState, emmc, false), DEFINE_PROP_END_OF_LIST() }; -- 2.7.4
[PATCH v2 15/22] sd: emmc: Update CID structure for eMMC
CID structure is little different for eMMC, w.r.t to product name and manufacturing date. Signed-off-by: Sai Pavan Boddu Signed-off-by: Edgar E. Iglesias --- hw/sd/sd.c | 47 ++- 1 file changed, 30 insertions(+), 17 deletions(-) diff --git a/hw/sd/sd.c b/hw/sd/sd.c index 6d2ef2b..3c24810 100644 --- a/hw/sd/sd.c +++ b/hw/sd/sd.c @@ -361,23 +361,36 @@ static void sd_set_scr(SDState *sd) static void sd_set_cid(SDState *sd) { -sd->cid[0] = MID; /* Fake card manufacturer ID (MID) */ -sd->cid[1] = OID[0];/* OEM/Application ID (OID) */ -sd->cid[2] = OID[1]; -sd->cid[3] = PNM[0];/* Fake product name (PNM) */ -sd->cid[4] = PNM[1]; -sd->cid[5] = PNM[2]; -sd->cid[6] = PNM[3]; -sd->cid[7] = PNM[4]; -sd->cid[8] = PRV; /* Fake product revision (PRV) */ -sd->cid[9] = 0xde; /* Fake serial number (PSN) */ -sd->cid[10] = 0xad; -sd->cid[11] = 0xbe; -sd->cid[12] = 0xef; -sd->cid[13] = 0x00 |/* Manufacture date (MDT) */ -((MDT_YR - 2000) / 10); -sd->cid[14] = ((MDT_YR % 10) << 4) | MDT_MON; -sd->cid[15] = (sd_crc7(sd->cid, 15) << 1) | 1; +if (sd->emmc) { +sd->cid[0] = MID; +sd->cid[1] = 0x1; /* CBX */ +sd->cid[2] = OID[0];/* OEM/Application ID (OID) */ +sd->cid[8] = 0x0; +sd->cid[9] = PRV;/* Fake product revision (PRV) */ +sd->cid[10] = 0xde; /* Fake serial number (PSN) */ +sd->cid[11] = 0xad; +sd->cid[12] = 0xbe; +sd->cid[13] = 0xef; +sd->cid[14] = ((MDT_YR - 1997) % 0x10); /* MDT */ +} else { +sd->cid[0] = MID; /* Fake card manufacturer ID (MID) */ +sd->cid[1] = OID[0];/* OEM/Application ID (OID) */ +sd->cid[2] = OID[1]; +sd->cid[8] = PRV; /* Fake product revision (PRV) */ +sd->cid[9] = 0xde; /* Fake serial number (PSN) */ +sd->cid[10] = 0xad; +sd->cid[11] = 0xbe; +sd->cid[12] = 0xef; +sd->cid[13] = 0x00 |/* Manufacture date (MDT) */ +((MDT_YR - 2000) / 10); +sd->cid[14] = ((MDT_YR % 10) << 4) | MDT_MON; + } + sd->cid[3] = PNM[0];/* Fake product name (PNM) 48bit */ + sd->cid[4] = PNM[1]; + sd->cid[5] = PNM[2]; + sd->cid[6] = PNM[3]; + sd->cid[7] = PNM[4]; + sd->cid[15] = (sd_crc7(sd->cid, 15) << 1) | 1; } #define HWBLOCK_SHIFT 9 /* 512 bytes */ -- 2.7.4
[PATCH v2 13/22] sd: emmc: Make ACMD41 illegal for mmc
ACMD41 is not applicable for eMMC. Signed-off-by: Sai Pavan Boddu Signed-off-by: Edgar E. Iglesias --- hw/sd/sd.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/hw/sd/sd.c b/hw/sd/sd.c index d702027..df82b61 100644 --- a/hw/sd/sd.c +++ b/hw/sd/sd.c @@ -1729,6 +1729,9 @@ static sd_rsp_type_t sd_app_command(SDState *sd, break; case 41:/* ACMD41: SD_APP_OP_COND */ +if (sd->emmc) { +break; +} if (sd->spi) { /* SEND_OP_CMD */ sd->state = sd_transfer_state; -- 2.7.4
[PATCH v2 04/22] sd: emmc: update OCR fields for eMMC
From: Vincent Palatin eMMC OCR register doesn't has UHS-II field and voltage fields are different. Signed-off-by: Sai Pavan Boddu --- hw/sd/sd.c | 27 --- 1 file changed, 24 insertions(+), 3 deletions(-) diff --git a/hw/sd/sd.c b/hw/sd/sd.c index 42ee49c..430bea5 100644 --- a/hw/sd/sd.c +++ b/hw/sd/sd.c @@ -283,6 +283,15 @@ FIELD(OCR, UHS_II_CARD,29, 1) /* Only UHS-II */ FIELD(OCR, CARD_CAPACITY, 30, 1) /* 0:SDSC, 1:SDHC/SDXC */ FIELD(OCR, CARD_POWER_UP, 31, 1) +/* + * eMMC OCR register + */ +FIELD(EMMC_OCR, VDD_WINDOW_0, 7, 1) +FIELD(EMMC_OCR, VDD_WINDOW_1, 8, 7) +FIELD(EMMC_OCR, VDD_WINDOW_2, 15, 9) +FIELD(EMMC_OCR, ACCESS_MODE, 29, 2) +FIELD(EMMC_OCR, POWER_UP, 31, 1) + #define ACMD41_ENQUIRY_MASK 0x00ff #define ACMD41_R3_MASK (R_OCR_VDD_VOLTAGE_WIN_HI_MASK \ | R_OCR_ACCEPT_SWITCH_1V8_MASK \ @@ -292,8 +301,16 @@ FIELD(OCR, CARD_POWER_UP, 31, 1) static void sd_set_ocr(SDState *sd) { -/* All voltages OK */ -sd->ocr = R_OCR_VDD_VOLTAGE_WIN_HI_MASK; +if (sd->emmc) { +/* + * Dual Voltage eMMC card + */ +sd->ocr = R_EMMC_OCR_VDD_WINDOW_0_MASK | + R_EMMC_OCR_VDD_WINDOW_2_MASK; +} else { +/* All voltages OK */ +sd->ocr = R_OCR_VDD_VOLTAGE_WIN_HI_MASK; +} } static void sd_ocr_powerup(void *opaque) @@ -521,7 +538,11 @@ static void sd_response_r1_make(SDState *sd, uint8_t *response) static void sd_response_r3_make(SDState *sd, uint8_t *response) { -stl_be_p(response, sd->ocr & ACMD41_R3_MASK); +if (sd->emmc) { +stl_be_p(response, sd->ocr); +} else { +stl_be_p(response, sd->ocr & ACMD41_R3_MASK); +} } static void sd_response_r6_make(SDState *sd, uint8_t *response) -- 2.7.4
[PATCH v2 09/22] sd: emmc: Update CMD1 definition for eMMC
Add support to Power up the card and send response r3 in case of eMMC. Signed-off-by: Sai Pavan Boddu Signed-off-by: Edgar E. Iglesias Reviewed-by: Alistair Francis --- hw/sd/sd.c | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/hw/sd/sd.c b/hw/sd/sd.c index 9835f52..8bc8d5d 100644 --- a/hw/sd/sd.c +++ b/hw/sd/sd.c @@ -1033,8 +1033,16 @@ static sd_rsp_type_t sd_normal_command(SDState *sd, SDRequest req) break; case 1: /* CMD1: SEND_OP_CMD */ -if (!sd->spi) +/* MMC: Powerup & send r3 + * SD: send r1 in spi mode + */ +if (sd->emmc) { +sd_ocr_powerup(sd); +return sd->state == sd_idle_state ? + sd_r3 : sd_r0; +} else if (!sd->spi) { goto bad_cmd; +} sd->state = sd_transfer_state; return sd_r1; -- 2.7.4
[PATCH v2 00/22] eMMC support
Hi, This patch series add support for eMMC cards. This work was previosly submitted by Vincent, rebased few changes on top. Cedric & Joel has helped to added boot partition access support. I expect them to make a follow-up series to use it with aspeed machines. Present series adds eMMC support to Versal SOC. Initial patch series version is RFC Changes for V2: Split Patch 1 Add comments for eMMC Erase commands Added documentation about eMMC and Versal-virt board. Make eMMC optional for xlnx-versal-virt machines Regards, Sai Pavan Cédric Le Goater (2): sd: sdmmc-internal: Add command string for SEND_OP_CMD sd: emmc: Add Extended CSD register definitions Joel Stanley (2): sd: emmc: Support boot area in emmc image sd: emmc: Subtract bootarea size from blk Sai Pavan Boddu (13): sd: sd: Remove usage of tabs in the file sd: emmc: Dont not update CARD_CAPACITY for eMMC cards sd: emmc: Update CMD1 definition for eMMC sd: emmc: support idle state in CMD2 sd: emmc: Add mmc switch function support sd: emmc: add CMD21 tuning sequence sd: emmc: Make ACMD41 illegal for mmc sd: emmc: Add support for emmc erase sd: emmc: Update CID structure for eMMC sd: sdhci: Support eMMC devices arm: xlnx-versal: Add emmc to versal docs: devel: emmc: Add a doc for emmc card emulation docs: arm: xlnx-versal-virt: Add eMMC support documentation Vincent Palatin (5): block: add eMMC block device type sd: emmc: Add support for eMMC cards sd: emmc: update OCR fields for eMMC sd: emmc: Add support for EXT_CSD & CSD for eMMC sd: emmc: Update CMD8 to send EXT_CSD register docs/devel/emmc.txt | 16 + docs/system/arm/xlnx-versal-virt.rst | 10 + hw/sd/sdmmc-internal.h | 97 +++ include/hw/arm/xlnx-versal.h | 1 + include/sysemu/blockdev.h| 1 + blockdev.c | 1 + hw/arm/xlnx-versal-virt.c| 30 +- hw/arm/xlnx-versal.c | 13 +- hw/sd/sd.c | 545 ++- hw/sd/sdhci.c| 4 - hw/sd/sdmmc-internal.c | 2 +- 11 files changed, 571 insertions(+), 149 deletions(-) create mode 100644 docs/devel/emmc.txt -- 2.7.4