On Wed, Mar 30, 2022 at 01:15:58PM +0100, Jonathan Cameron wrote:
> On Tue, 29 Mar 2022 12:53:51 -0700
> Davidlohr Bueso <d...@stgolabs.net> wrote:
> 
> > On Tue, 29 Mar 2022, Adam Manzanares wrote:
> > >> +typedef struct cxl_device_state {
> > >> +    MemoryRegion device_registers;
> > >> +
> > >> +    /* mmio for device capabilities array - 8.2.8.2 */
> > >> +    MemoryRegion device;
> > >> +    MemoryRegion caps;
> > >> +
> > >> +    /* mmio for the mailbox registers 8.2.8.4 */
> > >> +    MemoryRegion mailbox;
> > >> +
> > >> +    /* memory region for persistent memory, HDM */
> > >> +    uint64_t pmem_size;  
> > >
> > >Can we switch this to mem_size and drop the persistent comment? It is my
> > >understanding that HDM is independent of persistence.  
> > 
> > Agreed, but ideally both volatile and persistent capacities would have been
> > supported in this patchset. I'm also probably missing specific reasons as to
> > why this isn't the case.
> 
> Whilst it doesn't add a huge amount of complexity it does add some
> and the software paths in Linux we were developing this for are pmem focused.
> Hence volatile is on the todo list rather than in this first patch set.
> Not sensible to aim for feature complete in one go.

Makes complete sense. We can help with the Linux development for the volatile 
side. I will add a couple of folks on cc. In addition, we would like to help
the CXL ecosystem in general so I anticipate we will have more reviews and 
patches for CXL in general.

> 
> > 
> > Looking at it briefly could it be just a matter of adding to cxl_type3_dev
> > a new hostmem along with it's AddressSpace for the volatile? If so, I'm
> > thinking something along these lines:
> > 
> > @@ -123,8 +123,8 @@ typedef struct cxl_device_state {
> >      uint64_t host_set;
> >       } timestamp;
> > 
> > -    /* memory region for persistent memory, HDM */
> > -    uint64_t pmem_size;
> > +    /* memory region for persistent and volatile memory, HDM */
> > +    uint64_t pmem_size, mem_size;
> >   } CXLDeviceState;
> > 
> >   /* Initialize the register block for a device */
> > @@ -235,9 +235,9 @@ typedef struct cxl_type3_dev {
> >       PCIDevice parent_obj;
> > 
> >       /* Properties */
> > -    AddressSpace hostmem_as;
> > +    AddressSpace hostmem_as, hostmemv_as;
> >       uint64_t size;
> > -    HostMemoryBackend *hostmem;
> > +    HostMemoryBackend *hostmem, *hostmemv;
> >       HostMemoryBackend *lsa;
> >       uint64_t sn;
> > 
> > Then for cxl_setup_memory(), with ct3d->hostmem and/or ct3d->hostmemv
> > non-nil, set the respective MemoryRegions:
> > 
> > +    if (ct3d->hostmem) {
> > +            memory_region_set_nonvolatile(mr, true);
> > +            memory_region_set_enabled(mr, true);
> > +            host_memory_backend_set_mapped(ct3d->hostmem, true);
> > +            address_space_init(&ct3d->hostmem_as, mr, name);
> > +            ct3d->cxl_dstate.pmem_size = ct3d->hostmem->size;
> > +    }
> > +    if (ct3d->hostmemv) {
> > +            memory_region_set_nonvolatile(mrv, false);
> > +            memory_region_set_enabled(mrv, true);
> > +            host_memory_backend_set_mapped(ct3d->hostmemv, true);
> > +            address_space_init(&ct3d->hostmem_as, mrv, name);
> > +            ct3d->cxl_dstate.pmem_size = ct3d->hostmem->size;
> > +    }
> > 
> > For corresponding MB commands, it's mostly IDENTIFY_MEMORY_DEVICE that needs
> > updating:
> > 
> > @@ -281,7 +281,7 @@ static ret_code cmd_identify_memory_device(struct 
> > cxl_cmd *cmd,
> > 
> >       CXLType3Dev *ct3d = container_of(cxl_dstate, CXLType3Dev, cxl_dstate);
> >       CXLType3Class *cvc = CXL_TYPE3_DEV_GET_CLASS(ct3d);
> > -    uint64_t size = cxl_dstate->pmem_size;
> > +    uint64_t size = cxl_dstate->pmem_size + cxl_dstate->mem_size;
> > 
> >       if (!QEMU_IS_ALIGNED(size, 256 << 20)) {
> >      return CXL_MBOX_INTERNAL_ERROR;
> > @@ -290,11 +290,11 @@ static ret_code cmd_identify_memory_device(struct 
> > cxl_cmd *cmd,
> >       id = (void *)cmd->payload;
> >       memset(id, 0, sizeof(*id));
> > 
> > -    /* PMEM only */
> >       snprintf(id->fw_revision, 0x10, "BWFW VERSION %02d", 0);
> > 
> >       id->total_capacity = size / (256 << 20);
> > -    id->persistent_capacity = size / (256 << 20);
> > +    id->persistent_capacity = cxl_dstate->pmem_size / (256 << 20);
> > +    id->volatile_capacity = cxl_dstate->mem_size / (256 << 20);
> >       id->lsa_size = cvc->get_lsa_size(ct3d);
> > 
> >       *len = sizeof(*id);
> > @@ -312,16 +312,16 @@ static ret_code cmd_ccls_get_partition_info(struct 
> > cxl_cmd *cmd,
> >      uint64_t next_pmem;
> >       } QEMU_PACKED *part_info = (void *)cmd->payload;
> >       QEMU_BUILD_BUG_ON(sizeof(*part_info) != 0x20);
> > -    uint64_t size = cxl_dstate->pmem_size;
> > +    uint64_t psize = cxl_dstate->pmem_size;
> > +    uint64_t vsize = cxl_dstate->mem_size;
> > 
> > -    if (!QEMU_IS_ALIGNED(size, 256 << 20)) {
> > +    if (!QEMU_IS_ALIGNED(psize + vsize, 256 << 20)) {
> >      return CXL_MBOX_INTERNAL_ERROR;
> >       }
> > 
> > -    /* PMEM only */
> > -    part_info->active_vmem = 0;
> > -    part_info->next_vmem = 0;
> > -    part_info->active_pmem = size / (256 << 20);
> > +    part_info->active_vmem = vsize / (256 << 20);
> > +    part_info->next_vmem = part_info->active_vmem;
> > +    part_info->active_pmem = psize / (256 << 20);
> >       part_info->next_pmem = part_info->active_pmem;
> > 
> > Then for reads/writes, both cxl_type3_write() and _read() would, after 
> > computing the dpa_offset,
> > first try the volatile region then upon error attempt the same in the 
> > persistent memory - this
> > assuming the DPA space is consistent among both types of memory. Ie:
> > 
> > address_space_read(&ct3d->hostmemv_as, dpa_offset, attrs, data, size)
> > or
> > address_space_read(&ct3d->hostmem_as, dpa_offset, attrs, data, size)
> > 
> > ... but then again all this is probably just wishful thinking.
> 
> Without looking in detail, will indeed be something along those lines.
> Gets more fiddly if you include partitioning control that Alison was 
> interested
> in adding.
> 
> Also, we probably need to support multiple HDM decoders.  Also not a huge
> complexity to add, but left for now as main focus is to get the base
> patch set merged.
> 
> So I'm happy to queue stuff up on top of this series and carry it forward
> but I don't want to add features to what we try to merge initially.
> This set is already huge and hard to review even with what think is a
> minimum set of features to be useful.
> 
> Note I'm already carrying a few features on top if this on the gitlab
> branch gitlab.com/jic23/qemu (DOE + CDAT and serial numbers) and
> have a few other things out of tree for now (SPDM, emulating most
> of the PCI Config space controls). 
> 

Thanks for the updates. Do you have any suggestions on how to coordinate 
efforts? Ideally we can have a list of features that need to be developed and
some names of people that will lead the work. 

> Jonathan
> 
> > 
> > Thanks,
> > Davidlohr
> 

Reply via email to