Ok. Thanks for the information. Looking at cq.c in the mlx4 driver, it seems to be using a doorbell for the CQE. I think the CPU does some kind of uncacheable read from the PCI device to detect completions. Can someone please confirm this? There is so little information available online :(.
--Anuj On Fri, Jan 24, 2014 at 12:53 AM, Dong Zhang <[email protected]> wrote: > One thing I am sure is PCIE device can write to host RAM without notifying > the CPU(of course bus arbitration against CPU is necessary ), only if this > device support bus mastering DMA or another named third party DMA. Device use > PCIE address space to kick off the transport, the address will be translated > by PCIE RC to host RAM, or if there is a IOMMU, the IOMMU will do a second > translate after RC's translation, into host RAM physical address. > > Dong > > -----Original Message----- > From: Anuj Kalia [mailto:[email protected]] > Sent: 2014年1月24日 13:35 > To: Dong Zhang > Cc: [email protected] > Subject: Re: Question about CPU-HCA interaction > > This is pretty late, but I'm pretty sure that the entire WQE is written via > PIO to the HCA. Here are some references. > > 1. Performance analysis of ConnectX > (http://nowlab.cse.ohio-state.edu/publications/conf-presentations/2007/surs-hoti07.pdf). > It says that in ConnectX, PIO is used to writing the WQE (along with the > inlined payload) to the HCA. > > 2. http://comments.gmane.org/gmane.linux.drivers.openib/61682 > discusses doing PIO to BlueFlame pages. > > I had a related question. How is the CQE pushed from the HCA to the CPU? Is > it a DMA write? Or can PCI devices write to system memory too (just as CPUs > can write to the PCIe device's memory). > > Anuj > > > > On Mon, Jan 13, 2014 at 9:08 PM, Dong Zhang <[email protected]> wrote: >> Actually the first one is widely used by many PCIE IO card. The similar way >> is used by SAS/FC HBA, where, there is also a similar queue pair existing in >> both driver memory space and PCIE address space, one consumer queue and one >> producer queue, in the PCIE BAR space, there is a register named "producer >> index", which is updated by host driver when the upper layer generate a new >> IO request and put into the producer queue in the driver memory space. This >> register update action will cause a interrupt to the PCIE card's IO >> controller and then DMA form host driver memory's producer queue to fetch >> the IO request and handle it, then DMA write to the driver's consumer queue >> a new item to inform the driver that this IO is done. >> >> Hope that help. >> >> Dong >> >> -----Original Message----- >> From: [email protected] >> [mailto:[email protected]] On Behalf Of Anuj Kalia >> Sent: 2014年1月13日 17:04 >> To: [email protected] >> Subject: Question about CPU-HCA interaction >> >> Hi. >> >> I wanted to understand the PCIe usage while issuing Infiniband verbs. >> When posting a verb to a queue pair, how is the request descriptor written >> to the HCA? IMO, there are two options for this: >> >> 1. CPU prepares the descriptor in local memory. After preparing the >> descriptor, it writes the location of the descriptor to a hardwired register >> on the HCA (this process is called "ringing the doorbell"?). >> The HCA then reads the descriptor via DMA. >> >> 2. The CPU writes the entire descriptor to the HCA's memory via PCIe MMIO. >> Then, it rings the doorbell to alert the HCA. >> >> To me, the first one makes more sense, but I'm not sure. It would be great >> if someone could tell more about this. >> >> Thanks for your help. >> >> --Anuj >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-rdma" >> in the body of a message to [email protected] More majordomo >> info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html
