Re: PCIe coherency in spec (was: [RFC PATCH 2/2] drm/ttm: downgrade cached to write_combined when snooping not available)
在 2024-07-03星期三的 23:11 -0700,Christoph Hellwig写道: > On Thu, Jul 04, 2024 at 10:00:52AM +0800, Icenowy Zheng wrote: > > So I here want to ask a question as an individual hacker: what's > > the > > policy of linux-pci towards these non-coherent PCIe > > implementations? > > > > If the sentences of Christian is right, these implementations are > > just > > out-of-spec, should them get purged out of the kernel, or at least > > raising a warning that some HW won't work because of inconformant > > implementation? > > Nothing in the PCIe specifications that mandates a programming model. > Non-coherent DMA is extremely common in lower end devices, and > despite > all the issues that it causes well supported in Linux. > > What are you trying to solve? Currently the DRM TTM subsystem (and GPU drivers using it) will assume coherency and fail on these non-coherent systems with cryptic error messages (like `[drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring gfx test failed (-110)`) without mentioning coherency issues at all. My original patchset tries to solve this problem by make the TTM subsystem sensible of coherency status (and prevent CPU-side cached mapping when non-coherent), but got argued by TTM maintainer and the maintainer says TTM's ignorance on non-coherent systems is intentional. >
Re: PCIe coherency in spec (was: [RFC PATCH 2/2] drm/ttm: downgrade cached to write_combined when snooping not available)
在 2024-07-03星期三的 16:08 -0500,Bjorn Helgaas写道: > On Wed, Jul 03, 2024 at 04:52:30PM +0800, Jiaxun Yang wrote: > > 在2024年7月2日七月 下午6:03,Jiaxun Yang写道: > > > 在2024年7月2日七月 下午5:27,Christian König写道: > > > > Am 02.07.24 um 11:06 schrieb Icenowy Zheng: > > > > > [SNIP] However I don't think the definition of the AGP spec > > > > > could apply on all > > > > > PCI(e) implementations. The AGP spec itself don't apply on > > > > > implementations that do not implement AGP (which is the most > > > > > PCI(e) > > > > > implementations today), and it's not in the reference list of > > > > > the PCIe > > > > > spec, so it does no help on this context. > > > > No, exactly that is not correct. > > > > > > > > See as I explained the No-Snoop extension to PCIe was created > > > > to help > > > > with AGP support and later merged into the base PCIe > > > > specification. > > > > > > > > So the AGP spec is now part of the PCIe spec. > > > > Hi Bjorn & linux-pci folks, > > > > It seems like we have some disputes on interpretation pf PCIe > > specification. > > > > We are seeking your expertise on the question: Does PCIe > > specification mandate Cache coherency via snoop? > > I'm not qualified to opine on this. I'd say it's a question for the > PCI SIG protocol workgroup. https://forum.pcisig.com/ is a place to > start. Sorry for the disturbance. As individual hacker, I am not eligble of being a PCI-SIG member and join the discussion there. So I here want to ask a question as an individual hacker: what's the policy of linux-pci towards these non-coherent PCIe implementations? If the sentences of Christian is right, these implementations are just out-of-spec, should them get purged out of the kernel, or at least raising a warning that some HW won't work because of inconformant implementation? > > Bjorn
Re: PCIe coherency in spec (was: [RFC PATCH 2/2] drm/ttm: downgrade cached to write_combined when snooping not available)
On Wed, Jul 03, 2024 at 04:52:30PM +0800, Jiaxun Yang wrote: > 在2024年7月2日七月 下午6:03,Jiaxun Yang写道: > > 在2024年7月2日七月 下午5:27,Christian König写道: > >> Am 02.07.24 um 11:06 schrieb Icenowy Zheng: > >>> [SNIP] However I don't think the definition of the AGP spec could apply > >>> on all > >>> PCI(e) implementations. The AGP spec itself don't apply on > >>> implementations that do not implement AGP (which is the most PCI(e) > >>> implementations today), and it's not in the reference list of the PCIe > >>> spec, so it does no help on this context. > >> No, exactly that is not correct. > >> > >> See as I explained the No-Snoop extension to PCIe was created to help > >> with AGP support and later merged into the base PCIe specification. > >> > >> So the AGP spec is now part of the PCIe spec. > > Hi Bjorn & linux-pci folks, > > It seems like we have some disputes on interpretation pf PCIe > specification. > > We are seeking your expertise on the question: Does PCIe > specification mandate Cache coherency via snoop? I'm not qualified to opine on this. I'd say it's a question for the PCI SIG protocol workgroup. https://forum.pcisig.com/ is a place to start. Bjorn
