Re: [PATCH v3 4/20] PCI/P2PDMA: introduce helpers for dma_map_sg implementations

2021-09-29 Thread Jason Gunthorpe via iommu
On Wed, Sep 29, 2021 at 05:00:43PM -0600, Logan Gunthorpe wrote:
> 
> 
> 
> On 2021-09-29 4:46 p.m., Jason Gunthorpe wrote:
> > On Wed, Sep 29, 2021 at 03:30:42PM -0600, Logan Gunthorpe wrote:
> >> On 2021-09-28 4:05 p.m., Jason Gunthorpe wrote:
> >> No, that's not a correct reading of the code. Every time there is a new
> >> pagemap, this code calculates the mapping type and bus offset. If a page
> >> comes along with a different page map,f it recalculates. This just
> >> reduces the overhead so that the calculation is done only every time a
> >> page with a different pgmap comes along and not doing it for every
> >> single page.
> > 
> > Each 'struct scatterlist *sg' refers to a range of contiguous pfns
> > starting at page_to_pfn(sg_page()) and going for approx sg->length/PAGE_SIZE
> > pfns long.
> > 
> 
> Ugh, right. A bit contrived for consecutive pages to have different
> pgmaps and still be next to each other in a DMA transaction. But I guess
> it is technically possible and should be protected against.

I worry it is something a hostile userspace could cookup using mmap
and cause some kind of kernel integrity problem with.

> > @@ -470,7 +470,8 @@ int sg_alloc_append_table_from_pages(struct 
> > sg_append_table *sgt_append,
> >  
> > /* Merge contiguous pages into the last SG */
> > prv_len = sgt_append->prv->length;
> > -   while (n_pages && page_to_pfn(pages[0]) == paddr) {
> > +   while (n_pages && page_to_pfn(pages[0]) == paddr &&
> > +  sg_page(sgt_append->prv)->pgmap == pages[0]->pgmap) {
> 
> I don't think it's correct to use pgmap without first checking if it is
> a zone device page. But your point is taken. I'll try to address this.

Yes

Jason
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 4/20] PCI/P2PDMA: introduce helpers for dma_map_sg implementations

2021-09-29 Thread Logan Gunthorpe




On 2021-09-29 4:46 p.m., Jason Gunthorpe wrote:
> On Wed, Sep 29, 2021 at 03:30:42PM -0600, Logan Gunthorpe wrote:
>> On 2021-09-28 4:05 p.m., Jason Gunthorpe wrote:
>> No, that's not a correct reading of the code. Every time there is a new
>> pagemap, this code calculates the mapping type and bus offset. If a page
>> comes along with a different page map,f it recalculates. This just
>> reduces the overhead so that the calculation is done only every time a
>> page with a different pgmap comes along and not doing it for every
>> single page.
> 
> Each 'struct scatterlist *sg' refers to a range of contiguous pfns
> starting at page_to_pfn(sg_page()) and going for approx sg->length/PAGE_SIZE
> pfns long.
> 

Ugh, right. A bit contrived for consecutive pages to have different
pgmaps and still be next to each other in a DMA transaction. But I guess
it is technically possible and should be protected against.

> sg_page() returns the first page, but nothing says that sg_page()+1
> has the same pgmap.
> 
> The code in this patch does check the first page of each sg in a
> larger sgl.
> 
>>> At least sg_alloc_append_table_from_pages() and probably something in
>>> the block world should be updated to not combine struct pages with
>>> different pgmaps, and this should be documented in scatterlist.*
>>> someplace.
>>
>> There's no sane place to do this check. The code is designed to support
>> mappings with different pgmaps.
> 
> All places that generate compound sg's by aggregating multiple pages
> need to include this check along side the check for physical
> contiguity. There are not that many places but
> sg_alloc_append_table_from_pages() is one of them:

Yes. The block layer also does this. I believe a check in
page_is_mergable() will be sufficient there.

> @@ -470,7 +470,8 @@ int sg_alloc_append_table_from_pages(struct 
> sg_append_table *sgt_append,
>  
> /* Merge contiguous pages into the last SG */
> prv_len = sgt_append->prv->length;
> -   while (n_pages && page_to_pfn(pages[0]) == paddr) {
> +   while (n_pages && page_to_pfn(pages[0]) == paddr &&
> +  sg_page(sgt_append->prv)->pgmap == pages[0]->pgmap) {

I don't think it's correct to use pgmap without first checking if it is
a zone device page. But your point is taken. I'll try to address this.

Logan
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 4/20] PCI/P2PDMA: introduce helpers for dma_map_sg implementations

2021-09-29 Thread Jason Gunthorpe via iommu
On Wed, Sep 29, 2021 at 03:30:42PM -0600, Logan Gunthorpe wrote:
> 
> 
> 
> On 2021-09-28 4:05 p.m., Jason Gunthorpe wrote:
> > On Thu, Sep 16, 2021 at 05:40:44PM -0600, Logan Gunthorpe wrote:
> > 
> >> +enum pci_p2pdma_map_type
> >> +pci_p2pdma_map_segment(struct pci_p2pdma_map_state *state, struct device 
> >> *dev,
> >> + struct scatterlist *sg)
> >> +{
> >> +  if (state->pgmap != sg_page(sg)->pgmap) {
> >> +  state->pgmap = sg_page(sg)->pgmap;
> > 
> > This has built into it an assumption that every page in the sg element
> > has the same pgmap, but AFAIK nothing enforces this rule? There is no
> > requirement that the HW has pfn gaps between the pgmaps linux decides
> > to create over it.
> 
> No, that's not a correct reading of the code. Every time there is a new
> pagemap, this code calculates the mapping type and bus offset. If a page
> comes along with a different page map,f it recalculates. This just
> reduces the overhead so that the calculation is done only every time a
> page with a different pgmap comes along and not doing it for every
> single page.

Each 'struct scatterlist *sg' refers to a range of contiguous pfns
starting at page_to_pfn(sg_page()) and going for approx sg->length/PAGE_SIZE
pfns long.

sg_page() returns the first page, but nothing says that sg_page()+1
has the same pgmap.

The code in this patch does check the first page of each sg in a
larger sgl.

> > At least sg_alloc_append_table_from_pages() and probably something in
> > the block world should be updated to not combine struct pages with
> > different pgmaps, and this should be documented in scatterlist.*
> > someplace.
> 
> There's no sane place to do this check. The code is designed to support
> mappings with different pgmaps.

All places that generate compound sg's by aggregating multiple pages
need to include this check along side the check for physical
contiguity. There are not that many places but
sg_alloc_append_table_from_pages() is one of them:

@@ -470,7 +470,8 @@ int sg_alloc_append_table_from_pages(struct sg_append_table 
*sgt_append,
 
/* Merge contiguous pages into the last SG */
prv_len = sgt_append->prv->length;
-   while (n_pages && page_to_pfn(pages[0]) == paddr) {
+   while (n_pages && page_to_pfn(pages[0]) == paddr &&
+  sg_page(sgt_append->prv)->pgmap == pages[0]->pgmap) {
if (sgt_append->prv->length + PAGE_SIZE > max_segment)
break;
sgt_append->prv->length += PAGE_SIZE;
@@ -488,7 +489,8 @@ int sg_alloc_append_table_from_pages(struct sg_append_table 
*sgt_append,
for (i = 1; i < n_pages; i++) {
seg_len += PAGE_SIZE;
if (seg_len >= max_segment ||
-   page_to_pfn(pages[i]) != page_to_pfn(pages[i - 1]) + 1) {
+   page_to_pfn(pages[i]) != page_to_pfn(pages[i - 1]) + 1 ||
+   pages[i]->pgmap != pages[i - 1]->pgmap) {
chunks++;
seg_len = 0;
}
@@ -505,9 +507,10 @@ int sg_alloc_append_table_from_pages(struct 
sg_append_table *sgt_append,
seg_len += PAGE_SIZE;
if (seg_len >= max_segment ||
page_to_pfn(pages[j]) !=
-   page_to_pfn(pages[j - 1]) + 1)
+   page_to_pfn(pages[j - 1]) + 1 ||
+   pages[i]->pgmap != pages[i - 1]->pgmap) {
break;
-   }
+   }
 
/* Pass how many chunks might be left */
s = get_next_sg(sgt_append, s, chunks - i + left_pages,


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 4/20] PCI/P2PDMA: introduce helpers for dma_map_sg implementations

2021-09-29 Thread Logan Gunthorpe




On 2021-09-28 4:05 p.m., Jason Gunthorpe wrote:
> On Thu, Sep 16, 2021 at 05:40:44PM -0600, Logan Gunthorpe wrote:
> 
>> +enum pci_p2pdma_map_type
>> +pci_p2pdma_map_segment(struct pci_p2pdma_map_state *state, struct device 
>> *dev,
>> +   struct scatterlist *sg)
>> +{
>> +if (state->pgmap != sg_page(sg)->pgmap) {
>> +state->pgmap = sg_page(sg)->pgmap;
> 
> This has built into it an assumption that every page in the sg element
> has the same pgmap, but AFAIK nothing enforces this rule? There is no
> requirement that the HW has pfn gaps between the pgmaps linux decides
> to create over it.

No, that's not a correct reading of the code. Every time there is a new
pagemap, this code calculates the mapping type and bus offset. If a page
comes along with a different page map,f it recalculates. This just
reduces the overhead so that the calculation is done only every time a
page with a different pgmap comes along and not doing it for every
single page.

> At least sg_alloc_append_table_from_pages() and probably something in
> the block world should be updated to not combine struct pages with
> different pgmaps, and this should be documented in scatterlist.*
> someplace.

There's no sane place to do this check. The code is designed to support
mappings with different pgmaps.

Logan
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 4/20] PCI/P2PDMA: introduce helpers for dma_map_sg implementations

2021-09-28 Thread Jason Gunthorpe via iommu
On Thu, Sep 16, 2021 at 05:40:44PM -0600, Logan Gunthorpe wrote:

> +enum pci_p2pdma_map_type
> +pci_p2pdma_map_segment(struct pci_p2pdma_map_state *state, struct device 
> *dev,
> +struct scatterlist *sg)
> +{
> + if (state->pgmap != sg_page(sg)->pgmap) {
> + state->pgmap = sg_page(sg)->pgmap;

This has built into it an assumption that every page in the sg element
has the same pgmap, but AFAIK nothing enforces this rule? There is no
requirement that the HW has pfn gaps between the pgmaps linux decides
to create over it.

At least sg_alloc_append_table_from_pages() and probably something in
the block world should be updated to not combine struct pages with
different pgmaps, and this should be documented in scatterlist.*
someplace.

Jason
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu