Re: CMA enhancement - non-default areas in x86

2020-05-13 Thread gre...@linuxfoundation.org
On Wed, May 13, 2020 at 09:43:45AM +, Ravich, Leonid wrote:
> > On Wed, May 13, 2020 at 08:29:16AM +, Ravich, Leonid wrote:
> > > PCIe NTB
> > > Documentation/driver-api/ntb.rst
> > 
> > > 1) Basically PCI bridge between to root complex / PCI switches
> > > 2) using out of OS memory is one solution but then this memory is
> > > Limited for usage by other stack, ex: get_user_pages on this memory
> > > will fail, Therefore attempting to use it for block layer with (o_direct) 
> > > will
> > fail.
> > >
> > > Acutely any generic stack which attempts to "pin" this memory will fail.
> > 
> > So why isn't the BIOS/UEFI properly reserving this from the general 
> > operating
> > system's pages so that the driver knows to use them instead?
> > 
> > Is UEFI wrong here about these being valid memory ranges for general use?
> > If so, why not fix that?  If not, how in the world is the OS supposed to 
> > know
> > these memory ranges are _not_ for general use?  I feel like there is
> > something missing here...
> >
> Maybe I am miss understanding something here , but if BIOS/UEFI will reserve 
> this pages 
> They will be "out of kernel" which will work for propriety driver but this 
> memory will not 
> be useable for generic driver which will attempt to pin this memory with 
> get_user_pages() .
> so we can go and try to fix that  (not sure this is the right way) .

What do you mean by "propriety" driver vs. "generic" driver?

Shouldn't there be some "generic" way that UEFI tells any driver where
these memory locations are that can not be used as general memory?  If
not, try fixing up UEFI for that.

> another option here is to use some kernel infrastructure  which  from one 
> side reserve the memory from general use
> on the other hand kernel will be aware of this pages so get_user_pages()  
> will work on this memory .
> 
> from what we saw CMA infrastructure can support  such requirements.

CMA needs to be told where to reserve the memory at boot time.  If you
want to use that, great, but something has to tell it, so perhaps just
get that info from UEFI as that is the "equilivant" to a device tree,
right?

Try it all out and see, all of this is pointless without real patches,
which is why we almost never have these kinds of discussions without
working code.

thanks,

greg k-h


RE: CMA enhancement - non-default areas in x86

2020-05-13 Thread Ravich, Leonid
> A: http://en.wikipedia.org/wiki/Top_post
> Q: Were do I find info about this thing called top-posting?
> A: Because it messes up the order in which people normally read text.
> Q: Why is top-posting such a bad thing?
> A: Top-posting.
> Q: What is the most annoying thing in e-mail?
> 
> A: No.
> Q: Should I include quotations after my reply?
> 
> http://daringfireball.net/2007/07/on_top

Sorry , bad habit .
 
> On Wed, May 13, 2020 at 08:29:16AM +, Ravich, Leonid wrote:
> > PCIe NTB
> > Documentation/driver-api/ntb.rst
> 
> > 1) Basically PCI bridge between to root complex / PCI switches
> > 2) using out of OS memory is one solution but then this memory is
> > Limited for usage by other stack, ex: get_user_pages on this memory
> > will fail, Therefore attempting to use it for block layer with (o_direct) 
> > will
> fail.
> >
> > Acutely any generic stack which attempts to "pin" this memory will fail.
> 
> So why isn't the BIOS/UEFI properly reserving this from the general operating
> system's pages so that the driver knows to use them instead?
> 
> Is UEFI wrong here about these being valid memory ranges for general use?
> If so, why not fix that?  If not, how in the world is the OS supposed to know
> these memory ranges are _not_ for general use?  I feel like there is
> something missing here...
>
Maybe I am miss understanding something here , but if BIOS/UEFI will reserve 
this pages 
They will be "out of kernel" which will work for propriety driver but this 
memory will not 
be useable for generic driver which will attempt to pin this memory with 
get_user_pages() .
so we can go and try to fix that  (not sure this is the right way) .

another option here is to use some kernel infrastructure  which  from one side 
reserve the memory from general use
on the other hand kernel will be aware of this pages so get_user_pages()  will 
work on this memory .

from what we saw CMA infrastructure can support  such requirements.
Please let me know if you think I missing here something .

Thanks , and sorry for format mess .
> thanks,
> 
> greg k-h


Re: CMA enhancement - non-default areas in x86

2020-05-13 Thread gre...@linuxfoundation.org
A: http://en.wikipedia.org/wiki/Top_post
Q: Were do I find info about this thing called top-posting?
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing in e-mail?

A: No.
Q: Should I include quotations after my reply?

http://daringfireball.net/2007/07/on_top

On Wed, May 13, 2020 at 08:29:16AM +, Ravich, Leonid wrote:
> PCIe NTB 
> Documentation/driver-api/ntb.rst

> 1) Basically PCI bridge between to root complex / PCI switches 
> 2) using out of OS memory is one solution but then this memory is
> Limited for usage by other stack, ex: get_user_pages on this memory will 
> fail, 
> Therefore attempting to use it for block layer with (o_direct) will fail. 
>  
> Acutely any generic stack which attempts to "pin" this memory will fail.

So why isn't the BIOS/UEFI properly reserving this from the general
operating system's pages so that the driver knows to use them instead?

Is UEFI wrong here about these being valid memory ranges for general
use?  If so, why not fix that?  If not, how in the world is the OS
supposed to know these memory ranges are _not_ for general use?  I feel
like there is something missing here...

thanks,

greg k-h


RE: CMA enhancement - non-default areas in x86

2020-05-13 Thread Ravich, Leonid
PCIe NTB 
Documentation/driver-api/ntb.rst

1) Basically PCI bridge between to root complex / PCI switches 
2) using out of OS memory is one solution but then this memory is
Limited for usage by other stack, ex: get_user_pages on this memory will fail, 
Therefore attempting to use it for block layer with (o_direct) will fail. 
 
Acutely any generic stack which attempts to "pin" this memory will fail.

Leonid Ravich 
> -Original Message-
> From: gre...@linuxfoundation.org 
> Sent: Wednesday, May 13, 2020 10:14 AM
> To: Idgar, Or
> Cc: linux-kernel@vger.kernel.org; linux...@kvack.org; Ravich, Leonid
> Subject: Re: CMA enhancement - non-default areas in x86
> 
> On Wed, May 13, 2020 at 07:00:12AM +, Idgar, Or wrote:
> > > For what type of device?
> > NTB (Non-Transparent Bridge).
> 
> 
> Very odd quoting style...
> 
> Anyway, what exactly is a non-transparent bridge, and why doesn't your
> bios/uefi implementation properly reserve the memory for it so that the OS
> does not use it?
> 
> thanks,
> 
> greg k-h


Re: CMA enhancement - non-default areas in x86

2020-05-13 Thread gre...@linuxfoundation.org
On Wed, May 13, 2020 at 07:00:12AM +, Idgar, Or wrote:
> > For what type of device?
> NTB (Non-Transparent Bridge).


Very odd quoting style...

Anyway, what exactly is a non-transparent bridge, and why doesn't your
bios/uefi implementation properly reserve the memory for it so that the
OS does not use it?

thanks,

greg k-h


RE: CMA enhancement - non-default areas in x86

2020-05-13 Thread Idgar, Or
> For what type of device?
NTB (Non-Transparent Bridge).

-Original Message-
From: gre...@linuxfoundation.org  
Sent: יום ד 13 מאי 2020 09:48
To: Idgar, Or
Cc: linux-kernel@vger.kernel.org; linux...@kvack.org; Ravich, Leonid
Subject: Re: CMA enhancement - non-default areas in x86


[EXTERNAL EMAIL] 

On Wed, May 13, 2020 at 06:13:55AM +, Idgar, Or wrote:
> Hi,
> I'm working with Linux kernel on x86 and needed a way to allocate a very 
> large contiguous memory (around 20GB) for DMA operations.

For what type of device?

> I've found out that CMA is one of the major ways to do so, but our problem is 
> that CMA's default behavior is to create one default area from which all 
> devices can allocate memory.
> when booting, there were some drivers that allocated memory for DMA and used 
> CMA memory if exist. The problem is that it takes memory that we need for our 
> device and we want to make sure this area is dedicated for our device.
> 
> As I saw, the only way to reserve a dedicated area is by enabling 
> OF_RESERVED_MEM which is available for several architectures but excluding 
> x86 (and as far as I understand relies on device tree which is not in use 
> with x86 or at least cannot be configured with OF_RESERVED_MEM).
> 
> I really want to leverage this mechanism/API and thought about modifying the 
> code (and hopefully merge it upstream) so multiple non-default areas will be 
> available for x86 and with a way to consume it by mapping specific area to 
> specific device.
> 
> Is it something that will be open for merging if written properly?

We always will be glad to review patches, no need to ask us about that.
Just post them!

good luck,

greg k-h


Re: CMA enhancement - non-default areas in x86

2020-05-13 Thread gre...@linuxfoundation.org
On Wed, May 13, 2020 at 06:13:55AM +, Idgar, Or wrote:
> Hi,
> I'm working with Linux kernel on x86 and needed a way to allocate a very 
> large contiguous memory (around 20GB) for DMA operations.

For what type of device?

> I've found out that CMA is one of the major ways to do so, but our problem is 
> that CMA's default behavior is to create one default area from which all 
> devices can allocate memory.
> when booting, there were some drivers that allocated memory for DMA and used 
> CMA memory if exist. The problem is that it takes memory that we need for our 
> device and we want to make sure this area is dedicated for our device.
> 
> As I saw, the only way to reserve a dedicated area is by enabling 
> OF_RESERVED_MEM which is available for several architectures but excluding 
> x86 (and as far as I understand relies on device tree which is not in use 
> with x86 or at least cannot be configured with OF_RESERVED_MEM).
> 
> I really want to leverage this mechanism/API and thought about modifying the 
> code (and hopefully merge it upstream) so multiple non-default areas will be 
> available for x86 and with a way to consume it by mapping specific area to 
> specific device.
> 
> Is it something that will be open for merging if written properly?

We always will be glad to review patches, no need to ask us about that.
Just post them!

good luck,

greg k-h