On Thu, May 14, 2015 at 05:51:19PM +1000, Benjamin Herrenschmidt wrote:
> On Thu, 2015-05-14 at 09:39 +0200, Vlastimil Babka wrote:
> > On 05/14/2015 01:38 AM, Benjamin Herrenschmidt wrote:
> > > On Wed, 2015-05-13 at 16:10 +0200, Vlastimil Babka wrote:
> > >> Sorry for reviving oldish thread...
>
On Thu, May 14, 2015 at 05:51:19PM +1000, Benjamin Herrenschmidt wrote:
On Thu, 2015-05-14 at 09:39 +0200, Vlastimil Babka wrote:
On 05/14/2015 01:38 AM, Benjamin Herrenschmidt wrote:
On Wed, 2015-05-13 at 16:10 +0200, Vlastimil Babka wrote:
Sorry for reviving oldish thread...
Well,
On Thu, 2015-05-14 at 09:39 +0200, Vlastimil Babka wrote:
> On 05/14/2015 01:38 AM, Benjamin Herrenschmidt wrote:
> > On Wed, 2015-05-13 at 16:10 +0200, Vlastimil Babka wrote:
> >> Sorry for reviving oldish thread...
> >
> > Well, that's actually appreciated since this is constructive discussion
>
On 05/14/2015 01:38 AM, Benjamin Herrenschmidt wrote:
On Wed, 2015-05-13 at 16:10 +0200, Vlastimil Babka wrote:
Sorry for reviving oldish thread...
Well, that's actually appreciated since this is constructive discussion
of the kind I was hoping to trigger initially :-) I'll look at
I hoped
On 05/14/2015 01:38 AM, Benjamin Herrenschmidt wrote:
On Wed, 2015-05-13 at 16:10 +0200, Vlastimil Babka wrote:
Sorry for reviving oldish thread...
Well, that's actually appreciated since this is constructive discussion
of the kind I was hoping to trigger initially :-) I'll look at
I hoped
On Thu, 2015-05-14 at 09:39 +0200, Vlastimil Babka wrote:
On 05/14/2015 01:38 AM, Benjamin Herrenschmidt wrote:
On Wed, 2015-05-13 at 16:10 +0200, Vlastimil Babka wrote:
Sorry for reviving oldish thread...
Well, that's actually appreciated since this is constructive discussion
of the
On Wed, 2015-05-13 at 16:10 +0200, Vlastimil Babka wrote:
> Sorry for reviving oldish thread...
Well, that's actually appreciated since this is constructive discussion
of the kind I was hoping to trigger initially :-) I'll look at
ZONE_MOVABLE, I wasn't aware of its existence.
Don't we still
Sorry for reviving oldish thread...
On 04/28/2015 01:54 AM, Benjamin Herrenschmidt wrote:
On Mon, 2015-04-27 at 11:48 -0500, Christoph Lameter wrote:
On Mon, 27 Apr 2015, Rik van Riel wrote:
Why would we want to avoid the sane approach that makes this thing
work with the fewest required
Sorry for reviving oldish thread...
On 04/28/2015 01:54 AM, Benjamin Herrenschmidt wrote:
On Mon, 2015-04-27 at 11:48 -0500, Christoph Lameter wrote:
On Mon, 27 Apr 2015, Rik van Riel wrote:
Why would we want to avoid the sane approach that makes this thing
work with the fewest required
On Wed, 2015-05-13 at 16:10 +0200, Vlastimil Babka wrote:
Sorry for reviving oldish thread...
Well, that's actually appreciated since this is constructive discussion
of the kind I was hoping to trigger initially :-) I'll look at
ZONE_MOVABLE, I wasn't aware of its existence.
Don't we still have
On Tue, Apr 28, 2015 at 09:18:55AM -0500, Christoph Lameter wrote:
> On Mon, 27 Apr 2015, Jerome Glisse wrote:
>
> > > is the mechanism that DAX relies on in the VM.
> >
> > Which would require fare more changes than you seem to think. First using
> > MIXED|PFNMAP means we loose any kind of
On Mon, 27 Apr 2015, Jerome Glisse wrote:
> > is the mechanism that DAX relies on in the VM.
>
> Which would require fare more changes than you seem to think. First using
> MIXED|PFNMAP means we loose any kind of memory accounting and forget about
> memcg too. Seconds it means we would need to
On Mon, 27 Apr 2015, Jerome Glisse wrote:
is the mechanism that DAX relies on in the VM.
Which would require fare more changes than you seem to think. First using
MIXED|PFNMAP means we loose any kind of memory accounting and forget about
memcg too. Seconds it means we would need to set
On Tue, Apr 28, 2015 at 09:18:55AM -0500, Christoph Lameter wrote:
On Mon, 27 Apr 2015, Jerome Glisse wrote:
is the mechanism that DAX relies on in the VM.
Which would require fare more changes than you seem to think. First using
MIXED|PFNMAP means we loose any kind of memory
On Mon, 2015-04-27 at 11:48 -0500, Christoph Lameter wrote:
> On Mon, 27 Apr 2015, Rik van Riel wrote:
>
> > Why would we want to avoid the sane approach that makes this thing
> > work with the fewest required changes to core code?
>
> Becaus new ZONEs are a pretty invasive change to the memory
On Mon, Apr 27, 2015 at 02:26:04PM -0500, Christoph Lameter wrote:
> On Mon, 27 Apr 2015, Jerome Glisse wrote:
>
> > > We can drop the DAX name and just talk about mapping to external memory if
> > > that confuses the issue.
> >
> > DAX is for direct access block layer (X is for the cool name
On 04/27/2015 03:26 PM, Christoph Lameter wrote:
> DAX is about directly accessing memory. It is made for the purpose of
> serving as a block device for a filesystem right now but it can easily be
> used as a way to map any external memory into a processes space using the
> abstraction of a block
On Mon, 27 Apr 2015, Jerome Glisse wrote:
> > We can drop the DAX name and just talk about mapping to external memory if
> > that confuses the issue.
>
> DAX is for direct access block layer (X is for the cool name factor)
> there is zero code inside DAX that would be usefull to us. Because it
>
On Mon, Apr 27, 2015 at 11:51:51AM -0500, Christoph Lameter wrote:
> On Mon, 27 Apr 2015, Jerome Glisse wrote:
>
> > > Well lets avoid that. Access to device memory comparable to what the
> > > drivers do today by establishing page table mappings or a generalization
> > > of DAX approaches would
On Mon, 27 Apr 2015, Jerome Glisse wrote:
> > Well lets avoid that. Access to device memory comparable to what the
> > drivers do today by establishing page table mappings or a generalization
> > of DAX approaches would be the most straightforward way of implementing it
> > and would build based
On Mon, 27 Apr 2015, Rik van Riel wrote:
> Why would we want to avoid the sane approach that makes this thing
> work with the fewest required changes to core code?
Becaus new ZONEs are a pretty invasive change to the memory management and
because there are other ways to handle references to
On Mon, Apr 27, 2015 at 11:17:43AM -0500, Christoph Lameter wrote:
> On Mon, 27 Apr 2015, Jerome Glisse wrote:
>
> > > Improvements to the general code would be preferred instead of
> > > having specialized solutions for a particular hardware alone. If the
> > > general code can then handle the
On Mon, 27 Apr 2015, Paul E. McKenney wrote:
> I would instead look on this as a way to try out use of hardware migration
> hints, which could lead to hardware vendors providing similar hints for
> node-to-node migrations. At that time, the benefits could be provided
> all the functionality
On 04/27/2015 12:17 PM, Christoph Lameter wrote:
> On Mon, 27 Apr 2015, Jerome Glisse wrote:
>
>>> Improvements to the general code would be preferred instead of
>>> having specialized solutions for a particular hardware alone. If the
>>> general code can then handle the special coprocessor
On Mon, 27 Apr 2015, Jerome Glisse wrote:
> > Improvements to the general code would be preferred instead of
> > having specialized solutions for a particular hardware alone. If the
> > general code can then handle the special coprocessor situation then we
> > avoid a lot of code development.
>
On Mon, Apr 27, 2015 at 10:08:29AM -0500, Christoph Lameter wrote:
> On Sat, 25 Apr 2015, Paul E. McKenney wrote:
>
> > Would you have a URL or other pointer to this code?
>
> linux/mm/migrate.c
Ah, I thought you were calling out something not yet in mainline.
> > > > Without modifying a
On Mon, Apr 27, 2015 at 10:08:29AM -0500, Christoph Lameter wrote:
> On Sat, 25 Apr 2015, Paul E. McKenney wrote:
>
> > Would you have a URL or other pointer to this code?
>
> linux/mm/migrate.c
>
> > > > Without modifying a single line of mm code, the only way to do this is
> > > > to
> > > >
On Sat, 25 Apr 2015, Paul E. McKenney wrote:
> Would you have a URL or other pointer to this code?
linux/mm/migrate.c
> > > Without modifying a single line of mm code, the only way to do this is to
> > > either unmap from the cpu page table the range being migrated or to
> > > mprotect
> > >
On 04/27/2015 03:26 PM, Christoph Lameter wrote:
DAX is about directly accessing memory. It is made for the purpose of
serving as a block device for a filesystem right now but it can easily be
used as a way to map any external memory into a processes space using the
abstraction of a block
On Mon, Apr 27, 2015 at 02:26:04PM -0500, Christoph Lameter wrote:
On Mon, 27 Apr 2015, Jerome Glisse wrote:
We can drop the DAX name and just talk about mapping to external memory if
that confuses the issue.
DAX is for direct access block layer (X is for the cool name factor)
there
On Mon, 2015-04-27 at 11:48 -0500, Christoph Lameter wrote:
On Mon, 27 Apr 2015, Rik van Riel wrote:
Why would we want to avoid the sane approach that makes this thing
work with the fewest required changes to core code?
Becaus new ZONEs are a pretty invasive change to the memory
On Sat, 25 Apr 2015, Paul E. McKenney wrote:
Would you have a URL or other pointer to this code?
linux/mm/migrate.c
Without modifying a single line of mm code, the only way to do this is to
either unmap from the cpu page table the range being migrated or to
mprotect
it in some
On Mon, Apr 27, 2015 at 10:08:29AM -0500, Christoph Lameter wrote:
On Sat, 25 Apr 2015, Paul E. McKenney wrote:
Would you have a URL or other pointer to this code?
linux/mm/migrate.c
Without modifying a single line of mm code, the only way to do this is
to
either unmap from
On Mon, 27 Apr 2015, Paul E. McKenney wrote:
I would instead look on this as a way to try out use of hardware migration
hints, which could lead to hardware vendors providing similar hints for
node-to-node migrations. At that time, the benefits could be provided
all the functionality relying
On Mon, 27 Apr 2015, Jerome Glisse wrote:
Well lets avoid that. Access to device memory comparable to what the
drivers do today by establishing page table mappings or a generalization
of DAX approaches would be the most straightforward way of implementing it
and would build based on
On Mon, Apr 27, 2015 at 10:08:29AM -0500, Christoph Lameter wrote:
On Sat, 25 Apr 2015, Paul E. McKenney wrote:
Would you have a URL or other pointer to this code?
linux/mm/migrate.c
Ah, I thought you were calling out something not yet in mainline.
Without modifying a single line of
On 04/27/2015 12:17 PM, Christoph Lameter wrote:
On Mon, 27 Apr 2015, Jerome Glisse wrote:
Improvements to the general code would be preferred instead of
having specialized solutions for a particular hardware alone. If the
general code can then handle the special coprocessor situation then
On Mon, 27 Apr 2015, Rik van Riel wrote:
Why would we want to avoid the sane approach that makes this thing
work with the fewest required changes to core code?
Becaus new ZONEs are a pretty invasive change to the memory management and
because there are other ways to handle references to
On Mon, 27 Apr 2015, Jerome Glisse wrote:
Improvements to the general code would be preferred instead of
having specialized solutions for a particular hardware alone. If the
general code can then handle the special coprocessor situation then we
avoid a lot of code development.
I think
On Mon, Apr 27, 2015 at 11:17:43AM -0500, Christoph Lameter wrote:
On Mon, 27 Apr 2015, Jerome Glisse wrote:
Improvements to the general code would be preferred instead of
having specialized solutions for a particular hardware alone. If the
general code can then handle the special
On Mon, Apr 27, 2015 at 11:51:51AM -0500, Christoph Lameter wrote:
On Mon, 27 Apr 2015, Jerome Glisse wrote:
Well lets avoid that. Access to device memory comparable to what the
drivers do today by establishing page table mappings or a generalization
of DAX approaches would be the most
On Mon, 27 Apr 2015, Jerome Glisse wrote:
We can drop the DAX name and just talk about mapping to external memory if
that confuses the issue.
DAX is for direct access block layer (X is for the cool name factor)
there is zero code inside DAX that would be usefull to us. Because it
is all
On Sat, Apr 25, 2015 at 01:32:39PM +1000, Benjamin Herrenschmidt wrote:
> On Fri, 2015-04-24 at 22:32 -0400, Rik van Riel wrote:
> > > The result would be that the kernel would allocate only
> > migratable
> > > pages within the CCAD device's memory, and even then only if
> > >
On Fri, Apr 24, 2015 at 10:49:28AM -0500, Christoph Lameter wrote:
> On Fri, 24 Apr 2015, Paul E. McKenney wrote:
>
> > can deliver, but where the cost of full-fledge hand tuning cannot be
> > justified.
> >
> > You seem to believe that this latter category is the empty set, which
> > I must
On Fri, Apr 24, 2015 at 03:00:18PM -0500, Christoph Lameter wrote:
> On Fri, 24 Apr 2015, Jerome Glisse wrote:
>
> > > Still no answer as to why is that not possible with the current scheme?
> > > You keep on talking about pointers and I keep on responding that this is a
> > > matter of making
On Fri, Apr 24, 2015 at 11:09:36AM -0400, Jerome Glisse wrote:
> On Fri, Apr 24, 2015 at 07:57:38AM -0700, Paul E. McKenney wrote:
> > On Fri, Apr 24, 2015 at 09:12:07AM -0500, Christoph Lameter wrote:
> > > On Thu, 23 Apr 2015, Paul E. McKenney wrote:
> > >
> > > >
> > > > DAX
> > > >
> > > >
On Fri, Apr 24, 2015 at 11:09:36AM -0400, Jerome Glisse wrote:
On Fri, Apr 24, 2015 at 07:57:38AM -0700, Paul E. McKenney wrote:
On Fri, Apr 24, 2015 at 09:12:07AM -0500, Christoph Lameter wrote:
On Thu, 23 Apr 2015, Paul E. McKenney wrote:
DAX
DAX is a mechanism
On Fri, Apr 24, 2015 at 03:00:18PM -0500, Christoph Lameter wrote:
On Fri, 24 Apr 2015, Jerome Glisse wrote:
Still no answer as to why is that not possible with the current scheme?
You keep on talking about pointers and I keep on responding that this is a
matter of making the address
On Fri, Apr 24, 2015 at 10:49:28AM -0500, Christoph Lameter wrote:
On Fri, 24 Apr 2015, Paul E. McKenney wrote:
can deliver, but where the cost of full-fledge hand tuning cannot be
justified.
You seem to believe that this latter category is the empty set, which
I must confess does
On Sat, Apr 25, 2015 at 01:32:39PM +1000, Benjamin Herrenschmidt wrote:
On Fri, 2015-04-24 at 22:32 -0400, Rik van Riel wrote:
The result would be that the kernel would allocate only
migratable
pages within the CCAD device's memory, and even then only if
memory was
On Fri, 2015-04-24 at 22:32 -0400, Rik van Riel wrote:
> > The result would be that the kernel would allocate only
> migratable
> > pages within the CCAD device's memory, and even then only if
> > memory was otherwise exhausted.
>
> Does it make sense to allocate the device's
On 04/21/2015 05:44 PM, Paul E. McKenney wrote:
> AUTONUMA
>
> The Linux kernel's autonuma facility supports migrating both
> memory and processes to promote NUMA memory locality. It was
> accepted into 3.13 and is available in RHEL 7.0 and SLES 12.
> It is enabled by
On Fri, 2015-04-24 at 11:58 -0500, Christoph Lameter wrote:
> On Fri, 24 Apr 2015, Jerome Glisse wrote:
>
> > > What exactly is the more advanced version's benefit? What are the features
> > > that the other platforms do not provide?
> >
> > Transparent access to device memory from the CPU, you
On Fri, Apr 24, 2015 at 03:00:18PM -0500, Christoph Lameter wrote:
> On Fri, 24 Apr 2015, Jerome Glisse wrote:
>
> > > Still no answer as to why is that not possible with the current scheme?
> > > You keep on talking about pointers and I keep on responding that this is a
> > > matter of making
On Fri, 24 Apr 2015, Jerome Glisse wrote:
> > Still no answer as to why is that not possible with the current scheme?
> > You keep on talking about pointers and I keep on responding that this is a
> > matter of making the address space compatible on both sides.
>
> So if do that in a naive way,
On Fri, Apr 24, 2015 at 01:56:45PM -0500, Christoph Lameter wrote:
> On Fri, 24 Apr 2015, Jerome Glisse wrote:
>
> > > Right this is how things work and you could improve on that. Stay with the
> > > scheme. Why would that not work if you map things the same way in both
> > > environments if both
On Fri, 24 Apr 2015, Jerome Glisse wrote:
> > Right this is how things work and you could improve on that. Stay with the
> > scheme. Why would that not work if you map things the same way in both
> > environments if both accellerator and host processor can acceess each
> > others memory?
>
>
On 04/23/2015 07:22 PM, Jerome Glisse wrote:
On Thu, Apr 23, 2015 at 09:20:55AM -0500, Christoph Lameter wrote:
On Thu, 23 Apr 2015, Benjamin Herrenschmidt wrote:
There are hooks in glibc where you can replace the memory
management of the apps if you want that.
We don't control the app.
On Fri, Apr 24, 2015 at 11:58:39AM -0500, Christoph Lameter wrote:
> On Fri, 24 Apr 2015, Jerome Glisse wrote:
>
> > > What exactly is the more advanced version's benefit? What are the features
> > > that the other platforms do not provide?
> >
> > Transparent access to device memory from the
On Fri, 24 Apr 2015, Jerome Glisse wrote:
> > What exactly is the more advanced version's benefit? What are the features
> > that the other platforms do not provide?
>
> Transparent access to device memory from the CPU, you can map any of the GPU
> memory inside the CPU and have the whole cache
On Fri, Apr 24, 2015 at 11:03:52AM -0500, Christoph Lameter wrote:
> On Fri, 24 Apr 2015, Jerome Glisse wrote:
>
> > On Fri, Apr 24, 2015 at 09:29:12AM -0500, Christoph Lameter wrote:
> > > On Thu, 23 Apr 2015, Jerome Glisse wrote:
> > >
> > > > No this not have been solve properly. Today
On 04/24/2015 10:30 AM, Christoph Lameter wrote:
> On Thu, 23 Apr 2015, Paul E. McKenney wrote:
>
>> If by "entire industry" you mean everyone who might want to use hardware
>> acceleration, for example, including mechanical computer-aided design,
>> I am skeptical.
>
> The industry designs GPUs
On 04/24/2015 11:49 AM, Christoph Lameter wrote:
> On Fri, 24 Apr 2015, Paul E. McKenney wrote:
>
>> can deliver, but where the cost of full-fledge hand tuning cannot be
>> justified.
>>
>> You seem to believe that this latter category is the empty set, which
>> I must confess does greatly
On Fri, 24 Apr 2015, Jerome Glisse wrote:
> On Fri, Apr 24, 2015 at 09:29:12AM -0500, Christoph Lameter wrote:
> > On Thu, 23 Apr 2015, Jerome Glisse wrote:
> >
> > > No this not have been solve properly. Today solution is doing an explicit
> > > copy and again and again when complex data struct
On Fri, Apr 24, 2015 at 09:30:40AM -0500, Christoph Lameter wrote:
> On Thu, 23 Apr 2015, Paul E. McKenney wrote:
>
> > If by "entire industry" you mean everyone who might want to use hardware
> > acceleration, for example, including mechanical computer-aided design,
> > I am skeptical.
>
> The
On 04/24/2015 10:01 AM, Christoph Lameter wrote:
> On Thu, 23 Apr 2015, Paul E. McKenney wrote:
>
>>> As far as I know Jerome is talkeing about HPC loads and high performance
>>> GPU processing. This is the same use case.
>>
>> The difference is sensitivity to latency. You have latency-sensitive
On Fri, 24 Apr 2015, Paul E. McKenney wrote:
> > DAX is a mechanism to access memory not managed by the kernel and is the
> > successor to XIP. It just happens to be needed for persistent memory.
> > Fundamentally any driver can provide an MMAPPed interface to allow access
> > to a devices
On Fri, 24 Apr 2015, Paul E. McKenney wrote:
> can deliver, but where the cost of full-fledge hand tuning cannot be
> justified.
>
> You seem to believe that this latter category is the empty set, which
> I must confess does greatly surprise me.
If there are already compromises are being made
On Fri, Apr 24, 2015 at 07:57:38AM -0700, Paul E. McKenney wrote:
> On Fri, Apr 24, 2015 at 09:12:07AM -0500, Christoph Lameter wrote:
> > On Thu, 23 Apr 2015, Paul E. McKenney wrote:
> >
> > >
> > > DAX
> > >
> > > DAX is a mechanism for providing direct-memory access to
> > > high-speed
On Fri, Apr 24, 2015 at 09:29:12AM -0500, Christoph Lameter wrote:
> On Thu, 23 Apr 2015, Jerome Glisse wrote:
>
> > No this not have been solve properly. Today solution is doing an explicit
> > copy and again and again when complex data struct are involve (list, tree,
> > ...) this is extremly
On Fri, Apr 24, 2015 at 09:12:07AM -0500, Christoph Lameter wrote:
> On Thu, 23 Apr 2015, Paul E. McKenney wrote:
>
> >
> > DAX
> >
> > DAX is a mechanism for providing direct-memory access to
> > high-speed non-volatile (AKA "persistent") memory. Good
> > introductions to DAX may be
On Fri, Apr 24, 2015 at 09:30:40AM -0500, Christoph Lameter wrote:
> On Thu, 23 Apr 2015, Paul E. McKenney wrote:
>
> > If by "entire industry" you mean everyone who might want to use hardware
> > acceleration, for example, including mechanical computer-aided design,
> > I am skeptical.
>
> The
On Thu, 23 Apr 2015, Paul E. McKenney wrote:
> If by "entire industry" you mean everyone who might want to use hardware
> acceleration, for example, including mechanical computer-aided design,
> I am skeptical.
The industry designs GPUs with super fast special ram and accellerators
with special
On Thu, 23 Apr 2015, Jerome Glisse wrote:
> No this not have been solve properly. Today solution is doing an explicit
> copy and again and again when complex data struct are involve (list, tree,
> ...) this is extremly tedious and hard to debug. So today solution often
> restrict themself to easy
On Fri, Apr 24, 2015 at 09:01:47AM -0500, Christoph Lameter wrote:
> On Thu, 23 Apr 2015, Paul E. McKenney wrote:
>
> > > As far as I know Jerome is talkeing about HPC loads and high performance
> > > GPU processing. This is the same use case.
> >
> > The difference is sensitivity to latency.
On Thu, 23 Apr 2015, Paul E. McKenney wrote:
>
> DAX
>
> DAX is a mechanism for providing direct-memory access to
> high-speed non-volatile (AKA "persistent") memory. Good
> introductions to DAX may be found in the following LWN
> articles:
DAX is a mechanism to access
On Thu, 23 Apr 2015, Jerome Glisse wrote:
> The numa code we have today for CPU case exist because it does make
> a difference but you keep trying to restrict GPU user to a workload
> that is specific. Go talk to people doing physic, biology, data
> mining, CAD most of them do not care about
On Thu, 23 Apr 2015, Austin S Hemmelgarn wrote:
Looking at this whole conversation, all I see is two different views on how to
present the asymmetric multiprocessing arrangements that have become
commonplace in today's systems to userspace. Your model favors performance,
while CAPI favors
On Thu, 23 Apr 2015, Paul E. McKenney wrote:
> > As far as I know Jerome is talkeing about HPC loads and high performance
> > GPU processing. This is the same use case.
>
> The difference is sensitivity to latency. You have latency-sensitive
> HPC workloads, and Jerome is talking about HPC
On Thu, 23 Apr 2015, Austin S Hemmelgarn wrote:
Looking at this whole conversation, all I see is two different views on how to
present the asymmetric multiprocessing arrangements that have become
commonplace in today's systems to userspace. Your model favors performance,
while CAPI favors
On Thu, 23 Apr 2015, Paul E. McKenney wrote:
If by entire industry you mean everyone who might want to use hardware
acceleration, for example, including mechanical computer-aided design,
I am skeptical.
The industry designs GPUs with super fast special ram and accellerators
with special ram
On Fri, Apr 24, 2015 at 09:01:47AM -0500, Christoph Lameter wrote:
On Thu, 23 Apr 2015, Paul E. McKenney wrote:
As far as I know Jerome is talkeing about HPC loads and high performance
GPU processing. This is the same use case.
The difference is sensitivity to latency. You have
On Thu, 23 Apr 2015, Paul E. McKenney wrote:
As far as I know Jerome is talkeing about HPC loads and high performance
GPU processing. This is the same use case.
The difference is sensitivity to latency. You have latency-sensitive
HPC workloads, and Jerome is talking about HPC workloads
On Thu, 23 Apr 2015, Jerome Glisse wrote:
The numa code we have today for CPU case exist because it does make
a difference but you keep trying to restrict GPU user to a workload
that is specific. Go talk to people doing physic, biology, data
mining, CAD most of them do not care about latency.
On Thu, 23 Apr 2015, Jerome Glisse wrote:
No this not have been solve properly. Today solution is doing an explicit
copy and again and again when complex data struct are involve (list, tree,
...) this is extremly tedious and hard to debug. So today solution often
restrict themself to easy
On Thu, 23 Apr 2015, Paul E. McKenney wrote:
DAX
DAX is a mechanism for providing direct-memory access to
high-speed non-volatile (AKA persistent) memory. Good
introductions to DAX may be found in the following LWN
articles:
DAX is a mechanism to access memory not
On 04/23/2015 07:22 PM, Jerome Glisse wrote:
On Thu, Apr 23, 2015 at 09:20:55AM -0500, Christoph Lameter wrote:
On Thu, 23 Apr 2015, Benjamin Herrenschmidt wrote:
There are hooks in glibc where you can replace the memory
management of the apps if you want that.
We don't control the app.
On Fri, Apr 24, 2015 at 09:30:40AM -0500, Christoph Lameter wrote:
On Thu, 23 Apr 2015, Paul E. McKenney wrote:
If by entire industry you mean everyone who might want to use hardware
acceleration, for example, including mechanical computer-aided design,
I am skeptical.
The industry
On Fri, 2015-04-24 at 11:58 -0500, Christoph Lameter wrote:
On Fri, 24 Apr 2015, Jerome Glisse wrote:
What exactly is the more advanced version's benefit? What are the features
that the other platforms do not provide?
Transparent access to device memory from the CPU, you can map any
On Fri, 24 Apr 2015, Paul E. McKenney wrote:
DAX is a mechanism to access memory not managed by the kernel and is the
successor to XIP. It just happens to be needed for persistent memory.
Fundamentally any driver can provide an MMAPPed interface to allow access
to a devices memory.
I
On Fri, 24 Apr 2015, Jerome Glisse wrote:
On Fri, Apr 24, 2015 at 09:29:12AM -0500, Christoph Lameter wrote:
On Thu, 23 Apr 2015, Jerome Glisse wrote:
No this not have been solve properly. Today solution is doing an explicit
copy and again and again when complex data struct are involve
On 04/24/2015 11:49 AM, Christoph Lameter wrote:
On Fri, 24 Apr 2015, Paul E. McKenney wrote:
can deliver, but where the cost of full-fledge hand tuning cannot be
justified.
You seem to believe that this latter category is the empty set, which
I must confess does greatly surprise me.
If
On Fri, Apr 24, 2015 at 09:30:40AM -0500, Christoph Lameter wrote:
On Thu, 23 Apr 2015, Paul E. McKenney wrote:
If by entire industry you mean everyone who might want to use hardware
acceleration, for example, including mechanical computer-aided design,
I am skeptical.
The industry
On Fri, Apr 24, 2015 at 09:29:12AM -0500, Christoph Lameter wrote:
On Thu, 23 Apr 2015, Jerome Glisse wrote:
No this not have been solve properly. Today solution is doing an explicit
copy and again and again when complex data struct are involve (list, tree,
...) this is extremly tedious
On Fri, Apr 24, 2015 at 07:57:38AM -0700, Paul E. McKenney wrote:
On Fri, Apr 24, 2015 at 09:12:07AM -0500, Christoph Lameter wrote:
On Thu, 23 Apr 2015, Paul E. McKenney wrote:
DAX
DAX is a mechanism for providing direct-memory access to
high-speed non-volatile (AKA
On Fri, Apr 24, 2015 at 09:12:07AM -0500, Christoph Lameter wrote:
On Thu, 23 Apr 2015, Paul E. McKenney wrote:
DAX
DAX is a mechanism for providing direct-memory access to
high-speed non-volatile (AKA persistent) memory. Good
introductions to DAX may be found in the
On 04/24/2015 10:01 AM, Christoph Lameter wrote:
On Thu, 23 Apr 2015, Paul E. McKenney wrote:
As far as I know Jerome is talkeing about HPC loads and high performance
GPU processing. This is the same use case.
The difference is sensitivity to latency. You have latency-sensitive
HPC
On Fri, 24 Apr 2015, Paul E. McKenney wrote:
can deliver, but where the cost of full-fledge hand tuning cannot be
justified.
You seem to believe that this latter category is the empty set, which
I must confess does greatly surprise me.
If there are already compromises are being made then
On 04/21/2015 05:44 PM, Paul E. McKenney wrote:
AUTONUMA
The Linux kernel's autonuma facility supports migrating both
memory and processes to promote NUMA memory locality. It was
accepted into 3.13 and is available in RHEL 7.0 and SLES 12.
It is enabled by the
On Fri, 2015-04-24 at 22:32 -0400, Rik van Riel wrote:
The result would be that the kernel would allocate only
migratable
pages within the CCAD device's memory, and even then only if
memory was otherwise exhausted.
Does it make sense to allocate the device's page tables
1 - 100 of 194 matches
Mail list logo