On Wed 31-05-17 23:35:48, Pasha Tatashin wrote:
> >OK, so why cannot we make zero_struct_page 8x 8B stores, other arches
> >would do memset. You said it would be slower but would that be
> >measurable? I am sorry to be so persistent here but I would be really
> >happier if this didn't depend on
On Wed 31-05-17 23:35:48, Pasha Tatashin wrote:
> >OK, so why cannot we make zero_struct_page 8x 8B stores, other arches
> >would do memset. You said it would be slower but would that be
> >measurable? I am sorry to be so persistent here but I would be really
> >happier if this didn't depend on
OK, so why cannot we make zero_struct_page 8x 8B stores, other arches
would do memset. You said it would be slower but would that be
measurable? I am sorry to be so persistent here but I would be really
happier if this didn't depend on the deferred initialization. If this is
absolutely a no-go
OK, so why cannot we make zero_struct_page 8x 8B stores, other arches
would do memset. You said it would be slower but would that be
measurable? I am sorry to be so persistent here but I would be really
happier if this didn't depend on the deferred initialization. If this is
absolutely a no-go
From: Michal Hocko
Date: Wed, 31 May 2017 18:31:31 +0200
> On Tue 30-05-17 13:16:50, Pasha Tatashin wrote:
>> >Could you be more specific? E.g. how are other stores done in
>> >__init_single_page safe then? I am sorry to be dense here but how does
>> >the full 64B store differ
From: Michal Hocko
Date: Wed, 31 May 2017 18:31:31 +0200
> On Tue 30-05-17 13:16:50, Pasha Tatashin wrote:
>> >Could you be more specific? E.g. how are other stores done in
>> >__init_single_page safe then? I am sorry to be dense here but how does
>> >the full 64B store differ from other stores
On Tue 30-05-17 13:16:50, Pasha Tatashin wrote:
> >Could you be more specific? E.g. how are other stores done in
> >__init_single_page safe then? I am sorry to be dense here but how does
> >the full 64B store differ from other stores done in the same function.
>
> Hi Michal,
>
> It is safe to do
On Tue 30-05-17 13:16:50, Pasha Tatashin wrote:
> >Could you be more specific? E.g. how are other stores done in
> >__init_single_page safe then? I am sorry to be dense here but how does
> >the full 64B store differ from other stores done in the same function.
>
> Hi Michal,
>
> It is safe to do
Could you be more specific? E.g. how are other stores done in
__init_single_page safe then? I am sorry to be dense here but how does
the full 64B store differ from other stores done in the same function.
Hi Michal,
It is safe to do regular 8-byte and smaller stores (stx, st, sth, stb)
without
Could you be more specific? E.g. how are other stores done in
__init_single_page safe then? I am sorry to be dense here but how does
the full 64B store differ from other stores done in the same function.
Hi Michal,
It is safe to do regular 8-byte and smaller stores (stx, st, sth, stb)
without
On Fri 26-05-17 12:45:55, Pasha Tatashin wrote:
> Hi Michal,
>
> I have considered your proposals:
>
> 1. Making memset(0) unconditional inside __init_single_page() is not going
> to work because it slows down SPARC, and ppc64. On SPARC even the BSTI
> optimization that I have proposed earlier
On Fri 26-05-17 12:45:55, Pasha Tatashin wrote:
> Hi Michal,
>
> I have considered your proposals:
>
> 1. Making memset(0) unconditional inside __init_single_page() is not going
> to work because it slows down SPARC, and ppc64. On SPARC even the BSTI
> optimization that I have proposed earlier
Hi Michal,
I have considered your proposals:
1. Making memset(0) unconditional inside __init_single_page() is not
going to work because it slows down SPARC, and ppc64. On SPARC even the
BSTI optimization that I have proposed earlier won't work, because after
consulting with other engineers I
Hi Michal,
I have considered your proposals:
1. Making memset(0) unconditional inside __init_single_page() is not
going to work because it slows down SPARC, and ppc64. On SPARC even the
BSTI optimization that I have proposed earlier won't work, because after
consulting with other engineers I
On Fri, 2017-05-12 at 13:37 -0400, David Miller wrote:
> > Right now it is larger, but what I suggested is to add a new optimized
> > routine just for this case, which would do STBI for 64-bytes but
> > without membar (do membar at the end of memmap_init_zone() and
> > deferred_init_memmap()
> >
On Fri, 2017-05-12 at 13:37 -0400, David Miller wrote:
> > Right now it is larger, but what I suggested is to add a new optimized
> > routine just for this case, which would do STBI for 64-bytes but
> > without membar (do membar at the end of memmap_init_zone() and
> > deferred_init_memmap()
> >
On Mon 15-05-17 16:44:26, Pasha Tatashin wrote:
> On 05/15/2017 03:38 PM, Michal Hocko wrote:
> >I do not think this is the right approach. Your measurements just show
> >that sparc could have a more optimized memset for small sizes. If you
> >keep the same memset only for the parallel
On Mon 15-05-17 16:44:26, Pasha Tatashin wrote:
> On 05/15/2017 03:38 PM, Michal Hocko wrote:
> >I do not think this is the right approach. Your measurements just show
> >that sparc could have a more optimized memset for small sizes. If you
> >keep the same memset only for the parallel
On 05/15/2017 03:38 PM, Michal Hocko wrote:
On Mon 15-05-17 14:12:10, Pasha Tatashin wrote:
Hi Michal,
After looking at your suggested memblock_virt_alloc_core() change again, I
decided to keep what I have. I do not want to inline
memblock_virt_alloc_internal(), because it is not a
On 05/15/2017 03:38 PM, Michal Hocko wrote:
On Mon 15-05-17 14:12:10, Pasha Tatashin wrote:
Hi Michal,
After looking at your suggested memblock_virt_alloc_core() change again, I
decided to keep what I have. I do not want to inline
memblock_virt_alloc_internal(), because it is not a
On Mon 15-05-17 14:12:10, Pasha Tatashin wrote:
> Hi Michal,
>
> After looking at your suggested memblock_virt_alloc_core() change again, I
> decided to keep what I have. I do not want to inline
> memblock_virt_alloc_internal(), because it is not a performance critical
> path, and by inlining it
On Mon 15-05-17 14:12:10, Pasha Tatashin wrote:
> Hi Michal,
>
> After looking at your suggested memblock_virt_alloc_core() change again, I
> decided to keep what I have. I do not want to inline
> memblock_virt_alloc_internal(), because it is not a performance critical
> path, and by inlining it
Hi Michal,
After looking at your suggested memblock_virt_alloc_core() change again,
I decided to keep what I have. I do not want to inline
memblock_virt_alloc_internal(), because it is not a performance critical
path, and by inlining it we will unnecessarily increase the text size on
all
Hi Michal,
After looking at your suggested memblock_virt_alloc_core() change again,
I decided to keep what I have. I do not want to inline
memblock_virt_alloc_internal(), because it is not a performance critical
path, and by inlining it we will unnecessarily increase the text size on
all
From: Pasha Tatashin
Date: Fri, 12 May 2017 13:24:52 -0400
> Right now it is larger, but what I suggested is to add a new optimized
> routine just for this case, which would do STBI for 64-bytes but
> without membar (do membar at the end of memmap_init_zone() and
>
From: Pasha Tatashin
Date: Fri, 12 May 2017 13:24:52 -0400
> Right now it is larger, but what I suggested is to add a new optimized
> routine just for this case, which would do STBI for 64-bytes but
> without membar (do membar at the end of memmap_init_zone() and
> deferred_init_memmap()
>
>
On 05/12/2017 12:57 PM, David Miller wrote:
From: Pasha Tatashin
Date: Thu, 11 May 2017 16:59:33 -0400
We should either keep memset() only for deferred struct pages as what
I have in my patches.
Another option is to add a new function struct_page_clear() which
On 05/12/2017 12:57 PM, David Miller wrote:
From: Pasha Tatashin
Date: Thu, 11 May 2017 16:59:33 -0400
We should either keep memset() only for deferred struct pages as what
I have in my patches.
Another option is to add a new function struct_page_clear() which
would default to memset() and
From: Pasha Tatashin
Date: Thu, 11 May 2017 16:59:33 -0400
> We should either keep memset() only for deferred struct pages as what
> I have in my patches.
>
> Another option is to add a new function struct_page_clear() which
> would default to memset() and to
From: Pasha Tatashin
Date: Thu, 11 May 2017 16:59:33 -0400
> We should either keep memset() only for deferred struct pages as what
> I have in my patches.
>
> Another option is to add a new function struct_page_clear() which
> would default to memset() and to something else on platforms that
>
From: Pasha Tatashin
Date: Thu, 11 May 2017 16:47:05 -0400
> So, moving memset() into __init_single_page() benefits Intel. I am
> actually surprised why memset() is so slow on intel when it is called
> from memblock. But, hurts SPARC, I guess these membars at the end
From: Pasha Tatashin
Date: Thu, 11 May 2017 16:47:05 -0400
> So, moving memset() into __init_single_page() benefits Intel. I am
> actually surprised why memset() is so slow on intel when it is called
> from memblock. But, hurts SPARC, I guess these membars at the end of
> memset() kills the
We should either keep memset() only for deferred struct pages as what I
have in my patches.
Another option is to add a new function struct_page_clear() which would
default to memset() and to something else on platforms that decide to
optimize it.
On SPARC it would call STBIs, and we would
We should either keep memset() only for deferred struct pages as what I
have in my patches.
Another option is to add a new function struct_page_clear() which would
default to memset() and to something else on platforms that decide to
optimize it.
On SPARC it would call STBIs, and we would
Have you measured that? I do not think it would be super hard to
measure. I would be quite surprised if this added much if anything at
all as the whole struct page should be in the cache line already. We do
set reference count and other struct members. Almost nobody should be
looking at our page
Have you measured that? I do not think it would be super hard to
measure. I would be quite surprised if this added much if anything at
all as the whole struct page should be in the cache line already. We do
set reference count and other struct members. Almost nobody should be
looking at our page
From: Michal Hocko
Date: Thu, 11 May 2017 10:05:38 +0200
> Anyway, do you agree that doing the struct page initialization along
> with other writes to it shouldn't add a measurable overhead comparing
> to pre-zeroing of larger block of struct pages? We already have an
>
From: Michal Hocko
Date: Thu, 11 May 2017 10:05:38 +0200
> Anyway, do you agree that doing the struct page initialization along
> with other writes to it shouldn't add a measurable overhead comparing
> to pre-zeroing of larger block of struct pages? We already have an
> exclusive cache line and
On Wed 10-05-17 11:19:43, David S. Miller wrote:
> From: Michal Hocko
> Date: Wed, 10 May 2017 16:57:26 +0200
>
> > Have you measured that? I do not think it would be super hard to
> > measure. I would be quite surprised if this added much if anything at
> > all as the whole
On Wed 10-05-17 11:19:43, David S. Miller wrote:
> From: Michal Hocko
> Date: Wed, 10 May 2017 16:57:26 +0200
>
> > Have you measured that? I do not think it would be super hard to
> > measure. I would be quite surprised if this added much if anything at
> > all as the whole struct page should
On Wed, May 10, 2017 at 02:00:26PM -0400, David Miller wrote:
> From: Matthew Wilcox
> Date: Wed, 10 May 2017 10:17:03 -0700
> > On Wed, May 10, 2017 at 11:19:43AM -0400, David Miller wrote:
> >> I guess it might be clearer if you understand what the block
> >> initializing
On Wed, May 10, 2017 at 02:00:26PM -0400, David Miller wrote:
> From: Matthew Wilcox
> Date: Wed, 10 May 2017 10:17:03 -0700
> > On Wed, May 10, 2017 at 11:19:43AM -0400, David Miller wrote:
> >> I guess it might be clearer if you understand what the block
> >> initializing stores do on sparc64.
From: Matthew Wilcox
Date: Wed, 10 May 2017 10:17:03 -0700
> On Wed, May 10, 2017 at 11:19:43AM -0400, David Miller wrote:
>> From: Michal Hocko
>> Date: Wed, 10 May 2017 16:57:26 +0200
>>
>> > Have you measured that? I do not think it would be super
From: Matthew Wilcox
Date: Wed, 10 May 2017 10:17:03 -0700
> On Wed, May 10, 2017 at 11:19:43AM -0400, David Miller wrote:
>> From: Michal Hocko
>> Date: Wed, 10 May 2017 16:57:26 +0200
>>
>> > Have you measured that? I do not think it would be super hard to
>> > measure. I would be quite
On Wed, May 10, 2017 at 11:19:43AM -0400, David Miller wrote:
> From: Michal Hocko
> Date: Wed, 10 May 2017 16:57:26 +0200
>
> > Have you measured that? I do not think it would be super hard to
> > measure. I would be quite surprised if this added much if anything at
> > all
On Wed, May 10, 2017 at 11:19:43AM -0400, David Miller wrote:
> From: Michal Hocko
> Date: Wed, 10 May 2017 16:57:26 +0200
>
> > Have you measured that? I do not think it would be super hard to
> > measure. I would be quite surprised if this added much if anything at
> > all as the whole struct
From: Pasha Tatashin
Date: Wed, 10 May 2017 11:01:40 -0400
> Perhaps you are right, and I will measure on x86. But, I suspect hit
> can become unacceptable on some platfoms: there is an overhead of
> calling a function, even if it is leaf-optimized, and there is an
>
From: Pasha Tatashin
Date: Wed, 10 May 2017 11:01:40 -0400
> Perhaps you are right, and I will measure on x86. But, I suspect hit
> can become unacceptable on some platfoms: there is an overhead of
> calling a function, even if it is leaf-optimized, and there is an
> overhead in memset() to
From: Michal Hocko
Date: Wed, 10 May 2017 16:57:26 +0200
> Have you measured that? I do not think it would be super hard to
> measure. I would be quite surprised if this added much if anything at
> all as the whole struct page should be in the cache line already. We do
> set
From: Michal Hocko
Date: Wed, 10 May 2017 16:57:26 +0200
> Have you measured that? I do not think it would be super hard to
> measure. I would be quite surprised if this added much if anything at
> all as the whole struct page should be in the cache line already. We do
> set reference count and
On 05/10/2017 10:57 AM, Michal Hocko wrote:
On Wed 10-05-17 09:42:22, Pasha Tatashin wrote:
Well, I didn't object to this particular part. I was mostly concerned
about
http://lkml.kernel.org/r/1494003796-748672-4-git-send-email-pasha.tatas...@oracle.com
and the "zero" argument for other
On 05/10/2017 10:57 AM, Michal Hocko wrote:
On Wed 10-05-17 09:42:22, Pasha Tatashin wrote:
Well, I didn't object to this particular part. I was mostly concerned
about
http://lkml.kernel.org/r/1494003796-748672-4-git-send-email-pasha.tatas...@oracle.com
and the "zero" argument for other
On Wed 10-05-17 09:42:22, Pasha Tatashin wrote:
> >
> >Well, I didn't object to this particular part. I was mostly concerned
> >about
> >http://lkml.kernel.org/r/1494003796-748672-4-git-send-email-pasha.tatas...@oracle.com
> >and the "zero" argument for other functions. I guess we can do without
>
On Wed 10-05-17 09:42:22, Pasha Tatashin wrote:
> >
> >Well, I didn't object to this particular part. I was mostly concerned
> >about
> >http://lkml.kernel.org/r/1494003796-748672-4-git-send-email-pasha.tatas...@oracle.com
> >and the "zero" argument for other functions. I guess we can do without
>
Well, I didn't object to this particular part. I was mostly concerned
about
http://lkml.kernel.org/r/1494003796-748672-4-git-send-email-pasha.tatas...@oracle.com
and the "zero" argument for other functions. I guess we can do without
that. I _think_ that we should simply _always_ initialize the
Well, I didn't object to this particular part. I was mostly concerned
about
http://lkml.kernel.org/r/1494003796-748672-4-git-send-email-pasha.tatas...@oracle.com
and the "zero" argument for other functions. I guess we can do without
that. I _think_ that we should simply _always_ initialize the
On Tue 09-05-17 14:54:50, Pasha Tatashin wrote:
[...]
> >The implementation just looks too large to what I would expect. E.g. do
> >we really need to add zero argument to the large part of the memblock
> >API? Wouldn't it be easier to simply export memblock_virt_alloc_internal
> >(or its tiny
On Tue 09-05-17 14:54:50, Pasha Tatashin wrote:
[...]
> >The implementation just looks too large to what I would expect. E.g. do
> >we really need to add zero argument to the large part of the memblock
> >API? Wouldn't it be easier to simply export memblock_virt_alloc_internal
> >(or its tiny
Hi Michal,
I like the idea of postponing the zeroing from the allocation to the
init time. To be honest the improvement looks much larger than I would
expect (Btw. this should be a part of the changelog rather than a
outside link).
The improvements are larger, because this time was never
Hi Michal,
I like the idea of postponing the zeroing from the allocation to the
init time. To be honest the improvement looks much larger than I would
expect (Btw. this should be a part of the changelog rather than a
outside link).
The improvements are larger, because this time was never
On Fri 05-05-17 13:03:07, Pavel Tatashin wrote:
> Changelog:
> v2 - v3
> - Addressed David's comments about one change per patch:
> * Splited changes to platforms into 4 patches
> * Made "do not zero vmemmap_buf" as a separate patch
> v1 - v2
> -
On Fri 05-05-17 13:03:07, Pavel Tatashin wrote:
> Changelog:
> v2 - v3
> - Addressed David's comments about one change per patch:
> * Splited changes to platforms into 4 patches
> * Made "do not zero vmemmap_buf" as a separate patch
> v1 - v2
> -
Changelog:
v2 - v3
- Addressed David's comments about one change per patch:
* Splited changes to platforms into 4 patches
* Made "do not zero vmemmap_buf" as a separate patch
v1 - v2
- Per request, added s390 to deferred "struct page"
Changelog:
v2 - v3
- Addressed David's comments about one change per patch:
* Splited changes to platforms into 4 patches
* Made "do not zero vmemmap_buf" as a separate patch
v1 - v2
- Per request, added s390 to deferred "struct page"
64 matches
Mail list logo