On 08/24/2017 01:44 PM, Mel Gorman wrote:
> On Thu, Aug 24, 2017 at 11:16:15AM -0700, Linus Torvalds wrote:
>> On Thu, Aug 24, 2017 at 10:49 AM, Tim Chen
>> wrote:
>>>
>>> These changes look fine. We are testing them now.
>>> Does the second patch in the series look
On 08/24/2017 01:44 PM, Mel Gorman wrote:
> On Thu, Aug 24, 2017 at 11:16:15AM -0700, Linus Torvalds wrote:
>> On Thu, Aug 24, 2017 at 10:49 AM, Tim Chen
>> wrote:
>>>
>>> These changes look fine. We are testing them now.
>>> Does the second patch in the series look okay to you?
>>
>> I didn't
On Thu, Aug 24, 2017 at 11:16:15AM -0700, Linus Torvalds wrote:
> On Thu, Aug 24, 2017 at 10:49 AM, Tim Chen wrote:
> >
> > These changes look fine. We are testing them now.
> > Does the second patch in the series look okay to you?
>
> I didn't really have any
On Thu, Aug 24, 2017 at 11:16:15AM -0700, Linus Torvalds wrote:
> On Thu, Aug 24, 2017 at 10:49 AM, Tim Chen wrote:
> >
> > These changes look fine. We are testing them now.
> > Does the second patch in the series look okay to you?
>
> I didn't really have any reaction to that one, as long as
On Thu, Aug 24, 2017 at 10:49 AM, Tim Chen wrote:
>
> These changes look fine. We are testing them now.
> Does the second patch in the series look okay to you?
I didn't really have any reaction to that one, as long as Mel are
ok with it, I'm fine with it.
On Thu, Aug 24, 2017 at 10:49 AM, Tim Chen wrote:
>
> These changes look fine. We are testing them now.
> Does the second patch in the series look okay to you?
I didn't really have any reaction to that one, as long as Mel are
ok with it, I'm fine with it.
Linus
On 08/23/2017 04:30 PM, Linus Torvalds wrote:
> On Wed, Aug 23, 2017 at 11:17 AM, Linus Torvalds
> wrote:
>> On Wed, Aug 23, 2017 at 8:58 AM, Tim Chen wrote:
>>>
>>> Will you still consider the original patch as a fail safe mechanism?
>>
On 08/23/2017 04:30 PM, Linus Torvalds wrote:
> On Wed, Aug 23, 2017 at 11:17 AM, Linus Torvalds
> wrote:
>> On Wed, Aug 23, 2017 at 8:58 AM, Tim Chen wrote:
>>>
>>> Will you still consider the original patch as a fail safe mechanism?
>>
>> I don't think we have much choice, although I would
On Wed, Aug 23, 2017 at 11:17 AM, Linus Torvalds
wrote:
> On Wed, Aug 23, 2017 at 8:58 AM, Tim Chen wrote:
>>
>> Will you still consider the original patch as a fail safe mechanism?
>
> I don't think we have much choice, although I would
On Wed, Aug 23, 2017 at 11:17 AM, Linus Torvalds
wrote:
> On Wed, Aug 23, 2017 at 8:58 AM, Tim Chen wrote:
>>
>> Will you still consider the original patch as a fail safe mechanism?
>
> I don't think we have much choice, although I would *really* want to
> get this root-caused rather than just
>
> On Wed, Aug 23, 2017 at 8:58 AM, Tim Chen
> wrote:
> >
> > Will you still consider the original patch as a fail safe mechanism?
>
> I don't think we have much choice, although I would *really* want to get this
> root-caused rather than just papering over the
>
> On Wed, Aug 23, 2017 at 8:58 AM, Tim Chen
> wrote:
> >
> > Will you still consider the original patch as a fail safe mechanism?
>
> I don't think we have much choice, although I would *really* want to get this
> root-caused rather than just papering over the symptoms.
>
> Maybe still worth
On Wed, Aug 23, 2017 at 8:58 AM, Tim Chen wrote:
>
> Will you still consider the original patch as a fail safe mechanism?
I don't think we have much choice, although I would *really* want to
get this root-caused rather than just papering over the symptoms.
Maybe
On Wed, Aug 23, 2017 at 8:58 AM, Tim Chen wrote:
>
> Will you still consider the original patch as a fail safe mechanism?
I don't think we have much choice, although I would *really* want to
get this root-caused rather than just papering over the symptoms.
Maybe still worth testing that
On Tue, Aug 22, 2017 at 11:19:12AM -0700, Linus Torvalds wrote:
> On Tue, Aug 22, 2017 at 10:23 AM, Liang, Kan wrote:
> >
> > Although the patch doesn't trigger watchdog, the spin lock wait time
> > is not small (0.45s).
> > It may get worse again on larger systems.
>
>
On Tue, Aug 22, 2017 at 11:19:12AM -0700, Linus Torvalds wrote:
> On Tue, Aug 22, 2017 at 10:23 AM, Liang, Kan wrote:
> >
> > Although the patch doesn't trigger watchdog, the spin lock wait time
> > is not small (0.45s).
> > It may get worse again on larger systems.
>
> Yeah, I don't think Mel's
On 08/23/2017 07:49 AM, Liang, Kan wrote:
>> On Tue, Aug 22, 2017 at 12:55 PM, Liang, Kan wrote:
>>>
So I propose testing the attached trivial patch.
>>>
>>> It doesn’t work.
>>> The call stack is the same.
>>
>> So I would have expected the stack trace to be the same,
On 08/23/2017 07:49 AM, Liang, Kan wrote:
>> On Tue, Aug 22, 2017 at 12:55 PM, Liang, Kan wrote:
>>>
So I propose testing the attached trivial patch.
>>>
>>> It doesn’t work.
>>> The call stack is the same.
>>
>> So I would have expected the stack trace to be the same, and I would even
>>
> Subject: Re: [PATCH 1/2] sched/wait: Break up long wake list walk
>
> On Tue, Aug 22, 2017 at 2:24 PM, Andi Kleen <a...@linux.intel.com> wrote:
> >
> > I believe in this case it's used by threads, so a reference count
> > limit wouldn't help.
>
> For th
> Subject: Re: [PATCH 1/2] sched/wait: Break up long wake list walk
>
> On Tue, Aug 22, 2017 at 2:24 PM, Andi Kleen wrote:
> >
> > I believe in this case it's used by threads, so a reference count
> > limit wouldn't help.
>
> For the first migration try, y
> On Tue, Aug 22, 2017 at 12:55 PM, Liang, Kan wrote:
> >
> >> So I propose testing the attached trivial patch.
> >
> > It doesn’t work.
> > The call stack is the same.
>
> So I would have expected the stack trace to be the same, and I would even
> expect the CPU usage to be
> On Tue, Aug 22, 2017 at 12:55 PM, Liang, Kan wrote:
> >
> >> So I propose testing the attached trivial patch.
> >
> > It doesn’t work.
> > The call stack is the same.
>
> So I would have expected the stack trace to be the same, and I would even
> expect the CPU usage to be fairly similar,
On Tue, Aug 22, 2017 at 3:52 PM, Linus Torvalds
wrote:
>
> The *other* memory policies look fairly sane. They basically have a
> fairly well-defined preferred node for the policy (although the
> "MPOL_INTERLEAVE" looks wrong for a hugepage). But
>
On Tue, Aug 22, 2017 at 3:52 PM, Linus Torvalds
wrote:
>
> The *other* memory policies look fairly sane. They basically have a
> fairly well-defined preferred node for the policy (although the
> "MPOL_INTERLEAVE" looks wrong for a hugepage). But
> MPOL_PREFERRED/MPOL_F_LOCAL really looks
On Tue, Aug 22, 2017 at 2:24 PM, Andi Kleen wrote:
>
> I believe in this case it's used by threads, so a reference count limit
> wouldn't help.
For the first migration try, yes. But if it's some kind of "try and
try again" pattern, the second time you try and there are
On Tue, Aug 22, 2017 at 2:24 PM, Andi Kleen wrote:
>
> I believe in this case it's used by threads, so a reference count limit
> wouldn't help.
For the first migration try, yes. But if it's some kind of "try and
try again" pattern, the second time you try and there are people
waiting for the
On Tue, Aug 22, 2017 at 04:08:52PM -0500, Christopher Lameter wrote:
> On Tue, 22 Aug 2017, Andi Kleen wrote:
>
> > We only see it on 4S+ today. But systems are always getting larger,
> > so what's a large system today, will be a normal medium scale system
> > tomorrow.
> >
> > BTW we also
On Tue, Aug 22, 2017 at 04:08:52PM -0500, Christopher Lameter wrote:
> On Tue, 22 Aug 2017, Andi Kleen wrote:
>
> > We only see it on 4S+ today. But systems are always getting larger,
> > so what's a large system today, will be a normal medium scale system
> > tomorrow.
> >
> > BTW we also
On Tue, 22 Aug 2017, Andi Kleen wrote:
> We only see it on 4S+ today. But systems are always getting larger,
> so what's a large system today, will be a normal medium scale system
> tomorrow.
>
> BTW we also collected PT traces for the long hang cases, but it was
> hard to find a consistent
On Tue, 22 Aug 2017, Andi Kleen wrote:
> We only see it on 4S+ today. But systems are always getting larger,
> so what's a large system today, will be a normal medium scale system
> tomorrow.
>
> BTW we also collected PT traces for the long hang cases, but it was
> hard to find a consistent
On Tue, Aug 22, 2017 at 1:53 PM, Peter Zijlstra wrote:
>
> So _the_ problem with yield() is when you hit this with a RT task it
> will busy spin and possibly not allow the task that actually has the
> lock to make progress at all.
I thought we had explicitly defined yield()
On Tue, Aug 22, 2017 at 1:53 PM, Peter Zijlstra wrote:
>
> So _the_ problem with yield() is when you hit this with a RT task it
> will busy spin and possibly not allow the task that actually has the
> lock to make progress at all.
I thought we had explicitly defined yield() to not do that.
But
On Tue, Aug 22, 2017 at 01:42:13PM -0700, Linus Torvalds wrote:
> +void wait_on_page_bit_or_yield(struct page *page, int bit_nr)
> +{
> + if (PageWaiters(page)) {
> + yield();
> + return;
> + }
> + wait_on_page_bit(page, bit_nr);
> +}
So _the_ problem with
On Tue, Aug 22, 2017 at 01:42:13PM -0700, Linus Torvalds wrote:
> +void wait_on_page_bit_or_yield(struct page *page, int bit_nr)
> +{
> + if (PageWaiters(page)) {
> + yield();
> + return;
> + }
> + wait_on_page_bit(page, bit_nr);
> +}
So _the_ problem with
On Tue, Aug 22, 2017 at 12:55 PM, Liang, Kan wrote:
>
>> So I propose testing the attached trivial patch.
>
> It doesn’t work.
> The call stack is the same.
So I would have expected the stack trace to be the same, and I would
even expect the CPU usage to be fairly similar,
On Tue, Aug 22, 2017 at 12:55 PM, Liang, Kan wrote:
>
>> So I propose testing the attached trivial patch.
>
> It doesn’t work.
> The call stack is the same.
So I would have expected the stack trace to be the same, and I would
even expect the CPU usage to be fairly similar, because you'd see
> So I propose testing the attached trivial patch.
It doesn’t work.
The call stack is the same.
100.00% (821af140)
|
---wait_on_page_bit
__migration_entry_wait
migration_entry_wait
do_swap_page
> So I propose testing the attached trivial patch.
It doesn’t work.
The call stack is the same.
100.00% (821af140)
|
---wait_on_page_bit
__migration_entry_wait
migration_entry_wait
do_swap_page
> > Still, generating such a migration storm would be fairly tricky I think.
>
> Well, Mel seems to have been unable to generate a load that reproduces
> the long page waitqueues. And I don't think we've had any other
> reports of this either.
It could be that it requires a fairly large system.
> > Still, generating such a migration storm would be fairly tricky I think.
>
> Well, Mel seems to have been unable to generate a load that reproduces
> the long page waitqueues. And I don't think we've had any other
> reports of this either.
It could be that it requires a fairly large system.
On Tue, Aug 22, 2017 at 12:08 PM, Peter Zijlstra wrote:
>
> So that migration stuff has a filter on, we need two consecutive numa
> faults from the same page_cpupid 'hash', see
> should_numa_migrate_memory().
Hmm. That is only called for MPOL_F_MORON.
We don't actually
On Tue, Aug 22, 2017 at 12:08 PM, Peter Zijlstra wrote:
>
> So that migration stuff has a filter on, we need two consecutive numa
> faults from the same page_cpupid 'hash', see
> should_numa_migrate_memory().
Hmm. That is only called for MPOL_F_MORON.
We don't actually know what policy the
On Tue, Aug 22, 2017 at 11:56 AM, Peter Zijlstra wrote:
>
> Won't we now prematurely terminate the wait when we get a spurious
> wakeup?
I think there's two answers to that:
(a) do we even care?
(b) what spurious wakeup?
The "do we even care" quesiton is because
On Tue, Aug 22, 2017 at 11:56 AM, Peter Zijlstra wrote:
>
> Won't we now prematurely terminate the wait when we get a spurious
> wakeup?
I think there's two answers to that:
(a) do we even care?
(b) what spurious wakeup?
The "do we even care" quesiton is because wait_on_page_bit by
On Tue, Aug 22, 2017 at 11:19:12AM -0700, Linus Torvalds wrote:
> And since everybody touches it, as a result everybody eventually
> thinks that page should be migrated to their NUMA node.
So that migration stuff has a filter on, we need two consecutive numa
faults from the same page_cpupid
On Tue, Aug 22, 2017 at 11:19:12AM -0700, Linus Torvalds wrote:
> And since everybody touches it, as a result everybody eventually
> thinks that page should be migrated to their NUMA node.
So that migration stuff has a filter on, we need two consecutive numa
faults from the same page_cpupid
On Tue, Aug 22, 2017 at 11:19:12AM -0700, Linus Torvalds wrote:
> diff --git a/mm/filemap.c b/mm/filemap.c
> index a49702445ce0..75c29a1f90fb 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -991,13 +991,11 @@ static inline int
> wait_on_page_bit_common(wait_queue_head_t *q,
>
On Tue, Aug 22, 2017 at 11:19:12AM -0700, Linus Torvalds wrote:
> diff --git a/mm/filemap.c b/mm/filemap.c
> index a49702445ce0..75c29a1f90fb 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -991,13 +991,11 @@ static inline int
> wait_on_page_bit_common(wait_queue_head_t *q,
>
On Tue, Aug 22, 2017 at 11:19 AM, Linus Torvalds
wrote:
>
> So I propose testing the attached trivial patch. It may not do
> anything at all. But the existing code is actually doing extra work
> just to be fragile, in case the scenario above can happen.
Side note:
On Tue, Aug 22, 2017 at 11:19 AM, Linus Torvalds
wrote:
>
> So I propose testing the attached trivial patch. It may not do
> anything at all. But the existing code is actually doing extra work
> just to be fragile, in case the scenario above can happen.
Side note: the patch compiles for me. But
On Tue, Aug 22, 2017 at 10:23 AM, Liang, Kan wrote:
>
> Although the patch doesn't trigger watchdog, the spin lock wait time
> is not small (0.45s).
> It may get worse again on larger systems.
Yeah, I don't think Mel's patch is great - because I think we could do
so much
On Tue, Aug 22, 2017 at 10:23 AM, Liang, Kan wrote:
>
> Although the patch doesn't trigger watchdog, the spin lock wait time
> is not small (0.45s).
> It may get worse again on larger systems.
Yeah, I don't think Mel's patch is great - because I think we could do
so much better.
What I like
> > Covering both paths would be something like the patch below which
> > spins until the page is unlocked or it should reschedule. It's not
> > even boot tested as I spent what time I had on the test case that I
> > hoped would be able to prove it really works.
>
> I will give it a try.
> > Covering both paths would be something like the patch below which
> > spins until the page is unlocked or it should reschedule. It's not
> > even boot tested as I spent what time I had on the test case that I
> > hoped would be able to prove it really works.
>
> I will give it a try.
> > Because that code sequence doesn't actually depend on
> > "wait_on_page_lock()" for _correctness_ anyway, afaik. Anybody who
> > does "migration_entry_wait()" _has_ to retry anyway, since the page
> > table contents may have changed by waiting.
> >
> > So I'm not proud of the attached patch,
> > Because that code sequence doesn't actually depend on
> > "wait_on_page_lock()" for _correctness_ anyway, afaik. Anybody who
> > does "migration_entry_wait()" _has_ to retry anyway, since the page
> > table contents may have changed by waiting.
> >
> > So I'm not proud of the attached patch,
On Fri, Aug 18, 2017 at 12:14:12PM -0700, Linus Torvalds wrote:
> On Fri, Aug 18, 2017 at 11:54 AM, Mel Gorman
> wrote:
> >
> > One option to mitigate (but not eliminate) the problem is to record when
> > the page lock is contended and pass in TNF_PAGE_CONTENDED (new
On Fri, Aug 18, 2017 at 12:14:12PM -0700, Linus Torvalds wrote:
> On Fri, Aug 18, 2017 at 11:54 AM, Mel Gorman
> wrote:
> >
> > One option to mitigate (but not eliminate) the problem is to record when
> > the page lock is contended and pass in TNF_PAGE_CONTENDED (new flag) to
> >
On Fri, Aug 18, 2017 at 1:29 PM, Liang, Kan wrote:
> Here is the profiling with THP disabled for wait_on_page_bit_common and
> wake_up_page_bit.
>
>
> The call stack of wait_on_page_bit_common
> # Overhead Trace output
> # ..
> #
>100.00%
On Fri, Aug 18, 2017 at 1:29 PM, Liang, Kan wrote:
> Here is the profiling with THP disabled for wait_on_page_bit_common and
> wake_up_page_bit.
>
>
> The call stack of wait_on_page_bit_common
> # Overhead Trace output
> # ..
> #
>100.00% (821aefca)
>
On Fri, Aug 18, 2017 at 1:05 PM, Andi Kleen wrote:
>
> I think what's happening is that it allows more parallelism during wakeup:
>
> Normally it's like
>
> CPU 1 CPU 2 CPU 3
> .
>
> LOCK
> wake up tasks on
On Fri, Aug 18, 2017 at 1:05 PM, Andi Kleen wrote:
>
> I think what's happening is that it allows more parallelism during wakeup:
>
> Normally it's like
>
> CPU 1 CPU 2 CPU 3
> .
>
> LOCK
> wake up tasks on other CPUswoken up
> >>
> >> That indicates that it may be a hot page and it's possible that the
> >> page is locked for a short time but waiters accumulate. What happens
> >> if you leave NUMA balancing enabled but disable THP?
> >
> > No, disabling THP doesn't help the case.
>
> Interesting. That particular
> >>
> >> That indicates that it may be a hot page and it's possible that the
> >> page is locked for a short time but waiters accumulate. What happens
> >> if you leave NUMA balancing enabled but disable THP?
> >
> > No, disabling THP doesn't help the case.
>
> Interesting. That particular
On Fri, Aug 18, 2017 at 12:58 PM, Andi Kleen wrote:
>> which is hacky, but there's a rationale for it:
>>
>> (a) avoid the crazy long wait queues ;)
>>
>> (b) we know that migration is *supposed* to be CPU-bound (not IO
>> bound), so yielding the CPU and retrying may just
On Fri, Aug 18, 2017 at 12:58 PM, Andi Kleen wrote:
>> which is hacky, but there's a rationale for it:
>>
>> (a) avoid the crazy long wait queues ;)
>>
>> (b) we know that migration is *supposed* to be CPU-bound (not IO
>> bound), so yielding the CPU and retrying may just be the right thing
>>
> I was really hoping that we'd root-cause this and have a solution (and
> then apply Tim's patch as a "belt and suspenders" kind of thing), but
One thing I wanted to point out is that Tim's patch seems to make
several schedule intensive micro benchmarks faster.
I think what's happening is that
> I was really hoping that we'd root-cause this and have a solution (and
> then apply Tim's patch as a "belt and suspenders" kind of thing), but
One thing I wanted to point out is that Tim's patch seems to make
several schedule intensive micro benchmarks faster.
I think what's happening is that
> which is hacky, but there's a rationale for it:
>
> (a) avoid the crazy long wait queues ;)
>
> (b) we know that migration is *supposed* to be CPU-bound (not IO
> bound), so yielding the CPU and retrying may just be the right thing
> to do.
So this would degenerate into a spin when the
> which is hacky, but there's a rationale for it:
>
> (a) avoid the crazy long wait queues ;)
>
> (b) we know that migration is *supposed* to be CPU-bound (not IO
> bound), so yielding the CPU and retrying may just be the right thing
> to do.
So this would degenerate into a spin when the
On Fri, Aug 18, 2017 at 11:54 AM, Mel Gorman
wrote:
>
> One option to mitigate (but not eliminate) the problem is to record when
> the page lock is contended and pass in TNF_PAGE_CONTENDED (new flag) to
> task_numa_fault().
Well, finding it contended is fairly easy -
On Fri, Aug 18, 2017 at 11:54 AM, Mel Gorman
wrote:
>
> One option to mitigate (but not eliminate) the problem is to record when
> the page lock is contended and pass in TNF_PAGE_CONTENDED (new flag) to
> task_numa_fault().
Well, finding it contended is fairly easy - just look at the page wait
On Fri, Aug 18, 2017 at 10:48:23AM -0700, Linus Torvalds wrote:
> On Fri, Aug 18, 2017 at 9:53 AM, Liang, Kan wrote:
> >
> >> On Fri, Aug 18, 2017 Mel Gorman wrote:
> >>
> >> That indicates that it may be a hot page and it's possible that the page is
> >> locked for a short
On Fri, Aug 18, 2017 at 10:48:23AM -0700, Linus Torvalds wrote:
> On Fri, Aug 18, 2017 at 9:53 AM, Liang, Kan wrote:
> >
> >> On Fri, Aug 18, 2017 Mel Gorman wrote:
> >>
> >> That indicates that it may be a hot page and it's possible that the page is
> >> locked for a short time but waiters
On Fri, Aug 18, 2017 at 9:53 AM, Liang, Kan wrote:
>
>> On Fri, Aug 18, 2017 Mel Gorman wrote:
>>
>> That indicates that it may be a hot page and it's possible that the page is
>> locked for a short time but waiters accumulate. What happens if you leave
>> NUMA balancing
On Fri, Aug 18, 2017 at 9:53 AM, Liang, Kan wrote:
>
>> On Fri, Aug 18, 2017 Mel Gorman wrote:
>>
>> That indicates that it may be a hot page and it's possible that the page is
>> locked for a short time but waiters accumulate. What happens if you leave
>> NUMA balancing enabled but disable THP?
On Fri, Aug 18, 2017 at 5:23 AM, Mel Gorman wrote:
>
> new_page = alloc_pages_node(node,
> - (GFP_TRANSHUGE_LIGHT | __GFP_THISNODE),
> + (GFP_TRANSHUGE_LIGHT | __GFP_THISNODE) &
> ~__GFP_DIRECT_RECLAIM,
>
On Fri, Aug 18, 2017 at 5:23 AM, Mel Gorman wrote:
>
> new_page = alloc_pages_node(node,
> - (GFP_TRANSHUGE_LIGHT | __GFP_THISNODE),
> + (GFP_TRANSHUGE_LIGHT | __GFP_THISNODE) &
> ~__GFP_DIRECT_RECLAIM,
> HPAGE_PMD_ORDER);
That can't make any
> On Fri, Aug 18, 2017 at 02:20:38PM +, Liang, Kan wrote:
> > > Nothing fancy other than needing a comment if it works.
> > >
> >
> > No, the patch doesn't work.
> >
>
> That indicates that it may be a hot page and it's possible that the page is
> locked for a short time but waiters
> On Fri, Aug 18, 2017 at 02:20:38PM +, Liang, Kan wrote:
> > > Nothing fancy other than needing a comment if it works.
> > >
> >
> > No, the patch doesn't work.
> >
>
> That indicates that it may be a hot page and it's possible that the page is
> locked for a short time but waiters
> Still, I don't think this problem is THP specific. If there is a hot
> regular page getting migrated, we'll also see many threads get
> queued up quickly. THP may have made the problem worse as migrating
> it takes a longer time, meaning more threads could get queued up.
Also THP probably
> Still, I don't think this problem is THP specific. If there is a hot
> regular page getting migrated, we'll also see many threads get
> queued up quickly. THP may have made the problem worse as migrating
> it takes a longer time, meaning more threads could get queued up.
Also THP probably
On 08/18/2017 07:46 AM, Mel Gorman wrote:
> On Fri, Aug 18, 2017 at 02:20:38PM +, Liang, Kan wrote:
>>> Nothing fancy other than needing a comment if it works.
>>>
>>
>> No, the patch doesn't work.
>>
>
> That indicates that it may be a hot page and it's possible that the page is
> locked for
On 08/18/2017 07:46 AM, Mel Gorman wrote:
> On Fri, Aug 18, 2017 at 02:20:38PM +, Liang, Kan wrote:
>>> Nothing fancy other than needing a comment if it works.
>>>
>>
>> No, the patch doesn't work.
>>
>
> That indicates that it may be a hot page and it's possible that the page is
> locked for
On Fri, Aug 18, 2017 at 02:20:38PM +, Liang, Kan wrote:
> > Nothing fancy other than needing a comment if it works.
> >
>
> No, the patch doesn't work.
>
That indicates that it may be a hot page and it's possible that the page is
locked for a short time but waiters accumulate. What
On Fri, Aug 18, 2017 at 02:20:38PM +, Liang, Kan wrote:
> > Nothing fancy other than needing a comment if it works.
> >
>
> No, the patch doesn't work.
>
That indicates that it may be a hot page and it's possible that the page is
locked for a short time but waiters accumulate. What
> On Thu, Aug 17, 2017 at 01:44:40PM -0700, Linus Torvalds wrote:
> > On Thu, Aug 17, 2017 at 1:18 PM, Liang, Kan wrote:
> > >
> > > Here is the call stack of wait_on_page_bit_common when the queue is
> > > long (entries >1000).
> > >
> > > # Overhead Trace output
> > > #
> On Thu, Aug 17, 2017 at 01:44:40PM -0700, Linus Torvalds wrote:
> > On Thu, Aug 17, 2017 at 1:18 PM, Liang, Kan wrote:
> > >
> > > Here is the call stack of wait_on_page_bit_common when the queue is
> > > long (entries >1000).
> > >
> > > # Overhead Trace output
> > > #
> On Thu, Aug 17, 2017 at 1:18 PM, Liang, Kan wrote:
> >
> > Here is the call stack of wait_on_page_bit_common when the queue is
> > long (entries >1000).
> >
> > # Overhead Trace output
> > # ..
> > #
> >100.00% (931aefca)
> >
> On Thu, Aug 17, 2017 at 1:18 PM, Liang, Kan wrote:
> >
> > Here is the call stack of wait_on_page_bit_common when the queue is
> > long (entries >1000).
> >
> > # Overhead Trace output
> > # ..
> > #
> >100.00% (931aefca)
> > |
> >
On Thu, Aug 17, 2017 at 01:44:40PM -0700, Linus Torvalds wrote:
> On Thu, Aug 17, 2017 at 1:18 PM, Liang, Kan wrote:
> >
> > Here is the call stack of wait_on_page_bit_common
> > when the queue is long (entries >1000).
> >
> > # Overhead Trace output
> > #
On Thu, Aug 17, 2017 at 01:44:40PM -0700, Linus Torvalds wrote:
> On Thu, Aug 17, 2017 at 1:18 PM, Liang, Kan wrote:
> >
> > Here is the call stack of wait_on_page_bit_common
> > when the queue is long (entries >1000).
> >
> > # Overhead Trace output
> > # ..
> > #
> >
On Thu, Aug 17, 2017 at 1:18 PM, Liang, Kan wrote:
>
> Here is the call stack of wait_on_page_bit_common
> when the queue is long (entries >1000).
>
> # Overhead Trace output
> # ..
> #
>100.00% (931aefca)
> |
>
On Thu, Aug 17, 2017 at 1:18 PM, Liang, Kan wrote:
>
> Here is the call stack of wait_on_page_bit_common
> when the queue is long (entries >1000).
>
> # Overhead Trace output
> # ..
> #
>100.00% (931aefca)
> |
> ---wait_on_page_bit
>
> > Here is the wake_up_page_bit call stack when the workaround is running,
> which
> > is collected by perf record -g -a -e probe:wake_up_page_bit -- sleep 10
>
> It's actually not really wake_up_page_bit() that is all that
> interesting, it would be more interesting to see which path it is
> > Here is the wake_up_page_bit call stack when the workaround is running,
> which
> > is collected by perf record -g -a -e probe:wake_up_page_bit -- sleep 10
>
> It's actually not really wake_up_page_bit() that is all that
> interesting, it would be more interesting to see which path it is
On Thu, Aug 17, 2017 at 9:17 AM, Liang, Kan wrote:
>
> We tried both 12 and 16 bit table and that didn't make a difference.
> The long wake ups are mostly on the same page when we do instrumentation
Ok.
> Here is the wake_up_page_bit call stack when the workaround is
On Thu, Aug 17, 2017 at 9:17 AM, Liang, Kan wrote:
>
> We tried both 12 and 16 bit table and that didn't make a difference.
> The long wake ups are mostly on the same page when we do instrumentation
Ok.
> Here is the wake_up_page_bit call stack when the workaround is running, which
> is
> On Mon, Aug 14, 2017 at 5:52 PM, Tim Chen
> wrote:
> > We encountered workloads that have very long wake up list on large
> > systems. A waker takes a long time to traverse the entire wake list
> > and execute all the wake functions.
> >
> > We saw page wait list
> On Mon, Aug 14, 2017 at 5:52 PM, Tim Chen
> wrote:
> > We encountered workloads that have very long wake up list on large
> > systems. A waker takes a long time to traverse the entire wake list
> > and execute all the wake functions.
> >
> > We saw page wait list that are up to 3700+ entries
1 - 100 of 128 matches
Mail list logo