RE: Bug 71331 - mlock yields processor to lower priority process

2014-03-27 Thread jimmie.davis
  

> Generally real-time applications should not be doing mlock calls during 
> their real-time execution for that reason. The required memory regions 
> should be locked during startup so that this kind of execution delay can 
> be avoided at runtime.

Total agreement on this.
 .

Regards,
Bud Davis




  

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Bug 71331 - mlock yields processor to lower priority process

2014-03-27 Thread Robert Hancock

On 21/03/14 08:50 AM, jimmie.da...@l-3com.com wrote:>
> 
> From: Mike Galbraith [umgwanakikb...@gmail.com]
> Sent: Friday, March 21, 2014 9:41 AM
> To: Davis, Bud @ SSG - Link
> Cc: oneu...@suse.de; artem_fetis...@epam.com; pet...@infradead.org; 
kosaki.motoh...@jp.fujitsu.com; linux-kernel@vger.kernel.org

> Subject: RE: Bug 71331 - mlock yields processor to lower priority process
>
> On Fri, 2014-03-21 at 14:01 +, jimmie.da...@l-3com.com wrote:
>
>> If you call mlock () from a SCHED_FIFO task, you expect it to return
>> when done.  You don't expect it to block, and your task to be
>> pre-empted.
>
> Say some of your pages are sitting in an nfs swapfile orbiting Neptune,
> how do they get home, and what should we do meanwhile?
>
> -Mike
>
> Two options.
>
> #1. Return with a status value of EAGAIN.
>
> or
>
> #2.  Don't return until you can do it.
>
> If SCHED_FIFO is used, and mlock() is called, the intention of the 
user is very clear.  Run this task until
> it is completed or it blocks (and until a bit ago, mlock() did not 
block).


Returning EAGAIN is not something that the API definition from POSIX 
allows for, that is only for indicating a failure. If the memory that is 
being locked is not currently residing in RAM, then the memory will need 
to be swapped in before the call returns, which clearly cannot be done 
without blocking. Thus mlock can potentially block, which has not 
changed. Whether or not any kernel behavior has changed to cause this to 
happen in some cases where it didn't previously, the fact remains that 
this is allowed behavior.


Generally real-time applications should not be doing mlock calls during 
their real-time execution for that reason. The required memory regions 
should be locked during startup so that this kind of execution delay can 
be avoided at runtime.


>
> SCHED_FIFO users don't care about fairness.  They want the system to 
do what it is told.

>
> regards,
> Bud Davis
>
>
>
>
>
>
>
>
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Bug 71331 - mlock yields processor to lower priority process

2014-03-27 Thread Robert Hancock

On 21/03/14 08:50 AM, jimmie.da...@l-3com.com wrote:
 
 From: Mike Galbraith [umgwanakikb...@gmail.com]
 Sent: Friday, March 21, 2014 9:41 AM
 To: Davis, Bud @ SSG - Link
 Cc: oneu...@suse.de; artem_fetis...@epam.com; pet...@infradead.org; 
kosaki.motoh...@jp.fujitsu.com; linux-kernel@vger.kernel.org

 Subject: RE: Bug 71331 - mlock yields processor to lower priority process

 On Fri, 2014-03-21 at 14:01 +, jimmie.da...@l-3com.com wrote:

 If you call mlock () from a SCHED_FIFO task, you expect it to return
 when done.  You don't expect it to block, and your task to be
 pre-empted.

 Say some of your pages are sitting in an nfs swapfile orbiting Neptune,
 how do they get home, and what should we do meanwhile?

 -Mike

 Two options.

 #1. Return with a status value of EAGAIN.

 or

 #2.  Don't return until you can do it.

 If SCHED_FIFO is used, and mlock() is called, the intention of the 
user is very clear.  Run this task until
 it is completed or it blocks (and until a bit ago, mlock() did not 
block).


Returning EAGAIN is not something that the API definition from POSIX 
allows for, that is only for indicating a failure. If the memory that is 
being locked is not currently residing in RAM, then the memory will need 
to be swapped in before the call returns, which clearly cannot be done 
without blocking. Thus mlock can potentially block, which has not 
changed. Whether or not any kernel behavior has changed to cause this to 
happen in some cases where it didn't previously, the fact remains that 
this is allowed behavior.


Generally real-time applications should not be doing mlock calls during 
their real-time execution for that reason. The required memory regions 
should be locked during startup so that this kind of execution delay can 
be avoided at runtime.



 SCHED_FIFO users don't care about fairness.  They want the system to 
do what it is told.


 regards,
 Bud Davis










--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Bug 71331 - mlock yields processor to lower priority process

2014-03-27 Thread jimmie.davis
  

 Generally real-time applications should not be doing mlock calls during 
 their real-time execution for that reason. The required memory regions 
 should be locked during startup so that this kind of execution delay can 
 be avoided at runtime.

Total agreement on this.
 .

Regards,
Bud Davis




  

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Bug 71331 - mlock yields processor to lower priority process

2014-03-26 Thread Mike Galbraith
On Thu, 2014-03-27 at 04:20 +, jimmie.da...@l-3com.com wrote: 


> The example code submitted into bugzilla (chase back on the thread a
> bit, there is a reference) shows the problem.
> 
> Two threads, TaskA (high priority) and TaskB (low priority).  Assigned
> to the same processor, explicitly for the guarantee that only one of
> them can execute at a time.

Your priority based serialization guarantee does not exist.  Tasks can
be and are put to sleep.  When that happens, a lower priority runnable
task will run.  Whether you like that fact or not, it remains a fact.

If you don't want your lower priority task to run, why do you wake it?.

-Mike
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Bug 71331 - mlock yields processor to lower priority process

2014-03-26 Thread jimmie.davis


-Original Message-
From: Andy Lutomirski [mailto:l...@amacapital.net] 
Sent: Wednesday, March 26, 2014 7:40 PM
To: Davis, Bud @ SSG - Link; umgwanakikb...@gmail.com
Cc: oneu...@suse.de; artem_fetis...@epam.com; pet...@infradead.org; 
kosaki.motoh...@jp.fujitsu.com; linux-kernel@vger.kernel.org
Subject: Re: Bug 71331 - mlock yields processor to lower priority process

On 03/21/2014 07:50 AM, jimmie.da...@l-3com.com wrote:
> 
> 
> From: Mike Galbraith [umgwanakikb...@gmail.com]
> Sent: Friday, March 21, 2014 9:41 AM
> To: Davis, Bud @ SSG - Link
> Cc: oneu...@suse.de; artem_fetis...@epam.com; pet...@infradead.org; 
> kosaki.motoh...@jp.fujitsu.com; linux-kernel@vger.kernel.org
> Subject: RE: Bug 71331 - mlock yields processor to lower priority process
> 
> On Fri, 2014-03-21 at 14:01 +, jimmie.da...@l-3com.com wrote:
> 
>> If you call mlock () from a SCHED_FIFO task, you expect it to return
>> when done.  You don't expect it to block, and your task to be
>> pre-empted.
> 
> Say some of your pages are sitting in an nfs swapfile orbiting Neptune,
> how do they get home, and what should we do meanwhile?
> 
> -Mike
> 
> Two options.
> 
> #1. Return with a status value of EAGAIN.
> 
> or 
> 
> #2.  Don't return until you can do it.
> 
> If SCHED_FIFO is used, and mlock() is called, the intention of the user is 
> very clear.  Run this task until
> it is completed or it blocks (and until a bit ago, mlock() did not block).
> 
> SCHED_FIFO users don't care about fairness.  They want the system to do what 
> it is told.

I use mlock in real-time processes, but I do it in a separate thread.

Seriously, though, what do you expect the kernel to do?  When you call
mlock on a page that isn't present, the kernel will *read* that page.
mlock will, therefore, block until the IO finishes.

Some time around 3.9, the behavior changed a little bit: IIRC mlock used
to hold mmap_sem while sleeping.  Or maybe just mmap with MCL_FUTURE did
that.  In any case, the mlock code is less lock-happy than it was.  Is
it possible that you have two threads, and the non-mlock-calling thread
got blocked behind mlock, so it looked better?

--Andy

===


Andy,

The example code submitted into bugzilla (chase back on the thread a bit, there 
is a reference) shows the problem.

Two threads, TaskA (high priority) and TaskB (low priority).  Assigned to the 
same processor, explicitly for the guarantee that only one of them can execute 
at a time.  TaskA becomes eligible to run.  As part of its processing ( which 
the normal end is a call to sem_wait() ), it calls mlock().  TaskA then blocks, 
and TaskB begins running.  But wait, the system is designed that TaskA will run 
until it is done (thus SCHED_FIFO and a priority less than TaskB).  TaskA, a 
higher priority task is suspended and TaskB starts running.  And in the code 
that lead me on this endeavor :) {consisting of a lot of Ada threads}, the 
result was a segfault due to half-processed data by TaskA.

This is what I call 'blocking'; the thread is no longer running and the 
scheduler puts someone else in the processor.  I don't mean 'takes a long time 
until it returns'.  Takes a long time is fine, the system design relies on 
priority based scheduling and cpu affinity to ensure ordered access to 
application data.

mlock() now blocks.  I don't care how long mlock() takes, what I care about is 
the lower priority process pre-empting me.  Only a limited number of syscalls 
block; those that do are documented and usually have a way to obtain blocking 
or non-blocking behavior.

Can I change the system to deal with mlock() being a blocking syscall ?  Yes, 
but this is a situation where working code, that meets the API has stopped 
working.

Thanks for looking at it.

Regards,
Bud Davis






--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Bug 71331 - mlock yields processor to lower priority process

2014-03-26 Thread Andy Lutomirski
On 03/21/2014 07:50 AM, jimmie.da...@l-3com.com wrote:
> 
> 
> From: Mike Galbraith [umgwanakikb...@gmail.com]
> Sent: Friday, March 21, 2014 9:41 AM
> To: Davis, Bud @ SSG - Link
> Cc: oneu...@suse.de; artem_fetis...@epam.com; pet...@infradead.org; 
> kosaki.motoh...@jp.fujitsu.com; linux-kernel@vger.kernel.org
> Subject: RE: Bug 71331 - mlock yields processor to lower priority process
> 
> On Fri, 2014-03-21 at 14:01 +, jimmie.da...@l-3com.com wrote:
> 
>> If you call mlock () from a SCHED_FIFO task, you expect it to return
>> when done.  You don't expect it to block, and your task to be
>> pre-empted.
> 
> Say some of your pages are sitting in an nfs swapfile orbiting Neptune,
> how do they get home, and what should we do meanwhile?
> 
> -Mike
> 
> Two options.
> 
> #1. Return with a status value of EAGAIN.
> 
> or 
> 
> #2.  Don't return until you can do it.
> 
> If SCHED_FIFO is used, and mlock() is called, the intention of the user is 
> very clear.  Run this task until
> it is completed or it blocks (and until a bit ago, mlock() did not block).
> 
> SCHED_FIFO users don't care about fairness.  They want the system to do what 
> it is told.

I use mlock in real-time processes, but I do it in a separate thread.

Seriously, though, what do you expect the kernel to do?  When you call
mlock on a page that isn't present, the kernel will *read* that page.
mlock will, therefore, block until the IO finishes.

Some time around 3.9, the behavior changed a little bit: IIRC mlock used
to hold mmap_sem while sleeping.  Or maybe just mmap with MCL_FUTURE did
that.  In any case, the mlock code is less lock-happy than it was.  Is
it possible that you have two threads, and the non-mlock-calling thread
got blocked behind mlock, so it looked better?

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Bug 71331 - mlock yields processor to lower priority process

2014-03-26 Thread Andy Lutomirski
On 03/21/2014 07:50 AM, jimmie.da...@l-3com.com wrote:
 
 
 From: Mike Galbraith [umgwanakikb...@gmail.com]
 Sent: Friday, March 21, 2014 9:41 AM
 To: Davis, Bud @ SSG - Link
 Cc: oneu...@suse.de; artem_fetis...@epam.com; pet...@infradead.org; 
 kosaki.motoh...@jp.fujitsu.com; linux-kernel@vger.kernel.org
 Subject: RE: Bug 71331 - mlock yields processor to lower priority process
 
 On Fri, 2014-03-21 at 14:01 +, jimmie.da...@l-3com.com wrote:
 
 If you call mlock () from a SCHED_FIFO task, you expect it to return
 when done.  You don't expect it to block, and your task to be
 pre-empted.
 
 Say some of your pages are sitting in an nfs swapfile orbiting Neptune,
 how do they get home, and what should we do meanwhile?
 
 -Mike
 
 Two options.
 
 #1. Return with a status value of EAGAIN.
 
 or 
 
 #2.  Don't return until you can do it.
 
 If SCHED_FIFO is used, and mlock() is called, the intention of the user is 
 very clear.  Run this task until
 it is completed or it blocks (and until a bit ago, mlock() did not block).
 
 SCHED_FIFO users don't care about fairness.  They want the system to do what 
 it is told.

I use mlock in real-time processes, but I do it in a separate thread.

Seriously, though, what do you expect the kernel to do?  When you call
mlock on a page that isn't present, the kernel will *read* that page.
mlock will, therefore, block until the IO finishes.

Some time around 3.9, the behavior changed a little bit: IIRC mlock used
to hold mmap_sem while sleeping.  Or maybe just mmap with MCL_FUTURE did
that.  In any case, the mlock code is less lock-happy than it was.  Is
it possible that you have two threads, and the non-mlock-calling thread
got blocked behind mlock, so it looked better?

--Andy
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Bug 71331 - mlock yields processor to lower priority process

2014-03-26 Thread jimmie.davis


-Original Message-
From: Andy Lutomirski [mailto:l...@amacapital.net] 
Sent: Wednesday, March 26, 2014 7:40 PM
To: Davis, Bud @ SSG - Link; umgwanakikb...@gmail.com
Cc: oneu...@suse.de; artem_fetis...@epam.com; pet...@infradead.org; 
kosaki.motoh...@jp.fujitsu.com; linux-kernel@vger.kernel.org
Subject: Re: Bug 71331 - mlock yields processor to lower priority process

On 03/21/2014 07:50 AM, jimmie.da...@l-3com.com wrote:
 
 
 From: Mike Galbraith [umgwanakikb...@gmail.com]
 Sent: Friday, March 21, 2014 9:41 AM
 To: Davis, Bud @ SSG - Link
 Cc: oneu...@suse.de; artem_fetis...@epam.com; pet...@infradead.org; 
 kosaki.motoh...@jp.fujitsu.com; linux-kernel@vger.kernel.org
 Subject: RE: Bug 71331 - mlock yields processor to lower priority process
 
 On Fri, 2014-03-21 at 14:01 +, jimmie.da...@l-3com.com wrote:
 
 If you call mlock () from a SCHED_FIFO task, you expect it to return
 when done.  You don't expect it to block, and your task to be
 pre-empted.
 
 Say some of your pages are sitting in an nfs swapfile orbiting Neptune,
 how do they get home, and what should we do meanwhile?
 
 -Mike
 
 Two options.
 
 #1. Return with a status value of EAGAIN.
 
 or 
 
 #2.  Don't return until you can do it.
 
 If SCHED_FIFO is used, and mlock() is called, the intention of the user is 
 very clear.  Run this task until
 it is completed or it blocks (and until a bit ago, mlock() did not block).
 
 SCHED_FIFO users don't care about fairness.  They want the system to do what 
 it is told.

I use mlock in real-time processes, but I do it in a separate thread.

Seriously, though, what do you expect the kernel to do?  When you call
mlock on a page that isn't present, the kernel will *read* that page.
mlock will, therefore, block until the IO finishes.

Some time around 3.9, the behavior changed a little bit: IIRC mlock used
to hold mmap_sem while sleeping.  Or maybe just mmap with MCL_FUTURE did
that.  In any case, the mlock code is less lock-happy than it was.  Is
it possible that you have two threads, and the non-mlock-calling thread
got blocked behind mlock, so it looked better?

--Andy

===


Andy,

The example code submitted into bugzilla (chase back on the thread a bit, there 
is a reference) shows the problem.

Two threads, TaskA (high priority) and TaskB (low priority).  Assigned to the 
same processor, explicitly for the guarantee that only one of them can execute 
at a time.  TaskA becomes eligible to run.  As part of its processing ( which 
the normal end is a call to sem_wait() ), it calls mlock().  TaskA then blocks, 
and TaskB begins running.  But wait, the system is designed that TaskA will run 
until it is done (thus SCHED_FIFO and a priority less than TaskB).  TaskA, a 
higher priority task is suspended and TaskB starts running.  And in the code 
that lead me on this endeavor :) {consisting of a lot of Ada threads}, the 
result was a segfault due to half-processed data by TaskA.

This is what I call 'blocking'; the thread is no longer running and the 
scheduler puts someone else in the processor.  I don't mean 'takes a long time 
until it returns'.  Takes a long time is fine, the system design relies on 
priority based scheduling and cpu affinity to ensure ordered access to 
application data.

mlock() now blocks.  I don't care how long mlock() takes, what I care about is 
the lower priority process pre-empting me.  Only a limited number of syscalls 
block; those that do are documented and usually have a way to obtain blocking 
or non-blocking behavior.

Can I change the system to deal with mlock() being a blocking syscall ?  Yes, 
but this is a situation where working code, that meets the API has stopped 
working.

Thanks for looking at it.

Regards,
Bud Davis






--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Bug 71331 - mlock yields processor to lower priority process

2014-03-26 Thread Mike Galbraith
On Thu, 2014-03-27 at 04:20 +, jimmie.da...@l-3com.com wrote: 


 The example code submitted into bugzilla (chase back on the thread a
 bit, there is a reference) shows the problem.
 
 Two threads, TaskA (high priority) and TaskB (low priority).  Assigned
 to the same processor, explicitly for the guarantee that only one of
 them can execute at a time.

Your priority based serialization guarantee does not exist.  Tasks can
be and are put to sleep.  When that happens, a lower priority runnable
task will run.  Whether you like that fact or not, it remains a fact.

If you don't want your lower priority task to run, why do you wake it?.

-Mike
 


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Bug 71331 - mlock yields processor to lower priority process

2014-03-21 Thread Motohiro Kosaki
> Mike,
> 
> There are several problem domains where you protect critical sections by 
> assigning multiple threads to a single CPU and use priorities
> and SCHED_FIFO to ensure data integrity.
> 
> In this kind of design you don't make many syscalls.  The ones you do make, 
> have to be clearly understood
> if they block.
> 
> So, yes, I expect that a SCHED_FIFO task, that uses a subset of syscalls 
> known to be non-blocking, will not block.
> 
> If it is not 'unstoppable', then there is a defect in the OS.
> 
> In the past, a call to mlock() was known to be OK.  It would not block.  It 
> might take a while, but it would run to completion.  It does not
> do that any more.

False. Mlock is blockable since it was born.
Mlock and mlockall need memory allocate by definition. And it could lead to run 
VM activity and it may block. At least, on Linux.

lru_add_drain_all() is not only place to wait. Even if we remove it, mlock can 
still block. I don't think this discussion make sense.

> If mlock() is now a blocking call, then fine.  It only needs to be called on 
> occasion, and this can be accounted for in the application

Now? I have not seen any recent change.

Note: I'm not sure Artem's use-case is good or bad.  I only say the false 
assumption don't make a good discussion.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Bug 71331 - mlock yields processor to lower priority process

2014-03-21 Thread jimmie.davis


From: Mike Galbraith [umgwanakikb...@gmail.com]
Sent: Friday, March 21, 2014 9:41 AM
To: Davis, Bud @ SSG - Link
Cc: oneu...@suse.de; artem_fetis...@epam.com; pet...@infradead.org; 
kosaki.motoh...@jp.fujitsu.com; linux-kernel@vger.kernel.org
Subject: RE: Bug 71331 - mlock yields processor to lower priority process

On Fri, 2014-03-21 at 14:01 +, jimmie.da...@l-3com.com wrote:

> If you call mlock () from a SCHED_FIFO task, you expect it to return
> when done.  You don't expect it to block, and your task to be
> pre-empted.

Say some of your pages are sitting in an nfs swapfile orbiting Neptune,
how do they get home, and what should we do meanwhile?

-Mike

Two options.

#1. Return with a status value of EAGAIN.

or 

#2.  Don't return until you can do it.

If SCHED_FIFO is used, and mlock() is called, the intention of the user is very 
clear.  Run this task until
it is completed or it blocks (and until a bit ago, mlock() did not block).

SCHED_FIFO users don't care about fairness.  They want the system to do what it 
is told.

regards,
Bud Davis




 
 

 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Bug 71331 - mlock yields processor to lower priority process

2014-03-21 Thread Mike Galbraith
On Fri, 2014-03-21 at 14:01 +, jimmie.da...@l-3com.com wrote:

> If you call mlock () from a SCHED_FIFO task, you expect it to return
> when done.  You don't expect it to block, and your task to be
> pre-empted.

Say some of your pages are sitting in an nfs swapfile orbiting Neptune,
how do they get home, and what should we do meanwhile?

-Mike
> 
> 
> 
> 
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Bug 71331 - mlock yields processor to lower priority process

2014-03-21 Thread jimmie.davis


From: Mike Galbraith [umgwanakikb...@gmail.com]
Sent: Friday, March 21, 2014 8:14 AM
To: Davis, Bud @ SSG - Link
Cc: artem_fetis...@epam.com; pet...@infradead.org; 
kosaki.motoh...@jp.fujitsu.com; linux-kernel@vger.kernel.org
Subject: RE: Bug 71331 - mlock yields processor to lower priority process

On Fri, 2014-03-21 at 12:18 +, jimmie.da...@l-3com.com wrote:

> As the submitter of the bug, let me give you my perspective.
> SCHED_FIFO means run my task until it blocks or a higher priority task
> pre-empts it.  Period.

It blocked.
>
> mlock() doesn't block. check the man page.
>
I don't see that specified.

(or how it could be, but what do I know, IANIPL)

> Any other way and you are not able to use priority based scheduling.

Sure you can, allocate and lock down resources before entering critical
sections.

If you think donning a SCHED_FIFO super-suit should make your task
unstoppable, you're gonna be very disappointed.  Fact is if your
Juggernaut bumps ever so gently into a contended sleeping variety lock
(and in the rt kernel that means nearly every lock), it will block.

-Mike


Mike,

There are several problem domains where you protect critical sections by 
assigning multiple
threads to a single CPU and use priorities and SCHED_FIFO to ensure data 
integrity.

In this kind of design you don't make many syscalls.  The ones you do make, 
have to be clearly understood
if they block.   

So, yes, I expect that a SCHED_FIFO task, that uses a subset of syscalls known 
to be non-blocking, will not block.  

If it is not 'unstoppable', then there is a defect in the OS.

In the past, a call to mlock() was known to be OK.  It would not block.  It 
might take a while, but it would run to completion.  It does not do that any 
more.

If mlock() is now a blocking call, then fine.  It only needs to be called on 
occasion, and this can be accounted for
in the application design.  Does write() block ?  Yes, the man pages talks all 
about it.  Does clock_gettime() block ?
No, blocking is not mentioned in the man page.  Blocking behaviour is rare, 
when it exists it is documented.

My point is, this is either a defect to be fixed, or a change that warrants 
updating the documentation.

regards,
Bud Davis

 







   

 










--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Bug 71331 - mlock yields processor to lower priority process

2014-03-21 Thread jimmie.davis


From: Oliver Neukum [oneu...@suse.de]
Sent: Friday, March 21, 2014 8:35 AM
To: Davis, Bud @ SSG - Link
Cc: umgwanakikb...@gmail.com; artem_fetis...@epam.com; pet...@infradead.org; 
kosaki.motoh...@jp.fujitsu.com; linux-kernel@vger.kernel.org
Subject: Re: Bug 71331 - mlock yields processor to lower priority process

On Fri, 2014-03-21 at 12:18 +, jimmie.da...@l-3com.com wrote:
>
> >How is that different from any other time a task has to yield the CPU
> >for a bit?  While your high priority task is blocked for whatever
> >reason, a lower priority task gets to use the CPU.
>
>
> As the submitter of the bug, let me give you my perspective.  SCHED_FIFO 
> means run my task until it blocks or a higher priority task pre-empts it.  
> Period.
>
> mlock() doesn't block.  check the man page.

It guarantees that all pages be in RAM. That means it has to read them
in if they aren't. How could it do that without blocking?

Regards
Oliver
--
Oliver,

I would assume it would touch some flag bits on every page.  As part of the 
thread of execution that called it.

If you call mlock () from a SCHED_FIFO task, you expect it to return when done. 
 You don't expect it to block, and your task to be pre-empted.

For many years it returned when finished.  Now, it blocks.

This makes code that used to work, not work.  

I consider it a defect.

regards,
Bud Davis 



 




 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Bug 71331 - mlock yields processor to lower priority process

2014-03-21 Thread Oliver Neukum
On Fri, 2014-03-21 at 12:18 +, jimmie.da...@l-3com.com wrote:
>  
> >How is that different from any other time a task has to yield the CPU
> >for a bit?  While your high priority task is blocked for whatever
> >reason, a lower priority task gets to use the CPU.
> 
>  
> As the submitter of the bug, let me give you my perspective.  SCHED_FIFO 
> means run my task until it blocks or a higher priority task pre-empts it.  
> Period.
> 
> mlock() doesn't block.  check the man page.

It guarantees that all pages be in RAM. That means it has to read them
in if they aren't. How could it do that without blocking?

Regards
Oliver


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Bug 71331 - mlock yields processor to lower priority process

2014-03-21 Thread Mike Galbraith
On Fri, 2014-03-21 at 12:18 +, jimmie.da...@l-3com.com wrote:

> As the submitter of the bug, let me give you my perspective.
> SCHED_FIFO means run my task until it blocks or a higher priority task
> pre-empts it.  Period.

It blocked.
> 
> mlock() doesn't block. check the man page.
> 
I don't see that specified.

(or how it could be, but what do I know, IANIPL)

> Any other way and you are not able to use priority based scheduling. 

Sure you can, allocate and lock down resources before entering critical
sections.

If you think donning a SCHED_FIFO super-suit should make your task
unstoppable, you're gonna be very disappointed.  Fact is if your
Juggernaut bumps ever so gently into a contended sleeping variety lock
(and in the rt kernel that means nearly every lock), it will block.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Bug 71331 - mlock yields processor to lower priority process

2014-03-21 Thread jimmie.davis
 

>How is that different from any other time a task has to yield the CPU
>for a bit?  While your high priority task is blocked for whatever
>reason, a lower priority task gets to use the CPU.

 
As the submitter of the bug, let me give you my perspective.  SCHED_FIFO means 
run my task until it blocks or a higher priority task pre-empts it.  Period.

mlock() doesn't block.  check the man page.

Any other way and you are not able to use priority based scheduling.   
 

--bud davis




 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Bug 71331 - mlock yields processor to lower priority process

2014-03-21 Thread Mike Galbraith
On Fri, 2014-03-21 at 23:02 +0300, Artem Fetishev wrote:
> Hi all,
> 
> I am looking at a use-case when a real-time task (B) of higher
> priority is sometimes preempted by another real-time task (A) of lower
> priority. Well, B is not really preempted. It calls mlockall() which
> forces task B to yield the CPU. Under certain conditions, mlockall()
> calls lru_add_drain_all() which schedules a deferred work and wants
> the calling task to wait until that work is complete by putting the
> task into TASK_UNINTERRUPTIBLE state and calling schedule_timeout().
> 
> Tasks utilize SCHED_FIFO policy.
> 
> See details here: https://bugzilla.kernel.org/show_bug.cgi?id=71331
> 
> Besides mlockall, there are other kernel paths which make use of
> lru_add_drain_all() and schedule_timeout(), so I guess there are bunch
> of other syscalls which may lead to the above use-case.
> 
> So the question is: is above use-case an expected behavior of
> real-time tasks or is it a bug in mlockall (i.e. it should not
> interrupt a real-time process)?

How is that different from any other time a task has to yield the CPU
for a bit?  While your high priority task is blocked for whatever
reason, a lower priority task gets to use the CPU.

The bad thing is that in this case, your high priority task becomes
dependent upon kworker threads all over the box, with no mechanism to
guarantee that any of them will ever run.  No PI-boost to the rescue,
nada, say byebye to determinism.

That's true any time you depend upon some generic proxy.  Nothing tracks
IO for instance, to make sure your IO is handled all the way through the
chain by proxies of your priority.  What happens if say kjournald is
preempted by a low priority SCHED_FIFO hog.. nobody needing kjournald to
make progress goes anywhere, SCHED_FIFO 99 may as well be SCHED_IDLE.

In short, yes, I think this is the expected behavior.  Don't do things
that grow dependencies upon generic kernel proxies at critical times.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Bug 71331 - mlock yields processor to lower priority process

2014-03-21 Thread Mike Galbraith
On Fri, 2014-03-21 at 23:02 +0300, Artem Fetishev wrote:
 Hi all,
 
 I am looking at a use-case when a real-time task (B) of higher
 priority is sometimes preempted by another real-time task (A) of lower
 priority. Well, B is not really preempted. It calls mlockall() which
 forces task B to yield the CPU. Under certain conditions, mlockall()
 calls lru_add_drain_all() which schedules a deferred work and wants
 the calling task to wait until that work is complete by putting the
 task into TASK_UNINTERRUPTIBLE state and calling schedule_timeout().
 
 Tasks utilize SCHED_FIFO policy.
 
 See details here: https://bugzilla.kernel.org/show_bug.cgi?id=71331
 
 Besides mlockall, there are other kernel paths which make use of
 lru_add_drain_all() and schedule_timeout(), so I guess there are bunch
 of other syscalls which may lead to the above use-case.
 
 So the question is: is above use-case an expected behavior of
 real-time tasks or is it a bug in mlockall (i.e. it should not
 interrupt a real-time process)?

How is that different from any other time a task has to yield the CPU
for a bit?  While your high priority task is blocked for whatever
reason, a lower priority task gets to use the CPU.

The bad thing is that in this case, your high priority task becomes
dependent upon kworker threads all over the box, with no mechanism to
guarantee that any of them will ever run.  No PI-boost to the rescue,
nada, say byebye to determinism.

That's true any time you depend upon some generic proxy.  Nothing tracks
IO for instance, to make sure your IO is handled all the way through the
chain by proxies of your priority.  What happens if say kjournald is
preempted by a low priority SCHED_FIFO hog.. nobody needing kjournald to
make progress goes anywhere, SCHED_FIFO 99 may as well be SCHED_IDLE.

In short, yes, I think this is the expected behavior.  Don't do things
that grow dependencies upon generic kernel proxies at critical times.

-Mike

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Bug 71331 - mlock yields processor to lower priority process

2014-03-21 Thread jimmie.davis
 

How is that different from any other time a task has to yield the CPU
for a bit?  While your high priority task is blocked for whatever
reason, a lower priority task gets to use the CPU.

 
As the submitter of the bug, let me give you my perspective.  SCHED_FIFO means 
run my task until it blocks or a higher priority task pre-empts it.  Period.

mlock() doesn't block.  check the man page.

Any other way and you are not able to use priority based scheduling.   
 

--bud davis




 

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Bug 71331 - mlock yields processor to lower priority process

2014-03-21 Thread Mike Galbraith
On Fri, 2014-03-21 at 12:18 +, jimmie.da...@l-3com.com wrote:

 As the submitter of the bug, let me give you my perspective.
 SCHED_FIFO means run my task until it blocks or a higher priority task
 pre-empts it.  Period.

It blocked.
 
 mlock() doesn't block. check the man page.
 
I don't see that specified.

(or how it could be, but what do I know, IANIPL)

 Any other way and you are not able to use priority based scheduling. 

Sure you can, allocate and lock down resources before entering critical
sections.

If you think donning a SCHED_FIFO super-suit should make your task
unstoppable, you're gonna be very disappointed.  Fact is if your
Juggernaut bumps ever so gently into a contended sleeping variety lock
(and in the rt kernel that means nearly every lock), it will block.

-Mike

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Bug 71331 - mlock yields processor to lower priority process

2014-03-21 Thread Oliver Neukum
On Fri, 2014-03-21 at 12:18 +, jimmie.da...@l-3com.com wrote:
  
 How is that different from any other time a task has to yield the CPU
 for a bit?  While your high priority task is blocked for whatever
 reason, a lower priority task gets to use the CPU.
 
  
 As the submitter of the bug, let me give you my perspective.  SCHED_FIFO 
 means run my task until it blocks or a higher priority task pre-empts it.  
 Period.
 
 mlock() doesn't block.  check the man page.

It guarantees that all pages be in RAM. That means it has to read them
in if they aren't. How could it do that without blocking?

Regards
Oliver


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Bug 71331 - mlock yields processor to lower priority process

2014-03-21 Thread jimmie.davis


From: Oliver Neukum [oneu...@suse.de]
Sent: Friday, March 21, 2014 8:35 AM
To: Davis, Bud @ SSG - Link
Cc: umgwanakikb...@gmail.com; artem_fetis...@epam.com; pet...@infradead.org; 
kosaki.motoh...@jp.fujitsu.com; linux-kernel@vger.kernel.org
Subject: Re: Bug 71331 - mlock yields processor to lower priority process

On Fri, 2014-03-21 at 12:18 +, jimmie.da...@l-3com.com wrote:

 How is that different from any other time a task has to yield the CPU
 for a bit?  While your high priority task is blocked for whatever
 reason, a lower priority task gets to use the CPU.


 As the submitter of the bug, let me give you my perspective.  SCHED_FIFO 
 means run my task until it blocks or a higher priority task pre-empts it.  
 Period.

 mlock() doesn't block.  check the man page.

It guarantees that all pages be in RAM. That means it has to read them
in if they aren't. How could it do that without blocking?

Regards
Oliver
--
Oliver,

I would assume it would touch some flag bits on every page.  As part of the 
thread of execution that called it.

If you call mlock () from a SCHED_FIFO task, you expect it to return when done. 
 You don't expect it to block, and your task to be pre-empted.

For many years it returned when finished.  Now, it blocks.

This makes code that used to work, not work.  

I consider it a defect.

regards,
Bud Davis 



 




 


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Bug 71331 - mlock yields processor to lower priority process

2014-03-21 Thread jimmie.davis


From: Mike Galbraith [umgwanakikb...@gmail.com]
Sent: Friday, March 21, 2014 8:14 AM
To: Davis, Bud @ SSG - Link
Cc: artem_fetis...@epam.com; pet...@infradead.org; 
kosaki.motoh...@jp.fujitsu.com; linux-kernel@vger.kernel.org
Subject: RE: Bug 71331 - mlock yields processor to lower priority process

On Fri, 2014-03-21 at 12:18 +, jimmie.da...@l-3com.com wrote:

 As the submitter of the bug, let me give you my perspective.
 SCHED_FIFO means run my task until it blocks or a higher priority task
 pre-empts it.  Period.

It blocked.

 mlock() doesn't block. check the man page.

I don't see that specified.

(or how it could be, but what do I know, IANIPL)

 Any other way and you are not able to use priority based scheduling.

Sure you can, allocate and lock down resources before entering critical
sections.

If you think donning a SCHED_FIFO super-suit should make your task
unstoppable, you're gonna be very disappointed.  Fact is if your
Juggernaut bumps ever so gently into a contended sleeping variety lock
(and in the rt kernel that means nearly every lock), it will block.

-Mike


Mike,

There are several problem domains where you protect critical sections by 
assigning multiple
threads to a single CPU and use priorities and SCHED_FIFO to ensure data 
integrity.

In this kind of design you don't make many syscalls.  The ones you do make, 
have to be clearly understood
if they block.   

So, yes, I expect that a SCHED_FIFO task, that uses a subset of syscalls known 
to be non-blocking, will not block.  

If it is not 'unstoppable', then there is a defect in the OS.

In the past, a call to mlock() was known to be OK.  It would not block.  It 
might take a while, but it would run to completion.  It does not do that any 
more.

If mlock() is now a blocking call, then fine.  It only needs to be called on 
occasion, and this can be accounted for
in the application design.  Does write() block ?  Yes, the man pages talks all 
about it.  Does clock_gettime() block ?
No, blocking is not mentioned in the man page.  Blocking behaviour is rare, 
when it exists it is documented.

My point is, this is either a defect to be fixed, or a change that warrants 
updating the documentation.

regards,
Bud Davis

 







   

 










--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Bug 71331 - mlock yields processor to lower priority process

2014-03-21 Thread Mike Galbraith
On Fri, 2014-03-21 at 14:01 +, jimmie.da...@l-3com.com wrote:

 If you call mlock () from a SCHED_FIFO task, you expect it to return
 when done.  You don't expect it to block, and your task to be
 pre-empted.

Say some of your pages are sitting in an nfs swapfile orbiting Neptune,
how do they get home, and what should we do meanwhile?

-Mike
 
 
 
 
 


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Bug 71331 - mlock yields processor to lower priority process

2014-03-21 Thread jimmie.davis


From: Mike Galbraith [umgwanakikb...@gmail.com]
Sent: Friday, March 21, 2014 9:41 AM
To: Davis, Bud @ SSG - Link
Cc: oneu...@suse.de; artem_fetis...@epam.com; pet...@infradead.org; 
kosaki.motoh...@jp.fujitsu.com; linux-kernel@vger.kernel.org
Subject: RE: Bug 71331 - mlock yields processor to lower priority process

On Fri, 2014-03-21 at 14:01 +, jimmie.da...@l-3com.com wrote:

 If you call mlock () from a SCHED_FIFO task, you expect it to return
 when done.  You don't expect it to block, and your task to be
 pre-empted.

Say some of your pages are sitting in an nfs swapfile orbiting Neptune,
how do they get home, and what should we do meanwhile?

-Mike

Two options.

#1. Return with a status value of EAGAIN.

or 

#2.  Don't return until you can do it.

If SCHED_FIFO is used, and mlock() is called, the intention of the user is very 
clear.  Run this task until
it is completed or it blocks (and until a bit ago, mlock() did not block).

SCHED_FIFO users don't care about fairness.  They want the system to do what it 
is told.

regards,
Bud Davis




 
 

 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Bug 71331 - mlock yields processor to lower priority process

2014-03-21 Thread Motohiro Kosaki
 Mike,
 
 There are several problem domains where you protect critical sections by 
 assigning multiple threads to a single CPU and use priorities
 and SCHED_FIFO to ensure data integrity.
 
 In this kind of design you don't make many syscalls.  The ones you do make, 
 have to be clearly understood
 if they block.
 
 So, yes, I expect that a SCHED_FIFO task, that uses a subset of syscalls 
 known to be non-blocking, will not block.
 
 If it is not 'unstoppable', then there is a defect in the OS.
 
 In the past, a call to mlock() was known to be OK.  It would not block.  It 
 might take a while, but it would run to completion.  It does not
 do that any more.

False. Mlock is blockable since it was born.
Mlock and mlockall need memory allocate by definition. And it could lead to run 
VM activity and it may block. At least, on Linux.

lru_add_drain_all() is not only place to wait. Even if we remove it, mlock can 
still block. I don't think this discussion make sense.

 If mlock() is now a blocking call, then fine.  It only needs to be called on 
 occasion, and this can be accounted for in the application

Now? I have not seen any recent change.

Note: I'm not sure Artem's use-case is good or bad.  I only say the false 
assumption don't make a good discussion.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/