[PATCH v2] x86/efi: drop task_lock() from efi_switch_mm()

2018-07-24 Thread Sebastian Andrzej Siewior
efi_switch_mm() is a wrapper around switch_mm() which saves current's
->active_mm, sets the requests mm as ->active_mm and invokes
switch_mm().
I don't think that task_lock() is required during that procedure. It
protects ->mm which isn't changed here.

It needs to be mentioned that during the whole procedure (switch to
EFI's mm and back) the preemption needs to be disabled. A context switch
at this point would reset the cr3 value based on current->mm. Also, this
function may not be invoked at the same time on a different CPU because
it would overwrite the efi_scratch.prev_mm information.

Remove task_lock() and also update the comment to reflect it.

Signed-off-by: Sebastian Andrzej Siewior 
---
v1…v2:
- drop RFC
- don't remove assignment of ->active_mm (PeterZ)
- don't worry about concurrent invocations of the function, Ard
  said that the "mixed-mode" version has a lock now.

 arch/x86/platform/efi/efi_64.c | 10 --
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
index 448267f1c073..6ab1fdeefa1a 100644
--- a/arch/x86/platform/efi/efi_64.c
+++ b/arch/x86/platform/efi/efi_64.c
@@ -619,18 +619,16 @@ void __init efi_dump_pagetable(void)
 
 /*
  * Makes the calling thread switch to/from efi_mm context. Can be used
- * for SetVirtualAddressMap() i.e. current->active_mm == init_mm as well
- * as during efi runtime calls i.e current->active_mm == current_mm.
- * We are not mm_dropping()/mm_grabbing() any mm, because we are not
- * losing/creating any references.
+ * in a kernel thread and user context. Preemption needs to remain disabled
+ * while the EFI-mm is borrowed. mmgrab()/mmdrop() is not used because the mm
+ * can not change under us.
+ * It should be ensured that there are no concurent calls to this function.
  */
 void efi_switch_mm(struct mm_struct *mm)
 {
-   task_lock(current);
efi_scratch.prev_mm = current->active_mm;
current->active_mm = mm;
switch_mm(efi_scratch.prev_mm, mm, NULL);
-   task_unlock(current);
 }
 
 #ifdef CONFIG_EFI_MIXED
-- 
2.18.0

--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [RFC PATCH] x86/efi: drop task_lock() from efi_switch_mm()

2018-07-24 Thread Prakhya, Sai Praneeth
> > Because local_save_flags() does not disable interrupts??
> 
> now that you say so, it does make sense…
> 
> > > Anyway, I would still like to get rid of task_lock() in efi_switch_mm().
> > > Any objections to that?
> >
> > No, not at all.
> okay.

I don’t have any either and thanks for the clear commit message. It makes sense
why task_lock()/task_unlock() should go away.

Regards,
Sai
N�r��y���b�X��ǧv�^�)޺{.n�+{�y��^n�r��z���h&���G���h�(�階�ݢj"���m�z�ޖ���f���h���~�m�

Re: [RFC PATCH] x86/efi: drop task_lock() from efi_switch_mm()

2018-07-24 Thread Sebastian Andrzej Siewior
On 2018-07-24 18:19:15 [+0200], Ard Biesheuvel wrote:
> > Regarding the workqueue in commit 3eb420e70d87 ("efi: Use a work queue
> > to invoke EFI Runtime Services"). The efi_call_virt_pointer() function
> > uses local_save_flags() while invoking the EFI function. Why does commit
> > message say "Since UEFI runtime services are typically invoked with
> > interrupts enabled,"?
> >
> 
> Because local_save_flags() does not disable interrupts??

now that you say so, it does make sense…

> > Anyway, I would still like to get rid of task_lock() in efi_switch_mm().
> > Any objections to that?
> 
> No, not at all.
okay.

> > If efi_switch_mm() is only invoked from kernel-thread (the mixed-mode
> > caller does not) the then you could use use_mm() / unuse_mm() instead.
> > Then I would be fine with task_lock() (but it would have to be moved to
> > the preemptible section (after efi_sync_low_kernel_mappings()).
> 
> There are some exceptions, unfortunately, although I don't think they
> matter to -rt: at early boot time and panic() time, alternative
> wrappers are used that bypass the work queue. The reset_system()
> service also bypasses the work queue but that does not really matter
> given the purpose of the call.
If you lose the task's real mm in panic() then it probably does not
matter since you are never going back to userland.
But the others might be invoked from user context as there are two
non-blocking calls. So no use_mm() then.

reset_system() service should not return but it may if a signal is
pending (still there is a fallback but wouldn't it make sense to drop
that interruptible part)? Also this might be called from
emergency_restart() / panic() right? So down() will open interrupts and
schedule() if the lock is contended.

Sebastian
--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH] x86/efi: drop task_lock() from efi_switch_mm()

2018-07-24 Thread Ard Biesheuvel
On 24 July 2018 at 18:02, Sebastian Andrzej Siewior
 wrote:
> On 2018-07-24 17:32:14 [+0200], Ard Biesheuvel wrote:
>> Please refer to what has been queued up in tip:efi/core. Sai has
>> implemented a work queue for EFI calls so they occur from a kernel
>> thread, and the mixed mode locking has been fixed as well.
>
> I see commit 83a0a2ea0b99 ("efi/x86: Prevent reentrant firmware calls in
> mixed mode") which fixes the locking issue I mentioned. Will this make
> its way to the current kernel?
>

The next one, yes. If you mean 'backported to -stable', I haven't
really though about that yet, although it probable makes sense to do
so.

> Regarding the workqueue in commit 3eb420e70d87 ("efi: Use a work queue
> to invoke EFI Runtime Services"). The efi_call_virt_pointer() function
> uses local_save_flags() while invoking the EFI function. Why does commit
> message say "Since UEFI runtime services are typically invoked with
> interrupts enabled,"?
>

Because local_save_flags() does not disable interrupts??

> Anyway, I would still like to get rid of task_lock() in efi_switch_mm().
> Any objections to that?

No, not at all.

> If efi_switch_mm() is only invoked from kernel-thread (the mixed-mode
> caller does not) the then you could use use_mm() / unuse_mm() instead.
> Then I would be fine with task_lock() (but it would have to be moved to
> the preemptible section (after efi_sync_low_kernel_mappings()).

There are some exceptions, unfortunately, although I don't think they
matter to -rt: at early boot time and panic() time, alternative
wrappers are used that bypass the work queue. The reset_system()
service also bypasses the work queue but that does not really matter
given the purpose of the call.
--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH] x86/efi: drop task_lock() from efi_switch_mm()

2018-07-24 Thread Sebastian Andrzej Siewior
On 2018-07-24 17:32:14 [+0200], Ard Biesheuvel wrote:
> Please refer to what has been queued up in tip:efi/core. Sai has
> implemented a work queue for EFI calls so they occur from a kernel
> thread, and the mixed mode locking has been fixed as well.

I see commit 83a0a2ea0b99 ("efi/x86: Prevent reentrant firmware calls in
mixed mode") which fixes the locking issue I mentioned. Will this make
its way to the current kernel?

Regarding the workqueue in commit 3eb420e70d87 ("efi: Use a work queue
to invoke EFI Runtime Services"). The efi_call_virt_pointer() function
uses local_save_flags() while invoking the EFI function. Why does commit
message say "Since UEFI runtime services are typically invoked with
interrupts enabled,"?

Anyway, I would still like to get rid of task_lock() in efi_switch_mm().
Any objections to that?
If efi_switch_mm() is only invoked from kernel-thread (the mixed-mode
caller does not) the then you could use use_mm() / unuse_mm() instead.
Then I would be fine with task_lock() (but it would have to be moved to
the preemptible section (after efi_sync_low_kernel_mappings()).

Sebastian
--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH] x86/efi: drop task_lock() from efi_switch_mm()

2018-07-24 Thread Ard Biesheuvel
On 24 July 2018 at 17:29, Sebastian Andrzej Siewior
 wrote:
> On 2018-07-24 17:00:09 [+0200], Peter Zijlstra wrote:
>> On Tue, Jul 24, 2018 at 04:35:09PM +0200, Sebastian Andrzej Siewior wrote:
>> > I doubt that there any need to set ->active_mm. It is used by the
>> > scheduler to keep track of the "currently used mm" so it can reuse one
>> > for the kernel thread which does not own one and take a reference on it
>> > so it does not go away while the thread (that borrows it) is active.
>>
>> >  void efi_switch_mm(struct mm_struct *mm)
>> >  {
>> > -   task_lock(current);
>> > efi_scratch.prev_mm = current->active_mm;
>> > -   current->active_mm = mm;
>> > switch_mm(efi_scratch.prev_mm, mm, NULL);
>> > -   task_unlock(current);
>> >  }
>>
>> I think that's broken. Take for instance stuff like
>> perf_callchain_user32() -> get_segment_base(). That looks at active_mm
>> to get at the current LDT.
>
> right. I saw that briefly not sure why I dropped it. I have no idea
> where the LDT points to but it probably sense to return EFI's version of
> it.
>
>> Now, I'm not saying the whole perf vs EFI thing isn't already terminally
>> wrecked, but the rule is that active_mm really should point at the
>> current active mm, and the above breaks that.
> Right. Even if we not perform a context switch. Okay. Will update that
> part.
>

Please refer to what has been queued up in tip:efi/core. Sai has
implemented a work queue for EFI calls so they occur from a kernel
thread, and the mixed mode locking has been fixed as well.
--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH] x86/efi: drop task_lock() from efi_switch_mm()

2018-07-24 Thread Sebastian Andrzej Siewior
On 2018-07-24 17:00:09 [+0200], Peter Zijlstra wrote:
> On Tue, Jul 24, 2018 at 04:35:09PM +0200, Sebastian Andrzej Siewior wrote:
> > I doubt that there any need to set ->active_mm. It is used by the
> > scheduler to keep track of the "currently used mm" so it can reuse one
> > for the kernel thread which does not own one and take a reference on it
> > so it does not go away while the thread (that borrows it) is active.
> 
> >  void efi_switch_mm(struct mm_struct *mm)
> >  {
> > -   task_lock(current);
> > efi_scratch.prev_mm = current->active_mm;
> > -   current->active_mm = mm;
> > switch_mm(efi_scratch.prev_mm, mm, NULL);
> > -   task_unlock(current);
> >  }
> 
> I think that's broken. Take for instance stuff like
> perf_callchain_user32() -> get_segment_base(). That looks at active_mm
> to get at the current LDT.

right. I saw that briefly not sure why I dropped it. I have no idea
where the LDT points to but it probably sense to return EFI's version of
it.

> Now, I'm not saying the whole perf vs EFI thing isn't already terminally
> wrecked, but the rule is that active_mm really should point at the
> current active mm, and the above breaks that.
Right. Even if we not perform a context switch. Okay. Will update that
part.

Sebastian
--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH] x86/efi: drop task_lock() from efi_switch_mm()

2018-07-24 Thread Peter Zijlstra
On Tue, Jul 24, 2018 at 04:35:09PM +0200, Sebastian Andrzej Siewior wrote:
> I doubt that there any need to set ->active_mm. It is used by the
> scheduler to keep track of the "currently used mm" so it can reuse one
> for the kernel thread which does not own one and take a reference on it
> so it does not go away while the thread (that borrows it) is active.

>  void efi_switch_mm(struct mm_struct *mm)
>  {
> - task_lock(current);
>   efi_scratch.prev_mm = current->active_mm;
> - current->active_mm = mm;
>   switch_mm(efi_scratch.prev_mm, mm, NULL);
> - task_unlock(current);
>  }

I think that's broken. Take for instance stuff like
perf_callchain_user32() -> get_segment_base(). That looks at active_mm
to get at the current LDT.

Now, I'm not saying the whole perf vs EFI thing isn't already terminally
wrecked, but the rule is that active_mm really should point at the
current active mm, and the above breaks that.
--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH] x86/efi: drop task_lock() from efi_switch_mm()

2018-07-24 Thread Sebastian Andrzej Siewior
During invocations of EFI functions efi_switch_mm() is used to set the
active mm to borrow a different MM for a while. This used for instance
by efi_call_virt_pointer():
efi_call_virt_pointer()
- arch_efi_call_virt_setup()
  - preempt_disable()
  - efi_switch_mm(&efi_mm)

- "EFI CALL"

- arch_efi_call_virt_teardown()
  - efi_switch_mm(efi_scratch.prev_mm)
  - preempt_enable()

efi_switch_mm() is a wrapper around switch_mm() which saves current's
->active_mm, sets the requests mm as ->active_mm and invokes
switch_mm().
I don't think that task_lock() is required during that procedure. It
protects ->mm which isn't changed here.
I doubt that there any need to set ->active_mm. It is used by the
scheduler to keep track of the "currently used mm" so it can reuse one
for the kernel thread which does not own one and take a reference on it
so it does not go away while the thread (that borrows it) is active.
The `->active_mm' that is used here needs to saved to we can use it in
the "ending" efi_switch_to() again.
Based on this, the ->active_mm assignment can go.
It needs to be mentioned that during the whole procedure (switch to
EFI's mm and back) the preemption needs to be disabled. A context switch
at this point would reset the cr3 value based on current->mm.

With this new information, also update the comment to reflect it.

The "efi_scratch.prev_mm" assignment looks racy. With two concurrent
efi.get_next_variable invocations, one would overrwrite the value from
the other. This however does not happen because "runtime-wrappers"
(virt_efi_get_next_variable()) use efi_runtime_lock during the
invocation. This is not the case for "efi_64"
(efi_thunk_get_next_variable()) which does not take any locks (or I
missed them). Maybe it would be better to let the caller save
->active_mm while the MM is borrowed.

Signed-off-by: Sebastian Andrzej Siewior 
---
 arch/x86/platform/efi/efi_64.c | 10 --
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
index 77873ce700ae..64bf29462348 100644
--- a/arch/x86/platform/efi/efi_64.c
+++ b/arch/x86/platform/efi/efi_64.c
@@ -619,18 +619,16 @@ void __init efi_dump_pagetable(void)
 
 /*
  * Makes the calling thread switch to/from efi_mm context. Can be used
- * for SetVirtualAddressMap() i.e. current->active_mm == init_mm as well
- * as during efi runtime calls i.e current->active_mm == current_mm.
+ * in a kernel thread and user context. Preemption needs to remain disabled
+ * while the EFI-mm is borrowed.
  * We are not mm_dropping()/mm_grabbing() any mm, because we are not
- * losing/creating any references.
+ * losing/creating any references. Preemption must be disabled while the mm is
+ * switched.
  */
 void efi_switch_mm(struct mm_struct *mm)
 {
-   task_lock(current);
efi_scratch.prev_mm = current->active_mm;
-   current->active_mm = mm;
switch_mm(efi_scratch.prev_mm, mm, NULL);
-   task_unlock(current);
 }
 
 #ifdef CONFIG_EFI_MIXED
-- 
2.18.0

--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html