Re: [PATCH v4 1/4] firmware: Move umh locking code into fw_load_from_user_helper()

2016-09-15 Thread Luis R. Rodriguez
On Mon, Sep 12, 2016 at 02:09:10PM +0200, Daniel Wagner wrote:
> On 09/10/2016 12:10 AM, Luis R. Rodriguez wrote:
> > I've personally come to the determination we can stuff all this crap under
> > CONFIG_FW_LOADER_USER_HELPER due to the firmware cache. Here's why:
> > 
> > As Daniel has it, the usermodehelper_read_lock_wait() case this unfolds 
> > into a:
> >
> > schedule_timeout(0) when the usermodehelper_disabled is true, since after
> > the freezer was introduced this can be any of these values:
> > 
> > enum umh_disable_depth {
> > 
> > UMH_ENABLED = 0,
> > 
> > UMH_FREEZING,   
> > 
> > UMH_DISABLED,   
> > 
> > };
> > 
> > It means we schedule_timeout(0) anytime usermodehelper_disabled is either
> > UMH_FREEZING or UMH_DISABLED. Contrary to usermodehelper_read_trylock() it
> > will also *not* call try_to_freeze().
> > 
> > Given how we setup timer for this:
> > 
> > expire = timeout + jiffies; 
> > 
> > 
> > 
> > setup_timer_on_stack(, process_timeout, (unsigned 
> > long)current);  
> > __mod_timer(, expire, false); 
> > 
> > schedule(); 
> > 
> > del_singleshot_timer_sync();
> > 
> > We will naturally *always* timeout on schedule_timeout(0), so this return 0
> > always.
> 
> The smallest possible timeout is 1 * HZ because:
> 
> static inline long firmware_loading_timeout(void)
> {
>   return loading_timeout > 0 ? loading_timeout * HZ : MAX_JIFFY_OFFSET;
> }

Sorry, I confused your setting:

+#define fw_status_wait_timeout(fw_st, long)0

for replacing the abvove firmware_loading_timeout() to be 0 when
!CONFIG_FW_LOADER_USER_HELPER 

> > As Daniel has it when CONFIG_FW_LOADER_USER_HELPER is enabled we do have a
> > timeout and that just ensures we schedule_timeout(timeout) and as such can 
> > run
> > in the loop for usermodehelper_read_trylock() trying to see if the
> > usermodehelper_disabled has finally flipped to UMH_ENABLED, after which we'd
> > grant entry, say you can pass Go, and collect $200.
> > 
> > Since we use schedule_timeout(0) for !CONFIG_FW_LOADER_USER_HELPER it means
> > we *always* break out of the loop immediately, and return 0. If we are
> > in !CONFIG_FW_LOADER_USER_HELPER and the usermodehelper_disabled is set
> > to UMH_ENABLED we *immediately* bail and return 0 as well.
> 
> The mainline version will use 60s timeout for the usermode helper
> for !CONFIG_FW_LOADER_USER_HELPER. In case of the 
> CONFIG_FW_LOADER_USER_HELPER we should be the range of 1 * HZ to
> MAX_JIFFY_OFFSET. I say should because we don't check if
> 'loading_timeout * HZ' overflows. 

Indeed.

> > So effectively this code:
> > 
> > _request_firmware()
> > {
> > ...
> > timeout = usermodehelper_read_lock_wait(timeout);   
> > 
> > if (!timeout) { 
> > 
> > dev_dbg(device, "firmware: %s loading timed out\n", 
> > 
> > name);  
> > 
> > ret = -EBUSY;   
> > 
> > goto out;   
> > 
> > }   
> > ...
> > }
> > 
> > Will always returns -EBUSY if timeout is 0.
> 
> But timeout is never 0 unless I got that wrong. 

No indeed, my mistake.

> > Our current default is 60 even for !CONFIG_FW_LOADER_USER_HELPER kernels, in
> > that case what this code does is only on FW_OPT_NOWAIT calls
> > (request_firmware_nowait()) it will wait for 60 seconds for the freezer to
> > clear.
> > 
> > In Daniel's code right now this would change the default timeout to 0,
> 
> No, it shouldn't. With this patch 1, the usermode helper code
> is not involved anymore, that's the main difference.

Indeed. I meant to say that if the timeout was 0 it would not be effectively
for using the umh lock for what we though we wanted before. As you clarify
though your changes do not set the default timeout to 0 though.

> > but that
> > would effectively break all request_firmware_nowait() calls. If we want to
> > keep the same functionality we'd have to keep the default timeout set to 60
> > then for !CONFIG_FW_LOADER_USER_HELPER.
> > 
> > For the non-FW_OPT_NOWAIT (regular sync request_firmware() calls) cases we 
> > have:
> > 
> > _request_firmware()
> > {
> > ...
> > ret = usermodehelper_read_trylock();
> > 
> > if (WARN_ON(ret)) {  

Re: [PATCH v4 1/4] firmware: Move umh locking code into fw_load_from_user_helper()

2016-09-15 Thread Luis R. Rodriguez
On Mon, Sep 12, 2016 at 02:09:10PM +0200, Daniel Wagner wrote:
> On 09/10/2016 12:10 AM, Luis R. Rodriguez wrote:
> > I've personally come to the determination we can stuff all this crap under
> > CONFIG_FW_LOADER_USER_HELPER due to the firmware cache. Here's why:
> > 
> > As Daniel has it, the usermodehelper_read_lock_wait() case this unfolds 
> > into a:
> >
> > schedule_timeout(0) when the usermodehelper_disabled is true, since after
> > the freezer was introduced this can be any of these values:
> > 
> > enum umh_disable_depth {
> > 
> > UMH_ENABLED = 0,
> > 
> > UMH_FREEZING,   
> > 
> > UMH_DISABLED,   
> > 
> > };
> > 
> > It means we schedule_timeout(0) anytime usermodehelper_disabled is either
> > UMH_FREEZING or UMH_DISABLED. Contrary to usermodehelper_read_trylock() it
> > will also *not* call try_to_freeze().
> > 
> > Given how we setup timer for this:
> > 
> > expire = timeout + jiffies; 
> > 
> > 
> > 
> > setup_timer_on_stack(, process_timeout, (unsigned 
> > long)current);  
> > __mod_timer(, expire, false); 
> > 
> > schedule(); 
> > 
> > del_singleshot_timer_sync();
> > 
> > We will naturally *always* timeout on schedule_timeout(0), so this return 0
> > always.
> 
> The smallest possible timeout is 1 * HZ because:
> 
> static inline long firmware_loading_timeout(void)
> {
>   return loading_timeout > 0 ? loading_timeout * HZ : MAX_JIFFY_OFFSET;
> }

Sorry, I confused your setting:

+#define fw_status_wait_timeout(fw_st, long)0

for replacing the abvove firmware_loading_timeout() to be 0 when
!CONFIG_FW_LOADER_USER_HELPER 

> > As Daniel has it when CONFIG_FW_LOADER_USER_HELPER is enabled we do have a
> > timeout and that just ensures we schedule_timeout(timeout) and as such can 
> > run
> > in the loop for usermodehelper_read_trylock() trying to see if the
> > usermodehelper_disabled has finally flipped to UMH_ENABLED, after which we'd
> > grant entry, say you can pass Go, and collect $200.
> > 
> > Since we use schedule_timeout(0) for !CONFIG_FW_LOADER_USER_HELPER it means
> > we *always* break out of the loop immediately, and return 0. If we are
> > in !CONFIG_FW_LOADER_USER_HELPER and the usermodehelper_disabled is set
> > to UMH_ENABLED we *immediately* bail and return 0 as well.
> 
> The mainline version will use 60s timeout for the usermode helper
> for !CONFIG_FW_LOADER_USER_HELPER. In case of the 
> CONFIG_FW_LOADER_USER_HELPER we should be the range of 1 * HZ to
> MAX_JIFFY_OFFSET. I say should because we don't check if
> 'loading_timeout * HZ' overflows. 

Indeed.

> > So effectively this code:
> > 
> > _request_firmware()
> > {
> > ...
> > timeout = usermodehelper_read_lock_wait(timeout);   
> > 
> > if (!timeout) { 
> > 
> > dev_dbg(device, "firmware: %s loading timed out\n", 
> > 
> > name);  
> > 
> > ret = -EBUSY;   
> > 
> > goto out;   
> > 
> > }   
> > ...
> > }
> > 
> > Will always returns -EBUSY if timeout is 0.
> 
> But timeout is never 0 unless I got that wrong. 

No indeed, my mistake.

> > Our current default is 60 even for !CONFIG_FW_LOADER_USER_HELPER kernels, in
> > that case what this code does is only on FW_OPT_NOWAIT calls
> > (request_firmware_nowait()) it will wait for 60 seconds for the freezer to
> > clear.
> > 
> > In Daniel's code right now this would change the default timeout to 0,
> 
> No, it shouldn't. With this patch 1, the usermode helper code
> is not involved anymore, that's the main difference.

Indeed. I meant to say that if the timeout was 0 it would not be effectively
for using the umh lock for what we though we wanted before. As you clarify
though your changes do not set the default timeout to 0 though.

> > but that
> > would effectively break all request_firmware_nowait() calls. If we want to
> > keep the same functionality we'd have to keep the default timeout set to 60
> > then for !CONFIG_FW_LOADER_USER_HELPER.
> > 
> > For the non-FW_OPT_NOWAIT (regular sync request_firmware() calls) cases we 
> > have:
> > 
> > _request_firmware()
> > {
> > ...
> > ret = usermodehelper_read_trylock();
> > 
> > if (WARN_ON(ret)) {  

Re: [PATCH v4 1/4] firmware: Move umh locking code into fw_load_from_user_helper()

2016-09-09 Thread Luis R. Rodriguez
On Fri, Sep 09, 2016 at 12:19:38PM +0800, Ming Lei wrote:
> On Fri, Sep 9, 2016 at 11:39 AM, Luis R. Rodriguez  wrote:
> > On Sep 8, 2016 6:22 PM, "Ming Lei"  wrote:
> >>
> >> On Fri, Sep 9, 2016 at 4:11 AM, Luis R. Rodriguez 
> >> wrote:
> >> > On Thu, Sep 08, 2016 at 11:37:54PM +0800, Ming Lei wrote:
> >> >> On Wed, Sep 7, 2016 at 4:45 PM, Daniel Wagner  wrote:
> >> >> > From: Daniel Wagner 
> >> >> >
> >> >> > When we load the firmware directly we don't need to take the umh
> >> >> > lock.
> >> >>
> >> >> I am wondering if it can be wrong.
> >> >
> >> > If you disable the firmware UMH why would we need to lock if the lock is
> >> > being
> >> > shown only used for the firmare UMH ?
> >> >
> >> >> Actually in case of firmware loading, the usermode helper lock doesn't
> >> >> only mean the user helper is usable, and it also may serve to mark the
> >> >> filesystem/block device is ready for firmware loading, and of couse
> >> >> direct
> >> >> loading need fs/block to be ready too.
> >> >
> >> > Yes but that's a race I've identified a while ago present even if you
> >> > use initramfs *and*
> >> > use kernel_read_file_from_path() on any part of the kernel [0], I
> >> > proposed a possible
> >>
> >> Actualy I mean the situation of suspend vs. resume, and some drivers
> >> still may not benefit from firmware loading cache when requesting loading
> >> in .resume(), at that time it is still too early for direct loading.
> >> With UMH lock,
> >> we can get a warning or avoid the issue.
> >
> > Agreed, but that would seem odd and perhaps misleading to have a try lock
> > for UMH when no firmware UMH code is enabled. This should probably made
> > clear in comments for now as to why we have it then and we should just mark
> 
> That is very helpful, :-)

Upon a closer look at how we use usermodehelper_read_trylock() and
usermodehelper_read_lock_wait() for the non-CONFIG_FW_LOADER_USER_HELPER 
kernels,
it seems we have to change things a bit more if we wanted to keep this *and*
I've personally come to the determination we can stuff all this crap under
CONFIG_FW_LOADER_USER_HELPER due to the firmware cache. Here's why:

As Daniel has it, the usermodehelper_read_lock_wait() case this unfolds into a:

schedule_timeout(0) when the usermodehelper_disabled is true, since after
the freezer was introduced this can be any of these values:

enum umh_disable_depth {
UMH_ENABLED = 0,
UMH_FREEZING,   
UMH_DISABLED,   
};

It means we schedule_timeout(0) anytime usermodehelper_disabled is either
UMH_FREEZING or UMH_DISABLED. Contrary to usermodehelper_read_trylock() it
will also *not* call try_to_freeze().

Given how we setup timer for this:

expire = timeout + jiffies; 

setup_timer_on_stack(, process_timeout, (unsigned long)current);  
__mod_timer(, expire, false); 
schedule(); 
del_singleshot_timer_sync();

We will naturally *always* timeout on schedule_timeout(0), so this return 0
always.

As Daniel has it when CONFIG_FW_LOADER_USER_HELPER is enabled we do have a
timeout and that just ensures we schedule_timeout(timeout) and as such can run
in the loop for usermodehelper_read_trylock() trying to see if the
usermodehelper_disabled has finally flipped to UMH_ENABLED, after which we'd
grant entry, say you can pass Go, and collect $200.

Since we use schedule_timeout(0) for !CONFIG_FW_LOADER_USER_HELPER it means
we *always* break out of the loop immediately, and return 0. If we are
in !CONFIG_FW_LOADER_USER_HELPER and the usermodehelper_disabled is set
to UMH_ENABLED we *immediately* bail and return 0 as well.

So effectively this code:

_request_firmware()
{
...
timeout = usermodehelper_read_lock_wait(timeout);   
if (!timeout) { 
dev_dbg(device, "firmware: %s loading timed out\n", 
name);  
ret = -EBUSY;   
goto out;   
}   
...
}

Will always returns -EBUSY if timeout is 0.

Our current default is 60 even for !CONFIG_FW_LOADER_USER_HELPER kernels, in
that case what this code does is only on FW_OPT_NOWAIT calls
(request_firmware_nowait()) it will wait for 60 seconds for the 

Re: [PATCH v4 1/4] firmware: Move umh locking code into fw_load_from_user_helper()

2016-09-09 Thread Luis R. Rodriguez
On Fri, Sep 09, 2016 at 12:19:38PM +0800, Ming Lei wrote:
> On Fri, Sep 9, 2016 at 11:39 AM, Luis R. Rodriguez  wrote:
> > On Sep 8, 2016 6:22 PM, "Ming Lei"  wrote:
> >>
> >> On Fri, Sep 9, 2016 at 4:11 AM, Luis R. Rodriguez 
> >> wrote:
> >> > On Thu, Sep 08, 2016 at 11:37:54PM +0800, Ming Lei wrote:
> >> >> On Wed, Sep 7, 2016 at 4:45 PM, Daniel Wagner  wrote:
> >> >> > From: Daniel Wagner 
> >> >> >
> >> >> > When we load the firmware directly we don't need to take the umh
> >> >> > lock.
> >> >>
> >> >> I am wondering if it can be wrong.
> >> >
> >> > If you disable the firmware UMH why would we need to lock if the lock is
> >> > being
> >> > shown only used for the firmare UMH ?
> >> >
> >> >> Actually in case of firmware loading, the usermode helper lock doesn't
> >> >> only mean the user helper is usable, and it also may serve to mark the
> >> >> filesystem/block device is ready for firmware loading, and of couse
> >> >> direct
> >> >> loading need fs/block to be ready too.
> >> >
> >> > Yes but that's a race I've identified a while ago present even if you
> >> > use initramfs *and*
> >> > use kernel_read_file_from_path() on any part of the kernel [0], I
> >> > proposed a possible
> >>
> >> Actualy I mean the situation of suspend vs. resume, and some drivers
> >> still may not benefit from firmware loading cache when requesting loading
> >> in .resume(), at that time it is still too early for direct loading.
> >> With UMH lock,
> >> we can get a warning or avoid the issue.
> >
> > Agreed, but that would seem odd and perhaps misleading to have a try lock
> > for UMH when no firmware UMH code is enabled. This should probably made
> > clear in comments for now as to why we have it then and we should just mark
> 
> That is very helpful, :-)

Upon a closer look at how we use usermodehelper_read_trylock() and
usermodehelper_read_lock_wait() for the non-CONFIG_FW_LOADER_USER_HELPER 
kernels,
it seems we have to change things a bit more if we wanted to keep this *and*
I've personally come to the determination we can stuff all this crap under
CONFIG_FW_LOADER_USER_HELPER due to the firmware cache. Here's why:

As Daniel has it, the usermodehelper_read_lock_wait() case this unfolds into a:

schedule_timeout(0) when the usermodehelper_disabled is true, since after
the freezer was introduced this can be any of these values:

enum umh_disable_depth {
UMH_ENABLED = 0,
UMH_FREEZING,   
UMH_DISABLED,   
};

It means we schedule_timeout(0) anytime usermodehelper_disabled is either
UMH_FREEZING or UMH_DISABLED. Contrary to usermodehelper_read_trylock() it
will also *not* call try_to_freeze().

Given how we setup timer for this:

expire = timeout + jiffies; 

setup_timer_on_stack(, process_timeout, (unsigned long)current);  
__mod_timer(, expire, false); 
schedule(); 
del_singleshot_timer_sync();

We will naturally *always* timeout on schedule_timeout(0), so this return 0
always.

As Daniel has it when CONFIG_FW_LOADER_USER_HELPER is enabled we do have a
timeout and that just ensures we schedule_timeout(timeout) and as such can run
in the loop for usermodehelper_read_trylock() trying to see if the
usermodehelper_disabled has finally flipped to UMH_ENABLED, after which we'd
grant entry, say you can pass Go, and collect $200.

Since we use schedule_timeout(0) for !CONFIG_FW_LOADER_USER_HELPER it means
we *always* break out of the loop immediately, and return 0. If we are
in !CONFIG_FW_LOADER_USER_HELPER and the usermodehelper_disabled is set
to UMH_ENABLED we *immediately* bail and return 0 as well.

So effectively this code:

_request_firmware()
{
...
timeout = usermodehelper_read_lock_wait(timeout);   
if (!timeout) { 
dev_dbg(device, "firmware: %s loading timed out\n", 
name);  
ret = -EBUSY;   
goto out;   
}   
...
}

Will always returns -EBUSY if timeout is 0.

Our current default is 60 even for !CONFIG_FW_LOADER_USER_HELPER kernels, in
that case what this code does is only on FW_OPT_NOWAIT calls
(request_firmware_nowait()) it will wait for 60 seconds for the freezer to
clear.

In Daniel's code right now this would change the default timeout to 0, but that
would 

Re: [PATCH v4 1/4] firmware: Move umh locking code into fw_load_from_user_helper()

2016-09-08 Thread Ming Lei
On Fri, Sep 9, 2016 at 11:39 AM, Luis R. Rodriguez  wrote:
> On Sep 8, 2016 6:22 PM, "Ming Lei"  wrote:
>>
>> On Fri, Sep 9, 2016 at 4:11 AM, Luis R. Rodriguez 
>> wrote:
>> > On Thu, Sep 08, 2016 at 11:37:54PM +0800, Ming Lei wrote:
>> >> On Wed, Sep 7, 2016 at 4:45 PM, Daniel Wagner  wrote:
>> >> > From: Daniel Wagner 
>> >> >
>> >> > When we load the firmware directly we don't need to take the umh
>> >> > lock.
>> >>
>> >> I am wondering if it can be wrong.
>> >
>> > If you disable the firmware UMH why would we need to lock if the lock is
>> > being
>> > shown only used for the firmare UMH ?
>> >
>> >> Actually in case of firmware loading, the usermode helper lock doesn't
>> >> only mean the user helper is usable, and it also may serve to mark the
>> >> filesystem/block device is ready for firmware loading, and of couse
>> >> direct
>> >> loading need fs/block to be ready too.
>> >
>> > Yes but that's a race I've identified a while ago present even if you
>> > use initramfs *and*
>> > use kernel_read_file_from_path() on any part of the kernel [0], I
>> > proposed a possible
>>
>> Actualy I mean the situation of suspend vs. resume, and some drivers
>> still may not benefit from firmware loading cache when requesting loading
>> in .resume(), at that time it is still too early for direct loading.
>> With UMH lock,
>> we can get a warning or avoid the issue.
>
> Agreed, but that would seem odd and perhaps misleading to have a try lock
> for UMH when no firmware UMH code is enabled. This should probably made
> clear in comments for now as to why we have it then and we should just mark

That is very helpful, :-)

> a TODO item to generalize this to a common freezer check. Surprised we don't
> have one yet. Rafael ?
>
>   Luis


Re: [PATCH v4 1/4] firmware: Move umh locking code into fw_load_from_user_helper()

2016-09-08 Thread Ming Lei
On Fri, Sep 9, 2016 at 11:39 AM, Luis R. Rodriguez  wrote:
> On Sep 8, 2016 6:22 PM, "Ming Lei"  wrote:
>>
>> On Fri, Sep 9, 2016 at 4:11 AM, Luis R. Rodriguez 
>> wrote:
>> > On Thu, Sep 08, 2016 at 11:37:54PM +0800, Ming Lei wrote:
>> >> On Wed, Sep 7, 2016 at 4:45 PM, Daniel Wagner  wrote:
>> >> > From: Daniel Wagner 
>> >> >
>> >> > When we load the firmware directly we don't need to take the umh
>> >> > lock.
>> >>
>> >> I am wondering if it can be wrong.
>> >
>> > If you disable the firmware UMH why would we need to lock if the lock is
>> > being
>> > shown only used for the firmare UMH ?
>> >
>> >> Actually in case of firmware loading, the usermode helper lock doesn't
>> >> only mean the user helper is usable, and it also may serve to mark the
>> >> filesystem/block device is ready for firmware loading, and of couse
>> >> direct
>> >> loading need fs/block to be ready too.
>> >
>> > Yes but that's a race I've identified a while ago present even if you
>> > use initramfs *and*
>> > use kernel_read_file_from_path() on any part of the kernel [0], I
>> > proposed a possible
>>
>> Actualy I mean the situation of suspend vs. resume, and some drivers
>> still may not benefit from firmware loading cache when requesting loading
>> in .resume(), at that time it is still too early for direct loading.
>> With UMH lock,
>> we can get a warning or avoid the issue.
>
> Agreed, but that would seem odd and perhaps misleading to have a try lock
> for UMH when no firmware UMH code is enabled. This should probably made
> clear in comments for now as to why we have it then and we should just mark

That is very helpful, :-)

> a TODO item to generalize this to a common freezer check. Surprised we don't
> have one yet. Rafael ?
>
>   Luis


Re: [PATCH v4 1/4] firmware: Move umh locking code into fw_load_from_user_helper()

2016-09-08 Thread Ming Lei
On Fri, Sep 9, 2016 at 4:11 AM, Luis R. Rodriguez  wrote:
> On Thu, Sep 08, 2016 at 11:37:54PM +0800, Ming Lei wrote:
>> On Wed, Sep 7, 2016 at 4:45 PM, Daniel Wagner  wrote:
>> > From: Daniel Wagner 
>> >
>> > When we load the firmware directly we don't need to take the umh
>> > lock.
>>
>> I am wondering if it can be wrong.
>
> If you disable the firmware UMH why would we need to lock if the lock is being
> shown only used for the firmare UMH ?
>
>> Actually in case of firmware loading, the usermode helper lock doesn't
>> only mean the user helper is usable, and it also may serve to mark the
>> filesystem/block device is ready for firmware loading, and of couse direct
>> loading need fs/block to be ready too.
>
> Yes but that's a race I've identified a while ago present even if you use 
> initramfs *and*
> use kernel_read_file_from_path() on any part of the kernel [0], I proposed a 
> possible

Actualy I mean the situation of suspend vs. resume, and some drivers
still may not benefit from firmware loading cache when requesting loading
in .resume(), at that time it is still too early for direct loading.
With UMH lock,
we can get a warning or avoid the issue.


> solution recently [1], given tons of folks were *complaining* about this but 
> despite there
> being one solution proposed [2] it was still incorrect, as only *userspace* 
> can know

In case of initramfs and built-in drivers, it isn't a surprise to see this race,
and that shouldn't be a common case. But for suspend vs. resume, it can be
true.


> for sure when your critical filesystems are mounted. Since we now have a 
> shared
> "read file fs" helper kernel_read_file_from_path() it implicates the race is 
> possible
> elsewhere as well.
>
> The race is present given you simply cannot know when the root filesystem
> (consider a series of pivot_root() calls, hey -- anyone can do that, who are 
> we to
> tell them they can't), or in this particular case importance, only 
> considering firmware,
> when /lib/firmware/ is ready.  In short I am saying this whole race thing is a
> bigger issue, and as much as Linus hates my proposed solution I have not heard
> any proposal alternatives to address this properly, not even clarifications on
> what our expectations for drivers then are if they use 
> kernel_read_file_from_path()
> early on their driver code.
>
> The original goal of the usermode helper lock came can be traced back in
> time via a144c6a ("PM: Print a warning if firmware is requested when task
> are frozen) when Rafael noticed drivers were calling to request firmware
> on *resume* calls! Why would that be an issue? It was because of the stupid
> delays incurred on resume when firmware *was not found* __due__ to the stupid
> user mode helper timeout delay and people believing this often meant users
> confusing resume *stalling* for resume was *broken! Later b298d289c7
> ("PM / Sleep: Fix freezer failures due to racy usermodehelper_is_disabled()")
> addressed races. That code in turn was later massaged into shape to better
> accept direct FS loading via commit 4e0c92d015235 ("firmware: Refactoring for
> splitting user-mode helper code").
>
> So for a while we've been nagging driver developers to not add 
> request_firmware()
> calls on resume calls. In fact the drivers/base/firmware_class.c code *kills*
> firmware UMH calls when we go to suspend for the firmware cache, as such even
> on suspend its a stupid idea to use the UMH, not only because it can
> stall suspend but *also* because we kill these UMH calls. Its why I recently
> proposed removing the possibility to call the firmware UMH from the firmware
> cache [3]. If this patch is wrong please do chime in!
>
> So, as I see it the user mode helper lock should have *nothing* to do with
> the race between reading files from the kernel and having a path ready. If
> it was *used* for that, that was papering over the real issue. Its no
> different of a hack for instance as if a driver using request_firmware() tried
> to use async probe to avoid the same race. Yes it will help, but no, it does
> not mean it is deterministically safe.
>
> Reviewing commit 247bc03742545 ("PM / Sleep: Mitigate race between the freezer
> and request_firmware()") which originally extended umh state machine from
> just being enabled/disabled, with the concepts of UMH_ENABLED, UMH_FREEZING,
> UMH_DISABLED -- its goal was to prevent UMH uses during suspend. So -- the
> "UMH lock" on firmware was actually added to help avoid races between freezing
> and request_firmware(). We should not re-use UMH status notifiers when the
> firmware UMH is disabled for the same concepts -- if we needed such a concept
> then we should take this out from UMH code and generalize it.
>
> In fact this shows there's still an issue here for UMH code as well. The
> usermodehelper_enable() call
>
> rest_init() --> kernel_thread(kernel_init, NULL, CLONE_FS); -->
> kernel_init 

Re: [PATCH v4 1/4] firmware: Move umh locking code into fw_load_from_user_helper()

2016-09-08 Thread Ming Lei
On Fri, Sep 9, 2016 at 4:11 AM, Luis R. Rodriguez  wrote:
> On Thu, Sep 08, 2016 at 11:37:54PM +0800, Ming Lei wrote:
>> On Wed, Sep 7, 2016 at 4:45 PM, Daniel Wagner  wrote:
>> > From: Daniel Wagner 
>> >
>> > When we load the firmware directly we don't need to take the umh
>> > lock.
>>
>> I am wondering if it can be wrong.
>
> If you disable the firmware UMH why would we need to lock if the lock is being
> shown only used for the firmare UMH ?
>
>> Actually in case of firmware loading, the usermode helper lock doesn't
>> only mean the user helper is usable, and it also may serve to mark the
>> filesystem/block device is ready for firmware loading, and of couse direct
>> loading need fs/block to be ready too.
>
> Yes but that's a race I've identified a while ago present even if you use 
> initramfs *and*
> use kernel_read_file_from_path() on any part of the kernel [0], I proposed a 
> possible

Actualy I mean the situation of suspend vs. resume, and some drivers
still may not benefit from firmware loading cache when requesting loading
in .resume(), at that time it is still too early for direct loading.
With UMH lock,
we can get a warning or avoid the issue.


> solution recently [1], given tons of folks were *complaining* about this but 
> despite there
> being one solution proposed [2] it was still incorrect, as only *userspace* 
> can know

In case of initramfs and built-in drivers, it isn't a surprise to see this race,
and that shouldn't be a common case. But for suspend vs. resume, it can be
true.


> for sure when your critical filesystems are mounted. Since we now have a 
> shared
> "read file fs" helper kernel_read_file_from_path() it implicates the race is 
> possible
> elsewhere as well.
>
> The race is present given you simply cannot know when the root filesystem
> (consider a series of pivot_root() calls, hey -- anyone can do that, who are 
> we to
> tell them they can't), or in this particular case importance, only 
> considering firmware,
> when /lib/firmware/ is ready.  In short I am saying this whole race thing is a
> bigger issue, and as much as Linus hates my proposed solution I have not heard
> any proposal alternatives to address this properly, not even clarifications on
> what our expectations for drivers then are if they use 
> kernel_read_file_from_path()
> early on their driver code.
>
> The original goal of the usermode helper lock came can be traced back in
> time via a144c6a ("PM: Print a warning if firmware is requested when task
> are frozen) when Rafael noticed drivers were calling to request firmware
> on *resume* calls! Why would that be an issue? It was because of the stupid
> delays incurred on resume when firmware *was not found* __due__ to the stupid
> user mode helper timeout delay and people believing this often meant users
> confusing resume *stalling* for resume was *broken! Later b298d289c7
> ("PM / Sleep: Fix freezer failures due to racy usermodehelper_is_disabled()")
> addressed races. That code in turn was later massaged into shape to better
> accept direct FS loading via commit 4e0c92d015235 ("firmware: Refactoring for
> splitting user-mode helper code").
>
> So for a while we've been nagging driver developers to not add 
> request_firmware()
> calls on resume calls. In fact the drivers/base/firmware_class.c code *kills*
> firmware UMH calls when we go to suspend for the firmware cache, as such even
> on suspend its a stupid idea to use the UMH, not only because it can
> stall suspend but *also* because we kill these UMH calls. Its why I recently
> proposed removing the possibility to call the firmware UMH from the firmware
> cache [3]. If this patch is wrong please do chime in!
>
> So, as I see it the user mode helper lock should have *nothing* to do with
> the race between reading files from the kernel and having a path ready. If
> it was *used* for that, that was papering over the real issue. Its no
> different of a hack for instance as if a driver using request_firmware() tried
> to use async probe to avoid the same race. Yes it will help, but no, it does
> not mean it is deterministically safe.
>
> Reviewing commit 247bc03742545 ("PM / Sleep: Mitigate race between the freezer
> and request_firmware()") which originally extended umh state machine from
> just being enabled/disabled, with the concepts of UMH_ENABLED, UMH_FREEZING,
> UMH_DISABLED -- its goal was to prevent UMH uses during suspend. So -- the
> "UMH lock" on firmware was actually added to help avoid races between freezing
> and request_firmware(). We should not re-use UMH status notifiers when the
> firmware UMH is disabled for the same concepts -- if we needed such a concept
> then we should take this out from UMH code and generalize it.
>
> In fact this shows there's still an issue here for UMH code as well. The
> usermodehelper_enable() call
>
> rest_init() --> kernel_thread(kernel_init, NULL, CLONE_FS); -->
> kernel_init --> kernel_init_freeable() --> 
> wait_for_completion(_done);

Re: [PATCH v4 1/4] firmware: Move umh locking code into fw_load_from_user_helper()

2016-09-08 Thread Luis R. Rodriguez
On Thu, Sep 08, 2016 at 11:37:54PM +0800, Ming Lei wrote:
> On Wed, Sep 7, 2016 at 4:45 PM, Daniel Wagner  wrote:
> > From: Daniel Wagner 
> >
> > When we load the firmware directly we don't need to take the umh
> > lock.
> 
> I am wondering if it can be wrong.

If you disable the firmware UMH why would we need to lock if the lock is being
shown only used for the firmare UMH ?

> Actually in case of firmware loading, the usermode helper lock doesn't
> only mean the user helper is usable, and it also may serve to mark the
> filesystem/block device is ready for firmware loading, and of couse direct
> loading need fs/block to be ready too.

Yes but that's a race I've identified a while ago present even if you use 
initramfs *and*
use kernel_read_file_from_path() on any part of the kernel [0], I proposed a 
possible
solution recently [1], given tons of folks were *complaining* about this but 
despite there
being one solution proposed [2] it was still incorrect, as only *userspace* can 
know
for sure when your critical filesystems are mounted. Since we now have a shared
"read file fs" helper kernel_read_file_from_path() it implicates the race is 
possible
elsewhere as well.

The race is present given you simply cannot know when the root filesystem
(consider a series of pivot_root() calls, hey -- anyone can do that, who are we 
to
tell them they can't), or in this particular case importance, only considering 
firmware,
when /lib/firmware/ is ready.  In short I am saying this whole race thing is a
bigger issue, and as much as Linus hates my proposed solution I have not heard
any proposal alternatives to address this properly, not even clarifications on
what our expectations for drivers then are if they use 
kernel_read_file_from_path()
early on their driver code.

The original goal of the usermode helper lock came can be traced back in
time via a144c6a ("PM: Print a warning if firmware is requested when task
are frozen) when Rafael noticed drivers were calling to request firmware
on *resume* calls! Why would that be an issue? It was because of the stupid
delays incurred on resume when firmware *was not found* __due__ to the stupid
user mode helper timeout delay and people believing this often meant users
confusing resume *stalling* for resume was *broken! Later b298d289c7
("PM / Sleep: Fix freezer failures due to racy usermodehelper_is_disabled()")
addressed races. That code in turn was later massaged into shape to better
accept direct FS loading via commit 4e0c92d015235 ("firmware: Refactoring for
splitting user-mode helper code").

So for a while we've been nagging driver developers to not add 
request_firmware()
calls on resume calls. In fact the drivers/base/firmware_class.c code *kills*
firmware UMH calls when we go to suspend for the firmware cache, as such even
on suspend its a stupid idea to use the UMH, not only because it can
stall suspend but *also* because we kill these UMH calls. Its why I recently
proposed removing the possibility to call the firmware UMH from the firmware
cache [3]. If this patch is wrong please do chime in!

So, as I see it the user mode helper lock should have *nothing* to do with
the race between reading files from the kernel and having a path ready. If
it was *used* for that, that was papering over the real issue. Its no
different of a hack for instance as if a driver using request_firmware() tried
to use async probe to avoid the same race. Yes it will help, but no, it does
not mean it is deterministically safe.

Reviewing commit 247bc03742545 ("PM / Sleep: Mitigate race between the freezer
and request_firmware()") which originally extended umh state machine from
just being enabled/disabled, with the concepts of UMH_ENABLED, UMH_FREEZING,
UMH_DISABLED -- its goal was to prevent UMH uses during suspend. So -- the
"UMH lock" on firmware was actually added to help avoid races between freezing
and request_firmware(). We should not re-use UMH status notifiers when the
firmware UMH is disabled for the same concepts -- if we needed such a concept
then we should take this out from UMH code and generalize it.

In fact this shows there's still an issue here for UMH code as well. The
usermodehelper_enable() call 

rest_init() --> kernel_thread(kernel_init, NULL, CLONE_FS); -->
kernel_init --> kernel_init_freeable() --> 
wait_for_completion(_done);

So after kernel_init_freeable() kthreadd_done should be done but note that
only userspace will know what partition set up to use. Shortly after
this though in kernel_init_freeable() we call prepare_namespace() so if
initramfs was set up its set up after.

Meanwhile driver's inits are called via do_initcalls() called from 
do_basic_setup()
and that is called within kernel_init_freeable() *prior* to prepare_namespace().

So calling usermodehelper_enable() doesn't necessarily ensure that the paths
for the UMH used will actually work either. This path also has a race.

We need to address both 

Re: [PATCH v4 1/4] firmware: Move umh locking code into fw_load_from_user_helper()

2016-09-08 Thread Luis R. Rodriguez
On Thu, Sep 08, 2016 at 11:37:54PM +0800, Ming Lei wrote:
> On Wed, Sep 7, 2016 at 4:45 PM, Daniel Wagner  wrote:
> > From: Daniel Wagner 
> >
> > When we load the firmware directly we don't need to take the umh
> > lock.
> 
> I am wondering if it can be wrong.

If you disable the firmware UMH why would we need to lock if the lock is being
shown only used for the firmare UMH ?

> Actually in case of firmware loading, the usermode helper lock doesn't
> only mean the user helper is usable, and it also may serve to mark the
> filesystem/block device is ready for firmware loading, and of couse direct
> loading need fs/block to be ready too.

Yes but that's a race I've identified a while ago present even if you use 
initramfs *and*
use kernel_read_file_from_path() on any part of the kernel [0], I proposed a 
possible
solution recently [1], given tons of folks were *complaining* about this but 
despite there
being one solution proposed [2] it was still incorrect, as only *userspace* can 
know
for sure when your critical filesystems are mounted. Since we now have a shared
"read file fs" helper kernel_read_file_from_path() it implicates the race is 
possible
elsewhere as well.

The race is present given you simply cannot know when the root filesystem
(consider a series of pivot_root() calls, hey -- anyone can do that, who are we 
to
tell them they can't), or in this particular case importance, only considering 
firmware,
when /lib/firmware/ is ready.  In short I am saying this whole race thing is a
bigger issue, and as much as Linus hates my proposed solution I have not heard
any proposal alternatives to address this properly, not even clarifications on
what our expectations for drivers then are if they use 
kernel_read_file_from_path()
early on their driver code.

The original goal of the usermode helper lock came can be traced back in
time via a144c6a ("PM: Print a warning if firmware is requested when task
are frozen) when Rafael noticed drivers were calling to request firmware
on *resume* calls! Why would that be an issue? It was because of the stupid
delays incurred on resume when firmware *was not found* __due__ to the stupid
user mode helper timeout delay and people believing this often meant users
confusing resume *stalling* for resume was *broken! Later b298d289c7
("PM / Sleep: Fix freezer failures due to racy usermodehelper_is_disabled()")
addressed races. That code in turn was later massaged into shape to better
accept direct FS loading via commit 4e0c92d015235 ("firmware: Refactoring for
splitting user-mode helper code").

So for a while we've been nagging driver developers to not add 
request_firmware()
calls on resume calls. In fact the drivers/base/firmware_class.c code *kills*
firmware UMH calls when we go to suspend for the firmware cache, as such even
on suspend its a stupid idea to use the UMH, not only because it can
stall suspend but *also* because we kill these UMH calls. Its why I recently
proposed removing the possibility to call the firmware UMH from the firmware
cache [3]. If this patch is wrong please do chime in!

So, as I see it the user mode helper lock should have *nothing* to do with
the race between reading files from the kernel and having a path ready. If
it was *used* for that, that was papering over the real issue. Its no
different of a hack for instance as if a driver using request_firmware() tried
to use async probe to avoid the same race. Yes it will help, but no, it does
not mean it is deterministically safe.

Reviewing commit 247bc03742545 ("PM / Sleep: Mitigate race between the freezer
and request_firmware()") which originally extended umh state machine from
just being enabled/disabled, with the concepts of UMH_ENABLED, UMH_FREEZING,
UMH_DISABLED -- its goal was to prevent UMH uses during suspend. So -- the
"UMH lock" on firmware was actually added to help avoid races between freezing
and request_firmware(). We should not re-use UMH status notifiers when the
firmware UMH is disabled for the same concepts -- if we needed such a concept
then we should take this out from UMH code and generalize it.

In fact this shows there's still an issue here for UMH code as well. The
usermodehelper_enable() call 

rest_init() --> kernel_thread(kernel_init, NULL, CLONE_FS); -->
kernel_init --> kernel_init_freeable() --> 
wait_for_completion(_done);

So after kernel_init_freeable() kthreadd_done should be done but note that
only userspace will know what partition set up to use. Shortly after
this though in kernel_init_freeable() we call prepare_namespace() so if
initramfs was set up its set up after.

Meanwhile driver's inits are called via do_initcalls() called from 
do_basic_setup()
and that is called within kernel_init_freeable() *prior* to prepare_namespace().

So calling usermodehelper_enable() doesn't necessarily ensure that the paths
for the UMH used will actually work either. This path also has a race.

We need to address both things then:

  1) *freeze* races / stalls
  2) 

Re: [PATCH v4 1/4] firmware: Move umh locking code into fw_load_from_user_helper()

2016-09-08 Thread Ming Lei
On Wed, Sep 7, 2016 at 4:45 PM, Daniel Wagner  wrote:
> From: Daniel Wagner 
>
> When we load the firmware directly we don't need to take the umh
> lock.

I am wondering if it can be wrong.

Actually in case of firmware loading, the usermode helper lock doesn't
only mean the user helper is usable, and it also may serve to mark the
filesystem/block device is ready for firmware loading, and of couse direct
loading need fs/block to be ready too.

> So move this part inside fw_load_from_user_helper which is only
> available when CONFIG_FW_LOADER_USER_HELPER is set.
>
> This avoids a dependency on firmware_loading_timeout() even when
> !CONFIG_FW_LOADER_UER_HELPER.
>
> The usermodehelper locking code was added by b298d289c792 ("PM / Sleep:
> Fix freezer failures due to racy usermodehelper_is_disabled()").
>
> Signed-off-by: Daniel Wagner 
> Cc: Ming Lei 
> Cc: Luis R. Rodriguez 
> Cc: Greg Kroah-Hartman 
> ---
>  drivers/base/firmware_class.c | 52 
> +++
>  1 file changed, 28 insertions(+), 24 deletions(-)
>
> diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c
> index 960f8f7..d4fee06 100644
> --- a/drivers/base/firmware_class.c
> +++ b/drivers/base/firmware_class.c
> @@ -981,13 +981,38 @@ static int fw_load_from_user_helper(struct firmware 
> *firmware,
> unsigned int opt_flags, long timeout)
>  {
> struct firmware_priv *fw_priv;
> +   int ret;
> +
> +   timeout = firmware_loading_timeout();
> +   if (opt_flags & FW_OPT_NOWAIT) {
> +   timeout = usermodehelper_read_lock_wait(timeout);
> +   if (!timeout) {
> +   dev_dbg(device, "firmware: %s loading timed out\n",
> +   name);
> +   return -EBUSY;
> +   }
> +   } else {
> +   ret = usermodehelper_read_trylock();
> +   if (WARN_ON(ret)) {
> +   dev_err(device, "firmware: %s will not be loaded\n",
> +   name);
> +   return ret;
> +   }
> +   }
>
> fw_priv = fw_create_instance(firmware, name, device, opt_flags);
> -   if (IS_ERR(fw_priv))
> -   return PTR_ERR(fw_priv);
> +   if (IS_ERR(fw_priv)) {
> +   ret = PTR_ERR(fw_priv);
> +   goto release_lock;
> +   }
>
> fw_priv->buf = firmware->priv;
> -   return _request_firmware_load(fw_priv, opt_flags, timeout);
> +   ret = _request_firmware_load(fw_priv, opt_flags, timeout);
> +
> +release_lock:
> +   usermodehelper_read_unlock();
> +
> +   return ret;
>  }
>
>  #ifdef CONFIG_PM_SLEEP
> @@ -1150,25 +1175,6 @@ _request_firmware(const struct firmware **firmware_p, 
> const char *name,
> if (ret <= 0) /* error or already assigned */
> goto out;
>
> -   ret = 0;
> -   timeout = firmware_loading_timeout();
> -   if (opt_flags & FW_OPT_NOWAIT) {
> -   timeout = usermodehelper_read_lock_wait(timeout);
> -   if (!timeout) {
> -   dev_dbg(device, "firmware: %s loading timed out\n",
> -   name);
> -   ret = -EBUSY;
> -   goto out;
> -   }
> -   } else {
> -   ret = usermodehelper_read_trylock();
> -   if (WARN_ON(ret)) {
> -   dev_err(device, "firmware: %s will not be loaded\n",
> -   name);
> -   goto out;
> -   }
> -   }
> -
> ret = fw_get_filesystem_firmware(device, fw->priv);
> if (ret) {
> if (!(opt_flags & FW_OPT_NO_WARN))
> @@ -1185,8 +1191,6 @@ _request_firmware(const struct firmware **firmware_p, 
> const char *name,
> if (!ret)
> ret = assign_firmware_buf(fw, device, opt_flags);
>
> -   usermodehelper_read_unlock();
> -
>   out:
> if (ret < 0) {
> release_firmware(fw);
> --
> 2.7.4


Re: [PATCH v4 1/4] firmware: Move umh locking code into fw_load_from_user_helper()

2016-09-08 Thread Ming Lei
On Wed, Sep 7, 2016 at 4:45 PM, Daniel Wagner  wrote:
> From: Daniel Wagner 
>
> When we load the firmware directly we don't need to take the umh
> lock.

I am wondering if it can be wrong.

Actually in case of firmware loading, the usermode helper lock doesn't
only mean the user helper is usable, and it also may serve to mark the
filesystem/block device is ready for firmware loading, and of couse direct
loading need fs/block to be ready too.

> So move this part inside fw_load_from_user_helper which is only
> available when CONFIG_FW_LOADER_USER_HELPER is set.
>
> This avoids a dependency on firmware_loading_timeout() even when
> !CONFIG_FW_LOADER_UER_HELPER.
>
> The usermodehelper locking code was added by b298d289c792 ("PM / Sleep:
> Fix freezer failures due to racy usermodehelper_is_disabled()").
>
> Signed-off-by: Daniel Wagner 
> Cc: Ming Lei 
> Cc: Luis R. Rodriguez 
> Cc: Greg Kroah-Hartman 
> ---
>  drivers/base/firmware_class.c | 52 
> +++
>  1 file changed, 28 insertions(+), 24 deletions(-)
>
> diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c
> index 960f8f7..d4fee06 100644
> --- a/drivers/base/firmware_class.c
> +++ b/drivers/base/firmware_class.c
> @@ -981,13 +981,38 @@ static int fw_load_from_user_helper(struct firmware 
> *firmware,
> unsigned int opt_flags, long timeout)
>  {
> struct firmware_priv *fw_priv;
> +   int ret;
> +
> +   timeout = firmware_loading_timeout();
> +   if (opt_flags & FW_OPT_NOWAIT) {
> +   timeout = usermodehelper_read_lock_wait(timeout);
> +   if (!timeout) {
> +   dev_dbg(device, "firmware: %s loading timed out\n",
> +   name);
> +   return -EBUSY;
> +   }
> +   } else {
> +   ret = usermodehelper_read_trylock();
> +   if (WARN_ON(ret)) {
> +   dev_err(device, "firmware: %s will not be loaded\n",
> +   name);
> +   return ret;
> +   }
> +   }
>
> fw_priv = fw_create_instance(firmware, name, device, opt_flags);
> -   if (IS_ERR(fw_priv))
> -   return PTR_ERR(fw_priv);
> +   if (IS_ERR(fw_priv)) {
> +   ret = PTR_ERR(fw_priv);
> +   goto release_lock;
> +   }
>
> fw_priv->buf = firmware->priv;
> -   return _request_firmware_load(fw_priv, opt_flags, timeout);
> +   ret = _request_firmware_load(fw_priv, opt_flags, timeout);
> +
> +release_lock:
> +   usermodehelper_read_unlock();
> +
> +   return ret;
>  }
>
>  #ifdef CONFIG_PM_SLEEP
> @@ -1150,25 +1175,6 @@ _request_firmware(const struct firmware **firmware_p, 
> const char *name,
> if (ret <= 0) /* error or already assigned */
> goto out;
>
> -   ret = 0;
> -   timeout = firmware_loading_timeout();
> -   if (opt_flags & FW_OPT_NOWAIT) {
> -   timeout = usermodehelper_read_lock_wait(timeout);
> -   if (!timeout) {
> -   dev_dbg(device, "firmware: %s loading timed out\n",
> -   name);
> -   ret = -EBUSY;
> -   goto out;
> -   }
> -   } else {
> -   ret = usermodehelper_read_trylock();
> -   if (WARN_ON(ret)) {
> -   dev_err(device, "firmware: %s will not be loaded\n",
> -   name);
> -   goto out;
> -   }
> -   }
> -
> ret = fw_get_filesystem_firmware(device, fw->priv);
> if (ret) {
> if (!(opt_flags & FW_OPT_NO_WARN))
> @@ -1185,8 +1191,6 @@ _request_firmware(const struct firmware **firmware_p, 
> const char *name,
> if (!ret)
> ret = assign_firmware_buf(fw, device, opt_flags);
>
> -   usermodehelper_read_unlock();
> -
>   out:
> if (ret < 0) {
> release_firmware(fw);
> --
> 2.7.4


Re: [PATCH v4 1/4] firmware: Move umh locking code into fw_load_from_user_helper()

2016-09-08 Thread Luis R. Rodriguez
On Thu, Sep 08, 2016 at 02:41:33PM +0200, Daniel Wagner wrote:
> So far I have it tested with kvm. I'll give it a spin on real
> hardware. Good point.

If you can also include testing with the test_firmware.ko driver
and describe your tests in the cover letter that would be appreciated.
Bonus points if you also indicate you've had 0-day have full success
with your delta. To do that just poke Fenguang that you need testing
for a branch, and then push to your branch. I noted I gave your changes
a test spin and it came back OK, so it should be good, but still, this
is rather good practice for any changes sent out.

  Luis


Re: [PATCH v4 1/4] firmware: Move umh locking code into fw_load_from_user_helper()

2016-09-08 Thread Luis R. Rodriguez
On Thu, Sep 08, 2016 at 02:41:33PM +0200, Daniel Wagner wrote:
> So far I have it tested with kvm. I'll give it a spin on real
> hardware. Good point.

If you can also include testing with the test_firmware.ko driver
and describe your tests in the cover letter that would be appreciated.
Bonus points if you also indicate you've had 0-day have full success
with your delta. To do that just poke Fenguang that you need testing
for a branch, and then push to your branch. I noted I gave your changes
a test spin and it came back OK, so it should be good, but still, this
is rather good practice for any changes sent out.

  Luis


Re: [PATCH v4 1/4] firmware: Move umh locking code into fw_load_from_user_helper()

2016-09-08 Thread Daniel Wagner

On 09/08/2016 01:33 AM, Luis R. Rodriguez wrote:

The usermodehelper locking code was added by b298d289c792 ("PM / Sleep:
Fix freezer failures due to racy usermodehelper_is_disabled()").


Thanks, this helps to give some perspective, I'll note that commit also refers
to commit a144c6a (PM: Print a warning if firmware is requested when tasks are
frozen) by Srivatsa a long time ago which added a warning print if a driver
requested firmware when tasks are frozen. That commit log further clarifies
that the issues is that some drivers erroneously use request_firmware() in
their driver's ->resume() (or ->thaw(), or ->restore()) callbacks, it further
clarifies that is not going to work unless the firmware has been built in.
It did not explain *why* it wouldn't work though. But note it also mentioned
how drivers that do have request_firmware() calls on resume stall resume --
the reason for the stalling is the stupid usermode helper. The kernel now
"fixed" these by returning an error in such cases, it does this by checking
kernel user mode helper is disabled, this is why it would not work. But note
that we should be disabling the usermode helper on suspend too, and likely
the reason we never ran into an issue with the cache stuff is we would fail
if the usermode helper was disabled anyway. This is a long winded way of
saying that these commits further confirm removal of using the usermode helper
from the firmware cache work for suspend/resume.


Okay, so let's finish this round of refactoring first. I prefer going in 
smaller steps and see if there are any regressions with those changes.




Signed-off-by: Daniel Wagner 
Cc: Ming Lei 
Cc: Luis R. Rodriguez 
Cc: Greg Kroah-Hartman 
---
 drivers/base/firmware_class.c | 52 +++
 1 file changed, 28 insertions(+), 24 deletions(-)

diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c
index 960f8f7..d4fee06 100644
--- a/drivers/base/firmware_class.c
+++ b/drivers/base/firmware_class.c
@@ -981,13 +981,38 @@ static int fw_load_from_user_helper(struct firmware 
*firmware,
unsigned int opt_flags, long timeout)
 {
struct firmware_priv *fw_priv;
+   int ret;
+
+   timeout = firmware_loading_timeout();
+   if (opt_flags & FW_OPT_NOWAIT) {
+   timeout = usermodehelper_read_lock_wait(timeout);
+   if (!timeout) {
+   dev_dbg(device, "firmware: %s loading timed out\n",
+   name);
+   return -EBUSY;
+   }
+   } else {
+   ret = usermodehelper_read_trylock();
+   if (WARN_ON(ret)) {
+   dev_err(device, "firmware: %s will not be loaded\n",
+   name);
+   return ret;
+   }
+   }


fw_load_from_user_helper() no longer needs the timeout parameter then.


Updated the patch accordingly.


Given this fact I'll chime in with some other, IMHO cosmetic things for
this series. This however is the just the biggest issue for this series
that I've found. That and testing this at run time didn't boot on my
system, it could be an issue with linux-next next-20160907 booting
on my system, I hadn't tried that yet. I did put your series through
0-day though and it went through fine though.


So far I have it tested with kvm. I'll give it a spin on real hardware. 
Good point.



Since you will need a respin I'd appreciate if you can Cc Takashi,
Bjorn, Daniel Vetter, and Arend van Spriel on these series as some
of them have expressed interest in the umh stuff, so best to get wider
review as well. While at it please Cc Rafael and Srivatsa.


Will do.

cheers,
daniel


Re: [PATCH v4 1/4] firmware: Move umh locking code into fw_load_from_user_helper()

2016-09-08 Thread Daniel Wagner

On 09/08/2016 01:33 AM, Luis R. Rodriguez wrote:

The usermodehelper locking code was added by b298d289c792 ("PM / Sleep:
Fix freezer failures due to racy usermodehelper_is_disabled()").


Thanks, this helps to give some perspective, I'll note that commit also refers
to commit a144c6a (PM: Print a warning if firmware is requested when tasks are
frozen) by Srivatsa a long time ago which added a warning print if a driver
requested firmware when tasks are frozen. That commit log further clarifies
that the issues is that some drivers erroneously use request_firmware() in
their driver's ->resume() (or ->thaw(), or ->restore()) callbacks, it further
clarifies that is not going to work unless the firmware has been built in.
It did not explain *why* it wouldn't work though. But note it also mentioned
how drivers that do have request_firmware() calls on resume stall resume --
the reason for the stalling is the stupid usermode helper. The kernel now
"fixed" these by returning an error in such cases, it does this by checking
kernel user mode helper is disabled, this is why it would not work. But note
that we should be disabling the usermode helper on suspend too, and likely
the reason we never ran into an issue with the cache stuff is we would fail
if the usermode helper was disabled anyway. This is a long winded way of
saying that these commits further confirm removal of using the usermode helper
from the firmware cache work for suspend/resume.


Okay, so let's finish this round of refactoring first. I prefer going in 
smaller steps and see if there are any regressions with those changes.




Signed-off-by: Daniel Wagner 
Cc: Ming Lei 
Cc: Luis R. Rodriguez 
Cc: Greg Kroah-Hartman 
---
 drivers/base/firmware_class.c | 52 +++
 1 file changed, 28 insertions(+), 24 deletions(-)

diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c
index 960f8f7..d4fee06 100644
--- a/drivers/base/firmware_class.c
+++ b/drivers/base/firmware_class.c
@@ -981,13 +981,38 @@ static int fw_load_from_user_helper(struct firmware 
*firmware,
unsigned int opt_flags, long timeout)
 {
struct firmware_priv *fw_priv;
+   int ret;
+
+   timeout = firmware_loading_timeout();
+   if (opt_flags & FW_OPT_NOWAIT) {
+   timeout = usermodehelper_read_lock_wait(timeout);
+   if (!timeout) {
+   dev_dbg(device, "firmware: %s loading timed out\n",
+   name);
+   return -EBUSY;
+   }
+   } else {
+   ret = usermodehelper_read_trylock();
+   if (WARN_ON(ret)) {
+   dev_err(device, "firmware: %s will not be loaded\n",
+   name);
+   return ret;
+   }
+   }


fw_load_from_user_helper() no longer needs the timeout parameter then.


Updated the patch accordingly.


Given this fact I'll chime in with some other, IMHO cosmetic things for
this series. This however is the just the biggest issue for this series
that I've found. That and testing this at run time didn't boot on my
system, it could be an issue with linux-next next-20160907 booting
on my system, I hadn't tried that yet. I did put your series through
0-day though and it went through fine though.


So far I have it tested with kvm. I'll give it a spin on real hardware. 
Good point.



Since you will need a respin I'd appreciate if you can Cc Takashi,
Bjorn, Daniel Vetter, and Arend van Spriel on these series as some
of them have expressed interest in the umh stuff, so best to get wider
review as well. While at it please Cc Rafael and Srivatsa.


Will do.

cheers,
daniel


Re: [PATCH v4 1/4] firmware: Move umh locking code into fw_load_from_user_helper()

2016-09-07 Thread Luis R. Rodriguez
On Wed, Sep 07, 2016 at 10:45:05AM +0200, Daniel Wagner wrote:
> From: Daniel Wagner 
> 
> When we load the firmware directly we don't need to take the umh
> lock. So move this part inside fw_load_from_user_helper which is only
> available when CONFIG_FW_LOADER_USER_HELPER is set.
> 
> This avoids a dependency on firmware_loading_timeout() even when
> !CONFIG_FW_LOADER_UER_HELPER.

Great work! Just one issue found, noted below.

> The usermodehelper locking code was added by b298d289c792 ("PM / Sleep:
> Fix freezer failures due to racy usermodehelper_is_disabled()").

Thanks, this helps to give some perspective, I'll note that commit also refers
to commit a144c6a (PM: Print a warning if firmware is requested when tasks are
frozen) by Srivatsa a long time ago which added a warning print if a driver
requested firmware when tasks are frozen. That commit log further clarifies
that the issues is that some drivers erroneously use request_firmware() in
their driver's ->resume() (or ->thaw(), or ->restore()) callbacks, it further
clarifies that is not going to work unless the firmware has been built in.
It did not explain *why* it wouldn't work though. But note it also mentioned
how drivers that do have request_firmware() calls on resume stall resume --
the reason for the stalling is the stupid usermode helper. The kernel now
"fixed" these by returning an error in such cases, it does this by checking
kernel user mode helper is disabled, this is why it would not work. But note
that we should be disabling the usermode helper on suspend too, and likely
the reason we never ran into an issue with the cache stuff is we would fail
if the usermode helper was disabled anyway. This is a long winded way of
saying that these commits further confirm removal of using the usermode helper
from the firmware cache work for suspend/resume.

> Signed-off-by: Daniel Wagner 
> Cc: Ming Lei 
> Cc: Luis R. Rodriguez 
> Cc: Greg Kroah-Hartman 
> ---
>  drivers/base/firmware_class.c | 52 
> +++
>  1 file changed, 28 insertions(+), 24 deletions(-)
> 
> diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c
> index 960f8f7..d4fee06 100644
> --- a/drivers/base/firmware_class.c
> +++ b/drivers/base/firmware_class.c
> @@ -981,13 +981,38 @@ static int fw_load_from_user_helper(struct firmware 
> *firmware,
>   unsigned int opt_flags, long timeout)
>  {
>   struct firmware_priv *fw_priv;
> + int ret;
> +
> + timeout = firmware_loading_timeout();
> + if (opt_flags & FW_OPT_NOWAIT) {
> + timeout = usermodehelper_read_lock_wait(timeout);
> + if (!timeout) {
> + dev_dbg(device, "firmware: %s loading timed out\n",
> + name);
> + return -EBUSY;
> + }
> + } else {
> + ret = usermodehelper_read_trylock();
> + if (WARN_ON(ret)) {
> + dev_err(device, "firmware: %s will not be loaded\n",
> + name);
> + return ret;
> + }
> + }

fw_load_from_user_helper() no longer needs the timeout parameter then.
Given this fact I'll chime in with some other, IMHO cosmetic things for
this series. This however is the just the biggest issue for this series
that I've found. That and testing this at run time didn't boot on my
system, it could be an issue with linux-next next-20160907 booting
on my system, I hadn't tried that yet. I did put your series through
0-day though and it went through fine though.

Since you will need a respin I'd appreciate if you can Cc Takashi,
Bjorn, Daniel Vetter, and Arend van Spriel on these series as some
of them have expressed interest in the umh stuff, so best to get wider
review as well. While at it please Cc Rafael and Srivatsa.

  Luis


Re: [PATCH v4 1/4] firmware: Move umh locking code into fw_load_from_user_helper()

2016-09-07 Thread Luis R. Rodriguez
On Wed, Sep 07, 2016 at 10:45:05AM +0200, Daniel Wagner wrote:
> From: Daniel Wagner 
> 
> When we load the firmware directly we don't need to take the umh
> lock. So move this part inside fw_load_from_user_helper which is only
> available when CONFIG_FW_LOADER_USER_HELPER is set.
> 
> This avoids a dependency on firmware_loading_timeout() even when
> !CONFIG_FW_LOADER_UER_HELPER.

Great work! Just one issue found, noted below.

> The usermodehelper locking code was added by b298d289c792 ("PM / Sleep:
> Fix freezer failures due to racy usermodehelper_is_disabled()").

Thanks, this helps to give some perspective, I'll note that commit also refers
to commit a144c6a (PM: Print a warning if firmware is requested when tasks are
frozen) by Srivatsa a long time ago which added a warning print if a driver
requested firmware when tasks are frozen. That commit log further clarifies
that the issues is that some drivers erroneously use request_firmware() in
their driver's ->resume() (or ->thaw(), or ->restore()) callbacks, it further
clarifies that is not going to work unless the firmware has been built in.
It did not explain *why* it wouldn't work though. But note it also mentioned
how drivers that do have request_firmware() calls on resume stall resume --
the reason for the stalling is the stupid usermode helper. The kernel now
"fixed" these by returning an error in such cases, it does this by checking
kernel user mode helper is disabled, this is why it would not work. But note
that we should be disabling the usermode helper on suspend too, and likely
the reason we never ran into an issue with the cache stuff is we would fail
if the usermode helper was disabled anyway. This is a long winded way of
saying that these commits further confirm removal of using the usermode helper
from the firmware cache work for suspend/resume.

> Signed-off-by: Daniel Wagner 
> Cc: Ming Lei 
> Cc: Luis R. Rodriguez 
> Cc: Greg Kroah-Hartman 
> ---
>  drivers/base/firmware_class.c | 52 
> +++
>  1 file changed, 28 insertions(+), 24 deletions(-)
> 
> diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c
> index 960f8f7..d4fee06 100644
> --- a/drivers/base/firmware_class.c
> +++ b/drivers/base/firmware_class.c
> @@ -981,13 +981,38 @@ static int fw_load_from_user_helper(struct firmware 
> *firmware,
>   unsigned int opt_flags, long timeout)
>  {
>   struct firmware_priv *fw_priv;
> + int ret;
> +
> + timeout = firmware_loading_timeout();
> + if (opt_flags & FW_OPT_NOWAIT) {
> + timeout = usermodehelper_read_lock_wait(timeout);
> + if (!timeout) {
> + dev_dbg(device, "firmware: %s loading timed out\n",
> + name);
> + return -EBUSY;
> + }
> + } else {
> + ret = usermodehelper_read_trylock();
> + if (WARN_ON(ret)) {
> + dev_err(device, "firmware: %s will not be loaded\n",
> + name);
> + return ret;
> + }
> + }

fw_load_from_user_helper() no longer needs the timeout parameter then.
Given this fact I'll chime in with some other, IMHO cosmetic things for
this series. This however is the just the biggest issue for this series
that I've found. That and testing this at run time didn't boot on my
system, it could be an issue with linux-next next-20160907 booting
on my system, I hadn't tried that yet. I did put your series through
0-day though and it went through fine though.

Since you will need a respin I'd appreciate if you can Cc Takashi,
Bjorn, Daniel Vetter, and Arend van Spriel on these series as some
of them have expressed interest in the umh stuff, so best to get wider
review as well. While at it please Cc Rafael and Srivatsa.

  Luis


[PATCH v4 1/4] firmware: Move umh locking code into fw_load_from_user_helper()

2016-09-07 Thread Daniel Wagner
From: Daniel Wagner 

When we load the firmware directly we don't need to take the umh
lock. So move this part inside fw_load_from_user_helper which is only
available when CONFIG_FW_LOADER_USER_HELPER is set.

This avoids a dependency on firmware_loading_timeout() even when
!CONFIG_FW_LOADER_UER_HELPER.

The usermodehelper locking code was added by b298d289c792 ("PM / Sleep:
Fix freezer failures due to racy usermodehelper_is_disabled()").

Signed-off-by: Daniel Wagner 
Cc: Ming Lei 
Cc: Luis R. Rodriguez 
Cc: Greg Kroah-Hartman 
---
 drivers/base/firmware_class.c | 52 +++
 1 file changed, 28 insertions(+), 24 deletions(-)

diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c
index 960f8f7..d4fee06 100644
--- a/drivers/base/firmware_class.c
+++ b/drivers/base/firmware_class.c
@@ -981,13 +981,38 @@ static int fw_load_from_user_helper(struct firmware 
*firmware,
unsigned int opt_flags, long timeout)
 {
struct firmware_priv *fw_priv;
+   int ret;
+
+   timeout = firmware_loading_timeout();
+   if (opt_flags & FW_OPT_NOWAIT) {
+   timeout = usermodehelper_read_lock_wait(timeout);
+   if (!timeout) {
+   dev_dbg(device, "firmware: %s loading timed out\n",
+   name);
+   return -EBUSY;
+   }
+   } else {
+   ret = usermodehelper_read_trylock();
+   if (WARN_ON(ret)) {
+   dev_err(device, "firmware: %s will not be loaded\n",
+   name);
+   return ret;
+   }
+   }
 
fw_priv = fw_create_instance(firmware, name, device, opt_flags);
-   if (IS_ERR(fw_priv))
-   return PTR_ERR(fw_priv);
+   if (IS_ERR(fw_priv)) {
+   ret = PTR_ERR(fw_priv);
+   goto release_lock;
+   }
 
fw_priv->buf = firmware->priv;
-   return _request_firmware_load(fw_priv, opt_flags, timeout);
+   ret = _request_firmware_load(fw_priv, opt_flags, timeout);
+
+release_lock:
+   usermodehelper_read_unlock();
+
+   return ret;
 }
 
 #ifdef CONFIG_PM_SLEEP
@@ -1150,25 +1175,6 @@ _request_firmware(const struct firmware **firmware_p, 
const char *name,
if (ret <= 0) /* error or already assigned */
goto out;
 
-   ret = 0;
-   timeout = firmware_loading_timeout();
-   if (opt_flags & FW_OPT_NOWAIT) {
-   timeout = usermodehelper_read_lock_wait(timeout);
-   if (!timeout) {
-   dev_dbg(device, "firmware: %s loading timed out\n",
-   name);
-   ret = -EBUSY;
-   goto out;
-   }
-   } else {
-   ret = usermodehelper_read_trylock();
-   if (WARN_ON(ret)) {
-   dev_err(device, "firmware: %s will not be loaded\n",
-   name);
-   goto out;
-   }
-   }
-
ret = fw_get_filesystem_firmware(device, fw->priv);
if (ret) {
if (!(opt_flags & FW_OPT_NO_WARN))
@@ -1185,8 +1191,6 @@ _request_firmware(const struct firmware **firmware_p, 
const char *name,
if (!ret)
ret = assign_firmware_buf(fw, device, opt_flags);
 
-   usermodehelper_read_unlock();
-
  out:
if (ret < 0) {
release_firmware(fw);
-- 
2.7.4


[PATCH v4 1/4] firmware: Move umh locking code into fw_load_from_user_helper()

2016-09-07 Thread Daniel Wagner
From: Daniel Wagner 

When we load the firmware directly we don't need to take the umh
lock. So move this part inside fw_load_from_user_helper which is only
available when CONFIG_FW_LOADER_USER_HELPER is set.

This avoids a dependency on firmware_loading_timeout() even when
!CONFIG_FW_LOADER_UER_HELPER.

The usermodehelper locking code was added by b298d289c792 ("PM / Sleep:
Fix freezer failures due to racy usermodehelper_is_disabled()").

Signed-off-by: Daniel Wagner 
Cc: Ming Lei 
Cc: Luis R. Rodriguez 
Cc: Greg Kroah-Hartman 
---
 drivers/base/firmware_class.c | 52 +++
 1 file changed, 28 insertions(+), 24 deletions(-)

diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c
index 960f8f7..d4fee06 100644
--- a/drivers/base/firmware_class.c
+++ b/drivers/base/firmware_class.c
@@ -981,13 +981,38 @@ static int fw_load_from_user_helper(struct firmware 
*firmware,
unsigned int opt_flags, long timeout)
 {
struct firmware_priv *fw_priv;
+   int ret;
+
+   timeout = firmware_loading_timeout();
+   if (opt_flags & FW_OPT_NOWAIT) {
+   timeout = usermodehelper_read_lock_wait(timeout);
+   if (!timeout) {
+   dev_dbg(device, "firmware: %s loading timed out\n",
+   name);
+   return -EBUSY;
+   }
+   } else {
+   ret = usermodehelper_read_trylock();
+   if (WARN_ON(ret)) {
+   dev_err(device, "firmware: %s will not be loaded\n",
+   name);
+   return ret;
+   }
+   }
 
fw_priv = fw_create_instance(firmware, name, device, opt_flags);
-   if (IS_ERR(fw_priv))
-   return PTR_ERR(fw_priv);
+   if (IS_ERR(fw_priv)) {
+   ret = PTR_ERR(fw_priv);
+   goto release_lock;
+   }
 
fw_priv->buf = firmware->priv;
-   return _request_firmware_load(fw_priv, opt_flags, timeout);
+   ret = _request_firmware_load(fw_priv, opt_flags, timeout);
+
+release_lock:
+   usermodehelper_read_unlock();
+
+   return ret;
 }
 
 #ifdef CONFIG_PM_SLEEP
@@ -1150,25 +1175,6 @@ _request_firmware(const struct firmware **firmware_p, 
const char *name,
if (ret <= 0) /* error or already assigned */
goto out;
 
-   ret = 0;
-   timeout = firmware_loading_timeout();
-   if (opt_flags & FW_OPT_NOWAIT) {
-   timeout = usermodehelper_read_lock_wait(timeout);
-   if (!timeout) {
-   dev_dbg(device, "firmware: %s loading timed out\n",
-   name);
-   ret = -EBUSY;
-   goto out;
-   }
-   } else {
-   ret = usermodehelper_read_trylock();
-   if (WARN_ON(ret)) {
-   dev_err(device, "firmware: %s will not be loaded\n",
-   name);
-   goto out;
-   }
-   }
-
ret = fw_get_filesystem_firmware(device, fw->priv);
if (ret) {
if (!(opt_flags & FW_OPT_NO_WARN))
@@ -1185,8 +1191,6 @@ _request_firmware(const struct firmware **firmware_p, 
const char *name,
if (!ret)
ret = assign_firmware_buf(fw, device, opt_flags);
 
-   usermodehelper_read_unlock();
-
  out:
if (ret < 0) {
release_firmware(fw);
-- 
2.7.4