Re: Bug: power supply - NULL pointer dereference in bq27x00 driver

2015-05-18 Thread Dr. H. Nikolaus Schaller

Am 18.05.2015 um 16:09 schrieb Krzysztof Kozlowski :

> 2015-05-18 22:30 GMT+09:00 Dr. H. Nikolaus Schaller :
>> Hi Krzysztof,
>> 
>> Am 18.05.2015 um 14:32 schrieb Krzysztof Kozlowski :
>> 
>>> 2015-05-18 18:40 GMT+09:00 Dr. H. Nikolaus Schaller :
 Hi,
 we tried to upgrade from 4.0 to 4.1-rc3 kernel and ran into a NULL pointer 
 problem within the bq27x00 driver.
 
 It appears to be introduced by your patch 
 297d716f6260cc9421d971b124ca196b957ee458
 
 The problem appears to be that bq27x00_powersupply_init() calls 
 power_supply_register_no_ws() and
 sets di->bat *after* return. The old code did pass an uninitialized struct 
 pointer.
 
 Now for reasons I don’t understand, the power_supply_register_no_ws() 
 appears to call
 uevent related stuff which in turn calls bq27x00_battery_get_property() 
 before di->bat
 is properly initialized.
 
 I have checked with printk in bq27x00_battery_get_property() that di>bat 
 == NULL in this case and
 right before we see the segfault.
 
 The old code simply did pass a zeroed struct power_supply and perhaps 
 initialized its components
 during registration.
 
 Returning some -EINVAL if di->bat == NULL would likely solve the NULL 
 pointer dereference but
 I don’t know what it does to the uevent and if it restores previous 
 operation.
 
 It could have been that it was for good purpose that 
 power_supply_register_no_ws() did not return
 by value, but by reference to the di->bat struct:
 
 -   ret = power_supply_register_no_ws(di->dev, >bat, NULL);
 +   di->bat = power_supply_register_no_ws(di->dev, psy_desc, _cfg);
 
 So that code called within the context of power_supply_register_no_ws() 
 could already
 refer to initialized di->bat.
>>> 
>>> Indeed this issue was not present in previous design however I think
>>> the architecture was still error-prone. Starting from driver's probe:
>>> - some_driver_probe()
>>>  - power_supply_register()
>>>- device_add()
>>>  - kobject_uevent() - notify userspace
>>>   - power_supply_uevent()
>>> - power_supply_show_property()
>>>   - power_supply_get_property()
>>> - some_driver_get_property()
>>> 
>>> So before probe() ends the power supply core calls driver's
>>> get_property(). In that time the driver internal structures may not be
>>> ready yet, because the probe did not end.
>> 
>> Yes, that is indeed a problem. Maybe it did work with the bq27x00 driver
>> earlier because the call to power_supply_register() was the last
>> activity.
> 
> Right, for most of the drivers it is the last part of probe.
> 
>> 
>>> The get_property() could
>>> access some registers or other core functions which will be ready
>>> after power_supply_register() call. For example the
>>> bq27x00_battery_get_property() accesses delayed work which could be
>>> (by mistake) not yet initialized.
>>> 
>>> I looked at other dev_uevent implementations and almost all of them do
>>> not call back the driver.
>>> 
>>> Of course this does not change that I introduced the issue and I feel
>>> bad about it.
>>> I got some ideas for resolving it:
>>> 1. Fix bq27x00_battery and maybe others so they won't access the 'psy'
>>> member of its state container. This does not solve the issue from
>>> architectural point of view - still some uninitialised data may be
>>> accessed because probe() is in progress. However this would be the
>>> fastest and the least intrusive.
>> 
>> The problem might be that it fundamentally changes the driver code
>> architecture. For example one call using di->bat is in
>> 
>> bq27x00_battery_status() {
>> …
>>else if (power_supply_am_i_supplied(di->bat))
>>status = POWER_SUPPLY_STATUS_NOT_CHARGING;
>> …
>> }
>> 
>>> 
>>> 2. Remove calls to get_property() (and other functions provided by
>>> driver) from power_supply_uevent(). Unfortunately this may break
>>> userspace which expects that some data will be present in uevent.
>>> 
>>> 3. Change the API to the previous convention, which you pointed as a remedy:
>>> ret = power_supply_register_no_ws(di->dev, >bat, psy_desc, _cfg);
>>> This also won't solve the issue from 1. that uevent will be called
>>> before probe ends.
>> 
>> Well, we could require that power_supply_register_no_ws() is the last
>> activity to be done in any_driver_probe().
> 
> Some drivers (like drivers/power/lp8727_charger.c) register multiple
> power supplies and there can be only one last call. What is even
> important in this lp8727 case, is that it registers:
> - a battery,
> - two chargers which supply this battery.
> 
> So when power supplies are registered, the delayed work is executed
> for each supplied battery. Hopefully the the battery is registered as
> last... but I am not quite sure that this is still safe.
> 
>>> 4. Block the power_supply_uevent() from calling 

Re: Bug: power supply - NULL pointer dereference in bq27x00 driver

2015-05-18 Thread Krzysztof Kozlowski
2015-05-18 22:30 GMT+09:00 Dr. H. Nikolaus Schaller :
> Hi Krzysztof,
>
> Am 18.05.2015 um 14:32 schrieb Krzysztof Kozlowski :
>
>> 2015-05-18 18:40 GMT+09:00 Dr. H. Nikolaus Schaller :
>>> Hi,
>>> we tried to upgrade from 4.0 to 4.1-rc3 kernel and ran into a NULL pointer 
>>> problem within the bq27x00 driver.
>>>
>>> It appears to be introduced by your patch 
>>> 297d716f6260cc9421d971b124ca196b957ee458
>>>
>>> The problem appears to be that bq27x00_powersupply_init() calls 
>>> power_supply_register_no_ws() and
>>> sets di->bat *after* return. The old code did pass an uninitialized struct 
>>> pointer.
>>>
>>> Now for reasons I don’t understand, the power_supply_register_no_ws() 
>>> appears to call
>>> uevent related stuff which in turn calls bq27x00_battery_get_property() 
>>> before di->bat
>>> is properly initialized.
>>>
>>> I have checked with printk in bq27x00_battery_get_property() that di>bat == 
>>> NULL in this case and
>>> right before we see the segfault.
>>>
>>> The old code simply did pass a zeroed struct power_supply and perhaps 
>>> initialized its components
>>> during registration.
>>>
>>> Returning some -EINVAL if di->bat == NULL would likely solve the NULL 
>>> pointer dereference but
>>> I don’t know what it does to the uevent and if it restores previous 
>>> operation.
>>>
>>> It could have been that it was for good purpose that 
>>> power_supply_register_no_ws() did not return
>>> by value, but by reference to the di->bat struct:
>>>
>>> -   ret = power_supply_register_no_ws(di->dev, >bat, NULL);
>>> +   di->bat = power_supply_register_no_ws(di->dev, psy_desc, _cfg);
>>>
>>> So that code called within the context of power_supply_register_no_ws() 
>>> could already
>>> refer to initialized di->bat.
>>
>> Indeed this issue was not present in previous design however I think
>> the architecture was still error-prone. Starting from driver's probe:
>> - some_driver_probe()
>>   - power_supply_register()
>> - device_add()
>>   - kobject_uevent() - notify userspace
>>- power_supply_uevent()
>>  - power_supply_show_property()
>>- power_supply_get_property()
>>  - some_driver_get_property()
>>
>> So before probe() ends the power supply core calls driver's
>> get_property(). In that time the driver internal structures may not be
>> ready yet, because the probe did not end.
>
> Yes, that is indeed a problem. Maybe it did work with the bq27x00 driver
> earlier because the call to power_supply_register() was the last
> activity.

Right, for most of the drivers it is the last part of probe.

>
>> The get_property() could
>> access some registers or other core functions which will be ready
>> after power_supply_register() call. For example the
>> bq27x00_battery_get_property() accesses delayed work which could be
>> (by mistake) not yet initialized.
>>
>> I looked at other dev_uevent implementations and almost all of them do
>> not call back the driver.
>>
>> Of course this does not change that I introduced the issue and I feel
>> bad about it.
>> I got some ideas for resolving it:
>> 1. Fix bq27x00_battery and maybe others so they won't access the 'psy'
>> member of its state container. This does not solve the issue from
>> architectural point of view - still some uninitialised data may be
>> accessed because probe() is in progress. However this would be the
>> fastest and the least intrusive.
>
> The problem might be that it fundamentally changes the driver code
> architecture. For example one call using di->bat is in
>
> bq27x00_battery_status() {
> …
> else if (power_supply_am_i_supplied(di->bat))
> status = POWER_SUPPLY_STATUS_NOT_CHARGING;
> …
> }
>
>>
>> 2. Remove calls to get_property() (and other functions provided by
>> driver) from power_supply_uevent(). Unfortunately this may break
>> userspace which expects that some data will be present in uevent.
>>
>> 3. Change the API to the previous convention, which you pointed as a remedy:
>> ret = power_supply_register_no_ws(di->dev, >bat, psy_desc, _cfg);
>> This also won't solve the issue from 1. that uevent will be called
>> before probe ends.
>
> Well, we could require that power_supply_register_no_ws() is the last
> activity to be done in any_driver_probe().

Some drivers (like drivers/power/lp8727_charger.c) register multiple
power supplies and there can be only one last call. What is even
important in this lp8727 case, is that it registers:
 - a battery,
 - two chargers which supply this battery.

So when power supplies are registered, the delayed work is executed
for each supplied battery. Hopefully the the battery is registered as
last... but I am not quite sure that this is still safe.

>> 4. Block the power_supply_uevent() from calling the get_property()
>> functions until device_add() finishes. This would solve this issue but
>> first uevent messages (from adding device) won't contain all of device
>> data (just like in solution no 

Re: Bug: power supply - NULL pointer dereference in bq27x00 driver

2015-05-18 Thread Krzysztof Kozlowski
2015-05-18 21:32 GMT+09:00 Krzysztof Kozlowski :
> 2015-05-18 18:40 GMT+09:00 Dr. H. Nikolaus Schaller :
>> Hi,
>> we tried to upgrade from 4.0 to 4.1-rc3 kernel and ran into a NULL pointer 
>> problem within the bq27x00 driver.
>>
>> It appears to be introduced by your patch 
>> 297d716f6260cc9421d971b124ca196b957ee458
>>
>> The problem appears to be that bq27x00_powersupply_init() calls 
>> power_supply_register_no_ws() and
>> sets di->bat *after* return. The old code did pass an uninitialized struct 
>> pointer.
>>
>> Now for reasons I don’t understand, the power_supply_register_no_ws() 
>> appears to call
>> uevent related stuff which in turn calls bq27x00_battery_get_property() 
>> before di->bat
>> is properly initialized.
>>
>> I have checked with printk in bq27x00_battery_get_property() that di>bat == 
>> NULL in this case and
>> right before we see the segfault.
>>
>> The old code simply did pass a zeroed struct power_supply and perhaps 
>> initialized its components
>> during registration.
>>
>> Returning some -EINVAL if di->bat == NULL would likely solve the NULL 
>> pointer dereference but
>> I don’t know what it does to the uevent and if it restores previous 
>> operation.
>>
>> It could have been that it was for good purpose that 
>> power_supply_register_no_ws() did not return
>> by value, but by reference to the di->bat struct:
>>
>> -   ret = power_supply_register_no_ws(di->dev, >bat, NULL);
>> +   di->bat = power_supply_register_no_ws(di->dev, psy_desc, _cfg);
>>
>> So that code called within the context of power_supply_register_no_ws() 
>> could already
>> refer to initialized di->bat.
>
> Indeed this issue was not present in previous design however I think
> the architecture was still error-prone. Starting from driver's probe:
>  - some_driver_probe()
>- power_supply_register()
>  - device_add()
>- kobject_uevent() - notify userspace
> - power_supply_uevent()
>   - power_supply_show_property()
> - power_supply_get_property()
>   - some_driver_get_property()
>
> So before probe() ends the power supply core calls driver's
> get_property(). In that time the driver internal structures may not be
> ready yet, because the probe did not end. The get_property() could
> access some registers or other core functions which will be ready
> after power_supply_register() call. For example the
> bq27x00_battery_get_property() accesses delayed work which could be
> (by mistake) not yet initialized.
>
> I looked at other dev_uevent implementations and almost all of them do
> not call back the driver.
>
> Of course this does not change that I introduced the issue and I feel
> bad about it.
> I got some ideas for resolving it:
> 1. Fix bq27x00_battery and maybe others so they won't access the 'psy'
> member of its state container. This does not solve the issue from
> architectural point of view - still some uninitialised data may be
> accessed because probe() is in progress. However this would be the
> fastest and the least intrusive.
>
> 2. Remove calls to get_property() (and other functions provided by
> driver) from power_supply_uevent(). Unfortunately this may break
> userspace which expects that some data will be present in uevent.
>
> 3. Change the API to the previous convention, which you pointed as a remedy:
> ret = power_supply_register_no_ws(di->dev, >bat, psy_desc, _cfg);
> This also won't solve the issue from 1. that uevent will be called
> before probe ends.
>
> 4. Block the power_supply_uevent() from calling the get_property()
> functions until device_add() finishes. This would solve this issue but
> first uevent messages (from adding device) won't contain all of device
> data (just like in solution no 2.) and this won't solve other race -
> someone may call sysfs show() on created device nodes and thus access
> the get_property() before probe finishes.

Unfortunately similar issue is with changed_work of power supply. The
power_supply_register() calls at the end power_supply_changed_work()
which through leds calls get_property().

Best regards,
Krzysztof
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Bug: power supply - NULL pointer dereference in bq27x00 driver

2015-05-18 Thread Dr. H. Nikolaus Schaller
Hi Krzysztof,

Am 18.05.2015 um 14:32 schrieb Krzysztof Kozlowski :

> 2015-05-18 18:40 GMT+09:00 Dr. H. Nikolaus Schaller :
>> Hi,
>> we tried to upgrade from 4.0 to 4.1-rc3 kernel and ran into a NULL pointer 
>> problem within the bq27x00 driver.
>> 
>> It appears to be introduced by your patch 
>> 297d716f6260cc9421d971b124ca196b957ee458
>> 
>> The problem appears to be that bq27x00_powersupply_init() calls 
>> power_supply_register_no_ws() and
>> sets di->bat *after* return. The old code did pass an uninitialized struct 
>> pointer.
>> 
>> Now for reasons I don’t understand, the power_supply_register_no_ws() 
>> appears to call
>> uevent related stuff which in turn calls bq27x00_battery_get_property() 
>> before di->bat
>> is properly initialized.
>> 
>> I have checked with printk in bq27x00_battery_get_property() that di>bat == 
>> NULL in this case and
>> right before we see the segfault.
>> 
>> The old code simply did pass a zeroed struct power_supply and perhaps 
>> initialized its components
>> during registration.
>> 
>> Returning some -EINVAL if di->bat == NULL would likely solve the NULL 
>> pointer dereference but
>> I don’t know what it does to the uevent and if it restores previous 
>> operation.
>> 
>> It could have been that it was for good purpose that 
>> power_supply_register_no_ws() did not return
>> by value, but by reference to the di->bat struct:
>> 
>> -   ret = power_supply_register_no_ws(di->dev, >bat, NULL);
>> +   di->bat = power_supply_register_no_ws(di->dev, psy_desc, _cfg);
>> 
>> So that code called within the context of power_supply_register_no_ws() 
>> could already
>> refer to initialized di->bat.
> 
> Indeed this issue was not present in previous design however I think
> the architecture was still error-prone. Starting from driver's probe:
> - some_driver_probe()
>   - power_supply_register()
> - device_add()
>   - kobject_uevent() - notify userspace
>- power_supply_uevent()
>  - power_supply_show_property()
>- power_supply_get_property()
>  - some_driver_get_property()
> 
> So before probe() ends the power supply core calls driver's
> get_property(). In that time the driver internal structures may not be
> ready yet, because the probe did not end.

Yes, that is indeed a problem. Maybe it did work with the bq27x00 driver
earlier because the call to power_supply_register() was the last
activity.

> The get_property() could
> access some registers or other core functions which will be ready
> after power_supply_register() call. For example the
> bq27x00_battery_get_property() accesses delayed work which could be
> (by mistake) not yet initialized.
> 
> I looked at other dev_uevent implementations and almost all of them do
> not call back the driver.
> 
> Of course this does not change that I introduced the issue and I feel
> bad about it.
> I got some ideas for resolving it:
> 1. Fix bq27x00_battery and maybe others so they won't access the 'psy'
> member of its state container. This does not solve the issue from
> architectural point of view - still some uninitialised data may be
> accessed because probe() is in progress. However this would be the
> fastest and the least intrusive.

The problem might be that it fundamentally changes the driver code
architecture. For example one call using di->bat is in

bq27x00_battery_status() {
…
else if (power_supply_am_i_supplied(di->bat))
status = POWER_SUPPLY_STATUS_NOT_CHARGING;
…
}

> 
> 2. Remove calls to get_property() (and other functions provided by
> driver) from power_supply_uevent(). Unfortunately this may break
> userspace which expects that some data will be present in uevent.
> 
> 3. Change the API to the previous convention, which you pointed as a remedy:
> ret = power_supply_register_no_ws(di->dev, >bat, psy_desc, _cfg);
> This also won't solve the issue from 1. that uevent will be called
> before probe ends.

Well, we could require that power_supply_register_no_ws() is the last
activity to be done in any_driver_probe().

> 
> 4. Block the power_supply_uevent() from calling the get_property()
> functions until device_add() finishes. This would solve this issue but
> first uevent messages (from adding device) won't contain all of device
> data (just like in solution no 2.) and this won't solve other race -
> someone may call sysfs show() on created device nodes and thus access
> the get_property() before probe finishes.

> 
> Any ideas?

5. is it possible to delay the call to kobject_uevent() after 
some_driver_probe()
is finished? E.g. by registering a one-shot delayed work?

There seems to be a shared workqueue (mentioned e.g. in
)
but that is an area of the kernel I am not familiar with.

BR,
Nikolaus

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  

Re: Bug: power supply - NULL pointer dereference in bq27x00 driver

2015-05-18 Thread Krzysztof Kozlowski
2015-05-18 18:40 GMT+09:00 Dr. H. Nikolaus Schaller :
> Hi,
> we tried to upgrade from 4.0 to 4.1-rc3 kernel and ran into a NULL pointer 
> problem within the bq27x00 driver.
>
> It appears to be introduced by your patch 
> 297d716f6260cc9421d971b124ca196b957ee458
>
> The problem appears to be that bq27x00_powersupply_init() calls 
> power_supply_register_no_ws() and
> sets di->bat *after* return. The old code did pass an uninitialized struct 
> pointer.
>
> Now for reasons I don’t understand, the power_supply_register_no_ws() appears 
> to call
> uevent related stuff which in turn calls bq27x00_battery_get_property() 
> before di->bat
> is properly initialized.
>
> I have checked with printk in bq27x00_battery_get_property() that di>bat == 
> NULL in this case and
> right before we see the segfault.
>
> The old code simply did pass a zeroed struct power_supply and perhaps 
> initialized its components
> during registration.
>
> Returning some -EINVAL if di->bat == NULL would likely solve the NULL pointer 
> dereference but
> I don’t know what it does to the uevent and if it restores previous operation.
>
> It could have been that it was for good purpose that 
> power_supply_register_no_ws() did not return
> by value, but by reference to the di->bat struct:
>
> -   ret = power_supply_register_no_ws(di->dev, >bat, NULL);
> +   di->bat = power_supply_register_no_ws(di->dev, psy_desc, _cfg);
>
> So that code called within the context of power_supply_register_no_ws() could 
> already
> refer to initialized di->bat.

Indeed this issue was not present in previous design however I think
the architecture was still error-prone. Starting from driver's probe:
 - some_driver_probe()
   - power_supply_register()
 - device_add()
   - kobject_uevent() - notify userspace
- power_supply_uevent()
  - power_supply_show_property()
- power_supply_get_property()
  - some_driver_get_property()

So before probe() ends the power supply core calls driver's
get_property(). In that time the driver internal structures may not be
ready yet, because the probe did not end. The get_property() could
access some registers or other core functions which will be ready
after power_supply_register() call. For example the
bq27x00_battery_get_property() accesses delayed work which could be
(by mistake) not yet initialized.

I looked at other dev_uevent implementations and almost all of them do
not call back the driver.

Of course this does not change that I introduced the issue and I feel
bad about it.
I got some ideas for resolving it:
1. Fix bq27x00_battery and maybe others so they won't access the 'psy'
member of its state container. This does not solve the issue from
architectural point of view - still some uninitialised data may be
accessed because probe() is in progress. However this would be the
fastest and the least intrusive.

2. Remove calls to get_property() (and other functions provided by
driver) from power_supply_uevent(). Unfortunately this may break
userspace which expects that some data will be present in uevent.

3. Change the API to the previous convention, which you pointed as a remedy:
ret = power_supply_register_no_ws(di->dev, >bat, psy_desc, _cfg);
This also won't solve the issue from 1. that uevent will be called
before probe ends.

4. Block the power_supply_uevent() from calling the get_property()
functions until device_add() finishes. This would solve this issue but
first uevent messages (from adding device) won't contain all of device
data (just like in solution no 2.) and this won't solve other race -
someone may call sysfs show() on created device nodes and thus access
the get_property() before probe finishes.

Any ideas?

Best regards,
Krzysztof
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Bug: power supply - NULL pointer dereference in bq27x00 driver

2015-05-18 Thread Dr. H. Nikolaus Schaller
Hi,
we tried to upgrade from 4.0 to 4.1-rc3 kernel and ran into a NULL pointer 
problem within the bq27x00 driver.

It appears to be introduced by your patch 
297d716f6260cc9421d971b124ca196b957ee458

The problem appears to be that bq27x00_powersupply_init() calls 
power_supply_register_no_ws() and
sets di->bat *after* return. The old code did pass an uninitialized struct 
pointer.

Now for reasons I don’t understand, the power_supply_register_no_ws() appears 
to call
uevent related stuff which in turn calls bq27x00_battery_get_property() before 
di->bat
is properly initialized.

I have checked with printk in bq27x00_battery_get_property() that di>bat == 
NULL in this case and
right before we see the segfault.

The old code simply did pass a zeroed struct power_supply and perhaps 
initialized its components
during registration.

Returning some -EINVAL if di->bat == NULL would likely solve the NULL pointer 
dereference but
I don’t know what it does to the uevent and if it restores previous operation.

It could have been that it was for good purpose that 
power_supply_register_no_ws() did not return
by value, but by reference to the di->bat struct:

-   ret = power_supply_register_no_ws(di->dev, >bat, NULL);
+   di->bat = power_supply_register_no_ws(di->dev, psy_desc, _cfg);

So that code called within the context of power_supply_register_no_ws() could 
already
refer to initialized di->bat.


BR and thanks,
Nikolaus Schaller


[   11.879943] Unable to handle kernel NULL pointer dereference at virtual 
address 000c
[   11.888519] pgd = c0004000
[   11.891357] [000c] *pgd=
[   11.895141] Internal error: Oops: 5 [#1] SMP ARM
[   11.899963] Modules linked in: bq27x00_battery w1_bq27000 ov9655 v4l2_common 
omap_hdq snd_soc_omap_mcbsp videodev hmc5843_i2c(C) lis3lv02d_i2c snd_soc_omap 
lis3lv02d itg3200 snd_pcm_dmaengine tsc2007 bmp085_i2c bma150 hmc5843_core(C) 
media at24 input_polldev leds_tca6507 rtc_twl twl4030_pwrbutton twl4030_keypad 
twl4030_madc snd_soc_twl4030 twl4030_vibra ehci_omap
[   11.933898] CPU: 0 PID: 1164 Comm: w1_bus_master1 Tainted: G C  
4.1.0-rc3-gta04+ #1086
[   11.943267] Hardware name: Generic OMAP36xx (Flattened Device Tree)
[   11.949829] task: dd2729c0 ti: dd276000 task.ti: dd276000
[   11.955505] PC is at __power_supply_is_supplied_by+0x8/0xc0
[   11.961364] LR is at __power_supply_am_i_supplied+0x18/0x48
[   11.967193] pc : []lr : []psr: a013
[   11.967193] sp : dd277bb8  ip :   fp : c092f82c
[   11.979217] r10: de348000  r9 : dd18cc20  r8 : dd18cc20
[   11.984680] r7 : de5c9420  r6 : de5c9400  r5 : de5c9400  r4 : 
[   11.991516] r3 : de5c9648  r2 :   r1 :   r0 : de5c9400
[   11.998352] Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment 
kernel
[   12.006011] Control: 10c5387d  Table: 9c1b0019  DAC: 0015
[   12.012054] Process w1_bus_master1 (pid: 1164, stack limit = 0xdd276218)
[   12.019073] Stack: (0xdd277bb8 to 0xdd278000)
[   12.023620] 7ba0:   
de5c9648 
[   12.032196] 7bc0: de5c9400 c046a040 de5c9420 c046a058 de5c9420  
 dd277bf4
[   12.040771] 7be0:  c0374ebc c05c4268 dd18eec4 dd18cc20 de27fea4 
de5c9648 
[   12.049346] 7c00: dd18eee8  dd277c3c dd18ee10 dd18eec4 bf0d0de4 
 
[   12.057922] 7c20: dd377000 c092f82c dd18cc20 c0469508 b6db6db7 c046a140 
dd18cc20 c092f82c
[   12.066467] 7c40:  dd377000   dd18cc00 c046a42c 
dd18cc20 de348000
[   12.075042] 7c60: c046a3b0 de348000 dd18cc28 dd18cc20 de0918c0 c0608dd8 
 
[   12.083618] 7c80: c07b5bb1 c0371f0c c07b5bb1 c02c72c4 de348000 dd277ca8 
c07b5bb1 de348000
[   12.092193] 7ca0: de348000 dd18cc28  c02c7544 0007 0006 
 c0083394
[   12.100769] 7cc0: dd2729c0 dd2f6540 c07b2ae3 dc1f7950 dd18c810 dd18cc20 
 dd18cc28
[   12.109344] 7ce0: dc1f7950 dd18c810    c0371938 
dd18cc20 
[   12.117919] 7d00: c0008280 0080 dd18cc00 dd18cc20  dc1f7950 
dd18c810 c0469cd8
[   12.126464] 7d20: bf0d124f dd277d44 1f1c4000 c05ae99c bf0d124f dd18ee10 
dc1f7950 bf0cc121
[   12.135040] 7d40: 001b dd014c50 0001 bf0d0864  dd18ee10 
 
[   12.143615] 7d60: ffed dd18c810 bf0d1670 c03752c0 dd18c810  
bf0d1670 c0373bd4
[   12.152191] 7d80: 0007 dd18c810 bf0d1670 c0373ed8  c0373e34 
 dd18c810
[   12.160766] 7da0: dd277da8 c037265c de0948d4 dd2f6194 dd18c810 dd18c810 
dd18c844 dd18c810
[   12.169342] 7dc0: bf0cc160 c0373dbc dd18c810 c0918590 dd18c810 c0373294 
dd18c810 
[   12.177917] 7de0: dd18c818 c0371940 dd18c810 002f dd18c810  
dd18c800 
[   12.186492] 7e00:  bf0cc160 dd0144c0 c03754bc dd014c50 dd18c800 
dd014c00 
[   12.195037] 7e20: bf0cc160 bf0cc058 dd014c00 dd014c00 bf0cc174 c04663c0 
dd014c00 
[   12.203613] 7e40: 

Re: Bug: power supply - NULL pointer dereference in bq27x00 driver

2015-05-18 Thread Krzysztof Kozlowski
2015-05-18 21:32 GMT+09:00 Krzysztof Kozlowski k.kozlow...@samsung.com:
 2015-05-18 18:40 GMT+09:00 Dr. H. Nikolaus Schaller h...@goldelico.com:
 Hi,
 we tried to upgrade from 4.0 to 4.1-rc3 kernel and ran into a NULL pointer 
 problem within the bq27x00 driver.

 It appears to be introduced by your patch 
 297d716f6260cc9421d971b124ca196b957ee458

 The problem appears to be that bq27x00_powersupply_init() calls 
 power_supply_register_no_ws() and
 sets di-bat *after* return. The old code did pass an uninitialized struct 
 pointer.

 Now for reasons I don’t understand, the power_supply_register_no_ws() 
 appears to call
 uevent related stuff which in turn calls bq27x00_battery_get_property() 
 before di-bat
 is properly initialized.

 I have checked with printk in bq27x00_battery_get_property() that dibat == 
 NULL in this case and
 right before we see the segfault.

 The old code simply did pass a zeroed struct power_supply and perhaps 
 initialized its components
 during registration.

 Returning some -EINVAL if di-bat == NULL would likely solve the NULL 
 pointer dereference but
 I don’t know what it does to the uevent and if it restores previous 
 operation.

 It could have been that it was for good purpose that 
 power_supply_register_no_ws() did not return
 by value, but by reference to the di-bat struct:

 -   ret = power_supply_register_no_ws(di-dev, di-bat, NULL);
 +   di-bat = power_supply_register_no_ws(di-dev, psy_desc, psy_cfg);

 So that code called within the context of power_supply_register_no_ws() 
 could already
 refer to initialized di-bat.

 Indeed this issue was not present in previous design however I think
 the architecture was still error-prone. Starting from driver's probe:
  - some_driver_probe()
- power_supply_register(psy)
  - device_add()
- kobject_uevent() - notify userspace
 - power_supply_uevent()
   - power_supply_show_property()
 - power_supply_get_property()
   - some_driver_get_property()

 So before probe() ends the power supply core calls driver's
 get_property(). In that time the driver internal structures may not be
 ready yet, because the probe did not end. The get_property() could
 access some registers or other core functions which will be ready
 after power_supply_register() call. For example the
 bq27x00_battery_get_property() accesses delayed work which could be
 (by mistake) not yet initialized.

 I looked at other dev_uevent implementations and almost all of them do
 not call back the driver.

 Of course this does not change that I introduced the issue and I feel
 bad about it.
 I got some ideas for resolving it:
 1. Fix bq27x00_battery and maybe others so they won't access the 'psy'
 member of its state container. This does not solve the issue from
 architectural point of view - still some uninitialised data may be
 accessed because probe() is in progress. However this would be the
 fastest and the least intrusive.

 2. Remove calls to get_property() (and other functions provided by
 driver) from power_supply_uevent(). Unfortunately this may break
 userspace which expects that some data will be present in uevent.

 3. Change the API to the previous convention, which you pointed as a remedy:
 ret = power_supply_register_no_ws(di-dev, di-bat, psy_desc, psy_cfg);
 This also won't solve the issue from 1. that uevent will be called
 before probe ends.

 4. Block the power_supply_uevent() from calling the get_property()
 functions until device_add() finishes. This would solve this issue but
 first uevent messages (from adding device) won't contain all of device
 data (just like in solution no 2.) and this won't solve other race -
 someone may call sysfs show() on created device nodes and thus access
 the get_property() before probe finishes.

Unfortunately similar issue is with changed_work of power supply. The
power_supply_register() calls at the end power_supply_changed_work()
which through leds calls get_property().

Best regards,
Krzysztof
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Bug: power supply - NULL pointer dereference in bq27x00 driver

2015-05-18 Thread Dr. H. Nikolaus Schaller
Hi Krzysztof,

Am 18.05.2015 um 14:32 schrieb Krzysztof Kozlowski k.kozlow...@samsung.com:

 2015-05-18 18:40 GMT+09:00 Dr. H. Nikolaus Schaller h...@goldelico.com:
 Hi,
 we tried to upgrade from 4.0 to 4.1-rc3 kernel and ran into a NULL pointer 
 problem within the bq27x00 driver.
 
 It appears to be introduced by your patch 
 297d716f6260cc9421d971b124ca196b957ee458
 
 The problem appears to be that bq27x00_powersupply_init() calls 
 power_supply_register_no_ws() and
 sets di-bat *after* return. The old code did pass an uninitialized struct 
 pointer.
 
 Now for reasons I don’t understand, the power_supply_register_no_ws() 
 appears to call
 uevent related stuff which in turn calls bq27x00_battery_get_property() 
 before di-bat
 is properly initialized.
 
 I have checked with printk in bq27x00_battery_get_property() that dibat == 
 NULL in this case and
 right before we see the segfault.
 
 The old code simply did pass a zeroed struct power_supply and perhaps 
 initialized its components
 during registration.
 
 Returning some -EINVAL if di-bat == NULL would likely solve the NULL 
 pointer dereference but
 I don’t know what it does to the uevent and if it restores previous 
 operation.
 
 It could have been that it was for good purpose that 
 power_supply_register_no_ws() did not return
 by value, but by reference to the di-bat struct:
 
 -   ret = power_supply_register_no_ws(di-dev, di-bat, NULL);
 +   di-bat = power_supply_register_no_ws(di-dev, psy_desc, psy_cfg);
 
 So that code called within the context of power_supply_register_no_ws() 
 could already
 refer to initialized di-bat.
 
 Indeed this issue was not present in previous design however I think
 the architecture was still error-prone. Starting from driver's probe:
 - some_driver_probe()
   - power_supply_register(psy)
 - device_add()
   - kobject_uevent() - notify userspace
- power_supply_uevent()
  - power_supply_show_property()
- power_supply_get_property()
  - some_driver_get_property()
 
 So before probe() ends the power supply core calls driver's
 get_property(). In that time the driver internal structures may not be
 ready yet, because the probe did not end.

Yes, that is indeed a problem. Maybe it did work with the bq27x00 driver
earlier because the call to power_supply_register(psy) was the last
activity.

 The get_property() could
 access some registers or other core functions which will be ready
 after power_supply_register() call. For example the
 bq27x00_battery_get_property() accesses delayed work which could be
 (by mistake) not yet initialized.
 
 I looked at other dev_uevent implementations and almost all of them do
 not call back the driver.
 
 Of course this does not change that I introduced the issue and I feel
 bad about it.
 I got some ideas for resolving it:
 1. Fix bq27x00_battery and maybe others so they won't access the 'psy'
 member of its state container. This does not solve the issue from
 architectural point of view - still some uninitialised data may be
 accessed because probe() is in progress. However this would be the
 fastest and the least intrusive.

The problem might be that it fundamentally changes the driver code
architecture. For example one call using di-bat is in

bq27x00_battery_status() {
…
else if (power_supply_am_i_supplied(di-bat))
status = POWER_SUPPLY_STATUS_NOT_CHARGING;
…
}

 
 2. Remove calls to get_property() (and other functions provided by
 driver) from power_supply_uevent(). Unfortunately this may break
 userspace which expects that some data will be present in uevent.
 
 3. Change the API to the previous convention, which you pointed as a remedy:
 ret = power_supply_register_no_ws(di-dev, di-bat, psy_desc, psy_cfg);
 This also won't solve the issue from 1. that uevent will be called
 before probe ends.

Well, we could require that power_supply_register_no_ws() is the last
activity to be done in any_driver_probe().

 
 4. Block the power_supply_uevent() from calling the get_property()
 functions until device_add() finishes. This would solve this issue but
 first uevent messages (from adding device) won't contain all of device
 data (just like in solution no 2.) and this won't solve other race -
 someone may call sysfs show() on created device nodes and thus access
 the get_property() before probe finishes.

 
 Any ideas?

5. is it possible to delay the call to kobject_uevent() after 
some_driver_probe()
is finished? E.g. by registering a one-shot delayed work?

There seems to be a shared workqueue (mentioned e.g. in
http://www.makelinux.net/ldd3/chp-7-sect-6)
but that is an area of the kernel I am not familiar with.

BR,
Nikolaus

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Bug: power supply - NULL pointer dereference in bq27x00 driver

2015-05-18 Thread Krzysztof Kozlowski
2015-05-18 22:30 GMT+09:00 Dr. H. Nikolaus Schaller h...@goldelico.com:
 Hi Krzysztof,

 Am 18.05.2015 um 14:32 schrieb Krzysztof Kozlowski k.kozlow...@samsung.com:

 2015-05-18 18:40 GMT+09:00 Dr. H. Nikolaus Schaller h...@goldelico.com:
 Hi,
 we tried to upgrade from 4.0 to 4.1-rc3 kernel and ran into a NULL pointer 
 problem within the bq27x00 driver.

 It appears to be introduced by your patch 
 297d716f6260cc9421d971b124ca196b957ee458

 The problem appears to be that bq27x00_powersupply_init() calls 
 power_supply_register_no_ws() and
 sets di-bat *after* return. The old code did pass an uninitialized struct 
 pointer.

 Now for reasons I don’t understand, the power_supply_register_no_ws() 
 appears to call
 uevent related stuff which in turn calls bq27x00_battery_get_property() 
 before di-bat
 is properly initialized.

 I have checked with printk in bq27x00_battery_get_property() that dibat == 
 NULL in this case and
 right before we see the segfault.

 The old code simply did pass a zeroed struct power_supply and perhaps 
 initialized its components
 during registration.

 Returning some -EINVAL if di-bat == NULL would likely solve the NULL 
 pointer dereference but
 I don’t know what it does to the uevent and if it restores previous 
 operation.

 It could have been that it was for good purpose that 
 power_supply_register_no_ws() did not return
 by value, but by reference to the di-bat struct:

 -   ret = power_supply_register_no_ws(di-dev, di-bat, NULL);
 +   di-bat = power_supply_register_no_ws(di-dev, psy_desc, psy_cfg);

 So that code called within the context of power_supply_register_no_ws() 
 could already
 refer to initialized di-bat.

 Indeed this issue was not present in previous design however I think
 the architecture was still error-prone. Starting from driver's probe:
 - some_driver_probe()
   - power_supply_register(psy)
 - device_add()
   - kobject_uevent() - notify userspace
- power_supply_uevent()
  - power_supply_show_property()
- power_supply_get_property()
  - some_driver_get_property()

 So before probe() ends the power supply core calls driver's
 get_property(). In that time the driver internal structures may not be
 ready yet, because the probe did not end.

 Yes, that is indeed a problem. Maybe it did work with the bq27x00 driver
 earlier because the call to power_supply_register(psy) was the last
 activity.

Right, for most of the drivers it is the last part of probe.


 The get_property() could
 access some registers or other core functions which will be ready
 after power_supply_register() call. For example the
 bq27x00_battery_get_property() accesses delayed work which could be
 (by mistake) not yet initialized.

 I looked at other dev_uevent implementations and almost all of them do
 not call back the driver.

 Of course this does not change that I introduced the issue and I feel
 bad about it.
 I got some ideas for resolving it:
 1. Fix bq27x00_battery and maybe others so they won't access the 'psy'
 member of its state container. This does not solve the issue from
 architectural point of view - still some uninitialised data may be
 accessed because probe() is in progress. However this would be the
 fastest and the least intrusive.

 The problem might be that it fundamentally changes the driver code
 architecture. For example one call using di-bat is in

 bq27x00_battery_status() {
 …
 else if (power_supply_am_i_supplied(di-bat))
 status = POWER_SUPPLY_STATUS_NOT_CHARGING;
 …
 }


 2. Remove calls to get_property() (and other functions provided by
 driver) from power_supply_uevent(). Unfortunately this may break
 userspace which expects that some data will be present in uevent.

 3. Change the API to the previous convention, which you pointed as a remedy:
 ret = power_supply_register_no_ws(di-dev, di-bat, psy_desc, psy_cfg);
 This also won't solve the issue from 1. that uevent will be called
 before probe ends.

 Well, we could require that power_supply_register_no_ws() is the last
 activity to be done in any_driver_probe().

Some drivers (like drivers/power/lp8727_charger.c) register multiple
power supplies and there can be only one last call. What is even
important in this lp8727 case, is that it registers:
 - a battery,
 - two chargers which supply this battery.

So when power supplies are registered, the delayed work is executed
for each supplied battery. Hopefully the the battery is registered as
last... but I am not quite sure that this is still safe.

 4. Block the power_supply_uevent() from calling the get_property()
 functions until device_add() finishes. This would solve this issue but
 first uevent messages (from adding device) won't contain all of device
 data (just like in solution no 2.) and this won't solve other race -
 someone may call sysfs show() on created device nodes and thus access
 the get_property() before probe finishes.


 Any ideas?

 5. is it 

Re: Bug: power supply - NULL pointer dereference in bq27x00 driver

2015-05-18 Thread Krzysztof Kozlowski
2015-05-18 18:40 GMT+09:00 Dr. H. Nikolaus Schaller h...@goldelico.com:
 Hi,
 we tried to upgrade from 4.0 to 4.1-rc3 kernel and ran into a NULL pointer 
 problem within the bq27x00 driver.

 It appears to be introduced by your patch 
 297d716f6260cc9421d971b124ca196b957ee458

 The problem appears to be that bq27x00_powersupply_init() calls 
 power_supply_register_no_ws() and
 sets di-bat *after* return. The old code did pass an uninitialized struct 
 pointer.

 Now for reasons I don’t understand, the power_supply_register_no_ws() appears 
 to call
 uevent related stuff which in turn calls bq27x00_battery_get_property() 
 before di-bat
 is properly initialized.

 I have checked with printk in bq27x00_battery_get_property() that dibat == 
 NULL in this case and
 right before we see the segfault.

 The old code simply did pass a zeroed struct power_supply and perhaps 
 initialized its components
 during registration.

 Returning some -EINVAL if di-bat == NULL would likely solve the NULL pointer 
 dereference but
 I don’t know what it does to the uevent and if it restores previous operation.

 It could have been that it was for good purpose that 
 power_supply_register_no_ws() did not return
 by value, but by reference to the di-bat struct:

 -   ret = power_supply_register_no_ws(di-dev, di-bat, NULL);
 +   di-bat = power_supply_register_no_ws(di-dev, psy_desc, psy_cfg);

 So that code called within the context of power_supply_register_no_ws() could 
 already
 refer to initialized di-bat.

Indeed this issue was not present in previous design however I think
the architecture was still error-prone. Starting from driver's probe:
 - some_driver_probe()
   - power_supply_register(psy)
 - device_add()
   - kobject_uevent() - notify userspace
- power_supply_uevent()
  - power_supply_show_property()
- power_supply_get_property()
  - some_driver_get_property()

So before probe() ends the power supply core calls driver's
get_property(). In that time the driver internal structures may not be
ready yet, because the probe did not end. The get_property() could
access some registers or other core functions which will be ready
after power_supply_register() call. For example the
bq27x00_battery_get_property() accesses delayed work which could be
(by mistake) not yet initialized.

I looked at other dev_uevent implementations and almost all of them do
not call back the driver.

Of course this does not change that I introduced the issue and I feel
bad about it.
I got some ideas for resolving it:
1. Fix bq27x00_battery and maybe others so they won't access the 'psy'
member of its state container. This does not solve the issue from
architectural point of view - still some uninitialised data may be
accessed because probe() is in progress. However this would be the
fastest and the least intrusive.

2. Remove calls to get_property() (and other functions provided by
driver) from power_supply_uevent(). Unfortunately this may break
userspace which expects that some data will be present in uevent.

3. Change the API to the previous convention, which you pointed as a remedy:
ret = power_supply_register_no_ws(di-dev, di-bat, psy_desc, psy_cfg);
This also won't solve the issue from 1. that uevent will be called
before probe ends.

4. Block the power_supply_uevent() from calling the get_property()
functions until device_add() finishes. This would solve this issue but
first uevent messages (from adding device) won't contain all of device
data (just like in solution no 2.) and this won't solve other race -
someone may call sysfs show() on created device nodes and thus access
the get_property() before probe finishes.

Any ideas?

Best regards,
Krzysztof
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Bug: power supply - NULL pointer dereference in bq27x00 driver

2015-05-18 Thread Dr. H. Nikolaus Schaller

Am 18.05.2015 um 16:09 schrieb Krzysztof Kozlowski k.kozlow...@samsung.com:

 2015-05-18 22:30 GMT+09:00 Dr. H. Nikolaus Schaller h...@goldelico.com:
 Hi Krzysztof,
 
 Am 18.05.2015 um 14:32 schrieb Krzysztof Kozlowski k.kozlow...@samsung.com:
 
 2015-05-18 18:40 GMT+09:00 Dr. H. Nikolaus Schaller h...@goldelico.com:
 Hi,
 we tried to upgrade from 4.0 to 4.1-rc3 kernel and ran into a NULL pointer 
 problem within the bq27x00 driver.
 
 It appears to be introduced by your patch 
 297d716f6260cc9421d971b124ca196b957ee458
 
 The problem appears to be that bq27x00_powersupply_init() calls 
 power_supply_register_no_ws() and
 sets di-bat *after* return. The old code did pass an uninitialized struct 
 pointer.
 
 Now for reasons I don’t understand, the power_supply_register_no_ws() 
 appears to call
 uevent related stuff which in turn calls bq27x00_battery_get_property() 
 before di-bat
 is properly initialized.
 
 I have checked with printk in bq27x00_battery_get_property() that dibat 
 == NULL in this case and
 right before we see the segfault.
 
 The old code simply did pass a zeroed struct power_supply and perhaps 
 initialized its components
 during registration.
 
 Returning some -EINVAL if di-bat == NULL would likely solve the NULL 
 pointer dereference but
 I don’t know what it does to the uevent and if it restores previous 
 operation.
 
 It could have been that it was for good purpose that 
 power_supply_register_no_ws() did not return
 by value, but by reference to the di-bat struct:
 
 -   ret = power_supply_register_no_ws(di-dev, di-bat, NULL);
 +   di-bat = power_supply_register_no_ws(di-dev, psy_desc, psy_cfg);
 
 So that code called within the context of power_supply_register_no_ws() 
 could already
 refer to initialized di-bat.
 
 Indeed this issue was not present in previous design however I think
 the architecture was still error-prone. Starting from driver's probe:
 - some_driver_probe()
  - power_supply_register(psy)
- device_add()
  - kobject_uevent() - notify userspace
   - power_supply_uevent()
 - power_supply_show_property()
   - power_supply_get_property()
 - some_driver_get_property()
 
 So before probe() ends the power supply core calls driver's
 get_property(). In that time the driver internal structures may not be
 ready yet, because the probe did not end.
 
 Yes, that is indeed a problem. Maybe it did work with the bq27x00 driver
 earlier because the call to power_supply_register(psy) was the last
 activity.
 
 Right, for most of the drivers it is the last part of probe.
 
 
 The get_property() could
 access some registers or other core functions which will be ready
 after power_supply_register() call. For example the
 bq27x00_battery_get_property() accesses delayed work which could be
 (by mistake) not yet initialized.
 
 I looked at other dev_uevent implementations and almost all of them do
 not call back the driver.
 
 Of course this does not change that I introduced the issue and I feel
 bad about it.
 I got some ideas for resolving it:
 1. Fix bq27x00_battery and maybe others so they won't access the 'psy'
 member of its state container. This does not solve the issue from
 architectural point of view - still some uninitialised data may be
 accessed because probe() is in progress. However this would be the
 fastest and the least intrusive.
 
 The problem might be that it fundamentally changes the driver code
 architecture. For example one call using di-bat is in
 
 bq27x00_battery_status() {
 …
else if (power_supply_am_i_supplied(di-bat))
status = POWER_SUPPLY_STATUS_NOT_CHARGING;
 …
 }
 
 
 2. Remove calls to get_property() (and other functions provided by
 driver) from power_supply_uevent(). Unfortunately this may break
 userspace which expects that some data will be present in uevent.
 
 3. Change the API to the previous convention, which you pointed as a remedy:
 ret = power_supply_register_no_ws(di-dev, di-bat, psy_desc, psy_cfg);
 This also won't solve the issue from 1. that uevent will be called
 before probe ends.
 
 Well, we could require that power_supply_register_no_ws() is the last
 activity to be done in any_driver_probe().
 
 Some drivers (like drivers/power/lp8727_charger.c) register multiple
 power supplies and there can be only one last call. What is even
 important in this lp8727 case, is that it registers:
 - a battery,
 - two chargers which supply this battery.
 
 So when power supplies are registered, the delayed work is executed
 for each supplied battery. Hopefully the the battery is registered as
 last... but I am not quite sure that this is still safe.
 
 4. Block the power_supply_uevent() from calling the get_property()
 functions until device_add() finishes. This would solve this issue but
 first uevent messages (from adding device) won't contain all of device
 data (just like in solution no 2.) and this won't solve other race -
 someone may call sysfs show() on 

Bug: power supply - NULL pointer dereference in bq27x00 driver

2015-05-18 Thread Dr. H. Nikolaus Schaller
Hi,
we tried to upgrade from 4.0 to 4.1-rc3 kernel and ran into a NULL pointer 
problem within the bq27x00 driver.

It appears to be introduced by your patch 
297d716f6260cc9421d971b124ca196b957ee458

The problem appears to be that bq27x00_powersupply_init() calls 
power_supply_register_no_ws() and
sets di-bat *after* return. The old code did pass an uninitialized struct 
pointer.

Now for reasons I don’t understand, the power_supply_register_no_ws() appears 
to call
uevent related stuff which in turn calls bq27x00_battery_get_property() before 
di-bat
is properly initialized.

I have checked with printk in bq27x00_battery_get_property() that dibat == 
NULL in this case and
right before we see the segfault.

The old code simply did pass a zeroed struct power_supply and perhaps 
initialized its components
during registration.

Returning some -EINVAL if di-bat == NULL would likely solve the NULL pointer 
dereference but
I don’t know what it does to the uevent and if it restores previous operation.

It could have been that it was for good purpose that 
power_supply_register_no_ws() did not return
by value, but by reference to the di-bat struct:

-   ret = power_supply_register_no_ws(di-dev, di-bat, NULL);
+   di-bat = power_supply_register_no_ws(di-dev, psy_desc, psy_cfg);

So that code called within the context of power_supply_register_no_ws() could 
already
refer to initialized di-bat.


BR and thanks,
Nikolaus Schaller


[   11.879943] Unable to handle kernel NULL pointer dereference at virtual 
address 000c
[   11.888519] pgd = c0004000
[   11.891357] [000c] *pgd=
[   11.895141] Internal error: Oops: 5 [#1] SMP ARM
[   11.899963] Modules linked in: bq27x00_battery w1_bq27000 ov9655 v4l2_common 
omap_hdq snd_soc_omap_mcbsp videodev hmc5843_i2c(C) lis3lv02d_i2c snd_soc_omap 
lis3lv02d itg3200 snd_pcm_dmaengine tsc2007 bmp085_i2c bma150 hmc5843_core(C) 
media at24 input_polldev leds_tca6507 rtc_twl twl4030_pwrbutton twl4030_keypad 
twl4030_madc snd_soc_twl4030 twl4030_vibra ehci_omap
[   11.933898] CPU: 0 PID: 1164 Comm: w1_bus_master1 Tainted: G C  
4.1.0-rc3-gta04+ #1086
[   11.943267] Hardware name: Generic OMAP36xx (Flattened Device Tree)
[   11.949829] task: dd2729c0 ti: dd276000 task.ti: dd276000
[   11.955505] PC is at __power_supply_is_supplied_by+0x8/0xc0
[   11.961364] LR is at __power_supply_am_i_supplied+0x18/0x48
[   11.967193] pc : [c0469f4c]lr : [c046a058]psr: a013
[   11.967193] sp : dd277bb8  ip :   fp : c092f82c
[   11.979217] r10: de348000  r9 : dd18cc20  r8 : dd18cc20
[   11.984680] r7 : de5c9420  r6 : de5c9400  r5 : de5c9400  r4 : 
[   11.991516] r3 : de5c9648  r2 :   r1 :   r0 : de5c9400
[   11.998352] Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment 
kernel
[   12.006011] Control: 10c5387d  Table: 9c1b0019  DAC: 0015
[   12.012054] Process w1_bus_master1 (pid: 1164, stack limit = 0xdd276218)
[   12.019073] Stack: (0xdd277bb8 to 0xdd278000)
[   12.023620] 7ba0:   
de5c9648 
[   12.032196] 7bc0: de5c9400 c046a040 de5c9420 c046a058 de5c9420  
 dd277bf4
[   12.040771] 7be0:  c0374ebc c05c4268 dd18eec4 dd18cc20 de27fea4 
de5c9648 
[   12.049346] 7c00: dd18eee8  dd277c3c dd18ee10 dd18eec4 bf0d0de4 
 
[   12.057922] 7c20: dd377000 c092f82c dd18cc20 c0469508 b6db6db7 c046a140 
dd18cc20 c092f82c
[   12.066467] 7c40:  dd377000   dd18cc00 c046a42c 
dd18cc20 de348000
[   12.075042] 7c60: c046a3b0 de348000 dd18cc28 dd18cc20 de0918c0 c0608dd8 
 
[   12.083618] 7c80: c07b5bb1 c0371f0c c07b5bb1 c02c72c4 de348000 dd277ca8 
c07b5bb1 de348000
[   12.092193] 7ca0: de348000 dd18cc28  c02c7544 0007 0006 
 c0083394
[   12.100769] 7cc0: dd2729c0 dd2f6540 c07b2ae3 dc1f7950 dd18c810 dd18cc20 
 dd18cc28
[   12.109344] 7ce0: dc1f7950 dd18c810    c0371938 
dd18cc20 
[   12.117919] 7d00: c0008280 0080 dd18cc00 dd18cc20  dc1f7950 
dd18c810 c0469cd8
[   12.126464] 7d20: bf0d124f dd277d44 1f1c4000 c05ae99c bf0d124f dd18ee10 
dc1f7950 bf0cc121
[   12.135040] 7d40: 001b dd014c50 0001 bf0d0864  dd18ee10 
 
[   12.143615] 7d60: ffed dd18c810 bf0d1670 c03752c0 dd18c810  
bf0d1670 c0373bd4
[   12.152191] 7d80: 0007 dd18c810 bf0d1670 c0373ed8  c0373e34 
 dd18c810
[   12.160766] 7da0: dd277da8 c037265c de0948d4 dd2f6194 dd18c810 dd18c810 
dd18c844 dd18c810
[   12.169342] 7dc0: bf0cc160 c0373dbc dd18c810 c0918590 dd18c810 c0373294 
dd18c810 
[   12.177917] 7de0: dd18c818 c0371940 dd18c810 002f dd18c810  
dd18c800 
[   12.186492] 7e00:  bf0cc160 dd0144c0 c03754bc dd014c50 dd18c800 
dd014c00 
[   12.195037] 7e20: bf0cc160 bf0cc058 dd014c00 dd014c00 bf0cc174 c04663c0 
dd014c00 
[   12.203613]