Re: [PATCH] [RFT] drm: adv7511/33: fix adv7511_cec_init() failure handling

2017-11-17 Thread Naresh Kamboju
On 15 November 2017 at 18:28, Hans Verkuil  wrote:
> On 15/11/17 13:37, Arnd Bergmann wrote:
>> An otherwise correct cleanup patch from Dan Carpenter turned a broken
>> failure handling from a feature patch by Hans Verkuil into a kernel
>> Oops, so bisection points to commit 7af35b0addbc ("drm/kirin: Checking
>> for IS_ERR() instead of NULL") rather than 3b1b975003e4 ("drm:
>> adv7511/33: add HDMI CEC support").
>>
>> I've managed to piece together several partial problems, though
>> I'm still struggling with the bigger picture:
>>
>> adv7511_probe() registers a drm_bridge structure that was allocated
>> with devm_kzalloc(). It calls adv7511_cec_init(), which fails for an
>> unknown reason, which in turn triggers the registered structure to be
>> removed.
>>
>> Elsewhere, kirin_drm_platform_probe() gets called, which calls
>> of_graph_get_remote_node(), and that returns NULL. Before Dan's
>> patch we would go on with a NULL pointer here and register that,
>> now kirin_drm_platform_probe() fails with -ENODEV.
>>
>> In a third driver, dsi_parse_dt() calls drm_of_find_panel_or_bridge(),
>> which after not finding a panel goes on to call of_drm_find_bridge(),
>> and that crashes due to the earlier list corruption.
>>
>> This addresses the first issue by making sure that adv7511_probe()
>> does not leave behind any corrupted list entries. This should
>> get the system back to boot but needs testing. From my understanding,
>> there is at least one more bug that needs to be resolved to actually
>> get everything to work again.
>>
>> Reported-by: Naresh Kamboju 
>> Cc: Xinliang Liu 
>> Cc: Dan Carpenter 
>> Cc: Sean Paul 
>> Cc: Hans Verkuil 
>> Cc: Archit Taneja 
>> Link: https://bugs.linaro.org/show_bug.cgi?id=3345
>> Link: https://lkft.validation.linaro.org/scheduler/job/48017#L3551
>> Fixes: 7af35b0addbc ("drm/kirin: Checking for IS_ERR() instead of NULL")
>> Fixes: 3b1b975003e4 ("drm: adv7511/33: add HDMI CEC support")
>> Signed-off-by: Arnd Bergmann 
>> ---
>> Untested so far, this is what I came up with after reading the
>> WARN_ON log from a modified kernel.
>>
>> Naresh, can you give this one a go?

Tested with this patch.
I did not notice kernel crash/WARNING in dmesg log on HiKey (arm64) board.

Ref test log:
Link: https://pastebin.com/t8iLEFwF

Tested-by: Naresh Kamboju 

>>
>> Hans and others, can you review in the meantime?
>>
>> Signed-off-by: Arnd Bergmann 
>> ---
>>  drivers/gpu/drm/bridge/adv7511/adv7511_drv.c | 7 ++-
>>  1 file changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/bridge/adv7511/adv7511_drv.c 
>> b/drivers/gpu/drm/bridge/adv7511/adv7511_drv.c
>> index 0e14f1572d05..93d1ecafe8fa 100644
>> --- a/drivers/gpu/drm/bridge/adv7511/adv7511_drv.c
>> +++ b/drivers/gpu/drm/bridge/adv7511/adv7511_drv.c
>> @@ -1204,7 +1204,7 @@ static int adv7511_probe(struct i2c_client *i2c, const 
>> struct i2c_device_id *id)
>>  #ifdef CONFIG_DRM_I2C_ADV7511_CEC
>>   ret = adv7511_cec_init(dev, adv7511, offset);
>>   if (ret)
>> - goto err_unregister_cec;
>> + goto err_unregister_bridge;
>
> Rather than adding the err_unregister_bridge label, I think it is better to 
> move
> this code up to just before the call to drm_bridge_add().
>
> I think I just didn't realize that doing it after would require additional 
> cleanup.
> But it should be perfectly fine to move it up so we can avoid doing that.
>
> I can't test it until Monday as I don't have access to the hardware at the 
> moment.
>
> Regards,
>
> Hans
>
>>  #else
>>   regmap_write(adv7511->regmap, ADV7511_REG_CEC_CTRL + offset,
>>ADV7511_CEC_CTRL_POWER_DOWN);
>> @@ -1212,6 +1212,11 @@ static int adv7511_probe(struct i2c_client *i2c, 
>> const struct i2c_device_id *id)
>>
>>   return 0;
>>
>> +#ifdef CONFIG_DRM_I2C_ADV7511_CEC
>> +err_unregister_bridge:
>> + adv7511_audio_exit(adv7511);
>> + drm_bridge_remove(>bridge);
>> +#endif
>>  err_unregister_cec:
>>   i2c_unregister_device(adv7511->i2c_cec);
>>   if (adv7511->cec_clk)
>>
>
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] [RFT] drm: adv7511/33: fix adv7511_cec_init() failure handling

2017-11-16 Thread John Stultz
On Thu, Nov 16, 2017 at 2:20 PM, John Stultz  wrote:
> On Thu, Nov 16, 2017 at 1:50 PM, John Stultz  wrote:
>> On Wed, Nov 15, 2017 at 4:37 AM, Arnd Bergmann  wrote:
>>> An otherwise correct cleanup patch from Dan Carpenter turned a broken
>>> failure handling from a feature patch by Hans Verkuil into a kernel
>>> Oops, so bisection points to commit 7af35b0addbc ("drm/kirin: Checking
>>> for IS_ERR() instead of NULL") rather than 3b1b975003e4 ("drm:
>>> adv7511/33: add HDMI CEC support").
>>>
>>> I've managed to piece together several partial problems, though
>>> I'm still struggling with the bigger picture:
>>>
>>> adv7511_probe() registers a drm_bridge structure that was allocated
>>> with devm_kzalloc(). It calls adv7511_cec_init(), which fails for an
>>> unknown reason, which in turn triggers the registered structure to be
>>> removed.
>>>
>>> Elsewhere, kirin_drm_platform_probe() gets called, which calls
>>> of_graph_get_remote_node(), and that returns NULL. Before Dan's
>>> patch we would go on with a NULL pointer here and register that,
>>> now kirin_drm_platform_probe() fails with -ENODEV.
>>>
>>> In a third driver, dsi_parse_dt() calls drm_of_find_panel_or_bridge(),
>>> which after not finding a panel goes on to call of_drm_find_bridge(),
>>> and that crashes due to the earlier list corruption.
>>>
>>> This addresses the first issue by making sure that adv7511_probe()
>>> does not leave behind any corrupted list entries. This should
>>> get the system back to boot but needs testing. From my understanding,
>>> there is at least one more bug that needs to be resolved to actually
>>> get everything to work again.
>>
>> So I've started hitting the issue this patch tries to address (now
>> that the related code landed in Linus' tree). The only issue is that
>> with this fix, I don't see graphics initializing properly, so I
>> suspect something is still wrong with the error handling (though what
>> exactly I'm not sure).
>
> So this seems to only happen when CONFIG_DRM_I2C_ADV7511_CEC is
> enabled. If it is on, I don't get any graphics, but if its disabled
> graphics works.
>
> Tying this with Arnd's patch, I'm guessing adv7511_cec_init() is
> failing, but it seems like instead of just disabling the CEC feature,
> we're failing to load the driver entirely.
>
> Maybe should the logic be something like:
> #ifdef CONFIG_DRM_I2C_ADV7511_CEC
> ret = adv7511_cec_init(dev, adv7511, offset);
> if (ret)
> #endif
> regmap_write(adv7511->regmap, ADV7511_REG_CEC_CTRL + offset,
> ADV7511_CEC_CTRL_POWER_DOWN);
>
> ?

I just tested with this, and this approach seems to work for me.

thanks
-john
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] [RFT] drm: adv7511/33: fix adv7511_cec_init() failure handling

2017-11-16 Thread John Stultz
On Thu, Nov 16, 2017 at 1:50 PM, John Stultz  wrote:
> On Wed, Nov 15, 2017 at 4:37 AM, Arnd Bergmann  wrote:
>> An otherwise correct cleanup patch from Dan Carpenter turned a broken
>> failure handling from a feature patch by Hans Verkuil into a kernel
>> Oops, so bisection points to commit 7af35b0addbc ("drm/kirin: Checking
>> for IS_ERR() instead of NULL") rather than 3b1b975003e4 ("drm:
>> adv7511/33: add HDMI CEC support").
>>
>> I've managed to piece together several partial problems, though
>> I'm still struggling with the bigger picture:
>>
>> adv7511_probe() registers a drm_bridge structure that was allocated
>> with devm_kzalloc(). It calls adv7511_cec_init(), which fails for an
>> unknown reason, which in turn triggers the registered structure to be
>> removed.
>>
>> Elsewhere, kirin_drm_platform_probe() gets called, which calls
>> of_graph_get_remote_node(), and that returns NULL. Before Dan's
>> patch we would go on with a NULL pointer here and register that,
>> now kirin_drm_platform_probe() fails with -ENODEV.
>>
>> In a third driver, dsi_parse_dt() calls drm_of_find_panel_or_bridge(),
>> which after not finding a panel goes on to call of_drm_find_bridge(),
>> and that crashes due to the earlier list corruption.
>>
>> This addresses the first issue by making sure that adv7511_probe()
>> does not leave behind any corrupted list entries. This should
>> get the system back to boot but needs testing. From my understanding,
>> there is at least one more bug that needs to be resolved to actually
>> get everything to work again.
>
> So I've started hitting the issue this patch tries to address (now
> that the related code landed in Linus' tree). The only issue is that
> with this fix, I don't see graphics initializing properly, so I
> suspect something is still wrong with the error handling (though what
> exactly I'm not sure).

So this seems to only happen when CONFIG_DRM_I2C_ADV7511_CEC is
enabled. If it is on, I don't get any graphics, but if its disabled
graphics works.

Tying this with Arnd's patch, I'm guessing adv7511_cec_init() is
failing, but it seems like instead of just disabling the CEC feature,
we're failing to load the driver entirely.

Maybe should the logic be something like:
#ifdef CONFIG_DRM_I2C_ADV7511_CEC
ret = adv7511_cec_init(dev, adv7511, offset);
if (ret)
#endif
regmap_write(adv7511->regmap, ADV7511_REG_CEC_CTRL + offset,
ADV7511_CEC_CTRL_POWER_DOWN);

?
thanks
-john
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] [RFT] drm: adv7511/33: fix adv7511_cec_init() failure handling

2017-11-16 Thread John Stultz
On Wed, Nov 15, 2017 at 4:37 AM, Arnd Bergmann  wrote:
> An otherwise correct cleanup patch from Dan Carpenter turned a broken
> failure handling from a feature patch by Hans Verkuil into a kernel
> Oops, so bisection points to commit 7af35b0addbc ("drm/kirin: Checking
> for IS_ERR() instead of NULL") rather than 3b1b975003e4 ("drm:
> adv7511/33: add HDMI CEC support").
>
> I've managed to piece together several partial problems, though
> I'm still struggling with the bigger picture:
>
> adv7511_probe() registers a drm_bridge structure that was allocated
> with devm_kzalloc(). It calls adv7511_cec_init(), which fails for an
> unknown reason, which in turn triggers the registered structure to be
> removed.
>
> Elsewhere, kirin_drm_platform_probe() gets called, which calls
> of_graph_get_remote_node(), and that returns NULL. Before Dan's
> patch we would go on with a NULL pointer here and register that,
> now kirin_drm_platform_probe() fails with -ENODEV.
>
> In a third driver, dsi_parse_dt() calls drm_of_find_panel_or_bridge(),
> which after not finding a panel goes on to call of_drm_find_bridge(),
> and that crashes due to the earlier list corruption.
>
> This addresses the first issue by making sure that adv7511_probe()
> does not leave behind any corrupted list entries. This should
> get the system back to boot but needs testing. From my understanding,
> there is at least one more bug that needs to be resolved to actually
> get everything to work again.

So I've started hitting the issue this patch tries to address (now
that the related code landed in Linus' tree). The only issue is that
with this fix, I don't see graphics initializing properly, so I
suspect something is still wrong with the error handling (though what
exactly I'm not sure).

If you have a new version of the patch with Hans' feedback in it, I'd
be happy to re-test.

thanks
-john
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] [RFT] drm: adv7511/33: fix adv7511_cec_init() failure handling

2017-11-15 Thread Hans Verkuil
On 15/11/17 13:37, Arnd Bergmann wrote:
> An otherwise correct cleanup patch from Dan Carpenter turned a broken
> failure handling from a feature patch by Hans Verkuil into a kernel
> Oops, so bisection points to commit 7af35b0addbc ("drm/kirin: Checking
> for IS_ERR() instead of NULL") rather than 3b1b975003e4 ("drm:
> adv7511/33: add HDMI CEC support").
> 
> I've managed to piece together several partial problems, though
> I'm still struggling with the bigger picture:
> 
> adv7511_probe() registers a drm_bridge structure that was allocated
> with devm_kzalloc(). It calls adv7511_cec_init(), which fails for an
> unknown reason, which in turn triggers the registered structure to be
> removed.
> 
> Elsewhere, kirin_drm_platform_probe() gets called, which calls
> of_graph_get_remote_node(), and that returns NULL. Before Dan's
> patch we would go on with a NULL pointer here and register that,
> now kirin_drm_platform_probe() fails with -ENODEV.
> 
> In a third driver, dsi_parse_dt() calls drm_of_find_panel_or_bridge(),
> which after not finding a panel goes on to call of_drm_find_bridge(),
> and that crashes due to the earlier list corruption.
> 
> This addresses the first issue by making sure that adv7511_probe()
> does not leave behind any corrupted list entries. This should
> get the system back to boot but needs testing. From my understanding,
> there is at least one more bug that needs to be resolved to actually
> get everything to work again.
> 
> Reported-by: Naresh Kamboju 
> Cc: Xinliang Liu 
> Cc: Dan Carpenter 
> Cc: Sean Paul 
> Cc: Hans Verkuil 
> Cc: Archit Taneja 
> Link: https://bugs.linaro.org/show_bug.cgi?id=3345
> Link: https://lkft.validation.linaro.org/scheduler/job/48017#L3551
> Fixes: 7af35b0addbc ("drm/kirin: Checking for IS_ERR() instead of NULL")
> Fixes: 3b1b975003e4 ("drm: adv7511/33: add HDMI CEC support")
> Signed-off-by: Arnd Bergmann 
> ---
> Untested so far, this is what I came up with after reading the
> WARN_ON log from a modified kernel.
> 
> Naresh, can you give this one a go?
> 
> Hans and others, can you review in the meantime?
> 
> Signed-off-by: Arnd Bergmann 
> ---
>  drivers/gpu/drm/bridge/adv7511/adv7511_drv.c | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/bridge/adv7511/adv7511_drv.c 
> b/drivers/gpu/drm/bridge/adv7511/adv7511_drv.c
> index 0e14f1572d05..93d1ecafe8fa 100644
> --- a/drivers/gpu/drm/bridge/adv7511/adv7511_drv.c
> +++ b/drivers/gpu/drm/bridge/adv7511/adv7511_drv.c
> @@ -1204,7 +1204,7 @@ static int adv7511_probe(struct i2c_client *i2c, const 
> struct i2c_device_id *id)
>  #ifdef CONFIG_DRM_I2C_ADV7511_CEC
>   ret = adv7511_cec_init(dev, adv7511, offset);
>   if (ret)
> - goto err_unregister_cec;
> + goto err_unregister_bridge;

Rather than adding the err_unregister_bridge label, I think it is better to move
this code up to just before the call to drm_bridge_add().

I think I just didn't realize that doing it after would require additional 
cleanup.
But it should be perfectly fine to move it up so we can avoid doing that.

I can't test it until Monday as I don't have access to the hardware at the 
moment.

Regards,

Hans

>  #else
>   regmap_write(adv7511->regmap, ADV7511_REG_CEC_CTRL + offset,
>ADV7511_CEC_CTRL_POWER_DOWN);
> @@ -1212,6 +1212,11 @@ static int adv7511_probe(struct i2c_client *i2c, const 
> struct i2c_device_id *id)
>  
>   return 0;
>  
> +#ifdef CONFIG_DRM_I2C_ADV7511_CEC
> +err_unregister_bridge:
> + adv7511_audio_exit(adv7511);
> + drm_bridge_remove(>bridge);
> +#endif
>  err_unregister_cec:
>   i2c_unregister_device(adv7511->i2c_cec);
>   if (adv7511->cec_clk)
> 

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel