On 07/05/2024 19.23, Tom Rini wrote:
> On Tue, May 07, 2024 at 09:17:46AM +0200, Yoann Congal wrote:
>> Le 06/05/2024 à 19:43, Tom Rini a écrit :
>>> On Sun, May 05, 2024 at 03:53:53PM +0200, Yoann Congal wrote:
>>>
>>>> From: Masahiro Yamada <masahi...@kernel.org>
>>>>
>>>> This is a cherry-pick from the kernel commit:
>>>> 6262afa10ef7c (kconfig: default to zero if int/hex symbol lacks default 
>>>> property, 2023-11-26)
>>>>
>>>> When a default property is missing in an int or hex symbol, it defaults
>>>> to an empty string, which is not a valid symbol value.
>>>>
>>>> It results in an incorrect .config, and can also lead to an infinite
>>>> loop in scripting.
>>>>
>>>> Use "0" for int and "0x0" for hex as a default value.
>>>>
>>>> Signed-off-by: Masahiro Yamada <masahi...@kernel.org>
>>>> Reviewed-by: Yoann Congal <yoann.con...@smile.fr>
>>>>
>>>> Signed-off-by: Yoann Congal <yoann.con...@smile.fr>
>>>> ---
>>>> Added context that was not in the upstream commit:
>>>> The infinite loop case happens with a configuration defined like this
>>>> (a hex config without a valid default value):
>>>>   config HEX_TEST
>>>>    hex "Hex config without default"
>>>>
>>>> And using:
>>>>   $ make oldconfig < /dev/null
>>>>   scripts/kconfig/conf  --oldconfig Kconfig
>>>>   *
>>>>   * General setup
>>>>   *
>>>>
>>>>   Error in reading or end of file.
>>>>
>>>>   Error in reading or end of file.
>>>>   Hex config without default (HEX_TEST) [] (NEW)
>>>>
>>>>   Error in reading or end of file.
>>>>   Hex config without default (HEX_TEST) [] (NEW)
>>>>   # This loops forever
>>>>
>>>> NB: Scripted config manipulation often call make with /dev/null as
>>>> stdin (Yocto recipe, CI build, ...)
>>>>
>>>> This was discovered when working on Yocto bug:
>>>> https://bugzilla.yoctoproject.org/show_bug.cgi?id=14136
>>
>> Hi Tom,
>>
>>> I'm surprised this was accepted. In the past I've wanted to avoid this
>>> kind of change in Kconfig because while the empty string can be easily
>>> be checked in the code as "user didn't really configure this, do
>>> nothing" a value of zero is a valid option in these cases and so then in
>>> the code we need a bool symbol to decide if the hex/int symbol is set or
>>> not.
>>
>> For context (and what it's worth), before this patch was merged, I've tried 
>> to fix this problem with a patch of my own which only exited the config 
>> process when the infinite loop would start:
>> "[PATCH v4] kconfig: avoid an infinite loop in oldconfig/syncconfig" [0]
>> ... the v5 version was a bit more severe and exited as soon as an error was 
>> hit, it was then removed from -next because it triggered a build failure [1].
>>
>>> Today this is less of an issue than it used to be in U-Boot with
>>> everything CONFIG-related migrated to Kconfig and so there's no longer
>>> the question of if we missed migrating a file that defined the value but
>>> there's still places we have in the code where hex symbol is undefined
>>> is not the same thing as hex symbol is 0x0.
>>>
>>> Is there a specific use case you have for this in U-Boot? It's been a
>>> while, but it's also been cases of newly introduced symbols in Kconfig
>>> files with incorrect dependencies, where the infinite loop in kconfig
>>> happened, CI failed and we caught the problem.
>>
>> For a specific example, one can trigger this with CMD_PSTORE_MEM_ADDR like 
>> this (tested on today's master):
>>   make qemu_arm64_defconfig
>>   make menuconfig
>>    # activate CONFIG_CMD_PSTORE=y but forget to fill CMD_PSTORE_MEM_ADDR
>>   make < /dev/null # This emulates a Yocto/scripted/CI build and loops 
>> infinitely
>>
>> But, my more general use case is the Yocto dev trying to change the U-Boot 
>> (or any kconfig based software) config and accidentally trigger this.
>> In Yocto, what they will see is a do_configure task taking a very long time 
>> but they won't see the looping logs the detect the problem (This is caught 
>> later by filing the RAM and the build process is killed by OOM, or the disk 
>> run out of space filed multi-GB log files).
>>
>> I'm sadly aware that defining a default value for this kind of config (fixed 
>> addresses) is not really sensible but the build time infinite loop is not 
>> good either.
> 
> Exactly, but to me it's worse to cover up the build issue and introduce
> a runtime issue instead. Something builds but then crashes at run time
> is worse to me than builds fail / get stuck. Using your example,
> CMD_PSTORE_MEM_ADDR depends on CMD_PSTORE and so if someone enables
> CMD_PSTORE in a config fragment but doesn't define the address, the
> build should not complete. Using 0x0 means that oops, now it fails to
> work and might not be obvious why either.

I agree with Tom. Yes, we've also hit those yocto failures with a
multi-GB do_configure log file, and that's pretty annoying when one just
starts a build cooking over night and then the next day one's laptop has
a filled disk. So clearly there is/was a problem to solve.

However, blindly adding a "default default" of 0 is not the right thing
to do. It seems much better that, when there's no configured default and
kconfig gets EOF when asking the user for input (as in the 'make
oldconfig < /dev/null' case), kconfig should exit with an error
explicitly saying "no value given for CONFIG_XYZ which also doesn't have
a default value". That stops the build, and leaves precisely an
indication what the problem is. Then the developer must either provide a
value for CONFIG_XYZ in the fragments, or send a patch to add a
reasonable default when that is at all possible (not so for e.g. PSTORE).

Rasmus


Reply via email to