Re: [PATCH v2 2/2] xen: Fix latent check-endbr.sh bug with 32bit build environments
On 18.07.2022 14:07, Andrew Cooper wrote: > On 18/07/2022 10:49, Jan Beulich wrote: >> On 18.07.2022 11:31, Andrew Cooper wrote: >>> On 18/07/2022 10:11, Jan Beulich wrote: On 15.07.2022 15:26, Andrew Cooper wrote: > --- a/xen/tools/check-endbr.sh > +++ b/xen/tools/check-endbr.sh > @@ -61,19 +61,36 @@ ${OBJDUMP} -j .text $1 -d -w | grep ' endbr64 *$' | > cut -f 1 -d ':' > $VALID & > #the lower bits, rounding integers to the nearest 4k. > # > #Instead, use the fact that Xen's .text is within a 1G aligned > region, and > -#split the VMA in half so AWK's numeric addition is only working on > 32 bit > -#numbers, which don't lose precision. > +#split the VMA so AWK's numeric addition is only working on <32 bit > +#numbers, which don't lose precision. (See point 5) > # > # 4) MAWK doesn't support plain hex constants (an optional part of the > POSIX > #spec), and GAWK and MAWK can't agree on how to work with hex > constants in > #a string. Use the shell to convert $vma_lo to decimal before > passing to > #AWK. > # > +# 5) Point 4 isn't fully portable. POSIX only requires that $((0xN)) be > +#evaluated as long, which in 32bit shells turns negative if bit 31 > of the > +#VMA is set. AWK then interprets this negative number as a double > before > +#adding the offsets from the binary grep. > +# > +#Instead of doing an 8/8 split with vma_hi/lo, do a 9/7 split. > +# > +#The consequence of this is that for all offsets, $vma_lo + offset > needs > +#to be less that 256M (i.e. 7 nibbles) so as to be successfully > recombined > +#with the 9 nibbles of $vma_hi. This is fine; .text is at the start > of a > +#1G aligned region, and Xen is far far smaller than 256M, but leave > safety > +#check nevertheless. > +# > eval $(${OBJDUMP} -j .text $1 -h | > -$AWK '$2 == ".text" {printf "vma_hi=%s\nvma_lo=%s\n", substr($4, 1, > 8), substr($4, 9, 16)}') > +$AWK '$2 == ".text" {printf "vma_hi=%s\nvma_lo=%s\n", substr($4, 1, > 9), substr($4, 10, 16)}') > > ${OBJCOPY} -j .text $1 -O binary $TEXT_BIN > > +bin_sz=$(stat -c '%s' $TEXT_BIN) > +[ "$bin_sz" -ge $(((1 << 28) - $vma_lo)) ] && > +{ echo "$MSG_PFX Error: .text offsets can exceed 256M" >&2; exit 1; } ... s/can/cannot/ ? >>> Why? "Can" is correct here. If the offsets can't exceed 256M, then >>> everything is good. >> Hmm, the wording then indeed is ambiguous. > > I see your point. In this case it's meant as "are able to", but this is > still clearer than using "can't" because at least the text matches the > check which triggered it. > >> I read "can" as "are allowed >> to", when we mean "aren't allowed to". Maybe ".text is 256M or more in >> size"? If you mention "offsets", then I think the check should be based >> on actually observing an offset which is too large (which .text size >> alone doesn't guarantee will happen). > > It's not just .text on its own because the VMA of offset by 2M, hence > the subtraction of $vma_lo in the main calculation. > > There's no point searching for offsets. There will be one near the end, > so all searching for an offset would do is complicate the critical loop. > > How about ".text offsets must not exceed 256M" ? > > That should be unambiguous. Yes, that reads fine. Thanks. Jan
Re: [PATCH v2 2/2] xen: Fix latent check-endbr.sh bug with 32bit build environments
On 18/07/2022 10:49, Jan Beulich wrote: > On 18.07.2022 11:31, Andrew Cooper wrote: >> On 18/07/2022 10:11, Jan Beulich wrote: >>> On 15.07.2022 15:26, Andrew Cooper wrote: --- a/xen/tools/check-endbr.sh +++ b/xen/tools/check-endbr.sh @@ -61,19 +61,36 @@ ${OBJDUMP} -j .text $1 -d -w | grep ' endbr64 *$' | cut -f 1 -d ':' > $VALID & #the lower bits, rounding integers to the nearest 4k. # #Instead, use the fact that Xen's .text is within a 1G aligned region, and -#split the VMA in half so AWK's numeric addition is only working on 32 bit -#numbers, which don't lose precision. +#split the VMA so AWK's numeric addition is only working on <32 bit +#numbers, which don't lose precision. (See point 5) # # 4) MAWK doesn't support plain hex constants (an optional part of the POSIX #spec), and GAWK and MAWK can't agree on how to work with hex constants in #a string. Use the shell to convert $vma_lo to decimal before passing to #AWK. # +# 5) Point 4 isn't fully portable. POSIX only requires that $((0xN)) be +#evaluated as long, which in 32bit shells turns negative if bit 31 of the +#VMA is set. AWK then interprets this negative number as a double before +#adding the offsets from the binary grep. +# +#Instead of doing an 8/8 split with vma_hi/lo, do a 9/7 split. +# +#The consequence of this is that for all offsets, $vma_lo + offset needs +#to be less that 256M (i.e. 7 nibbles) so as to be successfully recombined +#with the 9 nibbles of $vma_hi. This is fine; .text is at the start of a +#1G aligned region, and Xen is far far smaller than 256M, but leave safety +#check nevertheless. +# eval $(${OBJDUMP} -j .text $1 -h | -$AWK '$2 == ".text" {printf "vma_hi=%s\nvma_lo=%s\n", substr($4, 1, 8), substr($4, 9, 16)}') +$AWK '$2 == ".text" {printf "vma_hi=%s\nvma_lo=%s\n", substr($4, 1, 9), substr($4, 10, 16)}') ${OBJCOPY} -j .text $1 -O binary $TEXT_BIN +bin_sz=$(stat -c '%s' $TEXT_BIN) +[ "$bin_sz" -ge $(((1 << 28) - $vma_lo)) ] && +{ echo "$MSG_PFX Error: .text offsets can exceed 256M" >&2; exit 1; } >>> ... s/can/cannot/ ? >> Why? "Can" is correct here. If the offsets can't exceed 256M, then >> everything is good. > Hmm, the wording then indeed is ambiguous. I see your point. In this case it's meant as "are able to", but this is still clearer than using "can't" because at least the text matches the check which triggered it. > I read "can" as "are allowed > to", when we mean "aren't allowed to". Maybe ".text is 256M or more in > size"? If you mention "offsets", then I think the check should be based > on actually observing an offset which is too large (which .text size > alone doesn't guarantee will happen). It's not just .text on its own because the VMA of offset by 2M, hence the subtraction of $vma_lo in the main calculation. There's no point searching for offsets. There will be one near the end, so all searching for an offset would do is complicate the critical loop. How about ".text offsets must not exceed 256M" ? That should be unambiguous. ~Andrew
Re: [PATCH v2 2/2] xen: Fix latent check-endbr.sh bug with 32bit build environments
On 18.07.2022 11:31, Andrew Cooper wrote: > On 18/07/2022 10:11, Jan Beulich wrote: >> On 15.07.2022 15:26, Andrew Cooper wrote: >>> --- a/xen/tools/check-endbr.sh >>> +++ b/xen/tools/check-endbr.sh >>> @@ -61,19 +61,36 @@ ${OBJDUMP} -j .text $1 -d -w | grep ' endbr64 *$' | >>> cut -f 1 -d ':' > $VALID & >>> #the lower bits, rounding integers to the nearest 4k. >>> # >>> #Instead, use the fact that Xen's .text is within a 1G aligned region, >>> and >>> -#split the VMA in half so AWK's numeric addition is only working on 32 >>> bit >>> -#numbers, which don't lose precision. >>> +#split the VMA so AWK's numeric addition is only working on <32 bit >>> +#numbers, which don't lose precision. (See point 5) >>> # >>> # 4) MAWK doesn't support plain hex constants (an optional part of the >>> POSIX >>> #spec), and GAWK and MAWK can't agree on how to work with hex >>> constants in >>> #a string. Use the shell to convert $vma_lo to decimal before passing >>> to >>> #AWK. >>> # >>> +# 5) Point 4 isn't fully portable. POSIX only requires that $((0xN)) be >>> +#evaluated as long, which in 32bit shells turns negative if bit 31 of >>> the >>> +#VMA is set. AWK then interprets this negative number as a double >>> before >>> +#adding the offsets from the binary grep. >>> +# >>> +#Instead of doing an 8/8 split with vma_hi/lo, do a 9/7 split. >>> +# >>> +#The consequence of this is that for all offsets, $vma_lo + offset >>> needs >>> +#to be less that 256M (i.e. 7 nibbles) so as to be successfully >>> recombined >>> +#with the 9 nibbles of $vma_hi. This is fine; .text is at the start >>> of a >>> +#1G aligned region, and Xen is far far smaller than 256M, but leave >>> safety >>> +#check nevertheless. >>> +# >>> eval $(${OBJDUMP} -j .text $1 -h | >>> -$AWK '$2 == ".text" {printf "vma_hi=%s\nvma_lo=%s\n", substr($4, 1, >>> 8), substr($4, 9, 16)}') >>> +$AWK '$2 == ".text" {printf "vma_hi=%s\nvma_lo=%s\n", substr($4, 1, >>> 9), substr($4, 10, 16)}') >>> >>> ${OBJCOPY} -j .text $1 -O binary $TEXT_BIN >>> >>> +bin_sz=$(stat -c '%s' $TEXT_BIN) >>> +[ "$bin_sz" -ge $(((1 << 28) - $vma_lo)) ] && >>> +{ echo "$MSG_PFX Error: .text offsets can exceed 256M" >&2; exit 1; } >> ... s/can/cannot/ ? > > Why? "Can" is correct here. If the offsets can't exceed 256M, then > everything is good. Hmm, the wording then indeed is ambiguous. I read "can" as "are allowed to", when we mean "aren't allowed to". Maybe ".text is 256M or more in size"? If you mention "offsets", then I think the check should be based on actually observing an offset which is too large (which .text size alone doesn't guarantee will happen). Jan
Re: [PATCH v2 2/2] xen: Fix latent check-endbr.sh bug with 32bit build environments
On 18/07/2022 10:11, Jan Beulich wrote: > On 15.07.2022 15:26, Andrew Cooper wrote: >> While Xen's current VMA means it works, the mawk fix (i.e. using $((0xN)) in >> the shell) isn't portable in 32bit shells. See the code comment for the fix. >> >> The fix found a second latent bug. Recombining $vma_hi/lo should have used >> printf "%s%08x" and only worked previously because $vma_lo had bits set in >> it's top nibble. Combining with the main fix, %08x becomes %07x. >> >> Fixes: $XXX patch 1 >> Reported-by: Jan Beulich >> Signed-off-by: Andrew Cooper > Reviewed-by: Jan Beulich Thanks, but... > with, I guess, ... > >> --- a/xen/tools/check-endbr.sh >> +++ b/xen/tools/check-endbr.sh >> @@ -61,19 +61,36 @@ ${OBJDUMP} -j .text $1 -d -w | grep 'endbr64 *$' | >> cut -f 1 -d ':' > $VALID & >> #the lower bits, rounding integers to the nearest 4k. >> # >> #Instead, use the fact that Xen's .text is within a 1G aligned region, >> and >> -#split the VMA in half so AWK's numeric addition is only working on 32 >> bit >> -#numbers, which don't lose precision. >> +#split the VMA so AWK's numeric addition is only working on <32 bit >> +#numbers, which don't lose precision. (See point 5) >> # >> # 4) MAWK doesn't support plain hex constants (an optional part of the POSIX >> #spec), and GAWK and MAWK can't agree on how to work with hex constants >> in >> #a string. Use the shell to convert $vma_lo to decimal before passing >> to >> #AWK. >> # >> +# 5) Point 4 isn't fully portable. POSIX only requires that $((0xN)) be >> +#evaluated as long, which in 32bit shells turns negative if bit 31 of >> the >> +#VMA is set. AWK then interprets this negative number as a double >> before >> +#adding the offsets from the binary grep. >> +# >> +#Instead of doing an 8/8 split with vma_hi/lo, do a 9/7 split. >> +# >> +#The consequence of this is that for all offsets, $vma_lo + offset needs >> +#to be less that 256M (i.e. 7 nibbles) so as to be successfully >> recombined >> +#with the 9 nibbles of $vma_hi. This is fine; .text is at the start of >> a >> +#1G aligned region, and Xen is far far smaller than 256M, but leave >> safety >> +#check nevertheless. >> +# >> eval $(${OBJDUMP} -j .text $1 -h | >> -$AWK '$2 == ".text" {printf "vma_hi=%s\nvma_lo=%s\n", substr($4, 1, 8), >> substr($4, 9, 16)}') >> +$AWK '$2 == ".text" {printf "vma_hi=%s\nvma_lo=%s\n", substr($4, 1, 9), >> substr($4, 10, 16)}') >> >> ${OBJCOPY} -j .text $1 -O binary $TEXT_BIN >> >> +bin_sz=$(stat -c '%s' $TEXT_BIN) >> +[ "$bin_sz" -ge $(((1 << 28) - $vma_lo)) ] && >> +{ echo "$MSG_PFX Error: .text offsets can exceed 256M" >&2; exit 1; } > ... s/can/cannot/ ? Why? "Can" is correct here. If the offsets can't exceed 256M, then everything is good. ~Andrew
Re: [PATCH v2 2/2] xen: Fix latent check-endbr.sh bug with 32bit build environments
On 15.07.2022 15:26, Andrew Cooper wrote: > While Xen's current VMA means it works, the mawk fix (i.e. using $((0xN)) in > the shell) isn't portable in 32bit shells. See the code comment for the fix. > > The fix found a second latent bug. Recombining $vma_hi/lo should have used > printf "%s%08x" and only worked previously because $vma_lo had bits set in > it's top nibble. Combining with the main fix, %08x becomes %07x. > > Fixes: $XXX patch 1 > Reported-by: Jan Beulich > Signed-off-by: Andrew Cooper Reviewed-by: Jan Beulich with, I guess, ... > --- a/xen/tools/check-endbr.sh > +++ b/xen/tools/check-endbr.sh > @@ -61,19 +61,36 @@ ${OBJDUMP} -j .text $1 -d -w | grep ' endbr64 *$' | > cut -f 1 -d ':' > $VALID & > #the lower bits, rounding integers to the nearest 4k. > # > #Instead, use the fact that Xen's .text is within a 1G aligned region, > and > -#split the VMA in half so AWK's numeric addition is only working on 32 > bit > -#numbers, which don't lose precision. > +#split the VMA so AWK's numeric addition is only working on <32 bit > +#numbers, which don't lose precision. (See point 5) > # > # 4) MAWK doesn't support plain hex constants (an optional part of the POSIX > #spec), and GAWK and MAWK can't agree on how to work with hex constants > in > #a string. Use the shell to convert $vma_lo to decimal before passing to > #AWK. > # > +# 5) Point 4 isn't fully portable. POSIX only requires that $((0xN)) be > +#evaluated as long, which in 32bit shells turns negative if bit 31 of the > +#VMA is set. AWK then interprets this negative number as a double before > +#adding the offsets from the binary grep. > +# > +#Instead of doing an 8/8 split with vma_hi/lo, do a 9/7 split. > +# > +#The consequence of this is that for all offsets, $vma_lo + offset needs > +#to be less that 256M (i.e. 7 nibbles) so as to be successfully > recombined > +#with the 9 nibbles of $vma_hi. This is fine; .text is at the start of a > +#1G aligned region, and Xen is far far smaller than 256M, but leave > safety > +#check nevertheless. > +# > eval $(${OBJDUMP} -j .text $1 -h | > -$AWK '$2 == ".text" {printf "vma_hi=%s\nvma_lo=%s\n", substr($4, 1, 8), > substr($4, 9, 16)}') > +$AWK '$2 == ".text" {printf "vma_hi=%s\nvma_lo=%s\n", substr($4, 1, 9), > substr($4, 10, 16)}') > > ${OBJCOPY} -j .text $1 -O binary $TEXT_BIN > > +bin_sz=$(stat -c '%s' $TEXT_BIN) > +[ "$bin_sz" -ge $(((1 << 28) - $vma_lo)) ] && > +{ echo "$MSG_PFX Error: .text offsets can exceed 256M" >&2; exit 1; } ... s/can/cannot/ ? Jan
[PATCH v2 2/2] xen: Fix latent check-endbr.sh bug with 32bit build environments
While Xen's current VMA means it works, the mawk fix (i.e. using $((0xN)) in the shell) isn't portable in 32bit shells. See the code comment for the fix. The fix found a second latent bug. Recombining $vma_hi/lo should have used printf "%s%08x" and only worked previously because $vma_lo had bits set in it's top nibble. Combining with the main fix, %08x becomes %07x. Fixes: $XXX patch 1 Reported-by: Jan Beulich Signed-off-by: Andrew Cooper --- CC: George Dunlap CC: Jan Beulich CC: Stefano Stabellini CC: Wei Liu CC: Julien Grall CC: Anthony PERARD CC: Luca Fancellu CC: Mathieu Tarral CC: Bertrand Marquis v2: * New --- xen/tools/check-endbr.sh | 25 + 1 file changed, 21 insertions(+), 4 deletions(-) diff --git a/xen/tools/check-endbr.sh b/xen/tools/check-endbr.sh index b3febd6a4ccc..d6aa117de13b 100755 --- a/xen/tools/check-endbr.sh +++ b/xen/tools/check-endbr.sh @@ -61,19 +61,36 @@ ${OBJDUMP} -j .text $1 -d -w | grep ' endbr64 *$' | cut -f 1 -d ':' > $VALID & #the lower bits, rounding integers to the nearest 4k. # #Instead, use the fact that Xen's .text is within a 1G aligned region, and -#split the VMA in half so AWK's numeric addition is only working on 32 bit -#numbers, which don't lose precision. +#split the VMA so AWK's numeric addition is only working on <32 bit +#numbers, which don't lose precision. (See point 5) # # 4) MAWK doesn't support plain hex constants (an optional part of the POSIX #spec), and GAWK and MAWK can't agree on how to work with hex constants in #a string. Use the shell to convert $vma_lo to decimal before passing to #AWK. # +# 5) Point 4 isn't fully portable. POSIX only requires that $((0xN)) be +#evaluated as long, which in 32bit shells turns negative if bit 31 of the +#VMA is set. AWK then interprets this negative number as a double before +#adding the offsets from the binary grep. +# +#Instead of doing an 8/8 split with vma_hi/lo, do a 9/7 split. +# +#The consequence of this is that for all offsets, $vma_lo + offset needs +#to be less that 256M (i.e. 7 nibbles) so as to be successfully recombined +#with the 9 nibbles of $vma_hi. This is fine; .text is at the start of a +#1G aligned region, and Xen is far far smaller than 256M, but leave safety +#check nevertheless. +# eval $(${OBJDUMP} -j .text $1 -h | -$AWK '$2 == ".text" {printf "vma_hi=%s\nvma_lo=%s\n", substr($4, 1, 8), substr($4, 9, 16)}') +$AWK '$2 == ".text" {printf "vma_hi=%s\nvma_lo=%s\n", substr($4, 1, 9), substr($4, 10, 16)}') ${OBJCOPY} -j .text $1 -O binary $TEXT_BIN +bin_sz=$(stat -c '%s' $TEXT_BIN) +[ "$bin_sz" -ge $(((1 << 28) - $vma_lo)) ] && +{ echo "$MSG_PFX Error: .text offsets can exceed 256M" >&2; exit 1; } + # instruction:hex: oct: # endbr64 f3 0f 1e fa363 017 036 372 # endbr32 f3 0f 1e fb363 017 036 373 @@ -84,7 +101,7 @@ then else grep -aob -e "$(printf '\363\17\36\372')" -e "$(printf '\363\17\36\373')" \ -e "$(printf '\146\17\37\1')" $TEXT_BIN -fi | $AWK -F':' '{printf "%s%x\n", "'$vma_hi'", int('$((0x$vma_lo))') + $1}' > $ALL +fi | $AWK -F':' '{printf "%s%07x\n", "'$vma_hi'", int('$((0x$vma_lo))') + $1}' > $ALL # Wait for $VALID to become complete wait -- 2.11.0