Re: misc/uniutils fix %n

Theo de Raadt Fri, 10 Sep 2021 17:25:42 -0700

I do not like your style of adjusting the counter with -=

the character counting approach looks fragile.


I suspect some of these refactorings are better if MULTIPLE *printf calls
are used, instead of one call.

Cut the call into two at the %n

Stefan Hagen <[email protected]> wrote:

> Hi,
> 
> Another %n fix.
> 
> There's actually a change in behavior, which I can't explain:
> 
> BEFORE:
> $ ExplicateUTF8 CanterburyPieces.txt
> The sequence 0xEF     0xBB     0xBF
>              11101111 10111011 10111111
> is a valid UTF-8 character encoding equivalent to UTF32 0x0000FEFF.
> The first byte tells us that there should be 2
> continuation bytes since it begins with 3 contiguous 1s.
> There are 2 following bytes and all are valid
> continuation bytes since they all have high bits 10.
> The first byte contributes its low 4 bits.
> The remaining bytes each contribute their low 6 bits,
> for a total of 16 bits: 1111 111011 111111
> Abort trap
> 
> AFTER:
> $ ExplicateUTF8 CanterburyPieces.txt
> The sequence 0xEF     0xBB     0xBF
>              11101111 10111011 10111111
> is a valid UTF-8 character encoding equivalent to UTF32 0x0000FEFF.
> The first byte tells us that there should be 2
> continuation bytes since it begins with 3 contiguous 1s.
> There are 2 following bytes and all are valid
> continuation bytes since they all have high bits 10.
> The first byte contributes its low 4 bits.
> The remaining bytes each contribute their low 6 bits,
> for a total of 16 bits: 1111 111011 111111
> This is padded to 32 places with 16 zeros: 
> 0000000000000000000000000000000000000000000000001111111011111111
>                                             0   0   0   0   F   E   F   F
> 
> 
> If anyone wants to try it, my test file is here:
> https://codevoid.de/0/p/CanterburyPieces.txt
> 
> OK?
> 
> Best regards,
> Stefan
> 
> Index: misc/uniutils/Makefile
> ===================================================================
> RCS file: /cvs/ports/misc/uniutils/Makefile,v
> retrieving revision 1.9
> diff -u -p -u -p -r1.9 Makefile
> --- misc/uniutils/Makefile    28 Jun 2021 21:34:19 -0000      1.9
> +++ misc/uniutils/Makefile    10 Sep 2021 21:09:48 -0000
> @@ -3,7 +3,7 @@
>  COMMENT=     Unicode utilities
>  
>  DISTNAME=    uniutils-2.27
> -REVISION=    3
> +REVISION=    4
>  CATEGORIES=  misc
>  
>  HOMEPAGE=    http://billposer.org/Software/unidesc.html
> Index: misc/uniutils/patches/patch-ExplicateUTF8_c
> ===================================================================
> RCS file: misc/uniutils/patches/patch-ExplicateUTF8_c
> diff -N misc/uniutils/patches/patch-ExplicateUTF8_c
> --- /dev/null 1 Jan 1970 00:00:00 -0000
> +++ misc/uniutils/patches/patch-ExplicateUTF8_c       10 Sep 2021 21:09:48 
> -0000
> @@ -0,0 +1,17 @@
> +$OpenBSD$
> +
> +Remove %n format specifier
> +
> +Index: ExplicateUTF8.c
> +--- ExplicateUTF8.c.orig
> ++++ ExplicateUTF8.c
> +@@ -214,7 +214,8 @@ main(int ac, char **av){
> +     printf("%s ",tempstr); 
> +   }
> +   printf("\n");
> +-  printf("This is padded to 32 places with %d zeros: 
> %n%s\n",(32-GotBits),&spaces,binfmtl(ch));
> ++  spaces = printf("This is padded to 32 places with %d zeros: 
> %s\n",(32-GotBits),binfmtl(ch));
> ++  spaces -= strlen(binfmtl(ch));
> +   sprintf(tempstr,"                                ");
> +   sprintf(tempstr,"%08lX",ch);
> +   tempstr[28] = tempstr[7];
>

Re: misc/uniutils fix %n

Reply via email to