I do not like your style of adjusting the counter with -= the character counting approach looks fragile.
I suspect some of these refactorings are better if MULTIPLE *printf calls are used, instead of one call. Cut the call into two at the %n Stefan Hagen <[email protected]> wrote: > Hi, > > Another %n fix. > > There's actually a change in behavior, which I can't explain: > > BEFORE: > $ ExplicateUTF8 CanterburyPieces.txt > The sequence 0xEF 0xBB 0xBF > 11101111 10111011 10111111 > is a valid UTF-8 character encoding equivalent to UTF32 0x0000FEFF. > The first byte tells us that there should be 2 > continuation bytes since it begins with 3 contiguous 1s. > There are 2 following bytes and all are valid > continuation bytes since they all have high bits 10. > The first byte contributes its low 4 bits. > The remaining bytes each contribute their low 6 bits, > for a total of 16 bits: 1111 111011 111111 > Abort trap > > AFTER: > $ ExplicateUTF8 CanterburyPieces.txt > The sequence 0xEF 0xBB 0xBF > 11101111 10111011 10111111 > is a valid UTF-8 character encoding equivalent to UTF32 0x0000FEFF. > The first byte tells us that there should be 2 > continuation bytes since it begins with 3 contiguous 1s. > There are 2 following bytes and all are valid > continuation bytes since they all have high bits 10. > The first byte contributes its low 4 bits. > The remaining bytes each contribute their low 6 bits, > for a total of 16 bits: 1111 111011 111111 > This is padded to 32 places with 16 zeros: > 0000000000000000000000000000000000000000000000001111111011111111 > 0 0 0 0 F E F F > > > If anyone wants to try it, my test file is here: > https://codevoid.de/0/p/CanterburyPieces.txt > > OK? > > Best regards, > Stefan > > Index: misc/uniutils/Makefile > =================================================================== > RCS file: /cvs/ports/misc/uniutils/Makefile,v > retrieving revision 1.9 > diff -u -p -u -p -r1.9 Makefile > --- misc/uniutils/Makefile 28 Jun 2021 21:34:19 -0000 1.9 > +++ misc/uniutils/Makefile 10 Sep 2021 21:09:48 -0000 > @@ -3,7 +3,7 @@ > COMMENT= Unicode utilities > > DISTNAME= uniutils-2.27 > -REVISION= 3 > +REVISION= 4 > CATEGORIES= misc > > HOMEPAGE= http://billposer.org/Software/unidesc.html > Index: misc/uniutils/patches/patch-ExplicateUTF8_c > =================================================================== > RCS file: misc/uniutils/patches/patch-ExplicateUTF8_c > diff -N misc/uniutils/patches/patch-ExplicateUTF8_c > --- /dev/null 1 Jan 1970 00:00:00 -0000 > +++ misc/uniutils/patches/patch-ExplicateUTF8_c 10 Sep 2021 21:09:48 > -0000 > @@ -0,0 +1,17 @@ > +$OpenBSD$ > + > +Remove %n format specifier > + > +Index: ExplicateUTF8.c > +--- ExplicateUTF8.c.orig > ++++ ExplicateUTF8.c > +@@ -214,7 +214,8 @@ main(int ac, char **av){ > + printf("%s ",tempstr); > + } > + printf("\n"); > +- printf("This is padded to 32 places with %d zeros: > %n%s\n",(32-GotBits),&spaces,binfmtl(ch)); > ++ spaces = printf("This is padded to 32 places with %d zeros: > %s\n",(32-GotBits),binfmtl(ch)); > ++ spaces -= strlen(binfmtl(ch)); > + sprintf(tempstr," "); > + sprintf(tempstr,"%08lX",ch); > + tempstr[28] = tempstr[7]; >
