Re: [sqlite] printf() problem padding multi-byte UTF-8 code points

J Decker Sat, 17 Feb 2018 15:51:58 -0800

On Sat, Feb 17, 2018 at 3:36 PM, Richard Hipp <[email protected]> wrote:


> On 2/17/18, Ralf Junker <[email protected]> wrote:
> > Example SQL:
> >
> > select
> >    length(printf ('%4s', 'abc')),
> >    length(printf ('%4s', 'äöü')),
> >    length(printf ('%-4s', 'abc')),
> >    length(printf ('%-4s', 'äöü'))
> >
> > Output is 4, 3, 4, 3. Padding seems to take into account UTF-8 bytes
> > instead of UTF-8 code points.
> >
> > Should padding not work on code points and output 4 in all cases as
> > requested?
>
> The current behavior of the printf() function in SQLite, goofy though
> it may be, exactly mirrors the behavior of the printf() C function in
> the standard library in this regard.
>
> So I'm not sure whether or not this is something that ought to be "fixed".
>
the length() SQL function and other character functions (rtrim/ltrim)
attempt to deal with codepoints not bytes...

Maybe an added function something like  `u8length( string, count )`  which
returns bytes for count characters in a string.... that could be passed to
printf( "%-*s",  u8length( 'äöü' , 4 ),  'äöü' )



> --
> D. Richard Hipp
> [email protected]
> _______________________________________________
> sqlite-users mailing list
> [email protected]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
[email protected]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] printf() problem padding multi-byte UTF-8 code points

Reply via email to