Re: [sqlite] printf() problem padding multi-byte UTF-8 code points

2018-02-24 Thread Ralf Junker
On 18.02.2018 00:36, Richard Hipp wrote: So I'm not sure whether or not this is something that ought to be "fixed". I want to send a big Thank You! for your efforts to enhance the printf() string formatter: http://www.sqlite.org/src/info/c883c4d33f4cd722 I saw the check-in just now as I

Re: [sqlite] printf() problem padding multi-byte UTF-8 code points

2018-02-20 Thread John McKown
On Tue, Feb 20, 2018 at 11:44 AM, Jens Alfke wrote: > > > > On Feb 19, 2018, at 7:49 PM, petern wrote: > > > > 3. Why can't SQLite have the expected common static SQL functions for > > getting rapid development done without external tools? > >

Re: [sqlite] printf() problem padding multi-byte UTF-8 code points

2018-02-20 Thread Jens Alfke
> On Feb 19, 2018, at 7:49 PM, petern wrote: > > 3. Why can't SQLite have the expected common static SQL functions for > getting rapid development done without external tools? Because its primary use case is as an embedded library for programs, not as a

Re: [sqlite] printf() problem padding multi-byte UTF-8 code points

2018-02-20 Thread J Decker
On Mon, Feb 19, 2018 at 7:49 PM, petern wrote: > There are other uses for padding strings besides user reports. Consider > scalar representations of computations for example. Also: > > 1.There was no mention of user display formatting in Ralf's original > report.

Re: [sqlite] printf() problem padding multi-byte UTF-8 code points

2018-02-20 Thread petern
There are other uses for padding strings besides user reports. Consider scalar representations of computations for example. Also: 1.There was no mention of user display formatting in Ralf's original report. It was a bug report about missing inverse functionality for padding/trimming strings.

Re: [sqlite] printf() problem padding multi-byte UTF-8 code points

2018-02-19 Thread Simon Slavin
On 20 Feb 2018, at 1:38am, petern wrote: > Yet even so, as Ralf pointed out, the PostgreSQL lpad() and rpad() fill > with arbitrary string functionality would still be missing despite the > checked in printf() being more directly equivalent to the PostgreSQL >

Re: [sqlite] printf() problem padding multi-byte UTF-8 code points

2018-02-19 Thread J Decker
On Mon, Feb 19, 2018 at 5:38 PM, petern wrote: > FYI. See http://www.sqlite.org/src/timeline for the equivalent DRH > checkins: http://www.sqlite.org/src/info/c883c4d33f4cd722 > Hopefully that branch will make a forthcoming trunk merge. [Printing > explicit nul

Re: [sqlite] printf() problem padding multi-byte UTF-8 code points

2018-02-19 Thread petern
FYI. See http://www.sqlite.org/src/timeline for the equivalent DRH checkins: http://www.sqlite.org/src/info/c883c4d33f4cd722 Hopefully that branch will make a forthcoming trunk merge. [Printing explicit nul terminator by formatting an interesting twist.] Yet even so, as Ralf pointed out, the

Re: [sqlite] printf() problem padding multi-byte UTF-8 code points

2018-02-19 Thread Cezary H. Noweta
Hello, On 2018-02-17 18:39, Ralf Junker wrote: Example SQL: select   length(printf ('%4s', 'abc')),   length(printf ('%4s', 'äöü')),   length(printf ('%-4s', 'abc')),   length(printf ('%-4s', 'äöü')) Output is 4, 3, 4, 3. Padding seems to take into account UTF-8 bytes instead of UTF-8

Re: [sqlite] printf() problem padding multi-byte UTF-8 code points

2018-02-19 Thread Keith Medcalf
traffic volume. >-Original Message- >From: sqlite-users [mailto:sqlite-users- >boun...@mailinglists.sqlite.org] On Behalf Of Ralf Junker >Sent: Saturday, 17 February, 2018 10:40 >To: sqlite-users@mailinglists.sqlite.org >Subject: [sqlite] printf() problem padding multi-byte UTF-8

Re: [sqlite] printf() problem padding multi-byte UTF-8 code points

2018-02-19 Thread Jens Alfke
> On Feb 19, 2018, at 2:54 AM, Ralf Junker wrote: > > 'です' are 2 codepoints according to > > http://www.fontspace.com/unicode/analyzer/?q=%E3%81%A7%E3%81%99 > > > The requested overall width is 4, so I

Re: [sqlite] printf() problem padding multi-byte UTF-8 code points

2018-02-19 Thread petern
As d3ck0r suggested. adding a byte_length() function would enable padding of spaces [but not general padding with arbitrary characters as lpad() and rpad() afford]. WITH points(p) AS (VALUES ('abc'), ('äöü'), ('です')) ,format(f) AS (VALUES ('%*s'), ('%-*s')) ,pad AS (SELECT p, f,

Re: [sqlite] printf() problem padding multi-byte UTF-8 code points

2018-02-19 Thread J Decker
On Mon, Feb 19, 2018 at 2:54 AM, Ralf Junker wrote: > On 19.02.2018 09:50, Rowan Worth wrote: > > What is your expected answer for: >> >> select length(printf ('%4s', 'です')) >> > > 'です' are 2 codepoints according to > >

Re: [sqlite] printf() problem padding multi-byte UTF-8 code points

2018-02-19 Thread J Decker
On Mon, Feb 19, 2018 at 3:21 AM, Cezary H. Noweta wrote: > Hello, > > On 2018-02-18 00:36, Richard Hipp wrote: > >> The current behavior of the printf() function in SQLite, goofy though >> it may be, exactly mirrors the behavior of the printf() C function in >> the standard

Re: [sqlite] printf() problem padding multi-byte UTF-8 code points

2018-02-19 Thread Cezary H. Noweta
Hello, On 2018-02-18 00:36, Richard Hipp wrote: The current behavior of the printf() function in SQLite, goofy though it may be, exactly mirrors the behavior of the printf() C function in the standard library in this regard. So I'm not sure whether or not this is something that ought to be

Re: [sqlite] printf() problem padding multi-byte UTF-8 code points

2018-02-19 Thread Ralf Junker
On 19.02.2018 09:50, Rowan Worth wrote: What is your expected answer for: select length(printf ('%4s', 'です')) 'です' are 2 codepoints according to http://www.fontspace.com/unicode/analyzer/?q=%E3%81%A7%E3%81%99 The requested overall width is 4, so I would expect expect two added spaces

Re: [sqlite] printf() problem padding multi-byte UTF-8 code points

2018-02-19 Thread Rowan Worth
What is your expected answer for: select length(printf ('%4s', 'です')) -Rowan On 18 February 2018 at 01:39, Ralf Junker wrote: > Example SQL: > > select > length(printf ('%4s', 'abc')), > length(printf ('%4s', 'äöü')), > length(printf ('%-4s', 'abc')), >

Re: [sqlite] printf() problem padding multi-byte UTF-8 code points

2018-02-19 Thread Ralf Junker
On 18.02.2018 00:36, Richard Hipp wrote: The current behavior of the printf() function in SQLite, goofy though it may be, exactly mirrors the behavior of the printf() C function in the standard library in this regard. SQLite3 is not C. SQLite3 text storage is always Unicode. Thus SQL text

Re: [sqlite] printf() problem padding multi-byte UTF-8 code points

2018-02-17 Thread Dominique Pellé
Richard Hipp wrote: > On 2/17/18, Ralf Junker wrote: >> Example SQL: >> >> select >>length(printf ('%4s', 'abc')), >>length(printf ('%4s', 'äöü')), >>length(printf ('%-4s', 'abc')), >>length(printf ('%-4s', 'äöü')) >> >> Output is 4, 3, 4, 3.

Re: [sqlite] printf() problem padding multi-byte UTF-8 code points

2018-02-17 Thread Cezary H. Noweta
Hello, On 2018-02-18 01:46, Peter Da Silva wrote: Printf's handling of unicode is inconsistent in other ways, too. I suspect that there's still undefined behavior floating around in there too. Even wprintf isn't entirely unsurprising: You have supplied examples which are exchanged with each

Re: [sqlite] printf() problem padding multi-byte UTF-8 code points

2018-02-17 Thread Peter Da Silva
On 2018-02-17, at 17:36, Richard Hipp wrote: > The current behavior of the printf() function in SQLite, goofy though > it may be, exactly mirrors the behavior of the printf() C function in > the standard library in this regard. > > So I'm not sure whether or not this is

Re: [sqlite] printf() problem padding multi-byte UTF-8 code points

2018-02-17 Thread J Decker
On Sat, Feb 17, 2018 at 3:36 PM, Richard Hipp wrote: > On 2/17/18, Ralf Junker wrote: > > Example SQL: > > > > select > >length(printf ('%4s', 'abc')), > >length(printf ('%4s', 'äöü')), > >length(printf ('%-4s', 'abc')), > >length(printf

Re: [sqlite] printf() problem padding multi-byte UTF-8 code points

2018-02-17 Thread Richard Hipp
On 2/17/18, Ralf Junker wrote: > Example SQL: > > select >length(printf ('%4s', 'abc')), >length(printf ('%4s', 'äöü')), >length(printf ('%-4s', 'abc')), >length(printf ('%-4s', 'äöü')) > > Output is 4, 3, 4, 3. Padding seems to take into account UTF-8 bytes >

[sqlite] printf() problem padding multi-byte UTF-8 code points

2018-02-17 Thread Ralf Junker
Example SQL: select length(printf ('%4s', 'abc')), length(printf ('%4s', 'äöü')), length(printf ('%-4s', 'abc')), length(printf ('%-4s', 'äöü')) Output is 4, 3, 4, 3. Padding seems to take into account UTF-8 bytes instead of UTF-8 code points. Should padding not work on code points