Re: start cleaning up UTF-8 processing in less(1)

2019-02-25 Thread Nicholas Marriott
Looks good, ok nicm On Tue, Feb 26, 2019 at 02:39:14AM +0100, Ingo Schwarze wrote: > Hi Todd, > > Todd C. Miller wrote on Mon, Feb 25, 2019 at 01:06:02PM -0700: > > On Mon, 25 Feb 2019 19:43:36 +0100, Ingo Schwarze wrote: > >> Todd C. Miller wrote on Mon, Feb 25, 2019 at 09:45:12AM -0700: >

Re: start cleaning up UTF-8 processing in less(1)

2019-02-25 Thread Todd C . Miller
On Tue, 26 Feb 2019 02:39:14 +0100, Ingo Schwarze wrote: > In that case, let's just use an index rather than a pointer; > diff otherwise unchanged. OK millert@ - todd

Re: start cleaning up UTF-8 processing in less(1)

2019-02-25 Thread Ingo Schwarze
Hi Todd, Todd C. Miller wrote on Mon, Feb 25, 2019 at 01:06:02PM -0700: > On Mon, 25 Feb 2019 19:43:36 +0100, Ingo Schwarze wrote: >> Todd C. Miller wrote on Mon, Feb 25, 2019 at 09:45:12AM -0700: >>> On Mon, 25 Feb 2019 12:39:41 +0100, Ingo Schwarze wrote: Index: line.c >> [...] @@

Re: start cleaning up UTF-8 processing in less(1)

2019-02-25 Thread Todd C . Miller
On Mon, 25 Feb 2019 19:43:36 +0100, Ingo Schwarze wrote: > Todd C. Miller wrote on Mon, Feb 25, 2019 at 09:45:12AM -0700: > > > On Mon, 25 Feb 2019 12:39:41 +0100, Ingo Schwarze wrote: > > >> Index: line.c > [...] > >> @@ -469,11 +469,10 @@ in_ansi_esc_seq(void) > >> * Search backwards for

Re: start cleaning up UTF-8 processing in less(1)

2019-02-25 Thread Ingo Schwarze
Hi Todd, Todd C. Miller wrote on Mon, Feb 25, 2019 at 09:45:12AM -0700: > One question inline. > On Mon, 25 Feb 2019 12:39:41 +0100, Ingo Schwarze wrote: >> Index: line.c [...] >> @@ -469,11 +469,10 @@ in_ansi_esc_seq(void) >> * Search backwards for either an ESC (which means we ARE in

Re: start cleaning up UTF-8 processing in less(1)

2019-02-25 Thread Todd C . Miller
On Mon, 25 Feb 2019 12:39:41 +0100, Ingo Schwarze wrote: One question inline. - todd > Index: line.c > === > RCS file: /cvs/src/usr.bin/less/line.c,v > retrieving revision 1.23 > diff -u -p -r1.23 line.c > --- line.c24 Feb

Re: start cleaning up UTF-8 processing in less(1)

2019-02-25 Thread Ingo Schwarze
Hi, Nicholas Marriott wrote on Mon, Feb 25, 2019 at 10:14:03AM +: > Ingo Schwarze wrote: >> During the upcoming cleanup steps, let use retain full support for >> the first (ESC-[) syntax and lets us completely delete support for >> the second and third CSI syntaxes (single-byte CSI and UTF-8

Re: start cleaning up UTF-8 processing in less(1)

2019-02-25 Thread Nicholas Marriott
> During the upcoming cleanup steps, let use retain full support for > the first (ESC-[) syntax and lets us completely delete support for > the second and third CSI syntaxes (single-byte CSI and UTF-8 > single-character two-byte CSI). > > If you are OK with that plan, i'll send diffs implementing

Re: start cleaning up UTF-8 processing in less(1)

2019-02-24 Thread Todd C . Miller
On Sun, 24 Feb 2019 11:40:10 +0100, Ingo Schwarze wrote: > During the upcoming cleanup steps, let use retain full support for > the first (ESC-[) syntax and lets us completely delete support for > the second and third CSI syntaxes (single-byte CSI and UTF-8 > single-character two-byte CSI). That

Re: start cleaning up UTF-8 processing in less(1)

2019-02-24 Thread Ingo Schwarze
Hi, Stefan Sperling wrote on Sat, Feb 23, 2019 at 04:19:02PM +0100: > Your diff looks good to me. > And I can't see how it could make this situation any worse either. thanks for checking; i committed the first patch. To be able to continue with the less cleanup i started, i have to explain

Re: start cleaning up UTF-8 processing in less(1)

2019-02-23 Thread Stefan Sperling
On Sat, Feb 23, 2019 at 03:45:28PM +0100, Ingo Schwarze wrote: > * charset.c, control_char(), calls iscntrl((unsigned char)c) >on an LWCHAR value - which is absurd to the point of not even being >funny any longer, rather making you cry. I got a good chuckle out of this :) Your diff

start cleaning up UTF-8 processing in less(1)

2019-02-23 Thread Ingo Schwarze
Hi, Evan Silberman, by posting a well-founded bug report to bugs@, just drew my attention to the pigsty of having a hand-rolled re-implemention of Unicode character property handling within less(1). The root of the evil is the existence of the custom definition typedef unsigned long LWCHAR;