Alejandro, Hola.

Alejandro Colomar wrote in
 <aZeOM3sbAm6t-mq6@devuan>:
 |On 2026-02-19T23:18:16+0100, Steffen Nurpmeso wrote:
 |> Alejandro Colomar wrote in
 |>  <aZeGabzZUpS4fT4B@devuan>:
 |>|On 2026-02-19T22:15:29+0100, Steffen Nurpmeso wrote:
 |>|> Crystal Kolipe via Mutt-dev wrote in
 |>|>  <[email protected]>:
 |>|>|On Thu, Feb 19, 2026 at 02:32:45PM +0100, Alejandro Colomar via \
 |>|>|Mutt-dev \
 |>|>|\
 |>|>|wrote:
 |>|> 
 |>|> I think "the best" is a table lookup.
 ...
 |>|By "best" are you optimizing for performance or readability?  I prefer
 |>|readability any day.  We're not libc nor cc(1); we don't need to
 |>|have the absolute best performance, or we'd be writing assembly (this is
 |>|what glibc does, indeed).  We can get 80% of the juice with trivial
 |>|one-liners that get optimized by the compiler and libc.
 |> 
 |> You were talking assembly!  But i am too, yes, i mean x86 has
 ...
 |> If you make the table 256 bytes you can even scratch "is-7-bit"
 |> check and simply do table[(u8)..char..] & BIT1[|BIT2..].
 |> (Ie, that is the other nice thing, you could test certain things
 |> with one lookup, ie space|punct|alnum, or whatever.)
 |
 |That's indeed what I'd expect of a good libc implementation.  But
 |I don't want to do that.  Readability is much more important to me.

Not that i insist on anything, but readable it is just the same,
'backing the macros there are.

 |I was checking the assembly to show that even though the code looked
 |quite heavy at first glance, it wasn't that much (compared to other
 |implementations that use more basic string functions).
 |
 |Of course, a LUT beats anything, but the implementation is far from
 |readable.  And it only works for isascii(3) functions, but not so much
 |for skipws(), because you need to write a loop, at which point you don't
 |necessarily beat strspn(s,CTYPE_SPACE_C) anymore.  So isascii(3) APIs
 |are only so fast in theory; most often, the fastest approach is to not
 |use isascii(3) at all, but use strspn(s," \t") or something similar.

That surely depends on the implementation.  DragonFly BSD via
Matthew Dillon for example threw away all the assembler optimized
variants of strlen() because the simplemost C variant was
optimized so well by the compiler.  That is really true, i was so
shocked, having had spent *time* with x86 assembler optimizations,
including even alignment directives (at least CPU dependently
really makes a difference; just like reverse vs forward string
stuff, even if you use repitition prefixes so that the CPU
*should* be able to preload or whatever as it knows the size; all
that stuff that was used to get things better in the past..),
i tried it out.  (This i *think* was over a decade ago.)

It must be said, in the meantime i think they brought them back:
aka on FreeBSD a German (Clausecker) had a tremendous run on
assembler optimizations of lots of string etc functions on various
CPUs, doing benchmarks and all that.  (I think a friend of Jörg
Schilling, and all that started after Jörg had died, if i recall
the timeline correctly.)

Anyway.. i for myself would always prefer a table thing if that is
possible.  Ie avoid functions.  On the other hand .. compilers are
so aggressive that you possibly get loop unfolding to whatever
extend, so my claim on code reduction may not necessarily be true.

(And to add that in relation to what else has to be done in that
misdesigned email area -- i have just been pointed to a BSDCon
presentation of Eric Allman from 2019, "Lessons learned from
Sendmail", and watched it on the 18th; he said some very true
thing about the email area of the IETF, as i see it, and not
meaning the personal comment -- iterating over for example " \t"
is not the pain event, performance wise; ie, if it would be me,
then ISO C would disallow loop unfolding, unless i tag it via some
keyword like "unfold" or what.)

Ciao and good night from Germany, dear Alejandro.

--steffen
|
|Der Kragenbaer,                The moon bear,
|der holt sich munter           he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)

Reply via email to