Re: [hackers] help

2023-08-03 Thread Laslo Hunhold
On Thu, 3 Aug 2023 12:29:04 +0100
Christopher Lang  wrote:

Dear Christopher,

> I was trying to subscribe to the mailing list but didn't get a
> response from  so I tried this. I think I
> have figured it out not though.
> Sorry if I pinged you. And I'm fine haha.
> PS. I'm sorry if this is html mail, still trying to set up a better
> email client.

how can we be sure it's you and not your kidnapper? HANG IN THERE,
CHRISTOPHER, IF YOU CAN HEAR THIS!

With best regards

Laslo



Re: [hackers] [quark][PATCH] Fix buffer over-read in decode()

2023-02-26 Thread Laslo Hunhold
On Sun, 21 Aug 2022 20:09:16 +
HushBugger  wrote:

> On Wed, 2022-08-17 at 08:49 +0600, NRK wrote:
> > I think the `s++` should be removed from the for loop and `s` should
> > be incremented as needed inside the loop instead.  
> 
> Agreed. I've changed it.

Thank you for working out this patch, I have applied it! :)



Re: [hackers] [quark][PATCH] Fix strftime error handling

2023-02-26 Thread Laslo Hunhold
On Fri,  8 Jul 2022 11:12:17 -0700
robert  wrote:

> Unlike snprintf, strftime buffer contents are undefined when it fails,
> so make sure the buffer is null-terminated. To prevent garbage from
> being printed out, we simply set the timestamp to the empty string,
> but maybe setting it to "unknown time" or something similar would be
> better. Either way, I don't think this can fail until year 1, so
> it's not a big deal.
> ---
>  connection.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/connection.c b/connection.c
> index 8aca2ab..24de809 100644
> --- a/connection.c
> +++ b/connection.c
> @@ -31,7 +31,8 @@ connection_log(const struct connection *c)
>   if (!strftime(tstmp, sizeof(tstmp), "%Y-%m-%dT%H:%M:%SZ",
> gmtime(&(time_t){time(NULL)}))) {
>   warn("strftime: Exceeded buffer capacity");
> - /* continue anyway (we accept the truncation) */
> + tstmp[0] = '\0'; /* tstmp contents are undefined on
> failure */
> + /* continue anyway */
>   }
>  
>   /* generate address-string */
> -- 
> 2.17.1

Thank you, I have applied your patch!



Re: [hackers] [quark][PATCH] Remove superfluous byteorder conversion

2023-02-26 Thread Laslo Hunhold
On Tue, 19 Apr 2022 12:20:40 +0200
Thomas Oltmann  wrote:

> When comparing IPv4 addresses in sock_same_addr() we don't need
> to correct their byteorder just to see if they are equal or not.
> Byte swapping would only be needed if we needed to know
> which address had the greater value.
> ---
>  sock.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/sock.c b/sock.c
> index ecb73ef..f1385ca 100644
> --- a/sock.c
> +++ b/sock.c
> @@ -200,8 +200,8 @@ sock_same_addr(const struct sockaddr_storage
> *sa1, const struct sockaddr_storage ((struct sockaddr_in6
> *)sa2)->sin6_addr.s6_addr, sizeof(((struct sockaddr_in6
> *)sa1)->sin6_addr.s6_addr)); case AF_INET:
> - return ntohl(((struct sockaddr_in
> *)sa1)->sin_addr.s_addr) ==
> -ntohl(((struct sockaddr_in
> *)sa2)->sin_addr.s_addr);
> + return ((struct sockaddr_in *)sa1)->sin_addr.s_addr
> ==
> +((struct sockaddr_in *)sa2)->sin_addr.s_addr;
>   default: /* AF_UNIX */
>   return strcmp(((struct sockaddr_un *)sa1)->sun_path,
> ((struct sockaddr_un *)sa2)->sun_path)
> == 0; -- 
> 2.35.1

Thanks, applied!



Re: [hackers] [quark][PATCH] Fix inverted conditional in sock_same_addr()

2023-02-26 Thread Laslo Hunhold
On Tue, 19 Apr 2022 12:04:57 +0200
Thomas Oltmann  wrote:

> sock_same_addr() is supposed to return 0 if sa1 and sa2 are different
> addresses. Since memcmp() returns 0 if its arguments are equal, we
> need to flip the return value by comparing it to 0.
> ---
>  sock.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/sock.c b/sock.c
> index ecb73ef..e6e7754 100644
> --- a/sock.c
> +++ b/sock.c
> @@ -198,7 +198,7 @@ sock_same_addr(const struct sockaddr_storage
> *sa1, const struct sockaddr_storage case AF_INET6:
>   return memcmp(((struct sockaddr_in6
> *)sa1)->sin6_addr.s6_addr, ((struct sockaddr_in6
> *)sa2)->sin6_addr.s6_addr,
> -   sizeof(((struct sockaddr_in6
> *)sa1)->sin6_addr.s6_addr));
> +   sizeof(((struct sockaddr_in6
> *)sa1)->sin6_addr.s6_addr)) == 0; case AF_INET:
>   return ntohl(((struct sockaddr_in
> *)sa1)->sin_addr.s_addr) == ntohl(((struct sockaddr_in
> *)sa2)->sin_addr.s_addr); -- 
> 2.35.1

Thank you, I have applied your patch! You really have eagle-eyes. :)



Re: [hackers][quark] Quark don't print-out output after dettach

2022-12-31 Thread Laslo Hunhold
On Sat, 31 Dec 2022 06:35:43 -0500
fo...@dnmx.org wrote:

Dear fossy,

> Quark does not print-out connection messages to terminal or a log file
> once dettached.
> Can someone help me?

could you please provide a minimal reproducible example?

With best regards

Laslo Hunhold



Re: [hackers] [slstatus] More LICENSE updates || drkhsh

2022-12-29 Thread Laslo Hunhold
On Wed, 21 Dec 2022 14:36:31 +
drkhsh  wrote:

Dear Aaron,

> So, before this discussion becomes untechnical, here are some facts
> from a quick research.
> 
> https://www.copyright.gov/title17/92chap4.html#408
> 
> "(2) in the case of a work other than an anonymous or pseudonymous
> work, the name and nationality or domicile of the author or authors,
> and, if one or more of the authors is dead, the dates of their deaths;
> 
> (3) if the work is anonymous or pseudonymous, the nationality or
> domicile of the author or authors;"
> 
> For me that means that publishing pseudonymous copyrighted work under
> any license should be fine as long as the license does not explicitly
> mention it?

you are citing US copyright law and it's a whole can of worms across
the world. I could go on citing EU, Russian, etc. (all signers of the
Berne convention) laws, which are lenient or dismissive in regard to
pseudonymous attributions, but always at least require unique
identification of a pseudonym, which can always be a matter of dispute.

If a license is invalid in a certain country's jurisdiction (e.g. if
pseudonyms are used even though not allowed), the code ends up being
"all rights reserved" and thus not satisfying the OSI license criteria.
Even if pseudonyms were allowed by the Berne convention (didn't check),
it would probably at least require unique identifiability of the
pseudonym, casting doubt at the overall license text. I know many
people and companies who are very careful about only building their
software using components with watertight licenses.

Regarding the pseudoynm uniqueness, I think that "pseudonym" in the
law's sense is very thinly stretched in regard to arbitrary
web-nicknames, and the only reason, I think, it's included in some laws
is that when an author writes a book under pseudonym the book doesn't
immediately go into the public domain. We could discuss this forever,
but this ends up being territory where one would have to ask a judge if
a pseudonym is unique enough or not.

Having real names only in a license has other, more practical,
advantages, though: You actually have the chance to reach out to people
even years after the software release, and I've had one positive
experience with this a few years ago. There is simply no chance if you
just have a nickname with a throwaway-e-mail-address, e.g. "PBC
", to ever reach out to this person in most
cases after just a few years. Reaching out could be regarding a
relicensing (e.g. ISC/MIT -> GPL or the other way round), technical
inquiry or simply to invite them to something.

I'm amazed about how many people are scared of putting their names on
their works; in a sense it reduces the software's trustworthiness, may
it only be by a subjective factor.

With best regards

Laslo



Re: [hackers] [slstatus] Update LICENSE || drkhsh

2022-12-18 Thread Laslo Hunhold
On Mon, 19 Dec 2022 02:44:40 +0100 (CET)
g...@suckless.org wrote:

Dear Aaron,

> commit 1ae616190cb3f88221571343a284fdf9f55b683f
> Author: drkhsh 
> AuthorDate: Mon Dec 19 02:40:00 2022 +0100
> Commit: drkhsh 
> CommitDate: Mon Dec 19 02:44:21 2022 +0100
> 
> Update LICENSE
> 
> diff --git a/LICENSE b/LICENSE
> index 70b9fb3..b7e3aa6 100644
> --- a/LICENSE
> +++ b/LICENSE
> @@ -27,6 +27,8 @@ Copyright 2020 Alexandre Ratchov 
>  Copyright 2020 Mart Lubbers 
>  Copyright 2020 Daniel Moch 
>  Copyright 2022 NRK 
> +Copyright 2022 Patrick Iacob 
> +Copyright 2021-2022 planet36 
>  
>  Permission to use, copy, modify, and/or distribute this software for
> any purpose with or without fee is hereby granted, provided that the
> above

planet36's real name is "Steven Ward" (as can be extracted from his
GitHub[0]) and his canonical E-Mail-address is plane...@gmail.com.

It should be avoided to add pseudonyms to license files, as the license
is formally and legally binding.

With best regards

Laslo

[0]:https://github.com/planet36/organize-roms/commit/24f10204297b74939a8676a864fa5c605e9f0306



Re: [hackers] [slstatus] config.mk: Fix PREFIX assignment || planet36

2022-12-18 Thread Laslo Hunhold
On Mon, 19 Dec 2022 02:44:40 +0100 (CET)
g...@suckless.org wrote:

Dear Aaron,

> commit c225c4315161a992b9e44dd990d083ee57f7f713
> Author: planet36 
> AuthorDate: Wed May 26 14:29:32 2021 -0400
> Commit: drkhsh 
> CommitDate: Mon Dec 19 02:44:21 2022 +0100
> 
> config.mk: Fix PREFIX assignment
> 
> Signed-off-by: drkhsh 
> 
> diff --git a/config.mk b/config.mk
> index ead1859..8f06800 100644
> --- a/config.mk
> +++ b/config.mk
> @@ -4,7 +4,7 @@ VERSION = 0
>  # customize below to fit your system
>  
>  # paths
> -PREFIX = /usr/local
> +PREFIX ?= /usr/local
>  MANPREFIX = $(PREFIX)/share/man
>  
>  X11INC = /usr/X11R6/include
> 

I would interject here that "?=" is not POSIX and assume that there was
push by some packager. Based on my experience, I would recommend to go
back to "=" and encourage packagers to simply do

make PREFIX=...

which overrides any assignments in config.mk.

With best regards

Laslo



Re: [hackers] [libgrapheme] Do not falsely read entire buffer instead of simply the filled with || Laslo Hunhold

2022-11-24 Thread Laslo Hunhold
On Thu, 24 Nov 2022 20:32:53 +0600
NRK  wrote:

Dear NRK,

> Small nitpick: ASan (and the other sanitizers) are *dynamic*
> analyzers, as they happen during runtime.
> 
> Static analysis is analyzing without executing anything. Examples of
> static analyzers would be clang-tidy or cppcheck. Newer GCC versions
> also have a `-fanalyzer` flag for statically analyzing C code, but in
> my experience it's not mature yet - but the direction looks promising.

yes, thanks, you are totally right, of course. :)

With best regards

Laslo



Re: [hackers] [libgrapheme] Add a check make-target as an alias for test || Laslo Hunhold

2022-11-21 Thread Laslo Hunhold
On Mon, 21 Nov 2022 11:06:33 +
"Tom Schwindl"  wrote:

Dear Tom,

> This should probably be added to the PHONY target as a prerequisite.

thank you, I have added this in commit[0].

With best regards

Laslo

[0]:https://git.suckless.org/libgrapheme/commit/84bd5ee67bb9cbd317c8fa44ae4da768e2af922d.html



Re: [hackers] [lchat] Makefile: add dist target to create release tarballs || Jan Klemkow

2022-10-21 Thread Laslo Hunhold
On Thu, 20 Oct 2022 19:18:42 -0400
Steve Ward  wrote:

Dear Steve,

> If you want to stick with git, the mkdir, cp, tar, and rm commands
> could be replaced with:
> git archive --prefix lchat-$(VERSION)/ HEAD | gzip >
> lchat-$(VERSION).tar.gz

this would add an implicit dependency on git, though.

With best regards

Laslo



Re: [hackers] [tabbed] Makefile: simplify and remove hiding the build process || Hiltjo Posthuma

2022-10-13 Thread Laslo Hunhold
On Wed, 12 Oct 2022 23:02:14 +0200 (CEST)
g...@suckless.org wrote:

> -# Solaris
> -#CFLAGS = -fast ${INCS} -DVERSION=\"${VERSION}\"
> -#LDFLAGS = ${LIBS}

Noo, not Solaris!



Re: [hackers] [libgrapheme] Switch to semantic versioning and improve dynamic library handling || Laslo Hunhold

2022-10-08 Thread Laslo Hunhold
On Fri, 7 Oct 2022 23:32:08 +0600
NRK  wrote:

Dear NRK,

> Curious, what makes you change you mind about putting these back in
> config.mk instead of keeping them in the Makefile ? Since they aren't
> meant to be changed by the user.

you're totally right. I put them back at first because I would risk not
rebuilding when I changed them in the Makefile (which is actually
easily fixable by just adding a dependency on the Makefile additionally
to config.mk, which I did just now[0]).

Another aspect was that the VERSION_* variables are used in some of the
variables, but I admit that it makes more sense to simply use them and
optionally add a comment at the top. To be honest, though, they are
pretty much self-explanatory.

With best regards

Laslo

[0]:https://git.suckless.org/libgrapheme/commit/d42f53b5baafe01caa48477e204b63e065660117.html



Re: [hackers] [libgrapheme] Convert GRAPHEME_STATE to uint_least16_t and remove it || Laslo Hunhold

2022-10-04 Thread Laslo Hunhold
On Tue, 4 Oct 2022 05:07:13 +0600
NRK  wrote:

Dear NRK,

> Another possibility is wrapping the integer inside a struct:
> 
>   typedef struct { unsigned internal_state; } GRAPHEME_STATE;
> 
> the benefit of this is that the type GRAPHEME_STATE clearly states the
> purpose, whereas a `uint_least16_t` doesn't.
> 
> Wrapping an enum into a struct is also a common trick to get stronger
> type-checking from the compiler; I don't think it matters in this case
> though, since the state is always passed via pointer.
> 
> >  and I want all of the semantics to be crystal clear to the
> > end-user.  
> 
> Other way of looking at it is that the state is an internal thing so
> the user shouldn't be concerned about what's going on behind the
> scene.

yeah, you bring up good points that I also thought of. What one should
not forget is that those shenanigans also complicate the use of FFIs.

I really originally thought that the state type would be used in more
than one place, but that's not the case. Enough meaning is given to it
by the name of the variable, so it's cool.
 
> The `(uint_least16_t)1` casts don't really do much since `int` is
> guaranteed to be 16bits anyways. But if you want to be explicit, you
> can still use `UINT16_C(1)`, which is shorter thus less noisy,
> instead of casting:
> 
> - out->prop_set = in & (((uint_least16_t)(1)) <<  8);
> + out->prop_set = in & (UINT16_C(1) << 8);

ah yeah, I always seem to forget about this macro, even though I use it
so often in the code. Fixed[0] now.

> I'd also return by value in these 2 functions. Expressions like these
> are more clear compared to out pointers:
> 
>   *s = state_deserialize();
>   state = state_serialize(*s);

I prefer to always pass structs by reference. Call me old-fashioned in
this regard.

Thank you for reviewing the changes, though. I really appreciate it! :)

With best regards

Laslo

[0]:https://git.suckless.org/libgrapheme/commit/0aa5d262f8d0975341bcc60916e12044c7d64d0d.html



Re: [hackers] [libgrapheme][PATCH] fix manpage

2022-10-02 Thread Laslo Hunhold
On Sun, 2 Oct 2022 09:29:18 +0600
NRK  wrote:

Dear NRK,

> - to_case: there's no `len` parameter. it should be `srclen` and
> `dstlen`.
> - is_case: `caselen` should be a pointer.
> ---
> 
> P.S: one more thing that caught my eye; the "next" manpages for the
> codepoint versions states:
> 
>   If len is set to SIZE_MAX the string str is interpreted to be
>   NUL-terminated and processing stops when a NUL-byte is
>   encountered.
> 
> is this correct? what if the integer contains a nul-byte?
> 
> it seems to be that it should be an integer (uint_least32_t) with the
> value 0, not a nul-byte, which are different things.

thanks for reporting these problems! I actually had on my TODO to take
a look at the manuals, given there were also some other problems.

I now took the time to fix them all[0], including your suggestions.
Thank you very much!

The wording regarding the "NUL-byte" was a bit unfortunate for the
codepoint-based functions. I fixed it up accordingly.

With best regards

Laslo

[0]:https://git.suckless.org/libgrapheme/commit/995e37182dc53da55dc4cf34868513610215c79e.html



Re: [hackers] [libgrapheme] Update to Unicode 15.0.0 || Laslo Hunhold

2022-09-15 Thread Laslo Hunhold
On Thu, 15 Sep 2022 09:44:11 +0200
Hiltjo Posthuma  wrote:

Dear Hiltjo,

> Finally support for the duck emoji!
> 
> https://blog.unicode.org/2022/09/announcing-unicode-standard-version-150.html
> https://www.unicode.org/announcements/u15-emoji-annc-large.png

yes, and not to forget that we can finally express an old old meme in
Unicode: 﫸﫷.

With best regards

Laslo



Re: [hackers] [libgrapheme] Add manuals for the grapheme_to_*case_utf8-functions || Laslo Hunhold

2022-08-29 Thread Laslo Hunhold
On Sun, 28 Aug 2022 20:00:42 +0200
Quentin Rameau  wrote:

Dear Quentin,

> But of course, that's why the construction ${variable} exists,
> for this very common case.
> It's clear that UNITs isn't a variable,
> so you need to separate the variable from the string.
> You don't that with a subshell and printf,
> you do that with just ${UNIT}s.

thanks for your explanation and pointing this out! I totally forgot
about this and have now pushed a change to use the proper parameter
expansion[0]. For those interested, here's the excerpt from the
POSIX-standard[1].

Also thanks to you, Thomas Oltmann, for pointing this out as well.

With best regards

Laslo

[0]:https://git.suckless.org/libgrapheme/commit/6e6c538e4efb4d191a2f0391466556eb758d76bd.html
[1]:https://pubs.opengroup.org/onlinepubs/009695399/utilities/xcu_chap02.html#tag_02_06_02



Re: [hackers] [libgrapheme] Add manuals for the grapheme_to_*case_utf8-functions || Laslo Hunhold

2022-08-28 Thread Laslo Hunhold
On Sun, 28 Aug 2022 18:33:21 +0200
Quentin Rameau  wrote:

> > +function returns the number of $(printf $UNIT)s in the array
> > resulting  
>   ^-- But… Why?!

Doesn't work otherwise in the heredoc. If I write $UNITs it doesn't
interpret it correctly, so I reformulate it as a subshell-expression.
It's not elegant, but works. Is there a better way to address this?



Re: [hackers] [dwm][PATCH RESEND 0/2] Const-correctness fixes

2022-08-22 Thread Laslo Hunhold
On Mon, 22 Aug 2022 11:15:19 +0100
Chris Down  wrote:

Dear Chris,

> Hmm? For example, for the FC/Xft types, fontname is declared as const
> by xfont_create, but then we cast away its constness when passing it
> to FCNameParse. The same goes for text, which we claim is const in
> the drw_font_getexts signature, but then we remove its constness.
> 
> In general the existing code seems confused, no? Either we shouldn't
> pass them in as const in the first place, or we should maintain the
> constness that we declare in the function parameters.
> 
> There shouldn't be any logical change here, but it seems weird to say
> things are not mutable up front and then waver about it later. Right
> now there's no UB, but making sure we don't cast away the const
> mitigates the risk altogether.

I agree here. Not only should const be used to at least have a partial
"contract" for the function parameters (C doesn't offer a lot in this
regard and it's an easy way to prevent problems), it also allows the
compiler to optimize the code better.

With best regards

Laslo



Re: [hackers] [libgrapheme] [PATCH] Remove dead file `src/util.c'

2022-08-16 Thread Laslo Hunhold
On Wed, 10 Aug 2022 15:03:06 +
Tom Schwindl  wrote:

Dear Tom,

> Since commit 072bb271868a3583da1f ("Introduce mostly branchless
> character break detection") removed the code from the file, it no
> longer serves a purpose.

while it was true, it now is used again. :) It often makes more sense
to keep util-stubs that are guaranteed to be used later rather than
ripping it out of the build-system only having to include it later
again.

With best regards

Laslo



Re: [hackers] [sbase] [PATCH] Use ar(1)'s s-flag instead of invoking ranlib(1)

2022-08-16 Thread Laslo Hunhold
On Mon, 1 Aug 2022 11:09:16 +0200
"Roberto E. Vargas Caballero"  wrote:

Dear Roberto,

> Because then you will support only the last systems. If you keep
> the ranlib you will support systems that support all versions of
> the standard. Again, if you find a system without ranlib then
> we can talk and consider what to do, but removing only for the sake
> of "the standard does not include anymore ranlib" is a horrible idea.
> For example, scc requires the use of ranlib, if you remove it then
> I will not be able to continue testing scc with suckless software.
> What happens if I want to compile sbase in an old SunOs workstation?

I thought about it a bit more in the last few weeks and added ranlib
again.

The main reason is that I find it convincing that POSIX would not try
to define varying binary formats, which is why the toolchain-tool
ranlib(1) was probably never included.

Adding the s-flag to ar is simply an unexpected and ill-fitting
feature-creep that bloats up an otherwise simple archive-tool.

Thanks for this very interesting discussion and sharing your
experience!

With best regards

Laslo



Re: [hackers][quark][patch] pre-compression

2022-08-16 Thread Laslo Hunhold
On Tue, 16 Aug 2022 11:19:55 -0400
fo...@dnmx.org wrote:

Dear fossy,

> Ah.. so very complicated, huh? Oh, well..

it's not a low-hanging-fruit by any means, yes.

> Hey, so.. I have a question.. how come that Suckless' web-site isn't
> hosted using Quark? If I remember correctly - it's Nginx?
> Like - not even OpenBSD's httpd??

The site is currently served using nginx. As far as I can tell, we just
haven't gotten around to switching over to OpenBSD's httpd.

To be completely frank, quark is a tool for a very limited scope, and
I've been battling with certain aspects for quite a while. It's good
for quickly hosting something from the command line, but lacks in other
aspects.

On OpenBSD, I would even recommend its httpd over quark in most cases.

> I find Quark fine enough.. just couldn't manage to log messages with a
> command.. probably could add a few lines for logging.. but I just..
> I'm too lazy for that :/

It's as simple as

quark > log

and a daily cron for log-rotation that amounts to

mv log log.2022-08-16

or something (fired daily at midnight, the date of course generated on
the fly for the filename).

With best regards

Laslo



Re: [hackers][quark][patch] pre-compression

2022-08-16 Thread Laslo Hunhold
On Mon, 8 Aug 2022 07:57:51 -0400
fo...@dnmx.org wrote:

Dear fossy,

> I'll try to do it by myself, but I don't promise anything.. It seems
> like the resp files were moved to esnprintf or something.
> 
> Sorry for no original e-mail text, DNMX is broken :d

given quark's new structure compression is not easily possible anymore,
unless you add a complicated stream-compression on top within the
individual connection-structs.

Thanks anyway for your offer!

With best regards

Laslo



Re: [hackers] [libgrapheme] Use (size_t)(-1) instead of SIZE_MAX and fix style || Laslo Hunhold

2022-08-16 Thread Laslo Hunhold
On Sun, 31 Jul 2022 12:18:22 +0200
Mattias Andrée  wrote:

Dear Mattias,

> Why wouldn't SIZE_MAX be the maximum of size_t?

you're totally right and I changed it. Thanks!

With best regards

Laslo



Re: [hackers] [libgrapheme] Rename reallocarray() to reallocate_array() to prevent mangling || Laslo Hunhold

2022-08-16 Thread Laslo Hunhold
On Mon, 1 Aug 2022 21:53:45 +0600
NRK  wrote:

Dear NRK,

> Given that this no longer shadows a libc/conventional function, I'd go
> one step further and move the `fprintf + exit` check inside
> reallocate_array() so the calling code doesn't need to worry about
> null returns.

thanks for your remark, but I prefer it this way.

With best regards

Laslo



Re: [hackers] [libgrapheme][PATCH] Add reallocarray implementation

2022-07-31 Thread Laslo Hunhold
On Sat, 30 Jul 2022 14:29:05 -0700
robert  wrote:

Dear Robert,

> reallocarray is nonstandard and glibc declares it only when
> _GNU_SOURCE is defined. Without this patch or _GNU_SOURCE defined, I
> get a seg fault from reallocarray being implicitly declared with the
> wrong signature.

thanks for your patch! I applied it with a few modifications. As a
matter of fact, glibc exports reallocarray() with _DEFAULT_SOURCE since
version 2.29 (from January 2019), however, you are still totally
correct that using this function reduces portability.

With best regards

Laslo



Re: [hackers] [sbase] [PATCH] Use ar(1)'s s-flag instead of invoking ranlib(1)

2022-07-30 Thread Laslo Hunhold
On Fri, 22 Jul 2022 17:28:38 +0200
"Roberto E. Vargas Caballero"  wrote:

Dear Roberto,

> I disagree with this change. I think it adds nothing and reduce
> portability of the Makefiles.

why would it reduce the portability of the Makefiles? It can be
expected that all ar-implementations support the s-flag, and ranlib is
simply legacy.

With best regards

Laslo



Re: [hackers] [dwm][PATCH] spawn: reduce 2 lines, change fprintf() + perror() + exit() to die("... :")

2022-07-30 Thread Laslo Hunhold
On Fri, 29 Jul 2022 18:26:04 -0500
explosion0men...@gmail.com wrote:

Dear explosion0mental,

> when calling die and the last character of the string corresponds to
> ':', die() will call perror(). See util.c
> 
> Cuz muh lines of code!1

> - fprintf(stderr, "dwm: execvp %s", ((char
> **)arg->v)[0]);
> - perror(" failed");
> - exit(EXIT_SUCCESS);
> + die("dwm: execvp '%s' failed:", ((char

as far as I can tell this is not correct, given the program exits with
EXIT_SUCCESS, not EXIT_FAILURE.

With best regards

Laslo



Re: [hackers] [sbase] [PATCH] Use ar(1)'s s-flag instead of invoking ranlib(1)

2022-07-30 Thread Laslo Hunhold
On Fri, 22 Jul 2022 17:28:38 +0200
"Roberto E. Vargas Caballero"  wrote:

Dear Roberto,

> I disagree with this change. I think it adds nothing and reduce
> portability of the Makefiles.

why would it reduce the portability of the Makefiles? It can be
expected that all ar-implementations support the s-flag, and ranlib is
simply legacy.

With best regards

Laslo



Re: [hackers] [quark][PATCH] Fix strftime error handling

2022-07-10 Thread Laslo Hunhold
On Fri,  8 Jul 2022 11:12:17 -0700
robert  wrote:

Dear Robert,

> Unlike snprintf, strftime buffer contents are undefined when it fails,
> so make sure the buffer is null-terminated. To prevent garbage from
> being printed out, we simply set the timestamp to the empty string,
> but maybe setting it to "unknown time" or something similar would be
> better. Either way, I don't think this can fail until year 1, so
> it's not a big deal.

nice catch, thanks! I'll merge it with the next window.

With best regards

Laslo



Re: [hackers] [PATCH][libgrapheme] macro-hygiene: wrap arguments in parenthesis

2022-06-29 Thread Laslo Hunhold
On Wed, 29 Jun 2022 09:07:49 +0600
NRK  wrote:

Dear NRK,

> reported by clang-tidy.

thank you very much! I pushed it!

Also, even though I appreciate you checking the code, there are
admittedly multiple ugly spots that need refactoring and also known
bugs (especially for the _utf8-functions) that need fixing or rather
refactoring.
The library is currently in a phase of "expansion" to check technical
feasability. Shared concepts will be integrated into common concepts to
ultimately simplify the code.

Stay tuned for more! :)

With best regards

Laslo



Re: [hackers] [sent] [PATCH 1/3] sent.c: Drop unnecessary NULL checks

2022-06-26 Thread Laslo Hunhold
On Sun, 26 Jun 2022 21:53:49 +0300
Greg Minshall  wrote:

Dear Greg,

> for what it's worth, i'd probably code with the checks, so as to avoid
> future code editors (including myself) doing a double-take, thinking,
> "hmm, did the author consider that case?".  (though, of course, you
> did that same -- if opposite -- double-take when you saw that code.)

come on, it is general knowledge that free() accepts NULL arguments.
The extra checks just add more cruft.

With best regards

Laslo



Re: [hackers] [libgrapheme] Explicitly use object-files in library-generation || Laslo Hunhold

2022-06-24 Thread Laslo Hunhold
On Fri, 24 Jun 2022 11:51:51 +0200
Quentin Rameau  wrote:

Dear Quentin,

> >  libgrapheme.a: $(SRC:=.o)
> > -   $(AR) rc $@ $?
> > +   $(AR) rc $@ $(SRC:=.o)
> > $(RANLIB) $@  
> 
> This works as intended with $?, because then you only update objects
> that are out of date, not *all* objects inconditionally (just note
> that you might want the -u flag too).

today I learned, thank you! :)

I pushed the change, but kept out the -u flag, as it's a bit redundant
and might lead to unexpected results when you override something in
make. Please let me know if I'm missing something there.

With best regards

Laslo



Re: [hackers] [libgrapheme] Implement line-segmentation || Laslo Hunhold

2022-06-17 Thread Laslo Hunhold
On Fri, 17 Jun 2022 13:47:32 -0400
fo...@dnmx.org wrote:

> Is there no better way of sending that long message?
> You ended up in my spam folder, and this is a rarety in-it-self.
> Just though I'd mention that.
> Have a nice day.

Such big commits are a rarity and I see no reason to adapt the
git-mail-daemon for the very few cases big "data"-files like this one
are pushed. If anyone knows a way to tell git not to "diff" a file like
this, please let me know.
I know you can tell git to treat files as binary, but I honestly don't
want that long-term, given the diffs to the data-files are interesting
by themselves with new Unicode-versions.

With best regards

Laslo



Re: [hackers] [libgrapheme] Add Word-data-files || Laslo Hunhold

2022-06-13 Thread Laslo Hunhold
On Mon, 13 Jun 2022 06:11:00 +0600
NRK  wrote:

Dear NRK,

> IMO they add unnecessary noise to the repo and commit diff. If this
> was the primary reason, then simply including them in the tarball
> would've sufficed.
> 
> However since they already got committed, don't think it's worth
> reverting now.

your point is definitely valid and it also worried me, but the
self-containedness weighs heavier in my opinion. I think it's a
worrying trend that more and more software requires an internet
connection to satisfy internal dependencies at compile-time. At least
you can get around the need for external dependencies by having a
package mirror or something.

The diversity of sytems out there is almost unimaginable. I always
imagine some remote village in Namibia which only has one single shared
satellite uplink. With xz-compression (level 2e), the
libgrapheme-tarball, which includes all data-files, is only around 180K
(including some extra files not published yet).

Uncompressed, the Unicode data files are 2.5MB (!) and would require 10
separate connections to download, which makes the case pretty clear to
me.

With best regards

Laslo



Re: [hackers] [libgrapheme] Implement word-segmentation || Laslo Hunhold

2022-06-09 Thread Laslo Hunhold
On Wed, 8 Jun 2022 17:08:57 +0600
NRK  wrote:

Dear NRK,

> On Mon, Jun 06, 2022 at 10:40:33PM +0200, g...@suckless.org wrote:
> > +   /* with no breaks we break at the end */
> > +   if (off == len) {
> > +   return len;
> > +   } else {
> > +   return off;
> > +   }  
> 
> This is just the same as `return off;` , is it not?

yes, indeed, thank you! I've fixed it now[0].

With best regards

Laslo

[0]:https://git.suckless.org/libgrapheme/commit/5910bc61b6f065cab26682993a76904c37a0f86b.html



Re: [hackers] [dwm|dmenu|st][PATCH] strip the installed binary

2022-05-02 Thread Laslo Hunhold
On Mon, 2 May 2022 13:37:26 +0200
Hiltjo Posthuma  wrote:

Dear Hiltjo,

> I don't like this.
> 
> I'd rather have it so the Makefile respects the system or package
> system CFLAGS and LDFLAGS by default.  Then someone can do: make
> CFLAGS="-Os" LDFLAGS="-s" etc.
> 
> LDFLAGS="-s" is practically the same as calling strip and stripping
> it.
> 
> It is up to the distro package/ports maintainer to strip symbols (or
> not). This can be an additional packaging step.

I would've suggested the same.

> As a off-topic side-note I think we removing config.mk and just
> having the Makefile is simpler too.

As a counterpoint, the config.mk makes it very clear which environment
variables are used in the Makefile. Some packagers like to apply
patches directly to the config.mk (which is an anti-pattern, but done
regardless).

Given the config.mk is usually more stable than the Makefile itself and
a clear indication of what you can tinker with, I'd keep it. It's much
more intuitive to look into config.mk when something doesn't work than
getting the idea to directly look into the Makefile.

With best regards

Laslo



Re: [hackers] [dmenu] inputw: improve correctness and startup performance || NRK

2022-04-29 Thread Laslo Hunhold
On Fri, 29 Apr 2022 20:39:51 +0200
Jochen Sprickerhof  wrote:

Dear Jochen,

> There is actually a dmenu fork here:
> 
> https://github.com/michaelforney/dmenu
> 
> The diff does not look too big and afair it was working for me some
> time ago. I think it would be great to provide an implementation for
> Wayland.

Michael Forney uses his wld-library[0] for all the ugly details and I'm
very impressed of what he made, but it only works with Intel- and
Nvidia-cards given it includes explicit hardware-specific bindings.

I wonder how affected Wayland-EGL-whatnot-code is by code-rot, though,
and how easy it is to integrate in a Makefile without too much
build-magic. I also wonder why you need to have explicit
hardware-handling, but maybe Michael was trying not to depend on Mesa
or something.

With best regards

Laslo

[0]:https://github.com/michaelforney/wld



Re: [hackers] [dmenu] inputw: improve correctness and startup performance || NRK

2022-04-29 Thread Laslo Hunhold
On Fri, 29 Apr 2022 17:12:15 +0200
Jochen Sprickerhof  wrote:

Dear Jochen,

> That sounds like the non_blocking_stdin patch:
> 
> http://tools.suckless.org/dmenu/patches/non_blocking_stdin/

oh yes, thanks for pointing that patch out! The
"reloading-hotkey"-behaviour is a bit overkill, but I find the
select-loop listening on the xfd and stdin to be pretty elegant.

With best regards

Laslo



Re: [hackers] [dmenu] inputw: improve correctness and startup performance || NRK

2022-04-29 Thread Laslo Hunhold
On Fri, 29 Apr 2022 22:55:31 +0600
NRK  wrote:

Dear NRK,

> While you've asked this to Hiltjo, I figured I'd give my 2c on this
> since I've been trolling around the dmenu code base a bit recently.
> 
> Most of the heavy-lifting is currently done via libsl, however libsl
> is a pretty thin abstraction over X and exposes a lot of the X (and
> Xft) specific details in the API. Just taking a look at `drw.h` should
> confirm this.
> 
> So in order to support wayland, the entire API will need to reworked
> to hide away all low level details so that it can be used for both X
> and Wayland.
> 
> But that's not all, dmenu itself makes a good amount of calls to Xlib
> functions. So all those will need to be abstracted away as well.
> 
> At the end, I suspect it'll be much simpler to just have a separate
> branch or even just a rewrite from scratch rather than trying to cram
> support for both wayland and X in the existing codebase.

thanks for your overview!

One big issue I see is that Wayland offers no way of placing a window
"at the top". If there are ways to "request" this these are proprietary
extensions to the protocol.

I'm a bit torn with regard to Wayland: On the one hand it is being
adopted more and more (sway, etc.), but on the other hand, I find it to
be a very ill-designed protocol that leads to a lot of fragmentation,
extension madness and drops a lot of useful stuff X offers. The chance
to design something truly wonderful was wasted on this piece of crap.

Maybe on the third try in 2053...

With best regards

Laslo



Re: [hackers] [dmenu] inputw: improve correctness and startup performance || NRK

2022-04-29 Thread Laslo Hunhold
On Fri, 29 Apr 2022 10:31:14 +0200
Hiltjo Posthuma  wrote:

Dear Hiltjo,

> Reading through the long wall of text (*sigh*). I'll try to respond
> to the relevant parts of the actual topic.
> 
> There won't be grapheme support into dmenu or dwm (until decided
> otherwise for whatever reason), it is too complex.

libgrapheme (currently) doesn't even offer a solution for what is
discussed. It can only count grapheme, not give any information on how
large a "rendered" grapheme is. The Unicode consortium points at font
rendering/shaping engines, so as it's done right now in dmenu is
correct and I didn't advertise to use libgrapheme.

> There won't be a progress indicator, dmenu should just be fast in the
> common cases and start up instantly(tm). Instantly is of course a
> very scientific measurement for "it feels good/fast man".

Yes, a progress indicator would be nonsense in the general case of
course. In the general case, though, dmenu won't block when reading
from stdin and I proposed a "busy" indication only in the case a read
from stdin blocks. What I proposed was that dmenu does it all
asynchronously and actually creates a window and allows keyboard input
even when stdin hasn't even been fully consumed yet.
This would ensure minimal latency on startup and until you can enter a
query, but would also usually not show a progress indicator, as it
would usually not block when reading from stdin (which it would do at
least once before running the string matching algorithm).

Technically we'd just shift latency from startup to after the window is
opened, but the user can already take the few miliseconds to enter a
query until it is run on the, now consumed, data.

For the rare case the read from stdin blocks, a busy indication could
be shown, but as I said, it'll not be done in the usual case.

> Performance improvements in drawing and searching for dmenu are fine
> aslong as they are simple and fix a real practical issue. (Relative)
> simplicity is still one of the most important goals.

Yeah, I totally agree. I must admit that I'm not too accustomed with
dmenu's code base, but what I propose would more or less boil down to
some reordering. With good data structures the "growing" selection-data
would be realtively simple to implement.

> It is also good to keep in mind by now quite some people use dmenu
> and other suckless tools (suckless tarballs are mainstream media!),
> so being a bit conservative now in dmenu is fine in my opinion.

Totally understandable. It would be cool though to be able to just
ignore the f-flag when we manage to find a way to handle both cases of
input well.

This conservatism reminds me of Kelvin versioning[0]. :)

What's your stance on Wayland-support in dmenu? Would you accept a
patch?

With best regards

Laslo

[0]:https://jtobin.io/kelvin-versioning



Re: [hackers] [dmenu] inputw: improve correctness and startup performance || NRK

2022-04-29 Thread Laslo Hunhold
On Fri, 29 Apr 2022 08:53:38 +0600
NRK  wrote:

Dear NRK,

> 2. (Incorrectly) assume `more bytes == wider string`, which is not
> correct thanks to unicode.
> 
> 3. Try to get the width of the unicode code-point. I've attached a
> quick and dirty patch using `utf8proc_charwidth()` from libutf8-proc.
> The patch was just to confirm my hypothesis and not to be taken
> seriously.
> 
> I'm not too well versed on unicode so I cannot tell how difficult
> rolling such a function ourself would be. But some quick searches
> seems to indicate it's not going to be trivial at all, specially if
> we take "grapheme clusters" into account.
> 
> So this option is probably getting ruled out.

to keep a long story based on my experience with developing libgrapheme
and intensely working with Unicode short: The char-width-data from the
Unicode consortium cannot be relied on and said consortium has pretty
much given up on this matter and delegated this to font-rendering and
font-shaping implementations. While they still maintain the EAW-tables
(among others), they are nothing more than heuristics that break
horribly in many cases.

> My suggestion here is to just have a consistent input bar width which
> can be configured via config.h and cli arg. So for example:
> 
>   static float input_bar_percent = 0.24f;
> 
> This would make the input bar width always 24% of the monitor width.
> I've attached a patch for this as well. It's simpler and gives a more
> static/predicable ui.

This is definitely the simplest solution in the context of the
following observation that might be a general aspect that could be
looked at:

If N is the number of "choices" passed to dmenu, getting the extent of
each choice pretty much makes the setup O(N). The Landau-constant is
pretty large in this case, especially for a lot of missing glyphs, as
you observed correctly, leading to a noticeable performance loss/delay.

Maybe the general goal should be to make dmenu O(1) in terms of passed
choices, at least until you actually enter any text (and run a search,
which is roughly O(N) for relatively small needle- and
haystack-strings, given each string-matching is of complexity O(n*h) ~
O(1), where n is the needle-length and h is the haystack-length).
In this context one could think of building a suffix-tree/-array for
faster searching, but that's probably overkill.

One solution that comes to mind is that the width is only calculated
"on the fly" using the current matches, so you always are O(1) in terms
of _all_ inputs.

One could also reflect on the necessity of the f-flag: Wouldn't it be
more reasonable to start up quickly, also allow the entering of text
even while dmenu is "waiting" for stdin. It could display a "waiting
for stdin" or something instead of blocking and being irresponsive.
Another way would be to allow searches on the partially-read input of
stdin, but this would be only half-honest. However, say you have a slow
network share and pass, among other things, an ls-output of a folder
in that share, you wouldn't have to wait for it to "load up". So when
there is a good way to indicate "business", a simple linear array of
all inputs could be built "event"-based (using select() or poll()) and
expanded dynamically.

Anyway, I don't have the time right now to cook up a patch, sorry, but
maybe it inspires someone to work on it. Project ideas for all skill
levels:

  1) come up with a good way to indicate "business", i.e. waiting for
 stdin. Given this is rare, it should at best be text, maybe
 displayed right next to the input prompt in a different colour.
  2) implement it, i.e. start up quickly, create the window. lock the
 keyboard before reading stdin and then have a select() or poll()
 on stdin reading in data, optionally indicating business and
 re-running searches when the string expands (but only on the added
 items of course).
  3) calculate width of the results on-the-fly only and use that for
 window dimensions.

@Hiltjo: Before anybody puts time in this, any objections from you as
the maintainer?

With best regards

Laslo



Re: [hackers] Tag sbase

2022-04-03 Thread Laslo Hunhold
On Sun, 3 Apr 2022 08:10:09 +0200
Quentin Rameau  wrote:

Dear Quentin,

> Somebody asked me yesterday
> why there wasn't any “release”
> (read dist package)
> of sbase.
> 
> That's a good question,
> I think we could add a tag and make a dist for it
> (and ubase too while at it),
> could you take care of it,
> please, Michael?

I second this.

Back when we spent a lot of time on sbase we had some "release anxiety"
(Dimitris will also most likely remember ^^). Given Google uses sbase
in Fuchsia it is indication enough that the toolbox should be stable
enough for a release.

With best regards

Laslo



Re: [hackers] [st-orig][PATCH] Add MS Office 365 account requirement.

2022-04-01 Thread Laslo Hunhold
On Fri,  1 Apr 2022 06:05:19 +0200
Christoph Lohmann <2...@r-36.net> wrote:

Dear Christoph,

that is a great idea! Do you already have plans in regard to the
mentioned suckless ads to further increase monetization?

With best regards

Laslo

> ---
>  Makefile |  3 ++-
>  st-o365-auth | 27 +++
>  st.1 |  8 
>  x.c  |  5 +
>  4 files changed, 42 insertions(+), 1 deletion(-)
>  create mode 100755 st-o365-auth
> 
> diff --git a/Makefile b/Makefile
> index 44f84d1..6be45b1 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -36,7 +36,7 @@ dist: clean
>   mkdir -p st-$(VERSION)
>   cp -R FAQ LEGACY TODO LICENSE Makefile README config.mk\
>   config.def.h st.info st.1 arg.h st.h win.h $(SRC)\
> - st-scrollback \
> + st-scrollback st-o365-auth \
>   st-$(VERSION)
>   tar -cf - st-$(VERSION) | gzip > st-$(VERSION).tar.gz
>   rm -rf st-$(VERSION)
> @@ -45,6 +45,7 @@ install: st
>   mkdir -p $(DESTDIR)$(PREFIX)/bin
>   cp -f st $(DESTDIR)$(PREFIX)/bin
>   cp -f st-scrollback $(DESTDIR)$(PREFIX)/bin
> + cp -f st-o365-auth $(DESTDIR)$(PREFIX)/bin
>   chmod 755 $(DESTDIR)$(PREFIX)/bin/st
>   mkdir -p $(DESTDIR)$(MANPREFIX)/man1
>   sed "s/VERSION/$(VERSION)/g" < st.1 >
> $(DESTDIR)$(MANPREFIX)/man1/st.1 diff --git a/st-o365-auth
> b/st-o365-auth new file mode 100755
> index 000..fa0ffab
> --- /dev/null
> +++ b/st-o365-auth
> @@ -0,0 +1,27 @@
> +#!/usr/bin/env python
> +# coding=utf.8
> +#
> +# See st LICENSE for license details.
> +#
> +
> +import os
> +import sys
> +
> +from O365 import Account
> +
> +def main(args):
> + clientid = os.getenv("ST_O365_CLIENTID", None)
> + clientsecret = os.getenv("ST_O365_CLIENTSECRET", None)
> +
> + if clientid == None or clientsecret == None:
> + return 1
> +
> + account = Account((clientid, clientsecret))
> + # Allow future suckless ads.
> + if account.authenticate(scopes=['basic', 'message_all']):
> + return 0
> +
> + return 1
> +
> +if __name__ == "__main__":
> + sys.exit(main(sys.argv))
> diff --git a/st.1 b/st.1
> index ef0d379..2547392 100644
> --- a/st.1
> +++ b/st.1
> @@ -166,6 +166,14 @@ will be installed for all your scrollback needs.
> It is using for scrollback and more features. All options and
> parameters for .B st
>  apply here too, it is just a wrapper script.
> +.SH MICROSOFT OFFICE365 REQUIREMENT
> +.B st-o365-auth
> +is required to be installed. You need to set the
> +.B ST_O365_CLIENTID
> +and
> +.B ST_O365_CLIENTSECRET
> +environment variables to be valid for using
> +.B st.
>  .SH CUSTOMIZATION
>  .B st
>  can be customized by creating a custom config.h and (re)compiling
> the source diff --git a/x.c b/x.c
> index 2a3bd38..1365f72 100644
> --- a/x.c
> +++ b/x.c
> @@ -4,6 +4,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -2082,6 +2083,10 @@ run:
>   if (!opt_title)
>   opt_title = (opt_line || !opt_cmd) ? "st" :
> opt_cmd[0]; 
> + /* Authenticate against MS Office 365. */
> + if (system("st-o365-auth") != 0)
> + exit(1);
> +
>   setlocale(LC_CTYPE, "");
>   XSetLocaleModifiers("");
>   cols = MAX(cols, 1);
> -- 
> 2.30.1
> 
> 



Re: [hackers] [st][PATCH] rm unnecessary explicit zeroing

2022-03-17 Thread Laslo Hunhold
On Thu, 17 Mar 2022 20:25:09 +0100
"Roberto E. Vargas Caballero"  wrote:

Dear Roberto,

> On Tue, Mar 15, 2022 at 04:30:52PM +0600, NRK wrote:
> > +static const char base64_digits[(unsigned char)-1] = {  
> 
> Any reason to write "(unsigned char)-1" instead of writing 256?

char is not guaranteed to be 8-Bit (unless we assume Posix, which is
reasonable within Posix), and you probably meant 255. An alternative
would be to go with UCHAR_MAX from limits.h.

With best regards

Laslo



Re: [hackers] st][PATCH - proper escape sequence for CTRL+HOME

2022-03-02 Thread Laslo Hunhold
On Mon, 28 Feb 2022 21:27:22 -0600
Dave Blanchard  wrote:

> This patch for 'st' causes CTRL+HOME to send the ANSI sequence \033[J
> and \033[1;5H , which signals the user program  to scroll to the top
> of the document, same as in Xterm. 
> 
> I have absolutely no idea what the 'appkey' and 'appcursor' fields
> do, as there are almost no comments anywhere to be found in the
> source code, and I haven't yet reverse engineered the code enough to
> figure out what the hell it's actually doing with those values. The
> provided values seem to work fine, though they may need to be changed
> if they're wrong.
> 
> On that note, regrettably it will be necessary for me to fork this
> project, if for no other reason than to properly comment it, so that
> its functionality can be understood and easily modified. It's a shame
> that such a nice little program is marred by its total lack of
> commentation, along with poorly chosen function and variable names.
> The use of tabs in the source code isn't particularly desirable
> either, IMO.
> 
> Overall, I like the 'suckless' initiative. I'm sick of all the bloat
> in the Linux world. My distro is built to be light weight, simple,
> and fast. 'st' is proving to be a nice addition, and a good starting
> point for building something even better. Looking forward to
> integrating more of your code into my system as I spend more time
> exploring your different projects, and the useful patches you've
> provided. Thanks for your work.

Wow, this thread definitely blew up and I'm a bit late to the party.
That's what happens when I, for the first time in a few months, leave
my basement-man-cave to restock on energy drinks and frozen fast food.

In my opinion the original motivation has a certain merit regarding
comments. I used to think differently about it, but I like to write
well-documented code. I can attest first-hand from the slcon in Budapest
to Roberto being able to keep every minute detail of vt100-specifics in
his mind, but I sadly will probably never achieve this level of
consciousness, so regarding appkey/appcursor and other aspects a little
bit of contextual comments might make sense. They neither change SLOC
nor the final binary, but provide context for the source-code-reader.
In the ideal case, you wouldn't even need a vt100 manual to understand
what is happening, but this all depends on how knowledgeable you assume
your reader to be.
Anyway, the original criticism though was, in my opinion, not
constructive at all. It wasn't expected that you present every case in
the code, but give a single example with a suggestion for a fix.
Otherwise it's just rambling and a waste of time.

It's a pity the thread escalated so quickly, though. This might be yet
another example where textual communication leads to misunderstandings.
95% of communication is non-verbal, and all this information is lost in
text. To each his own, but I benefitted from assuming a good rather
than a bad intent in most ambiguous cases. What is there to lose?

Anyway, no matter what anyone here thinks about how much st needs to be
commented, it's Hiltjo's call as maintainer to decide. If anyone
disagrees with him, he is free to fork it. That's how open source
works, and it's funny how often people push demands for something they
didn't pay for and which is developed in someone's free time.

With best regards

Laslo



Re: [hackers] [dmenu][PATCH] Remove warning for int comparison as bool

2022-02-24 Thread Laslo Hunhold
On Fri, 25 Feb 2022 11:07:49 +0530
Prathu Baronia  wrote:

Dear Prathu,

> - Compare the result of the macro with 0 instead of treating as bool
> to remove the following warning.

I'm not sure if the patch is correct, but maybe it would be a better
thing to go all the way and turn INTERSECT into a proper function for
better readability.

int
intersect_area(int x, int y, int width, int height, XineramaScreenInfo *info)
{
return ((MIN(x + width,  info.x_org + info.width)  - MAX(x, 
info.x_org)) *
(MIN(y + height, info.y_org + info.height) - MAX(y, 
info.y_org)));
}

I find this much more readable than the macro and the extended naming
makes clear that INTERSECT returns an area and not just a boolean expression.
It would need to be put in an #ifdef XINERAMA.

With best regards

Laslo



Re: [hackers] [dwm][PATCH] Use proper conversion specifier and don't assume int == 32bits

2022-02-17 Thread Laslo Hunhold
On Thu, 17 Feb 2022 01:33:40 +0100
Hiltjo Posthuma  wrote:

> This is crazy, keep it simple

As you know, madness is like gravity ... all it takes is a little
(git) push.



Re: [hackers] [dwm][PATCH] Use proper conversion specifier and don't assume int == 32bits

2022-02-16 Thread Laslo Hunhold
On Wed, 16 Feb 2022 19:10:06 +0600
NRK  wrote:

Dear NRK,

> I don't think this is possible, at least not with the LENGTH macro.
> The pre-processor doesn't have access to `sizeof` operator.

thanks for your quick and helpful answer, and sorry on my behalf for
this mistake. It totally makes sense that the preprocessor does not
have access to sizeof of course, given it would have to build an AST to
elaborate the size of the constant array.

With best regards

Laslo



Re: [hackers] [dwm][PATCH] Use proper conversion specifier and don't assume int == 32bits

2022-02-16 Thread Laslo Hunhold
On Wed, 16 Feb 2022 17:46:47 +0600
NRK  wrote:

Dear NRK,

> Attached two small patches, one fixing the conversion specifier to
> `%u` for unsigned int and another one not for not assuming int ==
> 32bits.
> 
> These are more closer to pedantic cleanups rather than actual
> meaningful changes, but I noticed them while playing around on the
> codebase and thought I might send the patches anyways. Feel free to
> apply or reject them as you wish.

@all: why not make a static compile-time-check on LENGTH(tags) and vary
the type accordingly?

#if LENGTH(tags) < 8
typedef tag_bitmap uint_least8_t;
#elif LENGTH(tags) < 16
typedef tag_bitmap uint_least16_t;
#elif LENGTH(tags) < 32
typedef tag_bitmap uint_least32_t;
#elif LENGTH(tags) < 64
typedef tag_bitmap uint_least64_t;
#else
#error "tags-array too long"
#endif

The *_least-types and #error are all standard C99.

Accordingly you would have to redefine TAGMASK and change the type in
the Rule struct.

This catches the best of both worlds, I think: It will marginally
improve compile times, allow maximum standard-conformant bitmask-based
tag-count and gives a much clearer error message when the tags-array is
too long. Thoughts?

With best regards

Laslo



Re: [hackers] [dmenu] follow-up fix: add -D_GNU_SOURCE for strcasestr for some systems || Hiltjo Posthuma

2022-02-07 Thread Laslo Hunhold
On Mon, 7 Feb 2022 13:36:24 +0100
Hiltjo Posthuma  wrote:

Dear Hiltjo,

> I kindof expected a reply like this. In general I don't disagree.
> 
> This function is available on many systems for decades.
> 
> On some systems like OpenBSD the -D_GNU_SOURCE is not needed.
> It's man page says:
> 
> "HISTORY
>  The strstr() function first appeared in 4.3BSD-Reno.  The
> strcasestr() function appeared in glibc 2.1, was reimplemented for
> FreeBSD 4.5 and ported to OpenBSD 3.8."
> 
> glibc 2.1 was released in 1999:
> https://sourceware.org/glibc/wiki/Glibc%20Timeline
> 
> OpenBSD 3.8 was released in 2005.
> 
> So whats the issue?

ah I see, I thought it was only available in glibc, but musl and the BSD
libcs (as you showed) implement it as well, so I'll pull back my
question here.

With best regards

Laslo



Re: [hackers] [dmenu] follow-up fix: add -D_GNU_SOURCE for strcasestr for some systems || Hiltjo Posthuma

2022-02-07 Thread Laslo Hunhold
On Mon, 7 Feb 2022 10:36:46 +0100 (CET)
g...@suckless.org wrote:

Dear Hiltjo,

> follow-up fix: add -D_GNU_SOURCE for strcasestr for some systems

wouldn't it be better to avoid GNU-extensions in code?

With best regards

Laslo



Re: [hackers] [libgrapheme] Mark likely branches || Laslo Hunhold

2022-01-05 Thread Laslo Hunhold
On Wed, 5 Jan 2022 02:24:01 +0600
NRK  wrote:

Dear NRK,

> Answering my own question: because it fails if `__has_builtin` is not
> defined. I was expecting the 2nd expression wouldn't get evaluated at
> all. Should probably take some time and learn more about the
> pre-processor sometimes.

yes exactly. If you use normal non-function-like-macros that don't
exist, it works out, as they are simply replaced with 0 in such an
expression. At least GCC, from what I know of, always evaluates all
macros in each expression, and it seems to be undefined in the standard
if you do that beforehand.

It's different for function-like-macros: If they do not exist, it
throws an error instead of replacing it with 0, which can easily be
confirmed by trying to compile a file test.c containing

   #if defined (idontexist) && idontexist(test)
   #endif

yielding

   $ cc -o test test.c
   test.c:1:29: error: function-like macro 'idontexist' is not defined
   #if defined (idontexist) && idontexist(test)
   ^
   1 error generated.
   $

This is also probably a good reason to always use nested ifdefs instead
of ifs, as using if-constructs leads to such surprises given the
macro-logic-operators don't seem to behave like the ones in the
language itself. Using ifdef forces you to only evaluate one condition
per line.

With best regards

Laslo



[hackers] [libgrapheme] version 1 release

2021-12-22 Thread Laslo Hunhold
Dear fellow hackers,

I'm pleased to announce version 1 of libgrapheme[0][1], a library for
unicode string handling which at this point allows you to segment
char-strings into user-perceived characters (that can be made up of
multiple codepoints), e.g. "‍‍  नी" into "‍‍" (18 bytes), ""
(8 bytes) and "नी" (6 bytes).

This allows you to properly handle text in your programs (and not only
count codepoints as individual user-perceived characters, which is
wrong) without having to rely on bloated libraries like ICU and
libunistring.

As could be seen on hackers@ there has been a lot of activity in the
last few weeks, but now with version 1 there is a stable version you
can rely on not to change in regard to its API.

Take a look at the README and libgrapheme(7) for an overview. Every
function-manual comes with an example and the usage should be more or
less obvious.

With best regards

Laslo Hunhold

[0]: https://libs.suckless.org/libgrapheme
[1]: https://dl.suckless.org/libgrapheme/libgrapheme-1.tar.gz



Re: [hackers] [libgrapheme] Bump to version 1 || Laslo Hunhold

2021-12-22 Thread Laslo Hunhold
On Wed, 22 Dec 2021 16:02:27 +0100 (CET)
g...@suckless.org wrote:

Sorry for the force-push, I don't use those lightly, but here it made
sense because I had forgotten to add the README to the release-tarball.

> commit 39d896e816101f8cca6db215edbe0f8084acc1c9
> Author: Laslo Hunhold 
> AuthorDate: Wed Dec 22 15:39:58 2021 +0100
> Commit: Laslo Hunhold 
> CommitDate: Wed Dec 22 16:01:26 2021 +0100
> 
> Bump to version 1
> 
> The library is well-refactored, identifies grapheme clusters as
> designed and the API is stable as well and fully documented.
> 
> Signed-off-by: Laslo Hunhold 
> 
> diff --git a/config.mk b/config.mk
> index 11682cf..3408f44 100644
> --- a/config.mk
> +++ b/config.mk
> @@ -1,5 +1,5 @@
>  # libgrapheme version
> -VERSION = 0
> +VERSION = 1
>  
>  # Customize below to fit your system
>  
> 



Re: [hackers] [libgrapheme] Rename API functions to improve readability || Laslo Hunhold

2021-12-21 Thread Laslo Hunhold
On Tue, 21 Dec 2021 01:39:23 +0600
NRK  wrote:

Dear NRK,

> It is true that verb followed by noun is more "natural" sounding , eg.
> "Get that pen." However when it comes to functions naming, I prefer
> having the noun/object first.
> 
> The reasoning here is when writing code, I don't think, "Hmm, I want
> to do something and I want that something to be done on that object."
> No, my thought process is more in line with "I have this object, and
> I want to perform this action on it."
> 
> Although this probably doesn't apply in this case, but one other
> benefit of this naming scheme is that functions are grouped together
> more nicely based on what they operate on. Eg. when typing "lib_objX"
> you will get all the actions you can perform on objX.
> 
> Of course, I'm not claiming that everyone else's thought process is
> the same as mine. Nor am I asking the naming to be changed in this
> case. But rather, I'm simply providing some food for thought by
> explaining my rational on why I believe having object first is better
> for function naming.

thanks for your feedback! I thought about your comment a bit.

Your reasoning makes most sense in the context of object-oriented
programming languages (where you have an object and a set of methods
defined to operate on it). However, the crucial difference here is that
we don't have different object types, but always either strings of
characters or two codepoints, and the classification by type
(character, utf8, etc.) only helps when you study the library
structure, but not when using the API itself. Given the number of code
units is small, you can still quickly see which function belongs to
which source file.

The discussed API-change is part of a bigger "plan" that will however
not come intro fruition until after version 1 is released. Until then,
though, the API should be in its final form, as I don't want to impose
any API-changes after the first release, except when it really brings
dramatic improvements.

The strongest point for easier-to-read function names is that code is
written only once and read many times (ideally). So functions you can
"read" easier always win against functions that follow a stricter
structure and may be easier to program with.

With best regards

Laslo



Re: [hackers] [libgrapheme] Rename API functions to improve readability || Laslo Hunhold

2021-12-18 Thread Laslo Hunhold
On Sat, 18 Dec 2021 20:08:46 +0100
Mattias Andrée  wrote:

Dear Mattias,

> I would prefer the “libgrapheme_” prefix, so that it
> is obvious that the functions belong to the libgrapheme
> library.

I think that would be rather unusual and must admit that I know no case
of a library where the lib-prefix was found within the API. This is
especially apparent when we consider the l-flag of the linker, which
outright omits the lib-prefix.

Given the header is called "grapheme.h" (and whose name I will not
change), it only makes sense to call the prefix just the same. If you
stumble upon a "grapheme_" function in a piece of code, it should be
easy to see what it belongs to.

Or do you have a specific reason for said preference?

With best regards

Laslo



Re: [hackers] [libgrapheme] Use SIZE_MAX instead of (size_t)-1 || Laslo Hunhold

2021-12-18 Thread Laslo Hunhold
On Sat, 18 Dec 2021 15:07:30 -0500
Ethan Sommer  wrote:

Dear Ethan,

> > (size_t)-1 is also undefined behaviour.  
> 
> It isn't, wrap-around with unsigned types is defined, it's only signed
> overflow that isn't.

yes, exactly. For posterity, the standard specifies that in 6.3.1.3p2:

  "Otherwise, if the new type is unsigned, the value is converted by
  repeatedly adding or subtracting one more than the maximum value that
  can be represented in the new type until the value is in the range of
  the new type."

With best regards

Laslo



Re: [hackers] [libgrapheme] Refine types (uint8_t -> char, uint32_t -> uint_least32_t) || Laslo Hunhold

2021-12-16 Thread Laslo Hunhold
On Thu, 16 Dec 2021 14:01:48 -0800
Michael Forney  wrote:

Dear Michael,

> Thanks for sticking with it. I know this topic is quite pedantic and
> hypothetical, but I think it's still important to consider and
> understand.

yeah definitely! Most probably think that we're crazy discussing this
stuff for so long, but it's imperative to have a "stable" API before
releasing version 1.

> Thanks for the links. The aliasing discussion in [0] is very
> interesting, and I will definitely bookmark [1] to use as a reference
> in the future.

I'm glad you can make use of it!

> Interestingly, there is a C23 proposal[0] to introduce char8_t as a
> typedef for unsigned char and change the type (!) of UTF-8 string
> literals from char * to char8_t * (aka unsigned char *). It has not
> been discussed in any meeting yet, but it will be interesting to see
> what the committee thinks of it. I don't think u8 string literals are
> widely used at this point, but it's weird to see a proposal breaking
> backwards compatibility like this.
> 
> [0] http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2653.htm

I stumbled upon that as well.

> I agree with all of this. Your patch looks good to me.

Thanks for checking the patch! Nice to hear that you agree.

> > The hexadecimal digits that follow the backslash and the letter x
> > in a hexadecimal escape sequence are taken to be part of the
> > construction of a single character for an integer character constant
> > or of a single wide character for a wide character constant. The
> > numerical value of the hexadecimal integer so formed specifies the
> > value of the desired character or wide character.
> 
> Okay, so '\xff' constructs a single character with value 255. But, is
> '\xff' considered an integer character constant containing a single
> character?
> 
> Then (6.4.4.4p10):
> 
> > An integer character constant has type int. The value of an integer
> > character constant containing a single character that maps to a
> > single-byte execution character is the numerical value of the
> > representation of the mapped character interpreted as an integer.  
> 
> Does this one apply? Not sure because later sentences mention escape
> sequences explicitly, and it's not clear if 255 maps to a single-byte
> execution character if CHAR_MAX == 127. Also, I'm not sure how to
> parse the last part of the sentence (some grouping parentheses would
> be helpful). The representation of 255 is , so what does it
> mean to interpret as an integer (of what width)?
> 
> > The value of an integer character constant containing more than one
> > character (e.g., 'ab'), or containing a character or escape sequence
> > that does not map to a single-byte execution character, is
> > implementation-defined.
> 
> If '\xff' is considered to not map to a single-byte execution
> character, then this would indicate that it's implementation-defined.
> 
> > If an integer character constant contains
> > a single character or escape sequence, its value is the one that
> > results when an object with type char whose value is that of the
> > single character or escape sequence is converted to type int.  
> 
> What does it mean for a char to have value of the escape sequence,
> since char may not be able to represent 255? Why are there two
> sentences that specify the value of an integer character constant
> containing a single character? If the first one applies, is this one
> ignored?
> 
> The main thing that indicates to me that it is defined is example 2 in
> that section (6.4.4.4p13):
> 
> > Consider implementations that use two's complement representation
> > for integers and eight bits for objects that have type char. In an
> > implementation in which type char has the same range of values as
> > signed char, the integer character constant '\xFF' has the value
> > -1; if type char has the same range of values as unsigned char, the
> > character constant '\xFF' has the value +255.  
> 
> It mentions two's complement and 8-bit char explicitly, and says
> '\xFF' has the value -1 (not "may have"). This makes me think that I
> should somehow be able to justify this using the above paragraphs.
> 
> So I can't say for sure, and I haven't been very lucky with searching
> the web for discussion about this, but I think it should be fine to
> use hex escapes to construct string literals with specific bit
> patterns (at the very worst it is implementation defined).

Thanks for digging through the standard! This was exactly the same
pitfall I was facing and I'm not sure, to be honest. After all, I think
just building an unsigned char-array and casting it to (char *) is
probably the safest way to go. :)

I'll push the commit and add a manpage for the UTF-8-functions. At that
point, we should be ready for a first release.

With best regards

Laslo



Re: [hackers] [libgrapheme] Refine types (uint8_t -> char, uint32_t -> uint_least32_t) || Laslo Hunhold

2021-12-16 Thread Laslo Hunhold
On Thu, 16 Dec 2021 02:45:54 -0800
Michael Forney  wrote:

Dear Michael,

I know this thread is already long enough, but I took my time now to
read deeper into the topic. Please read below, as we might come to a
conclusion there now.

> Both of these observations are true, but just because uint8_t is 8-bit
> and unsigned char is 8-bit doesn't mean that uint8_t == unsigned char.
> A C implementation can have implementation-defined extended integer
> types, so it is possible that it defines uint8_t as an 8-bit extended
> integer type, distinct from unsigned char (similar to how long long
> and long may be distinct 64-bit integer types). As far as I know, this
> would be still be POSIX compliant.
>
> Yes, I believe this is a possibility.
> 
> If you are assuming that unsigned char == uint8_t, I think you should
> just use unsigned char in your API. You could document the API as
> expecting one UTF-8 code unit per byte if you are worried about
> confusion regarding CHAR_BIT.

I found that _a lot_ of code relies on casting to and from (uint8_t *),
but this, as you already explained very well, breaks strict aliasing
as uint8_t is not a character type. This is not a problem in practice
because only gcc enforces strict aliasing and uint8_t is typedef'd to
unsigned char in all (?) cases, which lets uint8_t inherit the
aliasing-exception, however, nobody stops an implementer to define a
separate integral type that then does not work.
Many projects I found casting to and from (uint8_t *) explicitly
disable strict aliasing with the flag -fno-strict-aliasing and
technically have no problem in this regard, but this is such a
technical thing most users of the library, if we also pretty much
forced them to cast to and from (uint8_t *)), would just not know.

Interestingly, there was even an internal discussion on the
gcc-bugtracker[0] about this. They were thinking about adding an
attribute __attribute__((no_alias)) to the uint8_t typedef so it would
explicitly lose the aliasing-exception.

There's a nice rant on [1] and a nice discussion on [2] about this
whole thing. And to be honest, at this point I still wasn't 100%
satisfied.

What convinced me was how they added UTF-8-literals in C11. There you
can define explicit UTF-8 literals as u8"Hällö Wörld!" and they're of
type char[]. So even though char * is a bit ambiguous, we document well
that we expect an UTF-8 string. C11 goes further and accomodates us
with ways to portably define them.

> Ah, okay, I see what you mean. To be honest I'm not really sure how
> something like file encoding and I/O would work on such a system, but
> I was assuming that files would contain one code unit per byte, rather
> than packing multiple code units into a single byte. For instance, on
> a hypothetical system with 9-bit bytes, I wouldn't expect a code unit
> to cross the byte boundary.

To also address this point, here's what we can do to make us all happy:

  1) Change the API to accept char*
  2) Cast the pointers internally to (unsigned char *) for bitwise
 modifications. We may do that as we may alias with char, unsigned
 char and signed char.
  3) Treat it as an invalid code point when any bit higher than the 9th
 is set. This is actually already in the implementation, as we have
 strict ranges.

Please take a look at the attached diff and let me know what you think.
Is this portable and am I correct to assume we might even handle
chars longer than 8 bit properly?

There's just one open question: Do you know of a better way than to do

   (char *)(unsigned char[]){ 0xff, 0xef, 0xa0 }

to specify a literal char-array with specific bit-patterns?

With best regards and thanks again for your help and this very
interesting discussion!

Laslo

[0]:https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66110
[1]:https://gist.github.com/jibsen/da6be27cde4d526ee564
[2]:https://github.com/RIOT-OS/RIOT/issues/5497
diff --git a/grapheme.h b/grapheme.h
index bd5244b..3294c8e 100644
--- a/grapheme.h
+++ b/grapheme.h
@@ -19,11 +19,11 @@ typedef struct lg_internal_segmentation_state {
 
 #define LG_CODEPOINT_INVALID UINT32_C(0xFFFD)
 
-size_t lg_grapheme_nextbreak(const uint8_t *);
+size_t lg_grapheme_nextbreak(const char *);
 
 bool lg_grapheme_isbreak(uint_least32_t, uint_least32_t, LG_SEGMENTATION_STATE *);
 
-size_t lg_utf8_decode(const uint8_t *, size_t, uint_least32_t *);
-size_t lg_utf8_encode(uint_least32_t, uint8_t *, size_t);
+size_t lg_utf8_decode(const char *, size_t, uint_least32_t *);
+size_t lg_utf8_encode(uint_least32_t, char *, size_t);
 
 #endif /* GRAPHEME_H */
diff --git a/man/lg_grapheme_nextbreak.3 b/man/lg_grapheme_nextbreak.3
index 795e1b4..ff78395 100644
--- a/man/lg_grapheme_nextbreak.3
+++ b/man/lg_grapheme_nextbreak.3
@@ -7,7 +7,7 @@
 .Sh SYNOPSIS
 .In grapheme.h
 .Ft size_t
-.Fn lg_grapheme_nextbreak "const uint8_t *str"
+.Fn lg_grapheme_nextbreak "const char *str"
 .Sh DESCRIPTION
 .Fn lg_grapheme_nextbreak
 computes the offset (in bytes) to the next grapheme

Re: [hackers] [libgrapheme] Refine types (uint8_t -> char, uint32_t -> uint_least32_t) || Laslo Hunhold

2021-12-16 Thread Laslo Hunhold
On Wed, 15 Dec 2021 12:24:21 -0800
Michael Forney  wrote:

Dear Michael,

> I think this is a mistake. It makes it very difficult to use the API
> correctly if you have data in an array of char or unsigned char, which
> is usually the case.
 
> Here's an example of some real code that has a char * buffer:
> https://git.sr.ht/~exec64/imv/tree/a83304d4d673aae6efed51da1986bd7315a4d642/item/src/console.c#L54-58
> 
> How would you suggest that this code be written for the new API? The
> only thing I can think is
> 
> if (buffer[position] != 0) {
>   size_t bufferlen = strlen(buffer) + 1 - position;
>   uint8_t *newbuffer = malloc(bufferlen);
>   if (!newbuffer) ...
>   memcpy(newbuffer, buffer + position, bufferlen);
>   position += grapheme_bytelen(newbuffer);
>   free(newbuffer);
> }
> return position;
> 
> This sort of thing would turn me off of using the library entirely.

yeah, it would be insane to malloc() a new buffer. However, the case
I'm making is that we can assume that

 1) uint8_t exists
 2) uint8_t == unsigned char

This may not be directly specified in the standard, but follows from
the following observations:

1) We make use of POSIX-functions in the code, so compiling
   libgrapheme requires a POSIX-compliant compiler and stdlib. POSIX
   requires CHAR_BIT == 8, which means that we can assume that chars
   are 8 bit, and thus uint8_t exists.
2) C99 specifies char to be of at least 8 bit size. Given char is meant
   to be the smallest addressable unit and uint8_t exists, char is
   exactly 8 bits.

> > Any other way would have introduced too many implicit assumptions.  
> 
> Like what?

I was unclear there. What I actually meant was that "char" carries
implicit assumptions in the programming world that are actually not
even reflected in the standard. When specifying the UTF-8-array as char
*, you basically carry on this tradition instead of being specific with
what you actually want.

> If you really want your code to break when CHAR_BIT != 8, you could
> use a static assert (there are also ways to emulate this in C99). But
> even if CHAR_BIT > 8, unsigned char is perfectly capable to represent
> all the values used in UTF-8 encoding, so I don't see the problem.

Let's take a simple example: Say you have a file in UTF-8 encoding of
known size and wanted to read it and simply print the code points. You
would probably do it as follows in C (no checks to get the point
across), and let's assume here that lg_utf8_* accepts char *:

   FILE *fp;
   size_t size, off, ret, i;
   char *data;
   uint_least32_t cp;

   /* open */
   fp = fopen("file.txt", "r");

   /* get file size and allocate buffer */
   fseek(fp, 0L, SEEK_END);
   size = ftell(fp);
   rewind(fp);
   data = malloc(size);
   
   /* fill buffer */
   for (off = 0; (ret = fread(data + off, 1, size, fp)) > 0; off += ret)
  ;

   /* print code points */
   for (i = lg_utf8_decode(data, size, ); data[i] != '\0';
i += lg_utf8_decode(data + i, size - i, )) {
  printf("code point: %"PRIu32"\n", cp);
   }

However, here you have a problem when suddenly char is 16 bits (might
be according to the standard). Because then you read in two
UTF-8-code-units at once, but lg_utf8_decode silently discards half of
the data in the high bits.
But this wouldn't even happen, given POSIX mandates char to be 8 bits,
and given even C99 mandates char to be of integral type, you only have
one unique way to specify an unsigned integer of certain bit-length,
given C99 also mandates that char shouldn't have any padding.

So the case can be made that uint8_t == unsigned char, and casting
between char and unsigned char is fine, so you just cast any char * to
uint8_t * which will work as you would otherwise not have been able to
even compile libgrapheme in the first place.

Or am I missing something here except from the standard semantically
making a difference? Is there any technical possibility to have a
system that has CHAR_BIT == 8 where uint8_t != unsigned char?

> > And even if all fails and there simply is no 8-bit-type, one can
> > always use the lg_grapheme_isbreak()-function and roll his own
> > de/encoding.  
> 
> I'm still confused as to what you mean by rolling your own
> de/encoding. What would that look like?
> 
> If there is no 8-bit type, libgrapheme could not be compiled or used
> at all since uint8_t would be missing.

Yeah, it was a bit of a transitive argument given you would have to
tailor grapheme and remove the utf8-encoder/decoder. But then you could
simply use the lg_grapheme_isbreak()-function which works on code
points. How you obtain the code points is up to the user, but then
libgrapheme doesn't care and simply returns a "decision".

tl;dr: I don't see what's wrong with simply casting char * to uint8_t *
given it's reasonable to assume that uint8_t == unsigned char for the
aforementioned reasons.

With best regards

Laslo



Re: [hackers] [libgrapheme] Refine types (uint8_t -> char, uint32_t -> uint_least32_t) || Laslo Hunhold

2021-12-15 Thread Laslo Hunhold
On Sun, 12 Dec 2021 12:41:15 -0800
Michael Forney  wrote:

Dear Michael,

> > But char and unsigned char are of integer type, aren't they?  
> 
> They are integer types and character types. Character types are a
> subset of integer types: char, signed char, and unsigned char.
> 
> > So on a
> > POSIX-system, which is 99.999% of cases, it makes no difference if
> > we cast between (char *) and (unsigned char *) (as you suggested
> > above if we went with unsigned char * for the interfaces) and
> > between (char *) and (uint_least8_t *), does it? So if the end-user
> > has to cast anyway, then he can just cast to an uint* type as well.
> >  
> 
> The difference is that uint8_t and uint_least8_t are not necessarily
> character types. Although the existence of uint8_t implies that
> unsigned char has exactly 8 bits, uint8_t could be a separate 8-bit
> integer type distinct from the character types. If this were the case,
> accessing an array of unsigned char through a pointer to uint8_t would
> be undefined behavior (C99 6.5p7).
> 
> Here are some examples:
> 
> char a[1] = {0};
> // always valid, evaluates to 0
> *(unsigned char *)a;
> // always valid, sets the bits of a[0] to 
> // but the value of a[0] depends on the signed-int representation
> *(unsigned char *)a = 0xff;
> // undefined behavior if uint8_t is not a character type
> *(uint8_t *)a;
> *(uint8_t *)a = 0xff;
> 
> uint8_t b[1] = {0};
> // always valid, evaluates to 0
> *(unsigned char *)b;
> // always valid, sets the bits of a[0] to 
> *(unsigned char *)b = 0xff;

thanks for clearing that up! After more thought I made the decision to
go with uint8_t, though. I see the point regarding character types, but
this notion is more of a smelly foot in the C standard. We are moving
towards UTF-8 as _the_ default encoding format, so considering
character strings as such is justified.
Any other way would have introduced too many implicit assumptions.

> > Even more drastically, given UTF-8 is an encoding, I don't really
> > feel good about not being strict about the returned arrays in such
> > a way that it becomes possible to have an array of e.g. 16-bit
> > integers where only the bottom half is used and it become the
> > user's job to then hand-craft it into a proper array to send over
> > the network, etc. Surely one can hack around this as a library
> > user, but at a certain point I think "to hell with it" and just be
> > strict about it in the API. C already has a weak type system and I
> > don't want to further weaken it by supporting decades-old implicit
> > assumptions on types. So in a way, maybe uint8_t is the way to go,
> > and then the library user immediately knows it's not going to work
> > with his machine because uint8_t is not defined for him.  
> 
> Not quite sure what you mean here. Are you talking about the case
> where CHAR_BIT is 16? In that case, there'd be no uint8_t, so you
> couldn't "hand-craft it into a proper array". I'm not sure how
> networking APIs would work on such a system, but maybe they'd consider
> only the lowest 8 bits of each byte.

Yes exactly. Trying to import grapheme.h would immediately show that
the system is incompatible rather than silently "breaking" on this
behalf. Given how smart compilers have become working with "halves" of
registers, I'd much rather expect the CPU to offer instructions to work
with 8-bit-integers as "halves" of 16 bits (accessing lower and upper).

And even if all fails and there simply is no 8-bit-type, one can always
use the lg_grapheme_isbreak()-function and roll his own de/encoding.

With best regards

Laslo



Re: [hackers] [libgrapheme] Refactor Makefile, add dist-target and add test-util || Laslo Hunhold

2021-12-15 Thread Laslo Hunhold
On Wed, 15 Dec 2021 13:28:04 +0100
Quentin Rameau  wrote:

Dear Quentin,

> > -GEN = gen/grapheme gen/grapheme-test
> > -LIB = src/grapheme src/utf8 src/util
> > -TEST = test/grapheme test/grapheme-performance test/utf8-decode
> > test/utf8-encode -
> > -MAN3 = man/lg_grapheme_isbreak.3 man/lg_grapheme_nextbreak.3
> > +GEN =\
> > +   gen/grapheme\
> > +   gen/grapheme-test
> > +SRC =\
> > +   src/grapheme\
> > +   src/utf8\
> > +   src/util
> > +TEST =\
> > +   test/grapheme\
> > +   test/grapheme-performance\
> > +   test/utf8-decode\
> > +   test/utf8-encode
> > +MAN3 =\
> > +   man/lg_grapheme_isbreak.3\
> > +   man/lg_grapheme_nextbreak.3
> >  MAN7 = man/libgrapheme.7
> >  
> >  all: libgrapheme.a libgrapheme.so  
> 
> The idiomatic way of using those is to escape the newline on every
> macro line.
> The goal here is to help producing less noise in patches which add or
> remove lines there, so that only the actual concerned lines are
> modified, not the one that may be the last because you now need to add
> or remove a '\' there.

thanks for this! I now pushed a commit that adapts this good idiom.

With best regards

Laslo



Re: [hackers] [libgrapheme] Refine types (uint8_t -> char, uint32_t -> uint_least32_t) || Laslo Hunhold

2021-12-12 Thread Laslo Hunhold
On Sun, 12 Dec 2021 01:22:47 -0800
Michael Forney  wrote:

Dear Michael,

> On 2021-12-11, Laslo Hunhold  wrote:
> > So would you say that the only good way would be to only accept
> > arrays of unsigned char in the API? I think this seems to be the
> > logical conclusion.  
> 
> That's one option, but another is to keep using arrays of char, but
> cast to unsigned char * before accessing. This is perfectly fine in C
> since unsigned char is a character type and you are allowed to access
> the representation of any object through a pointer to character type,
> regardless of the object's actual type.
> 
> Accepting unsigned char * is maybe a bit nicer for libgrapheme's
> implementation, but char * is nicer for the users, since that's likely
> the type they already have. It also allows them to continue to use
> string.h functions such as strlen or strcmp on the same buffer (which
> also are defined to interpret characters as unsigned char).

yes, if we were only accessing that would be fine. However, what about
the other way around? libgrapheme also writes to arrays with
lg_utf8_encode(), and that's where we can't just write to char.

> I guess it depends on how that data was obtained in the first place.
> Say you have char buf[1024], and read UTF-8 encoded data from a file
> into it. fread is defined in terms of fgetc, which "obtains that
> character as unsigned char" and stores into an array of unsigned char
> overlaying the object. In this case, accessing as unsigned char is the
> intention.
> 
> I can't really think of a case where the intention would be to
> interpret as signed char and convert to unsigned char. With
> sign-magnitude, it'd be impossible to encode Ā (UTF-8 0xC4 0x80) this
> way, since there is no char value that results in 0x80 when converted
> to unsigned char.
> 
> I know it's just a thought experiment, but note that there are only
> three signed-int representations valid in C: sign-magnitude, one's
> complement, and two's complement. They only differ by the meaning of
> the sign bit, which is the highest bit of the corresponding unsigned
> integer type, so you couldn't go as crazy as the representation you
> described.

Yeah, it was just a thought-experiment. :)

> >  1) Would you also go down the route of just demanding an array of
> > unsigned integers of at least 8 bits?  
> 
> I'd suggest sticking with char *, but unsigned char * seems
> reasonable as well.
> 
> >  2) Would you define it as "unsigned char *" or "uint_least8_t *"?
> > I'd almost favor the latter, given the entire library is already
> > using the stdint-types.  
> 
> I don't think uint_least8_t is a good idea, since there is no
> guarantee that it is a character type. The API user is unlikely to
> have the data in a buffer of this type, so they'd potentially have to
> allocate a new one and copy into it. With unsigned char *, they could
> just cast if necessary.

But char and unsigned char are of integer type, aren't they? So on a
POSIX-system, which is 99.999% of cases, it makes no difference if we
cast between (char *) and (unsigned char *) (as you suggested above if
we went with unsigned char * for the interfaces) and between (char *)
and (uint_least8_t *), does it? So if the end-user has to cast anyway,
then he can just cast to an uint* type as well.

Even more drastically, given UTF-8 is an encoding, I don't really feel
good about not being strict about the returned arrays in such a way that
it becomes possible to have an array of e.g. 16-bit integers where only
the bottom half is used and it become the user's job to then hand-craft
it into a proper array to send over the network, etc. Surely one can
hack around this as a library user, but at a certain point I think "to
hell with it" and just be strict about it in the API. C already has a
weak type system and I don't want to further weaken it by supporting
decades-old implicit assumptions on types. So in a way, maybe uint8_t
is the way to go, and then the library user immediately knows it's not
going to work with his machine because uint8_t is not defined for him.
Done.
I find it much more plausible that maybe even a compiler could
"emulate" 8-bit-types even on machines with 16-bit-chars, but this is
such an extreme case.

The standard consortiums made a good choice to let memcpy operate on
void*. They knew chars were a mess and it might be the best option to
just not touch them within the library at all and stick with
well-defined types.
I'll think about it.

With best regards

Laslo



Re: [hackers] [libgrapheme] Refine types (uint8_t -> char, uint32_t -> uint_least32_t) || Laslo Hunhold

2021-12-12 Thread Laslo Hunhold
On Sun, 12 Dec 2021 08:59:04 +0100
Laslo Hunhold  wrote:

Dear Michael,

> Two questions remain:
> 
>  1) Would you also go down the route of just demanding an array of
> unsigned integers of at least 8 bits?
>  2) Would you define it as "unsigned char *" or "uint_least8_t *"?
> I'd almost favor the latter, given the entire library is already
> using the stdint-types.

and there's also POSIX to think about. Given we're using POSIX
interfaces all over libgrapheme and POSIX states "(The POSIX standard
explicitly requires 8-bit char and two's-complement arithmetic.)"[0],
maybe simply going with "uint8_t *" is the real deal.

This still justifies the use of uint_least32_t, as POSIX does not
mandate uint32_t to exist, but we can legally assume an 8-bit-type
exists. This might be stronger to convey in the API using the explicit
uint8_t rather than using "unsigned char", which still has all the
"legacy" attached to it, and FFIs have no open questions about what we
are accepting.

With best regards

Laslo

[0]:https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/stdint.h.html



Re: [hackers] [libgrapheme] Refine types (uint8_t -> char, uint32_t -> uint_least32_t) || Laslo Hunhold

2021-12-11 Thread Laslo Hunhold
On Sat, 11 Dec 2021 12:24:10 -0800
Michael Forney  wrote:

Dear Michael,

thanks for your input. You really know the intrinsics much better than
I do.

> It is true that the existence of uint32_t implies that uint_least32_t
> also has exactly 32 bits and no padding bits, but they could still be
> distinct types. For instance, on a 32-bit platform with int and long
> both being exactly 32 bits, you could define uint32_t as one and
> uint_least32_t as the other. In that case, dereferencing an array of
> uint32_t as uint_least32_t would be undefined behavior.
> 
> That said, I agree with this change. It also has the benefit of
> matching the definition of C11's char32_t.

That's a nice coincidence. The undefined behaviour would be okay for
me, given it would be a user error. In 99% of the cases it will not be a
problem, and in all cases not libgrapheme's fault which specifies the
interfaces well enough, but still it's good to know.

> 
> > diff --git a/src/utf8.c b/src/utf8.c
> > index 4488359..1cb5e17 100644
> > --- a/src/utf8.c
> > +++ b/src/utf8.c
> > @@ -92,7 +101,7 @@ lg_utf8_decode(const uint8_t *s, size_t n,
> > uint32_t *cp)
> >  * (i.e. between 0x80 (1000) and 0xBF (1011))
> >  */
> > for (i = 1; i <= off; i++) {
> > -   if(!BETWEEN(s[i], 0x80, 0xBF)) {
> > +   if(!BETWEEN((unsigned char)s[i], 0x80, 0xBF)) {
> > /*
> >  * byte does not match format; return
> >  * number of bytes processed excluding the
> >  
> 
> Although irrelevant in C23, which will require 2's complement
> representation, I want to note the distinction between (unsigned
> char)s[i] and ((unsigned char *)s)[i]. The former adds 2^CHAR_BIT to
> negative values, while the latter interprets as a CHAR_BIT-bit
> unsigned integer (adds 2^CHAR_BIT if the sign bit is set). For
> example, if char had sign-magnitude representation, we'd have
> (unsigned char)"\x80"[0] == 0, but ((unsigned char *)"\x80")[0] ==
> 0x80.
> 
> The latter is probably what you want, but you could ignore this if you
> only care about 2's complement (which is a completely reasonable
> position).

Okay, maybe I misunderstood something here, but from what I understand
casting between signed and unsigned char is well-defined, no matter the
implementation. However, if you want to work bitwise it's only
well-defined if you do it on an unsigned type (i.e. unsigned char in
this case), which is why I cast to unsigned char. Where is the
undefined behaviour here? Is it undefined behaviour to cast between
signed and unsigned char when the value is larger than 128?

> > -   .arr = (uint8_t[]){ 0xFD },
> > +   .arr = (char[]){
> > +   (unsigned char)0xFD,
> > +   },  
> 
> This cast doesn't do anything here. Both 0xFD and (unsigned char)0xFD
> have the same value (0xFD), which can't necessarily be represented as
> char. For example if CHAR_MAX is 127, this conversion is
> implementation defined and could raise a signal (C99 6.3.1.3p2).
> 
> I think using hex escapes in a string literal ("\xFD") has the
> behavior you want here. You could also create an array of unsigned
> char and cast to char *.

From how I understood the standard it does make a difference. "0xFD" as
is is an int-literal and it prints a warning stating that this cannot
be cast to a (signed) char. However, it does not complain with unsigned
char, so I assumed that the standard somehow safeguards it.

But when I got it correctly, you are saying that this only works
because I assume two's complement, right? So what's the portable way to
work with chars? :)

With best regards

Laslo



Re: [hackers] [quark][PATCH 1/7] arg.h: visual separation for blocks

2021-09-13 Thread Laslo Hunhold
On Sun,  4 Jul 2021 20:54:53 +0500
Nikita Zlobin  wrote:

Dear Nikita,

> 

thanks for your patchset, but I will not merge it, given it just
consists of style changes which I do not approve of. Regarding NULL,
using it would require importing something like stddef.h, which can be
avoided by just casting 0 to a pointer.

There are many low-hanging-fruits in suckless tools like porting
manpages to mandoc and other things. :)

I appreciate you took your time working on the patches.

With best regards

Laslo



Re: [hackers] [dmenu][PATCH] turn -b into a toggle

2021-08-16 Thread Laslo Hunhold
On Mon, 16 Aug 2021 19:30:03 +0600
NRK  wrote:

Dear NRK,

> Fair enough. I suppose it should be better fit as a user patch in the
> wiki then?

I personally don't think that this makes sense as a user-patch, given
there's maintenance involved and such a change usually just leads to
failed hunks when using multiple patches.

Everyone is free to upload a patch in the wiki, though, but there are
already too many unmaintained and dead patches.

With best regard

Laslo



Re: [hackers] [dmenu][PATCH] turn -b into a toggle

2021-08-16 Thread Laslo Hunhold
On Mon, 16 Aug 2021 10:28:36 +0200
Hiltjo Posthuma  wrote:

Dear Hiltjo,

> Thanks for the patch. I'd rather not add another option for it.
> 
> I think if the default is not changed it still makes sense. Either
> way the option works as documented.

I understand, it's your call as the maintainer. Thanks for the quick
response!

With best regards

Laslo



Re: [hackers] [dmenu][PATCH] turn -b into a toggle

2021-08-16 Thread Laslo Hunhold
On Sun, 15 Aug 2021 23:44:58 +0600
NRK  wrote:

Dear NRK,

> currently config.h allows users to set the value of topbar to 0.
> however if one does that, there's no way for him to get a topbar
> again. it makes more sense to have -b as a toggle instead.

this trades one problem for another given dmenu suddenly changes
behaviour unexpectedly.

Imagine someone having set up dmenu to be at the top by default (using
config.h), but having a certain launcher script that invokes dmenu to
be at the bottom by passing "-b". Now said user, using the launcher
script often, learns to prefer the bottom bar and sets it as such in
config.h. Assuming the b-flag will just be a redundancy, he keeps them
in his launcher, only to be surprised that dmenu suddenly shows up at
the top.

In my opinion, and the maintainer's may differ in that regard,
behaviour of flags should not be surprising when the defaults have been
changed.

Because of that, why not add another flag "-t" that forces dmenu to
appear at the top of the screen. This makes it immediately obvious what
happens (rather than turning b into a position-toggle, which makes zero
phonetic sense and effectively renders the b-flag unusable for scripts
because its inconsistent) and adds very little overhead.

See the attached patch (also @Hiltjo, what do you think?). :)

With best regards

Laslo
From 6499e6a6313a7dda8fc75329e01d37e585839ba6 Mon Sep 17 00:00:00 2001
From: Laslo Hunhold 
Date: Mon, 16 Aug 2021 09:36:49 +0200
Subject: [PATCH] Add t-flag complementing b-flag (top/bottom-positioning)

There currently is no way to override the bottom-positioning if it
has been set as a default in config.h. The simplest solution is to
just add a complementary t-flag which overrides whatever behaviour
has been set.

This is more favourable compared to turning the b-flag into a toggle,
given it would lead to inconsistent behaviour (scripts can't rely on it)
and break the phonetic readability of the letter "b".

Separate t- and b-flags are very clear and add negligible overhead.

Signed-off-by: Laslo Hunhold 
---
 LICENSE  |  1 +
 config.def.h |  2 +-
 dmenu.1  |  6 +-
 dmenu.c  | 20 +++-
 4 files changed, 22 insertions(+), 7 deletions(-)

diff --git a/LICENSE b/LICENSE
index 3afd28e..f4a0e4f 100644
--- a/LICENSE
+++ b/LICENSE
@@ -10,6 +10,7 @@ MIT/X Consortium License
 © 2010-2012 Connor Lane Smith 
 © 2014-2020 Hiltjo Posthuma 
 © 2015-2019 Quentin Rameau 
+© 2021 Laslo Hunhold 
 
 Permission is hereby granted, free of charge, to any person obtaining a
 copy of this software and associated documentation files (the "Software"),
diff --git a/config.def.h b/config.def.h
index 1edb647..6a3839a 100644
--- a/config.def.h
+++ b/config.def.h
@@ -1,7 +1,7 @@
 /* See LICENSE file for copyright and license details. */
 /* Default settings; can be overriden by command line. */
 
-static int topbar = 1;  /* -b  option; if 0, dmenu appears at bottom */
+static int topbar = 1;  /* -b/-t option; if 0, dmenu appears at bottom */
 /* -fn option overrides fonts[0]; default X11 font or font set */
 static const char *fonts[] = {
 	"monospace:size=10"
diff --git a/dmenu.1 b/dmenu.1
index 323f93c..15c5e68 100644
--- a/dmenu.1
+++ b/dmenu.1
@@ -3,7 +3,8 @@
 dmenu \- dynamic menu
 .SH SYNOPSIS
 .B dmenu
-.RB [ \-bfiv ]
+.RB [ \-b | \-t ]
+.RB [ \-fiv ]
 .RB [ \-l
 .IR lines ]
 .RB [ \-m
@@ -75,6 +76,9 @@ defines the selected background color.
 .BI \-sf " color"
 defines the selected foreground color.
 .TP
+.B \-t
+dmenu appears at the top of the screen.
+.TP
 .B \-v
 prints version information to stdout, then exits.
 .TP
diff --git a/dmenu.c b/dmenu.c
index 98507d9..85f1fa5 100644
--- a/dmenu.c
+++ b/dmenu.c
@@ -709,18 +709,20 @@ int
 main(int argc, char *argv[])
 {
 	XWindowAttributes wa;
-	int i, fast = 0;
+	int i, bflag = 0, tflag = 0, fast = 0;
 
 	for (i = 1; i < argc; i++)
 		/* these options take no arguments */
 		if (!strcmp(argv[i], "-v")) {  /* prints version information */
 			puts("dmenu-"VERSION);
 			exit(0);
-		} else if (!strcmp(argv[i], "-b")) /* appears at the bottom of the screen */
-			topbar = 0;
-		else if (!strcmp(argv[i], "-f"))   /* grabs keyboard before reading stdin */
+		} else if (!strcmp(argv[i], "-b")) { /* appears at the bottom of the screen */
+			bflag = 1;
+		} else if (!strcmp(argv[i], "-t")) {  /* appears at the top of the screen */
+			tflag = 1;
+		} else if (!strcmp(argv[i], "-f")) { /* grabs keyboard before reading stdin */
 			fast = 1;
-		else if (!strcmp(argv[i], "-i")) { /* case-insensitive item matching */
+		} else if (!strcmp(argv[i], "-i")) { /* case-insensitive item matching */
 			fstrncmp = strncasecmp;
 			fstrstr = cistrstr;
 		} else if (i + 1 == argc)
@@ -747,6 +749,14 @@ main(int argc, char *argv[])
 	

Re: [hackers] [sbase] tar: check if reallocarray failed

2021-07-18 Thread Laslo Hunhold
On Sat, 17 Jul 2021 21:04:04 +0200
Hiltjo Posthuma  wrote:

Dear Hiltjo,

> The patch below is for sbase tar:
>
> From 2eec3e07a5bd1ed1fa41ca02865297ab7d8b5fa8 Mon Sep 17 00:00:00 2001
> From: Hiltjo Posthuma 
> Date: Sat, 17 Jul 2021 21:03:27 +0200
> Subject: [PATCH] tar: check if reallocarray failed
> 
> ---
>  tar.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tar.c b/tar.c
> index b74c134..122f30a 100644
> --- a/tar.c
> +++ b/tar.c
> @@ -78,7 +78,7 @@ static const char *filtertools[] = {
>  static void
>  pushdirtime(char *name, time_t mtime)
>  {
> - dirtimes = reallocarray(dirtimes, dirtimeslen + 1,
> sizeof(*dirtimes));
> + dirtimes = ereallocarray(dirtimes, dirtimeslen + 1,
> sizeof(*dirtimes)); dirtimes[dirtimeslen].name = strdup(name);
>   dirtimes[dirtimeslen].mtime = mtime;
>   dirtimeslen++;

ah yes, good catch! While at it, we might also want to check the
strdup() in the consecutive line, right?

With best regards

Laslo



Re: [hackers] [PATCH] Add a configuration option for fullscreen locking

2021-07-14 Thread Laslo Hunhold
On Tue, 13 Jul 2021 14:33:31 -0400
Sebastian LaVine  wrote:

Dear Sebastian,

> I am the "some people" that Quentin mentioned above :)
> 
> I brought this up in the #suckless channel yesterday, when I was
> having a problem with Firefox: When I entered into fullscreen mode, I
> could no longer switch windows with my Alt+J/K keybindings. I use the 
> fakefullscreen patch, so for me fullscreen windows just expand as
> much as possible.
> 
> Quentin (quinq), Ingvix and I discussed this on IRC. To me, this 
> behavior seems rather unintuitive and therefore should not be
> included in mainline. To my knowledge, it isn't documented anywhere
> except the commit message, which is:

count me in in that regard. If an application (most likely a game) wants
exclusive fullscreen, it can capture the mouse in the window. I always
set it like this in wine and have had no problems with that, and it
still allows workspace-switching.

For what it's worth, in my humble opinion dwm should always guarantee
that you can switch workspaces. "Exclusive" fullscreen is a hack as we
know from slock.

Having a configuration-option is the best compromise, though. The
maintainer decides what should be the default. :)

With best regards

Laslo



Re: [hackers] [st][PATCH] arg.h: optimize & style

2021-07-04 Thread Laslo Hunhold
On Sun, 4 Jul 2021 11:55:53 +0200
Hiltjo Posthuma  wrote:

Dear Hiltjo,

> Thanks, but I prefer the current style one.
> 
> I'm not confident this patch doesn't modify any behaviour.
> For example I see the `i` variable was removed, but it is actually
> important to not modify argv as this causes issues on NetBSD and
> OpenBSD process listing (see commit
> a5a928bfc1dd049780a45e072cb4ee42de7219bf).
> 
> This is just one example. Unless it fixes a bug I rather keep the
> current code.

the arg.h in st has some "low-hanging" fruits regarding improvements,
and I modified it accordingly in quark and farbfeld back in 2017 (see
[0]) to fix some issues.

One example is that in my modified form, you can actually access
EARGF()/ARGF() multiple times (instead of silently corrupting the
state), and it properly handles the case when argv is NULL (which is
allowed by POSIX). There was also a bit of
code-deduplication/refactoring with fewer local variables and abort()
was replaced with exit(1).

Of course and as the license permits, feel free to use it in st or
other projects as well, if you like. :)

With best regards

Laslo

[0]:https://git.suckless.org/quark/file/arg.h.html



Re: [hackers] [st][patch] Mild const-correctness improvements.

2021-05-06 Thread Laslo Hunhold
On Thu, 6 May 2021 17:48:33 +0200
Hiltjo Posthuma  wrote:

Dear Hiltjo,

> The patch looks fine. I'm not in favor of some of the const changes,
> but I think it makes sense to make function parameters like for
> xstrdup() const.
> 
> I'll review and push it later.

const-correctness saved me from quite a few bugs in the past, so I
personally changed my mind about it a a few years ago.

It's always good to write down "contracts", because you can only keep
so many things in your head at the same time.

With best regards

Laslo



Re: [hackers] [st][patch] Mild const-correctness improvements.

2021-05-06 Thread Laslo Hunhold
On Thu, 6 May 2021 16:11:33 +0200
"Markus F.X.J. Oberhumer"  wrote:

Dear Markus,

> this is my first post to this list, so I hope I got the email patch
> right.
> 
> GitHub repo is at
> https://github.com/markus-oberhumer/suckless-st/compare/mild-const-correctness-improvements

thanks for your input, but please save the patch as a file (using git
diff or git format-patch) and attach it to your E-Mail. A GitHub-link
is not good because it doesn't satisfy archivability and is overkill,
among other things.

With best regards

Laslo



Re: [hackers] [st][PATCH] Set custom environment variables in config.h

2021-04-02 Thread Laslo Hunhold
On Fri, 2 Apr 2021 07:42:24 +
Subhaditya Nath  wrote:

Dear Subhaditya,

> From 79e69338725563e1bdba32e856726e8fa5151e4c Mon Sep 17 00:00:00 2001
> From: Subhaditya Nath 
> Date: Thu, 1 Apr 2021 19:42:51 +0530
> Subject: [PATCH] Set custom environment variables in config.h
> 
> This patch enables setting custom environment variables in config.h.
> This patch changes config.def.h, and sets $EDITOR to /usr/bin/vim by
> default. Beware.

that's what .profile files are for. I personally don't see the benefit
and, to the contrary, see a lot of potential for unexpected behaviour,
but maybe I'm missing something.

With best regards

Laslo



Re: [hackers] [tabbed][PATCH] Remove quotes around variables in Makefile

2021-04-02 Thread Laslo Hunhold
On Fri, 2 Apr 2021 11:03:05 +0200
Hiltjo Posthuma  wrote:

Dear Hiltjo,

> I prefer with quotes. You can still do make PREFIX=~/.local or
> whatever. Otherwise you could use $HOME.

aren't the quotes also necessary in case one of the variables (DESTDIR,
MANPREFIX, etc.) contains spaces?

With best regards

Laslo



Re: [hackers] [svkbd] [merge request] various patches for svkbd

2021-03-27 Thread Laslo Hunhold
On Sat, 27 Mar 2021 14:03:05 +0100
Maarten van Gompel  wrote:

Dear Maarten,

> I wonder if the svkbd patches I submitted last week arrived properly
> and if you have the opportunity to look at them soon?
> 
> (I only see 2 of the 24(!) patches in the mailing list archives, there
> may be some caught in a filter?)
> 
> Once possible issues are resolved and things are merged we'd like a
> new svkbd release tag (0.3.0) so I can pick up the packaging end for
> Alpine Linux and we can subsequently do our sxmo 1.4.0 release, for
> which the new svkbd is a major dependency.

no worries, all 24 patches arrived fine! I'm surprised, though, that
the archive apparently only lists 2. Let's wait until the maintainer
eaches out to you.

With best regards

Laslo



Re: [hackers] [quark] Apply (D)DoS-Hardening || Laslo Hunhold

2021-02-08 Thread Laslo Hunhold
On Sun, 07 Feb 2021 21:41:58 +0300
Greg Minshall  wrote:

Dear Greg,

> thanks for your reply and detailed explanation, which i should have
> understood from your earlier e-mail (and, if not, from looking at the
> code).

don't worry about it; the algorithm-code is a bit convoluted given the
two optimizations taking place at the same time.

With best regards

Laslo



Re: [hackers] [quark] Apply (D)DoS-Hardening || Laslo Hunhold

2021-02-07 Thread Laslo Hunhold
On Sun, 07 Feb 2021 17:07:24 +0300
Greg Minshall  wrote:

Dear Greg,

> just a comment from the outside.
> 
> if i read get_connection_to_drop_candidate() correctly, your algorithm
> selects the first, in terms of location in 'connection' array, "best"
> (lowest state) candidate to drop.
> 
> you might think of, when finding an *equally* "best" candidate,
> flipping some (weighted, by?) coin, and either taking your current
> candidate, or taking the newly discovered "best".  as someone's
> e-mail tag says, "when in doubt, randomize" :).
> 
> (as a *research* experiment, in some other life, i might flip a coin
> for *every* element in the array, based maybe on the relative states?)

thanks for your input! I may have been imprecise with the description of
my algorithm: Of all connection slots (where every one of them is
occupied), it first finds out which in-address takes up the most (e.g.
20 connections from 127.217.17.131). Among those 20, it finds the
(first) one with the smallest progression (i.e. state in this case).

Indeed, this "minimizer" is not unique and we can have more than one
connection from this client in the "minimal" state (e.g. 10 of those
connections might be in the state C_RECV_HEADER, so not even finished
with sending the request header).

One could refine the algorithm to also minimize e.g. over the number of
bytes received (for C_RECV_HEADER) or how much data has already been
sent (for C_SEND_HEADER, C_SEND_BODY) and then find an even "better"
candidate among those connections from this one greedy client, but
maybe that goes too far.

Your randomness approach might give a little peace of mind to select
from multiple candidates, as mentioned before, but the placement in the
connection-array itself is non-deterministic if a non-trivial number of
clients access the server, especially near saturation, where any slot
at any point in the connection-array might become free at any time. If I
wanted a more refined behaviour, I'd probably just reduce my
drop-candidate set further (with the previously mentioned further
criteria). With this reduction, we'd be talking about 1-2 minimizers
(how likely is it that we have matching byte-progresses?) which would
not need a randomized approach anyway. If we still have a large set of
candidates despite the refined criteria, one can reasonably assume that
the client is just spamming the server with connections, and then it
doesn't really matter which one of the connections we drop.

Maybe I'll further refine it in the future. Thanks for reaching out and
raising this very interesting point about randomization!

With best regards for a nice Sunday

Laslo



Re: [hackers] [quark][PATCH] Return -1 in case of errors in queue event wrapper functions.

2021-01-30 Thread Laslo Hunhold
On Sat, 30 Jan 2021 13:54:58 +0100
Rainer Holzner  wrote:

Dear Rainer,

> Use same data type for nready (number of events) as returned by
> queue_wait(). ---
> [...]
> - int qfd, nready, fd;
> + int qfd, fd;
> + ssize_t nready;
> [...]
> + return -1;

thanks for spotting these mistakes and submitting a patch! I've
pushed it.

With best regards

Laslo



Re: [hackers] [quark] Ignore queries and fragments in URIs || Laslo Hunhold

2021-01-30 Thread Laslo Hunhold
On Sat, 30 Jan 2021 14:30:12 +0100
Hiltjo Posthuma  wrote:

> Cool story, bro.

To be continued ;)



Re: [hackers] [lchat][PATCH] Point that libutf is available in the sbase

2021-01-29 Thread Laslo Hunhold
On Thu, 28 Jan 2021 16:06:28 -0300
Pedro Lucas Porcellis  wrote:

> ---
>  README.md | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/README.md b/README.md
> index 6ece4c0..d3a815f 100644
> --- a/README.md
> +++ b/README.md
> @@ -18,7 +18,7 @@ Programs you can use lchat as a front end for:
>  Requirements
>  
>  
> - * libutf
> + * [libutf](https://git.suckless.org/sbase)
>   * tail(1)
>   * grep(1)
>  
> -- 
> 2.30.0
> 
> 

I don't know if that's really correct, given sbase does not install the
library on the system. Instead, sbase just has a local copy, compiles
it as a static library (.a) and links it into each binary statically.
It would make more sense to do the same for lchat, but I'd rather check
and see what really is needed.
From what I can tell, reading the slackline-source, one could easily
port it to libgrapheme, for example, because the utf.h-requirement is
only necessary for "Rune"-handling (i.e. Codepoints). By porting lchat
to libgrapheme you would get grapheme-cluster-support for free on top
of that.

With best regards

Laslo



Re: [hackers] [quark][PATCH] Add a config switch to enable/disable NPROC limit

2021-01-25 Thread Laslo Hunhold
On Mon, 25 Jan 2021 14:17:17 +0100
Giulio Picierro  wrote:

Dear Giulio,

> sorry for the late reply, I had a really busy week (course to teach)
> :/.

don't worry about it! This is a mailing list and meant to be
asynchronous. Don't feel pressured to response and rather take your
time; it's a pastime after all.

> I didn't have the chance to test the new code, which I will do soon, 
> hopefully.
> 
> If I understand correctly your solution is to read the rlimit from
> the system and then update accordingly.
> 
> Now it seems to me a good solution, however I have to ask: do we
> really need to set the rlimit on the number of processes (which are
> fixed)?
> 
> I mean, what is the rationale behind it? Security purposes?
> 
> I'm just asking out of curiosity, to understand better the design 
> choices :D.

There are two aspects at play here, the limit on open file descriptors
of the current process (RLIMIT_NOFILE) and the limit on threads per user
(RLIMIT_NPROC).

The former is simple: It's a per-process-limit and we just apply a
heuristic and find a coarse upper bound for file descriptors the quark
process will consume. You see we set both cur and max, which is because
a program usually gets a signal when it exceeds the "soft" cur-limit,
but we don't want that. We want to either succeed or fail hard. The
nice thing about setrlimit() is that if the process is not privileged
(or has CAP_SYS_RESOURCE set) it can always set the soft limit and
irrevocably reduce the hard limit, so in case the limits are already
large enough, we don't even need CAP_SYS_RESOURCE.

The latter case is much more difficult, because the thread-limit is not
per-process but per-user. However, there isn't really a portable way to
find out how many current threads a user has. Say we give quark the
flag "-t 50" and are thus telling quark to spawn 50 serving-threads.
However, if the user has a thread-limit of 1000 and already has 990
threads active, pthread_create() will fail after creating 10 threads.
However, if the user has a thread-limit of 90 and only 20 active
threads, the thread creation will succeed (we suppose in both cases
that the number of "foreign" threads is constant).

Usually the thread limit is very high and not an issue, but there might
be configurations where that is not the case. The approach I worked out
for quark is kind of a hack, because what it does is just ask the
system to increase the thread-limit by the number of threads quark
needs. If that fails, e.g. if we are already at the kernel limit, we
just ignore that error, because that's as much as we can hope to do.

The only case where quark now could possibly fail with this heuristic
is on a system where a user is close to thread-saturation (in terms of
its limits, which must be way below the kernel limits) and some
user-program rapidly spawns threads in the few miliseconds between
quark's thread-limit-increase (triggering a TOCTOU) and
thread-allocation. However, even in this extreme case, quark would just
error out on pthread_create() and this is no security issue or
anything.

One could get rid of setting the thread limit by spawning the worker
threads before dropping root, but I just don't feel comfortable with
that. The earlier you lobotomize yourself, the better.

With best regards, hoping this was helpful to you

Laslo



Re: [hackers] [quark] http: fix default index serving

2021-01-24 Thread Laslo Hunhold
On Sun, 24 Jan 2021 19:12:30 +0100
Quentin Rameau  wrote:

Dear Quentin,

> I would prefer that you keep rightful authors of patches instead of
> changing the style a bit and committing in your own name.
> This isn't respectful of contributors and seems to be a recurring
> issue with you.
> 
> If you want to change the style, you can discuss it with the authors,
> and amend the commit before pushing instead of doing that.
> 
> I would prefer that you revert the commit, and do it properly (which
> would be good as it would also be explained in the development
> history).

as you know I mean neither disrespect nor offense and strictly add even
one-time-contributors to the LICENSEs of my projects, because I believe
in proper attribution.

The way I handled the application of patches is probably due to the
fact that I read a lot of OpenBSD-commits. To give you an insight, look
at [0], on how frequent they "credit" external patches in the commit
messages. However, I see and agree with your point and have reverted
and split up the commits[1][2][3][4] and updated the license[5].

The main reason for this split is that git distinguishes between
committer and author, a feature CVS doesn't have and which is likely
the reason they choose the form, and it's good to have a distinctive
history with clear authorship of patches and credit, as you also stated
as your preference.

> > The http_prepare_response()-function is pretty messy, especially in
> > regard to stale data, which this bug is also based on. I'm working
> > on making it more resilient by splitting the discrete sub-problems
> > into separate functions.  
> 
> Yes, but that's also partly due to the style, these are no
> “fallthrough” cases, there are early returns, and it's easier to read
> them as such instead of putting them into if-then-else blocks
> everywhere.

This is a style/code-readability-matter indeed, however, I also added
the change to the part regarding mime-type-handling, which is not
style. As an afterthought, though, it makes more sense to do that in a
separate commit, which I did now.

Anyways, if I do something wrong or something bothers you, please let
me know right then so I can have a chance to correct this. Otherwise,
it's likely I won't notice. If you call it "recurrent", it basically
implies an ill intent, which I don't have at all.

With best regards

Laslo

[0]:https://freshbsd.org/search?q=heavily+based+on%5B%5D=openbsd=commit_date
[1]:https://git.suckless.org/quark/commit/a4ea7cbe676adffd1dbd98b2bb7f68591b24d46c.html
[2]:https://git.suckless.org/quark/commit/deeec27c56d8f5049abac0dad3782f5daf95a1a3.html
[3]:https://git.suckless.org/quark/commit/8afc6416647585ec2695d57eee7c226216e4111c.html
[4]:https://git.suckless.org/quark/commit/67c29aaba8a8194685677586338688e82c619e93.html
[5]:https://git.suckless.org/quark/commit/c6a9055e5a30be570e30da8d216c39662c3a3f99.html



Re: [hackers] [quark] http: fix default index serving

2021-01-24 Thread Laslo Hunhold
On Sun, 24 Jan 2021 14:48:23 +0100
Quentin Rameau  wrote:

Dear Quentin,

> bump

sadly this patch was part of the mails that kept bumping due to the
DMARC/SPF-signing-issue we discussed earlier at admins@. Now that I see
your bump I hope this issue is resolved and I can see your future mails
again. :)

I took a look at the archives[0] and have merged it in [1], however,
changed it into an else-case (to make it not depend on the fallthroughs
in the if only) and changed the mime-check so the mime-type is matched
against the docindex-path.

The http_prepare_response()-function is pretty messy, especially in
regard to stale data, which this bug is also based on. I'm working on
making it more resilient by splitting the discrete sub-problems into
separate functions.

Thanks for finding this issue and your patch! I must admit that this
issue slipped past me because I only checked quark's behaviour with
"curl -I", which effectively masked this problem.

With best regards

Laslo

[0]:https://lists.suckless.org/hackers/2101/17763.html
[1]:https://git.suckless.org/quark/commit/87ae2e9212c5cc7309eefa2a3f49a758862db6c7.html



Re: [hackers] [quark][PATCH] Add a config switch to enable/disable NPROC limit

2021-01-20 Thread Laslo Hunhold
On Mon, 18 Jan 2021 23:03:11 +0100
Laslo Hunhold  wrote:

> that is a really nice observation! Thanks for pointing it out and your
> elaborate explanation. I must honestly admit that I assumed the limit
> was per process and not per user. I'll think about how to approach
> this the best way; given your aforementioned fact, I only see two
> options:
> 
>1) Don't touch the rlimits and let it fail, giving a proper error
>   message (might be problematic for open file descriptors that
>   might get exhausted at runtime). One can also check the limits
>   beforehand and error out (e.g. if we cannot guarantee 4 fds per
>   slot).
>2) Uncrement the rlimits by first reading them and setting the
>   incremented value. possible problems here are TOCTOU (even
>   though the risk here is not too high) and a possible
>   interference in things that shouldn't be touched by convention.

To give a followup, I went with option 2), because it allows the
smoothest operation. Quark is run as root and thus can seize all the
assets it needs. I'm sure many set their global resource limits to
reasonable values, but if you run your server and give it a thread and
slot count, you can estimate that it might exceed your resources. In
that respect, it is forgivable by quark to just raise the bar instead
of failing.

Thanks again, Giulio, for your input and patch suggestion. I hope this
fixes your issues in your case!

With best regards

Laslo



Re: [hackers] [quark][PATCH] Add a config switch to enable/disable NPROC limit

2021-01-18 Thread Laslo Hunhold
On Sun, 17 Jan 2021 17:29:53 +0100
Giulio Picierro  wrote:

Dear Giulio,

> Quoting the book "The Linux Programming Interface" from Micheal
> Kerrisk: "the RLIMIT_NPROC limit, which places a limit on the number
> of processes that can be created, is measured against not just that
> process’s consumption of the corresponding resource, but also against
> the sum of resources consumed by all processes with the same real
> user ID."
> 
> This leads quark to easily fail on Linux when launched with the same
> userid of a logged user.
> 
> For example if the user 'giulio' has an active desktop session, the
> following command:
> 
> $ sudo ./quark -p 8080 -u giulio -g giulio -l
> 
> fails with the following error:
> 
> $ ./quark: pthread_create: Resource temporarily unavailable
> 
> No error occour if instead quark is launched with an userid that does
> not have a session, such as the 'http' user, usually reserved for web
> servers.
> 
> I don't know if this is expected or this could be considered a bug:
> in the end for production servers we could expect that the limit
> works correctly.
> 
> In any case, the least invasive way that I have found to solve the
> issue is to introduce a config switch to disable the limit, retaining
> it enabled by default.

that is a really nice observation! Thanks for pointing it out and your
elaborate explanation. I must honestly admit that I assumed the limit
was per process and not per user. I'll think about how to approach this
the best way; given your aforementioned fact, I only see two options:

   1) Don't touch the rlimits and let it fail, giving a proper error
  message (might be problematic for open file descriptors that
  might get exhausted at runtime). One can also check the limits
  beforehand and error out (e.g. if we cannot guarantee 4 fds per
  slot).
   2) Uncrement the rlimits by first reading them and setting the
  incremented value. possible problems here are TOCTOU (even though
  the risk here is not too high) and a possible interference in
  things that shouldn't be touched by convention.

What do the others think?

With best regards

Laslo



Re: [hackers] [quark] Use epoll/kqueue and worker threads to handle connections || Laslo Hunhold

2021-01-17 Thread Laslo Hunhold
On Sun, 17 Jan 2021 12:48:38 +0100
Hiltjo Posthuma  wrote:

Dear Hiltjo,

> This does not work on OpenBSD and it does not compile.

thanks for letting me know! I didn't come around to testing it on
OpenBSD yet, but did it now and pushed a fix[0].

With best regards

Laslo

[0]:https://git.suckless.org/quark/commit/959c855734e3af12f35532d76deb1ab85474f8f4.html



Re: [hackers] [quark] Prevent overflow in strtonum()-parameters || Laslo Hunhold

2020-11-01 Thread Laslo Hunhold
On Sun, 1 Nov 2020 11:17:42 +0100
Quentin Rameau  wrote:

Dear Quentin,

> SIZE_MAX is the tangible guarantee for the upper limit of size_t.

indeed, but strtonum's maxval argument is a signed long long, and given
size_t can be unsigned long long, we could overflow it.

With best regards

Laslo



Re: [hackers] [quark][PATCH] Fix overflow when calling strtonum in parse_range

2020-10-31 Thread Laslo Hunhold
On Sun, 1 Nov 2020 01:15:32 +0100
José Miguel Sánchez García  wrote:

Dear José,

> Good point! It could be the case that SIZE_MAX is smaller than 
> LLONG_MAX. Honestly I don't know, but I would do what you are
> proposing just to be sure: it is the safest option, and maybe the
> compiler will take care of replacing the correct value at compile
> time. Way better than leaving another bug lingering until someone
> else finds it again.

we should be safe at that point and I have committed the MIN-solution
in commit 7d26fc695. Thanks for your report!

With best regards

Laslo



Re: [hackers] [quark][PATCH] Fix overflow when calling strtonum in parse_range

2020-10-31 Thread Laslo Hunhold
On Sat, 31 Oct 2020 21:58:26 +
José Miguel Sánchez García  wrote:

Dear José,

> The value passed as maxval, SIZE_MAX, doesn't fit on a long long int
> due to signedness. It was causing legitimate range request to be
> discarded as bad.
> 
> I tested it serving an mp4 and opening it with Firefox. A "range=0-"
> was requested, and it triggered the bug.

this is a great catch, thanks! But wouldn't it be better to use
MIN(SIZE_MAX, LLONG_MAX)?

I haven't found anything in the standard that puts "long long" and
"size_t" into any relation, which means, for me, that any case is
possible where either value could be larger, but please correct me if
I'm wrong.

With best regards

Laslo



Re: [hackers] [quark] Thoughts on CGI and authentication?

2020-10-31 Thread Laslo Hunhold
On Mon, 26 Oct 2020 11:49:33 +0100
José Miguel Sánchez García  wrote:

Dear José,

> Funny, that's my current use case. All my CGI is through forms, so
> I'm currently running a separate server for the form handlers,
> regenerating the HTML and then redirecting to the recently updated
> page through a "303 See Other" code.
> 
> My motivation behind integrating CGI into quark was leveraging the 
> quality of its implementation to avoid the security pitfalls of 
> badly-written HTTP servers out there. I would only have to worry
> about writing a simple script to handle the form data.
> 
> Also, if CGI was integrated into the web server itself, I could use
> the same domain/port/endpoint to serve the static page (via a GET
> request) and to handle the form (via a POST request). Moot point but
> it goes a long way towards usability.

another approach would be to have a very small interposer that splits
GET and POST requests and forwards them to quark and the CGI-handler
respectively.

> Finally, CGI is often used to customize the content of a page for a 
> given user. Imagine a logged in user in a forum: they must see a link 
> that points to their profile. Anonymous users would see a login/signup
> bar instead.
> 
> I must say that, even with these advantages in mind, I've come to
> think that CGI would not be appropriate for quark. Its goals are at
> odds with the needs of a CGI implementation, and that's fine (there
> are alternatives for those who want CGI). Feel free to prove me wrong
> :)

Software gets really complex if you try covering the last 5% of
use-cases. Given the massive flexibility of the static web and how many
CGI-applications really are just far away from the original idea of the
web I really don't see a reason to tailor quark towards CGI. It was
there before, but it just made everything really complicated.

With best regards

Laslo



Re: [hackers] [quark][PATCH] Add skeleton for keep-alive connections

2020-10-30 Thread Laslo Hunhold
On Thu, 29 Oct 2020 10:16:36 +
José Miguel Sánchez García  wrote:

Dear José,

> The bare minimum has been implemented, it is currently unused. It
> allows the server to maintain a stateful connection with the client.
> Also, keep-alive connections are more efficient than successive
> request/response pairs of connections.

thanks for your patch, but this can definitely be implemented much
simpler. It's sufficient to have a "binary" field "int keepalive" in
the response struct and set it when we prepare the request-struct
(defaulting to 0 of course, i.e. close) depending on the request-fields.
There's no need to add new data-structures or anything.

At the end of serve(), we then check the response-struct and either
close the connection or return to receiving the header.

I respect your systematic approach, but it's not like there will be any
more than a binary state (close or keep-alive) to this process. I'd
love to chime in further, but I've got a lot to do at the moment and
definitely have the keep-alive-connections as a big thing on my
todo-list.

With best regards

Laslo



Re: [hackers] [quark][PATCH] Don't erase response on http_send_error_response

2020-10-26 Thread Laslo Hunhold
On Mon, 26 Oct 2020 11:34:17 +0100
José Miguel Sánchez García  wrote:

Dear José,

> > I also don't see a reason for the constraints you mention. Just add
> > an array of group-auth-pairs to the server struct and also add a
> > group-auth-pair to the req-struct that you then fill when you parse
> > the request fields in http_parse_header(). Then later, in
> > http_prepare_header_buf(), you check if they match and either send
> > an error-header (access denied) or allow access.
> > 
> > In case the auth-field is empty but the file requires a password,
> > you, in turn, send the desired header to ask for auth.  
> 
> You are absolutely right, and I just didn't see it when I was working
> on it. Sorry for wasting your time.

no problem! Sometimes it takes a few refactorings of an idea until it
is implemented the best way.

With best regards

Laslo



Re: [hackers] [quark] Thoughts on CGI and authentication?

2020-10-26 Thread Laslo Hunhold
On Sun, 25 Oct 2020 18:00:30 +0300
Platon Ryzhikov  wrote:

Dear Platon,

> I've recently had an idea that instead of adding support for running
> scripts by HTTP server (which in any case leads to new fork() calls)
> one could use a library providing HTTP server itself while all the
> logic is created separately and is performed using callbacks from
> library main loop. In that case one could attempt to handle dynamic
> (and static using proper callbacks) content within fixed number of
> threads.

there is theoretically no limit to that, but IPC is a difficult thing
here given you are within a chroot. One could think of another
Unix-domain socket (besides the one that would be created with the -U
option) that could be used to "send" and "receive" data, but to be
honest, it really is not withing quark's scope.

Tell me one example where you need CGI which isn't a web forum? To give
an example how you can solve something statically: A comment section
could be built by having a static web server and also a very thin
"handler" that is called when the form is submitted that adds the
comment to a database and updates the static data on the fly. The
advantage of this is that if someone manages to "crash" the
comment-handler or kill the database process or something, the website
is not affected.

Still, maybe I'm missing something here. Please let me know what you
need CGI for!

With best regards

Laslo



Re: [hackers] [quark][PATCH] Don't erase response on http_send_error_response

2020-10-26 Thread Laslo Hunhold
On Sun, 25 Oct 2020 11:04:26 +0100
José Miguel Sánchez García  wrote:

Dear José,

> I'm currently relying on the req struct NOT being erased, because I'm 
> storing the realm the file belongs to there. Then, I'm using that
> realm information to build the WWW-Authenticate header for the 401
> error response.
> 
> I could just save that field before erasing everything else, but I 
> wonder if that's the way to go. If you are getting rid of everything, 
> maybe I shouldn't make exceptions?

Definitely don't make exceptions here, because erasing the entire
struct is a consistency measure and being inconsistent there
complicates the semantics.

I also don't see a reason for the constraints you mention. Just add an
array of group-auth-pairs to the server struct and also add a
group-auth-pair to the req-struct that you then fill when you parse the
request fields in http_parse_header(). Then later, in
http_prepare_header_buf(), you check if they match and either send
an error-header (access denied) or allow access.

In case the auth-field is empty but the file requires a password, you,
in turn, send the desired header to ask for auth.

With best regards

Laslo



Re: [hackers] [quark][PATCH] Don't erase response on http_send_error_response

2020-10-25 Thread Laslo Hunhold
On Sat, 24 Oct 2020 16:19:13 +
José Miguel Sánchez García  wrote:

Dear José,

thanks for taking your time reading the code and reporting this!

> The comment before the offending line indicated it was intended to
> only erase the fields, but it erased the whole response. It was most
> likely a bug.
>
>   /* empty all fields */
> - memset(req, 0, sizeof(*req));
> + memset(&(req->fields), 0, sizeof(req->fields));

No, this is supposed to be like this. I agree that the comment is a bit
misleading, but http_parse_header() really builds a request from
scratch and first sets it all to zero. With "fields" I'm referring to
the struct fields in request, and this misleading comment will be fixed
in an upcoming commit.

With best regards

Laslo



Re: [hackers] [quark] Thoughts on CGI and authentication?

2020-10-23 Thread Laslo Hunhold
On Fri, 23 Oct 2020 17:10:37 +0200
José Miguel Sánchez García  wrote:

Dear José,

> That was the whole reasoning behind supporting digest authentication. 
> Sure, TLS protects the connection from third parties messing around
> with your connection, but nothing prevents an evil/misconfigured
> server from stealing your cleartext password. At least with digest
> authentication, you know that the server is not seeing your password
> either (at least you would if the login UI for HTTP auth were barely
> usable and told you info about the security mechanism being used...
> I'm getting off track sorry).

I see what you mean. Still, when you go via TLS, it makes sure that the
authenticity of the server is assured as well.

> > Keeping with the spirit of the current set of command line arguments
> > (e.g. -m for maps, of which you can specify as many as you want),
> > one could have a flag -p (protect/password/whatever) that takes a
> > group name and a cleartext password and applies it to all files
> > matching that group in the serving folder, for example '-m "nogroup
> > user:pw"' for example.  
> 
> I like that: simple and intuitive. Will do that, thanks!

You might also go with "group user pw", which saves us one more
"token"-format.

> I hope it ends up being a drop-in solution, looking at the code it
> seems like it will. We'll know when it's done ;)

It most probably will be.

With best regards

Laslo



  1   2   3   4   >