Re: [dev] [tabbed] utf8 characters not displayed in tabs

2022-08-31 Thread NRK
On Tue, Aug 30, 2022 at 10:47:46PM +0200, Storkman wrote:
> but more generally,
> the "set of fonts covering as wide variety of code-points as possible"
> ...is just the set of all installed fonts, isn't it?

Not entirely. Let's say a system has 3 fonts installed: A, B and C. A
and B both *only* cover English characters and C covers Chinese ones.

In that case, our list would be [A, C]. B can be eliminated from the
fallback-list since it doesn't add anything new. (Or it could be [B, C]
with A eliminated).

We could of course add a reasonable upper-limit to this list, say
128/196 or something along those lines.

And just so everyone is on the same page: drw ALREADY maintains such a
list. I'm only proposing we change *how* we construct the list (assuming
it's possible).

This is roughly how things currently work:

0. at startup push the user specified fonts (in config.h) to the list
// list = [ userfont0, userfont1 ]
foreach code-point:
1. walk the list to find a font that can render the code-point.
   if any of the fonts in the list can render it, just use that.
2. if not, then run XftFontMatch() to see if we can find
   a match.
3. If XftFontMatch() finds a match then append the
   matched font to the list (so that step 1 can find it
   next time around).
   // list = [ userfont0, userfont1, matchedfont0, matchedfont1 
... ]

My idea is to simply remove step 2 and 3 from the loop and construct the
list at startup so we don't have to constantly keep calling
XftFontMatch() inside the loop for each unknown code-point.

> I'm a big fan of pre-calculating this list before compiling the program.

I was talking about doing it on program startup since I don't think you
should have to recompile the program every time you install a new font
for font fallback to work.

> If your question is simply "which code-points ARE NOT supported by ANY font",

Wasn't exactly what I was looking for but even this *could be* an
improvement over the current system if it can be done in reasonable
space, speed and code-size.

- NRK



Re: [dev] [tabbed] utf8 characters not displayed in tabs

2022-08-30 Thread Storkman
On Tue, Aug 30, 2022 at 10:23:44PM +0600, NRK wrote:
> On Tue, Aug 30, 2022 at 05:36:31PM +0200, Hiltjo Posthuma wrote:
> > "
> > The font fallback in dwm and dmenu is handled via libsl (i.e drw.c) and
> > it's a huge mess [0].
> > 
> > The way it works is also very inefficient (it calls XftFontMatch() for
> > every single "unknown" code-point). The `nomatches` cache is merely
> > there to stop the bleeding and is not really a proper fix.
> > "
> > 
> > This part specificly. The tone if very whiny.
> 
> I see; the intention there was to just describe/explain why I think
> replicating drw is not a good idea. It wasn't meant to be whiny.
> 
> > It doesn't help complaining the code is a mess or improper without
> > proposing a patch.
> 
> The "proper" way (IMO) would be to build up a list of fonts which would
> be capable of representing as many code-points available in the system
> *right at startup* - instead of checking each unknown code-point as we
> go.

I'm a big fan of pre-calculating this list before compiling the program.

> This way if the code-point cannot be found within the list; we'll know
> right away that it's a missing glyph and there won't be any need to call
> XftFontMatch for each "unknown" code-point.

In my text editor, I just use a static pre-configured list of font names.
It's very efficient, since there's only like four of them, and Xft keeps
a bitmap of supported code-points in every loaded font.
I can define exactly which fonts I want it to use without ever delving
into the FontConfig XML Hell again.

I suppose it's a difference in interpretation of what the "correct" behavior is.

> The problem is, as I said, I'm not sure if it's even possible/feasible
> with Xft/FontConfig as I'm not very familiar with those libraries. If
> someone knows the answer, then feel free to speak up.

There's the 'lang' and 'charset' properties, but more generally,
the "set of fonts covering as wide variety of code-points as possible"
...is just the set of all installed fonts, isn't it?
What exactly are we looking for here?

If your question is simply "which code-points ARE NOT supported by ANY font",
you could just iterate over all fonts and OR their charsets together.

$ time fc-list '' fullname charset >/dev/null
0m00.06s real 0m00.05s user 0m00.01s system

> If it is possible and someone can point out which routines I should be
> looking at then I can try to take a crack at it. In case that's not
> possible, then there's probably not a whole lot that can be done about
> the situation.
> 
> - NRK
> 

 - Storkman



Re: [dev] [tabbed] utf8 characters not displayed in tabs

2022-08-30 Thread Laslo Hunhold
On Tue, 30 Aug 2022 22:23:44 +0600
NRK  wrote:

Dear NRK,

> The "proper" way (IMO) would be to build up a list of fonts which
> would be capable of representing as many code-points available in the
> system *right at startup* - instead of checking each unknown
> code-point as we go.
> 
> This way if the code-point cannot be found within the list; we'll know
> right away that it's a missing glyph and there won't be any need to
> call XftFontMatch for each "unknown" code-point.
> 
> The problem is, as I said, I'm not sure if it's even possible/feasible
> with Xft/FontConfig as I'm not very familiar with those libraries. If
> someone knows the answer, then feel free to speak up.
> 
> If it is possible and someone can point out which routines I should be
> looking at then I can try to take a crack at it. In case that's not
> possible, then there's probably not a whole lot that can be done about
> the situation.

this aspect was discussed a while back and we all know that
Xft/Fontconfig is cancer. This entire font-rendering-topic is a huge
rabbit hole though, given it covers a very wide range of topics. You
have complex file parsing (OTF, TTF), font shaping (which only a single
library, harfbuzz, has a monopoly of and which Unicode works with as
"specification by implementation", which is horrible) and
rendering/rasterization.

It's difficult to even get a foot in the door. As far as I remember
Thomas Oltmann worked on a rendering library and has a good insight
into the difficulties of this.

As a middleground, maybe one could design a simple frontend for
fontconfig. I can imagine that caching the rendering-ability by
codepoints in a compressed format into metadata might be a cool
approach; I have made the experience while working on libgrapheme that
such tables are highly compressible down to a few kilobytes per
complete codepoint-table.

With best regards

Laslo



Re: [dev] [tabbed] utf8 characters not displayed in tabs

2022-08-30 Thread NRK
On Tue, Aug 30, 2022 at 05:36:31PM +0200, Hiltjo Posthuma wrote:
> "
> The font fallback in dwm and dmenu is handled via libsl (i.e drw.c) and
> it's a huge mess [0].
> 
> The way it works is also very inefficient (it calls XftFontMatch() for
> every single "unknown" code-point). The `nomatches` cache is merely
> there to stop the bleeding and is not really a proper fix.
> "
> 
> This part specificly. The tone if very whiny.

I see; the intention there was to just describe/explain why I think
replicating drw is not a good idea. It wasn't meant to be whiny.

> It doesn't help complaining the code is a mess or improper without
> proposing a patch.

The "proper" way (IMO) would be to build up a list of fonts which would
be capable of representing as many code-points available in the system
*right at startup* - instead of checking each unknown code-point as we
go.

This way if the code-point cannot be found within the list; we'll know
right away that it's a missing glyph and there won't be any need to call
XftFontMatch for each "unknown" code-point.

The problem is, as I said, I'm not sure if it's even possible/feasible
with Xft/FontConfig as I'm not very familiar with those libraries. If
someone knows the answer, then feel free to speak up.

If it is possible and someone can point out which routines I should be
looking at then I can try to take a crack at it. In case that's not
possible, then there's probably not a whole lot that can be done about
the situation.

- NRK



Re: [dev] [tabbed] utf8 characters not displayed in tabs

2022-08-30 Thread Hiltjo Posthuma
On Tue, Aug 30, 2022 at 09:11:03PM +0600, NRK wrote:
> On Tue, Aug 30, 2022 at 04:50:14PM +0200, Hiltjo Posthuma wrote:
> > Then write a patch to improve it instead of whining about it.
> 
> The topic was font fallback; I adviced not to replicate drw, explained
> my reasoning why, and then adviced to look into ST instead.
> 
> I don't see any whining.
> 
> - NRK
> 

"
The font fallback in dwm and dmenu is handled via libsl (i.e drw.c) and
it's a huge mess [0].

The way it works is also very inefficient (it calls XftFontMatch() for
every single "unknown" code-point). The `nomatches` cache is merely
there to stop the bleeding and is not really a proper fix.
"

This part specificly. The tone if very whiny. It doesn't help complaining
the code is a mess or improper without proposing a patch.

-- 
Kind regards,
Hiltjo



Re: [dev] [tabbed] utf8 characters not displayed in tabs

2022-08-30 Thread NRK
On Tue, Aug 30, 2022 at 04:50:14PM +0200, Hiltjo Posthuma wrote:
> Then write a patch to improve it instead of whining about it.

The topic was font fallback; I adviced not to replicate drw, explained
my reasoning why, and then adviced to look into ST instead.

I don't see any whining.

- NRK



Re: [dev] [tabbed] utf8 characters not displayed in tabs

2022-08-30 Thread Hiltjo Posthuma
On Tue, Aug 30, 2022 at 08:08:21PM +0600, NRK wrote:
> On Tue, Aug 30, 2022 at 11:42:33AM +0900, Pontus Stenetorp wrote:
> > My intuition is that it is related to font loading and fall backs, so
> > I am tempted to take a stab at it by comparing the font
> > loading/handling between tabbed, dwm, st, and sent to see if I can
> > spot any difference, as the latter three have no issues. If I am
> > correct, it should not be an awfully difficult patch.
> 
> The font fallback in dwm and dmenu is handled via libsl (i.e drw.c) and
> it's a huge mess [0].
> 
> The way it works is also very inefficient (it calls XftFontMatch() for
> every single "unknown" code-point). The `nomatches` cache is merely
> there to stop the bleeding and is not really a proper fix.
> 
> I wasn't very happy with the situation and had intended to rework the
> system to be something more simple, sensible and efficient. But I'm not
> sure if what I have planned is even possible with Xft/FontConfig or not.
> And quickly grepping through the function list wasn't that helpful. I'll
> probably have to allocate more time to read the docs properly.
> 
> But in any case I absolutely do NOT recommend trying to replicate drw's
> approach. Not sure what ST is doing, so that might be worth taking a
> look into instead.
> 
> [0]: https://git.suckless.org/libsl/file/drw.c.html#l251
> 
> - NRK
> 

Then write a patch to improve it instead of whining about it.

To be honest we all know the code is not perfect and never will be.

When rewriting it also be mindful of peoples preferences.

-- 
Kind regards,
Hiltjo



Re: [dev] [tabbed] utf8 characters not displayed in tabs

2022-08-30 Thread NRK
On Tue, Aug 30, 2022 at 11:42:33AM +0900, Pontus Stenetorp wrote:
> My intuition is that it is related to font loading and fall backs, so
> I am tempted to take a stab at it by comparing the font
> loading/handling between tabbed, dwm, st, and sent to see if I can
> spot any difference, as the latter three have no issues. If I am
> correct, it should not be an awfully difficult patch.

The font fallback in dwm and dmenu is handled via libsl (i.e drw.c) and
it's a huge mess [0].

The way it works is also very inefficient (it calls XftFontMatch() for
every single "unknown" code-point). The `nomatches` cache is merely
there to stop the bleeding and is not really a proper fix.

I wasn't very happy with the situation and had intended to rework the
system to be something more simple, sensible and efficient. But I'm not
sure if what I have planned is even possible with Xft/FontConfig or not.
And quickly grepping through the function list wasn't that helpful. I'll
probably have to allocate more time to read the docs properly.

But in any case I absolutely do NOT recommend trying to replicate drw's
approach. Not sure what ST is doing, so that might be worth taking a
look into instead.

[0]: https://git.suckless.org/libsl/file/drw.c.html#l251

- NRK



Re: [dev] [tabbed] utf8 characters not displayed in tabs

2022-08-29 Thread Pontus Stenetorp
On Mon 29 Aug 2022, Seb wrote:
> 
> I am using xterm under tabbed (v.0.6) and some of the polish
> characters which are in the directory name are not displayed in the
> tabs and the rest of the name is truncated, for instance:
> 
> /home/seb/żółty is truncated to /home/seb/żó
> 
> when I set the Xresource :
> 
> *XTerm*utf8Title: True
> 
> then the whole name is displayed with all polish characters but only
> after I change directory to a new location, so the name in tab does
> not refer to actual directory.

I only started using tabbed the other day, but I also have issues with Unicode 
(in my case Japanese showing up as “tofu”).

My intuition is that it is related to font loading and fall backs, so I am 
tempted to take a stab at it by comparing the font loading/handling between 
tabbed, dwm, st, and sent to see if I can spot any difference, as the latter 
three have no issues. If I am correct, it should not be an awfully difficult 
patch.



[dev] [tabbed] utf8 characters not displayed in tabs

2022-08-29 Thread Seb

hi,
I am using xterm under tabbed (v.0.6) and some of the polish characters which are 
in the directory name are not displayed in the tabs and the rest of the name is 
truncated, for instance:


/home/seb/żółty is truncated to /home/seb/żó

when I set the Xresource :

*XTerm*utf8Title: True

then the whole name is displayed with all polish characters but only after I 
change directory to a new location, so the name in tab does not refer to actual 
directory.