Hi,

On Sunday, 7 November 2021 13:08:38 CET Matthew Vernon wrote:
> On 24/12/2020 11:56, Diederik de Haas wrote:
> > On my system I have the 8/16/32 bit versions of the pcre2 library
> > installed.
> > The discription only tells me that this is the 8-bit runtime version.
> > But I have no idea why I/anyone would want a 8-bit runtime on my 64-bit
> > machine, where I'd normally expect (only) a 64-bit version, which
> > apparently, doesn't exist.
> 
> The short answer is because you installed something that depends on the
> 8-bit runtime version.

I actually knew that, but I should've phrased my request a bit clearer.

$ aptitude why libpcre2-8-0
i   git Depends libpcre2-8-0 (>= 10.34)

Let's take the Linux kernel as an example, which ofc uses git and has 
contributors from all over the world, include those of whose name won't always 
fit in ANSI/UTF-8 chars. Let's take Japanese as an example.
So if I want to query git's log (assuming it uses RE for that) for commits by 
a Japanese person, it won't be able to find it because it's using the 8-bit 
variant of libpcre2?

> The slightly longer answer is that the X-bit naming refers to the size
> of code points - so the 8-bit version takes strings composed of chars,
> representing single-byte characters, or UTF-8 strings. The 16 and 32
> libraries instead take strings contained in arrays of 16 or 32-bit code
> units (which again might be single-unit characters or UTF-16 or UTF-32
> strings.

My suspicion is that the choice for either the 8, 16 or 32 bit version isn't 
made as consciously as possibly should. Until your reply I wouldn't have known 
which one to pick if I wanted to package a program and would likely just copy 
what someone else has done (which may have followed the same 'procedure').

But if this information is added to the long description (+possible trade-
offs*), then people can make a better informed and thereby a (more) deliberate 
choice.

*) I don't know, but I can imagine that the 8bit version is faster then the 
16bit one. The downside is that you'll exclude RE using UTF-16/32 chars, like 
Japanese above. Depending on the use case, that can be acceptable. Or not.

Cheers,
  Diederik

Attachment: signature.asc
Description: This is a digitally signed message part.

Reply via email to