Good afternoon,

There's something I'd like to use pcre2 for, but I'm not sure if it's possible (or if it is, quite how to get there).

Given a valid character class string (e.g. "[a-z0-9_?$]" in a very simple case), I'd like to get back some representation which describes all of the characters included in the class. A bit mask would be fine, but a list of code point ranges would do as well.

My use case is that I need to rapidly test whether a given character matches a user-specified character class. I know I can do this by compiling a pattern and then attempting to match, but that's a little "heavy" for my use case.

I haven't looked at how character class matching works, but I *assume* that some sort of representation of the class is compiled that allows rapid testing. So I guess a way to expose that, or parse it into a bitmask/code point ranges would be ideal.

Is this currently possible, or could it be? (I'll be happy to write this up in Bugzilla if you think it's feasible, but I'm looking for a sense of whether it's doable.)

Thanks for any advice,

R.

--
Rich Siegel                                 Bare Bones Software, Inc.
<sie...@barebones.com>                      <https://www.barebones.com/>

Someday I'll look back on all this and laugh... until they sedate me.

--
## List details at https://lists.exim.org/mailman/listinfo/pcre-dev

Reply via email to