On Thu, May 15, 2025, 6:06 PM Eric Blake <ebl...@redhat.com> wrote:

> On Thu, May 15, 2025 at 07:20:48AM -0500, Eric Blake wrote:
> > On Thu, May 15, 2025 at 01:26:16AM -0400, Nikolaos Chatzikonstantinou
> wrote:
> > > > $ python
> > > > Python 3.13.3 (main, Apr 22 2025, 00:00:00) [GCC 15.0.1 20250418
> (Red Hat 15.0.1-0)] on linux
> > > > Type "help", "copyright", "credits" or "license" for more
> information.
> > > > >>> import pygnuregex
> > > > >>> pygnuregex.compile(b"a")
> > > > Segmentation fault (core dumped)
> > > > (.venv)
> > >
> > > Someone else on a forum helped me debug this. He noticed that the
> > > Python pointer was different from the one received by the underlying C
> > > function (by printing the Python pointer and inspecting the C pointer
> > > with gdb). In fact it was truncated to 32 bits. We eventually
> > > brainstormed that it was because I had neglected to add the argument
> > > types for the C functions. I'm not sure why there is a difference in
> > > Python 3.10 (where I verified the crash in a VM) versus Python 3.11.
> >
> > I was testing with Python 3.13, not 3.10, but I can confirm that your
> > latest fix finally solves it.
>
> I can't help but wonder: Do you NEED to call into the C functions, or
> would it be possible to write pure python code that translates any m4
> (emacs-style) regex into a similar Python regex?  For example, most
> characters translate straight over, ^ and $ in anchor positions
> translate to \A and \Z (better \z, but that is only available in
> Python 3.13 and newer), \( \| \) in m4 (outside of []) translate to (
> | ) in Python while bare ( | ) in m4 translate to \( \| \), and so
> forth.  A quick google search found
> https://www.regexbuddy.com/convert.html as a non-free resource; but
> there may be other sites that can summarize how to translate between
> flavors without needing foreign function interfacing.
>

At the time it seemed to be the simplest solution plus it contributes a
package to the python ecosystem. I will add your suggestion in the features
page for now, but it'll be one more thing to worry about edge cases for,
and that's why I didn't elect to do it that way. I'll take a closer look
soon.

Regards,
Nikolaos Chatzikonstantinou

>

Reply via email to