On Thu, May 15, 2025 at 07:20:48AM -0500, Eric Blake wrote:
> On Thu, May 15, 2025 at 01:26:16AM -0400, Nikolaos Chatzikonstantinou wrote:
> > > $ python
> > > Python 3.13.3 (main, Apr 22 2025, 00:00:00) [GCC 15.0.1 20250418 (Red Hat 
> > > 15.0.1-0)] on linux
> > > Type "help", "copyright", "credits" or "license" for more information.
> > > >>> import pygnuregex
> > > >>> pygnuregex.compile(b"a")
> > > Segmentation fault (core dumped)
> > > (.venv)
> > 
> > Someone else on a forum helped me debug this. He noticed that the
> > Python pointer was different from the one received by the underlying C
> > function (by printing the Python pointer and inspecting the C pointer
> > with gdb). In fact it was truncated to 32 bits. We eventually
> > brainstormed that it was because I had neglected to add the argument
> > types for the C functions. I'm not sure why there is a difference in
> > Python 3.10 (where I verified the crash in a VM) versus Python 3.11.
> 
> I was testing with Python 3.13, not 3.10, but I can confirm that your
> latest fix finally solves it.

I can't help but wonder: Do you NEED to call into the C functions, or
would it be possible to write pure python code that translates any m4
(emacs-style) regex into a similar Python regex?  For example, most
characters translate straight over, ^ and $ in anchor positions
translate to \A and \Z (better \z, but that is only available in
Python 3.13 and newer), \( \| \) in m4 (outside of []) translate to (
| ) in Python while bare ( | ) in m4 translate to \( \| \), and so
forth.  A quick google search found
https://www.regexbuddy.com/convert.html as a non-free resource; but
there may be other sites that can summarize how to translate between
flavors without needing foreign function interfacing.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org


Reply via email to