On Thu, May 15, 2025 at 07:20:48AM -0500, Eric Blake wrote: > On Thu, May 15, 2025 at 01:26:16AM -0400, Nikolaos Chatzikonstantinou wrote: > > > $ python > > > Python 3.13.3 (main, Apr 22 2025, 00:00:00) [GCC 15.0.1 20250418 (Red Hat > > > 15.0.1-0)] on linux > > > Type "help", "copyright", "credits" or "license" for more information. > > > >>> import pygnuregex > > > >>> pygnuregex.compile(b"a") > > > Segmentation fault (core dumped) > > > (.venv) > > > > Someone else on a forum helped me debug this. He noticed that the > > Python pointer was different from the one received by the underlying C > > function (by printing the Python pointer and inspecting the C pointer > > with gdb). In fact it was truncated to 32 bits. We eventually > > brainstormed that it was because I had neglected to add the argument > > types for the C functions. I'm not sure why there is a difference in > > Python 3.10 (where I verified the crash in a VM) versus Python 3.11. > > I was testing with Python 3.13, not 3.10, but I can confirm that your > latest fix finally solves it.
I can't help but wonder: Do you NEED to call into the C functions, or would it be possible to write pure python code that translates any m4 (emacs-style) regex into a similar Python regex? For example, most characters translate straight over, ^ and $ in anchor positions translate to \A and \Z (better \z, but that is only available in Python 3.13 and newer), \( \| \) in m4 (outside of []) translate to ( | ) in Python while bare ( | ) in m4 translate to \( \| \), and so forth. A quick google search found https://www.regexbuddy.com/convert.html as a non-free resource; but there may be other sites that can summarize how to translate between flavors without needing foreign function interfacing. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org