Re: DRAFT Introduce CFFI-based Python bindings
On Wed, Nov 29 2017, Floris Bruynooghe wrote: > David Bremnerwrites: > >> Floris Bruynooghe writes: >> >>> >>> Lastly there are some downsides to the choices I made: >>> - I ended up going squarely for CPython 3.6+. Choosing Python >>> 3 allowed better API design, e.g. with keyword-only parameters >>> etc. Choosing CPython 3.4+ restricts the madness that can >>> happen with __del__ and gives some newer (tho now unused) >>> features in weakref.finalizer. >>> - This is no longer drop-in compatible. >>> - I haven't got to a stage where my initial goal of speed has >>> been proven yet. >> >> I guess you'll have to convince the maintainers / users of alot and afew >> that this makes sense before we go much further. I'd point out that >> Debian stable is only at python 3.5, so that makes me a bit wary of this >> (being able to run the test suite on debian stable and similar aged >> distros useful for me, and I suspect other developers). >> >> I know there are issues with memory management in the current bindings, >> so that may be a strong reason to push to python 3.6; it seems to need >> more investigation at the moment. > > So on earlier Python versions, sure this is possible at not too much > cost. > > - Python 3.4+ would just cost the use of some f-strings. Not major, was > just nice to use. > - Python <3.4 afaik would only need a tweak to the Database.tags and > Message.tags properties. I *think* swapping the caching of these > using a weakref should suffice and not break the brittle > Python-libnotmuch memory management. > Mind you I think Python 3.0-3.3 are pretty old and not much point in > supporting them. But this would also apply for 2.7 support. > - Python 2.7 is probably the worst, in that keyword-only arguments would > be gone. If python 2.7 is required I'd be much keener to have another > go at a drop-in replacement with the memory safety features and then > build the "notdb" API on top off it. But for that to be worth it > people need to be convinced enough that maintaining a CFFI version is > nicer than a ctypes version I guess. IMO Python 3.4+ would be OK, if python 2 support can be dropped. Even Ubuntu 14.04 has python 3.4. One notable distribution that has Python 3.3 by default is RHEL 7, but there seems to be quite a few packaged alternatives available... > > Kind regards, > Floris Tomi ___ notmuch mailing list notmuch@notmuchmail.org https://notmuchmail.org/mailman/listinfo/notmuch
Re: DRAFT Introduce CFFI-based Python bindings
Florian Klinkwrites: >>> I guess you'll have to convince the maintainers / users of alot and afew >>> that this makes sense before we go much further. I'd point out that >>> Debian stable is only at python 3.5, so that makes me a bit wary of this >>> (being able to run the test suite on debian stable and similar aged >>> distros useful for me, and I suspect other developers). >>> >>> I know there are issues with memory management in the current bindings, >>> so that may be a strong reason to push to python 3.6; it seems to need >>> more investigation at the moment. >> >>I am generally in favour of modernizing the notmuch python bindings, >>especially when it comes to memory management and exception handling. >> >>At the moment, the alot interface officially only supports python v2.7 >>but our dependencies have now mostly been updated and we are working on >>port to python 3, see here: https://github.com/pazz/alot/pull/1055 > > afew maintainer here ;-) > > I'm also very much in favor of a more modern and pythonic interface, and would > gladly support retiring python 2, moving to the new interface. > > I had a quick glimpse on the code, and would like to do some annotations. I > fear > it's a bit awkward to do this inside the huge patch, which might already have > changed, and send back via email. > Did you publish a changeset to github, or somewhere else where I could comment > on it? I now also pushed it to github.com/flub/notmuch in the now, it's in the cffi branch and all code is in bindings/notmuch-cffi. Not sure if this helps with commenting, I can make a pull-request to somewhere if you want - just let me know where. Cheers, Floris ___ notmuch mailing list notmuch@notmuchmail.org https://notmuchmail.org/mailman/listinfo/notmuch
Re: DRAFT Introduce CFFI-based Python bindings
I guess you'll have to convince the maintainers / users of alot and afew that this makes sense before we go much further. I'd point out that Debian stable is only at python 3.5, so that makes me a bit wary of this (being able to run the test suite on debian stable and similar aged distros useful for me, and I suspect other developers). I know there are issues with memory management in the current bindings, so that may be a strong reason to push to python 3.6; it seems to need more investigation at the moment. I am generally in favour of modernizing the notmuch python bindings, especially when it comes to memory management and exception handling. At the moment, the alot interface officially only supports python v2.7 but our dependencies have now mostly been updated and we are working on port to python 3, see here: https://github.com/pazz/alot/pull/1055 afew maintainer here ;-) I'm also very much in favor of a more modern and pythonic interface, and would gladly support retiring python 2, moving to the new interface. I had a quick glimpse on the code, and would like to do some annotations. I fear it's a bit awkward to do this inside the huge patch, which might already have changed, and send back via email. Did you publish a changeset to github, or somewhere else where I could comment on it? Cheers, Florian signature.asc Description: PGP signature ___ notmuch mailing list notmuch@notmuchmail.org https://notmuchmail.org/mailman/listinfo/notmuch
Re: DRAFT Introduce CFFI-based Python bindings
Patrick Totzkewrites: > Quoting David Bremner (2017-11-28 23:59:26) >> Floris Bruynooghe writes: >> >> > >> > Lastly there are some downsides to the choices I made: >> > - I ended up going squarely for CPython 3.6+. Choosing Python >> > 3 allowed better API design, e.g. with keyword-only parameters >> > etc. Choosing CPython 3.4+ restricts the madness that can >> > happen with __del__ and gives some newer (tho now unused) >> > features in weakref.finalizer. >> > - This is no longer drop-in compatible. >> > - I haven't got to a stage where my initial goal of speed has >> > been proven yet. >> >> I guess you'll have to convince the maintainers / users of alot and afew >> that this makes sense before we go much further. I'd point out that >> Debian stable is only at python 3.5, so that makes me a bit wary of this >> (being able to run the test suite on debian stable and similar aged >> distros useful for me, and I suspect other developers). >> >> I know there are issues with memory management in the current bindings, >> so that may be a strong reason to push to python 3.6; it seems to need >> more investigation at the moment. >> >> d > > > I am generally in favour of modernizing the notmuch python bindings, > especially when it comes to memory management and exception handling. > > At the moment, the alot interface officially only supports python v2.7 > but our dependencies have now mostly been updated and we are working on > port to python 3, see here: https://github.com/pazz/alot/pull/1055 > > @Floris, you are welcome to join #alot on freenode if you want to > discuss details on that. > > You mention that your new API breaks compatibility with the existing > ones. Do you have some demo code that uses the new API for reference? Short, untested, example which works with what's posted: db = notdb.Database.create() # or db = notdb.Database(path=None, mode=notdb.Database.MODE.READ_WRITE) print(db.path) -> pathlib.Path (a py34 dependency) if 'unread' in db.tags: # tags behaves like a set print('unread mail!') with db.atomic(): msg = db.add('/path/to/file') mdg = db.get('/path/to/file') msg = db.find('some-msgid') db.remove('path/to/other/file') # sorry, don't have a query interface yet assert 'unread' in msg.tags for tag in msg.tags: print(f'a tag: {tag}') with msg.frozen(): # Message.frozen() not yet implemented msg.tags.clear() # all set operations supported msg.tags.add('atag') msg.tags_to_flags() I imagine the query interface would be something like: with db.query('tag:atag') as query: print(f'results: {query.count}') for msg in query: print(msg.path) But to be honest I've been spending most time on getting the memory-safety figured (which I hope I finally did) so far and I think the tag handling is so far the nicest thing to show off. They're completely normal Python sets with no special behaviour at all (well, that's not true - there's the binary interface, see the code posted for this). Actually, this last point is kind of important and I failed to mention it before too. The existing Python bindings convert many bytes from libnotmuch to Python strings, that is unicode on Python 3. For many it uses b'bytes'.decode('utf-8', errors='ignore') which is a sane default if you want to display things. But if you need to round-trip a tag and store it again you might be changing the tag. I've not found the right way to handle binary data (e.g. also needed for messageid) everywhere yet but for tags I've gone with: for tag in iter(msg.tags): # iter() normally called implicitly by for loop print(f'All unicode this: {tag}') for tag in msg.tags.iter(encoding=None): other_msg.tags.add(tag) # This passes pure bytes around, loses nothing # What I've used for message ID for now is a "BinString" type print(f'All just unicode: {msg.messageid}') binary_msgid = bytes(msg.messageid) # No lossy conversion This BinString stuff is somewhat hacky, not sure how sane that is. The second iterator on tags feels somewhat cleaner. Likewise tags could be BinString as well instead of plain str. Cheers, Floris ___ notmuch mailing list notmuch@notmuchmail.org https://notmuchmail.org/mailman/listinfo/notmuch
Re: DRAFT Introduce CFFI-based Python bindings
David Bremnerwrites: > Floris Bruynooghe writes: > >> >> Lastly there are some downsides to the choices I made: >> - I ended up going squarely for CPython 3.6+. Choosing Python >> 3 allowed better API design, e.g. with keyword-only parameters >> etc. Choosing CPython 3.4+ restricts the madness that can >> happen with __del__ and gives some newer (tho now unused) >> features in weakref.finalizer. >> - This is no longer drop-in compatible. >> - I haven't got to a stage where my initial goal of speed has >> been proven yet. > > I guess you'll have to convince the maintainers / users of alot and afew > that this makes sense before we go much further. I'd point out that > Debian stable is only at python 3.5, so that makes me a bit wary of this > (being able to run the test suite on debian stable and similar aged > distros useful for me, and I suspect other developers). > > I know there are issues with memory management in the current bindings, > so that may be a strong reason to push to python 3.6; it seems to need > more investigation at the moment. So on earlier Python versions, sure this is possible at not too much cost. - Python 3.4+ would just cost the use of some f-strings. Not major, was just nice to use. - Python <3.4 afaik would only need a tweak to the Database.tags and Message.tags properties. I *think* swapping the caching of these using a weakref should suffice and not break the brittle Python-libnotmuch memory management. Mind you I think Python 3.0-3.3 are pretty old and not much point in supporting them. But this would also apply for 2.7 support. - Python 2.7 is probably the worst, in that keyword-only arguments would be gone. If python 2.7 is required I'd be much keener to have another go at a drop-in replacement with the memory safety features and then build the "notdb" API on top off it. But for that to be worth it people need to be convinced enough that maintaining a CFFI version is nicer than a ctypes version I guess. Kind regards, Floris ___ notmuch mailing list notmuch@notmuchmail.org https://notmuchmail.org/mailman/listinfo/notmuch
Re: DRAFT Introduce CFFI-based Python bindings
Quoting David Bremner (2017-11-28 23:59:26) > Floris Bruynooghewrites: > > > > > Lastly there are some downsides to the choices I made: > > - I ended up going squarely for CPython 3.6+. Choosing Python > > 3 allowed better API design, e.g. with keyword-only parameters > > etc. Choosing CPython 3.4+ restricts the madness that can > > happen with __del__ and gives some newer (tho now unused) > > features in weakref.finalizer. > > - This is no longer drop-in compatible. > > - I haven't got to a stage where my initial goal of speed has > > been proven yet. > > I guess you'll have to convince the maintainers / users of alot and afew > that this makes sense before we go much further. I'd point out that > Debian stable is only at python 3.5, so that makes me a bit wary of this > (being able to run the test suite on debian stable and similar aged > distros useful for me, and I suspect other developers). > > I know there are issues with memory management in the current bindings, > so that may be a strong reason to push to python 3.6; it seems to need > more investigation at the moment. > > d I am generally in favour of modernizing the notmuch python bindings, especially when it comes to memory management and exception handling. At the moment, the alot interface officially only supports python v2.7 but our dependencies have now mostly been updated and we are working on port to python 3, see here: https://github.com/pazz/alot/pull/1055 @Floris, you are welcome to join #alot on freenode if you want to discuss details on that. You mention that your new API breaks compatibility with the existing ones. Do you have some demo code that uses the new API for reference? Cheers, P signature.asc Description: signature ___ notmuch mailing list notmuch@notmuchmail.org https://notmuchmail.org/mailman/listinfo/notmuch
Re: DRAFT Introduce CFFI-based Python bindings
Floris Bruynooghewrites: > > Lastly there are some downsides to the choices I made: > - I ended up going squarely for CPython 3.6+. Choosing Python > 3 allowed better API design, e.g. with keyword-only parameters > etc. Choosing CPython 3.4+ restricts the madness that can > happen with __del__ and gives some newer (tho now unused) > features in weakref.finalizer. > - This is no longer drop-in compatible. > - I haven't got to a stage where my initial goal of speed has > been proven yet. I guess you'll have to convince the maintainers / users of alot and afew that this makes sense before we go much further. I'd point out that Debian stable is only at python 3.5, so that makes me a bit wary of this (being able to run the test suite on debian stable and similar aged distros useful for me, and I suspect other developers). I know there are issues with memory management in the current bindings, so that may be a strong reason to push to python 3.6; it seems to need more investigation at the moment. d ___ notmuch mailing list notmuch@notmuchmail.org https://notmuchmail.org/mailman/listinfo/notmuch
DRAFT Introduce CFFI-based Python bindings
Hi all, Here are the beginnings off CFFI-based Python bindings, rather than the ctypes-based ones currently available. I started this work in order to get faster bindings on pypy since a script of mine was running slower on pypy than CPython. Initially aiming for a drop-in replacement of the existing bindings I ended up abandoning this to help enforce correct usage of the API. The benefits of this approach are: - More "Pythonic" API, e.g. tags behave like sets, iterators which get consumed can easily be re-created as is usual with collections, avoid allowing invalid combinations of args and calls on a Python-API level. - CFFI, this works on both CPython and PyPy, on the latter it is (supposed to be) a lot faster as the JIT can cross the boundary between C and Python code where it otherwise has extra overheads to emulate the C-Python API. Additionally it makes it safer to use compared to ctypes, it works on the API level using the compiler to figure out the correct details of the platform. Compared to ctypes which only works on the ABI level and you need to rely on knowing the layout of code when writing the Python bindings. Additionaly I belive these bindings fix a memory safety issue, certain situations in my test-suite would lead to coredump which is not something which should be possible from within Python. I believe I have seen similar reports in the list archives so am not the only one seeing these. Sadly these are hard to isolate and I have not managed to re-create this in a nice minimal example, however I believe the root cause is that in some situations, mostly interpreter shutdown, the __del__ method can have been called while there are still references to the object and while child-objects are still alive. This effectively results in double-frees as the child object frees memory already freed by the parent. These bindings solve this by adding the .alive property and using this to check parent objects are still alive before destroying themselves. This is somewhat expensive, but works and is easy to implement. Lastly there are some downsides to the choices I made: - I ended up going squarely for CPython 3.6+. Choosing Python 3 allowed better API design, e.g. with keyword-only parameters etc. Choosing CPython 3.4+ restricts the madness that can happen with __del__ and gives some newer (tho now unused) features in weakref.finalizer. - This is no longer drop-in compatible. - I haven't got to a stage where my initial goal of speed has been proven yet. In theory I think it's possible to create a CFFI-based drop-in replacement to the bindings, only adding the memory-safety fixes and keeping the Python 2.7 compatibility. It would then be possible to build the API proposed in these bindings on top of this, but once I was making these bindings safer it felt strange to still allow the API to be misused. There are a lot of details about this which can be discussed, also many finer implementation points and even just getting the proposed API right (you'll notice large gaps for now). But this mail is already too long. I look forward to your comments and feedback on the approach taken and on whether some form of this could make it into the main repo. Lastly a small note on the AUTHORS file patch, due to my own unfortunate choice of employer I have strict rules to follow on how to submit patches. One of which is to add this line if an AUTHORS file exists. Given clearly not everyone is listed here though maybe this is not appropriate. I would also rather receive email on f...@devork.be rather than the address I have to use in the git commits. Kind Regards, Floris ___ notmuch mailing list notmuch@notmuchmail.org https://notmuchmail.org/mailman/listinfo/notmuch