[PATCH 1/2] python/notmuch2: do not destroy messages owned by a query
From: Anton Khirnov Any messages retrieved from a query - either directly via search_messages() or indirectly via thread objects - are owned by that query. Retrieving the same message (i.e. corresponding to the same message ID / database object) several times will always yield the same C object. The caller is allowed to destroy message objects owned by a query before the query itself - which can save memory for long-lived queries. However, that message must then never be retrieved again from that query. The python-notmuch2 bindings will currently destroy every message object in Message._destroy(), which will lead to an invalid free if the same message is then retrieved again. E.g. the following python program leads to libtalloc abort()ing: import notmuch2 db = notmuch2.Database(mode = notmuch2.Database.MODE.READ_ONLY) t= next(db.threads('*')) msgs = list(zip(t.toplevel(), t.toplevel())) msgs = list(zip(t.toplevel(), t.toplevel())) Fix this issue by creating a subclass of Message, which is used for "standalone" message which have to be freed by the caller. Message class is then used only for messages descended from a query, which do not need to be freed by the caller. --- bindings/python-cffi/notmuch2/_database.py | 6 ++-- bindings/python-cffi/notmuch2/_message.py | 42 ++ 2 files changed, 30 insertions(+), 18 deletions(-) diff --git a/bindings/python-cffi/notmuch2/_database.py b/bindings/python-cffi/notmuch2/_database.py index 95f59ca0..f14eac78 100644 --- a/bindings/python-cffi/notmuch2/_database.py +++ b/bindings/python-cffi/notmuch2/_database.py @@ -399,7 +399,7 @@ class Database(base.NotmuchObject): capi.lib.NOTMUCH_STATUS_DUPLICATE_MESSAGE_ID] if ret not in ok: raise errors.NotmuchError(ret) -msg = message.Message(self, msg_pp[0], db=self) +msg = message.StandaloneMessage(self, msg_pp[0], db=self) if sync_flags: msg.tags.from_maildir_flags() return self.AddedMessage( @@ -468,7 +468,7 @@ class Database(base.NotmuchObject): msg_p = msg_pp[0] if msg_p == capi.ffi.NULL: raise LookupError -msg = message.Message(self, msg_p, db=self) +msg = message.StandaloneMessage(self, msg_p, db=self) return msg def get(self, filename): @@ -501,7 +501,7 @@ class Database(base.NotmuchObject): msg_p = msg_pp[0] if msg_p == capi.ffi.NULL: raise LookupError -msg = message.Message(self, msg_p, db=self) +msg = message.StandaloneMessage(self, msg_p, db=self) return msg @property diff --git a/bindings/python-cffi/notmuch2/_message.py b/bindings/python-cffi/notmuch2/_message.py index c5fdbf6d..416ce7ca 100644 --- a/bindings/python-cffi/notmuch2/_message.py +++ b/bindings/python-cffi/notmuch2/_message.py @@ -14,7 +14,7 @@ __all__ = ['Message'] class Message(base.NotmuchObject): -"""An email message stored in the notmuch database. +"""An email message stored in the notmuch database retrieved via a query. This should not be directly created, instead it will be returned by calling methods on :class:`Database`. A message keeps a @@ -61,22 +61,10 @@ class Message(base.NotmuchObject): @property def alive(self): -if not self._parent.alive: -return False -try: -self._msg_p -except errors.ObjectDestroyedError: -return False -else: -return True - -def __del__(self): -self._destroy() +return self._parent.alive def _destroy(self): -if self.alive: -capi.lib.notmuch_message_destroy(self._msg_p) -self._msg_p = None +pass @property def messageid(self): @@ -375,6 +363,30 @@ class Message(base.NotmuchObject): if isinstance(other, self.__class__): return self.messageid == other.messageid +class StandaloneMessage(Message): +"""An email message stored in the notmuch database. + +This subclass of Message is used for messages that are retrieved from the +database directly and are not owned by a query. +""" +@property +def alive(self): +if not self._parent.alive: +return False +try: +self._msg_p +except errors.ObjectDestroyedError: +return False +else: +return True + +def __del__(self): +self._destroy() + +def _destroy(self): +if self.alive: +capi.lib.notmuch_message_destroy(self._msg_p) +self._msg_p = None class FilenamesIter(base.NotmuchIter): """Iterator for binary filenames objects.""" -- 2.27.0 ___ notmuch mailing list notmuch@notmuchmail.org https://notmuchmail.org/mailman/listinfo/notmuch
Re: [PATCH 1/2] python/notmuch2: do not destroy messages owned by a query
Hi Anton, On Thu, May 21 2020, Anton Khirnov wrote: > ping Great (an important) stuff in this series ! ...just that... Floris did good review on your changes the next day. What I remember was there (looked 30 mins ago) - style: somewhat important (I press more there). Easy to get better, good and consistent style makes it better - docs in code: like Floris mentioned, the docs in commit messages are are not seen by user. The relevant documentation is to be with the code. commit message tells what was done and why - tests: since pytest is used to test the current notmuch2 cffi python tests, all new code that can be tested pytest code to do so is to be added And these have to be done now -- postponing such a things into future will just mean that it never happens, and the messines and entropy of the notmuch code gets even greater in such a case... > > -- > Anton Khirnov Tomi ___ notmuch mailing list notmuch@notmuchmail.org https://notmuchmail.org/mailman/listinfo/notmuch
Re: [PATCH 1/2] python/notmuch2: do not destroy messages owned by a query
Hi Anton, Thanks for improving the bindings! Any my apologies for the late response, I failed to spot this mail the first time round. Also, this is a pretty serious bug, thanks for finding it. This looks pretty solid, a few small style comments that aren't very important notwithstanding. Though I do think you should be able to add a test for this in the pytest tests. I think just triggering the use-after-free you demonstrate in the commit message is fine as it will work with your patch. The fact it will crash rather then fail without is unfortunate but probably fine. On Sat 09 May 2020 at 07:05 +0200, Anton Khirnov wrote: > Any messages retrieved from a query - either directly via > search_messages() or indirectly via thread objects - are owned by that > query. Retrieving the same message (i.e. corresponding to the same > message ID / database object) several times will always yield the same > C object. > > The caller is allowed to destroy message objects owned by a query before > the query itself - which can save memory for long-lived queries. > However, that message must then never be retrieved again from that > query. > > The python-notmuch2 bindings will currently destroy every message object > in Message._destroy(), which will lead to an invalid free if the same > message is then retrieved again. E.g. the following python program leads > to libtalloc abort()ing: > > import notmuch2 > db = notmuch2.Database(mode = notmuch2.Database.MODE.READ_ONLY) > t= next(db.threads('*')) > msgs = list(zip(t.toplevel(), t.toplevel())) > msgs = list(zip(t.toplevel(), t.toplevel())) > > Fix this issue by creating a subclass of Message, which is used for > "standalone" message which have to be freed by the caller. Message class > is then used only for messages descended from a query, which do not need > to be freed by the caller. > --- > bindings/python-cffi/notmuch2/_database.py | 6 ++-- > bindings/python-cffi/notmuch2/_message.py | 42 ++ > 2 files changed, 30 insertions(+), 18 deletions(-) > > diff --git a/bindings/python-cffi/notmuch2/_database.py > b/bindings/python-cffi/notmuch2/_database.py > index 95f59ca0..f14eac78 100644 > --- a/bindings/python-cffi/notmuch2/_database.py > +++ b/bindings/python-cffi/notmuch2/_database.py > @@ -399,7 +399,7 @@ class Database(base.NotmuchObject): >capi.lib.NOTMUCH_STATUS_DUPLICATE_MESSAGE_ID] > if ret not in ok: > raise errors.NotmuchError(ret) > -msg = message.Message(self, msg_pp[0], db=self) > +msg = message.StandaloneMessage(self, msg_pp[0], db=self) > if sync_flags: > msg.tags.from_maildir_flags() > return self.AddedMessage( > @@ -468,7 +468,7 @@ class Database(base.NotmuchObject): > msg_p = msg_pp[0] > if msg_p == capi.ffi.NULL: > raise LookupError > -msg = message.Message(self, msg_p, db=self) > +msg = message.StandaloneMessage(self, msg_p, db=self) > return msg > > def get(self, filename): > @@ -501,7 +501,7 @@ class Database(base.NotmuchObject): > msg_p = msg_pp[0] > if msg_p == capi.ffi.NULL: > raise LookupError > -msg = message.Message(self, msg_p, db=self) > +msg = message.StandaloneMessage(self, msg_p, db=self) > return msg > > @property > diff --git a/bindings/python-cffi/notmuch2/_message.py > b/bindings/python-cffi/notmuch2/_message.py > index c5fdbf6d..416ce7ca 100644 > --- a/bindings/python-cffi/notmuch2/_message.py > +++ b/bindings/python-cffi/notmuch2/_message.py > @@ -14,7 +14,7 @@ __all__ = ['Message'] > > > class Message(base.NotmuchObject): > -"""An email message stored in the notmuch database. > +"""An email message stored in the notmuch database retrieved via a query. > > This should not be directly created, instead it will be returned > by calling methods on :class:`Database`. A message keeps a > @@ -61,22 +61,10 @@ class Message(base.NotmuchObject): > > @property > def alive(self): > -if not self._parent.alive: > -return False > -try: > -self._msg_p > -except errors.ObjectDestroyedError: > -return False > -else: > -return True > - > -def __del__(self): > -self._destroy() > +return self._parent.alive > > def _destroy(self): > -if self.alive: > -capi.lib.notmuch_message_destroy(self._msg_p) > -self._msg_p = None > +pass I feel like some more description from the commit message could live here to explain why this isn't destroying the message rather than having to find it in the commit message. > > @property > def messageid(self): > @@ -375,6 +363,30 @@ class Message(base.NotmuchObject): > if isinstance(other, self.__class__): > return self.messageid == other.messageid > > +class
Re: [PATCH 1/2] python/notmuch2: do not destroy messages owned by a query
Anton Khirnov writes: > ping I'm hoping Floris will find a chance to review your patches. d ___ notmuch mailing list notmuch@notmuchmail.org https://notmuchmail.org/mailman/listinfo/notmuch
Re: [PATCH 1/2] python/notmuch2: do not destroy messages owned by a query
ping -- Anton Khirnov ___ notmuch mailing list notmuch@notmuchmail.org https://notmuchmail.org/mailman/listinfo/notmuch
[PATCH 1/2] python/notmuch2: do not destroy messages owned by a query
Any messages retrieved from a query - either directly via search_messages() or indirectly via thread objects - are owned by that query. Retrieving the same message (i.e. corresponding to the same message ID / database object) several times will always yield the same C object. The caller is allowed to destroy message objects owned by a query before the query itself - which can save memory for long-lived queries. However, that message must then never be retrieved again from that query. The python-notmuch2 bindings will currently destroy every message object in Message._destroy(), which will lead to an invalid free if the same message is then retrieved again. E.g. the following python program leads to libtalloc abort()ing: import notmuch2 db = notmuch2.Database(mode = notmuch2.Database.MODE.READ_ONLY) t= next(db.threads('*')) msgs = list(zip(t.toplevel(), t.toplevel())) msgs = list(zip(t.toplevel(), t.toplevel())) Fix this issue by creating a subclass of Message, which is used for "standalone" message which have to be freed by the caller. Message class is then used only for messages descended from a query, which do not need to be freed by the caller. --- bindings/python-cffi/notmuch2/_database.py | 6 ++-- bindings/python-cffi/notmuch2/_message.py | 42 ++ 2 files changed, 30 insertions(+), 18 deletions(-) diff --git a/bindings/python-cffi/notmuch2/_database.py b/bindings/python-cffi/notmuch2/_database.py index 95f59ca0..f14eac78 100644 --- a/bindings/python-cffi/notmuch2/_database.py +++ b/bindings/python-cffi/notmuch2/_database.py @@ -399,7 +399,7 @@ class Database(base.NotmuchObject): capi.lib.NOTMUCH_STATUS_DUPLICATE_MESSAGE_ID] if ret not in ok: raise errors.NotmuchError(ret) -msg = message.Message(self, msg_pp[0], db=self) +msg = message.StandaloneMessage(self, msg_pp[0], db=self) if sync_flags: msg.tags.from_maildir_flags() return self.AddedMessage( @@ -468,7 +468,7 @@ class Database(base.NotmuchObject): msg_p = msg_pp[0] if msg_p == capi.ffi.NULL: raise LookupError -msg = message.Message(self, msg_p, db=self) +msg = message.StandaloneMessage(self, msg_p, db=self) return msg def get(self, filename): @@ -501,7 +501,7 @@ class Database(base.NotmuchObject): msg_p = msg_pp[0] if msg_p == capi.ffi.NULL: raise LookupError -msg = message.Message(self, msg_p, db=self) +msg = message.StandaloneMessage(self, msg_p, db=self) return msg @property diff --git a/bindings/python-cffi/notmuch2/_message.py b/bindings/python-cffi/notmuch2/_message.py index c5fdbf6d..416ce7ca 100644 --- a/bindings/python-cffi/notmuch2/_message.py +++ b/bindings/python-cffi/notmuch2/_message.py @@ -14,7 +14,7 @@ __all__ = ['Message'] class Message(base.NotmuchObject): -"""An email message stored in the notmuch database. +"""An email message stored in the notmuch database retrieved via a query. This should not be directly created, instead it will be returned by calling methods on :class:`Database`. A message keeps a @@ -61,22 +61,10 @@ class Message(base.NotmuchObject): @property def alive(self): -if not self._parent.alive: -return False -try: -self._msg_p -except errors.ObjectDestroyedError: -return False -else: -return True - -def __del__(self): -self._destroy() +return self._parent.alive def _destroy(self): -if self.alive: -capi.lib.notmuch_message_destroy(self._msg_p) -self._msg_p = None +pass @property def messageid(self): @@ -375,6 +363,30 @@ class Message(base.NotmuchObject): if isinstance(other, self.__class__): return self.messageid == other.messageid +class StandaloneMessage(Message): +"""An email message stored in the notmuch database. + +This subclass of Message is used for messages that are retrieved from the +database directly and are not owned by a query. +""" +@property +def alive(self): +if not self._parent.alive: +return False +try: +self._msg_p +except errors.ObjectDestroyedError: +return False +else: +return True + +def __del__(self): +self._destroy() + +def _destroy(self): +if self.alive: +capi.lib.notmuch_message_destroy(self._msg_p) +self._msg_p = None class FilenamesIter(base.NotmuchIter): """Iterator for binary filenames objects.""" -- 2.20.1 ___ notmuch mailing list notmuch@notmuchmail.org https://notmuchmail.org/mailman/listinfo/notmuch