Re: [PATCH 3/7] go: Allow notmuch objects to be garbage collected
Le 19/10/2012 06:55, Ethan Glasser-Camp a écrit : Adrien Bustany writes: This makes notmuch appropriately free the underlying notmuch C objects when garbage collecting their Go wrappers. To make sure we don't break the underlying links between objects (for example, a notmuch_messages_t being GC'ed before a notmuch_message_t belonging to it), we add for each wraper struct a pointer to the owner object (Go objects with a reference pointing to them don't get garbage collected). Hi Adrien! This whole series is marked moreinfo, but I don't think that's just. It looks like there were some unresolved issues about reference tracking and garbage collection, and some suggestions to use the C values of enums instead of regenerating them with iota, but there's definitely valid code that I assume would be useful if anyone ever wanted to write in Go ;). Are you figuring to clean this series up? This comment should s/wraper/wrapper/. Ethan Hello Ethan, thanks for the heads up, I still have this on my table, and yes there is additional work to do for the patches to be really clean. I can't give an estimate for now, let's hope sooner than later :/ Cheers Adrien ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
Re: [PATCH 3/7] go: Allow notmuch objects to be garbage collected
Adrien Bustany writes: > This makes notmuch appropriately free the underlying notmuch C objects > when garbage collecting their Go wrappers. To make sure we don't break > the underlying links between objects (for example, a notmuch_messages_t > being GC'ed before a notmuch_message_t belonging to it), we add for each > wraper struct a pointer to the owner object (Go objects with a reference > pointing to them don't get garbage collected). Hi Adrien! This whole series is marked moreinfo, but I don't think that's just. It looks like there were some unresolved issues about reference tracking and garbage collection, and some suggestions to use the C values of enums instead of regenerating them with iota, but there's definitely valid code that I assume would be useful if anyone ever wanted to write in Go ;). Are you figuring to clean this series up? This comment should s/wraper/wrapper/. Ethan ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
Re: [PATCH 3/7] go: Allow notmuch objects to be garbage collected
Quoth Adrien Bustany on Jul 24 at 1:03 am: > Le 20/07/2012 06:23, Austin Clements a écrit : > >Quoth Adrien Bustany on Jul 19 at 9:25 pm: > >>Le 18/07/2012 23:40, Austin Clements a écrit : > >>>This is subtle enough that I think it deserves a comment in the source > >>>code explaining that tracking the talloc owner reference, combined > >>>with the fact that Go finalizers are run in dependency order, ensures > >>>that the C objects will always be destroyed from the talloc leaves up. > >> > >>Definitely, I tend to comment in the commit message and forget about > >>the code... > >> > >>> > >>>Just one inline comment below. Otherwise, I think this is all > >>>correct. > >> > >>Agree with the comment, the Database should be the parent. I guess I > >>wasn't sure of the talloc parenting. > >> > >>> > >>>Is reproducing the talloc hierarchy in all of the bindings really the > >>>right approach? I feel like there has to be a better way (or that the > >>>way we use talloc in the library is slightly broken). What if the > >>>bindings created an additional talloc reference to each managed > >>>object, just to keep the object alive, and used talloc_unlink instead > >>>of the destroy functions? > >> > >>Reproducing the hierarchy is probably error prone, and not that > >>simple indeed :/ > >>I haven't checked at all the way you suggest, but if we use > >>talloc_reference/unlink, we get the same issue no? > >>- If we do for each new wrapped object talloc_reference(NULL, > >>wrapped_object), the the object will be kept alive until we > >>talloc_unlink(NULL, wrapped_object), but what about its parents? For > >>example will doing that on a notmuch_message_t keep the > >>notmuch_messages_t alive? > > > >Hmm. This is what I was thinking. You have an interesting point; I > >think it's slightly wrong, but it exposes something deeper. I believe > >there are two different things going on here: some of the talloc > >relationships are for convenience, while some are structural. In the > >former case, I'm pretty sure my suggestion will work, but in the > >latter case the objects should *never* be freed by the finalizer! > > > >For example, notmuch_query_search_messages returns a new > >notmuch_messages_t with the query as the talloc parent, but that > >notmuch_messages_t doesn't depend on the query object; this is just so > >you can conveniently delete everything retrieved from the query by > >deleting the query. In this case, you can either use parent > >references like you did---which will prevent a double-free by forcing > >destruction to happen from the leaves up but at the cost of having to > >encode these relationships and of extending the parent object > >lifetimes beyond what's strictly necessary---or you can use my > >suggestion of creating an additional talloc reference. > > Actually, checking the code of notmuch_query_search_messages, it > seems that the notmuch_messages_t (and the notmuch_message_t as > well) object *does* depend on the database and the query... So in > that case I think we need the "owner" Object reference as I > currently have (we want the Messages to keep the Query alive, and > the Query keeps the Database alive). It does depends on the database (I think just about everything depends on the database, directly or indirectly, so I suppose everything will need some parent pointer), but could you explain how it depends on the query? It uses the MSet derived from the query, but Xapian internally handles the sharing and referencing counting of all of its objects. > That said, you example below looks valid, and it seems I'll need to > add a flag to createMessage() (and some others) to disable the > SetFinalizer call for certain instances (we probably want to keep it > for eg. SearchMessageByFilename). > > - The candidates I found for adding a tmalloc reference and not a > "full" Go reference (therefore preventing to keep the parent alive > too long needlessly) are GetAllTags, Thread.GetTags, > Messages.CollectTags, and Message.GetTags (those are basically > string lists) Sounds reasonable (but I haven't gone through carefully). > - The methods for which I should remove the SetFinalizer on the > wrapper (as you showed in the example below) while keeping the Go > reference are Threads.Get and Messages.Get Sounds right. I think those are the only cases where the object is still owned by a container, other than strings (which Go has to copy anyway). > I would also maybe remove all the Destroy() functions, since they > now seem more dangerous than anything else... Yeah, probably. > I tried to write a test using runtime.GC to test the behaviour of > the bindings, but for some reasons some cases which are supposed to > crash don't, which makes me sceptical about the validity of the test > :-/ Hmm. Go's collector is partially conservative, IIRC, so maybe it's following a technically dead pointer? > Cheers > > Adrien > > > > >However, in your example, the notmuch_message_t's are structurally
Re: [PATCH 3/7] go: Allow notmuch objects to be garbage collected
Le 20/07/2012 06:23, Austin Clements a écrit : Quoth Adrien Bustany on Jul 19 at 9:25 pm: Le 18/07/2012 23:40, Austin Clements a écrit : This is subtle enough that I think it deserves a comment in the source code explaining that tracking the talloc owner reference, combined with the fact that Go finalizers are run in dependency order, ensures that the C objects will always be destroyed from the talloc leaves up. Definitely, I tend to comment in the commit message and forget about the code... Just one inline comment below. Otherwise, I think this is all correct. Agree with the comment, the Database should be the parent. I guess I wasn't sure of the talloc parenting. Is reproducing the talloc hierarchy in all of the bindings really the right approach? I feel like there has to be a better way (or that the way we use talloc in the library is slightly broken). What if the bindings created an additional talloc reference to each managed object, just to keep the object alive, and used talloc_unlink instead of the destroy functions? Reproducing the hierarchy is probably error prone, and not that simple indeed :/ I haven't checked at all the way you suggest, but if we use talloc_reference/unlink, we get the same issue no? - If we do for each new wrapped object talloc_reference(NULL, wrapped_object), the the object will be kept alive until we talloc_unlink(NULL, wrapped_object), but what about its parents? For example will doing that on a notmuch_message_t keep the notmuch_messages_t alive? Hmm. This is what I was thinking. You have an interesting point; I think it's slightly wrong, but it exposes something deeper. I believe there are two different things going on here: some of the talloc relationships are for convenience, while some are structural. In the former case, I'm pretty sure my suggestion will work, but in the latter case the objects should *never* be freed by the finalizer! For example, notmuch_query_search_messages returns a new notmuch_messages_t with the query as the talloc parent, but that notmuch_messages_t doesn't depend on the query object; this is just so you can conveniently delete everything retrieved from the query by deleting the query. In this case, you can either use parent references like you did---which will prevent a double-free by forcing destruction to happen from the leaves up but at the cost of having to encode these relationships and of extending the parent object lifetimes beyond what's strictly necessary---or you can use my suggestion of creating an additional talloc reference. Actually, checking the code of notmuch_query_search_messages, it seems that the notmuch_messages_t (and the notmuch_message_t as well) object *does* depend on the database and the query... So in that case I think we need the "owner" Object reference as I currently have (we want the Messages to keep the Query alive, and the Query keeps the Database alive). That said, you example below looks valid, and it seems I'll need to add a flag to createMessage() (and some others) to disable the SetFinalizer call for certain instances (we probably want to keep it for eg. SearchMessageByFilename). - The candidates I found for adding a tmalloc reference and not a "full" Go reference (therefore preventing to keep the parent alive too long needlessly) are GetAllTags, Thread.GetTags, Messages.CollectTags, and Message.GetTags (those are basically string lists) - The methods for which I should remove the SetFinalizer on the wrapper (as you showed in the example below) while keeping the Go reference are Threads.Get and Messages.Get I would also maybe remove all the Destroy() functions, since they now seem more dangerous than anything else... I tried to write a test using runtime.GC to test the behaviour of the bindings, but for some reasons some cases which are supposed to crash don't, which makes me sceptical about the validity of the test :-/ Cheers Adrien However, in your example, the notmuch_message_t's are structurally related to the notmuch_messages_t from whence they came. They're all part of one data structure and hence it *never* makes sense for a caller to delete the notmuch_message_t's. For example, even with the code in this patch, I think the following could lead to a crash: 1. Obtain a Messages object, say ms. 2. m1 := ms.Get() 3. m1 = nil 4. m2 := ms.Get() 5. m2.whatever() If a garbage collection happens between steps 3 and 4, the Message in m1 will get finalized and destroyed. But step 4 will return the same, now dangling, pointer, leading to a potential crash in step 5. Maybe the answer in the structural case is to include the parent pointer in the Go struct and not set a finalizer on the child? That way, if there's a Go reference to the parent wrapper, it won't go away and the children won't get destroyed (collecting wrappers of children is fine) and if there's a Go reference to the child wrapper, it will keep the parent alive so it won't get des
Re: [PATCH 3/7] go: Allow notmuch objects to be garbage collected
Quoth Adrien Bustany on Jul 19 at 9:25 pm: > Le 18/07/2012 23:40, Austin Clements a écrit : > >This is subtle enough that I think it deserves a comment in the source > >code explaining that tracking the talloc owner reference, combined > >with the fact that Go finalizers are run in dependency order, ensures > >that the C objects will always be destroyed from the talloc leaves up. > > Definitely, I tend to comment in the commit message and forget about > the code... > > > > >Just one inline comment below. Otherwise, I think this is all > >correct. > > Agree with the comment, the Database should be the parent. I guess I > wasn't sure of the talloc parenting. > > > > >Is reproducing the talloc hierarchy in all of the bindings really the > >right approach? I feel like there has to be a better way (or that the > >way we use talloc in the library is slightly broken). What if the > >bindings created an additional talloc reference to each managed > >object, just to keep the object alive, and used talloc_unlink instead > >of the destroy functions? > > Reproducing the hierarchy is probably error prone, and not that > simple indeed :/ > I haven't checked at all the way you suggest, but if we use > talloc_reference/unlink, we get the same issue no? > - If we do for each new wrapped object talloc_reference(NULL, > wrapped_object), the the object will be kept alive until we > talloc_unlink(NULL, wrapped_object), but what about its parents? For > example will doing that on a notmuch_message_t keep the > notmuch_messages_t alive? Hmm. This is what I was thinking. You have an interesting point; I think it's slightly wrong, but it exposes something deeper. I believe there are two different things going on here: some of the talloc relationships are for convenience, while some are structural. In the former case, I'm pretty sure my suggestion will work, but in the latter case the objects should *never* be freed by the finalizer! For example, notmuch_query_search_messages returns a new notmuch_messages_t with the query as the talloc parent, but that notmuch_messages_t doesn't depend on the query object; this is just so you can conveniently delete everything retrieved from the query by deleting the query. In this case, you can either use parent references like you did---which will prevent a double-free by forcing destruction to happen from the leaves up but at the cost of having to encode these relationships and of extending the parent object lifetimes beyond what's strictly necessary---or you can use my suggestion of creating an additional talloc reference. However, in your example, the notmuch_message_t's are structurally related to the notmuch_messages_t from whence they came. They're all part of one data structure and hence it *never* makes sense for a caller to delete the notmuch_message_t's. For example, even with the code in this patch, I think the following could lead to a crash: 1. Obtain a Messages object, say ms. 2. m1 := ms.Get() 3. m1 = nil 4. m2 := ms.Get() 5. m2.whatever() If a garbage collection happens between steps 3 and 4, the Message in m1 will get finalized and destroyed. But step 4 will return the same, now dangling, pointer, leading to a potential crash in step 5. Maybe the answer in the structural case is to include the parent pointer in the Go struct and not set a finalizer on the child? That way, if there's a Go reference to the parent wrapper, it won't go away and the children won't get destroyed (collecting wrappers of children is fine) and if there's a Go reference to the child wrapper, it will keep the parent alive so it won't get destroyed and neither will the child. > - If we do talloc_reference(parent, wrapped), then we reproduce the > hierarchy again? > > Note that I have 0 experience with talloc, so I might as well be > getting things wrong here. > > > > >Quoth Adrien Bustany on Jul 18 at 9:34 pm: > >>This makes notmuch appropriately free the underlying notmuch C objects > >>when garbage collecting their Go wrappers. To make sure we don't break > >>the underlying links between objects (for example, a notmuch_messages_t > >>being GC'ed before a notmuch_message_t belonging to it), we add for each > >>wraper struct a pointer to the owner object (Go objects with a reference > >>pointing to them don't get garbage collected). > >>--- > >> bindings/go/src/notmuch/notmuch.go | 153 > >> +++- > >> 1 files changed, 134 insertions(+), 19 deletions(-) > >> > >>diff --git a/bindings/go/src/notmuch/notmuch.go > >>b/bindings/go/src/notmuch/notmuch.go > >>index 1d77fd2..3f436a0 100644 > >>--- a/bindings/go/src/notmuch/notmuch.go > >>+++ b/bindings/go/src/notmuch/notmuch.go > >>@@ -11,6 +11,7 @@ package notmuch > >> #include "notmuch.h" > >> */ > >> import "C" > >>+import "runtime" > >> import "unsafe" > >> > >> // Status codes used for the return values of most functions > >>@@ -47,40 +48,152 @@ func (self Status) String() string { > >> /* V
Re: [PATCH 3/7] go: Allow notmuch objects to be garbage collected
Le 18/07/2012 23:40, Austin Clements a écrit : This is subtle enough that I think it deserves a comment in the source code explaining that tracking the talloc owner reference, combined with the fact that Go finalizers are run in dependency order, ensures that the C objects will always be destroyed from the talloc leaves up. Definitely, I tend to comment in the commit message and forget about the code... Just one inline comment below. Otherwise, I think this is all correct. Agree with the comment, the Database should be the parent. I guess I wasn't sure of the talloc parenting. Is reproducing the talloc hierarchy in all of the bindings really the right approach? I feel like there has to be a better way (or that the way we use talloc in the library is slightly broken). What if the bindings created an additional talloc reference to each managed object, just to keep the object alive, and used talloc_unlink instead of the destroy functions? Reproducing the hierarchy is probably error prone, and not that simple indeed :/ I haven't checked at all the way you suggest, but if we use talloc_reference/unlink, we get the same issue no? - If we do for each new wrapped object talloc_reference(NULL, wrapped_object), the the object will be kept alive until we talloc_unlink(NULL, wrapped_object), but what about its parents? For example will doing that on a notmuch_message_t keep the notmuch_messages_t alive? - If we do talloc_reference(parent, wrapped), then we reproduce the hierarchy again? Note that I have 0 experience with talloc, so I might as well be getting things wrong here. Quoth Adrien Bustany on Jul 18 at 9:34 pm: This makes notmuch appropriately free the underlying notmuch C objects when garbage collecting their Go wrappers. To make sure we don't break the underlying links between objects (for example, a notmuch_messages_t being GC'ed before a notmuch_message_t belonging to it), we add for each wraper struct a pointer to the owner object (Go objects with a reference pointing to them don't get garbage collected). --- bindings/go/src/notmuch/notmuch.go | 153 +++- 1 files changed, 134 insertions(+), 19 deletions(-) diff --git a/bindings/go/src/notmuch/notmuch.go b/bindings/go/src/notmuch/notmuch.go index 1d77fd2..3f436a0 100644 --- a/bindings/go/src/notmuch/notmuch.go +++ b/bindings/go/src/notmuch/notmuch.go @@ -11,6 +11,7 @@ package notmuch #include "notmuch.h" */ import "C" +import "runtime" import "unsafe" // Status codes used for the return values of most functions @@ -47,40 +48,152 @@ func (self Status) String() string { /* Various opaque data types. For each notmuch__t see the various * notmuch_ functions below. */ +type Object interface {} + type Database struct { db *C.notmuch_database_t } +func createDatabase(db *C.notmuch_database_t) *Database { + self := &Database{db: db} + + runtime.SetFinalizer(self, func(x *Database) { + if (x.db != nil) { + C.notmuch_database_destroy(x.db) + } + }) + + return self +} + type Query struct { query *C.notmuch_query_t + owner Object +} + +func createQuery(query *C.notmuch_query_t, owner Object) *Query { + self := &Query{query: query, owner: owner} + + runtime.SetFinalizer(self, func(x *Query) { + if (x.query != nil) { + C.notmuch_query_destroy(x.query) + } + }) + + return self } type Threads struct { threads *C.notmuch_threads_t + owner Object +} + +func createThreads(threads *C.notmuch_threads_t, owner Object) *Threads { + self := &Threads{threads: threads, owner: owner} + + runtime.SetFinalizer(self, func(x *Threads) { + if (x.threads != nil) { + C.notmuch_threads_destroy(x.threads) + } + }) + + return self } type Thread struct { thread *C.notmuch_thread_t + owner Object +} + +func createThread(thread *C.notmuch_thread_t, owner Object) *Thread { + self := &Thread{thread: thread, owner: owner} + + runtime.SetFinalizer(self, func(x *Thread) { + if (x.thread != nil) { + C.notmuch_thread_destroy(x.thread) + } + }) + + return self } type Messages struct { messages *C.notmuch_messages_t + owner Object +} + +func createMessages(messages *C.notmuch_messages_t, owner Object) *Messages { + self := &Messages{messages: messages, owner: owner} + + return self } type Message struct { message *C.notmuch_message_t + owner Object +} + +func createMessage(message *C.notmuch_message_t, owner Object) *Message { + self := &Message{message: message, owner: owner} + + runtime.SetFinalizer(self, func(x *Message) { + if (x.message != nil) { + C.notmuc
Re: [PATCH 3/7] go: Allow notmuch objects to be garbage collected
This is subtle enough that I think it deserves a comment in the source code explaining that tracking the talloc owner reference, combined with the fact that Go finalizers are run in dependency order, ensures that the C objects will always be destroyed from the talloc leaves up. Just one inline comment below. Otherwise, I think this is all correct. Is reproducing the talloc hierarchy in all of the bindings really the right approach? I feel like there has to be a better way (or that the way we use talloc in the library is slightly broken). What if the bindings created an additional talloc reference to each managed object, just to keep the object alive, and used talloc_unlink instead of the destroy functions? Quoth Adrien Bustany on Jul 18 at 9:34 pm: > This makes notmuch appropriately free the underlying notmuch C objects > when garbage collecting their Go wrappers. To make sure we don't break > the underlying links between objects (for example, a notmuch_messages_t > being GC'ed before a notmuch_message_t belonging to it), we add for each > wraper struct a pointer to the owner object (Go objects with a reference > pointing to them don't get garbage collected). > --- > bindings/go/src/notmuch/notmuch.go | 153 > +++- > 1 files changed, 134 insertions(+), 19 deletions(-) > > diff --git a/bindings/go/src/notmuch/notmuch.go > b/bindings/go/src/notmuch/notmuch.go > index 1d77fd2..3f436a0 100644 > --- a/bindings/go/src/notmuch/notmuch.go > +++ b/bindings/go/src/notmuch/notmuch.go > @@ -11,6 +11,7 @@ package notmuch > #include "notmuch.h" > */ > import "C" > +import "runtime" > import "unsafe" > > // Status codes used for the return values of most functions > @@ -47,40 +48,152 @@ func (self Status) String() string { > /* Various opaque data types. For each notmuch__t see the various > * notmuch_ functions below. */ > > +type Object interface {} > + > type Database struct { > db *C.notmuch_database_t > } > > +func createDatabase(db *C.notmuch_database_t) *Database { > + self := &Database{db: db} > + > + runtime.SetFinalizer(self, func(x *Database) { > + if (x.db != nil) { > + C.notmuch_database_destroy(x.db) > + } > + }) > + > + return self > +} > + > type Query struct { > query *C.notmuch_query_t > + owner Object > +} > + > +func createQuery(query *C.notmuch_query_t, owner Object) *Query { > + self := &Query{query: query, owner: owner} > + > + runtime.SetFinalizer(self, func(x *Query) { > + if (x.query != nil) { > + C.notmuch_query_destroy(x.query) > + } > + }) > + > + return self > } > > type Threads struct { > threads *C.notmuch_threads_t > + owner Object > +} > + > +func createThreads(threads *C.notmuch_threads_t, owner Object) *Threads { > + self := &Threads{threads: threads, owner: owner} > + > + runtime.SetFinalizer(self, func(x *Threads) { > + if (x.threads != nil) { > + C.notmuch_threads_destroy(x.threads) > + } > + }) > + > + return self > } > > type Thread struct { > thread *C.notmuch_thread_t > + owner Object > +} > + > +func createThread(thread *C.notmuch_thread_t, owner Object) *Thread { > + self := &Thread{thread: thread, owner: owner} > + > + runtime.SetFinalizer(self, func(x *Thread) { > + if (x.thread != nil) { > + C.notmuch_thread_destroy(x.thread) > + } > + }) > + > + return self > } > > type Messages struct { > messages *C.notmuch_messages_t > + owner Object > +} > + > +func createMessages(messages *C.notmuch_messages_t, owner Object) *Messages { > + self := &Messages{messages: messages, owner: owner} > + > + return self > } > > type Message struct { > message *C.notmuch_message_t > + owner Object > +} > + > +func createMessage(message *C.notmuch_message_t, owner Object) *Message { > + self := &Message{message: message, owner: owner} > + > + runtime.SetFinalizer(self, func(x *Message) { > + if (x.message != nil) { > + C.notmuch_message_destroy(x.message) > + } > + }) > + > + return self > } > > type Tags struct { > tags *C.notmuch_tags_t > + owner Object > +} > + > +func createTags(tags *C.notmuch_tags_t, owner Object) *Tags { > + self := &Tags{tags: tags, owner: owner} > + > + runtime.SetFinalizer(self, func(x *Tags) { > + if (x.tags != nil) { > + C.notmuch_tags_destroy(x.tags) > + } > + }) > + > + return self > } > > type Directory struct { > dir *C.notmuch_directory_t > + owner Object > +} > + > +func createDirectory(directory *C.notmuch_directory_t, owner Object) > *Directory { > + self := &Directory{dir: directory, owner: owner} > + > + runtime.SetFinalizer(self