On Apr 18, 2009, at 6:51 AM, C. Titus Brown wrote:
> > On Fri, Apr 17, 2009 at 09:58:44PM -0700, Christopher Lee wrote: > > -> The only idea that comes to mind is to make explicitly closing a > -> sequence database object a requirement of the SequenceDB API? i.e. > -> you must execute db.close() when you are done with the database. > Does > -> that seem like a good requirement to add? > > Yes; I thought about suggesting it during my review but couldn't think > of a functional reason for doing so; it just seemed neat. Now we > have a > reason, though ;) OK, I will go through Pygr and recommend objects that keep a file open, for adding mandatory close() methods. The obvious examples include SequenceFileDB, NLMSA, and various shelve based storages like Graph. > > > (Providing the function is sufficient; I don't think we need to > require > that it be called in most circumstances.) I think the documentation has to say "you should always close() the object when done with it." It is true that you can probably get away without that in most cases (especially on UNIX), because the object will close itself when garbage collected. We've been relying on that behavior all these years on UNIX without any problem. But mandating this as standard operating procedure will save some people from baffling bugs that arise due to some interaction between the order of deleting a file vs. when garbage collection actually deletes the object. This seems to matter on Windows, as even our initial testing of the test suite hits that issue twice. IMPLEMENTATION I want to implement this in a way that avoids having to insert code that checks whether the file or shelve object is valid in each routine that accesses them (i.e. to catch whether the object has already been closed, and to raise a clear error message in that case). I propose to create a descriptor class that only gets accessed if a particular file attribute is missing. It will have a __set__() method that allows you to save the file attribute (saves it to the object's __dict__, so that it will be retrieved in future requests for this attribute); a __del__() method that deletes it from __dict__; and a __get__() which will therefore only be accessed if the file attribute is missing (because it was already closed) and will raise an exception with a clear explanatory message ("you already closed this object, you oaf"). We would then add this descriptor as a property to any class that keeps such an open file object. The file opening methods in the class __init__ would be unchanged -- they just save the file / shelve to that attribute as usual. All code accessing the attribute remains unchanged. The new close() method for the class would close the shelve, then delete the attribute, exposing this descriptor. Any future attempt to access the attribute will invoke the descriptor and produce the appropriate error message. Does that sound OK to you? Any criticisms or suggestions? I want to implement a systematic solution with a good error message, because I have unpleasant memories of tracking down a problem with a prematurely closed shelve object... When you close() a shelve, what actually happens is it overwrites its berkeleyDB hash attribute with the value zero (0)! So any further attempt to look up a key in the shelve gets a cryptic error message like "int object has no index method"!!! --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "pygr-dev" group. To post to this group, send email to pygr-dev@googlegroups.com To unsubscribe from this group, send email to pygr-dev+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/pygr-dev?hl=en -~----------~----~----~----~------~----~------~--~---