On Apr 18, 2009, at 6:51 AM, C. Titus Brown wrote:

>
> On Fri, Apr 17, 2009 at 09:58:44PM -0700, Christopher Lee wrote:
>
> -> The only idea that comes to mind is to make explicitly closing a
> -> sequence database object a requirement of the SequenceDB API?  i.e.
> -> you must execute db.close() when you are done with the database.   
> Does
> -> that seem like a good requirement to add?
>
> Yes; I thought about suggesting it during my review but couldn't think
> of a functional reason for doing so; it just seemed neat.  Now we  
> have a
> reason, though ;)

OK, I will go through Pygr and recommend objects that keep a file  
open, for adding mandatory close() methods.  The obvious examples  
include SequenceFileDB, NLMSA, and various shelve based storages like  
Graph.

>
>
> (Providing the function is sufficient; I don't think we need to  
> require
> that it be called in most circumstances.)

I think the documentation has to say "you should always close() the  
object when done with it."  It is true that you can probably get away  
without that in most cases (especially on UNIX), because the object  
will close itself when garbage collected.  We've been relying on that  
behavior all these years on UNIX without any problem.  But mandating  
this as standard operating procedure will save some people from  
baffling bugs that arise due to some interaction between the order of  
deleting a file vs. when garbage collection actually deletes the  
object.  This seems to matter on Windows, as even our initial testing  
of the test suite hits that issue twice.

IMPLEMENTATION
I want to implement this in a way that avoids having to insert code  
that checks whether the file or shelve object is valid in each routine  
that accesses them (i.e. to catch whether the object has already been  
closed, and to raise a clear error message in that case).  I propose  
to create a descriptor class that only gets accessed if a particular  
file attribute is missing.  It will have a __set__() method that  
allows you to save the file attribute (saves it to the object's  
__dict__, so that it will be retrieved in future requests for this  
attribute); a __del__() method that deletes it from __dict__; and a  
__get__() which will therefore only be accessed if the file attribute  
is missing (because it was already closed) and will raise an exception  
with a clear explanatory message ("you already closed this object, you  
oaf").  We would then add this descriptor as a property to any class  
that keeps such an open file object.  The file opening methods in the  
class __init__ would be unchanged -- they just save the file / shelve  
to that attribute as usual.  All code accessing the attribute remains  
unchanged.  The new close() method for the class would close the  
shelve, then delete the attribute, exposing this descriptor.  Any  
future attempt to access the attribute will invoke the descriptor and  
produce the appropriate error message.

Does that sound OK to you?  Any criticisms or suggestions?  I want to  
implement a systematic solution with a good error message, because I  
have unpleasant memories of tracking down a problem with a prematurely  
closed shelve object...  When you close() a shelve, what actually  
happens is it overwrites its berkeleyDB hash attribute with the value  
zero (0)!  So any further attempt to look up a key in the shelve gets  
a cryptic error message like "int object has no index method"!!!




--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"pygr-dev" group.
To post to this group, send email to pygr-dev@googlegroups.com
To unsubscribe from this group, send email to 
pygr-dev+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/pygr-dev?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to