On Dec 18, 2008, at 12:05 PM, Istvan Albert wrote: > On the other hand saving an object, then retrieving the same object > later seems a common thing to do. It is really strange when you start > getting back different things just because another module may have > reloaded pygr.Data. Imagine a threaded webserver that reloads a > changed module, or a failed data attempt that now wants to obtain a > fresh copy of the data. > > More to the point, in this particular case I don't even know what else > should one be doing (other than reload) to actually get the file > itself.
Sure. Back in July I floated a proposal to eliminate this reload() behavior. http://groups.google.com/group/pygr-dev/browse_thread/thread/d309166f7ca0ee36/31e771979f92504e#31e771979f92504e No one seemed particularly interested, so I haven't yet followed that up. I'll briefly summarize the issue: - for users to access names from the pygr.Data module namespace, those names have to be loaded into that namespace during the module import, since Python provides no dynamic attribute lookup (__getattr__) mechanism for modules. Names like pygr.Data.Bio or pygr.Data.Physics have to be added during import (based on reading top-level names like Bio and Physics from the resource databases), so that users can access them. This annoying fact causes annoying consequences: - e.g. pygr.Data must connect to resource databases listed by PYGRDATAPATH *during the import*, and creates an object (pygr.Data.getResource) that keeps a cache of all access activity to those resource databases. - if you reload the module, you of course get a new getResource object with an empty cache. When you reload a module you expect a clean reload with no possible "side-effects" persisting from past usage of the module before the reload... Possible Solutions: - require that all pygr.Data access go through a "root" name, i.e. pygr.Data.root.Bio.Seq.Genome.HUMAN.hg17. This requires users to type a few more characters, but eliminates most of these issues. The root object __getattr__ will be run whenever the user requests a new name, so pygr.Data would no longer need to connect to resource databases during module import. That could wait until the user actually requests some data. - don't cache objects that undergo unpickling transformations, since the current behavior of retrieving the object from cache will not give the expected transformation. If you or others think this needs to be addressed in the 0.8 release, we could include it. In the past, no one expressed much interest, so it was deferred, presumably to the 1.0 "pygr.Data improvements" release. Thanks for raising this! -- Chris --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "pygr-dev" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/pygr-dev?hl=en -~----------~----~----~----~------~----~------~--~---
