Hi Kenny,
strictly speaking the behavior you reported is not a bug but rather a
usage that pygr.Data doesn't support. The problem is that you're
using regular pickling, which doesn't know about pygr.Data. pygr.Data
uses pickling, but regular pickling doesn't know about pygr.Data. I
believe we have a pygr.Data-aware pickling method but I'd have to find
that information for you.
This is an issue of modularity of design. Pygr.Data is designed to
work with any class that is picklable; the code for those classes does
not need to know anything about pygr.Data or even that it exists. If
you want to be able to pickle your objects with the regular python
pickler / cPickler and have that magically invoke pygr.Data, then the
code for those object's classes would have to include special
__getstate__, __setstate__, __reduce__ methods that specifically
invoke pygr.Data methods. That would break the modularity principle
stated above, because each such class would have to be written to work
specifically with pygr.Data.
Simple option: instead of using the standard pickle function, use a
variant that we supply that saves data in a pygr.Data aware way, and
unpickle using a variant method that we supply that again will invoke
pygr.Data as needed to reconstitute your data.
Would that work for you?
-- Chris
On Jan 28, 2009, at 11:41 AM, Kenny Daily wrote:
>
> OK, so I re-ran my scripts double-checking everywhere that I used the
> genome from pygr.Data, and it still fails. So I worked up a test case,
> data and code follows. Before pickling, the _persistent_id is there.
> After pickling, its not.
>
> *** START test_genome.fa ***
>> test
> ACGCAGACTGACCTACGATCAAATAAGCCGAGCTAGCAAGCCCGCCGTAATCGATCGACGTACGTCGATCGATCGACCCC
> GAATAGACTCCGATAAGCGTAGTGTATATAGCGCGCCCGTATATAGGATGAGAAGAATATAAAGCTCCTCTCGAGATCGA
> *** END test_genome.fa ***
>
> *** START GENOME BUILD CODE ***
> from pygr import seqdb
> g = seqdb.BlastDB("/home/baldig/projects/genomics/nonsvn/results/
> yeast/
> Ty3/TEST/test_genome.fa")
> g.__doc__ = "Genome for testing Ty3 read clustering pipeline"
> import pygr.Data
> pygr.Data.getResource.addResource("Bio.Seq.Genome.TESTING.test_ty3",
> g)
> pygr.Data.save()
> *** END GENOME BUILD CODE ***
>
> *** START DATA CREATION CODE ***
> import cPickle
> import pygr.Data
> from pygr.sequence import Sequence
>
> # mimicks the data structure i'm using
> class HTS:
> def __init__(self, seqs=[]):
> self.seqs = seqs
>
> genome = pygr.Data.getResource("Bio.Seq.Genome.TESTING.test_ty3")
> foo = Sequence('CCCGCCGTAATCGATCGAC', 'foo')
> b = genome.blast(foo)
> s,d,e = b[foo].edges()[0]
>
> # check that persistent_id
> d.path.db._persistent_id
>
> H = HTS(seqs=[d])
> x = H.seqs[0]
>
> # check that persistent_id, again
> x.path.db._persistent_id
>
> Hs = {1: H}
> cPickle.dump(Hs, file("test_pickle_seqslice.pkl", "w"))
>
> *** END DATA CREATION CODE ***
>
>
> *** START DATA READ CODE ***
>
> import cPickle
> import pygr.Data
>
> class HTS:
> def __init__(self, seqs=[]):
> self.seqs = seqs
>
> d = cPickle.load(file("test_pickle_seqslice.pkl"))
> x = d[1]
> s = x.seqs[0]
>
> # I get <type 'exceptions.AttributeError'>: 'BlastDB' object has no
> attribute '_persistent_id'
> s.path.db._persistent_id
>
> *** END DATA READ CODE ***
>
>
> On Jan 28, 3:57 am, Christopher Lee <[email protected]> wrote:
>> On Jan 28, 2009, at 1:53 PM, Kenny Daily wrote:
>>
>>
>>
>>> OK. These things make sense. However, I think what I'm doing is a
>>> little more complicated, and I've left out some of the important
>>> steps
>>> that may help explain. First, I'm sure that I'm using the pygr.Data
>>> object everytime...i.e. genome is always set by:
>>
>>> genome = pygr.Data.getResource("Bio.Seq.Genome.YEAST.sacCer")
>>
>> Kenny, could you check the c.sequence.path.db._persistent_id on the
>> case from your example that gives the KeyError? If this attribute is
>> missing, the data was *not* loaded with a pygr.Data ID. Let me know
>> what you find.
> >
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"pygr-dev" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/pygr-dev?hl=en
-~----------~----~----~----~------~----~------~--~---