That would definitely work! There's nothing about cPickle that I'm inherently attached to, just need a way to store our data structure. Thanks!
On Jan 30, 11:05 am, Christopher Lee <[email protected]> wrote: > Hi Kenny, > strictly speaking the behavior you reported is not a bug but rather a > usage that pygr.Data doesn't support. The problem is that you're > using regular pickling, which doesn't know about pygr.Data. pygr.Data > uses pickling, but regular pickling doesn't know about pygr.Data. I > believe we have a pygr.Data-aware pickling method but I'd have to find > that information for you. > > This is an issue of modularity of design. Pygr.Data is designed to > work with any class that is picklable; the code for those classes does > not need to know anything about pygr.Data or even that it exists. If > you want to be able to pickle your objects with the regular python > pickler / cPickler and have that magically invoke pygr.Data, then the > code for those object's classes would have to include special > __getstate__, __setstate__, __reduce__ methods that specifically > invoke pygr.Data methods. That would break the modularity principle > stated above, because each such class would have to be written to work > specifically with pygr.Data. > > Simple option: instead of using the standard pickle function, use a > variant that we supply that saves data in a pygr.Data aware way, and > unpickle using a variant method that we supply that again will invoke > pygr.Data as needed to reconstitute your data. > > Would that work for you? > > -- Chris > > On Jan 28, 2009, at 11:41 AM, Kenny Daily wrote: > > > > > OK, so I re-ran my scripts double-checking everywhere that I used the > > genome from pygr.Data, and it still fails. So I worked up a test case, > > data and code follows. Before pickling, the _persistent_id is there. > > After pickling, its not. > > > *** START test_genome.fa *** > >> test > > ACGCAGACTGACCTACGATCAAATAAGCCGAGCTAGCAAGCCCGCCGTAATCGATCGACGTACGTCGATCGATCGACCCC > > GAATAGACTCCGATAAGCGTAGTGTATATAGCGCGCCCGTATATAGGATGAGAAGAATATAAAGCTCCTCTCGAGATCGA > > *** END test_genome.fa *** > > > *** START GENOME BUILD CODE *** > > from pygr import seqdb > > g = seqdb.BlastDB("/home/baldig/projects/genomics/nonsvn/results/ > > yeast/ > > Ty3/TEST/test_genome.fa") > > g.__doc__ = "Genome for testing Ty3 read clustering pipeline" > > import pygr.Data > > pygr.Data.getResource.addResource("Bio.Seq.Genome.TESTING.test_ty3", > > g) > > pygr.Data.save() > > *** END GENOME BUILD CODE *** > > > *** START DATA CREATION CODE *** > > import cPickle > > import pygr.Data > > from pygr.sequence import Sequence > > > # mimicks the data structure i'm using > > class HTS: > > def __init__(self, seqs=[]): > > self.seqs = seqs > > > genome = pygr.Data.getResource("Bio.Seq.Genome.TESTING.test_ty3") > > foo = Sequence('CCCGCCGTAATCGATCGAC', 'foo') > > b = genome.blast(foo) > > s,d,e = b[foo].edges()[0] > > > # check that persistent_id > > d.path.db._persistent_id > > > H = HTS(seqs=[d]) > > x = H.seqs[0] > > > # check that persistent_id, again > > x.path.db._persistent_id > > > Hs = {1: H} > > cPickle.dump(Hs, file("test_pickle_seqslice.pkl", "w")) > > > *** END DATA CREATION CODE *** > > > *** START DATA READ CODE *** > > > import cPickle > > import pygr.Data > > > class HTS: > > def __init__(self, seqs=[]): > > self.seqs = seqs > > > d = cPickle.load(file("test_pickle_seqslice.pkl")) > > x = d[1] > > s = x.seqs[0] > > > # I get <type 'exceptions.AttributeError'>: 'BlastDB' object has no > > attribute '_persistent_id' > > s.path.db._persistent_id > > > *** END DATA READ CODE *** > > > On Jan 28, 3:57 am, Christopher Lee <[email protected]> wrote: > >> On Jan 28, 2009, at 1:53 PM, Kenny Daily wrote: > > >>> OK. These things make sense. However, I think what I'm doing is a > >>> little more complicated, and I've left out some of the important > >>> steps > >>> that may help explain. First, I'm sure that I'm using the pygr.Data > >>> object everytime...i.e. genome is always set by: > > >>> genome = pygr.Data.getResource("Bio.Seq.Genome.YEAST.sacCer") > > >> Kenny, could you check the c.sequence.path.db._persistent_id on the > >> case from your example that gives the KeyError? If this attribute is > >> missing, the data was *not* loaded with a pygr.Data ID. Let me know > >> what you find. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "pygr-dev" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/pygr-dev?hl=en -~----------~----~----~----~------~----~------~--~---
