Re: problems using sequence slices as index to NLMSA

Kenny Daily Wed, 28 Jan 2009 11:41:04 -0800

OK, so I re-ran my scripts double-checking everywhere that I used the
genome from pygr.Data, and it still fails. So I worked up a test case,
data and code follows. Before pickling, the _persistent_id is there.
After pickling, its not.


*** START test_genome.fa ***
>test
ACGCAGACTGACCTACGATCAAATAAGCCGAGCTAGCAAGCCCGCCGTAATCGATCGACGTACGTCGATCGATCGACCCC
GAATAGACTCCGATAAGCGTAGTGTATATAGCGCGCCCGTATATAGGATGAGAAGAATATAAAGCTCCTCTCGAGATCGA
*** END test_genome.fa ***

*** START GENOME BUILD CODE ***
from pygr import seqdb
g = seqdb.BlastDB("/home/baldig/projects/genomics/nonsvn/results/yeast/
Ty3/TEST/test_genome.fa")
g.__doc__ = "Genome for testing Ty3 read clustering pipeline"
import pygr.Data
pygr.Data.getResource.addResource("Bio.Seq.Genome.TESTING.test_ty3",
g)
pygr.Data.save()
*** END GENOME BUILD CODE ***

*** START DATA CREATION CODE ***
import cPickle
import pygr.Data
from pygr.sequence import Sequence

# mimicks the data structure i'm using
class HTS:
    def __init__(self, seqs=[]):
        self.seqs = seqs

genome = pygr.Data.getResource("Bio.Seq.Genome.TESTING.test_ty3")
foo = Sequence('CCCGCCGTAATCGATCGAC', 'foo')
b = genome.blast(foo)
s,d,e = b[foo].edges()[0]

# check that persistent_id
d.path.db._persistent_id

H = HTS(seqs=[d])
x = H.seqs[0]

# check that persistent_id, again
x.path.db._persistent_id

Hs = {1: H}
cPickle.dump(Hs, file("test_pickle_seqslice.pkl", "w"))

*** END DATA CREATION CODE ***


*** START DATA READ CODE ***

import cPickle
import pygr.Data

class HTS:
    def __init__(self, seqs=[]):
        self.seqs = seqs

d = cPickle.load(file("test_pickle_seqslice.pkl"))
x = d[1]
s = x.seqs[0]

# I get <type 'exceptions.AttributeError'>: 'BlastDB' object has no
attribute '_persistent_id'
s.path.db._persistent_id

*** END DATA READ CODE ***


On Jan 28, 3:57 am, Christopher Lee <[email protected]> wrote:
> On Jan 28, 2009, at 1:53 PM, Kenny Daily wrote:
>
>
>
> > OK. These things make sense. However, I think what I'm doing is a
> > little more complicated, and I've left out some of the important steps
> > that may help explain. First, I'm sure that I'm using the pygr.Data
> > object everytime...i.e. genome is always set by:
>
> > genome = pygr.Data.getResource("Bio.Seq.Genome.YEAST.sacCer")
>
> Kenny, could you check the c.sequence.path.db._persistent_id on the  
> case from your example that gives the KeyError?  If this attribute is  
> missing, the data was *not* loaded with a pygr.Data ID.  Let me know  
> what you find.
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"pygr-dev" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/pygr-dev?hl=en
-~----------~----~----~----~------~----~------~--~---

Re: problems using sequence slices as index to NLMSA

Reply via email to