Scott Melnyk wrote:
Hello!

I recently suffered a loss of programming files (and I had been
putting off my backups...)

[snip]

#regular expression to pull out gene, transcript and exon ids

info=re.compile('^(ENSG\d+\.\d).+(ENST\d+\.\d).+(ENSE\d+\.\d)+')
#above is match gene, transcript, then one or more exons


#TFILE = open(sys.argv[1], 'r' ) #read the various transcripts from WFILE=open(sys.argv[1], 'w') # file to write 2 careful with 'w' will overwrite old info in file W2FILE=open(sys.argv[2], 'w') #this file will have the names of redundant exons import sets def getintersections(fname='Z:\datasets\h35GroupedDec15b.txt'): exonSets = {} f = open(fname) for line in f: if line.startswith('ENS'): parts = line.split() gene = parts[0] transcript = parts[1] exons = parts[2:] exonSets.setdefault(gene, sets.Set(exons)).intersection(sets.Set(exons))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
return exonSets



Hi Scott,

There may be other problems, but here's one thing I noticed:

exonSets.setdefault(gene,
    sets.Set(exons)).intersection(sets.Set(exons))

should be

exonSets.setdefault(gene,
   sets.Set(exons)).intersection_update(sets.Set(exons))

Hope that helps.

Rich



_______________________________________________
Tutor maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/tutor

Reply via email to