Scott Melnyk wrote:
Hello!
I recently suffered a loss of programming files (and I had been
putting off my backups...)
[snip]
#regular expression to pull out gene, transcript and exon ids
info=re.compile('^(ENSG\d+\.\d).+(ENST\d+\.\d).+(ENSE\d+\.\d)+')
#above is match gene, transcript, then one or more exons
#TFILE = open(sys.argv[1], 'r' ) #read the various
transcripts from
WFILE=open(sys.argv[1], 'w') # file to write 2 careful
with 'w'
will overwrite old info in file
W2FILE=open(sys.argv[2], 'w') #this file will have the
names of
redundant exons
import sets
def getintersections(fname='Z:\datasets\h35GroupedDec15b.txt'):
exonSets = {}
f = open(fname)
for line in f:
if line.startswith('ENS'):
parts = line.split()
gene = parts[0]
transcript = parts[1]
exons = parts[2:]
exonSets.setdefault(gene,
sets.Set(exons)).intersection(sets.Set(exons))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
return exonSets
Hi Scott,
There may be other problems, but here's one thing I noticed:
exonSets.setdefault(gene,
sets.Set(exons)).intersection(sets.Set(exons))
should be
exonSets.setdefault(gene,
sets.Set(exons)).intersection_update(sets.Set(exons))
Hope that helps.
Rich
_______________________________________________
Tutor maillist - [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/tutor