Hi everyone,
Can someone please help me with the following phyton script? I received the
error message DeprecationWarning: the sets module is deprecated
from sets import Set.
After googling, I have tried the methods others suggest: change sets to set or
delete the from sets import Set but none of them works.
Can someone suggest me how to modify the following codes so that the input file
is read from standard input?
I'd like to execute them with unix command
script.py < sequence.fna
Thanks a bunch.
#!/usr/local/bin/python
import math
from sets import Set
line = file("sequence.fna", "r")
for x in line:
if x [0] == ">" :
#determine the length of sequences
s=line.next()
s=s.rstrip()
length = len(s)
# determine the GC content
G = s.count('G')
C = s.count('C')
GC= 100 * (float(G + C) / length)
stList = list(s)
alphabet = list(Set(stList))
freqList = []
for symbol in alphabet:
ctr = 0
for sym in stList:
if sym == symbol:
ctr += 1
freqList.append(float(ctr)/len(stList))
# Shannon entropy
ent = 0.0
for freq in freqList:
ent = ent + freq * math.log(freq, 2)
ent = -ent
print x
print "Length:" , length
print "G+C:" ,round(GC),"%"
print 'Shannon entropy:'
print ent
print 'Minimum number of bits required to encode each symbol:'
print int(math.ceil(ent))
_______________________________________________
Tutor maillist - [email protected]
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor