Spyros Charonis wrote: > Hello, > > I've written a script that scans a biological database and extracts some > information. A sample of output from my script is as follows: > > LYLGILLSHAN AA3R_SHEEP 263 31 > > LYMGILLSHAN AA3R_HUMAN 264 31 > > MCLGILLSHAN AA3R_RAT 266 31 > > LLVGILLSHAN AA3R_RABIT 265 31 > > The leftmost strings are the ones I want to keep, while I would like to > get rid of the ones to the right (AA3R_SHEEP, 263 61) which are just > indicators of where the sequence came from and genomic coordinates. Is > there any way to do this with a string processing command? The loop which > builds my list goes like this: > > for line in query_lines: > if line.startswith('fd;'): # find motif sequences > #print "Found an FD for your query!", > line.rstrip().lstrip('fd;') > print line.lstrip('fd;') > motif.append(line.rstrip().lstrip('fd;')) > > Is there a del command I can use to preserve only the actual sequences > themselves. Many thanks in advance!
You don't have to delete; instead extract the piece you are interested in: with open("prints41_1.kdat") as instream: for line in instream: if line.startswith("fd;"): print line.split()[1] To see what the last line does, lets perform it in two steps >>> line = 'fd; RVNIENPSRADSYNPRAG A1YQH4_ORYSJ 310 310\n' >>> parts = line.split() >>> parts ['fd;', 'RVNIENPSRADSYNPRAG', 'A1YQH4_ORYSJ', '310', '310'] >>> wanted = parts[1] >>> wanted 'RVNIENPSRADSYNPRAG' _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor