On 14/03/2013 11:28, taserian wrote: Top posting fixed
On Thu, Mar 14, 2013 at 6:56 AM, Spyros Charonis <s.charo...@gmail.com <mailto:s.charo...@gmail.com>> wrote: Hello Pythoners, I am trying to extract certain fields from a file that whose text looks like this: COMPND 2 MOLECULE: POTASSIUM CHANNEL SUBFAMILY K MEMBER 4; COMPND 3 CHAIN: A, B; COMPND 10 MOL_ID: 2; COMPND 11 MOLECULE: ANTIBODY FAB FRAGMENT LIGHT CHAIN; COMPND 12 CHAIN: D, F; COMPND 13 ENGINEERED: YES; COMPND 14 MOL_ID: 3; COMPND 15 MOLECULE: ANTIBODY FAB FRAGMENT HEAVY CHAIN; COMPND 16 CHAIN: E, G; I would like the chain IDs, but only those following the text heading "ANTIBODY FAB FRAGMENT", i.e. I need to create a list with D,F,E,G which excludes A,B which have a non-antibody text heading. I am using the following syntax: with open(filename) as file: scanfile=file.readlines() for line in scanfile: if line[0:6]=='COMPND' and 'FAB FRAGMENT' in line: continue elif line[0:6]=='COMPND' and 'CHAIN' in line: print line But this yields: COMPND 3 CHAIN: A, B; COMPND 12 CHAIN: D, F; COMPND 16 CHAIN: E, G; I would like to ignore the first line since A,B correspond to non-antibody text headings, and instead want to extract only D,F & E,G whose text headings are specified as antibody fragments. Many thanks, Spyros Since the identifier and the item that you want to keep are on different lines, you'll need to set a "flag". with open(filename) as file: scanfile=file.readlines() flag = 0 for line in scanfile: if line[0:6]=='COMPND' and 'FAB FRAGMENT' in line: flag = 1 elif line[0:6]=='COMPND' and 'CHAIN' in line and flag = 1: print line flag = 0 Notice that the flag is set to 1 only on "FAB FRAGMENT", and it's reset to 0 after the next "CHAIN" line that follows the "FAB FRAGMENT" line. AR
Notice that this code won't run due to a syntax error. -- Cheers. Mark Lawrence _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor