On 14 March 2013 10:56, Spyros Charonis <s.charo...@gmail.com> wrote: > Hello Pythoners, > > I am trying to extract certain fields from a file that whose text looks like > this: > > COMPND 2 MOLECULE: POTASSIUM CHANNEL SUBFAMILY K MEMBER 4; > COMPND 3 CHAIN: A, B; > COMPND 10 MOL_ID: 2; > COMPND 11 MOLECULE: ANTIBODY FAB FRAGMENT LIGHT CHAIN; > COMPND 12 CHAIN: D, F; > COMPND 13 ENGINEERED: YES; > COMPND 14 MOL_ID: 3; > COMPND 15 MOLECULE: ANTIBODY FAB FRAGMENT HEAVY CHAIN; > COMPND 16 CHAIN: E, G; > > I would like the chain IDs, but only those following the text heading > "ANTIBODY FAB FRAGMENT", i.e. I need to create a list with D,F,E,G which > excludes A,B which have a non-antibody text heading. I am using the > following syntax: > > with open(filename) as file: > > scanfile=file.readlines() > > for line in scanfile: > > if line[0:6]=='COMPND' and 'FAB FRAGMENT' in line: continue > > elif line[0:6]=='COMPND' and 'CHAIN' in line: > > print line > > > But this yields: > > COMPND 3 CHAIN: A, B; > COMPND 12 CHAIN: D, F; > COMPND 16 CHAIN: E, G; > > I would like to ignore the first line since A,B correspond to non-antibody > text headings, and instead want to extract only D,F & E,G whose text > headings are specified as antibody fragments. > > Many thanks, > Spyros > > >
This is how I would do it. with open(filename) as file: scanfile = file.readlines() wanted = "CHAIN:" unwanted = [" A", " B"] for line in scanfile: for item in unwanted: if item not in line and wanted in line: print line HTH, Bodsda _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor