Re: [Tutor] Text Processing Query

Mark Lawrence Thu, 14 Mar 2013 09:35:17 -0700

On 14/03/2013 11:28, taserian wrote:

Top posting fixed


On Thu, Mar 14, 2013 at 6:56 AM, Spyros Charonis <[email protected]
<mailto:[email protected]>> wrote:

    Hello Pythoners,

    I am trying to extract certain fields from a file that whose text
    looks like this:

    COMPND   2 MOLECULE: POTASSIUM CHANNEL SUBFAMILY K MEMBER 4;
    COMPND   3 CHAIN: A, B;
    COMPND  10 MOL_ID: 2;
    COMPND  11 MOLECULE: ANTIBODY FAB FRAGMENT LIGHT CHAIN;
    COMPND  12 CHAIN: D, F;
    COMPND  13 ENGINEERED: YES;
    COMPND  14 MOL_ID: 3;
    COMPND  15 MOLECULE: ANTIBODY FAB FRAGMENT HEAVY CHAIN;
    COMPND  16 CHAIN: E, G;

    I would like the chain IDs, but only those following the text
    heading "ANTIBODY FAB FRAGMENT", i.e. I need to create a list with
    D,F,E,G  which excludes A,B which have a non-antibody text heading.
    I am using the following syntax:

    with open(filename) as file:

         scanfile=file.readlines()

         for line in scanfile:

             if line[0:6]=='COMPND' and 'FAB FRAGMENT' in line: continue

             elif line[0:6]=='COMPND' and 'CHAIN' in line:

                 print line


    But this yields:

    COMPND   3 CHAIN: A, B;
    COMPND  12 CHAIN: D, F;
    COMPND  16 CHAIN: E, G;

    I would like to ignore the first line since A,B correspond to
    non-antibody text headings, and instead want to extract only D,F &
    E,G whose text headings are specified as antibody fragments.

    Many thanks,
    Spyros

Since the identifier and the item that you want to keep are on different
lines, you'll need to set a "flag".

with open(filename) as file:

     scanfile=file.readlines()

     flag = 0

     for line in scanfile:

         if line[0:6]=='COMPND' and 'FAB FRAGMENT' in line: flag = 1

         elif line[0:6]=='COMPND' and 'CHAIN' in line and flag = 1:

             print line

             flag = 0


Notice that the flag is set to 1 only on "FAB FRAGMENT", and it's reset
to 0 after the next "CHAIN" line that follows the "FAB FRAGMENT" line.


AR


Notice that this code won't run due to a syntax error.

--
Cheers.

Mark Lawrence

_______________________________________________
Tutor maillist  -  [email protected]
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Text Processing Query

Reply via email to