>Here is all-in-one-line string that I want to parse:
>$string = "biological process|mitosis|IEA|GO:0007067|MGD|na|biological
>process|cell cycle|IEA|GO:0007049|MGD|na|cellular
>component|intracellular|IEA|GO:0005622|MGD|na|molecular function|protein
>tyrosine phosphatase|IEA|GO:0004725|MGD|na|biological process|M phase of
>mitotic cell cycle|IEA|GO:0000087|MGD|na|biological process|protein amino
>acid dephosphorylation|IEA|GO:0006470|MGD|na";>
>
>I want to extract recursively all words delimited by two pipes ( | ) that
>follow a specific pattern like "process" or "component". For example, I
want
>mitosis, cell cycle, M phase of mitotic cell cycles, and protein amino acid
>dephosphorylation, extracted that are associated with the pattern
"process".
>I tried several combinations of the following code but it gets only one
word
>or everything or nothing at all.
>
>while ($string  =~  /process(\S+(?!\|)(\s\S+)*)/g) { 
>   print "\tbiological process\t$1\n"; 
>}

The code above looks for 2 things only and returns them in $1 and $2.

If you want to get at something that has a delimiter, splitting a string
into an array is a good idea e.g.

my @Temp = split(/|/, $string);

Then you can process each element with a foreach loop.

I suggest this because I cant make out which particular
word and its relative position interests you - you say 'all'
in one sentence, then in the next you give an example
that relates to seemingly random positions (maybe its
me instead ; ).

Does $string really have new lines in it?
_______________________________________________
Perl-Unix-Users mailing list
[EMAIL PROTECTED]
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs

Reply via email to