>Here is all-in-one-line string that I want to parse: >$string = "biological process|mitosis|IEA|GO:0007067|MGD|na|biological >process|cell cycle|IEA|GO:0007049|MGD|na|cellular >component|intracellular|IEA|GO:0005622|MGD|na|molecular function|protein >tyrosine phosphatase|IEA|GO:0004725|MGD|na|biological process|M phase of >mitotic cell cycle|IEA|GO:0000087|MGD|na|biological process|protein amino >acid dephosphorylation|IEA|GO:0006470|MGD|na";> > >I want to extract recursively all words delimited by two pipes ( | ) that >follow a specific pattern like "process" or "component". For example, I want >mitosis, cell cycle, M phase of mitotic cell cycles, and protein amino acid >dephosphorylation, extracted that are associated with the pattern "process". >I tried several combinations of the following code but it gets only one word >or everything or nothing at all. > >while ($string =~ /process(\S+(?!\|)(\s\S+)*)/g) { > print "\tbiological process\t$1\n"; >}
The code above looks for 2 things only and returns them in $1 and $2. If you want to get at something that has a delimiter, splitting a string into an array is a good idea e.g. my @Temp = split(/|/, $string); Then you can process each element with a foreach loop. I suggest this because I cant make out which particular word and its relative position interests you - you say 'all' in one sentence, then in the next you give an example that relates to seemingly random positions (maybe its me instead ; ). Does $string really have new lines in it? _______________________________________________ Perl-Unix-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs