I don't have code to do what I want, but here's the pieces I'm trying to string together:
Abbreviation dictionary consists of a file like this: SRED. SREDNE SEV. SEVERN etc. Each abbreviation is turned into four regexes, like this (doubtless they could be made more efficient, but they work well enough at present): # Sred. = SREDNE $cgname =~ s/^SRED\.(?=[\W\s\-\d]+)/SREDNE:/g ; # Match it at beginning of line $cgname =~ s/[\W\s\-]+SRED\.(?=[\W\s\-\d]+)/:SREDNE:/g ; # Match it within the line $cgname =~ s/[\W\s\-]+SRED\.$/:SREDNE:/g ; # Match it at end of line $cgname =~ s/^SRED\.$/:SREDNE:/g ; # Match if it begins & ends line # Sev. = SEVERN $cgname =~ s/^SEV\.(?=[\W\s\-\d]+)/SEVERN:/g ; # Match it at beginning of line $cgname =~ s/[\W\s\-]+SEV\.(?=[\W\s\-\d]+)/:SEVERN:/g ; # Match it within the line $cgname =~ s/[\W\s\-]+SEV\.$/:SEVERN:/g ; # Match it at end of line $cgname =~ s/^SEV\.$/:SEVERN:/g ; # Match if it begins & ends line etc. Right now I'm generating the regexes in a standalone script, then inserting the output code into the subroutine that processes names into a "matchable" form. What I'd like to be able to do is take a *set* of abbreviation "dictionaries," concatenate them together and dynamically generate the regex code in the routine that is going to execute it. Thanks, Scott Scott E. Robinson SWAT Team UTC Onsite User Support RR-690 -- 281-654-5169 EMB-2813N -- 713-656-3629 "David Kirol" <[EMAIL PROTECTED] To: <[EMAIL PROTECTED]> > cc: Subject: Re: There has to be a way to do this 06/20/03 08:38 PM Scott, Sounds like a fun problem. Can you post some code and an (abbreviated) set of example data? David "Scott E Robinson" <[EMAIL PROTECTED]> wrote in message news:<[EMAIL PROTECTED]>... > I'm still working on the well-name matching program that I've brought up > here before. I've received invaluable help to solve the toughest questions > in its development, for which I'm very grateful. > > Now I'm trying to automate some steps which were previously manual in the > process, to make it more end-user-friendly. There has to be a way to do > this with Perl. > > The script uses a "dictionary" of abbreviations to aid its matching. The > abbreviations are implemented as a series of substitutions with the "s" > operator. I have a Perl script which builds the substitution statements > from a tab-delimited list of abbreviations and their equivalent long forms. > I then manually insert these statements into the subroutine that uses them. > > I kept the abbreviation translation hardcoded into the subroutine for > performance reasons (this thing compares 14,000 unknown well names against > 680,000 match candidates). Is there a way in Perl to read the abbreviation > dicitionary (the tab-delimited list), generate the code, insert it into the > right subroutine, and start executing the program, all in one script? > (Maybe you can tell me that the performance hit from using variables in the > substitution statements is negligible, and if so, I'd be happy to go that > route.) > > Thanks in advance, > > Scott > > Scott E. Robinson > Data SWAT Team > UTC Onsite User Support > RR-690 -- 281-654-5169 > EMB-2813N -- 713-656-3629 > -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]