In message <[EMAIL PROTECTED]>, "Robert Taylor" writes: >I need to parse the search string into tokens in the manner that search engine >s would.
Lexical analysis (i.e., tokenization) and parsing are two separate activities. Sometimes you can get away with combining the two, but you'll find you can only do so much with split. Define a regular expression for each of your tokens and consume the input matching against each in a specified order. In your case, tokens appear to be either \s+ (i.e., the separator which would be discarded), \S+, and "[^"]+"?. You have to test for the last token first to avoid misidentifying whitespace. It so happens that you can manage this with split. You appear to have almost gotten there already with: >Here is what I've tried (but it doesn't cover escaping metacharacters >which might be in the search string): > > /"(.*?)"|(\w+)/ I don't understand where your search string and escaped metacharacters enter the picture. If you need to escape metacharacters in a string, use Perl5Compiler.quotemeta. I hope that helps. daniel --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]