At 23:11 27.11.2002, Ernest E Vogelsinger said:
>If I understand you correctly you want to isolate either quoted strings
>(with or without whitespace), or tokens separated by whitespace, as array
>For this you would first have to isolate the first quoted sentence, then
>tokenize the part before, and loop this as long you're not done.
>Should work something like that:
> [...]
>Disclaimer: untested as usual. _Should_ behave like this:

I _should_ have tested. This script actually works the way you expect it to be:


function tokenize_search($input) {
    $re = '/\s*(.*?)\s*"\s*([^"]*?)\s*"\s*(.*)/s';
    /* look for 3 groups:
       a - prematch - anything up to the first quote
       b - match - anything until the next quote
       c - postmatch - rest of the string
    $tokens = array();
    while (preg_match($re, $input, $aresult)) {
        // aresult contains: [0]-total [1]-a [2]-b [3]-c
        // tokenize the prematch
        if ($aresult[1]) $tokens = array_merge($tokens, explode(' ',
        array_push($tokens, $aresult[2]);
        $input = $aresult[3];
    // $input has the rest of the line
    if ($input) $tokens = array_merge($tokens, explode(' ', $input));
    return $tokens;

$string = "\"search for this sentence\" -NotForThisWord
$tokens = tokenize_search($string);


The output of this script is:

Array (
    [0] => search for this sentence
    [1] => -NotForThisWord
    [2] => ButDefinitelyForThisWord

