Re: [PHP] Re: [PHP-DEV] Parse search string a la Google (Regular expression?)

2002-11-28 Thread Benny Rasmussen
Works like a charm, thank you!

Benny

"Ernest E Vogelsinger" <[EMAIL PROTECTED]> wrote in message
[EMAIL PROTECTED]">news:[EMAIL PROTECTED]...
> At 23:11 27.11.2002, Ernest E Vogelsinger said:
> [snip]
> >If I understand you correctly you want to isolate either quoted strings
> >(with or without whitespace), or tokens separated by whitespace, as array
> >elements?
> >
> >For this you would first have to isolate the first quoted sentence, then
> >tokenize the part before, and loop this as long you're not done.
> >
> >Should work something like that:
> >
> > [...]
> >Disclaimer: untested as usual. _Should_ behave like this:
> [snip]
>
> I _should_ have tested. This script actually works the way you expect it
to be:
>
> 
> 
> 
> function tokenize_search($input) {
> $re = '/\s*(.*?)\s*"\s*([^"]*?)\s*"\s*(.*)/s';
> /* look for 3 groups:
>a - prematch - anything up to the first quote
>b - match - anything until the next quote
>c - postmatch - rest of the string
> */
> $tokens = array();
> while (preg_match($re, $input, $aresult)) {
> // aresult contains: [0]-total [1]-a [2]-b [3]-c
> // tokenize the prematch
> if ($aresult[1]) $tokens = array_merge($tokens, explode(' ',
> $aresult[1]));
> array_push($tokens, $aresult[2]);
> $input = $aresult[3];
> }
> // $input has the rest of the line
> if ($input) $tokens = array_merge($tokens, explode(' ', $input));
> return $tokens;
> }
>
> $string = "\"search for this sentence\" -NotForThisWord
> ButDefinitelyForThisWord";
> $tokens = tokenize_search($string);
> print_r($tokens);
>
> ?>
> 
>
> The output of this script is:
>
> Array (
> [0] => search for this sentence
> [1] => -NotForThisWord
> [2] => ButDefinitelyForThisWord
> )
>
>
> --
>>O Ernest E. Vogelsinger
>(\)ICQ #13394035
> ^ http://www.vogelsinger.at/
>
>



-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php




Re: [PHP] Re: [PHP-DEV] Parse search string a la Google (Regular expression?)

2002-11-27 Thread Ernest E Vogelsinger
At 23:11 27.11.2002, Ernest E Vogelsinger said:
[snip]
>If I understand you correctly you want to isolate either quoted strings
>(with or without whitespace), or tokens separated by whitespace, as array
>elements?
>
>For this you would first have to isolate the first quoted sentence, then
>tokenize the part before, and loop this as long you're not done.
>
>Should work something like that:
>
> [...]
>Disclaimer: untested as usual. _Should_ behave like this:
[snip] 

I _should_ have tested. This script actually works the way you expect it to be:






The output of this script is:

Array (
[0] => search for this sentence
[1] => -NotForThisWord
[2] => ButDefinitelyForThisWord
) 


-- 
   >O Ernest E. Vogelsinger
   (\)ICQ #13394035
^ http://www.vogelsinger.at/



-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php




Re: [PHP] Re: [PHP-DEV] Parse search string a la Google (Regular expression?)

2002-11-27 Thread Ernest E Vogelsinger
>Benny Rasmussen wrote:
>> Hi,
>>
>> In my application I would like to offer a search interface like Google
>> and other popular search engines. The complication for me is to explode
>> the search string into proper array elements, like this:
>>
>> $search_str = "\"search for this sentence\" -NotForThisWord
>> ButDefinitelyForThisWord";
>>
>> $array[0]: "search for this sentence"
>> $array[1]: "-NotForThisWord"
>> $array[2]: "ButDefinitelyForThisWord"
>>
>> I have tried to use regular expressions but my case seems to be a bit
>> more complicated for this (?).
>>
>> Does anybody have a code snippet, a class or something, that can help
>> me with this?
[snip] 

If I understand you correctly you want to isolate either quoted strings
(with or without whitespace), or tokens separated by whitespace, as array
elements?

For this you would first have to isolate the first quoted sentence, then
tokenize the part before, and loop this as long you're not done.

Should work something like that:



function tokenize_search($input) {
$re = '/\s*(.*?)\s*"\s*([^"]*?)\s*"\s*(.*)/s';
/* look for 3 groups:
   a - prematch - anything up to the first quote
   b - match - anything until the next quote
   c - postmatch - rest of the string
*/
$tokens = array();
while (preg_match($re, $input, $aresult)) {
// aresult contains: [0]-total [1]-a [2]-b [3]-c
// tokenize the prematch
array_push($tokens, explode(' ', $aresult[1]));
array_push($tokens, $aresult[2]);
$input = $aresult[3];
}
// $input has the rest of the line
array_push($tokens, explode(' ', $input));
return $tokens;
}


Disclaimer: untested as usual. _Should_ behave like this:

$string = "\"search for this sentence\" -NotForThisWord
ButDefinitelyForThisWord";
$tokens = tokenize_search($string);
print_r($tokens);
Array(
[0] - search for this sentence
[1] - -NotForThisWord
[2} - ButDefinitelyForThisWord
)


-- 
   >O Ernest E. Vogelsinger
   (\)ICQ #13394035
^ http://www.vogelsinger.at/



-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php