Re: [PHP] Re: [PHP-DEV] Parse search string a la Google (Regular expression?)
Works like a charm, thank you! Benny "Ernest E Vogelsinger" <[EMAIL PROTECTED]> wrote in message [EMAIL PROTECTED]">news:[EMAIL PROTECTED]... > At 23:11 27.11.2002, Ernest E Vogelsinger said: > [snip] > >If I understand you correctly you want to isolate either quoted strings > >(with or without whitespace), or tokens separated by whitespace, as array > >elements? > > > >For this you would first have to isolate the first quoted sentence, then > >tokenize the part before, and loop this as long you're not done. > > > >Should work something like that: > > > > [...] > >Disclaimer: untested as usual. _Should_ behave like this: > [snip] > > I _should_ have tested. This script actually works the way you expect it to be: > > > > > function tokenize_search($input) { > $re = '/\s*(.*?)\s*"\s*([^"]*?)\s*"\s*(.*)/s'; > /* look for 3 groups: >a - prematch - anything up to the first quote >b - match - anything until the next quote >c - postmatch - rest of the string > */ > $tokens = array(); > while (preg_match($re, $input, $aresult)) { > // aresult contains: [0]-total [1]-a [2]-b [3]-c > // tokenize the prematch > if ($aresult[1]) $tokens = array_merge($tokens, explode(' ', > $aresult[1])); > array_push($tokens, $aresult[2]); > $input = $aresult[3]; > } > // $input has the rest of the line > if ($input) $tokens = array_merge($tokens, explode(' ', $input)); > return $tokens; > } > > $string = "\"search for this sentence\" -NotForThisWord > ButDefinitelyForThisWord"; > $tokens = tokenize_search($string); > print_r($tokens); > > ?> > > > The output of this script is: > > Array ( > [0] => search for this sentence > [1] => -NotForThisWord > [2] => ButDefinitelyForThisWord > ) > > > -- >>O Ernest E. Vogelsinger >(\)ICQ #13394035 > ^ http://www.vogelsinger.at/ > > -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Re: [PHP-DEV] Parse search string a la Google (Regular expression?)
At 23:11 27.11.2002, Ernest E Vogelsinger said: [snip] >If I understand you correctly you want to isolate either quoted strings >(with or without whitespace), or tokens separated by whitespace, as array >elements? > >For this you would first have to isolate the first quoted sentence, then >tokenize the part before, and loop this as long you're not done. > >Should work something like that: > > [...] >Disclaimer: untested as usual. _Should_ behave like this: [snip] I _should_ have tested. This script actually works the way you expect it to be: The output of this script is: Array ( [0] => search for this sentence [1] => -NotForThisWord [2] => ButDefinitelyForThisWord ) -- >O Ernest E. Vogelsinger (\)ICQ #13394035 ^ http://www.vogelsinger.at/ -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Re: [PHP-DEV] Parse search string a la Google (Regular expression?)
>Benny Rasmussen wrote: >> Hi, >> >> In my application I would like to offer a search interface like Google >> and other popular search engines. The complication for me is to explode >> the search string into proper array elements, like this: >> >> $search_str = "\"search for this sentence\" -NotForThisWord >> ButDefinitelyForThisWord"; >> >> $array[0]: "search for this sentence" >> $array[1]: "-NotForThisWord" >> $array[2]: "ButDefinitelyForThisWord" >> >> I have tried to use regular expressions but my case seems to be a bit >> more complicated for this (?). >> >> Does anybody have a code snippet, a class or something, that can help >> me with this? [snip] If I understand you correctly you want to isolate either quoted strings (with or without whitespace), or tokens separated by whitespace, as array elements? For this you would first have to isolate the first quoted sentence, then tokenize the part before, and loop this as long you're not done. Should work something like that: function tokenize_search($input) { $re = '/\s*(.*?)\s*"\s*([^"]*?)\s*"\s*(.*)/s'; /* look for 3 groups: a - prematch - anything up to the first quote b - match - anything until the next quote c - postmatch - rest of the string */ $tokens = array(); while (preg_match($re, $input, $aresult)) { // aresult contains: [0]-total [1]-a [2]-b [3]-c // tokenize the prematch array_push($tokens, explode(' ', $aresult[1])); array_push($tokens, $aresult[2]); $input = $aresult[3]; } // $input has the rest of the line array_push($tokens, explode(' ', $input)); return $tokens; } Disclaimer: untested as usual. _Should_ behave like this: $string = "\"search for this sentence\" -NotForThisWord ButDefinitelyForThisWord"; $tokens = tokenize_search($string); print_r($tokens); Array( [0] - search for this sentence [1] - -NotForThisWord [2} - ButDefinitelyForThisWord ) -- >O Ernest E. Vogelsinger (\)ICQ #13394035 ^ http://www.vogelsinger.at/ -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php