OK, this is less elegant (to me, anyway), and probably a bit slower, but:
function word_occurrence($word,$phrase) {
$word = strtolower($word); # this way,
$phrase = strtolower($phrase); # case is irrelevant
$Bits = split('[^[:alnum:]]+', $phrase);
$count = 0;
for ($i=0; $i<count($Bits); $i++) {
if ($Bits[$i] == $word) { $count++; }
}
return ($count);
}
It should also handle hyphenated & apostrophied (is that a word?)
words correctly, such as
coffe's a drag
or
coffe-heads are strange
If you want to count words that INCLUDE dashes or apostrophes, you'd
have to use "[^[:alnum:]'-]+" in the split() function. Or, just break
the string up by whitespace, and use '[[:space:]]+'.
-steve
---Original Message ---
At 11:07 PM +0200 4/14/01, n e t b r a i n wrote:
>Hi all,
>anyone have or know where I can find a small function in order to extract
>from a string the most relevant words in it?
>
>something like this:
>
>$var="I love coffe ... Coffe is from Brazil and coffe with milk ..";
>$occurence=2;
>//$occurence means word that are repeat 2 or more times
>my_dream_funct($var,$occurence);
>//the funct now return the word _ coffe _
>
>many thanks in advance
>max
>
>ps.plz note: I need that it works on php3
>
Well, just offthetopofmyhead:
function word_occurrence($word,$phrase) {
$word = strtolower($word); # this way,
$phrase = strtolower($phrase); # case is irrelevant
$Bits = split($word.'[^[:alnum:]]*', $phrase);
return (count($Bits)-1);
}
I tested this, and it works fine (php 3.0.12) EXCEPT it counts
'coffecoffe' as TWO words, not zero. If that's the behavior you want,
then it's fine. Now I'm intrigued...I want to find a single regular
expression that will NOT match 'coffecoffe'. Perhaps preg_ functions
(available on PHP >= 3.0.9).
And, I tried things like
split('[^[:alnum:]]*'.$word.'[^[:alnum:]]*', " $phrase ")
...didn't work.
-steve
--
+----------- 12 April 2001: Forty years of manned spaceflight -----------+
| Steve Edberg University of California, Davis |
| [EMAIL PROTECTED] Computer Consultant |
| http://aesric.ucdavis.edu/ http://pgfsun.ucdavis.edu/ |
+-------------------------- www.yurisnight.net --------------------------+
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]