[PHP] String searching
I need to find the position of the first character in the string (searching from the end) that is not one of the characters in a set. In this case the set is [0-9a-zA-z-_] I guess to be even more specific, I want to split a string into to parts the first part can contain anything and the second part must be only in the set described above. What is the easiest way to do this? -- Chris W KE5GIX Protect your digital freedom and privacy, eliminate DRM, learn more at http://www.defectivebydesign.org/what_is_drm; Ham Radio Repeater Database. http://hrrdb.com -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] String searching
On Sat, May 17, 2008 at 2:17 AM, Chris W [EMAIL PROTECTED] wrote: I need to find the position of the first character in the string (searching from the end) that is not one of the characters in a set. In this case the set is [0-9a-zA-z-_] To find the position of a specific character, RTFM on strpos(). For those not existing in your condition, I'd recommend everythingbut(), but it's yet to be included in the core. ;-P I guess to be even more specific, I want to split a string into to parts the first part can contain anything and the second part must be only in the set described above. You can split a string by doing something as simple as this: ?php $str = abcdefghijklmnopqrstuvwxyz; $d = $str[5]; // $d == position - 1, because count always begins with 0 ? So to walk backward through the string, while it's not very clean, you could do: ?php $str = ABCDEF01234567789; for($i=strlen($str);$i0;$i--) { if(preg_match('/[g-z]/i',$str[$i])) { // Handle your this is a bad character condition(s). // break; /* Or, optionally, continue. */ } } ? Not pretty, but if my mind is still working at 2:30a (EDT), it should help you out. -- /Daniel P. Brown Dedicated Servers - Intel 2.4GHz w/2TB bandwidth/mo. starting at just $59.99/mo. with no contract! Dedicated servers, VPS, and hosting from $2.50/mo. -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] String searching
Chris W wrote: I need to find the position of the first character in the string (searching from the end) that is not one of the characters in a set. In this case the set is [0-9a-zA-z-_] I guess to be even more specific, I want to split a string into to parts the first part can contain anything and the second part must be only in the set described above. What is the easiest way to do this? There's something here, imaginatively called blah(), which does what you require: http://www.phpguru.org/preg/example.phps -- Richard Heyes ++ | Access SSH with a Windows mapped drive | |http://www.phpguru.org/sftpdrive| ++ -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] String searching peformance
On Mon, 24 Feb 2003 22:35:35 +0100, Ernest E Vogelsinger wrote: At 21:22 24.02.2003, {R}ichard Ashton spoke out and said: [snip] while ( $flag == true ) if (strpos($body, $word[]) 0) {$flag=false} What I really need to know is which is the fastest loop? Which is the fastest match, strpos? Which is the fastest comparison? What other optimisations are possible to maximise search speed? I would expect to keep a frequency hit count and sort the $words[] so that the most frequent hits are found first, remembering that only $body with NO $words in then are automatically posted, so every word must be tested. I don't know sufficient internals to pick the fastest method. I will be running with 4.3.1 on FreeBSD 4.6 [snip] I'd suggest something like this: $buzzwords = array('idiot', 'fool', 'shit', 'FOAD'); $re = '/(' . implode('|',$buzzwords).')/is'; if (preg_match($re, $posting)) // bad word found else // cleared Thank you very much that is brilliant, I would never have thought of that. Do you think that: if (preg_match($re, $posting, $hits)) would slow it down at all. The $buzzwords will be kept in a file to be loaded before each run, every 5 minutes. I could therefore keep a count of which words hit most frequently and move them to the top of the list. {R} -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] String searching peformance
At 09:49 26.02.2003, {R}ichard Ashton said: [snip] Do you think that: if (preg_match($re, $posting, $hits)) would slow it down at all. The $buzzwords will be kept in a file to be loaded before each run, every 5 minutes. I could therefore keep a count of which words hit most frequently and move them to the top of the list. [snip] If you have a lot of buzzwords I believe this could make quite some performance impact. -- O Ernest E. Vogelsinger (\)ICQ #13394035 ^ http://www.vogelsinger.at/ -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] String searching peformance
On Wednesday 26 February 2003 16:49, {R}ichard Ashton wrote: Do you think that: if (preg_match($re, $posting, $hits)) would slow it down at all. The $buzzwords will be kept in a file to be loaded before each run, every 5 minutes. I could therefore keep a count of which words hit most frequently and move them to the top of the list. No idea whether this would be faster (it's certainly easier to code): explode() text into an array place your banned words into an array array_intersect() to find words common in both Do your own benchmarking! -- Jason Wong - Gremlins Associates - www.gremlins.biz Open Source Software Systems Integrators * Web Design Hosting * Internet Intranet Applications Development * -- Search the list archives before you post http://marc.theaimsgroup.com/?l=php-general -- /* Why is the alphabet in that order? Is it because of that song? -- Steven Wright */ -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] String searching peformance
On Wed, 26 Feb 2003 17:47:41 +0800, Jason Wong wrote: On Wednesday 26 February 2003 16:49, {R}ichard Ashton wrote: Do you think that: if (preg_match($re, $posting, $hits)) would slow it down at all. The $buzzwords will be kept in a file to be loaded before each run, every 5 minutes. I could therefore keep a count of which words hit most frequently and move them to the top of the list. No idea whether this would be faster (it's certainly easier to code): explode() text into an array place your banned words into an array array_intersect() to find words common in both Thanks for another method, Do your own benchmarking! That is the easy bit. {R} -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
[PHP] String searching peformance
I am looking for the most efficient way to search for Trigger words in a big string. I have a string, $body which is all of the body of any particular Usenet Post, so it can be as short as Me too and up to some, as yet undecided, limit say around 10Kbytes. I have a list of words in an array, maybe 50 words of up to 8 characters. I need to search the $body for the presence of $word and if any $word in the array is found in $body Trigger an action. This is a moderation bot for a beginners group, NO Flaming No swearing No Abuse, you can imagine a list of words, idiot, fool, shit, FOAD, and so on. When *any* one is found the post is diverted for manual intervention, and searching stops. Just guessing I would imagine while ( $flag == true ) if (strpos($body, $word[]) 0) {$flag=false} What I really need to know is which is the fastest loop? Which is the fastest match, strpos? Which is the fastest comparison? What other optimisations are possible to maximise search speed? I would expect to keep a frequency hit count and sort the $words[] so that the most frequent hits are found first, remembering that only $body with NO $words in then are automatically posted, so every word must be tested. I don't know sufficient internals to pick the fastest method. I will be running with 4.3.1 on FreeBSD 4.6 {R} -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] String searching peformance
At 21:22 24.02.2003, {R}ichard Ashton spoke out and said: [snip] while ( $flag == true ) if (strpos($body, $word[]) 0) {$flag=false} What I really need to know is which is the fastest loop? Which is the fastest match, strpos? Which is the fastest comparison? What other optimisations are possible to maximise search speed? I would expect to keep a frequency hit count and sort the $words[] so that the most frequent hits are found first, remembering that only $body with NO $words in then are automatically posted, so every word must be tested. I don't know sufficient internals to pick the fastest method. I will be running with 4.3.1 on FreeBSD 4.6 [snip] I'd suggest something like this: $buzzwords = array('idiot', 'fool', 'shit', 'FOAD'); $re = '/(' . implode('|',$buzzwords).')/is'; if (preg_match($re, $posting)) // bad word found else // cleared You only need to make sure that your buzzwords dont contain a '/' - you could change the regex delimiter then, or simply escape it in the array. -- O Ernest E. Vogelsinger (\) ICQ #13394035 ^ http://www.vogelsinger.at/ -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php