Re: [PHP] Regex in PHP
On Thu, 2008-06-05 at 00:24 -0400, Nathan Nobbe wrote: you really know how to rub it in there rob. but i was looking at the implementation in the php code, looks like somebody likes my idea (this code found in ext/standard/string.c). on the second line the haystack is converted to lower case[1], then if it passes a couple of checks, the needle is converted to lower case[2], and lastly the comparison is performed[3]. there is no logic to check both cases. (i have placed a star beside the statements ive referred to). ... haystack_dup = estrndup(haystack, haystack_len); *[1]php_strtolower(haystack_dup, haystack_len); if (Z_TYPE_P(needle) == IS_STRING) { if (Z_STRLEN_P(needle) == 0 || Z_STRLEN_P(needle) haystack_len) { efree(haystack_dup); RETURN_FALSE; } needle_dup = estrndup(Z_STRVAL_P(needle), Z_STRLEN_P(needle)); *[2]php_strtolower(needle_dup, Z_STRLEN_P(needle)); *[3]found = php_memnstr(haystack_dup + offset, needle_dup, Z_STRLEN_P(needle), haystack_dup + haystack_len); } Funny, I guess they took the quick route. This code could obviously be optmized :) But let's go with something used more often... such as more traditional string comparison where you're more likely to want to eke out efficiency: ZEND_API int zend_binary_strcasecmp(char *s1, uint len1, char *s2, uint len2) { int len; int c1, c2; len = MIN(len1, len2); while (len--) { c1 = zend_tolower((int)*(unsigned char *)s1++); c2 = zend_tolower((int)*(unsigned char *)s2++); if (c1 != c2) { return c1 - c2; } } return len1 - len2; } Well looks like they do indeed do a conversion.. but on a char by char basis. Strange that. Could more than likely speed it up by doing an initial exactness comparison and then falling back on the above. Maybe I'll compile and test out the following later: ZEND_API int zend_binary_strcasecmp (char *s1, uint len1, char *s2, uint len2) { int len; int c1, c2; len = MIN(len1, len2); while (len--) { c1 = (int)*(unsigned char *)s1++; c2 = (int)*(unsigned char *)s2++; if( c1 != c2 ){ c1 = zend_tolower( c1 ); c2 = zend_tolower( c2 ); if (c1 != c2) { return c1 - c2; } } } return len1 - len2; } Cheers, Rob. -- http://www.interjinn.com Application and Templating Framework for PHP -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Regex in PHP
sorry to bother you richard. You didn't, I just wanted to make sure I wasn't losing it (more). -- Richard Heyes ++ | Access SSH with a Windows mapped drive | |http://www.phpguru.org/sftpdrive| ++ -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Regex in PHP
Hi, and the case insensitive versions are a hair faster still ;) Are they? I always thought that case-sensitive functions were faster because they have to test fewer comparisons. Eg To test if i == I in a case-insensitive fashion requires two comparisons (i == I and i == i) whereas a case-sensitive comparison requires only one (i == i). Cheers. -- Richard Heyes ++ | Access SSH with a Windows mapped drive | |http://www.phpguru.org/sftpdrive| ++ -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Regex in PHP
On Wed, Jun 4, 2008 at 10:10 AM, Richard Heyes [EMAIL PROTECTED] wrote: Hi, and the case insensitive versions are a hair faster still ;) Are they? I always thought that case-sensitive functions were faster because they have to test fewer comparisons. Eg To test if i == I in a case-insensitive fashion requires two comparisons (i == I and i == i) whereas a case-sensitive comparison requires only one (i == i). umm, isnt it like the other way around. in the case of case-sensitive, you have to be able to distinguish between i and I, whereas w/ the case insensitive, you dont care so, basically, you strtolower() first thing, then just compare to lower case characters. -nathan
Re: [PHP] Regex in PHP
On Wed, 2008-06-04 at 10:18 -0600, Nathan Nobbe wrote: On Wed, Jun 4, 2008 at 10:10 AM, Richard Heyes [EMAIL PROTECTED] wrote: Hi, and the case insensitive versions are a hair faster still ;) Are they? I always thought that case-sensitive functions were faster because they have to test fewer comparisons. Eg To test if i == I in a case-insensitive fashion requires two comparisons (i == I and i == i) whereas a case-sensitive comparison requires only one (i == i). umm, isnt it like the other way around. in the case of case-sensitive, you have to be able to distinguish between i and I, whereas w/ the case insensitive, you dont care so, basically, you strtolower() first thing, then just compare to lower case characters. Nope, case insensitive is slower since you must make two tests for characters having a lower and upper case version. With case sensitive comparisons you only need to make a single comparison. Cheers, Rob. -- http://www.interjinn.com Application and Templating Framework for PHP -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Regex in PHP
On Wed, Jun 4, 2008 at 10:26 AM, Robert Cummings [EMAIL PROTECTED] wrote: Nope, case insensitive is slower since you must make two tests for characters having a lower and upper case version. With case sensitive comparisons you only need to make a single comparison. a quick test shows stripos beating strpos. ?php $str = 'asSAFAASFDADSfasfjhalskfjhlaseAERQWERQWER;.dafasjhflasfjd'; $search = 'fdasASDFAafdas'; $start = microtime(); strpos($str, $search); $end = microtime(); $r1 = $end - $start; $start = microtime(); stripos($str, $search); $end2 = microtime(); $r2 = $end2 - $start; echo strpos: $r1\n; echo stripos: $r2\n; if($r2 $r1) { echo 'stripos is faster' . PHP_EOL; } ? -nathan
Re: [PHP] Regex in PHP
I can't find any good reason for regex in this case. you can try to split it with explode / stristr / create a function by your own which goes over the string and check when a @ is catched, something like: function GetDomainName ($a) { $returnDomain = ; $beigale = false; for ($i = 0; $i strlen($a) !$beigale; $i++) if ($a[$i] == '@') { for ($z = ($i+1); $z strlen($a); $z++) $returnDomain .= $a[$z]; $beigale = true; } return $returnDomain; } (there is probably a better way to do this - this is just what came up at my mind right now..) On 04/06/2008, VamVan [EMAIL PROTECTED] wrote: Hello All, For example I have these email addressess - [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] What would be my PHP function[Regular expression[ to that can give me some thing like yahoo.com hotmail.com gmail.com Thanks
Re: [PHP] Regex in PHP
On Wed, 2008-06-04 at 10:56 -0600, Nathan Nobbe wrote: On Wed, Jun 4, 2008 at 10:26 AM, Robert Cummings [EMAIL PROTECTED] wrote: Nope, case insensitive is slower since you must make two tests for characters having a lower and upper case version. With case sensitive comparisons you only need to make a single comparison. a quick test shows stripos beating strpos. ?php $str = 'asSAFAASFDADSfasfjhalskfjhlaseAERQWERQWER;.dafasjhflasfjd'; $search = 'fdasASDFAafdas'; $start = microtime(); strpos($str, $search); $end = microtime(); $r1 = $end - $start; $start = microtime(); stripos($str, $search); $end2 = microtime(); $r2 = $end2 - $start; echo strpos: $r1\n; echo stripos: $r2\n; if($r2 $r1) { echo 'stripos is faster' . PHP_EOL; } ? Did you just try to use a test that used a single iteration to prove me wrong? OMFG ponies!!! Loop each one of those 10 million times, use a separate script for each, and use the system time program to appropriately measure the time the system takes. :) Cheers, Rob. -- http://www.interjinn.com Application and Templating Framework for PHP -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Regex in PHP
at least he have some humer ;-) On 04/06/2008, Robert Cummings [EMAIL PROTECTED] wrote: On Wed, 2008-06-04 at 10:56 -0600, Nathan Nobbe wrote: On Wed, Jun 4, 2008 at 10:26 AM, Robert Cummings [EMAIL PROTECTED] wrote: Nope, case insensitive is slower since you must make two tests for characters having a lower and upper case version. With case sensitive comparisons you only need to make a single comparison. a quick test shows stripos beating strpos. ?php $str = 'asSAFAASFDADSfasfjhalskfjhlaseAERQWERQWER;.dafasjhflasfjd'; $search = 'fdasASDFAafdas'; $start = microtime(); strpos($str, $search); $end = microtime(); $r1 = $end - $start; $start = microtime(); stripos($str, $search); $end2 = microtime(); $r2 = $end2 - $start; echo strpos: $r1\n; echo stripos: $r2\n; if($r2 $r1) { echo 'stripos is faster' . PHP_EOL; } ? Did you just try to use a test that used a single iteration to prove me wrong? OMFG ponies!!! Loop each one of those 10 million times, use a separate script for each, and use the system time program to appropriately measure the time the system takes. :) Cheers, Rob. -- http://www.interjinn.com Application and Templating Framework for PHP -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Regex in PHP
On Wed, Jun 4, 2008 at 11:12 AM, Robert Cummings [EMAIL PROTECTED] wrote: Did you just try to use a test that used a single iteration to prove me wrong? OMFG ponies!!! Loop each one of those 10 million times, use a separate script for each, and use the system time program to appropriately measure the time the system takes. ?php $str = 'asSAFAASFDADSfasfjhalskfjhlaseAERQWERQWER;.dafasjhflasfjd'; $search = 'fdasASDFAafdas'; $start = microtime(); for($i = 0; $i 1000; $i++) strpos($str, $search); $end = microtime(); $r1 = $end - $start; $start = microtime(); for($i = 0; $i 1000; $i++) stripos($str, $search); $end2 = microtime(); $r2 = $end2 - $start; echo strpos: $r1\n; echo stripos: $r2\n; if($r2 $r1) { echo 'stripos is faster' . PHP_EOL; } -- strpos: 0.730519 stripos: -0.098887 stripos is faster stripos still dominates ;) what is this system time program you speak of ? and, ill put them into separate programs when i get home this evening, and have more time to screw around. -nathan
Re: [PHP] Regex in PHP
On Wed, 2008-06-04 at 13:12 -0400, Robert Cummings wrote: On Wed, 2008-06-04 at 10:56 -0600, Nathan Nobbe wrote: On Wed, Jun 4, 2008 at 10:26 AM, Robert Cummings [EMAIL PROTECTED] wrote: Nope, case insensitive is slower since you must make two tests for characters having a lower and upper case version. With case sensitive comparisons you only need to make a single comparison. a quick test shows stripos beating strpos. ?php $str = 'asSAFAASFDADSfasfjhalskfjhlaseAERQWERQWER;.dafasjhflasfjd'; $search = 'fdasASDFAafdas'; $start = microtime(); strpos($str, $search); $end = microtime(); $r1 = $end - $start; $start = microtime(); stripos($str, $search); $end2 = microtime(); $r2 = $end2 - $start; echo strpos: $r1\n; echo stripos: $r2\n; if($r2 $r1) { echo 'stripos is faster' . PHP_EOL; } ? Did you just try to use a test that used a single iteration to prove me wrong? OMFG ponies!!! Loop each one of those 10 million times, use a separate script for each, and use the system time program to appropriately measure the time the system takes. Here's my results on my Athlon 2400, 10 million loops on each type using your script settings for $str and $search and making 3 runs each time: strpos() === real0m7.133s user0m6.480s sys 0m0.020s real0m6.134s user0m6.068s sys 0m0.016s real0m6.527s user0m6.476s sys 0m0.012s stripos() === real0m13.720s user0m13.517s sys 0m0.072s real0m13.158s user0m13.009s sys 0m0.016s real0m13.151s user0m13.013s sys 0m0.012s Now, that's how you test efficiency. Doing a single run is very, very subject to whatever else your processor might be doing and as such is usually garbage for any kind of analysis. Cheers, Rob. -- http://www.interjinn.com Application and Templating Framework for PHP -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Regex in PHP
On Wed, 2008-06-04 at 11:18 -0600, Nathan Nobbe wrote: On Wed, Jun 4, 2008 at 11:12 AM, Robert Cummings [EMAIL PROTECTED] wrote: Did you just try to use a test that used a single iteration to prove me wrong? OMFG ponies!!! Loop each one of those 10 million times, use a separate script for each, and use the system time program to appropriately measure the time the system takes. ?php $str = 'asSAFAASFDADSfasfjhalskfjhlaseAERQWERQWER;.dafasjhflasfjd'; $search = 'fdasASDFAafdas'; $start = microtime(); for($i = 0; $i 1000; $i++) strpos($str, $search); $end = microtime(); $r1 = $end - $start; $start = microtime(); for($i = 0; $i 1000; $i++) stripos($str, $search); $end2 = microtime(); $r2 = $end2 - $start; echo strpos: $r1\n; echo stripos: $r2\n; if($r2 $r1) { echo 'stripos is faster' . PHP_EOL; } -- strpos: 0.730519 stripos: -0.098887 stripos is faster Negative time eh!? You're code must be buggy :| The time program works like this unde rmost nix systems: time php -q foo.php And then it returns a report of how much time was taken for various types of time. I've already sent an email with the appropriate timing of both versions. BTW, as primtive as microtime() is for this kind of measurement... you might want to read the manual to use it properly: http://ca3.php.net/manual/en/function.microtime.php You probably want: microtime( true ) stripos still dominates ;) what is this system time program you speak of ? and, ill put them into separate programs when i get home this evening, and have more time to screw around. It's a simple thought process to understand that unless someone coding the PHP internals buggered their code, that stripos() cannot possibly be faster than strpos(). I really don't need benchmarks for something this simple to know which SHOULD be faster. Cheers, Rob. -- http://www.interjinn.com Application and Templating Framework for PHP -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Regex in PHP
On Wed, Jun 4, 2008 at 2:06 PM, Robert Cummings [EMAIL PROTECTED] wrote: On Wed, 2008-06-04 at 11:18 -0600, Nathan Nobbe wrote: On Wed, Jun 4, 2008 at 11:12 AM, Robert Cummings [EMAIL PROTECTED] wrote: Did you just try to use a test that used a single iteration to prove me wrong? OMFG ponies!!! Loop each one of those 10 million times, use a separate script for each, and use the system time program to appropriately measure the time the system takes. ?php $str = 'asSAFAASFDADSfasfjhalskfjhlaseAERQWERQWER;.dafasjhflasfjd'; $search = 'fdasASDFAafdas'; $start = microtime(); for($i = 0; $i 1000; $i++) strpos($str, $search); $end = microtime(); $r1 = $end - $start; $start = microtime(); for($i = 0; $i 1000; $i++) stripos($str, $search); $end2 = microtime(); $r2 = $end2 - $start; echo strpos: $r1\n; echo stripos: $r2\n; if($r2 $r1) { echo 'stripos is faster' . PHP_EOL; } -- strpos: 0.730519 stripos: -0.098887 stripos is faster Negative time eh!? You're code must be buggy :| The time program works like this unde rmost nix systems: time php -q foo.php And then it returns a report of how much time was taken for various types of time. I've already sent an email with the appropriate timing of both versions. BTW, as primtive as microtime() is for this kind of measurement... you might want to read the manual to use it properly: http://ca3.php.net/manual/en/function.microtime.php You probably want: microtime( true ) stripos still dominates ;) what is this system time program you speak of ? and, ill put them into separate programs when i get home this evening, and have more time to screw around. It's a simple thought process to understand that unless someone coding the PHP internals buggered their code, that stripos() cannot possibly be faster than strpos(). I really don't need benchmarks for something this simple to know which SHOULD be faster. i repeated your test using the time program and splitting the script into 2, one for each strpos and stripos, to find similar results. imo, there is no need for 2 comparisons for case-insensitive searches, because both arguments can be converted to a single case prior to the search. obviously, there is a small amount of overhead there the case-sensitive search is unencumbered by. i guess i never sat down and thought about how that algorithm would work (case-sensitive) =/. thanks for the tips rob. sorry to bother you richard. -nathan
Re: [PHP] Regex in PHP
On Wed, 2008-06-04 at 23:20 -0400, Nathan Nobbe wrote: i repeated your test using the time program and splitting the script into 2, one for each strpos and stripos, to find similar results. imo, there is no need for 2 comparisons for case-insensitive searches, because both arguments can be converted to a single case prior to the search. obviously, there is a small amount of overhead there the case-sensitive search is unencumbered by. i guess i never sat down and thought about how that algorithm would work (case-sensitive) =/. thanks for the tips rob. sorry to bother you richard. You would do two comparisons... why incur the overhead of a conversion if one is not necessary. First you do case sensitive match, if that fails then you try the alternative version comparison. It is inefficient to perform 2 conversions and a single comparison in contrast. Similarly, it's very inefficient to convert two entire strings then perform a comparison. If the first characters differ then conversion of the rest of the strings was pointless. This is basic algorithms in computer science. Cheers, Rob. -- http://www.interjinn.com Application and Templating Framework for PHP -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Regex in PHP
On Wed, Jun 4, 2008 at 11:43 PM, Robert Cummings [EMAIL PROTECTED] wrote: On Wed, 2008-06-04 at 23:20 -0400, Nathan Nobbe wrote: i repeated your test using the time program and splitting the script into 2, one for each strpos and stripos, to find similar results. imo, there is no need for 2 comparisons for case-insensitive searches, because both arguments can be converted to a single case prior to the search. obviously, there is a small amount of overhead there the case-sensitive search is unencumbered by. i guess i never sat down and thought about how that algorithm would work (case-sensitive) =/. thanks for the tips rob. sorry to bother you richard. You would do two comparisons... why incur the overhead of a conversion if one is not necessary. because it simplifies the algorithm, there is no need for conditional logic. First you do case sensitive match, if that fails then you try the alternative version comparison. It is inefficient to perform 2 conversions and a single comparison in contrast. 3 operations vs. 1 or potentially 2, sure. Similarly, it's very inefficient to convert two entire strings then perform a comparison. then they could be converted one at a time as the strings were traversed to increase efficiency. If the first characters differ then conversion of the rest of the strings was pointless. good point. This is basic algorithms in computer science. you really know how to rub it in there rob. but i was looking at the implementation in the php code, looks like somebody likes my idea (this code found in ext/standard/string.c). on the second line the haystack is converted to lower case[1], then if it passes a couple of checks, the needle is converted to lower case[2], and lastly the comparison is performed[3]. there is no logic to check both cases. (i have placed a star beside the statements ive referred to). ... haystack_dup = estrndup(haystack, haystack_len); *[1]php_strtolower(haystack_dup, haystack_len); if (Z_TYPE_P(needle) == IS_STRING) { if (Z_STRLEN_P(needle) == 0 || Z_STRLEN_P(needle) haystack_len) { efree(haystack_dup); RETURN_FALSE; } needle_dup = estrndup(Z_STRVAL_P(needle), Z_STRLEN_P(needle)); *[2]php_strtolower(needle_dup, Z_STRLEN_P(needle)); *[3]found = php_memnstr(haystack_dup + offset, needle_dup, Z_STRLEN_P(needle), haystack_dup + haystack_len); } ... -nathan
Re: [PHP] Regex in PHP
You can use this: $str = '[EMAIL PROTECTED]'; preg_match('/[EMAIL PROTECTED]@(.+)/', $str, $matches); var_dump($matches);//will be in $matches[1] Or without regex: echo substr($str, strpos($str, '@')+1); Liran - Original Message - From: VamVan [EMAIL PROTECTED] To: php-general@lists.php.net Sent: Wednesday, June 04, 2008 3:39 AM Subject: [PHP] Regex in PHP Hello All, For example I have these email addressess - [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] What would be my PHP function[Regular expression[ to that can give me some thing like yahoo.com hotmail.com gmail.com Thanks -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Regex in PHP
On Tue, Jun 3, 2008 at 8:39 PM, VamVan [EMAIL PROTECTED] wrote: Hello All, For example I have these email addressess - [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] What would be my PHP function[Regular expression[ to that can give me some thing like yahoo.com hotmail.com gmail.com if you know the values are valid email addresses, use a combination of strripos() and substr(). it will be nice a fast that way. as an aside, this is what the manual says on preg_match() Do not use *preg_match()* if you only want to check if one string is contained in another string. Use strpos()http://www.php.net/manual/en/function.strpos.phpor strstr() http://www.php.net/manual/en/function.strstr.php instead as they will be faster. and the case insensitive versions are a hair faster still ;) -nathan