Re: [PHP] link counting

2007-04-06 Thread Paul Novitski

At 4/6/2007 06:01 AM, Sebe wrote:

i thought of an idea of counting the number of links to reduce comment spam.



I do this by counting the number of 'http://' instances in the 
text.  You can use a variety of PHP functions:


- substr_count()
- preg_match_all() then count() the result array
- str_split() then count()
- preg_split() then count()

preg_split() is useful if you want to split the text by more than one 
string; simply separate alternative strings in the pattern with the 
pipe: '(http://|

However, in my personal experience, contact form spam links always 
contain 'http://' but they're not always couched in anchor tags, so 
I've never found the need to search for more than the one pattern.


substr_count() is case-sensitive so you'll want to make a copy of the 
message text lowercase using strtolower() to catch all variants of 
http|HTTP|Http|...  substr_count() is probably also faster than the 
regular expression functions -- not that a difference of microseconds 
or milliseconds need necessarily concern you if you're not executing 
many iterations.


I usually set the limit of permissible links to three.  Since it's 
entirely possible that a genuine correspondent might send more than 
three links someday, I don't throw away suspect messages but instead 
send them to my own mailbox coded so they're easy to catch and file 
on receipt; that way I can monitor the health of the system and watch 
for false positives while still shielding my clients from spam.


Typically I'll display an error message when someone fills out a 
contact form incorrectly, for example asking them to enter a valid 
email address.  Recently, however, I've stopped warning the sender if 
they try to send a message that looks like spam because I don't want 
to tech spammers how to circumvent my criteria.  I send the suspect 
message to my monitoring mailbox instead of to the intended recipient 
and let the spammers think they've succeeded.  I feared at first that 
this would encourage spammers to use my contact forms more, but it 
hasn't appeared to have had that effect.


Documentation links:
http://php.net/count
http://php.net/pcre.pattern.syntax
http://php.net/preg_match_all
http://php.net/preg_match_all
http://php.net/preg_split
http://php.net/strtolower
http://php.net/substr-count

Regards,

Paul
__

Paul Novitski
Juniper Webcraft Ltd.
http://juniperwebcraft.com 


--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



RE: [PHP] link counting

2007-04-06 Thread Jay Blanchard
[snip]
i thought of an idea of counting the number of links to reduce comment
spam.

unfortunately my methods is not reliable, i haven't tested it yet 
though.. anyone have maybe a better solution using some regexp?

$links = array('http://', 'https://', 'www.');

$total_links = 0;
foreach($links as $link)
{
 $total_links = substr_count($string, $link);
}
   
if($total_links > X)
{
.
}
[/snip]

External links or internal links? Regardless, start with counting anchor
tags. If you need to make it more granular you can work from there with
regex.

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] link counting

2007-04-06 Thread Tijnema !

On 4/6/07, Sebe <[EMAIL PROTECTED]> wrote:

i thought of an idea of counting the number of links to reduce comment spam.

unfortunately my methods is not reliable, i haven't tested it yet
though.. anyone have maybe a better solution using some regexp?

$links = array('http://', 'https://', 'www.');

$total_links = 0;
foreach($links as $link)
{
$total_links = substr_count($string, $link);
}

if($total_links > X)
{
   .
}



I don't have a better way, but links starting with http://www. or
https://www. are counted twice in your script...

Tijnema

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php




--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP] link counting

2007-04-06 Thread Sebe

i thought of an idea of counting the number of links to reduce comment spam.

unfortunately my methods is not reliable, i haven't tested it yet 
though.. anyone have maybe a better solution using some regexp?


$links = array('http://', 'https://', 'www.');

$total_links = 0;
foreach($links as $link)
{
$total_links = substr_count($string, $link);
}
  
if($total_links > X)

{
   .
}

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php