Ricardo:

I don't have anything to contribute to the discussion other than offer you a
big appreciative "thank you" for your efforts. (I'm not very technically
savvy to test stuff like you have.)

Cuspid


"Ricardo Marques" <[EMAIL PROTECTED]> wrote in message
[EMAIL PROTECTED]">news:[EMAIL PROTECTED]...
> This time, the topic is address harvesters also known as spambots. You
> know: the kind that scans web pages and newsgroups to collect email
> addresses.
>
> I have read many things about address harvesters, but I was never sure
> about what they did capture and what they didn't. So I downloaded one
> from the net: @spider (also known as atspider) from
> http://www.atspider.com  (current version:
> 1.18)
>
>
> My objective is - obviously - to see what are the possible ways that a
> webmaster can set up their mail links to avoid being harvested by a
> spider.
>
>
> I chose atspider for testing for several reasons:
>
> 1) It was listed in Filedudes (a Tucows look-a-like), unreachable at
> the moment, and also listed in Softpile.com
> 2) It claimed to capture addresses from CGI, ASP, PHP and other
> dynamically generated pages. Having programmed in ASP and PHP, that
> _really_ caught my attention
> 3) The site - http://www.atspider.com - didn't look like the usual
> "personal page set up by a spammer in Geocities to sell its garbage"
>
>
> Conclusions:
>
> 1) The installer is as professional as it is expected from any program
> using InstallShield or similar technology
>
> 2) It successfully harvests links in mailto addresses like:
> <a href="mailto:[EMAIL PROTECTED]";>Ricardo Dias Marques</a>
>
> 3) It successfully harvests addresses in web pages, even if they are
> not in a mailto link like:
> [EMAIL PROTECTED]
>
> 4) It FAILS to harvest addresses in web pages if one puts spaces in it
> like:
> ricmarques @ spamcop . net
>
> 5) It FAILS to harvest addresses in "human-readable" form like:
> ricmarques at spamcop dot net
>
> 6) It FAILS to harvest addresses if one has a a mailto link with some
> characters replaced by HTML entities like:
> <a href="&#109;ailto&#58;ricmarques&#64;spamcop&#46;net">
> ricmarques&#64;spamcop&#46;&#99;om</a>
> where:
>
> &#46;  = . (period)
> &#109; = m
> &#58;  =  :  (colon)
> &#64;  = @ sign
>
> You can check a table for HTML entities at:
> http://www.sandia.gov/sci_compute/iso_symbol.html
>
>
> 7) Atspider FAILS to harvest address if one creates a JavaScript to
> build the mailto link. For example, the page containing the link could
> be like:
>
> <html>
> <head>
> ...
> <script type="text/javascript"
> src="http://www.example.com/irc2/scripts.js";>
> </script>
> ...
> </head>
>
> <body>
> ...
> Send me email to <script
> type="text/javascript">send_email('spamcop','ricmarques','net')</scrip
> t>
> ...
> </body>
>
>
> And the file scripts.js has this (or similar) JavaScript:
>
> function send_email(domain,name,tld)
> {
>     var atsign ="&#64;"
>     var m_a_i_l_t_o = "&#109;&#97;&#105;&#108;&#116;&#111;";
>     var colon = "&#58;"
>
>     var expression = '<a href=\"' + m_a_i_l_t_o + colon + name +
> atsign + domain + '.' + tld + '\">' + name + atsign + domain + '.' +
> tld + '</a>';
>  document.write (expression);
> }
>
> 8) @spider FAILS to harvest mail links in PHP pages, _if_ the page
> that has the mail link (example.htm, example.php or example.php3),
> builds it like a "normal page" link:
> <a href="mail_send.php">Mail me!</a>
>
> and mail_send.php has only (needs only to have) these lines:
> <?php
>
> $url = "mailto:[EMAIL PROTECTED]";;
> header("Location: $url");
> ?>
>
> (clicking on the link, the effect _for a human_ is the same as in
> clicking a mailto link)
>
>
> In _newsgroup scans_, @spider successfully scans addresses in From:
> address and in the message body , but NOT from Reply-To: address or
> Cc:    It also DOES NOT scan addresses in the message body if one puts
> spaces in it, like:
> ricmarques @ spamcop . net
>
>
> To be sure, the atspider site points out that they are against spam,
> and stopped selling another product: @caster, which was "designed for
> mail list management, such as sending a company newsletter announcing
> new products and/or services to customers, but an ignorant few used
> @Caster to spam. Since @Caster carries an advertisement for our
> products and our site it looked like we were doing the spamming! "
>
>
> So, after reading this supposedly anti-spam viewpoint, my guess is
> that @Spider Software also _doesn't want_ to build the harvester to
> circumvent anti-harvesting techniques. Why? Because if they did that,
> they would become more associated to spamming.
>
> Anyway, the objective of this discussion is more about spambots than
> about @spider
> I hope these conclusions are useful for other webmasters trying to
> avoid being harvested by spambots.
>
>
> I would be _rather interested_ to hear about similar
> experiences/knowledge with other address harvesters (like EmailSiphon,
> Cherry Picker, EmailWolf, ExtractorPro, EmailCollector or any other).
>
>
> Other input is also very welcome.
>
> Thanks.
> Ricardo Dias Marques
> [EMAIL PROTECTED]


_______________________________________________
SpamCop-List mailing list
[EMAIL PROTECTED]
http://news.spamcop.net/mailman/listinfo/spamcop-list

Reply via email to