Re: How does Security affect search engine spiders?

2008-10-13 Thread Claude Schneegans
 >>Looks to me as though it is blocking SQL injection attacks

It doesn't block anything, it SENDS SQL injection attacks!
MY application blocked it. ;-)

~|
Adobe® ColdFusion® 8 software 8 is the most important and dramatic release to 
date
Get the Free Trial
http://ad.doubleclick.net/clk;207172674;29440083;f

Archive: 
http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:313834
Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm
Unsubscribe: 
http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4


RE: How does Security affect search engine spiders?

2008-10-13 Thread Larry Juncker
Looks to me as though it is blocking SQL injection attacks
 

-Original Message-
From: Claude Schneegans [mailto:[EMAIL PROTECTED] 
Sent: Monday, October 13, 2008 3:55 PM
To: cf-talk
Subject: Re: How does Security affect search engine spiders?

 >>Not as I know, anyway, one cannot rely on user agents which can be faked
so easily.

Just to illustrate this, as I was writing my last message, I just received a
notice from my server reporting a new bad bot detected.
its user agent is "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1;
FunWebProducts; SpamBlockerUtility 10.2.217.0)"
and it was trapped because
"p=releases';[EMAIL PROTECTED](4000);[EMAIL PROTECTED](0x4445434C4152452040...
.."
Was found in the URL.
Just wonder what this "SpamBlockerUtility" is supposed to block ;-)



~|
Adobe® ColdFusion® 8 software 8 is the most important and dramatic release to 
date
Get the Free Trial
http://ad.doubleclick.net/clk;207172674;29440083;f

Archive: 
http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:313833
Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm
Unsubscribe: 
http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4


Re: How does Security affect search engine spiders?

2008-10-13 Thread Matt Robertson
Oh yes... I found out early on that I would get all bent out of shape
and cranky doing log analysis without filtering out that declare stuff
before it hit the logs in the first place.

I figured there was no point to relying on user agent info but wanted
to see if anyone had anything that I might pick over.

Good old FunWebProducts ... a.k.a. "I Am a Moron" ...

-- 
[EMAIL PROTECTED]
Janitor, The Robertson Team
mysecretbase.com

~|
Adobe® ColdFusion® 8 software 8 is the most important and dramatic release to 
date
Get the Free Trial
http://ad.doubleclick.net/clk;207172674;29440083;f

Archive: 
http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:313832
Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm
Unsubscribe: 
http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4


Re: How does Security affect search engine spiders?

2008-10-13 Thread Claude Schneegans
 >>Not as I know, anyway, one cannot rely on user agents which can be faked
so easily.

Just to illustrate this, as I was writing my last message, I just 
received a notice from my server
reporting a new bad bot detected.
its user agent is "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; 
FunWebProducts; SpamBlockerUtility 10.2.217.0)"
and it was trapped because
"p=releases';[EMAIL PROTECTED](4000);[EMAIL 
PROTECTED](0x4445434C4152452040"
Was found in the URL.
Just wonder what this "SpamBlockerUtility" is supposed to block ;-)

~|
Adobe® ColdFusion® 8 software 8 is the most important and dramatic release to 
date
Get the Free Trial
http://ad.doubleclick.net/clk;207172674;29440083;f

Archive: 
http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:313829
Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4


Re: How does Security affect search engine spiders?

2008-10-13 Thread Claude Schneegans
 >>is there a good bot/bad bot list? 

Not as I know, anyway, one cannot rely on user agents which can be faked 
so easily.
Personally, I let just a few known bots in, based on the IP address, the 
only parameter that
cannot be faked.
For every other request, I have some tools that analyze automatically 
every visitor according to some criteria as:
- Does it read robots.txt?
- fails in some robot trap?
- reads robots.txt but reads forbidden pages any way;
- requests pages at too high rate.
- reads javascripts but does not execute it.
- does not read CSS,
- clearly idenfies itself in the user agent or not.
etc...
 and of course, presence of DECLARE or http in urls is the first test ;-)

 >>I have an IP- and bot-identifying based
system that works pretty well but I'm always up for newer and better

Such a system can only identify good bots for sure, but not bad bots and 
fakes.
And the problem is not with good bots, but with bad guys.
I also have a white list and a black list, but their only purpose is to 
bypass the rest of the tests.

~|
Adobe® ColdFusion® 8 software 8 is the most important and dramatic release to 
date
Get the Free Trial
http://ad.doubleclick.net/clk;207172674;29440083;f

Archive: 
http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:313828
Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4


Re: How does Security affect search engine spiders?

2008-10-13 Thread Matt Robertson
is there a good bot/bad bot list?  Not that I would trust it but it
can't hurt to at least look at whether its feasible to use it as
another weapon in the arsenal. I have an IP- and bot-identifying based
system that works pretty well but I'm always up for newer and better
info.



-- 
[EMAIL PROTECTED]
Janitor, The Robertson Team
mysecretbase.com

~|
Adobe® ColdFusion® 8 software 8 is the most important and dramatic release to 
date
Get the Free Trial
http://ad.doubleclick.net/clk;207172674;29440083;f

Archive: 
http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:313827
Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4


Re: How does Security affect search engine spiders?

2008-10-13 Thread Claude Schneegans
 >>My only thought on that is to detect the fact that they are a spider 
(not sure how to do that though) and not implement security in that case.

Oups, not a good idea. There are mainly two sorts of spiders: good bots 
(ie:google)
and bad bots (ie: those looking for mail addresses to spam)
In neither case they should be reading your pages. Good bots, because 
there is no need to index secured pages,
and bad bots should be banned from any page anyway.

So just let the login page do its work : good bots will never try to 
submit the login form,
bad bots may try,but with no password they'll be kicked out anyway.

~|
Adobe® ColdFusion® 8 software 8 is the most important and dramatic release to 
date
Get the Free Trial
http://ad.doubleclick.net/clk;207172674;29440083;f

Archive: 
http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:313822
Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4


Re: How does Security affect search engine spiders?

2008-10-12 Thread Michael Dinowitz
correct
https://www.google.com/adsense/list-auth
Use this section to allow the AdSense crawler to access pages that are
behind a login. Our crawler will access these pages only to determine
content for ad targeting purposes and will fully comply with Google's privacy
policy <http://www.google.com/privacy.html>

While they are determining content for ad targeting, they are looking at the
pages content. This content will not show up in their index but logic says
that it will be used to effect your ranking. I've seen search results that
show a site and when I click on it, I get the logic. This says that Google
has indexed past the login.

On 10/12/08, Adrian Lynch <[EMAIL PROTECTED]> wrote:
>
> And not include what it finds in its index?
>
> Adrian
>
>
> -Original Message-
> From: Michael Dinowitz
> Sent: 12 October 2008 22:51
> To: cf-talk
> Subject: Re: How does Security affect search engine spiders?
>
>
> There is an option in adsense to bypass login based security in order to
> index pages for ads. While your pages may not have ads on them, using this
> option guarantees that Google will get through your security.
>
> On 10/12/08, Doug Boude (rhymes with 'loud') <[EMAIL PROTECTED]> wrote:
> >
> > Hi all. I am curious if anybody knows how securing a site affects a
> search
> > engine spider's ability to crawl it. For instance, if I have my entire
> site
> > secured by means of authentication so that any page request is redirected
> to
> > the login page if the appropriate security creds are not present in
> session,
> > do spiders receive the same treatment? Are they also prohibited by my
> > security from crawling any page except the login page? If this is true,
> what
> > can I do to allow spiders to have access to crawl content but still apply
> > security to regular "human" visitors? My only thought on that is to
> detect
> > the fact that they are a spider (not sure how to do that though) and not
> > implement security in that case.
> >
> > Thanks for your ideas and thoughts. Feel free to email them to me at
> > [EMAIL PROTECTED]
> >
> > Doug  :0)
>
>
> 

~|
Adobe® ColdFusion® 8 software 8 is the most important and dramatic release to 
date
Get the Free Trial
http://ad.doubleclick.net/clk;207172674;29440083;f

Archive: 
http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:313794
Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4


RE: How does Security affect search engine spiders?

2008-10-12 Thread Adrian Lynch
And not include what it finds in its index?

Adrian

-Original Message-
From: Michael Dinowitz
Sent: 12 October 2008 22:51
To: cf-talk
Subject: Re: How does Security affect search engine spiders?


There is an option in adsense to bypass login based security in order to
index pages for ads. While your pages may not have ads on them, using this
option guarantees that Google will get through your security.

On 10/12/08, Doug Boude (rhymes with 'loud') <[EMAIL PROTECTED]> wrote:
>
> Hi all. I am curious if anybody knows how securing a site affects a search
> engine spider's ability to crawl it. For instance, if I have my entire
site
> secured by means of authentication so that any page request is redirected
to
> the login page if the appropriate security creds are not present in
session,
> do spiders receive the same treatment? Are they also prohibited by my
> security from crawling any page except the login page? If this is true,
what
> can I do to allow spiders to have access to crawl content but still apply
> security to regular "human" visitors? My only thought on that is to detect
> the fact that they are a spider (not sure how to do that though) and not
> implement security in that case.
>
> Thanks for your ideas and thoughts. Feel free to email them to me at
> [EMAIL PROTECTED]
>
> Doug  :0)


~|
Adobe® ColdFusion® 8 software 8 is the most important and dramatic release to 
date
Get the Free Trial
http://ad.doubleclick.net/clk;207172674;29440083;f

Archive: 
http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:313793
Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4


Re: How does Security affect search engine spiders?

2008-10-12 Thread Michael Dinowitz
There is an option in adsense to bypass login based security in order to
index pages for ads. While your pages may not have ads on them, using this
option guarantees that Google will get through your security.

On 10/12/08, Doug Boude (rhymes with 'loud') <[EMAIL PROTECTED]> wrote:
>
> Hi all. I am curious if anybody knows how securing a site affects a search
> engine spider's ability to crawl it. For instance, if I have my entire site
> secured by means of authentication so that any page request is redirected to
> the login page if the appropriate security creds are not present in session,
> do spiders receive the same treatment? Are they also prohibited by my
> security from crawling any page except the login page? If this is true, what
> can I do to allow spiders to have access to crawl content but still apply
> security to regular "human" visitors? My only thought on that is to detect
> the fact that they are a spider (not sure how to do that though) and not
> implement security in that case.
>
> Thanks for your ideas and thoughts. Feel free to email them to me at
> [EMAIL PROTECTED]
>
> Doug  :0)
>
> 

~|
Adobe® ColdFusion® 8 software 8 is the most important and dramatic release to 
date
Get the Free Trial
http://ad.doubleclick.net/clk;207172674;29440083;f

Archive: 
http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:313792
Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm
Unsubscribe: 
http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4


Re: How does Security affect search engine spiders?

2008-10-12 Thread s. isaac dealey
I agree with Ade. 

My general practice is to not secure reads for anything I want google to
index and only apply security on pages where a user is entering some
information that might require it. A lot of forum systems do this. You
can browse the forum without being logged in, but you have to log in if
you want to post something. 



-- 
s. isaac dealey  ^  new epoch
 isn't it time for a change? 
 ph: 781.769.0723

http://onTap.riaforge.org/blog



~|
Adobe® ColdFusion® 8 software 8 is the most important and dramatic release to 
date
Get the Free Trial
http://ad.doubleclick.net/clk;207172674;29440083;f

Archive: 
http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:313791
Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm
Unsubscribe: 
http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4


RE: How does Security affect search engine spiders?

2008-10-12 Thread Adrian Lynch
Oooo, not sure that's gonna be the best solution.

Firstly, if you let Google crawl the secure content, I'll go to Google and
view the cached version of your content.

Secondly, the way to detect if it's Google can be spoofed. It'll make itself
known to you via the user agent. Dump the CGI scope to find that.

Thirdly, I reckon, and I'll need someone else to confirm this, that if you
let Google in to index content that's not available to a non member and
Google finds out, it'll penalise you.

Adrian
Building a database of ColdFusion errors at http://cferror.org/

-Original Message-
From: Doug Boude (rhymes with 'loud')
Sent: 12 October 2008 22:20
To: cf-talk
Subject: How does Security affect search engine spiders?

Hi all. I am curious if anybody knows how securing a site affects a search
engine spider's ability to crawl it. For instance, if I have my entire site
secured by means of authentication so that any page request is redirected to
the login page if the appropriate security creds are not present in session,
do spiders receive the same treatment? Are they also prohibited by my
security from crawling any page except the login page? If this is true, what
can I do to allow spiders to have access to crawl content but still apply
security to regular "human" visitors? My only thought on that is to detect
the fact that they are a spider (not sure how to do that though) and not
implement security in that case.

Thanks for your ideas and thoughts. Feel free to email them to me at
[EMAIL PROTECTED]

Doug  :0)


~|
Adobe® ColdFusion® 8 software 8 is the most important and dramatic release to 
date
Get the Free Trial
http://ad.doubleclick.net/clk;207172674;29440083;f

Archive: 
http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:313790
Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4


How does Security affect search engine spiders?

2008-10-12 Thread Doug Boude (rhymes with 'loud')
Hi all. I am curious if anybody knows how securing a site affects a search 
engine spider's ability to crawl it. For instance, if I have my entire site 
secured by means of authentication so that any page request is redirected to 
the login page if the appropriate security creds are not present in session, do 
spiders receive the same treatment? Are they also prohibited by my security 
from crawling any page except the login page? If this is true, what can I do to 
allow spiders to have access to crawl content but still apply security to 
regular "human" visitors? My only thought on that is to detect the fact that 
they are a spider (not sure how to do that though) and not implement security 
in that case.

Thanks for your ideas and thoughts. Feel free to email them to me at [EMAIL 
PROTECTED]

Doug  :0) 

~|
Adobe® ColdFusion® 8 software 8 is the most important and dramatic release to 
date
Get the Free Trial
http://ad.doubleclick.net/clk;207172674;29440083;f

Archive: 
http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:313789
Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4