Google and Struts

2003-09-15 Thread Christian Bollmeyer
Hi,

just a simple Question:

1.) does Google follow links to Struts actions (ending with .do)?
2.) if so, does it make a difference if the ;jsessionid thing is
appended to the URL?

Background is that I'm working on a larger site that uses
Struts for request processing and content delivery, with
most Actions being mere forwards to static HTML and
JSP pages. Now, from the logs I can tell the GoogleBot
is analyzing the top page / (which contains an abundance
of additional .do links), but then just goes away. Well,
that should not be so. Though we've already searched all
other resources available on the web, none of them
seems to deal with Struts in particular. So finally, I'm
asking that question here. In the meantime, we've
already tried suppressing all session information
(calling the 'default' Forward action will always open
up a session and append the 'jsessionid' to all
page links if Cookies are disabled - which is the
case with Search Robots - regardless if the JSPs
don't require a session, and even though this is
explicitly switched off via a [EMAIL PROTECTED] % directive
stating session=false; quick solution: write a custom
PageAction that just does the same). If anybody
here has any experience with this Google problem,
I'd really be glad to learn from your knowledge.
Or does Google generally follow .do links as well,
and something else in the site may probably be
wrong? 

To share some of my thoughts, a countermeasure
might be to have Struts also handle requests with
a .htm (three letters) ending, thus tricking Google
into believing to analyze a static file, though in effect,
a Struts Action is invoked. It won't be able to tell,
not even when forwarding etc. Static HTML files could
retain a .html (4 letters) ending and be delivered and
indexed as-is. Or the other way round. I'm not sure if
I really like this approach, but still. Suppressing the 
'jsessionid' thing on the top page didn't make any
difference yet.

What are your experiences? 

-- Chris (SCPJ2)
 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Google and Struts

2003-09-15 Thread Louise Pryor
Google certainly crawls my site, which uses .do actions, and does a
pretty good job; I get a number of visitors referred straight into the
.do pages (which are all eg showTopic.do?topic=nn) from Google
searches. On the other hand, I don't do URL rewriting, so don't have
the jsessionid thingies.

I took the decision not to do URL rewriting because
- I don't need to track sessions
- the jsessionid thingies screwed up the crawler from Atomz, which I
use to provide site searching.

HTH.

Louise


On Monday, September 15, 2003 at 8:57:56 PM, Christian Bollmeyer wrote:

CB Hi,

CB just a simple Question:

CB 1.) does Google follow links to Struts actions (ending with .do)?
CB 2.) if so, does it make a difference if the ;jsessionid thing is
CB appended to the URL?

CB Background is that I'm working on a larger site that uses
CB Struts for request processing and content delivery, with
CB most Actions being mere forwards to static HTML and
CB JSP pages. Now, from the logs I can tell the GoogleBot
CB is analyzing the top page / (which contains an abundance
CB of additional .do links), but then just goes away. Well,
CB that should not be so. Though we've already searched all
CB other resources available on the web, none of them
CB seems to deal with Struts in particular. So finally, I'm
CB asking that question here. In the meantime, we've
CB already tried suppressing all session information
CB (calling the 'default' Forward action will always open
CB up a session and append the 'jsessionid' to all
CB page links if Cookies are disabled - which is the
CB case with Search Robots - regardless if the JSPs
CB don't require a session, and even though this is
CB explicitly switched off via a [EMAIL PROTECTED] % directive
CB stating session=false; quick solution: write a custom
CB PageAction that just does the same). If anybody
CB here has any experience with this Google problem,
CB I'd really be glad to learn from your knowledge.
CB Or does Google generally follow .do links as well,
CB and something else in the site may probably be
CB wrong? 

CB To share some of my thoughts, a countermeasure
CB might be to have Struts also handle requests with
CB a .htm (three letters) ending, thus tricking Google
CB into believing to analyze a static file, though in effect,
CB a Struts Action is invoked. It won't be able to tell,
CB not even when forwarding etc. Static HTML files could
CB retain a .html (4 letters) ending and be delivered and
CB indexed as-is. Or the other way round. I'm not sure if
CB I really like this approach, but still. Suppressing the 
CB 'jsessionid' thing on the top page didn't make any
CB difference yet.

CB What are your experiences? 

CB -- Chris (SCPJ2)
 


CB -
CB To unsubscribe, e-mail: [EMAIL PROTECTED]
CB For additional commands, e-mail: [EMAIL PROTECTED]




-- 
Louise Pryor
http://www.louisepryor.com



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]