Hello,

I still didn't find the time to make a blog post about this. So I just
put the code on pastebin:

http://pastebin.org/31242

I'm looking forward to your feedback :)

I tested this filter on Jetty and Tomcat (with Firefox' user agent
switcher) where it worked fine. However, as stated in the code, some app
servers might behave a little different, so YMMV.


greetings,

Rüdiger



Am Montag, den 14.04.2008, 16:37 +0200 schrieb Korbinian Bachl - privat:
> Yeah, its quite a shame that google doesnt open source their logic ;)
> 
> would be nice if you could give us the code however, so we could have a 
> look at it :)
> 
> Rüdiger Schulz schrieb:
> > Hm, SEO is really a little bit like black science sometimes *g*
> > 
> > This (german) article states, that SID cloaking would be ok for google:
> > http://www.trafficmaxx.de/blog/google/gutes-cloaking-schlechtes-cloaking
> > 
> > Some more googling, and here someone seems to confirm this:
> > http://www.webmasterworld.com/cloaking/3201743.htm
> > " I was actually at SMX West and Matt Cutts specifically sa*id* that this is
> > OK"
> > 
> > All I can say in our case is that I added this filter several months ago,
> > and I can't see any negative effects so far.
> > 
> > 
> > greetings,
> > 
> > Rüdiger
> > 
> > 
> > 2008/4/14, Korbinian Bachl - privat <[EMAIL PROTECTED]>:
> >> Hi Rüdiger,
> >>
> >> AFAIK this could lead to some punishment by google, as he browses the site
> >> multiple times using different agents and origin IPs and in case he sees
> >> different behaviours he thinks about cloaking/ prepared content and will 
> >> act
> >> accordingly to it;
> >>
> >> This is usually noticed after the regular google index refreshes that
> >> happen some times a year - you should keep an eye onto this;
> >>
> >> Best,
> >>
> >> Korbinian
> >>
> >> Rüdiger Schulz schrieb:
> >>
> >>> Hello everybody,
> >>>
> >>> I just want to add my 2 cents to this discussion.
> >>>
> >>> At IndyPhone we too wanted to get rid of jesessionid-URLs in google's
> >>> index.
> >>> Yeah, it would be nice if the google bot would be as clever as the one
> >>> from
> >>> yahoo, and just remove them himself. But he doesn't.
> >>>
> >>> So I implemented a Servlet-Filter which checks the user agent header for
> >>> google bot, and skips the url rewriting just for those clients. As this
> >>> will
> >>> generate lots of new sessions, the filter invalidates the session right
> >>> after the request. Also, if a crawler is doing a request containing a
> >>> jsessionid (which he stored before the filter was implemented), he
> >>> redirects
> >>> the crawler to the same URL, just without the jsessionid parameter. That
> >>> way, the index will be updated for those old URLs.
> >>>
> >>> Now we have almost none of those URLs in google's index.
> >>>
> >>> If anyone is interested in the code, I'd be willing to publish this. As
> >>> it
> >>> is not wicket specific, I could share it with some generic servlet tools
> >>> OS
> >>> project - is there something like that on apache or elsewhere?
> >>>
> >>> But maybe Google is smarter by now, and it is not required anymore?
> >>>
> >>>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: [EMAIL PROTECTED]
> >> For additional commands, e-mail: [EMAIL PROTECTED]
> >>
> >>
> > 
> > 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to