Wow. That's a pretty intense SEO solution. I hope the shadowing thing doesn't bite you, it's definitely a cool attempt.
Can't you just tickle your analytics urchin directly? ie http://jdwyah.blogspot.com/2008/05/gwt-and-tickling-your-google-analytics.html In your onLoad() you should just be able to tickle it with the current #anchor if it exists, no? -Jeff On Feb 27, 10:26 am, Nicolas Wetzel <[email protected]> wrote: > Hi all, > > I'm working for a compagny which build a web site broadcasting music based > on gwt:www.awdio.com > > On SEO, we've found some interresting stuff to cope with Ajax specifity : > search engine can't have javascript engine so they are not able to retrieve > the entire html produced by gwt script (or by other ajax framework script). > So each page ie "gwt screen" can not be indexed by them. Rather than > duplicate each page with a hand-made static html page accessible by the > noscript tag, we produce them with a java program which launches an > SWTBrowser (Eclipse 3.4) with the start url :http://www.awdio.com. > > The main issue with this approach is that the client program has no means of > knowing when the page is fully rendered by the javascript process. > In the gwt awdio code we implemented a "semaphore" (flag) which notifies the > SWTBrowser based client of the completion. > > This semaphore works with a hidden <DIV> drawing</DIV> which is accessible > or not in the DOM, i.e the the html content produced contains it. > With org.eclipse.swt.browser.Browser.getText() we can retrieve the html > content, and test for the presence of the above mentioned flag. > > To do that the java program listens at the Browser statustext event > (org.eclipse.swt.browser.StatusTextListener). > > Also, when the page is loaded, the program gets the content and looks up at > all the internal links <a href="# built by the gwt Hyperlink widget. Before > storing the html content in a cachable static page, all the '#' are > remplaced by a '/' so that the bot will get fully qualified URLs (the > crawlers do not handle anchors). > > Finally, the program follow each links with the SWTBrowser so all the static > version of the pages can be produced automaticaly. > > At last in the awdio server, a front-end servlet detects the user-agent of > request and if it's a search engine the static produced page is returned. > else the gwt host page is returned. > As far as I understand, this might be considered "shadowing". But the > content seen by the crawler is exactly the same as the one seen by the user > (after Javascript execution). > > In the onModuleLoad of the awdio EntryPoint the right part of the url is > parsed to build the corresponding historyToken. So when > thewww.awdio.com/eventsis requested on a browser, it react in the same way as > if the user clicked on an internal link (#events). > > Everythink looksfine, but there is still a big issue..... > > If a user copies and pastes one of our URLs on his own site, it will contain > the hash sign (e.g. :http://www.awdio.com/#events). Which means that the > search engine will not rank pages independently (all pages will be > considered as a single one :http://www.awdio.com). > > We can still add "link to this page" buttons wherever necessary, but it's > not satisfactory. > > To conclude, it seems that this whole solution solves the AJAX indexing > issue, with the very annoying exception of page ranking (due to the #anchor > URLs). Maybe Google should start to consider #anchors as having a new > meaning for our Web 2.0 generation ? Maybe by considering a specific value > of the "rel" attribute ? (e.g. : <A HREF="#mypage" rel="ispagelink">My > Page</A>) ? --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Google Web Toolkit" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/Google-Web-Toolkit?hl=en -~----------~----~----~----~------~----~------~--~---
