Wow. That's a pretty intense SEO solution. I hope the shadowing thing
doesn't bite you, it's definitely a cool attempt.

Can't you just tickle your analytics urchin directly? ie
http://jdwyah.blogspot.com/2008/05/gwt-and-tickling-your-google-analytics.html

In your onLoad() you should just be able to tickle it with the current
#anchor if it exists, no?

-Jeff

On Feb 27, 10:26 am, Nicolas Wetzel <[email protected]> wrote:
> Hi all,
>
> I'm working for a compagny which build a web site broadcasting music based
> on gwt:www.awdio.com
>
> On SEO, we've found some interresting stuff to cope with Ajax specifity :
> search engine can't  have javascript engine so they are not able to retrieve
> the entire html produced by gwt script (or by other ajax framework script).
> So each page ie "gwt screen"   can not be indexed by them. Rather than
> duplicate each page with a hand-made static html page accessible by the
> noscript tag, we produce them with a java program which launches an
> SWTBrowser (Eclipse 3.4) with the start url :http://www.awdio.com.
>
> The main issue with this approach is that the client program has no means of
> knowing when the page is fully rendered by the javascript process.
> In the gwt awdio code we implemented a "semaphore" (flag) which notifies the
> SWTBrowser based client of the completion.
>
> This semaphore works with a hidden <DIV> drawing</DIV> which is accessible
> or not in the DOM, i.e the the html content produced contains it.
> With org.eclipse.swt.browser.Browser.getText() we can retrieve the html
> content, and test for the presence of the above mentioned flag.
>
> To do that  the java program listens at the Browser statustext event
> (org.eclipse.swt.browser.StatusTextListener).
>
> Also, when the page is loaded, the program gets the content and looks up at
> all the internal links <a href="#  built by the gwt Hyperlink widget. Before
> storing the html content in a cachable static page,  all the '#' are
> remplaced by a '/' so that the bot will get fully qualified URLs (the
> crawlers do not handle anchors).
>
> Finally, the program follow each links with the SWTBrowser so all the static
> version of the pages can be produced automaticaly.
>
> At last in the awdio server, a front-end servlet detects the user-agent of
> request and if it's a search engine the static produced page is returned.
> else the gwt host page is returned.
> As far as I understand, this might be considered "shadowing". But the
> content seen by the crawler is exactly the same as the one seen by the user
> (after Javascript execution).
>
> In the onModuleLoad of the awdio EntryPoint the right part of the url is
> parsed to build the corresponding historyToken. So when 
> thewww.awdio.com/eventsis requested on a browser, it react in the same way as
> if the user clicked on an internal link (#events).
>
> Everythink looksfine,  but there is still a big issue.....
>
> If a user copies and pastes one of our URLs on his own site, it will contain
> the hash sign (e.g. :http://www.awdio.com/#events). Which means that the
> search engine will not rank pages independently (all pages will be
> considered as a single one :http://www.awdio.com).
>
> We can still add "link to this page" buttons wherever necessary, but it's
> not satisfactory.
>
> To conclude, it seems that this whole solution solves the AJAX indexing
> issue, with the very annoying exception of page ranking (due to the #anchor
> URLs). Maybe Google should start to consider #anchors as having a new
> meaning for our Web 2.0 generation ? Maybe by considering a specific value
> of the "rel" attribute ? (e.g. :  <A HREF="#mypage" rel="ispagelink">My
> Page</A>) ?
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Google Web Toolkit" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/Google-Web-Toolkit?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to