Hi Mike,

People who have seen Exhibit very often ask if Google would see their 
data. To convince a professor to stuff all of her publications into 
Exhibit, we need to assure her that her publications are still 
searchable because they are in their current form.

Thank you for the link to Google sitemap. I can't seem to find out how 
to make Google index exhibit json data linked from an html page as if 
the data itself were inside that html page. (This is because I don't 
want Google searchers to end up looking at the json rather than the 
html.) Do you know how?

Thanks,

David

Michael K. Bergman wrote:
> Hi David,
>
> I don't mean to be cavalier or dismissive of this concern, but the truth 
> is that MOST dynamic content (from Ajax or underlying databases) has 
> always been invisible to crawlers, Google or otherwise.  The co-called 
> 'deep Web' or 'invisbile Web' has been written about for years (some by 
> me).  Estimates are that 'deep Web' content may range from 2x to 10x or 
> more of the "surface Web' content that is discoverable by crawlers. 
> Indeed, I rather suspect that most data in Google spreadsheets itself is 
> unindexed by Google (use the site:http://spreadsheets.google.com search; 
> only about 700 sites are listed, which I suspect have had links embedded 
> in standard static pages).
>
> Since the content in an Exhibit display is equivalent to a standard 
> database record, the availability of the records themselves should not 
> be of terrible concern (like individual addresses in an address book or 
> individual events).  However, it is LIKELY important that the overall 
> nature of the database itself is important.  Thus, one good practice is 
> to make sure that an Exhibit display has an intro section in standard 
> HTML describing the datasets and the display, or be linked to by another 
> page that provides a similar description.
>
> Another alternative is to create a sitemap with a separate page showing 
> some information for all of the records in the database (this can be an 
> unobtrusive link to the JSON records themselves; while ugly, the content 
> would still get indexed).
>
> At any rate, there ARE good practices to overcome the crawl limitations 
> of dynamic content.  I definitely would not call this "perhaps the 
> biggest impediment to adoption" for Exhibit since it is shared by so 
> many sites and applications.
>
> If you are concerned, you can check the crawl status of a Web site on 
> Google by going to:  https://www.google.com/webmasters/tools/sitestatus. 
> That screen will also take you to a series of Webmaster options provided 
> by Google (more if you can verify you are the site owner).
>
> So, I don't recommend any changes be made directly to the Exhibit code 
> itself.  If desirable, I could draft a short note for the wiki that 
> could inform Exhibit users of what steps (including SEO) they might take 
> better capturing some of the items above.
>
> Thanks, Mike
>
> David Huynh wrote:
>   
>> Hi all,
>>
>> Exhibit suffers from the same Achilles heel as other Ajax applications: 
>> the dynamic content that gets inserted on-the-fly is totally invisible 
>> to Google. My whole web site is now invisible to Google :-) Perhaps this 
>> is the biggest impediment to adoption.
>>
>>     
> _______________________________________________
> General mailing list
> [email protected]
> http://simile.mit.edu/mailman/listinfo/general
>   

_______________________________________________
General mailing list
[email protected]
http://simile.mit.edu/mailman/listinfo/general

Reply via email to