Interesting ... I wonder if that might not also help with scalability of large datasets. I wonder if there is some middle ground to be found on this front ... not sure if we can get meaningful gains by having a mixed clientside/server-side hybrid (and still retain the "exhibit experience"), but I can't help but wonder.
On Tue, May 26, 2009 at 5:49 PM, David Huynh <[email protected]> wrote: > > Search engines are only interested in crawling (probably) visible HTML > content, so anything to be crawled must be in HTML, and that spoils the > whole point of separating data from presentation. I think the only way > to have both separation of data and presentation as well as > crawl-ability is to store the data in JSON files or whatever, and have a > cached rendering of *some* of the data in HTML. Maybe you can specify > some ordering of the items as well as a cut-off limit, and that > determines which items--potentially the most interesting ones--get > rendered into HTML. That way you won't duplicate the data 100%. > > So your PHP file will look something like this > > <html> > <head> > > <link rel="exhibit/data" href="data1.json" > type="application/json" /> > <link rel="exhibit/data" href="data2.rdf" > type="application/rdf+xml" /> > > </head> > <body> > ... > > <div ex:role="lens" id="template-1" ...>...</div> > > <noscript> > <?php > $curl_handle=curl_init(); > > curl_setopt($curl_handle,CURLOPT_URL,' > http://service.simile-widgets.org/exhibit-render?'); > curl_exec($curl_handle); > curl_close($curl_handle); > ?> > </noscript> > > </body> > </html> > > The trouble is how to pass data1.json, data2.rdf, and the lens template > to the web service exhibit-render. We could potentially make a php > library file that when you include it into another php file, it parses > the containing php file, extracts out the data links and lens templates, > and calls the web service exhibit-render automatically. > > <?php > include("exhibit-rendering-lib.php"); > renderExhibit("template-1", ".age", true, 10); #id of lens > template to use, sort by expression, sort ascending, limit > ?> > > I don't know enough php to know if that's possible / easy. > > David > > > John Clarke Mills wrote: > > Vincent, > > > > Although the idea of detecting user agent is a sound one, this can > > also be construed as cloaking, which if caught, you will be penalized > > by Google. I often flip a coin my head on a subject like this because > > what you are saying makes perfect sense; however, we dont always know > > how Googlebot is going to react. > > > > Just some food for thought. There's a good chance I will be > > attempting to combat this problem in the near future and I will report > > back. > > > > Cheers. > > > > On May 26, 1:02 am, Vincent Borghi <[email protected]> wrote: > > > >> Hi, > >> > >> > >> > >> On Sat, May 23, 2009 at 2:36 AM, David Huynh <[email protected]> > wrote: > >> > >> > >>> Hi all, > >>> > >>> Google recently introduced "rich snippets", which are basically > >>> microformats and RDFa: > >>> > >>> http://googlewebmastercentral.blogspot.com/2009/05/introducing-rich-s. > .. > >>> > >>> The idea is that if your web page is marked up with certain attributes > >>> then search results from your web page will look better on Google. > >>> > >>> So far exhibits' contents are not crawl-able at all by search engines, > >>> because they are contained inside JSON files rather than in HTML, and > >>> they are then rendered dynamically in the browser. > >>> > >>> Since Google is starting to pay attention to structured data within web > >>> pages, I think it might be a really good time to start thinking about > >>> how to make exhibits crawl-able *and* compatible with Google's support > >>> for microformats and RDFa at the same time. Two birds with one stone. > >>> > >>> One possible solution is that if you use Exhibit within a php file, > then > >>> you could make the php file get some service like Babel to take your > >>> JSON file and generate HTML with microformats or RDFa, and inject that > >>> into a <noscript> block. > >>> > >>> Please let me know if you have any thought on that! > >>> > >> AFAI understand, in the possible solution you mention, you finally > >> always double the volume of the served data: you serve the original json > >> plus a specially tagged version in a <noscript>. > >> > >> This works and is surely appropriate in many cases, > >> > >> I just add as a remark that, since it may cost bandwidth just to serve > >> additional data (data specially tagged for Google) that in the general > case > >> (a human visitor using a browser) is not used, an alternative solution > >> may be preferable in certain cases, and when this is possible: > >> > >> For those of us who can customize their httpd.conf configuration > >> of their apache server, we may prefer to implement the solution > >> which is to serve appropriately, on the same URL, two different > versions: > >> - one version being the "normal" exhibit, for "normal" human visitors, > >> - and the other, for (google)bots, being an ad-hoc html (either static > or > >> dynamically generated by cgi or similar, using or not babel). > >> > >> This assumes we configure apache to serve, for the same given URL, > >> the first or the other version, depending on the user-agent that visits > this URL > >> (using appropriate "RewriteCond %{HTTP_USER_AGENT} .../ rewriterule.. > >> in the apache httpd.conf). > >> > >> Regards > >> > > > > > > > > > > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "SIMILE Widgets" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/simile-widgets?hl=en -~----------~----~----~----~------~----~------~--~---
