Maybe instead of physical separation we can settle for logical separation.

Suppose we enable <link rel="exhibit/data" href="#local"> to specify
that the data can be found in the element with name or id #local in the
html doc?  that data can be cdata encoded and meets the goal of being
machine readable.  It does require xml parsing but that's a relatively
small cost.

David Huynh wrote:
> Search engines are only interested in crawling (probably) visible HTML 
> content, so anything to be crawled must be in HTML, and that spoils the 
> whole point of separating data from presentation. I think the only way 
> to have both separation of data and presentation as well as 
> crawl-ability is to store the data in JSON files or whatever, and have a 
> cached rendering of *some* of the data in HTML. Maybe you can specify 
> some ordering of the items as well as a cut-off limit, and that 
> determines which items--potentially the most interesting ones--get 
> rendered into HTML. That way you won't duplicate the data 100%.
>
> So your PHP file will look something like this
>
>     <html>
>         <head>
>        
>             <link rel="exhibit/data" href="data1.json" 
> type="application/json" />
>             <link rel="exhibit/data" href="data2.rdf" 
> type="application/rdf+xml" />
>            
>         </head>
>         <body>
>             ...
>            
>             <div ex:role="lens" id="template-1" ...>...</div>
>
>     <noscript>       
>     <?php
>     $curl_handle=curl_init();
>     
> curl_setopt($curl_handle,CURLOPT_URL,'http://service.simile-widgets.org/exhibit-render?');
>     curl_exec($curl_handle);
>     curl_close($curl_handle);
>     ?>
>     </noscript>
>
>         </body>
>     </html>
>
> The trouble is how to pass data1.json, data2.rdf, and the lens template 
> to the web service exhibit-render. We could potentially make a php 
> library file that when you include it into another php file, it parses 
> the containing php file, extracts out the data links and lens templates, 
> and calls the web service exhibit-render automatically.
>
>     <?php
>        include("exhibit-rendering-lib.php");
>        renderExhibit("template-1", ".age", true, 10); #id of lens 
> template to use, sort by expression, sort ascending, limit
>     ?>
>
> I don't know enough php to know if that's possible / easy.
>
> David
>
>
> John Clarke Mills wrote:
>   
>> Vincent,
>>
>> Although the idea of detecting user agent is a sound one, this can
>> also be construed as cloaking, which if caught, you will be penalized
>> by Google.  I often flip a coin my head on a subject like this because
>> what you are saying makes perfect sense; however, we dont always know
>> how Googlebot is going to react.
>>
>> Just some food for thought.  There's a good chance I will be
>> attempting to combat this problem in the near future and I will report
>> back.
>>
>> Cheers.
>>
>> On May 26, 1:02 am, Vincent Borghi <[email protected]> wrote:
>>   
>>     
>>> Hi,
>>>
>>>
>>>
>>> On Sat, May 23, 2009 at 2:36 AM, David Huynh <[email protected]> wrote:
>>>
>>>     
>>>       
>>>> Hi all,
>>>>       
>>>> Google recently introduced "rich snippets", which are basically
>>>> microformats and RDFa:
>>>>       
>>>> http://googlewebmastercentral.blogspot.com/2009/05/introducing-rich-s...
>>>>       
>>>> The idea is that if your web page is marked up with certain attributes
>>>> then search results from your web page will look better on Google.
>>>>       
>>>> So far exhibits' contents are not crawl-able at all by search engines,
>>>> because they are contained inside JSON files rather than in HTML, and
>>>> they are then rendered dynamically in the browser.
>>>>       
>>>> Since Google is starting to pay attention to structured data within web
>>>> pages, I think it might be a really good time to start thinking about
>>>> how to make exhibits crawl-able *and* compatible with Google's support
>>>> for microformats and RDFa at the same time. Two birds with one stone.
>>>>       
>>>> One possible solution is that if you use Exhibit within a php file, then
>>>> you could make the php file get some service like Babel to take your
>>>> JSON file and generate HTML with microformats or RDFa, and inject that
>>>> into a <noscript> block.
>>>>       
>>>> Please let me know if you have any thought on that!
>>>>       
>>>>         
>>> AFAI understand, in the possible solution you mention, you finally
>>> always double the volume of the served data: you serve the original json
>>> plus a specially tagged version in a <noscript>.
>>>
>>> This works and is surely appropriate in many cases,
>>>
>>> I just add as a remark that, since it may cost bandwidth just to serve
>>> additional data (data specially tagged for Google) that in the general case
>>> (a human visitor using a browser) is not used, an alternative solution
>>> may be preferable in certain cases, and when this is possible:
>>>
>>> For those of us who can customize their httpd.conf configuration
>>> of their apache server, we may prefer to implement the solution
>>> which is to serve appropriately, on the same URL, two different versions:
>>>  - one version being the "normal" exhibit, for "normal" human visitors,
>>>  - and the other, for (google)bots, being an ad-hoc html (either static or
>>> dynamically generated by cgi or similar, using or not babel).
>>>
>>> This assumes we configure apache to serve, for the same given URL,
>>> the first or the other version, depending on the user-agent that visits 
>>> this URL
>>> (using appropriate "RewriteCond %{HTTP_USER_AGENT} .../ rewriterule..
>>> in the apache httpd.conf).
>>>
>>> Regards
>>>     
>>>
>>>       
>>   
>>     
>
>
> >
>   

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"SIMILE Widgets" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/simile-widgets?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to