Testing Google's Rich Snippets RDFa support

Philip Taylor Sat, 12 Sep 2009 09:08:29 -0700

As a followup to the old news linked from<http://lists.w3.org/Archives/Public/public-rdf-in-xhtml-tf/2009May/0064.html>:Google has now made available a testing tool at<http://www.google.com/webmasters/tools/richsnippets>. As far as I'maware it's using the same code that the real search engine results use.

I tested it a bit, and it seems that what's implemented in that toolbears very little relation to RDFa. It's not simply a buggyimplementation - it's not even attempting to handle RDFa remotely correctly.

http://philip.html5.org/demos/rdfa/google-rich-snippets.html shows a fewexamples. It rejects some perfectly correct RDFa markup; it interpretssome perfectly correct RDFa markup incorrectly; and it accepts sometotally broken RDFa markup.

For example, the documentation athttp://www.google.com/support/webmasters/bin/answer.py?answer=146646includes:


  <a href="http://darryl-blog.example.com/"; rel="v:friend">Darryl</a>

Google's tool says the output has "friend = Darryl", whereas RDFa saysto ignore the element content and output a triple "...<http://rdf.data-vocabulary.org/#friend><http://darryl-blog.example.com/>" instead, so the markup is beinginterpreted incorrectly.

With input like John <spanproperty="v:nickname">Smith, Google's tool only extractsthe name and ignores the nickname triple that an RDFa processor wouldgenerate, so it's again failing to interpret the markup correctly.

With input like error, it returns "name = error".

So it seems to totally ignore attributes like 'datatype' and 'content',and treats 'rel' identically to 'property', as far as I can tell.



Also, the tool accepts input like:

  <div xmlns:v="http://rdf.data-vocabulary.org/#"; typeof="v:Person">
    <span property="v:name">John Smith</span>
  </div>

while it rejects equivalent input like:

  <div xmlns:v="http://rdf.data-vocabulary.or"; typeof="v:g/#Person">
    <span property="v:g/#name">John Smith</span>
  </div>

It also accepts input like:

  <div xmlns:v="http://arbitrary.example.org/#"; typeof="v:Person">
    <span property="v:name">John Smith</span>
  </div>

and apparently entirely ignores that it's in a different namespace, andprocesses the data as if it were in "http://rdf.data-vocabulary.org/#";(it still shows up in the search result preview regardless of namespace,as long as you have the right string after the colon).


It also accepts input like:

  <div typeof="zzz:Person">
    <span property="#:name">John Smith</span>
  </div>

and emits a warning about the undeclared namespaces but otherwiseprocesses it as if it were all using the correct namespace.

So it seems that Google doesn't attempt to do any kind ofnamespace/CURIE processing at all (other than a little bit for theharmless warning) - it simply looks at the part of the attribute valueafter the colon (case-insensitively), and ignores everything else.

Am I doing something wrong here, or am I missing a good reason for thisapparent behaviour? It seems very disappointing that Google is claimingto support RDFa while failing to implement it in a way that is remotelycorrect or compatible with other RDFa processors.


--
Philip Taylor
pj...@cam.ac.uk

Testing Google's Rich Snippets RDFa support

Reply via email to