Update to RDFa extraction stylesheet
------------------------------------
Key: ANY23-65
URL: https://issues.apache.org/jira/browse/ANY23-65
Project: Apache Any23
Issue Type: Improvement
Affects Versions: 0.7.0
Reporter: Ben Companjen
The RDFa 1.1 Core specification requests namespace prefixes in HTML5 be put in
a "prefix" attribute like this: "ns1: http://example.org/ ns2:
http://example.com/"
My sample HTML page has this, but Sindice, which uses Any23, didn't read my
namespace correctly. I narrowed it down to, and changed accordingly, the XSLT
template "tokenize2" in the rdfa.xslt stylesheet. The template expected
"ns1:http://example.org/ ns2:http://example.com/" (no spaces between prefix and
namespace URI) and did not normalize whitespace, like linebreaks (although I'm
not sure that broke the functionality).
I use Any23 0.6.1 locally, but
http://svn.apache.org/viewvc/incubator/any23/trunk/core/src/main/resources/org/apache/any23/extractor/rdfa/rdfa.xslt?revision=1231556&view=markup
shows that the template is the same in the trunk.
A possible problem may be that the new template will not accept the non-spaced
namespace definitions, like you can find in the RDFa produced by Best Buy. A
further improvement to my template may be accepting both namespace definitions
with spaces and the ones without.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira