[
https://issues.apache.org/jira/browse/ANY23-65?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13237505#comment-13237505
]
Ben Companjen commented on ANY23-65:
------------------------------------
One step further: build fails with my updated stylesheet because the tests try
to extract triples from rdfa-11-curies.html, which has
prefix="db:http://database.org/ dc:http://purl.org/dc/01/" (without spaces).
Even though this doesn't follow the spec, I start to believe it is important to
support both namespace definitions with space and without space between prefix
and namespace. Perhaps an extra template "tokenize2a" that can be called when
not(contains(@prefix,': ')) is all it takes. (Just thinking out loud here.)
> Update to RDFa extraction stylesheet
> ------------------------------------
>
> Key: ANY23-65
> URL: https://issues.apache.org/jira/browse/ANY23-65
> Project: Apache Any23
> Issue Type: Improvement
> Affects Versions: 0.7.0
> Reporter: Ben Companjen
> Labels: patch, xslt
> Attachments: rdfa.xslt, stylesheet.patch
>
> Original Estimate: 3h
> Remaining Estimate: 3h
>
> The RDFa 1.1 Core specification requests namespace prefixes in HTML5 be put
> in a "prefix" attribute like this: "ns1: http://example.org/ ns2:
> http://example.com/"
> My sample HTML page has this, but Sindice, which uses Any23, didn't read my
> namespace correctly. I narrowed it down to, and changed accordingly, the XSLT
> template "tokenize2" in the rdfa.xslt stylesheet. The template expected
> "ns1:http://example.org/ ns2:http://example.com/" (no spaces between prefix
> and namespace URI) and did not normalize whitespace, like linebreaks
> (although I'm not sure that broke the functionality).
> I use Any23 0.6.1 locally, but
> http://svn.apache.org/viewvc/incubator/any23/trunk/core/src/main/resources/org/apache/any23/extractor/rdfa/rdfa.xslt?revision=1231556&view=markup
> shows that the template is the same in the trunk.
> A possible problem may be that the new template will not accept the
> non-spaced namespace definitions, like you can find in the RDFa produced by
> Best Buy. A further improvement to my template may be accepting both
> namespace definitions with spaces and the ones without.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira