Add a new extractor for RDFa using java-rdfa
--------------------------------------------

                 Key: ANY23-18
                 URL: https://issues.apache.org/jira/browse/ANY23-18
             Project: Apache Any23
          Issue Type: Improvement
            Reporter: Paolo Castagna
            Priority: Minor


I wonder if it is possible to add a new RDFa extractor which uses java-rdfa [1].

java-rdfa is (according to its creator, Damian Steer :-)) "the cruftiest RDFa 
parser in the world" (and he is probably right!). java-rdfa is currently 
passing all conformance tests for XHTML, and the HTML 4 and 5 tests with one 
exception [2]. An online service|demo [3] is also available. java-rdfa, as far 
as I understand, is currently licensed with a BSD license. The Maven artifacts 
are available in the Maven central repository [4].

>From my little understanding of Any23, in order to do this one needs to 
>implement BlindExtractor (which extends Extractor<URI>) and ContentExtractor 
>(which extends Extractor<InputStream>).

 [1] https://github.com/shellac/java-rdfa
 [2] http://github.com/shellac/java-rdfa/issues#issue/15
 [3] http://rdf-in-html.appspot.com/
 [4] http://repo1.maven.org/maven2/net/rootdev/java-rdfa/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to