[
https://issues.apache.org/jira/browse/ANY23-19?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253175#comment-13253175
]
Peter Ansell commented on ANY23-19:
-----------------------------------
Hi Paolo,
The example library that is continually being referred to here, java-rdfa,
abstracts away from clerezza, sesame and jena interfaces by representing
everything as strings. See
https://github.com/shellac/java-rdfa/blob/master/core/src/main/java/net/rootdev/javardfa/StatementSink.java
The java-rdfa library is also not being developed or used actively, so it
isn't the best example of a successful library that doesn't use a type-safe RDF
Statement/Value API. I worked on java-rdfa a little but there is no push behind
it. Even its author refers to it as "The cruftiest RDFa parser in the world"
Two of the three goals of Any23, command line utility, and web service, are
completely ambivalent to the technology being used internally, as long as it is
high quality, and the Sesame libraries are very high quality in my opinion and
experience. The only goal that would be affected would be the use of Any23 as a
library. How often is Any23 currently being used as a library? Could another
project easily implement the same functionality using Jena with less effort
than it would take to either create a custom string based solution, or move to
another framework that may have just as many or more dependencies?
In terms of the use of Any23 as a library, is there anything about the Sesame
Model hierarchy (Value/Resource/Literal/BNode/URI) that would be better
represented using a custom solution? As one example, I have been working with
OWLAPI recently and its RDF handling is shocking, it merges URIs with Blank
Nodes to form what it refers to as IRIs. It has a custom internal solution that
only recognises two types of triples, those with an IRI in the object position
(where IRI is not type-safely defined between URI and BlankNode) and those with
Literals in the object position. I can't imagine Any23 going down this route,
but it is the worst case scenario if the API is converted without a reason. In
the simplest scenario, it may be possible to reuse the Sesame Model hierarchy
to produce Values that work across all three libraries, using custom ValueImpl
etc., implementations that actually implement the relevant interfaces from
other libraries, along a custom ValueFactory to produce these
multi-library-compatible Values (custom ValueFactories can be plugged into any
Rio Parser using Rio.getParser(RDFFormat, ValueFactory), a functionality which
I haven't seen in other libraries).
In terms of the actual packages that are currently used, there are four basic
packages sesame-model, sesame-rio-api, sesame-repository-api, sesame-sail-api,
sesame-sail-memory. These base libraries are small dependencies. One other
dependency is some small utilities, sesame-util that are used by sesame-model
and other sesame libraries.
82K - sesame-model-2.6.5.jar
36K - sesame-repository-api-2.6.5.jar
22K - sesame-rio-api-2.6.5.jar
56K - sesame-sail-api-2.6.5.jar
54K - sesame-sail-memory-2.6.5.jar
53K - sesame-util-2.6.5.jar
The value Impl classes should not be directly referenced. They should be
accessed using a ValueFactory and used as their Interfaces. This doesn't change
any of the libraries that are used, but it is better practice.
The other libraries that pull in the Rio parsers can be linked in dynamically
without compiling in the dependency, so the use of Any23 as a library would
enable people to pull them in as needed. See Rio.getParser(RDFFormat,
ValueFactory) and Rio.getWriter(RDFFormat) methods. It would be valuable if
Any23 could dynamically pull in all of its parsers and writers using the Rio.*
static methods. Then it could be used with the absolute minimum number of
parsers and writers for the current user.
4.6K - sesame-rio-n3-2.6.5.jar
14K - sesame-rio-ntriples-2.6.5.jar
33K - sesame-rio-rdfxml-2.6.5.jar
17K - sesame-rio-turtle-2.6.5.jar
Switching to another library may cause the bloat that you say you do not want.
For example, Jena and its immediate dependencies is quite large compared to the
modular sesame jar files, and that doesn't include the SPARQL parsing libraries
from ARQ, as indeed the sesame libraries quote above do not include the sparql
libraries.
1.7M - jena-core-2.7.0-incubating.jar
151K - jena-iri-0.9.0-incubating.jar
3.1M - icu4j-3.4.4.jar
1.4M - xercesImpl-2.10.0.jar
> Abstract away any specific RDF APIs
> -----------------------------------
>
> Key: ANY23-19
> URL: https://issues.apache.org/jira/browse/ANY23-19
> Project: Apache Any23
> Issue Type: New Feature
> Affects Versions: 0.7.0
> Reporter: Paolo Castagna
> Fix For: 0.8.0
>
>
> Any23 currently uses Sesame to work with or parse RDF. Specifically Any23
> uses these classes from org.openrdf.* packages:
> org.openrdf.model.BNode
> org.openrdf.model.datatypes.XMLDatatypeUtil
> org.openrdf.model.impl.LiteralImpl
> org.openrdf.model.impl.URIImpl
> org.openrdf.model.impl.ValueFactoryImpl
> org.openrdf.model.Literal
> org.openrdf.model.Resource
> org.openrdf.model.Statement
> org.openrdf.model.URI
> org.openrdf.model.Value
> org.openrdf.model.ValueFactory
> org.openrdf.model.vocabulary.OWL
> org.openrdf.model.vocabulary.RDF
> org.openrdf.model.vocabulary.RDFS
> org.openrdf.model.vocabulary.XMLSchema
> org.openrdf.repository.RepositoryConnection
> org.openrdf.repository.RepositoryException
> org.openrdf.repository.RepositoryResult
> org.openrdf.repository.sail.SailRepository
> org.openrdf.rio.helpers.RDFParserBase
> org.openrdf.rio.ntriples.NTriplesParser
> org.openrdf.rio.ntriples.NTriplesUtil
> org.openrdf.rio.ntriples.NTriplesWriter
> org.openrdf.rio.ParseErrorListener
> org.openrdf.rio.ParseLocationListener
> org.openrdf.rio.RDFFormat
> org.openrdf.rio.RDFHandler
> org.openrdf.rio.RDFHandlerException
> org.openrdf.rio.RDFParseException
> org.openrdf.rio.RDFParser
> org.openrdf.rio.rdfxml.RDFXMLParser
> org.openrdf.rio.rdfxml.RDFXMLWriter
> org.openrdf.rio.turtle.TurtleWriter
> org.openrdf.sail.memory.MemoryStore
> org.openrdf.sail.Sail
> org.openrdf.sail.SailException
> Would it be possible to abstract away any specific RDF APIs to allow Any23
> users to chose between, say: Apache Clerezza [1], Apache Jena [2], Sesame [3]
> and/or others?
> An example of small RDF distiller which does this is java-rdfa [4]. Maybe a
> similar agnostic (but easy to integrate) approach is possible for Any23.
> Although, java-rdfa does not need to parse RDF content itself.
> [1] http://incubator.apache.org/clerezza/
> [2] http://incubator.apache.org/jena/
> [3] http://www.openrdf.org/
> [4] https://github.com/shellac/java-rdfa
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira