Hi all, let me share my throughs. Because this mail is rather long I tried to split it up in three separate section (1) RDF (2) RESTful/ Web Interface and (3) other related topics
RDF libs: ==== Out of the viewpoint of Apache Stanbol one needs to ask the Question if it makes sense to manage an own RDF API. I expect the Semantic Web Standards to evolve quite a bit in the coming years and I do have concern that the Clerezza RDF modules will be updated/extended to provide implementations of those. One example of such an situation is SPARQL 1.1 that is around for quite some time and is still not supported by Clerezza. While I do like the small API, the flexibility to use different TripleStores and that Clerezza comes with OSGI support I think given the current situation we would need to discuss all options and those do also include a switch to Apache Jena or Sesame. Especially Sesame would be an attractive option as their RDF Graph API [1] is very similar to what Clerezza uses. Apache Jena's counterparts (Model [2] and Graph [3]) are considerable different and more complex interfaces. In addition Jena will only change to org.apache packages with the next major release so a switch before that release would mean two incompatible API changes. My personal opinion is that we should keep using Clerezza for now. Invest some effort to improve the Clerezza RDF modules and than see how it further develops. Such an Effort should include * to implement SPQRAL fast lane (as already discussed with Reto during ApacheCon). Fast lane would allow Clerezza to use the native SPARQL engine of the used Triplestore. Meaning that Clerezza only parses those parts of the SPARQL query to understand the RDF graph to execute the Query on. This information is than used to parse the query to the native SPARQL engine via an extended Interface of the TcProvide. The Clerezza SPARQL implementation would only be used in case the TcProvider does not provide a native SPARQL implementation of if the Query spans RDF graphs managed by different TcProvider instances. By that Clerezza users would be able to use any SPARQL feature provided by the used TripleStore. * update to the newest Jena versions (see also STANBOL-621; Peter Ansell's Clerezza fork on github [5] as well as Sebastian Schaffert's Jena bundle used for the Stanbol/LMF integration [5]) * finish and release the SingleTdbDatasetTcProvider.java (CLEREZZA-691) as this is important for the Stanbol Ontology Manager component * move the Indexed in-memory graph (CLEREZZA-683) from the Stanbol code base to Clerezza and release it so that we can use it from their in Stanbol * provide an Clerezza JsonLD parser/serializer. This is critical for Stanbol as several CMS use this as preferred RDF serialization. [1] http://www.openrdf.org/doc/sesame2/api/org/openrdf/model/package-summary.html [2] http://jena.apache.org/documentation/javadoc/jena/com/hp/hpl/jena/rdf/model/Model.html [3] http://jena.apache.org/documentation/javadoc/jena/com/hp/hpl/jena/graph/Graph.html [4] https://github.com/ansell/clerezza/commit/37747324d980fad6a33caa3da00491da66900c37 [5] https://bitbucket.org/srfgkmt/stanbol-lmf/src/f41c6c93f08872469dc2e2d64fc06ad75f76f003/lmf-jena/pom.xml RESTful API / Web Interface: ===================== There are several shortcomings of the current implementation of the Stanbol RESTful services / Web UI modules ( o.a.stanbol.commons.web, o.a.stanbol.*.web, o.a.stanbol.*.jersey modules) * Jersey's use of java.util.ServiceLoader forces the use manual configuration of the JAX-RS components. A switch to an OSGI compatible implementation such as Apache Wink would be very welcome * The RESTful API documentation is currently written as HTML into Freemarker templates. This makes it really hard to maintain this documentation. I would really appreciate the possibility to use markdown (as used on the Webpage) for that * For Stanbol deployments of Stanbol it should be possible to exclude the WebUI so that only the RESTful services are available regarding : > Stanbol drops it's interretation of "REST" as "not for humans" and want to go > to > allow integrating (wherever possible as modular and optional components) > media types designed for human consumptions and support REST approaches > there as well (thinking of the current back-button unfriendly UI). Adding support for a simple Table based representation of RDF data would indeed be an important feature. However having Resource (Entity) type specific rendering is out of the scope of Apache Stanbol (at least in my opinion). However AFAIK as soon as we switch to an OSGI compatible JAX-RS implementation users could add those easily by providing the according JAX-RS MessageBodyWriter. If there are people who would like to work it would be really great. If we could (re)use some stuff from Clerezza - even better. But things would need to keep simple as Stanbol is no semantic CMS. I would suggest to start development in an own branch and than have a discussion/vote based on an early prototype/demonstration. Other Topics ========= ### Scala and jsr 223 (scripting in the JVM) I do have an issue with Scala as it adds >150MByte to the PermGen as soon as it is loaded. But as long as it is an optional dependency and users are aware of that when adding the dependency I am fine with it. ### Shell Personally I do not find the shell very useful. For installing Bundles/Service configurations I prefer to use the Apache Sling FileInstaller. For deployment during development I like to use the Sling Maven Installer plugin. For creating new Stanbol Modules I rather suggest to create an extensive list of Maven Archetype (e.g. for Stanbol EnhancementEngines). As the Shell also depends on Scala the "+150MByte to the PermGen" issue also applies to the Shell. ### Security Having a security model in Apache Stanbol might be important for some use cases. Because of this I consider this an important topic. However one I have very little experience with. I would like to get rid of the dependencies to org.apache.clerezza:patform (AFAIK this is only needed for the configuration and this could be easily provided by the sling.properties file at runtime. Defaults can be provided in the commons.properties file already included in all Stanbol Launchers. I would also suggest to move the PermissionParser utility over to the Apache Stanbol Security modules. This two changes would allow to activate the security module also for the Stable (Stateless) launcher. best Rupert On Thu, Nov 8, 2012 at 2:39 PM, Hasan Hasan <ha...@trialox.org> wrote: > Comments inline... > > On Thu, Nov 8, 2012 at 1:00 PM, Reto Bachmann-Gmür <r...@apache.org> wrote: > >> Ok, sorry for jumping into this discussion so lately. I've been having >> quite some discussion on the matter here at apacheconeu. Also I had >> prositive feedback from my resentation of Clerezza yesterday. >> >> I think two things: >> - For high level platform component it is often not clear if the fit better >> into Stanbol or into Clerezza >> - The RDF Api shoud actually be independen both from triple store provider >> as well as from consumer >> >> So I think a good solution would be to have the RDF liraries comprising: >> - A modular and very spec oriented API for RDF and related standards >> - A set of serializing and parsing providers >> - Adapters to triple stores (where the api isn't provided by the triple >> store) >> basically that's what in the org.apache.clerezza.rdf.* packages >> >> That's the stuff that would fit well into Stanbol. Provided that stanbol >> drops it's interretation of "REST" as "not for humans" and want to go to >> allow integrating (wherever possible as modular and optional components) >> media types designed for human consumptions and support REST approaches >> there as well (thinking of the current back-button unfriendly UI). >> > > IMO, Clerezza is just too big for existing committers. If we could reduce > it to the > essential components dealing with rdf and leaving out templating and > rendering, > it may be easier to graduate. > > - Scala Server Pages >> - TypeRendering (selection of templates based on the rdf type of the >> returned response) >> - Security (already integrated to some degree, code based security to run >> bundles in a sandboxed manner is not) >> - Shell (already ships in the stanbol launcher, so here it's about >> 'adopting' the sources) >> - Dev tools: rapid development support (create sample projects, have source >> files as bundles) >> >> To the attic: >> - Triaxrs: The Clerezza jax-rs implementation is no longer needed as the >> same support (jax-rs components asosgi services) is now provided by apache >> wink >> - jssr 223 support >> >> In my opinion there is no urgent need for action, it is true that there >> hasn't been a lot of action in clerezza but imho the project os going on >> even at a low pace (as other projects like e.g. the recently graduated >> wink). >> > > Not sure about no urgent need for action. Maybe we should list the > requirements > to fulfil in order to be able to graduate. Wonder if we are able to meet > them. > > Cheers > Hasan > > >> >> Cheers, >> Reto >> >> On Thu, Nov 8, 2012 at 12:02 PM, Bertrand Delacretaz < >> bdelacre...@apache.org >> > wrote: >> >> > On Thu, Nov 8, 2012 at 11:33 AM, Andy Seaborne <a...@apache.org> wrote: >> > > ...It's good to have the existing released artifacts remain - what >> about >> > after >> > > the donation? >> > > >> > > Presumably the moved modules will be released by the new host - will >> they >> > > use group id org.apache.clerezza? or move to the new host project group >> > id? >> > > I'd suggest renaming the group to the new project but realise it is a >> bit >> > > more disruptive... >> > >> > I think that's really up to whatever project adopts that code. In >> > theory package names should change but that's probably not convenient. >> > >> > Or maybe it's time to create a semantic module or two at >> > http://commons.apache.org/ ? If existing committers are willing to >> > support that with their work it should be easy to make it happen. >> > >> > -Bertrand >> > >> -- | Rupert Westenthaler rupert.westentha...@gmail.com | Bodenlehenstraße 11 ++43-699-11108907 | A-5500 Bischofshofen