Jakob, Thank you for your quick answer. This discussion is important for me, I hope we can clarify things together. If you confirm that this feature is a plus for Marmotta, then I could work on it, and we could take decisions together about the best ways to implement the different functionalities. The goal of the functionality, as described in the document I did prepare to talk with the Marmotta team [1], seems to me different then the LDCache, eventhough pretty similar. That goal would be to setup a triple store for a specific purpose, and build apps over those "controlled and validated" data. But the data would mainly come from external sources, distributed sources. For instance, when creating an app for engineers in the building field, we could want to base that app on data coming from different building material providers. Those providers publishing their catalog in RDF (publishing is not our concern in this project). They can publish it as an .rdf file, RDFa directly on their webside, or even a sparql end-point. We then define the data sources we want to include (those catalogues for instance), and the system helps an administrator to validate those data (they should not contain unwanted or unexpected data) and keep them up-to-date (as soon as the original data is updated, the system must know it and do something automatically or semi-automatically). As I understand the LDCache, it is a functionality to transparently cache data from the LOD when a triple contains a reference to a URI that can be reached by one of the defined "LD Cache Endpoints". There is not much control about which information is precisely retrived, how to validate the content, or how that information is automatically updated (I don't know yet how the expiry time is handeld). This functionality, in our opinion, is a mandatory functionality to bring the LOD to its full potential, for real world applications (and not just for research purpose) -> here you need to know which data you work on, know they are reliable, etc. So that would be the goal of the "External data sources" module, which was originaly called "overLOD Referencer" in the document [1]: - define precisely RDF data to be cached in the server: that could be a RDF File, a SPARQL CONSTRUCT on a end-point, etc. - find a way to validate the content of that data -> here we might not want to reason in an open world assumption, but if a property is defined with a certain range, we would want to check that the objects in the file ARE effectively instances from that defined class (for instance using SPARQL queries to validate the content, instead of a reasoner). - find a way to manage automatically the updates: it could be a 'pull' from Marmotta depending on some VoID data provided by the source, or the source could put in place a "ping" to marmotta, RSS-like features, like it was done by Ping-The-Semantic-Web or Sindice Please refer to [1] for more detailed information, and let me know if the purpose of this is really not clear ? I hope you will be able to tell me if I did misunderstand LDCache and finally it can play that exact role ? If LDCache can not do that right now, do you think I should work on a new module, or just add some functionalities to LDCache ? Hope we can have an interesting discussion Thank you for your help Fabian [1] https://dl.dropboxusercontent.com/u/852552/Marmotta_OverLOD%20Surfer%20presentation_0.2.pdf
>>> Jakob Frank <[email protected]> 02.09.2014 12:45 >>> Hi Fabian, looks like you chose a big one for starting ;-) LDCache plugs into the Sesame-Sail stack to automatically retrieve remote resources that are available in the local triple store. Sesame does not use CDI but the build-in Java ServiceLoader [1], so plugging in there is not as easy. On the other hand: why do you want to implement module similar to LDCache? What feature do you need that can't be solved using LDCache - for me, your "exteral datasource" module sounds exactly like LDCache in action... Best, Jakob [1] http://docs.oracle.com/javase/6/docs/api/java/util/ServiceLoader.html On 2 September 2014 09:06, Fabian Cretton <[email protected]> wrote: > Hi, > > I would like to implement a module that is similar to the LDCache (following > the previous discussions with Sergio about the overLOD project). > I am currently reading about the LDCache functionalities here > http://marmotta.apache.org/ldcache/ > and having a look at the code. > > (It is a pretty steep curve for me to apprehend the Marmotta project, but I > think it is worse it instead of starting from scratch) > > As I am new to this kind of project infrastructure, is there anything I > should read to better understand the all framework ? Maybe Java JEE > tutorials as the project description says "The Apache Marmotta Platform is > implemented as a light-weight Service-Oriented Architecture (SOA) using the > CDI/Weld service framework (i.e. the core components of Java EE 6). " > > Then, to create the new module, would it be a good idea to duplicate the > LDCache files (libraries and platform I guess) and modify them, or should I > better start from a new empty module as described here: > http://wiki.apache.org/marmotta/Customizing#Modules > > Thank you for any help > Fabian > > >>>> Sergio Fernández<[email protected]> 27.08.2014 16:31 >>> > Hi Fabian, > > On 27/08/14 14:49, Fabian Cretton wrote: >> My first goal was: to build the all project locally, run my locally >> built Marmotta, and then start adding components. >> But my first concern now that I am digging deeper, is that Marmotta is >> a pretty big project (about 80 projects), and so you might recommand me >> not to import the main "pom.xml" in my eclipse environment, but start >> smaller ? > > Then start from the platform modules. > >> If there is already a documentation about how to procede, thank you to >> point me there, I didn't find any by myself. > > Well, the overall build process is entirely manage by Maven, check > http://marmotta.apache.org/installation#source > >> Nevertheless, I do have problems and errors in Eclipse, and hope you >> can help me about that. > > Eclipse should be able to manage such king of size of modules with Maven. > >> The first problems I do have, are with many "Plugin execution not >> covered by lifecycle configuration" errors. > > Some plugin lifecycles might not be supported inside Eclipse. Just > ignore it, you should not need them. > >> Than I do have 6-7 : "Project build error: Non-resolvable parent POM: >> Could not find artifact >> org.apache.marmotta:marmotta-parent:pom:3.2.1-SNAPSHOT and >> 'parent.relativePath' points at wrong local POM pom.xml >> /marmotta-backend-sparql line 23 Maven pom Loading Problem" >> and here I am pretty confused: it seems that some POM files are not >> up-to-date in this 3.3.0 current version, as they do still point to a >> "3.2.1" parent POM file, but the parent is already in its "3.3.0" >> version ? > > Sorry for the error. Those modules are out of the default profile, so > the release plugin did not update the versions accordingly. It's already > fixed in the develop branch; please update your fork. > >> Then, apart from those Maven errors, I do have a few java errors with >> many "imports" or "types" which can't be resolved, and this seems very >> strange to me. But maybe solving the main Maven problems here above >> would correct that ? > > All dependencies are available from Maven central. Try to make a "maven > install" from the root. > >> A first goal for me would be to update the Marmotta's main menu so that >> under "Others", next to "Linked Data Caching", I could have a "External >> Data Sources" menu and then work an that new module as discussed earlier >> with you. > > Then you need to create a custom module and add it to your custom webapp > launcher. All the process is supported by Maven artifacts, as described at: > > http://wiki.apache.org/marmotta/Customizing#Modules > > Hope that helps. > > Cheers, > > -- > Sergio Fernández > Partner Technology Manager > Redlink GmbH > m: +43 660 2747 925 > e: [email protected] > w: http://redlink.co
