Jakob,
 
Thank you for your quick answer.
This discussion is important for me, I hope we can clarify things
together. If you confirm that this feature is a plus for Marmotta, then
I could work on it, and we could take decisions together about the best
ways to implement the different functionalities.
 
The goal of the functionality, as described in the document I did
prepare to talk with the Marmotta team [1], seems to me different then
the LDCache, eventhough pretty similar.
 
That goal would be to setup a triple store for a specific purpose, and
build apps over those "controlled and validated" data. But the data
would mainly come from external sources, distributed sources. For
instance, when creating an app for engineers in the building field, we
could want to base that app on data coming from different building
material providers. Those providers publishing their catalog in RDF
(publishing is not our concern in this project). They can publish it as
an .rdf file, RDFa directly on their webside, or even a sparql
end-point. We then define the data sources we want to include (those
catalogues for instance), and the system helps an administrator to
validate those data (they should not contain unwanted or unexpected
data) and keep them up-to-date (as soon as the original data is updated,
the system must know it and do something automatically or
semi-automatically).
 
As I understand the LDCache, it is a functionality to transparently
cache data from the LOD when a triple contains a reference to a URI that
can be reached by one of the defined "LD Cache Endpoints". There is not
much control about which information is precisely retrived, how to
validate the content, or how that information is automatically updated
(I don't know yet how the expiry time is handeld).
 
This functionality, in our opinion, is a mandatory functionality to
bring the LOD to its full potential, for real world applications (and
not just for research purpose) -> here you need to know which data you
work on, know they are reliable, etc.
 
So that would be the goal of the "External data sources" module, which
was originaly called "overLOD Referencer" in the document [1]:
- define precisely RDF data to be cached in the server: that could be a
RDF File, a SPARQL CONSTRUCT on a end-point, etc.
- find a way to validate the content of that data -> here we might not
want to reason in an open world assumption, but if a property is defined
with a certain range, we would want to check that the objects in the
file ARE effectively instances from that defined class (for instance
using SPARQL queries to validate the content, instead of a reasoner).
- find a way to manage automatically the updates: it could be a 'pull'
from Marmotta depending on some VoID data provided by the source, or the
source could put in place a "ping" to marmotta, RSS-like features, like
it was done by Ping-The-Semantic-Web or Sindice 
 
Please refer to [1] for more detailed information, and let me know if
the purpose of this is really not clear ?
I hope you will be able to tell me if I did misunderstand LDCache and
finally it can play that exact role ?
If LDCache can not do that right now, do you think I should work on a
new module, or just add some functionalities to LDCache ?
 
Hope we can have an interesting discussion
Thank you for your help
Fabian
 
[1]
https://dl.dropboxusercontent.com/u/852552/Marmotta_OverLOD%20Surfer%20presentation_0.2.pdf

>>> Jakob Frank <[email protected]> 02.09.2014 12:45 >>>
Hi Fabian,

looks like you chose a big one for starting ;-)

LDCache plugs into the Sesame-Sail stack to automatically retrieve
remote resources that are available in the local triple store.

Sesame does not use CDI but the build-in Java ServiceLoader [1], so
plugging in there is not as easy.

On the other hand: why do you want to implement module similar to
LDCache? What feature do you need that can't be solved using LDCache -
for me, your "exteral datasource" module sounds exactly like LDCache
in action...

Best,
Jakob

[1]
http://docs.oracle.com/javase/6/docs/api/java/util/ServiceLoader.html

On 2 September 2014 09:06, Fabian Cretton <[email protected]>
wrote:
> Hi,
>
> I would like to implement a module that is similar to the LDCache
(following
> the previous discussions with Sergio about the overLOD project).
> I am currently reading about the LDCache functionalities here
> http://marmotta.apache.org/ldcache/
> and having a look at the code.
>
> (It is a pretty steep curve for me to apprehend the Marmotta project,
but I
> think it is worse it instead of starting from scratch)
>
> As I am new to this kind of project infrastructure, is there anything
I
> should read to better understand the all framework ? Maybe Java JEE
> tutorials as the project description says "The Apache Marmotta
Platform is
> implemented as a light-weight Service-Oriented Architecture (SOA)
using the
> CDI/Weld service framework (i.e. the core components of Java EE 6).
"
>
> Then, to create the new module, would it be a good idea to duplicate
the
> LDCache files (libraries and platform I guess) and modify them, or
should I
> better start from a new empty module as described here:
> http://wiki.apache.org/marmotta/Customizing#Modules
>
> Thank you for any help
> Fabian
>
>
>>>> Sergio Fernández<[email protected]> 27.08.2014 16:31 >>>
> Hi Fabian,
>
> On 27/08/14 14:49, Fabian Cretton wrote:
>> My first goal was: to build the all project locally, run my locally
>> built Marmotta, and then start adding components.
>> But my first concern now that I am digging deeper, is that Marmotta
is
>> a pretty big project (about 80 projects), and so you might recommand
me
>> not to import the main "pom.xml" in my eclipse environment, but
start
>> smaller ?
>
> Then start from the platform modules.
>
>> If there is already a documentation about how to procede, thank you
to
>> point me there, I didn't find any by myself.
>
> Well, the overall build process is entirely manage by Maven, check
> http://marmotta.apache.org/installation#source
>
>> Nevertheless, I do have problems and errors in Eclipse, and hope
you
>> can help me about that.
>
> Eclipse should be able to manage such king of size of modules with
Maven.
>
>> The first problems I do have, are with many "Plugin execution not
>> covered by lifecycle configuration" errors.
>
> Some plugin lifecycles might not be supported inside Eclipse. Just
> ignore it, you should not need them.
>
>> Than I do have 6-7 : "Project build error: Non-resolvable parent
POM:
>> Could not find artifact
>> org.apache.marmotta:marmotta-parent:pom:3.2.1-SNAPSHOT and
>> 'parent.relativePath' points at wrong local POM pom.xml
>> /marmotta-backend-sparql line 23 Maven pom Loading Problem"
>> and here I am pretty confused: it seems that some POM files are not
>> up-to-date in this 3.3.0 current version, as they do still point to
a
>> "3.2.1" parent POM file, but the parent is already in its "3.3.0"
>> version ?
>
> Sorry for the error. Those modules are out of the default profile,
so
> the release plugin did not update the versions accordingly. It's
already
> fixed in the develop branch; please update your fork.
>
>> Then, apart from those Maven errors, I do have a few java errors
with
>> many "imports" or "types" which can't be resolved, and this seems
very
>> strange to me. But maybe solving the main Maven problems here above
>> would correct that ?
>
> All dependencies are available from Maven central. Try to make a
"maven
> install" from the root.
>
>> A first goal for me would be to update the Marmotta's main menu so
that
>> under "Others", next to "Linked Data Caching", I could have a
"External
>> Data Sources" menu and then work an that new module as discussed
earlier
>> with you.
>
> Then you need to create a custom module and add it to your custom
webapp
> launcher. All the process is supported by Maven artifacts, as
described at:
>
> http://wiki.apache.org/marmotta/Customizing#Modules
>
> Hope that helps.
>
> Cheers,
>
> --
> Sergio Fernández
> Partner Technology Manager
> Redlink GmbH
> m: +43 660 2747 925
> e: [email protected]
> w: http://redlink.co

Reply via email to