Rép. : Re: Possible contribution

Fabian Cretton Fri, 22 Aug 2014 00:20:07 -0700

Sergio,
 
I hope you are well.
I hope you are well, and I am coming back to you as I will start these
days  to understand better Marmotta and see (hopefully with your help)
how the features we are thinking of in OverLOD could be implemented.
 
There are quiet a few questions in this email, hope you can answer them
so that I can move on more efficiently. Thank you in advance.
 
In a former email your were pointing me towards Fusepool, saying
"Actually looking to the idea, technologically talking it does not look
so different to what the Fusepool P3 FP7 project tries to do. "
I thus had a look at Fusepool, but from what I saw, Fusepool is about
"creating" RDF from non-RDF resources, whereas OverLOD is mainly about
consuming existing RDF data and have it at disposal for a specific
platform and use-case.
So one goal of OverLOD is more about the "next steps" of the semantic
web: how to consume efficiently RDF, and then make it easier for non-RDF
developpers to use the data.
 
For instance, one use case could be to develop a software for
engineers, the software being based on construction material data from
different construction material providers. The providers publish there
catalogs in RDF, and this instance of OverLOD keep a read-only copy of
all the catalogs, managing updates efficiently and automatically,
eventually handling data validation and so on. The software might need
other information as some regulations, maybe meteo data, etc. And all
those data are maintained in the OverLOD triple store for the softwares
based on that instance of OverLOD.
 
This feature is called "OverLOD Referencer" in the document "OverLOD
Surfer – Marmotta discussion"
https://dl.dropboxusercontent.com/u/852552/Marmotta_OverLOD%20Surfer%20presentation_0.2.pdf
 
OverLOD triple store (i.e. Marmotta) should be able to handle its own
RDF data BUT ALSO including RDF data from other sources on the web (RDF
files, RDF from SPARQL construct on different end-points, eventually
RDFa, Microformat/Microdata with rdfization here). And there here quite
some job to do, we haven't decided yet how far we will get, as there are
other features we need to implement.
 
If I am not mistaken, neither Marmotta, nor Fusepool or LDP does have
this feature, isn't it ? (I am also reading the LDP specification right
now)
At the end of the document "OverLOD Surfer – Marmotta discussion", I
came up with some questions which haven't been answered if I am not
mistaken, here I copy them:
Both Marmotta and OverLOD handle LOD data with a local copy of the
data. How does Marmotta plan to put in place automatic updates once the
original data is modified ?
Data validation: does Marmotta plan to validate the local data (in a
CWA manner, à la SPIN maybe) ?
Data "chunks": does Marmotta provide ways to import only a part of a
data source, for instance running SPARQL Construct queries on a
end-point, or on the content of an imported RDF file ?
It is to be noted that OverLOD was written before the W3C advancement
on LDP and JSON-LD. But now want to make good use of those
specifications.
 
About developpment
Apache developpment is a new world for me, but some colleagues here
might help me. 
Also I will follow the instruction you gave to QiHong earlier this
year, and so I will start to fork marmotta if I am not mistaken (git and
github are also new to me).
You told him:

·        Fork our mirror there [1] and give me (wikier) admin
permissions.
·        Create, at least,  a branch from 'develop' for your project;  
   according our development guidelines [2], I'd recommend you to use
the issue [3] as name for the branch: MARMOTTA-444.
·        I'll closely follow your development there, using the comments
on the code committed to provide you early feedback.
·        Create issues there for internal issues of the project.

So I will do the same, and thus, in my fork, create a branch from
'develop' -> is there a name you recommand me ? do I need to create
'Issues' about OverLOD features ?

As you pointed out "For me the "OverLOD Referencer" has a big potential
of reusing the infrastructure provided by LDClient [2 (
http://marmotta.apache.org/ldclient/ )] and LDCache [3 (
http://marmotta.apache.org/ldcache/ )]."
LDClient seems the way to import external data into Marmotta. I don't
see LDClient on the "Platform Architecture Overview".
This external data could be RDF, or data that needs to be RDFized,
right ?
The first question is the one already expressed here above: does
LDClient already handle the automatic update of the data once the data
source have been modified ? (I only read about some time-out features
but don't know yet what it means).
Second question is about the RDFizers: there is a nice list of RDFizer
listed, but nothing about Microdata/Microformat, would that be something
to implement if needed ?

Then, LDCache does handle where to store the incoming data from
LDClient. But is this storage different from the main Marmotta storage ?
Are the imported data part of the default graph and queryable
transparently with the other data from the LDP ? I guess so, but from
what I read I did have some doubts.

That's all for now, thank you again
Fabian

>>> Sergio Fernández<[email protected]> 18.07.2014 10:53 >>>
Hi Fabian.

On 13/07/14 07:00, Fabian Cretton wrote:
> As I said, we are not really working on that until mid-august.
However,
> what document would you recommand me to read until then, in order to
> really understand Marmotta ?

The platform description is a good starting point:

   http://marmotta.apache.org/platform

Just let us know whatever we can help.

Cheers,

-- 
Sergio Fernández
Partner Technology Manager
Redlink GmbH
m: +43 660 2747 925
e: [email protected]
w: http://redlink.co

Rép. : Re: Possible contribution

Reply via email to