Hi Dileepa,
On 13 March 2014 20:28, Dileepa Jayakody <[email protected]> wrote: > Hi Raffaele, > > Thanks again for your suggestions. > I think it will be a great addition to this project, to make the data > imported from Openrefine interoperable with other datasets in Marmotta. I > will followup with OpenRefine community to check whether they support DCAT > vocab in there latest release. If it doesn't support DCAT do you think > implementing DCAT support in OpenRefine a task within this project scope? > > Basically yes, It could be a task within this project scope. I think that a preliminary check is needed within RDF Refine. > On Thu, Mar 13, 2014 at 4:21 PM, Raffaele Palmieri < > [email protected]> wrote: > > > Hi Dileepa, > > some thoughts that I also share with other Marmotta's team members > > regarding integration with Open Refine. > > For the second level of integration, that fundamentally exports towards > > Marmotta both CSV and other data to produce RDF, it would be interesting > > try to add the functionality in Open Refine to supply additional data to > > dataset, using for example DCAT Vocabulary [1]. > > I don't remember if this feature is covered by GRefine RDF Extension, of > > that it's present a new release(ALPHA 0.9.0) [2] > > If dataset is supplied with DCAT metadata, Marmotta could expose it to > > facilitate its interoperability with other datasets. > > To do that, Marmotta needs to store also structured datasets, not > > necessarily instantiated in RDF triples. > > > > I think Marmotta's Kiwi Tripple Store can be connected to RDBMS back ends > (MySQL, Postgres, H2), therefore above requirement of storing structured > data in Marmotta's backend is fulfilled. Please correct me if I'm wrong. > > No, Kiwi Triple Store doesn't manage simple structured files(e.g. CSV), but rightly only instances of triples. The storage I mean is quite simple, also file system could be used, rightly to retrieve it from Marmotta at a later time using for example dcat:downloadUrl. Clearly this dataset is a copy of that tooled with Refine, that could be overwritten anytime. > In summary I think we are looking at 2 main tasks now. > 1. Ability to import data from OpenRefine process > Yes, in addition to linked dataset(4&5 stars) also structured dataset with simpler formats(CSV, etc.) furnished for example of DCAT metadata. > 2. Ability to configure the imported OpenRefine data inter-operable with > other datasets in Marmotta (potentially using DCAT vocab) > Yes, with the possibility to retrieve them from Marmotta, so also 3 stars datasets. > > More ideas, suggestions are mostly welcome. > > Before that you prepare the proposal, we should seek advice to the Marmotta's team. > > Thanks, > Dileepa > > Regards, Raffaele. > > > What do you think about? > > Regards, > > Raffaele. > > > > > > [1] http://www.w3.org/TR/vocab-dcat/ > > [2] https://github.com/fadmaa/grefine-rdf-extension/releases/tag/v0.9.0 > > > > > > On 11 March 2014 10:29, Dileepa Jayakody <[email protected]> > > wrote: > > > > > Thank you very much Raffaele for the detailed explaination. > > > > > > I will do some more background research on Marmotta data import and > > > OpenRefine and come up with questions and ideas I get. > > > > > > Also any new suggestions, directions to evolve this project idea are > > > welcome. > > > > > > Thanks, > > > Dileepa > > > > > > > > > On Tue, Mar 11, 2014 at 3:14 AM, Raffaele Palmieri < > > > [email protected]> wrote: > > > > > > > Hi Dileepa, > > > > pleased to meet you and know your interest for contributing to > > Marmotta. > > > > As discussed in Marmotta's mailing list, this integration could be > > > reached > > > > at various levels. > > > > A first level is reached refining your messy data with Refine tools, > > > using > > > > RDF extension, that already offers a graphical UI to model RDF data > > > > producing an RDF skeleton and then import new data in Marmotta, > > compliant > > > > to the created skeleton . > > > > This integration mode has been implemented in the past using [1] but > > > needs > > > > to be updated because: > > > > 1)Google Refine became Open Refine > > > > 2)LMF became Marmotta in its linked-data core functionalities > > > > This update also requires work about project configuration, because > > Open > > > > Refine has a different configuration than Apache Marmotta. > > > > Whatever kind of integration could be achieved, I think that work > about > > > > project configuration is required. > > > > A second level of integration is reached if you break up RDF in CSV > and > > > set > > > > of RDF mappings(aka RDF skeleton). > > > > So, starting from exported project that contains CSV and related > > actions > > > to > > > > produce RDF Skeleton, the integration expects to produce final RDF in > > > > Marmotta's world, probably performing similar steps as GRefine RDF > > > > Extension. > > > > For that second level of integration, export functionality and RDF > > > skeleton > > > > should be explored to verify what is easily exportable. > > > > At the moment, these are the hypothesis of integration, clearly the > > > second > > > > appears to be more complex, but also the first brings non-trivial > work. > > > > Since you have experience on other projects related to Semantic Web, > as > > > > Apache Stanbol, feel free to propose other hypothesis of integration, > > > > regards, > > > > Raffaele. > > > > > > > > [1]https://code.google.com/p/lmf/wiki/GoogleRefineExtension > > > > > > > > > > > > > > > > > > > > On 10 March 2014 21:35, Dileepa Jayakody <[email protected]> > > > > wrote: > > > > > > > > > Hi All, > > > > > > > > > > I'm Dileepa a research student from University of Moratuwa, Sri > Lanka > > > > with > > > > > keen interest in the linked-data and semantic-web domains. I have > > > worked > > > > > with linked-data related projects such as Apache Stanbol and I'm > > > > > experienced with related technologies like RDF, SPARQL, FOAF etc. > I'm > > > > very > > > > > much interested in applying for GSoC this year with Apache > Marmotta. > > > > > > > > > > I would like to open up a discussion on the OpenRefine integration > > > > project > > > > > idea [1]. AFAIU, the goal of this project is to import data to > > Marmotta > > > > > triple store (to Kiwi triple-store by default) from OpenRefine > after > > > the > > > > > data has been refined and exported. > > > > > > > > > > I did some background reading on Marmotta data import process [2] > > which > > > > > explains different ways to import RDF data to back-end triple > store. > > > > > Currently OpenRefine exports data in several formats: csv, tsv, > xsl, > > > html > > > > > tables. So I think the main task of this project will be to convert > > > this > > > > > exported data into RDF format and make it compatible to Marmotta > data > > > > > import process. I did a quick research on how to do so and there > are > > > > > several options to convert such data to RDF. > > > > > > > > > > They are, > > > > > 1. RDF extension to OpenRefine : > > > > https://github.com/sparkica/rdf-extension > > > > > 2. RDF refine : http://refine.deri.ie/ > > > > > 3. D2R server http://d2rq.org/d2r-server (if OpenRefine data is > > > imported > > > > > from a SQL database) > > > > > > > > > > Apart from the data conversion process from OpenRefine to RDF, what > > are > > > > the > > > > > other tasks to be done in this project? > > > > > Appreciate your thoughts and suggestions. > > > > > > > > > > Thanks, > > > > > Dileepa > > > > > > > > > > [1] https://issues.apache.org/jira/browse/MARMOTTA-202 > > > > > [2] http://wiki.apache.org/marmotta/ImportData > > > > > [3] > > > > > > > > > > > > > > > https://github.com/OpenRefine/OpenRefine/wiki/Exporters#exporting-projects > > > > > > > > > > > > > > >
