Hi to all, any feedback for Open Refine's integration? I particularly ask to Sergio who initiated Jira issue. Cheers, Raffaele.
---------- Messaggio inoltrato ---------- Da: *Raffaele Palmieri* <[email protected]> Data: giovedì 13 marzo 2014 Oggetto: [GSoC 2014] MARMOTTA-202 : OpenRefine import engine A: [email protected] Hi Dileepa, On 13 March 2014 20:28, Dileepa Jayakody <[email protected]<javascript:_e(%7B%7D,'cvml','[email protected]');> > wrote: > Hi Raffaele, > > Thanks again for your suggestions. > I think it will be a great addition to this project, to make the data > imported from Openrefine interoperable with other datasets in Marmotta. I > will followup with OpenRefine community to check whether they support DCAT > vocab in there latest release. If it doesn't support DCAT do you think > implementing DCAT support in OpenRefine a task within this project scope? > > Basically yes, It could be a task within this project scope. I think that a preliminary check is needed within RDF Refine. > On Thu, Mar 13, 2014 at 4:21 PM, Raffaele Palmieri < > [email protected]<javascript:_e(%7B%7D,'cvml','[email protected]');>> > wrote: > > > Hi Dileepa, > > some thoughts that I also share with other Marmotta's team members > > regarding integration with Open Refine. > > For the second level of integration, that fundamentally exports towards > > Marmotta both CSV and other data to produce RDF, it would be interesting > > try to add the functionality in Open Refine to supply additional data to > > dataset, using for example DCAT Vocabulary [1]. > > I don't remember if this feature is covered by GRefine RDF Extension, of > > that it's present a new release(ALPHA 0.9.0) [2] > > If dataset is supplied with DCAT metadata, Marmotta could expose it to > > facilitate its interoperability with other datasets. > > To do that, Marmotta needs to store also structured datasets, not > > necessarily instantiated in RDF triples. > > > > I think Marmotta's Kiwi Tripple Store can be connected to RDBMS back ends > (MySQL, Postgres, H2), therefore above requirement of storing structured > data in Marmotta's backend is fulfilled. Please correct me if I'm wrong. > > No, Kiwi Triple Store doesn't manage simple structured files(e.g. CSV), but rightly only instances of triples. The storage I mean is quite simple, also file system could be used, rightly to retrieve it from Marmotta at a later time using for example dcat:downloadUrl. Clearly this dataset is a copy of that tooled with Refine, that could be overwritten anytime. > In summary I think we are looking at 2 main tasks now. > 1. Ability to import data from OpenRefine process > Yes, in addition to linked dataset(4&5 stars) also structured dataset with simpler formats(CSV, etc.) furnished for example of DCAT metadata. > 2. Ability to configure the imported OpenRefine data inter-operable with > other datasets in Marmotta (potentially using DCAT vocab) > Yes, with the possibility to retrieve them from Marmotta, so also 3 stars datasets. > > More ideas, suggestions are mostly welcome. > > Before that you prepare the proposal, we should seek advice to the Marmotta's team. > > Thanks, > Dileepa > > Regards, Raffaele. > What do you think about? > Regards, > Raffaele. > > > [1] http://www.w3.org/TR/vocab-dcat/ > [2] https://github.com/fadmaa/grefine-rdf-extension/releases/tag/v0.9.0 > > > On 11 March 2014 10:29, Dileepa Jayakody <[email protected]> > wrote: > > > Thank you very much Raffaele for the detailed explaination. > > > > I will do some more background research on Marmotta data import and > > OpenRefine and come up with questions and ideas I get. > > > > Also any new suggestions, directions to evolve this project idea are > > welcome. > > > > Thanks, > > Dileepa > > > > > > On Tue, Mar 11, 2014 at 3:14 AM, Raffaele Palmieri < > > [email protected]> wrote: > > > > > Hi Dileepa, > > > pleased to meet you and know your interest for contributing to > Marmotta. > > > As discussed in Marmotta's mailing list, this integration could be > > reached > > > at various levels. > > > A first level is reached refining your messy data with Refine tools, > > using > > > RDF extension, that already offers a graphical UI to model RDF data > > > producing an RDF skeleton and then import new data in Marmotta, > compliant > > > to the created skeleton . > > > This integration mode has been implemented in the past using [1] but > > needs > > > to be updated because: > > > 1)Google Refine became Open Refine > > > 2)LMF became Marmotta in its linked-data core functionalities > > > This update also requires work about project configuration, because > Open > > > Refine has a different configuration than Apache Marmotta. > > > Whatever kind of integration could be achieved, I think that work about > > > project configuration is required. > > > A second level of integration is reached if you break up RDF in CSV and > > set > > > of RDF mappings(aka RDF skeleton). > > > So, starting from exported project that contains CSV and related > actions > > to > > > produce RDF Skeleton, the integration expects to produce final RDF in > > > Marmotta's world, probably performing similar steps as GRefine RDF > > > Extension. > > > For that second level of integration, export functionality and RDF > > skeleton > > > should be explored to verify what is easily exportable. > > > At the moment, these are the hypothesis of integration, clearly the > > second > > > appears to be more complex, but also the first brings non-trivial work. > > > Since you have experience on other projects related to Semantic Web, as > > > Apache Stanbol, feel free to propose other hypothesis of integration, > > > regards, > > > Raffaele. > > > > > > [1]https://code.google.com/p/lmf/wiki/GoogleRefineExtension > > > > > > > > > > > > > > > On 10 March 2014 21:35, Dileepa Jayakody <[email protected]> > > > wrote: > > > > > > > Hi All, > > > > > > > > I'm Dileepa a research student from University of Moratuwa, Sri Lanka > > > with > > > > keen interest in the linked-data and semantic-web domains. I have > > worked > > > > with linked-data related projects such as Apache Stanbol and I'm > > > > experienced with related technologies like RDF, SPARQL, FOAF etc. I'm > > > very > > > > much interested in applying for GSoC this year with Apache Marmotta. > > > > > > > > I would like to open up a discussion on the OpenRefine integration > > > project > > > > idea [1]. AFAIU, the goal of this project is to impo
