Hi All, I'm Dileepa a research student from University of Moratuwa, Sri Lanka with keen interest in the linked-data and semantic-web domains. I have worked with linked-data related projects such as Apache Stanbol and I'm experienced with related technologies like RDF, SPARQL, FOAF etc. I'm very much interested in applying for GSoC this year with Apache Marmotta.
I would like to open up a discussion on the OpenRefine integration project idea [1]. AFAIU, the goal of this project is to import data to Marmotta triple store (to Kiwi triple-store by default) from OpenRefine after the data has been refined and exported. I did some background reading on Marmotta data import process [2] which explains different ways to import RDF data to back-end triple store. Currently OpenRefine exports data in several formats: csv, tsv, xsl, html tables. So I think the main task of this project will be to convert this exported data into RDF format and make it compatible to Marmotta data import process. I did a quick research on how to do so and there are several options to convert such data to RDF. They are, 1. RDF extension to OpenRefine : https://github.com/sparkica/rdf-extension 2. RDF refine : http://refine.deri.ie/ 3. D2R server http://d2rq.org/d2r-server (if OpenRefine data is imported from a SQL database) Apart from the data conversion process from OpenRefine to RDF, what are the other tasks to be done in this project? Appreciate your thoughts and suggestions. Thanks, Dileepa [1] https://issues.apache.org/jira/browse/MARMOTTA-202 [2] http://wiki.apache.org/marmotta/ImportData [3] https://github.com/OpenRefine/OpenRefine/wiki/Exporters#exporting-projects
