Hi Andy, Thanks for your instructions!
Last week, I made the documentation of this GSoC project (jena-csv): - http://jena.staging.apache.org/documentation/csv/ - http://jena.staging.apache.org/documentation/csv/get_started.html - http://jena.staging.apache.org/documentation/csv/design.html - http://jena.staging.apache.org/documentation/csv/implementation.html I'll make more tests and improve the documentation in the coming weeks. Best regards, Ying Jiang On Tue, Jul 29, 2014 at 6:59 PM, Andy Seaborne <a...@apache.org> wrote: > On 27/07/14 16:20, Ying Jiang wrote: >> >> Hi Andy, >> >> Thanks for your comments! >> >> I just submitted the code of the csv2rdf tool. It's based on >> CmdLangParse, because parsing is the first step for transforming. >> csv2rdf inherits all of the command line arguments (for parsing) from >> CmdLangParse, besides the new "-dest=file" argument for the >> destination output file. You can try out: >> java -cp ... riotcmd.csv2rdf -dest=test.ntriples >> src/test/resources/test.csv >> >> The warnings you pointed out have been fixed already. It's now clean >> for packaging jena-csv. >> >> As to the release, do you mean releasing jena-csv itself, or the whole >> jena > > > jena-csv on it's own. It means it can be released more frequently and > out-of-step with the rest of Jena. As we are (trying) to release only 6 > monthly for the main distribution, (!!), coupling now does not work. > > >> (i.e. recent [VOTE] Release Jena 2.12.0 and Fuseki 1.1.0 )? >> Actually, I've made some code in jena-arq (e.g. LangCSV in RIOT), >> while more other code reside in jena-csv (e.g. PropertyTable, csv2rdf >> tool). jena-csv depends on jena-arq. Shall I integrate jena-csv into >> jena-arq, or just leave jena-csv a separate module alone to release? > > > For now, I'd leave it separate. > > The changes to jena-arq should have been picked up for 2.12.0. > > Hopefully, you can switch to using a released Jena - the POM is using > 2.12.0-SNAPSHOT so it'll be good for the 2.12.0 release. > > >> PropertyTable and its implementations now have good test coverage. >> However, other tests are still under development, including some unit >> tests and the tests for the real world csv data. I'll make it more >> sufficiently complete. In the remaining weeks, I can also compose the >> documentation you mentioned. After that, I think it's OK to release it >> to the world. In short, I believe I can go with the plan. Thanks a lot >> for your help during the project! > > > Great! > > Andy > > >> >> Best, >> Ying Jiang >> >> [1] >> https://svn.apache.org/repos/asf/jena/Experimental/jena-csv/src/main/java/riotcmd/csv2rdf.java >> >> On Thu, Jul 24, 2014 at 8:47 PM, Andy Seaborne <a...@apache.org> wrote: >>> >>> Ying, >>> >>> jena-csv is looking good. Hows the csv2rdf tool coming along? >>> >>> Using the StageGenerator route is OK; if it were OpExecutor you could do >>> some filtering as well but without a value based index (a general comment >>> about Jena stores - not CSv specific) it wil no tmake a lot of >>> performance >>> difference. There is material in the join engine rewrite "quack" but >>> it's >>> not ready. I don't see any major thing that you've done is not >>> applicable >>> at some later (post-summer) date for someone interested to move it on. >>> >>> Looking at the time left, there is a couple weeks and then the room for >>> manoeuvre. >>> >>> I think that the most important thing to get this out to people to use. >>> do is to release it on the world! >>> >>> That means: >>> >>> 1/ Documentation for use >>> >>> "Getting started page" >>> A page for full details. >>> A page about the code (?) >>> >>> 2/ An Apache release. >>> >>> If everyone is OK with this, I suggest that this is a release in it's own >>> right with it's own VOTE, etc etc. It's good expereince to >>> >>> Take a look at our release process documentation to know what's involved: >>> >>> https://cwiki.apache.org/confluence/display/JENA/Release+Process >>> >>> and reply quite soon about whether I'vemissed anything that needs doing >>> before a release and whether this plan works for you. I'm very open to >>> doing something different if you suggest something else. The goal is to >>> get >>> stuff to other people - there can be various different ways to do that. >>> >>> Andy >>> >>> Comments: >>> >>> I checked out a clean copy >>> >>> 1/ "mvn clean test" and I got a test failure. >>> >>> ------------ >>> Running org.apache.jena.propertytable.TS_PropertyTable >>> log4j:ERROR Could not read configuration file from URL >>> [file:src/test/resources/log4j.properties]. >>> java.io.FileNotFoundException: src/test/resources/log4j.properties (No >>> such >>> file or directory) >>> ------------ >>> Missing file? >>> >>> >>> 2/ fakign that file I hthen got some output from the tests: >>> >>>> 1(?x >>>> <file:///home/afs/Projects/jena-csv/src/test/resources/test.csv#Town> >>>> ?townName) >>> >>> (?x >>> >>> <file:///home/afs/Projects/jena-csv/src/test/resources/test.csv#Population> >>> ?pop) >>> <=1 (?x >>> >>> <file:///home/afs/Projects/jena-csv/src/test/resources/test.csv#Predicate%20With%20Space> >>> "PredicateWithSpace2") >>> >>> Are there by any chance some stray System.out.println in the code? :-) >>> >>> 3/ I also got some javadoc warnings when packaging. >>> >>> e.g. >>> [WARNING] >>> >>> /home/afs/Projects/jena-csv/src/main/java/org/apache/jena/propertytable/PropertyTable.java:45: >>> warning - @param argument "column," is not a parameter name. >>> >>> 4/ Is the test coverage sufficiently complete? > >