[ 
https://issues.apache.org/jira/browse/JENA-699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno P. Kinoshita updated JENA-699:
------------------------------------
    Attachment: JENA-699.patch

Hello, 

*TL;DR*: added commons-csv-1.0 as dependency in pom, removed classes specific 
to the existing parser, but used commons-csv wrapped within the existing parser 
to maintain the API - https://issues.apache.org/jira/browse/JENA-699

I'm using Jena in a project for a customer, and am slowly learning the code 
base and going through existing issues. Since I contribute to commons, I 
thought I'd give it a try to JENA-699.

I started by removing the CSVParser from the project, as well as the whole 
org.apache.jena.atlas.csv package, and then slowly tried to replace it in the 
rest of the code. 

However, I was having to change code in several places, not only in methods 
handling CSV. Because of that I decided to stick with the existing API, but 
wrap the commons-csv component within the existing parser.

As a consequence, the CSVToken, CSVTokenIterator, and CSVTokenType were removed 
and its tests updated. The CSVParser now forwards calls to the commons 
component, and keeps throwing the same CSVParseException.

The commons-csv parser is created as follows to comply with existing tests:

{code:borderStyle=solid}
CSVFormat.EXCEL.withQuote('\'').parse(input);
{code}

With the code from the patch, using the CSV example from \[1\], and the 
following snippet:

{code:borderStyle=solid}
    public static void main(String[] args) throws Exception {
        LangRIOT csv = new LangCSV (new StringReader("Town,Population\n" + 
                "Southton,123000\n" + 
                "Northville,654000"), 
                "http://example/b";, "", 
ErrorHandlerFactory.getDefaultErrorHandler(),  new 
WriterStreamRDFBlocks(System.out));
        csv.parse();
    }
{code}

The result is identical before/after patch: 

{code:borderStyle=solid}
_:b0    <http://w3c/future-csv-vocab/row>  1 ;
        <file:///home/kinow/java/tupilabs/jena/jena-arq/#Town>  "Southton" ;
        <file:///home/kinow/java/tupilabs/jena/jena-arq/#Population>  
"123000"^^<http://www.w3.org/2001/XMLSchema#double> .

_:b1    <http://w3c/future-csv-vocab/row>  2 ;
        <file:///home/kinow/java/tupilabs/jena/jena-arq/#Town>  "Northville" ;
        <file:///home/kinow/java/tupilabs/jena/jena-arq/#Population>  
"654000"^^<http://www.w3.org/2001/XMLSchema#double> .
{code}

Hope that helps,
Bruno

\[1\] https://www.w3.org/2013/csvw/wiki/CSV2RDF

> Replace the CSV/TSV parsing with Apache Commons CSV
> ---------------------------------------------------
>
>                 Key: JENA-699
>                 URL: https://issues.apache.org/jira/browse/JENA-699
>             Project: Apache Jena
>          Issue Type: Improvement
>            Reporter: Andy Seaborne
>            Priority: Minor
>         Attachments: JENA-699.patch
>
>
> When Apache Commons CSV is released, use that and remove the current parsers 
> in favour of a properly written and designed component.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to