[jira] [Commented] (CSV-239) Cannot get headers in column order from CSVRecord

Dave Moten (JIRA) Sun, 19 May 2019 13:24:31 -0700


    [ 
https://issues.apache.org/jira/browse/CSV-239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16843536#comment-16843536
 ]


Dave Moten commented on CSV-239:
--------------------------------

Thanks! I see headerNames is created once in CVSParser and a field added to 
CVSRecord. The additional field in CVSRecord will have an effect on allocation 
pressure (e.g when parsing a very big file). Given that getHeaderNames returns 
a copy anyway perhaps you could build it on demand from the mapping object? 
Also you don't need to sort because you already have the position of the 
columns. Just fill an array according to the position in the map. 

I also see the change in createHeaders of a LinkedHashMap to a TreeSet and the 
return of a LinkedHashMap copy in CVSRecord.toMap. if this is an attempt to 
make the entries iterable in column order it doesn't work! TreeSets order on 
key value (header name) for starters. 

Btw, would be easier to review if in a PR on GitHub repo. Is that not normal 
practice for this project?

> Cannot get headers in column order from CSVRecord
> -------------------------------------------------
>
>                 Key: CSV-239
>                 URL: https://issues.apache.org/jira/browse/CSV-239
>             Project: Commons CSV
>          Issue Type: Improvement
>          Components: Parser
>    Affects Versions: 1.6
>            Reporter: Dave Moten
>            Priority: Minor
>
> I have a use case where I read many lines from an arbitrary csv file with a 
> given CSVFormat as List<CSVRecord>, transform that list and then want to 
> write the transformed list to another file. 
> When I specify the format as CSVFormat.DEFAULT.withFirstRecordAsHeader() the 
> headers from the first line are available in the CSVRecord object via the 
> CSVRecord.toMap object but their column positions are not (the iteration of 
> the returned map does not reflect column order). Consequently I cannot write 
> a header line in the correct order to the output csv file (which I do when 
> the first CSVRecord is to be written).
> Another option would be to be to ensure that the CSVPrinter object writes the 
> header on the first call to CSVPrinter.printRecord but we should also be able 
> to cover the user case where we are writing to a non-csv format and we still 
> want to write the headers in the correct order. 
> My preference at minimum is that the headers with column order are available 
> from CSVRecord (after all the data to supply this is already present in 
> CVSRecord). The addition of a method `getHeaders` returning a `List<String>` 
> would do the job. I'm happy to submit a PR if desired.
> I've marked this as of minor importance but I think it's a pretty important 
> flaw in the library at the moment that prevents event the simplest of 
> round-trip (read then write) scenarios when the headers are read from the 
> file rather than known up-front.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (CSV-239) Cannot get headers in column order from CSVRecord

Reply via email to