[
https://issues.apache.org/jira/browse/CSV-226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Anson Schwabecher updated CSV-226:
----------------------------------
Description:
Hello, I'd like to contribute a CSVParser test suite for standard charsets as
defined in java.nio.charset.StandardCharsets + UTF-32.
This is a standalone test but is also in support of a fix for CSV-107. It also
refactors and unifies the testing around your established workaround of
inserting BOMInputStream ahead of the CSVParser.
It will take a single base UTF-8 encoded file (cstest.csv) and copy it to
multiple output files (in target dir) with differing character sets, similar to
the iconv tool. Each file will then be fed into the parser to test all the
BOM/NOBOM unicode variants. I think a file based approach is still important
here rather than just encoding a character stream inline as a string, that way
if issues develop it's easy to inspect the data.
I noticed in the project’s pom.xml (rat config) that you are excluding
individual test resource files by name rather than using a wildcard expression
to exclude every file in the directory. Is there a reason for this? It’s much
better if devs do not have to maintain this configuration.
{code:language=xml|title=i.e.: switch over to a single exclude expression}
<exclude>src/test/resources/**/*</exclude>
{code}
was:
Hello, I'd like to contribute a CSVParser test suite for standard charsets as
defined in java.nio.charset.StandardCharsets + UTF-32.
This is a standalone test but is also in support of a fix for CSV-107. It also
refactors and unifies the testing around your established workaround of
inserting BOMInputStream ahead of the CSVParser.
It will take a single base UTF-8 encoded file (cstest.csv) and copy it to
multiple output files (in target dir) with differing character sets, similar to
the iconv tool. Each file will then be fed into the parser to test all the
BOM/NOBOM unicode variants. I think a file based approach is still important
here rather than just encoding a character stream inline as a string, that way
if issues develop it's easy to inspect the data.
I noticed in the project’s pom.xml (rat config) that you are excluding
individual test resource files by name rather than using a wildcard expression
to exclude every file in the directory. Is there a reason for this? It’s much
better if devs do not have to maintain this configuration.
i.e.: switch over to a single exclude expression:
{{<exclude>src/test/resources/**/*</exclude>}}
> Add CSVParser test case for standard charsets
> ---------------------------------------------
>
> Key: CSV-226
> URL: https://issues.apache.org/jira/browse/CSV-226
> Project: Commons CSV
> Issue Type: Test
> Components: Parser
> Affects Versions: 1.5
> Reporter: Anson Schwabecher
> Priority: Minor
>
> Hello, I'd like to contribute a CSVParser test suite for standard charsets as
> defined in java.nio.charset.StandardCharsets + UTF-32.
> This is a standalone test but is also in support of a fix for CSV-107. It
> also refactors and unifies the testing around your established workaround of
> inserting BOMInputStream ahead of the CSVParser.
> It will take a single base UTF-8 encoded file (cstest.csv) and copy it to
> multiple output files (in target dir) with differing character sets, similar
> to the iconv tool. Each file will then be fed into the parser to test all
> the BOM/NOBOM unicode variants. I think a file based approach is still
> important here rather than just encoding a character stream inline as a
> string, that way if issues develop it's easy to inspect the data.
> I noticed in the project’s pom.xml (rat config) that you are excluding
> individual test resource files by name rather than using a wildcard
> expression to exclude every file in the directory. Is there a reason for
> this? It’s much better if devs do not have to maintain this configuration.
> {code:language=xml|title=i.e.: switch over to a single exclude expression}
> <exclude>src/test/resources/**/*</exclude>
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)