[
https://issues.apache.org/jira/browse/CSV-290?focusedWorklogId=812263&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-812263
]
ASF GitHub Bot logged work on CSV-290:
--------------------------------------
Author: ASF GitHub Bot
Created on: 26/Sep/22 21:43
Start Date: 26/Sep/22 21:43
Worklog Time Spent: 10m
Work Description: angusdev commented on PR #265:
URL: https://github.com/apache/commons-csv/pull/265#issuecomment-1258667702
Added test for tab characters (ASCII 9) in values.
For QUOTE and ESCAPE, see below example
```
postgres=# insert into COMMONS_CSV_PSQL_TEST select 1, '"quoted"',
'|quoted2|', null, null;
INSERT 0 1
postgres=# copy COMMONS_CSV_PSQL_TEST to STDOUT with CSV;
1,"""quoted""",|quoted2|,,
postgres=# copy COMMONS_CSV_PSQL_TEST to STDOUT with CSV QUOTE '|';;
1,"quoted",|||quoted2|||,,
postgres=# copy COMMONS_CSV_PSQL_TEST to STDOUT with CSV ESCAPE '~';
1,"~"quoted~"",|quoted2|,,
postgres=# copy COMMONS_CSV_PSQL_TEST to STDOUT with CSV QUOTE '|' ESCAPE
'~';
1,"quoted",|~|quoted2~||,,
postgres=# copy COMMONS_CSV_PSQL_TEST to STDOUT QUOTE '|';
ERROR: COPY quote available only in CSV mode
postgres=# copy COMMONS_CSV_PSQL_TEST to STDOUT ESCAPE '~';
ERROR: COPY escape available only in CSV mode
```
In PG (CSV), ESCAPE is used to escape the quote char, while in COMMONS_CSV,
ESCAPE is to escape delimiter and special char
In PG (TEXT), QUOTE is not needed as it is tab-delimited and the delimiter
(tab) is escaped by '\t'
Issue Time Tracking
-------------------
Worklog Id: (was: 812263)
Time Spent: 1h 20m (was: 1h 10m)
> Produced CSV using PostgreSQL format cannot be read
> ---------------------------------------------------
>
> Key: CSV-290
> URL: https://issues.apache.org/jira/browse/CSV-290
> Project: Commons CSV
> Issue Type: Bug
> Components: Parser
> Affects Versions: 1.6, 1.9.0
> Reporter: Anatoliy Artemenko
> Priority: Major
> Time Spent: 1h 20m
> Remaining Estimate: 0h
>
> {code:java}
> // code placeholder
> {code}
> CSV, produced using printer:
>
> CSVPrinter printer = new CSVPrinter(sw,
> CSVFormat.POSTGRESQL_CSV.withFirstRecordAsHeader());
>
> cannot be be read with same format parser:
>
> CSVParser parser = new CSVParser(new StringReader(sw.toString()),
> CSVFormat.POSTGRESQL_CSV.withFirstRecordAsHeader());
>
> To reproduce:
>
> {code:java}
> StringWriter sw = new StringWriter();
> CSVPrinter printer = new CSVPrinter(sw,
> CSVFormat.POSTGRESQL_CSV.withFirstRecordAsHeader());
> printer.printRecord("column1", "column2");
> printer.printRecord("v11", "v12");
> printer.printRecord("v21", "v22");
> printer.close();
> CSVParser parser = new CSVParser(new StringReader(sw.toString()),
> CSVFormat.POSTGRESQL_CSV.withFirstRecordAsHeader());
> System.out.println("headers: " +
> Arrays.equals(parser.getHeaderNames().toArray(), new String[] {"column1",
> "column2"}));
> Iterator<CSVRecord> i = parser.iterator();
> System.out.println("row: " + Arrays.equals(i.next().toList().toArray(), new
> String[] {"v11", "v12"}));
> System.out.println("row: " + Arrays.equals(i.next().toList().toArray(), new
> String[] {"v21", "v22"}));{code}
> I'd expect the above code to work, but it fails:
> {code:java}
> java.io.IOException: (startline 1) EOF reached before encapsulated token
> finishedjava.io.IOException: (startline 1) EOF reached before encapsulated
> token finished
> at org.apache.commons.csv.Lexer.parseEncapsulatedToken(Lexer.java:371)
> at org.apache.commons.csv.Lexer.nextToken(Lexer.java:285)
> at org.apache.commons.csv.CSVParser.nextRecord(CSVParser.java:701)
> at org.apache.commons.csv.CSVParser.createHeaders(CSVParser.java:480)
> at org.apache.commons.csv.CSVParser.<init>(CSVParser.java:432)
> at org.apache.commons.csv.CSVParser.<init>(CSVParser.java:398)
> at Test.main(Test.java:25)
> {code}
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)