[
https://issues.apache.org/jira/browse/CSV-200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15602069#comment-15602069
]
Vladimir Eatwell commented on CSV-200:
--------------------------------------
As far as I can tell the issue is that I can set both a quote mode and an
escape character. In the CSVPrinter values are not escaped is a quote mode is
set:
{code}
private void print(final Object object, final CharSequence value, final int
offset, final int len,
final Appendable out, final boolean newRecord) throws IOException {
if (!newRecord) {
out.append(getDelimiter());
}
if (object == null) {
out.append(value);
} else if (isQuoteCharacterSet()) {
// the original object is needed so can check for Number
printAndQuote(object, value, offset, len, out, newRecord);
} else if (isEscapeCharacterSet()) {
printAndEscape(value, offset, len, out);
} else {
out.append(value, offset, offset + len);
}
}
{code}
i.e. we either printAndQuote OR printAndEscape
However, in the CSVParser characters are unescaped inside quoted values:
{code}
private Token parseEncapsulatedToken(final Token token) throws IOException {
// save current line number in case needed for IOE
final long startLineNumber = getCurrentLineNumber();
int c;
while (true) {
c = reader.read();
if (isEscape(c)) {
final int unescaped = readEscape();
if (unescaped == Constants.END_OF_STREAM) { // unexpected char
after escape
token.content.append((char) c).append((char)
reader.getLastChar());
} else {
token.content.append((char) unescaped);
}
} else if (isQuoteChar(c)) {
...
{code}
> CSVFormat cannot read its own output if input contain escape character
> followed by quote character
> --------------------------------------------------------------------------------------------------
>
> Key: CSV-200
> URL: https://issues.apache.org/jira/browse/CSV-200
> Project: Commons CSV
> Issue Type: Bug
> Components: Parser
> Affects Versions: 1.4
> Reporter: Vladimir Eatwell
>
> I can format CSV using CSVFormat that is subsequently unparsable by
> CSVFormat, the test below illustrates the failure:
> {code}
> import org.apache.commons.csv.CSVFormat;
> import org.apache.commons.csv.CSVRecord;
> import org.apache.commons.csv.QuoteMode;
> import org.junit.Test;
> import java.io.StringReader;
> import java.util.List;
> public class CSVFormatTest {
> @Test
> public void parseFailure() throws Exception {
> CSVFormat formatter = CSVFormat.DEFAULT;
> formatter = formatter.withDelimiter(',');
> formatter = formatter.withQuote('*');
> formatter = formatter.withEscape('/');
> formatter = formatter.withNullString("NULL");
> formatter = formatter.withIgnoreSurroundingSpaces(true);
> formatter = formatter.withQuoteMode(QuoteMode.MINIMAL);
> String formatted = formatter.format("bob/*", "token");
> List<CSVRecord> parsed = formatter.parse(new
> StringReader(formatted)).getRecords();
> for (CSVRecord record : parsed) {
> System.out.println(record.size());
> }
> }
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)