[ 
https://issues.apache.org/jira/browse/CSV-35?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14033891#comment-14033891
 ] 

Thomas Neidhart commented on CSV-35:
------------------------------------

Right now the lexer does not use the record separator(s) specified in the 
format to be parsed.

In the mysql example, "\n" or LF is the record separator.

The record looks as follows:
3;Value\r
\\nwith a line break,c\n

the CRLF sequence is escaped so that \n is not used as record separator, but 
the second \n then finished the record.

So I would suggest that:

 * support multiple record separators for a format, e.g. \n, \r, or \r\n
 * the lexer uses the record separators defined for the format
 * an escape character indicates that the following character can not be used 
as record separator


> Escaped line separators are not supported
> -----------------------------------------
>
>                 Key: CSV-35
>                 URL: https://issues.apache.org/jira/browse/CSV-35
>             Project: Commons CSV
>          Issue Type: Bug
>            Reporter: Emmanuel Bourg
>             Fix For: 1.0
>
>         Attachments: mysql-export-line-terminated-by-crlf.csv, 
> mysql-export-line-terminated-by-lf.csv
>
>
> Commons CSV doesn't handle escaped line separators, for example:
> {code}
> value1;value2;value3a\
> value3b
> {code}
> In this case the expected result is:
> {code}["value1", "value2", "value3a\nvalue3b"]{code}
> This kind of escaping is produced by MySQL, whether the field enclosing is 
> enabled or not. It's possible to see enclosing quotes and escaped line 
> separators like this:
> {code}
> "value1";"value2";"value3a\
> value3b"
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to