[jira] [Commented] (CSV-35) Escaped line separators are not supported

2014-07-07 Thread Benedikt Ritter (JIRA)

[ 
https://issues.apache.org/jira/browse/CSV-35?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14053910#comment-14053910
 ] 

Benedikt Ritter commented on CSV-35:


I've asked the ML to to comment on this fix.

 Escaped line separators are not supported
 -

 Key: CSV-35
 URL: https://issues.apache.org/jira/browse/CSV-35
 Project: Commons CSV
  Issue Type: Bug
Reporter: Emmanuel Bourg
 Fix For: 1.0

 Attachments: CSV-35.patch, commons-csv CSV-35 escapeCRLFOnce 
 test.patch, commons-csv CSV-35 escapeCRLFOnce.patch, 
 mysql-export-line-terminated-by-crlf.csv, 
 mysql-export-line-terminated-by-lf.csv


 Commons CSV doesn't handle escaped line separators, for example:
 {code}
 value1;value2;value3a\
 value3b
 {code}
 In this case the expected result is:
 {code}[value1, value2, value3a\nvalue3b]{code}
 This kind of escaping is produced by MySQL, whether the field enclosing is 
 enabled or not. It's possible to see enclosing quotes and escaped line 
 separators like this:
 {code}
 value1;value2;value3a\
 value3b
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CSV-35) Escaped line separators are not supported

2014-06-29 Thread Benedikt Ritter (JIRA)

[ 
https://issues.apache.org/jira/browse/CSV-35?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14047217#comment-14047217
 ] 

Benedikt Ritter commented on CSV-35:


[~tn] go ahead an commit the patch yourself ;-)

 Escaped line separators are not supported
 -

 Key: CSV-35
 URL: https://issues.apache.org/jira/browse/CSV-35
 Project: Commons CSV
  Issue Type: Bug
Reporter: Emmanuel Bourg
 Fix For: 1.0

 Attachments: CSV-35.patch, mysql-export-line-terminated-by-crlf.csv, 
 mysql-export-line-terminated-by-lf.csv


 Commons CSV doesn't handle escaped line separators, for example:
 {code}
 value1;value2;value3a\
 value3b
 {code}
 In this case the expected result is:
 {code}[value1, value2, value3a\nvalue3b]{code}
 This kind of escaping is produced by MySQL, whether the field enclosing is 
 enabled or not. It's possible to see enclosing quotes and escaped line 
 separators like this:
 {code}
 value1;value2;value3a\
 value3b
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CSV-35) Escaped line separators are not supported

2014-06-18 Thread Thomas Neidhart (JIRA)

[ 
https://issues.apache.org/jira/browse/CSV-35?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14035579#comment-14035579
 ] 

Thomas Neidhart commented on CSV-35:


This issue is related to CSV-102.
The patch there adds support for custom record separators when parsing, similar 
to my patch.

Although I think that the ExtendedBufferedReader should not be changed, as 
CR/LF is used there to get the actual line number of the parsed file for error 
handling / debug information only.

 Escaped line separators are not supported
 -

 Key: CSV-35
 URL: https://issues.apache.org/jira/browse/CSV-35
 Project: Commons CSV
  Issue Type: Bug
Reporter: Emmanuel Bourg
 Fix For: 1.0

 Attachments: CSV-35.patch, mysql-export-line-terminated-by-crlf.csv, 
 mysql-export-line-terminated-by-lf.csv


 Commons CSV doesn't handle escaped line separators, for example:
 {code}
 value1;value2;value3a\
 value3b
 {code}
 In this case the expected result is:
 {code}[value1, value2, value3a\nvalue3b]{code}
 This kind of escaping is produced by MySQL, whether the field enclosing is 
 enabled or not. It's possible to see enclosing quotes and escaped line 
 separators like this:
 {code}
 value1;value2;value3a\
 value3b
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CSV-35) Escaped line separators are not supported

2014-06-17 Thread Thomas Neidhart (JIRA)

[ 
https://issues.apache.org/jira/browse/CSV-35?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14033891#comment-14033891
 ] 

Thomas Neidhart commented on CSV-35:


Right now the lexer does not use the record separator(s) specified in the 
format to be parsed.

In the mysql example, \n or LF is the record separator.

The record looks as follows:
3;Value\r
\\nwith a line break,c\n

the CRLF sequence is escaped so that \n is not used as record separator, but 
the second \n then finished the record.

So I would suggest that:

 * support multiple record separators for a format, e.g. \n, \r, or \r\n
 * the lexer uses the record separators defined for the format
 * an escape character indicates that the following character can not be used 
as record separator


 Escaped line separators are not supported
 -

 Key: CSV-35
 URL: https://issues.apache.org/jira/browse/CSV-35
 Project: Commons CSV
  Issue Type: Bug
Reporter: Emmanuel Bourg
 Fix For: 1.0

 Attachments: mysql-export-line-terminated-by-crlf.csv, 
 mysql-export-line-terminated-by-lf.csv


 Commons CSV doesn't handle escaped line separators, for example:
 {code}
 value1;value2;value3a\
 value3b
 {code}
 In this case the expected result is:
 {code}[value1, value2, value3a\nvalue3b]{code}
 This kind of escaping is produced by MySQL, whether the field enclosing is 
 enabled or not. It's possible to see enclosing quotes and escaped line 
 separators like this:
 {code}
 value1;value2;value3a\
 value3b
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CSV-35) Escaped line separators are not supported

2014-06-17 Thread Gary Gregory (JIRA)

[ 
https://issues.apache.org/jira/browse/CSV-35?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14033941#comment-14033941
 ] 

Gary Gregory commented on CSV-35:
-

It seems like a bug not to use the format's record separator.

 Escaped line separators are not supported
 -

 Key: CSV-35
 URL: https://issues.apache.org/jira/browse/CSV-35
 Project: Commons CSV
  Issue Type: Bug
Reporter: Emmanuel Bourg
 Fix For: 1.0

 Attachments: mysql-export-line-terminated-by-crlf.csv, 
 mysql-export-line-terminated-by-lf.csv


 Commons CSV doesn't handle escaped line separators, for example:
 {code}
 value1;value2;value3a\
 value3b
 {code}
 In this case the expected result is:
 {code}[value1, value2, value3a\nvalue3b]{code}
 This kind of escaping is produced by MySQL, whether the field enclosing is 
 enabled or not. It's possible to see enclosing quotes and escaped line 
 separators like this:
 {code}
 value1;value2;value3a\
 value3b
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CSV-35) Escaped line separators are not supported

2014-06-17 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/CSV-35?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14034404#comment-14034404
 ] 

Sebb commented on CSV-35:
-

Gary: I suspect the record separator was originally intended as output only

Thomas: I agree. 
However there is a possibility that the record separator (RS) could contain the 
escape character.
How should it handle that case?
I suspect this should be disallowed, as it will cause issues.

In the case of the MySQL examples:
If the escape char is set to '\' , then if the input is unescaped before 
checking for the RS, it would be possible to parse the input OK, by choosing 
RS=LF or RS=CRLF. i.e. there is no need to use the escape character in the RS 
because the unescaping is done first. This should of course be tested ...

If one checks for RS before unescaping, then it would not be possible to escape 
the RS sequence.

 Escaped line separators are not supported
 -

 Key: CSV-35
 URL: https://issues.apache.org/jira/browse/CSV-35
 Project: Commons CSV
  Issue Type: Bug
Reporter: Emmanuel Bourg
 Fix For: 1.0

 Attachments: mysql-export-line-terminated-by-crlf.csv, 
 mysql-export-line-terminated-by-lf.csv


 Commons CSV doesn't handle escaped line separators, for example:
 {code}
 value1;value2;value3a\
 value3b
 {code}
 In this case the expected result is:
 {code}[value1, value2, value3a\nvalue3b]{code}
 This kind of escaping is produced by MySQL, whether the field enclosing is 
 enabled or not. It's possible to see enclosing quotes and escaped line 
 separators like this:
 {code}
 value1;value2;value3a\
 value3b
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CSV-35) Escaped line separators are not supported

2014-03-27 Thread Gary Gregory (JIRA)

[ 
https://issues.apache.org/jira/browse/CSV-35?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13949990#comment-13949990
 ] 

Gary Gregory commented on CSV-35:
-

This looks to me like the main serious issue remaining before 1.0. There is 
also [CSV-58].

Handling MySQL exports sounds pretty basic. We already have 
{{CSVFormat.MYSQL}}, so we are telling the world we know how to do MySQL...

Gary

 Escaped line separators are not supported
 -

 Key: CSV-35
 URL: https://issues.apache.org/jira/browse/CSV-35
 Project: Commons CSV
  Issue Type: Bug
Reporter: Emmanuel Bourg
 Fix For: 1.0

 Attachments: mysql-export-line-terminated-by-crlf.csv, 
 mysql-export-line-terminated-by-lf.csv


 Commons CSV doesn't handle escaped line separators, for example:
 {code}
 value1;value2;value3a\
 value3b
 {code}
 In this case the expected result is:
 {code}[value1, value2, value3a\nvalue3b]{code}
 This kind of escaping is produced by MySQL, whether the field enclosing is 
 enabled or not. It's possible to see enclosing quotes and escaped line 
 separators like this:
 {code}
 value1;value2;value3a\
 value3b
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CSV-35) Escaped line separators are not supported

2014-01-23 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/CSV-35?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13879966#comment-13879966
 ] 

Sebb commented on CSV-35:
-

Interesting.
Do you have to tell MySQL what the EOL is when reloading from the CSV file?
Or does it work this out for itself? This could be tricky if there are CR/LF at 
the end of the record.

So long as the CSV code knows whether to treat LF or CRLF as the (only) line 
terminator it should be easy to parse this format.

For EOL=LF, only LF needs to be escaped on output, and only an unescaped LF 
acts as an EOL on parse.
For EOL=CRLF, in theory either (or both) need to be escaped on output; only if 
CR and LF are both unescaped is EOL detected on parse.

However, if one wants to support a variable EOL - or the EOL is not known at 
the start - it quickly becomes very tricky to parse.

It would be interesting to know how MySQL handles \CR\LF  and LF\CR as input in 
the CRLF case.

Also, how does it handle a bare CR on output?

 Escaped line separators are not supported
 -

 Key: CSV-35
 URL: https://issues.apache.org/jira/browse/CSV-35
 Project: Commons CSV
  Issue Type: Bug
Reporter: Emmanuel Bourg
 Fix For: 1.0

 Attachments: mysql-export-line-terminated-by-crlf.csv, 
 mysql-export-line-terminated-by-lf.csv


 Commons CSV doesn't handle escaped line separators, for example:
 {code}
 value1;value2;value3a\
 value3b
 {code}
 In this case the expected result is:
 {code}[value1, value2, value3a\nvalue3b]{code}
 This kind of escaping is produced by MySQL, whether the field enclosing is 
 enabled or not. It's possible to see enclosing quotes and escaped line 
 separators like this:
 {code}
 value1;value2;value3a\
 value3b
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CSV-35) Escaped line separators are not supported

2014-01-21 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/CSV-35?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13877640#comment-13877640
 ] 

Sebb commented on CSV-35:
-

I would be OK with adding an option to support this, but it should not be 
called anything to do with MySQL.

It so happens that MySQL is known to generate the escape sequence, but other 
CSV exports may do so as well, so I think the option name should relate only to 
the functionality. The Javadoc can mention that the option may be needed for 
MySQL parsing.

For example the option could be called:   withEscapeCRLF(boolean).
Default should be false (i.e. only escape the CR).

 Escaped line separators are not supported
 -

 Key: CSV-35
 URL: https://issues.apache.org/jira/browse/CSV-35
 Project: Commons CSV
  Issue Type: Bug
Reporter: Emmanuel Bourg
 Fix For: 1.0


 Commons CSV doesn't handle escaped line separators, for example:
 {code}
 value1;value2;value3a\
 value3b
 {code}
 In this case the expected result is:
 {code}[value1, value2, value3a\nvalue3b]{code}
 This kind of escaping is produced by MySQL, whether the field enclosing is 
 enabled or not. It's possible to see enclosing quotes and escaped line 
 separators like this:
 {code}
 value1;value2;value3a\
 value3b
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CSV-35) Escaped line separators are not supported

2014-01-20 Thread Gary Gregory (JIRA)

[ 
https://issues.apache.org/jira/browse/CSV-35?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13877129#comment-13877129
 ] 

Gary Gregory commented on CSV-35:
-

So, not addressing this means that we cannot deal with MySQL exports? That 
seems harsh (for users).

It sounds like, for users, the way to tell [csv] about this is with a 
withMySQLEol(boolean) option? Which would be a special case as Sebb mentioned.

 Escaped line separators are not supported
 -

 Key: CSV-35
 URL: https://issues.apache.org/jira/browse/CSV-35
 Project: Commons CSV
  Issue Type: Bug
Reporter: Emmanuel Bourg
 Fix For: 1.0


 Commons CSV doesn't handle escaped line separators, for example:
 {code}
 value1;value2;value3a\
 value3b
 {code}
 In this case the expected result is:
 {code}[value1, value2, value3a\nvalue3b]{code}
 This kind of escaping is produced by MySQL, whether the field enclosing is 
 enabled or not. It's possible to see enclosing quotes and escaped line 
 separators like this:
 {code}
 value1;value2;value3a\
 value3b
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CSV-35) Escaped line separators are not supported

2013-08-01 Thread Benedikt Ritter (JIRA)

[ 
https://issues.apache.org/jira/browse/CSV-35?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726343#comment-13726343
 ] 

Benedikt Ritter commented on CSV-35:


Do we ant to fix the parsing behavior for MySQL exports? I think we don't 
need to fix this for 1.0. MySQL is just one format among many.

 Escaped line separators are not supported
 -

 Key: CSV-35
 URL: https://issues.apache.org/jira/browse/CSV-35
 Project: Commons CSV
  Issue Type: Bug
Reporter: Emmanuel Bourg
 Fix For: 1.0


 Commons CSV doesn't handle escaped line separators, for example:
 {code}
 value1;value2;value3a\
 value3b
 {code}
 In this case the expected result is:
 {code}[value1, value2, value3a\nvalue3b]{code}
 This kind of escaping is produced by MySQL, whether the field enclosing is 
 enabled or not. It's possible to see enclosing quotes and escaped line 
 separators like this:
 {code}
 value1;value2;value3a\
 value3b
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CSV-35) Escaped line separators are not supported

2013-06-30 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/CSV-35?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13696333#comment-13696333
 ] 

Sebb commented on CSV-35:
-

As already noted, this would require special-casing, and will mean one parse 
generate escCR followed by plain LF.

At the very least, this needs to be carefully documented to avoid suprises (and 
complaints).

 Escaped line separators are not supported
 -

 Key: CSV-35
 URL: https://issues.apache.org/jira/browse/CSV-35
 Project: Commons CSV
  Issue Type: Bug
Reporter: Emmanuel Bourg
 Fix For: 1.0


 Commons CSV doesn't handle escaped line separators, for example:
 {code}
 value1;value2;value3a\
 value3b
 {code}
 In this case the expected result is:
 {code}[value1, value2, value3a\nvalue3b]{code}
 This kind of escaping is produced by MySQL, whether the field enclosing is 
 enabled or not. It's possible to see enclosing quotes and escaped line 
 separators like this:
 {code}
 value1;value2;value3a\
 value3b
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CSV-35) Escaped line separators are not supported

2013-06-28 Thread Emmanuel Bourg (JIRA)

[ 
https://issues.apache.org/jira/browse/CSV-35?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13695422#comment-13695422
 ] 

Emmanuel Bourg commented on CSV-35:
---

I rechecked the output of MySQL and it does indeed produce escCRLF and not 
escCRescLF. I think we should handle that case and extend the escaping to 
the second new line character.

 Escaped line separators are not supported
 -

 Key: CSV-35
 URL: https://issues.apache.org/jira/browse/CSV-35
 Project: Commons CSV
  Issue Type: Bug
Reporter: Emmanuel Bourg
 Fix For: 1.0


 Commons CSV doesn't handle escaped line separators, for example:
 {code}
 value1;value2;value3a\
 value3b
 {code}
 In this case the expected result is:
 {code}[value1, value2, value3a\nvalue3b]{code}
 This kind of escaping is produced by MySQL, whether the field enclosing is 
 enabled or not. It's possible to see enclosing quotes and escaped line 
 separators like this:
 {code}
 value1;value2;value3a\
 value3b
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CSV-35) Escaped line separators are not supported

2013-06-24 Thread Sebb (JIRA)

[ 
https://issues.apache.org/jira/browse/CSV-35?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692015#comment-13692015
 ] 

Sebb commented on CSV-35:
-

Yes - see my comment dated 29/Mar/12 16:13.

i.e. the escape char only affects the subsequent character.

I suppose escCRLF could be special-cased if it is likely to be needed.

However how does one then support the current behaviour - again, if there is a 
user-case for it?
There could be switchable option, but that would only work for the complete 
file.

 Escaped line separators are not supported
 -

 Key: CSV-35
 URL: https://issues.apache.org/jira/browse/CSV-35
 Project: Commons CSV
  Issue Type: Bug
Reporter: Emmanuel Bourg
 Fix For: 1.0


 Commons CSV doesn't handle escaped line separators, for example:
 {code}
 value1;value2;value3a\
 value3b
 {code}
 In this case the expected result is:
 {code}[value1, value2, value3a\nvalue3b]{code}
 This kind of escaping is produced by MySQL, whether the field enclosing is 
 enabled or not. It's possible to see enclosing quotes and escaped line 
 separators like this:
 {code}
 value1;value2;value3a\
 value3b
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CSV-35) Escaped line separators are not supported

2012-03-29 Thread Sebb (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CSV-35?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13241295#comment-13241295
 ] 

Sebb commented on CSV-35:
-

The Lexer does currently (r1036896) handle escLF and escCR.

The code currently treats escCRLF as escCR followed by LF. The LF is 
handled as EOL.

 Escaped line separators are not supported
 -

 Key: CSV-35
 URL: https://issues.apache.org/jira/browse/CSV-35
 Project: Commons CSV
  Issue Type: Bug
Reporter: Emmanuel Bourg
 Fix For: 1.0


 Commons CSV doesn't handle escaped line separators, for example:
 {code}
 value1;value2;value3a\
 value3b
 {code}
 In this case the expected result is:
 {code}[value1, value2, value3a\nvalue3b]{code}
 This kind of escaping is produced by MySQL, whether the field enclosing is 
 enabled or not. It's possible to see enclosing quotes and escaped line 
 separators like this:
 {code}
 value1;value2;value3a\
 value3b
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CSV-35) Escaped line separators are not supported

2012-03-23 Thread Sebb (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CSV-35?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13236649#comment-13236649
 ] 

Sebb commented on CSV-35:
-

What should happen in the case of escapeCRLF?
I presume only the CR should be subject to the escape.
If an application wants to include CRLF in a field, then the application should 
generate escapeCRescapeLF.

 Escaped line separators are not supported
 -

 Key: CSV-35
 URL: https://issues.apache.org/jira/browse/CSV-35
 Project: Commons CSV
  Issue Type: Bug
Reporter: Emmanuel Bourg
 Fix For: 1.0


 Commons CSV doesn't handle escaped line separators, for example:
 {code}
 value1;value2;value3a\
 value3b
 {code}
 In this case the expected result is:
 {code}[value1, value2, value3a\nvalue3b]{code}
 This kind of escaping is produced by MySQL, whether the field enclosing is 
 enabled or not. It's possible to see enclosing quotes and escaped line 
 separators like this:
 {code}
 value1;value2;value3a\
 value3b
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira