Very good arguments (as always) Sebb. I'd also be OK with leaving as is, until 
we have a user with a good reason for changing the send/create. 


And thanks for including the author of the quote. Going through his Wikipedia 
page, lots of things to read later.

Bruno


________________________________
From: sebb <seb...@gmail.com>
To: Commons Developers List <dev@commons.apache.org>; Bruno P. Kinoshita 
<brunodepau...@yahoo.com.br> 
Sent: Thursday, 23 August 2018 11:23 AM
Subject: Re: [CSV] Inconsistent record separator behavior



On 23 August 2018 at 00:01, Bruno P. Kinoshita
<brunodepau...@yahoo.com.br.invalid> wrote:
>
>>Maybe I'm just not getting it, but it feels pretty messed up :-)
>
>
> Mutual feeling, and +1 for consistency. From what I understood, users should 
> be able to parse these crazy CVS's, but if they tried to re-create them, with 
> comments, then they wouldn't be able to avoid the println/newline (so it 
> wouldn't be parseable later with the same reader).
>
>
> We probably need a ticket for it to aggregate the discussion and maybe a 
> possible solution.

I'm wondering whether we need to be as flexible when *creating* the CSV files.

"Be liberal in what you accept, and conservative in what you send" (Jon Postel)

In this case send == create, as it might be sent to other less liberal readers.

I don't have a problem with the output being less flexible, so long as
it is sufficiently flexible (which I think it likely is already).

I don't think consistency is necessary - or even desirable - here.

> Cheers
>
> ________________________________
> From: Benedikt Ritter <brit...@apache.org>
> To: Commons Developers List <dev@commons.apache.org>; 
> brunodepau...@yahoo.com.br
> Sent: Thursday, 23 August 2018 7:10 AM
> Subject: Re: [CSV] Inconsistent record separator behavior
>
>
>
> Hi Bruno,
>
> Am Mi., 22. Aug. 2018 um 15:10 Uhr schrieb Bruno P. Kinoshita
> <brunodepau...@yahoo.com.br.invalid>:
>
>> Hi,
>>
>>
>> Will try to look at the code and give a better answer during the weekend.
>> But risking a silly question, would it mean that users are not able to
>> parse a CSV unless each CSV row is separated by LF or CRLF?
>
>
> Yes.
>
>
>> I remember getting a CSV in a government website some time ago that was
>> formatted in a very strange way, and if I remember well it was a small
>> file, but without LF or CRLF. I think it was using | to separate the rows,
>> and , for columns.
>>
>
> I didn't know that there are formats that don't use a new line as line
> separator.
>
>
>>
>>
>> Quick search returned at least another person with similar issue
>> https://stackoverflow.com/questions/29903202/how-to-read-csv-on-python-with-newline-separator
>>
>>
>> Not sure if I understood the problem well, but in case it makes sense...
>> my suggestion would be to perhaps confirm if we could change
>> CSVPrinter.printComment to accept other characters for line ending?
>>
>
> The inconsistency I'm seeing is, that we an the one hand accept any
> character sequence as a record separator. Comments in a way a like special
> records to me. But our implementation seems to put them on a new "line"
> using the println() method. The println() method in turn uses the record
> seperator to start a new record. So it's not necessarily a new line.
> Nevertheless while processing a comment, we look out for CR and LF and then
> we call println() again. Maybe I'm just not getting it, but it feels pretty
> messed up :-)
>
> Regards,
> Benedikt
>
>
>
>>
>>
>> Thanks!
>>
>> Bruno
>>
>>
>> ________________________________
>> From: Benedikt Ritter <brit...@apache.org>
>> To: Commons Developers List <dev@commons.apache.org>
>> Sent: Tuesday, 21 August 2018 7:13 PM
>> Subject: [CSV] Inconsistent record separator behavior
>>
>>
>>
>> Hi,
>>
>>
>> we have this strange handling of record separator / line endings in CSV:
>>
>>
>> Users can use what ever character sequence they like as a record separator.
>>
>> I could for example use the ! character to mark the end of a record.
>>
>> Then we have CSVPrinter.printComment(String). This inserts comments into a
>>
>> CSV output. It detects CRLF and call println() on the CSVFormat, which in
>>
>> turn uses the record separator to indicate a new record...
>>
>>
>> So now I'm thinking: Does it make sense to use anything else but LF or CRLF
>>
>> as record separator? Maybe we should deprecate
>>
>> CSVFormat.recordSeparator(String) and introduce a LineEnding enum where
>>
>> users can choose between LF and CRLF. This way we can make the behavior
>>
>> between parsing and printing consistent.
>>
>>
>> Thoughts?
>>
>> Benedikt
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
>> For additional commands, e-mail: dev-h...@commons.apache.org
>
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org

>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Reply via email to