>Maybe I'm just not getting it, but it feels pretty messed up :-)
Mutual feeling, and +1 for consistency. From what I understood, users should be able to parse these crazy CVS's, but if they tried to re-create them, with comments, then they wouldn't be able to avoid the println/newline (so it wouldn't be parseable later with the same reader). We probably need a ticket for it to aggregate the discussion and maybe a possible solution. Cheers ________________________________ From: Benedikt Ritter <brit...@apache.org> To: Commons Developers List <dev@commons.apache.org>; brunodepau...@yahoo.com.br Sent: Thursday, 23 August 2018 7:10 AM Subject: Re: [CSV] Inconsistent record separator behavior Hi Bruno, Am Mi., 22. Aug. 2018 um 15:10 Uhr schrieb Bruno P. Kinoshita <brunodepau...@yahoo.com.br.invalid>: > Hi, > > > Will try to look at the code and give a better answer during the weekend. > But risking a silly question, would it mean that users are not able to > parse a CSV unless each CSV row is separated by LF or CRLF? Yes. > I remember getting a CSV in a government website some time ago that was > formatted in a very strange way, and if I remember well it was a small > file, but without LF or CRLF. I think it was using | to separate the rows, > and , for columns. > I didn't know that there are formats that don't use a new line as line separator. > > > Quick search returned at least another person with similar issue > https://stackoverflow.com/questions/29903202/how-to-read-csv-on-python-with-newline-separator > > > Not sure if I understood the problem well, but in case it makes sense... > my suggestion would be to perhaps confirm if we could change > CSVPrinter.printComment to accept other characters for line ending? > The inconsistency I'm seeing is, that we an the one hand accept any character sequence as a record separator. Comments in a way a like special records to me. But our implementation seems to put them on a new "line" using the println() method. The println() method in turn uses the record seperator to start a new record. So it's not necessarily a new line. Nevertheless while processing a comment, we look out for CR and LF and then we call println() again. Maybe I'm just not getting it, but it feels pretty messed up :-) Regards, Benedikt > > > Thanks! > > Bruno > > > ________________________________ > From: Benedikt Ritter <brit...@apache.org> > To: Commons Developers List <dev@commons.apache.org> > Sent: Tuesday, 21 August 2018 7:13 PM > Subject: [CSV] Inconsistent record separator behavior > > > > Hi, > > > we have this strange handling of record separator / line endings in CSV: > > > Users can use what ever character sequence they like as a record separator. > > I could for example use the ! character to mark the end of a record. > > Then we have CSVPrinter.printComment(String). This inserts comments into a > > CSV output. It detects CRLF and call println() on the CSVFormat, which in > > turn uses the record separator to indicate a new record... > > > So now I'm thinking: Does it make sense to use anything else but LF or CRLF > > as record separator? Maybe we should deprecate > > CSVFormat.recordSeparator(String) and introduce a LineEnding enum where > > users can choose between LF and CRLF. This way we can make the behavior > > between parsing and printing consistent. > > > Thoughts? > > Benedikt > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org