On 30.09.2018 at 12:42, Nikita Popov wrote:

> On Thu, Sep 27, 2018 at 12:29 PM Christoph M. Becker <cmbecke...@gmx.de>
> wrote:
> 
>> I hereby put the “Kill proprietary CSV escaping mechanism” under
>> discussion:
>>
>> <https://wiki.php.net/rfc/kill-csv-escaping>
>>
>> Any comments are welcome!
> 
> Could you please add a description of how the escaping mechanism currently
> works for read and write? My vague recollection is that the write and read
> behavior actually have nothing to do with each other and the fputcsv
> $escape parameter would be better described as the $corrupt parameter. We
> may want to treat both cases in different ways.

It's hard for me to describe something whose sense escapes me.  How can
two competing escaping mechanisms work at the same time?

Anyhow, from looking at the implementation the write case (fputcsv)[1]
is pretty clear:

  * if a field contains an escape character, it is enclosed with the
enclosure character
  * if an enclosure character is preceeded by an escape character in a
field, both are written verbatim (i.e. the enclosure character is not
doubled)

The implementation of the CSV reading[2] is way more complex, but
appears to be broken anyway.  For instance:

  $str = '"\\\\"a"';
  print_r(str_getcsv($str));

outputs:

  Array
  (
      [0] => \\a"
  )

Note that $str could have been produced by fputcsv($stream, ['\\\\"a']).

[1]
<https://github.com/php/php-src/blob/php-7.3.0RC2/ext/standard/file.c#L1932-L1990>
[2]
<https://github.com/php/php-src/blob/php-7.3.0RC2/ext/standard/file.c#L2092-L2349>

-- 
Christoph M. Becker

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to