Package: libpython3.5-stdlib
Version: 3.5.4-2
Severity: normal

Dear Maintainer,

The csv module has a Sniffer class to try and detect the dialect used by
a given csv file. To do so, at some point (in the method
_guess_quote_and_delimiter), it runs several regular expressions on the
data provided. Unfortunately, the regex are written such that the search
time might grow quadratically with the size of the input when there is
no match.

For instance on the data:
1234,"foobar"
1234,"foobar"
...
10000 lines like this

The first regex, roughly simplified to r',".*?",' can take several
seconds on 10000 lines, growing like the square of the number of lines.

To avoid this, I might suggest preventing the .*? to match an unescaped
and un-doubled quote.

Best regards,
Celelibi


-- System Information:
Debian Release: buster/sid
  APT prefers testing
  APT policy: (990, 'testing'), (500, 'unstable')
Architecture: amd64 (x86_64)

Kernel: Linux 4.12.0-2-amd64 (SMP w/8 CPU cores)
Locale: LANG=fr_FR.UTF-8, LC_CTYPE=fr_FR.UTF-8 (charmap=UTF-8), 
LANGUAGE=fr_FR.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)

Versions of packages libpython3.5-stdlib depends on:
ii  libbz2-1.0            1.0.6-8.1
ii  libc6                 2.24-17
ii  libdb5.3              5.3.28-13.1
ii  liblzma5              5.2.2-1.3
ii  libmpdec2             2.4.2-1
ii  libncursesw5          6.0+20170902-1
ii  libpython3.5-minimal  3.5.4-2
ii  libreadline7          7.0-3
ii  libsqlite3-0          3.20.1-1
ii  libtinfo5             6.0+20170902-1
ii  mime-support          3.60

libpython3.5-stdlib recommends no packages.

libpython3.5-stdlib suggests no packages.

-- no debconf information

Reply via email to