On Mon, Jan 13, 2014 at 07:58:43PM -0500, Terry Reedy wrote: > This discussion strikes me as more appropriate for python-ideas. That > said, I am leery of a heuristics module in the stdlib. When is a change > a 'bug fix'? and when is it an 'enhancement'?
Depends on the nature of the heuristic. For example, there's a simple "guess the encoding of text files" heuristic which uses the presence of a BOM to pick the encoding: - read the first four bytes in binary mode - if bytes 0 and 1 are FEFF or FFFE, then the encoding is UTF-16; - if bytes 0 through 2 are EFBBBF, then the encoding is UTF-8; - if bytes 0 through 3 are 0000FEFF or FFFE0000, then the encoding is UTF-32; - if bytes 0 through 2 are 2B2F76 and byte 3 is 38, 39, 2B or 2F, then the encoding is UTF-7; - otherwise the encoding is unknown. Here a bug fix versus an enhancement is easy: a bug fix is (say) getting one of the BOMs wrong (suppose it tested for EFFF instead of FEFF, that would be a bug); an enhancement would be adding a new BOM/encoding detector (say, F7644C for UTF-1). The same would not apply to, for instance, the chardet library, where detection is based on statistics. If the library adjusts a frequency table, does that reflect a bug or an enhancement or both? -- Steven _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com