[issue42096] zipfile.is_zipfile incorrectly identifying a gzipped file as a zip archive

2020-10-27 Thread Brian Kohan
Brian Kohan added the comment: I concur with Gregory. It seems that the action here is to just make it apparent in the docs the very real possibility of false positives. In my experience processing data from the wild, I see a pretty high rate of about 1/1000. I'm sure the probability

[issue42096] zipfile.is_zipfile incorrectly identifying a gzipped file as a zip archive

2020-10-22 Thread Brian Kohan
Brian Kohan added the comment: Hi all, I'm experiencing the same issue. I took a look at the is_zipfile code - seems like its not checking the start of the file for the magic numbers, but looking deeper in. I presume because the magic numbers at the start are considered unreliable for some