On Jan 8, 2020, at 01:09, Abdur-Rahmaan Janhangeer <arj.pyt...@gmail.com> wrote:
> 
> But now, a malicious program might try to modify the info file
> and modify the hash. One way to protect even the metadata is
> to hash the entire content
> 
> folder/
>     file.py # we can add those in a folder if needed
>     __main__.py
>    infofile
> 
> Then after zipping it, we hash the zipfile then append the hash to the zip 
> binary
> 
> [zipfile binary][hash value]

How does this solve the problem? A malicious program that could modify the hash 
inside the info file could even more easily modify the hash at the end of the 
zip.

Existing systems deal with this by recognizing that you can’t prevent anyone 
from hashing anything they want, so you either have to store the hashes in a 
trusted central repo, or (more commonly–there are multiple advantages) sign 
them with a trustable key. If a malicious app modified the program and modified 
the hash, it’s going to be a valid hash; there’s nothing you can do about that. 
But it won’t be the hash in the repo, or it’ll be signed by the untrusted 
author of the malicious program rather than the trusted author of the app, and 
that’s why you don’t let it run. And this works just as well for hashes 
embedded inside an info file inside the zip as for hashes appended to the zip.

And there are advantages to putting the hash inside. For example, if you want 
to allow downstream packagers or automated systems to add distribution info 
(this is important if you want to be able to pass a second code signing 
requirement, e.g., Apple’s, as well as the zipapp one), you just have a list of 
escape patterns that say which files are allowed to be unhashed. Anything that 
appears in the info file must match its hash or the archive is invalid. 
Anything that doesn’t appear in the info file but does match the escape 
patterns is fine, but if it doesn’t match the escape patterns, the archive is 
invalid. So now downstream distributors can add extra files that match the 
escape patterns. (The escape patterns can be configurable—you just need them to 
be specified by something inside the hash. But you definitely want a default 
that works 99% of the time, because if developers and packagers have to think 
it through in every case instead of only in exceptional cases, they’re going to 
get it wrong, and nobody will have any idea who to trust to get it right.)

_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/QFNO6GU4QRQVPZVUP6N6DPUNXFI7Y33W/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to