Hi Dan, Dan Price wrote:
Stephen mentioned to today that we need a jar file hashing algorithm.
Can you explain what the requirement/use-case is for jar file hashing?
I did a quick look at what our ON "wsdiff" tool does-- it breaks the jar up using 'jar xf', removes the MANIFEST.MF file (for reasons not clear to me), and then basically does a brute force comparison of the directory trees.
How does it generate the hash though?
We could certainly do the same to generate a hash, but it's expensive;
in search of a slightly better way, I put this together as a jumping
off point for further discussion. I wonder what people think of it? I wasn't
sure of the security implications of relying on a sha1 hash in turn derived
from metadata which is using CRC32.
----------------- ----------------- ----------------- -----------------
import zipfile
import sha
import sys
z = zipfile.ZipFile(sys.argv[1], "r")
if z.testzip() is not None:
print "Zip file contents do not match headers"
sys.exit(1)
zipinfos = sorted(z.infolist(), key = lambda x: x.filename)
hashstring = ""
for zi in zipinfos:
hashstring += zi.filename + " "
#
# These numbers seem to be signed upon printing even with the
# %u, which is annoying.
#
hashstring += "%u " % long(zi.file_size)
hashstring += "%u\n" % long(zi.CRC)
#print hashstring
hash = sha.new(hashstring)
print "%-38s %s" % (sys.argv[1], hash.hexdigest())
z.close()
sys.exit(0)
----------------- ----------------- ----------------- -----------------
You get output such as:
/usr/lib/krb5/gkadmin.jar 6b88c4ebc058743a51a1dac8d56d2d844ca427fc
/usr/lib/krb5/visualrt.jar cdedce797cd984fd6752d94e0df6fdbbb9cad189
/usr/lib/patch/csmauth.jar d592eca902e409b13183306eb9cc2b68e9d997c3
/usr/lib/patch/crl.jar e6c5d7f8ae51a3cc12806962cf8d886b021a20fc
/usr/lib/patch/patchpro.jar 3a70bc449736e8dbcf6d2af2c3fc8e53e5c1c3a4
Depending on your answer for the use-case/requirements of this hash, I'd be inclined to include the date_time and external_attr too on the basis that it could be possible to generate a different file with the same size and CRC in a new JAR, and maybe only the date would differ. Unlikely, I admit, but theoretically possible.
Other than that, the algorithm seems ok - assuming that it is fast enough of course.
Trev
smime.p7s
Description: S/MIME Cryptographic Signature
_______________________________________________ pkg-discuss mailing list [email protected] http://mail.opensolaris.org/mailman/listinfo/pkg-discuss
