On Wed, Nov 16, 2005 at 04:33:02AM +0100, Marius Mauch wrote: > On Wed, 16 Nov 2005 04:07:56 +0100 > Marius Mauch <[EMAIL PROTECTED]> wrote: > > > Hey, > > > > IIRC we (=Gentoo as a whole) pretty much agreed to drop the digest > > files in favor of a extended Manifest format. Well, today I wrote some > > initial code for this new format, you can find the result in the > > attachment. It's far from complete, but most basic stuff seems to work > > with some restrictions: > > - `ebuild foo digest` works if you have all SRC_URI files for the > > package (= all ebuilds of that package), however it will not verify > > the Manifest before overwriting it. Also it currently won't create > > the old digests (digestgen is a bit nasty). > > - `emerge foo` should also work (thankfully the parsing could be > > implemented in a simple compat layer) > > - repoman will likely have problems (didn't touch it yet) > > - I didn't add al the fancy exception handling and cvs code (which IMO > > shouldn't be in portage itself but repoman) > > > > Opinions about adding Manifest2 support to the next "minor" version > > (2.2.0 or 2.0.54 or 2.1.0 or ...)? I'm game, assuming it's sane.
> Index: pym/portage.py > =================================================================== > --- pym/portage.py (revision 2312) > +++ pym/portage.py (working copy) > @@ -95,6 +95,7 @@ > import portage_gpg > import portage_locks > import portage_exec > + import portage_manifest > from portage_locks import unlockfile,unlockdir,lockfile,lockdir > import portage_checksum > from portage_checksum import > perform_md5,perform_checksum,prelink_capable > @@ -1723,9 +1724,9 @@ > mymirrors += [x] > > mydigests = {} > - digestfn = mysettings["FILESDIR"]+"/digest-"+mysettings["PF"] > + digestfn = > portdb.finddigest(mysettings["CATEGORY"]+"/"+mysettings["PF"]) Like this mod personally, but don't like the '/'. You're passing in two chunks of data, pass them in seperate imo, rather then collapsing it for every call. > if os.path.exists(digestfn): > - mydigests = digestParseFile(digestfn) > + mydigests = digestParseFile(digestfn, calledFor="digest") > > fsmirrors = [] > for x in range(len(mymirrors)-1,-1,-1): > @@ -2136,8 +2137,10 @@ > # Track the old digest so we can assume checksums without > requiring > # all files to be downloaded. 'Assuming' > myolddigest = {} > - if os.path.exists(digestfn): > - myolddigest = digestParseFile(digestfn) > + if portage_manifest.getManifestVersion(manifestfn) == 2: > + myolddigests = digestParseFile(manifestfn, > calledFor="digest") > + elif os.path.exists(digestfn): > + myolddigest = digestParseFile(digestfn, > calledFor="digest") Kind of disliking the 'calledFor' being thrown around here, plus the v2 check, then resorting to a seperate method of checking. Integrate 'em potentially? You're basically treating it as two seperate funcs (from how I look at it), should split it up (even if the seperated funcs call each other, or a common func). > @@ -2218,7 +2229,7 @@ > return 1 > > > -def digestParseFile(myfilename): > +def digestParseFile(myfilename, calledFor="manifest"): > """(filename) -- Parses a given file for entries matching: > MD5 MD5_STRING_OF_HEX_CHARS FILE_NAME FILE_SIZE > Ignores lines that do not begin with 'MD5' and returns a > @@ -2230,6 +2241,10 @@ > > mydigests={} > for x in mylines: > + tmp = portage_manifest.convertManifest2ToDigest(x, > calledAs=calledFor) > + if tmp[0] == True: > + mydigests[tmp[1]] = tmp[2] > + continue > myline=string.split(x) > if len(myline) < 4: > #invalid line > @@ -2258,7 +2273,10 @@ > the basedir given. Returns 1 only if all files exist and match the md5s. > """ > for x in myfiles: > - if not mydigests.has_key(x): > + if portage_manfest.getManifestVersion(basedir+"/Manifest") == 2 > and x.startswith("files/digest-"): hardcoding sucks, kthnx If possible, try to shove it into the func so modification (down the line) occurs in one spot, rather then having to resort to search/replace :) > Index: pym/portage_checksum.py > =================================================================== > --- pym/portage_checksum.py (revision 2312) > +++ pym/portage_checksum.py (working copy) > @@ -126,3 +126,12 @@ > portage_locks.unlockfile(mylock) > > return (myhash,mysize) > + > + > +hashfunc_map = {"MD5": md5hash, "SHA1": sha1hash} > + > +def perform_multiple_checksums(filename, hashes=["MD5"], calc_prelink=0): > + rVal = {} > + for x in hashes: > + rVal[x] = perform_checksum(filename, hashfunc_map[x], > calc_prelink)[0] > + return rVal Not sure if I like the [0] tbh. Still think the size test should be broken off as it's own pseudo hashfunc, and implemented in that way. Course, would need to weight it somehow, so that size test is done first, then the more expensive chf tests... > Index: pym/portage_manifest.py > =================================================================== > --- pym/portage_manifest.py (revision 0) > +++ pym/portage_manifest.py (revision 0) > @@ -0,0 +1,163 @@ > +import os > + > +import portage, portage_exception, portage_checksum, portage_versions > + > +class Manifest2Entry: > + def __init__(self, entrytype, name, config, > methods=portage_checksum.get_valid_checksum_keys()): The methods trick you're doing is a one time evaluation... not sure if that's the best approach tbh. > + if entrytype not in ["EBUILD", "AUXFILE", "MISCFILE", "SRCURI"]: > + raise portage_exception.PortageException("unknown > Manifest entry type: %s" % entrytype) break that out into a class constant, and please kill off the capps (they kill kittens, or something). > + self.type = entrytype > + self.name = name > + self.methods = methods > + self.config = config I realize stable does this all over, but I really am not much for storing a config instance on this attribute. Thoughts on yanking the settings you need up front? > + def generate(self, pkgdir, genOldStyle=True): > + if self.type == "EBUILD" or self.type == "MISCFILE": > + filename = pkgdir+"/"+self.name > + elif self.type == "AUXFILE": > + filename = pkgdir+"/files/"+self.name > + elif self.type == "SRCURI": > + filename = self.config["DISTDIR"]+"/"+self.name > + if not os.path.exists(filename): > + raise portage_exception.PortageException("file for > Manifest entry doesn't exist: %s" % filename) > + > + self.size = os.stat(filename).st_size > + self.checksums = > portage_checksum.perform_multiple_checksums(filename, self.methods) See comment about making size into a pseudo chf func- would eliminate this case. Chunking this up into type handlers might be wise- would make it easier to extend (modify a class dict rather then N funcs). You've already got the seperated beast, just throw it into a mapping and it'll be nice and pretty. > + cs_list = [x+" "+self.checksums[x] for x in > self.checksums.keys()] > + self.entry = " ".join([self.type, self.name, > str(self.size)]+cs_list) > + if genOldStyle and self.type != "SRCURI": > + if "MD5" in self.methods: > + oldentry = Manifest2Entry._makeOldEntry(pkgdir, > filename, size=self.size, md5sum=self.checksums["MD5"]) > + else: > + oldentry = Manifest2Entry._makeOldEntry(pkgdir, > filename, size=self.size) Kinda iffy on this approach also, seems unclean (quite subjective I realize). > + self.entry += "\n"+oldentry > + return self.entry > + > + def generateOldDigestEntries(self, pkgdir, digestfiles): > + rVal = [] > + for d in digestfiles: > + filename = pkgdir+"/"+d > + if not os.path.exists(filename): > + raise > portage_exception.PortageException("digest file for Manifest compatibility > entry doesn't exist: %s" % filename) > + rVal.append(Manifest2Entry._makeOldEntry(pkgdir, > filename)) > + return rVal > + generateOldDigestEntries = classmethod(generateOldDigestEntries) > + > + def _makeOldEntry(self, pkgdir, filename, size=None, md5sum=None): > + oldname = "/".join(filename.split("/")[len(pkgdir.split("/")):]) > + if md5sum == None: > + md5sum = portage_checksum.perform_md5(filename) > + if size == None: > + size = os.stat(filename).st_size > + return " ".join(["MD5", md5sum, oldname, str(size)]) > + _makeOldEntry = classmethod(_makeOldEntry) > + > +def manifest2AuxfileFilter(filename): > + filename = filename.strip("/") > + return not (filename in ["CVS", ".svn"] or filename[:len("digest-")] == > "digest-") > + > +def manifest2MiscfileFilter(filename): > + filename = filename.strip("/") > + return not (filename in ["CVS", ".svn", "files", "Manifest"] or > filename[-7:] == ".ebuild") > + > +def verifyManifest2(cpv, config): > + return (True, None) > + > +def createFullManifest2(pkgdir, config, checkExisting=False, > requireAllSrcUriFiles=True, createOldDigests=True, manifestCompatible=True, > debug=1): > + mydb = portage.db["/"]["porttree"].dbapi Really don't like that. Pass it in. > + > + if debug: > + print "pkgdir:", pkgdir > + > + cpv_list = [pkgdir.rstrip("/").split("/")[-2]+"/"+x[:-7] for x in > portage.listdir(pkgdir) if x.endswith(".ebuild")] os.path.sep... not much for the portage.listdir reference there either, kind of an indication the evil listdir/cachedir should be broken out into their own module. > + srcuri_list = [] > + auxfile_list = [] > + miscfile_list = [] > + > + if checkExisting: > + (isOk, brokenFile) = verifyManifest2(pkgdir, config) > + if not isOk: > + return (1, brokenFile) > + for cpv in cpv_list: > + srcuri_list.extend([x.split("/")[-1] for x in mydb.aux_get(cpv, > ["SRC_URI"])[0].split()]) > + auxfile_list = filter(manifest2AuxfileFilter, > portage.listdir(pkgdir+"/files/", recursive=1, filesonly=1)) > + miscfile_list = filter(manifest2MiscfileFilter, portage.listdir(pkgdir, > recursive=0, filesonly=1)) > + > + if requireAllSrcUriFiles: > + for f in srcuri_list: > + if not os.path.exists(config["DISTDIR"]+"/"+f): > + return (2, f) > + else: > + srcuri_list = [f for f in srcuri_list if > os.path.exists(config["DISTDIR"]+"/"+f)] > + > + if debug: > + print "pkgdir:", pkgdir > + print "CPV:", cpv_list > + print "SRC_URI:", srcuri_list > + print "auxfiles:", auxfile_list > + print "miscfiles:", miscfile_list > + > + myentries = [] > + myentries += [Manifest2Entry("EBUILD", > portage_versions.catsplit(x)[1]+".ebuild", config).generate(pkgdir, > genOldStyle=manifestCompatible) for x in cpv_list] > + myentries += [Manifest2Entry("AUXFILE", x, config).generate(pkgdir, > genOldStyle=manifestCompatible) for x in auxfile_list] > + myentries += [Manifest2Entry("MISCFILE", x, config).generate(pkgdir, > genOldStyle=manifestCompatible) for x in miscfile_list] > + myentries += [Manifest2Entry("SRCURI", x, config).generate(pkgdir, > genOldStyle=manifestCompatible) for x in srcuri_list] > + > + if createOldDigests: > + # TODO > + pass > + > + if manifestCompatible: > + digestfiles = ["files/"+x for x in > portage.listdir(pkgdir+"/files/") if x.startswith("digest-")] > + myentries += Manifest2Entry.generateOldDigestEntries(pkgdir, > digestfiles) > + > + # hack to allow sorting > + myentries = ("\n".join(myentries)).split("\n") > + myentries.sort() > + > + if debug: > + for x in myentries: > + print x > + manifest = "\n".join(myentries) > + manifestfile = open(pkgdir+"/Manifest", "w") > + manifestfile.write(manifest) > + manifestfile.close() Not commenting too much on the bits above atm, but I'd personally open a tmpfile alongside the manifest and write to it directly, same for digest, mving it in afterwards (or unlinking on failure). > + > + return (0, None) Eh? Exceptions seem like a better route here then magic constants and string error msgs. > +def convertManifest2ToDigest(manifestline, calledAs="manifest"): > + mysplit = manifestline.split() > + if mysplit[0] not in ["EBUILD", "AUXFILE", "MISCFILE", "SRCURI"] or > ((len(mysplit) - 3) % 2) != 0: > + return (False, None, None) > + if mysplit[0] == "SRCURI" and calledAs == "manifest": > + return (False, None, None) > + if mysplit[0] in ["EBUILD", "AUXFILE", "MISCFILE"] and calledAs == > "digest": > + return (False, None, None) > + > + checksums = {} > + > + mytype = mysplit.pop(0) > + filename = mysplit.pop(0) > + if mytype == "AUXFILE": > + filename = "files/"+filename > + size = long(mysplit.pop(0)) > + checksums = dict(zip(mysplit[::2], mysplit[1::2])) > + if not checksums.has_key("size"): > + checksums["size"] = size > + > + return (True, filename, checksums) > + > +def getManifestVersion(manifest): > + rVal = 0 > + > + if not os.path.exists(manifest): > + return rVal > + myfile = open(manifest, "r") > + lines = myfile.readlines() > + for l in lines: > + mysplit = l.split() > + if mysplit[0] == "MD5" and len(mysplit) == 4 and rVal == 0: > + rVal = 1 > + elif mysplit[0] in ["EBUILD", "AUXFILE", "MISCFILE", "SRCURI"] > and ((len(mysplit) - 3) % 2) == 0 and rVal in [0,1]: > + rVal = 2 > + return rVal Offhand... I really think you need to do some subclassing above. Flow's mostly there, but a lot of icky stuff for long term maintenance (imo). Seems like a refactoring of the patch would bring line count down, and up readability. Note that was just a read through... haven't tested it yet, just code comments. Meanwhile, damn nice to see manifest/digest code (ty) :) ~harring
pgpFNSFAr5ydY.pgp
Description: PGP signature