Re: [gentoo-portage-dev] [PATCH] Manifest2 reloaded
On Thu, Mar 16, 2006 at 11:06:12AM +0200, tvali wrote: > Just in case ...what i have to do to test those resultse Command line invocation is listed above the test results; python -m timeit -s 's="asdf"*400;s+="fdsa.ebuild"' 's.endswith(".ebuild")' fex (for example). > ...ps. should i send it here if i have a working c++ class for forking in a seperate thread, post it here if you've got a patch you would like people to look at. ~harring pgpcjmMNsGYO6.pgp Description: PGP signature
Re: [gentoo-portage-dev] [PATCH] Manifest2 reloaded
Just in case ...what i have to do to test those results ...ps. should i send it here if i have a working c++ class for forking -- it evolved from that python thought here, which evolved into general interest to those pipes and interacting with other apps in my case (which is, as i have understood, important in "unix-like operating systems") ..therefore i encapsulate it into some generic c++ class to do piping and add some error checking, which would give simple way to use scripting languages, too. i have still not installed that c++ ide i like, but anyway, Kate is not so bad :) I have it almost ready but it seems that i have to do some work now for a while ...do i send it here, too, when done or noone needs such thing anymore? it just runs some command and gives a simple way to send messages to its stdin and read its stdout so that interacting with things like python could be simple, too. 2006/3/16, Brian Harring <[EMAIL PROTECTED]>: > On Wed, Mar 15, 2006 at 11:14:04PM -0800, Donnie Berkholz wrote: > > Brian Harring wrote: > > > python -m timeit -s 's="asdf"*400;s+="fdsa.ebuild"' > > > 's.endswith(".ebuild")' > > > 100 loops, best of 3: 0.88 usec per loop > > > > > python -m timeit -s 's="asdf"*400;s+="fdsa.ebuild"' 's[-7:] == ".ebuild"' > > > 100 loops, best of 3: 0.564 usec per loop > > > > > Use endswith > > > > > oddly, worth noting that startswith differs in this behaviour... > > > python -m timeit -s 's="asdf"*400;s+="fdsa.ebuild"' 's[:7] == ".ebuild"' > > > 100 loops, best of 3: 0.592 usec per loop > > > > > python -m timeit -s 's="asdf"*400;s+="fdsa.ebuild"' > > > 's.startswith(".ebuild")' > > > 100 loops, best of 3: 0.842 usec per loop > > > > Um, those both read the same way to me. You just switched the ordering > > around, so the (starts|ends)with is on the bottom instead of the top, > > but both times (starts|ends)with is longer. > > This is why crack is bad, mm'kay. > > /me lights the pipe and goes back to his corner. > > Pardon, just did a quick test and screwed the results ;) > ~harring > > > > -- tvali (e-mail: "[EMAIL PROTECTED]"; msn: "[EMAIL PROTECTED]"; icq: "317-492-912") Ühe eesti internetifirma lehel kohtasin tsitaati: If you don't do it excellently, dont do it at all. Because if it's not excellent, it won't be profitable or fun, and if you're not in business for fun or profit, what the hell are you doing here? Robert Townsend -- gentoo-portage-dev@gentoo.org mailing list
Re: [gentoo-portage-dev] [PATCH] Manifest2 reloaded
On Wed, Mar 15, 2006 at 11:14:04PM -0800, Donnie Berkholz wrote: > Brian Harring wrote: > > python -m timeit -s 's="asdf"*400;s+="fdsa.ebuild"' 's.endswith(".ebuild")' > > 100 loops, best of 3: 0.88 usec per loop > > > python -m timeit -s 's="asdf"*400;s+="fdsa.ebuild"' 's[-7:] == ".ebuild"' > > 100 loops, best of 3: 0.564 usec per loop > > > Use endswith > > > oddly, worth noting that startswith differs in this behaviour... > > python -m timeit -s 's="asdf"*400;s+="fdsa.ebuild"' 's[:7] == ".ebuild"' > > 100 loops, best of 3: 0.592 usec per loop > > > python -m timeit -s 's="asdf"*400;s+="fdsa.ebuild"' > > 's.startswith(".ebuild")' > > 100 loops, best of 3: 0.842 usec per loop > > Um, those both read the same way to me. You just switched the ordering > around, so the (starts|ends)with is on the bottom instead of the top, > but both times (starts|ends)with is longer. This is why crack is bad, mm'kay. /me lights the pipe and goes back to his corner. Pardon, just did a quick test and screwed the results ;) ~harring pgpB6yC53TYdU.pgp Description: PGP signature
Re: [gentoo-portage-dev] [PATCH] Manifest2 reloaded
Brian Harring wrote: > python -m timeit -s 's="asdf"*400;s+="fdsa.ebuild"' 's.endswith(".ebuild")' > 100 loops, best of 3: 0.88 usec per loop > python -m timeit -s 's="asdf"*400;s+="fdsa.ebuild"' 's[-7:] == ".ebuild"' > 100 loops, best of 3: 0.564 usec per loop > Use endswith > oddly, worth noting that startswith differs in this behaviour... > python -m timeit -s 's="asdf"*400;s+="fdsa.ebuild"' 's[:7] == ".ebuild"' > 100 loops, best of 3: 0.592 usec per loop > python -m timeit -s 's="asdf"*400;s+="fdsa.ebuild"' 's.startswith(".ebuild")' > 100 loops, best of 3: 0.842 usec per loop Um, those both read the same way to me. You just switched the ordering around, so the (starts|ends)with is on the bottom instead of the top, but both times (starts|ends)with is longer. Thanks, Donnie signature.asc Description: OpenPGP digital signature
Re: [gentoo-portage-dev] [PATCH] Manifest2 reloaded
On Wed, Mar 15, 2006 at 09:53:24PM -0800, Zac Medico wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > Marius Mauch wrote: > > Marius Mauch schrieb: > >> The first should be delayed until there is some consensus how the gpg > >> stuff should work in the future, the others I don't see the use for. > >> Also I only checked portage.py for changes, so emerge/repoman/... might > >> still have to be fixed. > >> Last but not least: I did some basic testing with this and the > >> important stuff seems to work, but I'm quite sure the code still has a > >> lot of bugs/issues, and this being a core functionality it needs a > >> *lot* of testing, so I'd really appreciate if you could all give it a > >> spin (but do not commit anything to the tree without manually checking > >> it first). > > > > Does the lack of feedback (only got a reaction from Brian so far) mean > > that noone tried it or that it doesn't have any issues? > > The patch applies and seems to work well. At a quick glance the code looks > pretty clean and it's nice to migrate more code out of portage.py to a > separate module. I've attached a refreshed version of the patch that applies > cleanly against current svn (I've made no changes). > > Zac > -BEGIN PGP SIGNATURE- > Version: GnuPG v1.4.2.2 (GNU/Linux) > > iD8DBQFEGP1S/ejvha5XGaMRAl/7AJ9cZbjhWtjCz+ac2/tjQNUoivj0twCg7xAG > cYvDbMiqU5HtpNrVk7fs6RM= > =Eqlo > -END PGP SIGNATURE- > === added file 'pym/portage_manifest.py' > --- /dev/null > +++ pym/portage_manifest.py > @@ -0,0 +1,314 @@ > +import os, sets > + > +import portage, portage_exception, portage_versions, portage_const > +from portage_checksum import * > +from portage_exception import * > + > +class FileNotInManifestException(PortageException): > + pass > + > +def manifest2AuxfileFilter(filename): > + filename = filename.strip("/") > + return not (filename in ["CVS", ".svn"] or filename[:len("digest-")] == > "digest-") > + > +def manifest2MiscfileFilter(filename): > + filename = filename.strip("/") > + return not (filename in ["CVS", ".svn", "files", "Manifest"] or > filename[-7:] == ".ebuild") python -m timeit -s 's="asdf"*400;s+="fdsa.ebuild"' 's.endswith(".ebuild")' 100 loops, best of 3: 0.88 usec per loop python -m timeit -s 's="asdf"*400;s+="fdsa.ebuild"' 's[-7:] == ".ebuild"' 100 loops, best of 3: 0.564 usec per loop Use endswith oddly, worth noting that startswith differs in this behaviour... python -m timeit -s 's="asdf"*400;s+="fdsa.ebuild"' 's[:7] == ".ebuild"' 100 loops, best of 3: 0.592 usec per loop python -m timeit -s 's="asdf"*400;s+="fdsa.ebuild"' 's.startswith(".ebuild")' 100 loops, best of 3: 0.842 usec per loop > +class Manifest(object): > + def __init__(self, pkgdir, db, mysettings, > hashes=portage_const.MANIFEST2_HASH_FUNCTIONS, manifest1_compat=True, > fromScratch=False): > + self.pkgdir = pkgdir+os.sep rstrip os.sep prior to adding it > + self.fhashdict = {} > + self.hashes = hashes > + self.hashes.append("size") > + if manifest1_compat: > + > self.hashes.extend(portage_const.MANIFEST1_HASH_FUNCTIONS) > + self.hashes = sets.Set(self.hashes) > + for t in portage_const.MANIFEST2_IDENTIFIERS: > + self.fhashdict[t] = {} > + self._read() > + self.compat = manifest1_compat > + self.db = db > + self.mysettings = mysettings > + if mysettings.has_key("PORTAGE_ACTUAL_DISTDIR"): > + self.distdir = mysettings["PORTAGE_ACTUAL_DISTDIR"] > + else: > + self.distdir = mysettings["DISTDIR"] Why pass in mysettings? Have the code push it in, manifest shouldn't know about the DISTDIR key nor PORTAGE_ACTUAL_DISTDIR, should just have a directory to look in. > + def guessType(self, filename): > + if filename.startswith("files/digest-"): > + return None > + if filename.startswith("files/"): if you're intent on using os.sep, might want to correct the two '/' uses above to use os.path.join/os.path.sep If concerned about cost, just calculate it once in the class namespace as a constant. related, might I suggest converting away from internal strings to a class level enumeration? int comparison is faster then string, plus it unbinds the internal code from the on disk symbols used (eg, just cause on disk is AUX doesn't mean internally it should be throwing around "AUX"). > + return "AUX" > + elif filename.endswith(".ebuild"): > + return "EBUILD" > + elif filename in ["ChangeLog", "metadata.xml"]: > + return "MISC" > + else: > + return "DIST" > + > + def getFullname(self): > + return self.pkgdir+"Manifest" Err... move that into
Re: [gentoo-portage-dev] [PATCH] Manifest2 reloaded
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Marius Mauch wrote: > Marius Mauch schrieb: >> The first should be delayed until there is some consensus how the gpg >> stuff should work in the future, the others I don't see the use for. >> Also I only checked portage.py for changes, so emerge/repoman/... might >> still have to be fixed. >> Last but not least: I did some basic testing with this and the >> important stuff seems to work, but I'm quite sure the code still has a >> lot of bugs/issues, and this being a core functionality it needs a >> *lot* of testing, so I'd really appreciate if you could all give it a >> spin (but do not commit anything to the tree without manually checking >> it first). > > Does the lack of feedback (only got a reaction from Brian so far) mean > that noone tried it or that it doesn't have any issues? The patch applies and seems to work well. At a quick glance the code looks pretty clean and it's nice to migrate more code out of portage.py to a separate module. I've attached a refreshed version of the patch that applies cleanly against current svn (I've made no changes). Zac -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.2.2 (GNU/Linux) iD8DBQFEGP1S/ejvha5XGaMRAl/7AJ9cZbjhWtjCz+ac2/tjQNUoivj0twCg7xAG cYvDbMiqU5HtpNrVk7fs6RM= =Eqlo -END PGP SIGNATURE- === added file 'pym/portage_manifest.py' --- /dev/null +++ pym/portage_manifest.py @@ -0,0 +1,314 @@ +import os, sets + +import portage, portage_exception, portage_versions, portage_const +from portage_checksum import * +from portage_exception import * + +class FileNotInManifestException(PortageException): + pass + +def manifest2AuxfileFilter(filename): + filename = filename.strip("/") + return not (filename in ["CVS", ".svn"] or filename[:len("digest-")] == "digest-") + +def manifest2MiscfileFilter(filename): + filename = filename.strip("/") + return not (filename in ["CVS", ".svn", "files", "Manifest"] or filename[-7:] == ".ebuild") + +class Manifest(object): + def __init__(self, pkgdir, db, mysettings, hashes=portage_const.MANIFEST2_HASH_FUNCTIONS, manifest1_compat=True, fromScratch=False): + self.pkgdir = pkgdir+os.sep + self.fhashdict = {} + self.hashes = hashes + self.hashes.append("size") + if manifest1_compat: + self.hashes.extend(portage_const.MANIFEST1_HASH_FUNCTIONS) + self.hashes = sets.Set(self.hashes) + for t in portage_const.MANIFEST2_IDENTIFIERS: + self.fhashdict[t] = {} + self._read() + self.compat = manifest1_compat + self.db = db + self.mysettings = mysettings + if mysettings.has_key("PORTAGE_ACTUAL_DISTDIR"): + self.distdir = mysettings["PORTAGE_ACTUAL_DISTDIR"] + else: + self.distdir = mysettings["DISTDIR"] + + def guessType(self, filename): + if filename.startswith("files/digest-"): + return None + if filename.startswith("files/"): + return "AUX" + elif filename.endswith(".ebuild"): + return "EBUILD" + elif filename in ["ChangeLog", "metadata.xml"]: + return "MISC" + else: + return "DIST" + + def getFullname(self): + return self.pkgdir+"Manifest" + + def getDigests(self): + rval = {} + for t in portage_const.MANIFEST2_IDENTIFIERS: + rval.update(self.fhashdict[t]) + return rval + + def _readDigests(self): + mycontent = "" + for d in portage.listdir(self.pkgdir+"files", filesonly=True, recursive=False): + if d.startswith("digest-"): +mycontent += open(self.pkgdir+"files"+os.sep+d, "r").read() + return mycontent + + def _read(self): + if not os.path.exists(self.getFullname()): + return + fd = open(self.getFullname(), "r") + mylines = fd.readlines() + fd.close() + mylines.extend(self._readDigests().split("\n")) + for l in mylines: + myname = "" + mysplit = l.split() + if len(mysplit) == 4 and mysplit[0] in portage_const.MANIFEST1_HASH_FUNCTIONS: +myname = mysplit[2] +mytype = self.guessType(myname) +if mytype == "AUX" and myname.startswith("files/"): + myname = myname[6:] +if mytype == None: + continue +mysize = int(mysplit[3]) +myhashes = {mysplit[0]: mysplit[1]} + if len(mysplit) > 4 and mysplit[0] in portage_const.MANIFEST2_IDENTIFIERS: +mytype = mysplit[0] +myname = mysplit[1] +mysize = int(mysplit[2]) +myhashes = dict(zip(mysplit[3::2], mysplit[4::2])) + if len(myname) == 0: +continue + if not self.fhashdict[mytype].has_key(myname): +self.fhashdict[mytype][myname] = {} + self.fhashdict[mytype][myname].update(myhashes) + self.fhashdict[mytype][myname]["size"] = mysize + + def _writeDigests(self): + cpvlist = [self.pkgdir.rstrip("/").split("/")[-2]+"/"+x[:-7] for x in portage.listdir(self.pkgdir) if x.endswith(".ebuild")] + rval = [] + for cpv in cpvlist: + dname = self.pkgdir+"files"+os.sep+"digest-"+portage.catsplit(cpv)[1] + mylines = [] + distlist = self._getCpvDistfiles(cpv) + for f in self.fhashdict["DIST"].keys(): +if f in distlist: + for h in self.fhashdict["DIST"][f].keys(): + if h not in portage_const.MANIFEST1_HASH_FU
Re: [gentoo-portage-dev] [PATCH] Manifest2 reloaded
Marius Mauch schrieb: The first should be delayed until there is some consensus how the gpg stuff should work in the future, the others I don't see the use for. Also I only checked portage.py for changes, so emerge/repoman/... might still have to be fixed. Last but not least: I did some basic testing with this and the important stuff seems to work, but I'm quite sure the code still has a lot of bugs/issues, and this being a core functionality it needs a *lot* of testing, so I'd really appreciate if you could all give it a spin (but do not commit anything to the tree without manually checking it first). Does the lack of feedback (only got a reaction from Brian so far) mean that noone tried it or that it doesn't have any issues? Marius -- gentoo-portage-dev@gentoo.org mailing list
[gentoo-portage-dev] [PATCH] Manifest2 reloaded
So while on my way to FOSDEM I decided to do something useful with the time and wrote a new manifest2 implementation. This has nothing to do with the original prototype I posted a while ago, it's been written completely from scratch. Basically all functionality (creation, parsing, validation) is encapsulated in the new portage_manifest.Manifest class, including compability code to read/write old style digests. The changes to portage.py only change the digest*() functions to use this new class instead of handling the task themselves (exception: digestCheckFiles() which apparently was only used internally by other digest* functions), they should more or less behave like with the old code. Any new code however should use the Manifest() class directly however. While this patch implements the basic functionality some extra stuff that was in the old code isn't included yet: - gpg verification - FEATURES=autoaddcvs - FEATURES=cvs (probably obsolete anyway) - emerge --digest / FEATURES=digest (may or may not work) The first should be delayed until there is some consensus how the gpg stuff should work in the future, the others I don't see the use for. Also I only checked portage.py for changes, so emerge/repoman/... might still have to be fixed. Last but not least: I did some basic testing with this and the important stuff seems to work, but I'm quite sure the code still has a lot of bugs/issues, and this being a core functionality it needs a *lot* of testing, so I'd really appreciate if you could all give it a spin (but do not commit anything to the tree without manually checking it first). Marius -- Public Key at http://www.genone.de/info/gpg-key.pub In the beginning, there was nothing. And God said, 'Let there be Light.' And there was still nothing, but you could see a bit better. diff -ru --exclude=CVS --exclude=.svn -N pym/portage.py.org pym/portage.py --- pym/portage.py.org 2006-03-04 02:25:20.957635000 + +++ pym/portage.py 2006-03-04 03:12:19.545785750 + @@ -90,6 +90,7 @@ from portage_data import ostype, lchown, userland, secpass, uid, wheelgid, \ portage_uid, portage_gid + from portage_manifest import Manifest import portage_util from portage_util import atomic_ofstream, dump_traceback, getconfig, grabdict, \ @@ -2049,181 +2050,67 @@ return 0 return 1 - -def digestCreate(myfiles,basedir,oldDigest={}): - """Takes a list of files and the directory they are in and returns the - dict of dict[filename][CHECKSUM_KEY] = hash - returns None on error.""" - mydigests={} - for x in myfiles: - print "<<<",x - myfile=os.path.normpath(basedir+"///"+x) - if os.path.exists(myfile): - if not os.access(myfile, os.R_OK): -print "!!! Given file does not appear to be readable. Does it exist?" -print "!!! File:",myfile -return None - mydigests[x] = portage_checksum.perform_multiple_checksums(myfile, hashes=portage_const.MANIFEST1_HASH_FUNCTIONS) - mysize = os.stat(myfile)[stat.ST_SIZE] - else: - if x in oldDigest: -# DeepCopy because we might not have a unique reference. -mydigests[x] = copy.deepcopy(oldDigest[x]) -mysize = copy.deepcopy(oldDigest[x]["size"]) - else: -print "!!! We have a source URI, but no file..." -print "!!! File:",myfile -return None - - if mydigests[x].has_key("size") and (mydigests[x]["size"] != mysize): - raise portage_exception.DigestException, "Size mismatch during checksums" - mydigests[x]["size"] = copy.deepcopy(mysize) - return mydigests - -def digestCreateLines(filelist, mydict): - mylines = [] - mydigests = copy.deepcopy(mydict) - for myarchive in filelist: - mysize = mydigests[myarchive]["size"] - if len(mydigests[myarchive]) == 0: - raise portage_exception.DigestException, "No generate digest for '%(file)s'" % {"file":myarchive} - for sumName in mydigests[myarchive].keys(): - if sumName not in portage_checksum.get_valid_checksum_keys(): -continue - mysum = mydigests[myarchive][sumName] - - myline = sumName[:] - myline += " "+mysum - myline += " "+myarchive - myline += " "+str(mysize) - mylines.append(myline) - return mylines - -def digestgen(myarchives,mysettings,overwrite=1,manifestonly=0): +def digestgen(myarchives,mysettings,db=None,overwrite=1,manifestonly=0): """generates digest file if missing. Assumes all files are available. If - overwrite=0, the digest will only be created if it doesn't already exist.""" - - # archive files - basedir=mysettings["DISTDIR"]+"/" - digestfn=mysettings["FILESDIR"]+"/digest-"+mysettings["PF"] - - # portage files -- p(ortagefiles)basedir - pbasedir=mysettings["O"]+"/" - manifestfn=pbasedir+"Manifest" - - if not manifestonly: - if not os.path.isdir(mysettings["FILESDIR"]): - os.makedirs(mysettings["FILESDIR"]) - mycvstree=cvstree.getentries(pbasedir, recursive=1) - - if ("cvs" in features) and os.path.exists(pbasedir+"/CVS"): - if not cvstree.isadded(mycvstree,"files"): -if "autoaddcvs" in features: -