Re: [gentoo-portage-dev] [PATCH] Manifest2 reloaded

2006-03-16 Thread tvali
Just in case ...what i have to do to test those results


...ps. should i send it here if i have a working c++ class for forking
-- it evolved from that python thought here, which evolved into
general interest to those pipes and interacting with other apps in my
case (which is, as i have understood, important in unix-like
operating systems) ..therefore i encapsulate it into some generic c++
class to do piping and add some error checking, which would give
simple way to use scripting languages, too. i have still not installed
that c++ ide i like, but anyway, Kate is not so bad :) I have it
almost ready but it seems that i have to do some work now for a while
...do i send it here, too, when done or noone needs such thing
anymore? it just runs some command and gives a simple way to send
messages to its stdin and read its stdout so that interacting with
things like python could be simple, too.

2006/3/16, Brian Harring [EMAIL PROTECTED]:
 On Wed, Mar 15, 2006 at 11:14:04PM -0800, Donnie Berkholz wrote:
  Brian Harring wrote:
   python -m timeit -s 's=asdf*400;s+=fdsa.ebuild' 
   's.endswith(.ebuild)'
   100 loops, best of 3: 0.88 usec per loop
 
   python -m timeit -s 's=asdf*400;s+=fdsa.ebuild' 's[-7:] == .ebuild'
   100 loops, best of 3: 0.564 usec per loop
 
   Use endswith
 
   oddly, worth noting that startswith differs in this behaviour...
   python -m timeit -s 's=asdf*400;s+=fdsa.ebuild' 's[:7] == .ebuild'
   100 loops, best of 3: 0.592 usec per loop
 
   python -m timeit -s 's=asdf*400;s+=fdsa.ebuild' 
   's.startswith(.ebuild)'
   100 loops, best of 3: 0.842 usec per loop
 
  Um, those both read the same way to me. You just switched the ordering
  around, so the (starts|ends)with is on the bottom instead of the top,
  but both times (starts|ends)with is longer.

 This is why crack is bad, mm'kay.

 /me lights the pipe and goes back to his corner.

 Pardon, just did a quick test and screwed the results ;)
 ~harring






--
tvali
(e-mail: [EMAIL PROTECTED]; msn: [EMAIL PROTECTED];
icq: 317-492-912)

Ühe eesti internetifirma lehel kohtasin tsitaati:
If you don't do it excellently, dont do it at all. Because if it's not
excellent, it won't be profitable or fun, and if you're not in
business for fun or profit, what the hell are you doing here?
Robert Townsend

-- 
gentoo-portage-dev@gentoo.org mailing list



Re: [gentoo-portage-dev] [PATCH] Manifest2 reloaded

2006-03-15 Thread Marius Mauch

Marius Mauch schrieb:

The first should be delayed until there is some consensus how the gpg
stuff should work in the future, the others I don't see the use for.
Also I only checked portage.py for changes, so emerge/repoman/... might
still have to be fixed.
Last but not least: I did some basic testing with this and the
important stuff seems to work, but I'm quite sure the code still has a
lot of bugs/issues, and this being a core functionality it needs a
*lot* of testing, so I'd really appreciate if you could all give it a
spin (but do not commit anything to the tree without manually checking
it first).


Does the lack of feedback (only got a reaction from Brian so far) mean 
that noone tried it or that it doesn't have any issues?


Marius
--
gentoo-portage-dev@gentoo.org mailing list



Re: [gentoo-portage-dev] [PATCH] Manifest2 reloaded

2006-03-15 Thread Zac Medico
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Marius Mauch wrote:
 Marius Mauch schrieb:
 The first should be delayed until there is some consensus how the gpg
 stuff should work in the future, the others I don't see the use for.
 Also I only checked portage.py for changes, so emerge/repoman/... might
 still have to be fixed.
 Last but not least: I did some basic testing with this and the
 important stuff seems to work, but I'm quite sure the code still has a
 lot of bugs/issues, and this being a core functionality it needs a
 *lot* of testing, so I'd really appreciate if you could all give it a
 spin (but do not commit anything to the tree without manually checking
 it first).
 
 Does the lack of feedback (only got a reaction from Brian so far) mean
 that noone tried it or that it doesn't have any issues?

The patch applies and seems to work well.  At a quick glance the code looks 
pretty clean and it's nice to migrate more code out of portage.py to a separate 
module.  I've attached a refreshed version of the patch that applies cleanly 
against current svn (I've made no changes).

Zac
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.2.2 (GNU/Linux)

iD8DBQFEGP1S/ejvha5XGaMRAl/7AJ9cZbjhWtjCz+ac2/tjQNUoivj0twCg7xAG
cYvDbMiqU5HtpNrVk7fs6RM=
=Eqlo
-END PGP SIGNATURE-
=== added file 'pym/portage_manifest.py'
--- /dev/null	
+++ pym/portage_manifest.py	
@@ -0,0 +1,314 @@
+import os, sets
+
+import portage, portage_exception, portage_versions, portage_const
+from portage_checksum import *
+from portage_exception import *
+
+class FileNotInManifestException(PortageException):
+	pass
+
+def manifest2AuxfileFilter(filename):
+	filename = filename.strip(/)
+	return not (filename in [CVS, .svn] or filename[:len(digest-)] == digest-)
+
+def manifest2MiscfileFilter(filename):
+	filename = filename.strip(/)
+	return not (filename in [CVS, .svn, files, Manifest] or filename[-7:] == .ebuild)
+
+class Manifest(object):
+	def __init__(self, pkgdir, db, mysettings, hashes=portage_const.MANIFEST2_HASH_FUNCTIONS, manifest1_compat=True, fromScratch=False):
+		self.pkgdir = pkgdir+os.sep
+		self.fhashdict = {}
+		self.hashes = hashes
+		self.hashes.append(size)
+		if manifest1_compat:
+			self.hashes.extend(portage_const.MANIFEST1_HASH_FUNCTIONS)
+		self.hashes = sets.Set(self.hashes)
+		for t in portage_const.MANIFEST2_IDENTIFIERS:
+			self.fhashdict[t] = {}
+		self._read()
+		self.compat = manifest1_compat
+		self.db = db
+		self.mysettings = mysettings
+		if mysettings.has_key(PORTAGE_ACTUAL_DISTDIR):
+			self.distdir = mysettings[PORTAGE_ACTUAL_DISTDIR]
+		else:
+			self.distdir = mysettings[DISTDIR]
+		
+	def guessType(self, filename):
+		if filename.startswith(files/digest-):
+			return None
+		if filename.startswith(files/):
+			return AUX
+		elif filename.endswith(.ebuild):
+			return EBUILD
+		elif filename in [ChangeLog, metadata.xml]:
+			return MISC
+		else:
+			return DIST
+	
+	def getFullname(self):
+		return self.pkgdir+Manifest
+	
+	def getDigests(self):
+		rval = {}
+		for t in portage_const.MANIFEST2_IDENTIFIERS:
+			rval.update(self.fhashdict[t])
+		return rval
+	
+	def _readDigests(self):
+		mycontent = 
+		for d in portage.listdir(self.pkgdir+files, filesonly=True, recursive=False):
+			if d.startswith(digest-):
+mycontent += open(self.pkgdir+files+os.sep+d, r).read()
+		return mycontent
+		
+	def _read(self):
+		if not os.path.exists(self.getFullname()):
+			return
+		fd = open(self.getFullname(), r)
+		mylines = fd.readlines()
+		fd.close()
+		mylines.extend(self._readDigests().split(\n))
+		for l in mylines:
+			myname = 
+			mysplit = l.split()
+			if len(mysplit) == 4 and mysplit[0] in portage_const.MANIFEST1_HASH_FUNCTIONS:
+myname = mysplit[2]
+mytype = self.guessType(myname)
+if mytype == AUX and myname.startswith(files/):
+	myname = myname[6:]
+if mytype == None:
+	continue
+mysize = int(mysplit[3])
+myhashes = {mysplit[0]: mysplit[1]}
+			if len(mysplit)  4 and mysplit[0] in portage_const.MANIFEST2_IDENTIFIERS:
+mytype = mysplit[0]
+myname = mysplit[1]
+mysize = int(mysplit[2])
+myhashes = dict(zip(mysplit[3::2], mysplit[4::2]))
+			if len(myname) == 0:
+continue
+			if not self.fhashdict[mytype].has_key(myname):
+self.fhashdict[mytype][myname] = {} 
+			self.fhashdict[mytype][myname].update(myhashes)
+			self.fhashdict[mytype][myname][size] = mysize
+	
+	def _writeDigests(self):
+		cpvlist = [self.pkgdir.rstrip(/).split(/)[-2]+/+x[:-7] for x in portage.listdir(self.pkgdir) if x.endswith(.ebuild)]
+		rval = []
+		for cpv in cpvlist:
+			dname = self.pkgdir+files+os.sep+digest-+portage.catsplit(cpv)[1]
+			mylines = []
+			distlist = self._getCpvDistfiles(cpv)
+			for f in self.fhashdict[DIST].keys():
+if f in distlist:
+	for h in self.fhashdict[DIST][f].keys():
+		if h not in portage_const.MANIFEST1_HASH_FUNCTIONS:
+			continue
+		myline =  .join([h, str(self.fhashdict[DIST][f][h]), f, 

Re: [gentoo-portage-dev] [PATCH] Manifest2 reloaded

2006-03-15 Thread Brian Harring
On Wed, Mar 15, 2006 at 09:53:24PM -0800, Zac Medico wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 Marius Mauch wrote:
  Marius Mauch schrieb:
  The first should be delayed until there is some consensus how the gpg
  stuff should work in the future, the others I don't see the use for.
  Also I only checked portage.py for changes, so emerge/repoman/... might
  still have to be fixed.
  Last but not least: I did some basic testing with this and the
  important stuff seems to work, but I'm quite sure the code still has a
  lot of bugs/issues, and this being a core functionality it needs a
  *lot* of testing, so I'd really appreciate if you could all give it a
  spin (but do not commit anything to the tree without manually checking
  it first).
  
  Does the lack of feedback (only got a reaction from Brian so far) mean
  that noone tried it or that it doesn't have any issues?
 
 The patch applies and seems to work well.  At a quick glance the code looks 
 pretty clean and it's nice to migrate more code out of portage.py to a 
 separate module.  I've attached a refreshed version of the patch that applies 
 cleanly against current svn (I've made no changes).
 
 Zac
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.4.2.2 (GNU/Linux)
 
 iD8DBQFEGP1S/ejvha5XGaMRAl/7AJ9cZbjhWtjCz+ac2/tjQNUoivj0twCg7xAG
 cYvDbMiqU5HtpNrVk7fs6RM=
 =Eqlo
 -END PGP SIGNATURE-

 === added file 'pym/portage_manifest.py'
 --- /dev/null 
 +++ pym/portage_manifest.py   
 @@ -0,0 +1,314 @@
 +import os, sets
 +
 +import portage, portage_exception, portage_versions, portage_const
 +from portage_checksum import *
 +from portage_exception import *
 +
 +class FileNotInManifestException(PortageException):
 + pass
 +
 +def manifest2AuxfileFilter(filename):
 + filename = filename.strip(/)
 + return not (filename in [CVS, .svn] or filename[:len(digest-)] == 
 digest-)
 +
 +def manifest2MiscfileFilter(filename):
 + filename = filename.strip(/)
 + return not (filename in [CVS, .svn, files, Manifest] or 
 filename[-7:] == .ebuild)

python -m timeit -s 's=asdf*400;s+=fdsa.ebuild' 's.endswith(.ebuild)'
100 loops, best of 3: 0.88 usec per loop

python -m timeit -s 's=asdf*400;s+=fdsa.ebuild' 's[-7:] == .ebuild'
100 loops, best of 3: 0.564 usec per loop

Use endswith

oddly, worth noting that startswith differs in this behaviour...
python -m timeit -s 's=asdf*400;s+=fdsa.ebuild' 's[:7] == .ebuild'
100 loops, best of 3: 0.592 usec per loop

python -m timeit -s 's=asdf*400;s+=fdsa.ebuild' 's.startswith(.ebuild)'
100 loops, best of 3: 0.842 usec per loop

 +class Manifest(object):
 + def __init__(self, pkgdir, db, mysettings, 
 hashes=portage_const.MANIFEST2_HASH_FUNCTIONS, manifest1_compat=True, 
 fromScratch=False):
 + self.pkgdir = pkgdir+os.sep

rstrip os.sep prior to adding it

 + self.fhashdict = {}
 + self.hashes = hashes
 + self.hashes.append(size)
 + if manifest1_compat:
 + 
 self.hashes.extend(portage_const.MANIFEST1_HASH_FUNCTIONS)
 + self.hashes = sets.Set(self.hashes)
 + for t in portage_const.MANIFEST2_IDENTIFIERS:
 + self.fhashdict[t] = {}
 + self._read()
 + self.compat = manifest1_compat
 + self.db = db
 + self.mysettings = mysettings
 + if mysettings.has_key(PORTAGE_ACTUAL_DISTDIR):
 + self.distdir = mysettings[PORTAGE_ACTUAL_DISTDIR]
 + else:
 + self.distdir = mysettings[DISTDIR]

Why pass in mysettings?  
Have the code push it in, manifest shouldn't know about the DISTDIR 
key nor PORTAGE_ACTUAL_DISTDIR, should just have a directory to look 
in.


 + def guessType(self, filename):
 + if filename.startswith(files/digest-):
 + return None
 + if filename.startswith(files/):

if you're intent on using os.sep, might want to correct the two '/' 
uses above to use os.path.join/os.path.sep

If concerned about cost, just calculate it once in the class namespace 
as a constant.

related, might I suggest converting away from internal strings to a 
class level enumeration?

int comparison is faster then string, plus it unbinds the internal 
code from the on disk symbols used (eg, just cause on disk is AUX 
doesn't mean internally it should be throwing around AUX).

 + return AUX
 + elif filename.endswith(.ebuild):
 + return EBUILD
 + elif filename in [ChangeLog, metadata.xml]:
 + return MISC
 + else:
 + return DIST
 + 
 + def getFullname(self):
 + return self.pkgdir+Manifest

Err... move that into the initializer.

If you're concerned folks will screw up the var, use a property to 
make it immutable.

Either way, func calls aren't cheap, and that's not really needed :)

 + 
 +   

Re: [gentoo-portage-dev] [PATCH] Manifest2 reloaded

2006-03-15 Thread Donnie Berkholz
Brian Harring wrote:
 python -m timeit -s 's=asdf*400;s+=fdsa.ebuild' 's.endswith(.ebuild)'
 100 loops, best of 3: 0.88 usec per loop

 python -m timeit -s 's=asdf*400;s+=fdsa.ebuild' 's[-7:] == .ebuild'
 100 loops, best of 3: 0.564 usec per loop

 Use endswith

 oddly, worth noting that startswith differs in this behaviour...
 python -m timeit -s 's=asdf*400;s+=fdsa.ebuild' 's[:7] == .ebuild'
 100 loops, best of 3: 0.592 usec per loop

 python -m timeit -s 's=asdf*400;s+=fdsa.ebuild' 's.startswith(.ebuild)'
 100 loops, best of 3: 0.842 usec per loop

Um, those both read the same way to me. You just switched the ordering
around, so the (starts|ends)with is on the bottom instead of the top,
but both times (starts|ends)with is longer.

Thanks,
Donnie



signature.asc
Description: OpenPGP digital signature


[gentoo-portage-dev] [PATCH] Manifest2 reloaded

2006-03-03 Thread Marius Mauch
So while on my way to FOSDEM I decided to do something useful with the
time and wrote a new manifest2 implementation. This has nothing to do
with the original prototype I posted a while ago, it's been written
completely from scratch.
Basically all functionality (creation, parsing, validation) is
encapsulated in the new portage_manifest.Manifest class, including
compability code to read/write old style digests.
The changes to portage.py only change the digest*() functions to use
this new class instead of handling the task themselves (exception:
digestCheckFiles() which apparently was only used internally by other
digest* functions), they should more or less behave like with the old
code. Any new code however should use the Manifest() class directly
however.
While this patch implements the basic functionality some extra stuff
that was in the old code isn't included yet:
- gpg verification
- FEATURES=autoaddcvs
- FEATURES=cvs (probably obsolete anyway)
- emerge --digest / FEATURES=digest (may or may not work)

The first should be delayed until there is some consensus how the gpg
stuff should work in the future, the others I don't see the use for.
Also I only checked portage.py for changes, so emerge/repoman/... might
still have to be fixed.
Last but not least: I did some basic testing with this and the
important stuff seems to work, but I'm quite sure the code still has a
lot of bugs/issues, and this being a core functionality it needs a
*lot* of testing, so I'd really appreciate if you could all give it a
spin (but do not commit anything to the tree without manually checking
it first).

Marius

-- 
Public Key at http://www.genone.de/info/gpg-key.pub

In the beginning, there was nothing. And God said, 'Let there be
Light.' And there was still nothing, but you could see a bit better.
diff -ru --exclude=CVS --exclude=.svn -N pym/portage.py.org pym/portage.py
--- pym/portage.py.org	2006-03-04 02:25:20.957635000 +
+++ pym/portage.py	2006-03-04 03:12:19.545785750 +
@@ -90,6 +90,7 @@
 
 	from portage_data import ostype, lchown, userland, secpass, uid, wheelgid, \
 	 portage_uid, portage_gid
+	from portage_manifest import Manifest
 
 	import portage_util
 	from portage_util import atomic_ofstream, dump_traceback, getconfig, grabdict, \
@@ -2049,181 +2050,67 @@
 			return 0
 	return 1
 
-
-def digestCreate(myfiles,basedir,oldDigest={}):
-	Takes a list of files and the directory they are in and returns the
-	dict of dict[filename][CHECKSUM_KEY] = hash
-	returns None on error.
-	mydigests={}
-	for x in myfiles:
-		print ,x
-		myfile=os.path.normpath(basedir+///+x)
-		if os.path.exists(myfile):
-			if not os.access(myfile, os.R_OK):
-print !!! Given file does not appear to be readable. Does it exist?
-print !!! File:,myfile
-return None
-			mydigests[x] = portage_checksum.perform_multiple_checksums(myfile, hashes=portage_const.MANIFEST1_HASH_FUNCTIONS)
-			mysize   = os.stat(myfile)[stat.ST_SIZE]
-		else:
-			if x in oldDigest:
-# DeepCopy because we might not have a unique reference.
-mydigests[x] = copy.deepcopy(oldDigest[x])
-mysize   = copy.deepcopy(oldDigest[x][size])
-			else:
-print !!! We have a source URI, but no file...
-print !!! File:,myfile
-return None
-
-		if mydigests[x].has_key(size) and (mydigests[x][size] != mysize):
-			raise portage_exception.DigestException, Size mismatch during checksums
-		mydigests[x][size] = copy.deepcopy(mysize)
-	return mydigests
-
-def digestCreateLines(filelist, mydict):
-	mylines = []
-	mydigests = copy.deepcopy(mydict)
-	for myarchive in filelist:
-		mysize = mydigests[myarchive][size]
-		if len(mydigests[myarchive]) == 0:
-			raise portage_exception.DigestException, No generate digest for '%(file)s' % {file:myarchive}
-		for sumName in mydigests[myarchive].keys():
-			if sumName not in portage_checksum.get_valid_checksum_keys():
-continue
-			mysum = mydigests[myarchive][sumName]
-
-			myline  = sumName[:]
-			myline +=  +mysum
-			myline +=  +myarchive
-			myline +=  +str(mysize)
-			mylines.append(myline)
-	return mylines
-
-def digestgen(myarchives,mysettings,overwrite=1,manifestonly=0):
+def digestgen(myarchives,mysettings,db=None,overwrite=1,manifestonly=0):
 	generates digest file if missing.  Assumes all files are available.	If
-	overwrite=0, the digest will only be created if it doesn't already exist.
-
-	# archive files
-	basedir=mysettings[DISTDIR]+/
-	digestfn=mysettings[FILESDIR]+/digest-+mysettings[PF]
-
-	# portage files -- p(ortagefiles)basedir
-	pbasedir=mysettings[O]+/
-	manifestfn=pbasedir+Manifest
-
-	if not manifestonly:
-		if not os.path.isdir(mysettings[FILESDIR]):
-			os.makedirs(mysettings[FILESDIR])
-		mycvstree=cvstree.getentries(pbasedir, recursive=1)
-
-		if (cvs in features) and os.path.exists(pbasedir+/CVS):
-			if not cvstree.isadded(mycvstree,files):
-if autoaddcvs in features:
-	print  Auto-adding files/ dir to CVS...
-	spawn(cd +pbasedir+; cvs