Robin Becker wrote: > Martin v. Löwis wrote: > > 0 the ideal hash > > :) > > can't be argued with > >> ....... >> So: what are your input data, and what is the >> distribution among them? >> >> Regards, >> Martin >> > I'm trying to create UniqueID's for dynamic postscript fonts. According > to my resources we don't actually need to use these, but if they are > required by a particular postscript program (perhaps to make a print run > efficient) then the private range of these ID's is 4000000<=UID<=4999999 > ie a range of one million. > > So I probably really need an 18 bit hash > > The data going into the font consists of > > fontBBox '[-415 -431 2014 2033]' > charmaps ['dup (\000) 0 get /C0 put',......] > metrics ['/C0 1251 def',.....] > bboxes ['/C29 [0 0 512 0] def',.......] > chardefs ['/C0 {newpath 224 418 m 234 336 ......def}',......] > > ie a bunch of lists of strings which are eventually joined together and > written out with a template to make the postscript definition. > > The UniqueID is used by PS interpreters to avoid recreating particular > glyphs so ideally I would number these fonts sequentially using a global > count, but in practice several processes separated by application and > time can produce postscript which eventually gets merged back together. > > If the UID's clash then the printer produces very strange output. > > I'm fairly sure there's no obvious python way to ensure the separated > processes can communicate except via the printer. So either I use a > python based scheme which reduces the risk of clashes ie random or some > data based hash scheme or I attempt to produce a postscript solution > like looking for a private global sequence number. > > I'm not sure my postscript is really good enough to do the latter so I > hoped to pursue a python based approach which has a low probability of > busting. Originally I thought the range was a 16bit number which is why > I started with 16bit hashes.
For identifying something, I suggest you use a hash function like sha1 truncating it to as much as you can use, similarly to what Jon Ribbens suggested. -- http://mail.python.org/mailman/listinfo/python-list