On Wed, 5 Sep 2012 18:48:00 +0200 Lionel Cons wrote:
> On 5 September 2012 17:00, Glenn Fowler <[email protected]> wrote:
> >
> > (1) are we talking about libast mktemp(3) or ksh mktemp(1)
> AST mktemp as plain command. We replaced the machine's native
> /usr/bin/mktemp with AST mktemp since GNU coreutils and (especially!)
> the Solaris /usr/bin/mktemp are prone to even more collisions (Solaris
> mktemp in Solaris 2.6-10 and 11 (Opensolaris didn't have the problem
> since it used the ksh93 mktemp) suffers from printing random garbage
> in rare occasions, too).
> > (2) did the original temp file exist when the dup name was generated
> I don't know. I have to ask. I'm just the messenger.
roland is correct that the low level ast routine is pathtemp()
the reason for question (2) is that pathtemp() has a collision detection loop
and mktemp(1) calls pathtemp() with an fd pointer that instructs it to
create the temp file with
open(path, O_CREAT|O_RDWR|O_EXCL, mode)
if this fails then another pseudo-random tmp path is generated
until there is no collision
even if the range of the generated paths were limited pathtemp(3) as called
by mktemp(1) should never return success on a path that already exists
unless mktemp were called with --unsafe or with --directory
pathtemp() in this mode could fail (by not returning) if the entire range
were covered by existing files -- it would loop in a bad fashion, attempting
random paths attempting to hit unused names
since the collision in your case was after 96 hours I don't think its
any kind of weird filesystem timing problem
can you send the mktemp command line used?
I looked at the pathtemp() code and the range can be improved by switching
from base 32 (should have been 36! : [0-9]+[a-z]) to base 62 ([0-9]+[a-z]+[A-Z])
numeric representation of the pesudorandom hash
and fixing the mktemp(1) user supplied prefix logic to use more of the hash
when the prefix length is less than the max 5 chars
I ran this test with the old and new pathtemp()
ksh -c '
integer n
typeset f
typeset -A seen
builtin mktemp
for p in _____ ____ ___ __ _ ""
do while :
do ((n++))
f=$(mktemp -u "$p")
if [[ ${seen[$f]} ]]
then printf "%11s %9d %9d\n" $f ${seen[$f]} $n
break
fi
seen[$f]=$n
done
done
'
the test uses -u (--unsafe) so it doesn't generate any temp files
it does check for collisions using access(2), but that's not a factor for the
test
the results show that the old alg collides around ~10^4 calls regardless of
prefix size
and the new alg collides around ~10^6 calls with prefix length 0
for avoiding predictability the new results are better
and they also make the collision detection loop more efficient
with mktemp prefix "":
the old alg is limited to 32^5 = ~10^7 different names
the new alg is limited to 32^10 = ~10^15 different names
using the max #X's (14) in the template notation
mktemp ${prefix}XXXXXXXXXXXXXX
the old and new alg are limited to 32^14 = ~10^21 different names
again, the test calls mktemp with -u (collision detection disabled),
so the results show collisions in name generation only
old
_____3e.3vj 2657 3178
____2c.vks 3828 4723
___0c.7td 4814 5268
__1i.2q7 8004 10667
_26.6kb 13720 14311
3n.dc5 16113 16578
new
_____03.44w 76 359
____041.2Nu 2460 2890
___02Vu.1De 7965 23963
__02eD1.3p1 144094 246462
_04p47i.3E5 316344 855852
04yrNdG.0NY 1375224 2357065
I looked further into the code and also modified pathtemp() to
to call mkdir() atomically rahter than open(O_CREAT) if the prefix
arg ends with "/"
this will move the mktemp --directory collision detection from
mktemp(1) to pathtemp(3)
_______________________________________________
ast-users mailing list
[email protected]
https://mailman.research.att.com/mailman/listinfo/ast-users