Wed Dec 28 06:49:51 2011: Request 73491 was acted upon. Transaction: Correspondence added by RSCHUPP Queue: PAR-Packer Subject: cache directory naming problem Broken in: 1.012 Severity: (no value) Owner: RSCHUPP Requestors: gyp...@gmail.com Status: open Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=73491 >
On 2011-12-27 01:31:33, gyp...@gmail.com wrote: > I looked into the error message and I found that the original username > is changed to be an illegal byte sequence: > * from 0x ba ce be fb c0 cc (3character, 6bytes, encoded with cp949) > * to 0x ba ce be 5f c0 cc ... > or, equivalent C-code in mktmpdir.c: > > /* replace all non-alphanumeric letters with '_' */ > for ( c = username ; *c != '\0' ; c++ ) { > if ( !isalnum(*c) ) { > *c = '_'; > } > } Excellent analysis. There's actually a problem with the C version: the type of c is char* and char is signed on Intel i386. At least the Mingw32 implementation of isalnum() doesn't correctly account for that and marks only one of the 6 bytes above as !isalnum. It should've marked them all, resulting in "______". That wouldn't have caused an error when creating the cache directory, but has a high probability for collision with another username (though that wouldn't have shown in a default Windows environment where the temp directory is per-user anyway). > Would you please check it? I think that it would be good idea to use "%- > encoding" for all non-latin characters, so that the username in the > screenshot would be changed into "par-%ba%ce%be%fb%c0%cc". ('%' may be > removed) Yeah, but what is a "non-latin" character if we don't consider the charset? I think we shouldn't make any assumption about it and simply encode _all_ bytes unconditionally in username as two hex bytes. The only thing we loose is easy recognizability of which cache directory belongs to which user. >From cursory looking at CP949 (or EUC-KR) I believe that the sequence of bytes "par-bacebefbc0cc" is a legal string in CP949, right? That should also work with ASCII and all ISO Latin encodings, as well as EUC-CN and EUC-JN and UTF-8. Could you please try the two attached patches (the first is for PAR, the second for PAR::Packer)? Cheers, Roderich
Index: PAR/lib/PAR/SetupTemp.pm =================================================================== --- PAR/lib/PAR/SetupTemp.pm (revision 1336) +++ PAR/lib/PAR/SetupTemp.pm (working copy) @@ -98,7 +98,11 @@ qw( C:\\TEMP /tmp . ) ) { next unless defined $path and -d $path and -w $path; - $temp_path = File::Spec->catdir($path, "par-$username"); + # create a temp directory that is unique per user + # NOTE: $username may be in an unspecified charset/encoding; + # use a name that hopefully works for all of them; + # also avoid problems with platform-specific meta characters in the name + $temp_path = File::Spec->catdir($path, "par-".unpack("H*", $username)); ($temp_path) = $temp_path =~ /^(.*)$/s; unless (mkdir($temp_path, 0700) || $!{EEXIST}) { warn "creation of private subdirectory $temp_path failed (errno=$!)"; @@ -140,7 +144,6 @@ else { $username = $ENV{USERNAME} || $ENV{USER} || 'SYSTEM'; } - $username =~ s/\W/_/g; return $username; }
Index: PAR-Packer/myldr/mktmpdir.c =================================================================== --- PAR-Packer/myldr/mktmpdir.c (revision 1336) +++ PAR-Packer/myldr/mktmpdir.c (working copy) @@ -76,7 +76,6 @@ char *par_mktmpdir ( char **argv ) { int i; - char *c; const char *tmpdir = NULL; const char *key = NULL , *val = NULL; @@ -112,6 +111,7 @@ DWORD buflen = MAXPATHLEN; username = malloc(MAXPATHLEN); GetUserName((LPTSTR)username, &buflen); + // FIXME this is uncondifionally overwritten below - WTF? } #endif @@ -123,18 +123,17 @@ username = strdup(val); } } - - if ( username == NULL ) { + if ( username == NULL ) username = "SYSTEM"; + + /* sanitize username: encode all bytes as 2 hex digits */ + { + char *hexname = malloc(2 * strlen(username) + 1); + char *u, *h; + for ( u = username, h = hexname ; *u != '\0' ; u++, h += 2) + sprintf(h, "%02x", *(unsigned char*)u); + username = hexname; } - else { - /* replace all non-alphanumeric letters with '_' */ - for ( c = username ; *c != '\0' ; c++ ) { - if ( !isalnum(*c) ) { - *c = '_'; - } - } - } /* Try temp environment variables */ for ( i = 0 ; tmpdir == NULL && (key = temp_keys[i]); i++ ) {