jlh <[EMAIL PROTECTED]> ha escrit: > Ok, here's an update. I could track down the cause of this > problem. In order to match file names to patterns, tar uses the > fnmatch(3), which is provided by glibc. This happens in > lib/exclude.c:149:exclude_fnmatch(). fnmatch() is documented to > return 0 on a successful match, FNM_NOMATCH (defined to be 1) on a > not-match, and anything else on error. exclude_fnmatch() only > compares the return value to 0 and thus treats a non-match and an > error the same way. The particular problem I'm experiencing > triggered an error and fnmatch() indeed returns -1, which means an > error happened and perror() says "Invalid or incomplete multibyte > or wide character". The message is correct, since the byte is > invalid in utf8, but I was under the impression that a path > component may consist of any sequence of non-nul, non-slash bytes. > Since fnmatch() is specially aimed at matching paths I would think > it should also handle the cases where a path component contains > arbitrary bytes. I've been able to reproduce this error as a > stand-alone small test-case that calls fnmatch(), so this is not a > tar problem anymore (excepted that tar doesn't check for errors). > I will take it to the glibc list.
Thanks for reporting. I am not sure what exclude_fnmatch is supposed to return on error, since it returns bool, so I'm CC-ing this to [EMAIL PROTECTED] > One other comment: I also noticed that tar makes the call to > fnmatch with the flag value 0x50000008 in this particular case. > The low bit corresponds to the flag FNM_LEADING_DIR, but the two > high bits have no meaning to fnmatch() as far as I can see, > they're only used by tar itself for internal use. Does it say > somewhere that one may set undefined bits in flags and expect > things to still work? It works with fnmatch from gnulib, but I agree that it is a risky thing to do with an arbitrary third-party fnmatch implementation. Regards, Sergey