Michael Wagner:

Decoding the UTF-8 encoded file name (again with an additional print
statement):

$ REQUEST_METHOD=GET 
QUERY_STRING='p=notes.git;a=blob_plain;f=work/G%C3%83%C2%BCtekriterien.txt;hb=HEAD'
 ./gitweb.cgi

work/Gütekriterien.txt
Content-disposition: inline; filename="work/Gütekriterien.txt"

You should fix the code path that created that URI, though, as it is not what you expected.

%C3%83 decodes to U+00C3 Latin Capital Letter A With Tilde
%C2%BC decodes to U+00BC Vulgar Graction One Quarter

The proper UTF-8 encoding for ü (U+00FC) is, as you can probably guess from looking at which two characters the sequence above yielded, C3 BC, which in a URI is represented as %C3%BC.

Your QUERY_STRING should thus be

  p=notes.git;a=blob_plain;f=work/G%C3%BCtekriterien.txt;hb=HEAD

which probably works as expected.

What is happening is that whatever is generating the URI us UTF-8-encoding the string twice (i.e., it generates a string with the proper C3 BC in it, and then interprets it as iso-8859-1 data and runs that through a UTF-8 encoder again, yielding the C3 83 C2 BC sequence you see above).

--
\\// Peter - http://www.softwolves.pp.se/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to