I don't think python is the only one with that problem, try saving a file with
non utf8 chars in subversion and see what happens.

We should be liberal in what we accept and strict in what we send as we really
don't know the filesystem will return to us. I guess a file read from the
filesystem could have PathBinary as it's default object type. And the
programmer would have the option of converting it.

But you are right we have the same problem with Perl 5 today:

my $file = readdir $dir;

Should $file have the utf8 flag set if my locale is set to uft8?

or should i have to do a :

my $file = eval { decode("utf8", $file, Encode::FB_CROAK); };

every time i get a filename?

Regards Troels.

On Tue, Aug 18, 2009 at 16:37, Nicholas Clark<n...@ccl4.org> wrote:
> On Tue, Aug 18, 2009 at 01:10:58PM +0200, Jan Ingvoldstad wrote:
>> On Tue, Aug 18, 2009 at 12:54 PM, Troels Liebe Bentsen<t...@rapanden.dk> 
>> wrote:
>
>> > Besides that, a simple check on Unix for what the locale is set to might 
>> > also be
>> > nice, so we don't write UTF8 files on a filesystem where the rest for the 
>> > files
>> > are in Latin1.
>>
>> The locale doesn't say what format the filenames are on the
>> filesystem, though, merely the current user's language preferences may
>> be.
>
> We don't want to make the same mistakes as Python 3:
>
> http://mail.python.org/pipermail/python-dev/2008-December/083856.html
>
> The summary is that different file names in the same directory might be
> in different encodings, and your programming language runtime sucks big time
> if it doesn't offer you a way to iterate over all of them somehow, even if
> you can't render their names.
>
> [Consider a security critical program scanning using glob('*'), which gives
> a clean bill of health because it opened "all" files and found no problems.]
>
> I don't know how Python 3 resolved this.
>
> Nicholas Clark
>

Reply via email to