Dmitrij had some questions about my intent, I'll try to clarify.
2014/12/02 18:57 Joel Rees joel.r...@gmail.com:
(apologies for the html.)
2014/12/02 9:52 Dmitrij D. Czarkoff czark...@gmail.com:
[ ... and others
Snipped context:
There was some discussion of what kind of file names should be
Joel Rees writes:
You can even handle broken UTF-8 and unconverted UTF-16/32 of whatever byte
order spit into the file name as a sequence of bytes if and only if you
escape NUL, slash, and your escape character properly, restoring the
escaped characters when putting the file names on the
Anthony J. Bentley said:
I haven't used Apple OSses since around 10.4, but Mac OS X was doing a
thing where certain well-known directory names were aliased according to
the current locale. For instance, the user's music directory was shown
as 「音楽」 when the locale was set to ja_JP.UTF-8.
On Wed, Dec 3, 2014 at 9:09 PM, Dmitrij D. Czarkoff czark...@gmail.com wrote:
Anthony J. Bentley said:
I haven't used Apple OSses since around 10.4, but Mac OS X was doing a
thing where certain well-known directory names were aliased according to
the current locale. For instance, the user's
First of all, I really don't believe that preservation of non-canonical
form should be a consideration for any software. There is no single
reason to allow non-canonical forms to exist at all, while there are
several reasons to avoid them. More so for foreign encodings in
filenames - if you are
2014/12/03 22:23 Dmitrij D. Czarkoff czark...@gmail.com:
First of all, I really don't believe that preservation of non-canonical
form should be a consideration for any software.
There is no particular canonical form for some kinds of software.
Unix, in particular, happens to have file name
Joel Rees writes:
2014/12/03 22:23 Dmitrij D. Czarkoff czark...@gmail.com:
First of all, I really don't believe that preservation of non-canonical
form should be a consideration for any software.
There is no particular canonical form for some kinds of software.
Unix, in particular,
Joel Rees writes:
2014/12/03 22:23 Dmitrij D. Czarkoff czark...@gmail.com:
First of all, I really don't believe that preservation of non-canonical
form should be a consideration for any software.
There is no particular canonical form for some kinds of software.
Unix, in particular,
Joel Rees said:
Maybe it would be better just to not make those directories until they
are needed by an application, and then ask the user to name them
instead of providing standard names.
Actually, it is still workable if you carry your ~/.config/user-dirs.dir
around, so that you could
(apologies for the html.)
2014/12/02 9:52 Dmitrij D. Czarkoff czark...@gmail.com:
Joel Rees said:
Now, what would you do with this?
ã¸ã§ã¨ã«
Why not decompose it to the following?
ï½¼ï¾ï½®ï½´ï¾
Because it is not what Unicode normalization is.
Well, it definitely isn't
Hi Ingo,
Ingo Schwarze writes:
While the article is old, the essence of what Schneier said here
still stands, and it is not likely to fall in the future:
https://www.schneier.com/crypto-gram-0007.html#9
The most interesting sentence here is:
Unicode is just too complex to ever be secure.
On Sat, Nov 29, 2014 at 09:48:53PM +0100, Dmitrij D. Czarkoff wrote:
That said, the standard provides just enough facilities to make
filesystem-related aspects of Unicode work nicely, particularily in case
of utf-8. Eg. ability to enforce NFD for all operations on file names
could actually
pizdel...@gmail.com said:
How do you 'enforce' NFD?
Let the kernel normalize (ie /destructively/ transform) the file names
behind user's back, so that a file will be listed with a different name
than that with which it was created? That's very nice and secure, indeed.
I would enforce
2014-12-01 10:20 GMT+01:00 Dmitrij D. Czarkoff czark...@gmail.com:
pizdel...@gmail.com said:
How do you 'enforce' NFD?
Let the kernel normalize (ie /destructively/ transform) the file names
behind user's back, so that a file will be listed with a different name
than that with which it
On Mon, Dec 01, 2014 at 10:38:40AM +0200, pizdel...@gmail.com wrote:
On Sat, Nov 29, 2014 at 09:48:53PM +0100, Dmitrij D. Czarkoff wrote:
That said, the standard provides just enough facilities to make
filesystem-related aspects of Unicode work nicely, particularily in case
of utf-8. Eg.
On Mon, Dec 01, 2014 at 10:20:08AM +0100, Dmitrij D. Czarkoff wrote:
I would enforce normalization at filename access time (open(), fopen(),
readdir(), etc). Yes, destructively transform. I would reject
filenames that won't decode. If this is documented, I just don't see
how it is behind
Stefan Sperling said:
Bad idea. See my other post. Apple did this and broke existing applications.
OpenBSD changed time_t and broke existing applications, but hardly
anyone thinks it was a bad idea. Fancy filenames are long known to be
problematic, so filename policy enforcement is a breakage
2014-12-01 12:05 GMT+01:00 Dmitrij D. Czarkoff czark...@gmail.com:
Stefan Sperling said:
Bad idea. See my other post. Apple did this and broke existing
applications.
OpenBSD changed time_t and broke existing applications, but hardly
anyone thinks it was a bad idea. Fancy filenames are
Janne Johansson said:
There is quite a bit of difference between changing the storage format and
making some dates impossible that previously did work.
Don't think so. Something got changed, things got broken and need to be
fixed. The only real question is: is the change worth the trouble. I
On Mon, Dec 1, 2014 at 8:43 PM, Dmitrij D. Czarkoff czark...@gmail.com wrote:
Janne Johansson said:
There is quite a bit of difference between changing the storage format and
making some dates impossible that previously did work.
Don't think so. Something got changed, things got broken and
Joel Rees said:
Hmm. What would you suggest doing with the following file name?
/etc
(You may need a Japanese font to display it.)
If you try to normalize it on a *nix box, it will hopefully conflict
with your system file permissions. But, then what do you do with it?
If you throw it
On Mon, Dec 01, 2014 at 12:43, Dmitrij D. Czarkoff wrote:
Janne Johansson said:
There is quite a bit of difference between changing the storage format and
making some dates impossible that previously did work.
Don't think so. Something got changed, things got broken and need to be
fixed.
Joel Rees, 01 Dec 2014 22:04:
Hmm. What would you suggest doing with the following file name?
/etc
(You may need a Japanese font to display it.)
If you try to normalize it on a *nix box, it will hopefully conflict
with your system file permissions. But, then what do you do with it?
Stefan Sperling, 29 Nov 2014 18:17:
Are you aware of 'detox' package?
There's also converters/convmv
$ touch »´ÁÉǑÄ«
$ convmv *
wrong/unknown from encoding!
$ convmv -f utf8 -t latin1 *
Starting a dry run without changes...
iso-8859-1 doesn't cover all needed characters for: ./»´ÁÉǑÄ«
To
On Mon, Dec 1, 2014 at 11:13 PM, Dmitrij D. Czarkoff czark...@gmail.com wrote:
Joel Rees said:
Hmm. What would you suggest doing with the following file name?
/etc
(You may need a Japanese font to display it.)
If you try to normalize it on a *nix box, it will hopefully conflict
with your
Ted Unangst writes:
On Mon, Dec 01, 2014 at 12:43, Dmitrij D. Czarkoff wrote:
Janne Johansson said:
There is quite a bit of difference between changing the storage format and
making some dates impossible that previously did work.
Don't think so. Something got changed, things got
Joel Rees said:
Now, what would you do with this?
ジョエル
Why not decompose it to the following?
ジョエル
Because it is not what Unicode normalization is.
I know what the Unicode rules say, but my boss says, if I'm going to
play with file names, he wants it done his way.
And now you
Joel Rees said:
That said, the standard provides just enough facilities to make
filesystem-related aspects of Unicode work nicely, particularily in case
of utf-8. Eg. ability to enforce NFD for all operations on file names
could actually make several things more secure by preventing homograph
Thomas Bohl said:
# ls | cat
Will display the characters right.
Not entirely sure why though.
From ls(1) manual:
| -q Force printing of non-graphic characters in file names as the
| character `?'; this is the default when output is to a terminal.
--
Dmitrij D. Czarkoff
On Sun, Nov 30, 2014 at 6:31 PM, Dmitrij D. Czarkoff czark...@gmail.com wrote:
Joel Rees said:
That said, the standard provides just enough facilities to make
filesystem-related aspects of Unicode work nicely, particularily in case
of utf-8. Eg. ability to enforce NFD for all operations on
On 2014-11-29, Ingo Schwarze schwa...@usta.de wrote:
But Unicode must never be allowed near anything that might get
executed as program code, including scripts in interpreted languages,
including, but not limited to, the shell. In particular, that means
trying to handle Unicode in filenames
/read/renamed/deleted
without problems.
is it true to say then, that ffs is entirely utf8 safe,
and/or that ffs is actually an utf-8 encoded filesystem
as IIRC Mac OS is? or is it some kind of happy accident
that it works? :)
-f
--
mips = meaningless index of processor speed
Hello,
On 29 November 2014 at 14:02, frantisek holop min...@obiit.org wrote:
i have written for myself a small python3 script that
removes accented characters and all utf8 symbols
from filenames, a kind of utf-8 to ascii sanitizer.
Are you aware of 'detox' package?
--
Regards,
Ville
frantisek holop, 29 Nov 2014 13:02:
while working on it, i created some strange test cases
(e.g. »´ÁÉǑÄ«) for filenames and i was pleasently
surprised that the files were created/read/renamed/deleted
without problems.
i think i should clarify this a bit:
they show perfect in midnight
Ville Valkonen, 29 Nov 2014 14:08:
Are you aware of 'detox' package?
$ touch »´ÁÉǑÄ«
$ detox *
$ ls
A_A_A_A_C_A_A_
$ touch »´ÁÉǑÄ«
$ my_silly_script
$ ls
aeoa
perhaps with some massaging detox can be made
to work like my script, i dont know. but that is
actually besides the point.
i wrote my
Shouldn't in 2014 the aim having all working in utf-8?
Paolo Aglialoro, 29 Nov 2014 13:56:
Shouldn't in 2014 the aim having all working in utf-8?
sure.
but i like my filenames ascii and whitespaceless.
shows my age.
-f
--
what a nice night for an evening. -- steven wright
frantisek holop said:
is it true to say then, that ffs is entirely utf8 safe,
and/or that ffs is actually an utf-8 encoded filesystem
as IIRC Mac OS is? or is it some kind of happy accident
that it works? :)
As I get it, ffs is entirely utf8 safe because it is not encoding
aware
Hi,
On 29.11.2014 13:20, frantisek holop wrote:
i think i should clarify this a bit:
they show perfect in midnight commander, not in shell.
$ touch »´ÁÉǑÄ«
$ ls
??
-f
I had a similar problem some time ago and have been told that the ls
tool is not aware of UTF-8. See here for
On Sat, Nov 29, 2014 at 13:02, frantisek holop wrote:
is it true to say then, that ffs is entirely utf8 safe,
and/or that ffs is actually an utf-8 encoded filesystem
as IIRC Mac OS is? or is it some kind of happy accident
that it works? :)
FFS stores filenames as bytes.
On 2014-11-29, frantisek holop min...@obiit.org wrote:
is it true to say then, that ffs is entirely utf8 safe,
and/or that ffs is actually an utf-8 encoded filesystem
as IIRC Mac OS is?
The former. Unix filesystems accept all bytes for filenames with
the exception of 0x2f, which serves
On 2014-11-29, frantisek holop min...@obiit.org wrote:
$ touch »´ÁÉǑÄ«
$ ls
??
If you need a locale-aware ls(1), use the one from the colorls package.
(Don't worry, colored output is entirely optional.)
--
Christian naddy Weisgerber na...@mips.inka.de
Hi,
Paolo Aglialoro wrote on Sat, Nov 29, 2014 at 01:56:23PM +0100:
Shouldn't in 2014 the aim having all working in utf-8?
Most definitely not, that would directly run contrary to some of
OpenBSD's most important project goals: Correctness, simplicity,
security.
While the article is old, the
On Nov 29 13:02:34, min...@obiit.org wrote:
is it true to say then, that ffs is entirely utf8 safe,
and/or that ffs is actually an utf-8 encoded filesystem
The file names are just strings of bytes.
There is nothing UTF8 about them.
On Nov 29 14:23:35, czark...@gmail.com wrote:
(Interestingly
On Sat, Nov 29, 2014 at 02:08:32PM +0200, Ville Valkonen wrote:
Hello,
On 29 November 2014 at 14:02, frantisek holop min...@obiit.org wrote:
i have written for myself a small python3 script that
removes accented characters and all utf8 symbols
from filenames, a kind of utf-8 to ascii
Ingo Schwarze said:
While the article is old, the essence of what Schneier said here
still stands, and it is not likely to fall in the future:
https://www.schneier.com/crypto-gram-0007.html#9
Sorry, but this article is mostly based on lack of understanding of
Unicode.
that would directly
On Sun, Nov 30, 2014 at 5:48 AM, Dmitrij D. Czarkoff czark...@gmail.com wrote:
Ingo Schwarze said:
While the article is old, the essence of what Schneier said here
still stands, and it is not likely to fall in the future:
https://www.schneier.com/crypto-gram-0007.html#9
Sorry, but this
Am 29.11.2014 um 13:20 schrieb frantisek holop:
i think i should clarify this a bit:
they show perfect in midnight commander, not in shell.
$ touch »´ÁÉǑÄ«
$ ls
??
# ls | cat
Will display the characters right.
Not entirely sure why though.
48 matches
Mail list logo