Re: NAME_MAX bumping consequences

2012-12-25 Thread Marc Espie
On Mon, Dec 24, 2012 at 09:30:53PM -0800, Philip Guenther wrote:
 On Mon, Dec 24, 2012 at 4:13 PM, Vadim Zhukov persg...@gmail.com wrote:
 ...
  Thanks a lot, Phillip! Now I feel myself much more brave than a few hours
  ago. :) I think about tweaking NAME_MAX to 1535: this should be fine for any
  255 UTF-8 characters (and even a bit more). Oh, PATH_MAX is smaller... But
  none said it'll be easy; otherwise somebody probably had done this work in
  OpenBSD already. :)
 
 My response made you feel *more* brave about making the change?
 Changing both the on-disk format *and* ABI makes this worse than a
 normal a flag day: not only would you be unable to use normal OpenBSD
 binaries, but you wouldn't be able to view the filesystem with a
 normal OpenBSD bsd.rd.  If something goes wrong, digging out your data
 could be really painful.

Vadim is crazy enough to work on kde4, so it doesn't sound too bad by
comparison. :)



Re: NAME_MAX bumping consequences

2012-12-24 Thread Philip Guenther
On Mon, Dec 24, 2012 at 5:08 AM, Vadim Zhukov persg...@gmail.com wrote:
...
 I understand that simple change of those constants will (not ever could)
 break some system parts, break ABI, break apps assuming 255 and so on.
 All I want is to know is: if I'll build release with those
 constants changed, what is known to break then by definition? I'm
 fine to cope with problems, just want to know about predictable ones.
 FWIW, I'm using FFS, NTFS and FAT filesystems only.

Problem 1: the on-disk format of directories in FFS uses a single byte
to store the length of the filename.  To quote ufs/ufs/dir.h:

struct  direct {
u_int32_t d_ino;/* inode number of entry */
u_int16_t d_reclen; /* length of this record */
u_int8_t  d_type;   /* file type, see below */
u_int8_t  d_namlen; /* length of string in d_name */
char  d_name[MAXNAMLEN + 1];/* name with length = MAXNAMLEN */
};

Since such a change is obviously not backwards compatible, you should
change the filesystem name and magic number when doing that so that
the kernel can know how to handle a given file system.

Problem 2: the dirent structure used by getdirentries(2) and thus
readdir(3) also uses a single-byte for the filename length, so you'll
also break that ABI and need to rebuild the world.  Note that the UFS
code actually assumes that those two structures (direct and dirent)
have the same layout, so if you do change dirent, but want to support
the existing UFS filesystem layout, you'll have to write *fun*
conversion code in ufs_readdir()...



 Also, while looking through sources, I've found some XXX in
 sys/compat/linux/linux_misc.c. Am I right with the patch below?

Not until all the filesystems actually *set* f_namemax.  Looks like
FAT, for example, returns zero right now...



Philip Guenther



Re: NAME_MAX bumping consequences

2012-12-24 Thread Vadim Zhukov
24.12.2012 23:34 пользователь Philip Guenther
guent...@gmail.com
написал:

 On Mon, Dec 24, 2012 at 5:08 AM, Vadim Zhukov persg...@gmail.com wrote:
 ...
  I understand that simple change of those constants will (not ever
could)
  break some system parts, break ABI, break apps assuming 255 and so on.
  All I want is to know is: if I'll build release with those
  constants changed, what is known to break then by definition? I'm
  fine to cope with problems, just want to know about predictable ones.
  FWIW, I'm using FFS, NTFS and FAT filesystems only.

 Problem 1: the on-disk format of directories in FFS uses a single byte
 to store the length of the filename.  To quote ufs/ufs/dir.h:

 struct  direct {
 u_int32_t d_ino;/* inode number of entry */
 u_int16_t d_reclen; /* length of this record */
 u_int8_t  d_type;   /* file type, see below */
 u_int8_t  d_namlen; /* length of string in d_name */
 char  d_name[MAXNAMLEN + 1];/* name with length = MAXNAMLEN
*/
 };

 Since such a change is obviously not backwards compatible, you should
 change the filesystem name and magic number when doing that so that
 the kernel can know how to handle a given file system.

 Problem 2: the dirent structure used by getdirentries(2) and thus
 readdir(3) also uses a single-byte for the filename length, so you'll
 also break that ABI and need to rebuild the world.  Note that the UFS
 code actually assumes that those two structures (direct and dirent)
 have the same layout, so if you do change dirent, but want to support
 the existing UFS filesystem layout, you'll have to write *fun*
 conversion code in ufs_readdir()...

Thanks a lot, Phillip! Now I feel myself much more brave than a few hours
ago. :) I think about tweaking NAME_MAX to 1535: this should be fine for
any 255 UTF-8 characters (and even a bit more). Oh, PATH_MAX is smaller...
But none said it'll be easy; otherwise somebody probably had done this work
in OpenBSD already. :)

  Also, while looking through sources, I've found some XXX in
  sys/compat/linux/linux_misc.c. Am I right with the patch below?

 Not until all the filesystems actually *set* f_namemax.  Looks like
 FAT, for example, returns zero right now...

Hmmm, is there any interest in fixing those filesystems code? I can build a
patch now that I'm in that land anyway.



Re: NAME_MAX bumping consequences

2012-12-24 Thread Philip Guenther
On Mon, Dec 24, 2012 at 4:13 PM, Vadim Zhukov persg...@gmail.com wrote:
...
 Thanks a lot, Phillip! Now I feel myself much more brave than a few hours
 ago. :) I think about tweaking NAME_MAX to 1535: this should be fine for any
 255 UTF-8 characters (and even a bit more). Oh, PATH_MAX is smaller... But
 none said it'll be easy; otherwise somebody probably had done this work in
 OpenBSD already. :)

My response made you feel *more* brave about making the change?
Changing both the on-disk format *and* ABI makes this worse than a
normal a flag day: not only would you be unable to use normal OpenBSD
binaries, but you wouldn't be able to view the filesystem with a
normal OpenBSD bsd.rd.  If something goes wrong, digging out your data
could be really painful.

That doesn't mean it's a bad idea.  4.4 BSD changed the BSD world by
changing the size of off_t and we're all the better off for it, but
even that was just an ABI change.  Changing NAME_MAX is a big
change...


  Also, while looking through sources, I've found some XXX in
  sys/compat/linux/linux_misc.c. Am I right with the patch below?

 Not until all the filesystems actually *set* f_namemax.  Looks like
 FAT, for example, returns zero right now...

 Hmmm, is there any interest in fixing those filesystems code? I can build a
 patch now that I'm in that land anyway.

Sure.  Please verify that pathconf(path, _PC_NAME_MAX) returns the
correct value for those filesystems too.  (This can be tested from the
shell via getconf NAME_MAX path.)


Philip Guenther