Re: How to go on with ISO 9660 large file support ?
Hi Thomas, unfortunately, I'm a bit scarce on round tuits right now, so here only some random comments: - cd9660_readdir() serving as VOP_READDIR(9) The case of mount -o norrip,nogens already used a delivery function with delayed file candidates: cd9660_vnops.c : iso_shipdir(). Originally it only had the task to find the youngest version of a ISO 9660 data file. Now it also counts the follow-up directory records of the same file and skips over them. Don't forget that in case of -o norrip, iso_shipdir is also complicated by associated files - cd9660_lookup() as VOP_LOOKUP(9) mount -o norrip returned the last record of matching name, whereas -o rrip returned the first matching record. RRIP doesn't have the concept of multiple versions of a file. There is also an ISO image emerging which (by hex editor) exposes exotic or even illegal situations. 6.1.3 and the host operating system show interesting effects when mounting it. What 6.1.3 do you refer to here? Certainly not to the entry in ECMA-119 labelled Volume Space? - I could not yet find ISO images or software which would provide test opportunities for ISO 9660 Associated Files or Extented Attributes (which are not related to getextattr(1)/extattr(9)). Regarding associated files, I saw those used by Apple for their resource fork. Albeit I haven't seen any ISOs with them recently. Ciao, Wolfgang -- wolfg...@solfrank.net Wolfgang Solfrank
Re: How to go on with ISO 9660 large file support ?
Hi, Wolfgang Solfrank: I'm a bit scarce on round tuits right now, The global tuits shortage crisis is of course a big obstacle. Especially since ISO 9660 is essential when booting installation CDs or DVDs. I plan to repack NetBSD-6.1.3-i386.iso with my development kernel and to perform an installation. (It is not nice of makefs to hide the El Torito boot image in blind space. No i have to guess its size.) Currently i am looking for critical comments and questions about the overall implementation concept. A decision would be nice, whether API compatibility for cd9660_node.h is worth an extra pointer in each iso_node (= vnode.v_data). As stated, fstat(1) would be affected by an API break. On the other hand, half of my CVS /usr/src seems to suffer from API problems with the NetBSD-6.1.3 system installation. Heavy testing makes sense only after this decision. RRIP doesn't have the concept of multiple versions of a file. Yes. It looks funny if there is more than one ISO file version which all have the same RRIP name. I tested this. My proposal showed the best results among the tested systems. What 6.1.3 do you refer to here? NetBSD-6.1.3 i386. The host operating system is Debian 6 amd64. My development subject from CVS refers to itself as NetBSD-6.99.40 (#190 meanwhile). Regarding associated files, I saw those used by Apple for their resource fork. Albeit I haven't seen any ISOs with them recently. Probably i will have to fake some by hand. This implies the risk, though, that i got the specs wrong in both cases. A real test case would be much more significant. In general i would say that large files are much more common than Associated Files, multiple Versions, or ISO 9660 Extended Attributes. So even if there is a risk to break those, it still seems worth. Have a nice day :) Thomas
Re: How to go on with ISO 9660 large file support ?
Hi, New code should use kmem(9) for variable-sized or one-time allocations, Currently it is running on kmem(9). There is old malloc(9) usage in cd9660 which i tried to mimic until i came to the deprecation statement in the man page. Shall i change the old malloc(9) usage, too ? Or in a later PR, maybe ? For the most part, struct iso_node is private to the kernel implementation of cd9660. Regrettably there is a #ifdef _KERNEL in cd9660_node.h and struct iso_node is defined outside of that area. Since fstat(1) and pmap(1) use it, you should avoid changing the offsets within the structure of the members that fstat(1) and pmap(1) use, I hope to have achieved that on all non-128-bit machine types by: struct iso_file_section { unsigned long isofsc_extent;/* will not be used by cd9660 */ unsigned long isofsc_size; /* byte count */ unsigned long isofsc_start; /* block address */ }; union iso_file_data { struct iso_file_section single; /* used if only one extent exists: * CD9660_FSECT_FROM_INO(.i_number)==1 */ struct iso_file_section *many; /* allocated with enough array elements * for CD9660_FSECT_FROM_INO(.i_number) * if this is larger than 1. */ }; ... struct iso_node { ... #ifdef CD9660_MULTI_EXTENT union iso_file_data iso_fsect; /* File sections. Their number is given * by CD9660_FSECT_FROM_INO(.i_number) */ #else unsigned long iso_extent; /* extent of file */ unsigned long i_size; unsigned long iso_start;/* actual start of data of file (may be different */ /* from iso_extent, if file has extended attributes) */ #endif /* ! CD9660_MULTI_EXTENT */ ... }; Some day, we ought to find a better way to do what fstat(1) and pmap(1) are doing with kernel guts. At least they should make sure that the binary and the kernel are compatible. As is done with dynamic linking and .so numbers. It would be nice if you could integrate your tests into the atf file system tests under src/tests/fs so that they get run automatically. I promise to learn what's needed and to finally provide such a test. As said, currently it is neither complete nor easily portable to a different computer. This sounds bad (but not at all surprising). If the `interesting effects' are anything more than harmless messages printed to the console and unreadable images -- e.g., programs crash, kernel panics or discloses private memory to userland, c. -- it would be nice if you could identify little fixes or at least early checks that can be easily back-ported. It's inbetween: Confusing to userland because file names get shown twice or more, bear all the same attributes and content, although the causing files do differ. That's why i propose to drop duplicate names in favor of a probably best and youngest survivor. No crashes yet. The NetBSD-6.1.3 bug of kern/48787 is reported and fixed. PR 48815 is still open. It shows behavior in the above confusion class. I.e. ISO files hardly usable but rest of system still healthy. The hardest reason why this information has to be encoded in ino_t is the desire to implement method VFS_VGET(9). NFS uses VFS_VGET, so it would be good to continue to support that. Isn't it rather using VFS_FHTOVP(9) ? Currently its implementation cd9660_fhtovp() indeed uses VFS_VGET(), but it could as well use other means to identify an iso_node and its boss vnode. The address encoding in ino_t would be a short-sighted hack, if not NetBSD was far-sighted enough to have 64 bit ino_t. It even provides enough bits to work around a shortcomming in the list format of ISO 9660 directory records. Nevertheless, after bin/48799 and applying my large-file proposal, you will get to see things like this: netbsd# ls -li /mnt/iso/my total 8455812 281479306324160 -r 1 thomas dbus 4329375744 May 6 15:30 large_file (Inode number in hex = 0x100010210a8c0 : A file with two sections. Directory records located above 4 GiB. Only 49 of 61 possible bits used.) But embedding an ino_t in struct iso_node is probably the easiest way to do it, and will avoid breaking fstat(1) and pmap(1). A unique inode number is a must-have. But for that, ISO 9660 would not need 64 bit. pmap(1) will not be affected by the API change, because it is only interested in members of iso_node before the ones which have to change. fstat(1) is reading the size (of the first file section), which still works well on i386 with the changed kernel, but would fail to compile with the new cd9660_node.h. (I plan to fix this, of course.) It is also interpreting the
Re: How to go on with ISO 9660 large file support ?
In article 20140527142117.6807560...@jupiter.mumble.net, Taylor R Campbell campbell+netbsd-tech-k...@mumble.net wrote: For the most part, struct iso_node is private to the kernel implementation of cd9660. Since fstat(1) and pmap(1) use it, you should avoid changing the offsets within the structure of the members that fstat(1) and pmap(1) use, but I don't think you need to worry about preserving the whole API or the structure size. Some day, we ought to find a better way to do what fstat(1) and pmap(1) are doing with kernel guts. I'd say it is not critical that those work if we have to run through hoops to achieve compatibility. I.e. clean code would be my priority. christos
Re: How to go on with ISO 9660 large file support ?
Hi, Christos Zoulas: I'd say it is not critical that those work if we have to run through hoops to achieve compatibility. I.e. clean code would be my priority. The fully API compatible code is slightly cleaner than the only ABI compatible code, which saves sizeof(void *) with every vnode/iso_node. The main advantage of the API-compatible-wasteful variant is that kmem_free() is a bit less frightening. (Actually fully non-scary would be malloc(9)/free(9).) if (ip-iso_sections != NULL) kmem_free(ip-iso_sections, CD9660_FSECT_FROM_INO(ip-i_number) * sizeof(struct iso_file_section)); ip-iso_sections = NULL; versus if (CD9660_FSECT_FROM_INO(ip-i_number) 1) kmem_free(ip-iso_fsect.many, CD9660_FSECT_FROM_INO(ip-i_number) * sizeof(struct iso_file_section)); iso_set_ino_nfsect(ip-i_number, 1); Another advantage of API-compatible-wasteful is that there will be no new union defined in cd9660_node.h -- API compatible struct iso_node (freestyle diff): + struct iso_file_section { + unsigned long isofsc_size; /* byte count */ + unsigned long isofsc_start; /* block address */ + }; + struct iso_node { ... - unsigned long iso_extent; /* extent of file */ - unsigned long i_size; - unsigned long iso_start;/* actual start of data of file (may be different */ + /* These three numbers represent the +* first extent and file section. +* If CD9660_FSECT_FROM_INO(.i_number) +* is larger than 1, then .iso_sections +* is valid. +*/ + unsigned long iso_extent; /* will not be used by cd9660 */ + unsigned long i_size; /* byte count */ + unsigned long iso_start;/* block address */ ... + struct iso_file_section *iso_sections; /* Holds info about all file +* sections if +* CD9660_FSECT_FROM_INO( +* .i_number) +* is larger than 1. +*/ }; -- ABI-only compatible: + struct iso_file_section { + unsigned long isofsc_extent;/* will not be used by cd9660 */ + unsigned long isofsc_size; /* byte count */ + unsigned long isofsc_start; /* block address */ + }; + union iso_file_data { + struct iso_file_section single; /* used if only one extent exists: +* CD9660_FSECT_FROM_INO(.i_number)==1 +*/ + struct iso_file_section *many; /* allocated with enough array elements +* for CD9660_FSECT_FROM_INO(.i_number) +* if this is larger than 1. +*/ + }; + struct iso_node { ... - unsigned long iso_extent; /* extent of file */ - unsigned long i_size; - unsigned long iso_start;/* actual start of data of file (may be different */ + union iso_file_data iso_fsect; /* File sections. Their number is given +* by CD9660_FSECT_FROM_INO(.i_number) +*/ ... }; -- Have a nice day :) Thomas