Source: linux Version: 4.1.6 Severity: normal Dear Maintainer,
I experience coarse name truncation with very long Rock Ridge filenames in mounted ISO 9660 filesystems. It happens with names of 255 or 254 characters. 253 is ok. Now i riddle about the reasoning behind constant 254 in fs/isofs/rock.c function get_rock_ridge_filename() line 270. Is there a limit in the data structures which would forbid filenames of 254 or 255 bytes length ? Affected are at least Debian kernels 4.1.0-2-amd64 and 3.16.0-4-amd64 but probably it reaches far more back. I seemingly fixed the problem for length 254 and 255 by changing 254 to 256 and building a new kernel. VM makes courageous. -------------------------------------------------------------- Debian's /usr/src/linux-4.1.6/fs/isofs/rock.c and also https://github.com/torvalds/linux/blob/master/fs/isofs/rock.c function get_rock_ridge_filename(), line 270 say if ((strlen(retname) + rr->len - 5) >= 254) { truncate = 1; break; } The value 1 in "truncate" causes interpretation of NM entries to end prematurely. "break" prevents the current NM name piece from being added to the result. Different distribution of the name over NM entries causes different effective truncation lengths. It depends on ECMA-119 name and the SUSP data in the ECMA-119 directory record how much of the name fits into the first NM, and how much must go to the second NM in Continuation Area. This second NM gets dropped. Further it seems that truncation without any attempt to generate a unique filename is suboptimal for the further processing of such names. In the times of UTF-8, truncation would have to show some additional intelligence to avoid incomplete multi-byte characters. -------------------------------------------------------------- To reproduce the problem, create an ISO with a Rock Ridge filename of at least 254 bytes length $ touch my_dummy_file $ genisoimage -R -o test.iso -graft-points /12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234=my_dummy_file Then mount it and list the file # mount -o loop test.iso /mnt/iso $ ls /mnt/iso 12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456 That's 146 characters only. Filenames of length 245 or 255 with same 146 first characters all get mapped to the same name. readdir(3) delivers several dirents with the same .dname. Different production yields different truncations. $ xorriso -as mkisofs ...same.options.as.above... yields after mount 155 characters. $ xorriso -outdev /dvdbuffer/test.iso -blank as_needed \ -map my_dummy_file /...254.characters... yields 93 characters. xorriso as alternative reader shows in all three cases 254 characters. -------------------------------------------------------------- Investigating the possible impact of a change from 254 to 256 in rock.c ------------------------------------------------------------- dir.c: static do_isofs_readdir() calls get_rock_ridge_filename() and hands over the result to dir_emit(), which i understand from include/linux/fs.h is a "filldir" function type, by which the kernel receives dirent information. Such a receiver and its consumers should be good for 255 bytes plus 0, shouldn't it ? The name buffer tmpname is a parameter handed over by the only caller isofs_readdir(). It obtains the buffer by tmpname = (char *)__get_free_page(GFP_KERNEL); "struct page" sends me on a log journey into regions of the memory management which i do not understand. Nevertheless a widely used object named "page" that can take 254 bytes should be able to take 256 bytes, shouldn't it ?. This buffer is not used after the call of do_isofs_readdir() but rather gets freed immediately. ------------------------------------------------------------- namei.c: static isofs_find_entry() uses the result of get_rock_ridge_filename() as argument of static isofs_cmp(). I believe to trace it through struct dentry_operations to isofs_dentry_cmp_common() in fs/isofs/inode.c. Its use of struct qstr *name seems harmless with a potential name->len == 255. The name buffer tmpname is a parameter handed over by the only caller isofs_lookup(): struct page *page; page = alloc_page(GFP_USER); The parameter tmpname is page_address(page) As above: a "page" that can take 254 bytes should be able to take 256 bytes. This buffer is not used after the call of isofs_find_entry() but rather gets freed immediately. ------------------------------------------------------------- So s/254/256/ _should_ be safe ... if i am not mistaken. On my Debian "unstable" VM i worked my way through http://kernel-handbook.alioth.debian.org/ch-common-tasks.html (mainly paragraph 4.5) and did the smallest possible change from "4" to "6". The VM is now running with the changed kernel and shows filenames up to 255 (US-ASCII) characters length. I gave it a stress test by taring the content of a 3.2 GB backup ISO with 67000 files, stemming from 20 years of linuxing. Not really a challenge with exotic characters, but it has the long names (for a HFS+ mangling test). genisoimage is obviously willing to write Rock Ridge names longer than 255 characters. This gave a good opportunity to verify that the kernel still prevents file names larger than MAX_SIZE. (If this is mandatory at all ...) (xorriso or libisofs silently truncate the filename to 255 bytes. As their upstream maintainer i will have to investigate whether it's UTF-8-safe and why there is no warning.) ------------------------------------------------------------- Have a nice day :) Thomas -- System Information: Debian Release: stretch/sid APT prefers unstable APT policy: (500, 'unstable') Architecture: amd64 (x86_64) Kernel: Linux 4.1.6 (SMP w/1 CPU core) Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Init: systemd (via /run/systemd/system)

