https://bugzilla.kernel.org/show_bug.cgi?id=219586
Bug ID: 219586 Summary: Unable to find file after unicode change Product: File System Version: 2.5 Hardware: All OS: Linux Status: NEW Severity: blocking Priority: P3 Component: f2fs Assignee: filesystem_f...@kernel-bugs.kernel.org Reporter: ha...@vivo.com Regression: No Hi everybody, The f2fs filesystem is unable to read some files with special characters, such as ❤️, after the kernel was updated with the following patch: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=18b5f47e7da46d3a0d7331e48befcaf151ed2ddf We can reproduce this in the following steps: 1、First, we need to roll back the unicode-related changes above and create the special character file or folder: ./tools/mkfs.f2fs -f -O casefold -C utf8 f2fs.img mount f2fs.img f2fs_dir/ mkdir Picture ./f2fs_io setflags casefold Picture cd Picture touch ❤️ 2、Then we apply the above unicode patch, and after mounting the filesystem, we get a message that the special character file was not found. mount f2fs.img f2fs_dir/ cd Picture ls -alh ls: cannot access '❤️': No such file or directory total 8 drwxr-xr-x 2 root root 3488 Dec 10 06:11 . drwxr-xr-x 3 root root 4096 Dec 9 10:21 .. -????????? ? ? ? ? ? ❤️ Here are the conclusions of my preliminary analysis. In casefole-enabled f2fs filesystems, file names are converted to lowercase by the utf8_casefold function when querying for a file, and then the hash is calculated based on the lowercase filename and stored on disk. The path to the function is: f2fs_lookup f2fs_prepare_lookup __f2fs_setup_filename f2fs_init_casefolded_name utf8_casefold f2fs_hash_filename __f2fs_find_entry For some files that contain special characters, such as ❤️. We found that the length of the output characters changed after the utf8_casefold function converted them to lowercase before and after the patch, which ultimately led to a change in the calculated hash. Files created before patch are not readable after path is enabled. I think we need to modify the f2fs filesystem to be compatible with unicode related changes. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. _______________________________________________ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel