https://bugzilla.kernel.org/show_bug.cgi?id=219586
Bug ID: 219586
Summary: Unable to find file after unicode change
Product: File System
Version: 2.5
Hardware: All
OS: Linux
Status: NEW
Severity: blocking
Priority: P3
Component: f2fs
Assignee: [email protected]
Reporter: [email protected]
Regression: No
Hi everybody,
The f2fs filesystem is unable to read some files with special characters,
such as ❤️, after the kernel was updated with the following patch:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=18b5f47e7da46d3a0d7331e48befcaf151ed2ddf
We can reproduce this in the following steps:
1、First, we need to roll back the unicode-related changes above and create
the special character file or folder:
./tools/mkfs.f2fs -f -O casefold -C utf8 f2fs.img
mount f2fs.img f2fs_dir/
mkdir Picture
./f2fs_io setflags casefold Picture
cd Picture
touch ❤️
2、Then we apply the above unicode patch, and after mounting the filesystem,
we get a message that the special character file was not found.
mount f2fs.img f2fs_dir/
cd Picture
ls -alh
ls: cannot access '❤️': No such file or directory
total 8
drwxr-xr-x 2 root root 3488 Dec 10 06:11 .
drwxr-xr-x 3 root root 4096 Dec 9 10:21 ..
-????????? ? ? ? ? ? ❤️
Here are the conclusions of my preliminary analysis.
In casefole-enabled f2fs filesystems, file names are converted to lowercase
by the utf8_casefold function when querying for a file, and then the hash is
calculated based on the lowercase filename and stored on disk. The path to
the function is:
f2fs_lookup
f2fs_prepare_lookup
__f2fs_setup_filename
f2fs_init_casefolded_name
utf8_casefold
f2fs_hash_filename
__f2fs_find_entry
For some files that contain special characters, such as ❤️. We found that the
length of the output characters changed after the utf8_casefold function
converted
them to lowercase before and after the patch, which ultimately led to a change
in the
calculated hash. Files created before patch are not readable after path is
enabled.
I think we need to modify the f2fs filesystem to be compatible with unicode
related changes.
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
Linux-f2fs-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel