So, these two patches change the readdir cookies to format that should last a lot longer before there are collisions. When I did performance testing, the results depended on how large the directory was. For small directories, the code performed slightly worse, with it becoming more noticeable as the directories got larger until a point. At around 100000 entries it seemed the worst, with a "ls -f" time of 0.095s in the new code vs 0.084s in the old code. After that you start reaching the point where hash indexes reach the maximum depth, and the new code stops needing to sort them, and performance of the new code quickly surpasses the old code. For instance, when I contrived a situation where there were 1000 dirents with the same hash index, the new code "ls -f" time was a less than a tenth of the old code, 0.003s vs 0.036s. However, this is a pretty unrealistic size, since with 131072 hash buckets, you shouldn't expect this many dirents per average bucket until you have around 130 million files in a directory.
The only other real issue with the new code is that since we have to compute and save the cookie when we first process the dirent in the read_dir_code, instead of at the moment of sorting, we need to double the space used to save the dirents for sorting. We could avoid the, by using part of the dirent padding as a scratch space to store the computed cookie. Benjamin Marzinski (2): gfs2: keep offset when splitting dir leaf blocks gfs2: change gfs2 readdir cookie fs/gfs2/dir.c | 189 ++++++++++++++++++++++++++++++--------- fs/gfs2/incore.h | 3 + fs/gfs2/ops_fstype.c | 3 + fs/gfs2/super.c | 12 +++ include/uapi/linux/gfs2_ondisk.h | 2 + 5 files changed, 167 insertions(+), 42 deletions(-) -- 1.8.3.1
