So, these two patches change the readdir cookies to format that should last a
lot longer before there are collisions. When I did performance testing, the
results depended on how large the directory was.  For small directories, the
code performed slightly worse, with it becoming more noticeable as the
directories got larger until a point. At around 100000 entries it seemed the
worst, with a "ls -f" time of 0.095s in the new code vs 0.084s in the old code.
After that you start reaching the point where hash indexes reach the maximum
depth, and the new code stops needing to sort them, and performance of the new
code quickly surpasses the old code. For instance, when I contrived a situation
where there were 1000 dirents with the same hash index, the new code "ls -f"
time was a less than a tenth of the old code, 0.003s vs 0.036s. However, this
is a pretty unrealistic size, since with 131072 hash buckets, you shouldn't
expect this many dirents per average bucket until you have around 130 million
files in a directory.

The only other real issue with the new code is that since we have to compute
and save the cookie when we first process the dirent in the read_dir_code,
instead of at the moment of sorting, we need to double the space used to
save the dirents for sorting. We could avoid the, by using part of the
dirent padding as a scratch space to store the computed cookie. 

Benjamin Marzinski (2):
  gfs2: keep offset when splitting dir leaf blocks
  gfs2: change gfs2 readdir cookie

 fs/gfs2/dir.c                    | 189 ++++++++++++++++++++++++++++++---------
 fs/gfs2/incore.h                 |   3 +
 fs/gfs2/ops_fstype.c             |   3 +
 fs/gfs2/super.c                  |  12 +++
 include/uapi/linux/gfs2_ondisk.h |   2 +
 5 files changed, 167 insertions(+), 42 deletions(-)

-- 
1.8.3.1

Reply via email to