On 2025/7/5 09:15, Hongzhen Luo wrote:
On 2025/7/5 05:09, Gao Xiang wrote:
On 2025/7/3 20:23, Christian Brauner wrote:
From: Hongzhen Luo <hongz...@linux.alibaba.com>
When using .fadvise to release a file's page cache, it frees page cache
pages that were first read by this file. To achieve this, an interval
tree is added in the inode of that file to track the segments first
read by that inode.
Signed-off-by: Hongzhen Luo <hongz...@linux.alibaba.com>
Link:
https://lore.kernel.org/20240902110620.2202586-5-hongz...@linux.alibaba.com
Signed-off-by: Christian Brauner <brau...@kernel.org>
---
fs/erofs/data.c | 38 ++++++++++++++++++++--
fs/erofs/internal.h | 5 +++
fs/erofs/pagecache_share.c | 81 ++++++++++++++++++++++++++++++++++++++++++++--
fs/erofs/pagecache_share.h | 2 ++
fs/erofs/super.c | 9 ++++++
5 files changed, 131 insertions(+), 4 deletions(-)
diff --git a/fs/erofs/data.c b/fs/erofs/data.c
index fb54162f4c54..61a42a95d26b 100644
--- a/fs/erofs/data.c
+++ b/fs/erofs/data.c
@@ -7,6 +7,7 @@
#include "internal.h"
#include <linux/sched/mm.h>
#include <trace/events/erofs.h>
+#include "pagecache_share.h"
void erofs_unmap_metabuf(struct erofs_buf *buf)
{
@@ -353,6 +354,7 @@ static int erofs_read_folio(struct file *file, struct folio
*folio)
{
#ifdef CONFIG_EROFS_FS_PAGE_CACHE_SHARE
struct erofs_inode *vi = NULL;
+ struct interval_tree_node *seg;
int ret;
if (file && file->private_data) {
@@ -363,8 +365,22 @@ static int erofs_read_folio(struct file *file, struct
folio *folio)
vi = NULL;
}
ret = iomap_read_folio(folio, &erofs_iomap_ops);
- if (vi)
+ if (vi) {
folio->mapping->host = file_inode(file);
+ seg = erofs_pcs_alloc_seg();
+ if (!seg)
+ return -ENOMEM;
+ seg->start = folio->index;
+ seg->last = seg->start + (folio_size(folio) >> PAGE_SHIFT);
+ if (seg->last > (vi->vfs_inode.i_size >> PAGE_SHIFT))
+ seg->last = vi->vfs_inode.i_size >> PAGE_SHIFT;
+ if (seg->last >= seg->start) {
+ mutex_lock(&vi->segs_mutex);
+ interval_tree_insert(seg, &vi->segs);
+ mutex_unlock(&vi->segs_mutex);
+ } else
+ erofs_pcs_free_seg(seg);
+ }
I don't know what Hongzhen is trying to do in this patch and
it seems too odd on my side, maybe it needs to reimplement
this patch later but we should support .fadvise().
The original approach aimed to maintain a first-read interval tree per inode,
ensuring
that .fadvise would only release cached pages within its own mapped ranges,
thereby
preventing interference with other file operations. However, this introduced
unnecessary
complexity. The latest patch series adopts overlayfs-style handling:
https://lore.kernel.org/all/20250301145002.2420830-8-hongz...@linux.alibaba.com/
Yes, that patch makes more sense for me since mm
code will handle it as page cache ops.
Thanks,
Gao Xiang