Hi Chunhai,

On 2024/1/12 09:58, Chunhai Guo wrote:
On 2024/1/10 14:45, Chunhai Guo wrote:
On 2024/1/9 21:08, Gao Xiang wrote:
[你通常不会收到来自 hsiang...@linux.alibaba.com 的电子邮件。请访问
https://aka.ms/LearnAboutSenderIdentification,以了解这一点为什么很重要]

Hi Chunhai,

On 2024/1/9 15:41, Chunhai Guo wrote:
Using a global page pool for LZ4 decompression significantly reduces
the
time spent on page allocation in low memory scenarios.

The table below shows the reduction in time spent on page allocation
for
LZ4 decompression when using a global page pool.  The results were
obtained from multi-app launch benchmarks on ARM64 Android devices
running the 5.15 kernel with an 8-core CPU and 8GB of memory. In the
benchmark, we launched 16 frequently-used apps, and the camera app was
the last one in each round. The data in the table is the average
time of
camera app for each round.
After using the page pool, there was an average improvement of 150ms in
the launch time of the camera app, which was obtained from systrace
log.
+--------------+---------------+--------------+---------+
|              | w/o page pool | w/ page pool |  diff   |
+--------------+---------------+--------------+---------+
| Average (ms) |     3434      |      21      | -99.38% |
+--------------+---------------+--------------+---------+

Based on the benchmark logs, 64 pages are sufficient for 95% of
scenarios. This value can be adjusted from the module parameter. The
default value is 0.

This patch currently only supports the LZ4 decompressor, other
decompressors will be supported in the next step.

Signed-off-by: Chunhai Guo <guochun...@vivo.com>

This patch looks good to me, yet we're in the merge window for v6.8.
I will address it after -rc1 is out since no stable tag these days.

Also it would be better to add some results of changing max_distance
if you have more time to test.

OK. I will reply to this email when the experiment is finished.

Dear Xiang,

The experiment is done and table below shows the results. We can find
that a 16k sliding window reduces 38.2% of time used in page allocation
for LZ4 decompression compared to a 64k sliding window. However, using a
global page pool is still far better than both of them.

+--------------+---------------+--------------+---------+
|              |   64k window  |  16k window  |  diff   |
+--------------+---------------+--------------+---------+
| Average (ms) |     3364      |      2079    | -38.2%  |
+--------------+---------------+--------------+---------+

Thanks,

Let's rebase this onto
commit ("erofs: relaxed temporary buffers allocation on readahead")

I will merge these after the rebase patch is received.

Thanks,
Gao Xiang

Reply via email to