On 5/16/25 09:40, wangtao wrote: > > >> -----Original Message----- >> From: Christian König <christian.koe...@amd.com> >> Sent: Thursday, May 15, 2025 10:26 PM >> To: wangtao <tao.wang...@honor.com>; sumit.sem...@linaro.org; >> benjamin.gaign...@collabora.com; brian.star...@arm.com; >> jstu...@google.com; tjmerc...@google.com >> Cc: linux-me...@vger.kernel.org; dri-devel@lists.freedesktop.org; linaro- >> mm-...@lists.linaro.org; linux-ker...@vger.kernel.org; >> wangbintian(BintianWang) <bintian.w...@honor.com>; yipengxiang >> <yipengxi...@honor.com>; liulu 00013167 <liulu....@honor.com>; hanfeng >> 00012985 <feng....@honor.com> >> Subject: Re: [PATCH 2/2] dmabuf/heaps: implement >> DMA_BUF_IOCTL_RW_FILE for system_heap >> >> On 5/15/25 16:03, wangtao wrote: >>> [wangtao] My Test Configuration (CPU 1GHz, 5-test average): >>> Allocation: 32x32MB buffer creation >>> - dmabuf 53ms vs. udmabuf 694ms (10X slower) >>> - Note: shmem shows excessive allocation time >> >> Yeah, that is something already noted by others as well. But that is >> orthogonal. >> >>> >>> Read 1024MB File: >>> - dmabuf direct 326ms vs. udmabuf direct 461ms (40% slower) >>> - Note: pin_user_pages_fast consumes majority CPU cycles >>> >>> Key function call timing: See details below. >> >> Those aren't valid, you are comparing different functionalities here. >> >> Please try using udmabuf with sendfile() as confirmed to be working by T.J. > [wangtao] Using buffer IO with dmabuf file read/write requires one memory > copy. > Direct IO removes this copy to enable zero-copy. The sendfile system call > reduces memory copies from two (read/write) to one. However, with udmabuf, > sendfile still keeps at least one copy, failing zero-copy.
Then please work on fixing this. Regards, Christian. > > If udmabuf sendfile uses buffer IO (file page cache), read latency matches > dmabuf buffer read, but allocation time is much longer. > With Direct IO, the default 16-page pipe size makes it slower than buffer IO. > > Test data shows: > udmabuf direct read is much faster than udmabuf sendfile. > dmabuf direct read outperforms udmabuf direct read by a large margin. > > Issue: After udmabuf is mapped via map_dma_buf, apps using memfd or > udmabuf for Direct IO might cause errors, but there are no safeguards to > prevent this. > > Allocate 32x32MB buffer and read 1024 MB file Test: > Metric | alloc (ms) | read (ms) | total (ms) > -----------------------|------------|-----------|----------- > udmabuf buffer read | 539 | 2017 | 2555 > udmabuf direct read | 522 | 658 | 1179 > udmabuf buffer sendfile| 505 | 1040 | 1546 > udmabuf direct sendfile| 510 | 2269 | 2780 > dmabuf buffer read | 51 | 1068 | 1118 > dmabuf direct read | 52 | 297 | 349 > > udmabuf sendfile test steps: > 1. Open data file(1024MB), get back_fd > 2. Create memfd(32MB) # Loop steps 2-6 > 3. Allocate udmabuf with memfd > 4. Call sendfile(memfd, back_fd) > 5. Close memfd after sendfile > 6. Close udmabuf > 7. Close back_fd > >> >> Regards, >> Christian. >