Re: [PATCH v6 0/2] Block layer support ZAC/ZBC commands

2016-08-09 Thread Damien Le Moal
ck device backend driver and everything is fine. Can't tell the difference with direct-to-drive SG_IO accesses. But unlike these, the zone ioctls keep the zone information RB-tree cache up to date. > > I will be updating my patchset accordingly. I need to cleanup my code and rebase on top o

Re: [PATCH v8 1/2 RESEND] Add bio/request flags to issue ZBC/ZAC commands

2016-08-25 Thread Damien Le Moal
let me know what you think. If we drop this, we can get a clean and full ZBC support patch set ready in no time at all. Best regards. -- Damien Le Moal, Ph.D. Sr. Manager, System Software Group, HGST Research, HGST, a Western Digital brand damien.lem...@hgst.com (+81) 0466-98-3593 (ext. 51

Re: [PATCH] uapi: linux/blkzoned.h: fix BLKGETZONESZ and BLKGETNRZONES definitions

2018-12-16 Thread Damien Le Moal
_zone_range) > -#define BLKGETZONESZ _IOW(0x12, 132, __u32) > -#define BLKGETNRZONES_IOW(0x12, 133, __u32) > +#define BLKGETZONESZ _IOR(0x12, 132, __u32) > +#define BLKGETNRZONES _IOR(0x12, 133, __u32) > > #endif /* _UAPI_BLKZONED_H */ > Indeed, my bad. Reviewed-by: Damien Le Moal -- Damien Le Moal Western Digital Research

Re: [PATCH 09/19] zonefs: remove duplicate cleanup in zonefs_fill_super

2023-09-13 Thread Damien Le Moal
On 9/13/23 20:10, Christoph Hellwig wrote: > When ->fill_super fails, ->kill_sb is called which already cleans up > the inodes and zgroups. > > Drop the extra cleanup code in zonefs_fill_super. > > Signed-off-by: Christoph Hellwig Looks good to me. Acked-by: Dami

[PATCH v2 2/2] riscv: Disable text-data gap in flat binaries

2021-04-07 Thread Damien Le Moal
incorrect and prevent correct execution of a flatbin executable. Avoid this problem by enabling CONFIG_BINFMT_FLAT_NO_TEXT_DATA_GAP automatically when CONFIG_RISCV is enabled and CONFIG_MMU disabled. Signed-off-by: Damien Le Moal --- arch/riscv/Kconfig | 1 + 1 file changed, 1 insertion(+) diff

[PATCH v2 1/2] binfmt_flat: allow not offsetting data start

2021-04-07 Thread Damien Le Moal
s the use of a single RAM region for loading (equivalent to FLAT_FLAG_RAM being set). Signed-off-by: Damien Le Moal --- fs/Kconfig.binfmt | 3 +++ fs/binfmt_flat.c | 21 +++-- 2 files changed, 18 insertions(+), 6 deletions(-) diff --git a/fs/Kconfig.binfmt b/fs/Kconfig.

[PATCH v2 0/2] Fix binfmt_flat loader for RISC-V

2021-04-07 Thread Damien Le Moal
addition of riscv/include/asm/flat.h and set CONFIG_BINFMT_FLAT_NO_TEXT_DATA_GAP for RISCV and !MMU Damien Le Moal (2): binfmt_flat: allow not offsetting data start riscv: Disable text-data gap in flat binaries arch/riscv/Kconfig | 1 + fs/Kconfig.binfmt | 3 +++ fs/binfmt_flat.c | 21

Re: [PATCH] soc: canaan: Sort the Makefile alphabetically

2021-02-22 Thread Damien Le Moal
obj-y+= fsl/ > @@ -29,4 +30,3 @@ obj-$(CONFIG_ARCH_U8500)+= ux500/ > obj-$(CONFIG_PLAT_VERSATILE) += versatile/ > obj-y+= xilinx/ > obj-$(CONFIG_ARCH_ZX) += zte/ > -obj-$(CONFIG_SOC_CANAAN) +

Re: [PATCH v4 6/6] io_uring: add support for zone-append

2020-07-30 Thread Damien Le Moal
t;>>>>> mind, here is a quick question: if we do work_add(task) when the task is >>>>>> running in the userspace, wouldn't the work execution wait until the next >>>>>> syscall/allotted time ends up? >>>>> >>>>> It'll get the task to enter the kernel, just like signal delivery. The >>>>> only >>>>> tricky part is really if we have a dependency waiting in the kernel, like >>>>> the recent eventfd fix. >>>> >>>> I see, thanks for sorting this out! >>> >>> Few more doubts about this (please mark me wrong if that is the case): >>> >>> - Task-work makes me feel like N completions waiting to be served by >>> single task. >>> Currently completions keep arriving and CQEs would be updated with >>> result, but the user-space (submitter task) would not be poked. >>> >>> - Completion-code will set the task-work. But post that it cannot go >>> immediately to its regular business of picking cqe and updating >>> res/flags, as we cannot afford user-space to see the cqe before the >>> pointer update. So it seems completion-code needs to spawn another >>> work which will allocate/update cqe after waiting for pointer-update >>> from task-work? >> >> The task work would post the completion CQE for the request after >> writing the offset. > > Got it, thank you for making it simple. > Overall if I try to put the tradeoffs of moving to indirect-offset > (compared to current scheme)– > > Upside: > - cqe res/flags would be intact, avoids future-headaches as you mentioned > - short-write cases do not have to be failed in lower-layers (as > cqe->res is there to report bytes-copied) I personally think it is a super bad idea to allow short asynchronous append writes. The interface should allow the async zone append write to proceed only and only if it can be stuffed entirely into a single BIO which necessarilly will be a single request on the device side. Otherwise, the application would have no guarantees as to where a split may happen, and since this is zone append, the next async append will not leave any hole to complete a previous short write. This will wreak the structure of the application data. For the sync case, this is fine. The application can just issue a new append write with the remaining unwritten data from the previous append write. But in the async case, if one write == one data record (e.g. a key-value tuple for an SSTable in an LSM tree), then allowing a short write will destroy the record: the partial write will be garbage data that will need garbage collection... > > Downside: > - We may not be able to use RWF_APPEND, and need exposing a new > type/flag (RWF_INDIRECT_OFFSET etc.) user-space. Not sure if this > sounds outrageous, but is it OK to have uring-only flag which can be > combined with RWF_APPEND? Why ? Where is the problem ? O_APPEND/RWF_APPEND is currently meaningless for raw block device accesses. We could certainly define a meaning for these in the context of zoned block devices. I already commented on the need for first defining an interface (flags etc) and its semantic (e.g. do we allow short zone append or not ? What happens for regular files ? etc). Did you read my comment ? We really need to first agree on something to clarify what needs to be done. > - Expensive compared to sending results in cqe itself. But I agree > that this may not be major, and only for one type of write. > > -- Damien Le Moal Western Digital Research

Re: [PATCH v4 6/6] io_uring: add support for zone-append

2020-07-31 Thread Damien Le Moal
On 2020/07/31 15:45, h...@infradead.org wrote: > On Fri, Jul 31, 2020 at 06:42:10AM +0000, Damien Le Moal wrote: >>> - We may not be able to use RWF_APPEND, and need exposing a new >>> type/flag (RWF_INDIRECT_OFFSET etc.) user-space. Not sure if this >>> sounds ou

Re: [PATCH v4 6/6] io_uring: add support for zone-append

2020-07-31 Thread Damien Le Moal
On 2020/07/31 16:59, Kanchan Joshi wrote: > On Fri, Jul 31, 2020 at 12:29 PM Damien Le Moal wrote: >> >> On 2020/07/31 15:45, h...@infradead.org wrote: >>> On Fri, Jul 31, 2020 at 06:42:10AM +, Damien Le Moal wrote: >>>>> - We may not be able to

Re: [PATCH v4 6/6] io_uring: add support for zone-append

2020-07-31 Thread Damien Le Moal
On 2020/07/31 18:14, h...@infradead.org wrote: > On Fri, Jul 31, 2020 at 08:14:22AM +0000, Damien Le Moal wrote: >> >>> This was one of the reason why we chose to isolate the operation by a >>> different IOCB flag and not by IOCB_APPEND alone. >> >> For zonef

Re: [PATCH v4 6/6] io_uring: add support for zone-append

2020-07-31 Thread Damien Le Moal
On 2020/07/31 18:41, h...@infradead.org wrote: > On Fri, Jul 31, 2020 at 09:34:50AM +0000, Damien Le Moal wrote: >> Sync writes are done under the inode lock, so there cannot be other writers >> at >> the same time. And for the sync case, since the actual written offset is &g

Re: [PATCH] riscv: Setup exception vector for K210 properly

2020-08-10 Thread Damien Le Moal
register to 0, indicating to exception vector > + * that we are presently executing in the kernel > + */ > + csr_write(CSR_SCRATCH, 0); > + /* Set the exception vector address */ > + csr_write(CSR_TVEC, &handle_exception); > +#endif > } > Looks OK to me. But out of curiosity, how did you trigger a problem ? I never got any weird exceptions with my busybox userspace. -- Damien Le Moal Western Digital Research

Re: [PATCH] riscv: Setup exception vector for K210 properly

2020-08-10 Thread Damien Le Moal
#ifndef CONFIG_MMU > + /* > + * Set sup0 scratch register to 0, indicating to exception vector > + * that we are presently executing in the kernel > + */ > + csr_write(CSR_SCRATCH, 0); > + /* Set the exception vector address */ > + csr_write(CSR_TVEC, &handle_exception); > +#endif > } > -- Damien Le Moal Western Digital Research

Re: [PATCH] riscv: Setup exception vector for K210 properly

2020-08-11 Thread Damien Le Moal
046511054ns > [0.008254] Console: colour dummy device 80x25 Interesting. Never saw that happening... Thanks ! > > > > > -原始邮件- > > 发件人: "Damien Le Moal" > > 发送时间: 2020-08-11 14:42:15 (星期二) > > 收件人: "Qiu Wenbo" , "Palmer Dab

Re: [PATCH v2] riscv: Setup exception vector for nommu platform

2020-08-12 Thread Damien Le Moal
_bss_done: > call relocate > #endif /* CONFIG_MMU */ > > + call setup_trap_vector > /* Restore C environment */ > la tp, init_task > sw zero, TASK_TI_CPU(tp) > -- Damien Le Moal Western Digital Research

Re: [PATCH v2] riscv: Setup exception vector for nommu platform

2020-08-13 Thread Damien Le Moal
On 2020/08/13 15:45, Atish Patra wrote: > On Wed, Aug 12, 2020 at 10:44 PM Damien Le Moal wrote: >> >> On 2020/08/13 12:40, Qiu Wenbo wrote: >>> Exception vector is missing on nommu platform and that is an issue. >>> This patch is tested in Sipeed Maix Bit Dev B

Re: [PATCH 2/2] nvme: add emulation for zone-append

2020-08-19 Thread Damien Le Moal
any performance impact over regular writes *and* zone write locking does not in general degrade HDD write performance (only a few corner cases suffer from it). Comparing things equally, the same could be said of NVMe drives that do not have zone append native support: performance will be essentially the same using regular writes and emulated zone append. But mq-deadline and zone write locking will significantly lower performance for emulated zone append compared to a native zone append support by the drive. -- Damien Le Moal Western Digital Research

Re: [PATCH 1/2] nvme: set io-scheduler requirement for ZNS

2020-08-19 Thread Damien Le Moal
rite lock is taken and released by the emulation driver itself, ELEVATOR_F_ZBD_SEQ_WRITE is required only if the user will also be issuing regular writes at high QD. And that is trivially controllable by the user by simply setting the drive elevator to mq-deadline. Conclusion: setting ELEVATOR_F_Z

Re: [PATCH 1/2] nvme: set io-scheduler requirement for ZNS

2020-08-19 Thread Damien Le Moal
On 2020/08/19 19:32, Kanchan Joshi wrote: > On Wed, Aug 19, 2020 at 3:08 PM Damien Le Moal wrote: >> >> On 2020/08/19 18:27, Kanchan Joshi wrote: >>> On Tue, Aug 18, 2020 at 12:46 PM Christoph Hellwig wrote: >>>> >>>> On Tue, Aug 18, 2020 at

Re: [PATCH 4.19 13/35] null_blk: Fix zone size initialization

2021-01-10 Thread Damien Le Moal
_capacity_sects, > dev->zone_size_sects)? Would that be faster, more readable and robust > against weird dev->zone_size_sects sizes? Yes, we can change to this to be more readable. Will send a cleanup patch. Thanks ! > > Best regards, > Pavel > -- Damien Le Moal Western Digital Research

Re: [PATCH] RISC-V: simplify BUILTIN_DTB processing

2021-01-11 Thread Damien Le Moal
; dtb_early_pa = dtb_pa; > } > > Tested this with a nommu kernel on a MAIX bit board (K210 SoC). No problems detected. Tested-by: Damien Le Moal -- Damien Le Moal Western Digital Research

Re: [PATCH] bio: limit bio max size.

2021-01-12 Thread Damien Le Moal
inline bool bio_full(struct bio *bio, unsigned len) > if (bio->bi_vcnt >= bio->bi_max_vecs) > return true; > > - if (bio->bi_iter.bi_size > UINT_MAX - len) > + if (bio->bi_iter.bi_size > BIO_MAX_SIZE - len) > return true; > > return false; > -- Damien Le Moal Western Digital Research

Re: [RFC PATCH v2 0/2] add simple copy support

2020-12-07 Thread Damien Le Moal
On 2020/12/07 17:16, javier.g...@samsung.com wrote: > On 07.12.2020 08:06, Damien Le Moal wrote: >> On 2020/12/07 16:46, javier.g...@samsung.com wrote: >>> On 04.12.2020 23:40, Keith Busch wrote: >>>> On Fri, Dec 04, 2020 at 11:25:12AM +, Damien Le Moal wr

Re: [PATCH] bio: limit bio max size.

2021-01-12 Thread Damien Le Moal
x/bio.h >>> +++ b/include/linux/bio.h >>> @@ -20,6 +20,7 @@ >>> #endif >>> >>> #define BIO_MAX_PAGES 256 >>> +#define BIO_MAX_SIZE (BIO_MAX_PAGES * PAGE_SIZE) >>> >>> #define bio_prio(bio)

Re: [PATCH] bio: limit bio max size.

2021-01-12 Thread Damien Le Moal
_size > UINT_MAX - len) { >>>>> + if (bio->bi_iter.bi_size > BIO_MAX_SIZE - len) { >>>>> *same_page = false; >>>>> return false; >>>>> } >>>

Re: [PATCH] bio: limit bio max size.

2021-01-12 Thread Damien Le Moal
| 2 +- >>>>>>> include/linux/bio.h | 3 ++- >>>>>>> 2 files changed, 3 insertions(+), 2 deletions(-) >>>>>>> >>>>>>> diff --git a/block/bio.c b/block/bio.c >>>>>>> index 1f2cc1fbe283..dbe14d6

Re: [PATCH] bio: limit bio max size.

2021-01-13 Thread Damien Le Moal
atency is about 17ms including merge time too. > > 19ms looks too big just for preparing one 32MB sized bio, which isn't > supposed to > take so long. Can you investigate where the 19ms is taken just for > preparing one > 32MB sized bio? Changheun mentioned that the device side IO latency is 16.7ms out of the 19ms total. So the BIO handling, submission+completion takes about 2.3ms, and Changheun points above to 2ms for the submission part. > > It might be iov_iter_get_pages() for handling page fault. If yes, one > suggestion > is to enable THP(Transparent HugePage Support) in your application. But if that was due to page faults, the same large-ish time would be taken for the preparing the size-limited BIOs too, no ? No matter how the BIOs are diced, all 32MB of pages of the user IO are referenced... > > -- Damien Le Moal Western Digital Research

Re: [PATCH] bio: limit bio max size.

2021-01-13 Thread Damien Le Moal
On 2021/01/13 19:25, Ming Lei wrote: > On Wed, Jan 13, 2021 at 09:28:02AM +0000, Damien Le Moal wrote: >> On 2021/01/13 18:19, Ming Lei wrote: >>> On Wed, Jan 13, 2021 at 12:09 PM Changheun Lee >>> wrote: >>>> >>>>> On 2021/01/12 21:14, Chang

Re: [PATCH] bio: limit bio max size.

2021-01-13 Thread Damien Le Moal
On 2021/01/13 20:48, Ming Lei wrote: > On Wed, Jan 13, 2021 at 11:16:11AM +0000, Damien Le Moal wrote: >> On 2021/01/13 19:25, Ming Lei wrote: >>> On Wed, Jan 13, 2021 at 09:28:02AM +0000, Damien Le Moal wrote: >>>> On 2021/01/13 18:19, Ming Lei wrote: >>

Re: [PATCH] bio: limit bio max size.

2021-01-13 Thread Damien Le Moal
On 2021/01/14 12:53, Ming Lei wrote: > On Wed, Jan 13, 2021 at 12:02:44PM +0000, Damien Le Moal wrote: >> On 2021/01/13 20:48, Ming Lei wrote: >>> On Wed, Jan 13, 2021 at 11:16:11AM +0000, Damien Le Moal wrote: >>>> On 2021/01/13 19:25, Ming Lei wrote: >>&g

Re: [PATCH] drivers: pinctrl: Remove duplicate include of io.h

2021-03-22 Thread Damien Le Moal
rs/pinctrl/pinctrl-k210.c b/drivers/pinctrl/pinctrl-k210.c > index 8a733cf77ba0..f831526d06ff 100644 > --- a/drivers/pinctrl/pinctrl-k210.c > +++ b/drivers/pinctrl/pinctrl-k210.c > @@ -15,7 +15,6 @@ > #include > #include > #include > -#include > > #include > &g

Re: [PATCH -next 2/5] block: add ioctl to read the disk sequence number

2021-03-15 Thread Damien Le Moal
off on the commits that added those ioctl > numbers without updating this comment. Perhaps one of them will figure > out how to stop this happening in future. > Indeed. Will be more careful :) And send a patch to fix this. Thanks ! -- Damien Le Moal Western Digital Research

Re: [PATCH] zonefs: fix to update .i_wr_refcnt correctly in zonefs_open_zone()

2021-03-16 Thread Damien Le Moal
unlock; > } > @@ -986,6 +984,8 @@ static int zonefs_open_zone(struct inode *inode) > } > } > > + zi->i_wr_refcnt++; > + > unlock: > mutex_unlock(&zi->i_truncate_mutex); > > Good catch ! Will apply this and check zonefs test suite as this bug went undetected. Thanks. -- Damien Le Moal Western Digital Research

Re: [PATCH] btrfs: Fix a typo

2021-03-25 Thread Damien Le Moal
sponsible to cleanup... -> We're responsible for cleaning up... > * that is before @logical. > * > * Return 0 if there is no csum for the range. > -- > 2.26.2 > > -- Damien Le Moal Western Digital Research

Re: [PATCH v3 1/2] binfmt_flat: allow not offsetting data start

2021-04-15 Thread Damien Le Moal
On 2021/04/15 23:04, Greg Ungerer wrote: > Hi Damien, > > On 15/4/21 4:15 pm, Damien Le Moal wrote: >> Commit 2217b9826246 ("binfmt_flat: revert "binfmt_flat: don't offset >> the data start"") restored offsetting the start of the data section by >&

Re: [PATCH v2 0/2] Fix binfmt_flat loader for RISC-V

2021-04-15 Thread Damien Le Moal
; > I'm reasonably familiar with binfmt_{elf,misc,script}; anything > else gets touched as part of larger series and only with sanity checks > from other folks, if the changes are not entirely trivial. Al, Thanks for the clarification. Would it make sense to have an entry in MA

Re: [PATCH v3 1/2] binfmt_flat: allow not offsetting data start

2021-04-16 Thread Damien Le Moal
On 2021/04/16 16:24, Greg Ungerer wrote: > > On 16/4/21 9:22 am, Damien Le Moal wrote: >> On 2021/04/15 23:04, Greg Ungerer wrote: >>> Hi Damien, >>> >>> On 15/4/21 4:15 pm, Damien Le Moal wrote: >>>> Commit 2217b9826246 ("binfmt_fla

[PATCH v4 1/2] binfmt_flat: allow not offsetting data start

2021-04-16 Thread Damien Le Moal
T disabled case) and to 0 when CONFIG_BINFMT_FLAT_NO_DATA_START_OFFSET is enabled. DATA_START_OFFSET_WORDS is used in load_flat_file() to calculate the data section length and start position. Signed-off-by: Damien Le Moal --- fs/Kconfig.binfmt | 3 +++ fs/binfmt_flat.c | 19 ++-

[PATCH v4 2/2] riscv: Disable data start offset in flat binaries

2021-04-16 Thread Damien Le Moal
CONFIG_MMU is disabled. Signed-off-by: Damien Le Moal Acked-by: Palmer Dabbelt --- arch/riscv/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index 4515a10c5d22..add528eb9235 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -33,6

[PATCH v4 0/2] Fix binfmt_flat loader for RISC-V

2021-04-16 Thread Damien Le Moal
ATA_NO_GAP macro with CONFIG_BINFMT_FLAT_NO_TEXT_DATA_GAP config option (patch 1). * Remove the addition of riscv/include/asm/flat.h and set CONFIG_BINFMT_FLAT_NO_TEXT_DATA_GAP for RISCV and !MMU Damien Le Moal (2): binfmt_flat: allow not offsetting data start riscv: Disable data start offs

Re: [RFC PATCH v5 2/4] block: add simple copy support

2021-04-12 Thread Damien Le Moal
On 2021/04/12 23:35, Selva Jove wrote: > On Mon, Apr 12, 2021 at 5:55 AM Damien Le Moal wrote: >> >> On 2021/04/07 20:33, Selva Jove wrote: >>> Initially I started moving the dm-kcopyd interface to the block layer >>> as a generic interface. >>> Once I di

Re: [PATCH v2 0/2] Fix binfmt_flat loader for RISC-V

2021-04-14 Thread Damien Le Moal
On 2021/04/08 0:49, Damien Le Moal wrote: > RISC-V NOMMU flat binaries cannot tolerate a gap between the text and > data section as the toolchain fully resolves at compile time the PC > relative global pointer (__global_pointer$ value loaded in gp register). > Without a relocation en

[PATCH v3 0/2] Fix binfmt_flat loader for RISC-V

2021-04-14 Thread Damien Le Moal
CONFIG_BINFMT_FLAT_NO_TEXT_DATA_GAP config option (patch 1). * Remove the addition of riscv/include/asm/flat.h and set CONFIG_BINFMT_FLAT_NO_TEXT_DATA_GAP for RISCV and !MMU Damien Le Moal (2): binfmt_flat: allow not offsetting data start riscv: Disable text-data gap in flat binaries arch/riscv

[PATCH v3 1/2] binfmt_flat: allow not offsetting data start

2021-04-14 Thread Damien Le Moal
s the use of a single RAM region for loading (equivalent to FLAT_FLAG_RAM being set). Signed-off-by: Damien Le Moal Acked-by: Palmer Dabbelt --- fs/Kconfig.binfmt | 3 +++ fs/binfmt_flat.c | 21 +++-- 2 files changed, 18 insertions(+), 6 deletions(-) diff --git a/fs/Kconfig.b

[PATCH v3 2/2] riscv: Disable text-data gap in flat binaries

2021-04-14 Thread Damien Le Moal
incorrect and prevent correct execution of a flatbin executable. Avoid this problem by enabling CONFIG_BINFMT_FLAT_NO_TEXT_DATA_GAP automatically when CONFIG_RISCV is enabled and CONFIG_MMU disabled. Signed-off-by: Damien Le Moal Acked-by: Palmer Dabbelt --- arch/riscv/Kconfig | 1 + 1 file

Re: [PATCH v2 0/2] Fix binfmt_flat loader for RISC-V

2021-04-14 Thread Damien Le Moal
6PM -0700, Palmer Dabbelt wrote: >> On Wed, 14 Apr 2021 17:32:10 PDT (-0700), Damien Le Moal wrote: >>>> On 2021/04/08 0:49, Damien Le Moal wrote: >>>> RISC-V NOMMU flat binaries cannot tolerate a gap between the text and >>>> data section as the toolchain fully r

Re: [RESEND,v5,1/2] bio: limit bio max size

2021-04-07 Thread Damien Le Moal
3aeab9e7e97b 100644 >>>>>>>>>> --- a/include/linux/blkdev.h >>>>>>>>>> +++ b/include/linux/blkdev.h >>>>>>>>>> @@ -621,6 +621,7 @@ struct request_queue { >>>>>>>>>> #define QUEUE_FLAG_RQ_ALLOC_TIME 27 /* record rq->alloc_time_ns */ >>>>>>>>>> #define QUEUE_FLAG_HCTX_ACTIVE 28 /* at least one blk-mq >>>>>>>>>> hctx is active */ >>>>>>>>>> #define QUEUE_FLAG_NOWAIT 29 /* device supports NOWAIT */ >>>>>>>>>> +#define QUEUE_FLAG_LIMIT_BIO_SIZE 30/* limit bio size */ >>>>>>>>>> >>>>>>>>>> #define QUEUE_FLAG_MQ_DEFAULT ((1 << QUEUE_FLAG_IO_STAT) | >>>>>>>>>> \ >>>>>>>>>> (1 << QUEUE_FLAG_SAME_COMP) | >>>>>>>>>> \ >>>>>>>>>> @@ -667,6 +668,8 @@ bool blk_queue_flag_test_and_set(unsigned int >>>>>>>>>> flag, struct request_queue *q); >>>>>>>>>> #define blk_queue_fua(q)test_bit(QUEUE_FLAG_FUA, >>>>>>>>>> &(q)->queue_flags) >>>>>>>>>> #define blk_queue_registered(q) test_bit(QUEUE_FLAG_REGISTERED, >>>>>>>>>> &(q)->queue_flags) >>>>>>>>>> #define blk_queue_nowait(q) test_bit(QUEUE_FLAG_NOWAIT, >>>>>>>>>> &(q)->queue_flags) >>>>>>>>>> +#define blk_queue_limit_bio_size(q) \ >>>>>>>>>> +test_bit(QUEUE_FLAG_LIMIT_BIO_SIZE, &(q)->queue_flags) >>>>>>>>>> >>>>>>>>>> extern void blk_set_pm_only(struct request_queue *q); >>>>>>>>>> extern void blk_clear_pm_only(struct request_queue *q); >>>>>>>>>> -- >>>>>>>>>> 2.28.0 >>>>>>>>>> >>>>>>>>> >>>>>>>>> Please feedback to me if more modification is needed to apply. :) >>>>>>>> >>>>>>>> You are adding code that tests for a value to be set, yet you never set >>>>>>>> it in this code so why is it needed at all? >>>>>>> >>>>>>> This patch is a solution for some inefficient case of multipage bvec >>>>>>> like >>>>>>> as current DIO scenario. So it's not set as a default. >>>>>>> It will be set when bio size limitation is needed in runtime. >>>>>> >>>>>> Set where? >>>>> >>>>> In my environment, set it on init.rc file like as below. >>>>> "echo 1 > /sys/block/sda/queue/limit_bio_size" >>>> >>>> I do not see any sysfs file in this patch, and why would you ever want >>>> to be forced to manually do this? The hardware should know the limits >>>> itself, and should automatically tune things like this, do not force a >>>> user to do it as that's just not going to go well at all. >>> >>> Patch for sysfs is sent "[RESEND,v5,2/2] bio: add limit_bio_size sysfs". >>> Actually I just suggested constant - 1MB - value to limit bio size at first. >>> But I got a feedback that patch will be better if it's optional, and >>> getting meaningful value from device queue on patchwork. >>> There are some differences for each system environment I think. >>> >>> But there are inefficient logic obviously by applying of multipage bvec. >>> So it will be shown in several system environment. >>> Currently providing this patch as a option would be better to select >>> according to each system environment, and policy I think. >>> >>> Please, revisit applying this patch. >>> >>>> >>>> So if this patch series is forcing a new option to be configured by >>>> sysfs only, that's not acceptable, sorry. >>> >>> If it is not acceptable ever with current, may I progress review again >>> with default enabled? >> >> I am sorry, I can not parse this, can you rephrase this? >> >> thanks, >> >> greg k-h >> > > I'll prepare new patch as you recommand. It will be added setting of > limit_bio_size automatically when queue max sectors is determined. Please do that in the driver for the HW that benefits from it. Do not do this for all block devices. > > > Thanks, > > Changheun Lee > -- Damien Le Moal Western Digital Research

[PATCH 2/2] riscv: introduce asm/flat.h

2021-04-07 Thread Damien Le Moal
incorrect and prevent correct execution of a flatbin executable. Avoid this problem by introducing the file asm/flat.h and defining the macro FLAT_TEXT_DATA_NO_GAP to indicate that the text and data sections must be loaded at contiguous addresses. Signed-off-by: Damien Le Moal --- arch/riscv

[PATCH 0/2] Fix binfmt_flat loader for RISC-V

2021-04-07 Thread Damien Le Moal
request the gap suppression using the newly introduced macro FLAT_TEXT_DATA_NO_GAP. These patches do not change the binfmt_flat loader behavior for other architectures. Damien Le Moal (2): binfmt_flat: allow not offsetting data start riscv: introduce asm/flat.h arch/riscv/include/asm/Kbuild | 1

[PATCH 1/2] binfmt_flat: allow not offsetting data start

2021-04-07 Thread Damien Le Moal
_flat_file() to calculate the data section length and start position. The definition of FLAT_TEXT_DATA_NO_GAP by an architecture also prevents the use of the separate text/data load case (when FLAT_FLAG_RAM and FLAT_FLAG_GZIP are not set with NOMMU kernels). Signed-off-by: Damien Le Moal -

Re: [null_blk] de3510e52b: blktests.block.014.fail

2021-04-07 Thread Damien Le Moal
p split-job --compatible job.yaml > bin/lkp runcompatible-job.yaml > > > > --- > 0DAY/LKP+ Test Infrastructure Open Source Technology Center > https://lists.01.org/hyperkitty/list/l...@lists.01.org Intel Corporation > > Thanks, > Oliver Sang > -- Damien Le Moal Western Digital Research

Re: [PATCH v4 5/6] PCI: fu740: Add SiFive FU740 PCIe host controller driver

2021-04-01 Thread Damien Le Moal
gt;pcie_aux = devm_clk_get(dev, "pcie_aux"); > + if (IS_ERR(afp->pcie_aux)) > + return dev_err_probe(dev, PTR_ERR(afp->pcie_aux), > + "pcie_aux clock source missing or > invalid\n"); > + > + /* Fetch reset */ > + afp->rst = devm_reset_control_get_exclusive(dev, NULL); > + if (IS_ERR(afp->rst)) > + return dev_err_probe(dev, PTR_ERR(afp->rst), "unable to get > reset\n"); > + > + platform_set_drvdata(pdev, afp); > + > + ret = dw_pcie_host_init(&pci->pp); > + if (ret < 0) > + return ret; You can simplify this with a simple: return dw_pcie_host_init(&pci->pp); > + > + return 0; > +} > + > +static void fu740_pcie_shutdown(struct platform_device *pdev) > +{ > + struct fu740_pcie *afp = platform_get_drvdata(pdev); > + > + /* Bring down link, so bootloader gets clean state in case of reboot */ > + fu740_pcie_assert_reset(afp); > +} > + > +static const struct of_device_id fu740_pcie_of_match[] = { > + { .compatible = "sifive,fu740-pcie", }, > + {}, > +}; > + > +static struct platform_driver fu740_pcie_driver = { > + .driver = { > +.name = "fu740-pcie", > +.of_match_table = fu740_pcie_of_match, > +.suppress_bind_attrs = true, > + }, > + .probe = fu740_pcie_probe, > + .shutdown = fu740_pcie_shutdown, > +}; > + > +builtin_platform_driver(fu740_pcie_driver); > -- Damien Le Moal Western Digital Research

Re: [PATCH v4 1/2] binfmt_flat: allow not offsetting data start

2021-04-16 Thread Damien Le Moal
On 2021/04/17 13:52, Greg Ungerer wrote: > > On 17/4/21 11:10 am, Damien Le Moal wrote: >> Commit 2217b9826246 ("binfmt_flat: revert "binfmt_flat: don't offset >> the data start"") restored offsetting the start of the data section by >> a number

Re: [RESEND,v5,1/2] bio: limit bio max size

2021-04-11 Thread Damien Le Moal
On 2021/04/09 23:47, Bart Van Assche wrote: > On 4/7/21 3:27 AM, Damien Le Moal wrote: >> On 2021/04/07 18:46, Changheun Lee wrote: >>> I'll prepare new patch as you recommand. It will be added setting of >>> limit_bio_size automatically when queue max sectors is

Re: [RFC PATCH v5 2/4] block: add simple copy support

2021-04-11 Thread Damien Le Moal
/* simple copy not supported in stacked devices */ >>> + t->copy_offload = 0; >>> + t->max_copy_sectors = 0; >>> + t->max_copy_range_sectors = 0; >>> + t->max_copy_nr_ranges = 0; >> >> You do not need this. Limits not explicitely initialized are 0 already. >> But I do not see why you can't support copy on stacked devices. That should >> be >> feasible taking the min() for each of the above limit. >> > > Disabling stacked device support was feedback from v2. > > https://patchwork.kernel.org/project/linux-block/patch/20201204094659.12732-2-selvakuma...@samsung.com/ Right. But the initialization to 0 is still not needed. The fields are already initialized to 0. -- Damien Le Moal Western Digital Research

Re: [RFC PATCH v5 2/4] block: add simple copy support

2021-02-19 Thread Damien Le Moal
ndom(q) test_bit(QUEUE_FLAG_ADD_RANDOM, > &(q)->queue_flags) > #define blk_queue_discard(q) test_bit(QUEUE_FLAG_DISCARD, &(q)->queue_flags) > +#define blk_queue_copy(q)test_bit(QUEUE_FLAG_SIMPLE_COPY, > &(q)->queue_flags) > #define blk_queue_zone_resetall(q) \ > test_bit(QUEUE_FLAG_ZONE_RESETALL, &(q)->queue_flags) > #define blk_queue_secure_erase(q) \ > @@ -1069,6 +1075,9 @@ static inline unsigned int > blk_queue_get_max_sectors(struct request_queue *q, > return min(q->limits.max_discard_sectors, > UINT_MAX >> SECTOR_SHIFT); > > + if (unlikely(op == REQ_OP_COPY)) > + return q->limits.max_copy_sectors; > + I would agreee with this if a copy BIO was always a single range, but that is not the case. So I am not sure this makes sense at all. > if (unlikely(op == REQ_OP_WRITE_SAME)) > return q->limits.max_write_same_sectors; > > @@ -1343,6 +1352,12 @@ extern int __blkdev_issue_discard(struct block_device > *bdev, sector_t sector, > sector_t nr_sects, gfp_t gfp_mask, int flags, > struct bio **biop); > > +#define BLKDEV_COPY_NOEMULATION (1 << 0)/* do not emulate if > copy offload not supported */ > + > +extern int blkdev_issue_copy(struct block_device *src_bdev, int nr_srcs, > + struct range_entry *src_rlist, struct block_device *dest_bdev, > + sector_t dest, gfp_t gfp_mask, int flags); No need for extern. > + > #define BLKDEV_ZERO_NOUNMAP (1 << 0) /* do not free blocks */ > #define BLKDEV_ZERO_NOFALLBACK (1 << 1) /* don't write explicit > zeroes */ > > diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h > index f44eb0a04afd..5cadb176317a 100644 > --- a/include/uapi/linux/fs.h > +++ b/include/uapi/linux/fs.h > @@ -64,6 +64,18 @@ struct fstrim_range { > __u64 minlen; > }; > > +struct range_entry { > + __u64 src; > + __u64 len; > +}; > + > +struct copy_range { > + __u64 dest; > + __u64 nr_range; > + __u64 range_list; > + __u64 rsvd; > +}; > + > /* extent-same (dedupe) ioctls; these MUST match the btrfs ioctl definitions > */ > #define FILE_DEDUPE_RANGE_SAME 0 > #define FILE_DEDUPE_RANGE_DIFFERS1 > @@ -184,6 +196,7 @@ struct fsxattr { > #define BLKSECDISCARD _IO(0x12,125) > #define BLKROTATIONAL _IO(0x12,126) > #define BLKZEROOUT _IO(0x12,127) > +#define BLKCOPY _IOWR(0x12, 128, struct copy_range) > /* > * A jump here: 130-131 are reserved for zoned block devices > * (see uapi/linux/blkzoned.h) > Please test your code more thoroughly. It is full of problems that you should have detected with better testing including RO devices, partitions and error path coverage. -- Damien Le Moal Western Digital Research

Re: [dm-devel] [RFC PATCH v2 1/2] block: add simple copy support

2020-12-08 Thread Damien Le Moal
uld be off maybe). Integrating nvme simple copy in such initial support would I think be quite simple and scsi xcopy can follow. From there, adding stack device support can be worked on with little, if any, impact on the existing users of the block copy API (mostly FSes such as f2fs and btrfs). -- Damien Le Moal Western Digital Research

Re: [RFC PATCH v2 0/2] add simple copy support

2020-12-04 Thread Damien Le Moal
+++ > drivers/nvme/host/core.c | 87 +++ > include/linux/bio.h | 1 + > include/linux/blk_types.h | 15 + > include/linux/blkdev.h| 15 + > include/linux/nvme.h | 43 - > include/uapi/linux/fs.h | 13 > 14 files changed, 461 insertions(+), 11 deletions(-) > -- Damien Le Moal Western Digital Research

Re: [RFC PATCH v2 0/2] add simple copy support

2020-12-07 Thread Damien Le Moal
On 2020/12/07 16:46, javier.g...@samsung.com wrote: > On 04.12.2020 23:40, Keith Busch wrote: >> On Fri, Dec 04, 2020 at 11:25:12AM +0000, Damien Le Moal wrote: >>> On 2020/12/04 20:02, SelvaKumar S wrote: >>>> This patchset tries to add support for

Re: [PATCH] riscv: defconfig: k210: Disable CONFIG_VT

2020-11-24 Thread Damien Le Moal
set > # CONFIG_SERIO is not set > +# CONFIG_VT is not set > # CONFIG_LEGACY_PTYS is not set > # CONFIG_LDISC_AUTOLOAD is not set > # CONFIG_HW_RANDOM is not set > @@ -60,7 +61,6 @@ CONFIG_GPIO_SIFIVE=y > CONFIG_POWER_RESET=y > CONFIG_POWER_RESET_SYSCON=y > # CONFIG_HWMON is n

Re: [PATCH] efi: EFI_EARLYCON should depend on EFI

2020-11-25 Thread Damien Le Moal
+++ b/drivers/firmware/efi/Kconfig > @@ -270,7 +270,7 @@ config EFI_DEV_PATH_PARSER >   > >  config EFI_EARLYCON >   def_bool y > - depends on SERIAL_EARLYCON && !ARM && !IA64 > + depends on EFI && SERIAL_EARLYCON && !ARM && !IA64 >   select FONT_SUPPORT >   select ARCH_USE_MEMREMAP_PROT >   > Looks good to me. Reviewed-by: Damien Le Moal -- Damien Le Moal Western Digital

Re: [PATCH] riscv: defconfig: k210: Disable CONFIG_VT

2020-11-25 Thread Damien Le Moal
On Wed, 2020-11-25 at 09:20 +0100, Geert Uytterhoeven wrote: > Hi Damien, > > On Wed, Nov 25, 2020 at 7:14 AM Damien Le Moal wrote: > > On 2020/11/25 3:57, Geert Uytterhoeven wrote: > > > There is no need to enable Virtual Terminal support in the Canaan > > >

Re: [PATCH] riscv: defconfig: k210: Disable CONFIG_VT

2020-11-25 Thread Damien Le Moal
On 2020/11/25 17:51, Geert Uytterhoeven wrote: > Hi Damien, > > On Wed, Nov 25, 2020 at 7:14 AM Damien Le Moal wrote: >> On 2020/11/25 3:57, Geert Uytterhoeven wrote: >>> There is no need to enable Virtual Terminal support in the Canaan >>> Kendryte K210 defco

Re: [PATCH] riscv: defconfig: k210: Disable CONFIG_VT

2020-11-25 Thread Damien Le Moal
On 2020/11/25 18:26, Geert Uytterhoeven wrote: > Hi Damien, > > On Wed, Nov 25, 2020 at 10:02 AM Damien Le Moal wrote: >> On 2020/11/25 17:51, Geert Uytterhoeven wrote: >>> On Wed, Nov 25, 2020 at 7:14 AM Damien Le Moal >>> wrote: >>>> On 2020/11/2

Re: [PATCH] riscv: defconfig: k210: Disable CONFIG_VT

2020-11-25 Thread Damien Le Moal
On 2020/11/25 20:00, Damien Le Moal wrote: > On 2020/11/25 18:26, Geert Uytterhoeven wrote: >> Hi Damien, >> >> On Wed, Nov 25, 2020 at 10:02 AM Damien Le Moal >> wrote: >>> On 2020/11/25 17:51, Geert Uytterhoeven wrote: >>>> On Wed, Nov 25, 2020

Re: [PATCH] riscv: defconfig: k210: Disable CONFIG_VT

2020-11-25 Thread Damien Le Moal
On Wed, 2020-11-25 at 13:47 +0100, Geert Uytterhoeven wrote: > Hi Damien, > > On Wed, Nov 25, 2020 at 12:00 PM Damien Le Moal wrote: > > On 2020/11/25 18:26, Geert Uytterhoeven wrote: > > > On Wed, Nov 25, 2020 at 10:02 AM Damien Le Moal > > > wrote: &

Re: [PATCH] block: fix possible bd_size_lock deadlock

2021-03-12 Thread Damien Le Moal
e warnings come from i_size_write() calling preempt_disable() rather than set_capacity() use of spin_lock(&bdev->bd_size_lock). I wonder how it is possible to have brd being initialized so early. I am not sure how to fix that. It looks like arm arch code territory. For now, we could revert the revert as I do not think that Yanfei patch is enough since completions may be from hard IRQ context too, which is not covered with the spin_lock_bh() variants (c.f. a similar problem we are facing with that in scsi completion [1]) I do not have any good idea how to proceed though. [1] https://lore.kernel.org/linux-scsi/ph0pr04mb7416c8330459e92d8aa21a889b...@ph0pr04mb7416.namprd04.prod.outlook.com/T/#t -- Damien Le Moal Western Digital Research

Re: [PATCH 1/1] scsi: sd: use max_xfer_blocks for set rw_max if max_xfer_blocks is available

2021-01-20 Thread Damien Le Moal
ith max_xfer_blocks. So if for your device max_sectors end up being too small, it is likely because the device itself is reporting an opt_xfer_blocks value that is too small for its own good. The max_sectors limit can be manually increased with "echo xxx > /sys/block/sdX/queue/max_sectors_kb". A udev rule can be used to handle this autmatically if needed. But to get a saner default for that device, I do not think that this patch is the right solution. Ideally, the device peculiarity should be handled with a quirk, but that is not used in scsi. So beside the udev rule trick, I am not sure what the right approach is here. >> q->limits.io_opt = logical_to_bytes(sdp, sdkp->opt_xfer_blocks); >> rw_max = logical_to_sectors(sdp, sdkp->opt_xfer_blocks); >> } else { >> -- >> 2.29.0 >> >> > -- Damien Le Moal Western Digital Research

Re: [PATCH v2] bio: limit bio max size.

2021-01-20 Thread Damien Le Moal
we can hold */ > > atomic_t__bi_cnt; /* pin count */ This modification comes at the cost of increasing the bio structure size to simply tell the block layer "do not delay BIO splitting"... I think there is a much simpler approach. What about: 1) Use a request queue flag to indicate "limit BIO size" 2) modify __bio_try_merge_page() to look at that flag to disallow page merging if the bio size exceeds blk_queue_get_max_sectors(), or more ideally a version of it that takes into account the bio start sector. 3) Set the "limit bio size" queue flag in the driver of the device that benefit from this change. Eventually, that could also be controlled through sysfs. With such change, you will get the same result without having to increase the BIO structure size. -- Damien Le Moal Western Digital Research

Re: [PATCH v2] bio: limit bio max size

2021-01-21 Thread Damien Le Moal
mit_bio_merge_size(q) \ test_bit(QUEUE_FLAG_LIMIT_MERGE, &(q)->queue_flags) static inline unsigned int bio_max_merge_size(struct bio *bio) { struct request_queue *q = bio->bi_disk->queue; if (blk_queue_limit_bio_merge_size(q)) return blk_queue_get_max_sectors(q, bio_op(bio)) << SECTOR_SHIFT; return UINT_MAX; } and use that helper in __bio_try_merge_page(), e.g.: if (bio->bi_iter.bi_size > bio_max_merge_size(bio) - len) { *same_page = false; return false; } No need to change the bio struct. If you measure performance with and without this change on nullblk, you can verify if it has any impact for regular devices. And for your use case, that should give you the same performance. > > bool __bio_try_merge_page(struct bio *bio, struct page *page, > unsigned int len, unsigned int off, bool *same_page) > { > ... > if (page_is_mergeable(bv, page, len, off, same_page)) { > - if (bio->bi_iter.bi_size > UINT_MAX - len) { > + if (bio->bi_iter.bi_size > bio->bi_max_size - len) { > *same_page = false; > return false; > } > > bv->bv_len += len; > bio->bi_iter.bi_size += len; > return true; > } > ... > } > > > static inline bool bio_full(struct bio *bio, unsigned len) > { > ... > - if (bio->bi_iter.bi_size > UINT_MAX - len) > + if (bio->bi_iter.bi_size > bio->bi_max_size - len) > return true; > ... > } > > +void bio_set_dev(struct bio *bio, struct block_device *bdev) > +{ > + if (bio->bi_disk != bdev->bd_disk) > + bio_clear_flag(bio, BIO_THROTTLED); > + > + bio->bi_disk = bdev->bd_disk; > + bio->bi_partno = bdev->bd_partno; > + if (blk_queue_limit_bio_max_size(bio)) > + bio->bi_max_size = blk_queue_get_bio_max_size(bio); > + > + bio_associate_blkg(bio); > +} > +EXPORT_SYMBOL(bio_set_dev); > > > -- > > Damien Le Moal > > Western Digital Research > > --- > Changheun Lee > Samsung Electronics -- Damien Le Moal Western Digital

Re: [PATCH 1/1] scsi: sd: use max_xfer_blocks for set rw_max if max_xfer_blocks is available

2021-01-21 Thread Damien Le Moal
used by the block layer to limit command size if the value of >> opt_xfer_blocks is smaller than the limit initially set with max_xfer_blocks. >> >> So if for your device max_sectors end up being too small, it is likely >> because >> the device itself is reporting an o

Re: [PATCH v3 1/2] bio: limit bio max size

2021-01-26 Thread Damien Le Moal
> if (unlikely(ret)) { > /* >* We have to stop part way through an IO. We must fall > diff --git a/fs/zonefs/super.c b/fs/zonefs/super.c > index bec47f2d074b..c95ac37f9305 100644 > --- a/fs/zonefs/super.c > +++ b/fs/zonefs/super.c > @@ -690,7 +690,7 @@ static ssize_t zonefs_file_dio_append(struct kiocb *iocb, > struct iov_iter *from) > if (iocb->ki_flags & IOCB_DSYNC) > bio->bi_opf |= REQ_FUA; > > - ret = bio_iov_iter_get_pages(bio, from); > + ret = bio_iov_iter_get_pages(bio, from, is_sync_kiocb(iocb)); > if (unlikely(ret)) > goto out_release; > > diff --git a/include/linux/bio.h b/include/linux/bio.h > index 676870b2c88d..fa3a503b955c 100644 > --- a/include/linux/bio.h > +++ b/include/linux/bio.h > @@ -472,7 +472,7 @@ bool __bio_try_merge_page(struct bio *bio, struct page > *page, > unsigned int len, unsigned int off, bool *same_page); > void __bio_add_page(struct bio *bio, struct page *page, > unsigned int len, unsigned int off); > -int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter); > +int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter, bool > sync); > void bio_release_pages(struct bio *bio, bool mark_dirty); > extern void bio_set_pages_dirty(struct bio *bio); > extern void bio_check_pages_dirty(struct bio *bio); > > > Thanks, > Ming > > -- Damien Le Moal Western Digital Research

Re: [PATCH v3 1/2] bio: limit bio max size

2021-01-26 Thread Damien Le Moal
On 2021/01/26 15:07, Ming Lei wrote: > On Tue, Jan 26, 2021 at 04:06:06AM +0000, Damien Le Moal wrote: >> On 2021/01/26 12:58, Ming Lei wrote: >>> On Tue, Jan 26, 2021 at 10:32:34AM +0900, Changheun Lee wrote: >>>> bio size can grow up to 4GB when muli-page bvec is

Re: [PATCH v3 1/2] bio: limit bio max size

2021-01-26 Thread Damien Le Moal
68,8 @@ bool blk_queue_flag_test_and_set(unsigned int flag, > struct request_queue *q); > #define blk_queue_fua(q) test_bit(QUEUE_FLAG_FUA, &(q)->queue_flags) > #define blk_queue_registered(q) test_bit(QUEUE_FLAG_REGISTERED, > &(q)->queue_flags) > #define blk_queue_nowait(q) test_bit(QUEUE_FLAG_NOWAIT, &(q)->queue_flags) > +#define blk_queue_limit_bio_size(q) \ > + test_bit(QUEUE_FLAG_LIMIT_BIO_SIZE, &(q)->queue_flags) > > extern void blk_set_pm_only(struct request_queue *q); > extern void blk_clear_pm_only(struct request_queue *q); > -- Damien Le Moal Western Digital Research

Re: [PATCH v3 1/2] bio: limit bio max size

2021-01-26 Thread Damien Le Moal
; >> No need for extern. > > It's just for compile warning in my test environment. > I'll remove it too. But I think compile warning could be in the other > .c file which includes bio.h. Is it OK? Hmmm... not having extern should not generate a compilation warning. There are tons of functions declared without extern in header files in the kernel. What compiler are you using ? -- Damien Le Moal Western Digital Research

Re: [PATCH] dm zoned: select CONFIG_CRC32

2021-01-03 Thread Damien Le Moal
g DM_ZONED > tristate "Drive-managed zoned block device target support" > depends on BLK_DEV_DM > depends on BLK_DEV_ZONED > + select CRC32 > help > This device-mapper target takes a host-managed or host-aware zoned > block d

Re: [PATCH] zonefs: select CONFIG_CRC32

2021-01-03 Thread Damien Le Moal
b/fs/zonefs/Kconfig > @@ -3,6 +3,7 @@ config ZONEFS_FS > depends on BLOCK > depends on BLK_DEV_ZONED > select FS_IOMAP > + select CRC32 > help > zonefs is a simple file system which exposes zones of a zoned block > device (e.g. host-managed

Re: [RFC PATCH v4 1/3] block: export bio_map_kern()

2021-01-04 Thread Damien Le Moal
> extern int blk_rq_map_user_iov(struct request_queue *, struct request *, > struct rq_map_data *, const struct iov_iter *, > -- Damien Le Moal Western Digital Research

Re: [RFC PATCH v4 2/3] block: add simple copy support

2021-01-04 Thread Damien Le Moal
gt; active */ > #define QUEUE_FLAG_NOWAIT 29 /* device supports NOWAIT */ > +#define QUEUE_FLAG_COPY 30 /* supports copy */ I think this should be called QUEUE_FLAG_SIMPLE_COPY to indicate more precisely the type of copy supported. SCSI XCOPY is more advanced... > > #define QUEUE_FLAG_MQ_DEFAULT((1 << QUEUE_FLAG_IO_STAT) | > \ >(1 << QUEUE_FLAG_SAME_COMP) | \ > @@ -647,6 +652,7 @@ bool blk_queue_flag_test_and_set(unsigned int flag, > struct request_queue *q); > #define blk_queue_io_stat(q) test_bit(QUEUE_FLAG_IO_STAT, &(q)->queue_flags) > #define blk_queue_add_random(q) test_bit(QUEUE_FLAG_ADD_RANDOM, > &(q)->queue_flags) > #define blk_queue_discard(q) test_bit(QUEUE_FLAG_DISCARD, &(q)->queue_flags) > +#define blk_queue_copy(q)test_bit(QUEUE_FLAG_COPY, &(q)->queue_flags) > #define blk_queue_zone_resetall(q) \ > test_bit(QUEUE_FLAG_ZONE_RESETALL, &(q)->queue_flags) > #define blk_queue_secure_erase(q) \ > @@ -1061,6 +1067,9 @@ static inline unsigned int > blk_queue_get_max_sectors(struct request_queue *q, > return min(q->limits.max_discard_sectors, > UINT_MAX >> SECTOR_SHIFT); > > + if (unlikely(op == REQ_OP_COPY)) > + return q->limits.max_copy_sectors; > + > if (unlikely(op == REQ_OP_WRITE_SAME)) > return q->limits.max_write_same_sectors; > > @@ -1335,6 +1344,10 @@ extern int __blkdev_issue_discard(struct block_device > *bdev, sector_t sector, > sector_t nr_sects, gfp_t gfp_mask, int flags, > struct bio **biop); > > +extern int blkdev_issue_copy(struct block_device *bdev, int nr_srcs, > + struct range_entry *src_rlist, struct block_device *dest_bdev, > + sector_t dest, gfp_t gfp_mask); > + > #define BLKDEV_ZERO_NOUNMAP (1 << 0) /* do not free blocks */ > #define BLKDEV_ZERO_NOFALLBACK (1 << 1) /* don't write explicit > zeroes */ > > diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h > index f44eb0a04afd..5cadb176317a 100644 > --- a/include/uapi/linux/fs.h > +++ b/include/uapi/linux/fs.h > @@ -64,6 +64,18 @@ struct fstrim_range { > __u64 minlen; > }; > > +struct range_entry { > + __u64 src; > + __u64 len; > +}; > + > +struct copy_range { > + __u64 dest; > + __u64 nr_range; > + __u64 range_list; > + __u64 rsvd; > +}; > + > /* extent-same (dedupe) ioctls; these MUST match the btrfs ioctl definitions > */ > #define FILE_DEDUPE_RANGE_SAME 0 > #define FILE_DEDUPE_RANGE_DIFFERS1 > @@ -184,6 +196,7 @@ struct fsxattr { > #define BLKSECDISCARD _IO(0x12,125) > #define BLKROTATIONAL _IO(0x12,126) > #define BLKZEROOUT _IO(0x12,127) > +#define BLKCOPY _IOWR(0x12, 128, struct copy_range) > /* > * A jump here: 130-131 are reserved for zoned block devices > * (see uapi/linux/blkzoned.h) > -- Damien Le Moal Western Digital Research

Re: [RFC PATCH 02/34] block: introduce and use bio_new

2021-01-27 Thread Damien Le Moal
So this should be checked. > + > + bio_set_dev(bio, bdev); > + bio->bi_iter.bi_sector = sector; > + bio_set_op_attrs(bio, op, op_flags); This function is obsolete. Open code this. > + > + return bio; > +} > > #endif /* __LINUX_BIO_H */ > -- Damien Le Moal Western Digital Research

Re: [RFC PATCH 28/34] zonefs: use bio_new

2021-01-27 Thread Damien Le Moal
bio->bi_opf = REQ_OP_ZONE_APPEND | REQ_SYNC | REQ_IDLE; > if (iocb->ki_flags & IOCB_DSYNC) > bio->bi_opf |= REQ_FUA; > > -- Damien Le Moal Western Digital Research

Re: [RFC PATCH 02/34] block: introduce and use bio_new

2021-01-27 Thread Damien Le Moal
On 2021/01/28 16:21, Damien Le Moal wrote: > On 2021/01/28 16:12, Chaitanya Kulkarni wrote: >> Introduce bio_new() helper and use it in blk-lib.c to allocate and >> initialize various non-optional or semi-optional members of the bio >> along with bio allocation done with bio_

Re: [RFC PATCH v4 2/3] block: add simple copy support

2021-01-05 Thread Damien Le Moal
On 2021/01/05 21:24, Selva Jove wrote: > Thanks for the review, Damien. > > On Mon, Jan 4, 2021 at 6:17 PM Damien Le Moal wrote: >> >> On 2021/01/04 19:48, SelvaKumar S wrote: >>> Add new BLKCOPY ioctl that offloads copying of one or more sources >>>

Re: [PATCH] drivers: block: skd: remove skd_pci_info()

2020-12-13 Thread Damien Le Moal
i_info(skdev, pci_str); > - dev_info(&pdev->dev, "%s 64bit\n", pci_str); Replace these 2 lines with: pcie_print_link_status(pdev); And the link speed information will be printed. -- Damien Le Moal Western Digital Research

Re: [PATCH v1] drivers: block: skd: remove skd_pci_info()

2020-12-14 Thread Damien Le Moal
tr); > - dev_info(&pdev->dev, "%s 64bit\n", pci_str); > + pcie_print_link_status(pdev); > > pci_set_master(pdev); > rc = pci_enable_pcie_error_reporting(pdev); > Note: V1 of this patch was the one I commented on. This one should thus be V2. In any case, this looks OK to me. Acked-by: Damien Le Moal -- Damien Le Moal Western Digital Research

Re: [v1 PATCH 4/4] RISC-V: Support nr_cpus command line option.

2019-03-19 Thread Damien Le Moal
) { > + if (cpuid_to_hartid_map(cpuid) != INVALID_HARTID) > + set_cpu_possible(cpuid, true); > + } > } > > int __cpu_up(unsigned int cpu, struct task_struct *tidle) > -- Damien Le Moal Western Digital Research

Re: [v1 PATCH 4/4] RISC-V: Support nr_cpus command line option.

2019-03-19 Thread Damien Le Moal
On 2019/03/20 8:56, Damien Le Moal wrote: > On 2019/03/20 7:20, Atish Patra wrote: >> If nr_cpus command line option is set, maximum possible cpu should be >> set to that value. >> >> Signed-off-by: Atish Patra >> --- >> arch/riscv/kernel/smpboot.

Re: [PATCH] ext4: Fix deadlock on page reclaim

2019-07-26 Thread Damien Le Moal
hammer not needed) or lower than the top level (big hammer needed) ? One simple hack would be an fcntl() or mount option to tell the FS to use GFP_NOFS unconditionally, but avoiding the bug would mean making sure that the applications or system setup is correct. So not so safe. -- Damien Le Moal Western Digital Research

Re: [PATCH] ext4: Fix deadlock on page reclaim

2019-07-28 Thread Damien Le Moal
On 2019/07/29 8:42, Dave Chinner wrote: > On Sat, Jul 27, 2019 at 02:59:59AM +0000, Damien Le Moal wrote: >> On 2019/07/27 7:55, Theodore Y. Ts'o wrote: >>> On Sat, Jul 27, 2019 at 08:44:23AM +1000, Dave Chinner wrote: >>>>> >>>>> This looks

[PATCH] ext4: Fix deadlock on page reclaim

2019-07-25 Thread Damien Le Moal
ps(). Reported-by: Masato Suzuki Signed-off-by: Damien Le Moal --- fs/ext4/inode.c | 16 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 420fe3deed39..f882929037df 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@

Re: [PATCH] ext4: Fix deadlock on page reclaim

2019-07-25 Thread Damien Le Moal
On 2019/07/25 20:54, Christoph Hellwig wrote: > On Thu, Jul 25, 2019 at 06:33:58PM +0900, Damien Le Moal wrote: >> +gfp_t gfp_mask; >> + >> switch (ext4_inode_journal_mode(inode)) { >> case EXT4_INODE_ORDERED_DATA_MODE: >> case EXT4_INODE_W

Re: [PATCH] ext4: Fix deadlock on page reclaim

2019-07-29 Thread Damien Le Moal
Andreas, On 2019/07/30 3:40, Andreas Dilger wrote: > On Jul 26, 2019, at 8:59 PM, Damien Le Moal wrote: >> >> On 2019/07/27 7:55, Theodore Y. Ts'o wrote: >>> On Sat, Jul 27, 2019 at 08:44:23AM +1000, Dave Chinner wrote: >>>>> >>>>> This

Re: [PATCH] ext4: Fix deadlock on page reclaim

2019-07-30 Thread Damien Le Moal
Dave, On 2019/07/31 8:48, Dave Chinner wrote: > On Tue, Jul 30, 2019 at 02:06:33AM +0000, Damien Le Moal wrote: >> If we had a pread_nofs()/pwrite_nofs(), that would work. Or we could define a >> RWF_NORECLAIM flag for pwritev2()/preadv2(). This last one could actually be >

Re: lift the xfs writepage code into iomap v3

2019-07-30 Thread Damien Le Moal
fs on this branch and all tests passed, no compilation problems either. I will send a v2 of zonefs patch with all of Dave's comments addressed shortly. Thank you. Best regards. -- Damien Le Moal Western Digital Research

Re: [PATCH 2/4] null_blk: add zone open, close, and finish support

2019-06-21 Thread Damien Le Moal
case BLK_ZONE_COND_EXP_OPEN: if (zone->wp == zone->start) { zone->cond = BLK_ZONE_COND_EMPTY; break; } /* fallthrough */ default: zon

Re: [PATCH v2] nvme-pci: Support shared tags across queues for Apple 2018 controllers

2019-07-18 Thread Damien Le Moal
\n", mqes); > + result = -ENODEV; > + goto disable; > + } > + dev->q_depth = min_t(int, dev->q_depth, > + mqes - NVME_AQ_DEPTH + 1); > + } > + > /* >* Temporary fix for the Apple controller found in the MacBook8,1 and >* some MacBook7,1 to avoid controller resets and data loss. > @@ -3057,7 +3092,8 @@ static const struct pci_device_id nvme_id_table[] = { > { PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2003) }, > { PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2005), > .driver_data = NVME_QUIRK_SINGLE_VECTOR | > - NVME_QUIRK_128_BYTES_SQES }, > + NVME_QUIRK_128_BYTES_SQES | > + NVME_QUIRK_SHARED_TAGS }, > { 0, } > }; > MODULE_DEVICE_TABLE(pci, nvme_id_table); > > > > ___ > Linux-nvme mailing list > linux-n...@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-nvme > -- Damien Le Moal Western Digital Research

Re: [PATCH v2] nvme-pci: Support shared tags across queues for Apple 2018 controllers

2019-07-18 Thread Damien Le Moal
On 2019/07/19 13:49, Benjamin Herrenschmidt wrote: > On Fri, 2019-07-19 at 04:43 +0000, Damien Le Moal wrote: >> On 2019/07/19 13:37, Benjamin Herrenschmidt wrote: >>> Another issue with the Apple T2 based 2018 controllers seem to be >>> that they blow up (and shut the

Re: [PATCH 2/8] scsi: take the DMA max mapping size into account

2019-07-22 Thread Damien Le Moal
3/0x23 > [5.82] ? pick_next_task_fair+0x976/0xa3d > [5.82] ? mutex_lock+0x88/0xc4 > [5.82] scsi_scan_channel+0x76/0x9e > [5.82] scsi_scan_host_selected+0x131/0x176 > [5.82] ? scsi_scan_host+0x241/0x241 > [5.82] do_scan_async+0x27/0x219 >

  1   2   3   >