Re: PROBLEM: fanotify_mark EFAULT on x86

2020-11-26 Thread Jan Kara
On Tue 24-11-20 11:28:14, Borislav Petkov wrote: > On Tue, Nov 24, 2020 at 11:20:33AM +0100, Jan Kara wrote: > > On Tue 24-11-20 09:45:07, Borislav Petkov wrote: > > > On Mon, Nov 23, 2020 at 11:46:51PM +0100, Paweł Jasiak wrote: > > > > On 23/11/20, Jan Kara wr

Re: [PATCH] trace: fix potenial dangerous pointer

2020-11-25 Thread Jan Kara
points. > > Acked-by: Tejun Heo > > Andrew, can you please route this one? I'll queue it to my tree and push it to Linus on Friday since I sometimes handle writeback stuff myself anyway... Honza -- Jan Kara SUSE Labs, CR

Re: kernel BUG at fs/ext4/inode.c:LINE!

2020-11-25 Thread Jan Kara
free page reallocate page for something else we can even dirty & start to writeback 'page' wake_up_page(page) and we have a "spurious" wake up on 'page'. Honza -- Jan Kara SUSE Labs, CR

Re: PROBLEM: fanotify_mark EFAULT on x86

2020-11-24 Thread Jan Kara
On Tue 24-11-20 09:45:07, Borislav Petkov wrote: > On Mon, Nov 23, 2020 at 11:46:51PM +0100, Paweł Jasiak wrote: > > On 23/11/20, Jan Kara wrote: > > > OK, with a help of Boris Petkov I think I have a fix that looks correct > > > (attach). Can you please try wheth

Re: PROBLEM: fanotify_mark EFAULT on x86

2020-11-23 Thread Jan Kara
const char __user *, pathname) > { > return do_fanotify_mark(fanotify_fd, flags, mask, dfd, pathname); > } > +#endif > > #ifdef CONFIG_COMPAT > COMPAT_SYSCALL_DEFINE6(fanotify_mark, > > > -- > > Paweł Jasiak -- Jan Kara SUSE Labs, CR >From f

Re: [PATCH 030/141] ext2: Fix fall-through warnings for Clang

2020-11-23 Thread Jan Kara
ext2_free_branches(inode, , +1, 3); > } > + break; > case EXT2_TIND_BLOCK: > ; > } > -- > 2.27.0 > -- Jan Kara SUSE Labs, CR

Re: [mm/gup] 47e29d32af: phoronix-test-suite.npb.FT.A.total_mop_s -45.0% regression

2020-11-18 Thread Jan Kara
pinning users might be indeed rare and only those would show regressions in THP pinning performance... Honza -- Jan Kara SUSE Labs, CR

Re: linux-next: build failure after merge of the ext3 tree

2020-11-13 Thread Jan Kara
| ^~~ > > > Caused by commit > > 32559cea1f55 ("fs/ext2: Use ext2_put_page") > > Presumably some missing includes :-( > > I have used the ext3 tree from next-20201112 for today. Yeah, sorry for that. Should be fixed now. Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH V2] fs/ext2: Use ext2_put_page

2020-11-13 Thread Jan Kara
> inode_dec_link_count(old_dir); > } > return 0; > > > out_dir: > - if (dir_de) { > - kunmap(dir_page); > - put_page(dir_page); > - } > + if (dir_de) > + ext2_put_page(dir_page); > out_old: > - kunmap(old_page); > - put_page(old_page); > + ext2_put_page(old_page); > out: > return err; > } > -- > 2.28.0.rc0.12.gb6a658bd00c9 > -- Jan Kara SUSE Labs, CR

Re: [PATCH 1/2] xfs: show the dax option in mount options.

2020-11-12 Thread Jan Kara
for the kernel > to make the information about DAX availability accessible somewhere. > > > > And all this comes about because DAX is a property of the block > > device, not the filesystem. Hence the only time a DAX capable > > filesystem on a block device that is DAX capable will not be DAX > > capable is if the dax=never is set... > See, it is not property of the block device. It is property of the mount > point. The availability on the device is one requirement but the > filesystem options affect availability to the user in the end. No, it is not really a property of the mountpoint either. If anything it is a property of the inode. Two different inodes on the very same filesystem, one may support DAX the other will not (think for example of XFS real-time volumes, or simply inodes with / without S_DAX flag set). And we are back at what Dave tries to get accross. As inconvenient as it is statx(STATX_ATTR_DAX) is the only way to tell. > > Of course, this is just encoding how existing filesystems behave - > > it's not a requirement for future filesytsems so they may use other > > mechanisms for enabling/disabling DAX. Which leaves you with the > > only reliable mechanism of creating filesystem and checking > > statx(STATX_ATTR_DAX) > Or the kernel could just tell the user. But right, information is power, > and keeping the user in the dark is much more entertaining. I think it would be more productive if you actually answered Ted's question: Exactly which application got broken by the change? I know for a fact that one large DB vendor was parsing mount options in /proc/mounts to determine whether their DB can use DAX or not (and this was already a "cleaned up" method because before this they were parsing VMA flags in /proc//smaps which is even worse). But in this case they also seemed OK to switch to statx() once it is available... Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH] fs/ext2: Use ext2_put_page

2020-11-12 Thread Jan Kara
e, new_dir, 0); > - else { > - kunmap(dir_page); > - put_page(dir_page); > - } > + else > + ext2_put_page(dir_page); > inode_dec_link_count(old_dir); > } > return 0; > > > out_dir: > - if (dir_de) { > - kunmap(dir_page); > - put_page(dir_page); > - } > + if (dir_de) > + ext2_put_page(dir_page); > out_old: > - kunmap(old_page); > - put_page(old_page); > + ext2_put_page(old_page); > out: > return err; > } > -- > 2.28.0.rc0.12.gb6a658bd00c9 > -- Jan Kara SUSE Labs, CR

Re: BUG: sleeping function called from invalid context in ext4_superblock_csum_set

2020-11-11 Thread Jan Kara
On Wed 04-11-20 14:12:35, Jan Kara wrote: > On Tue 03-11-20 09:16:19, Costa Sapuntzakis wrote: > > Jan, does this fixup from Hillf look ok to you? You originally argued for > > lock_buffer/unlock_buffer. > > > > I think the problem here is that the ext4 code assumes

Re: [PATCH v4] inotify: Increase default inotify.max_user_watches limit to 1048576

2020-11-09 Thread Jan Kara
76]. > > +*/ > > + watches_max = (((si.totalram - si.totalhigh) / 100) << PAGE_SHIFT) / > > + INOTIFY_WATCH_COST; > > + watches_max = clamp(watches_max, 8192UL, 1048576UL); > > + > > BUILD_BUG_ON(IN_ACCESS != FS_ACCESS); > > BUILD_BUG_ON(IN_MODIFY != FS_MODIFY); > > BUILD_BUG_ON(IN_ATTRIB != FS_ATTRIB); > > @@ -827,7 +848,7 @@ static int __init inotify_user_setup(void) > > > > inotify_max_queued_events = 16384; > > init_user_ns.ucount_max[UCOUNT_INOTIFY_INSTANCES] = 128; > > - init_user_ns.ucount_max[UCOUNT_INOTIFY_WATCHES] = 8192; > > + init_user_ns.ucount_max[UCOUNT_INOTIFY_WATCHES] = watches_max; > > > > return 0; > > } > > -- > > 2.18.1 > > -- Jan Kara SUSE Labs, CR

Re: [PATCH] docs: filesystems: Reduce ext2.rst to one top-level heading

2020-11-09 Thread Jan Kara
bbb162e20..c2fce22cfd035 100644 > --- a/Documentation/filesystems/ext2.rst > +++ b/Documentation/filesystems/ext2.rst > @@ -1,6 +1,7 @@ > .. SPDX-License-Identifier: GPL-2.0 > > > +== > The Second Extended Filesystem > == > > -- > 2.28.0 > -- Jan Kara SUSE Labs, CR

Re: [PATCH 1/5] mm: truncate_complete_page is not existed anymore

2020-11-06 Thread Jan Kara
t; > Signed-off-by: Yang Shi Thanks! Looks good to me. You can add: Reviewed-by: Jan Kara Honza > --- > mm/migrate.c | 2 +- > mm/vmscan.c | 2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) > > di

Re: kernel BUG at mm/page-writeback.c:2241 [ BUG_ON(PageWriteback(page); ]

2020-11-04 Thread Jan Kara
o far I have only been able to reproduce on this Intel platform: > > HPE DL560 gen10 > Intel(R) Xeon(R) Gold 6154 CPU @ 3.00GHz > 131072 MB memory, 1000 GB disk space (smartpqi nvme) Did you try running with the debug patch Matthew sent? Any results?

Re: BUG: sleeping function called from invalid context in ext4_superblock_csum_set

2020-11-04 Thread Jan Kara
4_superblock_csum(sb, es); > > - unlock_buffer(EXT4_SB(sb)->s_sbh); > > + spin_unlock_irqrestore(>s_cs_lock, flags); > > } > > > > ext4_fsblk_t ext4_block_bitmap(struct super_block *sb, > > --- a/fs/ext4/mballoc.c > > +++ b/fs/ext4/mballoc.c > > @@ -2868,6 +2868,7 @@ int ext4_mb_init(struct super_block *sb) > > i++; > > } while (i <= sb->s_blocksize_bits + 1); > > > > + spin_lock_init(>s_cs_lock); > > spin_lock_init(>s_md_lock); > > spin_lock_init(>s_bal_lock); > > sbi->s_mb_free_pending = 0; > > --- a/fs/ext4/ext4.h > > +++ b/fs/ext4/ext4.h > > @@ -1439,6 +1439,7 @@ struct ext4_sb_info { > > loff_t s_bitmap_maxbytes; /* max bytes for bitmap files */ > > struct buffer_head * s_sbh; /* Buffer containing the super > > block */ > > struct ext4_super_block *s_es; /* Pointer to the super block in > > the buffer */ > > + spinlock_t s_cs_lock; /* SB checksum lock */ > > struct buffer_head * __rcu *s_group_desc; > > unsigned int s_mount_opt; > > unsigned int s_mount_opt2; > > -- Jan Kara SUSE Labs, CR

Re: PROBLEM: fanotify_mark EFAULT on x86

2020-11-04 Thread Jan Kara
ned int, flags, > __u64, mask, int, dfd, > const char __user *, pathname) > { > return do_fanotify_mark(fanotify_fd, flags, mask, dfd, pathname); > } > +#endif > > #ifdef CONFIG_COMPAT > COMPAT_SYSCALL_DEFINE6(fanotify_mark, > > > -- > > Paweł Jasiak -- Jan Kara SUSE Labs, CR

Re: possible lockdep regression introduced by 4d004099a668 ("lockdep: Fix lockdep recursion")

2020-11-03 Thread Jan Kara
eading the value > > of the old CPU, which is no longer 0. > > > > I already fixed a bunch of that in: > > > > baffd723e44d ("lockdep: Revert "lockdep: Use raw_cpu_*() for per-cpu > > variables"") > > > > but clearly this one got crossed. > > > > Still, that leaves me puzzled over you seeing this on x86 :/ > > Hi Peter, > > I still get the same issue with 5.10-rc2. > Is there any non-merged patch I should try, or anything I can help with? BTW, I've just hit the same deadlock issue with ext4 on generic/390 so I confirm this isn't btrfs specific issue (as we already knew from the analysis but still it's good to have that confirmed). Honza -- Jan Kara SUSE Labs, CR

Re: fix fs/quota/dquot.c oops error

2020-11-02 Thread Jan Kara
more sanity checking into quota code to verify quota file headers are not corrupted. Because these corrupted headers cause bogus return values from get_free_blk() and possibly other quota functions which then confuse __dquot_initialize(). Honza -- Jan Kara SUSE Labs, CR

Re: PROBLEM: fanotify_mark EFAULT on x86

2020-11-02 Thread Jan Kara
d? Brian, any idea whether your series could regress fanotify_mark(2) syscall? Do we have somewhere documented which syscalls need compat wrappers and how they should look like? Honza [1] https://lists.linux.it/pipermail/ltp/2020-June/017436.html [2] https://lore.kernel.org/lkml/20200313195144.164260-1-brge...@gmail.com/ -- Jan Kara SUSE Labs, CR

Re: [PATCH v2 2/2] mm: prevent gup_fast from racing with COW during fork

2020-11-02 Thread Jan Kara
On Fri 30-10-20 14:02:26, Jason Gunthorpe wrote: > On Fri, Oct 30, 2020 at 05:51:05PM +0100, Jan Kara wrote: > > > @@ -446,6 +447,12 @@ struct mm_struct { > > >*/ > > > atomic_t has_pinned; > > > > > > + /** > >

Re: [PATCH v2 2/2] mm: prevent gup_fast from racing with COW during fork

2020-10-30 Thread Jan Kara
_struct. > > Fixes: f3c64eda3e50 ("mm: avoid early COW write protect games during fork()") > Suggested-by: Linus Torvalds > Link: > https://lore.kernel.org/r/CAHk-=wi=icnycarbpgjkvju9eyyez13n64tzyldob8cp5q_...@mail.gmail.com > Reviewed-by: John Hubbard >

Re: [PATCH v2 1/2] mm: reorganize internal_get_user_pages_fast()

2020-10-30 Thread Jan Kara
ng its cast > > - The handling of ret and nr_pinned can be streamlined a bit > > No functional change. > > Signed-off-by: Jason Gunthorpe Looks good to me. You can add: Reviewed-by: Jan Kara

Re: [PATCH v3] inotify: Increase default inotify.max_user_watches limit to 1048576

2020-10-30 Thread Jan Kara
watches_max = (((si.totalram - si.totalhigh) / 100) << PAGE_SHIFT) / > + INOTIFY_WATCH_COST; ^^^ So for machines with > 1TB of memory watches_max would overflow. So you probably need to use ulong for that. > + watches_max = min(1048576U, max(watches_max, 8192U)); ^^^ use clamp() here? Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH v2] inotify: Increase default inotify.max_user_watches limit to 1048576

2020-10-30 Thread Jan Kara
> > > > And this approximation can be pretty accurate at times. > > > > For example, on Ubuntu 18.04 kernel 5.4.0: > > > > inode_cache608 > > > > nfs_inode_cache 1088 > > > > btrfs_inode1168 > > > > xfs_inode 1024 > > > > ext4_inode_cache 1096 > > > Just to clarify, is your original 2 * sizeof(struct inode) figure > > > include the filesystem inode overhead or there is an additional inode > > > somewhere that I needs to go to 4 * sizeof(struct inode)? > > No additional inode. > > > > #define INOTIFY_WATCH_COST (sizeof(struct inotify_inode_mark) + \ > >2 * sizeof(struct > > inode)) > > > > Not sure if the inotify_inode_mark part matters, but it doesn't hurt. > > Do note that Jan had a different proposal for fs inode size estimation (1K). > > I have no objection to this estimation if Jan insists. > > > > Thanks, > > Amir. > > > Thanks for the confirmation. 2*sizeof(struct inode) is more than 1k. Besides > with debugging turned on, the size will increase more. So that figure is > good enough. Yeah, the 2*sizeof(struct inode) is fine by me as well. Please don't forget to update the comment explaining INOTIFY_WATCH_COST. Thanks! Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH] inotify: Increase default inotify.max_user_watches limit to 1048576

2020-10-27 Thread Jan Kara
@@ static int __init inotify_user_setup(void) > > inotify_max_queued_events = 16384; > init_user_ns.ucount_max[UCOUNT_INOTIFY_INSTANCES] = 128; > - init_user_ns.ucount_max[UCOUNT_INOTIFY_WATCHES] = 8192; > + init_user_ns.ucount_max[UCOUNT_INOTIFY_WATCHES] = watches_max; > > return 0; > } > -- > 2.18.1 > -- Jan Kara SUSE Labs, CR

Re: [PATCH 1/2] mm: reorganize internal_get_user_pages_fast()

2020-10-27 Thread Jan Kara
llers that care about partial success. See e.g. iov_iter_get_pages() usage in fs/direct_io.c:dio_refill_pages() or bio_iov_iter_get_pages(). These places handle partial success just fine and not allowing partial success from GUP could regress things... Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH v3 04/12] mm/filemap: Add mapping_seek_hole_data

2020-10-26 Thread Jan Kara
On Mon 26-10-20 12:17:27, Matthew Wilcox wrote: > On Mon, Oct 26, 2020 at 11:48:06AM +0100, Jan Kara wrote: > > > +static inline loff_t page_seek_hole_data(struct page *page, > > > + loff_t start, loff_t end, bool seek_data) > > > +{ > > > + if

Re: possible lockdep regression introduced by 4d004099a668 ("lockdep: Fix lockdep recursion")

2020-10-26 Thread Jan Kara
hirez.programming.kicks-ass.net > > Make sure you have commit: > > f8e48a3dca06 ("lockdep: Fix preemption WARN for spurious IRQ-enable") > > (in Linus' tree by now) and do you have CONFIG_DEBUG_PREEMPT enabled? Hum, I am at 5.10-rc1 now and above mentioned commit doesn't appear to be there? Also googling for the title doesn't help... Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH v3 11/12] mm/truncate,shmem: Handle truncates that split THPs

2020-10-26 Thread Jan Kara
h if a truncate or hole punch is entirely > within a single page. We can add some more complex logic to restore > the optimisation if it proves to be worthwhile. > > Signed-off-by: Matthew Wilcox (Oracle) > Reviewed-by: William Kucharski The patch looks good to me. You c

Re: [PATCH v3 04/12] mm/filemap: Add mapping_seek_hole_data

2020-10-26 Thread Jan Kara
that this loop forgets to release the page reference it has got when doing SEEK_HOLE. > + } > + rcu_read_unlock(); > + > + if (seek_data) > + return -ENXIO; > + goto out; > + > +unlock: > + rcu_read_unlock(); > + if (!xa_is_value(page)) > + put_page(page); > +out: > + if (start > end) > + return end; > + return start; > +} Honza -- Jan Kara SUSE Labs, CR

Re: kernel BUG at mm/page-writeback.c:2241 [ BUG_ON(PageWriteback(page); ]

2020-10-26 Thread Jan Kara
but so far I failed. It's good to know it isn't ext4 specific so we should be searching in the generic code ;). So far I was concentrating more on ext4 bits... Honza [1] https://lore.kernel.org/lkml/d3a33205add2f...@google.com/ -- Jan Kara SUSE Labs, CR

Re: [PATCH v2 12/12] mm/filemap: Return only head pages from find_get_entries

2020-10-26 Thread Jan Kara
On Sun 25-10-20 23:19:34, Matthew Wilcox wrote: > On Thu, Oct 01, 2020 at 09:17:28AM +0200, Jan Kara wrote: > > > I have a followup patch which isn't part of this series which fixes it: > > > > > > http://git.infradead.org/users/

Re: [PATCH v3 35/56] jbd2: fix kernel-doc markups

2020-10-23 Thread Jan Kara
fferent names between their > prototypes and the kernel-doc markup. > > Signed-off-by: Mauro Carvalho Chehab Thanks for the patch. It looks good. You can add: Reviewed-by: Jan Kara Honza >

Re: [PATCH] ext4: remove the null check of bio_vec page

2020-10-23 Thread Jan Kara
ll get to your patch in a week or two. Honza > > -Original Message----- > From: Jan Kara [mailto:j...@suse.cz] > Sent: Wednesday, October 21, 2020 6:25 PM > To: tianxianting (RD) > Cc: ty...@mit.edu; adilger.ker...

Re: [fsnotify] 9b93f33105: WARNING:missing_R10_value_at__fsnotify_parent/0x

2020-10-23 Thread Jan Kara
_rate to 3000 > [ 182.751847] perf: interrupt took too long (80901 > 79856), lowering > kernel.perf_event_max_sample_rate to 2000 > [ 188.527603] WARNING: missing R10 value at __fsnotify_parent+0x25/0x280 OK, that's an unwinder warning but we don't do anything special in __fsnotify_parent(). Let's CC x86 guys if they have idea what's going on. Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH v2] fs: Kill DCACHE_DONTCACHE dentry even if DCACHE_REFERENCED is set

2020-10-21 Thread Jan Kara
Hum, Al, did this patch get lost? Honza On Thu 24-09-20 16:58:56, Jan Kara wrote: > On Thu 24-09-20 13:59:58, Hao Li wrote: > > If DCACHE_REFERENCED is set, fast_dput() will return true, and then > > retain_dentry()

Re: [PATCH] ext4: remove the null check of bio_vec page

2020-10-21 Thread Jan Kara
patch. It looks good to me. You can add: Reviewed-by: Jan Kara Honza > --- > fs/ext4/page-io.c | 3 --- > 1 file changed, 3 deletions(-) > > diff --git a/fs/ext4/page-io.c b/fs/ext4/page-io.c > index defd2e10d.

[tip: perf/urgent] reiserfs: Initialize inode keys properly

2020-10-19 Thread tip-bot2 for Jan Kara
The following commit has been merged into the perf/urgent branch of tip: Commit-ID: d3bb68fa8d43bcd889ce86249f73a70e3ba221aa Gitweb: https://git.kernel.org/tip/d3bb68fa8d43bcd889ce86249f73a70e3ba221aa Author:Jan Kara AuthorDate:Mon, 21 Sep 2020 15:08:50 +02:00 Committer

[tip: perf/urgent] reiserfs: Fix oops during mount

2020-10-19 Thread tip-bot2 for Jan Kara
The following commit has been merged into the perf/urgent branch of tip: Commit-ID: 061fe185e17a1519a75eee89462f35a5360ece8b Gitweb: https://git.kernel.org/tip/061fe185e17a1519a75eee89462f35a5360ece8b Author:Jan Kara AuthorDate:Wed, 30 Sep 2020 17:08:20 +02:00 Committer

Re: [PATCH] ext2: Remove unnecessary blank

2020-10-19 Thread Jan Kara
count, > + sbi->s_group_desc = kmalloc_array(db_count, > sizeof(struct buffer_head *), > GFP_KERNEL); > if (sbi->s_group_desc == NULL) { > -- > 2.17.1 > -- Jan Kara SUSE Labs, CR

Re: [PATCH] fs/quota: update quota state flags scheme with project quota flags

2020-10-19 Thread Jan Kara
ST_DIRTY 0x0800 > */ > enum { > _DQUOT_USAGE_ENABLED = 0, /* Track disk usage for users */ > -- > 2.7.4 > -- Jan Kara SUSE Labs, CR

Re: [mm/writeback] 8d92890bd6: will-it-scale.per_process_ops -15.3% regression

2020-10-15 Thread Jan Kara
On Thu 15-10-20 11:08:43, Jan Kara wrote: > On Thu 15-10-20 08:46:01, NeilBrown wrote: > > On Wed, Oct 14 2020, Jan Kara wrote: > > > > > On Wed 14-10-20 16:47:06, kernel test robot wrote: > > >> Greeting, > > >> > > >> FYI, we noti

Re: [mm/writeback] 8d92890bd6: will-it-scale.per_process_ops -15.3% regression

2020-10-15 Thread Jan Kara
On Thu 15-10-20 08:46:01, NeilBrown wrote: > On Wed, Oct 14 2020, Jan Kara wrote: > > > On Wed 14-10-20 16:47:06, kernel test robot wrote: > >> Greeting, > >> > >> FYI, we noticed a -15.3% regression of will-it-scale.per_process_op

Re: [mm/writeback] 8d92890bd6: will-it-scale.per_process_ops -15.3% regression

2020-10-14 Thread Jan Kara
o if there's any negative performance impact of these changes, they're likely due to code alignment changes or something like that... So I don't think there's much to do here since optimal code alignment is highly specific to a particular CPU etc. Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH v2] ext4/xfs: add page refcount helper

2020-10-08 Thread Jan Kara
ell > Reviewed-by: Christoph Hellwig > Acked-by: Darrick J. Wong > Acked-by: Theodore Ts'o # for fs/ext4/inode.c The patch looks good to me. Feel free to add: Reviewed-by: Jan Kara Honza > --- > > Changes in v

Re: [PATCH] ext4/xfs: add page refcount helper

2020-10-07 Thread Jan Kara
t_var_event(&(_page)->_refcount, \ > + dax_layout_is_idle_page(_page), \ > + TASK_INTERRUPTIBLE, 0, 0, _wait_cb(_inode)) > + > #ifdef CONFIG_DEV_DAX_HMEM_DEVICES > void hmem_register_device(int target_nid, struct resource *r); > #else > -- > 2.20.1 > -- Jan Kara SUSE Labs, CR

Re: [PATCH 2/2] mm/frame-vec: use FOLL_LONGTERM

2020-10-05 Thread Jan Kara
the loop (to out: label) anyway due to the loop termination condition and why not return the frames we already have? Furthermore find_vma_intersection() can return NULL which would oops in your check then. What am I missing? Honza > out: > if (locked) > -- > 2.28.0 > -- Jan Kara SUSE Labs, CR

Re: ext4 regression in v5.9-rc2 from e7bfb5c9bb3d on ro fs with overlapped bitmaps

2020-10-05 Thread Jan Kara
correctly to fail when > upgrading to v5.9-rc2 or later. > > Fix this by defaulting block_validity to off when > EXT4_FEATURE_RO_COMPAT_SHARED_BLOCKS is set. > > Signed-off-by: Josh Triplett > Fixes: e7bfb5c9bb3

Re: ext4 regression in v5.9-rc2 from e7bfb5c9bb3d on ro fs with overlapped bitmaps

2020-10-05 Thread Jan Kara
On Mon 05-10-20 03:16:41, Josh Triplett wrote: > On Mon, Oct 05, 2020 at 11:46:01AM +0200, Jan Kara wrote: > > On Mon 05-10-20 01:14:54, Josh Triplett wrote: > > > Ran into an ext4 regression when testing upgrades to 5.9-rc kernels: > > > > > > Commit e7bfb5c

Re: [PATCH 2/2] mm/frame-vec: use FOLL_LONGTERM

2020-10-05 Thread Jan Kara
sed to work. Anyway, if you can make this go away, sure go ahead :) Honza -- Jan Kara SUSE Labs, CR

Re: ext4 regression in v5.9-rc2 from e7bfb5c9bb3d on ro fs with overlapped bitmaps

2020-10-05 Thread Jan Kara
eature is up to you but I don't think that belongs to the upstream kernel since that is correct as is... Honza -- Jan Kara SUSE Labs, CR

Re: [Linux-kernel-mentees] [PATCH] fs: reiserfs: xattr: Fix null pointer derefernce in open_xa_root()

2020-10-01 Thread Jan Kara
m this check directly in reiserfs_xattr_get(). > + } There's no need for additional braces in this 'if'. > > inode_lock_nested(d_inode(privroot), I_MUTEX_XATTR); Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH v2 12/12] mm/filemap: Return only head pages from find_get_entries

2020-10-01 Thread Jan Kara
On Wed 30-09-20 18:23:21, Matthew Wilcox wrote: > On Wed, Sep 30, 2020 at 07:08:07PM +0200, Jan Kara wrote: > > On Wed 30-09-20 13:36:37, Matthew Wilcox wrote: > > > On Wed, Sep 30, 2020 at 02:15:12PM +0200, Jan Kara wrote: > > > > On Mon 14-09-20 14:00:42,

Re: [PATCH v2 12/12] mm/filemap: Return only head pages from find_get_entries

2020-09-30 Thread Jan Kara
On Wed 30-09-20 13:36:37, Matthew Wilcox wrote: > On Wed, Sep 30, 2020 at 02:15:12PM +0200, Jan Kara wrote: > > On Mon 14-09-20 14:00:42, Matthew Wilcox (Oracle) wrote: > > > All callers now expect head (and base) pages, and can handle multiple > > > head pages

Re: [PATCH v2 12/12] mm/filemap: Return only head pages from find_get_entries

2020-09-30 Thread Jan Kara
tead of open-coding how pvecs behave. This has the side-effect of > being able to append to a pagevec with existing contents, although we > don't make use of that functionality anywhere yet. > > Signed-off-by: Matthew Wilcox (Oracle) Looks good to me. You can add: Reviewed-by: Jan Kara

Re: [PATCH v2 11/12] mm/truncate,shmem: Handle truncates that split THPs

2020-09-30 Thread Jan Kara
invalidatepage(page, 0, > - partial_end); > - unlock_page(page); > - put_page(page); > - } > + > + if (index != -1) > + page = find_lock_head(mapping, index); Similarly to shmem the use of index is a bit confusing here but it at least gets used in this case so OK. But I'd still find something like: if (!tail_page_already_truncated) page = find_lock_head(mapping, lend >> PAGE_SHIFT); easier to grasp. > + if (page) { > + if (!truncate_inode_partial_page(page, lstart, lend)) > + end = page->index; > + unlock_page(page); > + put_page(page); > } Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH v2 05/12] mm: Add and use find_lock_entries

2020-09-30 Thread Jan Kara
On Tue 29-09-20 13:48:06, Matthew Wilcox wrote: > On Tue, Sep 29, 2020 at 10:58:55AM +0200, Jan Kara wrote: > > On Mon 14-09-20 14:00:35, Matthew Wilcox (Oracle) wrote: > > > We have three functions (shmem_undo_range(), truncate_inode_pages_range() > > > and invalidate_

Re: BUG: unable to handle kernel paging request in dqput

2020-09-29 Thread Jan Kara
de path. > It must be a different reason. Yeah, it seems the bisection got confused because it hit a different error during the bisection. Looking at the original oops, I think the actual reason of a crash is that quota file got corrupted in a particular way. Quota code is not very paranoid

Re: [PATCH v2 01/12] mm: Make pagecache tagged lookups return only head pages

2020-09-29 Thread Jan Kara
pages today are in-memory, so there are no tagged huge pages today. > > Signed-off-by: Matthew Wilcox (Oracle) Looks good to me. You can add: Reviewed-by: Jan Kara Honza > --- > mm/filemap.c | 10 +- > 1 file

Re: [PATCH v2 10/12] mm: Remove pagevec_lookup_entries

2020-09-29 Thread Jan Kara
On Mon 14-09-20 14:00:40, Matthew Wilcox (Oracle) wrote: > pagevec_lookup_entries() is now just a wrapper around find_get_entries() > so remove it and convert all its callers. > > Signed-off-by: Matthew Wilcox (Oracle) Looks good. You can add: Reviewed-

Re: [PATCH v2 09/12] mm: Pass pvec directly to find_get_entries

2020-09-29 Thread Jan Kara
On Mon 14-09-20 14:00:39, Matthew Wilcox (Oracle) wrote: > All callers of find_get_entries() use a pvec, so pass it directly > instead of manipulating it in the caller. > > Signed-off-by: Matthew Wilcox (Oracle) Looks good. You can add: Reviewed-

Re: [PATCH v2 08/12] mm: Remove nr_entries parameter from pagevec_lookup_entries

2020-09-29 Thread Jan Kara
On Mon 14-09-20 14:00:38, Matthew Wilcox (Oracle) wrote: > All callers want to fetch the full size of the pvec. > > Signed-off-by: Matthew Wilcox (Oracle) Looks good. You can add: Reviewed-by: Jan Kara Honza > --

Re: [PATCH v2 07/12] mm: Add an 'end' parameter to pagevec_lookup_entries

2020-09-29 Thread Jan Kara
by: Matthew Wilcox (Oracle) Looks good to me. You can add: Reviewed-by: Jan Kara Honza > --- > include/linux/pagevec.h | 5 ++--- > mm/swap.c | 8 >

Re: [PATCH v2 05/12] mm: Add and use find_lock_entries

2020-09-29 Thread Jan Kara
do_range() which will try again so what you did might make a difference with performance but not much else. But still it would be good to at least comment about this in the changelog... Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH v2 04/12] mm/filemap: Add mapping_seek_hole_data

2020-09-29 Thread Jan Kara
rcu_read_lock(); > + while ((page = xas_find_get_entry(, max, XA_PRESENT))) { > + loff_t pos = xas.xa_index * PAGE_SIZE; OK, but for ordinary filesystems this could be problematic because of exceptional entries? Also for shmem you've dropped the PageUptodate check which

Re: [PATCH v2 02/12] mm/shmem: Use pagevec_lookup in shmem_unlock_mapping

2020-09-29 Thread Jan Kara
is a simpler function to use than find_get_pages(), so use it instead. > > Signed-off-by: Matthew Wilcox (Oracle) Looks good to me. BTW, I think I've already reviewed this... You can add: Reviewed-by: Jan Kara Honza > -

Re: [PATCH v2 03/12] mm/filemap: Add helper for finding pages

2020-09-29 Thread Jan Kara
e) Looks good. You can add: Reviewed-by: Jan Kara Honza > --- > mm/filemap.c | 98 +++- > 1 file changed, 43 insertions(+), 55 deletions(-) > > diff --git a/mm/fi

Re: KMSAN: uninit-value in udf_get_pblock_spar15

2020-09-25 Thread Jan Kara
se this could result from UDF image where sparing table is larger than a block. I've added check of the sparing table size to the mount path. Honza -- Jan Kara SUSE Labs, CR

Re: KMSAN: uninit-value in udf_evict_inode

2020-09-25 Thread Jan Kara
G: KMSAN: uninit-value in udf_evict_inode+0x382/0x7d0 fs/udf/inode.c:150 Yeah, easy enough. I'll send a fix. Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH] quota: clear padding in v2r1_mem2diskdqb()

2020-09-25 Thread Jan Kara
/open.c:1240 > __ia32_compat_sys_openat+0x56/0x70 fs/open.c:1240 > do_syscall_32_irqs_on arch/x86/entry/common.c:80 [inline] > __do_fast_syscall_32+0x129/0x180 arch/x86/entry/common.c:139 > do_fast_syscall_32+0x6a/0xc0 arch/x86/entry/common.c:162 > do_SYSENTER_32+0x73/0x90 arch/x86/en

Re: [PATCH v2] fs: Kill DCACHE_DONTCACHE dentry even if DCACHE_REFERENCED is set

2020-09-24 Thread Jan Kara
will be > killed and the inode will be evicted. In this way, if we change per-file > DAX policy, it will take effects automatically after this file is closed > by all processes. > > I also add some comments to make the code more clear. > > Signed

Re: [PATCH 07/13] block: lift setting the readahead size into the block layer

2020-09-24 Thread Jan Kara
gt; Acked-by: Coly Li > Reviewed-by: Johannes Thumshirn The patch looks good to me now. You can add: Reviewed-by: Jan Kara Honza > --- > block/blk-settings.c | 18 -- > block/blk-sysfs.c

Re: [PATCH 1/5] mm: Introduce mm_struct.has_pinned

2020-09-24 Thread Jan Kara
On Thu 24-09-20 11:02:37, Jason Gunthorpe wrote: > On Thu, Sep 24, 2020 at 09:44:09AM +0200, Jan Kara wrote: > > > After the page is pinned it is prevented from being freed and > > > recycled. After GUP has the pin it must check that the PTE still > > > points at the

Re: [PATCH] ext4: fix leaking sysfs kobject after failed mount

2020-09-24 Thread Jan Kara
.kernel.org > Signed-off-by: Eric Biggers Looks good. You can add: Reviewed-by: Jan Kara Honza > --- > fs/ext4/super.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/fs/ext4/super.c b/fs/ext4/super.c

Re: [PATCH 12/13] udf: Tell the VFS that readpage was synchronous

2020-09-24 Thread Jan Kara
On Thu 17-09-20 16:10:49, Matthew Wilcox (Oracle) wrote: > The udf inline data readpage implementation was already synchronous, > so use AOP_UPDATED_PAGE to avoid cycling the page lock. > > Signed-off-by: Matthew Wilcox (Oracle) Looks good. You can add: Reviewed-

Re: [PATCH 1/5] mm: Introduce mm_struct.has_pinned

2020-09-24 Thread Jan Kara
On Wed 23-09-20 14:12:07, Jason Gunthorpe wrote: > On Wed, Sep 23, 2020 at 04:20:03PM +0200, Jan Kara wrote: > > > I'd hate to take spinlock in the GUP-fast path. Also I don't think this is > > quite correct because GUP-fast-only can be called from interrupt context > &

Re: [PATCH 01/14] block: move the NEED_PART_SCAN flag to struct gendisk

2020-09-24 Thread Jan Kara
On Thu 17-09-20 18:57:07, Christoph Hellwig wrote: > We can only scan for partitions on the whole disk, so move the flag > from struct block_device to struct gendisk. > > Signed-off-by: Christoph Hellwig Makes sense. You can add: Reviewed-

Re: [PATCH 1/5] mm: Introduce mm_struct.has_pinned

2020-09-23 Thread Jan Kara
r, unsigned long end, > struct dev_pagemap *pgmap = NULL; > int nr_start = *nr, ret = 0; > pte_t *ptep, *ptem; > + spinlock_t *ptl = NULL; > + > + /* > +* More strict with FOLL_PIN, otherwise it could race with fork(). > The > +* page table lock guarantees that fork() will capture all the pinned > +* pages when dup_mm() and do proper page copy on them. > +*/ > + if (flags & FOLL_PIN) { > + ptl = pte_lockptr(mm, pmd); > + if (!spin_trylock(ptl)) > + return 0; > + } I'd hate to take spinlock in the GUP-fast path. Also I don't think this is quite correct because GUP-fast-only can be called from interrupt context and page table locks are not interrupt safe. That being said I don't see what's wrong with the solution Jason proposed of first setting writeprotect and then checking page_may_be_dma_pinned() during fork(). That should work just fine AFAICT... BTW note that GUP-fast code is (and this is deliberated because e.g. DAX depends on this) first updating page->_refcount and then rechecking PTE didn't change and the page->_refcount update is actually done using atomic_add_unless() so that it cannot be reordered wrt the PTE check. So the fork() code only needs to add barriers to pair with this. Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH 5/5] mm/thp: Split huge pmds/puds if they're pinned when fork()

2020-09-23 Thread Jan Kara
On Wed 23-09-20 09:50:04, Peter Xu wrote: > On Wed, Sep 23, 2020 at 11:22:05AM +0200, Jan Kara wrote: > > On Tue 22-09-20 13:01:13, John Hubbard wrote: > > > On 9/22/20 3:33 AM, Jan Kara wrote: > > > > On Mon 21-09-20 23:41:16, John Hubbard wrote: > > >

Re: [PATCH] FIX the comment of struct jbd2_journal_handle

2020-09-23 Thread Jan Kara
On Wed 23-09-20 01:12:31, Hui Su wrote: > the struct name was modified long ago, but the comment still > use struct handle_s. > > Signed-off-by: Hui Su Thanks for the patch. It looks good to me. You can add: Reviewed-

Re: [Linux-kernel-mentees] [PATCH] udf: Fix memory leak in udf_process_sequence()

2020-09-23 Thread Jan Kara
sb, data.part_descs_loc[i].rec.block); > if (ret < 0) > - return ret; > + goto out; > } > > - return 0; > + ret = 0; > +out: > + kfree(data.part_descs_loc); > + return ret; > } > > /* > -- > 2.25.1 > -- Jan Kara SUSE Labs, CR

Re: NVFS XFS metadata (was: [PATCH] pmem: export the symbols __copy_user_flushcache and __copy_from_user_flushcache)

2020-09-23 Thread Jan Kara
rkloads but you have to have some way to recover from crashes so it's mostly used for scratch filesystems (e.g. in build systems, Google uses this feature a lot for some of their infrastructure as well). Honza -- Jan Kara SUSE Labs, CR

Re: A bug in ext4 with big directories (was: NVFS XFS metadata)

2020-09-23 Thread Jan Kara
ees with large_dir feature (mkfs.ext4 -O large_dir). Does that help? Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH 5/5] mm/thp: Split huge pmds/puds if they're pinned when fork()

2020-09-23 Thread Jan Kara
On Tue 22-09-20 13:01:13, John Hubbard wrote: > On 9/22/20 3:33 AM, Jan Kara wrote: > > On Mon 21-09-20 23:41:16, John Hubbard wrote: > > > On 9/21/20 2:20 PM, Peter Xu wrote: > > > ... > > > > + if (unlikely(READ_ONCE(src_mm->has_pinned) &&

Re: [PATCH 5/5] mm/thp: Split huge pmds/puds if they're pinned when fork()

2020-09-22 Thread Jan Kara
s. For file pages mm->has_pinned does not work because the page may be still pinned by completely unrelated process as Jann already properly pointed out earlier in the thread. So maybe anon_page_likely_pinned()? Possibly also assert PageAnon(page) in it if we want to be paranoid... Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH] udf: Remove redundant initialization of variable ret

2020-09-22 Thread Jan Kara
t; struct timestamp *ts; > > outstr = kmalloc(128, GFP_NOFS); > -- > 2.17.1 > -- Jan Kara SUSE Labs, CR

Re: possible deadlock in blkdev_put

2020-09-22 Thread Jan Kara
00 00 00 00 0f 1f 44 00 00 b8 a6 00 00 00 0f 05 <48> 3d 01 f0 ff ff 0f > 83 fd 89 fb ff c3 66 2e 0f 1f 84 00 00 00 00 > RSP: 002b:7fff59216328 EFLAGS: 0246 ORIG_RAX: 00a6 > RAX: RBX: 00076035 RCX: 00460027 > RDX: 00403188 RSI: 0002 RDI: 7fff592163d0 > RBP: 0333 R08: R09: 000b > R10: 0005 R11: 0246 R12: 7fff59217460 > R13: 02df2a60 R14: R15: 7fff59217460 > > > --- > This report is generated by a bot. It may contain errors. > See https://goo.gl/tpsmEJ for more information about syzbot. > syzbot engineers can be reached at syzkal...@googlegroups.com. > > syzbot will keep track of this issue. See: > https://goo.gl/tpsmEJ#status for how to communicate with syzbot. -- Jan Kara SUSE Labs, CR

Re: [PATCH 07/13] block: lift setting the readahead size into the block layer

2020-09-22 Thread Jan Kara
> device's (%lu -> %lu)\n", > - q->backing_dev_info->ra_pages, > - b->backing_dev_info->ra_pages); > - q->backing_dev_info->ra_pages = > - b->backing_dev_info->ra_pages; > - } > - } > fixup_discard_if_not_supported(q); > fixup_write_zeroes(device, q); > } -- Jan Kara SUSE Labs, CR

Re: [PATCH 05/13] bdi: initialize ->ra_pages and ->io_pages in bdi_init

2020-09-22 Thread Jan Kara
ood to me. You can add: Reviewed-by: Jan Kara I'd just prefer if the changelog explicitely mentioned that this patch results in enabling readahead for coda, ecryptfs, and orangefs... Just in case someone bisects some issue down to this patch :).

Re: [PATCH 04/13] aoe: set an optimal I/O size

2020-09-22 Thread Jan Kara
ig Looks good. You can add: Reviewed-by: Jan Kara Honza > --- > drivers/block/aoe/aoeblk.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/block/aoe/aoeblk.c b/driv

Re: [PATCH 03/13] bcache: inherit the optimal I/O size

2020-09-22 Thread Jan Kara
On Mon 21-09-20 10:07:24, Christoph Hellwig wrote: > Inherit the optimal I/O size setting just like the readahead window, > as any reason to do larger I/O does not apply to just readahead. > > Signed-off-by: Christoph Hellwig The patch looks good to me. You can add: Reviewed-

Re: More filesystem need this fix (xfs: use MMAPLOCK around filemap_map_pages())

2020-09-22 Thread Jan Kara
On Mon 21-09-20 18:59:43, Matthew Wilcox wrote: > On Mon, Sep 21, 2020 at 09:20:25AM -0700, Linus Torvalds wrote: > > On Mon, Sep 21, 2020 at 2:11 AM Jan Kara wrote: > > > > > > Except that on truncate, we have to unmap these > > > anonymous pages in private f

Re: [PATCH v2] dm: Call proper helper to determine dax support

2020-09-21 Thread Jan Kara
On Mon 21-09-20 11:23:07, Naresh Kamboju wrote: > On Fri, 18 Sep 2020 at 11:18, Dan Williams wrote: > > > > From: Jan Kara > > > > DM was calling generic_fsdax_supported() to determine whether a device > > referenced in the DM table supports DAX. However this i

Re: PROBLEM: 5.9.0-rc6 fails to compile due to 'redefinition of ‘dax_supported’'

2020-09-21 Thread Jan Kara
> all my local builds are breaking now too with this :( > > Was there a proposed patch anywhere for this? Attached patch should fix the build breakage. I'm sorry for that. Honza -- Jan Kara SUSE Labs, CR >From 8b8c7d6148b

Re: More filesystem need this fix (xfs: use MMAPLOCK around filemap_map_pages())

2020-09-21 Thread Jan Kara
ous page for that offset, copy to it current contents of the corresponding file page, and from that moment on it behaves as an anonymous page. Except that on truncate, we have to unmap these anonymous pages in private file mappings as well... Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH 1/4] mm: Trial do_wp_page() simplification

2020-09-21 Thread Jan Kara
f even ordinary threaded FOLL_PIN users would not have to be that careful about fork(2) and possible data loss due to COW - we had certainly reports of O_DIRECT IO loosing data due to fork(2) and COW exactly because it is very subtle how it behaves... But as I wrote above this is not urgent since that problematic behavior exists since the beginning of O_DIRECT IO in Linux. Honza -- Jan Kara SUSE Labs, CR

Re: [RFC PATCH] locking/percpu-rwsem: use this_cpu_{inc|dec}() for read_count

2020-09-18 Thread Jan Kara
ly to matter. The lock hold times there are long enough that it would be just lost in the noise. For other stuff using them like get_online_cpus() or get_online_mems() I'm not so sure... Honza -- Jan Kara SUSE Labs, CR

Re: the "read" syscall sees partial effects of the "write" syscall

2020-09-18 Thread Jan Kara
ge basis for buffered IO. Honza -- Jan Kara SUSE Labs, CR

<    1   2   3   4   5   6   7   8   9   10   >