Re: Writeback bug causing writeback stalls

2020-05-25 Thread Jan Kara
76,14 @@ static int writeback_single_inode(struct >*/ > if (!(inode->i_state & I_DIRTY_ALL)) > inode_io_list_del_locked(inode, wb); > + else if (!(inode->i_state & I_DIRTY_TIME) && dt) { > + /* > + * We can correct inode's io

Re: Writeback bug causing writeback stalls

2020-05-25 Thread Jan Kara
use of performance. We don't want to block writes (or event reads in case of XFS) for the inode during writeback. Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH V4 7/8] fs/ext4: Introduce DAX inode flag

2020-05-25 Thread Jan Kara
On Sun 24-05-20 21:39:10, Ira Weiny wrote: > On Fri, May 22, 2020 at 01:48:48PM +0200, Jan Kara wrote: > > And then we should check conflicts with the journal flag as well, as I > > mentioned in reply to the first patch. There it is more complicated by the > > fact that we sh

Re: Writeback bug causing writeback stalls

2020-05-22 Thread Jan Kara
On Fri 22-05-20 17:23:30, Martijn Coenen wrote: > [ dropped android-storage-c...@google.com from CC: since that list > can't receive emails from outside google.com - sorry about that ] > > Hi Jan, > > On Fri, May 22, 2020 at 4:41 PM Jan Kara wrote: > > > The easie

Re: Writeback bug causing writeback stalls

2020-05-22 Thread Jan Kara
reasonably quickly. Probably I'd add an inode state flag telling that inode is queued for writeback by flush worker and we won't touch dirty lists in that case, otherwise we are safe to update current writeback list as needed. I'll work on fixing this as when I was reading the code I've noticed there are

Re: [PATCH V4 7/8] fs/ext4: Introduce DAX inode flag

2020-05-22 Thread Jan Kara
nd EXT4_JOURNAL_DATA_FL at the same time so the checks will be somewhat more complicated. Honza > + > /* Is it quota file? Do not allow user to mess with it */ > if (ext4_is_quota_file(inode)) > goto flags_out; -- Jan Kara SUSE Labs, CR

Re: [PATCH V4 4/8] fs/ext4: Update ext4_should_use_dax()

2020-05-22 Thread Jan Kara
ort prior to the over > riding mount option. > > While we are at it change the function to ext4_should_enable_dax() as > this better reflects the ask as well as matches xfs. > > Signed-off-by: Ira Weiny The patch looks good to

Re: [PATCH V4 1/8] fs/ext4: Narrow scope of DAX check in setflags

2020-05-22 Thread Jan Kara
On Thu 21-05-20 12:13:06, ira.we...@intel.com wrote: > From: Ira Weiny > > When preventing DAX and journaling on an inode. Use the effective DAX > check rather than the mount option. > > This will be required to support per inode DAX flags. > > Reviewed-by: Jan Ka

Re: [PATCH V3 4/8] fs/ext4: Update ext4_should_use_dax()

2020-05-21 Thread Jan Kara
On Wed 20-05-20 12:40:50, Ira Weiny wrote: > On Wed, May 20, 2020 at 03:37:28PM +0200, Jan Kara wrote: > > On Tue 19-05-20 22:57:49, ira.we...@intel.com wrote: > > > From: Ira Weiny > > > > > > S_DAX should only be enabled when the underlying block device sup

Re: [PATCH V3 7/8] fs/ext4: Introduce DAX inode flag

2020-05-20 Thread Jan Kara
heck and change this to IS_DAX(inode) && !(inode->i_flags & I_NEW) ? Because as I'm reading the code now, this should never trigger? > > + if (ext4_test_inode_flag(inode, EXT4_INODE_DAX)) > + return -EOPNOTSUPP; > + Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH V3 4/8] fs/ext4: Update ext4_should_use_dax()

2020-05-20 Thread Jan Kara
; While we are at it change the function to ext4_should_enable_dax() as > this better reflects the ask as well as matches xfs. > > Reviewed-by: Jan Kara > Signed-off-by: Ira Weiny ... > @@ -4412,7 +4410,13 @@ static bool ext4_should_use_dax(struct inode *inode) > return

Re: [PATCH 3/9] fs/ext4: Disallow encryption if inode is DAX

2020-05-20 Thread Jan Kara
ed but I'm not clear right now how to fix it. > > > If not, maybe you should change it to check > > S_NEW instead of i_size == 0 to make it clearer? > > The patch is completely unnecessary. > > It is much easier to make (EXT4_ENCRYPT_FL | EXT4_VERITY_FL) incompatible > with EXT4_DAX_FL when it is introduced later in the series. Furthermore > this mutual exclusion can be done on directories in the encrypt case. > Which I think will be nicer for the user if they get an error when trying > to set one when the other is set. Agreed. Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH 1/2 V4] MM: replace PF_LESS_THROTTLE with PF_LOCAL_THROTTLE

2020-05-15 Thread Jan Kara
e behaviour of nfsd and > loop tasks. I don't know what is wanted for realtime. > > Acked-by: Chuck Lever (for nfsd change) > Signed-off-by: NeilBrown The idea looks good to me. It will still have the "threads may hit hard wall" behavior when BDI freerun threshold is cro

Re: [PATCH 2/2 V4] MM: Discard NR_UNSTABLE_NFS, use NR_WRITEBACK instead.

2020-05-15 Thread Jan Kara
trace points and warning printks which mentioned this > counter no longer report it. > > Reviewed-by: Christoph Hellwig > Acked-by: Trond Myklebust > Acked-by: Michal Hocko # for MM parts > Signed-off-by: NeilBrown The patch looks good to me. You can add: Reviewed-by: Jan Kara

Re: [PATCH v2 7/9] fs/ext4: Make DAX mount option a tri-state

2020-05-15 Thread Jan Kara
printing. > > Signed-off-by: Ira Weiny Thanks. The patch looks good to me now. You can add: Reviewed-by: Jan Kara Honza > > --- > Changes from V1: > Fix up mounting options to only show an optio

Re: [PATCH V1 7/9] fs/ext4: Make DAX mount option a tri-state

2020-05-14 Thread Jan Kara
s I wrote in my reply to previous version of this patch, I'd prefer if we handled this like e.g. 'data=' mount option. I don't think any unification in option parsing with XFS makes sence and I'd rather keep consistent how ext4 handles these 'enum' options. Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH V1 9/9] Documentation/dax: Update DAX enablement for ext4

2020-05-14 Thread Jan Kara
On Wed 13-05-20 23:53:15, ira.we...@intel.com wrote: > From: Ira Weiny > > Update the document to reflect ext4 and xfs now behave the same. > > Signed-off-by: Ira Weiny Looks good to me. You can add: Review

Re: [PATCH V1 8/9] fs/ext4: Introduce DAX inode flag

2020-05-14 Thread Jan Kara
> Finally, on regular files, flag the inode to not be cached to facilitate > changing S_DAX on the next creation of the inode. > > Signed-off-by: Ira Weiny The patch looks good to me. You can add: Reviewed-by: Jan Kara

Re: [PATCH V1 3/9] fs/ext4: Disallow encryption if inode is DAX

2020-05-14 Thread Jan Kara
xclusive by returning an error if DAX was > set first. > > Furthermore, clarify the documentation of the exclusivity and how that > will work. > > Signed-off-by: Ira Weiny Looks good to me. You can add: Reviewed-by: Jan Kara

Re: [PATCH V1 2/9] fs/ext4: Disallow verity if inode is DAX

2020-05-14 Thread Jan Kara
by returning an error if DAX was > set first. > > (Setting DAX is already disabled if Verity is set first.) > > Signed-off-by: Ira Weiny Makes sence. You can add: Reviewed-by: Jan Kara Honza > > --- > Cha

Re: [PATCH 8/9] fs/ext4: Introduce DAX inode flag

2020-05-14 Thread Jan Kara
On Wed 13-05-20 14:41:55, Ira Weiny wrote: > On Wed, May 13, 2020 at 04:47:06PM +0200, Jan Kara wrote: > > > > So I think you'll have to check > > whether DAX flag is being changed, > > ext4_dax_dontcache() does check if the flag is being changed. Yes, but if you call

Re: [PATCH 8/9] fs/ext4: Introduce DAX inode flag

2020-05-13 Thread Jan Kara
gt; > inode_lock(inode); > + > + ext4_dax_dontcache(inode, flags); > + I don't think we should set dontcache flag when setting of DAX flag fails - it could event be a security issue). So I think you'll have to check whether DAX flag is being changed, call vfs_ioc_fssetxattr_check(), and only if it succeeded and DAX flags was changing call ext4_dax_dontcache(). Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH 7/9] fs/ext4: Make DAX mount option a tri-state

2020-05-13 Thread Jan Kara
@ static int ext4_remount(struct super_block *sb, int > *flags, char *data) > goto restore_opts; > } > > - if ((sbi->s_mount_opt ^ old_opts.s_mount_opt) & EXT4_MOUNT_DAX_ALWAYS) { > + if ((sbi->s_mount_opt ^ old_opts.s_mount_opt) & EXT4_MOUNT_DAX_ALWAYS || > + (sbi->s_mount_opt2 ^ old_opts.s_mount_opt2) & > EXT4_MOUNT2_DAX_NEVER) { > ext4_msg(sb, KERN_WARNING, "warning: refusing change of " > - "dax flag with busy inodes while remounting"); > + "dax mount option with busy inodes while remounting"); > sbi->s_mount_opt ^= EXT4_MOUNT_DAX_ALWAYS; > + sbi->s_mount_opt2 ^= EXT4_MOUNT2_DAX_NEVER; > } > > if (sbi->s_mount_flags & EXT4_MF_FS_ABORTED) > -- > 2.25.1 > -- Jan Kara SUSE Labs, CR

Re: [PATCH 6/9] fs/ext4: Only change S_DAX on inode load

2020-05-13 Thread Jan Kara
; > Add init bool to ext4_set_inode_flags() to indicate if the inode is > being newly initialized. > > Assert that S_DAX is not set on an inode which is just being loaded. > > Signed-off-by: Ira Weiny The patch looks good to me. You can add: Reviewed-by: Jan Kara

Re: [PATCH 5/9] fs/ext4: Update ext4_should_use_dax()

2020-05-13 Thread Jan Kara
; While we are at it change the function to ext4_should_enable_dax() as > this better reflects the ask as well as matches xfs. > > Signed-off-by: Ira Weiny The patch looks good to me. You can add: Reviewed-by: Jan Kara

Re: [PATCH 4/9] fs/ext4: Change EXT4_MOUNT_DAX to EXT4_MOUNT_DAX_ALWAYS

2020-05-13 Thread Jan Kara
On Tue 12-05-20 22:43:19, ira.we...@intel.com wrote: > From: Ira Weiny > > In prep for the new tri-state mount option which then introduces > EXT4_MOUNT_DAX_NEVER. > > Signed-off-by: Ira Weiny Looks good to me. You can add: Rev

Re: [PATCH v2] lib/flex_proportions.c: cleanup __fprop_inc_percpu_max

2020-05-11 Thread Jan Kara
uicky return on pl->period == p->period > test, so it would not result to significant downside of performance. > > Thanks for Jan's guidance. > > Signed-off-by: Tan Hu Thanks for the patch. It looks good to me. Yo

Re: [PATCH] lib/flex_proportions.c: aging counts when fraction smaller than max_frac/FPROP_FRAC_BASE

2020-05-08 Thread Jan Kara
does no big harm as we'd just quickly bail on pl->period == p->period test. So I think the code is easier to understand the way you suggest without significant downside. But please update the changelog to explain that this is just code cleanup (with the above reasoning) and not a functional fix. Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH] fanotify: Replace zero-length array with flexible-array

2020-05-08 Thread Jan Kara
t; --- a/fs/notify/fanotify/fanotify.h > +++ b/fs/notify/fanotify/fanotify.h > @@ -89,7 +89,7 @@ struct fanotify_name_event { > __kernel_fsid_t fsid; > struct fanotify_fh dir_fh; > u8 name_len; > - char name[0]; > + char name[]; > }; > > static inline struct fanotify_name_event * > -- Jan Kara SUSE Labs, CR

Re: [RFC] mm/gup.c: Updated return value of {get|pin}_user_pages_fast()

2020-05-07 Thread Jan Kara
On Wed 06-05-20 21:38:40, Souptick Joarder wrote: > On Wed, May 6, 2020 at 6:29 PM Jan Kara wrote: > > > > On Wed 06-05-20 17:51:39, Souptick Joarder wrote: > > > On Wed, May 6, 2020 at 3:36 PM Jan Kara wrote: > > > > > > > > On Wed 06-05-20 02

Re: [RFC] mm/gup.c: Updated return value of {get|pin}_user_pages_fast()

2020-05-06 Thread Jan Kara
On Wed 06-05-20 17:51:39, Souptick Joarder wrote: > On Wed, May 6, 2020 at 3:36 PM Jan Kara wrote: > > > > On Wed 06-05-20 02:06:56, Souptick Joarder wrote: > > > On Wed, May 6, 2020 at 1:08 AM John Hubbard wrote: > > > > > > > > On 2020-05-05 12

Re: [RFC] mm/gup.c: Updated return value of {get|pin}_user_pages_fast()

2020-05-06 Thread Jan Kara
e behavior of < 0 return on error or > 0 return if we mapped some pages. Callers that can possibly ask to map 0 pages can get 0 pages back - kind of expected - and I don't see any benefit in trying to rewrite these callers to handle -EINVAL instead... Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH 09/15] udf: avoid gcc-10 zero-length-bounds warnings

2020-05-01 Thread Jan Kara
tions) > part) { > - accum = le32_to_cpu( > - lvid->freeSpaceTable[part]); > + accum = le32_to_cpup( > + (lvid->freeSpaceTable + part)); > if (accum == 0x) > accum = 0; > } > > > > This version could easily be backported to stable kernels to let them be > compiled with gcc-10, and then synchronizing with the udftools version of > the header needs additional changes on top, which do not need to be > backported. > >Arnd -- Jan Kara SUSE Labs, CR

Re: [PATCH v5 0/4] Charge loop device i/o to issuing cgroup

2020-04-29 Thread Jan Kara
On Wed 29-04-20 10:22:30, Tejun Heo wrote: > Hello, > > On Wed, Apr 29, 2020 at 12:25:40PM +0200, Jan Kara wrote: > > Yeah, I was thinking about the same when reading the patch series > > description. We already have some cgroup workarounds for btrfs kthreads if > > I

Re: [PATCH v5 0/4] Charge loop device i/o to issuing cgroup

2020-04-29 Thread Jan Kara
to the problem and also the number of kthreads gets multiplied by the number of cgroups. So I agree some generic solution how to approach IO throttling of kthreads / workers would be desirable. OTOH I don't have a great idea how the generic infrastructure should look like... Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH V11 10/11] fs: Introduce DCACHE_DONTCACHE

2020-04-28 Thread Jan Kara
being set I_DONTCACHE. > > This facilitates dropping dentry references to inodes sooner which > require eviction to swap S_DAX mode. > > Cc: Al Viro > Signed-off-by: Ira Weiny The patch looks good to me. You can add: Reviewed-by: Jan Kara

Re: [PATCH v2] cgroup, blkcg: prevent dirty inodes to pin dying memory cgroups

2019-10-22 Thread Jan Kara
On Mon 21-10-19 23:49:04, Roman Gushchin wrote: > On Wed, Oct 16, 2019 at 11:18:40AM +0200, Jan Kara wrote: > > On Tue 15-10-19 21:40:45, Roman Gushchin wrote: > > > On Tue, Oct 15, 2019 at 11:09:33AM +0200, Jan Kara wrote: > > > > On Thu 10-10-19

Re: [PATCH] fs/dax: Fix pmd vs pte conflict detection

2019-10-21 Thread Jan Kara
ck to lookup the correct order. > > Reported-by: Jeff Smits > Reported-by: Doug Nelson > Cc: > Fixes: 23c84eb78375 ("dax: Fix missed wakeup with PMD faults") > Cc: Jan Kara > Cc: Matthew Wilcox (Oracle) > Signed-off-by: Dan Williams Good catch! The patch looks g

Re: [PATCH] aio: Fix io_pgetevents() struct __compat_aio_sigset layout

2019-10-17 Thread Jan Kara
if (ret) > return ret; > > @@ -2239,7 +2239,7 @@ COMPAT_SYSCALL_DEFINE6(io_pgetevents_time64, > if (usig && copy_from_user(, usig, sizeof(ksig))) > return -EFAULT; > > - ret = set_compat_user_sigmask(ksig.sigmask, ksig.sigsetsize); > + ret = set_compat_user_sigmask(compat_ptr(ksig.sigmask), > ksig.sigsetsize); > if (ret) > return ret; > > -- > 2.23.0 > -- Jan Kara SUSE Labs, CR

Re: [PATCH] fsnotify/fdinfo: exportfs_encode_inode_fh() takes pointer as 4th argument

2019-10-17 Thread Jan Kara
: Using plain integer as NULL pointer > > Signed-off-by: Ben Dooks Thanks for the cleanup! Applied. Honza > --- > Cc: Jan Kara > Cc: Amir Goldstein > Cc: linux-fsde...@vger.kernel.org > Cc: linux-kernel@vger.

Re: [PATCH] fsnotify: move declaration of fsnotify_mark_connector_cachep to fsnotify.h

2019-10-17 Thread Jan Kara
t declared. Should it be static? > > Signed-off-by: Ben Dooks OK, fair enough. Applied. Thanks for the patch. Honza > --- > Cc: Jan Kara > Cc: Amir Goldstein > Cc: linux-fsde...@vger.kernel.org > Cc: linux-ke

Re: [PATCH] quota: minor code cleanup for v1_format_ops

2019-10-17 Thread Jan Kara
ormat_ops = { > .check_quota_file = v1_check_quota_file, > .read_file_info = v1_read_file_info, > .write_file_info= v1_write_file_info, > - .free_file_info = NULL, > .read_dqblk = v1_read_dqblk, > .commit_dqblk =

Re: [PATCH v2] cgroup, blkcg: prevent dirty inodes to pin dying memory cgroups

2019-10-16 Thread Jan Kara
On Tue 15-10-19 21:40:45, Roman Gushchin wrote: > On Tue, Oct 15, 2019 at 11:09:33AM +0200, Jan Kara wrote: > > On Thu 10-10-19 16:40:36, Roman Gushchin wrote: > > > > > @@ -426,7 +431,7 @@ static void inode_switch_wbs_work_fn(struct > > > work_struct *work) &

Re: [RFC] writeback: add elastic bdi in cgwb bdp

2019-10-15 Thread Jan Kara
description of what your solution is would be good... Honza > Cc: Roman Gushchin > Cc: Tejun Heo > Cc: Jan Kara > Cc: Johannes Weiner > Cc: Shakeel Butt > Cc: Minchan Kim > Cc: Mel Gorman > Signed-off-by: Hillf Danton > --- > > --- a/include/linux/back

Re: [PATCH v2] cgroup, blkcg: prevent dirty inodes to pin dying memory cgroups

2019-10-15 Thread Jan Kara
val we traverse > the whole list ignoring wbs with active io operations. That will > allow the majority of io operations to be finished after the > removal of the cgroup. > > Big thanks to Jan Kara and Dennis Zhou for their ideas and > contribution to this patch. > > Signed-off-by:

Re: [PATCH] fs: include internal.h for missing declarations

2019-10-15 Thread Jan Kara
hould it be static? > fs/buffer.c:2994:6: warning: symbol 'guard_bio_eod' was not declared. Should > it be static? > > Signed-off-by: Ben Dooks OK, makes sense to keep declarations in sync with real functions. Thanks for the patch a feel free

Re: WARNING: bad unlock balance in rcu_lock_release

2019-10-15 Thread Jan Kara
4e8 EFLAGS: 0246 ORIG_RAX: 00a5 > RAX: ffda RBX: 0003 RCX: 00441e99 > RDX: 2140 RSI: 2280 RDI: 24c0 > RBP: f000 R08: R09: 7fff8d717698 > R10: R11: 0246 R12: 006ccdc8 > R13: 006cd440 R14: R15: > -- Jan Kara SUSE Labs, CR

Re: [PATCH] quota: check quota type in early stage

2019-10-08 Thread Jan Kara
SK; > > + if (type >= MAXQUOTAS) > + return -EINVAL; > + > /* >* As a special case Q_SYNC can be called without a specific device. >* It will iterate all superblocks that have quota enabled and call > -- > 2.21.0 > > > > -- Jan Kara SUSE Labs, CR

Re: [PATCH] cgroup, blkcg: prevent dirty inodes to pin dying memory cgroups

2019-10-08 Thread Jan Kara
cation of already quite complex locking scheme. > And because inode_switch_wbs() can fail, we can't guarantee that a single > pass over such a list will be enough. That means the we need to schedule > scans periodically until all inodes will be switched. > > So I really don't know which option is better, but at the same time > doing nothing isn't the option too. Somehow the problem should be solved. I agree with Dave that scanning all inodes in the system can get really expensive quickly. So what I rather think we could do is create another 'IO list' (linked by inode->i_io_list) where we would put inodes that reference the wb but are not in any other IO list of the wb. And then we would switch inodes on this list when the wb is dying... One would have to be somewhat careful with properly draining this list since new inodes can be added to it while we work on it but otherwise I don't see any complication with this. Honza -- Jan Kara SUSE Labs, CR

Re: Lease semantic proposal

2019-10-07 Thread Jan Kara
eless for anything else. I agree we should not tailor the layout lease definition to just RDMA usecase. But let's talk about the semantics once our confusion about how pNFS currently uses layout leases is clear out. Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH] quota: avoid increasing DQST_LOOKUPS when iterating over dirty/inuse list

2019-10-07 Thread Jan Kara
dqstats_inc(DQST_LOOKUPS); > err = sb->dq_op->write_dquot(dquot); > if (err) { > /* > -- > 2.20.1 > > > > -- Jan Kara SUSE Labs, CR

Re: [PATCH v2] quota: code cleanup for hash bits calculation

2019-10-07 Thread Jan Kara
ash_bits++; > - } while (nr_hash >> dq_hash_bits); > - dq_hash_bits--; > + dq_hash_bits = ilog2(nr_hash); > > nr_hash = 1UL << dq_hash_bits; > dq_hash_mask = nr_hash - 1; > -- > 2.21.0 > > > > -- Jan Kara SUSE Labs, CR

Re: System hangs if NVMe/SSD is removed during suspend

2019-10-07 Thread Jan Kara
the other reply, I don't think there's a good reason for that to be > the case. Well, cannot it happen that the flush worker will get stuck in D state because some subsystem is already suspended and thus hibernation fails (because AFAIK processes in uninterruptible sleep block hibernation)? I was also somewhat worried that the hibernation image could be inconsistent if flush workers do something while hibernation image is being taken but that does not seem to be a valid concern as all kernel processes get frozen before hibernation image is taken. Honza -- Jan Kara SUSE Labs, CR

Re: Lease semantic proposal

2019-10-03 Thread Jan Kara
had in your patches should do that, shouldn't they? Maybe they were not tied to the right structure... They really need to be tied to the existence of a lease. Honza -- Jan Kara SUSE Labs, CR

Re: Lease semantic proposal

2019-10-03 Thread Jan Kara
ot multiplex this over F_SETLEASE. An F_SETLAYOUT cmd > > > would probably be sufficient, and maybe just reuse > > > F_RDLCK/F_WRLCK/F_UNLCK for the iomode? > > > > > > For the byte ranges, the catch there is that extending the userland > > > interface for that later will be difficult. > > > > Why would it be difficult? > > > > Legacy userland code that wanted to use byte range enabled layouts would > have to be rebuilt to take advantage of them. If we require a range from > the get-go, then they will get the benefit of them once they're > available. I don't think this is true. Because current implementation of locking the whole file may hide implementation bugs in userspace. So the new range lock handling may break userspace and history shows such problems with APIs are actually rather common. So I think switching to range locking *must* be conscious decision of the application and as such having separate API for that is the most natural thing to do. > > > What I'd probably suggest > > > (and what would jive with the way pNFS works) would be to go ahead and > > > add an offset and length to the arguments and result (maybe also > > > whence?). > > > > Why not add new commands with range arguments later if it turns out to > > be necessary? > > We could do that. It'd be a little ugly, IMO, simply because then we'd > end up with two interfaces that do almost the exact same thing. > > Should byte-range layouts at that point conflict with non-byte range > layouts, or should they be in different "spaces" (a'la POSIX and flock > locks)? When it's all one interface, those sorts of questions sort of > answer themselves. When they aren't we'll have to document them clearly > and I think the result will be more confusing for userland programmers. > > If you felt strongly about leaving those out for now, you could just do > something similar to what Aleksa is planning for openat2 -- have a > struct pointer and length as arguments for this cmd, and only have a > single iomode member in there for now. > > The kernel would have to know how to deal with "legacy" and byte-range- > enabled variants if we ever extend it, but that's not too hard to > handle. Yeah, so we can discuss how to make possible future extension towards range locking the least confusing to userspace. E.g. we can just put the ranges in the API and require that start is always 0 and end is always ULONG_MAX or whatever. But switching to smaller ranges must be the decision in the application after the kernel supports it. Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH] udf: prevent memory leak in udf_new_inode

2019-09-26 Thread Jan Kara
ormal eviction pathway ought to free the damn thing. a bit> > > Mind explaining what's to stop ->evict_inode (== udf_evict_inode) from > hitting > kfree(iinfo->i_ext.i_data); > considering that this call of kfree() appears to be unconditional there? Exactly. udf

Re: [PATCH] quota: code cleanup for hash bits calculation

2019-09-23 Thread Jan Kara
= 1UL << dq_hash_bits; Why not just: dq_hash_bits = ilog2(nr_hash); nr_hash = 1UL << dq_hash_bits; That way we need to compute fls() only once... Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH] quota: fix wrong condition in is_quota_modification()

2019-09-16 Thread Jan Kara
On Mon 16-09-19 10:53:08, Chao Yu wrote: > On 2019/9/12 18:06, Jan Kara wrote: > > On Wed 11-09-19 17:36:50, Chao Yu wrote: > >> diff --git a/include/linux/quotaops.h b/include/linux/quotaops.h > >> index dc905a4ff8d7..bd30acad3a7f 100644 > >> --- a/include

Re: [PATCH] quota: fix wrong condition in is_quota_modification()

2019-09-12 Thread Jan Kara
& ATTR_UID && !uid_eq(ia->ia_uid, inode->i_uid)) > || > (ia->ia_valid & ATTR_GID && !gid_eq(ia->ia_gid, inode->i_gid)); > } OK, but your change makes i_size extension not to be quota modification which is IMO wrong. So I think the condit

Re: [PATCH] fs-udf: Delete an unnecessary check before brelse()

2019-09-04 Thread Jan Kara
= 0; i < nr_groups; i++) > - if (bitmap->s_block_bitmap[i]) > - brelse(bitmap->s_block_bitmap[i]); > + brelse(bitmap->s_block_bitmap[i]); > > kvfree(bitmap); > } > -- > 2.23.0 > > -- Jan Kara SUSE Labs, CR

Re: [PATCH] ext2: Delete an unnecessary check before brelse()

2019-09-04 Thread Jan Kara
or (i = 0; i < db_count; i++) > - if (sbi->s_group_desc[i]) > - brelse (sbi->s_group_desc[i]); > + brelse(sbi->s_group_desc[i]); > kfree(sbi->s_group_desc); > kfree(sbi->s_debts); > percpu_counter_destroy(>s_freeblocks_counter); > -- > 2.23.0 > > -- Jan Kara SUSE Labs, CR

Re: [PATCH block/for-next] writeback: add tracepoints for cgroup foreign writebacks

2019-08-30 Thread Jan Kara
= __trace_wb_assign_cgroup(wb); > + __entry->page_cgroup_ino = > page->mem_cgroup->css.cgroup->kn->id.ino; > + ), Are the page dereferences above safe? I suppose lock_page_memcg() protects the page->mem_cgroup->css.cgroup->kn->id dereference? But page->mapping does not seem to be protected by page lock? Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH] fs/sync.c: Fix UBSAN Undefined behaviour in sync_file_range

2019-08-27 Thread Jan Kara
the calculate after compare. ensure the offset >= 0 && nbytes >= 0 && > no overflow may happen first. > > Signed-off-by: SunKe Thanks for the patch. The patch looks good to me. You can add: Reviewed-by: Jan Kara Al, care to pickup this fix?

Re: [PATCH v2 3/4] linux/fs.h: fix umask on NFS with CONFIG_FS_POSIX_ACL=n

2019-08-27 Thread Jan Kara
-off-by: Max Kellermann > Cc: sta...@vger.kernel.org Thanks for the patch. This patch definitely looks good to me so feel free to add: Reviewed-by: Jan Kara I just wonder, do you really need patches 1 and 2? Doesn't this patch alone fix the problem? Because AFAIU the problem, this patch sh

Re: [PATCH 02/19] dax: Pass dax_dev to dax_writeback_mapping_range()

2019-08-27 Thread Jan Kara
vice. So there is no need to take reference and drop reference to > dax_device on each call of this function. > > Suggested-by: Christoph Hellwig > Signed-off-by: Vivek Goyal Looks good to me. You can add: Reviewed-by: Jan Kara

Re: [PATCH v2] udf: augment UDF permissions on new inodes

2019-08-27 Thread Jan Kara
cpu_to_le32(udfperms); > > if (S_ISDIR(inode->i_mode) && inode->i_nlink > 0) > --- a/fs/udf/file.c 2019-08-26 21:38:12.138562583 -0500 > +++ b/fs/udf/file.c 2019-08-26 21:12:44.664536308 -0500 > @@ -280,6 +280,9 @@ static int udf_setattr(struct dentry *de > return error; > } > > + if (attr->ia_valid & ATTR_MODE) > + udf_update_extra_perms(inode, attr->ia_mode); > + > setattr_copy(inode, attr); > mark_inode_dirty(inode); > return 0; > -- Jan Kara SUSE Labs, CR

Re: [PATCH v3 4/5] writeback, memcg: Implement cgroup_writeback_by_id()

2019-08-26 Thread Jan Kara
rious wbs. > > v3: Interpret 0 @nr as 1.25 * nr_dirty to implement best-effort > flushing while avoding possible livelocks. > > Signed-off-by: Tejun Heo The patch looks good to me. You can add: Reviewed-by: Jan Kara

Re: [PATCH] udf: augment owner permissions on new inodes

2019-08-26 Thread Jan Kara
perms |= (le32_to_cpu(fe->permissions) & > - (FE_PERM_O_DELETE | FE_PERM_O_CHATTR | > - FE_PERM_G_DELETE | FE_PERM_G_CHATTR | > - FE_PERM_U_DELETE | FE_PERM_U_CHATTR)); > + udfperms |= iinfo->i_extraPerms; > fe->permissions = cpu_to_le32(udfperms); > > if (S_ISDIR(inode->i_mode) && inode->i_nlink > 0) > -- Jan Kara SUSE Labs, CR

Re: [PATCH v2] udf: reduce leakage of blocks related to named streams

2019-08-26 Thread Jan Kara
On Mon 19-08-19 07:10:24, Steve Magnani wrote: > Jan - > > > On 8/15/19 7:42 AM, Jan Kara wrote: > > On Wed 14-08-19 07:50:02, Steven J. Magnani wrote: > > > Windows is capable of creating UDF files having named streams. > > > One example is the "Zone.

Re: [RFC PATCH v2 00/19] RDMA/FS DAX truncate proposal V1,000,002 ; -)

2019-08-19 Thread Jan Kara
On Fri 16-08-19 16:20:07, Ira Weiny wrote: > On Fri, Aug 16, 2019 at 12:05:28PM -0700, 'Ira Weiny' wrote: > > On Thu, Aug 15, 2019 at 03:05:58PM +0200, Jan Kara wrote: > > > On Wed 14-08-19 11:08:49, Ira Weiny wrote: > > > > On Wed, Aug 14, 2019 at 12:17:14PM +0200, J

Re: [RFC PATCH v2 00/19] RDMA/FS DAX truncate proposal V1,000,002 ;-)

2019-08-19 Thread Jan Kara
On Sat 17-08-19 12:26:03, Dave Chinner wrote: > On Fri, Aug 16, 2019 at 12:05:28PM -0700, Ira Weiny wrote: > > On Thu, Aug 15, 2019 at 03:05:58PM +0200, Jan Kara wrote: > > > On Wed 14-08-19 11:08:49, Ira Weiny wrote: > > > > On Wed, Aug 14, 2019 at 12:17:14PM +0200,

Re: [RFC PATCH 2/2] mm/gup: introduce vaddr_pin_pages_remote()

2019-08-16 Thread Jan Kara
On Fri 16-08-19 11:52:20, Jerome Glisse wrote: > On Fri, Aug 16, 2019 at 05:44:04PM +0200, Jan Kara wrote: > > On Fri 16-08-19 10:47:21, Vlastimil Babka wrote: > > > On 8/15/19 3:35 PM, Jan Kara wrote: > > > >> > > > >> So when the GUP

Re: [RFC PATCH 2/2] mm/gup: introduce vaddr_pin_pages_remote()

2019-08-16 Thread Jan Kara
On Fri 16-08-19 10:47:21, Vlastimil Babka wrote: > On 8/15/19 3:35 PM, Jan Kara wrote: > >> > >> So when the GUP user uses MMU notifiers to stop writing to pages whenever > >> they are writeprotected with page_mkclean(), they don't really need page > >> pin

Re: [RFC PATCH 2/2] mm/gup: introduce vaddr_pin_pages_remote()

2019-08-16 Thread Jan Kara
On Thu 15-08-19 19:14:08, John Hubbard wrote: > On 8/15/19 10:41 AM, John Hubbard wrote: > > On 8/15/19 10:32 AM, Ira Weiny wrote: > >> On Thu, Aug 15, 2019 at 03:35:10PM +0200, Jan Kara wrote: > >>> On Thu 15-08-19 15:26:22, Jan Kara wrote: > >>>>

Re: [PATCH block 1/2] writeback, cgroup: Adjust WB_FRN_TIME_CUT_DIV to accelerate foreign inode switching

2019-08-15 Thread Jan Kara
gt; missed detection depending on the writeback pattern. > > Let's change the parameter to 8, so that it only ignores writeback > with are smaller than 12.5% of the current running average. > > Signed-off-by: Tejun Heo Makes sen

Re: [RFC PATCH 2/2] mm/gup: introduce vaddr_pin_pages_remote()

2019-08-15 Thread Jan Kara
On Thu 15-08-19 15:26:22, Jan Kara wrote: > On Wed 14-08-19 20:01:07, John Hubbard wrote: > > On 8/14/19 5:02 PM, John Hubbard wrote: > > > On 8/14/19 4:50 PM, Ira Weiny wrote: > > > > On Tue, Aug 13, 2019 at 05:56:31PM -0700, John Hubbard wrote: > > > >

Re: [RFC PATCH 2/2] mm/gup: introduce vaddr_pin_pages_remote()

2019-08-15 Thread Jan Kara
() => needs FOLL_PIN 2) RDMA case - GUP references to pages serving as DMA buffers needed for a long time, no special synchronization with page_mkclean() or munmap() => needs FOLL_PIN | FOLL_LONGTERM This case has also a special case when the pages are actually DAX. Then the caller additionally needs file lease and additional file_pin structure is used for tracking this usage. 3) ODP case - GUP references to pages serving as DMA buffers, MMU notifiers used to synchronize with page_mkclean() and munmap() => normal page references are fine. Honza -- Jan Kara SUSE Labs, CR

Re: [RFC PATCH v2 00/19] RDMA/FS DAX truncate proposal V1,000,002 ;-)

2019-08-15 Thread Jan Kara
On Wed 14-08-19 11:08:49, Ira Weiny wrote: > On Wed, Aug 14, 2019 at 12:17:14PM +0200, Jan Kara wrote: > > Hello! > > > > On Fri 09-08-19 15:58:14, ira.we...@intel.com wrote: > > > Pre-requisites > > > == > > > Based on mmotm tre

Re: [PATCH v2] udf: reduce leakage of blocks related to named streams

2019-08-15 Thread Jan Kara
cStreamdir = > + lelb_to_cpu(efe->streamDirectoryICB.extLocation); > + iinfo->i_lenStreams = le64_to_cpu(efe->objectSize); > + if (iinfo->i_lenStreams >= inode->i_size) > + iinfo->i_lenStreams -= inode->i_size; > + else > + iinfo->i_lenStreams = 0; Hum, maybe you could just have i_objectSize instead of i_lenStreams? You use the field just to preserve objectSize anyway so there's no point in complicating it. Honza -- Jan Kara SUSE Labs, CR

Re: [RFC PATCH v2 00/19] RDMA/FS DAX truncate proposal V1,000,002 ;-)

2019-08-14 Thread Jan Kara
ile" variant in your patch set, then I'd just implement that and leave "owning mm" variant for later if it proves to be necessary. The things are complex enough as is... > 9) Truncation of pages which are not actively pinned nor covered by a lease >will succeed. Otherwise I like the design. Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH v2] fanotify, inotify, dnotify, security: add security hook for fs notifications

2019-08-12 Thread Jan Kara
s CAP_SYS_ADMIN, this is insufficient as it gives implicit > trust to root, which we do not do, and does not support least privilege. > > Signed-off-by: Aaron Goidel > Acked-by: Casey Schaufler The fsnotify bits look good to me in this patch so feel free to add: Acked-by: Jan Kara

Re: [PATCH] fanotify, inotify, dnotify, security: add security hook for fs notifications

2019-08-12 Thread Jan Kara
ail - but I'm not going to wait forever and I'd like > to get this into -next soon so we can get some testing. Yeah, sorry for the delays. I'm aware of the patch but I was also on vacation and pretty busy at work so Amir always beat me in commenting on the patch and I didn't have much to add. Once Aar

Re: [PATCH 1/3] mm/mlock.c: convert put_page() to put_user_page*()

2019-08-09 Thread Jan Kara
inside mm/gup.c. But I would really like to discourage new GUP users taking just page reference as the most clueless users (drivers) usually need a pin in the sense John implements. So in terms of API I'd strongly prefer to deprecate GUP as an API, provide vaddr_pin_pages() for drivers to get their buffer pages pinned and then for those few users who really know what they are doing (and who are not interested in page contents) we can have APIs like follow_page() to get a page reference from a virtual address. Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH] udf: reduce leakage of blocks related to named streams

2019-08-09 Thread Jan Kara
ogicalVolIntegrityDescImpUse *lvidiu = udf_sb_lvidiu(sb); > + struct udf_inode_info *iinfo = UDF_I(inode); > > if (lvidiu) { > mutex_lock(>s_alloc_mutex); > @@ -42,7 +43,13 @@ void udf_free_inode(struct inode *inode) > mutex_unlock(>s_alloc_mutex); > } > > - udf_free_blocks(sb, NULL, _I(inode)->i_location, 0, 1); > + udf_free_blocks(sb, NULL, >i_location, 0, 1); > + if (iinfo->i_streamdir) { > + udf_free_blocks(sb, NULL, >i_locStreamdir, 0, 1); > + udf_warn(inode->i_sb, > + "Leaking unsupported stream blocks for inode %lu\n", > + inode->i_ino); > + } > } > > struct inode *udf_new_inode(struct inode *dir, umode_t mode) > -- Jan Kara SUSE Labs, CR

Re: [patch V2 6/7] fs/jbd2: Make state lock a spinlock

2019-08-02 Thread Jan Kara
On Thu 01-08-19 20:12:11, Thomas Gleixner wrote: > On Thu, 1 Aug 2019, Jan Kara wrote: > > On Thu 01-08-19 03:01:32, Thomas Gleixner wrote: > > > As almost all functions which use this lock have a journal head pointer > > > readily available, it makes more sense to rem

Re: [patch V2 6/7] fs/jbd2: Make state lock a spinlock

2019-08-02 Thread Jan Kara
> > instead? The difference is of course that lockdep_assert_held() requires > the current context to hold the lock, where assert_*_locked() merely > checks _someone_ holds it. Yeah, jbd2 doesn't play any weird locking tricks so lockdep_assert_held() is fine. I'll replace those when I'm updating the patch. Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH 00/34] put_user_pages(): miscellaneous call sites

2019-08-02 Thread Jan Kara
put_page()) but I suppose it would be a high enough barrier for missed conversions... Thoughts? Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH] ext4: use rb_entry_safe() instead of open-coding it

2019-08-02 Thread Jan Kara
On Thu 25-07-19 21:16:58, Weitao Hou wrote: > use rb_entry_safe() to make it clean > > Signed-off-by: Weitao Hou Thanks for the patch. It looks good. You can add: Reviewed-by: Jan Kara Honza > --- > fs/ext4/e

Re: UDF filesystem image with Write-Once UDF Access Type

2019-08-02 Thread Jan Kara
On Thu 01-08-19 10:57:55, Pali Rohár wrote: > On Thursday 01 August 2019 10:38:00 Jan Kara wrote: > > Hum, looks like a problem with mkudffs. Relevant debug messages look like: > > > > UDF-fs: fs/udf/super.c:671:udf_check_vsd: Starting at sector 16 (2048 byte > >

Re: [patch V2 6/7] fs/jbd2: Make state lock a spinlock

2019-08-01 Thread Jan Kara
his lock have a journal head pointer > readily available, it makes more sense to remove the lock helper inlines > and write out spin_*lock() at all call sites. > > Fixup all locking comments as well. > > Suggested-by: Jan Kara > Signed-off-by: Thomas Gleixner > Cc: &q

Re: [patch V2 7/7] fs/jbd2: Free journal head outside of locked region

2019-08-01 Thread Jan Kara
h function and handed into > the free function. Might be over-engineered, but better safe than sorry. > > This makes the journal head bit-spinlock usage RT compliant and also avoids > nested locking which is not covered by lockdep. > > Suggested-by: Jan Kara > Signed-off-by: Th

Re: [patch V2 5/7] fs/jbd2: Simplify journal_unmap_buffer()

2019-08-01 Thread Jan Kara
nlock. > > Signed-off-by: Thomas Gleixner > Cc: linux-e...@vger.kernel.org > Cc: "Theodore Ts'o" > Cc: Jan Kara Nice simplification. You can add: Reviewed-by: Jan Kara Honza > --- > V2: New patch >

Re: [patch V2 4/7] fs/jbd2: Remove jbd_trylock_bh_state()

2019-08-01 Thread Jan Kara
On Thu 01-08-19 03:01:30, Thomas Gleixner wrote: > No users. > > Signed-off-by: Thomas Gleixner > Cc: linux-e...@vger.kernel.org > Cc: "Theodore Ts'o" > Cc: Jan Kara Right. Reviewed-by: Jan Kara H

Re: [patch 4/4] fs: jbd/jbd2: Substitute BH locks for RT and lock debugging

2019-08-01 Thread Jan Kara
On Wed 31-07-19 21:40:42, Thomas Gleixner wrote: > On Wed, 31 Jul 2019, Jan Kara wrote: > > BH_State lock is definitely worth it. In fact, if you placed the spinlock > > inside struct journal_head (which is the structure whose members are in > > fact protected by it), I'd b

Re: UDF filesystem image with Write-Once UDF Access Type

2019-08-01 Thread Jan Kara
On Thu 01-08-19 09:35:30, Jan Kara wrote: > > If you want to play with Write-Once Access Type, use recent version of > > mkudffs and choose --media-type=cdr option, which generates UDF > > filesystem suitable for CD-R (Write-Once Access Type with VAT and other > > UDF

Re: UDF filesystem image with Write-Once UDF Access Type

2019-08-01 Thread Jan Kara
least for read-write mounts we should make sure it is valid. So that's something that needs fixing. Honza -- Jan Kara SUSE Labs, CR

Re: UDF filesystem image with Write-Once UDF Access Type

2019-08-01 Thread Jan Kara
oo. This is definitely handled properly and we mount such fs read-only as well. Honza -- Jan Kara SUSE Labs, CR

Re: [patch 4/4] fs: jbd/jbd2: Substitute BH locks for RT and lock debugging

2019-07-31 Thread Jan Kara
Steven Rostedt > Signed-off-by: Thomas Gleixner > Cc: linux-e...@vger.kernel.org > Cc: "Theodore Ts'o" > Cc: Matthew Wilcox > Cc: Jan Kara > -- > include/linux/buffer_head.h |8 > include/linux/jbd2.h| 36 &g

Re: [patch 3/4] fs/buffer: Substitute BH_Uptodate_Lock for RT and bit spinlock debugging

2019-07-31 Thread Jan Kara
place, they can be exposed via > CONFIG_DEBUG_BIT_SPINLOCKS. > > Signed-off-by: Thomas Gleixner > Cc: "Theodore Ts'o" > Cc: Alexander Viro > Cc: Matthew Wilcox > Cc: linux-fsde...@vger.kernel.org Looks good to me. You can add: Reviewed-by: Jan Kara

Re: [patch 2/4] fs/buffer: Move BH_Uptodate_Lock locking into wrapper functions

2019-07-31 Thread Jan Kara
Cc: linux-fsde...@vger.kernel.org Looks good to me. You can add: Reviewed-by: Jan Kara BTW, it should be possible to get rid of BH_Uptodate_Lock altogether using bio chaining (which was non-existent when this bh code was written) to make sure IO completion function gets called only once all bios used

<    1   2   3   4   5   6   7   8   9   10   >