76,14 @@ static int writeback_single_inode(struct
>*/
> if (!(inode->i_state & I_DIRTY_ALL))
> inode_io_list_del_locked(inode, wb);
> + else if (!(inode->i_state & I_DIRTY_TIME) && dt) {
> + /*
> + * We can correct inode's io
use of performance. We
don't want to block writes (or event reads in case of XFS) for the inode
during writeback.
Honza
--
Jan Kara
SUSE Labs, CR
On Sun 24-05-20 21:39:10, Ira Weiny wrote:
> On Fri, May 22, 2020 at 01:48:48PM +0200, Jan Kara wrote:
> > And then we should check conflicts with the journal flag as well, as I
> > mentioned in reply to the first patch. There it is more complicated by the
> > fact that we sh
On Fri 22-05-20 17:23:30, Martijn Coenen wrote:
> [ dropped android-storage-c...@google.com from CC: since that list
> can't receive emails from outside google.com - sorry about that ]
>
> Hi Jan,
>
> On Fri, May 22, 2020 at 4:41 PM Jan Kara wrote:
> > > The easie
reasonably quickly.
Probably I'd add an inode state flag telling that inode is queued for
writeback by flush worker and we won't touch dirty lists in that case,
otherwise we are safe to update current writeback list as needed. I'll work
on fixing this as when I was reading the code I've noticed there are
nd
EXT4_JOURNAL_DATA_FL at the same time so the checks will be somewhat more
complicated.
Honza
> +
> /* Is it quota file? Do not allow user to mess with it */
> if (ext4_is_quota_file(inode))
> goto flags_out;
--
Jan Kara
SUSE Labs, CR
ort prior to the over
> riding mount option.
>
> While we are at it change the function to ext4_should_enable_dax() as
> this better reflects the ask as well as matches xfs.
>
> Signed-off-by: Ira Weiny
The patch looks good to
On Thu 21-05-20 12:13:06, ira.we...@intel.com wrote:
> From: Ira Weiny
>
> When preventing DAX and journaling on an inode. Use the effective DAX
> check rather than the mount option.
>
> This will be required to support per inode DAX flags.
>
> Reviewed-by: Jan Ka
On Wed 20-05-20 12:40:50, Ira Weiny wrote:
> On Wed, May 20, 2020 at 03:37:28PM +0200, Jan Kara wrote:
> > On Tue 19-05-20 22:57:49, ira.we...@intel.com wrote:
> > > From: Ira Weiny
> > >
> > > S_DAX should only be enabled when the underlying block device sup
heck and change this to
IS_DAX(inode) && !(inode->i_flags & I_NEW)
? Because as I'm reading the code now, this should never trigger?
>
> + if (ext4_test_inode_flag(inode, EXT4_INODE_DAX))
> + return -EOPNOTSUPP;
> +
Honza
--
Jan Kara
SUSE Labs, CR
; While we are at it change the function to ext4_should_enable_dax() as
> this better reflects the ask as well as matches xfs.
>
> Reviewed-by: Jan Kara
> Signed-off-by: Ira Weiny
...
> @@ -4412,7 +4410,13 @@ static bool ext4_should_use_dax(struct inode *inode)
> return
ed but I'm not clear right now how to fix it.
>
> > If not, maybe you should change it to check
> > S_NEW instead of i_size == 0 to make it clearer?
>
> The patch is completely unnecessary.
>
> It is much easier to make (EXT4_ENCRYPT_FL | EXT4_VERITY_FL) incompatible
> with EXT4_DAX_FL when it is introduced later in the series. Furthermore
> this mutual exclusion can be done on directories in the encrypt case.
> Which I think will be nicer for the user if they get an error when trying
> to set one when the other is set.
Agreed.
Honza
--
Jan Kara
SUSE Labs, CR
e behaviour of nfsd and
> loop tasks. I don't know what is wanted for realtime.
>
> Acked-by: Chuck Lever (for nfsd change)
> Signed-off-by: NeilBrown
The idea looks good to me. It will still have the "threads may hit hard
wall" behavior when BDI freerun threshold is cro
trace points and warning printks which mentioned this
> counter no longer report it.
>
> Reviewed-by: Christoph Hellwig
> Acked-by: Trond Myklebust
> Acked-by: Michal Hocko # for MM parts
> Signed-off-by: NeilBrown
The patch looks good to me. You can add:
Reviewed-by: Jan Kara
printing.
>
> Signed-off-by: Ira Weiny
Thanks. The patch looks good to me now. You can add:
Reviewed-by: Jan Kara
Honza
>
> ---
> Changes from V1:
> Fix up mounting options to only show an optio
s I wrote in my reply to previous version of this patch, I'd prefer if we
handled this like e.g. 'data=' mount option. I don't think any unification
in option parsing with XFS makes sence and I'd rather keep consistent how
ext4 handles these 'enum' options.
Honza
--
Jan Kara
SUSE Labs, CR
On Wed 13-05-20 23:53:15, ira.we...@intel.com wrote:
> From: Ira Weiny
>
> Update the document to reflect ext4 and xfs now behave the same.
>
> Signed-off-by: Ira Weiny
Looks good to me. You can add:
Review
> Finally, on regular files, flag the inode to not be cached to facilitate
> changing S_DAX on the next creation of the inode.
>
> Signed-off-by: Ira Weiny
The patch looks good to me. You can add:
Reviewed-by: Jan Kara
xclusive by returning an error if DAX was
> set first.
>
> Furthermore, clarify the documentation of the exclusivity and how that
> will work.
>
> Signed-off-by: Ira Weiny
Looks good to me. You can add:
Reviewed-by: Jan Kara
by returning an error if DAX was
> set first.
>
> (Setting DAX is already disabled if Verity is set first.)
>
> Signed-off-by: Ira Weiny
Makes sence. You can add:
Reviewed-by: Jan Kara
Honza
>
> ---
> Cha
On Wed 13-05-20 14:41:55, Ira Weiny wrote:
> On Wed, May 13, 2020 at 04:47:06PM +0200, Jan Kara wrote:
> >
> > So I think you'll have to check
> > whether DAX flag is being changed,
>
> ext4_dax_dontcache() does check if the flag is being changed.
Yes, but if you call
gt;
> inode_lock(inode);
> +
> + ext4_dax_dontcache(inode, flags);
> +
I don't think we should set dontcache flag when setting of DAX flag fails -
it could event be a security issue). So I think you'll have to check
whether DAX flag is being changed, call vfs_ioc_fssetxattr_check(), and
only if it succeeded and DAX flags was changing call ext4_dax_dontcache().
Honza
--
Jan Kara
SUSE Labs, CR
@ static int ext4_remount(struct super_block *sb, int
> *flags, char *data)
> goto restore_opts;
> }
>
> - if ((sbi->s_mount_opt ^ old_opts.s_mount_opt) & EXT4_MOUNT_DAX_ALWAYS) {
> + if ((sbi->s_mount_opt ^ old_opts.s_mount_opt) & EXT4_MOUNT_DAX_ALWAYS ||
> + (sbi->s_mount_opt2 ^ old_opts.s_mount_opt2) &
> EXT4_MOUNT2_DAX_NEVER) {
> ext4_msg(sb, KERN_WARNING, "warning: refusing change of "
> - "dax flag with busy inodes while remounting");
> + "dax mount option with busy inodes while remounting");
> sbi->s_mount_opt ^= EXT4_MOUNT_DAX_ALWAYS;
> + sbi->s_mount_opt2 ^= EXT4_MOUNT2_DAX_NEVER;
> }
>
> if (sbi->s_mount_flags & EXT4_MF_FS_ABORTED)
> --
> 2.25.1
>
--
Jan Kara
SUSE Labs, CR
;
> Add init bool to ext4_set_inode_flags() to indicate if the inode is
> being newly initialized.
>
> Assert that S_DAX is not set on an inode which is just being loaded.
>
> Signed-off-by: Ira Weiny
The patch looks good to me. You can add:
Reviewed-by: Jan Kara
; While we are at it change the function to ext4_should_enable_dax() as
> this better reflects the ask as well as matches xfs.
>
> Signed-off-by: Ira Weiny
The patch looks good to me. You can add:
Reviewed-by: Jan Kara
On Tue 12-05-20 22:43:19, ira.we...@intel.com wrote:
> From: Ira Weiny
>
> In prep for the new tri-state mount option which then introduces
> EXT4_MOUNT_DAX_NEVER.
>
> Signed-off-by: Ira Weiny
Looks good to me. You can add:
Rev
uicky return on pl->period == p->period
> test, so it would not result to significant downside of performance.
>
> Thanks for Jan's guidance.
>
> Signed-off-by: Tan Hu
Thanks for the patch. It looks good to me. Yo
does no big harm as we'd just quickly bail on
pl->period == p->period test. So I think the code is easier to understand
the way you suggest without significant downside. But please update the
changelog to explain that this is just code cleanup (with the above
reasoning) and not a functional fix.
Honza
--
Jan Kara
SUSE Labs, CR
t; --- a/fs/notify/fanotify/fanotify.h
> +++ b/fs/notify/fanotify/fanotify.h
> @@ -89,7 +89,7 @@ struct fanotify_name_event {
> __kernel_fsid_t fsid;
> struct fanotify_fh dir_fh;
> u8 name_len;
> - char name[0];
> + char name[];
> };
>
> static inline struct fanotify_name_event *
>
--
Jan Kara
SUSE Labs, CR
On Wed 06-05-20 21:38:40, Souptick Joarder wrote:
> On Wed, May 6, 2020 at 6:29 PM Jan Kara wrote:
> >
> > On Wed 06-05-20 17:51:39, Souptick Joarder wrote:
> > > On Wed, May 6, 2020 at 3:36 PM Jan Kara wrote:
> > > >
> > > > On Wed 06-05-20 02
On Wed 06-05-20 17:51:39, Souptick Joarder wrote:
> On Wed, May 6, 2020 at 3:36 PM Jan Kara wrote:
> >
> > On Wed 06-05-20 02:06:56, Souptick Joarder wrote:
> > > On Wed, May 6, 2020 at 1:08 AM John Hubbard wrote:
> > > >
> > > > On 2020-05-05 12
e behavior of < 0 return on
error or > 0 return if we mapped some pages. Callers that can possibly ask
to map 0 pages can get 0 pages back - kind of expected - and I don't see
any benefit in trying to rewrite these callers to handle -EINVAL instead...
Honza
--
Jan Kara
SUSE Labs, CR
tions) > part) {
> - accum = le32_to_cpu(
> - lvid->freeSpaceTable[part]);
> + accum = le32_to_cpup(
> + (lvid->freeSpaceTable + part));
> if (accum == 0x)
> accum = 0;
> }
>
>
>
> This version could easily be backported to stable kernels to let them be
> compiled with gcc-10, and then synchronizing with the udftools version of
> the header needs additional changes on top, which do not need to be
> backported.
>
>Arnd
--
Jan Kara
SUSE Labs, CR
On Wed 29-04-20 10:22:30, Tejun Heo wrote:
> Hello,
>
> On Wed, Apr 29, 2020 at 12:25:40PM +0200, Jan Kara wrote:
> > Yeah, I was thinking about the same when reading the patch series
> > description. We already have some cgroup workarounds for btrfs kthreads if
> > I
to the problem and also
the number of kthreads gets multiplied by the number of cgroups. So I
agree some generic solution how to approach IO throttling of kthreads /
workers would be desirable.
OTOH I don't have a great idea how the generic infrastructure should look
like...
Honza
--
Jan Kara
SUSE Labs, CR
being set I_DONTCACHE.
>
> This facilitates dropping dentry references to inodes sooner which
> require eviction to swap S_DAX mode.
>
> Cc: Al Viro
> Signed-off-by: Ira Weiny
The patch looks good to me. You can add:
Reviewed-by: Jan Kara
On Mon 21-10-19 23:49:04, Roman Gushchin wrote:
> On Wed, Oct 16, 2019 at 11:18:40AM +0200, Jan Kara wrote:
> > On Tue 15-10-19 21:40:45, Roman Gushchin wrote:
> > > On Tue, Oct 15, 2019 at 11:09:33AM +0200, Jan Kara wrote:
> > > > On Thu 10-10-19
ck to lookup the correct order.
>
> Reported-by: Jeff Smits
> Reported-by: Doug Nelson
> Cc:
> Fixes: 23c84eb78375 ("dax: Fix missed wakeup with PMD faults")
> Cc: Jan Kara
> Cc: Matthew Wilcox (Oracle)
> Signed-off-by: Dan Williams
Good catch! The patch looks g
if (ret)
> return ret;
>
> @@ -2239,7 +2239,7 @@ COMPAT_SYSCALL_DEFINE6(io_pgetevents_time64,
> if (usig && copy_from_user(, usig, sizeof(ksig)))
> return -EFAULT;
>
> - ret = set_compat_user_sigmask(ksig.sigmask, ksig.sigsetsize);
> + ret = set_compat_user_sigmask(compat_ptr(ksig.sigmask),
> ksig.sigsetsize);
> if (ret)
> return ret;
>
> --
> 2.23.0
>
--
Jan Kara
SUSE Labs, CR
: Using plain integer as NULL pointer
>
> Signed-off-by: Ben Dooks
Thanks for the cleanup! Applied.
Honza
> ---
> Cc: Jan Kara
> Cc: Amir Goldstein
> Cc: linux-fsde...@vger.kernel.org
> Cc: linux-kernel@vger.
t declared. Should it be static?
>
> Signed-off-by: Ben Dooks
OK, fair enough. Applied. Thanks for the patch.
Honza
> ---
> Cc: Jan Kara
> Cc: Amir Goldstein
> Cc: linux-fsde...@vger.kernel.org
> Cc: linux-ke
ormat_ops = {
> .check_quota_file = v1_check_quota_file,
> .read_file_info = v1_read_file_info,
> .write_file_info= v1_write_file_info,
> - .free_file_info = NULL,
> .read_dqblk = v1_read_dqblk,
> .commit_dqblk =
On Tue 15-10-19 21:40:45, Roman Gushchin wrote:
> On Tue, Oct 15, 2019 at 11:09:33AM +0200, Jan Kara wrote:
> > On Thu 10-10-19 16:40:36, Roman Gushchin wrote:
> >
> > > @@ -426,7 +431,7 @@ static void inode_switch_wbs_work_fn(struct
> > > work_struct *work)
&
description of what
your solution is would be good...
Honza
> Cc: Roman Gushchin
> Cc: Tejun Heo
> Cc: Jan Kara
> Cc: Johannes Weiner
> Cc: Shakeel Butt
> Cc: Minchan Kim
> Cc: Mel Gorman
> Signed-off-by: Hillf Danton
> ---
>
> --- a/include/linux/back
val we traverse
> the whole list ignoring wbs with active io operations. That will
> allow the majority of io operations to be finished after the
> removal of the cgroup.
>
> Big thanks to Jan Kara and Dennis Zhou for their ideas and
> contribution to this patch.
>
> Signed-off-by:
hould it be static?
> fs/buffer.c:2994:6: warning: symbol 'guard_bio_eod' was not declared. Should
> it be static?
>
> Signed-off-by: Ben Dooks
OK, makes sense to keep declarations in sync with real functions. Thanks
for the patch a feel free
4e8 EFLAGS: 0246 ORIG_RAX: 00a5
> RAX: ffda RBX: 0003 RCX: 00441e99
> RDX: 2140 RSI: 2280 RDI: 24c0
> RBP: f000 R08: R09: 7fff8d717698
> R10: R11: 0246 R12: 006ccdc8
> R13: 006cd440 R14: R15:
>
--
Jan Kara
SUSE Labs, CR
SK;
>
> + if (type >= MAXQUOTAS)
> + return -EINVAL;
> +
> /*
>* As a special case Q_SYNC can be called without a specific device.
>* It will iterate all superblocks that have quota enabled and call
> --
> 2.21.0
>
>
>
>
--
Jan Kara
SUSE Labs, CR
cation of already quite complex locking scheme.
> And because inode_switch_wbs() can fail, we can't guarantee that a single
> pass over such a list will be enough. That means the we need to schedule
> scans periodically until all inodes will be switched.
>
> So I really don't know which option is better, but at the same time
> doing nothing isn't the option too. Somehow the problem should be solved.
I agree with Dave that scanning all inodes in the system can get really
expensive quickly. So what I rather think we could do is create another 'IO
list' (linked by inode->i_io_list) where we would put inodes that reference
the wb but are not in any other IO list of the wb. And then we would
switch inodes on this list when the wb is dying... One would have to be
somewhat careful with properly draining this list since new inodes can be
added to it while we work on it but otherwise I don't see any complication
with this.
Honza
--
Jan Kara
SUSE Labs, CR
eless for anything else.
I agree we should not tailor the layout lease definition to just RDMA
usecase. But let's talk about the semantics once our confusion about how
pNFS currently uses layout leases is clear out.
Honza
--
Jan Kara
SUSE Labs, CR
dqstats_inc(DQST_LOOKUPS);
> err = sb->dq_op->write_dquot(dquot);
> if (err) {
> /*
> --
> 2.20.1
>
>
>
>
--
Jan Kara
SUSE Labs, CR
ash_bits++;
> - } while (nr_hash >> dq_hash_bits);
> - dq_hash_bits--;
> + dq_hash_bits = ilog2(nr_hash);
>
> nr_hash = 1UL << dq_hash_bits;
> dq_hash_mask = nr_hash - 1;
> --
> 2.21.0
>
>
>
>
--
Jan Kara
SUSE Labs, CR
the other reply, I don't think there's a good reason for that to be
> the case.
Well, cannot it happen that the flush worker will get stuck in D state
because some subsystem is already suspended and thus hibernation fails
(because AFAIK processes in uninterruptible sleep block hibernation)?
I was also somewhat worried that the hibernation image could be
inconsistent if flush workers do something while hibernation image is being
taken but that does not seem to be a valid concern as all kernel processes
get frozen before hibernation image is taken.
Honza
--
Jan Kara
SUSE Labs, CR
had in your
patches should do that, shouldn't they? Maybe they were not tied to the
right structure... They really need to be tied to the existence of a lease.
Honza
--
Jan Kara
SUSE Labs, CR
ot multiplex this over F_SETLEASE. An F_SETLAYOUT cmd
> > > would probably be sufficient, and maybe just reuse
> > > F_RDLCK/F_WRLCK/F_UNLCK for the iomode?
> > >
> > > For the byte ranges, the catch there is that extending the userland
> > > interface for that later will be difficult.
> >
> > Why would it be difficult?
> >
>
> Legacy userland code that wanted to use byte range enabled layouts would
> have to be rebuilt to take advantage of them. If we require a range from
> the get-go, then they will get the benefit of them once they're
> available.
I don't think this is true. Because current implementation of locking the
whole file may hide implementation bugs in userspace. So the new
range lock handling may break userspace and history shows such
problems with APIs are actually rather common. So I think switching to
range locking *must* be conscious decision of the application and as
such having separate API for that is the most natural thing to do.
> > > What I'd probably suggest
> > > (and what would jive with the way pNFS works) would be to go ahead and
> > > add an offset and length to the arguments and result (maybe also
> > > whence?).
> >
> > Why not add new commands with range arguments later if it turns out to
> > be necessary?
>
> We could do that. It'd be a little ugly, IMO, simply because then we'd
> end up with two interfaces that do almost the exact same thing.
>
> Should byte-range layouts at that point conflict with non-byte range
> layouts, or should they be in different "spaces" (a'la POSIX and flock
> locks)? When it's all one interface, those sorts of questions sort of
> answer themselves. When they aren't we'll have to document them clearly
> and I think the result will be more confusing for userland programmers.
>
> If you felt strongly about leaving those out for now, you could just do
> something similar to what Aleksa is planning for openat2 -- have a
> struct pointer and length as arguments for this cmd, and only have a
> single iomode member in there for now.
>
> The kernel would have to know how to deal with "legacy" and byte-range-
> enabled variants if we ever extend it, but that's not too hard to
> handle.
Yeah, so we can discuss how to make possible future extension towards
range locking the least confusing to userspace. E.g. we can just put the
ranges in the API and require that start is always 0 and end is always
ULONG_MAX or whatever. But switching to smaller ranges must be the decision
in the application after the kernel supports it.
Honza
--
Jan Kara
SUSE Labs, CR
ormal eviction pathway ought to free the damn thing. a bit>
>
> Mind explaining what's to stop ->evict_inode (== udf_evict_inode) from
> hitting
> kfree(iinfo->i_ext.i_data);
> considering that this call of kfree() appears to be unconditional there?
Exactly. udf
= 1UL << dq_hash_bits;
Why not just:
dq_hash_bits = ilog2(nr_hash);
nr_hash = 1UL << dq_hash_bits;
That way we need to compute fls() only once...
Honza
--
Jan Kara
SUSE Labs, CR
On Mon 16-09-19 10:53:08, Chao Yu wrote:
> On 2019/9/12 18:06, Jan Kara wrote:
> > On Wed 11-09-19 17:36:50, Chao Yu wrote:
> >> diff --git a/include/linux/quotaops.h b/include/linux/quotaops.h
> >> index dc905a4ff8d7..bd30acad3a7f 100644
> >> --- a/include
& ATTR_UID && !uid_eq(ia->ia_uid, inode->i_uid))
> ||
> (ia->ia_valid & ATTR_GID && !gid_eq(ia->ia_gid, inode->i_gid));
> }
OK, but your change makes i_size extension not to be quota modification
which is IMO wrong. So I think the condit
= 0; i < nr_groups; i++)
> - if (bitmap->s_block_bitmap[i])
> - brelse(bitmap->s_block_bitmap[i]);
> + brelse(bitmap->s_block_bitmap[i]);
>
> kvfree(bitmap);
> }
> --
> 2.23.0
>
>
--
Jan Kara
SUSE Labs, CR
or (i = 0; i < db_count; i++)
> - if (sbi->s_group_desc[i])
> - brelse (sbi->s_group_desc[i]);
> + brelse(sbi->s_group_desc[i]);
> kfree(sbi->s_group_desc);
> kfree(sbi->s_debts);
> percpu_counter_destroy(>s_freeblocks_counter);
> --
> 2.23.0
>
>
--
Jan Kara
SUSE Labs, CR
= __trace_wb_assign_cgroup(wb);
> + __entry->page_cgroup_ino =
> page->mem_cgroup->css.cgroup->kn->id.ino;
> + ),
Are the page dereferences above safe? I suppose lock_page_memcg() protects
the page->mem_cgroup->css.cgroup->kn->id dereference? But page->mapping
does not seem to be protected by page lock?
Honza
--
Jan Kara
SUSE Labs, CR
the calculate after compare. ensure the offset >= 0 && nbytes >= 0 &&
> no overflow may happen first.
>
> Signed-off-by: SunKe
Thanks for the patch. The patch looks good to me. You can add:
Reviewed-by: Jan Kara
Al, care to pickup this fix?
-off-by: Max Kellermann
> Cc: sta...@vger.kernel.org
Thanks for the patch. This patch definitely looks good to me so feel free
to add:
Reviewed-by: Jan Kara
I just wonder, do you really need patches 1 and 2? Doesn't this patch alone
fix the problem? Because AFAIU the problem, this patch sh
vice. So there is no need to take reference and drop reference to
> dax_device on each call of this function.
>
> Suggested-by: Christoph Hellwig
> Signed-off-by: Vivek Goyal
Looks good to me. You can add:
Reviewed-by: Jan Kara
cpu_to_le32(udfperms);
>
> if (S_ISDIR(inode->i_mode) && inode->i_nlink > 0)
> --- a/fs/udf/file.c 2019-08-26 21:38:12.138562583 -0500
> +++ b/fs/udf/file.c 2019-08-26 21:12:44.664536308 -0500
> @@ -280,6 +280,9 @@ static int udf_setattr(struct dentry *de
> return error;
> }
>
> + if (attr->ia_valid & ATTR_MODE)
> + udf_update_extra_perms(inode, attr->ia_mode);
> +
> setattr_copy(inode, attr);
> mark_inode_dirty(inode);
> return 0;
>
--
Jan Kara
SUSE Labs, CR
rious wbs.
>
> v3: Interpret 0 @nr as 1.25 * nr_dirty to implement best-effort
> flushing while avoding possible livelocks.
>
> Signed-off-by: Tejun Heo
The patch looks good to me. You can add:
Reviewed-by: Jan Kara
perms |= (le32_to_cpu(fe->permissions) &
> - (FE_PERM_O_DELETE | FE_PERM_O_CHATTR |
> - FE_PERM_G_DELETE | FE_PERM_G_CHATTR |
> - FE_PERM_U_DELETE | FE_PERM_U_CHATTR));
> + udfperms |= iinfo->i_extraPerms;
> fe->permissions = cpu_to_le32(udfperms);
>
> if (S_ISDIR(inode->i_mode) && inode->i_nlink > 0)
>
--
Jan Kara
SUSE Labs, CR
On Mon 19-08-19 07:10:24, Steve Magnani wrote:
> Jan -
>
>
> On 8/15/19 7:42 AM, Jan Kara wrote:
> > On Wed 14-08-19 07:50:02, Steven J. Magnani wrote:
> > > Windows is capable of creating UDF files having named streams.
> > > One example is the "Zone.
On Fri 16-08-19 16:20:07, Ira Weiny wrote:
> On Fri, Aug 16, 2019 at 12:05:28PM -0700, 'Ira Weiny' wrote:
> > On Thu, Aug 15, 2019 at 03:05:58PM +0200, Jan Kara wrote:
> > > On Wed 14-08-19 11:08:49, Ira Weiny wrote:
> > > > On Wed, Aug 14, 2019 at 12:17:14PM +0200, J
On Sat 17-08-19 12:26:03, Dave Chinner wrote:
> On Fri, Aug 16, 2019 at 12:05:28PM -0700, Ira Weiny wrote:
> > On Thu, Aug 15, 2019 at 03:05:58PM +0200, Jan Kara wrote:
> > > On Wed 14-08-19 11:08:49, Ira Weiny wrote:
> > > > On Wed, Aug 14, 2019 at 12:17:14PM +0200,
On Fri 16-08-19 11:52:20, Jerome Glisse wrote:
> On Fri, Aug 16, 2019 at 05:44:04PM +0200, Jan Kara wrote:
> > On Fri 16-08-19 10:47:21, Vlastimil Babka wrote:
> > > On 8/15/19 3:35 PM, Jan Kara wrote:
> > > >>
> > > >> So when the GUP
On Fri 16-08-19 10:47:21, Vlastimil Babka wrote:
> On 8/15/19 3:35 PM, Jan Kara wrote:
> >>
> >> So when the GUP user uses MMU notifiers to stop writing to pages whenever
> >> they are writeprotected with page_mkclean(), they don't really need page
> >> pin
On Thu 15-08-19 19:14:08, John Hubbard wrote:
> On 8/15/19 10:41 AM, John Hubbard wrote:
> > On 8/15/19 10:32 AM, Ira Weiny wrote:
> >> On Thu, Aug 15, 2019 at 03:35:10PM +0200, Jan Kara wrote:
> >>> On Thu 15-08-19 15:26:22, Jan Kara wrote:
> >>>>
gt; missed detection depending on the writeback pattern.
>
> Let's change the parameter to 8, so that it only ignores writeback
> with are smaller than 12.5% of the current running average.
>
> Signed-off-by: Tejun Heo
Makes sen
On Thu 15-08-19 15:26:22, Jan Kara wrote:
> On Wed 14-08-19 20:01:07, John Hubbard wrote:
> > On 8/14/19 5:02 PM, John Hubbard wrote:
> > > On 8/14/19 4:50 PM, Ira Weiny wrote:
> > > > On Tue, Aug 13, 2019 at 05:56:31PM -0700, John Hubbard wrote:
> > > >
() => needs FOLL_PIN
2) RDMA case - GUP references to pages serving as DMA buffers needed for a
long time, no special synchronization with page_mkclean() or munmap()
=> needs FOLL_PIN | FOLL_LONGTERM
This case has also a special case when the pages are actually DAX. Then
the caller additionally needs file lease and additional file_pin
structure is used for tracking this usage.
3) ODP case - GUP references to pages serving as DMA buffers, MMU notifiers
used to synchronize with page_mkclean() and munmap() => normal page
references are fine.
Honza
--
Jan Kara
SUSE Labs, CR
On Wed 14-08-19 11:08:49, Ira Weiny wrote:
> On Wed, Aug 14, 2019 at 12:17:14PM +0200, Jan Kara wrote:
> > Hello!
> >
> > On Fri 09-08-19 15:58:14, ira.we...@intel.com wrote:
> > > Pre-requisites
> > > ==
> > > Based on mmotm tre
cStreamdir =
> + lelb_to_cpu(efe->streamDirectoryICB.extLocation);
> + iinfo->i_lenStreams = le64_to_cpu(efe->objectSize);
> + if (iinfo->i_lenStreams >= inode->i_size)
> + iinfo->i_lenStreams -= inode->i_size;
> + else
> + iinfo->i_lenStreams = 0;
Hum, maybe you could just have i_objectSize instead of i_lenStreams? You
use the field just to preserve objectSize anyway so there's no point in
complicating it.
Honza
--
Jan Kara
SUSE Labs, CR
ile" variant in your patch set, then
I'd just implement that and leave "owning mm" variant for later if it
proves to be necessary. The things are complex enough as is...
> 9) Truncation of pages which are not actively pinned nor covered by a lease
>will succeed.
Otherwise I like the design.
Honza
--
Jan Kara
SUSE Labs, CR
s CAP_SYS_ADMIN, this is insufficient as it gives implicit
> trust to root, which we do not do, and does not support least privilege.
>
> Signed-off-by: Aaron Goidel
> Acked-by: Casey Schaufler
The fsnotify bits look good to me in this patch so feel free to add:
Acked-by: Jan Kara
ail - but I'm not going to wait forever and I'd like
> to get this into -next soon so we can get some testing.
Yeah, sorry for the delays. I'm aware of the patch but I was also on
vacation and pretty busy at work so Amir always beat me in commenting on
the patch and I didn't have much to add. Once Aar
inside mm/gup.c. But I would really like to
discourage new GUP users taking just page reference as the most clueless
users (drivers) usually need a pin in the sense John implements. So in
terms of API I'd strongly prefer to deprecate GUP as an API, provide
vaddr_pin_pages() for drivers to get their buffer pages pinned and then for
those few users who really know what they are doing (and who are not
interested in page contents) we can have APIs like follow_page() to get a
page reference from a virtual address.
Honza
--
Jan Kara
SUSE Labs, CR
ogicalVolIntegrityDescImpUse *lvidiu = udf_sb_lvidiu(sb);
> + struct udf_inode_info *iinfo = UDF_I(inode);
>
> if (lvidiu) {
> mutex_lock(>s_alloc_mutex);
> @@ -42,7 +43,13 @@ void udf_free_inode(struct inode *inode)
> mutex_unlock(>s_alloc_mutex);
> }
>
> - udf_free_blocks(sb, NULL, _I(inode)->i_location, 0, 1);
> + udf_free_blocks(sb, NULL, >i_location, 0, 1);
> + if (iinfo->i_streamdir) {
> + udf_free_blocks(sb, NULL, >i_locStreamdir, 0, 1);
> + udf_warn(inode->i_sb,
> + "Leaking unsupported stream blocks for inode %lu\n",
> + inode->i_ino);
> + }
> }
>
> struct inode *udf_new_inode(struct inode *dir, umode_t mode)
>
--
Jan Kara
SUSE Labs, CR
On Thu 01-08-19 20:12:11, Thomas Gleixner wrote:
> On Thu, 1 Aug 2019, Jan Kara wrote:
> > On Thu 01-08-19 03:01:32, Thomas Gleixner wrote:
> > > As almost all functions which use this lock have a journal head pointer
> > > readily available, it makes more sense to rem
>
> instead? The difference is of course that lockdep_assert_held() requires
> the current context to hold the lock, where assert_*_locked() merely
> checks _someone_ holds it.
Yeah, jbd2 doesn't play any weird locking tricks so lockdep_assert_held()
is fine. I'll replace those when I'm updating the patch.
Honza
--
Jan Kara
SUSE Labs, CR
put_page()) but I suppose it would be a high enough barrier for missed
conversions... Thoughts?
Honza
--
Jan Kara
SUSE Labs, CR
On Thu 25-07-19 21:16:58, Weitao Hou wrote:
> use rb_entry_safe() to make it clean
>
> Signed-off-by: Weitao Hou
Thanks for the patch. It looks good. You can add:
Reviewed-by: Jan Kara
Honza
> ---
> fs/ext4/e
On Thu 01-08-19 10:57:55, Pali Rohár wrote:
> On Thursday 01 August 2019 10:38:00 Jan Kara wrote:
> > Hum, looks like a problem with mkudffs. Relevant debug messages look like:
> >
> > UDF-fs: fs/udf/super.c:671:udf_check_vsd: Starting at sector 16 (2048 byte
> >
his lock have a journal head pointer
> readily available, it makes more sense to remove the lock helper inlines
> and write out spin_*lock() at all call sites.
>
> Fixup all locking comments as well.
>
> Suggested-by: Jan Kara
> Signed-off-by: Thomas Gleixner
> Cc: &q
h function and handed into
> the free function. Might be over-engineered, but better safe than sorry.
>
> This makes the journal head bit-spinlock usage RT compliant and also avoids
> nested locking which is not covered by lockdep.
>
> Suggested-by: Jan Kara
> Signed-off-by: Th
nlock.
>
> Signed-off-by: Thomas Gleixner
> Cc: linux-e...@vger.kernel.org
> Cc: "Theodore Ts'o"
> Cc: Jan Kara
Nice simplification. You can add:
Reviewed-by: Jan Kara
Honza
> ---
> V2: New patch
>
On Thu 01-08-19 03:01:30, Thomas Gleixner wrote:
> No users.
>
> Signed-off-by: Thomas Gleixner
> Cc: linux-e...@vger.kernel.org
> Cc: "Theodore Ts'o"
> Cc: Jan Kara
Right.
Reviewed-by: Jan Kara
H
On Wed 31-07-19 21:40:42, Thomas Gleixner wrote:
> On Wed, 31 Jul 2019, Jan Kara wrote:
> > BH_State lock is definitely worth it. In fact, if you placed the spinlock
> > inside struct journal_head (which is the structure whose members are in
> > fact protected by it), I'd b
On Thu 01-08-19 09:35:30, Jan Kara wrote:
> > If you want to play with Write-Once Access Type, use recent version of
> > mkudffs and choose --media-type=cdr option, which generates UDF
> > filesystem suitable for CD-R (Write-Once Access Type with VAT and other
> > UDF
least for read-write mounts we should make
sure it is valid. So that's something that needs fixing.
Honza
--
Jan Kara
SUSE Labs, CR
oo.
This is definitely handled properly and we mount such fs read-only as well.
Honza
--
Jan Kara
SUSE Labs, CR
Steven Rostedt
> Signed-off-by: Thomas Gleixner
> Cc: linux-e...@vger.kernel.org
> Cc: "Theodore Ts'o"
> Cc: Matthew Wilcox
> Cc: Jan Kara
> --
> include/linux/buffer_head.h |8
> include/linux/jbd2.h| 36
&g
place, they can be exposed via
> CONFIG_DEBUG_BIT_SPINLOCKS.
>
> Signed-off-by: Thomas Gleixner
> Cc: "Theodore Ts'o"
> Cc: Alexander Viro
> Cc: Matthew Wilcox
> Cc: linux-fsde...@vger.kernel.org
Looks good to me. You can add:
Reviewed-by: Jan Kara
Cc: linux-fsde...@vger.kernel.org
Looks good to me. You can add:
Reviewed-by: Jan Kara
BTW, it should be possible to get rid of BH_Uptodate_Lock altogether using
bio chaining (which was non-existent when this bh code was written) to make
sure IO completion function gets called only once all bios used
301 - 400 of 6478 matches
Mail list logo