version would be:
Update:
modify inode
inode_maybe_inc_iversion(inode)
Read:
my_version = inode_query_iversion(inode)
get inode data
And you need to make sure 'get inode data' does not get speculatively
evaluated before you actually sample i_version so that you are guaranteed
that if data changes, you will observe larger i_version in the future.
Also please add a comment smp_mb() in inode_maybe_inc_iversion() like:
/* This barrier pairs with the barrier in inode_query_iversion() */
and a similar comment to inode_query_iversion(). Because memory barriers
make sense only in pairs (see SMP BARRIER PAIRING in
Documentation/memory-barriers.txt).
Honza
--
Jan Kara
SUSE Labs, CR
+0x2b7/0x3b0
> ? iomap_dio_zero+0x110/0x110
> iomap_apply+0xa4/0x110
> iomap_dio_rw+0x29e/0x3b0
> ? iomap_dio_zero+0x110/0x110
> ? xfs_file_dio_aio_read+0x7c/0x1a0 [xfs]
> xfs_file_dio_aio_read+0x7c/0x1a0 [xfs]
> xfs_file_read_iter+0xa0/0xc0 [xfs]
> __vfs_read+0xf9/0x170
> vfs_read+0xa6/0x150
> SyS_pread64+0x93/0xb0
> entry_SYSCALL_64_fastpath+0x1f/0x96
Honza
--
Jan Kara
SUSE Labs, CR
On Wed 13-12-17 07:39:30, Tejun Heo wrote:
> Hello,
>
> On Wed, Dec 13, 2017 at 12:00:04PM +0100, Jan Kara wrote:
> > OK, but this effectively prevents writeback from sync_inodes_sb() to ever
> > make inode switch wbs. Cannot that be abused in some way like making sure
>
; + size = MAX_HANDLE_SZ >> 2;
> >>
> >> - ret = exportfs_encode_inode_fh(inode, (struct fid
> >> *)f.handle.f_handle, &size, 0);
> >> + ret = exportfs_encode_inode_fh(inode, fhbuf, &size, 0);
> >> if ((ret == FILEID_I
On Mon 18-12-17 12:22:20, Jeff Layton wrote:
> On Mon, 2017-12-18 at 17:34 +0100, Jan Kara wrote:
> > On Mon 18-12-17 10:11:56, Jeff Layton wrote:
> > > static inline bool
> > > inode_maybe_inc_iversion(struct inode *inode, bool force)
> > > {
> >
_VERSION_QUERIED;
> + old = atomic64_cmpxchg(&inode->i_version, cur, new);
> + if (old == cur)
> + break;
> + cur = old;
> + }
Why not just use atomic64_or() here?
Honza
--
Jan Kara
SUSE Labs, CR
t; + if (dirty)
> iflags |= I_DIRTY_SYNC;
> __mark_inode_dirty(inode, iflags);
> return 0;
> @@ -1863,7 +1871,7 @@ int file_update_time(struct file *file)
> if (!timespec_equal(&inode->i_ctime, &now))
> sync_it |= S_CTIME;
>
> - if (IS_I_VERSION(inode))
> + if (IS_I_VERSION(inode) && inode_iversion_need_inc(inode))
> sync_it |= S_VERSION;
>
> if (!sync_it)
> --
> 2.14.3
>
--
Jan Kara
SUSE Labs, CR
On Wed 13-12-17 09:20:10, Jeff Layton wrote:
> From: Jeff Layton
>
> Signed-off-by: Jeff Layton
Looks good to me. You can add:
Reviewed-by: Jan Kara
Honza
> ---
> fs/ocfs2/dir.c | 14 +++
On Wed 13-12-17 09:20:06, Jeff Layton wrote:
> From: Jeff Layton
>
> Signed-off-by: Jeff Layton
Looks good. You can add:
Reviwed-by: Jan Kara
Honza
> ---
> fs/ext2/dir.c | 8
> fs/ext2/super.c |
On Fri 15-12-17 09:17:42, Yan, Zheng wrote:
> On Fri, Dec 15, 2017 at 12:53 AM, Jan Kara wrote:
> >> >
> >> > In this particular case I'm not sure why does ceph pass 'filp' into
> >> > readpage() / readpages() handler when it already
On Thu 14-12-17 22:30:26, Yan, Zheng wrote:
> On Thu, Dec 14, 2017 at 9:43 PM, Jan Kara wrote:
> > On Thu 14-12-17 18:55:27, Yan, Zheng wrote:
> >> We recently got an Oops report:
> >>
> >> BUG: unable to handle kernel NULL pointer dereference at (null)
>
to read/set
> + * current->journal_info.
> + */
> + old_journal_info = current->journal_info;
> + current->journal_info = NULL;
> +
> if (unlikely(is_vm_hugetlb_page(vma)))
> ret = hugetlb_fault(vma->vm_mm, vma, address, flags);
> else
> ret = __handle_mm_fault(vma, address, flags);
>
> + current->journal_info = old_journal_info;
> +
> if (flags & FAULT_FLAG_USER) {
> mem_cgroup_oom_disable();
> /*
> --
> 2.13.6
>
--
Jan Kara
SUSE Labs, CR
CONFIG_CGROUP_WRITEBACK
> struct radix_tree_root cgwb_tree; /* radix tree of active cgroup wbs */
> struct rb_root cgwb_congested_tree; /* their congested states */
> + struct rw_semaphore wb_switch_rwsem; /* no cgwb switch while syncing */
> #else
> struct bdi_writeback_congested *wb_congested;
> #endif
> --- a/mm/backing-dev.c
> +++ b/mm/backing-dev.c
> @@ -706,6 +706,7 @@ static int cgwb_bdi_init(struct backing_
>
> INIT_RADIX_TREE(&bdi->cgwb_tree, GFP_ATOMIC);
> bdi->cgwb_congested_tree = RB_ROOT;
> + init_rwsem(&bdi->wb_switch_rwsem);
>
> ret = wb_init(&bdi->wb, bdi, 1, GFP_KERNEL);
> if (!ret) {
--
Jan Kara
SUSE Labs, CR
Hello Byungchul,
On Tue 05-12-17 13:58:09, Byungchul Park wrote:
> On 12/4/2017 5:33 PM, Jan Kara wrote:
> >adding Peter and Byungchul to CC since the lockdep report just looks
> >strange and cross-release seems to be involved. Guys, how did #5 get into
> >the loc
m-r5
(none):~# stat /usr/share/terminfo/x/xterm-r5
File: `/usr/share/terminfo/x/xterm-r5' -> `/lib/terminfo/x/xterm-r5'
Size: 24 Blocks: 8 IO Block: 4096 symbolic link
Device: 6200h/25088dInode: 98027 Links: 1
Access: (0777/lrwxrwxrwx) Uid: (0/root) Gid: (0/root)
Access: 2017-12-04 16:27:29.0 +
Modify: 2006-05-19 21:12:53.0 +
Change: 2006-05-19 21:12:53.0 +
Honza
--
Jan Kara
SUSE Labs, CR
On Thu 30-11-17 20:05:48, Luis R. Rodriguez wrote:
> On Thu, Nov 30, 2017 at 06:13:10PM +0100, Jan Kara wrote:
> > ... I dislike the _by_user() suffix as there may be different places that
> > call freeze_super() (e.g. device mapper does this during some operations).
> &g
On Wed 29-11-17 13:38:26, Chris Mason wrote:
> On 11/29/2017 12:05 PM, Tejun Heo wrote:
> >On Wed, Nov 29, 2017 at 09:03:30AM -0800, Tejun Heo wrote:
> >>Hello,
> >>
> >>On Wed, Nov 29, 2017 at 05:56:08PM +0100, Jan Kara wrote:
> >>>What has happene
le cleanup if freezing of all superblocks fails in the middle.
So I'm not 100% this works out nicely in the end. But it's certainly worth
a consideration.
Honza
--
Jan Kara
SUSE Labs, CR
active count management.
>
> This change has no functional changes.
>
> Suggested-by: Dave Chinner
> Signed-off-by: Luis R. Rodriguez
Looks good to me. You can add:
Reviewed-by: Jan Kara
Honz
g and active count management.
>
> This change has no functional changes.
>
> Suggested-by: Dave Chinner
> Signed-off-by: Luis R. Rodriguez
Looks good to me. You can add:
Reviewed-by: Jan Kara
Honza
his but also also captures any errors encountered.
>
> Signed-off-by: Luis R. Rodriguez
The patch looks good to me. You can add:
Reviewed-by: Jan Kara
Honza
> ---
> fs/super.c | 91
>
k);
>
> if (tout)
> - freezable_schedule_timeout(msecs_to_jiffies(tout));
> + schedule_timeout(msecs_to_jiffies(tout));
>
> __set_current_state(TASK_RUNNING);
>
> - try_to_freeze();
> -
> tout = xfsaild_push(ailp);
> }
>
> --
> 2.15.0
>
--
Jan Kara
SUSE Labs, CR
> fs/ext2/inode.c |3 ++-
> fs/ext2/super.c |1 -
> fs/ext4/inode.c |5 -
> fs/ext4/super.c |2 --
> include/linux/backing-dev.h |2 +-
> include/linux/buffer_head.h |3 ++
> Signed-off-by: Tetsuo Handa
>
> Fixes: 1d3d4437eae1 ("vmscan: per-node deferred work")
>
> > Cc: Jan Kara
> > Cc: Michal Hocko
>
> From my very limited understanding of the code this looks
successfully, we can make sure all inodes disk usage
> can be accounted, which will be more reasonable.
>
> Suggested-by: Jan Kara
> Signed-off-by: Chao Yu
Thanks. Added to my tree.
Honza
> ---
&g
l can
> coordinate revoking DMA access when the filesystem needs to truncate
> mappings.
>
> Reported-by: Jan Kara
> Cc: Mauro Carvalho Chehab
> Cc: linux-me...@vger.kernel.org
> Cc:
> Fixes: 3565fce3a659 ("mm, x86: get_user_pages() for dax mappings")
Cc: Inki Dae
> Cc: Seung-Woo Kim
> Cc: Joonyoung Shim
> Cc: Kyungmin Park
> Cc: Mauro Carvalho Chehab
> Cc: linux-me...@vger.kernel.org
> Cc: Jan Kara
> Cc: Mel Gorman
> Cc: Vlastimil Babka
> Cc: Andrew Morton
> Cc:
> Fixes: 3565fce3a659 ("mm, x86:
to just pass
'flags' here. Other than that the patch looks good.
Honza
--
Jan Kara
SUSE Labs, CR
memory 4 cpus":
> >
> >make clean
> >echo 3 > drop_caches
> >time make -j4
>
> Maybe FS people will help you find a more representative workload. E.g.
> linear cache cold file read should be good as well. Maybe there are some
> tests in fstests
y to trigger in the production because small
> allocations do not fail usually.
>
> Debugged-by: Tetsuo Handa
> Signed-off-by: Michal Hocko
Looks good to me now. You can add:
Reviewed-by: Jan Kara
Honza
> ---
> fs/
if (err) {
> spin_unlock(&sb_lock);
> + unregister_shrinker(&s->s_shrink);
> destroy_unused_super(s);
> return ERR_PTR(err);
> }
> @@ -518,7 +525,6 @@ struct super_block *sget_userns(struct file_system_type
> *type,
> hlist_add_head(&s->s_instances, &type->fs_supers);
> spin_unlock(&sb_lock);
> get_filesystem(type);
> - register_shrinker(&s->s_shrink);
> return s;
> }
>
> --
> 2.15.0
>
--
Jan Kara
SUSE Labs, CR
gt;files[type] = NULL;
> iput(inode);
This bail out path is not correct. You have to go through full quota off at
this point (dquot_disable() function) as some inodes had already quotas
initialized and can be using them...
Honza
--
Jan Kara
SUSE Labs, CR
> the crash.
> >
> > Programs can be found here: https://pastebin.com/RYGtNn3z
> >
> > Stack trace here: https://pastebin.com/SaJXWMg3
> >
> > We don't have a C reproducer but we will send one if we have it.
> >
> > Regards,
> > Shankara
>
--
Jan Kara
SUSE Labs, CR
;i_version++;
> inode->i_mtime = inode->i_ctime = current_time(inode);
> mark_inode_dirty(inode);
> return len - towrite;
> --
> 2.13.6
>
--
Jan Kara
SUSE Labs, CR
On Wed 15-11-17 01:32:16, Yang Shi wrote:
>
>
> On 11/14/17 1:39 AM, Michal Hocko wrote:
> >On Tue 14-11-17 03:10:22, Yang Shi wrote:
> >>
> >>
> >>On 11/9/17 5:54 AM, Michal Hocko wrote:
> >>>[Sorry for the late reply]
> >>>
&g
On Tue 14-11-17 11:43:49, Chao Yu wrote:
> On 2017/11/13 17:18, Jan Kara wrote:
> > On Mon 13-11-17 11:31:48, Chao Yu wrote:
> >> Commit 6184fc0b8dd7 ("quota: Propagate error from ->acquire_dquot()")
> >> missed to handle error from dquot_initialize in dquot
hat tree?
Sorry, I forgot you were still fetching it. No, there's no need to fetch
that branch anymore (it was a branch for one time work and I've deleted it
now since it was untouched for an year or so). Thanks!
Honza
--
Jan Kara
SUSE Labs, CR
+ error = dquot_initialize(inode);
> return error;
> }
> EXPORT_SYMBOL(dquot_file_open);
> --
> 2.15.0.55.gc2ece9dc4de6
>
>
--
Jan Kara
SUSE Labs, CR
ception
>
> Attached the full dmesg and kconfig.
Ok, I assume this is still valid even though previous KASAN report need not
be? I'm not sure if this could be inotify related though... Possibly if
double-free could trigger this in SLOB but then we should see issues also
with SLAB or SLUB.
Honza
--
Jan Kara
SUSE Labs, CR
s good to me. You can add:
Reviewed-by: Jan Kara
Honza Kara
> ---
> Changes from v3:
> - s/waiters/used_lists, more doc around the counter.
> - fixed racy scenario when the list empty/non-empty
> condition ch
ou'd need to have a completely separate set of locking classes for
each filesystem to avoid false positives like these. And that would
increase number of classes lockdep has to handle significantly. So I'm not
sure it's really worth it...
Honza
--
Jan Kara
SUSE Labs, CR
On Fri 03-11-17 01:04:45, Wang Long wrote:
> The parameter `struct bdi_writeback *wb` is not been used in the function
> body. so we just remove it.
>
> Signed-off-by: Wang Long
Looks good. You can add:
Reviewed-
sole_lock owner.
> + */
> + mutex_release(&console_lock_dep_map, 1, _THIS_IP_);
> + printk_safe_exit_irqrestore(flags);
> + /* Note, if waiter is set, logbuf_lock is not held */
> + return;
> + }
> +
> console_locked = 0;
>
> /* Release the exclusive_console once it is used */
--
Jan Kara
SUSE Labs, CR
On Wed 01-11-17 00:44:18, Yang Shi wrote:
> On 10/31/17 3:12 AM, Jan Kara wrote:
> >On Tue 31-10-17 00:39:58, Yang Shi wrote:
> >>On 10/30/17 5:43 AM, Jan Kara wrote:
> >>>On Sat 28-10-17 02:22:18, Yang Shi wrote:
> >>>>If some process generates e
struct dlock_list_node *node,
> {
> struct dlock_list_head *head = &dlist->heads[this_cpu_read(cpu2idx)];
>
> - /*
> - * There is no need to disable preemption
> - */
> - spin_lock(&head->lock);
> - node->head = head;
> - list_add(&node->list, &head->list);
> - spin_unlock(&head->lock);
> + dlock_list_add(node, head);
> }
> EXPORT_SYMBOL(dlock_lists_add);
>
> --
> 1.8.3.1
>
>
--
Jan Kara
SUSE Labs, CR
t on performance. It
> also improves dlock list iteration performance as fewer lists need
> to be iterated.
>
> Signed-off-by: Waiman Long
The patch looks good to me. You can add:
Reviewed-by: Jan Kara
char length [ISODCL (1, 1)]; /* 711 */
> - char ext_attr_length[ISODCL (2, 2)]; /* 711 */
> - char extent [ISODCL (3, 10)]; /* 733 */
> - char size [ISODCL (11, 18)]; /* 733 */
> - char date [ISODCL (19, 25)]; /* 7 by 711 */
> - char flags [ISODCL (26, 26)];
> - char file_unit_size [ISODCL (27, 27)]; /* 711 */
> - char interleave [ISODCL (28, 28)]; /* 711 */
> - char volume_sequence_number [ISODCL (29, 32)]; /* 723 */
> - unsigned char name_len [ISODCL (33, 33)]; /* 711 */
> + __u8 length [ISODCL (1, 1)]; /* 711 */
> + __u8 ext_attr_length[ISODCL (2, 2)]; /* 711 */
> + __u8 extent [ISODCL (3, 10)]; /* 733 */
> + __u8 size [ISODCL (11, 18)]; /* 733 */
> + __u8 date [ISODCL (19, 25)]; /* 7 by 711 */
> + __u8 flags [ISODCL (26, 26)];
> + __u8 file_unit_size [ISODCL (27, 27)]; /* 711 */
> + __u8 interleave [ISODCL (28, 28)]; /* 711 */
> + __u8 volume_sequence_number [ISODCL (29, 32)]; /* 723 */
> + __u8 name_len [ISODCL (33, 33)]; /* 711 */
> char name [0];
> } __attribute__((packed));
>
> --
> 2.9.0
>
--
Jan Kara
SUSE Labs, CR
On Thu 19-10-17 17:29:12, Arnd Bergmann wrote:
> On Thu, Oct 19, 2017 at 5:17 PM, Jan Kara wrote:
> > On Thu 19-10-17 16:47:48, Arnd Bergmann wrote:
> >> isofs uses a 'char' variable to load the number of years since
> >> 1900 for an inode timestamp. On arch
On Tue 31-10-17 13:51:40, Amir Goldstein wrote:
> On Tue, Oct 31, 2017 at 12:50 PM, Jan Kara wrote:
> > On Sun 22-10-17 11:24:17, Amir Goldstein wrote:
> >> But I think there is another problem, not introduced by your change, but
> >> could
> >> be amplified b
On Tue 31-10-17 13:02:21, Amir Goldstein wrote:
> On Tue, Oct 31, 2017 at 11:54 AM, Jan Kara wrote:
> > On Mon 30-10-17 21:18:09, Miklos Szeredi wrote:
> >> On Mon, Oct 30, 2017 at 6:27 PM, Jan Kara wrote:
> >> > On Fri 27-10-17 13:53:20, Jan Kara wrote:
> >&
that found and may be using this mark. */
> - atomic_t refcnt;
> + refcount_t refcnt;
> /* Group this mark is for. Set on mark creation, stable until last ref
>* is dropped */
> struct fsnotify_group *group;
> diff --git a/kernel/audit_tree.c b/kernel/audit_tree.c
> index 011d46e..45ec960 100644
> --- a/kernel/audit_tree.c
> +++ b/kernel/audit_tree.c
> @@ -1007,7 +1007,7 @@ static void audit_tree_freeing_mark(struct
> fsnotify_mark *entry, struct fsnotify
>* We are guaranteed to have at least one reference to the mark from
>* either the inode or the caller of fsnotify_destroy_mark().
>*/
> - BUG_ON(atomic_read(&entry->refcnt) < 1);
> + BUG_ON(refcount_read(&entry->refcnt) < 1);
> }
>
> static const struct fsnotify_ops audit_tree_ops = {
> --
> 2.7.4
>
--
Jan Kara
SUSE Labs, CR
inotify_init() and the refcnt will hit 0 only when that fd has been
>* closed.
>*/
> - atomic_t refcnt;/* things with interest in this group */
> + refcount_t refcnt; /* things with interest in this group */
>
> const struct fsnotify_ops *ops; /* how this group handles things */
>
> --
> 2.7.4
>
--
Jan Kara
SUSE Labs, CR
n.c
> +++ b/fs/notify/notification.c
> @@ -111,7 +111,8 @@ int fsnotify_add_event(struct fsnotify_group *group,
> return 2;
> }
>
> - if (group->q_len >= group->max_events) {
> + if (group->q_len >= group->max_events ||
> + event == group->overflow_event) {
> ret = 2;
> /* Queue overflow event only if it isn't already queued */
> if (!list_empty(&group->overflow_event->list)) {
> --
> 2.7.4
>
--
Jan Kara
SUSE Labs, CR
On Tue 31-10-17 00:39:58, Yang Shi wrote:
> On 10/30/17 5:43 AM, Jan Kara wrote:
> >On Sat 28-10-17 02:22:18, Yang Shi wrote:
> >>If some process generates events into a huge or unlimit event queue, but no
> >>listener read them, they may consume significant amount of me
On Mon 30-10-17 21:18:09, Miklos Szeredi wrote:
> On Mon, Oct 30, 2017 at 6:27 PM, Jan Kara wrote:
> > On Fri 27-10-17 13:53:20, Jan Kara wrote:
> >> On Wed 25-10-17 16:31:39, Miklos Szeredi wrote:
> >> > On Wed, Oct 25, 2017 at 10:41 AM, Miklos Szeredi
> >&g
On Fri 27-10-17 13:53:20, Jan Kara wrote:
> On Wed 25-10-17 16:31:39, Miklos Szeredi wrote:
> > On Wed, Oct 25, 2017 at 10:41 AM, Miklos Szeredi
> > wrote:
> > > We discovered some problems in the latest fsnotify/fanotify codebase with
> > > the help of a stre
ess_list);
> -#endif
> + if (IS_ENABLED(CONFIG_FANOTIFY_ACCESS_PERMISSIONS)) {
> + init_waitqueue_head(&group->fanotify_data.access_waitq);
> + INIT_LIST_HEAD(&group->fanotify_data.access_list);
> + }
When having space for these allocated, just initialize them properly.
Otherwise it's asking for trouble.
Honza
--
Jan Kara
SUSE Labs, CR
On Mon 30-10-17 14:42:11, Miklos Szeredi wrote:
> On Mon, Oct 30, 2017 at 2:34 PM, Jan Kara wrote:
> > On Wed 25-10-17 10:41:34, Miklos Szeredi wrote:
> >> We may fail to pin one of the marks in fsnotify_prepare_user_wait() when
> >> dropping the srcu read lock, resulti
= srcu_dereference(inode_node->next,
> &fsnotify_mark_srcu);
> +skip_vfsmount:
> if (vfsmount_group)
> vfsmount_node = srcu_dereference(vfsmount_node->next,
>&fsnotify_mark_srcu);
> --
> 2.5.5
>
--
Jan Kara
SUSE Labs, CR
}
>
> - iter_info.inode_mark = inode_mark;
> - iter_info.vfsmount_mark = vfsmount_mark;
> -
> ret = send_to_group(to_tell, inode_mark, vfsmount_mark, mask,
> data, data_is, cookie, file_name,
> &iter_info);
> --
> 2.5.5
>
--
Jan Kara
SUSE Labs, CR
know what it is doing.
So maybe we could come up with some better way to control amount of
resources consumed by notification events but for that we lack more
information about your use case. And I maintain that the solution should
account events to the consumer, not the producer...
g
Looks good to me. You can add:
Reviewed-by: Jan Kara
Honza
> ---
> include/linux/dlock-list.h | 28 +---
> 1 file changed, 17 insertions(+), 11 deletions(-)
>
> diff --git a/include/linux/d
implement mechanism to block truncate while there are
short term references pending (and for that retry loops would be IMHO
acceptable). And then we can work on a mechanism to notify userspace that
it needs to drop references to blocks that are going to be truncated so
that we can re-enable taking of long term references.
Honza
[1]
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1522887.html
--
Jan Kara
SUSE Labs, CR
ve a close
look. I'll try to check it early next week and pick it up to my tree.
Also thanks Amir for reviewing Miklos' patches!
Honza
--
Jan Kara
SUSE Labs, CR
ks either. So we are
back at a situation where we need to detach blocks from the inode and then
wait for page refs to be dropped - so some form of busy extents. Am I
missing something?
Honza
--
Jan Kara
SUSE Labs, CR
nel.org
> Signed-off-by: Arnd Bergmann
...
> -int iso_date(char * p, int flag)
> +int iso_date(u8 *p, int flag)
> {
> int year, month, day, hour, minute, second, tz;
> int crtime;
>
> - year = p[0];
> + year = (int)(u8)p[0];
The cast seems unnecessa
e_node and as it no longer requires a mapping, the private
> field is removed.
>
> Signed-off-by: Mel Gorman
> Acked-by: Johannes Weiner
The patch looks good to me. You can add:
Reviewed-by: Jan Kara
Honza
@@ -1650,8 +1650,8 @@ int wait_on_node_pages_writeback(struct f2fs_sb_info
> *sbi, nid_t ino)
>
> pagevec_init(&pvec, 0);
>
> - while (nr_pages = pagevec_lookup_tag(&pvec, NODE_MAPPING(sbi), &index,
> - PAGECACHE_TAG_WRITEBACK)) {
> + while ((nr_pages = pagevec_lookup_tag(&pvec, NODE_MAPPING(sbi), &index,
> + PAGECACHE_TAG_WRITEBACK))) {
> int i;
>
> for (i = 0; i < nr_pages; i++) {
> --
> 2.9.0
>
--
Jan Kara
SUSE Labs, CR
On Mon 16-10-17 15:59:13, Kees Cook wrote:
> In preparation for unconditionally passing the struct timer_list pointer to
> all timer callbacks, switch to using the new timer_setup() and from_timer()
> to pass the timer pointer explicitly.
>
> Cc: Andrew Morton
> Cc: Jan Ka
s' command failing with EIO.
>
> * FIBMAP on a file block located above 0x7FFF can return a negative
> value. The low 32 bits are correct, but applications that don't mask the
> high 32 bits of the result can perform incorrectly.
>
> Per suggestion by Jan Kara,
; Cc: Andreas Dilger
> Cc: linux-e...@vger.kernel.org
> Cc: Thomas Gleixner
> Signed-off-by: Kees Cook
The patch looks good. You can add:
Reviewed-by: Jan Kara
Honza
> ---
> This requires commit 686fef928bba (&qu
se cases.
But it would be rather large overhaul of the code so it may be a bit out of
scope for these improvements...
> @@ -409,8 +445,8 @@ void truncate_inode_pages_range(struct address_space
> *mapping,
> }
>
> if (radix_tree_exceptional_entry(page)) {
> - truncate_exceptional_entry(mapping, index,
> -page);
> + if (ei != PAGEVEC_SIZE)
> + ei = i;
This should be ei == PAGEVEC_SIZE I think.
Otherwise the patch looks good to me so feel free to add:
Reviewed-by: Jan Kara
Honza
--
Jan Kara
SUSE Labs, CR
ax_mapping(mapping) || shmem_mapping(mapping))
> - return;
> -
Hum, we don't need to pass 'mapping' from call sites then? Either pass NULL
or just remove the argument completely since nobody needs it anymore...
Otherwise the patch looks good.
Honza
--
Jan Kara
SUSE Labs, CR
On Tue 10-10-17 22:30:30, Steve Magnani wrote:
> Jan -
>
> On 10/10/2017 02:33 AM, Jan Kara wrote:
> >On Mon 09-10-17 10:04:52, Steve Magnani wrote:
> >
> >...the patch seems to be mixing two changes into one which I'd prefer to be
> > separate patches:
&g
On Tue 10-10-17 08:54:40, Tejun Heo wrote:
> Implement submit_bh_blkcg_css() which will be used to override cgroup
> membership on specific buffer_heads.
>
> v2: Reimplemented using create_bh_bio() as suggested by Jan.
>
> Signed-off-by: Tejun Heo
> Cc: Jan Kara
> Cc:
As bio can now be manipulated before submitted, we can move out @wbc
> handling into submit_bh_wbc() and similarly this will make adding more
> submit_bh variants straight-forward.
>
> This patch is pure refactoring and doesn't cause any functional
> changes.
>
> Signed
On Tue 10-10-17 08:54:37, Tejun Heo wrote:
> Export blkcg_root_css so that filesystem modules can use it.
>
> Signed-off-by: Tejun Heo
Looks good. You can add:
Reviewed-by: Jan Kara
Honza
> ---
> block/blk-cgro
On Wed 04-10-17 17:52:50, Kees Cook wrote:
> In preparation for unconditionally passing the struct timer_list pointer to
> all timer callbacks, switch to using the new timer_setup() and from_timer()
> to pass the timer pointer explicitly.
>
> Cc: Petr Vandrovec
> Cc: Jan Kara
On Wed 04-10-17 17:48:46, Kees Cook wrote:
> In preparation for unconditionally passing the struct timer_list pointer to
> all timer callbacks, switch to using the new timer_setup() and from_timer()
> to pass the timer pointer explicitly.
>
> Cc: "Theodore Ts'o"
On Tue 10-10-17 17:14:48, Yafang Shao wrote:
> 2017-10-10 16:48 GMT+08:00 Jan Kara :
> > On Tue 10-10-17 16:00:29, Yafang Shao wrote:
> >> 2017-10-10 6:42 GMT+08:00 Andrew Morton :
> >> > On Sat, 7 Oct 2017 06:58:04 +0800 Yafang Shao
> >> > wrote:
>
ut() to kick off
>* regular writeback instead of writing things out itself.
>*/
> - if (wbc->wb)
> - bio_associate_blkcg(bio, wbc->wb->blkcg_css);
> + if (wbc->blkcg_css)
> + bio_associate_blkcg(bio, wbc->blkcg_css);
> }
>
> #else/* CONFIG_CGROUP_WRITEBACK */
> --
> 2.9.5
>
--
Jan Kara
SUSE Labs, CR
>
> * btrfs sets the new flag in btrfs_update_iflags() function. Note
> that this automatically excludes btree_inode which doesn't use
> btrfs_update_iflags() during initialization. This is an intended
> behavior change.
>
> Signed-off-by: Tejun Heo
> Cc: Jan Kara
Maybe we'd better call wb_wakeup_delayed(wb) here to bypass the
> bdi_has_dirty_io() check ?
Well, wb_wakeup_delayed() would be more appropriate but you'd then have to
iterate over all bdis and wbs to be able to call it which IMO isn't worth
the pain for a special case like this. But the decision is worth mentioning
in the comment. Also wakeup_flusher_threads() does in principle what you
need - see my reply to Andrew for details.
Honza
--
Jan Kara
SUSE Labs, CR
a strange thing to do).
I guess to prevent busylooping? But I'm not sure...
> (and what happens if the interval was set to 1 hour and the user
> rewrites that to 1 second? Does that change take 1 hour to take
> effect?)
That's a good point I didn't think about. So probably we should do the
wakeup whenever dirty_writeback_interval changes.
Honza
--
Jan Kara
SUSE Labs, CR
gt; - udf_debug("bit %ld already set\n", bit + i);
> + udf_debug("bit %lu already set\n", bit + i);
This change looks wrong - bit and i are signed. However they are ints, not
longs, so that should indeed be fixed.
Honza
--
Jan Kara
SUSE Labs, CR
if (list_empty(&iter->head[iter->index].list))
Why these two do not need a similar treatment as alloc_dlist_heads()?
Honza
--
Jan Kara
SUSE Labs, CR
nto some object and applied it only after the last
> possible failure exit. The entire "restore the original state" logics
> would go away...
Well, it's not like the restore logic would be that difficult for ext2. But
I agree that running the whole parsing logic under a spinlock is
unnecessary and accumulating all the changes in one structure and then
applying them looks like a cleaner way to go. I'll look into that.
Honza
--
Jan Kara
SUSE Labs, CR
nodes_sb() which is the
> only caller. Also change return type of try_to_writeback_inodes_sb to
> void as the only user ext4 doesn't care.
>
> Signed-off-by: Rakesh Pandit
Looks good. You can add:
Reviewed-by: Jan Kara
On Mon 09-10-17 18:44:23, Yafang Shao wrote:
> 2017-10-09 17:56 GMT+08:00 Jan Kara :
> > On Sat 07-10-17 06:58:04, Yafang Shao wrote:
> >> After disable periodic writeback by writing 0 to
> >> dirty_writeback_centisecs, the handler wb_workfn() will not be
> &g
s has some
changes queued in linux-block tree in this area so your change won't apply.
So please base your changes on his tree.
Honza
--
Jan Kara
SUSE Labs, CR
user is still using the disk.
> - */
> -void laptop_io_completion(struct backing_dev_info *info)
> -{
> - mod_timer(&info->laptop_mode_wb_timer, jiffies + laptop_mode);
> -}
> -
> -/*
> - * We're in laptop mode and we've just synced. The sync's writes will have
> - * caused another writeback to be scheduled by laptop_io_completion.
> - * Nothing needs to be written back anymore, so we unschedule the writeback.
> - */
> -void laptop_sync_completion(void)
> -{
> - struct backing_dev_info *bdi;
> -
> - rcu_read_lock();
> -
> - list_for_each_entry_rcu(bdi, &bdi_list, bdi_list)
> - del_timer(&bdi->laptop_mode_wb_timer);
> -
> - rcu_read_unlock();
> -}
> -#endif
> -
> /*
> * If ratelimit_pages is too high then we can get into dirty-data overload
> * if a large number of processes all perform writes at the same time.
> --
> 2.14.1
>
--
Jan Kara
SUSE Labs, CR
On Thu 05-10-17 10:57:07, Waiman Long wrote:
> On 10/05/2017 04:59 AM, Jan Kara wrote:
> > On Wed 04-10-17 17:20:05, Waiman Long wrote:
> >> int alloc_dlock_list_heads(struct dlock_list_heads *dlist)
> >> {
> >> - int idx;
> >> + int idx, cnt =
in include/linux/list_bl.h. Sure it's a tradeoff between bitlock /
spinlock but is there a user where it matters?
Honza
--
Jan Kara
SUSE Labs, CR
ock_lists is initialized? But how can the dlist be used later
when it has larger number of lists and you don't know how many?
Honza
--
Jan Kara
SUSE Labs, CR
ust for this purpose.
>
> After this change, we truly only ever have one of them running at
> any point in time. We mark the need to start all flushes, and the
> writeback thread will clear it once it has processed the request.
>
> Signed-off-by: Jens Axboe
Just one nit below. You c
file ABI obsolete notice, and
> the sysfs file.
>
> Signed-off-by: Jens Axboe
Agreed. You can add:
Reviewed-by: Jan Kara
Honza
> ---
> Documentation/ABI/obsolete/proc-sys-vm-nr_pdflush_thr
On Tue 03-10-17 08:36:16, Jens Axboe wrote:
> On 10/03/2017 06:25 AM, Jan Kara wrote:
> > On Tue 03-10-17 14:10:49, Jan Kara wrote:
> >> On Wed 27-09-17 14:13:49, Jens Axboe wrote:
> >>> We currently it it for find_or_create_page(), which means that it
> >&g
o the last user of free_more_memory(), kill
> it off completely.
>
> Signed-off-by: Jens Axboe
Looks good. You can add:
Reviewed-by: Jan Kara
Honza
> ---
> fs/buffer.c | 23 ---
> 1 file change
y cleaning" writeback, I agree that
range_cyclic probably makes more sense. You can add:
Reviewed-by: Jan Kara
Honza
> ---
> fs/fs-writeback.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>
On Tue 03-10-17 14:10:49, Jan Kara wrote:
> On Wed 27-09-17 14:13:49, Jens Axboe wrote:
> > We currently it it for find_or_create_page(), which means that it
> > cannot fail. Ensure we also pass in 'retry == true' to
> > alloc_page_buffers(), which also ensure that
901 - 1000 of 3464 matches
Mail list logo