Re: bad rss-counter message in 3.14rc5

2014-03-20 Thread Hugh Dickins
On Thu, 20 Mar 2014, Sasha Levin wrote: > On 03/20/2014 09:51 AM, Dave Jones wrote: > > On Wed, Mar 19, 2014 at 10:00:29PM -0700, Hugh Dickins wrote: > > > > > > This might be collateral damage from the swapops thing, I guess we > > won't know until > > > > that gets fixed, but I thought I'd m

Re: bad rss-counter message in 3.14rc5

2014-03-20 Thread Sasha Levin
On 03/20/2014 09:51 AM, Dave Jones wrote: On Wed, Mar 19, 2014 at 10:00:29PM -0700, Hugh Dickins wrote: > > This might be collateral damage from the swapops thing, I guess we won't know until > > that gets fixed, but I thought I'd mention that we might still have a problem here. > > Ye

Re: bad rss-counter message in 3.14rc5

2014-03-20 Thread Dave Jones
On Wed, Mar 19, 2014 at 10:00:29PM -0700, Hugh Dickins wrote: > > This might be collateral damage from the swapops thing, I guess we won't > > know until > > that gets fixed, but I thought I'd mention that we might still have a > > problem here. > > Yes, those Bad rss-counters could well

Re: bad rss-counter message in 3.14rc5

2014-03-19 Thread Hugh Dickins
On Wed, 19 Mar 2014, Dave Jones wrote: > On Tue, Mar 18, 2014 at 07:19:09PM -0700, Hugh Dickins wrote: > > > Another positive on the rss counters, great, thanks Dave. > > That encourages me to think again on the swapops BUG, but no promises. > > So while I slept I ran a test kernel with that sw

Re: bad rss-counter message in 3.14rc5

2014-03-19 Thread Dave Jones
On Tue, Mar 18, 2014 at 07:19:09PM -0700, Hugh Dickins wrote: > Another positive on the rss counters, great, thanks Dave. > That encourages me to think again on the swapops BUG, but no promises. So while I slept I ran a test kernel with that swapops BUG replaced with a printk. I'm not sure of

Re: bad rss-counter message in 3.14rc5

2014-03-19 Thread Cyrill Gorcunov
On Tue, Mar 18, 2014 at 05:38:38PM -0700, Hugh Dickins wrote: > > (Cyrill, entirely unrelated, but in preparing this patch I noticed > your soft_dirty work in install_file_pte(): which looked good at > first, until I realized that it's propagating the soft_dirty of a > pte it's about to zap comple

Re: bad rss-counter message in 3.14rc5

2014-03-19 Thread Jan Kara
On Tue 18-03-14 19:37:01, Hugh Dickins wrote: > On Tue, 18 Mar 2014, Linus Torvalds wrote: > > On Tue, Mar 18, 2014 at 7:06 PM, Hugh Dickins wrote: > > > > > > I'd love that, if we can get away with it now: depends very > > > much on whether we then turn out to break userspace or not. > > > > Rig

Re: bad rss-counter message in 3.14rc5

2014-03-18 Thread Linus Torvalds
On Tue, Mar 18, 2014 at 7:37 PM, Hugh Dickins wrote: > > For 3.15, and probably 3.16 too, we should keep in place whatever > partial accommodations we have for the case (such as allowing for > anon and swap in fremap's zap_pte), in case we do need to revert; > but clean those away later on. (Not

Re: bad rss-counter message in 3.14rc5

2014-03-18 Thread Sasha Levin
On 03/18/2014 10:12 PM, Hugh Dickins wrote: On Tue, 18 Mar 2014, Sasha Levin wrote: On 03/18/2014 08:38 PM, Hugh Dickins wrote: On Tue, 11 Mar 2014, Dave Jones wrote: On Tue, Mar 11, 2014 at 09:36:03PM +0400, Cyrill Gorcunov wrote: > On Tue, Mar 11, 2014 at 01:10:45PM -0400, Dave Jones wrot

Re: bad rss-counter message in 3.14rc5

2014-03-18 Thread Hugh Dickins
On Tue, 18 Mar 2014, Linus Torvalds wrote: > On Tue, Mar 18, 2014 at 7:06 PM, Hugh Dickins wrote: > > > > I'd love that, if we can get away with it now: depends very > > much on whether we then turn out to break userspace or not. > > Right. I suspect we can, though, but it's one of those "we can

Re: bad rss-counter message in 3.14rc5

2014-03-18 Thread Linus Torvalds
On Tue, Mar 18, 2014 at 7:06 PM, Hugh Dickins wrote: > > I'd love that, if we can get away with it now: depends very > much on whether we then turn out to break userspace or not. Right. I suspect we can, though, but it's one of those "we can try it and see". Remind me early in the 3.15 merge wind

Re: bad rss-counter message in 3.14rc5

2014-03-18 Thread Hugh Dickins
On Tue, 18 Mar 2014, Dave Jones wrote: > On Tue, Mar 18, 2014 at 10:06:02PM -0400, Dave Jones wrote: > > On Tue, Mar 18, 2014 at 09:32:36PM -0400, Sasha Levin wrote: > > > > > > Untested patch below: I can't quite say Reported-by, because it may > > > > not even be one that you and Sasha ha

Re: bad rss-counter message in 3.14rc5

2014-03-18 Thread Hugh Dickins
On Tue, 18 Mar 2014, Sasha Levin wrote: > On 03/18/2014 08:38 PM, Hugh Dickins wrote: > > On Tue, 11 Mar 2014, Dave Jones wrote: > > > On Tue, Mar 11, 2014 at 09:36:03PM +0400, Cyrill Gorcunov wrote: > > > > On Tue, Mar 11, 2014 at 01:10:45PM -0400, Dave Jones wrote: > > > > > > > > > > > >

Re: bad rss-counter message in 3.14rc5

2014-03-18 Thread Dave Jones
On Tue, Mar 18, 2014 at 10:06:02PM -0400, Dave Jones wrote: > On Tue, Mar 18, 2014 at 09:32:36PM -0400, Sasha Levin wrote: > > > > Untested patch below: I can't quite say Reported-by, because it may > > > not even be one that you and Sasha have been seeing; but I'm hopeful, > > > remap_fi

Re: bad rss-counter message in 3.14rc5

2014-03-18 Thread Hugh Dickins
On Tue, 18 Mar 2014, Linus Torvalds wrote: > On Tue, Mar 18, 2014 at 5:38 PM, Hugh Dickins wrote: > > > > And yes, it is possible (though very unusual) to find an anon page or > > swap entry in a VM_SHARED nonlinear mapping: coming from that horrid > > get_user_pages(write, force) case which COWs

Re: bad rss-counter message in 3.14rc5

2014-03-18 Thread Dave Jones
On Tue, Mar 18, 2014 at 09:32:36PM -0400, Sasha Levin wrote: > > Untested patch below: I can't quite say Reported-by, because it may > > not even be one that you and Sasha have been seeing; but I'm hopeful, > > remap_file_pages is in the list. > > > > Please give this a try, preferably on 3.

Re: bad rss-counter message in 3.14rc5

2014-03-18 Thread Sasha Levin
On 03/18/2014 08:38 PM, Hugh Dickins wrote: On Tue, 11 Mar 2014, Dave Jones wrote: On Tue, Mar 11, 2014 at 09:36:03PM +0400, Cyrill Gorcunov wrote: > On Tue, Mar 11, 2014 at 01:10:45PM -0400, Dave Jones wrote: > > > > > > Dave, iirc trinity can write log file pointing which exactly sysca

Re: bad rss-counter message in 3.14rc5

2014-03-18 Thread Linus Torvalds
On Tue, Mar 18, 2014 at 5:38 PM, Hugh Dickins wrote: > > And yes, it is possible (though very unusual) to find an anon page or > swap entry in a VM_SHARED nonlinear mapping: coming from that horrid > get_user_pages(write, force) case which COWs even in a shared mapping. Hmm. Maybe we could just d

Re: bad rss-counter message in 3.14rc5

2014-03-18 Thread Hugh Dickins
On Tue, 11 Mar 2014, Dave Jones wrote: > On Tue, Mar 11, 2014 at 09:36:03PM +0400, Cyrill Gorcunov wrote: > > On Tue, Mar 11, 2014 at 01:10:45PM -0400, Dave Jones wrote: > > > > > > > > Dave, iirc trinity can write log file pointing which exactly syscall > sequence > > > > was passed, righ

Re: bad rss-counter message in 3.14rc5

2014-03-14 Thread Cyrill Gorcunov
On Tue, Mar 11, 2014 at 01:39:17PM -0400, Dave Jones wrote: > > > > Sasha already gave a link to the syscalls sequence, so no rush. > > It'd be nice to get a more concise reproducer, his list had a little of > everything in there. Dave, could you please send me your config privately so I woul

Re: bad rss-counter message in 3.14rc5

2014-03-11 Thread Dave Jones
On Tue, Mar 11, 2014 at 09:36:03PM +0400, Cyrill Gorcunov wrote: > On Tue, Mar 11, 2014 at 01:10:45PM -0400, Dave Jones wrote: > > > > > > Dave, iirc trinity can write log file pointing which exactly syscall > > sequence > > > was passed, right? Share it too please. > > > > Hm, I may h

Re: bad rss-counter message in 3.14rc5

2014-03-11 Thread Cyrill Gorcunov
On Tue, Mar 11, 2014 at 01:10:45PM -0400, Dave Jones wrote: > > > > Dave, iirc trinity can write log file pointing which exactly syscall > sequence > > was passed, right? Share it too please. > > Hm, I may have been mistaken, and the damage was done by a previous run. > I went from being able

Re: bad rss-counter message in 3.14rc5

2014-03-11 Thread Dave Jones
On Tue, Mar 11, 2014 at 06:37:50PM +0400, Cyrill Gorcunov wrote: > > > After reading some more, I suppose the idea I had is wrong, > > investigating. > > > Will ping if I find something. > > > > I can rule it out anyway, I can reproduce this by telling trinity to do > > nothing > > ot

Re: bad rss-counter message in 3.14rc5

2014-03-11 Thread Sasha Levin
On 03/11/2014 10:37 AM, Cyrill Gorcunov wrote: On Tue, Mar 11, 2014 at 10:28:17AM -0400, Dave Jones wrote: On Tue, Mar 11, 2014 at 05:41:58PM +0400, Cyrill Gorcunov wrote: > On Tue, Mar 11, 2014 at 09:23:05AM -0400, Sasha Levin wrote: > > >> > > >>Ok, with move_pages excluded it still oops

Re: bad rss-counter message in 3.14rc5

2014-03-11 Thread Cyrill Gorcunov
On Tue, Mar 11, 2014 at 10:28:17AM -0400, Dave Jones wrote: > On Tue, Mar 11, 2014 at 05:41:58PM +0400, Cyrill Gorcunov wrote: > > On Tue, Mar 11, 2014 at 09:23:05AM -0400, Sasha Levin wrote: > > > >> > > > >>Ok, with move_pages excluded it still oopses. > > > > > > > >Dave, is it possible to

Re: bad rss-counter message in 3.14rc5

2014-03-11 Thread Dave Jones
On Tue, Mar 11, 2014 at 05:41:58PM +0400, Cyrill Gorcunov wrote: > On Tue, Mar 11, 2014 at 09:23:05AM -0400, Sasha Levin wrote: > > >> > > >>Ok, with move_pages excluded it still oopses. > > > > > >Dave, is it possible to somehow figure out was someone reading pagemap > > >file > > >at mome

Re: bad rss-counter message in 3.14rc5

2014-03-11 Thread Cyrill Gorcunov
On Tue, Mar 11, 2014 at 09:23:05AM -0400, Sasha Levin wrote: > >> > >>Ok, with move_pages excluded it still oopses. > > > >Dave, is it possible to somehow figure out was someone reading pagemap file > >at moment of the bug triggering? > > We can sprinkle printk()s wherever might be useful, might n

Re: bad rss-counter message in 3.14rc5

2014-03-11 Thread Sasha Levin
On 03/11/2014 09:20 AM, Cyrill Gorcunov wrote: On Tue, Mar 11, 2014 at 01:30:17AM -0400, Dave Jones wrote: > > > > > > I don't see any holes in regular migration. Do you know if this is > > > reproducible with CONFIG_NUMA_BALANCING=n or CONFIG_NUMA=n? > > > > CONFIG_NUMA_BALANCING

Re: bad rss-counter message in 3.14rc5

2014-03-11 Thread Cyrill Gorcunov
On Tue, Mar 11, 2014 at 01:30:17AM -0400, Dave Jones wrote: > > > > > > > > I don't see any holes in regular migration. Do you know if this is > > > > reproducible with CONFIG_NUMA_BALANCING=n or CONFIG_NUMA=n? > > > > > > CONFIG_NUMA_BALANCING was n already btw, so I'll do a NUMA=n run.

Re: bad rss-counter message in 3.14rc5

2014-03-11 Thread Sasha Levin
On 03/11/2014 01:30 AM, Dave Jones wrote: On Mon, Mar 10, 2014 at 10:01:58PM -0700, Andrew Morton wrote: > On Tue, 11 Mar 2014 00:51:09 -0400 Dave Jones wrote: > > > On Mon, Mar 10, 2014 at 09:46:12PM -0700, Andrew Morton wrote: > > > On Mon, 10 Mar 2014 20:13:40 -0700 Andrew Morton w

Re: bad rss-counter message in 3.14rc5

2014-03-10 Thread Dave Jones
On Mon, Mar 10, 2014 at 10:01:58PM -0700, Andrew Morton wrote: > On Tue, 11 Mar 2014 00:51:09 -0400 Dave Jones wrote: > > > On Mon, Mar 10, 2014 at 09:46:12PM -0700, Andrew Morton wrote: > > > On Mon, 10 Mar 2014 20:13:40 -0700 Andrew Morton > > wrote: > > > > > > > > Anyone ? I'm hi

Re: bad rss-counter message in 3.14rc5

2014-03-10 Thread Dave Jones
On Mon, Mar 10, 2014 at 10:01:58PM -0700, Andrew Morton wrote: > On Tue, 11 Mar 2014 00:51:09 -0400 Dave Jones wrote: > > > On Mon, Mar 10, 2014 at 09:46:12PM -0700, Andrew Morton wrote: > > > On Mon, 10 Mar 2014 20:13:40 -0700 Andrew Morton > > wrote: > > > > > > > > Anyone ? I'm hi

Re: bad rss-counter message in 3.14rc5

2014-03-10 Thread Andrew Morton
On Tue, 11 Mar 2014 00:51:09 -0400 Dave Jones wrote: > On Mon, Mar 10, 2014 at 09:46:12PM -0700, Andrew Morton wrote: > > On Mon, 10 Mar 2014 20:13:40 -0700 Andrew Morton > wrote: > > > > > > Anyone ? I'm hitting this trace on an almost daily basis, which is a > pain > > > > while trying

Re: bad rss-counter message in 3.14rc5

2014-03-10 Thread Dave Jones
On Mon, Mar 10, 2014 at 09:46:12PM -0700, Andrew Morton wrote: > On Mon, 10 Mar 2014 20:13:40 -0700 Andrew Morton > wrote: > > > > Anyone ? I'm hitting this trace on an almost daily basis, which is a pain > > > while trying to reproduce a different bug.. > > > > Damn, I thought we'd fixe

Re: bad rss-counter message in 3.14rc5

2014-03-10 Thread Dave Jones
On Mon, Mar 10, 2014 at 09:46:12PM -0700, Andrew Morton wrote: > On Mon, 10 Mar 2014 20:13:40 -0700 Andrew Morton > wrote: > > > > Anyone ? I'm hitting this trace on an almost daily basis, which is a pain > > > while trying to reproduce a different bug.. > > > > Damn, I thought we'd fixe

Re: bad rss-counter message in 3.14rc5

2014-03-10 Thread Andrew Morton
On Mon, 10 Mar 2014 20:13:40 -0700 Andrew Morton wrote: > > Anyone ? I'm hitting this trace on an almost daily basis, which is a pain > > while trying to reproduce a different bug.. > > Damn, I thought we'd fixed that but it seems not. Cc's added. > > Guys, what stops the migration target pag

Re: bad rss-counter message in 3.14rc5

2014-03-10 Thread Andrew Morton
On Mon, 10 Mar 2014 22:49:06 -0400 Dave Jones wrote: > ... > > > > 124 static inline struct page *migration_entry_to_page(swp_entry_t > entry) > > > 125 { > > > 126 struct page *p = pfn_to_page(swp_offset(entry)); > > > 127 /* > > > 128 * Any use of migration e

Re: bad rss-counter message in 3.14rc5

2014-03-10 Thread Dave Jones
On Thu, Mar 06, 2014 at 07:22:10PM -0500, Dave Jones wrote: > On Wed, Mar 05, 2014 at 12:57:25PM -0500, Dave Jones wrote: > > On Wed, Mar 05, 2014 at 12:45:03PM -0500, Dave Jones wrote: > > > I just saw this on my box that's been running trinity.. > > > > > > [48825.517189] BUG: Bad rs

Re: bad rss-counter message in 3.14rc5

2014-03-06 Thread Dave Jones
On Wed, Mar 05, 2014 at 12:57:25PM -0500, Dave Jones wrote: > On Wed, Mar 05, 2014 at 12:45:03PM -0500, Dave Jones wrote: > > I just saw this on my box that's been running trinity.. > > > > [48825.517189] BUG: Bad rss-counter state mm:880177921d40 idx:0 val:1 > (Not tainted) > > >

Re: bad rss-counter message in 3.14rc5

2014-03-05 Thread Dave Jones
On Wed, Mar 05, 2014 at 12:45:03PM -0500, Dave Jones wrote: > I just saw this on my box that's been running trinity.. > > [48825.517189] BUG: Bad rss-counter state mm:880177921d40 idx:0 val:1 > (Not tainted) > > There's nothing else, no trace, nothing. Any ideas where to begin with th

bad rss-counter message in 3.14rc5

2014-03-05 Thread Dave Jones
I just saw this on my box that's been running trinity.. [48825.517189] BUG: Bad rss-counter state mm:880177921d40 idx:0 val:1 (Not tainted) There's nothing else, no trace, nothing. Any ideas where to begin with this? Dave -- To unsubscribe from this list: send the line "unsubscrib