Re: crash in 3.12.51 (likely in 3.12.52 as well) in timer code

2016-02-04 Thread Nikolay Borisov
On 02/04/2016 02:17 PM, Mike Galbraith wrote: > On Thu, 2016-02-04 at 13:51 +0200, Nikolay Borisov wrote: >> >> On 02/04/2016 01:32 PM, Mike Galbraith wrote: >>> On Wed, 2016-02-03 at 12:58 +0200, Nikolay Borisov wrote: >>>> >>>> So in this case

Re: crash in 3.12.51 (likely in 3.12.52 as well) in timer code

2016-02-04 Thread Nikolay Borisov
On 02/04/2016 01:32 PM, Mike Galbraith wrote: > On Wed, 2016-02-03 at 12:58 +0200, Nikolay Borisov wrote: >> >> So in this case the prev/next entries do not look like corrupted, whereas >> when manipulating the list inside detach_timer they do. This is really >> od

crash in 3.12.51 (likely in 3.12.52 as well) in timer code

2016-02-03 Thread Nikolay Borisov
Hello, I've observed the following crash on a machine running 3.12.51: [2711471.041886] Modules linked in: xt_length xt_state xt_pkttype xt_dscp xt_multiport xt_set(O) ip_set_list_set(O) ip_set_hash_ip(O) ip_set(O) act_police cls_basic sch_ingress veth dm_snapshot netconsole openvswitch gre v

[RESEND PATCH 1/9] ipv4: Namespaceify tcp syn retries sysctl knob

2016-02-02 Thread Nikolay Borisov
Signed-off-by: Nikolay Borisov --- include/net/netns/ipv4.h | 2 ++ include/net/tcp.h | 1 - net/ipv4/sysctl_net_ipv4.c | 18 +- net/ipv4/tcp.c | 3 ++- net/ipv4/tcp_ipv4.c| 2 ++ net/ipv4/tcp_timer.c | 4 ++-- 6 files changed, 17

[RESEND PATCH 2/9] ipv4: Namespaceify tcp synack retries sysctl knob

2016-02-02 Thread Nikolay Borisov
Signed-off-by: Nikolay Borisov --- include/net/netns/ipv4.h| 1 + include/net/tcp.h | 1 - net/ipv4/inet_connection_sock.c | 7 ++- net/ipv4/sysctl_net_ipv4.c | 14 +++--- net/ipv4/tcp_ipv4.c | 1 + net/ipv4/tcp_timer.c| 3

[RESEND PATCH 5/9] ipv4: Namespaceify tcp_retries1 sysctl knob

2016-02-02 Thread Nikolay Borisov
Signed-off-by: Nikolay Borisov --- include/net/netns/ipv4.h | 1 + include/net/tcp.h | 1 - net/ipv4/sysctl_net_ipv4.c | 16 net/ipv4/tcp_ipv4.c| 1 + net/ipv4/tcp_timer.c | 8 5 files changed, 14 insertions(+), 13 deletions(-) diff --git a

[RESEND PATCH 0/9] Namespaceify more of the tcp sysctl knobs

2016-02-02 Thread Nikolay Borisov
it is required to tune the tcp settings for each independently of the host node. I've split the patches to be per-sysctl but after the review if the outcome is positive I'm happy to either send it in one big blob or just. Nikolay Borisov (9): ipv4: Namespaceify tcp syn retries sys

[RESEND PATCH 3/9] ipv4: Namespaceify tcp syncookies sysctl knob

2016-02-02 Thread Nikolay Borisov
Signed-off-by: Nikolay Borisov --- include/net/netns/ipv4.h | 2 ++ include/net/tcp.h | 1 - net/ipv4/syncookies.c | 4 +--- net/ipv4/sysctl_net_ipv4.c | 18 +- net/ipv4/tcp_input.c | 10 ++ net/ipv4/tcp_ipv4.c| 3 ++- net/ipv4

[RESEND PATCH 9/9] ipv4: Namespaceify tcp_notsent_lowat sysctl knob

2016-02-02 Thread Nikolay Borisov
Signed-off-by: Nikolay Borisov --- include/net/netns/ipv4.h | 1 + include/net/tcp.h | 4 ++-- net/ipv4/sysctl_net_ipv4.c | 14 +++--- net/ipv4/tcp_ipv4.c| 1 + net/ipv4/tcp_output.c | 3 --- 5 files changed, 11 insertions(+), 12 deletions(-) diff --git a

[RESEND PATCH 6/9] ipv4: Namespaceify tcp_retries2 sysctl knob

2016-02-02 Thread Nikolay Borisov
Signed-off-by: Nikolay Borisov --- include/net/netns/ipv4.h | 1 + include/net/tcp.h | 1 - net/ipv4/sysctl_net_ipv4.c | 14 +++--- net/ipv4/tcp_ipv4.c| 1 + net/ipv4/tcp_output.c | 3 ++- net/ipv4/tcp_timer.c | 5 ++--- 6 files changed, 13 insertions

[RESEND PATCH 7/9] ipv4: Namespaceify tcp_orphan_retries sysctl knob

2016-02-02 Thread Nikolay Borisov
Signed-off-by: Nikolay Borisov --- include/net/netns/ipv4.h | 1 + include/net/tcp.h | 1 - net/ipv4/sysctl_net_ipv4.c | 14 +++--- net/ipv4/tcp_ipv4.c| 1 + net/ipv4/tcp_timer.c | 3 +-- 5 files changed, 10 insertions(+), 10 deletions(-) diff --git a

[RESEND PATCH 4/9] ipv4: Namespaceify tcp reordering sysctl knob

2016-02-02 Thread Nikolay Borisov
Signed-off-by: Nikolay Borisov --- include/net/netns/ipv4.h | 2 +- include/net/tcp.h | 4 +++- net/ipv4/sysctl_net_ipv4.c | 14 +++--- net/ipv4/tcp.c | 2 +- net/ipv4/tcp_input.c | 12 ++-- net/ipv4/tcp_ipv4.c| 2 +- net/ipv4

[RESEND PATCH 8/9] ipv4: Namespaceify tcp_fin_timeout sysctl knob

2016-02-02 Thread Nikolay Borisov
Signed-off-by: Nikolay Borisov --- include/net/netns/ipv4.h | 1 + include/net/tcp.h | 3 +-- net/ipv4/sysctl_net_ipv4.c | 14 +++--- net/ipv4/tcp.c | 7 +++ net/ipv4/tcp_ipv4.c| 1 + 5 files changed, 13 insertions(+), 13 deletions(-) diff --git a

Re: [PATCH 3.12 00/91] 3.12.52-stable review

2016-01-05 Thread Nikolay Borisov
Hello Jiry, On 01/05/2016 07:46 PM, Jiri Slaby wrote: > This is the start of the stable review cycle for the 3.12.52 release. > There are 91 patches in this series, all will be posted as a response > to this one. If anyone has any issues with these being applied, please > let me know. Can you pl

[PATCH] dm-thin: Fix race condition when destroying thin pool

2015-12-17 Thread Nikolay Borisov
_work item which guarantees that on return the workitem is not running anymore. Fixes: 905e51b39a555 ("dm thin: commit outstanding data every second") Fixes: 85ad643b7e7e52 ("dm thin: add timeout to stop out-of-data-space mode holding IO forever") Signed-off-by: Nikolay Borisov C

Re: corruption causing crash in __queue_work

2015-12-17 Thread Nikolay Borisov
On 12/17/2015 05:33 PM, Tejun Heo wrote: > Hello, Nikolay. > > On Thu, Dec 17, 2015 at 12:46:10PM +0200, Nikolay Borisov wrote: >> diff --git a/drivers/md/dm-thin.c b/drivers/md/dm-thin.c >> index 493c38e08bd2..ccbbf7823cf3 100644 >> --- a/drivers/md/dm-thin.c &g

Re: corruption causing crash in __queue_work

2015-12-17 Thread Nikolay Borisov
On 12/14/2015 10:31 PM, Mike Snitzer wrote: > On Mon, Dec 14 2015 at 3:11pm -0500, > Nikolay Borisov wrote: > >> On Mon, Dec 14, 2015 at 5:31 PM, Mike Snitzer wrote: >>> On Mon, Dec 14 2015 at 3:41P -0500, >>> Nikolay Borisov wrote: >>> >

Re: corruption causing crash in __queue_work

2015-12-14 Thread Nikolay Borisov
On Mon, Dec 14, 2015 at 5:31 PM, Mike Snitzer wrote: > On Mon, Dec 14 2015 at 3:41P -0500, > Nikolay Borisov wrote: > >> Had another poke at the backtrace that is produced and here what the >> delayed_work looks like: >> >> crash> struct delayed_work fff

umount saying that a mounted directory is not mounted

2015-12-14 Thread Nikolay Borisov
Hello, I'm using the attached script to perform some tests. However from time to time I get the following results: [root@kernighan lvm-race]# bash -x ./init_vg.sh + set -e ++ mktemp -u --tmpdir=. vgfile. + file=./vgfile.lCdz ++ mktemp -u testgrp- + group=testgrp-OgAz ++ mktemp -u thingrp-

Re: corruption causing crash in __queue_work

2015-12-14 Thread Nikolay Borisov
On 12/11/2015 07:08 PM, Tejun Heo wrote: > Hello, Nikolay. > > On Fri, Dec 11, 2015 at 05:57:22PM +0200, Nikolay Borisov wrote: >> So I had a server with the patch just crash on me: >> >> Here is how the queue looks like: >> crash> struct workqueue

Re: corruption causing crash in __queue_work

2015-12-12 Thread Nikolay Borisov
On 12/11/2015 09:14 PM, Mike Snitzer wrote: > On Fri, Dec 11 2015 at 1:00pm -0500, > Nikolay Borisov wrote: > >> On Fri, Dec 11, 2015 at 7:08 PM, Tejun Heo wrote: >>> >>> Hmmm... No idea why it didn't show up in the debug log but the only >>>

Re: [RFC] REHL 7.1: soft lockup when flush tlb

2015-12-12 Thread Nikolay Borisov
On 12/12/2015 10:52 AM, Xishi Qiu wrote: > [60050.458309] kjournald starting. Commit interval 5 seconds > [60076.821224] EXT3-fs (sda1): using internal journal > [60098.811865] EXT3-fs (sda1): mounted filesystem with ordered data mode > [60138.687054] kjournald starting. Commit interval 5 secon

Re: corruption causing crash in __queue_work

2015-12-11 Thread Nikolay Borisov
On Fri, Dec 11, 2015 at 7:08 PM, Tejun Heo wrote: > Hello, Nikolay. > > On Fri, Dec 11, 2015 at 05:57:22PM +0200, Nikolay Borisov wrote: >> So I had a server with the patch just crash on me: >> >> Here is how the queue looks like: >> crash> struct workqueue

Re: corruption causing crash in __queue_work

2015-12-11 Thread Nikolay Borisov
On 12/10/2015 05:29 PM, Tejun Heo wrote: > On Thu, Dec 10, 2015 at 11:28:02AM +0200, Nikolay Borisov wrote: >> On 12/09/2015 06:27 PM, Tejun Heo wrote: >>> Hello, >>> >>> On Wed, Dec 09, 2015 at 06:23:15PM +0200, Nikolay Borisov wrote: >>>> I think

Re: corruption causing crash in __queue_work

2015-12-10 Thread Nikolay Borisov
On 12/09/2015 06:27 PM, Tejun Heo wrote: > Hello, > > On Wed, Dec 09, 2015 at 06:23:15PM +0200, Nikolay Borisov wrote: >> I think we are seeing this at least daily on at least 1 server (we have >> multiple servers like that). So adding printk's would likely be the

Re: corruption causing crash in __queue_work

2015-12-09 Thread Nikolay Borisov
On 12/09/2015 06:08 PM, Tejun Heo wrote: > Hello, Nikolay. > > On Wed, Dec 09, 2015 at 02:08:56PM +0200, Nikolay Borisov wrote: >> 73309.529940] BUG: unable to handle kernel NULL pointer dereference at >> (null) >> [73309.530238] IP: [] __queue_work+0xb3/0

corruption causing crash in __queue_work

2015-12-09 Thread Nikolay Borisov
Hello Tejun, I've been observing the following crashes on kernel 4.2.6 : 73309.529940] BUG: unable to handle kernel NULL pointer dereference at (null) [73309.530238] IP: [] __queue_work+0xb3/0x390 [73309.530466] PGD 0 [73309.530681] Oops: [#1] SMP [73309.530947] Modules linked

Re: [PATCH v3 0/7] User namespace mount updates

2015-11-18 Thread Nikolay Borisov
On 11/18/2015 04:58 PM, Al Viro wrote: > On Wed, Nov 18, 2015 at 08:22:38AM -0600, Seth Forshee wrote: > >> But it still requires the admin set it up that way, no? And aren't >> privileges required to set up those devices in the first place? >> >> I'm not saying that it wouldn't be a good idea t

Re: [PATCH 3.12 000/123] 3.12.50-stable review

2015-11-02 Thread Nikolay Borisov
Hello Jiri, I think you should also add this patch: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=2871c69e025e8bc507651d5a9cf81a8a7da9d24b I hit this on 3.12.47 originally - http://www.spinics.net/lists/dm-devel/msg24531.html On 10/28/2015 03:51 PM, Jiri Slaby wrote

Re: [RFC PATCH 1/2] ext4: Fix possible deadlock with local interrupts disabled and page-draining IPI

2015-10-16 Thread Nikolay Borisov
On 10/13/2015 04:14 PM, Jan Kara wrote: > On Tue 13-10-15 13:37:16, Nikolay Borisov wrote: >> >> >> On 10/13/2015 11:15 AM, Jan Kara wrote: >>> On Mon 12-10-15 17:51:07, Nikolay Borisov wrote: >>>> Hello and thanks for the reply, >>>> >&g

Re: [RFC PATCH 1/2] ext4: Fix possible deadlock with local interrupts disabled and page-draining IPI

2015-10-14 Thread Nikolay Borisov
On 10/13/2015 04:14 PM, Jan Kara wrote: > On Tue 13-10-15 13:37:16, Nikolay Borisov wrote: >> >> >> On 10/13/2015 11:15 AM, Jan Kara wrote: >>> On Mon 12-10-15 17:51:07, Nikolay Borisov wrote: >>>> Hello and thanks for the reply, >>>> >&g

Re: [RFC PATCH 1/2] ext4: Fix possible deadlock with local interrupts disabled and page-draining IPI

2015-10-13 Thread Nikolay Borisov
On 10/13/2015 11:15 AM, Jan Kara wrote: > On Mon 12-10-15 17:51:07, Nikolay Borisov wrote: >> Hello and thanks for the reply, >> >> On 10/12/2015 04:40 PM, Jan Kara wrote: >>> On Fri 09-10-15 11:03:30, Nikolay Borisov wrote: >>>> On 10/09/2015 10:37

Re: [RFC PATCH 1/2] ext4: Fix possible deadlock with local interrupts disabled and page-draining IPI

2015-10-12 Thread Nikolay Borisov
Hello and thanks for the reply, On 10/12/2015 04:40 PM, Jan Kara wrote: > On Fri 09-10-15 11:03:30, Nikolay Borisov wrote: >> On 10/09/2015 10:37 AM, Hillf Danton wrote: >>>>>> @@ -109,8 +109,8 @@ static void ext4_finish_bio(struct bio *bio) >>>>>&g

Re: [RFC PATCH 1/2] ext4: Fix possible deadlock with local interrupts disabled and page-draining IPI

2015-10-09 Thread Nikolay Borisov
On 10/09/2015 11:41 AM, Gilad Ben-Yossef wrote: > On Oct 8, 2015 18:31, "Nikolay Borisov" wrote: >> >> Currently when bios are being finished in ext4_finish_bio this is done by >> first disabling interrupts and then acquiring a bit_spin_lock. > ... >> &g

Re: Hard lockup in ext4_finish_bio

2015-10-09 Thread Nikolay Borisov
On 10/08/2015 09:56 PM, John Stoffel wrote: >>>>>> "Nikolay" == Nikolay Borisov writes: > > Nikolay> On 10/08/2015 05:34 PM, John Stoffel wrote: >>> Great bug report, but you're missing the info on which kernel >>> you're >

Re: [RFC PATCH 1/2] ext4: Fix possible deadlock with local interrupts disabled and page-draining IPI

2015-10-09 Thread Nikolay Borisov
On 10/09/2015 10:37 AM, Hillf Danton wrote: @@ -109,8 +109,8 @@ static void ext4_finish_bio(struct bio *bio) if (bio->bi_error) buffer_io_error(bh); } while ((bh = bh->b_this_page) != head); - bit_spin_unlock

Re: [RFC PATCH 1/2] ext4: Fix possible deadlock with local interrupts disabled and page-draining IPI

2015-10-09 Thread Nikolay Borisov
On 10/09/2015 10:19 AM, Hillf Danton wrote: >> @@ -109,8 +109,8 @@ static void ext4_finish_bio(struct bio *bio) >> if (bio->bi_error) >> buffer_io_error(bh); >> } while ((bh = bh->b_this_page) != head); >> -bit_spin_unlock

[RFC PATCH 2/2] fs: Disable interrupts after acquiring bit_spin_lock

2015-10-08 Thread Nikolay Borisov
Signed-off-by: Nikolay Borisov --- fs/buffer.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/fs/buffer.c b/fs/buffer.c index 82283ab..7109d6a 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -305,8 +305,8 @@ static void end_buffer_async_read(struct buffer_head *bh

[RFC PATCH 1/2] ext4: Fix possible deadlock with local interrupts disabled and page-draining IPI

2015-10-08 Thread Nikolay Borisov
ng on the bitlock it will have its interrupts enabled, thus being able to respond to IPIs. This eventually would allow memory allocation requiring draining of the per cpu pages to succeed. Signed-off-by: Nikolay Borisov --- fs/ext4/page-io.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions

Re: Hard lockup in ext4_finish_bio

2015-10-08 Thread Nikolay Borisov
On 10/08/2015 05:34 PM, John Stoffel wrote: > Great bug report, but you're missing the info on which kernel you're This is on 3.12.47 (self compiled). It was evident on my initial post, but I did forget to mention that in the reply. Also, I suspect even current kernel are susceptible to this sin

Re: Hard lockup in ext4_finish_bio

2015-10-08 Thread Nikolay Borisov
meaning "wait until the ipi handler has finished", which of course will never happen in the described situation. Regards, Nikolay On 10/08/2015 02:46 PM, Nikolay Borisov wrote: > Hello, > > I've hit a rather strange hard lock up on one of my servers from the >

Hard lockup in ext4_finish_bio

2015-10-08 Thread Nikolay Borisov
Hello, I've hit a rather strange hard lock up on one of my servers from the page writeback path, the actual backtrace is: [427149.717151] [ cut here ] [427149.717553] WARNING: CPU: 23 PID: 4611 at kernel/watchdog.c:245 watchdog_overflow_callback+0x98/0xc0() [427149.718

Re: [PATCH v2] jbd2: gate checksum calculations on crc driver presence, not sb flags

2015-10-04 Thread Nikolay Borisov
On 10/05/2015 09:21 AM, Darrick J. Wong wrote: > On Mon, Oct 05, 2015 at 08:43:34AM +0300, Nikolay Borisov wrote: >> It is just me or am I not seeing this feature test helper in this patch ? >> >> On 10/03/2015 06:46 AM, Darrick J. Wong wrote: >>> Change the journa

Re: [PATCH v2] jbd2: gate checksum calculations on crc driver presence, not sb flags

2015-10-04 Thread Nikolay Borisov
journal crash if someone loads a journal in no-csum > mode and then randomizes the superblock, thus flipping on the feature > bits. > > v2: Create a feature-test helper, use it everywhere. > > Tested-By: Nikolay Borisov > Reported-by: Nikolay Borisov > Signed-off

Re: [PATCH] jbd2: gate checksum calculations on crc driver presence, not sb flags

2015-10-01 Thread Nikolay Borisov
en randomizes the superblock, thus flipping on the feature > bits. > > Reported-by: Nikolay Borisov > Signed-off-by: Darrick J. Wong > --- > fs/jbd2/journal.c| 12 +--- > include/linux/jbd2.h | 10 ++ > 2 files changed, 15 insertions(+), 7 deletions(-)

Re: Crash in jbd2_chksum due to null journal->j_chksum_driver

2015-09-30 Thread Nikolay Borisov
rt to make filesystems mountable in non-init user namespace and an arbitrary user could potentially cause instability on the system? Regards, Nikolay On Wed, Sep 30, 2015 at 8:12 PM, Darrick J. Wong wrote: > On Wed, Sep 30, 2015 at 04:35:49PM +0300, Nikolay Borisov wrote: >> Hello,

Crash in jbd2_chksum due to null journal->j_chksum_driver

2015-09-30 Thread Nikolay Borisov
Hello, Today a colleague was testing something and while doing so he observed the following crash: jbd2_journal_bmap: journal block not found at offset 67 on dm-26-8 Aborting journal on device dm-26-8. BUG: unable to handle kernel NULL pointer dereference at (null) IP: [] jbd2_superb

Re: PIDs Controller Limit

2015-09-28 Thread Nikolay Borisov
On 09/26/2015 02:11 AM, Aleksa Sarai wrote: >> On Thu, Sep 24, 2015 at 09:42:38AM +1000, Aleksa Sarai wrote: >>> Does it make sense for the PIDs controller to allow a user to set a >>> limit of 0? Since we don't cancel attaches, a limit of 0 doesn't >>> affect anything (nothing stops attaches, an

Stall in serial8250_console_putchar hangs the system

2015-09-28 Thread Nikolay Borisov
Hello, I'm running the stable 3.12.47 kernel and today one of the server started reporting softlockups and rcushed stalls. I believe the 2 things are related with the culprit being the following: INFO: rcu_sched detected stalls on CPUs/tasks: { 9} (detected by 6, t=10020747 jiffies, g=532488

Re: [PATCH 3.12 00/33] 3.12.48-stable review

2015-09-15 Thread Nikolay Borisov
Hi Jiry, Maybe you would want to consider this: https://patchwork.ozlabs.org/patch/459088/ It has already found its ways in other stable kernels, despite not being cc'ed to stable. Regards, Nikolay On 09/15/2015 05:22 PM, Jiri Slaby wrote: > This is the start of the stable review cycle for the

Re: Kernel 4.1.6 Panic due to slab corruption

2015-09-09 Thread Nikolay Borisov
On 09/09/2015 05:01 PM, Christoph Lameter wrote: > On Wed, 9 Sep 2015, Nikolay Borisov wrote: > >> [root@kernighan vm]# ./slabinfo -da kmalloc-32 >> Cannot write to dma-kmalloc-32/sanity >> [root@kernighan vm]# ./slabinfo -dF kmalloc-32 >> Cannot write to

Re: Kernel 4.1.6 Panic due to slab corruption

2015-09-09 Thread Nikolay Borisov
On 09/08/2015 06:15 PM, Christoph Lameter wrote: > On Tue, 8 Sep 2015, Nikolay Borisov wrote: > >>> You have read https://www.kernel.org/doc/Documentation/vm/slub.txt? >> >> I've read that I'm also following the merge/nomerge thread on the DM >> mail

Re: Kernel 4.1.6 Panic due to slab corruption

2015-09-08 Thread Nikolay Borisov
On 09/08/2015 05:27 PM, Christoph Lameter wrote: > On Tue, 8 Sep 2015, Nikolay Borisov wrote: > >> Unfortunately I haven't found a way to reproduce it so the only option >> would be to do this on a live server. However, the performance impact I >> believe is

Re: Kernel 4.1.6 Panic due to slab corruption

2015-09-08 Thread Nikolay Borisov
On 09/08/2015 04:58 PM, Christoph Lameter wrote: > On Mon, 7 Sep 2015, Nikolay Borisov wrote: > >> Did a bit more investigation and it turns out the >> corruption is happening in slab_alloc_node, in the >> 'else' branch when get_freepointer is being called:

[RFC PATCH 1/2] userns: Implement per-userns nproc infrastructure

2015-09-08 Thread Nikolay Borisov
From: Nikolay Borisov This patch add a simple hashtable to the user_namespace structure and the necessary functions to work with it. The idea is to keep a uid->nproc counts per-namespace. Signed-off-by: Nikolay Borisov --- include/linux/user_namespace.h | 15 +- kernel/use

[RFC PATCH 2/2] userns/nproc: Add hooks for userns nproc management

2015-09-08 Thread Nikolay Borisov
From: Nikolay Borisov This patch introduce the usage of the userns_nproc_* functions where necessary to have correct accounting of the processes. Signed-off-by: Nikolay Borisov --- kernel/cred.c | 36 ++-- kernel/exit.c | 9 + kernel/fork.c | 33

[RFC PATCH 0/2] Containerise nproc count

2015-09-08 Thread Nikolay Borisov
From: Nikolay Borisov Hello, This is an initial try to have nproc count apply per-userns, rather than per the global user struct. The implementation is really simple - a hashtable holding uid->nproc mapping for each id inside the respective namespace. In its current form I have also left

Re: Kernel 4.1.6 Panic due to slab corruption

2015-09-07 Thread Nikolay Borisov
Mon, 07 Sep 2015 11:41:17 +0300, Nikolay Borisov wrote: > >> Hello, >> >> On one of our servers I've observed the a kernel pannic >> happening with the following backtrace: >> >> [654405.527070] BUG: unable to handle kernel paging reques

Re: Kernel 4.1.6 Panic due to slab corruption

2015-09-07 Thread Nikolay Borisov
rrupted. Doing addr2line on the other paging request failures also show that the issue is in the same function - get_freepointer: addr2line -f -e vmlinux-4.1.6-clouder1 811824e5 get_freepointer /home/projects/linux-stable/mm/slub.c:247 Regards, Nikolay On 09/07/2015 11:41 AM, Nikolay Boris

Kernel 4.1.6 Panic due to slab corruption

2015-09-07 Thread Nikolay Borisov
Hello, On one of our servers I've observed the a kernel pannic happening with the following backtrace: [654405.527070] BUG: unable to handle kernel paging request at 00028001 [654405.527076] IP: [] kmem_cache_alloc_node+0x99/0x1e0 [654405.527085] PGD 14bef58067 PUD 2ab358067 PMD 0 [654

[PATCH v2] x86/asm/entry/64: Minor cleanup of conditional compilation

2015-09-05 Thread Nikolay Borisov
the value of CONFIG_X86_X32_ABI Signed-off-by: Nikolay Borisov --- Sending v2 as I had forgotten to add my signed-off-by line. arch/x86/entry/entry_64.S | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S index 8cb3e43

[PATCH] x86/asm/entry/64: Minor cleanup of conditional compilation

2015-09-05 Thread Nikolay Borisov
The entry_SYSCALL_64_fastpath was checking the value of __SYSCALL_MASK, which in turn was being set in arch/x86/include/asm/unistd.h depending on whether CONFIG_X86_X32_ABI was set or not. This made the intention a bit cryptic. Juggle the code around so that the conditional compilation depends on

Re: [PATCH] mm, vmscan: Do not wait for page writeback for GFP_NOFS allocations

2015-08-07 Thread Nikolay Borisov
oduce the premature OOM killer issue > which was originally addressed by the heuristic. > > As per David Chinner the xfs is doing similar thing since 2.6.15 already > so ext4 is not the only affected filesystem. Moreover he notes: > : For example: IO completion might requir

HARD LOCKUP: Strange hard lock up on spin_lock(&sighand->siglock);

2015-07-27 Thread Nikolay Borisov
Hello, I have a machine running kernel 3.13.3, in fact I have 3 such machines which are connected via corosync in a cluster. On one of the machines I observed the following lock-up: Jul 26 06:01:14 shrek kernel: [ cut here ] Jul 26 06:01:14 shrek kernel: WARNING: CPU:

Re: [RFC PATCH] thread_local_abi system call: caching current CPU number (x86)

2015-07-17 Thread Nikolay Borisov
On 07/16/2015 11:00 PM, Mathieu Desnoyers wrote: > Expose a new system call allowing threads to register a userspace memory > area where to store the current CPU number. Scheduler migration sets the > TIF_NOTIFY_RESUME flag on the current thread. Upon return to user-space, > a notify-resume handl

Re: [PATCH 4/7] fs: Treat foreign mounts as nosuid

2015-07-16 Thread Nikolay Borisov
On 07/15/2015 10:46 PM, Seth Forshee wrote: > From: Andy Lutomirski > > If a process gets access to a mount from a different namespace user > namespace, that process should not be able to take advantage of > setuid files or selinux entrypoints from that filesystem. > Technically, trusting mount

Re: [PATCH 0/3] Remove ext3 filesystem driver

2015-07-15 Thread Nikolay Borisov
On 07/15/2015 01:26 PM, Jan Kara wrote: > Hello, > > so I have created this patch set which removes ext3 driver (and some > related support > code) from the kernel. See changelog of patch 2/3 for more details. If noone > objects, > I will queue the series in my tree for the next merge wind

Re: [PATCH 2/2] ext4: make use of sb_getblk_gfp

2015-07-01 Thread Nikolay Borisov
On 07/02/2015 09:14 AM, Theodore Ts'o wrote: > On Tue, Jun 30, 2015 at 09:26:49AM +0300, Nikolay Borisov wrote: >> Switch ext4 to using sb_getblk_gfp with GFP_NOFS added, this fixes >> possible deadlocks in the page writeback path. >> >> Signed-off-by: Nikolay Bori

[PATCH 1/2] bufferhead: Add _gfp version for sb_getblk()

2015-06-29 Thread Nikolay Borisov
ments a sb_getblk_gfp with the only difference it can accept user-provided GFP flags. Signed-off-by: Nikolay Borisov --- As per the discussion in this thread (http://marc.info/?l=linux-ext4&m=143563347324528&w=2) here are the patches which hopefully implement Ted's suggestion

[PATCH 2/2] ext4: make use of sb_getblk_gfp

2015-06-29 Thread Nikolay Borisov
Switch ext4 to using sb_getblk_gfp with GFP_NOFS added, this fixes possible deadlocks in the page writeback path. Signed-off-by: Nikolay Borisov --- fs/ext4/extents.c |6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index

Lockup in wait_transaction_locked under memory pressure

2015-06-25 Thread Nikolay Borisov
Hello, On a fairly busy server, running LXC I'm observing that sometimes the processes for a particular container lock up by going into D (uninterruptible sleep) state. Obtaining backtraces for those processes one thing which stands out is that they are all blocked in wait_transaction_locked (part

LOCKDEP warning around ext4_iget

2015-06-18 Thread Nikolay Borisov
Hello, During a debugging session of my local code I encountered the following lockdep splat but my machine did not deadlock, on subsequent repeats of the same operations that led to this splat (enqueuing my rcu callback) I couldn't reproduce it: = [ INFO: inconsist

Re: Kernel panic with user namespaces

2015-05-28 Thread Nikolay Borisov
Hi, I've run the attached test case on a clean 4.0 and 4.0 + the patch you referenced but in neither cases could I crash the kernel. All that was happening was for a new namespace to be created and bash executed in it and after some seconds I get logged out of the machine. But then I can log-in b

[Ext4][Bug] Deadlock in ext4 with memcg enabled.

2015-05-18 Thread Nikolay Borisov
Hello, On one of our servers we are observing deadlocks when fsync running. The kernel version in question is: 3.12.28 We've managed to acquire a backtrace from one of the hanging processes: PID: 21575 TASK: 883f482ac200 CPU: 24 COMMAND: "exim" #0 [8824be1ab0e8] __schedule at fff

[tip:sched/core] sched: Remove redundant #ifdef

2015-05-14 Thread tip-bot for Nikolay Borisov
Commit-ID: 8c8a457a60050d5922676f81913d87e4af6fd97b Gitweb: http://git.kernel.org/tip/8c8a457a60050d5922676f81913d87e4af6fd97b Author: Nikolay Borisov AuthorDate: Thu, 14 May 2015 14:31:01 +0300 Committer: Ingo Molnar CommitDate: Thu, 14 May 2015 20:04:43 +0200 sched: Remove redundant

Re: softirq under reporting with CONFIG_NO_HZ_* and under syn flood ?

2015-05-11 Thread Nikolay Borisov
Hello I can confirm that I'm also observing the same issue when running turbostat. I've attached the respective turbostat log with and without the hping syn flood running. I've also attached my kernel config. It's evident that when the CPU is in c0 state (processing data) due to the syn pac

RE: softirq under reporting with CONFIG_NO_HZ_* and under syn flood ?

2015-05-11 Thread Nikolay Borisov
Hello, I can confirm that I'm also observing the same issue. Here is the output of turbostat: without hping running (Most of the time is spent in idle C6 state, which is expected): pk cor CPU%c0 GHz TSC SMI%c1%c3%c6%c7 CTMP PTMP %pc2 %pc3 %pc6 %pc7 Pkg_W Cor

Repercussions of overflow in get_next_ino()

2015-05-07 Thread Nikolay Borisov
Hello, get_next_ino would allocate a number between 0...2^32 - 1 to be used as an inode number. The implementation of this mechanism relies on an unsigned int which is 32 bits. On one server I'm observing that every couple of months grsec complains that the percpu variable last_ino overflows

<    1   2   3   4