Re: [PATCH] Add VDSO time function support for x86 32-bit kernel

2012-12-13 Thread Andy Lutomirski
On Thu, Dec 13, 2012 at 4:09 PM, H. Peter Anvin wrote: > On 12/13/2012 11:32 AM, Andy Lutomirski wrote: >> >> >> x32's vdso cheats -- x32 code can see high addresses just fine. The >> toolchain just makes it difficult. >> >> Your best bet is probabl

Re: [PATCH] Add VDSO time function support for x86 32-bit kernel

2012-12-13 Thread Andy Lutomirski
On Thu, Dec 13, 2012 at 5:32 PM, H. Peter Anvin wrote: > On 12/13/2012 04:20 PM, Andy Lutomirski wrote: >> >> >> What you could do is probably arrange (using some linker script magic) >> for a symbol to exist that points at the page *before* the vdso >> starts.

Re: [PATCH 4/4] block: Optionally snapshot page contents to provide stable pages during write

2012-12-13 Thread Andy Lutomirski
On 12/13/2012 12:08 AM, Darrick J. Wong wrote: > Several complaints have been received regarding long file write latencies when > memory pages must be held stable during writeback. Since it might not be > acceptable to stall programs for the entire duration of a page write (which > may > take man

Re: [PATCH] Add VDSO time function support for x86 32-bit kernel

2012-12-13 Thread Andy Lutomirski
On Thu, Dec 13, 2012 at 5:49 PM, H. Peter Anvin wrote: > On 12/13/2012 05:42 PM, Andy Lutomirski wrote: >> >> The 64-bit/x32 case is currently very simple and fast because it uses >> absolute addressing. Admittedly, pcrel references are free, so >> changing this woul

Re: [PATCH] Add VDSO time function support for x86 32-bit kernel

2012-12-13 Thread Andy Lutomirski
;ll still break ABI. I'm not sure that criu is stable enough yet that we should care. Criu people? (In brief summary: how annoying would it be if the vdso was no longer just a bunch of constant bytes that lived somewhere?) --Andy > > Andy Lutomirski wrote: > >>On Thu, Dec 1

Re: [RFC][PATCH] Fix cap_capable to only allow owners in the parent user namespace to have caps.

2012-12-13 Thread Andy Lutomirski
On Thu, Dec 13, 2012 at 6:33 PM, Eric W. Biederman wrote: > > Andy thank you for your review. > > Andy Lutomirski writes: >> This is confusing enough that I can't immediately tell whether it's >> correct. I think it's close but out of order. > >

Re: [GIT PULL] user namespace and namespace infrastructure changes for 3.8

2012-12-13 Thread Andy Lutomirski
On Thu, Dec 13, 2012 at 8:11 PM, Eric W. Biederman wrote: > Andy Lutomirski writes: > >> One more issue: the requirement that both upper and lower uids (etc.) >> in the maps are in order is rather limiting. I have no objection if >> you only require upper ids to be

percpu allocation failures in kvm

2012-12-13 Thread Andy Lutomirski
On 3.7.0 + irrelevant patches, I get this on boot. I've seen it on and off on earlier kernels, I think (although I'm not currently getting it on 3.5). [ 10.230054] PERCPU: allocation failed, size=304 align=32, alloc from reserved chunk failed [ 10.230059] Pid: 1026, comm: modprobe Tainted: G

[PATCH] mm: Downgrade mmap_sem before locking or populating on mmap

2012-12-13 Thread Andy Lutomirski
This is a serious cause of mmap_sem contention. MAP_POPULATE and MCL_FUTURE, in particular, are disastrous in multithreaded programs. Signed-off-by: Andy Lutomirski --- Sensible people use anonymous mappings. I write kernel patches :) I'm not entirely thrilled by the aesthetics of this

Re: [PATCH] mm: Downgrade mmap_sem before locking or populating on mmap

2012-12-14 Thread Andy Lutomirski
On Thu, Dec 13, 2012 at 11:27 PM, Al Viro wrote: > On Thu, Dec 13, 2012 at 09:49:43PM -0800, Andy Lutomirski wrote: >> This is a serious cause of mmap_sem contention. MAP_POPULATE >> and MCL_FUTURE, in particular, are disastrous in multithreaded programs. >> >> Sig

Re: [PATCH] mm: Downgrade mmap_sem before locking or populating on mmap

2012-12-14 Thread Andy Lutomirski
On Fri, Dec 14, 2012 at 6:49 AM, Al Viro wrote: > On Fri, Dec 14, 2012 at 03:14:50AM -0800, Andy Lutomirski wrote: > >> > Wait a minute. get_user_pages() relies on ->mmap_sem being held. Unless >> > I'm seriously misreading your patch it removes that protect

Re: [CRIU] [PATCH] Add VDSO time function support for x86 32-bit kernel

2012-12-14 Thread Andy Lutomirski
On Fri, Dec 14, 2012 at 10:35 AM, H. Peter Anvin wrote: > On 12/14/2012 12:34 AM, Pavel Emelyanov wrote: >> On 12/14/2012 06:20 AM, Andy Lutomirski wrote: >>> On Thu, Dec 13, 2012 at 6:18 PM, H. Peter Anvin wrote: >>>> Wouldn't the vdso get mapped alre

Re: [RFC][PATCH] Fix cap_capable to only allow owners in the parent user namespace to have caps.

2012-12-14 Thread Andy Lutomirski
On Fri, Dec 14, 2012 at 10:43 AM, Linus Torvalds wrote: > On Fri, Dec 14, 2012 at 10:12 AM, Eric W. Biederman > wrote: >> >> That said Serge I think I have lost track of the point of your question. > > .. and I'm a bit unsure what I should do about this all. Including > pulling the pull request t

Re: [CRIU] [PATCH] Add VDSO time function support for x86 32-bit kernel

2012-12-14 Thread Andy Lutomirski
On Fri, Dec 14, 2012 at 1:08 PM, H. Peter Anvin wrote: > On 12/14/2012 12:12 PM, Cyrill Gorcunov wrote: >>> The real issue is that happens if the process is checkpointed while >>> inside the vdso and now eip/rip or a stack frame points into the vdso. >>> This is not impossible or even unlikel

Re: [PATCH 4/4] block: Optionally snapshot page contents to provide stable pages during write

2012-12-14 Thread Andy Lutomirski
On Thu, Dec 13, 2012 at 6:10 PM, Darrick J. Wong wrote: > On Thu, Dec 13, 2012 at 05:48:06PM -0800, Andy Lutomirski wrote: >> On 12/13/2012 12:08 AM, Darrick J. Wong wrote: >> > Several complaints have been received regarding long file write latencies >> > when >

Re: percpu allocation failures in kvm

2012-12-14 Thread Andy Lutomirski
On Fri, Dec 14, 2012 at 5:03 PM, Marcelo Tosatti wrote: > On Thu, Dec 13, 2012 at 09:43:23PM -0800, Andy Lutomirski wrote: >> On 3.7.0 + irrelevant patches, I get this on boot. I've seen it on >> and off on earlier kernels, I think (although I'm not curre

Re: [PATCH 4/4] block: Optionally snapshot page contents to provide stable pages during write

2012-12-14 Thread Andy Lutomirski
On Fri, Dec 14, 2012 at 6:01 PM, Darrick J. Wong wrote: > On Fri, Dec 14, 2012 at 05:12:37PM -0800, Andy Lutomirski wrote: >> It survived. I hit at least one mm bug, but I really don't think it's >> a problem with your code. (I have not tried this workload on Linux

[PATCH v2] mm: Downgrade mmap_sem before locking or populating on mmap

2012-12-14 Thread Andy Lutomirski
This is a serious cause of mmap_sem contention. MAP_POPULATE and MCL_FUTURE, in particular, are disastrous in multithreaded programs. Signed-off-by: Andy Lutomirski --- Changes from v1: The non-unlocking versions of do_mmap_pgoff and mmap_region are still available for aio_setup_ring&#

Re: [PATCH v2] mm: Downgrade mmap_sem before locking or populating on mmap

2012-12-16 Thread Andy Lutomirski
On Sun, Dec 16, 2012 at 1:00 AM, Ingo Molnar wrote: > > * Andy Lutomirski wrote: > >> This is a serious cause of mmap_sem contention. MAP_POPULATE >> and MCL_FUTURE, in particular, are disastrous in multithreaded programs. >> >> Signed-off-by: Andy Lutomir

Re: [PATCH v2] mm: Downgrade mmap_sem before locking or populating on mmap

2012-12-16 Thread Andy Lutomirski
On Sun, Dec 16, 2012 at 4:39 AM, Michel Lespinasse wrote: > On Fri, Dec 14, 2012 at 6:17 PM, Andy Lutomirski wrote: >> This is a serious cause of mmap_sem contention. MAP_POPULATE >> and MCL_FUTURE, in particular, are disastrous in multithreaded programs. >> >> Sig

Re: [PATCH 0/4] user namespace fixes

2012-12-17 Thread Andy Lutomirski
reason, I didn't get the actual email. Assuming this is 5e4a08476b50fa39210fca82e03325cc46b9c235: Acked-by: Andy Lutomirski However, this comment: /* * The owner of the user namespace in the parent of the * user namespace has all caps.

Re: [PATCH 2/4] userns: Require CAP_SYS_ADMIN for most uses of setns.

2012-12-17 Thread Andy Lutomirski
On Fri, Dec 14, 2012 at 2:03 PM, Eric W. Biederman wrote: > > Andy Lutomirski found a nasty little bug in > the permissions of setns. With unprivileged user namespaces it > became possible to create new namespaces without privilege. > > However the setns calls were relax

Re: [PATCH 3/4] userns: Add a more complete capability subset test to commit_creds

2012-12-17 Thread Andy Lutomirski
eq(old->egid, new->egid) || > !uid_eq(old->fsuid, new->fsuid) || > !gid_eq(old->fsgid, new->fsgid) || > - !cap_issubset(new->cap_permitted, old->cap_permitted)) { > + !cred_cap_issubset(old, new)) { >

Re: [PATCH 4/4] userns: Fix typo in description of the limitation of userns_install

2012-12-17 Thread Andy Lutomirski
/* Threaded many not enter a different user namespace */ > + /* Threaded processes may not enter a different user namespace */ > if (atomic_read(¤t->mm->mm_users) > 1) > return -EINVAL; > > -- > 1.7.5.4 > Acked-by: Andy Lutomirski -- To unsubscr

Re: [PATCH 3.5 0/2] seccomp and vsyscall fixes

2012-09-27 Thread Andy Lutomirski
st from my tree. FWIW, the same patch applies cleanly to -next. --Andy On Thu, Sep 27, 2012 at 10:36 AM, Greg KH wrote: > On Tue, Jul 17, 2012 at 04:19:18PM -0700, Andy Lutomirski wrote: >> Apologies for the lateness of this stuff. I was at a conference last >> week when the Ch

[PATCH v2 resend] seccomp: Make syscall skipping and nr changes more consistent

2012-10-01 Thread Andy Lutomirski
x86-64 due to the way the system call entry works.) - On x86-64 with vsyscall=emulate, skipped vsyscalls were buggy. This updates the documentation accordingly. Signed-off-by: Andy Lutomirski Acked-by: Will Drewry --- This causes 9/10 of the tests in 'vsyscall' at https://

Re: [SATA] status reports updated

2005-04-15 Thread Andy Lutomirski
Jeff Garzik wrote: My Linux SATA software/hardware status reports have just been updated. To see where libata (SATA) support stands for a particular piece of hardware, or a particular feature, go to http://linux.yyz.us/sata/ What's the timeline on getting sata-promise's PATA support into ma

Re: reiserfs + quotas in kernel 2.6.11.12

2005-07-07 Thread Andy Lutomirski
Nigel Kukard wrote: Hi Guys, How stable is reiserfs quotas in 2.6.11.12? It's been stable for me for months (using various 2.6.11.y kernels), over RAID-5 even. --Andy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More

Re: Why is the kfree() argument const?

2008-01-18 Thread Andy Lutomirski
Giacomo Catenazzi wrote: And to demostrate that Linus is not the only person with this view, I copy some paragraphs from C99 rationale (you can find standard, rationale and other documents in http://clc-wiki.net/wiki/C_standardisation:ISO ) Page 75 of C99 rationale: Type qualifiers were introdu

Re: Why is the kfree() argument const?

2008-01-18 Thread Andy Lutomirski
Giacomo Catenazzi wrote: And to demostrate that Linus is not the only person with this view, I copy some paragraphs from C99 rationale (you can find standard, rationale and other documents in http://clc-wiki.net/wiki/C_standardisation:ISO ) Page 75 of C99 rationale: Type qualifiers were introdu

Re: /dev/urandom uses uninit bytes, leaks user data

2007-12-17 Thread Andy Lutomirski
Theodore Tso wrote: On Mon, Dec 17, 2007 at 08:30:05AM -0800, John Reiser wrote: [You have yet to show that...] There is a path that goes from user data into the pool. Note particularly that the path includes data from other users. Under the current implementation, anyone who accesses /dev/uran

Re: Something is broken with SATA RAID ? [and PATA raid and reiserfs?]

2005-03-02 Thread Andy Lutomirski
Jeff Garzik wrote: On Thu, Mar 03, 2005 at 12:39:41AM +, J.A. Magallon wrote: Hi... I posted this in other mail, but now I can confirm this. I have a box with a SATA RAID-5, and with 2.6.11-rc3-mm2+libata-dev1 works like a charm as a samba server, I dropped it 12Gb from an osx client, and peopl

Re: [patch] sched: auto-tune migration costs [was: Re: Industry db benchmark result on recent 2.6 kernels]

2005-04-03 Thread Andy Lutomirski
Paul Jackson wrote: Ok - that flies, or at least walks. It took 53 seconds to compute this cost matrix. Not that I really know what I'm talking about here, but this sounds highly parallelizable. It seems like you could do N/2 measurements at a time, so this should be O(N) to compute the matrix

Re: [BUG] 2.6.21-rc1,2,3 regressions on my system that I found so far

2007-03-20 Thread Andy Lutomirski
Arjan van de Ven wrote: On Tue, 2007-03-20 at 01:36 -0400, Eric St-Laurent wrote: On Tue, 2007-20-03 at 01:04 -0400, Lee Revell wrote: I think CONFIG_TRY_TO_DISABLE_SMI would be excellent for debugging, not to mention people trying to spec out hardware for RT applications... There is a SMI di

Re: [PATCH/RFC] Re: linux-next: build failure after merge of the luto-misc tree

2016-07-19 Thread Andy Lutomirski
# define __BITS_PER_LONG 32 >> #endif > >> and got this: > >> /home/sfr/next/next/tools/arch/x86/include/uapi/asm/bitsperlong.h:8:9: note: >> #pragma message: __x86_64__ is not defined >> #pragma message "__x86_64__ is not defined" > > Humm

[PATCH] x86/ebda: If the EBDA is in lowmem, reserve only 4k for the EBDA

2016-07-20 Thread Andy Lutomirski
only breaks boot in practice when some other firmware or GRUB oddity that I don't fully understand kicks in causing the memory below 0x2c000 to be unusable. Signed-off-by: Andy Lutomirski --- This is intentionally not tagged for -stable. I think it's -stable material *event

Re: [PATCH] x86/boot: Reorganize and clean up the BIOS area reservation code

2016-07-21 Thread Andy Lutomirski
On Jul 21, 2016 1:14 AM, "Ingo Molnar" wrote: > > > * Andy Lutomirski wrote: > > > Under some conditions, my Dell XPS 13 9350 puts the EBDA at 0x2c000 > > but reports the lowmem cutoff as 0. The old code reserves > > everything above 0x2c000 and I can

Re: [PATCH] x86/boot: Reorganize and clean up the BIOS area reservation code

2016-07-21 Thread Andy Lutomirski
On Thu, Jul 21, 2016 at 9:18 AM, Ingo Molnar wrote: > > * Andy Lutomirski wrote: > >> It would be very easy to implement this if we could handle overlapping >> memblocks >> precisely or set a lower limit on the memblock allocator. Then we could block >> off

[PATCH 1/2] x86/boot: Clarify what x86_legacy_features.reserve_bios_regions does

2016-07-21 Thread Andy Lutomirski
It doesn't just control probing for the EBDA -- it controls whether we detect and reserve the <1MB BIOS regions in general. Signed-off-by: Andy Lutomirski --- arch/x86/include/asm/x86_init.h | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/x8

[PATCH 2/2] x86/boot: Simplify EBDA-vs-BIOS reservation logic

2016-07-21 Thread Andy Lutomirski
effect under any circumstances. Signed-off-by: Andy Lutomirski --- arch/x86/kernel/ebda.c | 34 +++--- 1 file changed, 11 insertions(+), 23 deletions(-) diff --git a/arch/x86/kernel/ebda.c b/arch/x86/kernel/ebda.c index 6219eef20e2e..4312f8ae71b7 100644 --- a/arch/x86/ker

[PATCH 0/2] x86/boot: Further reserve_boot_regions() cleanups

2016-07-21 Thread Andy Lutomirski
This follows up on Ingo's cleanup. The first patch fixes the docs and the second makes the code even more comprehensible. Andy Lutomirski (2): x86/boot: Clarify what x86_legacy_features.reserve_bios_regions does x86/boot: Simplify EBDA-vs-BIOS reservation logic arch/x86/includ

Re: Minor PKRU bug?

2016-07-21 Thread Andy Lutomirski
On Thu, Jul 21, 2016 at 2:35 PM, Dave Hansen wrote: > On 07/12/2016 03:59 PM, Andy Lutomirski wrote: >> On Tue, Jul 12, 2016 at 3:55 PM, H. Peter Anvin wrote: >>> On 07/12/16 08:32, Dave Hansen wrote: >>>> On 07/09/2016 02:27 PM, Andy Lutomirski wrote: >>&g

Re: [PATCH] x86/boot: Reorganize and clean up the BIOS area reservation code

2016-07-21 Thread Andy Lutomirski
On Thu, Jul 21, 2016 at 2:28 PM, H. Peter Anvin wrote: > On July 21, 2016 2:08:12 PM PDT, Andy Lutomirski wrote: >>On Thu, Jul 21, 2016 at 9:18 AM, Ingo Molnar wrote: >>> >>> * Andy Lutomirski wrote: >>> >>>> It would be very easy to implement

Re: [PATCH 03/19] x86/dumpstack: remove unnecessary stack pointer arguments

2016-07-21 Thread Andy Lutomirski
e way it is -- the actual return stack pointer given a pt_regs is quite well defined -- if regs->cs & 3 != 0, then it's regs->sp, else it's ®s->sp.) That being said, this isn't a big deal, so: Reviewed-by: Andy Lutomirski If you want to make this all a bit more reliably on x86_32, you could fix kernel_stack_pointer().

Re: [PATCH 01/19] x86/dumpstack: remove show_trace()

2016-07-21 Thread Andy Lutomirski
On Thu, Jul 21, 2016 at 2:21 PM, Josh Poimboeuf wrote: > There are a bewildering array of options for dumping the stack. > Simplify things a little by removing show_trace(), which is unused. Reviewed-by: Andy Lutomirski

Re: [PATCH 02/19] x86/dumpstack: add get_stack_pointer() and get_frame_pointer()

2016-07-21 Thread Andy Lutomirski
t; Reviewed-by: Andy Lutomirski

Re: [PATCH 07/19] x86/dumpstack: add IRQ_USABLE_STACK_SIZE define

2016-07-21 Thread Andy Lutomirski
On Thu, Jul 21, 2016 at 2:21 PM, Josh Poimboeuf wrote: > For reasons unknown, the x86_64 irq stack starts at an offset 64 bytes > from the end of the page. At least make that explicit. This is a change in behavior -- see below. Please mention this in the changelog. > > FIXME: Can we just remov

Re: [PATCH 09/19] x86/dumpstack: simplify in_exception_stack()

2016-07-21 Thread Andy Lutomirski
On Thu, Jul 21, 2016 at 2:21 PM, Josh Poimboeuf wrote: > in_exception_stack() does some bad, bad things just so the unwinder can > print different values for different areas of the debug exception stack. > > There's no need to clarify where exactly on the stack it is. Just print > "#DB" and be do

Re: [PATCH 17/19] x86/entry/dumpstack: encode pt_regs pointer in frame pointer

2016-07-21 Thread Andy Lutomirski
he pt_regs pointer in the frame > pointer on entry from an interrupt or an exception. The frame pointer > unwinder is also updated to decode it. > > Suggested-by: Andy Lutomirski > Signed-off-by: Josh Poimboeuf > > +/* > + * This determines if the frame p

Re: [PATCH 19/19] x86/dumpstack: print any pt_regs found on the stack

2016-07-21 Thread Andy Lutomirski
On Thu, Jul 21, 2016 at 2:21 PM, Josh Poimboeuf wrote: > Now that we can find pt_regs registers in the middle of the stack due to > an interrupt or exception, we can print them. Here's what it looks > like: > >... >[] do_async_page_fault+0x2c/0xa0 >[] async_page_fault+0x28/0x30 > RI

Re: [PATCH] x86/boot: Reorganize and clean up the BIOS area reservation code

2016-07-21 Thread Andy Lutomirski
On Thu, Jul 21, 2016 at 2:48 PM, Andy Lutomirski wrote: > On Thu, Jul 21, 2016 at 2:28 PM, H. Peter Anvin wrote: >> On July 21, 2016 2:08:12 PM PDT, Andy Lutomirski wrote: >>>On Thu, Jul 21, 2016 at 9:18 AM, Ingo Molnar wrote: >>>> >>>> * Andy Luto

Re: [PATCH 03/19] x86/dumpstack: remove unnecessary stack pointer arguments

2016-07-21 Thread Andy Lutomirski
On Thu, Jul 21, 2016 at 6:41 PM, Josh Poimboeuf wrote: > On Thu, Jul 21, 2016 at 02:56:52PM -0700, Andy Lutomirski wrote: >> On Thu, Jul 21, 2016 at 2:21 PM, Josh Poimboeuf wrote: >> > When calling show_stack_log_lvl() or dump_trace() with a regs argument, >> > providi

Re: [PATCH 19/19] x86/dumpstack: print any pt_regs found on the stack

2016-07-21 Thread Andy Lutomirski
On Thu, Jul 21, 2016 at 8:30 PM, Josh Poimboeuf wrote: > On Thu, Jul 21, 2016 at 03:32:32PM -0700, Andy Lutomirski wrote: >> On Thu, Jul 21, 2016 at 2:21 PM, Josh Poimboeuf wrote: >> > Now that we can find pt_regs registers in the middle of the stack due to >> > an int

Re: [kernel-hardening] [PATCH v5 03/32] x86/cpa: In populate_pgd, don't set the pgd entry until it's populated

2016-07-21 Thread Andy Lutomirski
On 07/21/2016 09:43 PM, valdis.kletni...@vt.edu wrote: On Mon, 11 Jul 2016 13:53:36 -0700, Andy Lutomirski said: This avoids pointless races in which another CPU or task might see a partially populated global pgd entry. These races should normally be harmless, but, if another CPU propagates

Re: [PATCH 2/3] x86: add some better documentation for probe_kernel_address()

2016-07-22 Thread Andy Lutomirski
On Jul 22, 2016 11:03 AM, "Dave Hansen" wrote: > > > From: Dave Hansen > > probe_kernel_address() has an unfortunate name since it is used > to probe kernel *and* userspace addresses. Add a comment > explaining some of the situation to help the next developer who > might make the silly assumptio

Re: [kernel-hardening] [PATCH v5 03/32] x86/cpa: In populate_pgd, don't set the pgd entry until it's populated

2016-07-22 Thread Andy Lutomirski
On Fri, Jul 22, 2016 at 3:21 AM, Ingo Molnar wrote: > > * Andy Lutomirski wrote: > >> On 07/21/2016 09:43 PM, valdis.kletni...@vt.edu wrote: >> >On Mon, 11 Jul 2016 13:53:36 -0700, Andy Lutomirski said: >> >>This avoids pointless races in which another CP

Re: [PATCH 2/3] x86: add some better documentation for probe_kernel_address()

2016-07-22 Thread Andy Lutomirski
On Fri, Jul 22, 2016 at 11:18 AM, Dave Hansen wrote: > On 07/22/2016 11:10 AM, Andy Lutomirski wrote: >> On Jul 22, 2016 11:03 AM, "Dave Hansen" wrote: >>> From: Dave Hansen >>> >>> probe_kernel_address() has an unfortunate name since it is used

Re: [kernel-hardening] [PATCH v5 03/32] x86/cpa: In populate_pgd, don't set the pgd entry until it's populated

2016-07-22 Thread Andy Lutomirski
On Fri, Jul 22, 2016 at 11:21 AM, Andy Lutomirski wrote: > On Fri, Jul 22, 2016 at 3:21 AM, Ingo Molnar wrote: >> >> * Andy Lutomirski wrote: >> >>> On 07/21/2016 09:43 PM, valdis.kletni...@vt.edu wrote: >>> >On Mon, 11 Jul 2016 13:53:36 -0700, Andy Lu

Re: [PATCH 19/19] x86/dumpstack: print any pt_regs found on the stack

2016-07-22 Thread Andy Lutomirski
On Fri, Jul 22, 2016 at 8:57 AM, Josh Poimboeuf wrote: > On Thu, Jul 21, 2016 at 10:13:03PM -0700, Andy Lutomirski wrote: >> On Thu, Jul 21, 2016 at 8:30 PM, Josh Poimboeuf wrote: >> > On Thu, Jul 21, 2016 at 03:32:32PM -0700, Andy Lutomirski wrote: >> >> On Thu,

Re: [PATCH 19/19] x86/dumpstack: print any pt_regs found on the stack

2016-07-22 Thread Andy Lutomirski
On Fri, Jul 22, 2016 at 3:20 PM, Josh Poimboeuf wrote: > On Fri, Jul 22, 2016 at 02:46:10PM -0700, Andy Lutomirski wrote: >> On Fri, Jul 22, 2016 at 8:57 AM, Josh Poimboeuf wrote: >> > On Thu, Jul 21, 2016 at 10:13:03PM -0700, Andy Lutomirski wrote: >> >> On Thu,

Re: [PATCH 10/19] x86/dumpstack: add get_stack_info() interface

2016-07-22 Thread Andy Lutomirski
On Thu, Jul 21, 2016 at 2:21 PM, Josh Poimboeuf wrote: > valid_stack_ptr() is buggy: it assumes that all stacks are of size > THREAD_SIZE, which is not true for exception stacks. So the > walk_stack() callbacks will need to know the location of the beginning > of the stack as well as the end. > >

Re: [PATCH 19/19] x86/dumpstack: print any pt_regs found on the stack

2016-07-22 Thread Andy Lutomirski
On Fri, Jul 22, 2016 at 4:30 PM, Josh Poimboeuf wrote: > On Fri, Jul 22, 2016 at 04:18:04PM -0700, Andy Lutomirski wrote: >> On Fri, Jul 22, 2016 at 3:20 PM, Josh Poimboeuf wrote: >> > On Fri, Jul 22, 2016 at 02:46:10PM -0700, Andy Lutomirski wrote: >> >> On Fri,

Re: [PATCH 10/19] x86/dumpstack: add get_stack_info() interface

2016-07-22 Thread Andy Lutomirski
On Fri, Jul 22, 2016 at 4:26 PM, Andy Lutomirski wrote: > On Thu, Jul 21, 2016 at 2:21 PM, Josh Poimboeuf wrote: >> valid_stack_ptr() is buggy: it assumes that all stacks are of size >> THREAD_SIZE, which is not true for exception stacks. So the >> walk_stack() callbacks

Re: [PATCH 10/19] x86/dumpstack: add get_stack_info() interface

2016-07-22 Thread Andy Lutomirski
On Fri, Jul 22, 2016 at 4:54 PM, Josh Poimboeuf wrote: >> > +static bool in_hardirq_stack(unsigned long *stack, struct stack_info >> > *info, >> > +unsigned long *visit_mask) >> > +{ >> > + unsigned long *begin = (unsigned long >> > *)this_cpu_read(hardirq_stack

Re: [PATCH 00/19] x86/dumpstack: rewrite x86 stack dump code

2016-07-22 Thread Andy Lutomirski
On Fri, Jul 22, 2016 at 5:22 PM, Linus Torvalds wrote: > > So without having yet looked at the code, I want people to understand > that to a very real degree, the stack tracer that the *oopsing* code > (ie what all the usual kernel fault handlers use) is very very special > code and needs to be ha

[PATCH] x86/mm/cpa: Unbreak populate_pgd(): stop trying to deallocate failed PUDs

2016-07-22 Thread Andy Lutomirski
ectively open-codes what the now-deleted unmap_pgd_range() function used to do except that unmap_pgd_range() used to try to free the page as well. Cc: Mike Krinkin Reported-by: Valdis Kletnieks Signed-off-by: Andy Lutomirski --- arch/x86/mm/pageattr.c | 7 ++- 1 file changed, 2 insertions(+

[PATCH] x86/mm/cpa: Add missing comment in populate_pdg()

2016-07-23 Thread Andy Lutomirski
In commit 21cbc2822aa1 ("x86/mm/cpa: Unbreak populate_pgd(): stop trying to deallocate failed PUDs"), I intended to add this comment, but I failed at using git. Signed-off-by: Andy Lutomirski --- arch/x86/mm/pageattr.c | 5 + 1 file changed, 5 insertions(+) diff --git a/a

Re: [RFC 0/3] Put vdso in ramfs-like filesystem (vdsofs)

2016-09-20 Thread Andy Lutomirski
On Tue, Sep 20, 2016 at 5:32 PM, H. Peter Anvin wrote: > On 09/20/16 17:22, H. Peter Anvin wrote: >> The more I'm thinking about this, why don't we simply have these (the >> various possible vdsos as well as vvar) as actual files in sysfs instead >> of introducing a new filesystem? I don't believ

Re: [PATCH 1/9] x86/entry/head/32: use local labels

2016-09-20 Thread Andy Lutomirski
On Sep 20, 2016 10:03 AM, "Josh Poimboeuf" wrote: > > Add the local label prefix to all non-function named labels in head_32.S > and entry_32.S. In addition to decluttering the symbol table, it also > will help stack traces to be more sensible. For example, the last > reported function in the id

Re: [RFC 0/3] Put vdso in ramfs-like filesystem (vdsofs)

2016-09-20 Thread Andy Lutomirski
> On 09/20/16 18:07, H. Peter Anvin wrote: > > > >> - vvar is highly magical. IMO letting it get mapped with VM_MAYWRITE > >> is asking for trouble, as anything that writes it will COW it, leading > >> to strange malfunctions. > >> > > The vvar page obviously needs to be mapped MAP_SHARED, and th

Re: [PATCH 3/9] x86/entry/32: fix the end of the stack for newly forked tasks

2016-09-20 Thread Andy Lutomirski
On Sep 20, 2016 5:25 PM, "Josh Poimboeuf" wrote: > > On Tue, Sep 20, 2016 at 09:10:55PM -0400, Brian Gerst wrote: > > Dropping asmlinkage from schedule_tail() would be a better option if > > possible. > > My understanding is that it's still needed for ia64. AFAICT, ia64 > relies on schedule_tail

Re: [PATCH v4 0/3] nvme power saving

2016-09-21 Thread Andy Lutomirski
On Fri, Sep 16, 2016 at 11:16 AM, Andy Lutomirski wrote: > Hi all- > > Here's v4 of the APST patch set. The biggest bikesheddable thing (I > think) is the scaling factor. I currently have it hardcoded so that > we wait 50x the total latency before entering a power saving st

Re: [PATCH 00/13] Virtually mapped stacks with guard pages (x86, core)

2016-06-19 Thread Andy Lutomirski
On Sun, Jun 19, 2016 at 10:58 PM, Heiko Carstens wrote: > On Fri, Jun 17, 2016 at 10:38:24AM -0700, Andy Lutomirski wrote: >> > A disassembly looks like this (r15 is the stackpointer): >> > >> > 0670 : >> > 670: eb 6f f0

Re: [PATCH] x86/ptrace: Remove questionable TS_COMPAT usage in ptrace

2016-06-19 Thread Andy Lutomirski
now I have nothing to add, but > > On 06/18, Andy Lutomirski wrote: >> >> @@ -922,16 +922,7 @@ static int putreg32(struct task_struct *child, unsigned >> regno, u32 value) >> R32(esp, sp); >> >> case offsetof(struct user32, regs.orig_eax): &g

Re: [PATCH v2 05/13] mm: Move memcg stack accounting to account_kernel_stack

2016-06-20 Thread Andy Lutomirski
On Jun 20, 2016 6:02 AM, "Michal Hocko" wrote: > > On Fri 17-06-16 13:00:41, Andy Lutomirski wrote: > > We should account for stacks regardless of stack size. Move it into > > account_kernel_stack. > > > > Fixes: 12580e4b54ba8 ("mm: mem

Re: [PATCH v2 06/13] fork: Add generic vmalloced stack support

2016-06-20 Thread Andy Lutomirski
On Mon, Jun 20, 2016 at 6:36 AM, Michal Hocko wrote: > On Fri 17-06-16 13:00:42, Andy Lutomirski wrote: >> If CONFIG_VMAP_STACK is selected, kernel stacks are allocated with >> vmalloc_node. > > I like this! It also reduces demand for higher order (order-2) pages > consi

Re: [PATCH] x86/ptrace: Remove questionable TS_COMPAT usage in ptrace

2016-06-20 Thread Andy Lutomirski
On Mon, Jun 20, 2016 at 8:24 AM, Oleg Nesterov wrote: > On 06/19, Andy Lutomirski wrote: >> >> On Sat, Jun 18, 2016 at 10:02 AM, Andy Lutomirski >> wrote: >> Step 1: for 4.7 and for -stable, introduce TS_I386_REGS_POKED. Set it >> in putre

[PATCH v2] x86/ptrace: Stop setting TS_COMPAT in ptrace code

2016-06-20 Thread Andy Lutomirski
a new flag TS_I386_REGS_POKED that handles the ptrace special case. Cc: Pedro Alves Cc: Oleg Nesterov Cc: Kees Cook Signed-off-by: Andy Lutomirski --- Hi Ingo and Kees- I'm still rather nervous about leaving this code as is when Kees' seccomp-vs-ptrace code goes in. This patch is intended to be a sa

Re: [PATCH] x86/ptrace: Remove questionable TS_COMPAT usage in ptrace

2016-06-20 Thread Andy Lutomirski
On Mon, Jun 20, 2016 at 9:14 AM, Oleg Nesterov wrote: > On 06/20, Andy Lutomirski wrote: >> >> On Mon, Jun 20, 2016 at 8:24 AM, Oleg Nesterov wrote: >> > >> > How about the simple change below for now? IIRC 32-bit task can't use >> > "sysc

Re: [PATCH] perf: add 'perf bench syscall'

2016-06-20 Thread Andy Lutomirski
On Mon, Jun 20, 2016 at 11:00 AM, Josh Poimboeuf wrote: > > From: Josh Poimboeuf > Subject: [PATCH] perf: add 'perf bench syscall' > > Add a basic 'perf bench syscall' benchmark which does a getppid() system > call in a tight loop. > My one suggestion would be to use a different syscall than ge

Re: [PATCH v2] x86/ptrace: Stop setting TS_COMPAT in ptrace code

2016-06-20 Thread Andy Lutomirski
On Mon, Jun 20, 2016 at 9:29 AM, Andy Lutomirski wrote: > Setting TS_COMPAT in ptrace is wrong: if we happen to do it during > syscall entry, then we'll confuse seccomp and audit. (The former > isn't a security problem: seccomp is currently entirely insecure if a > malicio

Re: [PATCH 1/2] x86/entry: Avoid interrupt flag save and restore

2016-06-20 Thread Andy Lutomirski
On Mon, Jun 20, 2016 at 7:58 AM, Paolo Bonzini wrote: > Thanks to all the work that was done by Andy Lutomirski and others, > enter_from_user_mode and prepare_exit_to_usermode are now called only with > interrupts disabled. Let's provide them a version of user_enter/user_exit >

[PATCH v3 02/13] x86/cpa: In populate_pgd, don't set the pgd entry until it's populated

2016-06-20 Thread Andy Lutomirski
a use-after-free of the pgd entry. Signed-off-by: Andy Lutomirski --- arch/x86/mm/pageattr.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c index 7a1f7bbf4105..6a8026918bf6 100644 --- a/arch/x86/mm/pageattr.c +++ b/arch/x86

[PATCH v3 00/13] Virtually mapped stacks with guard pages (x86, core)

2016-06-20 Thread Andy Lutomirski
- Fix sub-page stack accounting better (Josh) Changes from v1: - Fix rewind_stack_and_do_exit (Josh) - Fix deadlock under load - Clean up generic stack vmalloc code - Many other minor fixes Andy Lutomirski (12): x86/cpa: In populate_pgd, don't set the pgd entry until it

[PATCH v3 12/13] x86/mm/64: Enable vmapped stacks

2016-06-20 Thread Andy Lutomirski
pecifically, we'll get #PF and make it to no_context and an oops without triggering a double-fault, and no_context doesn't know about stack overflows. The next patch will improve that case. Signed-off-by: Andy Lutomirski --- arch/x86/Kconfig | 1

[PATCH v3 13/13] x86/mm: Improve stack-overflow #PF handling

2016-06-20 Thread Andy Lutomirski
: Andy Lutomirski --- arch/x86/include/asm/traps.h | 6 ++ arch/x86/kernel/traps.c | 6 +++--- arch/x86/mm/fault.c | 39 +++ 3 files changed, 48 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm

[PATCH v3 09/13] x86/dumpstack: When dumping stack bytes due to OOPS, start with regs->sp

2016-06-20 Thread Andy Lutomirski
The comment suggests that show_stack(NULL, NULL) should backtrace the current context, but the code doesn't match the comment. If regs are given, start the "Stack:" hexdump at regs->sp. Signed-off-by: Andy Lutomirski --- arch/x86/kernel/dumpstack_32.c | 4 +++- arch/x86/ker

[PATCH v3 11/13] x86/dumpstack/64: Handle faults when printing the "Stack:" part of an OOPS

2016-06-20 Thread Andy Lutomirski
If we overflow the stack into a guard page, we'll recursively fault when trying to dump the contents of the guard page. Use probe_kernel_address so we can recover if this happens. Signed-off-by: Andy Lutomirski --- arch/x86/kernel/dumpstack_64.c | 12 ++-- 1 file changed, 10 inser

[PATCH v3 01/13] x86/mm/hotplug: Don't remove PGD entries in remove_pagetable()

2016-06-20 Thread Andy Lutomirski
512 GB of memory hotplugged. Cc: Andrew Morton Cc: Andy Lutomirski Cc: Borislav Petkov Cc: Brian Gerst Cc: Denys Vlasenko Cc: H. Peter Anvin Cc: Linus Torvalds Cc: Oleg Nesterov Cc: Peter Zijlstra Cc: Rik van Riel Cc: Thomas Gleixner Cc: Waiman Long Cc: linux...@kvack.org Signed-off-by: I

[PATCH v3 04/13] mm: Track NR_KERNEL_STACK in KiB instead of number of stacks

2016-06-20 Thread Andy Lutomirski
nit that divides both THREAD_SIZE and PAGE_SIZE on all architectures. Keep it simple and use KiB. Cc: Vladimir Davydov Cc: Johannes Weiner Cc: Michal Hocko Cc: linux...@kvack.org Signed-off-by: Andy Lutomirski --- drivers/base/node.c| 3 +-- fs/proc/meminfo.c | 2 +- include/linux/mmzon

[PATCH v3 07/13] x86/die: Don't try to recover from an OOPS on a non-default stack

2016-06-20 Thread Andy Lutomirski
It's not going to work, because the scheduler will explode if we try to schedule when running on an IST stack or similar. This will matter when we let kernel stack overflows (which are #DF) call die(). Signed-off-by: Andy Lutomirski --- arch/x86/kernel/dumpstack.c | 3 +++ 1 file chang

[PATCH v3 10/13] x86/dumpstack: Try harder to get a call trace on stack overflow

2016-06-20 Thread Andy Lutomirski
If we overflow the stack, print_context_stack will abort. Detect this case and rewind back into the valid part of the stack so that we can trace it. Signed-off-by: Andy Lutomirski --- arch/x86/kernel/dumpstack.c | 9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/arch

[PATCH v3 08/13] x86/dumpstack: When OOPSing, rewind the stack before do_exit

2016-06-20 Thread Andy Lutomirski
our logs. I intentionally separated this from the preceding patch that disables do_exit-on-OOPS on IST stacks. This way, if we need to revert this patch, we still end up in an acceptable state wrt stack overflow handling. Signed-off-by: Andy Lutomirski --- arch/x86/entry/entry_32.S | 11

[PATCH v3 06/13] fork: Add generic vmalloced stack support

2016-06-20 Thread Andy Lutomirski
If CONFIG_VMAP_STACK is selected, kernel stacks are allocated with vmalloc_node. Signed-off-by: Andy Lutomirski --- arch/Kconfig| 29 + arch/ia64/include/asm/thread_info.h | 2 +- include/linux/sched.h | 15 +++ kernel/fork.c

[PATCH v3 05/13] mm: Fix memcg stack accounting for sub-page stacks

2016-06-20 Thread Andy Lutomirski
tat") Cc: Vladimir Davydov Cc: Johannes Weiner Cc: Michal Hocko Cc: linux...@kvack.org Signed-off-by: Andy Lutomirski --- include/linux/memcontrol.h | 2 +- kernel/fork.c | 15 ++- mm/memcontrol.c| 2 +- 3 files changed, 8 insertions(+), 11 deletions(-)

[PATCH v3 03/13] x86/mm: Remove kernel_unmap_pages_in_pgd() and efi_cleanup_page_tables()

2016-06-20 Thread Andy Lutomirski
This leaves a couple of other helpers unused, so delete them, too. Cc: Matt Fleming Cc: linux-...@vger.kernel.org Signed-off-by: Andy Lutomirski --- arch/x86/include/asm/efi.h | 1 - arch/x86/include/asm/pgtable_types.h | 2 -- arch/x86/mm/pageattr.c

[PATCH v3 0/3] ptrace-vs-syscall-restart fixes, v3

2016-06-20 Thread Andy Lutomirski
tion. It could also make sense for -stable -- I thing it fixes a bug that could be exploited to confuse the syscall auit logs. Patch 2 and 3 are intended for 4.8. Pedro, can you try to test this series a bit? I'm having trouble getting ptrace-tests to pass even on an unmodified kernel. And

[PATCH v3 2/3] x86/signal: Rewire the restart_block() syscall to have a constant nr

2016-06-20 Thread Andy Lutomirski
nr==380 to refer to sys_restart_block() in all cases. Cc: Pedro Alves Cc: Oleg Nesterov Cc: Kees Cook Signed-off-by: Andy Lutomirski --- arch/x86/entry/syscalls/syscall_32.tbl | 2 ++ arch/x86/entry/syscalls/syscall_64.tbl | 3 +++ arch/x86/kernel/signal.c | 34 ---

[PATCH v3 1/3] x86/ptrace: Stop setting TS_COMPAT in ptrace code

2016-06-20 Thread Andy Lutomirski
a new flag TS_I386_REGS_POKED that handles the ptrace special case. Cc: Pedro Alves Cc: Oleg Nesterov Cc: Kees Cook Signed-off-by: Andy Lutomirski --- arch/x86/entry/common.c| 6 +- arch/x86/include/asm/syscall.h | 5 + arch/x86/include/asm/thread_info.h | 3 +++ arch/x86/kerne

[PATCH v3 3/3] x86/ptrace, x86/signal: Remove TS_I386_REGS_POKED

2016-06-20 Thread Andy Lutomirski
not worrying about ptrace in the signal handling code. Cc: Pedro Alves Cc: Oleg Nesterov Cc: Kees Cook Signed-off-by: Andy Lutomirski --- arch/x86/entry/common.c| 6 +- arch/x86/include/asm/syscall.h | 2 +- arch/x86/include/asm/thread_info.h | 3 --- arch/x86/ke

<    1   2   3   4   5   6   7   8   9   10   >