Re: [PATCH V2] mm: add a new vector based madvise syscall

2015-12-01 Thread Andrew Morton
On Mon, 9 Nov 2015 11:44:54 -0800 Shaohua Li wrote: > In jemalloc, a free(3) doesn't immediately free the memory to OS even > the memory is page aligned/size, and hope the memory can be reused soon. > Later the virtual address becomes fragmented, and more and more free > memory are

Re: [PATCH] ptrace: use fsuid, fsgid, effective creds for fs access checks

2015-11-09 Thread Andrew Morton
On Mon, 9 Nov 2015 22:12:09 +0100 Jann Horn wrote: > > > Can we do > > > > #define PTRACE_foo (PTRACE_MODE_READ|PTRACE_MODE_FSCREDS) > > > > to avoid all that? > > Hm. All combinations of the PTRACE_MODE_*CREDS flags with > PTRACE_MODE_{READ,ATTACH} plus optionally

Re: [PATCH] ptrace: use fsuid, fsgid, effective creds for fs access checks

2015-11-09 Thread Andrew Morton
On Sun, 8 Nov 2015 13:08:36 +0100 Jann Horn wrote: > By checking the effective credentials instead of the real UID / > permitted capabilities, ensure that the calling process actually > intended to use its credentials. > > To ensure that all ptrace checks use the correct caller

Re: [PATCH v4 3/4] mm, shmem: Add shmem resident memory accounting

2015-10-02 Thread Andrew Morton
On Fri, 2 Oct 2015 15:35:50 +0200 Vlastimil Babka wrote: > From: Jerome Marchand Changelog is a bit weird. > Currently looking at /proc//status or statm, there is no way to > distinguish shmem pages from pages mapped to a regular file (shmem > pages are

Re: [PATCH v4 2/4] mm, proc: account for shmem swap in /proc/pid/smaps

2015-10-02 Thread Andrew Morton
On Fri, 2 Oct 2015 15:35:49 +0200 Vlastimil Babka wrote: > Currently, /proc/pid/smaps will always show "Swap: 0 kB" for shmem-backed > mappings, even if the mapped portion does contain pages that were swapped out. > This is because unlike private anonymous mappings, shmem does

Re: [PATCH v4 4/4] mm, procfs: Display VmAnon, VmFile and VmShm in /proc/pid/status

2015-10-02 Thread Andrew Morton
On Fri, 2 Oct 2015 15:35:51 +0200 Vlastimil Babka wrote: > From: Jerome Marchand > > It's currently inconvenient to retrieve MM_ANONPAGES value from status > and statm files and there is no way to separate MM_FILEPAGES and > MM_SHMEMPAGES. Add RssAnon,

Re: [PATCH -mm v9 0/8] idle memory tracking

2015-07-29 Thread Andrew Morton
On Wed, 29 Jul 2015 19:29:08 +0300 Vladimir Davydov vdavy...@parallels.com wrote: /proc/kpageidle should probably live somewhere in /sys/kernel/mm, but I added it where similar files are located (kpagecount, kpageflags) to keep things consistent. I think these files should be moved

Re: [PATCH -mm v9 0/8] idle memory tracking

2015-07-27 Thread Andrew Morton
On Mon, 27 Jul 2015 12:18:57 -0700 Kees Cook keesc...@chromium.org wrote: Why were these put in /proc anyway? Rather than under /sys/fs/cgroup somewhere? Presumably because /proc/kpageidle is useful in non-memcg setups. Do we need a /proc/vm/ for holding these kinds of things? We're

Re: [PATCH -mm v9 7/8] proc: export idle flag via kpageflags

2015-07-22 Thread Andrew Morton
On Wed, 22 Jul 2015 19:25:28 +0300 Vladimir Davydov vdavy...@parallels.com wrote: On Tue, Jul 21, 2015 at 04:35:00PM -0700, Andrew Morton wrote: On Sun, 19 Jul 2015 15:31:16 +0300 Vladimir Davydov vdavy...@parallels.com wrote: As noted by Minchan, a benefit of reading idle flag from

Re: [PATCH v4 00/10] hugetlbfs: add fallocate support

2015-07-22 Thread Andrew Morton
On Tue, 21 Jul 2015 11:09:34 -0700 Mike Kravetz mike.krav...@oracle.com wrote: Changes in this revision address the minor comment and function name issues brought up by Naoya Horiguchi. Patch set is also rebased on current mmotm/since-4.1. This revision does not introduce any functional

Re: [PATCH v4 01/10] mm/hugetlb: add cache of descriptors to resv_map for region_add

2015-07-22 Thread Andrew Morton
On Tue, 21 Jul 2015 11:09:35 -0700 Mike Kravetz mike.krav...@oracle.com wrote: fallocate hole punch will want to remove a specific range of pages. When pages are removed, their associated entries in the region/reserve map will also be removed. This will break an assumption in the

Re: [PATCH v4 09/10] hugetlbfs: add hugetlbfs_fallocate()

2015-07-22 Thread Andrew Morton
On Wed, 22 Jul 2015 15:23:42 -0700 Mike Kravetz mike.krav...@oracle.com wrote: On 07/22/2015 03:03 PM, Andrew Morton wrote: On Tue, 21 Jul 2015 11:09:43 -0700 Mike Kravetz mike.krav...@oracle.com wrote: ... + + if (mode ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE

Re: [PATCH v4 00/10] hugetlbfs: add fallocate support

2015-07-22 Thread Andrew Morton
On Wed, 22 Jul 2015 15:34:34 -0700 Davidlohr Bueso d...@stgolabs.net wrote: On Wed, 2015-07-22 at 15:30 -0700, Andrew Morton wrote: selftests is a pretty scrappy place. It's partly a dumping ground for things so useful test code doesn't just get lost and bitrotted. Partly a framework so

Re: [PATCH v4 00/10] hugetlbfs: add fallocate support

2015-07-22 Thread Andrew Morton
On Tue, 21 Jul 2015 11:09:34 -0700 Mike Kravetz mike.krav...@oracle.com wrote: As suggested during the RFC process, tests have been proposed to libhugetlbfs as described at: http://librelist.com/browser//libhugetlbfs/2015/6/25/patch-tests-add-tests-for-fallocate-system-call/ I didn't know

Re: [PATCH v4 00/10] hugetlbfs: add fallocate support

2015-07-22 Thread Andrew Morton
On Wed, 22 Jul 2015 15:19:54 -0700 Davidlohr Bueso d...@stgolabs.net wrote: I didn't know that libhugetlbfs has tests. I wonder if that makes tools/testing/selftests/vm's hugetlbfstest harmful? Why harmful? Redundant, maybe(?). The presence of the in-kernel tests will cause people to

Re: [PATCH V4 2/6] mm: mlock: Add new mlock, munlock, and munlockall system calls

2015-07-21 Thread Andrew Morton
emun...@akamai.com Cc: Stephen Rothwell s...@canb.auug.org.au Signed-off-by: Andrew Morton a...@linux-foundation.org --- arch/s390/kernel/syscalls.S |3 --- 1 file changed, 3 deletions(-) diff -puN arch/s390/kernel/syscalls.S~mm-mlock-add-new-mlock-munlock-and-munlockall-system-calls-fix-2

Re: [PATCH -mm v9 4/8] proc: add kpagecgroup file

2015-07-21 Thread Andrew Morton
On Sun, 19 Jul 2015 15:31:13 +0300 Vladimir Davydov vdavy...@parallels.com wrote: /proc/kpagecgroup contains a 64-bit inode number of the memory cgroup each page is charged to, indexed by PFN. Having this information is useful for estimating a cgroup working set size. The file is present

Re: [PATCH -mm v9 0/8] idle memory tracking

2015-07-21 Thread Andrew Morton
On Sun, 19 Jul 2015 15:31:09 +0300 Vladimir Davydov vdavy...@parallels.com wrote: Hi, This patch set introduces a new user API for tracking user memory pages that have not been used for a given period of time. The purpose of this is to provide the userspace with the means of tracking a

Re: [PATCH -mm v9 7/8] proc: export idle flag via kpageflags

2015-07-21 Thread Andrew Morton
On Sun, 19 Jul 2015 15:31:16 +0300 Vladimir Davydov vdavy...@parallels.com wrote: As noted by Minchan, a benefit of reading idle flag from /proc/kpageflags is that one can easily filter dirty and/or unevictable pages while estimating the size of unused memory. Note that idle flag read from

Re: [PATCH -mm v9 2/8] hwpoison: use page_cgroup_ino for filtering by memcg

2015-07-21 Thread Andrew Morton
On Sun, 19 Jul 2015 15:31:11 +0300 Vladimir Davydov vdavy...@parallels.com wrote: Hwpoison allows to filter pages by memory cgroup ino. Currently, it calls try_get_mem_cgroup_from_page to obtain the cgroup from a page and then its ino using cgroup_ino, but now we have an apter method for

Re: [PATCH -mm v9 1/8] memcg: add page_cgroup_ino helper

2015-07-21 Thread Andrew Morton
On Sun, 19 Jul 2015 15:31:10 +0300 Vladimir Davydov vdavy...@parallels.com wrote: This function returns the inode number of the closest online ancestor of the memory cgroup a page is charged to. It is required for exporting information about which page is charged to which cgroup to userspace,

Re: [PATCH -mm v9 6/8] proc: add kpageidle file

2015-07-21 Thread Andrew Morton
On Sun, 19 Jul 2015 15:31:15 +0300 Vladimir Davydov vdavy...@parallels.com wrote: Knowing the portion of memory that is not used by a certain application or memory cgroup (idle memory) can be useful for partitioning the system efficiently, e.g. by setting memory cgroup limits appropriately.

Re: [PATCH v4 1/2] capabilities: Ambient capabilities

2015-07-17 Thread Andrew Morton
On Fri, 17 Jul 2015 15:40:57 -0500 (CDT) Christoph Lameter c...@linux.com wrote: Here is a test program that can be used to verify the functionality. I slurped that into the changelog. Adding a selftest into tools/testing/selftests/ would be appropriate. -- To unsubscribe from this list:

Re: [PATCH V3 0/5] Allow user to request memory to be locked on page fault

2015-07-08 Thread Andrew Morton
On Wed, 8 Jul 2015 09:23:02 -0400 Eric B Munson emun...@akamai.com wrote: I don't know whether these syscalls should be documented via new manpages, or if we should instead add them to the existing mlock/munlock/mlockall manpages. Michael, could you please advise? Thanks for adding

Re: [PATCH V3 0/5] Allow user to request memory to be locked on page fault

2015-07-07 Thread Andrew Morton
On Tue, 7 Jul 2015 13:03:38 -0400 Eric B Munson emun...@akamai.com wrote: mlock() allows a user to control page out of program memory, but this comes at the cost of faulting in the entire mapping when it is allocated. For large mappings where the entire area is not necessary this is not

Re: [PATCH V3 5/5] selftests: vm: Add tests for lock on fault

2015-07-07 Thread Andrew Morton
On Tue, 7 Jul 2015 13:03:43 -0400 Eric B Munson emun...@akamai.com wrote: Test the mmap() flag, and the mlockall() flag. These tests ensure that pages are not faulted in until they are accessed, that the pages are unevictable once faulted in, and that VMA splitting and merging works with

Re: [RFCv3 0/5] enable migration of driver pages

2015-07-07 Thread Andrew Morton
On Tue, 7 Jul 2015 13:36:20 +0900 Gioh Kim gioh@lge.com wrote: From: Gioh Kim guru...@hanmail.net Hello, This series try to enable migration of non-LRU pages, such as driver's page. My ARM-based platform occured severe fragmentation problem after long-term (several days) test.

Re: [RFCv3 0/5] enable migration of driver pages

2015-07-07 Thread Andrew Morton
On Wed, 08 Jul 2015 09:02:59 +0900 Gioh Kim gioh@lge.com wrote: 2015-07-08 __ 7:37___ Andrew Morton ___(___) ___ ___: On Tue, 7 Jul 2015 13:36:20 +0900 Gioh Kim gioh@lge.com wrote: From: Gioh Kim guru...@hanmail.net Hello, This series try to enable migration of non

Re: [PATCH v4] pagemap: switch to the new format and do some cleanup

2015-06-16 Thread Andrew Morton
On Mon, 15 Jun 2015 08:56:49 +0300 Konstantin Khlebnikov koc...@gmail.com wrote: This patch removes page-shift bits (scheduled to remove since 3.11) and completes migration to the new bit layout. Also it cleans messy macro. hm, I can't find any kernel version to which this patch comes close

Re: [PATCH v5 0/4] idle memory tracking

2015-06-08 Thread Andrew Morton
On Sun, 7 Jun 2015 11:41:15 +0530 Raghavendra KT raghavendra...@linux.vnet.ibm.com wrote: On Tue, May 12, 2015 at 7:04 PM, Vladimir Davydov vdavy...@parallels.com wrote: Hi, This patch set introduces a new user API for tracking user memory pages that have not been used for a given

Re: [RESEND PATCH 0/3] Allow user to request memory to be locked on page fault

2015-06-01 Thread Andrew Morton
On Fri, 29 May 2015 10:13:25 -0400 Eric B Munson emun...@akamai.com wrote: mlock() allows a user to control page out of program memory, but this comes at the cost of faulting in the entire mapping when it is allocated. For large mappings where the entire area is not necessary this is not

Re: [PATCH for v4.2 v18 1/3] sys_membarrier(): system-wide memory barrier (generic, x86)

2015-05-29 Thread Andrew Morton
On Sat, 16 May 2015 19:48:18 -0400 Mathieu Desnoyers mathieu.desnoy...@efficios.com wrote: Here is an implementation of a new system call, sys_membarrier(), which executes a memory barrier on all threads running on the system. It is implemented by calling synchronize_sched(). It can be used

Re: [PATCH 22/23] userfaultfd: avoid mmap_sem read recursion in mcopy_atomic

2015-05-22 Thread Andrew Morton
On Thu, 14 May 2015 19:31:19 +0200 Andrea Arcangeli aarca...@redhat.com wrote: If the rwsem starves writers it wasn't strictly a bug but lockdep doesn't like it and this avoids depending on lowlevel implementation details of the lock. ... @@ -229,13 +246,33 @@ static __always_inline

Re: [PATCH 22/23] userfaultfd: avoid mmap_sem read recursion in mcopy_atomic

2015-05-22 Thread Andrew Morton
There's a more serious failure with i386 allmodconfig: fs/userfaultfd.c:145:2: note: in expansion of macro 'BUILD_BUG_ON' BUILD_BUG_ON(sizeof(struct uffd_msg) != 32); I'm surprised the feature is even reachable on i386 builds? -- To unsubscribe from this list: send the line unsubscribe

Re: [PATCH] [RFC] fs, proc: don't guard /proc/pid/task/tid/children on CONFIG_CHECKPOINT_RESTORE

2015-05-21 Thread Andrew Morton
On Thu, 21 May 2015 12:30:21 +0200 Alban Crequy alban.cre...@gmail.com wrote: commit 818411616baf (fs, proc: introduce /proc/pid/task/tid/children entry) introduced the children entry for checkpoint restore and the file is only available on kernels configured with CONFIG_EXPERT and

Re: [PATCH 0/3] Allow user to request memory to be locked on page fault

2015-05-08 Thread Andrew Morton
On Fri, 8 May 2015 15:33:43 -0400 Eric B Munson emun...@akamai.com wrote: mlock() allows a user to control page out of program memory, but this comes at the cost of faulting in the entire mapping when it is allocated. For large mappings where the entire area is not necessary this is not

Re: [PATCH 0/6] support dataplane mode for nohz_full

2015-05-08 Thread Andrew Morton
On Fri, 8 May 2015 19:11:10 -0400 Chris Metcalf cmetc...@ezchip.com wrote: On 5/8/2015 5:22 PM, Steven Rostedt wrote: On Fri, 8 May 2015 14:18:24 -0700 Andrew Morton a...@linux-foundation.org wrote: On Fri, 8 May 2015 13:58:41 -0400 Chris Metcalf cmetc...@ezchip.com wrote: A prctl

Re: [RFC 1/3] mm: mmap make MAP_LOCKED really mlock semantic

2015-04-28 Thread Andrew Morton
On Tue, 28 Apr 2015 14:11:49 +0200 Michal Hocko mho...@suse.cz wrote: The man page however says MAP_LOCKED (since Linux 2.5.37) Lock the pages of the mapped region into memory in the manner of mlock(2). This flag is ignored in older kernels. I'm trying to remember why we

Re: [PATCH] Test compaction of mlocked memory

2015-04-22 Thread Andrew Morton
On Wed, 22 Apr 2015 17:01:20 -0400 Sri Jayaramappa sjaya...@akamai.com wrote: Commit commit 5bbe3547aa3b (mm: allow compaction of unevictable pages) introduced a sysctl that allows userspace to enable scanning of locked pages for compaction. This patch introduces a new test which fragments

Re: [PATCH v7 0/5] vfs: Non-blockling buffered fs read (page cache only)

2015-04-03 Thread Andrew Morton
On Mon, 30 Mar 2015 13:26:25 -0700 Andrew Morton a...@linux-foundation.org wrote: d) fincore() is more expensive Actually, I kinda take that back. fincore() will be faster than preadv2() in the case of a pagecache miss, and slower in the case of a pagecache hit. The breakpoint appears

Re: [PATCH v7 0/5] vfs: Non-blockling buffered fs read (page cache only)

2015-03-30 Thread Andrew Morton
On Mon, 30 Mar 2015 13:32:27 -0700 Jeremy Allison j...@samba.org wrote: On Mon, Mar 30, 2015 at 01:26:25PM -0700, Andrew Morton wrote: cons: d) fincore() is more expensive e) fincore() will very occasionally block The above is the killer for Samba. If fincore returns true

Re: [PATCH v7 0/5] vfs: Non-blockling buffered fs read (page cache only)

2015-03-30 Thread Andrew Morton
On Mon, 30 Mar 2015 00:40:20 -0700 Christoph Hellwig h...@infradead.org wrote: On Fri, Mar 27, 2015 at 10:04:11AM -0700, Andrew Morton wrote: mm... I don't think we should be adding placeholders to the kernel API to support code which hasn't been written, tested, reviewed, merged, etc

Re: [PATCH v7 0/5] vfs: Non-blockling buffered fs read (page cache only)

2015-03-30 Thread Andrew Morton
On Mon, 30 Mar 2015 00:36:04 -0700 Christoph Hellwig h...@infradead.org wrote: On Fri, Mar 27, 2015 at 08:58:54AM -0700, Jeremy Allison wrote: The problem with the above is that we can't tell the difference between pread2() returning a short read because the pages are not in cache, or

Re: [PATCH v7 0/5] vfs: Non-blockling buffered fs read (page cache only)

2015-03-30 Thread Andrew Morton
On Mon, 30 Mar 2015 18:40:16 -0400 Milosz Tanski mil...@adfin.com wrote: On Mon, Mar 30, 2015 at 2:54 PM, Andrew Morton a...@linux-foundation.org wrote: On Mon, 30 Mar 2015 00:40:20 -0700 Christoph Hellwig h...@infradead.org wrote: On Fri, Mar 27, 2015 at 10:04:11AM -0700, Andrew

Re: [PATCH v7 0/5] vfs: Non-blockling buffered fs read (page cache only)

2015-03-30 Thread Andrew Morton
On Mon, 30 Mar 2015 18:49:06 -0400 Milosz Tanski mil...@adfin.com wrote: A fincore+pread solution that blocks is simply unsafe to use for us. We'll have to stay with the threadpool :-(. We're getting data from a network filesystem Ceph in our case, but it could be pNFS. In many cases

Re: [PATCH v7 0/5] vfs: Non-blockling buffered fs read (page cache only)

2015-03-30 Thread Andrew Morton
On Mon, 30 Mar 2015 13:49:37 -0700 Jeremy Allison j...@samba.org wrote: This implies that the samba main thread also has to avoid any memory allocations both direct and within syscall and pagefault - those will occasionally exhibit similar worse-case latency. Is this done now? We don't

Re: [PATCH v7 0/5] vfs: Non-blockling buffered fs read (page cache only)

2015-03-27 Thread Andrew Morton
On Fri, 27 Mar 2015 06:41:25 +0100 Volker Lendecke volker.lende...@sernet.de wrote: On Thu, Mar 26, 2015 at 08:28:24PM -0700, Andrew Morton wrote: A thing which bugs me about pread2() is that it is specifically tailored to applications which are able to use a partial read result. ie

Re: [PATCH v7 0/5] vfs: Non-blockling buffered fs read (page cache only)

2015-03-27 Thread Andrew Morton
On Fri, 27 Mar 2015 01:18:22 -0700 Christoph Hellwig h...@infradead.org wrote: On Thu, Mar 26, 2015 at 08:28:24PM -0700, Andrew Morton wrote: I still don't understand why pwritev() exists. We discussed this last time but it seems nothing has changed. I'm not seeing here an adequate

Re: [PATCH v7 0/5] vfs: Non-blockling buffered fs read (page cache only)

2015-03-27 Thread Andrew Morton
On Fri, 27 Mar 2015 01:48:33 -0700 Christoph Hellwig h...@infradead.org wrote: On Fri, Mar 27, 2015 at 01:35:16AM -0700, Andrew Morton wrote: fincore() doesn't have to be ugly. Please address the design issues I raised. How is pread2() useful to the class of applications which cannot

Re: [PATCH] Add preadv2/pwritev2 documentation.

2015-03-27 Thread Andrew Morton
On Mon, 16 Mar 2015 14:32:26 -0400 Milosz Tanski mil...@adfin.com wrote: +.BR pwritev2 () can also fail for the same reasons as .BR lseek (2). -Additionally, the following error is defined: +Additionally, the following errors are defined: +.TP +.B EAGAIN +The operation would block. This

Re: [PATCH v7 0/5] vfs: Non-blockling buffered fs read (page cache only)

2015-03-27 Thread Andrew Morton
On Fri, 27 Mar 2015 08:58:54 -0700 Jeremy Allison j...@samba.org wrote: On Fri, Mar 27, 2015 at 02:01:59AM -0700, Andrew Morton wrote: On Fri, 27 Mar 2015 01:48:33 -0700 Christoph Hellwig h...@infradead.org wrote: On Fri, Mar 27, 2015 at 01:35:16AM -0700, Andrew Morton wrote

Re: [PATCH v7 0/5] vfs: Non-blockling buffered fs read (page cache only)

2015-03-27 Thread Andrew Morton
On Fri, 27 Mar 2015 09:30:46 -0700 Andrew Morton a...@linux-foundation.org wrote: I expect that this situation (first part in cache, latter part not in cache) is rare - for reasonably small requests the common cases will be all cached and nothing cached. So perhaps the best approach here

Re: [PATCH V7] Allow compaction of unevictable pages

2015-03-20 Thread Andrew Morton
On Fri, 20 Mar 2015 09:49:50 -0400 Eric B Munson emun...@akamai.com wrote: Currently, pages which are marked as unevictable are protected from compaction, but not from other types of migration. The POSIX real time extension explicitly states that mlock() will prevent a major page fault, but

Re: [PATCH V7] Allow compaction of unevictable pages

2015-03-20 Thread Andrew Morton
On Fri, 20 Mar 2015 09:49:50 -0400 Eric B Munson emun...@akamai.com wrote: Documentation/sysctl/vm.txt | 11 +++ include/linux/compaction.h |1 + kernel/sysctl.c |9 + mm/compaction.c |7 +++ Documentation/vm/unevictable-lru.txt

Re: [PATCH] mremap: add MREMAP_NOHOLE flag --resend

2015-03-18 Thread Andrew Morton
On Tue, 17 Mar 2015 14:09:39 -0700 Shaohua Li s...@fb.com wrote: There was a similar patch posted before, but it doesn't get merged. I'd like to try again if there are more discussions. http://marc.info/?l=linux-mmm=141230769431688w=2 mremap can be used to accelerate realloc. The problem is

Re: [PATCH] mremap: add MREMAP_NOHOLE flag --resend

2015-03-18 Thread Andrew Morton
On Wed, 18 Mar 2015 22:08:26 -0700 Shaohua Li s...@fb.com wrote: Daniel also had microbenchmark testing results for glibc and jemalloc. Can you please do this? I run Daniel's microbenchmark too, and not surprise the result is similar: glibc: 32.82 jemalloc: 70.35 jemalloc+mremap:

Re: [PATCH 0/2] Move away from non-failing small allocations

2015-03-16 Thread Andrew Morton
On Wed, 11 Mar 2015 16:54:52 -0400 Michal Hocko mho...@suse.cz wrote: as per discussion at LSF/MM summit few days back it seems there is a general agreement on moving away from small allocations do not fail concept. Such a change affects basically every part of the kernel and every kernel

Re: [PATCH v3 0/3] epoll: introduce round robin wakeup mode

2015-02-27 Thread Andrew Morton
On Wed, 25 Feb 2015 11:27:04 -0500 Jason Baron jba...@akamai.com wrote: Libenzi inactive eventpoll appears to be without a dedicated maintainer since 2011 or so. Is there anyone who knows the code and its usages in detail and does final ABI decisions on eventpoll - Andrew, Al or Linus?

Re: [PATCH v3 0/3] epoll: introduce round robin wakeup mode

2015-02-27 Thread Andrew Morton
On Fri, 27 Feb 2015 17:01:32 -0500 Jason Baron jba...@akamai.com wrote: I don't really understand the need for rotation/round-robin. We can solve the thundering herd via exclusive wakeups, but what is the point in choosing to wake the task which has been sleeping for the longest

Re: [RFC][PATCH v2] procfs: Always expose /proc/pid/map_files/ and make it readable

2015-01-26 Thread Andrew Morton
On Tue, 27 Jan 2015 09:46:47 +0300 Cyrill Gorcunov gorcu...@gmail.com wrote: There's one other problem here: we're assuming that the map_files implementation doesn't have bugs. If it does have bugs then relaxing permissions like this will create new vulnerabilities. And the map_files

Re: [PATCH] headers_check: don't warn about kexec.h

2015-01-13 Thread Andrew Morton
On Tue, 13 Jan 2015 22:05:11 +0100 Paul Bolle pebo...@tiscali.nl wrote: [Dragging Andrew, Linus, and Maximilian into this thread.] On Tue, 2015-01-13 at 21:27 +0100, Arnd Bergmann wrote: On Tuesday 13 January 2015 18:13:32 Paul Bolle wrote: The last time that Geoff has been trying to get

Re: [PATCH v6 0/7] vfs: Non-blockling buffered fs read (page cache only)

2014-12-04 Thread Andrew Morton
On Wed, 3 Dec 2014 11:48:28 -0500 Milosz Tanski mil...@adfin.com wrote: On Tue, Dec 2, 2014 at 5:42 PM, Andrew Morton a...@linux-foundation.org wrote: On Tue, 2 Dec 2014 17:17:42 -0500 Milosz Tanski mil...@adfin.com wrote: There have been several incomplete attempts to implement

Re: [PATCH v6 0/7] vfs: Non-blockling buffered fs read (page cache only)

2014-12-02 Thread Andrew Morton
On Tue, 2 Dec 2014 17:17:42 -0500 Milosz Tanski mil...@adfin.com wrote: There have been several incomplete attempts to implement fincore(). If we were to complete those attempts, preadv2() could be implemented using fincore()+pread(). Plus we get fincore(), which is useful for other

Re: [PATCH v6 0/7] vfs: Non-blockling buffered fs read (page cache only)

2014-11-25 Thread Andrew Morton
On Mon, 10 Nov 2014 11:40:23 -0500 Milosz Tanski mil...@adfin.com wrote: This patcheset introduces an ability to perform a non-blocking read from regular files in buffered IO mode. This works by only for those filesystems that have data in the page cache. It does this by introducing new

Re: [PATCH v17 0/7] MADV_FREE support

2014-11-13 Thread Andrew Morton
On Fri, 14 Nov 2014 07:58:09 +0900 Minchan Kim minc...@kernel.org wrote: It seems I have waited your review for a long time. What should I do to take your time slot? I'm being terrible, sorry. I'll merge the patches into -mm next week so at least they get some external testing while I get my

Re: [PATCHv7 0/3] syscalls,x86: Add execveat() system call

2014-11-12 Thread Andrew Morton
On Fri, 7 Nov 2014 17:01:01 + David Drysdale drysd...@google.com wrote: This patch set adds execveat(2) for x86, and is derived from Meredydd Luff's patch from Sept 2012 (https://lkml.org/lkml/2012/9/11/528). The primary aim of adding an execveat syscall is to allow an implementation

Re: [PATCHv7 0/3] syscalls,x86: Add execveat() system call

2014-11-12 Thread Andrew Morton
On Fri, 7 Nov 2014 17:01:01 + David Drysdale drysd...@google.com wrote: This patch set adds execveat(2) for x86 I grabbed these. If someone else was planning to do so, feel free to shout at me. I haven't been following the discussion closely so some reviewed-by's and tested-by's would be

Re: [PATCH V3] kernel, add bug_on_warn

2014-10-23 Thread Andrew Morton
On Thu, 23 Oct 2014 08:53:14 -0400 Prarit Bhargava pra...@redhat.com wrote: There have been several times where I have had to rebuild a kernel to cause a panic when hitting a WARN() in the code in order to get a crash dump from a system. Sometimes this is easy to do, other times (such as in

Re: [PATCHv8.1] fanotify: enable close-on-exec on events' fd when requested in fanotify_init()

2014-10-02 Thread Andrew Morton
On Thu, 2 Oct 2014 12:44:10 +0200 Jan Kara j...@suse.cz wrote: On Wed 01-10-14 15:36:21, Andrew Morton wrote: On Mon, 29 Sep 2014 10:49:15 +0200 Yann Droneaud ydrone...@opteya.com wrote: According to commit 80af258867648 ('fanotify: groups can specify their f_flags for new fd

Re: [PATCHv8.1] fanotify: enable close-on-exec on events' fd when requested in fanotify_init()

2014-10-01 Thread Andrew Morton
On Mon, 29 Sep 2014 10:49:15 +0200 Yann Droneaud ydrone...@opteya.com wrote: According to commit 80af258867648 ('fanotify: groups can specify their f_flags for new fd'), file descriptors created as part of file access notification events inherit flags from the event_f_flags argument passed to

Re: [PATCH 1/5] Add __designated_init, wrapping __attribute__((designated_init))

2014-08-01 Thread Andrew Morton
On Thu, 31 Jul 2014 16:47:23 -0700 Josh Triplett j...@joshtriplett.org wrote: GCC 4.10 and newer, and Sparse, supports __attribute__((designated_init)), which marks a structure as requiring a designated initializer rather than a positional one. This helps reduce churn and errors when used

Re: [PATCH] ptrace: add ability to retrieve signals without removing from a queue (v2)

2013-02-25 Thread Andrew Morton
On Mon, 25 Feb 2013 14:06:53 +0400 Andrey Vagin ava...@openvz.org wrote: This patch adds a new ptrace request PTRACE_PEEKSIGINFO. This request is used to retrieve information about signals starting with the specified sequence number. Siginfo_t structures are copied from the child into the

Re: [RFC PATCH v8 0/5] IPC: checkpoint/restore in userspace enhancements

2012-12-20 Thread Andrew Morton
On Thu, 20 Dec 2012 08:06:32 +0400 Stanislav Kinsbursky skinsbur...@parallels.com wrote: 19.12.2012 00:36, Andrew Morton __: On Wed, 24 Oct 2012 19:34:51 +0400 Stanislav Kinsbursky skinsbur...@parallels.com wrote: This respin of the patch set was significantly reworked. Most part

Re: [PATCH v8 4/5] ipc: message queue copy feature introduced

2012-10-24 Thread Andrew Morton
On Wed, 24 Oct 2012 19:35:20 +0400 Stanislav Kinsbursky skinsbur...@parallels.com wrote: This patch is required for checkpoint/restore in userspace. IOW, c/r requires some way to get all pending IPC messages without deleting them from the queue (checkpoint can fail and in this case tasks will

Re: [PATCH v8 2/5] ipc: add sysctl to specify desired next object id

2012-10-24 Thread Andrew Morton
On Wed, 24 Oct 2012 19:35:09 +0400 Stanislav Kinsbursky skinsbur...@parallels.com wrote: This patch adds 3 new variables and sysctls to tune them (by one next_id variable for messages, semaphores and shared memory respectively). This variable can be used to set desired id for next allocated

Re: [RFC PATCH v8 0/5] IPC: checkpoint/restore in userspace enhancements

2012-10-24 Thread Andrew Morton
On Wed, 24 Oct 2012 19:34:51 +0400 Stanislav Kinsbursky skinsbur...@parallels.com wrote: This respin of the patch set was significantly reworked. Most part of new API was replaced by sysctls (by one per messages, semaphores and shared memory), allowing to preset desired id for next new IPC

Re: [PATCH, RFC] Remove fasync() BKL usage, take 3325

2012-08-23 Thread Andrew Morton
On Fri, 23 Jan 2009 05:56:46 +0100 Andi Kleen a...@firstfloor.org wrote: On Thu, Jan 22, 2009 at 03:32:49PM -0500, Christoph Hellwig wrote: On Thu, Jan 22, 2009 at 06:51:04AM -0800, Andrew Morton wrote: OK, replacing a lock_kernel() with a spin_lock(global_lock) is pretty straightforwad

Re: [PATCH, v10 3/3] cgroups: introduce timer slack controller

2011-10-14 Thread Andrew Morton
On Tue, 11 Oct 2011 19:15:29 +0300 Kirill A. Shutemov kir...@shutemov.name wrote: Every task_struct has timer_slack_ns value. This value uses to round up poll() and select() timeout values. This feature can be useful in mobile environment where combined wakeups are desired. Originally,

Re: [PATCH v2] fadvise: introduce POSIX_FADV_DONTNEED_FS

2011-05-04 Thread Andrew Morton
On Wed, 27 Apr 2011 20:13:47 +0200 Andrea Righi and...@betterlinux.com wrote: Introduce a new fadvise flag to drop page cache pages of a single filesystem. I'm going to object to this on general principle. We shouldn't toss new features into the kernel API just because we can. Each feature

Re: [PATCH v3] introduce sys_syncfs to sync a single file system

2011-03-15 Thread Andrew Morton
On Tue, 15 Mar 2011 09:56:08 -0600 Andreas Dilger adil...@dilger.ca wrote: Should there be a wait argument or flag that allows an app to start the syncfs(), do something, and then call again to wait for completion? I don't think so. If userspace wants to do that then fork(). Perhaps we

Re: [PATCH v3] introduce sys_syncfs to sync a single file system

2011-03-14 Thread Andrew Morton
On Mon, 14 Mar 2011 02:56:52 +0100 (CET) Indan Zupancic in...@nul.nu wrote: On Sat, March 12, 2011 18:32, Greg KH wrote: On Fri, Mar 11, 2011 at 08:10:01PM -0600, Jonathan Nieder wrote: Indan Zupancic wrote: I'm not pushing for any official convention, just what seems good taste. In

Re: [PATCH, v9 3/3] cgroups: introduce timer slack controller

2011-03-14 Thread Andrew Morton
On Mon, 14 Mar 2011 16:05:24 +0200 Kirill A. Shutemov kir...@shutemov.name wrote: +Overview + + +Every task_struct has timer_slack_ns value. This value uses to round up +poll() and select() timeout values. This feature can be useful in +mobile environment where combined wakeups are

Re: [RFCv4] timerfd: add TFD_NOTIFY_CLOCK_SET to watch for clock changes

2011-03-10 Thread Andrew Morton
On Wed, 9 Mar 2011 18:01:09 -0800 Scott James Remnant sc...@netsplit.com wrote: It would be helpful to know if the identified users of this feature actually find it useful and adequate. __I guess the most common application is the 1,001 desktop clock widgets. __Do you have any feedback

Re: [RFCv4] timerfd: add TFD_NOTIFY_CLOCK_SET to watch for clock changes

2011-03-09 Thread Andrew Morton
On Wed, 9 Mar 2011 16:36:51 +0200 Alexander Shishkin virtu...@slind.org wrote: Changes since v3: - changed timerfd_settime() semantics (see below) Changes since v2: - replaced sysfs interface with a syscall - added sysctl/procfs handle to set a limit to the number of users - fixed

Re: [PATCH, v8 3/3] cgroups: introduce timer slack controller

2011-03-07 Thread Andrew Morton
On Thu, 3 Mar 2011 16:19:07 +0200 Kirill A. Shutsemov kir...@shutemov.name wrote: --- /dev/null +++ b/Documentation/cgroups/timer_slack.txt @@ -0,0 +1,64 @@ +Timer Slack Controller += + +Overview + + +Every task_struct has timer_slack_ns value. This value

Re: [PATCH v3 1/5] add metadata_incore ioctl in vfs

2011-01-19 Thread Andrew Morton
On Wed, 19 Jan 2011 09:15:18 +0800 Shaohua Li shaohua...@intel.com wrote: Subject: add metadata_incore ioctl in vfs Add an ioctl to dump filesystem's metadata in memory in vfs. Userspace collects such info and uses it to do metadata readahead. Filesystem can hook to

Re: [PATCH v3 1/5] add metadata_incore ioctl in vfs

2011-01-19 Thread Andrew Morton
On Thu, 20 Jan 2011 10:30:47 +0800 Shaohua Li shaohua...@intel.com wrote: I don't know if this is worth addressing. Perhaps require that the filp refers to the root of the fs? I didn't see why this is needed, but I can limit the fip to the root of the fs. I don't think it matters much

Re: [PATCH v3 1/5] add metadata_incore ioctl in vfs

2011-01-19 Thread Andrew Morton
On Thu, 20 Jan 2011 10:48:33 +0800 Shaohua Li shaohua...@intel.com wrote: On Thu, 2011-01-20 at 10:42 +0800, Andrew Morton wrote: On Thu, 20 Jan 2011 10:30:47 +0800 Shaohua Li shaohua...@intel.com wrote: I don't know if this is worth addressing. Perhaps require that the filp refers

Re: [PATCH v3 1/5] add metadata_incore ioctl in vfs

2011-01-19 Thread Andrew Morton
On Thu, 20 Jan 2011 11:21:49 +0800 Shaohua Li shaohua...@intel.com wrote: It seems to return a single offset/length tuple which refers to the btrfs metadata file, with the intent that this tuple later be fed into a btrfs-specific readahead ioctl. I can see how this might be used with

Re: [PATCH v3 1/5] add metadata_incore ioctl in vfs

2011-01-19 Thread Andrew Morton
On Thu, 20 Jan 2011 13:38:18 +0800 Shaohua Li shaohua...@intel.com wrote: ext2, minix and probably others create an address_space for each directory. Heaven knows what xfs does (for example). yes, this is for one directiory, but the all files's metadata are in block_dev address_space. I

Re: [PATCH v3 1/5] add metadata_incore ioctl in vfs

2011-01-19 Thread Andrew Morton
On Thu, 20 Jan 2011 14:19:50 +0800 Wu Fengguang fengguang...@intel.com wrote: On Thu, Jan 20, 2011 at 02:12:33PM +0800, Li, Shaohua wrote: On Thu, 2011-01-20 at 13:55 +0800, Andrew Morton wrote: On Thu, 20 Jan 2011 13:38:18 +0800 Shaohua Li shaohua...@intel.com wrote: ext2

Re: [resend][PATCH] Added PR_SET_PROCTITLE_AREA option for prctl()

2009-10-10 Thread Andrew Morton
On Sat, 10 Oct 2009 15:32:35 +0900 KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com wrote: The solution is to use the seqlock to detect this, and prevent the secret information from ever making it back to process B's userspace. Note that it's not enough to just recheck arg_start, as process

Re: [resend][PATCH] Added PR_SET_PROCTITLE_AREA option for prctl()

2009-10-09 Thread Andrew Morton
On Fri, 9 Oct 2009 22:22:10 -0400 Bryan Donlan bdon...@gmail.com wrote: On Fri, Oct 9, 2009 at 8:13 PM, Andrew Morton a...@linux-foundation.org wrote: + __ __ __ __ __ __ res = access_process_vm(task, mm-arg_start, buffer, len, 0); + + __ __ __ __ __ __ if (mm-arg_end != mm

Re: [PATCH 00/80] Kernel based checkpoint/restart [v18]

2009-09-24 Thread Andrew Morton
On Wed, 23 Sep 2009 19:50:40 -0400 Oren Laadan or...@librato.com wrote: Q: How useful is this code as it stands in real-world usage? A: The application can be single- or multi-processes and threads. It handles open files (regular files/directories on most file systems, pipes, fifos,

Re: [PATCH] [try 3] vt: add ioctl commands to /dev/vcsaN to get/put the current palette of the given tty

2009-05-14 Thread Andrew Morton
On Thu, 14 May 2009 09:20:17 +0200 (CEST) Cedric Roux s...@free.fr wrote: I sent this patch a month ago and didn't get any reply. Should I consider you dropped it and should I forget about it or what? I dropped it in response to Alan's off-list comments and then lost track of it. -- To

Re: [patch 143/166] preadv/pwritev: Add preadv and pwritev system calls.

2009-04-04 Thread Andrew Morton
On Fri, 3 Apr 2009 08:03:22 -0700 (PDT) Linus Torvalds torva...@linux-foundation.org wrote: +static inline loff_t pos_from_hilo(unsigned long high, unsigned long low) +{ +#define HALF_LONG_BITS (BITS_PER_LONG / 2) + return ((high HALF_LONG_BITS) HALF_LONG_BITS) | low; +} Does C

Re: [RFC v13][PATCH 00/14] Kernel based checkpoint/restart

2009-02-14 Thread Andrew Morton
On Sun, 15 Feb 2009 00:08:02 +0100 Ingo Molnar mi...@elte.hu wrote: * Andrew Morton a...@linux-foundation.org wrote: Similar to the way in which perfectly correct and normal kernel sometimes has to be changed because it unexpectedly upsets the -rt patch. Actually, regarding -rt, we

Re: [RFC v13][PATCH 00/14] Kernel based checkpoint/restart

2009-02-13 Thread Andrew Morton
On Thu, 12 Feb 2009 10:11:22 -0800 Dave Hansen d...@linux.vnet.ibm.com wrote: ... - In bullet-point form, what features are missing, and should be added? * support for more architectures than i386 * file descriptors: * sockets (network, AF_UNIX, etc...) * devices files *

Re: [PATCH 2/4] Convert epoll to a bitlock

2009-02-03 Thread Andrew Morton
On Mon, 2 Feb 2009 11:20:09 -0700 Jonathan Corbet cor...@lwn.net wrote: Matt Mackall suggested converting epoll's ep_lock to a bitlock as a way of saving space in struct file. This patch makes that change. hrm. bit_spin_lock() makes people upset (large penguiny people). iirc it doesn't

Re: [PATCH, RFC] Remove fasync() BKL usage, take 3325

2009-01-22 Thread Andrew Morton
On Thu, 22 Jan 2009 22:15:00 -0700 Jonathan Corbet cor...@lwn.net wrote: On Thu, 22 Jan 2009 06:51:04 -0800 Andrew Morton a...@linux-foundation.org wrote: OK, replacing a lock_kernel() with a spin_lock(global_lock) is pretty straightforwad. But it's really really sad. It basically

  1   2   >