Re: [PATCH] mm: add node physical memory range to sysfs

2012-12-13 Thread Dave Hansen
On 12/13/2012 03:15 PM, Davidlohr Bueso wrote: On Wed, 2012-12-12 at 20:49 -0800, Dave Hansen wrote: How is that possible? If NUMA nodes are defined by distances from CPUs to memory, how could a DIMM have more than a single distance to any given CPU? Can't this occur when interleaving

[RFC][PATCH 3/7] order memory debugging Kconfig options

2012-12-14 Thread Dave Hansen
having an arch/foo/Kconfig.debug-memory might be taking things a bit too far Signed-off-by: Dave Hansen d...@linux.vnet.ibm.com --- linux-2.6.git-dave/arch/x86/Kconfig.debug |2 linux-2.6.git-dave/lib/Kconfig.debug | 702 +++--- 2 files changed, 357 insertions

[RFC][PATCH 4/7] consolidate RCU debugging Kconfig options

2012-12-14 Thread Dave Hansen
These were in two different places, and taking up too much of my valuable screen real-estate. Banish them to their own menu. Signed-off-by: Dave Hansen d...@linux.vnet.ibm.com --- linux-2.6.git-dave/lib/Kconfig.debug | 160 +-- 1 file changed, 82 insertions

[RFC][PATCH 2/7] consolidate per-arch stack overflow debugging options

2012-12-14 Thread Dave Hansen
the actual menu option. Signed-off-by: Dave Hansen d...@linux.vnet.ibm.com --- linux-2.6.git-dave/arch/blackfin/Kconfig |1 + linux-2.6.git-dave/arch/blackfin/Kconfig.debug |7 --- linux-2.6.git-dave/arch/frv/Kconfig|1 + linux-2.6.git-dave/arch/frv/Kconfig.debug

[RFC][PATCH 5/7] consolidate runtime testing configs

2012-12-14 Thread Dave Hansen
. This menu should only be used for tests that do not have a more appropriate home. Signed-off-by: Dave Hansen d...@linux.vnet.ibm.com --- linux-2.6.git-dave/lib/Kconfig.debug | 151 ++- 1 file changed, 78 insertions(+), 73 deletions(-) diff -puN lib/Kconfig.debug

[RFC][PATCH 1/7] move debugfs to filesystems menu (fs/Kconfig)

2012-12-14 Thread Dave Hansen
, configfs, or /proc. Also, Debug filesystem sounds like a debugging option _for_ filesystems code, not a filesystem for debugging. We also never call it the debug filesystem. We always say debugfs, so reflect the fact that we _call_ it debugfs in the menu text. Signed-off-by: Dave Hansen d

[RFC][PATCH 6/7] consolidate compilation option configs

2012-12-14 Thread Dave Hansen
even though I'm actually moving the options on either side of it. Signed-off-by: Dave Hansen d...@linux.vnet.ibm.com --- linux-2.6.git-dave/lib/Kconfig.debug | 156 +-- 1 file changed, 80 insertions(+), 76 deletions(-) diff -puN lib/Kconfig.debug~consolidate

[RFC][PATCH 0/7] Put Kernel hacking Kconfig menu on a diet

2012-12-14 Thread Dave Hansen
I think the Kernel Hacking menu has gotten a bit out of hand. It is over 120 lines long on my system with everything enabled and options are scattered around it haphazardly. http://sr71.net/~dave/linux/kconfig-horror.png Let's try to introduce some sanity. -- To unsubscribe from this

[RFC][PATCH 7/7] group locking debugging options

2012-12-14 Thread Dave Hansen
There are quite a few of these, and we want to make sure that there is one-stop-shopping for lock debugging. Signed-off-by: Dave Hansen d...@linux.vnet.ibm.com --- linux-2.6.git-dave/lib/Kconfig.debug | 120 ++- 1 file changed, 62 insertions(+), 58 deletions

Re: [PATCH v3 15/44] metag: Huge TLB

2013-01-10 Thread Dave Hansen
On 01/10/2013 07:30 AM, James Hogan wrote: +pte_t *huge_pte_alloc(struct mm_struct *mm, + unsigned long addr, unsigned long sz) +{ + pgd_t *pgd; + pud_t *pud; + pmd_t *pmd; + pte_t *pte; + + pgd = pgd_offset(mm, addr); + pud = pud_offset(pgd,

Re: [RFC] Reproducible OOM with partial workaround

2013-01-10 Thread Dave Hansen
On 01/10/2013 01:58 PM, paul.sz...@sydney.edu.au wrote: I developed a workaround patch for this particular OOM demo, dropping filesystem caches when about to exhaust lowmem. However, subsequently I observed OOM when running many processes (as yet I do not have an easy-to-reproduce demo of

Re: [RFC] Reproducible OOM with partial workaround

2013-01-10 Thread Dave Hansen
On 01/10/2013 04:46 PM, paul.sz...@sydney.edu.au wrote: Your configuration has never worked. This isn't a regression ... ... does not mean that we expect it to work. Do you mean that CONFIG_HIGHMEM64G is deprecated, should not be used; that all development is for 64-bit only? My last 4GB

Re: [RFC] Reproducible OOM with partial workaround

2013-01-11 Thread Dave Hansen
On 01/10/2013 05:46 PM, paul.sz...@sydney.edu.au wrote: ... I don't believe 64GB of RAM has _ever_ been booted on a 32-bit kernel without either violating the ABI (3GB/1GB split) or doing something that never got merged upstream ... Sorry to be so contradictory: psz@como:~$ uname -a

[PATCH] consolidate per-arch stack overflow debugging options

2013-01-11 Thread Dave Hansen
a bunch of duplication and adds consistency across arches. Signed-off-by: Dave Hansen d...@linux.vnet.ibm.com Cc: Mike Frysinger vap...@gentoo.org Cc: David Howells dhowe...@redhat.com Cc: Hirokazu Takata tak...@linux-m32r.org Cc: Ralf Baechle r...@linux-mips.org Cc: Koichi Yasutake yasutake.koi

Re: [RFC] Reproducible OOM with just a few sleeps

2013-01-14 Thread Dave Hansen
On 01/11/2013 07:31 PM, paul.sz...@sydney.edu.au wrote: Seems that any i386 PAE machine will go OOM just by running a few processes. To reproduce: sh -c 'n=0; while [ $n -lt 1 ]; do sleep 600 ((n=n+1)); done' My machine has 64GB RAM. With previous OOM episodes, it seemed that running

Re: [PATCH] consolidate per-arch stack overflow debugging options

2013-01-14 Thread Dave Hansen
On 01/11/2013 05:00 PM, Stephen Rothwell wrote: On Fri, 11 Jan 2013 09:00:43 -0800 Dave Hansen d...@linux.vnet.ibm.com wrote: I'm looking for some Acked-bys on this from the various arch maintainers that it affects. I'd like to send it up to Linus in the next merge window. This is part

Re: [RFC] Reproducible OOM with just a few sleeps

2013-01-14 Thread Dave Hansen
On 01/14/2013 12:36 PM, paul.sz...@sydney.edu.au wrote: I understand that more RAM leaves less lowmem. What is unacceptable is that PAE crashes or freezes with OOM: it should gracefully handle the issue. Noting that (for a machine with 4GB or under) PAE fails where the HIGHMEM4G kernel

Re: [PATCHv2 8/9] zswap: add to mm/

2013-01-08 Thread Dave Hansen
On 01/07/2013 12:24 PM, Seth Jennings wrote: +struct zswap_tree { + struct rb_root rbroot; + struct list_head lru; + spinlock_t lock; + struct zs_pool *pool; +}; BTW, I spent some time trying to get this lock contended. You thought the anon_vma locks would dominate and this

[RFCv3][PATCH 2/3] fix kvm's use of __pa() on percpu areas

2013-01-09 Thread Dave Hansen
for the page fault (it was injected by the host), assumed that the kernel had taken a _real_ page fault, and panic()'d. Signed-off-by: Dave Hansen d...@linux.vnet.ibm.com --- linux-2.6.git-dave/arch/x86/kernel/kvm.c |9 + linux-2.6.git-dave/arch/x86/kernel/kvmclock.c |4 ++-- 2 files

[RFCv3][PATCH 1/3] create slow_virt_to_phys()

2013-01-09 Thread Dave Hansen
(), which walks the kernel page tables on x86 and should do precisely the same logical thing as __pa(), but actually work on a wider range of memory. It should work on the normal linear mapping, vmalloc(), kmap(), etc... Signed-off-by: Dave Hansen d...@linux.vnet.ibm.com --- linux-2.6.git-dave/arch

[RFCv3][PATCH 3/3] make DEBUG_VIRTUAL work earlier in boot

2013-01-09 Thread Dave Hansen
The KVM code has some repeated bugs in it around use of __pa() on per-cpu data. Those data are not in an area on which __pa() is valid. However, they are also called early enough in boot that __vmalloc_start_set is not set, and thus the CONFIG_DEBUG_VIRTUAL debugging does not catch them. This

Re: [PATCH 0/16] Pid namespaces

2007-07-06 Thread Dave Hansen
On Fri, 2007-07-06 at 12:01 +0400, Pavel Emelianov wrote: This is submition for inclusion of hierarchical, not kconfig configurable, zero overheaded ;) pid namespaces. Pavel, I'm a bit disappointed that you went ahead and sent this. I thought that, perhaps, you might have brought up how

Re: [-mm PATCH 1/8] Memory controller resource counters (v2)

2007-07-06 Thread Dave Hansen
On Thu, 2007-07-05 at 22:20 -0700, Balbir Singh wrote: +/* + * the core object. the container that wishes to account for some + * resource may include this counter into its structures and use + * the helpers described beyond + */ I'm going to nitpick a bit here. Nothing major, I promise. ;)

Re: [-mm PATCH 2/8] Memory controller containers setup (v2)

2007-07-06 Thread Dave Hansen
On Thu, 2007-07-05 at 22:21 -0700, Balbir Singh wrote: +struct mem_container { + struct container_subsys_state css; + /* + * the counter to account for memory usage + */ + struct res_counter res; +}; How about we call it memory_usage? That would kill two birds with

Re: [-mm PATCH 1/8] Memory controller resource counters (v2)

2007-07-06 Thread Dave Hansen
On Fri, 2007-07-06 at 14:03 -0700, Balbir Singh wrote: +ssize_t res_counter_read(struct res_counter *cnt, int member, +const char __user *userbuf, size_t nbytes, loff_t *pos) +{ +unsigned long *val; +char buf[64], *s; + +s = buf; +val =

Re: RFC: CONFIG_PAGE_SHIFT (aka software PAGE_SIZE)

2007-07-06 Thread Dave Hansen
On Sat, 2007-07-07 at 00:26 +0200, Andrea Arcangeli wrote: for the hack week at opensuse (see http://idea.opensuse.org/) I've been working on a new feature called CONFIG_PAGE_SHIFT. ... If you want to help/look here the patch:

Re: [-mm PATCH 1/8] Memory controller resource counters (v2)

2007-07-09 Thread Dave Hansen
On Mon, 2007-07-09 at 11:16 +0400, Pavel Emelianov wrote: Dave Hansen wrote: On Thu, 2007-07-05 at 22:20 -0700, Balbir Singh wrote: +/* + * the core object. the container that wishes to account for some + * resource may include this counter into its structures and use + * the helpers

Re: [PATCH 0/16] Pid namespaces

2007-07-09 Thread Dave Hansen
On Mon, 2007-07-09 at 09:58 +0400, Pavel Emelianov wrote: Dave Hansen wrote: On Fri, 2007-07-06 at 12:01 +0400, Pavel Emelianov wrote: This is submition for inclusion of hierarchical, not kconfig configurable, zero overheaded ;) pid namespaces. Pavel, I'm a bit disappointed that you

[PATCH 00/23] Mount writer count API (read-only bind mounts prep)

2007-07-11 Thread Dave Hansen
operations on the three directories, including ones that are expected to fail, like creating a file on the r/o mount. Signed-off-by: Dave Hansen [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info

[PATCH 01/23] rearrange may_open() to be r/o friendly

2007-07-11 Thread Dave Hansen
may_open() calls vfs_permission() before it does checks for IS_RDONLY(inode). It checks _again_ inside of vfs_permission(). The check inside of vfs_permission() is going away eventually. With the mnt_want/drop_write() functions, all of the r/o checks (except for this one) are consistently done

[PATCH 02/23] create cleanup helper svc_msnfs()

2007-07-11 Thread Dave Hansen
I'm going to be modifying nfsd_rename() shortly to support read-only bind mounts. This #ifdef is around the area I'm patching, and it starts to get really ugly if I just try to add my new code by itself. Using this little helper makes things a lot cleaner to use. Signed-off-by: Dave Hansen

[PATCH 04/23] r/o bind mounts: stub functions

2007-07-11 Thread Dave Hansen
. When that is complete, we can actually introduce code that will safely check the counts before allowing r/w-r/o transitions to occur. Signed-off-by: Dave Hansen [EMAIL PROTECTED] --- lxc-dave/fs/namespace.c| 46 + lxc-dave/include/linux/mount.h

[PATCH 09/23] elevate mnt writers for callers of vfs_mkdir()

2007-07-11 Thread Dave Hansen
Pretty self-explanatory. Fits in with the rest of the series. Signed-off-by: Dave Hansen [EMAIL PROTECTED] --- lxc-dave/fs/namei.c|5 + lxc-dave/fs/nfsd/nfs4recover.c |4 2 files changed, 9 insertions(+) diff -puN fs/namei.c~elevate-mnt-writers-for-callers-of-vfs

[PATCH 12/23] elevate mount count for extended attributes

2007-07-11 Thread Dave Hansen
This basically audits the callers of xattr_permission(), which calls permission() and can perform writes to the filesystem. Signed-off-by: Dave Hansen [EMAIL PROTECTED] --- lxc-dave/fs/nfsd/nfs4proc.c |7 ++- lxc-dave/fs/xattr.c | 16 ++-- 2 files changed, 20

[PATCH 11/23] elevate write count for link and symlink calls

2007-07-11 Thread Dave Hansen
Signed-off-by: Dave Hansen [EMAIL PROTECTED] --- lxc-dave/fs/namei.c | 10 ++ 1 file changed, 10 insertions(+) diff -puN fs/namei.c~elevate-write-count-for-link-and-symlink-calls fs/namei.c --- lxc/fs/namei.c~elevate-write-count-for-link-and-symlink-calls 2007-07-10 12:46

[PATCH 13/23] elevate write count for file_update_time()

2007-07-11 Thread Dave Hansen
Signed-off-by: Dave Hansen [EMAIL PROTECTED] --- lxc-dave/fs/inode.c | 13 - 1 file changed, 12 insertions(+), 1 deletion(-) diff -puN fs/inode.c~elevate-write-count-for-file_update_time fs/inode.c --- lxc/fs/inode.c~elevate-write-count-for-file_update_time 2007-07-10 12:46

[PATCH 15/23] unix_find_other() elevate write count for touch_atime()

2007-07-11 Thread Dave Hansen
Signed-off-by: Dave Hansen [EMAIL PROTECTED] --- lxc-dave/net/unix/af_unix.c | 16 1 file changed, 12 insertions(+), 4 deletions(-) diff -puN net/unix/af_unix.c~unix-find-other-elevate-write-count-for-touch-atime net/unix/af_unix.c --- lxc/net/unix/af_unix.c~unix-find

[PATCH 14/23] mount_is_safe(): add comment

2007-07-11 Thread Dave Hansen
This area of code is currently #ifdef'd out, so add a comment for the time when it is actually used. Signed-off-by: Dave Hansen [EMAIL PROTECTED] --- lxc-dave/fs/namespace.c |4 1 file changed, 4 insertions(+) diff -puN fs/namespace.c~mount-is-safe-add-comment fs/namespace.c --- lxc

[PATCH 16/23] elevate write count over calls to vfs_rename()

2007-07-11 Thread Dave Hansen
This also uses the little helper in the NFS code to make an if() a little bit less ugly. We introduced the helper at the beginning of the series. Signed-off-by: Dave Hansen [EMAIL PROTECTED] --- lxc-dave/fs/namei.c|4 lxc-dave/fs/nfsd/vfs.c | 15 +++ 2 files changed

[PATCH 21/23] sys_mknodat(): elevate write count for vfs_mknod/create()

2007-07-11 Thread Dave Hansen
outside of the switch. Signed-off-by: Dave Hansen [EMAIL PROTECTED] --- lxc-dave/fs/namei.c | 32 +--- lxc-dave/fs/nfsd/vfs.c |4 lxc-dave/net/unix/af_unix.c |4 3 files changed, 29 insertions(+), 11 deletions(-) diff -puN fs/namei.c

[PATCH 23/23] do_rmdir(): elevate write count

2007-07-11 Thread Dave Hansen
Elevate the write count during the vfs_rmdir() call. Signed-off-by: Dave Hansen [EMAIL PROTECTED] --- lxc-dave/fs/namei.c |5 + 1 file changed, 5 insertions(+) diff -puN fs/namei.c~do-rmdir-elevate-write-count fs/namei.c --- lxc/fs/namei.c~do-rmdir-elevate-write-count 2007-07-10 12:46

[PATCH 20/23] elevate write count for do_sys_utime() and touch_atime()

2007-07-11 Thread Dave Hansen
Signed-off-by: Dave Hansen [EMAIL PROTECTED] --- lxc-dave/fs/inode.c | 20 1 file changed, 12 insertions(+), 8 deletions(-) diff -puN fs/inode.c~elevate-write-count-for-do-sys-utime-and-touch-atime fs/inode.c --- lxc/fs/inode.c~elevate-write-count-for-do-sys-utime

[PATCH 18/23] elevate writer count for do_sys_truncate()

2007-07-11 Thread Dave Hansen
Signed-off-by: Dave Hansen [EMAIL PROTECTED] --- lxc-dave/fs/open.c | 16 +--- 1 file changed, 9 insertions(+), 7 deletions(-) diff -puN fs/open.c~elevate-writer-count-for-do-sys-truncate fs/open.c --- lxc/fs/open.c~elevate-writer-count-for-do-sys-truncate 2007-07-10 12:46

[PATCH 22/23] elevate mnt writers for vfs_unlink() callers

2007-07-11 Thread Dave Hansen
Signed-off-by: Dave Hansen [EMAIL PROTECTED] --- lxc-dave/fs/namei.c |4 lxc-dave/ipc/mqueue.c |5 - 2 files changed, 8 insertions(+), 1 deletion(-) diff -puN fs/namei.c~elevate-mnt-writers-for-vfs-unlink-callers fs/namei.c --- lxc/fs/namei.c~elevate-mnt-writers-for-vfs

[PATCH 19/23] elevate write count for do_utimes()

2007-07-11 Thread Dave Hansen
Signed-off-by: Dave Hansen [EMAIL PROTECTED] --- lxc-dave/fs/utimes.c | 15 +-- 1 file changed, 9 insertions(+), 6 deletions(-) diff -puN fs/utimes.c~elevate-write-count-for-do-utimes fs/utimes.c --- lxc/fs/utimes.c~elevate-write-count-for-do-utimes 2007-07-10 12:46

[PATCH 17/23] nfs: check mnt instead of superblock directly

2007-07-11 Thread Dave Hansen
two are probably unnecessary and duplicate existing checks in the VFS. This won't make them better checks than before, but it will make them detect r/o mounts. Signed-off-by: Dave Hansen [EMAIL PROTECTED] --- lxc-dave/fs/nfs/dir.c |3 ++- lxc-dave/fs/nfsd/vfs.c |4 ++-- 2 files changed, 4

[PATCH 10/23] elevate write count during entire ncp_ioctl()

2007-07-11 Thread Dave Hansen
Some ioctls need write access, but others don't. Make a helper function to decide when write access is needed, and take it. Signed-off-by: Dave Hansen [EMAIL PROTECTED] --- lxc-dave/fs/ncpfs/ioctl.c | 55 +- 1 file changed, 54 insertions(+), 1

[PATCH 08/23] make access() use mnt check

2007-07-11 Thread Dave Hansen
It is OK to let access() go without using a mnt_want/drop_write() pair because it doesn't actually do writes to the filesystem, and it is inherently racy anyway. This is a rare case when it is OK to use __mnt_is_readonly() directly. Signed-off-by: Dave Hansen [EMAIL PROTECTED] --- lxc-dave/fs

[PATCH 07/23] elevate writer count for chown and friends

2007-07-11 Thread Dave Hansen
chown/chmod,etc... don't call permission in the same way that the normal open for write calls do. They still write to the filesystem, so bump the write count during these operations. Signed-off-by: Dave Hansen [EMAIL PROTECTED] --- lxc-dave/fs/open.c | 39

[PATCH 06/23] r/o bind mounts: elevate write count for some ioctls

2007-07-11 Thread Dave Hansen
-by: Dave Hansen [EMAIL PROTECTED] --- lxc-dave/fs/ext2/ioctl.c | 46 +- lxc-dave/fs/ext3/ioctl.c | 100 +--- lxc-dave/fs/ext4/ioctl.c | 105 +- lxc-dave/fs/fat/file.c

[PATCH 03/23] filesystem helpers for custom 'struct file's

2007-07-11 Thread Dave Hansen
a unified place which the r/o bind mount code may patch. Also, rename an existing, static-scope init_file() to a less generic name. Signed-off-by: Dave Hansen [EMAIL PROTECTED] --- lxc-dave/fs/configfs/dir.c|5 +++-- lxc-dave/fs/file_table.c | 34

[PATCH 05/23] elevate write count open()'d files

2007-07-11 Thread Dave Hansen
file, while the vfsmount is ro. That is bad. Some filesystems forego the use of normal vfs calls to create struct files. Make sure that these users elevate the mnt writer count because they will get __fput(), and we need to make sure they're balanced. Signed-off-by: Dave Hansen [EMAIL PROTECTED

Re: RFC: CONFIG_PAGE_SHIFT (aka software PAGE_SIZE)

2007-07-12 Thread Dave Hansen
On Thu, 2007-07-12 at 18:31 +0200, Andrea Arcangeli wrote: On Fri, Jul 13, 2007 at 12:44:49AM +1000, David Chinner wrote: That's crap. Just because a machine has lots of memory does not make it OK to waste lots of memory. It's not just wasted, it lowers overhead all over the place. Yes,

Re: [RFC/PATCH] Documentation of kernel messages

2007-06-13 Thread Dave Hansen
On Wed, 2007-06-13 at 17:06 +0200, holzheu wrote: The operation of a Linux system sometimes requires to decode the meaning of a specific kernel message, e.g. an error message of a driver. Especially on our mainframe zSeries platform system administrators want to have descriptions for Linux

Re: [RFC/PATCH] Documentation of kernel messages

2007-06-13 Thread Dave Hansen
On Wed, 2007-06-13 at 11:32 -0700, Greg KH wrote: dev_printk() and friends are great, since they already define something like KMSG_COMPONENT: The driver name. They provide way more than that, they also provide the explicit device that is being discussed, as well as other things

Re: [PATCH] update checkpatch.pl to version 0.06

2007-06-22 Thread Dave Hansen
Andy and Joel, very cool that you got this in-tree! I have a patch touching a bunch of fs ioctl functions. Things like ext2_ioctl() look like this: foo_ioctl() { switch(ioctl) { case FOO: lots of code error:

Re: [PATCH] update checkpatch.pl to version 0.06

2007-06-22 Thread Dave Hansen
On Fri, 2007-06-22 at 12:54 -0500, Joel Schopp wrote: If it is kinda like a mini function why not make it actually a mini function and call it? Several of our on-disk filesystems have an ioctl function that already has indented goto labels. I don't think it's quite worth churning all of these

Re: checkpointing and restoring processes

2007-06-06 Thread Dave Hansen
On Wed, 2007-06-06 at 13:37 +0200, Mark Pflueger wrote: hi everyone! i'm not subscribed to the list, so if you care to flame because of my noob question, just do it to the list, otherwise please cc me. i'm trying to write a checkpoint/restore module for processes and so have a basic

Re: [RFC] [Patch 1/4] statistics: no include hell for users

2007-06-06 Thread Dave Hansen
On Wed, 2007-06-06 at 23:33 +0200, Martin Peschke wrote: struct statistic_interface { /* private: */ struct list_head list; - struct dentry *debugfs_dir; - struct dentry *data_file; - struct dentry *def_file; + void

Re: [patch] i386, numaq: enable TSCs again

2007-05-25 Thread Dave Hansen
On Fri, 2007-05-25 at 10:41 +0200, Ingo Molnar wrote: * William Lee Irwin III [EMAIL PROTECTED] wrote: yes, that's what i meant under 'slightly async'. Some AMD CPUs are like that too and sched_clock() now handles that fine. So we should try my patch. Sorry, then. I took slight to

Re: Containers: css_put() dilemma

2007-07-17 Thread Dave Hansen
On Tue, 2007-07-17 at 08:49 -0700, Paul (宝瑠) Menage wrote: Because as soon as you do the atomic_dec_and_test() on css-refcnt and the refcnt hits zero, then theoretically someone other thread (that already holds container_mutex) could check that the refcount is zero and free the container

Re: Read-only bind mount patches

2007-05-14 Thread Dave Hansen
On Mon, 2007-05-14 at 21:22 +0800, Ian Kent wrote: I've sent mail to Dave but he must be to busy, on leave or not keen on getting these patches included any more as I've not had a response. I'm just now catching up on email. I was out for a couple of weeks. I'm planning on refreshing the

Re: Read-only bind mount patches

2007-05-14 Thread Dave Hansen
On Mon, 2007-05-14 at 23:55 +0800, Ian Kent wrote: Anything I can do to help? If so maybe we could reduce the time to posting a bit. Probably nothing to actually speed up the development, but I'd really appreciate some testing once I do post them. How about I send you an advance copy, or cc

Re: [PATCH resend] drop_caches: add some documentation and info message

2013-08-02 Thread Dave Hansen
On 08/02/2013 09:04 AM, Rob Landley wrote: I'd be surprised if anybody who does this sees the printk and thinks hey, I'll dig into the VM's balancing logic and come up to speed on the tradeoffs sufficient to contribute to kernel development because of something in dmesg. Anybody actually

Re: [RFC PATCH 3/4] mm: add zbud flag to page flags

2013-08-06 Thread Dave Hansen
On 08/05/2013 11:42 PM, Krzysztof Kozlowski wrote: +#ifdef CONFIG_ZBUD + /* Allocated by zbud. Flag is necessary to find zbud pages to unuse + * during migration/compaction. + */ + PG_zbud, +#endif Do you _really_ need an absolutely new, unshared page flag? The zbud code

Re: [PATCH 19/23] truncate: support huge pages

2013-08-06 Thread Dave Hansen
On 08/03/2013 07:17 PM, Kirill A. Shutemov wrote: + if (PageTransTailCache(page)) { + /* part of already handled huge page */ + if (!page-mapping) + continue; +

Re: [PATCH 19/23] truncate: support huge pages

2013-08-06 Thread Dave Hansen
On 08/03/2013 07:17 PM, Kirill A. Shutemov wrote: If a huge page is only partly in the range we zero out the part, exactly like we do for partial small pages. What's the logic behind this behaviour? Seems like the kind of place that we would really want to be splitting pages. + if

Re: [RFC 0/3] Add madvise(..., MADV_WILLWRITE)

2013-08-07 Thread Dave Hansen
On 08/07/2013 06:40 AM, Jan Kara wrote: One question before I look at the patches: Why don't you use fallocate() in your application? The functionality you require seems to be pretty similar to it - writing to an already allocated block is usually quick. One problem I've seen is that it

Re: [PATCH 2/2] Drivers: hv: balloon: Online the hot-added memory in context

2013-07-24 Thread Dave Hansen
On 07/24/2013 02:29 PM, K. Y. Srinivasan wrote: /* - * Wait for the memory block to be onlined. - * Since the hot add has succeeded, it is ok to - * proceed even if the pages in the hot added region - * have not been onlined

Re: [PATCH 1/1] Drivers: base: memory: Export symbols for onlining memory blocks

2013-07-25 Thread Dave Hansen
On 07/25/2013 04:14 AM, KY Srinivasan wrote: As promised, I have sent out the patches for (a) an implementation of an in-kernel API for onlining and a consumer for this API. While I don't know the exact reason why the user mode code is delayed (under some low memory conditions), what is

Re: [PATCH 1/1] Drivers: base: memory: Export symbols for onlining memory blocks

2013-07-25 Thread Dave Hansen
On 07/25/2013 08:15 AM, Kay Sievers wrote: Complexity, well, it's just a bit of code which belongs in the kernel. The mentioned unconditional hotplug loop through userspace is absolutely pointless. Such defaults never belong in userspace tools if they do not involve data that is only available

Re: [RFC PATCH 1/2] vmsplice unmap gifted pages for recipient

2013-07-25 Thread Dave Hansen
On 07/25/2013 10:21 AM, Robert Jennings wrote: +static void zap_buf_page(unsigned long useraddr) +{ + struct vm_area_struct *vma; + + down_read(current-mm-mmap_sem); + vma = find_vma_intersection(current-mm, useraddr, + useraddr + PAGE_SIZE); + if

SATA hotplug not detecting new disks

2013-07-25 Thread Dave Hansen
I've got a relatively new system that doesn't seem to be able to hotplug SATA disks. I see the same behavior on 3.10, 3.11-rc2, and Ubuntu's 3.8.0-25-generic. The disks are detected right away on reboots, but even after poking the /sys/class/scsi_host/host*/scan files, new disks are never

Re: [RFC PATCH 1/2] vmsplice unmap gifted pages for recipient

2013-07-26 Thread Dave Hansen
On 07/26/2013 08:16 AM, Robert Jennings wrote: + if ((spd-flags SPLICE_F_MOVE) + !buf-offset (buf-len == PAGE_SIZE)) + /* Can move page aligned buf */ +

Re: SATA hotplug not detecting new disks

2013-07-26 Thread Dave Hansen
On 07/25/2013 06:51 PM, Aaron Lu wrote: On 07/26/2013 07:15 AM, Dave Hansen wrote: I've got a relatively new system that doesn't seem to be able to hotplug SATA disks. I see the same behavior on 3.10, 3.11-rc2, and Ubuntu's 3.8.0-25-generic. The disks are detected right away on reboots

Re: SATA hotplug not detecting new disks

2013-07-26 Thread Dave Hansen
On 07/25/2013 06:51 PM, Aaron Lu wrote: On 07/26/2013 07:15 AM, Dave Hansen wrote: I've got a relatively new system that doesn't seem to be able to hotplug SATA disks. I see the same behavior on 3.10, 3.11-rc2, and Ubuntu's 3.8.0-25-generic. The disks are detected right away on reboots

[PATCH] checkpatch: enforce sane perl version

2013-07-29 Thread Dave Hansen
From: Dave Hansen dave.han...@linux.intel.com I got a bug report from a couple of users who said checkpatch.pl was broken for them. It was erroring out on fairly random lines most commonly with messages like: Nested quantifiers in regex; marked by --HERE in m/(\((?:[^\(\)]++ -- HERE

Re: [PATCH 2/2] mm: add overcommit_kbytes sysctl variable

2013-08-19 Thread Dave Hansen
On 08/19/2013 08:17 AM, Jerome Marchand wrote: Some applications that run on HPC clusters are designed around the availability of RAM and the overcommit ratio is fine tuned to get the maximum usage of memory without swapping. With growing memory, the 1%-of-all-RAM grain provided by

Re: [BUG REPORT]kernel panic with kmemcheck config

2013-08-20 Thread Dave Hansen
On 08/19/2013 07:44 PM, Libin wrote: When kmemcheck kernel support configured??we encountered random kernel panic (sometimes can be booted) during system boot process in our environment. I have tested the mainline kernel version from v3.0 to v3.11-rc6, they also have this problem. And the

Re: [Ksummit-2013-discuss] [ATTEND] oops.kernel.org prospect

2013-08-20 Thread Dave Hansen
On 08/19/2013 02:25 PM, Dave Jones wrote: * This bug last seen: 2013-08-17 Also useful here would be something like: Seen on: 3.2-rc2, 3.10-rc10 (You can probably just list earliest/latest rather than every single kernel it's been seen on, unless you want a 'show all' button) Once

Re: [BUG REPORT]kernel panic with kmemcheck config

2013-08-20 Thread Dave Hansen
On 08/20/2013 07:45 PM, Libin wrote: [3.158023] [ cut here ] [3.162626] WARNING: CPU: 0 PID: 1 at arch/x86/mm/kmemcheck/kmemcheck.c:634 kmemcheck_fault+0xb1/0xc0() ... [3.314877] [81046aa7] ? kmemcheck_trap+0x17/0x30 [3.320507] EOE #DB

Re: [PATCH 2/2] mm: add overcommit_kbytes sysctl variable

2013-08-21 Thread Dave Hansen
On 08/21/2013 08:22 AM, Jerome Marchand wrote: Instead of introducing yet another tunable, why don't we just make the ratio that comes in from the user more fine-grained? sysctl overcommit_ratio=0.2 We change the internal 'sysctl_overcommit_ratio' to store tenths or hundreths of

Re: [BUG REPORT]kernel panic with kmemcheck config

2013-08-22 Thread Dave Hansen
On 08/21/2013 08:58 PM, Libin wrote: I test it on IBM System x3850 X5 platform, and also trigger oops in boot process. But if don't config the kmemcheck, it can boot up normally. Hardware information and oops information as following: [0.205976] BUG: unable to handle kernel paging

Re: [3.10-rc1 PATCH] devtmpfs: Fix kmemcheck warning.

2013-08-22 Thread Dave Hansen
cc'ing Al and Kay who have the most commits in devtmpfs... On 05/14/2013 05:02 AM, Tetsuo Handa wrote: I got below warning. WARNING: kmemcheck: Caught 8-bit read from uninitialized memory (88007ae384d8) d884e37a0088006f665f64657669 i i i i i

page fault scalability (ext3, ext4, xfs)

2013-08-14 Thread Dave Hansen
We talked a little about this issue in this thread: http://marc.info/?l=linux-mmm=137573185419275w=2 but I figured I'd follow up with a full comparison. ext4 is about 20% slower in handling write page faults than ext3. xfs is about 30% slower than ext3. I'm running on an 8-socket /

Re: [RFC][PATCH] drivers: base: dynamic memory block creation

2013-08-14 Thread Dave Hansen
On 08/14/2013 12:43 PM, Greg Kroah-Hartman wrote: On Wed, Aug 14, 2013 at 02:31:45PM -0500, Seth Jennings wrote: ppc64 has a normal memory block size of 256M (however sometimes as low as 16M depending on the system LMB size), and (I think) x86 is 128M. With 1TB of RAM and a 256M block size,

Re: [RFC][PATCH] drivers: base: dynamic memory block creation

2013-08-14 Thread Dave Hansen
On 08/14/2013 12:31 PM, Seth Jennings wrote: There was a significant amount of refactoring to allow for this but IMHO, the code is much easier to understand now. ... drivers/base/memory.c | 248 + include/linux/memory.h | 1 - 2 files

Re: page fault scalability (ext3, ext4, xfs)

2013-08-14 Thread Dave Hansen
On 08/14/2013 12:43 PM, Theodore Ts'o wrote: Thanks dave for doing this comparison. Is there any chance you can check whether lockstats shows anything interesting? Test case is this: https://github.com/antonblanchard/will-it-scale/blob/master/tests/page_fault3.c One interesting

Re: [RFC][PATCH] drivers: base: dynamic memory block creation

2013-08-14 Thread Dave Hansen
On 08/14/2013 02:14 PM, Seth Jennings wrote: On Wed, Aug 14, 2013 at 01:47:27PM -0700, Dave Hansen wrote: On 08/14/2013 12:31 PM, Seth Jennings wrote: +static unsigned long *memblock_present; +static bool largememory_enable __read_mostly; How would you see this getting used in practice

Re: [RFC][PATCH] drivers: base: dynamic memory block creation

2013-08-14 Thread Dave Hansen
On 08/14/2013 02:37 PM, Cody P Schafer wrote: Also, I'd expect userspace tools might use readdir() to find out what memory blocks a system has (unless they just stat(memory0), stat(memory1)...). I don't think filesystem tricks (at least within sysfs) are going to let this magically be solved

Re: page fault scalability (ext3, ext4, xfs)

2013-08-15 Thread Dave Hansen
On 08/14/2013 05:24 PM, Dave Chinner wrote: On Wed, Aug 14, 2013 at 10:10:07AM -0700, Dave Hansen wrote: We talked a little about this issue in this thread: http://marc.info/?l=linux-mmm=137573185419275w=2 but I figured I'd follow up with a full comparison. ext4 is about 20% slower

Re: page fault scalability (ext3, ext4, xfs)

2013-08-15 Thread Dave Hansen
On 08/14/2013 06:11 PM, Theodore Ts'o wrote: The point is that if the goal is to measure page fault scalability, we shouldn't have this other stuff happening as the same time as the page fault workload. will-it-scale does several different tests probing at different parts of the fault path:

Re: page fault scalability (ext3, ext4, xfs)

2013-08-15 Thread Dave Hansen
On 08/14/2013 09:29 PM, Dave Chinner wrote: On Wed, Aug 14, 2013 at 07:24:01PM -0700, Andi Kleen wrote: And FWIW, it's no secret that XFS has more per-operation overhead than ext4 through the write path when it comes to allocation, so it's no surprise that on a workload that is highly

Re: page fault scalability (ext3, ext4, xfs)

2013-08-15 Thread Dave Hansen
On 08/15/2013 08:05 AM, Theodore Ts'o wrote: IOW, if it really is about write page fault handling, the simplest test to do is to mmap /dev/zero and then start dirtying pages. At that point we will be measuring the VM level write page fault code. As I mentioned in some of the other replies,

Re: [PATCH 1/4] mm/pgtable: Fix continue to preallocate pmds even if failure occurrence

2013-08-15 Thread Dave Hansen
) { - free_pmds(pmds); - return -ENOMEM; - } - return 0; +err: + free_pmds(pmds); + return -ENOMEM; } I don't have a problem with what you have, though. It's better than what was there, so: Reviewed-by: Dave Hansen dave.han...@linux.intel.com

Re: [PATCH 2/4] mm/sparse: introduce alloc_usemap_and_memmap

2013-08-15 Thread Dave Hansen
On 08/14/2013 05:31 PM, Wanpeng Li wrote: After commit 9bdac91424075(sparsemem: Put mem map for one node together.), vmemmap for one node will be allocated together, its logic is similiar as memory allocation for pageblock flags. This patch introduce alloc_usemap_and_memmap to extract the

Re: [PATCH 4/4] mm/vmalloc: use wrapper function get_vm_area_size to caculate size of vm area

2013-08-15 Thread Dave Hansen
On 08/14/2013 05:31 PM, Wanpeng Li wrote: diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 93d3182..553368c 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -1553,7 +1553,7 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask, unsigned int nr_pages, array_size, i;

Re: [RFC v3 0/5] Transparent on-demand struct page initialization embedded in the buddy allocator

2013-08-16 Thread Dave Hansen
Hey Nathan, Could you post your boot timing patches? My machines are much smaller than yours, but I'm curious how things behave here as well. I did some very imprecise timings (strace -t on a telnet attached to the serial console). The 'struct page' initializations take about a minute of boot

Re: [PATCH 1/8] THP: Use real address for NUMA policy

2013-08-16 Thread Dave Hansen
On 08/16/2013 07:33 AM, Alex Thorlton wrote: --- mm/huge_memory.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index a92012a..55ec681 100644 Could you add some actual descriptions to these patches that say why you are doing

Re: [PATCH 5/8] mm: make clear_huge_page cache clear only around the fault address

2013-08-16 Thread Dave Hansen
On 08/16/2013 07:34 AM, Alex Thorlton wrote: +#if ARCH_HAS_USER_NOCACHE == 0 +#define clear_user_highpage_nocache clear_user_highpage +#endif ... cond_resched(); - clear_user_highpage(p, addr + i * PAGE_SIZE); + vaddr = haddr + i*PAGE_SIZE; +

<    3   4   5   6   7   8   9   10   11   12   >