from:"Pekka Enberg"

Re: GFS

2005-08-09 Thread Pekka Enberg

On Mon, 2005-08-08 at 11:32 -0700, Zach Brown wrote:
  Sorry if this is an obvious question but what prevents another thread
  from doing mmap() before we do the second walk and messing up num_gh?
 
 Nothing, I suspect.  OCFS2 has a problem like this, too.  It wants a way
 for a file system to serialize mmap/munmap/mremap during file IO.  Well,
 more specifically, it wants to make sure that the locks it acquired at
 the start of the IO really cover the buf regions that might fault during
 the IO.. mapping activity during the IO can wreck that.

In addition, the vma walk will become an unmaintainable mess as soon as
someone introduces another mmap() capable fs that needs similar locking.

I am not an expert so could someone please explain why this cannot be
done with a_ops-prepare_write and friends?

Pekka

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.19 file content corruption on ext3

2006-12-19 Thread Pekka Enberg


On 12/19/06, Andrew Morton [EMAIL PROTECTED] wrote:

Wow.  I didn't expect that, because Mark Haber reported that ext3's 
data=writeback
fixed it.   Maybe he didn't run it for long enough?


I don't think it did fix it for Mark:

http://marc.theaimsgroup.com/?l=linux-kernelm=116625777306843w=2
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Fix IPMI watchdog set_param_str() using kstrdup

2006-12-20 Thread Pekka Enberg


On 12/20/06, Sébastien Dugué [EMAIL PROTECTED] wrote:

  set_param_str() cannot use kstrdup() to duplicate the parameter. That's
fine when the driver is compiled as a module but it sure is not when
built into the kernel as the kernel parameters are parsed before the
kmalloc slabs are setup.


Aah. I wonder though, if we could defer parsing driver parameters
until kmalloc caches are up...
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mprotect abuse in slim

2007-01-10 Thread Pekka Enberg


On 1/10/07, Serge E. Hallyn [EMAIL PROTECTED] wrote:

But since it looks like you just munmap the region now, shouldn't a
subsequent munmap by the app just return -EINVAL?  that seems appropriate
to me.


Applications don't know about revoke and neither should they.
Therefore close(2) and munmap(2) must work the same way they would for
non-revoked inodes so that applications can release resources
properly.

Pekka
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] bonding: Replace kmalloc() + memset() pairs with the appropriate kzalloc() calls

2007-01-11 Thread Pekka Enberg


Hi Joe,

On 1/12/07, joe jin [EMAIL PROTECTED] wrote:

@@ -788,7 +786,7 @@ static int rlb_initialize(struct bonding

spin_lock_init((bond_info-rx_hashtbl_lock));

-   new_hashtbl = kmalloc(size, GFP_KERNEL);
+   new_hashtbl = kzalloc(size, GFP_KERNEL);
if (!new_hashtbl) {
printk(KERN_ERR DRV_NAME
   : %s: Error: Failed to allocate RLB hash table\n,


You forgot to remove the memset here.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mprotect abuse in slim

2007-01-11 Thread Pekka Enberg


On 1/11/07, Serge E. Hallyn [EMAIL PROTECTED] wrote:

Right, but is returning -EINVAL to userspace on munmap a problem?


Yes, because an application has no way of reusing the revoked mapping
range. The current patch should get this right, though.

On 1/11/07, Serge E. Hallyn [EMAIL PROTECTED] wrote:

Thanks for the tw other patches - I'll give them a shot and check
out current munmap behavior just as soon as I get a chance.


I hacked the remaining open issues yesterday so please use this instead:

http://www.cs.helsinki.fi/u/penberg/linux/revoke/revoke-2.6.20-rc4

The one at kernel.org will be updated as well when mirroring catches up.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] slab: cache_grow cleanup

2007-01-12 Thread Pekka Enberg


Hi Andrew,

On 1/9/07, Pekka J Enberg [EMAIL PROTECTED] wrote:

From: Pekka Enberg [EMAIL PROTECTED]

The current implementation of cache_grow() has to either (1) use pre-allocated
memory for the slab or (2) allocate the memory itself which makes the error
paths messy. Move __GFP_NO_GROW and __GFP_WAIT processing to kmem_getpages()
and introduce a new __cache_grow() variant that expects the memory for a new
slab to always be handed over in the 'objp' parameter.

Cc: Hugh Dickins [EMAIL PROTECTED]
Cc: Christoph Lameter [EMAIL PROTECTED]
Signed-off-by: Pekka Enberg [EMAIL PROTECTED]


Can we get this into -mm, please?
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mprotect abuse in slim

2007-01-12 Thread Pekka Enberg


On 1/10/07, Serge E. Hallyn [EMAIL PROTECTED] wrote:

Now, what slim needs isn't revoke all files for this inode,
but revoke this task's write access to this fd.  So two functions
which could be useful are

int fd_revoke_write(struct task_struct *tsk, int fd)
int fd_revoke_write_iter(struct task_struct *tsk,
(int *)need_revoke(struct task_struct *tsk, int fd))


This gets interesting. We probably need revokefs (which we use
internally as a substitute for revoke inodes) to be stacked on top of
the actual fs so that you can still read from the fd. But most of the
revocation is still the same, we need to watch out for fork(2) and
dup(2) and take down shared mappings.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/5] eCryptfs: convert kmap() to kmap_atomic()

2007-01-19 Thread Pekka Enberg


On 1/18/07, Michael Halcrow [EMAIL PROTECTED] wrote:

+   page_data = (char *)kmap_atomic(page, KM_USER0);
+   lower_page_data = (char *)kmap_atomic(lower_page, KM_USER1);


Drop 'em redundant casts, please.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: can someone explain inline once and for all?

2007-01-19 Thread Pekka Enberg


On 1/19/07, Robert P. J. Day [EMAIL PROTECTED] wrote:

is there a simple explanation for how to *properly* define inline
routines in the kernel?  and maybe this can be added to the
CodingStyle guide (he mused, wistfully).


AFAIK __always_inline is the only reliable way to force inlining where
it matters for correctness (for example, when playing tricks with
__builtin_return_address like we do in the slab).

Anything else is just a hint to the compiler that might be ignored if
the optimizer thinks it knows better.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.19: slight performance optimization for lib/string.c's strstrip()

2006-12-08 Thread Pekka Enberg


On 12/8/06, Ulrich Windl [EMAIL PROTECTED] wrote:

my apologies for disobeying all the rules for submitting patches, but I'll 
suggest
a performance optimization for strstrip() in lib/string.c:


Makes sense. Please submit a patch.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: why are some of my patches being credited to other authors?

2006-12-09 Thread Pekka Enberg


On 12/9/06, Robert P. J. Day [EMAIL PROTECTED] wrote:

  perhaps i'm just being clueless about the authorship protocol here,
but i'm a bit hacked off by noticing that at least one submitted patch
of mine was apparently re-submitted (albeit slightly modified) a few
days later by another poster and applied under that poster's name.

  on sun, dec 3, i submitted to the list:

http://marc.theaimsgroup.com/?l=linux-kernelm=116516635728664w=2


It really seems to be Burman Yan's patch from November 22. Notice how
your patch still has the redundant cast whereas the applied one
doesn't.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] kcalloc: Re-order the first two out-of-order args to kcalloc().

2006-12-09 Thread Pekka Enberg


On 12/9/06, Robert P. J. Day [EMAIL PROTECTED] wrote:

@@ -705,7 +705,7 @@ static int uss720_probe(struct usb_inter
/*
 * Allocate parport interface
 */
-   if (!(priv = kcalloc(sizeof(struct parport_uss720_private), 1, 
GFP_KERNEL))) {
+   if (!(priv = kcalloc(1, sizeof(struct parport_uss720_private), 
GFP_KERNEL))) {


This one should be kzalloc

You really ought to send these cleanups to [EMAIL PROTECTED] with LKML
cc'd to get them merged.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Re: [PATCH] kcalloc: Re-order the first two out-of-order args to kcalloc().

2006-12-09 Thread Pekka Enberg


On 12/9/06, Pekka Enberg [EMAIL PROTECTED] wrote:

You really ought to send these cleanups to [EMAIL PROTECTED] with LKML
cc'd to get them merged.


...or even better, the relevant maintainer.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] Cleanup slab headers / API to allow easy addition of new slab allocators

2006-12-09 Thread Pekka Enberg


Hi Christoph,

On 12/8/06, Christoph Lameter [EMAIL PROTECTED] wrote:

+#defineSLAB_POISON 0x0800UL/* DEBUG: Poison 
objects */
+#defineSLAB_HWCACHE_ALIGN  0x2000UL/* Align objs on cache 
lines */
+#define SLAB_CACHE_DMA 0x4000UL/* Use GFP_DMA memory */
+#define SLAB_MUST_HWCACHE_ALIGN0x8000UL/* Force alignment even 
if debuggin is active */


Please fix formatting while you're at it.


+#ifdef CONFIG_SLAB
+#include linux/slab_def.h
+#else
+
+/*
+ * Fallback definitions for an allocator not wanting to provide
+ * its own optimized kmalloc definitions (like SLOB).
+ */
+
+#if defined(CONFIG_NUMA) || defined(CONFIG_DEBUG_SLAB)
+#error SLAB fallback definitions not usable for NUMA or Slab debug


Do we need this? Shouldn't we just make sure no one can enable
CONFIG_NUMA and CONFIG_DEBUG_SLAB for non-compatible allocators?


-static inline void *kmalloc(size_t size, gfp_t flags)
+void *kmalloc(size_t size, gfp_t flags)


static inline?


+void *kzalloc(size_t size, gfp_t flags)
+{
+   return __kzalloc(size, flags);
+}


same here.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Re: [PATCH] kcalloc: Re-order the first two out-of-order args to kcalloc().

2006-12-09 Thread Pekka Enberg


On 12/9/06, Robert P. J. Day [EMAIL PROTECTED] wrote:

argh, in that i've already mentioned that, following the guidelines in
SubmittingPatches, i prefer that each of my patches have one logical
purpose.  once the order of the kcalloc() args is corrected, that
would be followed by a single subsequent patch that did the
kcalloc-kzalloc transformation all at once.


...and what would that buy us? Nothing. It *really* wants to use
kzalloc and the transformation is equivalent, so please make it one
patch.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Re: Re: [PATCH] kcalloc: Re-order the first two out-of-order args to kcalloc().

2006-12-09 Thread Pekka Enberg


On 12/9/06, Robert P. J. Day [EMAIL PROTECTED] wrote:

normally what i would do but, in the case of that patch, there are
five files affected, *all* of which are in totally different
subsystems (macintosh, net, scsi, usb, sunrpc).  are you suggesting
that up to 5 different people be CC'ed?

and what about source-wide aesthetic changes that might touch dozens
or hundreds of files?


Well, it depends. There are no fixed rules in the art of patch
feeding. FWIW, I probably would send this patch just to akpm too.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Re: Re: [PATCH] kcalloc: Re-order the first two out-of-order args to kcalloc().

2006-12-09 Thread Pekka Enberg


On 12/9/06, Robert P. J. Day [EMAIL PROTECTED] wrote:

no.  those two submissions represent two logically different fixes
and i have no intention of combining them.


Like I said, fixing the order of kcalloc parameters with a follow-up
patch to use kzalloc is just plain stupid. You can ignore my review
comments all you want, but don't expect that bit to be merged. So, for
the record: NAK for that bit of the patch, it should be converted to
kzalloc instead. Thanks.

 Pekka
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2.6.19] e1000: replace kmalloc with kzalloc

2006-12-12 Thread Pekka Enberg


On 12/12/06, Yan Burman [EMAIL PROTECTED] wrote:

size = txdr-count * sizeof(struct e1000_buffer);
-   if (!(txdr-buffer_info = kmalloc(size, GFP_KERNEL))) {
+   if (!(txdr-buffer_info = kzalloc(size, GFP_KERNEL))) {
ret_val = 1;
goto err_nomem;
}
-   memset(txdr-buffer_info, 0, size);


No one seems to be using size elsewhere so why not convert to
kcalloc() and get rid of it? (Seems to apply to other places as well.)
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6, devfs, SMP, SATA

2005-09-05 Thread Pekka Enberg

On 9/5/05, Howard Chu [EMAIL PROTECTED] wrote:
 So, any guesses why with otherwise identical config options, a kernel
 with SMP enabled doesn't boot up with all of the device nodes that it
 should? (Both drives are on the same controller. I haven't checked to
 see if any other device files are missing.)

Devfs is disabled in 2.6.13 as it most likely will be going away soon.
See http://marc.theaimsgroup.com/?l=linux-kernelm=111939455921877w=2.

Pekka
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [-mm patch 2/5] SharpSL: Add cxx00 support to the Corgi LCD driver

2005-09-06 Thread Pekka Enberg

On 9/6/05, Richard Purdie [EMAIL PROTECTED] wrote:
 +/*
 + * Corgi/Spitz Touchscreen to LCD interface
 + */
 +unsigned long inline corgi_get_hsync_len(void)
 +{
 +   if (machine_is_corgi() || machine_is_shepherd() || 
 machine_is_husky()) {
 +#ifdef CONFIG_PXA_SHARP_C7xx
 +   return w100fb_get_hsynclen(corgifb_device.dev);
 +#endif
 +   } else if (machine_is_spitz() || machine_is_akita() || 
 machine_is_borzoi()) {
 +#ifdef CONFIG_PXA_SHARP_Cxx00
 +   return pxafb_get_hsync_time(pxafb_device.dev);
 +#endif
 +   }
 +   return 0;
 +}

Please consider making two version of corgi_get_hsync_len() instead
for both config options. The above is hard to read.

   Pekka
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [-mm patch 2/5] SharpSL: Add cxx00 support to the Corgi LCD driver

2005-09-06 Thread Pekka Enberg

On 9/6/05, Richard Purdie [EMAIL PROTECTED] wrote:
  +/*
  + * Corgi/Spitz Touchscreen to LCD interface
  + */
  +unsigned long inline corgi_get_hsync_len(void)
  +{
  +   if (machine_is_corgi() || machine_is_shepherd() || 
  machine_is_husky()) {
  +#ifdef CONFIG_PXA_SHARP_C7xx
  +   return w100fb_get_hsynclen(corgifb_device.dev);
  +#endif
  +   } else if (machine_is_spitz() || machine_is_akita() || 
  machine_is_borzoi()) {
  +#ifdef CONFIG_PXA_SHARP_Cxx00
  +   return pxafb_get_hsync_time(pxafb_device.dev);
  +#endif
  +   }
  +   return 0;
  +}

On 9/6/05, Pekka Enberg [EMAIL PROTECTED] wrote:
 Please consider making two version of corgi_get_hsync_len() instead
 for both config options. The above is hard to read.

Uhm, forget it. I didn't realize both config options can be enabled at
the same time. Sorry for the noise.

  Pekka
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] fat: Remove duplicate directory scanning code

2005-09-07 Thread Pekka Enberg

On 9/5/05, OGAWA Hirofumi [EMAIL PROTECTED] wrote:
 But sorry, I have no time for reviewing all of patch now. I'll send it
 to Andrew after review.

Ok. Please do the appropriate renaming also. Thanks.

Pekka
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Wistron laptop button driver

2005-09-08 Thread Pekka Enberg

On 9/7/05, Miloslav Trmac [EMAIL PROTECTED] wrote:
 +static int __init map_bios(void)
 +{
 + static const unsigned char __initdata signature[]
 + = { 0x42, 0x21, 0x55, 0x30 };
 +
 + void __iomem *base;
 + size_t offset;
 + uint32_t entry_point;
 +
 + base = ioremap(0xF, 0x1); /* Can't fail */

How come? ioremap can return NULL if, for example, we run out of memory.

 + for (offset = 0; offset  0x1; offset += 0x10) {
 + if (check_signature(base + offset, signature,
 + sizeof(signature)) != 0)
 + goto found;
 + }

  Pekka
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/25] NTFS: Allow highmem kmalloc() in ntfs_malloc_nofs() and add _nofail() version.

2005-09-09 Thread Pekka Enberg

On 9/9/05, Anton Altaparmakov [EMAIL PROTECTED] wrote:
 -static inline void *ntfs_malloc_nofs(unsigned long size)
 +static inline void *__ntfs_malloc(unsigned long size,
 +   unsigned int __nocast gfp_mask)
  {
 if (likely(size = PAGE_SIZE)) {
 BUG_ON(!size);
 /* kmalloc() has per-CPU caches so is faster for now. */
 -   return kmalloc(PAGE_SIZE, GFP_NOFS);
 -   /* return (void *)__get_free_page(GFP_NOFS | __GFP_HIGHMEM); 
 */
 +   return kmalloc(PAGE_SIZE, gfp_mask);
 +   /* return (void *)__get_free_page(gfp_mask); */
 }
 if (likely(size  PAGE_SHIFT  num_physpages))
 -   return __vmalloc(size, GFP_NOFS | __GFP_HIGHMEM, PAGE_KERNEL);
 +   return __vmalloc(size, gfp_mask, PAGE_KERNEL);

Unrelated to this patch but why do you have this wrapper instead of
using kmalloc() where you can and__vmalloc() where you really have to?

 Pekka
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -v3 14/14] x86, mm: Map ISA area with connected ram range at the same time

2012-09-06 Thread Pekka Enberg

On Wed, Sep 5, 2012 at 1:02 AM, Pekka Enberg penb...@kernel.org wrote:
  How significant is the speed gain? The isa_done flag makes code flow
  more difficult to follow.

On Wed, 5 Sep 2012, Yinghai Lu wrote:
 Not really much.
 
 when booting system:
 memmap=16m$128m memmap=16m$512m memmap=16m$256m memmap=16m$768m 
 memmap=16m$1024m
 
 with the patch
 [0.00] init_memory_mapping: [mem 0x-0x07ff]
 [0.00]  [mem 0x-0x07ff] page 2M
 [0.00] init_memory_mapping: [mem 0x0900-0x0fff]
 [0.00]  [mem 0x0900-0x0fff] page 2M
 [0.00] init_memory_mapping: [mem 0x1100-0x1fff]
 [0.00]  [mem 0x1100-0x1fff] page 2M
 [0.00] init_memory_mapping: [mem 0x2100-0x2fff]
 [0.00]  [mem 0x2100-0x2fff] page 2M
 [0.00] init_memory_mapping: [mem 0x3100-0x3fff]
 [0.00]  [mem 0x3100-0x3fff] page 2M
 [0.00] init_memory_mapping: [mem 0x4100-0x7fffdfff]
 [0.00]  [mem 0x4100-0x7fdf] page 2M
 [0.00]  [mem 0x7fe0-0x7fffdfff] page 4k
 
 otherwise will have
 
 [0.00] init_memory_mapping: [mem 0x-0x000f]
 [0.00]  [mem 0x-0x000f] page 4k
 [0.00] init_memory_mapping: [mem 0x0010-0x07ff]
 [0.00]  [mem 0x0010-0x001f] page 4k
 [0.00]  [mem 0x0020-0x07ff] page 2M
 [0.00] init_memory_mapping: [mem 0x0900-0x0fff]
 [0.00]  [mem 0x0900-0x0fff] page 2M
 [0.00] init_memory_mapping: [mem 0x1100-0x1fff]
 [0.00]  [mem 0x1100-0x1fff] page 2M
 [0.00] init_memory_mapping: [mem 0x2100-0x2fff]
 [0.00]  [mem 0x2100-0x2fff] page 2M
 [0.00] init_memory_mapping: [mem 0x3100-0x3fff]
 [0.00]  [mem 0x3100-0x3fff] page 2M
 [0.00] init_memory_mapping: [mem 0x4100-0x7fffdfff]
 [0.00]  [mem 0x4100-0x7fdf] page 2M
 [0.00]  [mem 0x7fe0-0x7fffdfff] page 4k

OK. Is there any other reason than performance to do this?

Pekka
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3] perf bench: fix assert when NDEBUG is defined

2012-09-08 Thread Pekka Enberg

On Sat, Sep 8, 2012 at 8:35 AM, Irina Tirdea irina.tir...@gmail.com wrote:
 From: Irina Tirdea irina.tir...@intel.com

 When NDEBUG is defined, the assert macro will be expanded to nothing.
 Some assert calls used in perf are also including some functionality
 (e.g. system calls), not only validity checks. Therefore, if NDEBUG is
 defined, this functionality will be removed along with the assert.
 Perf also defines BUG_ON based on assert, so it has the same problem.

 Define BUG_ON so that the condition will be executed when NDEBUG is defined.
 Replace the assert statements that have these side effects with BUG_ON.

 For defining BUG_ON, use if (cond) {} insted of if (cond) ; because in
 the latter case build fails with error: suggest braces around empty body in
 an ‘if’ statement [-Werror=empty-body]

 Suggested-by: Peter Zijlstra a.p.zijls...@chello.nl
 Signed-off-by: Irina Tirdea irina.tir...@intel.com

Reviewed-by: Pekka Enberg penb...@kernel.org
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] slab: fix the DEADLOCK issue on l3 alien lock

2012-09-08 Thread Pekka Enberg

On Fri, Sep 7, 2012 at 1:29 AM, Paul E. McKenney
paul...@linux.vnet.ibm.com wrote:
 On Thu, Sep 06, 2012 at 11:05:11AM +0800, Michael Wang wrote:
 On 09/05/2012 09:55 PM, Christoph Lameter wrote:
  On Wed, 5 Sep 2012, Michael Wang wrote:
 
  Since the cachep and cachep-slabp_cache's l3 alien are in the same lock 
  class,
  fake report generated.
 
  Ahh... That is a key insight into why this occurs.
 
  This should not happen since we already have init_lock_keys() which will
  reassign the lock class for both l3 list and l3 alien.
 
  Right. I was wondering why we still get intermitted reports on this.
 
  This patch will invoke init_lock_keys() after we done enable_cpucache()
  instead of before to avoid the fake DEADLOCK report.
 
  Acked-by: Christoph Lameter c...@linux.com

 Thanks for your review.

 And add Paul to the cc list(my skills on mailing is really poor...).

 Tested-by: Paul E. McKenney paul...@linux.vnet.ibm.com

I'd also like to tag this for the stable tree to avoid bogus lockdep
reports. How far back in release history should we queue this?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 03/12] perf tools: include __WORDSIZE definition

2012-09-08 Thread Pekka Enberg

On Sat, Sep 8, 2012 at 3:43 AM, Irina Tirdea irina.tir...@gmail.com wrote:
 From: Irina Tirdea irina.tir...@intel.com

 __WORDSIZE is GLibC-specific and is not defined on all systems or glibc
 versions (e.g. Android's bionic does not define it).

 In file included from util/include/linux/bitmap.h:5:0,
  from util/header.h:10,
  from util/session.h:6,
  from util/build-id.h:4,
  from util/annotate.c:11:
 util/include/linux/bitops.h: In function 'set_bit':
 util/include/linux/bitops.h:25:12: error:
 '__WORDSIZE' undeclared (first use in this function)
 util/include/linux/bitops.h:25:12: note:
 each undeclared identifier is reported only once for each function it appears 
 in
 util/include/linux/bitops.h:23:51: error:
 parameter 'addr' set but not used [-Werror=unused-but-set-parameter]
 util/include/linux/bitops.h: In function 'clear_bit':
 util/include/linux/bitops.h:30:12: error:
 '__WORDSIZE' undeclared (first use in this function)
 util/include/linux/bitops.h:28:53: error:
 parameter 'addr' set but not used [-Werror=unused-but-set-parameter]
 In file included from util/header.h:10:0,
  from util/session.h:6,
  from util/build-id.h:4,
  from util/annotate.c:11:
 util/include/linux/bitmap.h: In function 'bitmap_zero':
 util/include/linux/bitmap.h:22:6: error:
 '__WORDSIZE' undeclared (first use in this function)

 Defining __WORDSIZE in perf's headers if it is not already defined.

 Signed-off-by: Irina Tirdea irina.tir...@intel.com
 ---
  tools/perf/util/include/linux/bitops.h |9 +
  1 file changed, 9 insertions(+)

 diff --git a/tools/perf/util/include/linux/bitops.h 
 b/tools/perf/util/include/linux/bitops.h
 index 587a230..91779ec 100644
 --- a/tools/perf/util/include/linux/bitops.h
 +++ b/tools/perf/util/include/linux/bitops.h
 @@ -5,6 +5,15 @@
  #include linux/compiler.h
  #include asm/hweight.h

 +#ifndef __WORDSIZE
 +#if defined(__x86_64__)
 +# define __WORDSIZE 64
 +#endif
 +#if defined(__i386__) || defined(__arm__)
 +# define __WORDSIZE 32
 +#endif
 +#endif

Why not use sizeof(unsigned long) * 8 ?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 05/12] perf tools: include basename for non-glibc systems

2012-09-08 Thread Pekka Enberg

On Sat, Sep 8, 2012 at 3:43 AM, Irina Tirdea irina.tir...@gmail.com wrote:
 From: Irina Tirdea irina.tir...@intel.com

 perf uses the glibc version of basename(), by defining _GNU_SOURCE, including
 string.h and not including libgen.h. The glibc version of basename is better
 than the POSIX version since it does not modify its argument.

 Android has only one version of basename which is defined in libgen.h.
 This version is the same as the glibc version.

 Error on Android:
 util/annotate.c: In function 'symbol__annotate_printf':
 util/annotate.c:503:3: error: implicit declaration of function 'basename'
 [-Werror=implicit-function-declaration]
 util/annotate.c:503:3: error: nested extern declaration of 'basename'
 [-Werror=nested-externs]
 util/annotate.c:503:14: error: assignment makes pointer from integer without
 a cast [-Werror]

 On Android libgen.h should be included to define basename.

 Signed-off-by: Irina Tirdea irina.tir...@intel.com
 ---
  tools/perf/util/symbol.h |3 +++
  1 file changed, 3 insertions(+)

 diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
 index fc4b1e6..d3b330c 100644
 --- a/tools/perf/util/symbol.h
 +++ b/tools/perf/util/symbol.h
 @@ -10,6 +10,9 @@
  #include linux/rbtree.h
  #include stdio.h
  #include byteswap.h
 +#if defined(__BIONIC__)
 +#include libgen.h
 +#endif

It's safe to include libgen.h on glibc Linux systems as well, no? So
there's no need to check for __BIONIC__.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 01/12] perf tools: include wrapper for magic.h

2012-09-08 Thread Pekka Enberg

Hi Irina,

I commented on some of the patches but overall, I think the series makes sense:

Acked-by: Pekka Enberg penb...@kernel.org

Pekka
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/3] perf tool: basename cleanups

2012-09-08 Thread Pekka Enberg

On Sat, Sep 8, 2012 at 6:06 PM, David Ahern dsah...@gmail.com wrote:
 basename can modify strings passed to it, so the strings should not
 be marked const. Clean up the offenders and remove the BIONIC wrapper
 for libgen.h

 Irina: Would you verify perf still compiles cleanly for Android. Thanks.

To all three patches:

Acked-by: Pekka Enberg penb...@kernel.org
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: linux-next: build warning after merge of the final tree (slab tree related)

2012-09-19 Thread Pekka Enberg

On Tue, 11 Sep 2012, Christoph Lameter wrote:

 On Tue, 11 Sep 2012, Stephen Rothwell wrote:
 
  After merging the final tree, today's linux-next build (sparc64 defconfig)
  produced this warning:
 
  mm/slab.c:808:13: warning: '__slab_error' defined but not used 
  [-Wunused-function]
 
  Introduced by commit 945cf2b6199b (mm/sl[aou]b: Extract a common function 
  for kmem_cache_destroy).  All uses of slab_error() are now guarded by 
  DEBUG.
 
 
 Subject: Slab: Only define slab_error for DEBUG
 
 There is no use case left for slab builds without DEBUG.
 
 Signed-off-by: Christoph Lameter c...@linux.com
 
 Index: linux/mm/slab.c
 ===
 --- linux.orig/mm/slab.c  2012-09-11 14:44:56.304015235 -0500
 +++ linux/mm/slab.c   2012-09-11 14:48:46.988948440 -0500
 @@ -803,6 +803,7 @@ static void cache_estimate(unsigned long
   *left_over = slab_size - nr_objs*buffer_size - mgmt_size;
  }
 
 +#if DEBUG
  #define slab_error(cachep, msg) __slab_error(__func__, cachep, msg)
 
  static void __slab_error(const char *function, struct kmem_cache *cachep,
 @@ -812,6 +813,7 @@ static void __slab_error(const char *fun
  function, cachep-name, msg);
   dump_stack();
  }
 +#endif
 
  /*
   * By default on NUMA we use alien caches to stage the freeing of
 

Applied, thanks.

P.S. Guys, please use penb...@kernel.org email address. I missed this 
patch because I don't read this mailbox.

Pekka
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Taint kernel when we detect a corrupted slab.

2012-09-19 Thread Pekka Enberg

On Tue, 18 Sep 2012, Dave Jones wrote:
 It doesn't seem worth adding a new taint flag for this, so just re-use
 the one from 'bad page'

 Signed-off-by: Dave Jones da...@redhat.com

On Tue, Sep 18, 2012 at 11:19 PM, David Rientjes rient...@google.com wrote:
 Acked-by: David Rientjes rient...@google.com

Applied, thanks!
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/4] perf tools: configure tmp path at build time

2012-09-21 Thread Pekka Enberg

Hi Irina,

On Fri, Sep 21, 2012 at 1:13 AM, Irina Tirdea irina.tir...@gmail.com wrote:
 diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
 index b442ee4..c5b4632 100644
 --- a/tools/perf/util/map.c
 +++ b/tools/perf/util/map.c
 @@ -59,7 +59,8 @@ struct map *map__new(struct list_head *dsos__list, u64 
 start, u64 len,
 no_dso = is_no_dso_memory(filename);

 if (anon) {
 -   snprintf(newfilename, sizeof(newfilename), 
 /tmp/perf-%d.map, pid);
 +   snprintf(newfilename, sizeof(newfilename),
 +PERF_TMP_DIR /perf-%d.map, pid);
 filename = newfilename;
 }

[snip]

 diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
 index e2e8c69..eb671d5 100644
 --- a/tools/perf/util/symbol.c
 +++ b/tools/perf/util/symbol.c
 @@ -1051,7 +1051,7 @@ int dso__load(struct dso *dso, struct map *map, 
 symbol_filter_t filter)

 dso-adjust_symbols = 0;

 -   if (strncmp(dso-name, /tmp/perf-, 10) == 0) {
 +   if (strncmp(dso-name, PERF_TMP_DIR /perf-, 10) == 0) {
 struct stat st;

 if (lstat(dso-name, st)  0)

Just to point out: these two path names are actually part of a
JIT/perf integration ABI. I'm OK with using PERF_TMP_DIR here but you
really ought to update tools/perf/Documentation/jit-interface.txt.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3 08/16] slab: allow enable_cpu_cache to use preset values for its tunables

2012-09-21 Thread Pekka Enberg

On Tue, Sep 18, 2012 at 5:12 PM, Glauber Costa glom...@parallels.com wrote:
 diff --git a/mm/slab.c b/mm/slab.c
 index e2cf984..f2d760c 100644
 --- a/mm/slab.c
 +++ b/mm/slab.c
 @@ -4141,8 +4141,19 @@ static int do_tune_cpucache(struct kmem_cache *cachep, 
 int limit,
  static int enable_cpucache(struct kmem_cache *cachep, gfp_t gfp)
  {
 int err;
 -   int limit, shared;
 -
 +   int limit = 0;
 +   int shared = 0;
 +   int batchcount = 0;
 +
 +#ifdef CONFIG_MEMCG_KMEM
 +   if (cachep-memcg_params.parent) {
 +   limit = cachep-memcg_params.parent-limit;
 +   shared = cachep-memcg_params.parent-shared;
 +   batchcount = cachep-memcg_params.parent-batchcount;

Style nit: please introduce a variable for
cachep-memcg_params.parent to make this human-readable.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3 09/16] sl[au]b: always get the cache from its page in kfree

2012-09-21 Thread Pekka Enberg

On Wed, Sep 19, 2012 at 10:42 AM, Glauber Costa glom...@parallels.com wrote:
 index f2d760c..18de3f6 100644
 --- a/mm/slab.c
 +++ b/mm/slab.c
 @@ -3938,9 +3938,12 @@ EXPORT_SYMBOL(__kmalloc);
   * Free an object which was previously allocated from this
   * cache.
   */
 -void kmem_cache_free(struct kmem_cache *cachep, void *objp)
 +void kmem_cache_free(struct kmem_cache *s, void *objp)
  {
  unsigned long flags;
 +struct kmem_cache *cachep = virt_to_cache(objp);
 +
 +VM_BUG_ON(!slab_equal_or_parent(cachep, s));

 This is an extremely hot path of the kernel and you are adding significant
 processing. Check how the benchmarks are influenced by this change.
 virt_to_cache can be a bit expensive.

 Would it be enough for you to have a separate code path for
 !CONFIG_MEMCG_KMEM?

 I don't really see another way to do it, aside from deriving the cache
 from the object in our case. I am open to suggestions if you do.

We should assume that most distributions enable CONFIG_MEMCG_KMEM,
right? Therfore, any performance impact should be dependent on whether
or not kmem memcg is *enabled* at runtime or not.

Can we use the static key thingy introduced by tracing folks for this?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3 00/16] slab accounting for memcg

2012-09-21 Thread Pekka Enberg

Hi Glauber,

On Tue, Sep 18, 2012 at 5:11 PM, Glauber Costa glom...@parallels.com wrote:
 This is a followup to the previous kmem series. I divided them logically
 so it gets easier for reviewers. But I believe they are ready to be merged
 together (although we can do a two-pass merge if people would prefer)

 Throwaway git tree found at:

 git://git.kernel.org/pub/scm/linux/kernel/git/glommer/memcg.git 
 kmemcg-slab

 There are mostly bugfixes since last submission.

Overall, I like this series a lot. However, I don't really see this as a
v3.7 material because we already have largeish pending updates to the
slab allocators. I also haven't seen any performance numbers for this
which is a problem.

So what I'd really like to see is this series being merged early in the
v3.8 development cycle to maximize the number of people eyeballing the
code and looking at performance impact.

Does this sound reasonable to you Glauber?

Pekka
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3 09/16] sl[au]b: always get the cache from its page in kfree

2012-09-21 Thread Pekka Enberg

On Fri, 21 Sep 2012, Glauber Costa wrote:
  We should assume that most distributions enable CONFIG_MEMCG_KMEM,
  right? Therfore, any performance impact should be dependent on whether
  or not kmem memcg is *enabled* at runtime or not.
  
  Can we use the static key thingy introduced by tracing folks for this?

 Yes.
 
 I am already using static keys extensively in this patchset, and that is
 how I intend to handle this particular case.

Cool.

The key point here is that !CONFIG_MEMCG_KMEM should have exactly *zero* 
performance impact and CONFIG_MEMCG_KMEM disabled at runtime should have 
absolute minimal impact.

Pekka
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3 09/16] sl[au]b: always get the cache from its page in kfree

2012-09-21 Thread Pekka Enberg

On Fri, Sep 21, 2012 at 11:07 PM, Tejun Heo t...@kernel.org wrote:
 Not necessarily disagreeing, but I don't think it's helpful to set the
 bar impossibly high.  Even static_key doesn't have exactly *zero*
 impact.  Let's stick to as minimal as possible when not in use and
 reasonable in use.

For !CONFIG_MEMCG_KMEM, it should be exactly zero. No need to play
games with static_key.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[GIT PULL] SLAB fixes for v3.6-rc5

2012-09-11 Thread Pekka Enberg

Hi Linus,

Please pull the latest SLAB tree from:

  git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux.git slab/urgent

It has a PFMALLOC reserve page fixlet for slab from David Rientjes.

Pekka

--
The following changes since commit 318e15101993c0fdc3f23f24ac61fc7769d27e68:

  Merge branch 'for-linus' of
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
(2012-08-29 11:36:22 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux.git slab/urgent

David Rientjes (1):
  mm, slab: lock the correct nodelist after reenabling irqs

 mm/slab.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] slab: fix the DEADLOCK issue on l3 alien lock

2012-09-11 Thread Pekka Enberg

On Tue, Sep 11, 2012 at 5:50 AM, Michael Wang
wang...@linux.vnet.ibm.com wrote:
 On 09/08/2012 04:39 PM, Pekka Enberg wrote:
 On Fri, Sep 7, 2012 at 1:29 AM, Paul E. McKenney
 paul...@linux.vnet.ibm.com wrote:
 On Thu, Sep 06, 2012 at 11:05:11AM +0800, Michael Wang wrote:
 On 09/05/2012 09:55 PM, Christoph Lameter wrote:
 On Wed, 5 Sep 2012, Michael Wang wrote:

 Since the cachep and cachep-slabp_cache's l3 alien are in the same lock 
 class,
 fake report generated.

 Ahh... That is a key insight into why this occurs.

 This should not happen since we already have init_lock_keys() which will
 reassign the lock class for both l3 list and l3 alien.

 Right. I was wondering why we still get intermitted reports on this.

 This patch will invoke init_lock_keys() after we done enable_cpucache()
 instead of before to avoid the fake DEADLOCK report.

 Acked-by: Christoph Lameter c...@linux.com

 Thanks for your review.

 And add Paul to the cc list(my skills on mailing is really poor...).

 Tested-by: Paul E. McKenney paul...@linux.vnet.ibm.com

 I'd also like to tag this for the stable tree to avoid bogus lockdep
 reports. How far back in release history should we queue this?
 Hi, Pekka

 Sorry for the delayed reply, I try to find out the reason for commit
 30765b92 but not get it yet, so I add Peter to the cc list.

 The below patch for release 3.0.0 is the one to cause the bogus report.

 commit 30765b92ada267c5395fc788623cb15233276f5c
 Author: Peter Zijlstra pet...@infradead.org
 Date:   Thu Jul 28 23:22:56 2011 +0200

 slab, lockdep: Annotate the locks before using them

 Fernando found we hit the regular OFF_SLAB 'recursion' before we
 annotate the locks, cure this.

 The relevant portion of the stack-trace:

  [0.00]  [c085e24f] rt_spin_lock+0x50/0x56
  [0.00]  [c04fb406] __cache_free+0x43/0xc3
  [0.00]  [c04fb23f] kmem_cache_free+0x6c/0xdc
  [0.00]  [c04fb2fe] slab_destroy+0x4f/0x53
  [0.00]  [c04fb396] free_block+0x94/0xc1
  [0.00]  [c04fc551] do_tune_cpucache+0x10b/0x2bb
  [0.00]  [c04fc8dc] enable_cpucache+0x7b/0xa7
  [0.00]  [c0bd9d3c] kmem_cache_init_late+0x1f/0x61
  [0.00]  [c0bba687] start_kernel+0x24c/0x363
  [0.00]  [c0bba0ba] i386_start_kernel+0xa9/0xaf

 Reported-by: Fernando Lopez-Lezcano na...@ccrma.stanford.edu
 Acked-by: Pekka Enberg penb...@kernel.org
 Signed-off-by: Peter Zijlstra a.p.zijls...@chello.nl
 Link: http://lkml.kernel.org/r/1311888176.2617.379.camel@laptop
 Signed-off-by: Ingo Molnar mi...@elte.hu

 It moved init_lock_keys() before we build up the alien, so we failed to
 reclass it.

I've queued the patch for v3.7. Thanks!
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH TRIVIAL] mm: Fix build warning in kmem_cache_create()

2012-07-30 Thread Pekka Enberg

On Sat, Jul 14, 2012 at 2:12 AM, Shuah Khan shuah.k...@hp.com wrote:
 The label oops is used in CONFIG_DEBUG_VM ifdef block and is defined
 outside ifdef CONFIG_DEBUG_VM block. This results in the following
 build warning when built with CONFIG_DEBUG_VM disabled. Fix to move
 label oops definition to inside a CONFIG_DEBUG_VM block.

 mm/slab_common.c: In function ‘kmem_cache_create’:
 mm/slab_common.c:101:1: warning: label ‘oops’ defined but not used
 [-Wunused-label]

 Signed-off-by: Shuah Khan shuah.k...@hp.com

I merged this as an obvious and safe fix for current merge window. We
need to clean this up properly for v3.7.

 ---
  mm/slab_common.c |2 ++
  1 file changed, 2 insertions(+)

 diff --git a/mm/slab_common.c b/mm/slab_common.c
 index 12637ce..aa3ca5b 100644
 --- a/mm/slab_common.c
 +++ b/mm/slab_common.c
 @@ -98,7 +98,9 @@ struct kmem_cache *kmem_cache_create(const char *name, 
 size_t size, size_t align

 s = __kmem_cache_create(name, size, align, flags, ctor);

 +#ifdef CONFIG_DEBUG_VM
  oops:
 +#endif
 mutex_unlock(slab_mutex);
 put_online_cpus();

 --
 1.7.9.5



 --
 To unsubscribe, send a message with 'unsubscribe linux-mm' in
 the body to majord...@kvack.org.  For more info on Linux MM,
 see: http://www.linux-mm.org/ .
 Don't email: a href=mailto:d...@kvack.org; em...@kvack.org /a
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[GIT PULL] SLAB changes for v3.6-rc0

2012-07-30 Thread Pekka Enberg

Hi Linus,

Please pull the latest SLAB tree from:

  git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux.git slab/next

Most of the changes included are from Christoph Lameter's common slab
patch series that unifies common parts of SLUB, SLAB, and SLOB
allocators. The unification is needed for Glauber Costa's kmem memcg
work that will hopefully appear for v3.7.

Rest of the changes are fixes and speedups by various people.

Pekka

--
The following changes since commit f7da9cdf45cbbad5029d4858dcbc0134e06084ed:

  Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net (2012-07-28 
06:00:39 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux.git slab/next

Andi Kleen (1):
  slab/mempolicy: always use local policy from interrupt context

Christoph Lameter (21):
  slub: Use freelist instead of object in __slab_alloc
  slub: Add frozen check in __slab_alloc
  slub: Acquire_slab() avoid loop
  slub: Simplify control flow in __slab_alloc()
  slub: new_slab_objects() can also get objects from partial list
  slub: Get rid of the node field
  slub: Separate out kmem_cache_cpu processing from deactivate_slab
  slub: Use page variable instead of c-page.
  slub: pass page to node_match() instead of kmem_cache_cpu structure
  slob: Define page struct fields used in mm_types.h
  slob: No need to zero mapping since it is no longer in use
  slob: Remove various small accessors
  slab: Use page struct fields instead of casting
  slab: Remove some accessors
  mm, sl[aou]b: Extract common fields from struct kmem_cache
  slab: Get rid of obj_size macro
  mm, sl[aou]b: Extract common code for kmem_cache_create()
  mm, sl[aou]b: Common definition for boot state of the slab allocators
  mm, sl[aou]b: Use a common mutex definition
  mm, sl[aou]b: Move kmem_cache_create mutex handling to common code
  slob: Fix early boot kernel crash

David Rientjes (1):
  mm, slub: ensure irqs are enabled for kmemcheck

Feng Tang (1):
  slab: Fix a typo in commit 8c138b slab: Get rid of obj_size macro

Glauber Costa (2):
  slab: rename gfpflags to allocflags
  slab: move FULL state transition to an initcall

Joonsoo Kim (2):
  slub: use __cmpxchg_double_slab() at interrupt disabled place
  slub: refactoring unfreeze_partials()

Julia Lawall (1):
  slub: remove invalid reference to list iterator variable

Pekka Enberg (1):
  Merge branch 'slub/cleanups' into slab/next

Shuah Khan (1):
  mm: Fix build warning in kmem_cache_create()

Thierry Reding (1):
  mm, slab: Build fix for recent kmem_cache changes

majianpeng (1):
  mm: Fix signal SIGFPE in slabinfo.c.

 include/linux/mempolicy.h |2 +-
 include/linux/mm_types.h  |   11 +-
 include/linux/slab.h  |   24 +++
 include/linux/slab_def.h  |   12 +-
 include/linux/slub_def.h  |3 +-
 mm/Makefile   |3 +-
 mm/mempolicy.c|8 +-
 mm/slab.c |  406 --
 mm/slab.h |   33 
 mm/slab_common.c  |  120 +
 mm/slob.c |  152 +---
 mm/slub.c |  436 +
 tools/vm/slabinfo.c   |   14 +-
 13 files changed, 608 insertions(+), 616 deletions(-)
 create mode 100644 mm/slab.h
 create mode 100644 mm/slab_common.c
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH TRIVIAL] mm: Fix build warning in kmem_cache_create()

2012-07-30 Thread Pekka Enberg

On Mon, Jul 30, 2012 at 10:56 PM, David Rientjes rient...@google.com wrote:
 -Wunused-label is overridden in gcc for a label that is conditionally
 referenced by using __maybe_unused in the kernel.  I'm not sure what's so
 obscure about

 out: __maybe_unused

 Are label attributes really that obsecure?

I think they are.

The real problem, however, is that label attributes would just paper
over the badly thought out control flow in the function and not make the
code any better or easier to read.

Pekka
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH TRIVIAL] mm: Fix build warning in kmem_cache_create()

2012-07-31 Thread Pekka Enberg

On Mon, 30 Jul 2012, David Rientjes wrote:
 So much for compromise, I thought we had agreed that at least some of the 
 checks for !name, in_interrupt() or bad size values should be moved out 
 from under the #ifdef CONFIG_DEBUG_VM, but this wasn't done.  This 
 discussion would be irrelevent if we actually did what we talked about.

I didn't want to change the checks at the last minute and invalidate 
testing in linux-next but I'm more than happy to merge such a patch when 
the merge window closes.

Pekka
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] sched: Document schedule() entry points

2012-07-31 Thread Pekka Enberg

This patch adds a comment on top of the schedule() function to explain
to scheduler newbies how the main scheduler function is entered.

Explained-by: Ingo Molnar mi...@kernel.org
Explained-by: Peter Zijlstra a.p.zijls...@chello.nl
Signed-off-by: Pekka Enberg penb...@kernel.org
---
 kernel/sched/core.c |   34 ++
 1 files changed, 34 insertions(+), 0 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 468bdd4..9f31bbd 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3361,6 +3361,40 @@ pick_next_task(struct rq *rq)
 
 /*
  * __schedule() is the main scheduler function.
+ *
+ * The main means of driving the scheduler and thus entering this function are:
+ *
+ *   1. Explicit blocking: mutex, semaphore, waitqueue, etc.
+ *
+ *   2. TIF_NEED_RESCHED flag is checked on interrupt and userspace return
+ *  paths. For example, see arch/x86/entry_64.S.
+ *
+ *  To drive preemption between tasks, the scheduler sets the flag is set
+ *  in timer interrupt handler scheduler_tick().
+ *
+ *   3. Wakeups don't really cause entry into schedule(). They add a
+ *  task to the run-queue and that's it.
+ *
+ *  Now, if the new task added to the run-queue preempts the current
+ *  task, then the wakeup sets TIF_NEED_RESCHED and schedule() gets
+ *  called on the nearest possible occasion:
+ *
+ *   - If the kernel is preemptible (CONFIG_PREEMPT=y):
+ *
+ * - in syscall or exception context, at the next outmost
+ *   preempt_enable(). (this might be as soon as the wake_up()'s
+ *   spin_unlock()!)
+ *
+ * - in IRQ context, return from interrupt-handler to
+ *   preemptible context
+ *
+ *   - If the kernel is not preemptible (CONFIG_PREEMPT is not set)
+ * then at the next:
+ *
+ *  - cond_resched() call
+ *  - explicit schedule() call
+ *  - return from syscall or exception to user-space
+ *  - return from interrupt-handler to user-space
  */
 static void __sched __schedule(void)
 {
-- 
1.7.7.6

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC/PATCH] Use kernel supplied MMU info for kvm tool

2012-07-31 Thread Pekka Enberg

On Wed, 18 Jul 2012, Michael Ellerman wrote:
 It occurred to me overnight that I forgot to mention that in order to
 build the new code you need the headers from a 3.5-rc1 era kernel (for
 the ioctl  KVM_CAP definitions).
 
 The easiest way to do that is to merge linus' tree into kvmtool.
 
 Are you planning on doing that in the master kvmtool tree anytime soon?
 It's still based on 3.4-rc1 it seems.

Done. Sorry for the delay!
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch v3.6] mm, slab: lock the correct nodelist after reenabling irqs

2012-08-30 Thread Pekka Enberg

On Wed, Aug 29, 2012 at 2:41 PM, Haggai Eran hagg...@mellanox.com wrote:
 Looks like a problem in 072bb0aa5e0 (mm: sl[au]b: add knowledge of
 PFMEMALLOC reserve pages).  cache_grow() can reenable irqs which allows
 this to be scheduled on a different cpu, possibly with a different node.
 So it turns out that we lock the wrong node's list_lock because we don't
 check the new node id when irqs are disabled again.

 I doubt you can reliably reproduce this, but the following should fix the
 issue.
 Your patch did solve the issue. Thanks!

Applied, thanks!
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] perf/x86: Disable uncore on virtualized CPU.

2012-08-31 Thread Pekka Enberg

On Tue, Aug 21, 2012 at 12:08 PM, Yan, Zheng zheng.z@intel.com wrote:
 From: Yan, Zheng zheng.z@intel.com

 Initializing uncore PMU on virtualized CPU may hang the kernel.
 This is because kvm does not emulate the entire hardware. Thers
 are lots of uncore related MSRs, making kvm enumerate them all
 is a non-trival task. So just disable uncore on virtualized CPU.

 Signed-off-by: Yan, Zheng zheng.z@intel.com
 ---
  arch/x86/kernel/cpu/perf_event_intel_uncore.c | 3 +++
  1 file changed, 3 insertions(+)

 diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.c 
 b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
 index 0a55710..2f005ba 100644
 --- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
 +++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
 @@ -2898,6 +2898,9 @@ static int __init intel_uncore_init(void)
 if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL)
 return -ENODEV;

 +   if (cpu_has_hypervisor)
 +   return -ENODEV;
 +
 ret = uncore_pci_init();
 if (ret)
 goto fail;

On Tue, Aug 21, 2012 at 3:55 PM, Pekka Enberg penb...@kernel.org wrote:
 Tested-by: Pekka Enberg penb...@kernel.org

Ping? I have not seen a tip bot email for this. Is the patch queued?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] perf bench: fix assert when NDEBUG is defined

2012-09-02 Thread Pekka Enberg

On Mon, Sep 3, 2012 at 4:45 AM, Namhyung Kim namhy...@kernel.org wrote:
 On Mon, 3 Sep 2012 03:04:32 +0300, Irina Tirdea wrote:
 From: Irina Tirdea irina.tir...@intel.com

 When NDEBUG is defined, the assert macro will be expanded to nothing.
 Some assert calls used in perf are also including some functionality
 (e.g. system calls), not only validity checks. Therefore, if NDEBUG is
 defined, these functionality will be removed along with the assert.

 The functionality of the program needs to be separated from the assert 
 checks.
 In perf, BUG_ON is also defined on assert, so we need to fix these statements
 too.

 Signed-off-by: Irina Tirdea irina.tir...@intel.com
 ---
  tools/perf/bench/mem-memcpy.c |8 +---
  tools/perf/bench/mem-memset.c |8 +---
  tools/perf/bench/sched-pipe.c |6 --
  3 files changed, 14 insertions(+), 8 deletions(-)

 diff --git a/tools/perf/bench/mem-memcpy.c b/tools/perf/bench/mem-memcpy.c
 index 02dad5d..bccb783 100644
 --- a/tools/perf/bench/mem-memcpy.c
 +++ b/tools/perf/bench/mem-memcpy.c
 @@ -144,17 +144,19 @@ static double do_memcpy_gettimeofday(memcpy_t
 fn, size_t len, bool prefault)
  {
   struct timeval tv_start, tv_end, tv_diff;
   void *src = NULL, *dst = NULL;
 - int i;
 + int i, ret;

   alloc_mem(src, dst, len);

   if (prefault)
   fn(dst, src, len);

 - BUG_ON(gettimeofday(tv_start, NULL));
 + ret = gettimeofday(tv_start, NULL);
 + BUG_ON(ret);

 I think one of good thing of assert is that it outputs the exact failure
 condition when it fails.  So with patch, it will convert

   Assertion `gettimeofday(tv_start, NULL)' failed.

 into

   Assertion `ret' failed.

 which is not so informative.

 So I'd rather suggest using more descriptive names like ret_gtod ?

No, please don't do that. That'll make the code ugly and it's really
just papering over the fact that the assertions should be converted to
proper error handling.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -v2 01/13] x86, mm: Add global page_size_mask

2012-09-02 Thread Pekka Enberg

On Sun, Sep 2, 2012 at 10:46 AM, Yinghai Lu ying...@kernel.org wrote:
 detect if need to use 1G or 2M and store them in page_size_mask.

 Only probe them one time.

 Suggested-by: Ingo Molnar mi...@elte.hu
 Signed-off-by: Yinghai Lu ying...@kernel.org

Reviewed-by: Pekka Enberg penb...@kernel.org
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -v2 02/13] x86, mm: Split out split_mem_range

2012-09-02 Thread Pekka Enberg

On Sun, Sep 2, 2012 at 10:46 AM, Yinghai Lu ying...@kernel.org wrote:
 from init_memory_mapping, so make init_memory_mapping readable.

 Suggested-by: Ingo Molnar mi...@elte.hu
 Signed-off-by: Yinghai Lu ying...@kernel.org

Reviewed-by: Pekka Enberg penb...@kernel.org
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -v2 03/13] x86, mm: Moving init_memory_mapping calling

2012-09-02 Thread Pekka Enberg

On Sun, Sep 2, 2012 at 10:46 AM, Yinghai Lu ying...@kernel.org wrote:
 from setup.c to mm/init.c

 So could update all related calling together later

 Signed-off-by: Yinghai Lu ying...@kernel.org

Reviewed-by: Pekka Enberg penb...@kernel.org
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -v2 04/13] x86, mm: Revert back good_end setting for 64bit

2012-09-02 Thread Pekka Enberg

On Sun, Sep 2, 2012 at 10:46 AM, Yinghai Lu ying...@kernel.org wrote:
 So we could put page table high again for 64bit.

 Signed-off-by: Yinghai Lu ying...@kernel.org

The changelog for this is too terse for me to actually understand why
this is needed.

 ---
  arch/x86/mm/init.c |2 +-
  1 files changed, 1 insertions(+), 1 deletions(-)

 diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
 index 15a6a38..cca9b7d 100644
 --- a/arch/x86/mm/init.c
 +++ b/arch/x86/mm/init.c
 @@ -76,8 +76,8 @@ static void __init find_early_table_space(struct map_range 
 *mr,
  #ifdef CONFIG_X86_32
 /* for fixmap */
 tables += roundup(__end_of_fixed_addresses * sizeof(pte_t), 
 PAGE_SIZE);
 -#endif
 good_end = max_pfn_mapped  PAGE_SHIFT;
 +#endif

 base = memblock_find_in_range(start, good_end, tables, PAGE_SIZE);
 if (!base)
 --
 1.7.7

 --
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -v2 05/13] x86, mm: Find early page table only one time

2012-09-02 Thread Pekka Enberg

On Sun, Sep 2, 2012 at 10:46 AM, Yinghai Lu ying...@kernel.org wrote:
 Should not do that in every calling of init_memory_mapping.
 Actually in early time, only need do once.

 Also move down early_memtest.

 Signed-off-by: Yinghai Lu ying...@kernel.org

The changelog is too terse for my liking. I think it could use some
more context on what the code is actually doing now and why the change
makes it better.

 ---
  arch/x86/mm/init.c |   72 ++-
  1 files changed, 37 insertions(+), 35 deletions(-)

 diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
 index cca9b7d..0ada295 100644
 --- a/arch/x86/mm/init.c
 +++ b/arch/x86/mm/init.c
 @@ -37,7 +37,7 @@ struct map_range {

  static int page_size_mask;

 -static void __init find_early_table_space(struct map_range *mr,
 +static void __init find_early_table_space(unsigned long begin,
   unsigned long end)
  {
 unsigned long puds, pmds, ptes, tables, start = 0, good_end = end;
 @@ -64,8 +64,8 @@ static void __init find_early_table_space(struct map_range 
 *mr,
 extra += PMD_SIZE;
  #endif
 /* The first 2/4M doesn't use large pages. */
 -   if (mr-start  PMD_SIZE)
 -   extra += mr-end - mr-start;
 +   if (begin  PMD_SIZE)
 +   extra += (PMD_SIZE - begin)  PAGE_SHIFT;

 ptes = (extra + PAGE_SIZE - 1)  PAGE_SHIFT;
 } else
 @@ -265,16 +265,6 @@ unsigned long __init_refok init_memory_mapping(unsigned 
 long start,
 nr_range = 0;
 nr_range = split_mem_range(mr, nr_range, start, end);

 -   /*
 -* Find space for the kernel direct mapping tables.
 -*
 -* Later we should allocate these tables in the local node of the
 -* memory mapped. Unfortunately this is done currently before the
 -* nodes are discovered.
 -*/
 -   if (!after_bootmem)
 -   find_early_table_space(mr[0], end);
 -
 for (i = 0; i  nr_range; i++)
 ret = kernel_physical_mapping_init(mr[i].start, mr[i].end,
mr[i].page_size_mask);
 @@ -287,6 +277,36 @@ unsigned long __init_refok init_memory_mapping(unsigned 
 long start,

 __flush_tlb_all();

 +   return ret  PAGE_SHIFT;
 +}
 +
 +void __init init_mem_mapping(void)
 +{
 +   probe_page_size_mask();
 +
 +   /*
 +* Find space for the kernel direct mapping tables.
 +*
 +* Later we should allocate these tables in the local node of the
 +* memory mapped. Unfortunately this is done currently before the
 +* nodes are discovered.
 +*/
 +#ifdef CONFIG_X86_64
 +   find_early_table_space(0, max_pfnPAGE_SHIFT);
 +#else
 +   find_early_table_space(0, max_low_pfnPAGE_SHIFT);
 +#endif
 +   max_low_pfn_mapped = init_memory_mapping(0, max_low_pfnPAGE_SHIFT);
 +   max_pfn_mapped = max_low_pfn_mapped;
 +
 +#ifdef CONFIG_X86_64
 +   if (max_pfn  max_low_pfn) {
 +   max_pfn_mapped = init_memory_mapping(1UL32,
 +max_pfnPAGE_SHIFT);
 +   /* can we preseve max_low_pfn ?*/
 +   max_low_pfn = max_pfn;
 +   }
 +#endif
 /*
  * Reserve the kernel pagetable pages we used (pgt_buf_start -
  * pgt_buf_end) and free the other ones (pgt_buf_end - pgt_buf_top)
 @@ -302,32 +322,14 @@ unsigned long __init_refok init_memory_mapping(unsigned 
 long start,
  * RO all the pagetable pages, including the ones that are beyond
  * pgt_buf_end at that time.
  */
 -   if (!after_bootmem  pgt_buf_end  pgt_buf_start)
 +   if (pgt_buf_end  pgt_buf_start)
 x86_init.mapping.pagetable_reserve(PFN_PHYS(pgt_buf_start),
 PFN_PHYS(pgt_buf_end));

 -   if (!after_bootmem)
 -   early_memtest(start, end);
 +   /* stop the wrong using */
 +   pgt_buf_top = 0;

 -   return ret  PAGE_SHIFT;
 -}
 -
 -void __init init_mem_mapping(void)
 -{
 -   probe_page_size_mask();
 -
 -   /* max_pfn_mapped is updated here */
 -   max_low_pfn_mapped = init_memory_mapping(0, max_low_pfnPAGE_SHIFT);
 -   max_pfn_mapped = max_low_pfn_mapped;
 -
 -#ifdef CONFIG_X86_64
 -   if (max_pfn  max_low_pfn) {
 -   max_pfn_mapped = init_memory_mapping(1UL32,
 -max_pfnPAGE_SHIFT);
 -   /* can we preseve max_low_pfn ?*/
 -   max_low_pfn = max_pfn;
 -   }
 -#endif
 +   early_memtest(0, max_pfn_mapped  PAGE_SHIFT);
  }

  /*
 --
 1.7.7

 --
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at

Re: [PATCH -v2 06/13] x86, mm: Separate out calculate_table_space_size()

2012-09-02 Thread Pekka Enberg

On Sun, Sep 2, 2012 at 10:46 AM, Yinghai Lu ying...@kernel.org wrote:
 it should take physical address range that will need to be mapped.
 and find_early_table_space should take range that pgt buff should be in.
 Separate those two to reduce confusion.

 Signed-off-by: Yinghai Lu ying...@kernel.org

Reviewed-by: Pekka Enberg penb...@kernel.org
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -v2 07/13] x86, mm: Move down two calculate_table_space_size down.

2012-09-02 Thread Pekka Enberg

On Sun, Sep 2, 2012 at 10:46 AM, Yinghai Lu ying...@kernel.org wrote:
 So later could make it call split_mem_range...

 Signed-off-by: Yinghai Lu ying...@kernel.org

The commit title is utterly confusing. And it has a trailing dot (.).

As for the actual change:

Reviewed-by: Pekka Enberg penb...@kernel.org
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -v2 08/13] x86: if kernel .text .data .bss are not marked as E820_RAM, complain and fix

2012-09-02 Thread Pekka Enberg

On Sun, Sep 2, 2012 at 10:46 AM, Yinghai Lu ying...@kernel.org wrote:
 From: Jacob Shin jacob.s...@amd.com

 There could be cases where user supplied memmap=exactmap memory
 mappings do not mark the region where the kernel .text .data and
 .bss reside as E820_RAM, as reported here:

 https://lkml.org/lkml/2012/8/14/86

 Handle it by complaining, and adding the range back into the e820.

 Signed-off-by: Jacob Shin jacob.s...@amd.com

This should have Yinghai's sign-off and the warning could be less cryptic.

As for the fix itself:

Reviewed-by: Pekka Enberg penb...@kernel.org
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -v2 09/13] x86: Fixup code testing if a pfn is direct mapped

2012-09-02 Thread Pekka Enberg

On Sun, Sep 2, 2012 at 10:46 AM, Yinghai Lu ying...@kernel.org wrote:
 From: Jacob Shin jacob.s...@amd.com

 Update code that previously assumed pfns [ 0 - max_low_pfn_mapped ) and
 [ 4GB - max_pfn_mapped ) were always direct mapped, to now look up
 pfn_mapped ranges instead.

What problem does this fix? How did you find about it?

 -v2: change applying sequence to keep git bisecting working.
  so add dummy pfn_range_is_mapped(). - Yinghai Lu

 Signed-off-by: Jacob Shin jacob.s...@amd.com

Yinghai's sign-off is missing.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -v2 10/13] x86: Only direct map addresses that are marked as E820_RAM

2012-09-02 Thread Pekka Enberg

On Sun, Sep 2, 2012 at 10:46 AM, Yinghai Lu ying...@kernel.org wrote:
 From: Jacob Shin jacob.s...@amd.com

 Currently direct mappings are created for [ 0 to max_low_pfnPAGE_SHIFT )
 and [ 4GB to max_pfnPAGE_SHIFT ), which may include regions that are not
 backed by actual DRAM. This is fine for holes under 4GB which are covered
 by fixed and variable range MTRRs to be UC. However, we run into trouble
 on higher memory addresses which cannot be covered by MTRRs.

 Our system with 1TB of RAM has an e820 that looks like this:

  BIOS-e820: [mem 0x-0x000983ff] usable
  BIOS-e820: [mem 0x00098400-0x0009] reserved
  BIOS-e820: [mem 0x000d-0x000f] reserved
  BIOS-e820: [mem 0x0010-0xc7eb] usable
  BIOS-e820: [mem 0xc7ec-0xc7ed7fff] ACPI data
  BIOS-e820: [mem 0xc7ed8000-0xc7ed9fff] ACPI NVS
  BIOS-e820: [mem 0xc7eda000-0xc7ff] reserved
  BIOS-e820: [mem 0xfec0-0xfec0] reserved
  BIOS-e820: [mem 0xfee0-0xfee00fff] reserved
  BIOS-e820: [mem 0xfff0-0x] reserved
  BIOS-e820: [mem 0x0001-0x00e037ff] usable
  BIOS-e820: [mem 0x00e03800-0x00fc] reserved
  BIOS-e820: [mem 0x0100-0x011ffeff] usable

 and so direct mappings are created for huge memory hole between
 0x00e03800 to 0x0100. Even though the kernel never
 generates memory accesses in that region, since the page tables mark
 them incorrectly as being WB, our (AMD) processor ends up causing a MCE
 while doing some memory bookkeeping/optimizations around that area.

 This patch iterates through e820 and only direct maps ranges that are
 marked as E820_RAM, and keeps track of those pfn ranges. Depending on
 the alignment of E820 ranges, this may possibly result in using smaller
 size (i.e. 4K instead of 2M or 1G) page tables.

 -v2: move changes from setup.c to mm/init.c, also use for_each_mem_pfn_range
 instead.  - Yinghai Lu
 -v3: add calculate_all_table_space_size() to get correct needed page table
 size. - Yinghai Lu

 Signed-off-by: Jacob Shin jacob.s...@amd.com

Yinghai's sign-off is missing.

Reviewed-by: Pekka Enberg penb...@kernel.org
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -v2 11/13] x86/mm: calculate_table_space_size based on memory ranges that are being mapped

2012-09-02 Thread Pekka Enberg

On Sun, Sep 2, 2012 at 10:46 AM, Yinghai Lu ying...@kernel.org wrote:
 From: Jacob Shin jacob.s...@amd.com

 Current logic finds enough space for direct mapping page tables from 0
 to end. Instead, we only need to find enough space to cover mr[0].start
 to mr[nr_range].end -- the range that is actually being mapped by
 init_memory_mapping()

 This patch also reportedly fixes suspend/resume issue reported in:

 https://lkml.org/lkml/2012/8/11/83

 -v2: update with calculate_table_space_size()
  clear max_pfn_mapped before init_all_memory_mapping to get right value
   -Yinghai Lu

 Signed-off-by: Jacob Shin jacob.s...@amd.com

Yinghai's sign-off is missing.

Reviewed-by: Pekka Enberg penb...@kernel.org
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -v2 12/13] x86, mm: Use func pointer to table size calculation and mapping.

2012-09-02 Thread Pekka Enberg

On Sun, Sep 2, 2012 at 10:46 AM, Yinghai Lu ying...@kernel.org wrote:
 +static void __init with_all_ram_ranges(
 +   void (*work_fn)(unsigned long, unsigned long, void *),
 +   void *data)

 +static void __init size_work_fn(unsigned long start, unsigned long end, void 
 *data)

 +static void __init mapping_work_fn(unsigned long start, unsigned long end,
 +void *data)

So I passionately hate the naming convention. How about something
similar to mm/pagewalk.c:

  s/with_all_ram_ranges/walk_ram_ranges/g

  s/size_work_fn/table_space_size/g

  s/mapping_work_fn/map_memory/g
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -v2 13/13] x86, 64bit: Map first 1M ram early before memblock_x86_fill()

2012-09-02 Thread Pekka Enberg

On Sun, Sep 2, 2012 at 10:46 AM, Yinghai Lu ying...@kernel.org wrote:
 This one intend to fix bugs:
 when efi booting have too many memmap entries, will need to double memblock
 memory array or reserved array.

Okay, why do we need to do that?

 +RESERVE_BRK(early_pgt_alloc, 65536);

What is this needed for?

 +void  __init early_init_mem_mapping(void)
 +{
 +   unsigned long tables;
 +   phys_addr_t base;
 +   unsigned long start = 0, end = ISA_END_ADDRESS;
 +
 +   probe_page_size_mask();
 +
 +   if (max_pfn_mapped)
 +   return;

I find this confusing - what is this protecting for? Why is
'max_pfn_mapped' set when someone calls early_init_mem_mappings()?

Side note: we have multiple pfn_mapped globals and it's not at all
obvious to me what the semantics for them are. Maybe adding a comment
or two in arch/x86/include/asm/page_types.h would help.

 +
 +   tables = calculate_table_space_size(start, end);
 +   base = __pa(extend_brk(tables, PAGE_SIZE));
 +
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -v2 13/13] x86, 64bit: Map first 1M ram early before memblock_x86_fill()

2012-09-03 Thread Pekka Enberg

On Mon, Sep 3, 2012 at 9:17 AM, Yinghai Lu ying...@kernel.org wrote:
 On Sun, Sep 2, 2012 at 10:50 PM, Pekka Enberg penb...@kernel.org wrote:
 On Sun, Sep 2, 2012 at 10:46 AM, Yinghai Lu ying...@kernel.org wrote:
 This one intend to fix bugs:
 when efi booting have too many memmap entries, will need to double memblock
 memory array or reserved array.

 Okay, why do we need to do that?

 memblock initial memory only have 128 entry, and some EFI system could
 have more entries than that.

 So during memblock_x86_fill need to double that array.

 and efi_reserve_boot_services() could make thing more worse. aka need
 more entries in memblock.memory.regions.

Aah. Care to put that information in the changelog?

 +void  __init early_init_mem_mapping(void)
 +{
 +   unsigned long tables;
 +   phys_addr_t base;
 +   unsigned long start = 0, end = ISA_END_ADDRESS;
 +
 +   probe_page_size_mask();
 +
 +   if (max_pfn_mapped)
 +   return;

 I find this confusing - what is this protecting for? Why is
 'max_pfn_mapped' set when someone calls early_init_mem_mappings()?

 for 32 bit, it will non zero max_pfn_mapped set in head_32.S

OK, that's why my grep missed it. A comment would be nice.

 Side note: we have multiple pfn_mapped globals and it's not at all
 obvious to me what the semantics for them are. Maybe adding a comment
 or two in arch/x86/include/asm/page_types.h would help.

 move the comments  from arch/x86/kernel/setup.c to that header file ?

Yup, or move the globals together with the comment to arch/x86/mm/init.c.

That said, max_pfn_high_mapped really ought to be kept together with
the other pfn_mapped globals and the comment should be updated.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -v3 14/14] x86, mm: Map ISA area with connected ram range at the same time

2012-09-05 Thread Pekka Enberg

On Wed, Sep 5, 2012 at 8:46 AM, Yinghai Lu ying...@kernel.org wrote:
 so could reduce one loop.

 Signed-off-by: Yinghai Lu ying...@kernel.org

How significant is the speed gain? The isa_done flag makes code flow
more difficult to follow.

 ---
  arch/x86/mm/init.c |   21 ++---
  1 files changed, 14 insertions(+), 7 deletions(-)

 diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
 index 6663f61..e69f832 100644
 --- a/arch/x86/mm/init.c
 +++ b/arch/x86/mm/init.c
 @@ -248,20 +248,27 @@ static void __init walk_ram_ranges(
 void *data)
  {
 unsigned long start_pfn, end_pfn;
 +   bool isa_done = false;
 int i;

 -   /* the ISA range is always mapped regardless of memory holes */
 -   work_fn(0, ISA_END_ADDRESS, data);
 -
 for_each_mem_pfn_range(i, MAX_NUMNODES, start_pfn, end_pfn, NULL) {
 u64 start = start_pfn  PAGE_SHIFT;
 u64 end = end_pfn  PAGE_SHIFT;

 -   if (end = ISA_END_ADDRESS)
 -   continue;
 +   if (!isa_done  start  ISA_END_ADDRESS) {
 +   work_fn(0, ISA_END_ADDRESS, data);
 +   isa_done = true;
 +   } else {
 +   if (end  ISA_END_ADDRESS)
 +   continue;
 +
 +   if (start = ISA_END_ADDRESS 
 +   end = ISA_END_ADDRESS) {
 +   start = 0;
 +   isa_done = true;
 +   }
 +   }

 -   if (start  ISA_END_ADDRESS)
 -   start = ISA_END_ADDRESS;
  #ifdef CONFIG_X86_32
 /* on 32 bit, we only map up to max_low_pfn */
 if ((start  PAGE_SHIFT) = max_low_pfn)
 --
 1.7.7

 --
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -v3 13/14] x86, mm: Use func pointer to table size calculation and mapping

2012-09-05 Thread Pekka Enberg

On Wed, Sep 5, 2012 at 8:46 AM, Yinghai Lu ying...@kernel.org wrote:
 They all need to go over ram range in same sequence. So add shared function
 to reduce duplicated code.

 -v2: Change to walk_ram_ranges() according to Pekka Enberg.

 Signed-off-by: Yinghai Lu ying...@kernel.org

Reviewed-by: Pekka Enberg penb...@kernel.org
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 7/7] perf gtk/browser: Use hist_period_print functions

2012-08-15 Thread Pekka Enberg

On Mon, 6 Aug 2012, Namhyung Kim wrote:
 Now we can support color using pango markup with this change.
 
 Cc: Pekka Enberg penb...@kernel.org
 Signed-off-by: Namhyung Kim namhy...@kernel.org

Awesome!

Acked-by: Pekka Enberg penb...@kernel.org
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3] mm: Restructure kmem_cache_create() to move debug cache integrity checks into a new function

2012-08-16 Thread Pekka Enberg

On Thu, Aug 16, 2012 at 2:53 AM, Andrew Morton
a...@linux-foundation.org wrote:
 On Sun, 12 Aug 2012 10:40:18 -0600
 Shuah Khan shuah.k...@hp.com wrote:

 kmem_cache_create() does cache integrity checks when CONFIG_DEBUG_VM
 is defined. These checks interspersed with the regular code path has
 lead to compile time warnings when compiled without CONFIG_DEBUG_VM
 defined. Restructuring the code to move the integrity checks in to a new
 function would eliminate the current compile warning problem and also
 will allow for future changes to the debug only code to evolve without
 introducing new warnings in the regular path. This restructuring work
 is based on the discussion in the following thread:

 Your patch appears to be against some ancient old kernel, such as 3.5.
 I did this:

 --- 
 a/mm/slab_common.c~mm-slab_commonc-restructure-kmem_cache_create-to-move-debug-cache-integrity-checks-into-a-new-function-fix
 +++ a/mm/slab_common.c
 @@ -101,15 +101,8 @@ struct kmem_cache *kmem_cache_create(con

 get_online_cpus();
 mutex_lock(slab_mutex);
 -
 -   if (kmem_cache_sanity_check(name, size))
 -   goto oops;
 -
 -   s = __kmem_cache_create(name, size, align, flags, ctor);
 -
 -#ifdef CONFIG_DEBUG_VM
 -oops:
 -#endif
 +   if (kmem_cache_sanity_check(name, size) == 0)
 +   s = __kmem_cache_create(name, size, align, flags, ctor);
 mutex_unlock(slab_mutex);
 put_online_cpus();

Yup. Shuah, care to spin another version against slab/next?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/3] slub: reduce failure of this_cpu_cmpxchg in put_cpu_partial() after unfreezing

2012-08-16 Thread Pekka Enberg

On Sat, 23 Jun 2012, Joonsoo Kim wrote:
 In current implementation, after unfreezing, we doesn't touch oldpage,
 so it remain 'NOT NULL'. When we call this_cpu_cmpxchg()
 with this old oldpage, this_cpu_cmpxchg() is mostly be failed.
 
 We can change value of oldpage to NULL after unfreezing,
 because unfreeze_partial() ensure that all the cpu partial slabs is removed
 from cpu partial list. In this time, we could expect that
 this_cpu_cmpxchg is mostly succeed.
 
 Signed-off-by: Joonsoo Kim js1...@gmail.com
 
 diff --git a/mm/slub.c b/mm/slub.c
 index 92f1c0e..531d8ed 100644
 --- a/mm/slub.c
 +++ b/mm/slub.c
 @@ -1968,6 +1968,7 @@ int put_cpu_partial(struct kmem_cache *s, struct page 
 *page, int drain)
   local_irq_save(flags);
   unfreeze_partials(s);
   local_irq_restore(flags);
 + oldpage = NULL;
   pobjects = 0;
   pages = 0;
   stat(s, CPU_PARTIAL_DRAIN);

Applied, thanks!
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Drop support for x86-32

2012-08-23 Thread Pekka Enberg

Dear wbrana,

On Thu, Aug 23, 2012 at 9:17 PM, wbrana wbr...@gmail.com wrote:
 I'm user space developer. User space software also needs more time if
 more ABIs are supported.

I feel your pain.

As much as I appreciate your contribution here on LKML, I can't help
thinking that this discussion would be best continued on the
linux-visionaries mailing list.

Your pal,

Pekka
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCHv2 1/1] perf: Port to Android

2012-08-23 Thread Pekka Enberg

Hi Bernard,

(You didn't CC perf maintainers.)

On Thu, Aug 23, 2012 at 6:01 PM, Bernhard Rosenkraenzer
bernhard.rosenkran...@linaro.org wrote:
 commit 4dc79eed16e3bb03b3cf92fcc6127e107e7537aa
 Author: Bernhard Rosenkraenzer bernhard.rosenkran...@linaro.org
 Date:   Sat Jun 23 06:18:05 2012 +0200

 perf: Port to Android

 Adapt perf to deal with some missing functions in Bionic etc.

 Change-Id: I0cda2aad3edba26e1be3aebc9475a229ea9e8356
 Signed-off-by: Bernhard Rosenkraenzer bernhard.rosenkran...@linaro.org


Your patch changelog format is pretty funky.

 diff --git a/tools/perf/Makefile b/tools/perf/Makefile
 index 92271d3..d15cdae 100644
 --- a/tools/perf/Makefile
 +++ b/tools/perf/Makefile

Whoa! I had no idea Android userspace was f*cked up...

 @@ -117,7 +117,7 @@ ifndef PERF_DEBUG
  endif

  CFLAGS = -fno-omit-frame-pointer -ggdb3 -Wall -Wextra -std=gnu99
 $(CFLAGS_WERROR) $(CFLAGS_OPTIMIZE) -D_FORTIFY_SOURCE=2 $(EXTRA_WARNINGS)
 $(EXTRA_CFLAGS)
 -EXTLIBS = -lpthread -lrt -lelf -lm
 +EXTLIBS = -lpthread -lelf -lm
  ALL_CFLAGS = $(CFLAGS) -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -
 D_GNU_SOURCE
  ALL_LDFLAGS = $(LDFLAGS)
  STRIP ?= strip
 @@ -474,12 +474,23 @@ FLAGS_LIBELF=$(ALL_CFLAGS) $(ALL_LDFLAGS) $(EXTLIBS)
  ifneq ($(call try-cc,$(SOURCE_LIBELF),$(FLAGS_LIBELF)),y)
 FLAGS_GLIBC=$(ALL_CFLAGS) $(ALL_LDFLAGS)
 ifneq ($(call try-cc,$(SOURCE_GLIBC),$(FLAGS_GLIBC)),y)
 -   msg := $(error No gnu/libc-version.h found, please install 
 glibc-
 dev[el]/glibc-static);
 +   ifeq ($(call try-cc,$(SOURCE_BIONIC),$(FLAGS_GLIBC)),y)
 +   # Found Bionic instead of glibc...
 +   # That works too, but needs a bit of special treatment
 +   BASIC_CFLAGS += -DANDROID -include compat-android.h
 +   ANDROID := 1
 +   else
 +   msg := $(error No gnu/libc-version.h found, please 
 install glibc-
 dev[el]/glibc-static);
 +   endif
 else
 msg := $(error No libelf.h/libelf found, please install 
 libelf-
 dev/elfutils-libelf-devel);
 endif
  endif

 +ifneq ($(ANDROID),1)
 +EXTLIBS += -lrt
 +endif
 +
  ifneq ($(call try-cc,$(SOURCE_ELF_MMAP),$(FLAGS_COMMON)),y)
 BASIC_CFLAGS += -DLIBELF_NO_MMAP
  endif
 diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
 index be4e1ee..3a1d0cc 100644
 --- a/tools/perf/builtin-record.c
 +++ b/tools/perf/builtin-record.c
 @@ -27,10 +27,43 @@
  #include util/cpumap.h
  #include util/thread_map.h

 +#include stdlib.h
  #include unistd.h
  #include sched.h
  #include sys/mman.h

 +#ifdef ANDROID
 +/* While stdlib.h has a prototype for it,
 +   Bionic doesn't actually implement on_exit() */
 +#ifndef ATEXIT_MAX
 +#define ATEXIT_MAX 32
 +#endif
 +static int __on_exit_count = 0;
 +typedef void (*on_exit_func_t)(int, void*);
 +static on_exit_func_t __on_exit_funcs[ATEXIT_MAX];
 +static void *__on_exit_args[ATEXIT_MAX];
 +static int __exitcode = 0;
 +static void __handle_on_exit_funcs();
 +static int on_exit(on_exit_func_t function, void *arg);
 +#define exit(x) (exit)(__exitcode = (x))
 +
 +static int on_exit(on_exit_func_t function, void *arg) {
 +   if(__on_exit_count == ATEXIT_MAX)
 +   return ENOMEM;
 +   else if(__on_exit_count == 0)
 +   atexit(__handle_on_exit_funcs);
 +   __on_exit_funcs[__on_exit_count] = function;
 +   __on_exit_args[__on_exit_count++] = arg;
 +   return 0;
 +}
 +
 +static void __handle_on_exit_funcs() {
 +   for(int i=0; i__on_exit_count; i++) {
 +   __on_exit_funcs[i](__exitcode, __on_exit_args[i]);
 +   }
 +}
 +#endif
 +
  enum write_mode_t {
 WRITE_FORCE,
 WRITE_APPEND
 diff --git a/tools/perf/builtin-test.c b/tools/perf/builtin-test.c
 index 223ffdc..6dbd2ee 100644
 --- a/tools/perf/builtin-test.c
 +++ b/tools/perf/builtin-test.c
 @@ -469,10 +469,17 @@ static int test__basic_mmap(void)
 .watermark  = 0,
 };
 cpu_set_t cpu_set;
 +#ifndef ANDROID
 const char *syscall_names[] = { getsid, getppid, getpgrp,
 getpgid, };
 pid_t (*syscalls[])(void) = { (void *)getsid, getppid, getpgrp,
   (void*)getpgid };
 +#else
 +   /* No getsid() on Android */
 +   const char *syscall_names[] = { getppid, getpgrp,
 +   getpgid, };
 +   pid_t (*syscalls[])(void) = { getppid, getpgrp, (void*)getpgid };
 +#endif
  #define nsyscalls ARRAY_SIZE(syscall_names)
 int ids[nsyscalls];
 unsigned int nr_events[nsyscalls],
 diff --git a/tools/perf/builtin.h b/tools/perf/builtin.h
 index b382bd5..1521a275 100644
 --- a/tools/perf/builtin.h
 +++ b/tools/perf/builtin.h
 @@ -1,6 +1,7 @@
  #ifndef BUILTIN_H
  #define BUILTIN_H

 +#include compat-android.h
  #include util/util.h
  #include

Re: Drop support for x86-32

2012-08-25 Thread Pekka Enberg

Dearest wbrana,

On Sat, Aug 25, 2012 at 8:27 PM, wbrana wbr...@gmail.com wrote:
 Why did you send this irrelevant e-mail?

So despite my humble suggestion, you've filled up my inbox with
pointless rambling. Would it be at all possible you just got the f*ck
off LKML? I know it's difficult to hear this but nobody gives a shit
about your ideas.

Still your pal,

Pekka
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC/PATCH] zcache/ramster rewrite and promotion

2012-07-31 Thread Pekka Enberg

On Tue, Jul 31, 2012 at 11:18 PM, Dan Magenheimer
dan.magenhei...@oracle.com wrote:
 diffstat vs 3.5:
  drivers/staging/ramster/Kconfig   |2
  drivers/staging/ramster/Makefile  |2
  drivers/staging/zcache/Kconfig|2
  drivers/staging/zcache/Makefile   |2
  mm/Kconfig|2
  mm/Makefile   |4
  mm/tmem/Kconfig   |   33
  mm/tmem/Makefile  |5
  mm/tmem/tmem.c|  894 +
  mm/tmem/tmem.h|  259 +++
  mm/tmem/zbud.c| 1060 +++
  mm/tmem/zbud.h|   33
  mm/tmem/zcache-main.c | 1686 +
  mm/tmem/zcache.h  |   53
  mm/tmem/ramster.h |   59
  mm/tmem/ramster/heartbeat.c   |  462 ++
  mm/tmem/ramster/heartbeat.h   |   87 +
  mm/tmem/ramster/masklog.c |  155 ++
  mm/tmem/ramster/masklog.h |  220 +++
  mm/tmem/ramster/nodemanager.c |  995 +++
  mm/tmem/ramster/nodemanager.h |   88 +
  mm/tmem/ramster/r2net.c   |  414 ++
  mm/tmem/ramster/ramster.c |  985 ++
  mm/tmem/ramster/ramster.h |  161 ++
  mm/tmem/ramster/ramster_nodemanager.h |   39
  mm/tmem/ramster/tcp.c | 2253 
 ++
  mm/tmem/ramster/tcp.h |  159 ++
  mm/tmem/ramster/tcp_internal.h|  248 +++
 28 files changed, 10358 insertions(+), 4 deletions(-)

So it's basically this commit, right?

https://oss.oracle.com/git/djm/tmem.git/?p=djm/tmem.git;a=commitdiff;h=22844fe3f52d912247212408294be330a867937c

Why on earth would you want to move that under the mm directory?

Pekka
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2] sched: Document schedule() entry points

2012-08-04 Thread Pekka Enberg

This patch adds a comment on top of the schedule() function to explain
to scheduler newbies how the main scheduler function is entered.

Cc: Randy Dunlap rdun...@xenotime.net
Explained-by: Ingo Molnar mi...@kernel.org
Explained-by: Peter Zijlstra a.p.zijls...@chello.nl
Signed-off-by: Pekka Enberg penb...@kernel.org
---
V1 - V2: Fix funky grammar pointed out by Peter and Randy.

 kernel/sched/core.c |   34 ++
 1 files changed, 34 insertions(+), 0 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 468bdd4..7dc75df 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3361,6 +3361,40 @@ pick_next_task(struct rq *rq)
 
 /*
  * __schedule() is the main scheduler function.
+ *
+ * The main means of driving the scheduler and thus entering this function are:
+ *
+ *   1. Explicit blocking: mutex, semaphore, waitqueue, etc.
+ *
+ *   2. TIF_NEED_RESCHED flag is checked on interrupt and userspace return
+ *  paths. For example, see arch/x86/entry_64.S.
+ *
+ *  To drive preemption between tasks, the scheduler sets the flag in timer
+ *  interrupt handler scheduler_tick().
+ *
+ *   3. Wakeups don't really cause entry into schedule(). They add a
+ *  task to the run-queue and that's it.
+ *
+ *  Now, if the new task added to the run-queue preempts the current
+ *  task, then the wakeup sets TIF_NEED_RESCHED and schedule() gets
+ *  called on the nearest possible occasion:
+ *
+ *   - If the kernel is preemptible (CONFIG_PREEMPT=y):
+ *
+ * - in syscall or exception context, at the next outmost
+ *   preempt_enable(). (this might be as soon as the wake_up()'s
+ *   spin_unlock()!)
+ *
+ * - in IRQ context, return from interrupt-handler to
+ *   preemptible context
+ *
+ *   - If the kernel is not preemptible (CONFIG_PREEMPT is not set)
+ * then at the next:
+ *
+ *  - cond_resched() call
+ *  - explicit schedule() call
+ *  - return from syscall or exception to user-space
+ *  - return from interrupt-handler to user-space
  */
 static void __sched __schedule(void)
 {
-- 
1.7.7.6

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC/PATCH] zcache/ramster rewrite and promotion

2012-08-04 Thread Pekka Enberg

Hi Konrad,

 On Tue, Jul 31, 2012 at 11:53:57PM +0300, Pekka Enberg wrote:
 Why on earth would you want to move that under the mm directory?

On Wed, Aug 1, 2012 at 12:04 AM, Konrad Rzeszutek Wilk
konrad.w...@oracle.com wrote:
 If you take aside that problem that it is one big patch instead
 of being split up in more reasonable pieces - would you recommend
 that it reside in a different directory?

 Or is that it does not make sense b/c it has other components in it - such
 as tcp/nodemaneger/hearbeat/etc so it should go under the refactor knife?

 And if you rip out the ramster from this and just concentrate on zcache -
 should that go in drivers/mm or mm/tmem/zcache?

I definitely think mm/zcache.c makes sense. I hate the fact that it's
now riddled with references to tmem and ramster but that's probably
fixable. I also hate the fact that you've now gone and rewritten
everything so we lose all the change history zcache has had under
staging.

As for ramster, it might make sense to have its core in mm/ramster.c and
move the TCP weirdness somewhere else. The exact location depends on
what kind of userspace ABIs you expose, I suppose. I mean, surely you
need to configure the thing somehow?

Pekka
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC/PATCH] zcache/ramster rewrite and promotion

2012-08-04 Thread Pekka Enberg

Hi Dan,

On Wed, Aug 1, 2012 at 12:13 AM, Dan Magenheimer
dan.magenhei...@oracle.com wrote:
 Ramster does the same thing but manages it peer-to-peer across
 multiple systems using kernel sockets.  One could argue that
 the dependency on sockets makes it more of a driver than mm
 but ramster is memory management too, just a bit more exotic.

How do you configure it? Can we move parts of the network protocol under
net/ramster or something?

Pekka
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC/PATCH] zcache/ramster rewrite and promotion

2012-08-06 Thread Pekka Enberg

On Mon, Aug 6, 2012 at 5:07 PM, Dan Magenheimer
dan.magenhei...@oracle.com wrote:
 I'm OK with placing it wherever kernel developers want to put
 it, as long as the reason is not NIMBY-ness. [1]  My preference
 is to keep all the parts together, at least for the review phase,
 but if there is a consensus that it belongs someplace else,
 I will be happy to move it.

I'd go for core code in mm/zcache.c and mm/ramster.c, and move the
clustering code under net/ramster or drivers/ramster.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/4] promote zcache from staging

2012-08-06 Thread Pekka Enberg

On Mon, Aug 6, 2012 at 6:24 PM, Dan Magenheimer
dan.magenhei...@oracle.com wrote:
 IMHO, the fastest way to get the best zcache into the kernel and
 to distros and users is to throw away the demo version, move forward
 to a new solid well-designed zcache code base, and work together to
 build on it.  There's still a lot to do so I hope we can work together.

I'm not convinced it's the _fastest way_. You're effectively
invalidating all the work done under drivers/staging so you might end up
in review limbo with your shiny new code...

AFAICT, your best bet is to first clean up zcache under driver/staging
and get that promoted under mm/zcache.c. You can then move on to the
more controversial ramster and figure out where to put the clustering
code, etc.

Pekka
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC/PATCH] zcache/ramster rewrite and promotion

2012-08-06 Thread Pekka Enberg

On Mon, Aug 6, 2012 at 7:10 PM, Dan Magenheimer
dan.magenhei...@oracle.com wrote:
 Hmmm.. there's also zbud.c and tmem.c which are critical components
 of both zcache and ramster.  And there are header files as well which
 will need to either be in mm/ or somewhere in include/linux/

 Is there a reason or rule that mm/ can't have subdirectories?

 Since zcache has at least three .c files plus ramster.c, and
 since mm/frontswap.c and mm/cleancache.c are the foundation on
 which all of these are built, I was thinking grouping all six
 (plus headers) in the same mm/tmem/ subdirectory was a good
 way to keep mm/ from continuing to get more cluttered... not counting
 new zcache and ramster files, there are now 74 .c files in mm/!
 (Personally, I think a directory has too many files in it if
 ls doesn't fit in a 25x80 window.)

 Thoughts?

There's no reason we can't have subdirectories. That said, I really
don't see the point of having a separate directory called 'tmem'. It
might make sense to have mm/zcache and/or mm/ramster but I suspect
you can just fold the core code in mm/zcache.c and mm/ramster.c by
slimming down the weird Solaris-like 'tmem' abstractions.

Pekka
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PROBLEM] thinkpad_acpi: unhandled HKEY event 0x6040

2012-08-17 Thread Pekka Enberg

Hello,

I'm seeing this when I dock a Thinkpad X220 laptop:

[ 3129.616279] thinkpad_acpi: unknown possible thermal alarm or
keyboard event received
[ 3129.616297] thinkpad_acpi: unhandled HKEY event 0x6040
[ 3129.616298] thinkpad_acpi: please report the conditions when this
event happened to ibm-acpi-de...@lists.sourceforge.net
[ 3129.616949] thinkpad_acpi: undocked from hotplug port replicator
[ 3129.617065] ACPI: \_SB_.GDCK - undocking
[ 3144.184916] ACPI: \_SB_.GDCK - docking
[ 3144.185204] ACPI: Unable to dock!

and the monitor doesn't come up.

I'm currently running a stock Fedora kernel:

[penberg@tux ~]$ uname -a
Linux tux 3.3.8-1.fc16.x86_64 #1 SMP Mon Jun 4 20:49:02 UTC 2012
x86_64 x86_64 x86_64 GNU/Linux

but looking at drivers/platform/x86/thinkpad_acpi.c in Linus' tree, the event
is not handled there either:

TP_HKEY_EV_UNK_6040 = 0x6040, /* Related to AC change?
 some sort of APM hint,
 W520 */

Help!

Pekka
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 3/5] perf ui/gtk: Implement helpline_fns

2012-08-17 Thread Pekka Enberg

On Thu, 16 Aug 2012, Namhyung Kim wrote:
 Add helpline API implementation to GTK front-end.

For all three:

Acked-by: Pekka Enberg penb...@kernel.org
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] perf ui/gtk: Ensure not to call gtk_main_quit() twice

2012-08-18 Thread Pekka Enberg

On Fri, Aug 17, 2012 at 7:56 PM, Namhyung Kim namhy...@kernel.org wrote:
 Currently the gtk_main_quit() is called twice when perf exits so the
 following warning is emitted:

   [penberg@tux perf]$ ./perf report --gtk
   ^Cperf: Interrupt

   (perf:4048): Gtk-CRITICAL **: IA__gtk_main_quit: assertion `main_loops != 
 NULL' failed

 Fix it by not to call it unnecessarily.

 Reported-by: Pekka Enberg penb...@kernel.org
 Signed-off-by: Namhyung Kim namhy...@kernel.org
 ---
  tools/perf/ui/gtk/setup.c |2 ++
  1 file changed, 2 insertions(+)

 diff --git a/tools/perf/ui/gtk/setup.c b/tools/perf/ui/gtk/setup.c
 index ad40b3626fdb..ec1ee26b485a 100644
 --- a/tools/perf/ui/gtk/setup.c
 +++ b/tools/perf/ui/gtk/setup.c
 @@ -13,6 +13,8 @@ int perf_gtk__init(void)

  void perf_gtk__exit(bool wait_for_ok __used)
  {
 +   if (!perf_gtk__is_active_context(pgctx))
 +   return;
 perf_error__unregister(perf_gtk_eops);
 gtk_main_quit();
  }

Wouldn't it be nicer to rearrange the callers so that perf_gtk__exit()
is not called twice?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] perf: silence GTK2 probing errors

2012-08-19 Thread Pekka Enberg

On Sun, Aug 19, 2012 at 6:46 PM, David Ahern dsah...@gmail.com wrote:
 If GTK2 development packages are not installed, make is rather noisy:

[snip]

 Signed-off-by: David Ahern dsah...@gmail.com
 Cc: Arnaldo Carvalho de Melo a...@ghostprotocols.net
 Cc: Pekka Enberg penb...@kernel.org
 Cc: Namhyung Kim namhy...@kernel.org

Acked-by: Pekka Enberg penb...@kernel.org
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] perf ui/gtk: Ensure not to call gtk_main_quit() twice

2012-08-19 Thread Pekka Enberg

On Mon, Aug 20, 2012 at 4:59 AM, Namhyung Kim namhy...@kernel.org wrote:
 Forgot to add the #ifdefery to the below code also. :-/  Anyway, it needs
 to expose gtk specifics to general code with the #ifdef's. So I'd still
 prefer the original patch.

Fair enough.

Acked-by: Pekka Enberg penb...@kernel.org
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] perf/x86: Disable uncore on virtualized CPU.

2012-08-21 Thread Pekka Enberg

On Tue, Aug 21, 2012 at 12:08 PM, Yan, Zheng zheng.z@intel.com wrote:
 From: Yan, Zheng zheng.z@intel.com

 Initializing uncore PMU on virtualized CPU may hang the kernel.
 This is because kvm does not emulate the entire hardware. Thers
 are lots of uncore related MSRs, making kvm enumerate them all
 is a non-trival task. So just disable uncore on virtualized CPU.

 Signed-off-by: Yan, Zheng zheng.z@intel.com
 ---
  arch/x86/kernel/cpu/perf_event_intel_uncore.c | 3 +++
  1 file changed, 3 insertions(+)

 diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.c 
 b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
 index 0a55710..2f005ba 100644
 --- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
 +++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
 @@ -2898,6 +2898,9 @@ static int __init intel_uncore_init(void)
 if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL)
 return -ENODEV;

 +   if (cpu_has_hypervisor)
 +   return -ENODEV;
 +
 ret = uncore_pci_init();
 if (ret)
 goto fail;

Tested-by: Pekka Enberg penb...@kernel.org
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] perf/x86: Disable uncore on virtualized CPU.

2012-08-21 Thread Pekka Enberg

On Tue, Aug 21, 2012 at 05:08:37PM +0800, Yan, Zheng wrote:
 From: Yan, Zheng zheng.z@intel.com

 Initializing uncore PMU on virtualized CPU may hang the kernel.
 This is because kvm does not emulate the entire hardware. Thers
 are lots of uncore related MSRs, making kvm enumerate them all
 is a non-trival task. So just disable uncore on virtualized CPU.

On Tue, Aug 21, 2012 at 5:31 PM, Andi Kleen a...@firstfloor.org wrote:
 I'm not sure cpu_has_hypervisor is reliable enough for this.
 Better find out why it hangs and fix that.

 Early rdmsrls should be rdmsrl_safe()
 Early writes in the driver should check if anything was written,
 and if not then disable itself.

It's unfortunate that the changelog does not include any reference to
the actual thread this was discussed in:

http://www.mail-archive.com/kvm@vger.kernel.org/msg77524.html
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [rfc] direct IO submission and completion scalability issues

2008-02-03 Thread Pekka Enberg

Hi Nick,

On Feb 3, 2008 11:52 AM, Nick Piggin [EMAIL PROTECTED] wrote:
 +asmlinkage void smp_call_function_fast_interrupt(void)
 +{

[snip]

 +   while (!list_empty(list)) {
 +   struct call_single_data *data;
 +
 +   data = list_entry(list.next, struct call_single_data, list);
 +   list_del(data-list);
 +
 +   data-func(data-info);
 +   if (data-wait) {
 +   smp_mb();
 +   data-wait = 0;

Why do we need smp_mb() here (maybe add a comment to keep
Andrew/checkpatch happy)?

Pekka
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] USB: mark USB drivers as being GPL only

2008-02-03 Thread Pekka Enberg

Hi Christer,

On Feb 3, 2008 1:48 PM, Christer Weinigel [EMAIL PROTECTED] wrote:
 Saying use BSD instead isn't a good answer for me since I don't know
 BSD well enough.  And personally, I want to see Linux everywhere; I
 think it's a lot better to have Linux + a proprietary driver in an
 embedded system than BSD or Windows CE.

Why are we discussing this again? The Linux kernel is distributed
under the GPLv2 and even though there are some legal gray areas
regarding derived work (think nvidia and ati binary blobs here), the
license is not friendly towards proprietary drivers at all.
Furthermore, many of the _kernel developers_ do not support
proprietary drivers, so why do you insist on using Linux for that
purpose?

Seriously, you really really want to look at the BSDs or proprietary
operating systems because they support your needs much better.

Pekka
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] USB: mark USB drivers as being GPL only

2008-02-03 Thread Pekka Enberg

Hi David,

On Feb 3, 2008 5:12 PM, David Newall [EMAIL PROTECTED] wrote:
 By the way, I'm almost certain that the COPYING file is the first, last
 and only document specifying licence conditions, and nothing in that
 prevents a proprietary driver from including a patch that, for example,
 globally replaces ALL GPL-only symbols by the less restrictive ones.

So I am going to assume you're not trolling here (although some of
your snarky remarks make that bit hard).

A vendor is, of course, allowed to distribute a patch (under the GPLv2
proper) that removes the license checks no doubt‚ but it doesn't
change the fact whether the actual driver they're distributing (under
a proprietary license) is derived work or not (one way or another).
And, _if_ you're distributing a derived work that is not under the
GPLv2, you're breaking the law. I think we can agree on this?

As there is some controversy over the definition of derived work
(think Linus' comments on porting a driver or a filesystem from
another operating system here), we use the EXPORT_SYMBOL_GPL
annotations as a big warning sign that what you're doing is likely to
be considered as a derived work. If the USB developers want to
annotate their code with EXPORT_SYMBOL_GPL, why the hell do you want
to argue about it? They know the code better than you and as copyright
holders they can actually sue those parties distributing proprietary
code they think is derived work.

Bringing up Linux world domination or Microsoft market share in these
kind of discussion is totally pointless. The license is what it is
(GPLv2) and it seems unlikely to change at this point. If you want to
develop for Linux, you're most certainly better off always
distributing your code under the GPLv2; otherwise you really really
want to consult an IP lawyer to be sure. But what I don't understand
is why people insist using the Linux kernel for something it clearly
can never really properly support (proprietary code)?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] USB: mark USB drivers as being GPL only

2008-02-03 Thread Pekka Enberg

Hi David,

On Feb 3, 2008 6:06 PM, David Newall [EMAIL PROTECTED] wrote:
  But what I don't understand is why people insist using the Linux kernel
  for something it clearly can never really properly support (proprietary
  code)?

 That's defeatist.  Of course the Linux kernel can properly support
 (run) proprietary code.  It would be a miserable excuse for an
 operating system if it couldn't.

I think you're missing my point: as long as the license stays the way
it is now, you can never distribute proprietary code unless you've
consulted a lawyer and even then you run the risk of being sued for
infringement if the copyright holder thinks what you have is derived
work. The GPLv2 and thus Linux was never designed to allow proprietary
code and arguing that is pointless, isn't it? There are much better
alternatives available and people interested in proprietary code
should be looking there.

 Pekka
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] USB: mark USB drivers as being GPL only

2008-02-05 Thread Pekka Enberg

Hi David,

Marcel Holtmann writes:
  You driver was meant to be running as Linux kernel module and thus it is
  derivative work.

On Feb 5, 2008 1:39 PM, David Newall [EMAIL PROTECTED] wrote:
 It is precisely the fact that it is a loadable module, and does not form
 part of the kernel, that removes the requirement to distribute it under GPL.

What makes you qualified to make that statement (without giving any
evidence)? Are you're an expert on international copyright law?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: SLUB: Support for statistics to help analyze allocator behavior

2008-02-05 Thread Pekka Enberg


Christoph Lameter wrote:

On Tue, 5 Feb 2008, Pekka J Enberg wrote:


Hi Christoph,

On Mon, 4 Feb 2008, Christoph Lameter wrote:

The statistics provided here allow the monitoring of allocator behavior
at the cost of some (minimal) loss of performance. Counters are placed in
SLUB's per cpu data structure that is already written to by other code.
Looks good but I am wondering if we want to make the statistics per-CPU so 
that we can see the kmalloc/kfree ping-pong of, for example, hackbench 


We could do that Any idea how to display that kind of information 
in a meaningful way. Parameter conventions for slabinfo?


We could just print out one total summary and one summary for each CPU 
(and maybe show % of total allocations/fees. That way you can 
immediately spot if some CPUs are doing more allocations/freeing than 
others.


Pekka
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Bug: 2.6.24-smp: Eeek! page_mapcount(page) went negative! (-1)

2008-02-06 Thread Pekka Enberg

Hi,

On Feb 6, 2008 10:21 PM,  [EMAIL PROTECTED] wrote:
 From reboot after last hang on 02/03/08, I found this...

 Feb  5 23:26:26 sc-software kernel: [ cut here ]
 Feb  5 23:26:26 sc-software kernel: kernel BUG at mm/slab.c:591!
 Feb  5 23:26:26 sc-software kernel: invalid opcode:  [#1] PREEMPT SMP
 Feb  5 23:26:26 sc-software kernel: Modules linked in: shpchp pci_hotplug 
 ohci1394 ieee1394
 Feb  5 23:26:26 sc-software kernel:
 Feb  5 23:26:26 sc-software kernel: Pid: 6040, comm: modprobe Not tainted 
 (2.6.24 #1)
 Feb  5 23:26:26 sc-software kernel: EIP: 0060:[c017960f] EFLAGS: 00010046 
 CPU: 1
 Feb  5 23:26:26 sc-software kernel: EIP is at kfree+0x8f/0xa0
 Feb  5 23:26:26 sc-software kernel: EAX: 4000 EBX: f5915380 ECX: c1bb4498 
 EDX: c1bb4498
 Feb  5 23:26:26 sc-software kernel: ESI: eff354a0 EDI: f5815280 EBP: f0067e78 
 ESP: f0067e68
 Feb  5 23:26:26 sc-software kernel:  DS: 007b ES: 007b FS: 00d8 GS:  SS: 
 0068
 Feb  5 23:26:26 sc-software kernel: Process modprobe (pid: 6040, ti=f0066000 
 task=f00b5550 task.ti=f0066000)
 Feb  5 23:26:26 sc-software kernel: Stack: 0282 f5915380 eff354a0 
 b7fb87b0 f0067f1c c01ac9ac 0003 1812
 Feb  5 23:26:26 sc-software kernel:f0067e98 0805ba64 0418 
 0805ba64 f0067eb4 0805ba64 1812 
 Feb  5 23:26:26 sc-software kernel:f5815280 0001  
 0805be7c 0805ba64 0805aa64 08048000 
 Feb  5 23:26:26 sc-software kernel: Call Trace:
 Feb  5 23:26:26 sc-software kernel:  [c0103e5a] show_trace_log_lvl+0x1a/0x30
 Feb  5 23:26:26 sc-software kernel:  [c0103f2a] show_stack_log_lvl+0x9a/0xc0
 Feb  5 23:26:26 sc-software kernel:  [c01040d7] show_registers+0xc7/0x250
 Feb  5 23:26:26 sc-software kernel:  [c010441f] die+0x11f/0x220
 Feb  5 23:26:26 sc-software kernel:  [c01045b1] do_trap+0x91/0xd0
 Feb  5 23:26:26 sc-software kernel:  [c0104859] do_invalid_op+0x89/0xa0
 Feb  5 23:26:26 sc-software kernel:  [c0657c22] error_code+0x72/0x78
 Feb  5 23:26:26 sc-software kernel:  [c01ac9ac] load_elf_binary+0x8cc/0xcf0
 Feb  5 23:26:26 sc-software kernel:  [c0181864] 
 search_binary_handler+0xc4/0x250
 Feb  5 23:26:26 sc-software kernel:  [c0181b35] do_execve+0x145/0x190
 Feb  5 23:26:26 sc-software kernel:  [c0101c82] sys_execve+0x32/0xa0
 Feb  5 23:26:26 sc-software kernel:  [c0103102] syscall_call+0x7/0xb

This means we pass a non-slab pointer from load_elf_binary() to
kfree() which doesn't seem likely to be a software bug reading the
code. As Hugh suggested, please run memtest86+.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] USB: mark USB drivers as being GPL only

2008-02-07 Thread Pekka Enberg

Hi David,

On Feb 7, 2008 3:31 PM, David Newall [EMAIL PROTECTED] wrote:
 This *is* real work.  You have blinded yourself to the fact that this
 discussion is preliminary to a proposed change.

 Or put another way, if you want to kill the discussion then the answer
 to shall we is no.

Ok. I am not interested in continuing this discussion so please remove
from the cc. Thanks.

Pekka
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] kmemcheck v3

2008-02-07 Thread Pekka Enberg


Hi Christoph,

Christoph Lameter wrote:

SLABs can be excepted from tracking?


Yes. __GFP_NOTRACK can be used to suppress tracking of objects (but we 
still take the page fault for each access). That is required for things 
like DMA filled pages that are never initialized by the CPU. 
SLAB_NOTRACK is for not tracking a whole *cache* so that we _don't_ take 
the page fault. This is needed for kmemcheck implementation (to avoid 
recursive page faults for memory accessed by the page fault handler).


So it breaks recursion. But this adds a new cache that is rarely 
used. There will be only about 50-100 kmem_cache objects in the system. I 
thought you could control the tracking on an per object level? Would not a 
kmalloc with __GFP_NOTRACK work?


No. We need to not track the whole page to avoid recursive faults. So 
for kmemcheck we absolutely do need cache_cache but we can, of course, 
hide that under a alloc_cache() function that only uses the extra cache 
when CONFIG_KMEMCHECK is enabled?

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] kmemcheck v3

2008-02-07 Thread Pekka Enberg


Pekka Enberg wrote:
No. We need to not track the whole page to avoid recursive faults. So 
for kmemcheck we absolutely do need cache_cache but we can, of course, 
hide that under a alloc_cache() function that only uses the extra cache 
when CONFIG_KMEMCHECK is enabled?


Btw, one option is to have a new _page flag_ so that we no longer need 
to look inside struct kmem_cache in the page fault handler.


Pekka


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] kmemcheck v3

2008-02-07 Thread Pekka Enberg

On Feb 8, 2008 1:32 AM, Christoph Lameter [EMAIL PROTECTED] wrote:
 But the slab layer allocates pages  PAGE_SIZE. You need to take a fault
 right? So each object would need its own page?

No. We allocate a shadow page for each data page which we then use as
a per-byte bitmap. For every tracked _page_ we take the page fault
always.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] kmemcheck v3

2008-02-08 Thread Pekka Enberg

On Feb 8, 2008 2:15 PM, Andi Kleen [EMAIL PROTECTED] wrote:
  need to worry about it just yet. In case it's from kmalloc() you can
  pass __GFP_NOTRACK to annotate those call sites where the memory is

 Ok you should add that then to skbuff.c.

Indeed. If you look at the second patch, I think Ingo is already doing
that with __GFP_ZERO which accomplishes the same thing. But yeah,
we're probably missing a lot of callsites atm.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] kmemcheck v3

2008-02-08 Thread Pekka Enberg

Hi Andi,

On Feb 8, 2008 1:55 PM, Andi Kleen [EMAIL PROTECTED] wrote:
 Also i'm not sure how you handle initializedness of DMAed data
 (like network buffers). Wouldn't you need hooks into pci_dma_*
 for this?

If the DMA'd memory is allocated from the page allocator, we don't
need to worry about it just yet. In case it's from kmalloc() you can
pass __GFP_NOTRACK to annotate those call sites where the memory is
filled by DMA (memory that is read needs to be initialized by the
caller obviously). There was some discussion with Ingo of a
__GFP_DMAFILL annotation to tag those call sites instead of
__GFP_NOTRACK which would work the same way.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

< 1 2 3 4 5 6 7 8 9 10 >

401 - 500 of 1778 matches

Mail list logo