from:"Heiko . Carstens"

[PATCH 2/2] drivers core: remove assert_held_device_hotplug()

2017-03-09 Thread Heiko Carstens

The last caller of assert_held_device_hotplug() is gone, so remove it again.

Cc: Dan Williams 
Cc: Michal Hocko 
Cc: "Rafael J. Wysocki" 
Cc: Vladimir Davydov 
Cc: Ben Hutchings 
Cc: Gerald Schaefer 
Cc: Martin Schwidefsky 
Cc: Sebastian Ott 
Signed-off-by: Heiko Carstens 
---
 drivers/base/core.c| 5 -
 include/linux/device.h | 1 -
 2 files changed, 6 deletions(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 684bda4d14a1..6bb60fb6a30b 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -639,11 +639,6 @@ int lock_device_hotplug_sysfs(void)
return restart_syscall();
 }
 
-void assert_held_device_hotplug(void)
-{
-   lockdep_assert_held(_hotplug_lock);
-}
-
 #ifdef CONFIG_BLOCK
 static inline int device_is_not_partition(struct device *dev)
 {
diff --git a/include/linux/device.h b/include/linux/device.h
index 30c4570e928d..9ef518af5515 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -1140,7 +1140,6 @@ static inline bool device_supports_offline(struct device 
*dev)
 extern void lock_device_hotplug(void);
 extern void unlock_device_hotplug(void);
 extern int lock_device_hotplug_sysfs(void);
-void assert_held_device_hotplug(void);
 extern int device_offline(struct device *dev);
 extern int device_online(struct device *dev);
 extern void set_primary_fwnode(struct device *dev, struct fwnode_handle 
*fwnode);
-- 
2.8.4

[PATCH 0/2] mm: add private lock to serialize memory hotplug operations

2017-03-09 Thread Heiko Carstens

These two patches are supposed to hopefully fix a memory hotplug
problem reported by Sebastian Ott.

Heiko Carstens (2):
  mm: add private lock to serialize memory hotplug operations
  drivers core: remove assert_held_device_hotplug()

 drivers/base/core.c| 5 -
 include/linux/device.h | 1 -
 kernel/memremap.c  | 4 
 mm/memory_hotplug.c| 6 +-
 4 files changed, 5 insertions(+), 11 deletions(-)

-- 
2.8.4

[PATCH 0/2] mm: add private lock to serialize memory hotplug operations

2017-03-09 Thread Heiko Carstens

These two patches are supposed to hopefully fix a memory hotplug
problem reported by Sebastian Ott.

Heiko Carstens (2):
  mm: add private lock to serialize memory hotplug operations
  drivers core: remove assert_held_device_hotplug()

 drivers/base/core.c| 5 -
 include/linux/device.h | 1 -
 kernel/memremap.c  | 4 
 mm/memory_hotplug.c| 6 +-
 4 files changed, 5 insertions(+), 11 deletions(-)

-- 
2.8.4

Re: [PATCH 1/4] s390: get rid of superfluous __GFP_REPEAT

2017-03-09 Thread Heiko Carstens

On Wed, Mar 08, 2017 at 03:11:10PM +0100, Michal Hocko wrote:
> On Wed 08-03-17 09:23:40, Heiko Carstens wrote:
> > On Tue, Mar 07, 2017 at 04:48:40PM +0100, Michal Hocko wrote:
> > > From: Michal Hocko <mho...@suse.com>
> > > 
> > > __GFP_REPEAT has a rather weak semantic but since it has been introduced
> > > around 2.6.12 it has been ignored for low order allocations.
> > > 
> > > page_table_alloc then uses the flag for a single page allocation. This
> > > means that this flag has never been actually useful here because it has
> > > always been used only for PAGE_ALLOC_COSTLY requests.
> > > 
> > > An earlier attempt to remove the flag 10d58bf297e2 ("s390: get rid of
> > > superfluous __GFP_REPEAT") has missed this one but the situation is very
> > > same here.
> > > 
> > > Cc: Heiko Carstens <heiko.carst...@de.ibm.com>
> > > Signed-off-by: Michal Hocko <mho...@suse.com>
> > > ---
> > >  arch/s390/mm/pgalloc.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > FWIW:
> > Acked-by: Heiko Carstens <heiko.carst...@de.ibm.com>
> 
> Thanks
> 
> > If you want, this can be routed via the s390 tree, whatever you prefer.
> 
> Yes, that would be great. I suspect the rest will take longer to get
> merged or land to a conclusion.

Ok, applied. Thanks! :)

Re: [PATCH 1/4] s390: get rid of superfluous __GFP_REPEAT

2017-03-09 Thread Heiko Carstens

On Wed, Mar 08, 2017 at 03:11:10PM +0100, Michal Hocko wrote:
> On Wed 08-03-17 09:23:40, Heiko Carstens wrote:
> > On Tue, Mar 07, 2017 at 04:48:40PM +0100, Michal Hocko wrote:
> > > From: Michal Hocko 
> > > 
> > > __GFP_REPEAT has a rather weak semantic but since it has been introduced
> > > around 2.6.12 it has been ignored for low order allocations.
> > > 
> > > page_table_alloc then uses the flag for a single page allocation. This
> > > means that this flag has never been actually useful here because it has
> > > always been used only for PAGE_ALLOC_COSTLY requests.
> > > 
> > > An earlier attempt to remove the flag 10d58bf297e2 ("s390: get rid of
> > > superfluous __GFP_REPEAT") has missed this one but the situation is very
> > > same here.
> > > 
> > > Cc: Heiko Carstens 
> > > Signed-off-by: Michal Hocko 
> > > ---
> > >  arch/s390/mm/pgalloc.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > FWIW:
> > Acked-by: Heiko Carstens 
> 
> Thanks
> 
> > If you want, this can be routed via the s390 tree, whatever you prefer.
> 
> Yes, that would be great. I suspect the rest will take longer to get
> merged or land to a conclusion.

Ok, applied. Thanks! :)

Re: [PATCH 1/4] s390: get rid of superfluous __GFP_REPEAT

2017-03-08 Thread Heiko Carstens

On Tue, Mar 07, 2017 at 04:48:40PM +0100, Michal Hocko wrote:
> From: Michal Hocko <mho...@suse.com>
> 
> __GFP_REPEAT has a rather weak semantic but since it has been introduced
> around 2.6.12 it has been ignored for low order allocations.
> 
> page_table_alloc then uses the flag for a single page allocation. This
> means that this flag has never been actually useful here because it has
> always been used only for PAGE_ALLOC_COSTLY requests.
> 
> An earlier attempt to remove the flag 10d58bf297e2 ("s390: get rid of
> superfluous __GFP_REPEAT") has missed this one but the situation is very
> same here.
> 
> Cc: Heiko Carstens <heiko.carst...@de.ibm.com>
> Signed-off-by: Michal Hocko <mho...@suse.com>
> ---
>  arch/s390/mm/pgalloc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

FWIW:
Acked-by: Heiko Carstens <heiko.carst...@de.ibm.com>

If you want, this can be routed via the s390 tree, whatever you prefer.

Re: [PATCH 1/4] s390: get rid of superfluous __GFP_REPEAT

2017-03-08 Thread Heiko Carstens

On Tue, Mar 07, 2017 at 04:48:40PM +0100, Michal Hocko wrote:
> From: Michal Hocko 
> 
> __GFP_REPEAT has a rather weak semantic but since it has been introduced
> around 2.6.12 it has been ignored for low order allocations.
> 
> page_table_alloc then uses the flag for a single page allocation. This
> means that this flag has never been actually useful here because it has
> always been used only for PAGE_ALLOC_COSTLY requests.
> 
> An earlier attempt to remove the flag 10d58bf297e2 ("s390: get rid of
> superfluous __GFP_REPEAT") has missed this one but the situation is very
> same here.
> 
> Cc: Heiko Carstens 
> Signed-off-by: Michal Hocko 
> ---
>  arch/s390/mm/pgalloc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

FWIW:
Acked-by: Heiko Carstens 

If you want, this can be routed via the s390 tree, whatever you prefer.

Re: [PATCH] mm, add_memory_resource: hold device_hotplug lock over mem_hotplug_{begin, done}

2017-03-06 Thread Heiko Carstens

Hello Dan,

> > If you look at commit 5e33bc4165f3 ("driver core / ACPI: Avoid device hot
> > remove locking issues") then lock_device_hotplug_sysfs() was introduced to
> > avoid a different subtle deadlock, but it also sleeps uninterruptible, but
> > not for more than 5ms ;)
> >
> > However I'm not sure if the device hotplug lock should also be used to fix
> > an unrelated bug that was introduced with the get_online_mems() /
> > put_online_mems() interface. Should it?
> 
> No, I don't think it should.
> 
> I like your proposed direction of creating a new lock internal to
> mem_hotplug_begin() to protect active_writer, and stop relying on
> lock_device_hotplug to serve this purpose.
> 
> > If so, we need to sprinkle around a couple of lock_device_hotplug() calls
> > near mem_hotplug_begin() calls, like Sebastian already started, and give it
> > additional semantics (protecting mem_hotplug.active_writer), and hope it
> > doesn't lead to deadlocks anywhere.
> 
> I'll put your proposed patch through some testing.

On s390 it _seems_ to work. Did it pass your testing too?
If so I would send a patch with proper patch description for inclusion.

Thanks,
Heiko

Re: [PATCH] mm, add_memory_resource: hold device_hotplug lock over mem_hotplug_{begin, done}

2017-03-06 Thread Heiko Carstens

Hello Dan,

> > If you look at commit 5e33bc4165f3 ("driver core / ACPI: Avoid device hot
> > remove locking issues") then lock_device_hotplug_sysfs() was introduced to
> > avoid a different subtle deadlock, but it also sleeps uninterruptible, but
> > not for more than 5ms ;)
> >
> > However I'm not sure if the device hotplug lock should also be used to fix
> > an unrelated bug that was introduced with the get_online_mems() /
> > put_online_mems() interface. Should it?
> 
> No, I don't think it should.
> 
> I like your proposed direction of creating a new lock internal to
> mem_hotplug_begin() to protect active_writer, and stop relying on
> lock_device_hotplug to serve this purpose.
> 
> > If so, we need to sprinkle around a couple of lock_device_hotplug() calls
> > near mem_hotplug_begin() calls, like Sebastian already started, and give it
> > additional semantics (protecting mem_hotplug.active_writer), and hope it
> > doesn't lead to deadlocks anywhere.
> 
> I'll put your proposed patch through some testing.

On s390 it _seems_ to work. Did it pass your testing too?
If so I would send a patch with proper patch description for inclusion.

Thanks,
Heiko

Re: [PATCH 1/3] futex: remove duplicated code

2017-03-03 Thread Heiko Carstens

On Fri, Mar 03, 2017 at 01:27:10PM +0100, Jiri Slaby wrote:
> There is code duplicated over all architecture's headers for
> futex_atomic_op_inuser. Namely op decoding, access_ok check for uaddr,
> and comparison of the result.
> 
> Remove this duplication and leave up to the arches only the needed
> assembly which is now in arch_futex_atomic_op_inuser.
> 
> Note that s390 removed access_ok check in d12a29703 ("s390/uaccess:
> remove pointless access_ok() checks") as access_ok there returns true.
> We introduce it back to the helper for the sake of simplicity (it gets
> optimized away anyway).
> 
> Signed-off-by: Jiri Slaby <jsl...@suse.cz>
> ---
>  arch/s390/include/asm/futex.h   | 23 -
>  include/asm-generic/futex.h | 50 
> +++--
>  kernel/futex.c  | 36 ++

Looks good to me and still boots on s390. Therefore for the s390 bits:
Acked-by: Heiko Carstens <heiko.carst...@de.ibm.com>

Thanks!

Re: [PATCH 1/3] futex: remove duplicated code

2017-03-03 Thread Heiko Carstens

On Fri, Mar 03, 2017 at 01:27:10PM +0100, Jiri Slaby wrote:
> There is code duplicated over all architecture's headers for
> futex_atomic_op_inuser. Namely op decoding, access_ok check for uaddr,
> and comparison of the result.
> 
> Remove this duplication and leave up to the arches only the needed
> assembly which is now in arch_futex_atomic_op_inuser.
> 
> Note that s390 removed access_ok check in d12a29703 ("s390/uaccess:
> remove pointless access_ok() checks") as access_ok there returns true.
> We introduce it back to the helper for the sake of simplicity (it gets
> optimized away anyway).
> 
> Signed-off-by: Jiri Slaby 
> ---
>  arch/s390/include/asm/futex.h   | 23 -
>  include/asm-generic/futex.h | 50 
> +++--
>  kernel/futex.c  | 36 ++

Looks good to me and still boots on s390. Therefore for the s390 bits:
Acked-by: Heiko Carstens 

Thanks!

[BUG 4.9/4.10] crash in __d_lookup() due to corrupted dentry_hashtable

2017-03-03 Thread Heiko Carstens

Hello Al,

Gustavo reported the crash below within __d_lookup() on s390. I'm wondering
if you can make any sense of it:

Unable to handle kernel pointer dereference in virtual kernel address space
Failing address: f000 TEID: f803
Fault in home space mode while using kernel ASCE.
AS:00ec8007 R3:0003dc5f4007 S:0020 
Oops: 0038 ilc:3 [#1] SMP 
Modules linked in: dm_mirror dm_region_hash dm_log raid1 dm_raid raid456 
async_raid6_recov async_memcpy async_pq async_xor async_tx dm_service_time 
paes_s390 pkey zcrypt rng_core xt_CHECKSUM iptable_mangle ipt_MASQUERADE 
nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 
nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp 
bridge stp llc ip6table_filter ip6_tables iptable_filter aes_s390 des_s390 
des_generic sha512_s390 sha256_s390 vmur sha1_s390 sha_common vhost_net nfsd 
tun vhost macvtap auth_rpcgss macvlan nfs_acl lockd kvm sch_fq_codel grace 
sunrpc dm_multipath dm_mod ip_tables x_tables autofs4
CPU: 1 PID: 31708 Comm: fio Tainted: GW   
4.10.0-20170228.0.1e014e7.d661408.fc24.s390xperformance #1
Hardware name: IBM  2964 N96  704  (z/VM)
task: f7e55d00 task.stack: 00023124c000
Krnl PSW : 0704e0018000 0033539e (__d_lookup+0x6e/0x168)
   R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
Krnl GPRS: 002f 03e0806248f8 82313080 00d2abe0
   00023124fc00 00023124fbfc 003b08a8 0003dafac220
   82313080  00023124fdc8 00026248fcce
   fffe 0094b420 00023124faf0 00023124faa0
Krnl Code: 00335392: b9040082lgr %r8,%r2
   00335396: a798lhi %r9,0
  #0033539a: a7f40008brc 15,3353aa
  >0033539e: e3c0c004lg  %r12,0(%r12)
   003353a4: ecc8001d007ccgij%r12,0,8,3353de
   003353aa: 59b0c01cc   %r11,28(%r12)
   003353ae: a774fff8brc 7,33539e
   003353b2: 4120c050la  %r2,80(%r12)
Call Trace:
([<82313080>] 0x82313080)
 [<00327026>] lookup_fast+0x19e/0x340 
 [<00327530>] walk_component+0x48/0x358 
 [<00327980>] link_path_walk+0x140/0x508 
 [<00328806>] path_openat+0xae/0x1320 
 [<0032af26>] do_filp_open+0x86/0x108 
 [<00315b5c>] do_sys_open+0x174/0x250 
 [<00922d5c>] system_call+0xc4/0x264 
Last Breaking-Event-Address:
 [<003353ae>] __d_lookup+0x7e/0x168
 
Kernel panic - not syncing: Fatal exception: panic_on_oops

Looking at the relevant part of __d_lookup:

struct dentry *__d_lookup(const struct dentry *parent, const struct qstr *name)
{
unsigned int hash = name->hash;
struct hlist_bl_head *b = d_hash(hash);  <--- points to corrupted entry
struct hlist_bl_node *node;
struct dentry *found = NULL;
struct dentry *dentry;

rcu_read_lock();

hlist_bl_for_each_entry_rcu(dentry, node, b, d_hash) {

if (dentry->d_name.hash != hash)
continue;
...

The contents of *b within the dump is:

> struct hlist_bl_head 03e0806248f8
struct hlist_bl_head {
first = 0x
}

Note that 0x03e0806248f8 is a valid address within the
dentry_hashtable. In addition all other entries look ok, as far as I can
tell. This is the only entry that contains a -1UL value.

We also have a second dump with a similar crash with a 4.9 kernel. In that
case there are in total three entries spread within the dentry_hashtable
with a -1UL value, while all other entries seem to look ok. So there seems
to be a pattern.

Note: these kernels do contain addon patches that are not mainline, but I
don't believe that any of those can explain these corruptions.

[BUG 4.9/4.10] crash in __d_lookup() due to corrupted dentry_hashtable

2017-03-03 Thread Heiko Carstens

Hello Al,

Gustavo reported the crash below within __d_lookup() on s390. I'm wondering
if you can make any sense of it:

Unable to handle kernel pointer dereference in virtual kernel address space
Failing address: f000 TEID: f803
Fault in home space mode while using kernel ASCE.
AS:00ec8007 R3:0003dc5f4007 S:0020 
Oops: 0038 ilc:3 [#1] SMP 
Modules linked in: dm_mirror dm_region_hash dm_log raid1 dm_raid raid456 
async_raid6_recov async_memcpy async_pq async_xor async_tx dm_service_time 
paes_s390 pkey zcrypt rng_core xt_CHECKSUM iptable_mangle ipt_MASQUERADE 
nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 
nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp 
bridge stp llc ip6table_filter ip6_tables iptable_filter aes_s390 des_s390 
des_generic sha512_s390 sha256_s390 vmur sha1_s390 sha_common vhost_net nfsd 
tun vhost macvtap auth_rpcgss macvlan nfs_acl lockd kvm sch_fq_codel grace 
sunrpc dm_multipath dm_mod ip_tables x_tables autofs4
CPU: 1 PID: 31708 Comm: fio Tainted: GW   
4.10.0-20170228.0.1e014e7.d661408.fc24.s390xperformance #1
Hardware name: IBM  2964 N96  704  (z/VM)
task: f7e55d00 task.stack: 00023124c000
Krnl PSW : 0704e0018000 0033539e (__d_lookup+0x6e/0x168)
   R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
Krnl GPRS: 002f 03e0806248f8 82313080 00d2abe0
   00023124fc00 00023124fbfc 003b08a8 0003dafac220
   82313080  00023124fdc8 00026248fcce
   fffe 0094b420 00023124faf0 00023124faa0
Krnl Code: 00335392: b9040082lgr %r8,%r2
   00335396: a798lhi %r9,0
  #0033539a: a7f40008brc 15,3353aa
  >0033539e: e3c0c004lg  %r12,0(%r12)
   003353a4: ecc8001d007ccgij%r12,0,8,3353de
   003353aa: 59b0c01cc   %r11,28(%r12)
   003353ae: a774fff8brc 7,33539e
   003353b2: 4120c050la  %r2,80(%r12)
Call Trace:
([<82313080>] 0x82313080)
 [<00327026>] lookup_fast+0x19e/0x340 
 [<00327530>] walk_component+0x48/0x358 
 [<00327980>] link_path_walk+0x140/0x508 
 [<00328806>] path_openat+0xae/0x1320 
 [<0032af26>] do_filp_open+0x86/0x108 
 [<00315b5c>] do_sys_open+0x174/0x250 
 [<00922d5c>] system_call+0xc4/0x264 
Last Breaking-Event-Address:
 [<003353ae>] __d_lookup+0x7e/0x168
 
Kernel panic - not syncing: Fatal exception: panic_on_oops

Looking at the relevant part of __d_lookup:

struct dentry *__d_lookup(const struct dentry *parent, const struct qstr *name)
{
unsigned int hash = name->hash;
struct hlist_bl_head *b = d_hash(hash);  <--- points to corrupted entry
struct hlist_bl_node *node;
struct dentry *found = NULL;
struct dentry *dentry;

rcu_read_lock();

hlist_bl_for_each_entry_rcu(dentry, node, b, d_hash) {

if (dentry->d_name.hash != hash)
continue;
...

The contents of *b within the dump is:

> struct hlist_bl_head 03e0806248f8
struct hlist_bl_head {
first = 0x
}

Note that 0x03e0806248f8 is a valid address within the
dentry_hashtable. In addition all other entries look ok, as far as I can
tell. This is the only entry that contains a -1UL value.

We also have a second dump with a similar crash with a 4.9 kernel. In that
case there are in total three entries spread within the dentry_hashtable
with a -1UL value, while all other entries seem to look ok. So there seems
to be a pattern.

Note: these kernels do contain addon patches that are not mainline, but I
don't believe that any of those can explain these corruptions.

Re: [PATCHv2 00/14] set_memory_* functions header refactor

2017-03-02 Thread Heiko Carstens

On Wed, Mar 01, 2017 at 04:14:52PM -0800, Laura Abbott wrote:
> Hi,
> 
> This is v2 of my proposal to move set_memory_* function prototypes out of
> cacheflush.h and into their own header file. This came out of a comment
> Russell made while reviewing RODATA test cases
> http://lists.infradead.org/pipermail/linux-arm-kernel/2017-January/480855.html
> While the final result of that series was the rodata code was refactored into
> its own header file, the set_memory_* APIs are still out of place.
> 
> This version refactored the common set_memory_* functions into an asm-generic
> header. s390x added some features so it can't just use the asm-generic header.
> I debated how much more to try and shove into the asm-generic version (e.g.
> stub prototypes for the ARM nommu case, set_kernel_text_*) but decided to 
> stick
> with just the basics for this version.
> 
> I split out the cacheflush.h -> set_memory.h conversions into separate patches
> to hopefully make merging easier. Worst case, the final patch to completely
> separate the two can be delayed if there are more problems found. I'd like
> for this to eventually go through the -mm tree so I'd like Acks where
> appropriate.
> 
> As always, feedback appreciated.

For the s390 bits:
Acked-by: Heiko Carstens <heiko.carst...@de.ibm.com>

Re: [PATCHv2 00/14] set_memory_* functions header refactor

2017-03-02 Thread Heiko Carstens

On Wed, Mar 01, 2017 at 04:14:52PM -0800, Laura Abbott wrote:
> Hi,
> 
> This is v2 of my proposal to move set_memory_* function prototypes out of
> cacheflush.h and into their own header file. This came out of a comment
> Russell made while reviewing RODATA test cases
> http://lists.infradead.org/pipermail/linux-arm-kernel/2017-January/480855.html
> While the final result of that series was the rodata code was refactored into
> its own header file, the set_memory_* APIs are still out of place.
> 
> This version refactored the common set_memory_* functions into an asm-generic
> header. s390x added some features so it can't just use the asm-generic header.
> I debated how much more to try and shove into the asm-generic version (e.g.
> stub prototypes for the ARM nommu case, set_kernel_text_*) but decided to 
> stick
> with just the basics for this version.
> 
> I split out the cacheflush.h -> set_memory.h conversions into separate patches
> to hopefully make merging easier. Worst case, the final patch to completely
> separate the two can be delayed if there are more problems found. I'd like
> for this to eventually go through the -mm tree so I'd like Acks where
> appropriate.
> 
> As always, feedback appreciated.

For the s390 bits:
Acked-by: Heiko Carstens

Re: [PATCH] mm, add_memory_resource: hold device_hotplug lock over mem_hotplug_{begin, done}

2017-03-01 Thread Heiko Carstens

On Wed, Mar 01, 2017 at 07:52:18AM -0800, Dan Williams wrote:
> On Wed, Mar 1, 2017 at 4:51 AM, Heiko Carstens
> <heiko.carst...@de.ibm.com> wrote:
> > Since it is anything but obvious why Dan wrote in changelog of b5d24fda9c3d
> > ("mm, devm_memremap_pages: hold device_hotplug lock over
> > mem_hotplug_{begin, done}") that write accesses to
> > mem_hotplug.active_writer are coordinated via lock_device_hotplug() I'd
> > rather propose a new private memory_add_remove_lock which has similar
> > semantics like the cpu_add_remove_lock for cpu hotplug (see patch below).
> >
> > However instead of sprinkling locking/unlocking of that new lock around all
> > calls of mem_hotplug_begin() and mem_hotplug_end() simply include locking
> > and unlocking into these two functions.
> >
> > This still allows get_online_mems() and put_online_mems() to work, while at
> > the same time preventing mem_hotplug.active_writer corruption.
> >
> > Any opinions?
> 
> Sorry, yes, I didn't make it clear that I derived that locking
> requirement from store_mem_state() and its usage of
> lock_device_hotplug_sysfs().
> 
> That routine is trying very hard not trip the soft-lockup detector. It
> seems like that wants to be an interruptible wait.

If you look at commit 5e33bc4165f3 ("driver core / ACPI: Avoid device hot
remove locking issues") then lock_device_hotplug_sysfs() was introduced to
avoid a different subtle deadlock, but it also sleeps uninterruptible, but
not for more than 5ms ;)

However I'm not sure if the device hotplug lock should also be used to fix
an unrelated bug that was introduced with the get_online_mems() /
put_online_mems() interface. Should it?

If so, we need to sprinkle around a couple of lock_device_hotplug() calls
near mem_hotplug_begin() calls, like Sebastian already started, and give it
additional semantics (protecting mem_hotplug.active_writer), and hope it
doesn't lead to deadlocks anywhere.

Re: [PATCH] mm, add_memory_resource: hold device_hotplug lock over mem_hotplug_{begin, done}

2017-03-01 Thread Heiko Carstens

On Wed, Mar 01, 2017 at 07:52:18AM -0800, Dan Williams wrote:
> On Wed, Mar 1, 2017 at 4:51 AM, Heiko Carstens
>  wrote:
> > Since it is anything but obvious why Dan wrote in changelog of b5d24fda9c3d
> > ("mm, devm_memremap_pages: hold device_hotplug lock over
> > mem_hotplug_{begin, done}") that write accesses to
> > mem_hotplug.active_writer are coordinated via lock_device_hotplug() I'd
> > rather propose a new private memory_add_remove_lock which has similar
> > semantics like the cpu_add_remove_lock for cpu hotplug (see patch below).
> >
> > However instead of sprinkling locking/unlocking of that new lock around all
> > calls of mem_hotplug_begin() and mem_hotplug_end() simply include locking
> > and unlocking into these two functions.
> >
> > This still allows get_online_mems() and put_online_mems() to work, while at
> > the same time preventing mem_hotplug.active_writer corruption.
> >
> > Any opinions?
> 
> Sorry, yes, I didn't make it clear that I derived that locking
> requirement from store_mem_state() and its usage of
> lock_device_hotplug_sysfs().
> 
> That routine is trying very hard not trip the soft-lockup detector. It
> seems like that wants to be an interruptible wait.

If you look at commit 5e33bc4165f3 ("driver core / ACPI: Avoid device hot
remove locking issues") then lock_device_hotplug_sysfs() was introduced to
avoid a different subtle deadlock, but it also sleeps uninterruptible, but
not for more than 5ms ;)

However I'm not sure if the device hotplug lock should also be used to fix
an unrelated bug that was introduced with the get_online_mems() /
put_online_mems() interface. Should it?

If so, we need to sprinkle around a couple of lock_device_hotplug() calls
near mem_hotplug_begin() calls, like Sebastian already started, and give it
additional semantics (protecting mem_hotplug.active_writer), and hope it
doesn't lead to deadlocks anywhere.

Re: [PATCH] mm, add_memory_resource: hold device_hotplug lock over mem_hotplug_{begin, done}

2017-03-01 Thread Heiko Carstens

On Tue, Feb 28, 2017 at 12:57:29PM +0100, Heiko Carstens wrote:
> On Mon, Feb 27, 2017 at 05:20:31PM +0100, Michal Hocko wrote:
> > [CC Rafael]
> > 
> > I've got lost in the acpi indirection (again). I can see
> > acpi_device_hotplug calling lock_device_hotplug() but i cannot find a
> > path down to add_memory() which might call add_memory_resource. But the
> > patch below sounds suspicious to me. Is it possible that this could lead
> > to a deadlock. I would suspect that it is the s390 code which needs to
> > do the locking. But I would have to double check - it is really easy to
> > get lost there.
> 
> To me it rather looks like bfc8c90139eb ("mem-hotplug: implement
> get/put_online_mems") introduced quite subtle and probably wrong locking
> rules.
> 
> The patch introduced mem_hotplug_begin() in order to have something like
> cpu_hotplug_begin() for memory. Note that for cpu hotplug all
> cpu_hotplug_begin() calls are serialized by cpu_maps_update_begin().
> 
> Especially this makes sure that active_writer can only be changed by one
> process. (See also Dan's commit which introduced the lock_device_hotplug()
> calls: https://marc.info/?l=linux-kernel=148693912419972=2 )
> 
> If you look at the above commit bfc8c90139eb: there is nothing like
> cpu_maps_update_begin() for memory. And therefore it's possible to have
> concurrent writers to active_writer.
> 
> It looks like now lock_device_hotplug() is supposed to be the new
> cpu_maps_update_begin() for memory. But.. this looks like a mess, unless I
> read the code completely wrong ;)

[Full quote since I now hopefully use a non-bouncing email address from
Vladimir]

Since it is anything but obvious why Dan wrote in changelog of b5d24fda9c3d
("mm, devm_memremap_pages: hold device_hotplug lock over
mem_hotplug_{begin, done}") that write accesses to
mem_hotplug.active_writer are coordinated via lock_device_hotplug() I'd
rather propose a new private memory_add_remove_lock which has similar
semantics like the cpu_add_remove_lock for cpu hotplug (see patch below).

However instead of sprinkling locking/unlocking of that new lock around all
calls of mem_hotplug_begin() and mem_hotplug_end() simply include locking
and unlocking into these two functions.

This still allows get_online_mems() and put_online_mems() to work, while at
the same time preventing mem_hotplug.active_writer corruption.

Any opinions?

---
 kernel/memremap.c   | 4 
 mm/memory_hotplug.c | 6 +-
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/kernel/memremap.c b/kernel/memremap.c
index 06123234f118..07e85e5229da 100644
--- a/kernel/memremap.c
+++ b/kernel/memremap.c
@@ -247,11 +247,9 @@ static void devm_memremap_pages_release(struct device 
*dev, void *data)
align_start = res->start & ~(SECTION_SIZE - 1);
align_size = ALIGN(resource_size(res), SECTION_SIZE);
 
-   lock_device_hotplug();
mem_hotplug_begin();
arch_remove_memory(align_start, align_size);
mem_hotplug_done();
-   unlock_device_hotplug();
 
untrack_pfn(NULL, PHYS_PFN(align_start), align_size);
pgmap_radix_release(res);
@@ -364,11 +362,9 @@ void *devm_memremap_pages(struct device *dev, struct 
resource *res,
if (error)
goto err_pfn_remap;
 
-   lock_device_hotplug();
mem_hotplug_begin();
error = arch_add_memory(nid, align_start, align_size, true);
mem_hotplug_done();
-   unlock_device_hotplug();
if (error)
goto err_add_memory;
 
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 1d3ed58f92ab..6ee6e6a17310 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -124,9 +124,12 @@ void put_online_mems(void)
 
 }
 
+/* Needed to serialize write accesses to mem_hotplug.active_writer. */
+static DEFINE_MUTEX(memory_add_remove_lock);
+
 void mem_hotplug_begin(void)
 {
-   assert_held_device_hotplug();
+   mutex_lock(_add_remove_lock);
 
mem_hotplug.active_writer = current;
 
@@ -146,6 +149,7 @@ void mem_hotplug_done(void)
mem_hotplug.active_writer = NULL;
mutex_unlock(_hotplug.lock);
memhp_lock_release();
+   mutex_unlock(_add_remove_lock);
 }
 
 /* add this memory to iomem resource */
-- 
2.8.4

> > On Sun 26-02-17 12:42:44, Sebastian Ott wrote:
> > > With 4.10.0-10265-gc4f3f22 the following warning is triggered on s390:
> > > 
> > > WARNING: CPU: 6 PID: 1 at drivers/base/core.c:643 
> > > assert_held_device_hotplug+0x4a/0x58
> > > [5.731214] Call Trace:
> > > [5.731219] ([<0067b8b0>] assert_held_device_hotplug+0x40/0x58)
> > > [5.731225]  [<00337914>] mem_hotplug_begin+0x34/0xc8
> > > [5.731231]  [<008b897e>] add_memory_resour

Re: [PATCH] mm, add_memory_resource: hold device_hotplug lock over mem_hotplug_{begin, done}

2017-03-01 Thread Heiko Carstens

On Tue, Feb 28, 2017 at 12:57:29PM +0100, Heiko Carstens wrote:
> On Mon, Feb 27, 2017 at 05:20:31PM +0100, Michal Hocko wrote:
> > [CC Rafael]
> > 
> > I've got lost in the acpi indirection (again). I can see
> > acpi_device_hotplug calling lock_device_hotplug() but i cannot find a
> > path down to add_memory() which might call add_memory_resource. But the
> > patch below sounds suspicious to me. Is it possible that this could lead
> > to a deadlock. I would suspect that it is the s390 code which needs to
> > do the locking. But I would have to double check - it is really easy to
> > get lost there.
> 
> To me it rather looks like bfc8c90139eb ("mem-hotplug: implement
> get/put_online_mems") introduced quite subtle and probably wrong locking
> rules.
> 
> The patch introduced mem_hotplug_begin() in order to have something like
> cpu_hotplug_begin() for memory. Note that for cpu hotplug all
> cpu_hotplug_begin() calls are serialized by cpu_maps_update_begin().
> 
> Especially this makes sure that active_writer can only be changed by one
> process. (See also Dan's commit which introduced the lock_device_hotplug()
> calls: https://marc.info/?l=linux-kernel=148693912419972=2 )
> 
> If you look at the above commit bfc8c90139eb: there is nothing like
> cpu_maps_update_begin() for memory. And therefore it's possible to have
> concurrent writers to active_writer.
> 
> It looks like now lock_device_hotplug() is supposed to be the new
> cpu_maps_update_begin() for memory. But.. this looks like a mess, unless I
> read the code completely wrong ;)

[Full quote since I now hopefully use a non-bouncing email address from
Vladimir]

Since it is anything but obvious why Dan wrote in changelog of b5d24fda9c3d
("mm, devm_memremap_pages: hold device_hotplug lock over
mem_hotplug_{begin, done}") that write accesses to
mem_hotplug.active_writer are coordinated via lock_device_hotplug() I'd
rather propose a new private memory_add_remove_lock which has similar
semantics like the cpu_add_remove_lock for cpu hotplug (see patch below).

However instead of sprinkling locking/unlocking of that new lock around all
calls of mem_hotplug_begin() and mem_hotplug_end() simply include locking
and unlocking into these two functions.

This still allows get_online_mems() and put_online_mems() to work, while at
the same time preventing mem_hotplug.active_writer corruption.

Any opinions?

---
 kernel/memremap.c   | 4 
 mm/memory_hotplug.c | 6 +-
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/kernel/memremap.c b/kernel/memremap.c
index 06123234f118..07e85e5229da 100644
--- a/kernel/memremap.c
+++ b/kernel/memremap.c
@@ -247,11 +247,9 @@ static void devm_memremap_pages_release(struct device 
*dev, void *data)
align_start = res->start & ~(SECTION_SIZE - 1);
align_size = ALIGN(resource_size(res), SECTION_SIZE);
 
-   lock_device_hotplug();
mem_hotplug_begin();
arch_remove_memory(align_start, align_size);
mem_hotplug_done();
-   unlock_device_hotplug();
 
untrack_pfn(NULL, PHYS_PFN(align_start), align_size);
pgmap_radix_release(res);
@@ -364,11 +362,9 @@ void *devm_memremap_pages(struct device *dev, struct 
resource *res,
if (error)
goto err_pfn_remap;
 
-   lock_device_hotplug();
mem_hotplug_begin();
error = arch_add_memory(nid, align_start, align_size, true);
mem_hotplug_done();
-   unlock_device_hotplug();
if (error)
goto err_add_memory;
 
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 1d3ed58f92ab..6ee6e6a17310 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -124,9 +124,12 @@ void put_online_mems(void)
 
 }
 
+/* Needed to serialize write accesses to mem_hotplug.active_writer. */
+static DEFINE_MUTEX(memory_add_remove_lock);
+
 void mem_hotplug_begin(void)
 {
-   assert_held_device_hotplug();
+   mutex_lock(_add_remove_lock);
 
mem_hotplug.active_writer = current;
 
@@ -146,6 +149,7 @@ void mem_hotplug_done(void)
mem_hotplug.active_writer = NULL;
mutex_unlock(_hotplug.lock);
memhp_lock_release();
+   mutex_unlock(_add_remove_lock);
 }
 
 /* add this memory to iomem resource */
-- 
2.8.4

> > On Sun 26-02-17 12:42:44, Sebastian Ott wrote:
> > > With 4.10.0-10265-gc4f3f22 the following warning is triggered on s390:
> > > 
> > > WARNING: CPU: 6 PID: 1 at drivers/base/core.c:643 
> > > assert_held_device_hotplug+0x4a/0x58
> > > [5.731214] Call Trace:
> > > [5.731219] ([<0067b8b0>] assert_held_device_hotplug+0x40/0x58)
> > > [5.731225]  [<00337914>] mem_hotplug_begin+0x34/0xc8
> > > [5.731231]  [<008b897e>] add_memory_resour

Re: [PATCH] mm, add_memory_resource: hold device_hotplug lock over mem_hotplug_{begin, done}

2017-02-28 Thread Heiko Carstens

On Mon, Feb 27, 2017 at 05:20:31PM +0100, Michal Hocko wrote:
> [CC Rafael]
> 
> I've got lost in the acpi indirection (again). I can see
> acpi_device_hotplug calling lock_device_hotplug() but i cannot find a
> path down to add_memory() which might call add_memory_resource. But the
> patch below sounds suspicious to me. Is it possible that this could lead
> to a deadlock. I would suspect that it is the s390 code which needs to
> do the locking. But I would have to double check - it is really easy to
> get lost there.

To me it rather looks like bfc8c90139eb ("mem-hotplug: implement
get/put_online_mems") introduced quite subtle and probably wrong locking
rules.

The patch introduced mem_hotplug_begin() in order to have something like
cpu_hotplug_begin() for memory. Note that for cpu hotplug all
cpu_hotplug_begin() calls are serialized by cpu_maps_update_begin().

Especially this makes sure that active_writer can only be changed by one
process. (See also Dan's commit which introduced the lock_device_hotplug()
calls: https://marc.info/?l=linux-kernel=148693912419972=2 )

If you look at the above commit bfc8c90139eb: there is nothing like
cpu_maps_update_begin() for memory. And therefore it's possible to have
concurrent writers to active_writer.

It looks like now lock_device_hotplug() is supposed to be the new
cpu_maps_update_begin() for memory. But.. this looks like a mess, unless I
read the code completely wrong ;)

> On Sun 26-02-17 12:42:44, Sebastian Ott wrote:
> > With 4.10.0-10265-gc4f3f22 the following warning is triggered on s390:
> > 
> > WARNING: CPU: 6 PID: 1 at drivers/base/core.c:643 
> > assert_held_device_hotplug+0x4a/0x58
> > [5.731214] Call Trace:
> > [5.731219] ([<0067b8b0>] assert_held_device_hotplug+0x40/0x58)
> > [5.731225]  [<00337914>] mem_hotplug_begin+0x34/0xc8
> > [5.731231]  [<008b897e>] add_memory_resource+0x7e/0x1f8
> > [5.731236]  [<008b8bd2>] add_memory+0xda/0x130
> > [5.731243]  [<00d7f0dc>] add_memory_merged+0x15c/0x178
> > [5.731247]  [<00d7f3a6>] sclp_detect_standby_memory+0x2ae/0x2f8
> > [5.731252]  [<001002ba>] do_one_initcall+0xa2/0x150
> > [5.731258]  [<00d3adc0>] kernel_init_freeable+0x228/0x2d8
> > [5.731263]  [<008b6572>] kernel_init+0x2a/0x140
> > [5.731267]  [<008c3972>] kernel_thread_starter+0x6/0xc
> > [5.731272]  [<008c396c>] kernel_thread_starter+0x0/0xc
> > [5.731276] no locks held by swapper/0/1.
> > [5.731280] Last Breaking-Event-Address:
> > [5.731285]  [<0067b8b6>] assert_held_device_hotplug+0x46/0x58
> > [5.731292] ---[ end trace 46480df21194c96a ]---
> 
> such an informtion belongs to the changelog
> 
> > ->8
> > mm, add_memory_resource: hold device_hotplug lock over mem_hotplug_{begin, 
> > done}
> > 
> > With commit 3fc219241 ("mm: validate device_hotplug is held for memory 
> > hotplug")
> > a lock assertion was added to mem_hotplug_begin() which led to a warning
> > when add_memory() is called. Fix this by acquiring device_hotplug_lock in
> > add_memory_resource().
> > 
> > Signed-off-by: Sebastian Ott 
> > ---
> >  mm/memory_hotplug.c | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> > index 1d3ed58..c633bbc 100644
> > --- a/mm/memory_hotplug.c
> > +++ b/mm/memory_hotplug.c
> > @@ -1361,6 +1361,7 @@ int __ref add_memory_resource(int nid, struct 
> > resource *res, bool online)
> > new_pgdat = !p;
> > }
> >  
> > +   lock_device_hotplug();
> > mem_hotplug_begin();
> >  
> > /*
> > @@ -1416,6 +1417,7 @@ int __ref add_memory_resource(int nid, struct 
> > resource *res, bool online)
> >  
> >  out:
> > mem_hotplug_done();
> > +   unlock_device_hotplug();
> > return ret;
> >  }
> >  EXPORT_SYMBOL_GPL(add_memory_resource);
> > -- 
> > 2.3.0
> > 
> > --
> > To unsubscribe, send a message with 'unsubscribe linux-mm' in
> > the body to majord...@kvack.org.  For more info on Linux MM,
> > see: http://www.linux-mm.org/ .
> > Don't email: mailto:"d...@kvack.org;> em...@kvack.org 
> 
> -- 
> Michal Hocko
> SUSE Labs
>

Re: [PATCH] mm, add_memory_resource: hold device_hotplug lock over mem_hotplug_{begin, done}

2017-02-28 Thread Heiko Carstens

On Mon, Feb 27, 2017 at 05:20:31PM +0100, Michal Hocko wrote:
> [CC Rafael]
> 
> I've got lost in the acpi indirection (again). I can see
> acpi_device_hotplug calling lock_device_hotplug() but i cannot find a
> path down to add_memory() which might call add_memory_resource. But the
> patch below sounds suspicious to me. Is it possible that this could lead
> to a deadlock. I would suspect that it is the s390 code which needs to
> do the locking. But I would have to double check - it is really easy to
> get lost there.

To me it rather looks like bfc8c90139eb ("mem-hotplug: implement
get/put_online_mems") introduced quite subtle and probably wrong locking
rules.

The patch introduced mem_hotplug_begin() in order to have something like
cpu_hotplug_begin() for memory. Note that for cpu hotplug all
cpu_hotplug_begin() calls are serialized by cpu_maps_update_begin().

Especially this makes sure that active_writer can only be changed by one
process. (See also Dan's commit which introduced the lock_device_hotplug()
calls: https://marc.info/?l=linux-kernel=148693912419972=2 )

If you look at the above commit bfc8c90139eb: there is nothing like
cpu_maps_update_begin() for memory. And therefore it's possible to have
concurrent writers to active_writer.

It looks like now lock_device_hotplug() is supposed to be the new
cpu_maps_update_begin() for memory. But.. this looks like a mess, unless I
read the code completely wrong ;)

> On Sun 26-02-17 12:42:44, Sebastian Ott wrote:
> > With 4.10.0-10265-gc4f3f22 the following warning is triggered on s390:
> > 
> > WARNING: CPU: 6 PID: 1 at drivers/base/core.c:643 
> > assert_held_device_hotplug+0x4a/0x58
> > [5.731214] Call Trace:
> > [5.731219] ([<0067b8b0>] assert_held_device_hotplug+0x40/0x58)
> > [5.731225]  [<00337914>] mem_hotplug_begin+0x34/0xc8
> > [5.731231]  [<008b897e>] add_memory_resource+0x7e/0x1f8
> > [5.731236]  [<008b8bd2>] add_memory+0xda/0x130
> > [5.731243]  [<00d7f0dc>] add_memory_merged+0x15c/0x178
> > [5.731247]  [<00d7f3a6>] sclp_detect_standby_memory+0x2ae/0x2f8
> > [5.731252]  [<001002ba>] do_one_initcall+0xa2/0x150
> > [5.731258]  [<00d3adc0>] kernel_init_freeable+0x228/0x2d8
> > [5.731263]  [<008b6572>] kernel_init+0x2a/0x140
> > [5.731267]  [<008c3972>] kernel_thread_starter+0x6/0xc
> > [5.731272]  [<008c396c>] kernel_thread_starter+0x0/0xc
> > [5.731276] no locks held by swapper/0/1.
> > [5.731280] Last Breaking-Event-Address:
> > [5.731285]  [<0067b8b6>] assert_held_device_hotplug+0x46/0x58
> > [5.731292] ---[ end trace 46480df21194c96a ]---
> 
> such an informtion belongs to the changelog
> 
> > ->8
> > mm, add_memory_resource: hold device_hotplug lock over mem_hotplug_{begin, 
> > done}
> > 
> > With commit 3fc219241 ("mm: validate device_hotplug is held for memory 
> > hotplug")
> > a lock assertion was added to mem_hotplug_begin() which led to a warning
> > when add_memory() is called. Fix this by acquiring device_hotplug_lock in
> > add_memory_resource().
> > 
> > Signed-off-by: Sebastian Ott 
> > ---
> >  mm/memory_hotplug.c | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> > index 1d3ed58..c633bbc 100644
> > --- a/mm/memory_hotplug.c
> > +++ b/mm/memory_hotplug.c
> > @@ -1361,6 +1361,7 @@ int __ref add_memory_resource(int nid, struct 
> > resource *res, bool online)
> > new_pgdat = !p;
> > }
> >  
> > +   lock_device_hotplug();
> > mem_hotplug_begin();
> >  
> > /*
> > @@ -1416,6 +1417,7 @@ int __ref add_memory_resource(int nid, struct 
> > resource *res, bool online)
> >  
> >  out:
> > mem_hotplug_done();
> > +   unlock_device_hotplug();
> > return ret;
> >  }
> >  EXPORT_SYMBOL_GPL(add_memory_resource);
> > -- 
> > 2.3.0
> > 
> > --
> > To unsubscribe, send a message with 'unsubscribe linux-mm' in
> > the body to majord...@kvack.org.  For more info on Linux MM,
> > see: http://www.linux-mm.org/ .
> > Don't email: mailto:"d...@kvack.org;> em...@kvack.org 
> 
> -- 
> Michal Hocko
> SUSE Labs
>

Re: [RFC PATCH] mm, hotplug: get rid of auto_online_blocks

2017-02-28 Thread Heiko Carstens

On Mon, Feb 27, 2017 at 04:43:04PM +0100, Michal Hocko wrote:
> On Mon 27-02-17 12:25:10, Heiko Carstens wrote:
> > On Mon, Feb 27, 2017 at 11:02:09AM +0100, Vitaly Kuznetsov wrote:
> > > A couple of other thoughts:
> > > 1) Having all newly added memory online ASAP is probably what people
> > > want for all virtual machines.
> > 
> > This is not true for s390. On s390 we have "standby" memory that a guest
> > sees and potentially may use if it sets it online. Every guest that sets
> > memory offline contributes to the hypervisor's standby memory pool, while
> > onlining standby memory takes memory away from the standby pool.
> > 
> > The use-case is that a system administrator in advance knows the maximum
> > size a guest will ever have and also defines how much memory should be used
> > at boot time. The difference is standby memory.
> > 
> > Auto-onlining of standby memory is the last thing we want.
> > 
> > > Unfortunately, we have additional complexity with memory zones
> > > (ZONE_NORMAL, ZONE_MOVABLE) and in some cases manual intervention is
> > > required. Especially, when further unplug is expected.
> > 
> > This also is a reason why auto-onlining doesn't seem be the best way.
> 
> Can you imagine any situation when somebody actually might want to have
> this knob enabled? From what I understand it doesn't seem to be the
> case.

I can only speak for s390, and at least here I think auto-online is always
wrong, especially if you consider the added complexity that you may want to
online memory sometimes to ZONE_NORMAL and sometimes to ZONE_MOVABLE.

Re: [RFC PATCH] mm, hotplug: get rid of auto_online_blocks

2017-02-28 Thread Heiko Carstens

On Mon, Feb 27, 2017 at 04:43:04PM +0100, Michal Hocko wrote:
> On Mon 27-02-17 12:25:10, Heiko Carstens wrote:
> > On Mon, Feb 27, 2017 at 11:02:09AM +0100, Vitaly Kuznetsov wrote:
> > > A couple of other thoughts:
> > > 1) Having all newly added memory online ASAP is probably what people
> > > want for all virtual machines.
> > 
> > This is not true for s390. On s390 we have "standby" memory that a guest
> > sees and potentially may use if it sets it online. Every guest that sets
> > memory offline contributes to the hypervisor's standby memory pool, while
> > onlining standby memory takes memory away from the standby pool.
> > 
> > The use-case is that a system administrator in advance knows the maximum
> > size a guest will ever have and also defines how much memory should be used
> > at boot time. The difference is standby memory.
> > 
> > Auto-onlining of standby memory is the last thing we want.
> > 
> > > Unfortunately, we have additional complexity with memory zones
> > > (ZONE_NORMAL, ZONE_MOVABLE) and in some cases manual intervention is
> > > required. Especially, when further unplug is expected.
> > 
> > This also is a reason why auto-onlining doesn't seem be the best way.
> 
> Can you imagine any situation when somebody actually might want to have
> this knob enabled? From what I understand it doesn't seem to be the
> case.

I can only speak for s390, and at least here I think auto-online is always
wrong, especially if you consider the added complexity that you may want to
online memory sometimes to ZONE_NORMAL and sometimes to ZONE_MOVABLE.

Re: [RFC PATCH] mm, hotplug: get rid of auto_online_blocks

2017-02-27 Thread Heiko Carstens

On Mon, Feb 27, 2017 at 11:02:09AM +0100, Vitaly Kuznetsov wrote:
> A couple of other thoughts:
> 1) Having all newly added memory online ASAP is probably what people
> want for all virtual machines.

This is not true for s390. On s390 we have "standby" memory that a guest
sees and potentially may use if it sets it online. Every guest that sets
memory offline contributes to the hypervisor's standby memory pool, while
onlining standby memory takes memory away from the standby pool.

The use-case is that a system administrator in advance knows the maximum
size a guest will ever have and also defines how much memory should be used
at boot time. The difference is standby memory.

Auto-onlining of standby memory is the last thing we want.

> Unfortunately, we have additional complexity with memory zones
> (ZONE_NORMAL, ZONE_MOVABLE) and in some cases manual intervention is
> required. Especially, when further unplug is expected.

This also is a reason why auto-onlining doesn't seem be the best way.

Re: [RFC PATCH] mm, hotplug: get rid of auto_online_blocks

2017-02-27 Thread Heiko Carstens

On Mon, Feb 27, 2017 at 11:02:09AM +0100, Vitaly Kuznetsov wrote:
> A couple of other thoughts:
> 1) Having all newly added memory online ASAP is probably what people
> want for all virtual machines.

This is not true for s390. On s390 we have "standby" memory that a guest
sees and potentially may use if it sets it online. Every guest that sets
memory offline contributes to the hypervisor's standby memory pool, while
onlining standby memory takes memory away from the standby pool.

The use-case is that a system administrator in advance knows the maximum
size a guest will ever have and also defines how much memory should be used
at boot time. The difference is standby memory.

Auto-onlining of standby memory is the last thing we want.

> Unfortunately, we have additional complexity with memory zones
> (ZONE_NORMAL, ZONE_MOVABLE) and in some cases manual intervention is
> required. Especially, when further unplug is expected.

This also is a reason why auto-onlining doesn't seem be the best way.

Re: [PATCH 1/4] s390: convert debug_info.ref_count from atomic_t to refcount_t

2017-02-20 Thread Heiko Carstens

On Mon, Feb 20, 2017 at 02:24:24PM +0100, Heiko Carstens wrote:
> > @@ -361,7 +361,7 @@ debug_info_create(const char *name, int pages_per_area, 
> > int nr_areas,
> >  debug_area_last = rc;
> >  rc->next = NULL;
> >  
> > -   debug_info_get(rc);
> > +   refcount_set(>ref_count, 1);
> 
> This is not wrong, but I will remove this hunk before applying your patch,
> since this doesn't look like an obvious correct change at first glance.

Actually your version is needed - just looked at refcount_inc().
Sorry for the confusion in my side now.

Re: [PATCH 1/4] s390: convert debug_info.ref_count from atomic_t to refcount_t

2017-02-20 Thread Heiko Carstens

On Mon, Feb 20, 2017 at 02:24:24PM +0100, Heiko Carstens wrote:
> > @@ -361,7 +361,7 @@ debug_info_create(const char *name, int pages_per_area, 
> > int nr_areas,
> >  debug_area_last = rc;
> >  rc->next = NULL;
> >  
> > -   debug_info_get(rc);
> > +   refcount_set(>ref_count, 1);
> 
> This is not wrong, but I will remove this hunk before applying your patch,
> since this doesn't look like an obvious correct change at first glance.

Actually your version is needed - just looked at refcount_inc().
Sorry for the confusion in my side now.

Re: [PATCH 1/4] s390: convert debug_info.ref_count from atomic_t to refcount_t

2017-02-20 Thread Heiko Carstens

On Mon, Feb 20, 2017 at 01:06:18PM +0200, Elena Reshetova wrote:
> refcount_t type and corresponding API should be
> used instead of atomic_t when the variable is used as
> a reference counter. This allows to avoid accidental
> refcounter overflows that might lead to use-after-free
> situations.
> 
> Signed-off-by: Elena Reshetova 
> Signed-off-by: Hans Liljestrand 
> Signed-off-by: Kees Cook 
> Signed-off-by: David Windsor 
> ---
>  arch/s390/include/asm/debug.h | 3 ++-
>  arch/s390/kernel/debug.c  | 8 
>  2 files changed, 6 insertions(+), 5 deletions(-)

I can only see a pull request from Ingo a couple of hours ago for Peter's
refcount code. So the refcount code is not merged yet. It would have been
good if you would have waited until it is really merged to avoid confusion.

> @@ -361,7 +361,7 @@ debug_info_create(const char *name, int pages_per_area, 
> int nr_areas,
>  debug_area_last = rc;
>  rc->next = NULL;
>  
> - debug_info_get(rc);
> + refcount_set(>ref_count, 1);

This is not wrong, but I will remove this hunk before applying your patch,
since this doesn't look like an obvious correct change at first glance.

Thanks,
Heiko

Re: [PATCH 1/4] s390: convert debug_info.ref_count from atomic_t to refcount_t

2017-02-20 Thread Heiko Carstens

On Mon, Feb 20, 2017 at 01:06:18PM +0200, Elena Reshetova wrote:
> refcount_t type and corresponding API should be
> used instead of atomic_t when the variable is used as
> a reference counter. This allows to avoid accidental
> refcounter overflows that might lead to use-after-free
> situations.
> 
> Signed-off-by: Elena Reshetova 
> Signed-off-by: Hans Liljestrand 
> Signed-off-by: Kees Cook 
> Signed-off-by: David Windsor 
> ---
>  arch/s390/include/asm/debug.h | 3 ++-
>  arch/s390/kernel/debug.c  | 8 
>  2 files changed, 6 insertions(+), 5 deletions(-)

I can only see a pull request from Ingo a couple of hours ago for Peter's
refcount code. So the refcount code is not merged yet. It would have been
good if you would have waited until it is really merged to avoid confusion.

> @@ -361,7 +361,7 @@ debug_info_create(const char *name, int pages_per_area, 
> int nr_areas,
>  debug_area_last = rc;
>  rc->next = NULL;
>  
> - debug_info_get(rc);
> + refcount_set(>ref_count, 1);

This is not wrong, but I will remove this hunk before applying your patch,
since this doesn't look like an obvious correct change at first glance.

Thanks,
Heiko

Re: [PATCH 0/3] s390: audit and remove needless module.h includes

2017-02-12 Thread Heiko Carstens

On Thu, Feb 09, 2017 at 03:20:22PM -0500, Paul Gortmaker wrote:
> Paul Gortmaker (3):
>   s390: kernel: Audit and remove any unnecessary uses of module.h
>   s390: mm: Audit and remove any unnecessary uses of module.h
>   s390: Audit and remove any remaining unnecessary uses of module.h

Applied, thanks!

Re: [PATCH 0/3] s390: audit and remove needless module.h includes

2017-02-12 Thread Heiko Carstens

On Thu, Feb 09, 2017 at 03:20:22PM -0500, Paul Gortmaker wrote:
> Paul Gortmaker (3):
>   s390: kernel: Audit and remove any unnecessary uses of module.h
>   s390: mm: Audit and remove any unnecessary uses of module.h
>   s390: Audit and remove any remaining unnecessary uses of module.h

Applied, thanks!

Re: [RFC][PATCH] treewide: Move set_memory_* functions away from cacheflush.h

2017-02-09 Thread Heiko Carstens

On Tue, Feb 07, 2017 at 01:10:16PM -0800, Laura Abbott wrote:
> The set_memory_* APIs came out of a desire to have a better way to
> change memory attributes. Many of these attributes were linked to cache
> functionality so the prototypes were put in cacheflush.h. These days,
> the APIs have grown and have a much wider use than just cache APIs. To
> support this growth, split off set_memory_* and friends into a separate
> header file to avoid growing cacheflush.h for APIs that have nothing to
> do with caches.
> 
> Signed-off-by: Laura Abbott <labb...@redhat.com>
> ---
> This came out of a comment Russell made while reviewing RODATA test cases
> http://lists.infradead.org/pipermail/linux-arm-kernel/2017-January/480855.html
> While the final result of that series was the rodata code was refactored into
> its own header file, the set_memory_* APIs are still out of place.
> 
> This is a simple attempt at moving all the API stubs to their own file.
> Another idea I had was throwing set_memory_{x,nx,ro,rw} in an asm-generic
> file since those are commonly used for module setting across all arches.
> 
> This is an RFC to see if this is actually beneficial. The diffstat is not
> negative unfortunately due to header guards in newer files.
> I have patches to convert call sites over to set_memory.h instead of
> cacheflush.h if there is sufficient interest.
> ---
>  arch/arm/include/asm/cacheflush.h   |  21 +---
>  arch/arm/include/asm/set_memory.h   |  32 
>  arch/arm64/include/asm/cacheflush.h |   6 +--
>  arch/arm64/include/asm/set_memory.h |  26 ++
>  arch/s390/include/asm/cacheflush.h  |   6 +--
>  arch/s390/include/asm/set_memory.h  |   9 
>  arch/x86/include/asm/cacheflush.h   |  96 +-
>  arch/x86/include/asm/set_memory.h   | 100 
> 
>  8 files changed, 171 insertions(+), 125 deletions(-)
>  create mode 100644 arch/arm/include/asm/set_memory.h
>  create mode 100644 arch/arm64/include/asm/set_memory.h
>  create mode 100644 arch/s390/include/asm/set_memory.h
>  create mode 100644 arch/x86/include/asm/set_memory.h

For s390:
Acked-by: Heiko Carstens <heiko.carst...@de.ibm.com>

Re: [RFC][PATCH] treewide: Move set_memory_* functions away from cacheflush.h

2017-02-09 Thread Heiko Carstens

On Tue, Feb 07, 2017 at 01:10:16PM -0800, Laura Abbott wrote:
> The set_memory_* APIs came out of a desire to have a better way to
> change memory attributes. Many of these attributes were linked to cache
> functionality so the prototypes were put in cacheflush.h. These days,
> the APIs have grown and have a much wider use than just cache APIs. To
> support this growth, split off set_memory_* and friends into a separate
> header file to avoid growing cacheflush.h for APIs that have nothing to
> do with caches.
> 
> Signed-off-by: Laura Abbott 
> ---
> This came out of a comment Russell made while reviewing RODATA test cases
> http://lists.infradead.org/pipermail/linux-arm-kernel/2017-January/480855.html
> While the final result of that series was the rodata code was refactored into
> its own header file, the set_memory_* APIs are still out of place.
> 
> This is a simple attempt at moving all the API stubs to their own file.
> Another idea I had was throwing set_memory_{x,nx,ro,rw} in an asm-generic
> file since those are commonly used for module setting across all arches.
> 
> This is an RFC to see if this is actually beneficial. The diffstat is not
> negative unfortunately due to header guards in newer files.
> I have patches to convert call sites over to set_memory.h instead of
> cacheflush.h if there is sufficient interest.
> ---
>  arch/arm/include/asm/cacheflush.h   |  21 +---
>  arch/arm/include/asm/set_memory.h   |  32 
>  arch/arm64/include/asm/cacheflush.h |   6 +--
>  arch/arm64/include/asm/set_memory.h |  26 ++
>  arch/s390/include/asm/cacheflush.h  |   6 +--
>  arch/s390/include/asm/set_memory.h  |   9 
>  arch/x86/include/asm/cacheflush.h   |  96 +-
>  arch/x86/include/asm/set_memory.h   | 100 
> 
>  8 files changed, 171 insertions(+), 125 deletions(-)
>  create mode 100644 arch/arm/include/asm/set_memory.h
>  create mode 100644 arch/arm64/include/asm/set_memory.h
>  create mode 100644 arch/s390/include/asm/set_memory.h
>  create mode 100644 arch/x86/include/asm/set_memory.h

For s390:
Acked-by: Heiko Carstens

Re: [PATCHv3 1/2] arch: Move CONFIG_DEBUG_RODATA and CONFIG_SET_MODULE_RONX to be common

2017-02-07 Thread Heiko Carstens

On Mon, Feb 06, 2017 at 04:31:57PM -0800, Laura Abbott wrote:
> 
> There are multiple architectures that support CONFIG_DEBUG_RODATA and
> CONFIG_SET_MODULE_RONX. These options also now have the ability to be
> turned off at runtime. Move these to an architecture independent
> location and make these options def_bool y for almost all of those
> arches.
> 
> Signed-off-by: Laura Abbott <labb...@redhat.com>
> ---
> v3: Make these configs selectable for arm. Include some documentation about
> how the setup of the optional Kconfigs work as well. Stop spelling 'kenrel'
> in prompt text.
> ---
>  Documentation/security/self-protection.txt |  6 ++
>  arch/Kconfig   | 34 
> ++
>  arch/arm/Kconfig   |  4 
>  arch/arm/Kconfig.debug | 11 --
>  arch/arm/mm/Kconfig| 12 ---
>  arch/arm64/Kconfig |  5 ++---
>  arch/arm64/Kconfig.debug   | 11 --
>  arch/parisc/Kconfig|  1 +
>  arch/parisc/Kconfig.debug  | 11 --
>  arch/s390/Kconfig  |  5 ++---
>  arch/s390/Kconfig.debug|  3 ---
>  arch/x86/Kconfig   |  5 ++---
>  arch/x86/Kconfig.debug | 11 --
>  13 files changed, 51 insertions(+), 68 deletions(-)

For the s390 bits:

Acked-by: Heiko Carstens <heiko.carst...@de.ibm.com>

Re: [PATCHv3 1/2] arch: Move CONFIG_DEBUG_RODATA and CONFIG_SET_MODULE_RONX to be common

2017-02-07 Thread Heiko Carstens

On Mon, Feb 06, 2017 at 04:31:57PM -0800, Laura Abbott wrote:
> 
> There are multiple architectures that support CONFIG_DEBUG_RODATA and
> CONFIG_SET_MODULE_RONX. These options also now have the ability to be
> turned off at runtime. Move these to an architecture independent
> location and make these options def_bool y for almost all of those
> arches.
> 
> Signed-off-by: Laura Abbott 
> ---
> v3: Make these configs selectable for arm. Include some documentation about
> how the setup of the optional Kconfigs work as well. Stop spelling 'kenrel'
> in prompt text.
> ---
>  Documentation/security/self-protection.txt |  6 ++
>  arch/Kconfig   | 34 
> ++
>  arch/arm/Kconfig   |  4 
>  arch/arm/Kconfig.debug | 11 --
>  arch/arm/mm/Kconfig| 12 ---
>  arch/arm64/Kconfig |  5 ++---
>  arch/arm64/Kconfig.debug   | 11 --
>  arch/parisc/Kconfig|  1 +
>  arch/parisc/Kconfig.debug  | 11 --
>  arch/s390/Kconfig  |  5 ++---
>  arch/s390/Kconfig.debug|  3 ---
>  arch/x86/Kconfig   |  5 ++---
>  arch/x86/Kconfig.debug | 11 --
>  13 files changed, 51 insertions(+), 68 deletions(-)

For the s390 bits:

Acked-by: Heiko Carstens

[PATCH 1/3] memblock: let memblock_type_name know about physmem type

2017-01-20 Thread Heiko Carstens

Since commit 70210ed950b5 ("mm/memblock: add physical memory list")
the memblock structure knows about a physical memory list.

memblock_type_name() should return "physmem" instead of "unknown" if
the name of the physmem memblock_type is being asked for.

Signed-off-by: Heiko Carstens <heiko.carst...@de.ibm.com>
---
 mm/memblock.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/mm/memblock.c b/mm/memblock.c
index 7608bc305936..acbfa1d2 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -72,6 +72,10 @@ memblock_type_name(struct memblock_type *type)
return "memory";
else if (type == )
return "reserved";
+#ifdef CONFIG_HAVE_MEMBLOCK_PHYS_MAP
+   else if (type == )
+   return "physmem";
+#endif
else
return "unknown";
 }
-- 
2.8.4

[PATCH 1/3] memblock: let memblock_type_name know about physmem type

2017-01-20 Thread Heiko Carstens

Since commit 70210ed950b5 ("mm/memblock: add physical memory list")
the memblock structure knows about a physical memory list.

memblock_type_name() should return "physmem" instead of "unknown" if
the name of the physmem memblock_type is being asked for.

Signed-off-by: Heiko Carstens 
---
 mm/memblock.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/mm/memblock.c b/mm/memblock.c
index 7608bc305936..acbfa1d2 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -72,6 +72,10 @@ memblock_type_name(struct memblock_type *type)
return "memory";
else if (type == )
return "reserved";
+#ifdef CONFIG_HAVE_MEMBLOCK_PHYS_MAP
+   else if (type == )
+   return "physmem";
+#endif
else
return "unknown";
 }
-- 
2.8.4

[PATCH 3/3] memblock: embed memblock type name within struct memblock_type

2017-01-20 Thread Heiko Carstens

Provide the name of each memblock type with struct memblock_type. This
allows to get rid of the function memblock_type_name() and duplicating
the type names in __memblock_dump_all().

The only memblock_type usage out of mm/memblock.c seems to be
arch/s390/kernel/crash_dump.c. While at it, give it a name.

Signed-off-by: Heiko Carstens <heiko.carst...@de.ibm.com>
---
 arch/s390/kernel/crash_dump.c |  1 +
 include/linux/memblock.h  |  1 +
 mm/memblock.c | 35 +++
 3 files changed, 13 insertions(+), 24 deletions(-)

diff --git a/arch/s390/kernel/crash_dump.c b/arch/s390/kernel/crash_dump.c
index f9293bfefb7f..9c9440dc253a 100644
--- a/arch/s390/kernel/crash_dump.c
+++ b/arch/s390/kernel/crash_dump.c
@@ -31,6 +31,7 @@ static struct memblock_type oldmem_type = {
.max = 1,
.total_size = 0,
.regions = _region,
+   .name = "oldmem",
 };
 
 struct save_area {
diff --git a/include/linux/memblock.h b/include/linux/memblock.h
index 5b759c9acf97..8dee5ec80adf 100644
--- a/include/linux/memblock.h
+++ b/include/linux/memblock.h
@@ -42,6 +42,7 @@ struct memblock_type {
unsigned long max;  /* size of the allocated array */
phys_addr_t total_size; /* size of all regions */
struct memblock_region *regions;
+   char *name;
 };
 
 struct memblock {
diff --git a/mm/memblock.c b/mm/memblock.c
index fbaaf713827c..82d21e598e8c 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -35,15 +35,18 @@ struct memblock memblock __initdata_memblock = {
.memory.regions = memblock_memory_init_regions,
.memory.cnt = 1,/* empty dummy entry */
.memory.max = INIT_MEMBLOCK_REGIONS,
+   .memory.name= "memory",
 
.reserved.regions   = memblock_reserved_init_regions,
.reserved.cnt   = 1,/* empty dummy entry */
.reserved.max   = INIT_MEMBLOCK_REGIONS,
+   .reserved.name  = "reserved",
 
 #ifdef CONFIG_HAVE_MEMBLOCK_PHYS_MAP
.physmem.regions= memblock_physmem_init_regions,
.physmem.cnt= 1,/* empty dummy entry */
.physmem.max= INIT_PHYSMEM_REGIONS,
+   .physmem.name   = "physmem",
 #endif
 
.bottom_up  = false,
@@ -64,22 +67,6 @@ ulong __init_memblock choose_memblock_flags(void)
return system_has_some_mirror ? MEMBLOCK_MIRROR : MEMBLOCK_NONE;
 }
 
-/* inline so we don't get a warning when pr_debug is compiled out */
-static __init_memblock const char *
-memblock_type_name(struct memblock_type *type)
-{
-   if (type == )
-   return "memory";
-   else if (type == )
-   return "reserved";
-#ifdef CONFIG_HAVE_MEMBLOCK_PHYS_MAP
-   else if (type == )
-   return "physmem";
-#endif
-   else
-   return "unknown";
-}
-
 /* adjust *@size so that (@base + *@size) doesn't overflow, return new size */
 static inline phys_addr_t memblock_cap_size(phys_addr_t base, phys_addr_t 
*size)
 {
@@ -406,12 +393,12 @@ static int __init_memblock memblock_double_array(struct 
memblock_type *type,
}
if (!addr) {
pr_err("memblock: Failed to double %s array from %ld to %ld 
entries !\n",
-  memblock_type_name(type), type->max, type->max * 2);
+  type->name, type->max, type->max * 2);
return -1;
}
 
memblock_dbg("memblock: %s is doubled to %ld at [%#010llx-%#010llx]",
-   memblock_type_name(type), type->max * 2, (u64)addr,
+   type->name, type->max * 2, (u64)addr,
(u64)addr + new_size - 1);
 
/*
@@ -1675,14 +1662,14 @@ phys_addr_t __init_memblock 
memblock_get_current_limit(void)
return memblock.current_limit;
 }
 
-static void __init_memblock memblock_dump(struct memblock_type *type, char 
*name)
+static void __init_memblock memblock_dump(struct memblock_type *type)
 {
unsigned long long base, size;
unsigned long flags;
int idx;
struct memblock_region *rgn;
 
-   pr_info(" %s.cnt  = 0x%lx\n", name, type->cnt);
+   pr_info(" %s.cnt  = 0x%lx\n", type->name, type->cnt);
 
for_each_memblock_type(type, rgn) {
char nid_buf[32] = "";
@@ -1696,7 +1683,7 @@ static void __init_memblock memblock_dump(struct 
memblock_type *type, char *name
 memblock_get_region_node(rgn));
 #endif
pr_info(" %s[%#x]\t[%#016llx-%#016llx], %#llx bytes%s flags: 
%#lx\n",
-   name, idx, base, base + size - 1, size, nid_buf, flags);
+   type->name, idx, base, base + size - 1, size, nid_buf, 
flags);

[PATCH 2/3] memblock: also dump physmem list within __memblock_dump_all

2017-01-20 Thread Heiko Carstens

Since commit 70210ed950b5 ("mm/memblock: add physical memory list")
the memblock structure knows about a physical memory list.

The physical memory list should also be dumped if memblock_dump_all()
is called in case memblock_debug is switched on. This makes debugging
a bit easier.

Signed-off-by: Heiko Carstens <heiko.carst...@de.ibm.com>
---
 mm/memblock.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/mm/memblock.c b/mm/memblock.c
index acbfa1d2..fbaaf713827c 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -1709,6 +1709,9 @@ void __init_memblock __memblock_dump_all(void)
 
memblock_dump(, "memory");
memblock_dump(, "reserved");
+#ifdef CONFIG_HAVE_MEMBLOCK_PHYS_MAP
+   memblock_dump(, "physmem");
+#endif
 }
 
 void __init memblock_allow_resize(void)
-- 
2.8.4

[PATCH 0/3] memblock: physical memory list cleanups

2017-01-20 Thread Heiko Carstens

Just a couple of trivial memblock patches, which could also be merged
into one patch; whatever is preferred.

Since commit 70210ed950b5 ("mm/memblock: add physical memory list")
the memblock structure knows about a physical memory list.

The memblock code should also print a sane name instead of "unknown"
if it calls memblock_type_name() to get a name for the physmem
memblock type.
In addition the physmem list should also be dumped, if present, and
memblock_dump_all is called to improve debugability.

The last patch embeds the memblock type name into the memblock type
structure in order to hopefully make the code a bit more easier and to
get rid of a bit of code duplication.

Thanks,
Heiko

Heiko Carstens (3):
  memblock: let memblock_type_name know about physmem type
  memblock: also dump physmem list within __memblock_dump_all
  memblock: embed memblock type name within struct memblock_type

 arch/s390/kernel/crash_dump.c |  1 +
 include/linux/memblock.h  |  1 +
 mm/memblock.c | 32 +---
 3 files changed, 15 insertions(+), 19 deletions(-)

-- 
2.8.4

[PATCH 2/3] memblock: also dump physmem list within __memblock_dump_all

2017-01-20 Thread Heiko Carstens

Since commit 70210ed950b5 ("mm/memblock: add physical memory list")
the memblock structure knows about a physical memory list.

The physical memory list should also be dumped if memblock_dump_all()
is called in case memblock_debug is switched on. This makes debugging
a bit easier.

Signed-off-by: Heiko Carstens 
---
 mm/memblock.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/mm/memblock.c b/mm/memblock.c
index acbfa1d2..fbaaf713827c 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -1709,6 +1709,9 @@ void __init_memblock __memblock_dump_all(void)
 
memblock_dump(, "memory");
memblock_dump(, "reserved");
+#ifdef CONFIG_HAVE_MEMBLOCK_PHYS_MAP
+   memblock_dump(, "physmem");
+#endif
 }
 
 void __init memblock_allow_resize(void)
-- 
2.8.4

[PATCH 0/3] memblock: physical memory list cleanups

2017-01-20 Thread Heiko Carstens

Just a couple of trivial memblock patches, which could also be merged
into one patch; whatever is preferred.

Since commit 70210ed950b5 ("mm/memblock: add physical memory list")
the memblock structure knows about a physical memory list.

The memblock code should also print a sane name instead of "unknown"
if it calls memblock_type_name() to get a name for the physmem
memblock type.
In addition the physmem list should also be dumped, if present, and
memblock_dump_all is called to improve debugability.

The last patch embeds the memblock type name into the memblock type
structure in order to hopefully make the code a bit more easier and to
get rid of a bit of code duplication.

Thanks,
Heiko

Heiko Carstens (3):
  memblock: let memblock_type_name know about physmem type
  memblock: also dump physmem list within __memblock_dump_all
  memblock: embed memblock type name within struct memblock_type

 arch/s390/kernel/crash_dump.c |  1 +
 include/linux/memblock.h  |  1 +
 mm/memblock.c | 32 +---
 3 files changed, 15 insertions(+), 19 deletions(-)

-- 
2.8.4

[PATCH 3/3] memblock: embed memblock type name within struct memblock_type

2017-01-20 Thread Heiko Carstens

Provide the name of each memblock type with struct memblock_type. This
allows to get rid of the function memblock_type_name() and duplicating
the type names in __memblock_dump_all().

The only memblock_type usage out of mm/memblock.c seems to be
arch/s390/kernel/crash_dump.c. While at it, give it a name.

Signed-off-by: Heiko Carstens 
---
 arch/s390/kernel/crash_dump.c |  1 +
 include/linux/memblock.h  |  1 +
 mm/memblock.c | 35 +++
 3 files changed, 13 insertions(+), 24 deletions(-)

diff --git a/arch/s390/kernel/crash_dump.c b/arch/s390/kernel/crash_dump.c
index f9293bfefb7f..9c9440dc253a 100644
--- a/arch/s390/kernel/crash_dump.c
+++ b/arch/s390/kernel/crash_dump.c
@@ -31,6 +31,7 @@ static struct memblock_type oldmem_type = {
.max = 1,
.total_size = 0,
.regions = _region,
+   .name = "oldmem",
 };
 
 struct save_area {
diff --git a/include/linux/memblock.h b/include/linux/memblock.h
index 5b759c9acf97..8dee5ec80adf 100644
--- a/include/linux/memblock.h
+++ b/include/linux/memblock.h
@@ -42,6 +42,7 @@ struct memblock_type {
unsigned long max;  /* size of the allocated array */
phys_addr_t total_size; /* size of all regions */
struct memblock_region *regions;
+   char *name;
 };
 
 struct memblock {
diff --git a/mm/memblock.c b/mm/memblock.c
index fbaaf713827c..82d21e598e8c 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -35,15 +35,18 @@ struct memblock memblock __initdata_memblock = {
.memory.regions = memblock_memory_init_regions,
.memory.cnt = 1,/* empty dummy entry */
.memory.max = INIT_MEMBLOCK_REGIONS,
+   .memory.name= "memory",
 
.reserved.regions   = memblock_reserved_init_regions,
.reserved.cnt   = 1,/* empty dummy entry */
.reserved.max   = INIT_MEMBLOCK_REGIONS,
+   .reserved.name  = "reserved",
 
 #ifdef CONFIG_HAVE_MEMBLOCK_PHYS_MAP
.physmem.regions= memblock_physmem_init_regions,
.physmem.cnt= 1,/* empty dummy entry */
.physmem.max= INIT_PHYSMEM_REGIONS,
+   .physmem.name   = "physmem",
 #endif
 
.bottom_up  = false,
@@ -64,22 +67,6 @@ ulong __init_memblock choose_memblock_flags(void)
return system_has_some_mirror ? MEMBLOCK_MIRROR : MEMBLOCK_NONE;
 }
 
-/* inline so we don't get a warning when pr_debug is compiled out */
-static __init_memblock const char *
-memblock_type_name(struct memblock_type *type)
-{
-   if (type == )
-   return "memory";
-   else if (type == )
-   return "reserved";
-#ifdef CONFIG_HAVE_MEMBLOCK_PHYS_MAP
-   else if (type == )
-   return "physmem";
-#endif
-   else
-   return "unknown";
-}
-
 /* adjust *@size so that (@base + *@size) doesn't overflow, return new size */
 static inline phys_addr_t memblock_cap_size(phys_addr_t base, phys_addr_t 
*size)
 {
@@ -406,12 +393,12 @@ static int __init_memblock memblock_double_array(struct 
memblock_type *type,
}
if (!addr) {
pr_err("memblock: Failed to double %s array from %ld to %ld 
entries !\n",
-  memblock_type_name(type), type->max, type->max * 2);
+  type->name, type->max, type->max * 2);
return -1;
}
 
memblock_dbg("memblock: %s is doubled to %ld at [%#010llx-%#010llx]",
-   memblock_type_name(type), type->max * 2, (u64)addr,
+   type->name, type->max * 2, (u64)addr,
(u64)addr + new_size - 1);
 
/*
@@ -1675,14 +1662,14 @@ phys_addr_t __init_memblock 
memblock_get_current_limit(void)
return memblock.current_limit;
 }
 
-static void __init_memblock memblock_dump(struct memblock_type *type, char 
*name)
+static void __init_memblock memblock_dump(struct memblock_type *type)
 {
unsigned long long base, size;
unsigned long flags;
int idx;
struct memblock_region *rgn;
 
-   pr_info(" %s.cnt  = 0x%lx\n", name, type->cnt);
+   pr_info(" %s.cnt  = 0x%lx\n", type->name, type->cnt);
 
for_each_memblock_type(type, rgn) {
char nid_buf[32] = "";
@@ -1696,7 +1683,7 @@ static void __init_memblock memblock_dump(struct 
memblock_type *type, char *name
 memblock_get_region_node(rgn));
 #endif
pr_info(" %s[%#x]\t[%#016llx-%#016llx], %#llx bytes%s flags: 
%#lx\n",
-   name, idx, base, base + size - 1, size, nid_buf, flags);
+   type->name, idx, base, base + size - 1, size, nid_buf, 
flags);
}
 }
 
@@ -1707,10 +1694,10 @@

Re: [PATCH 1/2] security: Change name of CONFIG_DEBUG_RODATA

2017-01-19 Thread Heiko Carstens

On Thu, Jan 19, 2017 at 10:56:46AM +, Mark Rutland wrote:
> > +config HARDENED_PAGE_MAPPINGS
> > +   bool "Mark kernel mappings with stricter permissions (RO/W^X)"
> > +   default y
> > +   depends on ARCH_HAS_HARDENED_MAPPINGS
> > +   help
> > +  If this is set, kernel text and rodata memory will be made 
> > read-only,
> > + and non-text memory will be made non-executable. This provides
> > + protection against certain security attacks (e.g. executing the heap
> > + or modifying text).
> > +
> > + Unless your system has known restrictions or performance issues, it
> > + is recommended to say Y here.
> 
> It's somewhat unfortunate that this means the feature is no longer
> mandatory on arm64 (and s390+x86). We have a boot-time switch to turn
> the protections off, so I was hoping we could make this mandatory on all
> architectures with support.
> 
> It would be good to see if we could make this mandatory for arm and
> parisc, or if it really needs to be optional for either of those.

Looks like the config option is a no-op on parisc just like it is on
s390. Irrelavant of the config option at least on s390 the page tables for
kernel text and rodata will be read-only anyway.

This works since ages and I don't see a reason why this should be
changed. Also trying to disable this with the "rodata=" command line option
does not work at least on s390, and I guess this is true for parisc as
well.

The only thing implemented with CONFIG_DEBUG_RODATA on both architectures
is to emit a message that states memory has been protected
(mark_rodata_ro).
This just avoids a wrong "Kernel memory protection disabled." message.

So yes, I'd really like to keep this option mandatory.

Re: [PATCH 1/2] security: Change name of CONFIG_DEBUG_RODATA

2017-01-19 Thread Heiko Carstens

On Thu, Jan 19, 2017 at 10:56:46AM +, Mark Rutland wrote:
> > +config HARDENED_PAGE_MAPPINGS
> > +   bool "Mark kernel mappings with stricter permissions (RO/W^X)"
> > +   default y
> > +   depends on ARCH_HAS_HARDENED_MAPPINGS
> > +   help
> > +  If this is set, kernel text and rodata memory will be made 
> > read-only,
> > + and non-text memory will be made non-executable. This provides
> > + protection against certain security attacks (e.g. executing the heap
> > + or modifying text).
> > +
> > + Unless your system has known restrictions or performance issues, it
> > + is recommended to say Y here.
> 
> It's somewhat unfortunate that this means the feature is no longer
> mandatory on arm64 (and s390+x86). We have a boot-time switch to turn
> the protections off, so I was hoping we could make this mandatory on all
> architectures with support.
> 
> It would be good to see if we could make this mandatory for arm and
> parisc, or if it really needs to be optional for either of those.

Looks like the config option is a no-op on parisc just like it is on
s390. Irrelavant of the config option at least on s390 the page tables for
kernel text and rodata will be read-only anyway.

This works since ages and I don't see a reason why this should be
changed. Also trying to disable this with the "rodata=" command line option
does not work at least on s390, and I guess this is true for parisc as
well.

The only thing implemented with CONFIG_DEBUG_RODATA on both architectures
is to emit a message that states memory has been protected
(mark_rodata_ro).
This just avoids a wrong "Kernel memory protection disabled." message.

So yes, I'd really like to keep this option mandatory.

Re: [PATCH 2/2] security: Change name of CONFIG_DEBUG_SET_MODULE_RONX

2017-01-19 Thread Heiko Carstens

On Thu, Jan 19, 2017 at 11:11:18AM +, Mark Rutland wrote:
> > +config HARDENED_MODULE_MAPPINGS
> > +   bool "Mark module mappings with stricter permissions (RO/W^X)"
> > +   default y
> > +   depends on ARCH_HAS_HARDENED_MODULE_MAPPINGS
> > +   help
> > + If this is set, module text and rodata memory will be made read-only,
> > + and non-text memory will be made non-executable. This provides
> > + protection against certain security vulnerabilities (e.g. modifying
> > + code)
> > +
> > + Unless your system has known restrictions or performance issues, it
> > + is recommended to say Y here.
> > +
> 
> I was hoping that we'd make this mandatory, as we'd already done for
> DEBUG_RODATA.

Same for s390: would be good to make this mandatory.

Re: [PATCH 2/2] security: Change name of CONFIG_DEBUG_SET_MODULE_RONX

2017-01-19 Thread Heiko Carstens

On Thu, Jan 19, 2017 at 11:11:18AM +, Mark Rutland wrote:
> > +config HARDENED_MODULE_MAPPINGS
> > +   bool "Mark module mappings with stricter permissions (RO/W^X)"
> > +   default y
> > +   depends on ARCH_HAS_HARDENED_MODULE_MAPPINGS
> > +   help
> > + If this is set, module text and rodata memory will be made read-only,
> > + and non-text memory will be made non-executable. This provides
> > + protection against certain security vulnerabilities (e.g. modifying
> > + code)
> > +
> > + Unless your system has known restrictions or performance issues, it
> > + is recommended to say Y here.
> > +
> 
> I was hoping that we'd make this mandatory, as we'd already done for
> DEBUG_RODATA.

Same for s390: would be good to make this mandatory.

Re: [RFC 1/4] mm: remove unused TASK_SIZE_OF()

2017-01-01 Thread Heiko Carstens

On Fri, Dec 30, 2016 at 06:56:31PM +0300, Dmitry Safonov wrote:
> All users of TASK_SIZE_OF(tsk) have migrated to mm->task_size or
> TASK_SIZE_MAX since:
> commit d696ca016d57 ("x86/fsgsbase/64: Use TASK_SIZE_MAX for
> FSBASE/GSBASE upper limits"),
> commit a06db751c321 ("pagemap: check permissions and capabilities at
> open time"),
> 
> Signed-off-by: Dmitry Safonov <dsafo...@virtuozzo.com>
> ---
...

> diff --git a/arch/s390/include/asm/processor.h 
> b/arch/s390/include/asm/processor.h
> index 6bca916a5ba0..c53e8e2a51ac 100644
> --- a/arch/s390/include/asm/processor.h
> +++ b/arch/s390/include/asm/processor.h
> @@ -89,10 +89,9 @@ extern void execve_tail(void);
>   * User space process size: 2GB for 31 bit, 4TB or 8PT for 64 bit.
>   */
> 
> -#define TASK_SIZE_OF(tsk)((tsk)->mm->context.asce_limit)
>  #define TASK_UNMAPPED_BASE   (test_thread_flag(TIF_31BIT) ? \
>   (1UL << 30) : (1UL << 41))
> -#define TASK_SIZETASK_SIZE_OF(current)
> +#define TASK_SIZE(current->mm->context.asce_limit)
>  #define TASK_MAX_SIZE(1UL << 53)
> 
>  #define STACK_TOP(1UL << (test_thread_flag(TIF_31BIT) ? 31:42))

FWIW, for the s390 part:

Acked-by: Heiko Carstens <heiko.carst...@de.ibm.com>

Re: [RFC 1/4] mm: remove unused TASK_SIZE_OF()

2017-01-01 Thread Heiko Carstens

On Fri, Dec 30, 2016 at 06:56:31PM +0300, Dmitry Safonov wrote:
> All users of TASK_SIZE_OF(tsk) have migrated to mm->task_size or
> TASK_SIZE_MAX since:
> commit d696ca016d57 ("x86/fsgsbase/64: Use TASK_SIZE_MAX for
> FSBASE/GSBASE upper limits"),
> commit a06db751c321 ("pagemap: check permissions and capabilities at
> open time"),
> 
> Signed-off-by: Dmitry Safonov 
> ---
...

> diff --git a/arch/s390/include/asm/processor.h 
> b/arch/s390/include/asm/processor.h
> index 6bca916a5ba0..c53e8e2a51ac 100644
> --- a/arch/s390/include/asm/processor.h
> +++ b/arch/s390/include/asm/processor.h
> @@ -89,10 +89,9 @@ extern void execve_tail(void);
>   * User space process size: 2GB for 31 bit, 4TB or 8PT for 64 bit.
>   */
> 
> -#define TASK_SIZE_OF(tsk)((tsk)->mm->context.asce_limit)
>  #define TASK_UNMAPPED_BASE   (test_thread_flag(TIF_31BIT) ? \
>   (1UL << 30) : (1UL << 41))
> -#define TASK_SIZETASK_SIZE_OF(current)
> +#define TASK_SIZE(current->mm->context.asce_limit)
>  #define TASK_MAX_SIZE(1UL << 53)
> 
>  #define STACK_TOP(1UL << (test_thread_flag(TIF_31BIT) ? 31:42))

FWIW, for the s390 part:

Acked-by: Heiko Carstens

Re: [RFC] minimum gcc version for kernel: raise to gcc-4.3 or 4.6?

2016-12-20 Thread Heiko Carstens

On Fri, Dec 16, 2016 at 11:00:27PM +0100, Arnd Bergmann wrote:
> On Friday, December 16, 2016 6:00:43 PM CET Sebastian Andrzej Siewior wrote:
> > On 2016-12-16 11:56:21 [+0100], Arnd Bergmann wrote:
> > > The original gcc-4.3 release was in early 2008. If we decide to still
> > > support that, we probably want the first 10 quirks in this series,
> > > while gcc-4.6 (released in 2011) requires none of them.
> > 
> > It this min gcc thingy ARM only?
> 
> This is part of the question that I'm trying to figure out myself.
> 
> Clearly having the same minimum version across all architectures simplifies
> things a lot, because many of the bugs in old versions are architecture
> independent. Then again, some architectures implicitly require a new version
> because an old one never existed (e.g. arm64 or risc-v), while some other
> architectures may require an old version.

FWIW, s390 requires gcc 4.3 or newer since two years already. For older
compilers we enforce a compile error (see arch/s390/kernel/asm-offsets.c).

Re: [RFC] minimum gcc version for kernel: raise to gcc-4.3 or 4.6?

2016-12-20 Thread Heiko Carstens

On Fri, Dec 16, 2016 at 11:00:27PM +0100, Arnd Bergmann wrote:
> On Friday, December 16, 2016 6:00:43 PM CET Sebastian Andrzej Siewior wrote:
> > On 2016-12-16 11:56:21 [+0100], Arnd Bergmann wrote:
> > > The original gcc-4.3 release was in early 2008. If we decide to still
> > > support that, we probably want the first 10 quirks in this series,
> > > while gcc-4.6 (released in 2011) requires none of them.
> > 
> > It this min gcc thingy ARM only?
> 
> This is part of the question that I'm trying to figure out myself.
> 
> Clearly having the same minimum version across all architectures simplifies
> things a lot, because many of the bugs in old versions are architecture
> independent. Then again, some architectures implicitly require a new version
> because an old one never existed (e.g. arm64 or risc-v), while some other
> architectures may require an old version.

FWIW, s390 requires gcc 4.3 or newer since two years already. For older
compilers we enforce a compile error (see arch/s390/kernel/asm-offsets.c).

Re: [PATCH 1/2] s390/zcore: Remove unneeded linux/miscdevice.h include

2016-12-16 Thread Heiko Carstens

On Thu, Dec 15, 2016 at 03:18:23PM +0100, Corentin Labbe wrote:
> drivers/s390/char/zcore.c does not contain any miscdevice so the
> inclusion of linux/miscdevice.h is uncessary.
> 
> Signed-off-by: Corentin Labbe 
> ---
>  drivers/s390/char/zcore.c | 1 -
>  1 file changed, 1 deletion(-)

Applied, thanks.

Re: [PATCH 1/2] s390/zcore: Remove unneeded linux/miscdevice.h include

2016-12-16 Thread Heiko Carstens

On Thu, Dec 15, 2016 at 03:18:23PM +0100, Corentin Labbe wrote:
> drivers/s390/char/zcore.c does not contain any miscdevice so the
> inclusion of linux/miscdevice.h is uncessary.
> 
> Signed-off-by: Corentin Labbe 
> ---
>  drivers/s390/char/zcore.c | 1 -
>  1 file changed, 1 deletion(-)

Applied, thanks.

Re: linux-next: Tree for Dec 14

2016-12-14 Thread Heiko Carstens

On Wed, Dec 14, 2016 at 11:25:49AM -0500, Paul Gortmaker wrote:
> On Tue, Dec 13, 2016 at 11:22 PM, Stephen Rothwell <s...@canb.auug.org.au> 
> wrote:
> > Hi all,
> >
> > Please do not add any material for v4.11 to your linux-next included
> > branches until after v4.10-rc1 has been released.
> >
> > Changes since 20161213:
> >
> > The vfs-miklos tree gained a conflict against the ubifs tree.
> >
> > The akpm-current tree gained a conflict against the jc_docs tree.
> >
> > Non-merge commits (relative to Linus' tree): 4498
> >  4320 files changed, 189429 insertions(+), 94239 deletions(-)
> 
> New s390 build failure today:
> 
> http://kisskb.ellerman.id.au/kisskb/buildresult/12883922/
> 
> I reproduced it locally using the older korg s390 toolchain.
> 
> Reverting this and the file compiles fine:
> 
> commit d543a106f96d6f15e4507cf349128912d44356d9
> Author: Heiko Carstens <heiko.carst...@de.ibm.com>
> Date:   Tue Dec 6 15:52:10 2016 +0100
> 
> s390: fix initrd corruptions with gcov/kcov instrumented kernels
> 
> Appears old toolchains don't like this commit.
> 
> Paul.

Yes, a fix is ready to be merged soon:

https://git.kernel.org/cgit/linux/kernel/git/s390/linux.git/commit/?h=features=75a357341e7c9d3893405ea6f9d722036012dd1f

Thanks,
Heiko

Re: linux-next: Tree for Dec 14

2016-12-14 Thread Heiko Carstens

On Wed, Dec 14, 2016 at 11:25:49AM -0500, Paul Gortmaker wrote:
> On Tue, Dec 13, 2016 at 11:22 PM, Stephen Rothwell  
> wrote:
> > Hi all,
> >
> > Please do not add any material for v4.11 to your linux-next included
> > branches until after v4.10-rc1 has been released.
> >
> > Changes since 20161213:
> >
> > The vfs-miklos tree gained a conflict against the ubifs tree.
> >
> > The akpm-current tree gained a conflict against the jc_docs tree.
> >
> > Non-merge commits (relative to Linus' tree): 4498
> >  4320 files changed, 189429 insertions(+), 94239 deletions(-)
> 
> New s390 build failure today:
> 
> http://kisskb.ellerman.id.au/kisskb/buildresult/12883922/
> 
> I reproduced it locally using the older korg s390 toolchain.
> 
> Reverting this and the file compiles fine:
> 
> commit d543a106f96d6f15e4507cf349128912d44356d9
> Author: Heiko Carstens 
> Date:   Tue Dec 6 15:52:10 2016 +0100
> 
> s390: fix initrd corruptions with gcov/kcov instrumented kernels
> 
> Appears old toolchains don't like this commit.
> 
> Paul.

Yes, a fix is ready to be merged soon:

https://git.kernel.org/cgit/linux/kernel/git/s390/linux.git/commit/?h=features=75a357341e7c9d3893405ea6f9d722036012dd1f

Thanks,
Heiko

Re: linux-next: manual merge of the kvms390 tree with the s390 tree

2016-11-23 Thread Heiko Carstens

On Wed, Nov 23, 2016 at 09:24:17AM +0100, Christian Borntraeger wrote:
> On 11/23/2016 04:45 AM, Stephen Rothwell wrote:
> > Hi all,
> > 
> > Today's linux-next merge of the kvms390 tree got a conflict in:
> > 
> >   arch/s390/include/asm/facilities_src.h
> > 
> > between commit:
> > 
> >   d1f7e8f85b51 ("s390: squash facilities_src.h into gen_facilities.c")
> > 
> > from the s390 tree and commits:
> > 
> >   09ff894457e9 ("KVM: s390: gaccess: add ESOP2 handling")
> >   88abf0b54f51 ("KVM: s390: instruction-execution-protection support")
> > 
> > from the kvms390 tree.
> > 
> > I fixed it up (I deleted the file and then added the following merge fix
> > patch) and can carry the fix as necessary. This is now fixed as far as
> > linux-next is concerned, but any non trivial conflicts should be mentioned
> > to your upstream maintainer when your tree is submitted for merging.
> > You may also want to consider cooperating with the maintainer of the
> > conflicting tree to minimise any particularly complex conflicts.
> 
> Thanks, looks fine.
> 
> Hmmm, Martin, Heiko, KVM is going to touch that file from time to time.
> Can we either have 
> a: a topic branch for this tool (like I did for the 4.9 merge window)
> with just the patches for this tool. This topic branch can then be merged
> into kvm/next and s390/features.

That's up to Martin.

> b: split out the kvm defines into arch/s390/tools/kvm_facilities.h
> b would like something like this (cut/paste so whitespace damaged)
> 
> From: Christian Borntraeger 
> Date: Wed, 23 Nov 2016 09:18:42 +0100
> Subject: [PATCH 1/1] s390:gen_facilites: Move kvm facilities into a separate
>  file
> 
> Signed-off-by: Christian Borntraeger 
> ---
>  MAINTAINERS  |  1 +
>  arch/s390/tools/gen_facilities.c | 25 +
>  arch/s390/tools/kvm_facilities.c | 26 ++
>  3 files changed, 28 insertions(+), 24 deletions(-)
>  create mode 100644 arch/s390/tools/kvm_facilities.c

No, please don't split this file again. Only kvm will ever touch the kvm
facilities, so we won't any conflicts here.
This time it's only because the file moved, otherwise there wouldn't be any
problem at all.

Re: linux-next: manual merge of the kvms390 tree with the s390 tree

2016-11-23 Thread Heiko Carstens

On Wed, Nov 23, 2016 at 09:24:17AM +0100, Christian Borntraeger wrote:
> On 11/23/2016 04:45 AM, Stephen Rothwell wrote:
> > Hi all,
> > 
> > Today's linux-next merge of the kvms390 tree got a conflict in:
> > 
> >   arch/s390/include/asm/facilities_src.h
> > 
> > between commit:
> > 
> >   d1f7e8f85b51 ("s390: squash facilities_src.h into gen_facilities.c")
> > 
> > from the s390 tree and commits:
> > 
> >   09ff894457e9 ("KVM: s390: gaccess: add ESOP2 handling")
> >   88abf0b54f51 ("KVM: s390: instruction-execution-protection support")
> > 
> > from the kvms390 tree.
> > 
> > I fixed it up (I deleted the file and then added the following merge fix
> > patch) and can carry the fix as necessary. This is now fixed as far as
> > linux-next is concerned, but any non trivial conflicts should be mentioned
> > to your upstream maintainer when your tree is submitted for merging.
> > You may also want to consider cooperating with the maintainer of the
> > conflicting tree to minimise any particularly complex conflicts.
> 
> Thanks, looks fine.
> 
> Hmmm, Martin, Heiko, KVM is going to touch that file from time to time.
> Can we either have 
> a: a topic branch for this tool (like I did for the 4.9 merge window)
> with just the patches for this tool. This topic branch can then be merged
> into kvm/next and s390/features.

That's up to Martin.

> b: split out the kvm defines into arch/s390/tools/kvm_facilities.h
> b would like something like this (cut/paste so whitespace damaged)
> 
> From: Christian Borntraeger 
> Date: Wed, 23 Nov 2016 09:18:42 +0100
> Subject: [PATCH 1/1] s390:gen_facilites: Move kvm facilities into a separate
>  file
> 
> Signed-off-by: Christian Borntraeger 
> ---
>  MAINTAINERS  |  1 +
>  arch/s390/tools/gen_facilities.c | 25 +
>  arch/s390/tools/kvm_facilities.c | 26 ++
>  3 files changed, 28 insertions(+), 24 deletions(-)
>  create mode 100644 arch/s390/tools/kvm_facilities.c

No, please don't split this file again. Only kvm will ever touch the kvm
facilities, so we won't any conflicts here.
This time it's only because the file moved, otherwise there wouldn't be any
problem at all.

Re: [PATCH v2] tile: handle __ro_after_init like parisc does

2016-11-14 Thread Heiko Carstens

On Mon, Nov 14, 2016 at 01:12:05PM -0800, Kees Cook wrote:
> At some point here, I want to collect all the arch maintainers and
> discuss the options for correctly reflecting the three data
> memory-protection needs we have:
> 
> - always read-only
> - read-only after init
> - read-only except during rare updates
> 
> (The latter one doesn't exist all yet...)
> 
> x86, arm, and arm64 use mark_rodata_ro() after init finishes, so they
> don't technically implement "always read-only". parisc, tile, powerpc,
> others have "always read-only", but disable read-only-after-init since
> they don't use mark_rodata_ro(). I think s390 has recently implemented
> both, but I have to double-check...

Yes, s390 has both: an early always read-only support, which is effective
as soon as paging_init() has set up and enabled page tables.
Our mark_rodata_ro() implementation only makes the ro_after_init section
read-only.

Re: [PATCH v2] tile: handle __ro_after_init like parisc does

2016-11-14 Thread Heiko Carstens

On Mon, Nov 14, 2016 at 01:12:05PM -0800, Kees Cook wrote:
> At some point here, I want to collect all the arch maintainers and
> discuss the options for correctly reflecting the three data
> memory-protection needs we have:
> 
> - always read-only
> - read-only after init
> - read-only except during rare updates
> 
> (The latter one doesn't exist all yet...)
> 
> x86, arm, and arm64 use mark_rodata_ro() after init finishes, so they
> don't technically implement "always read-only". parisc, tile, powerpc,
> others have "always read-only", but disable read-only-after-init since
> they don't use mark_rodata_ro(). I think s390 has recently implemented
> both, but I have to double-check...

Yes, s390 has both: an early always read-only support, which is effective
as soon as paging_init() has set up and enabled page tables.
Our mark_rodata_ro() implementation only makes the ro_after_init section
read-only.

[PATCH] mm/pkeys: generate pkey system call code only if ARCH_HAS_PKEYS is selected

2016-11-14 Thread Heiko Carstens

Having code for the pkey_mprotect, pkey_alloc and pkey_free system
calls makes only sense if ARCH_HAS_PKEYS is selected. If not selected
these system calls will always return -ENOSPC or -EINVAL.

To simplify things and have less code generate the pkey system call
code only if ARCH_HAS_PKEYS is selected.

For architectures which have already wired up the system calls, but do
not select ARCH_HAS_PKEYS this will result in less generated code and
a different return code: the three system calls will now always return
-ENOSYS, using the cond_syscall mechanism.

For architectures which have not wired up the system calls less
unreachable code will be generated.

Signed-off-by: Heiko Carstens <heiko.carst...@de.ibm.com>
---
 mm/mprotect.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/mm/mprotect.c b/mm/mprotect.c
index 11936526b08b..a06e91c4de29 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -484,6 +484,8 @@ SYSCALL_DEFINE3(mprotect, unsigned long, start, size_t, len,
return do_mprotect_pkey(start, len, prot, -1);
 }
 
+#ifdef CONFIG_ARCH_HAS_PKEYS
+
 SYSCALL_DEFINE4(pkey_mprotect, unsigned long, start, size_t, len,
unsigned long, prot, int, pkey)
 {
@@ -534,3 +536,5 @@ SYSCALL_DEFINE1(pkey_free, int, pkey)
 */
return ret;
 }
+
+#endif /* CONFIG_ARCH_HAS_PKEYS */
-- 
2.8.4

[PATCH] mm/pkeys: generate pkey system call code only if ARCH_HAS_PKEYS is selected

2016-11-14 Thread Heiko Carstens

Having code for the pkey_mprotect, pkey_alloc and pkey_free system
calls makes only sense if ARCH_HAS_PKEYS is selected. If not selected
these system calls will always return -ENOSPC or -EINVAL.

To simplify things and have less code generate the pkey system call
code only if ARCH_HAS_PKEYS is selected.

For architectures which have already wired up the system calls, but do
not select ARCH_HAS_PKEYS this will result in less generated code and
a different return code: the three system calls will now always return
-ENOSYS, using the cond_syscall mechanism.

For architectures which have not wired up the system calls less
unreachable code will be generated.

Signed-off-by: Heiko Carstens 
---
 mm/mprotect.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/mm/mprotect.c b/mm/mprotect.c
index 11936526b08b..a06e91c4de29 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -484,6 +484,8 @@ SYSCALL_DEFINE3(mprotect, unsigned long, start, size_t, len,
return do_mprotect_pkey(start, len, prot, -1);
 }
 
+#ifdef CONFIG_ARCH_HAS_PKEYS
+
 SYSCALL_DEFINE4(pkey_mprotect, unsigned long, start, size_t, len,
unsigned long, prot, int, pkey)
 {
@@ -534,3 +536,5 @@ SYSCALL_DEFINE1(pkey_free, int, pkey)
 */
return ret;
 }
+
+#endif /* CONFIG_ARCH_HAS_PKEYS */
-- 
2.8.4

Re: [PATCHv2 0/2] THREAD_INFO_IN_TASK prep work for arm64+s390

2016-11-11 Thread Heiko Carstens

Hi Mark,

> On Wed, Nov 02, 2016 at 03:56:26PM +, Mark Rutland wrote:
> > On Wed, Oct 26, 2016 at 06:43:05PM +0100, Mark Rutland wrote:
> > > Heiko and I have been working on THREAD_INFO_IN_TASK for s390 and arm64
> > > respectively, and we're both targetting v4.10.
> > > 
> > > These are the common core changes which we both require. I've put 
> > > together a
> > > branch [1,2] based on v4.9-rc2. I intend to tag this at some point next 
> > > week so
> > > that Heiko and I have a stable base.
> > 
> > As a heads-up, I've collated the tags I've received in the last few days, 
> > along
> > with the comment Andy requested in the v1 thread, and tagged this as
> > core-ti-stack-split [1].
> > 
> > Thanks,
> > Mark.
> > 
> > [1] git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git 
> > core-ti-stack-split
> 
> I believe that Catalin's hoping to take my arm64 patches for v4.10,
> using the above tag as a base.
> 
> What's the plan for s390? Are you happy to do the same?

We intend to merge the s390 variant also for v4.10. Our approach has
changed however: we came to the conclusion that having a per-cpu preempt
count will generate better code on s390; so Martin implemented that.

And while being at it he also moved the rest of the s390 specific stuff out
of our struct thread_info. The corresponding patches will be pushed to

git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git features

branch soon.

In result we will have a struct thread_info that only contains the flags
field like x86 does.
So we don't depend an any other patches anymore...

Re: [PATCHv2 0/2] THREAD_INFO_IN_TASK prep work for arm64+s390

2016-11-11 Thread Heiko Carstens

Hi Mark,

> On Wed, Nov 02, 2016 at 03:56:26PM +, Mark Rutland wrote:
> > On Wed, Oct 26, 2016 at 06:43:05PM +0100, Mark Rutland wrote:
> > > Heiko and I have been working on THREAD_INFO_IN_TASK for s390 and arm64
> > > respectively, and we're both targetting v4.10.
> > > 
> > > These are the common core changes which we both require. I've put 
> > > together a
> > > branch [1,2] based on v4.9-rc2. I intend to tag this at some point next 
> > > week so
> > > that Heiko and I have a stable base.
> > 
> > As a heads-up, I've collated the tags I've received in the last few days, 
> > along
> > with the comment Andy requested in the v1 thread, and tagged this as
> > core-ti-stack-split [1].
> > 
> > Thanks,
> > Mark.
> > 
> > [1] git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git 
> > core-ti-stack-split
> 
> I believe that Catalin's hoping to take my arm64 patches for v4.10,
> using the above tag as a base.
> 
> What's the plan for s390? Are you happy to do the same?

We intend to merge the s390 variant also for v4.10. Our approach has
changed however: we came to the conclusion that having a per-cpu preempt
count will generate better code on s390; so Martin implemented that.

And while being at it he also moved the rest of the s390 specific stuff out
of our struct thread_info. The corresponding patches will be pushed to

git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git features

branch soon.

In result we will have a struct thread_info that only contains the flags
field like x86 does.
So we don't depend an any other patches anymore...

Re: [PATCH] mm: only enable sys_pkey* when ARCH_HAS_PKEYS

2016-11-08 Thread Heiko Carstens

On Tue, Nov 08, 2016 at 12:39:28PM +0100, Arnd Bergmann wrote:
> On Tuesday, November 8, 2016 10:30:42 AM CET Heiko Carstens wrote:
> > Three architectures (parisc, powerpc, s390) decided to ignore the system
> > calls completely, but still have the pkey code linked into the kernel
> > image.
> 
> Wouldn't it actually make sense to hook this up to the storage keys
> in the s390 page tables?

We have storage keys per _physical_ page. Not per page within the the table
entries. So this doesn't work unfortunately.

Re: [PATCH] mm: only enable sys_pkey* when ARCH_HAS_PKEYS

2016-11-08 Thread Heiko Carstens

On Tue, Nov 08, 2016 at 12:39:28PM +0100, Arnd Bergmann wrote:
> On Tuesday, November 8, 2016 10:30:42 AM CET Heiko Carstens wrote:
> > Three architectures (parisc, powerpc, s390) decided to ignore the system
> > calls completely, but still have the pkey code linked into the kernel
> > image.
> 
> Wouldn't it actually make sense to hook this up to the storage keys
> in the s390 page tables?

We have storage keys per _physical_ page. Not per page within the the table
entries. So this doesn't work unfortunately.

Re: [PATCH] mm: only enable sys_pkey* when ARCH_HAS_PKEYS

2016-11-08 Thread Heiko Carstens

On Tue, Nov 08, 2016 at 10:41:12AM +, Russell King - ARM Linux wrote:
> On Tue, Nov 08, 2016 at 10:30:42AM +0100, Heiko Carstens wrote:
> > Two architectures (arm, mips) have wired them up and thus allocated system
> > call numbers, even though they don't have ARCH_HAS_PKEYS set. Which seems a
> > bit pointless.
> 
> I don't think it's pointless at all.  First, read the LWN article for
> the userspace side of the interface: https://lwn.net/Articles/689395/
> 
> From reading this, it seems (at least to me) that these pkey syscalls
> are going to be the application level API - which means applications
> are probably going to want to make these calls.
> 
> Sure, they'll have to go through glibc, and glibc can provide stubs,
> but the problem with that is if we do get hardware pkey support (eg,
> due to pressure to increase security) then we're going to end up
> needing both kernel changes and glibc changes to add the calls.
> 
> Since one of the design goals of pkeys is to allow them to work when
> there is no underlying hardware support, I see no reason not to wire
> them up in architecture syscall tables today, so that we have a cross-
> architecture kernel version where the pkey syscalls become available.
> glibc (and other libcs) don't then have to mess around with per-
> architecture recording of which kernel version the pkey syscalls were
> added.
> 
> Not wiring up the syscalls doesn't really gain anything: the code
> present when !ARCH_HAS_PKEYS will still be part of the kernel image,
> it just won't be callable.

That can be easily solved (see below).

> So, on balance, I've decided to wire them up on ARM, even though the
> hardware doesn't support them, to avoid unnecessary pain in userspace
> from the ARM side of things.
> 
> Obviously what other architectures do is their own business.

It would make sense if this would be handled the same across architectures.

We could simply ifdef the small pkey block in mprotect.c and rely on the
pkey cond_syscalls added with e2753293ac4b. The result would actually be
the same what Mark proposed, except with less generated code.

Something like this:

diff --git a/mm/mprotect.c b/mm/mprotect.c
index 11936526b08b..9fb86b107e49 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -484,6 +484,8 @@ SYSCALL_DEFINE3(mprotect, unsigned long, start, size_t, len,
return do_mprotect_pkey(start, len, prot, -1);
 }
 
+#ifdef CONFIG_ARCH_HAS_PKEYS
+
 SYSCALL_DEFINE4(pkey_mprotect, unsigned long, start, size_t, len,
unsigned long, prot, int, pkey)
 {
@@ -534,3 +536,4 @@ SYSCALL_DEFINE1(pkey_free, int, pkey)
 */
return ret;
 }
+#endif /* CONFIG_ARCH_HAS_PKEYS */

Re: [PATCH] mm: only enable sys_pkey* when ARCH_HAS_PKEYS

2016-11-08 Thread Heiko Carstens

On Tue, Nov 08, 2016 at 10:41:12AM +, Russell King - ARM Linux wrote:
> On Tue, Nov 08, 2016 at 10:30:42AM +0100, Heiko Carstens wrote:
> > Two architectures (arm, mips) have wired them up and thus allocated system
> > call numbers, even though they don't have ARCH_HAS_PKEYS set. Which seems a
> > bit pointless.
> 
> I don't think it's pointless at all.  First, read the LWN article for
> the userspace side of the interface: https://lwn.net/Articles/689395/
> 
> From reading this, it seems (at least to me) that these pkey syscalls
> are going to be the application level API - which means applications
> are probably going to want to make these calls.
> 
> Sure, they'll have to go through glibc, and glibc can provide stubs,
> but the problem with that is if we do get hardware pkey support (eg,
> due to pressure to increase security) then we're going to end up
> needing both kernel changes and glibc changes to add the calls.
> 
> Since one of the design goals of pkeys is to allow them to work when
> there is no underlying hardware support, I see no reason not to wire
> them up in architecture syscall tables today, so that we have a cross-
> architecture kernel version where the pkey syscalls become available.
> glibc (and other libcs) don't then have to mess around with per-
> architecture recording of which kernel version the pkey syscalls were
> added.
> 
> Not wiring up the syscalls doesn't really gain anything: the code
> present when !ARCH_HAS_PKEYS will still be part of the kernel image,
> it just won't be callable.

That can be easily solved (see below).

> So, on balance, I've decided to wire them up on ARM, even though the
> hardware doesn't support them, to avoid unnecessary pain in userspace
> from the ARM side of things.
> 
> Obviously what other architectures do is their own business.

It would make sense if this would be handled the same across architectures.

We could simply ifdef the small pkey block in mprotect.c and rely on the
pkey cond_syscalls added with e2753293ac4b. The result would actually be
the same what Mark proposed, except with less generated code.

Something like this:

diff --git a/mm/mprotect.c b/mm/mprotect.c
index 11936526b08b..9fb86b107e49 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -484,6 +484,8 @@ SYSCALL_DEFINE3(mprotect, unsigned long, start, size_t, len,
return do_mprotect_pkey(start, len, prot, -1);
 }
 
+#ifdef CONFIG_ARCH_HAS_PKEYS
+
 SYSCALL_DEFINE4(pkey_mprotect, unsigned long, start, size_t, len,
unsigned long, prot, int, pkey)
 {
@@ -534,3 +536,4 @@ SYSCALL_DEFINE1(pkey_free, int, pkey)
 */
return ret;
 }
+#endif /* CONFIG_ARCH_HAS_PKEYS */

Re: [PATCH] mm: only enable sys_pkey* when ARCH_HAS_PKEYS

2016-11-08 Thread Heiko Carstens

On Fri, Nov 04, 2016 at 11:44:59PM +, Mark Rutland wrote:
> On Wed, Nov 02, 2016 at 12:15:50PM -0700, Dave Hansen wrote:
> > On 10/31/2016 05:08 PM, Mark Rutland wrote:
> > > When an architecture does not select CONFIG_ARCH_HAS_PKEYS, the pkey_alloc
> > > syscall will return -ENOSPC for all (otherwise well-formed) requests, as 
> > > the
> > > generic implementation of mm_pkey_alloc() returns -1. The other pkey 
> > > syscalls
> > > perform some work before always failing, in a similar fashion.
> > > 
> > > This implies the absence of keys, but otherwise functional pkey support. 
> > > This
> > > is odd, since the architecture provides no such support. Instead, it 
> > > would be
> > > preferable to indicate that the syscall is not implemented, since this is
> > > effectively the case.
> > 
> > This makes the behavior of an x86 cpu without pkeys and an arm cpu
> > without pkeys differ.  Is that what we want?
> 
> My rationale was that we have no idea whether architectures will have pkey
> support in future, and if/when they do, we may have to apply additional checks
> anyhow. i.e. in cases we'd return -ENOSPC today, we might want to return
> another error code.
> 
> Returning -ENOSYS retains the current behaviour, and allows us to handle that
> ABI issue when we know what architecture support looks like.
> 
> Other architectures not using the generic syscalls seem to handle this with
> -ENOSYS, e.g. parisc with commit 18088db042dd9ae2, so there's differing
> behaviour regardless of arm specifically.

The three system calls won't return -ENOSYS on architectures which decided
to ignore them (like with with above mentioned commit), since they haven't
allocated a system call number at all.

Right now we have one architecture where these three system calls work if
the cpu supports the feature (x86).

Two architectures (arm, mips) have wired them up and thus allocated system
call numbers, even though they don't have ARCH_HAS_PKEYS set. Which seems a
bit pointless.

Three architectures (parisc, powerpc, s390) decided to ignore the system
calls completely, but still have the pkey code linked into the kernel
image.

imho the generic pkey code should be ifdef'ed with CONFIG_ARCH_HAS_PKEYS.
Otherwise only dead code will be linked and increase the kernel image size
for no good reason.

Re: [PATCH] mm: only enable sys_pkey* when ARCH_HAS_PKEYS

2016-11-08 Thread Heiko Carstens

On Fri, Nov 04, 2016 at 11:44:59PM +, Mark Rutland wrote:
> On Wed, Nov 02, 2016 at 12:15:50PM -0700, Dave Hansen wrote:
> > On 10/31/2016 05:08 PM, Mark Rutland wrote:
> > > When an architecture does not select CONFIG_ARCH_HAS_PKEYS, the pkey_alloc
> > > syscall will return -ENOSPC for all (otherwise well-formed) requests, as 
> > > the
> > > generic implementation of mm_pkey_alloc() returns -1. The other pkey 
> > > syscalls
> > > perform some work before always failing, in a similar fashion.
> > > 
> > > This implies the absence of keys, but otherwise functional pkey support. 
> > > This
> > > is odd, since the architecture provides no such support. Instead, it 
> > > would be
> > > preferable to indicate that the syscall is not implemented, since this is
> > > effectively the case.
> > 
> > This makes the behavior of an x86 cpu without pkeys and an arm cpu
> > without pkeys differ.  Is that what we want?
> 
> My rationale was that we have no idea whether architectures will have pkey
> support in future, and if/when they do, we may have to apply additional checks
> anyhow. i.e. in cases we'd return -ENOSPC today, we might want to return
> another error code.
> 
> Returning -ENOSYS retains the current behaviour, and allows us to handle that
> ABI issue when we know what architecture support looks like.
> 
> Other architectures not using the generic syscalls seem to handle this with
> -ENOSYS, e.g. parisc with commit 18088db042dd9ae2, so there's differing
> behaviour regardless of arm specifically.

The three system calls won't return -ENOSYS on architectures which decided
to ignore them (like with with above mentioned commit), since they haven't
allocated a system call number at all.

Right now we have one architecture where these three system calls work if
the cpu supports the feature (x86).

Two architectures (arm, mips) have wired them up and thus allocated system
call numbers, even though they don't have ARCH_HAS_PKEYS set. Which seems a
bit pointless.

Three architectures (parisc, powerpc, s390) decided to ignore the system
calls completely, but still have the pkey code linked into the kernel
image.

imho the generic pkey code should be ifdef'ed with CONFIG_ARCH_HAS_PKEYS.
Otherwise only dead code will be linked and increase the kernel image size
for no good reason.

Re: [PATCH] s390: remove unneeded dependency for gen_facilities

2016-11-07 Thread Heiko Carstens

On Tue, Nov 08, 2016 at 10:55:47AM +0900, Masahiro Yamada wrote:
> The dependency between the object and the source is handled by
> scripts/Makefile.host, so only "hostprogs-y += gen_facilities"
> is fine.
> 
> Signed-off-by: Masahiro Yamada <yamada.masah...@socionext.com>
> ---
> 
>  arch/s390/tools/Makefile | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/arch/s390/tools/Makefile b/arch/s390/tools/Makefile
> index 6d9814c..4b5e1e4 100644
> --- a/arch/s390/tools/Makefile
> +++ b/arch/s390/tools/Makefile
> @@ -9,7 +9,5 @@ define filechk_facilities.h
>   $(obj)/gen_facilities
>  endef
>  
> -$(obj)/gen_facilities.o: $(srctree)/arch/s390/tools/gen_facilities.c
> -
>  include/generated/facilities.h: $(obj)/gen_facilities FORCE
>   $(call filechk,facilities.h)

Acked-by: Heiko Carstens <heiko.carst...@de.ibm.com>

Martin, can you pick this one up too, please?

Re: [PATCH] s390: remove unneeded dependency for gen_facilities

2016-11-07 Thread Heiko Carstens

On Tue, Nov 08, 2016 at 10:55:47AM +0900, Masahiro Yamada wrote:
> The dependency between the object and the source is handled by
> scripts/Makefile.host, so only "hostprogs-y += gen_facilities"
> is fine.
> 
> Signed-off-by: Masahiro Yamada 
> ---
> 
>  arch/s390/tools/Makefile | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/arch/s390/tools/Makefile b/arch/s390/tools/Makefile
> index 6d9814c..4b5e1e4 100644
> --- a/arch/s390/tools/Makefile
> +++ b/arch/s390/tools/Makefile
> @@ -9,7 +9,5 @@ define filechk_facilities.h
>   $(obj)/gen_facilities
>  endef
>  
> -$(obj)/gen_facilities.o: $(srctree)/arch/s390/tools/gen_facilities.c
> -
>  include/generated/facilities.h: $(obj)/gen_facilities FORCE
>   $(call filechk,facilities.h)

Acked-by: Heiko Carstens 

Martin, can you pick this one up too, please?

Re: [PATCH] tile: handle RO_AFTER_INIT_DATA

2016-11-07 Thread Heiko Carstens

On Mon, Nov 07, 2016 at 02:50:27PM -0500, Chris Metcalf wrote:
> This is the minimal change to handle RO_AFTER_INIT_DATA.
> The tile architecture already marks RO_DATA as read-only in
> the kernel, so grouping RO_AFTER_INIT_DATA with RO_DATA, as is
> done by default, means the kernel faults in init when it tries
> to write to RO_AFTER_INIT_DATA.  For now, just move it past the
> end of the RODATA section so it is not specially treated.
> ---
> This is just to fix 4.9; I will post a more complete fix shortly
> targeting 4.10.
> 
>  arch/tile/kernel/vmlinux.lds.S | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/arch/tile/kernel/vmlinux.lds.S b/arch/tile/kernel/vmlinux.lds.S
> index e1baf094fba4..dcd7445c31a2 100644
> --- a/arch/tile/kernel/vmlinux.lds.S
> +++ b/arch/tile/kernel/vmlinux.lds.S
> @@ -1,3 +1,6 @@
> +/* Handle ro_after_init data on our own. */
> +#define RO_AFTER_INIT_DATA
> +
>  #include 
>  #include 
>  #include 
> @@ -87,6 +90,7 @@ SECTIONS
> 
>_sdata = .;   /* Start of data section */
>RO_DATA_SECTION(PAGE_SIZE)
> +  RO_AFTER_INIT_DATA
>RW_DATA_SECTION(L2_CACHE_BYTES, PAGE_SIZE, THREAD_SIZE)
>_edata = .;

imho, the minimal fix would be to just

#define __ro_after_init __read_mostly

within arch/tile/include/asm/cache.h. That's also what parisc currently
has.

Re: [PATCH] tile: handle RO_AFTER_INIT_DATA

2016-11-07 Thread Heiko Carstens

On Mon, Nov 07, 2016 at 02:50:27PM -0500, Chris Metcalf wrote:
> This is the minimal change to handle RO_AFTER_INIT_DATA.
> The tile architecture already marks RO_DATA as read-only in
> the kernel, so grouping RO_AFTER_INIT_DATA with RO_DATA, as is
> done by default, means the kernel faults in init when it tries
> to write to RO_AFTER_INIT_DATA.  For now, just move it past the
> end of the RODATA section so it is not specially treated.
> ---
> This is just to fix 4.9; I will post a more complete fix shortly
> targeting 4.10.
> 
>  arch/tile/kernel/vmlinux.lds.S | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/arch/tile/kernel/vmlinux.lds.S b/arch/tile/kernel/vmlinux.lds.S
> index e1baf094fba4..dcd7445c31a2 100644
> --- a/arch/tile/kernel/vmlinux.lds.S
> +++ b/arch/tile/kernel/vmlinux.lds.S
> @@ -1,3 +1,6 @@
> +/* Handle ro_after_init data on our own. */
> +#define RO_AFTER_INIT_DATA
> +
>  #include 
>  #include 
>  #include 
> @@ -87,6 +90,7 @@ SECTIONS
> 
>_sdata = .;   /* Start of data section */
>RO_DATA_SECTION(PAGE_SIZE)
> +  RO_AFTER_INIT_DATA
>RW_DATA_SECTION(L2_CACHE_BYTES, PAGE_SIZE, THREAD_SIZE)
>_edata = .;

imho, the minimal fix would be to just

#define __ro_after_init __read_mostly

within arch/tile/include/asm/cache.h. That's also what parisc currently
has.

Re: [PATCH 2/2] s390: squash facilities_src.h into gen_facilities.c

2016-11-07 Thread Heiko Carstens

On Mon, Nov 07, 2016 at 02:13:06PM +0100, Paul Bolle wrote:
> It took me some time to figure out that gen_facilities is only used to
> generate a small header file (generated/facilities.h). And that header's only
> goal is to define FACILITIES_ALS and FACILITIES_KVM.
> 
> Pasted below is an attempt to use asm/facilities.h instead of
> generated/facilities.h. That allows to drop arch/s390/tools/ entirely. I don't
> actually have an s390 machine at hand to test this, but this does build and
> the preprocessed code this generates looks sane.
> 
> (Yes, asm/facilities.h might need another level of preprocessor defines to
> become actually readable.)

The whole point of the tool is that we can use the bit numbers that are
specified within the architecture document, with the additional odd IBM bit
numbering scheme. Where the most significant bit has bit number 0(!).

> diff --git a/arch/s390/include/asm/facilities.h 
> b/arch/s390/include/asm/facilities.h
> new file mode 100644
> index ..c87f18d29217
> --- /dev/null
> +++ b/arch/s390/include/asm/facilities.h
> @@ -0,0 +1,43 @@
> +#ifndef __ASM_FACILITIES_H
> +#define __ASM_FACILITIES_H
> +
> +#define FACILITIES_ALS \
> + _AC(IS_BUILTIN(CONFIG_HAVE_MARCH_Z900_FEATURES), UL) << 0 | /* N3 
> instructions */ \

So, this is wrong. It should be " << 63" to match the existing code.

> +#define FACILITIES_KVM \
> + _BITUL(0)  |/* N3 instructions */ \
> + _BITUL(1)  |/* z/Arch mode installed */ \
> + _BITUL(2)  |/* z/Arch mode active */ \
> + _BITUL(3)  |/* DAT-enhancement */ \
> + _BITUL(4)  |/* idte segment table */ \
> + _BITUL(5)  |/* idte region table */ \
> + _BITUL(6)  |/* ASN-and-LX reuse */ \
> + _BITUL(7)  |/* stfle */ \
> + _BITUL(8)  |/* enhanced-DAT 1 */ \
> + _BITUL(9)  |/* sense-running-status */ \
> + _BITUL(10) |/* conditional sske */ \
> + _BITUL(13) |/* ipte-range */ \
> + _BITUL(14)  /* nonquiescing key-setting */ \
> + , \
> + _BITUL(9)  |/* transactional execution */ \
> + _BITUL(11) |/* access-exception-fetch/store indication */ \

And this is exactly what I want to avoid: start counting from zero again if
we cross a double word. It _must_ read 73, 75, otherwise this becomes the
unmaintainable and error prone mess we had before.

I just want a list with bit numbers and the rest must be created
automatically.

Re: [PATCH 2/2] s390: squash facilities_src.h into gen_facilities.c

2016-11-07 Thread Heiko Carstens

On Mon, Nov 07, 2016 at 02:13:06PM +0100, Paul Bolle wrote:
> It took me some time to figure out that gen_facilities is only used to
> generate a small header file (generated/facilities.h). And that header's only
> goal is to define FACILITIES_ALS and FACILITIES_KVM.
> 
> Pasted below is an attempt to use asm/facilities.h instead of
> generated/facilities.h. That allows to drop arch/s390/tools/ entirely. I don't
> actually have an s390 machine at hand to test this, but this does build and
> the preprocessed code this generates looks sane.
> 
> (Yes, asm/facilities.h might need another level of preprocessor defines to
> become actually readable.)

The whole point of the tool is that we can use the bit numbers that are
specified within the architecture document, with the additional odd IBM bit
numbering scheme. Where the most significant bit has bit number 0(!).

> diff --git a/arch/s390/include/asm/facilities.h 
> b/arch/s390/include/asm/facilities.h
> new file mode 100644
> index ..c87f18d29217
> --- /dev/null
> +++ b/arch/s390/include/asm/facilities.h
> @@ -0,0 +1,43 @@
> +#ifndef __ASM_FACILITIES_H
> +#define __ASM_FACILITIES_H
> +
> +#define FACILITIES_ALS \
> + _AC(IS_BUILTIN(CONFIG_HAVE_MARCH_Z900_FEATURES), UL) << 0 | /* N3 
> instructions */ \

So, this is wrong. It should be " << 63" to match the existing code.

> +#define FACILITIES_KVM \
> + _BITUL(0)  |/* N3 instructions */ \
> + _BITUL(1)  |/* z/Arch mode installed */ \
> + _BITUL(2)  |/* z/Arch mode active */ \
> + _BITUL(3)  |/* DAT-enhancement */ \
> + _BITUL(4)  |/* idte segment table */ \
> + _BITUL(5)  |/* idte region table */ \
> + _BITUL(6)  |/* ASN-and-LX reuse */ \
> + _BITUL(7)  |/* stfle */ \
> + _BITUL(8)  |/* enhanced-DAT 1 */ \
> + _BITUL(9)  |/* sense-running-status */ \
> + _BITUL(10) |/* conditional sske */ \
> + _BITUL(13) |/* ipte-range */ \
> + _BITUL(14)  /* nonquiescing key-setting */ \
> + , \
> + _BITUL(9)  |/* transactional execution */ \
> + _BITUL(11) |/* access-exception-fetch/store indication */ \

And this is exactly what I want to avoid: start counting from zero again if
we cross a double word. It _must_ read 73, 75, otherwise this becomes the
unmaintainable and error prone mess we had before.

I just want a list with bit numbers and the rest must be created
automatically.

Re: [PATCH 2/2] s390: squash facilities_src.h into gen_facilities.c

2016-11-06 Thread Heiko Carstens

On Sun, Nov 06, 2016 at 12:45:28PM +0900, Masahiro Yamada wrote:
> We generally expect headers in arch/$(ARCH)/include/asm directory
> are included from kernel sources, but facilities_src.h is not;
> it is included from the arch/s390/tools/gen_facilities.c tool.
> 
> There is no reason to expose this header to the public include path.
> Furthermore, facilities_src.h makes sure to be included only from
> gen_facilities.c by the following:
> 
>   #ifndef S390_GEN_FACILITIES_C
>   #error "This file can only be included by gen_facilities.c"
>   #endif
> 
> This check can be removed by merging the two files.
> 
> Signed-off-by: Masahiro Yamada <yamada.masah...@socionext.com>
> ---

Both patches:
Acked-by: Heiko Carstens <heiko.carst...@de.ibm.com>

Martin, can you please pick them up?

Re: [PATCH 2/2] s390: squash facilities_src.h into gen_facilities.c

2016-11-06 Thread Heiko Carstens

On Sun, Nov 06, 2016 at 12:45:28PM +0900, Masahiro Yamada wrote:
> We generally expect headers in arch/$(ARCH)/include/asm directory
> are included from kernel sources, but facilities_src.h is not;
> it is included from the arch/s390/tools/gen_facilities.c tool.
> 
> There is no reason to expose this header to the public include path.
> Furthermore, facilities_src.h makes sure to be included only from
> gen_facilities.c by the following:
> 
>   #ifndef S390_GEN_FACILITIES_C
>   #error "This file can only be included by gen_facilities.c"
>   #endif
> 
> This check can be removed by merging the two files.
> 
> Signed-off-by: Masahiro Yamada 
> ---

Both patches:
Acked-by: Heiko Carstens 

Martin, can you please pick them up?

Re: [PATCH 11/25] s390/smp: Convert to hotplug state machine

2016-11-04 Thread Heiko Carstens

On Thu, Nov 03, 2016 at 03:50:07PM +0100, Sebastian Andrzej Siewior wrote:
> @@ -1084,20 +1079,13 @@ static int smp_add_present_cpu(int cpu)
>   rc = sysfs_create_group(>kobj, _common_attr_group);
>   if (rc)
>   goto out_cpu;
> - if (cpu_online(cpu)) {
> - rc = sysfs_create_group(>kobj, _online_attr_group);
> - if (rc)
> - goto out_online;
> - }
> +
>   rc = topology_cpu_init(c);

Please don't add whitespace :)

If you fix the minor nits both s390 related patches are
Acked-by: Heiko Carstens <heiko.carst...@de.ibm.com>

Thanks!

Re: [PATCH 11/25] s390/smp: Convert to hotplug state machine

2016-11-04 Thread Heiko Carstens

On Thu, Nov 03, 2016 at 03:50:07PM +0100, Sebastian Andrzej Siewior wrote:
> @@ -1084,20 +1079,13 @@ static int smp_add_present_cpu(int cpu)
>   rc = sysfs_create_group(>kobj, _common_attr_group);
>   if (rc)
>   goto out_cpu;
> - if (cpu_online(cpu)) {
> - rc = sysfs_create_group(>kobj, _online_attr_group);
> - if (rc)
> - goto out_online;
> - }
> +
>   rc = topology_cpu_init(c);

Please don't add whitespace :)

If you fix the minor nits both s390 related patches are
Acked-by: Heiko Carstens 

Thanks!

Re: [PATCH 10/25] s390/smp: Make cpu notifier symetric

2016-11-04 Thread Heiko Carstens

On Thu, Nov 03, 2016 at 03:50:06PM +0100, Sebastian Andrzej Siewior wrote:
> From: Thomas Gleixner <t...@linutronix.de>
> 
> There is no reason to remove the sysfs cpu files when the CPU is dead, they
> can be removed when the cpu is prepared to go down. Doing it at
> DOWN_PREPARE allows us to convert it to a symetric hotplug state in the
> next step.
> 
> Cc: Martin Schwidefsky <schwidef...@de.ibm.com>
> Cc: Heiko Carstens <heiko.carst...@de.ibm.com>
> Cc: linux-s...@vger.kernel.org
> Signed-off-by: Thomas Gleixner <t...@linutronix.de>
> Signed-off-by: Sebastian Andrzej Siewior <bige...@linutronix.de>
> ---
>  arch/s390/kernel/smp.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/s390/kernel/smp.c b/arch/s390/kernel/smp.c
> index 35531fe1c5ea..1a21e66c484a 100644
> --- a/arch/s390/kernel/smp.c
> +++ b/arch/s390/kernel/smp.c
> @@ -1056,9 +1056,10 @@ static int smp_cpu_notify(struct notifier_block *self, 
> unsigned long action,
>  
>   switch (action & ~CPU_TASKS_FROZEN) {
>   case CPU_ONLINE:
> + case CPU_DOWN_FAILED:
>   err = sysfs_create_group(>kobj, _online_attr_group);
>   break;
> - case CPU_DEAD:
> + case CPU_DOWN_PREPARE::

This won't compile... even though it will be removed with the following
patch it would be good to fix this to keep this bisectable.

Re: [PATCH 10/25] s390/smp: Make cpu notifier symetric

2016-11-04 Thread Heiko Carstens

On Thu, Nov 03, 2016 at 03:50:06PM +0100, Sebastian Andrzej Siewior wrote:
> From: Thomas Gleixner 
> 
> There is no reason to remove the sysfs cpu files when the CPU is dead, they
> can be removed when the cpu is prepared to go down. Doing it at
> DOWN_PREPARE allows us to convert it to a symetric hotplug state in the
> next step.
> 
> Cc: Martin Schwidefsky 
> Cc: Heiko Carstens 
> Cc: linux-s...@vger.kernel.org
> Signed-off-by: Thomas Gleixner 
> Signed-off-by: Sebastian Andrzej Siewior 
> ---
>  arch/s390/kernel/smp.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/s390/kernel/smp.c b/arch/s390/kernel/smp.c
> index 35531fe1c5ea..1a21e66c484a 100644
> --- a/arch/s390/kernel/smp.c
> +++ b/arch/s390/kernel/smp.c
> @@ -1056,9 +1056,10 @@ static int smp_cpu_notify(struct notifier_block *self, 
> unsigned long action,
>  
>   switch (action & ~CPU_TASKS_FROZEN) {
>   case CPU_ONLINE:
> + case CPU_DOWN_FAILED:
>   err = sysfs_create_group(>kobj, _online_attr_group);
>   break;
> - case CPU_DEAD:
> + case CPU_DOWN_PREPARE::

This won't compile... even though it will be removed with the following
patch it would be good to fix this to keep this bisectable.

[PATCH] percpu: use notrace variant of preempt_disable/preempt_enable

2016-11-03 Thread Heiko Carstens

Commit 345ddcc882d8 ("ftrace: Have set_ftrace_pid use the bitmap like
events do") added a couple of this_cpu_read calls to the ftrace code.

On x86 this is not a problem, since it has single instructions to read
percpu data. Other architectures which use the generic variant now
have additional preempt_disable and preempt_enable calls in the core
ftrace code. This may lead to recursive calls and in result to a dead
machine, e.g. if preemption and debugging options are enabled.

To fix this use the notrace variant of preempt_disable and
preempt_enable within the generic percpu code.

Reported-and-bisected-by: Sebastian Ott <seb...@linux.vnet.ibm.com>
Tested-by: Sebastian Ott <seb...@linux.vnet.ibm.com>
Fixes: 345ddcc882d8 ("ftrace: Have set_ftrace_pid use the bitmap like events 
do")
Signed-off-by: Heiko Carstens <heiko.carst...@de.ibm.com>
---
 include/asm-generic/percpu.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/asm-generic/percpu.h b/include/asm-generic/percpu.h
index 40e887068da2..0504ef8f3aa3 100644
--- a/include/asm-generic/percpu.h
+++ b/include/asm-generic/percpu.h
@@ -118,9 +118,9 @@ do {
\
 #define this_cpu_generic_read(pcp) \
 ({ \
typeof(pcp) __ret;  \
-   preempt_disable();  \
+   preempt_disable_notrace();  \
__ret = raw_cpu_generic_read(pcp);  \
-   preempt_enable();   \
+   preempt_enable_notrace();   \
__ret;  \
 })
 
-- 
2.8.4

[PATCH] percpu: use notrace variant of preempt_disable/preempt_enable

2016-11-03 Thread Heiko Carstens

Commit 345ddcc882d8 ("ftrace: Have set_ftrace_pid use the bitmap like
events do") added a couple of this_cpu_read calls to the ftrace code.

On x86 this is not a problem, since it has single instructions to read
percpu data. Other architectures which use the generic variant now
have additional preempt_disable and preempt_enable calls in the core
ftrace code. This may lead to recursive calls and in result to a dead
machine, e.g. if preemption and debugging options are enabled.

To fix this use the notrace variant of preempt_disable and
preempt_enable within the generic percpu code.

Reported-and-bisected-by: Sebastian Ott 
Tested-by: Sebastian Ott 
Fixes: 345ddcc882d8 ("ftrace: Have set_ftrace_pid use the bitmap like events 
do")
Signed-off-by: Heiko Carstens 
---
 include/asm-generic/percpu.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/asm-generic/percpu.h b/include/asm-generic/percpu.h
index 40e887068da2..0504ef8f3aa3 100644
--- a/include/asm-generic/percpu.h
+++ b/include/asm-generic/percpu.h
@@ -118,9 +118,9 @@ do {
\
 #define this_cpu_generic_read(pcp) \
 ({ \
typeof(pcp) __ret;  \
-   preempt_disable();  \
+   preempt_disable_notrace();  \
__ret = raw_cpu_generic_read(pcp);  \
-   preempt_enable();   \
+   preempt_enable_notrace();   \
__ret;  \
 })
 
-- 
2.8.4

Re: [PATCH 0/9] s390: remove modular usage from non-modular code

2016-10-31 Thread Heiko Carstens

On Sun, Oct 30, 2016 at 04:37:23PM -0400, Paul Gortmaker wrote:
> My ongoing audit looking for non-modular code that needlessly uses
> modular macros (vs. built-in equivalents) and/or has dead code
> relating to module unloading that can never be executed led to the
> creation of these s390 related commits.
> 
> For anyone new to the underlying goal of this cleanup, we are trying to
> not use module support for code that can never be built as a module since:
> 
>  (1) it is easy to accidentally write unused module_exit and remove code
>  (2) it can be misleading when reading the source, thinking it can be
>  modular when the Makefile and/or Kconfig prohibit it
>  (3) it requires the include of the module.h header file which in turn
>  includes nearly everything else, thus adding to CPP overhead.
>  (4) it gets copied/replicated into other code and spreads like weeds.
> 
> Build tested on current linux-next (allyes/allno/allmod) to ensure no
> silly typos or implicit include issues that would break compilation
> crept in.
> 
> Paul Gortmaker (9):
>   s390: cio: make it explicitly non-modular
>   s390: char: make zcore explicitly non-modular
>   s390: char: make con3215 explicitly non-modular
>   s390: char: make sclp_tty explicitly non-modular
>   s390: char: make slcp_quiesce explicitly non-modular
>   s390: hotplug: make pci_hpc explicitly non-modular
>   s390: hypfs: make inode explicitly non-modular
>   s390: kernel: make lgr explicitly non-modular
>   s390: virtio: make ccw explicitly non-modular

Whole series applied. Thanks!

Re: [PATCH 0/9] s390: remove modular usage from non-modular code

2016-10-31 Thread Heiko Carstens

On Sun, Oct 30, 2016 at 04:37:23PM -0400, Paul Gortmaker wrote:
> My ongoing audit looking for non-modular code that needlessly uses
> modular macros (vs. built-in equivalents) and/or has dead code
> relating to module unloading that can never be executed led to the
> creation of these s390 related commits.
> 
> For anyone new to the underlying goal of this cleanup, we are trying to
> not use module support for code that can never be built as a module since:
> 
>  (1) it is easy to accidentally write unused module_exit and remove code
>  (2) it can be misleading when reading the source, thinking it can be
>  modular when the Makefile and/or Kconfig prohibit it
>  (3) it requires the include of the module.h header file which in turn
>  includes nearly everything else, thus adding to CPP overhead.
>  (4) it gets copied/replicated into other code and spreads like weeds.
> 
> Build tested on current linux-next (allyes/allno/allmod) to ensure no
> silly typos or implicit include issues that would break compilation
> crept in.
> 
> Paul Gortmaker (9):
>   s390: cio: make it explicitly non-modular
>   s390: char: make zcore explicitly non-modular
>   s390: char: make con3215 explicitly non-modular
>   s390: char: make sclp_tty explicitly non-modular
>   s390: char: make slcp_quiesce explicitly non-modular
>   s390: hotplug: make pci_hpc explicitly non-modular
>   s390: hypfs: make inode explicitly non-modular
>   s390: kernel: make lgr explicitly non-modular
>   s390: virtio: make ccw explicitly non-modular

Whole series applied. Thanks!

[tip:x86/urgent] sched/core, x86: Make struct thread_info arch specific again

2016-10-20 Thread tip-bot for Heiko Carstens

Commit-ID:  c8061485a0d7569a865a3cc3c63347b0f42b3765
Gitweb: http://git.kernel.org/tip/c8061485a0d7569a865a3cc3c63347b0f42b3765
Author: Heiko Carstens <heiko.carst...@de.ibm.com>
AuthorDate: Wed, 19 Oct 2016 19:28:11 +0100
Committer:  Ingo Molnar <mi...@kernel.org>
CommitDate: Thu, 20 Oct 2016 13:27:47 +0200

sched/core, x86: Make struct thread_info arch specific again

The following commit:

  c65eacbe290b ("sched/core: Allow putting thread_info into task_struct")

... made 'struct thread_info' a generic struct with only a
single ::flags member, if CONFIG_THREAD_INFO_IN_TASK_STRUCT=y is
selected.

This change however seems to be quite x86 centric, since at least the
generic preemption code (asm-generic/preempt.h) assumes that struct
thread_info also has a preempt_count member, which apparently was not
true for x86.

We could add a bit more #ifdefs to solve this problem too, but it seems
to be much simpler to make struct thread_info arch specific
again. This also makes the conversion to THREAD_INFO_IN_TASK_STRUCT a
bit easier for architectures that have a couple of arch specific stuff
in their thread_info definition.

The arch specific stuff _could_ be moved to thread_struct. However
keeping them in thread_info makes it easier: accessing thread_info
members is simple, since it is at the beginning of the task_struct,
while the thread_struct is at the end. At least on s390 the offsets
needed to access members of the thread_struct (with task_struct as
base) are too large for various asm instructions.  This is not a
problem when keeping these members within thread_info.

Signed-off-by: Heiko Carstens <heiko.carst...@de.ibm.com>
Signed-off-by: Mark Rutland <mark.rutl...@arm.com>
Acked-by: Thomas Gleixner <t...@linutronix.de>
Cc: Andrew Morton <a...@linux-foundation.org>
Cc: Andy Lutomirski <l...@kernel.org>
Cc: Linus Torvalds <torva...@linux-foundation.org>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: keesc...@chromium.org
Cc: linux-a...@vger.kernel.org
Link: 
http://lkml.kernel.org/r/1476901693-8492-2-git-send-email-mark.rutl...@arm.com
Signed-off-by: Ingo Molnar <mi...@kernel.org>
---
 arch/x86/include/asm/thread_info.h |  9 +
 include/linux/thread_info.h| 11 ---
 2 files changed, 9 insertions(+), 11 deletions(-)

diff --git a/arch/x86/include/asm/thread_info.h 
b/arch/x86/include/asm/thread_info.h
index 2aaca53..ad6f5eb0 100644
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -52,6 +52,15 @@ struct task_struct;
 #include 
 #include 
 
+struct thread_info {
+   unsigned long   flags;  /* low level flags */
+};
+
+#define INIT_THREAD_INFO(tsk)  \
+{  \
+   .flags  = 0,\
+}
+
 #define init_stack (init_thread_union.stack)
 
 #else /* !__ASSEMBLY__ */
diff --git a/include/linux/thread_info.h b/include/linux/thread_info.h
index 45f004e..2873baf 100644
--- a/include/linux/thread_info.h
+++ b/include/linux/thread_info.h
@@ -14,17 +14,6 @@ struct timespec;
 struct compat_timespec;
 
 #ifdef CONFIG_THREAD_INFO_IN_TASK
-struct thread_info {
-   unsigned long   flags;  /* low level flags */
-};
-
-#define INIT_THREAD_INFO(tsk)  \
-{  \
-   .flags  = 0,\
-}
-#endif
-
-#ifdef CONFIG_THREAD_INFO_IN_TASK
 #define current_thread_info() ((struct thread_info *)current)
 #endif

[tip:x86/urgent] sched/core, x86: Make struct thread_info arch specific again

2016-10-20 Thread tip-bot for Heiko Carstens

Commit-ID:  c8061485a0d7569a865a3cc3c63347b0f42b3765
Gitweb: http://git.kernel.org/tip/c8061485a0d7569a865a3cc3c63347b0f42b3765
Author: Heiko Carstens 
AuthorDate: Wed, 19 Oct 2016 19:28:11 +0100
Committer:  Ingo Molnar 
CommitDate: Thu, 20 Oct 2016 13:27:47 +0200

sched/core, x86: Make struct thread_info arch specific again

The following commit:

  c65eacbe290b ("sched/core: Allow putting thread_info into task_struct")

... made 'struct thread_info' a generic struct with only a
single ::flags member, if CONFIG_THREAD_INFO_IN_TASK_STRUCT=y is
selected.

This change however seems to be quite x86 centric, since at least the
generic preemption code (asm-generic/preempt.h) assumes that struct
thread_info also has a preempt_count member, which apparently was not
true for x86.

We could add a bit more #ifdefs to solve this problem too, but it seems
to be much simpler to make struct thread_info arch specific
again. This also makes the conversion to THREAD_INFO_IN_TASK_STRUCT a
bit easier for architectures that have a couple of arch specific stuff
in their thread_info definition.

The arch specific stuff _could_ be moved to thread_struct. However
keeping them in thread_info makes it easier: accessing thread_info
members is simple, since it is at the beginning of the task_struct,
while the thread_struct is at the end. At least on s390 the offsets
needed to access members of the thread_struct (with task_struct as
base) are too large for various asm instructions.  This is not a
problem when keeping these members within thread_info.

Signed-off-by: Heiko Carstens 
Signed-off-by: Mark Rutland 
Acked-by: Thomas Gleixner 
Cc: Andrew Morton 
Cc: Andy Lutomirski 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: keesc...@chromium.org
Cc: linux-a...@vger.kernel.org
Link: 
http://lkml.kernel.org/r/1476901693-8492-2-git-send-email-mark.rutl...@arm.com
Signed-off-by: Ingo Molnar 
---
 arch/x86/include/asm/thread_info.h |  9 +
 include/linux/thread_info.h| 11 ---
 2 files changed, 9 insertions(+), 11 deletions(-)

diff --git a/arch/x86/include/asm/thread_info.h 
b/arch/x86/include/asm/thread_info.h
index 2aaca53..ad6f5eb0 100644
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -52,6 +52,15 @@ struct task_struct;
 #include 
 #include 
 
+struct thread_info {
+   unsigned long   flags;  /* low level flags */
+};
+
+#define INIT_THREAD_INFO(tsk)  \
+{  \
+   .flags  = 0,\
+}
+
 #define init_stack (init_thread_union.stack)
 
 #else /* !__ASSEMBLY__ */
diff --git a/include/linux/thread_info.h b/include/linux/thread_info.h
index 45f004e..2873baf 100644
--- a/include/linux/thread_info.h
+++ b/include/linux/thread_info.h
@@ -14,17 +14,6 @@ struct timespec;
 struct compat_timespec;
 
 #ifdef CONFIG_THREAD_INFO_IN_TASK
-struct thread_info {
-   unsigned long   flags;  /* low level flags */
-};
-
-#define INIT_THREAD_INFO(tsk)  \
-{  \
-   .flags  = 0,\
-}
-#endif
-
-#ifdef CONFIG_THREAD_INFO_IN_TASK
 #define current_thread_info() ((struct thread_info *)current)
 #endif

Re: [PATCH 3/3] thread_info: include for THREAD_INFO_IN_TASK

2016-10-20 Thread Heiko Carstens

On Wed, Oct 19, 2016 at 07:28:13PM +0100, Mark Rutland wrote:
> When CONFIG_THREAD_INFO_IN_TASK is selected, the current_thread_info()
> macro relies on current having been defined prior to its use. However,
> not all users of current_thread_info() include , and thus
> current is not guaranteed to be defined.
> 
> When CONFIG_THREAD_INFO_IN_TASK is not selected, it's possible that
> get_current() / current are based upon current_thread_info(), and
>  includes . Thus always including
>  would result in circular dependences on some platforms.
> 
> To ensure both cases work, this patch includes , but only
> when CONFIG_THREAD_INFO_IN_TASK is selected.
> 
> Signed-off-by: Mark Rutland <mark.rutl...@arm.com>
> Cc: Andrew Morton <a...@linux-foundation.org>
> Cc: Andy Lutomirski <l...@kernel.org>
> Cc: Heiko Carstens <heiko.carst...@de.ibm.com>
> Cc: Kees Cook <keesc...@chromium.org> 
> ---
>  include/linux/thread_info.h | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/include/linux/thread_info.h b/include/linux/thread_info.h
> index c75c6ab..ef1f4b0 100644
> --- a/include/linux/thread_info.h
> +++ b/include/linux/thread_info.h
> @@ -12,6 +12,7 @@
>  #include 
>  
>  #ifdef CONFIG_THREAD_INFO_IN_TASK
> +#include 
>  #define current_thread_info() ((struct thread_info *)current)
>  #endif

FWIW:
Acked-by: Heiko Carstens <heiko.carst...@de.ibm.com>

Re: [PATCH 3/3] thread_info: include for THREAD_INFO_IN_TASK

2016-10-20 Thread Heiko Carstens

On Wed, Oct 19, 2016 at 07:28:13PM +0100, Mark Rutland wrote:
> When CONFIG_THREAD_INFO_IN_TASK is selected, the current_thread_info()
> macro relies on current having been defined prior to its use. However,
> not all users of current_thread_info() include , and thus
> current is not guaranteed to be defined.
> 
> When CONFIG_THREAD_INFO_IN_TASK is not selected, it's possible that
> get_current() / current are based upon current_thread_info(), and
>  includes . Thus always including
>  would result in circular dependences on some platforms.
> 
> To ensure both cases work, this patch includes , but only
> when CONFIG_THREAD_INFO_IN_TASK is selected.
> 
> Signed-off-by: Mark Rutland 
> Cc: Andrew Morton 
> Cc: Andy Lutomirski 
> Cc: Heiko Carstens 
> Cc: Kees Cook  
> ---
>  include/linux/thread_info.h | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/include/linux/thread_info.h b/include/linux/thread_info.h
> index c75c6ab..ef1f4b0 100644
> --- a/include/linux/thread_info.h
> +++ b/include/linux/thread_info.h
> @@ -12,6 +12,7 @@
>  #include 
>  
>  #ifdef CONFIG_THREAD_INFO_IN_TASK
> +#include 
>  #define current_thread_info() ((struct thread_info *)current)
>  #endif

FWIW:
Acked-by: Heiko Carstens

Re: [PATCH v2 1/1] s390/spinlock: Provide vcpu_is_preempted

2016-10-19 Thread Heiko Carstens

On Wed, Oct 19, 2016 at 08:56:36AM +0200, Christian Borntraeger wrote:
> On 09/29/2016 05:51 PM, Christian Borntraeger wrote:
> > this implements the s390 backend for commit
> > "kernel/sched: introduce vcpu preempted check interface"
> > by reworking the existing smp_vcpu_scheduled into
> > arch_vcpu_is_preempted. We can then also get rid of the
> > local cpu_is_preempted function by moving the
> > CIF_ENABLED_WAIT test into arch_vcpu_is_preempted.
> > 
> > Signed-off-by: Christian Borntraeger 
> 
> 
> Martin, Peter,
> 
> I think we could go with the patch as is. In other words not providing
> arch_vcpu_is_preempted for !CONFIG_SMP.
> 
> This will result in compile errors if code does spinning or yielding for
> non-SMP kernels - which does not make sense to me, so this might actually
> be a nice indicator.
> If you prefer the !CONFIG_SMP implementation let me know and I will respin.

...but I do prefer an implementation for !CONFIG_SMP. I'm tired of fixing
silly compile errors that only happen on s390.

Re: [PATCH v2 1/1] s390/spinlock: Provide vcpu_is_preempted

2016-10-19 Thread Heiko Carstens

On Wed, Oct 19, 2016 at 08:56:36AM +0200, Christian Borntraeger wrote:
> On 09/29/2016 05:51 PM, Christian Borntraeger wrote:
> > this implements the s390 backend for commit
> > "kernel/sched: introduce vcpu preempted check interface"
> > by reworking the existing smp_vcpu_scheduled into
> > arch_vcpu_is_preempted. We can then also get rid of the
> > local cpu_is_preempted function by moving the
> > CIF_ENABLED_WAIT test into arch_vcpu_is_preempted.
> > 
> > Signed-off-by: Christian Borntraeger 
> 
> 
> Martin, Peter,
> 
> I think we could go with the patch as is. In other words not providing
> arch_vcpu_is_preempted for !CONFIG_SMP.
> 
> This will result in compile errors if code does spinning or yielding for
> non-SMP kernels - which does not make sense to me, so this might actually
> be a nice indicator.
> If you prefer the !CONFIG_SMP implementation let me know and I will respin.

...but I do prefer an implementation for !CONFIG_SMP. I'm tired of fixing
silly compile errors that only happen on s390.

Re: [PATCH v3] s390/spinlock: Provide vcpu_is_preempted

2016-10-19 Thread Heiko Carstens

On Wed, Oct 19, 2016 at 10:42:04AM +0200, Christian Borntraeger wrote:
> this implements the s390 backend for commit
> "kernel/sched: introduce vcpu preempted check interface"
> by reworking the existing smp_vcpu_scheduled into
> arch_vcpu_is_preempted. We can then also get rid of the
> local cpu_is_preempted function by moving the
> CIF_ENABLED_WAIT test into arch_vcpu_is_preempted.
> 
> Signed-off-by: Christian Borntraeger <borntrae...@de.ibm.com>
> ---
>  arch/s390/include/asm/spinlock.h |  8 
>  arch/s390/kernel/smp.c   |  9 +++--
>  arch/s390/lib/spinlock.c | 25 -
>  3 files changed, 23 insertions(+), 19 deletions(-)

Acked-by: Heiko Carstens <heiko.carst...@de.ibm.com>

Re: [PATCH v3] s390/spinlock: Provide vcpu_is_preempted

2016-10-19 Thread Heiko Carstens

On Wed, Oct 19, 2016 at 10:42:04AM +0200, Christian Borntraeger wrote:
> this implements the s390 backend for commit
> "kernel/sched: introduce vcpu preempted check interface"
> by reworking the existing smp_vcpu_scheduled into
> arch_vcpu_is_preempted. We can then also get rid of the
> local cpu_is_preempted function by moving the
> CIF_ENABLED_WAIT test into arch_vcpu_is_preempted.
> 
> Signed-off-by: Christian Borntraeger 
> ---
>  arch/s390/include/asm/spinlock.h |  8 
>  arch/s390/kernel/smp.c   |  9 +++--
>  arch/s390/lib/spinlock.c | 25 -
>  3 files changed, 23 insertions(+), 19 deletions(-)

Acked-by: Heiko Carstens

Re: [PATCH] thread_info: include for THREAD_INFO_IN_TASK (WAS: [PATCH 2/3] sched/preempt: include asm/current.h)

2016-10-18 Thread Heiko Carstens

On Mon, Oct 17, 2016 at 06:33:15PM +0100, Mark Rutland wrote:
> On Mon, Oct 17, 2016 at 03:48:13PM +0100, Mark Rutland wrote:
> > On Fri, Oct 14, 2016 at 11:42:25AM +0100, Mark Rutland wrote:
> > > When CONFIG_THREAD_INFO_IN_TASK is selected, the current_thread_info()
> > > macro relies on current having been defined prior to its use. However,
> > > not all users of current_thread_info() include , and thus
> > > current is not guaranteed to be defined.
> > > 
> > > When CONFIG_THREAD_INFO_IN_TASK is not selected, it's possible that
> > > get_current() / current are based upon current_thread_info(), and
> > >  includes . Thus always including
> > >  would result in circular dependences on some platforms.
> > > 
> > > To ensure both cases work, this patch includes , but only
> > > when CONFIG_THREAD_INFO_IN_TASK is selected.
> > > 
> > > Signed-off-by: Mark Rutland 
> > > ---
> > >  include/linux/thread_info.h | 1 +
> > >  1 file changed, 1 insertion(+)
> > > 
> > > As discussed, it would be great if this could go in along with the patch 
> > > to
> > > make thread_info arch-specific again, to make merging the arch-specific 
> > > parts
> > > easier (for arm64 and s390).
> > 
> > Urrgh; I've just recalled that this patch alone is insufficient. The
> > h8300 arch code has an unfixed bug [1], and relies on some implicit
> > definition ordering that will be broken by this patch.
> > 
> > I have a workaround/cleanup for core code that I'll send with an updated
> > version of my arm64 series shortly.
> 
> Looks like I spoke too soon. I have another circular include issue with
> raw_smp_processor_id() and task_struct::cpu to solve -- it looks like
> s390 doesn't suffer from that per my reading of your headers.

That's correct.

> In the mean time, I've pushed out a branch [2] with the common patches,
> atop of v4.9-rc1.

I just verified that your branch works fine for s390 (with and without the
THREAD_INFO_IN_TASK conversion).

Thanks,
Heiko

Re: [PATCH] thread_info: include for THREAD_INFO_IN_TASK (WAS: [PATCH 2/3] sched/preempt: include asm/current.h)

2016-10-18 Thread Heiko Carstens

On Mon, Oct 17, 2016 at 06:33:15PM +0100, Mark Rutland wrote:
> On Mon, Oct 17, 2016 at 03:48:13PM +0100, Mark Rutland wrote:
> > On Fri, Oct 14, 2016 at 11:42:25AM +0100, Mark Rutland wrote:
> > > When CONFIG_THREAD_INFO_IN_TASK is selected, the current_thread_info()
> > > macro relies on current having been defined prior to its use. However,
> > > not all users of current_thread_info() include , and thus
> > > current is not guaranteed to be defined.
> > > 
> > > When CONFIG_THREAD_INFO_IN_TASK is not selected, it's possible that
> > > get_current() / current are based upon current_thread_info(), and
> > >  includes . Thus always including
> > >  would result in circular dependences on some platforms.
> > > 
> > > To ensure both cases work, this patch includes , but only
> > > when CONFIG_THREAD_INFO_IN_TASK is selected.
> > > 
> > > Signed-off-by: Mark Rutland 
> > > ---
> > >  include/linux/thread_info.h | 1 +
> > >  1 file changed, 1 insertion(+)
> > > 
> > > As discussed, it would be great if this could go in along with the patch 
> > > to
> > > make thread_info arch-specific again, to make merging the arch-specific 
> > > parts
> > > easier (for arm64 and s390).
> > 
> > Urrgh; I've just recalled that this patch alone is insufficient. The
> > h8300 arch code has an unfixed bug [1], and relies on some implicit
> > definition ordering that will be broken by this patch.
> > 
> > I have a workaround/cleanup for core code that I'll send with an updated
> > version of my arm64 series shortly.
> 
> Looks like I spoke too soon. I have another circular include issue with
> raw_smp_processor_id() and task_struct::cpu to solve -- it looks like
> s390 doesn't suffer from that per my reading of your headers.

That's correct.

> In the mean time, I've pushed out a branch [2] with the common patches,
> atop of v4.9-rc1.

I just verified that your branch works fine for s390 (with and without the
THREAD_INFO_IN_TASK conversion).

Thanks,
Heiko

Re: [PATCH 2/3] sched/preempt: include asm/current.h

2016-10-14 Thread Heiko Carstens

On Fri, Oct 14, 2016 at 12:25:56AM +0100, Mark Rutland wrote:
> Hi,
> 
> On Thu, Oct 13, 2016 at 01:57:11PM +0200, Heiko Carstens wrote:
> > The generic preempt code needs to include . Otherwise
> > compilation fails if THREAD_INFO_IN_TASK is selected and the generic
> > preempt code is used:
> > 
> > ./include/linux/thread_info.h:17:54: error: 'current' undeclared (first use 
> > in this function)
> >  #define current_thread_info() ((struct thread_info *)current)
> 
> I don't think this is the right fix. Users of current_thread_info() should 
> only
> have to include , as  already 
> does.
> 
> I have a patch [1] which has  include  the
> THREAD_INFO_IN_TASK case (while avoiding circular includes over 
> 
> and  in the !THREAD_INFO_IN_TASK case).

I added that include initially to  too, but was afraid
of include dependency hell. So I tried the minimal version, and it worked.

However with the ifdef within your patch you make sure that nothing breaks
by accident; so I think your version is better.

> I was planning on posting an updated series with that come -rc1.

That could/should also go into 4.9, so architectures could independently
convert to THREAD_INFO_IN_TASK for the next merge window.

Re: [PATCH 2/3] sched/preempt: include asm/current.h

2016-10-14 Thread Heiko Carstens

On Fri, Oct 14, 2016 at 12:25:56AM +0100, Mark Rutland wrote:
> Hi,
> 
> On Thu, Oct 13, 2016 at 01:57:11PM +0200, Heiko Carstens wrote:
> > The generic preempt code needs to include . Otherwise
> > compilation fails if THREAD_INFO_IN_TASK is selected and the generic
> > preempt code is used:
> > 
> > ./include/linux/thread_info.h:17:54: error: 'current' undeclared (first use 
> > in this function)
> >  #define current_thread_info() ((struct thread_info *)current)
> 
> I don't think this is the right fix. Users of current_thread_info() should 
> only
> have to include , as  already 
> does.
> 
> I have a patch [1] which has  include  the
> THREAD_INFO_IN_TASK case (while avoiding circular includes over 
> 
> and  in the !THREAD_INFO_IN_TASK case).

I added that include initially to  too, but was afraid
of include dependency hell. So I tried the minimal version, and it worked.

However with the ifdef within your patch you make sure that nothing breaks
by accident; so I think your version is better.

> I was planning on posting an updated series with that come -rc1.

That could/should also go into 4.9, so architectures could independently
convert to THREAD_INFO_IN_TASK for the next merge window.

[PATCH 3/3] s390: move thread_info into task_struct

2016-10-13 Thread Heiko Carstens

This is the s390 variant of commit 15f4eae70d36 ("x86: Move
thread_info into task_struct").

Signed-off-by: Heiko Carstens <heiko.carst...@de.ibm.com>
---
 arch/s390/Kconfig   |  1 +
 arch/s390/include/asm/lowcore.h |  2 +-
 arch/s390/include/asm/thread_info.h | 11 
 arch/s390/kernel/asm-offsets.c  | 17 +
 arch/s390/kernel/entry.S| 51 ++---
 arch/s390/kernel/head64.S   |  5 ++--
 arch/s390/kernel/setup.c|  3 +--
 arch/s390/kernel/smp.c  |  1 -
 8 files changed, 37 insertions(+), 54 deletions(-)

diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index 426481d4cc86..a159dfc27b76 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -169,6 +169,7 @@ config S390
select OLD_SIGSUSPEND3
select SPARSE_IRQ
select SYSCTL_EXCEPTION_TRACE
+   select THREAD_INFO_IN_TASK
select TTY
select VIRT_CPU_ACCOUNTING
select VIRT_TO_BUS
diff --git a/arch/s390/include/asm/lowcore.h b/arch/s390/include/asm/lowcore.h
index 7b93b78f423c..27402de81047 100644
--- a/arch/s390/include/asm/lowcore.h
+++ b/arch/s390/include/asm/lowcore.h
@@ -95,7 +95,7 @@ struct lowcore {
 
/* Current process. */
__u64   current_task;   /* 0x0310 */
-   __u64   thread_info;/* 0x0318 */
+   __u8pad_0x318[0x320-0x318]; /* 0x0318 */
__u64   kernel_stack;   /* 0x0320 */
 
/* Interrupt, panic and restart stack. */
diff --git a/arch/s390/include/asm/thread_info.h 
b/arch/s390/include/asm/thread_info.h
index f15c0398c363..f5fc4efa9bec 100644
--- a/arch/s390/include/asm/thread_info.h
+++ b/arch/s390/include/asm/thread_info.h
@@ -30,10 +30,8 @@
  * - if the contents of this structure are changed, the assembly constants 
must also be changed
  */
 struct thread_info {
-   struct task_struct  *task;  /* main task structure */
unsigned long   flags;  /* low level flags */
unsigned long   sys_call_table; /* System call table address */
-   unsigned intcpu;/* current CPU */
int preempt_count;  /* 0 => preemptable, <0 => BUG 
*/
unsigned intsystem_call;
__u64   user_timer;
@@ -46,21 +44,12 @@ struct thread_info {
  */
 #define INIT_THREAD_INFO(tsk)  \
 {  \
-   .task   = , \
.flags  = 0,\
-   .cpu= 0,\
.preempt_count  = INIT_PREEMPT_COUNT,   \
 }
 
-#define init_thread_info   (init_thread_union.thread_info)
 #define init_stack (init_thread_union.stack)
 
-/* how to get the thread information struct from C */
-static inline struct thread_info *current_thread_info(void)
-{
-   return (struct thread_info *) S390_lowcore.thread_info;
-}
-
 void arch_release_task_struct(struct task_struct *tsk);
 int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src);
 
diff --git a/arch/s390/kernel/asm-offsets.c b/arch/s390/kernel/asm-offsets.c
index f3df9e0a5dec..e6969ebd0d44 100644
--- a/arch/s390/kernel/asm-offsets.c
+++ b/arch/s390/kernel/asm-offsets.c
@@ -25,7 +25,7 @@
 int main(void)
 {
/* task struct offsets */
-   OFFSET(__TASK_thread_info, task_struct, stack);
+   OFFSET(__TASK_stack, task_struct, stack);
OFFSET(__TASK_thread, task_struct, thread);
OFFSET(__TASK_pid, task_struct, pid);
BLANK();
@@ -39,14 +39,12 @@ int main(void)
OFFSET(__THREAD_trap_tdb, thread_struct, trap_tdb);
BLANK();
/* thread info offsets */
-   OFFSET(__TI_task, thread_info, task);
-   OFFSET(__TI_flags, thread_info, flags);
-   OFFSET(__TI_sysc_table, thread_info, sys_call_table);
-   OFFSET(__TI_cpu, thread_info, cpu);
-   OFFSET(__TI_precount, thread_info, preempt_count);
-   OFFSET(__TI_user_timer, thread_info, user_timer);
-   OFFSET(__TI_system_timer, thread_info, system_timer);
-   OFFSET(__TI_last_break, thread_info, last_break);
+   OFFSET(__TASK_TI_flags, task_struct, thread_info.flags);
+   OFFSET(__TASK_TI_sysc_table,  task_struct, thread_info.sys_call_table);
+   OFFSET(__TASK_TI_precount, task_struct, thread_info.preempt_count);
+   OFFSET(__TASK_TI_user_timer, task_struct, thread_info.user_timer);
+   OFFSET(__TASK_TI_system_timer, task_struct, thread_info.system_timer);
+   OFFSET(__TASK_TI_last_break, task_struct, thread_info.last_break);
BLANK();
/* pt_regs offsets */
OFFSET(__PT_ARGS, pt_regs, args);
@@ -159,7 +157,6 @@ int main(void)
OFFSET(__LC_INT_CLOCK, lowcore, int_clock);
OFFSET(__LC_MCCK_CLOCK, lowcore, mcck_clock);
OFFSET(__LC_CURRENT, lowcore, curr

[PATCH 3/3] s390: move thread_info into task_struct

2016-10-13 Thread Heiko Carstens

This is the s390 variant of commit 15f4eae70d36 ("x86: Move
thread_info into task_struct").

Signed-off-by: Heiko Carstens 
---
 arch/s390/Kconfig   |  1 +
 arch/s390/include/asm/lowcore.h |  2 +-
 arch/s390/include/asm/thread_info.h | 11 
 arch/s390/kernel/asm-offsets.c  | 17 +
 arch/s390/kernel/entry.S| 51 ++---
 arch/s390/kernel/head64.S   |  5 ++--
 arch/s390/kernel/setup.c|  3 +--
 arch/s390/kernel/smp.c  |  1 -
 8 files changed, 37 insertions(+), 54 deletions(-)

diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index 426481d4cc86..a159dfc27b76 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -169,6 +169,7 @@ config S390
select OLD_SIGSUSPEND3
select SPARSE_IRQ
select SYSCTL_EXCEPTION_TRACE
+   select THREAD_INFO_IN_TASK
select TTY
select VIRT_CPU_ACCOUNTING
select VIRT_TO_BUS
diff --git a/arch/s390/include/asm/lowcore.h b/arch/s390/include/asm/lowcore.h
index 7b93b78f423c..27402de81047 100644
--- a/arch/s390/include/asm/lowcore.h
+++ b/arch/s390/include/asm/lowcore.h
@@ -95,7 +95,7 @@ struct lowcore {
 
/* Current process. */
__u64   current_task;   /* 0x0310 */
-   __u64   thread_info;/* 0x0318 */
+   __u8pad_0x318[0x320-0x318]; /* 0x0318 */
__u64   kernel_stack;   /* 0x0320 */
 
/* Interrupt, panic and restart stack. */
diff --git a/arch/s390/include/asm/thread_info.h 
b/arch/s390/include/asm/thread_info.h
index f15c0398c363..f5fc4efa9bec 100644
--- a/arch/s390/include/asm/thread_info.h
+++ b/arch/s390/include/asm/thread_info.h
@@ -30,10 +30,8 @@
  * - if the contents of this structure are changed, the assembly constants 
must also be changed
  */
 struct thread_info {
-   struct task_struct  *task;  /* main task structure */
unsigned long   flags;  /* low level flags */
unsigned long   sys_call_table; /* System call table address */
-   unsigned intcpu;/* current CPU */
int preempt_count;  /* 0 => preemptable, <0 => BUG 
*/
unsigned intsystem_call;
__u64   user_timer;
@@ -46,21 +44,12 @@ struct thread_info {
  */
 #define INIT_THREAD_INFO(tsk)  \
 {  \
-   .task   = , \
.flags  = 0,\
-   .cpu= 0,\
.preempt_count  = INIT_PREEMPT_COUNT,   \
 }
 
-#define init_thread_info   (init_thread_union.thread_info)
 #define init_stack (init_thread_union.stack)
 
-/* how to get the thread information struct from C */
-static inline struct thread_info *current_thread_info(void)
-{
-   return (struct thread_info *) S390_lowcore.thread_info;
-}
-
 void arch_release_task_struct(struct task_struct *tsk);
 int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src);
 
diff --git a/arch/s390/kernel/asm-offsets.c b/arch/s390/kernel/asm-offsets.c
index f3df9e0a5dec..e6969ebd0d44 100644
--- a/arch/s390/kernel/asm-offsets.c
+++ b/arch/s390/kernel/asm-offsets.c
@@ -25,7 +25,7 @@
 int main(void)
 {
/* task struct offsets */
-   OFFSET(__TASK_thread_info, task_struct, stack);
+   OFFSET(__TASK_stack, task_struct, stack);
OFFSET(__TASK_thread, task_struct, thread);
OFFSET(__TASK_pid, task_struct, pid);
BLANK();
@@ -39,14 +39,12 @@ int main(void)
OFFSET(__THREAD_trap_tdb, thread_struct, trap_tdb);
BLANK();
/* thread info offsets */
-   OFFSET(__TI_task, thread_info, task);
-   OFFSET(__TI_flags, thread_info, flags);
-   OFFSET(__TI_sysc_table, thread_info, sys_call_table);
-   OFFSET(__TI_cpu, thread_info, cpu);
-   OFFSET(__TI_precount, thread_info, preempt_count);
-   OFFSET(__TI_user_timer, thread_info, user_timer);
-   OFFSET(__TI_system_timer, thread_info, system_timer);
-   OFFSET(__TI_last_break, thread_info, last_break);
+   OFFSET(__TASK_TI_flags, task_struct, thread_info.flags);
+   OFFSET(__TASK_TI_sysc_table,  task_struct, thread_info.sys_call_table);
+   OFFSET(__TASK_TI_precount, task_struct, thread_info.preempt_count);
+   OFFSET(__TASK_TI_user_timer, task_struct, thread_info.user_timer);
+   OFFSET(__TASK_TI_system_timer, task_struct, thread_info.system_timer);
+   OFFSET(__TASK_TI_last_break, task_struct, thread_info.last_break);
BLANK();
/* pt_regs offsets */
OFFSET(__PT_ARGS, pt_regs, args);
@@ -159,7 +157,6 @@ int main(void)
OFFSET(__LC_INT_CLOCK, lowcore, int_clock);
OFFSET(__LC_MCCK_CLOCK, lowcore, mcck_clock);
OFFSET(__LC_CURRENT, lowcore, current_task);
-   OFFSET(__LC_T

[PATCH 2/3] sched/preempt: include asm/current.h

2016-10-13 Thread Heiko Carstens

The generic preempt code needs to include . Otherwise
compilation fails if THREAD_INFO_IN_TASK is selected and the generic
preempt code is used:

./include/linux/thread_info.h:17:54: error: 'current' undeclared (first use in 
this function)
 #define current_thread_info() ((struct thread_info *)current)

Signed-off-by: Heiko Carstens <heiko.carst...@de.ibm.com>
---
 include/asm-generic/preempt.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/asm-generic/preempt.h b/include/asm-generic/preempt.h
index c1cde3577551..66fcd6cd7fc6 100644
--- a/include/asm-generic/preempt.h
+++ b/include/asm-generic/preempt.h
@@ -2,6 +2,7 @@
 #define __ASM_PREEMPT_H
 
 #include 
+#include 
 
 #define PREEMPT_ENABLED(0)
 
-- 
2.8.4

< 1 2 3 4 5 6 7 8 9 10 >

501 - 600 of 1909 matches

Mail list logo