Doc.../sound/soc/platform.rst: snd_soc_platform_driver out of sync

2017-11-30 Thread Andrey Utkin
Hi maintainers of Sound and/or Docs.

Documentation/sound/soc/platform.rst quotes outdated definition of
"struct snd_soc_platform_driver". I spotted this because it shows like
there are suspend/resume function pointers while this is no longer true.

I don't have a good idea how to update the doc, I am just sure that mere
update of the struct definition is not the way to go.

Hope this report helps.
Thanks for you work!
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v13 0/7] cgroup-aware OOM killer

2017-11-30 Thread Andrew Morton
On Thu, 30 Nov 2017 15:28:17 + Roman Gushchin  wrote:

> This patchset makes the OOM killer cgroup-aware.

Thanks, I'll grab these.

There has been controversy over this patchset, to say the least.  I
can't say that I followed it closely!  Could those who still have
reservations please summarise their concerns and hopefully suggest a
way forward?
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3] doc: add maintainer book

2017-11-30 Thread Tobin C. Harding
On Thu, Nov 30, 2017 at 09:06:21AM -0200, Mauro Carvalho Chehab wrote:
> Em Thu, 30 Nov 2017 21:47:44 +1100
> "Tobin C. Harding"  escreveu:
> 
> > On Thu, Nov 30, 2017 at 07:01:19AM -0200, Mauro Carvalho Chehab wrote:
> > > Em Thu, 30 Nov 2017 12:55:07 +1100
> > > "Tobin C. Harding"  escreveu:
> 
> 
> > > > +So, by way of an example, Greg gives; a pull request with 
> > > > miscellaneous  
> > > 
> > > Nitpick: there's an extra ";" character above:
> > >   gives; -> gives  
> > 
> > Ha ha, I thought for ages how to word this bit. The irony of grammar
> > corrections from a non-native speaker is not lost on me :)
> 
> :-)
> 
> Well, if that serves as a consolation, I had to go to a dictionary to
> understand "By way of":
> 
>   https://www.collinsdictionary.com/dictionary/english/by-way-of
> 
> As that was a new expression for me :-)
> 
> Anyway, AFAIKT, English and Portuguese (and probably Spanish) have 
> similar rules with regards to commas and semicolons.

Oh cool, I didn't know that. Eu fallo um pouco Portugese muito no
esrever bem. I never learned to spell in Portuguese.

> > Perhaps:
> > 
> >  By way of an example Greg gives, a pull request with miscellaneous
> > 
> > I'll take any nitpicks you have Mauro, striving for perfection here. Thanks.
> 
> Yeah, that looks a way better on my eyes.

thanks,
Tobin.
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v13 4/7] mm, oom: introduce memory.oom_group

2017-11-30 Thread Roman Gushchin
The cgroup-aware OOM killer treats leaf memory cgroups as memory
consumption entities and performs the victim selection by comparing
them based on their memory footprint. Then it kills the biggest task
inside the selected memory cgroup.

But there are workloads, which are not tolerant to a such behavior.
Killing a random task may leave the workload in a broken state.

To solve this problem, memory.oom_group knob is introduced.
It will define, whether a memory group should be treated as an
indivisible memory consumer, compared by total memory consumption
with other memory consumers (leaf memory cgroups and other memory
cgroups with memory.oom_group set), and whether all belonging tasks
should be killed if the cgroup is selected.

If set on memcg A, it means that in case of system-wide OOM or
memcg-wide OOM scoped to A or any ancestor cgroup, all tasks,
belonging to the sub-tree of A will be killed. If OOM event is
scoped to a descendant cgroup (A/B, for example), only tasks in
that cgroup can be affected. OOM killer will never touch any tasks
outside of the scope of the OOM event.

Also, tasks with oom_score_adj set to -1000 will not be killed because
this has been a long established way to protect a particular process
from seeing an unexpected SIGKILL from the OOM killer. Ignoring this
user defined configuration might lead to data corruptions or other
misbehavior.

The default value is 0.

Signed-off-by: Roman Gushchin 
Acked-by: Michal Hocko 
Acked-by: Johannes Weiner 
Cc: Vladimir Davydov 
Cc: Tetsuo Handa 
Cc: David Rientjes 
Cc: Andrew Morton 
Cc: Tejun Heo 
Cc: kernel-t...@fb.com
Cc: cgro...@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Cc: linux-ker...@vger.kernel.org
Cc: linux...@kvack.org
---
 include/linux/memcontrol.h | 17 +++
 mm/memcontrol.c| 75 +++---
 mm/oom_kill.c  | 47 +++--
 3 files changed, 126 insertions(+), 13 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index cb4db659a8b5..7b8bcdf6571d 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -203,6 +203,13 @@ struct mem_cgroup {
/* OOM-Killer disable */
int oom_kill_disable;
 
+   /*
+* Treat the sub-tree as an indivisible memory consumer,
+* kill all belonging tasks if the memory cgroup selected
+* as OOM victim.
+*/
+   bool oom_group;
+
/* handle for "memory.events" */
struct cgroup_file events_file;
 
@@ -490,6 +497,11 @@ bool mem_cgroup_oom_synchronize(bool wait);
 
 bool mem_cgroup_select_oom_victim(struct oom_control *oc);
 
+static inline bool mem_cgroup_oom_group(struct mem_cgroup *memcg)
+{
+   return memcg->oom_group;
+}
+
 #ifdef CONFIG_MEMCG_SWAP
 extern int do_swap_account;
 #endif
@@ -990,6 +1002,11 @@ static inline bool mem_cgroup_select_oom_victim(struct 
oom_control *oc)
 {
return false;
 }
+
+static inline bool mem_cgroup_oom_group(struct mem_cgroup *memcg)
+{
+   return false;
+}
 #endif /* CONFIG_MEMCG */
 
 /* idx can be of type enum memcg_stat_item or node_stat_item */
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 592ffb1c98a7..5d27a4bbd478 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2779,19 +2779,51 @@ static long oom_evaluate_memcg(struct mem_cgroup *memcg,
 
 static void select_victim_memcg(struct mem_cgroup *root, struct oom_control 
*oc)
 {
-   struct mem_cgroup *iter;
+   struct mem_cgroup *iter, *group = NULL;
+   long group_score = 0;
 
oc->chosen_memcg = NULL;
oc->chosen_points = 0;
 
+   /*
+* If OOM is memcg-wide, and the memcg has the oom_group flag set,
+* all tasks belonging to the memcg should be killed.
+* So, we mark the memcg as a victim.
+*/
+   if (oc->memcg && mem_cgroup_oom_group(oc->memcg)) {
+   oc->chosen_memcg = oc->memcg;
+   css_get(>chosen_memcg->css);
+   return;
+   }
+
/*
 * The oom_score is calculated for leaf memory cgroups (including
 * the root memcg).
+* Non-leaf oom_group cgroups accumulating score of descendant
+* leaf memory cgroups.
 */
rcu_read_lock();
for_each_mem_cgroup_tree(iter, root) {
long score;
 
+   /*
+* We don't consider non-leaf non-oom_group memory cgroups
+* as OOM victims.
+*/
+   if (memcg_has_children(iter) && iter != root_mem_cgroup &&
+   !mem_cgroup_oom_group(iter))
+   continue;
+
+   /*
+* If group is not set or we've ran out of the group's sub-tree,
+* we should set group and 

[PATCH v13 6/7] mm, oom, docs: describe the cgroup-aware OOM killer

2017-11-30 Thread Roman Gushchin
Document the cgroup-aware OOM killer.

Signed-off-by: Roman Gushchin 
Cc: Johannes Weiner 
Cc: Michal Hocko 
Cc: Vladimir Davydov 
Cc: Tetsuo Handa 
Cc: Andrew Morton 
Cc: David Rientjes 
Cc: Tejun Heo 
Cc: kernel-t...@fb.com
Cc: cgro...@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Cc: linux-ker...@vger.kernel.org
Cc: linux...@kvack.org
---
 Documentation/cgroup-v2.txt | 58 +
 1 file changed, 58 insertions(+)

diff --git a/Documentation/cgroup-v2.txt b/Documentation/cgroup-v2.txt
index 779211fbb69f..c80a147f94b7 100644
--- a/Documentation/cgroup-v2.txt
+++ b/Documentation/cgroup-v2.txt
@@ -48,6 +48,7 @@ v1 is available under Documentation/cgroup-v1/.
5-2-1. Memory Interface Files
5-2-2. Usage Guidelines
5-2-3. Memory Ownership
+   5-2-4. OOM Killer
  5-3. IO
5-3-1. IO Interface Files
5-3-2. Writeback
@@ -1026,6 +1027,28 @@ PAGE_SIZE multiple when read back.
high limit is used and monitored properly, this limit's
utility is limited to providing the final safety net.
 
+  memory.oom_group
+
+   A read-write single value file which exists on non-root
+   cgroups.  The default is "0".
+
+   If set, OOM killer will consider the memory cgroup as an
+   indivisible memory consumers and compare it with other memory
+   consumers by it's memory footprint.
+   If such memory cgroup is selected as an OOM victim, all
+   processes belonging to it or it's descendants will be killed.
+
+   This applies to system-wide OOM conditions and reaching
+   the hard memory limit of the cgroup and their ancestor.
+   If OOM condition happens in a descendant cgroup with it's own
+   memory limit, the memory cgroup can't be considered
+   as an OOM victim, and OOM killer will not kill all belonging
+   tasks.
+
+   Also, OOM killer respects the /proc/pid/oom_score_adj value -1000,
+   and will never kill the unkillable task, even if memory.oom_group
+   is set.
+
   memory.events
A read-only flat-keyed file which exists on non-root cgroups.
The following entries are defined.  Unless specified
@@ -1229,6 +1252,41 @@ to be accessed repeatedly by other cgroups, it may make 
sense to use
 POSIX_FADV_DONTNEED to relinquish the ownership of memory areas
 belonging to the affected files to ensure correct memory ownership.
 
+OOM Killer
+~~
+
+Cgroup v2 memory controller implements a cgroup-aware OOM killer.
+It means that it treats cgroups as first class OOM entities.
+
+Under OOM conditions the memory controller tries to make the best
+choice of a victim, looking for a memory cgroup with the largest
+memory footprint, considering leaf cgroups and cgroups with the
+memory.oom_group option set, which are considered to be an indivisible
+memory consumers.
+
+By default, OOM killer will kill the biggest task in the selected
+memory cgroup. A user can change this behavior by enabling
+the per-cgroup memory.oom_group option. If set, it causes
+the OOM killer to kill all processes attached to the cgroup,
+except processes with oom_score_adj set to -1000.
+
+This affects both system- and cgroup-wide OOMs. For a cgroup-wide OOM
+the memory controller considers only cgroups belonging to the sub-tree
+of the OOM'ing cgroup.
+
+The root cgroup is treated as a leaf memory cgroup, so it's compared
+with other leaf memory cgroups and cgroups with oom_group option set.
+
+If there are no cgroups with the enabled memory controller,
+the OOM killer is using the "traditional" process-based approach.
+
+Please, note that memory charges are not migrating if tasks
+are moved between different memory cgroups. Moving tasks with
+significant memory footprint may affect OOM victim selection logic.
+If it's a case, please, consider creating a common ancestor for
+the source and destination memory cgroups and enabling oom_group
+on ancestor layer.
+
 
 IO
 --
-- 
2.14.3

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v13 7/7] cgroup: list groupoom in cgroup features

2017-11-30 Thread Roman Gushchin
List groupoom in cgroup features list (exported via
/sys/kernel/cgroup/features), which can be used by a userspace
apps (most likely, systemd) to get an idea which cgroup features
are supported by kernel.

Signed-off-by: Roman Gushchin 
Cc: Tejun Heo 
Cc: Michal Hocko 
Cc: Vladimir Davydov 
Cc: Johannes Weiner 
Cc: Tetsuo Handa 
Cc: David Rientjes 
Cc: Andrew Morton 
Cc: kernel-t...@fb.com
Cc: cgro...@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Cc: linux-ker...@vger.kernel.org
Cc: linux...@kvack.org
---
 kernel/cgroup/cgroup.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 7338e12979e1..693443282fc1 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -5922,7 +5922,8 @@ static struct kobj_attribute cgroup_delegate_attr = 
__ATTR_RO(delegate);
 static ssize_t features_show(struct kobject *kobj, struct kobj_attribute *attr,
 char *buf)
 {
-   return snprintf(buf, PAGE_SIZE, "nsdelegate\n");
+   return snprintf(buf, PAGE_SIZE, "nsdelegate\n"
+   "groupoom\n");
 }
 static struct kobj_attribute cgroup_features_attr = __ATTR_RO(features);
 
-- 
2.14.3

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v13 5/7] mm, oom: add cgroup v2 mount option for cgroup-aware OOM killer

2017-11-30 Thread Roman Gushchin
Add a "groupoom" cgroup v2 mount option to enable the cgroup-aware
OOM killer. If not set, the OOM selection is performed in
a "traditional" per-process way.

The behavior can be changed dynamically by remounting the cgroupfs.

Signed-off-by: Roman Gushchin 
Cc: Michal Hocko 
Cc: Vladimir Davydov 
Cc: Johannes Weiner 
Cc: Tetsuo Handa 
Cc: David Rientjes 
Cc: Andrew Morton 
Cc: Tejun Heo 
Cc: kernel-t...@fb.com
Cc: cgro...@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Cc: linux-ker...@vger.kernel.org
Cc: linux...@kvack.org
---
 include/linux/cgroup-defs.h |  5 +
 kernel/cgroup/cgroup.c  | 10 ++
 mm/memcontrol.c |  3 +++
 3 files changed, 18 insertions(+)

diff --git a/include/linux/cgroup-defs.h b/include/linux/cgroup-defs.h
index 8b7fd8eeccee..9fb99e25d654 100644
--- a/include/linux/cgroup-defs.h
+++ b/include/linux/cgroup-defs.h
@@ -81,6 +81,11 @@ enum {
 * Enable cpuset controller in v1 cgroup to use v2 behavior.
 */
CGRP_ROOT_CPUSET_V2_MODE = (1 << 4),
+
+   /*
+* Enable cgroup-aware OOM killer.
+*/
+   CGRP_GROUP_OOM = (1 << 5),
 };
 
 /* cftype->flags */
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 0b1ffe147f24..7338e12979e1 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -1731,6 +1731,9 @@ static int parse_cgroup_root_flags(char *data, unsigned 
int *root_flags)
if (!strcmp(token, "nsdelegate")) {
*root_flags |= CGRP_ROOT_NS_DELEGATE;
continue;
+   } else if (!strcmp(token, "groupoom")) {
+   *root_flags |= CGRP_GROUP_OOM;
+   continue;
}
 
pr_err("cgroup2: unknown option \"%s\"\n", token);
@@ -1747,6 +1750,11 @@ static void apply_cgroup_root_flags(unsigned int 
root_flags)
cgrp_dfl_root.flags |= CGRP_ROOT_NS_DELEGATE;
else
cgrp_dfl_root.flags &= ~CGRP_ROOT_NS_DELEGATE;
+
+   if (root_flags & CGRP_GROUP_OOM)
+   cgrp_dfl_root.flags |= CGRP_GROUP_OOM;
+   else
+   cgrp_dfl_root.flags &= ~CGRP_GROUP_OOM;
}
 }
 
@@ -1754,6 +1762,8 @@ static int cgroup_show_options(struct seq_file *seq, 
struct kernfs_root *kf_root
 {
if (cgrp_dfl_root.flags & CGRP_ROOT_NS_DELEGATE)
seq_puts(seq, ",nsdelegate");
+   if (cgrp_dfl_root.flags & CGRP_GROUP_OOM)
+   seq_puts(seq, ",groupoom");
return 0;
 }
 
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 5d27a4bbd478..c76d5fb55c5c 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2869,6 +2869,9 @@ bool mem_cgroup_select_oom_victim(struct oom_control *oc)
if (!cgroup_subsys_on_dfl(memory_cgrp_subsys))
return false;
 
+   if (!(cgrp_dfl_root.flags & CGRP_GROUP_OOM))
+   return false;
+
if (oc->memcg)
root = oc->memcg;
else
-- 
2.14.3

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v13 2/7] mm: implement mem_cgroup_scan_tasks() for the root memory cgroup

2017-11-30 Thread Roman Gushchin
Implement mem_cgroup_scan_tasks() functionality for the root
memory cgroup to use this function for looking for a OOM victim
task in the root memory cgroup by the cgroup-ware OOM killer.

The root memory cgroup is treated as a leaf cgroup, so only tasks
which are directly belonging to the root cgroup are iterated over.

This patch doesn't introduce any functional change as
mem_cgroup_scan_tasks() is never called for the root memcg.
This is preparatory work for the cgroup-aware OOM killer,
which will use this function to iterate over tasks belonging
to the root memcg.

Signed-off-by: Roman Gushchin 
Acked-by: Michal Hocko 
Acked-by: Johannes Weiner 
Acked-by: David Rientjes 
Cc: Vladimir Davydov 
Cc: Tetsuo Handa 
Cc: Andrew Morton 
Cc: Tejun Heo 
Cc: kernel-t...@fb.com
Cc: cgro...@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Cc: linux-ker...@vger.kernel.org
Cc: linux...@kvack.org
---
 mm/memcontrol.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index a4aac306ebe3..55fbda60cef6 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -888,7 +888,8 @@ static void invalidate_reclaim_iterators(struct mem_cgroup 
*dead_memcg)
  * value, the function breaks the iteration loop and returns the value.
  * Otherwise, it will iterate over all tasks and return 0.
  *
- * This function must not be called for the root memory cgroup.
+ * If memcg is the root memory cgroup, this function will iterate only
+ * over tasks belonging directly to the root memory cgroup.
  */
 int mem_cgroup_scan_tasks(struct mem_cgroup *memcg,
  int (*fn)(struct task_struct *, void *), void *arg)
@@ -896,8 +897,6 @@ int mem_cgroup_scan_tasks(struct mem_cgroup *memcg,
struct mem_cgroup *iter;
int ret = 0;
 
-   BUG_ON(memcg == root_mem_cgroup);
-
for_each_mem_cgroup_tree(iter, memcg) {
struct css_task_iter it;
struct task_struct *task;
@@ -906,7 +905,7 @@ int mem_cgroup_scan_tasks(struct mem_cgroup *memcg,
while (!ret && (task = css_task_iter_next()))
ret = fn(task, arg);
css_task_iter_end();
-   if (ret) {
+   if (ret || memcg == root_mem_cgroup) {
mem_cgroup_iter_break(memcg, iter);
break;
}
-- 
2.14.3

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v13 0/7] cgroup-aware OOM killer

2017-11-30 Thread Roman Gushchin
This patchset makes the OOM killer cgroup-aware.

v13:
  - Reverted fallback to per-process OOM as in v11 (asked by Michal)
  - Added entry in cgroup features list
  - Added a note about charge migration
  - Rebase

v12:
  - Root memory cgroup is evaluated based on sum of the oom scores
of belonging tasks
  - Do not fallback to the per-process behavior if there if
it wasn't possbile to kill a memcg victim
  - Rebase on top of mm tree

v11:
  - Fixed an issue with skipping the root mem cgroup
(discovered by Shakeel Butt)
  - Moved a check in __oom_kill_process() to the memmory.oom_group
patch, added corresponding comments
  - Added a note about ignoring tasks with oom_score_adj -1000
(proposed by Michal Hocko)
  - Rebase on top of mm tree

v10:
  - Separate oom_group introduction into a standalone patch
  - Stop propagating oom_group
  - Make oom_group delegatable
  - Do not try to kill the biggest task in the first order,
if the whole cgroup is going to be killed
  - Stop caching oom_score on struct memcg, optimize victim
memcg selection
  - Drop dmesg printing (for further refining)
  - Small refactorings and comments added here and there
  - Rebase on top of mm tree

v9:
  - Change siblings-to-siblings comparison to the tree-wide search,
make related refactorings
  - Make oom_group implicitly propagated down by the tree
  - Fix an issue with task selection in root cgroup

v8:
  - Do not kill tasks with OOM_SCORE_ADJ -1000
  - Make the whole thing opt-in with cgroup mount option control
  - Drop oom_priority for further discussions
  - Kill the whole cgroup if oom_group is set and it's
memory.max is reached
  - Update docs and commit messages

v7:
  - __oom_kill_process() drops reference to the victim task
  - oom_score_adj -1000 is always respected
  - Renamed oom_kill_all to oom_group
  - Dropped oom_prio range, converted from short to int
  - Added a cgroup v2 mount option to disable cgroup-aware OOM killer
  - Docs updated
  - Rebased on top of mmotm

v6:
  - Renamed oom_control.chosen to oom_control.chosen_task
  - Renamed oom_kill_all_tasks to oom_kill_all
  - Per-node NR_SLAB_UNRECLAIMABLE accounting
  - Several minor fixes and cleanups
  - Docs updated

v5:
  - Rebased on top of Michal Hocko's patches, which have changed the
way how OOM victims becoming an access to the memory
reserves. Dropped corresponding part of this patchset
  - Separated the oom_kill_process() splitting into a standalone commit
  - Added debug output (suggested by David Rientjes)
  - Some minor fixes

v4:
  - Reworked per-cgroup oom_score_adj into oom_priority
(based on ideas by David Rientjes)
  - Tasks with oom_score_adj -1000 are never selected if
oom_kill_all_tasks is not set
  - Memcg victim selection code is reworked, and
synchronization is based on finding tasks with OOM victim marker,
rather then on global counter
  - Debug output is dropped
  - Refactored TIF_MEMDIE usage

v3:
  - Merged commits 1-4 into 6
  - Separated oom_score_adj logic and debug output into separate commits
  - Fixed swap accounting

v2:
  - Reworked victim selection based on feedback
from Michal Hocko, Vladimir Davydov and Johannes Weiner
  - "Kill all tasks" is now an opt-in option, by default
only one process will be killed
  - Added per-cgroup oom_score_adj
  - Refined oom score calculations, suggested by Vladimir Davydov
  - Converted to a patchset

v1:
  https://lkml.org/lkml/2017/5/18/969


Cc: Michal Hocko 
Cc: Vladimir Davydov 
Cc: Johannes Weiner 
Cc: Tetsuo Handa 
Cc: David Rientjes 
Cc: Andrew Morton 
Cc: Tejun Heo 
Cc: kernel-t...@fb.com
Cc: cgro...@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Cc: linux-ker...@vger.kernel.org
Cc: cgro...@vger.kernel.org
Cc: linux...@kvack.org

Roman Gushchin (7):
  mm, oom: refactor the oom_kill_process() function
  mm: implement mem_cgroup_scan_tasks() for the root memory cgroup
  mm, oom: cgroup-aware OOM killer
  mm, oom: introduce memory.oom_group
  mm, oom: add cgroup v2 mount option for cgroup-aware OOM killer
  mm, oom, docs: describe the cgroup-aware OOM killer
  cgroup: list groupoom in cgroup features

 Documentation/cgroup-v2.txt |  58 ++
 include/linux/cgroup-defs.h |   5 +
 include/linux/memcontrol.h  |  34 ++
 include/linux/oom.h |  12 ++-
 kernel/cgroup/cgroup.c  |  13 ++-
 mm/memcontrol.c | 258 +++-
 mm/oom_kill.c   | 224 +-
 7 files changed, 525 insertions(+), 79 deletions(-)

-- 
2.14.3

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v13 1/7] mm, oom: refactor the oom_kill_process() function

2017-11-30 Thread Roman Gushchin
The oom_kill_process() function consists of two logical parts:
the first one is responsible for considering task's children as
a potential victim and printing the debug information.
The second half is responsible for sending SIGKILL to all
tasks sharing the mm struct with the given victim.

This commit splits the oom_kill_process() function with
an intention to re-use the the second half: __oom_kill_process().

The cgroup-aware OOM killer will kill multiple tasks
belonging to the victim cgroup. We don't need to print
the debug information for the each task, as well as play
with task selection (considering task's children),
so we can't use the existing oom_kill_process().

Signed-off-by: Roman Gushchin 
Acked-by: Michal Hocko 
Acked-by: Johannes Weiner 
Acked-by: David Rientjes 
Cc: Vladimir Davydov 
Cc: Tetsuo Handa 
Cc: David Rientjes 
Cc: Andrew Morton 
Cc: Tejun Heo 
Cc: kernel-t...@fb.com
Cc: cgro...@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Cc: linux-ker...@vger.kernel.org
Cc: linux...@kvack.org
---
 mm/oom_kill.c | 123 +++---
 1 file changed, 65 insertions(+), 58 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 3b0d0fed8480..f041534d77d3 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -814,68 +814,12 @@ static bool task_will_free_mem(struct task_struct *task)
return ret;
 }
 
-static void oom_kill_process(struct oom_control *oc, const char *message)
+static void __oom_kill_process(struct task_struct *victim)
 {
-   struct task_struct *p = oc->chosen;
-   unsigned int points = oc->chosen_points;
-   struct task_struct *victim = p;
-   struct task_struct *child;
-   struct task_struct *t;
+   struct task_struct *p;
struct mm_struct *mm;
-   unsigned int victim_points = 0;
-   static DEFINE_RATELIMIT_STATE(oom_rs, DEFAULT_RATELIMIT_INTERVAL,
- DEFAULT_RATELIMIT_BURST);
bool can_oom_reap = true;
 
-   /*
-* If the task is already exiting, don't alarm the sysadmin or kill
-* its children or threads, just give it access to memory reserves
-* so it can die quickly
-*/
-   task_lock(p);
-   if (task_will_free_mem(p)) {
-   mark_oom_victim(p);
-   wake_oom_reaper(p);
-   task_unlock(p);
-   put_task_struct(p);
-   return;
-   }
-   task_unlock(p);
-
-   if (__ratelimit(_rs))
-   dump_header(oc, p);
-
-   pr_err("%s: Kill process %d (%s) score %u or sacrifice child\n",
-   message, task_pid_nr(p), p->comm, points);
-
-   /*
-* If any of p's children has a different mm and is eligible for kill,
-* the one with the highest oom_badness() score is sacrificed for its
-* parent.  This attempts to lose the minimal amount of work done while
-* still freeing memory.
-*/
-   read_lock(_lock);
-   for_each_thread(p, t) {
-   list_for_each_entry(child, >children, sibling) {
-   unsigned int child_points;
-
-   if (process_shares_mm(child, p->mm))
-   continue;
-   /*
-* oom_badness() returns 0 if the thread is unkillable
-*/
-   child_points = oom_badness(child,
-   oc->memcg, oc->nodemask, oc->totalpages);
-   if (child_points > victim_points) {
-   put_task_struct(victim);
-   victim = child;
-   victim_points = child_points;
-   get_task_struct(victim);
-   }
-   }
-   }
-   read_unlock(_lock);
-
p = find_lock_task_mm(victim);
if (!p) {
put_task_struct(victim);
@@ -949,6 +893,69 @@ static void oom_kill_process(struct oom_control *oc, const 
char *message)
 }
 #undef K
 
+static void oom_kill_process(struct oom_control *oc, const char *message)
+{
+   struct task_struct *p = oc->chosen;
+   unsigned int points = oc->chosen_points;
+   struct task_struct *victim = p;
+   struct task_struct *child;
+   struct task_struct *t;
+   unsigned int victim_points = 0;
+   static DEFINE_RATELIMIT_STATE(oom_rs, DEFAULT_RATELIMIT_INTERVAL,
+ DEFAULT_RATELIMIT_BURST);
+
+   /*
+* If the task is already exiting, don't alarm the sysadmin or kill
+* its children or threads, just give it access to memory reserves
+* so it can die quickly
+*/
+   task_lock(p);
+   if 

[PATCH] kernel-doc: parse DECLARE_KFIFO_PTR()

2017-11-30 Thread Mauro Carvalho Chehab
On media, we now have an struct declared with:

struct lirc_fh {
struct list_head list;
struct rc_dev *rc;
int carrier_low;
boolsend_timeout_reports;
DECLARE_KFIFO_PTR(rawir, unsigned int);
DECLARE_KFIFO_PTR(scancodes, struct lirc_scancode);
wait_queue_head_t   wait_poll;
u8  send_mode;
u8  rec_mode;
};

Currently, it produces the following error:

./include/media/rc-core.h:96: warning: No description found for 
parameter 'int'
./include/media/rc-core.h:96: warning: No description found for 
parameter 'lirc_scancode'
./include/media/rc-core.h:96: warning: Excess struct member 'rawir' 
description in 'lirc_fh'
./include/media/rc-core.h:96: warning: Excess struct member 'scancodes' 
description in 'lirc_fh'

So, teach kernel-doc how to parse a DECLARE_KFIFO_PTR();

While here, relax at the past DECLARE_foo() macros,
accepting a random number of spaces after comma.

Signed-off-by: Mauro Carvalho Chehab 
---
 scripts/kernel-doc | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/scripts/kernel-doc b/scripts/kernel-doc
index bd29a92b4b48..5c12208f8c89 100755
--- a/scripts/kernel-doc
+++ b/scripts/kernel-doc
@@ -2208,10 +2208,11 @@ sub dump_struct($$) {
$members =~ s/__aligned\s*\([^;]*\)//gos;
$members =~ s/\s*CRYPTO_MINALIGN_ATTR//gos;
# replace DECLARE_BITMAP
-   $members =~ s/DECLARE_BITMAP\s*\(([^,)]+), ([^,)]+)\)/unsigned long 
$1\[BITS_TO_LONGS($2)\]/gos;
+   $members =~ s/DECLARE_BITMAP\s*\(([^,)]+),\s*([^,)]+)\)/unsigned long 
$1\[BITS_TO_LONGS($2)\]/gos;
# replace DECLARE_HASHTABLE
-   $members =~ s/DECLARE_HASHTABLE\s*\(([^,)]+), ([^,)]+)\)/unsigned long 
$1\[1 << (($2) - 1)\]/gos;
-
+   $members =~ s/DECLARE_HASHTABLE\s*\(([^,)]+),\s*([^,)]+)\)/unsigned 
long $1\[1 << (($2) - 1)\]/gos;
+   # replace DECLARE_KFIFO_PTR(fifo, type)
+   $members =~ s/DECLARE_KFIFO_PTR\s*\(([^,)]+),\s*([^,)]+)\)/$2 \*$1/gos;
create_parameterlist($members, ';', $file);
check_sections($file, $declaration_name, $decl_type, $sectcheck, 
$struct_actual, $nested);
 
-- 
2.14.3


--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v2] MIPS: Add nonxstack=on|off kernel parameter

2017-11-30 Thread Miodrag Dinic
Hi James,

> > We do have PT_GNU_STACK flags set correctly, this feature is required to
> > workaround CPU revisions which do not have RIXI support.
> 
> RIXI support can be discovered programatically from CP0_Config3.RXI
> (cpu_has_rixi in asm/cpu-features.h), so I don't follow why CPUs without
> RIXI would require a kernel parameter.

The following patch introduced change in behavior with regards to
stack & heap execute-ability :
commit 1a770b85c1f1c1ee37afd7cef5237ffc4c970f04
Author: Paul Burton 
Date:   Fri Jul 8 11:06:20 2016 +0100

MIPS: non-exec stack & heap when non-exec PT_GNU_STACK is present

The stack and heap have both been executable by default on MIPS until
now. This patch changes the default to be non-executable, but only for
ELF binaries with a non-executable PT_GNU_STACK header present. This
does apply to both the heap & the stack, despite the name PT_GNU_STACK,
and this matches the behaviour of other architectures like ARM & x86.

Current MIPS toolchains do not produce the PT_GNU_STACK header, which
means that we can rely upon this patch not changing the behaviour of
existing binaries. The new default will only take effect for newly
compiled binaries once toolchains are updated to support PT_GNU_STACK,
and since those binaries are newly compiled they can be compiled
expecting the change in default behaviour. Again this matches the way in
which the ARM & x86 architectures handled their implementations of
non-executable memory.

Signed-off-by: Paul Burton 
Cc: Leonid Yegoshin 
Cc: Maciej Rozycki 
Cc: Faraz Shahbazker 
Cc: Raghu Gandham 
Cc: Matthew Fortune 
Cc: linux-m...@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13765/
Signed-off-by: Ralf Baechle 

...

When kernel is detecting the type of mapping it should apply :

fs/binfmt_elf.c:
..
if (elf_read_implies_exec(loc->elf_ex, executable_stack))
current->personality |= READ_IMPLIES_EXEC;
..

this effectively calls mips_elf_read_implies_exec() which performs a check:
..
if (!cpu_has_rixi) {
/* The CPU doesn't support non-executable memory */
return 1;
}

return 0;
}

This will in turn make stack & heap executable on processors without RIXI, 
which are practically all processors with MIPS ISA R < 6.

We would like to have an option to override this and force non-executable 
mappings for such systems.

Kind regards,
Miodrag

From: James Hogan
Sent: Thursday, November 30, 2017 11:09 AM
To: Miodrag Dinic
Cc: David Daney; Aleksandar Markovic; linux-m...@linux-mips.org; Aleksandar 
Markovic; Andrew Morton; DengCheng Zhu; Ding Tianhong; Douglas Leung; Frederic 
Weisbecker; Goran Ferenc; Ingo Molnar; James Cowgill; Jonathan Corbet; 
linux-doc@vger.kernel.org; linux-ker...@vger.kernel.org; Marc Zyngier; Matt 
Redfearn; Mimi Zohar; Paul Burton; Paul E. McKenney; Petar Jovanovic; Raghu 
Gandham; Ralf Baechle; Thomas Gleixner; Tom Saeger
Subject: Re: [PATCH v2] MIPS: Add nonxstack=on|off kernel parameter

On Thu, Nov 30, 2017 at 09:34:15AM +, Miodrag Dinic wrote:
> Hi David,
>
> Sorry for a late response, please find answers in-lined:
>
> > > If this parameter is omitted, kernel behavior remains the same as it
> > > was before this patch is applied.
> >
> > Do other architectures have a similar hack?
> >
> > If arm{,64} and x86 don't need this, what would make MIPS so special
> > that we have to carry this around?
>
> Yes, there are similar workarounds. Just a couple lines above
> nonxstack description in the documentation there are :
>   noexec  [IA-64]
>
>   noexec  [X86]
>   On X86-32 available only on PAE configured kernels.
>   noexec=on: enable non-executable mappings (default)
>   noexec=off: disable non-executable mappings
> ...
>
>   noexec32[X86-64]
>   This affects only 32-bit executables.
>   noexec32=on: enable non-executable mappings (default)
>   read doesn't imply executable mappings
>   noexec32=off: disable non-executable mappings
>   read implies executable mappings
>
> > >
> > > This functionality is convenient during debugging and is especially
> > > useful for Android development where non-exec stack is required.
> >
> > Why not just set the PT_GNU_STACK flags correctly in the first place?
>
> We do have PT_GNU_STACK flags set correctly, this feature is required to
> workaround CPU revisions which do not have RIXI support.

RIXI support can be discovered programatically from 

Re: [PATCH v3] doc: add maintainer book

2017-11-30 Thread Mauro Carvalho Chehab
Em Thu, 30 Nov 2017 21:47:44 +1100
"Tobin C. Harding"  escreveu:

> On Thu, Nov 30, 2017 at 07:01:19AM -0200, Mauro Carvalho Chehab wrote:
> > Em Thu, 30 Nov 2017 12:55:07 +1100
> > "Tobin C. Harding"  escreveu:


> > > +So, by way of an example, Greg gives; a pull request with miscellaneous  
> > 
> > Nitpick: there's an extra ";" character above:
> > gives; -> gives  
> 
> Ha ha, I thought for ages how to word this bit. The irony of grammar
> corrections from a non-native speaker is not lost on me :)

:-)

Well, if that serves as a consolation, I had to go to a dictionary to
understand "By way of":

https://www.collinsdictionary.com/dictionary/english/by-way-of

As that was a new expression for me :-)

Anyway, AFAIKT, English and Portuguese (and probably Spanish) have 
similar rules with regards to commas and semicolons.

> 
> Perhaps:
> 
>  By way of an example Greg gives, a pull request with miscellaneous
> 
> I'll take any nitpicks you have Mauro, striving for perfection here. Thanks.

Yeah, that looks a way better on my eyes.

Thanks,
Mauro
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3] doc: add maintainer book

2017-11-30 Thread Tobin C. Harding
On Thu, Nov 30, 2017 at 07:01:19AM -0200, Mauro Carvalho Chehab wrote:
> Em Thu, 30 Nov 2017 12:55:07 +1100
> "Tobin C. Harding"  escreveu:
> 
> > There is currently very little documentation in the kernel on maintainer
> > level tasks. In particular there are no documents on creating pull
> > requests to submit to Linus.
> > 
> > Quoting Greg Kroah-Hartman on LKML:
> > 
> > Anyway, this actually came up at the kernel summit / maintainer
> > meeting a few weeks ago, in that "how do I make a
> > good pull request to Linus" is something we need to document.
> > 
> > Here's what I do, and it seems to work well, so maybe we should turn
> > it into the start of the documentation for how to do it.
> > 
> > (quote references: kernel summit, Europe 2017)
> > 
> > Create a new kernel documentation book 'how to be a maintainer'
> > (suggested by Jonathan Corbet). Add chapters on 'configuring git' and
> > 'creating a pull request'.
> > 
> > Most of the content was written by Linus Torvalds and Greg Kroah-Hartman
> > in discussion on LKML. This is stated at the start of one of the
> > chapters and the original email thread is referenced in
> > 'pull-requests.rst'.
> > 
> > Signed-off-by: Tobin C. Harding 
> > Reviewed-by: Greg Kroah-Hartman 
> > 
> > ---
> > 
> > v3:
> >  - Modified details for branch and tag naming, suggested by Mauro
> >Carvalho Chehab.
> >  - Added example email subject line for submitting pull requests.
> >  - Re-added Greg's reviewed-by tag from version 1.
> > 
> > v2:
> >  - Change title of book, suggested by Dan Williams.
> > 
> > ---
> >  Documentation/index.rst|   1 +
> >  Documentation/maintainer/conf.py   |  10 ++
> >  Documentation/maintainer/configure-git.rst |  34 ++
> >  Documentation/maintainer/index.rst |  10 ++
> >  Documentation/maintainer/pull-requests.rst | 182 
> > +
> >  5 files changed, 237 insertions(+)
> >  create mode 100644 Documentation/maintainer/conf.py
> >  create mode 100644 Documentation/maintainer/configure-git.rst
> >  create mode 100644 Documentation/maintainer/index.rst
> >  create mode 100644 Documentation/maintainer/pull-requests.rst
> > 
> > diff --git a/Documentation/index.rst b/Documentation/index.rst
> > index cb7f1ba5b3b1..a4fb34dddcf3 100644
> > --- a/Documentation/index.rst
> > +++ b/Documentation/index.rst
> > @@ -52,6 +52,7 @@ merged much easier.
> > dev-tools/index
> > doc-guide/index
> > kernel-hacking/index
> > +   maintainer/index
> >  
> >  Kernel API documentation
> >  
> > diff --git a/Documentation/maintainer/conf.py 
> > b/Documentation/maintainer/conf.py
> > new file mode 100644
> > index ..81e9eb7a7884
> > --- /dev/null
> > +++ b/Documentation/maintainer/conf.py
> > @@ -0,0 +1,10 @@
> > +# -*- coding: utf-8; mode: python -*-
> > +
> > +project = 'Linux Kernel Development Documentation'
> > +
> > +tags.add("subproject")
> > +
> > +latex_documents = [
> > +('index', 'maintainer.tex', 'Linux Kernel Development Documentation',
> > + 'The kernel development community', 'manual'),
> > +]
> > diff --git a/Documentation/maintainer/configure-git.rst 
> > b/Documentation/maintainer/configure-git.rst
> > new file mode 100644
> > index ..780d2c84
> > --- /dev/null
> > +++ b/Documentation/maintainer/configure-git.rst
> > @@ -0,0 +1,34 @@
> > +.. _configuregit:
> > +
> > +Configure Git
> > +=
> > +
> > +This chapter describes maintainer level git configuration.
> > +
> > +Tagged branches used in :ref:`Documentation/maintainer/pull-requests.rst
> > +` should be signed with the developers public GPG key. Signed
> > +tags can be created by passing the ``-u`` flag to ``git tag``. However,
> > +since you would *usually* use the same key for the same project, you can
> > +set it once with
> > +::
> > +
> > +   git config user.signingkey "keyname"
> > +
> > +Alternatively, edit your ``.git/config`` or ``~/.gitconfig`` file by hand:
> > +::
> > +
> > +   [user]
> > +   name = Jane Developer
> > +   email = j...@domain.org
> > +   signingkey = j...@domain.org
> > +
> > +You may need to tell ``git`` to use ``gpg2``
> > +::
> > +
> > +   [gpg]
> > +   program = /path/to/gpg2
> > +
> > +You may also like to tell ``gpg`` which ``tty`` to use (add to your shell 
> > rc file)
> > +::
> > +
> > +   export GPG_TTY=$(tty)
> > diff --git a/Documentation/maintainer/index.rst 
> > b/Documentation/maintainer/index.rst
> > new file mode 100644
> > index ..fa84ac9cae39
> > --- /dev/null
> > +++ b/Documentation/maintainer/index.rst
> > @@ -0,0 +1,10 @@
> > +==
> > +Kernel Maintainer Handbook
> > +==
> > +
> > +.. toctree::
> > +   :maxdepth: 2
> > +
> > +   configure-git
> > +   pull-requests
> > +
> > diff --git a/Documentation/maintainer/pull-requests.rst 
> 

Re: [PATCH v2] MIPS: Add nonxstack=on|off kernel parameter

2017-11-30 Thread James Hogan
On Thu, Nov 30, 2017 at 09:34:15AM +, Miodrag Dinic wrote:
> Hi David,
> 
> Sorry for a late response, please find answers in-lined:
> 
> > > If this parameter is omitted, kernel behavior remains the same as it
> > > was before this patch is applied.
> > 
> > Do other architectures have a similar hack?
> > 
> > If arm{,64} and x86 don't need this, what would make MIPS so special
> > that we have to carry this around?
> 
> Yes, there are similar workarounds. Just a couple lines above
> nonxstack description in the documentation there are :
>   noexec  [IA-64]
> 
>   noexec  [X86]
>   On X86-32 available only on PAE configured kernels.
>   noexec=on: enable non-executable mappings (default)
>   noexec=off: disable non-executable mappings
> ...
> 
>   noexec32[X86-64]
>   This affects only 32-bit executables.
>   noexec32=on: enable non-executable mappings (default)
>   read doesn't imply executable mappings
>   noexec32=off: disable non-executable mappings
>   read implies executable mappings
> 
> > > 
> > > This functionality is convenient during debugging and is especially
> > > useful for Android development where non-exec stack is required.
> > 
> > Why not just set the PT_GNU_STACK flags correctly in the first place?
> 
> We do have PT_GNU_STACK flags set correctly, this feature is required to
> workaround CPU revisions which do not have RIXI support.

RIXI support can be discovered programatically from CP0_Config3.RXI
(cpu_has_rixi in asm/cpu-features.h), so I don't follow why CPUs without
RIXI would require a kernel parameter.

Cheers
James

> 
> Kind regards,
> Miodrag
> 
> From: David Daney [dda...@caviumnetworks.com]
> Sent: Tuesday, November 21, 2017 9:53 PM
> To: Aleksandar Markovic; linux-m...@linux-mips.org
> Cc: Miodrag Dinic; Aleksandar Markovic; Andrew Morton; DengCheng Zhu; Ding 
> Tianhong; Douglas Leung; Frederic Weisbecker; Goran Ferenc; Ingo Molnar; 
> James Cowgill; James Hogan; Jonathan Corbet; linux-doc@vger.kernel.org; 
> linux-ker...@vger.kernel.org; Marc Zyngier; Matt Redfearn; Mimi Zohar; Paul 
> Burton; Paul E. McKenney; Petar Jovanovic; Raghu Gandham; Ralf Baechle; 
> Thomas Gleixner; Tom Saeger
> Subject: Re: [PATCH v2] MIPS: Add nonxstack=on|off kernel parameter
> 
> On 11/21/2017 05:56 AM, Aleksandar Markovic wrote:
> > From: Miodrag Dinic 
> >
> > Add a new kernel parameter to override the default behavior related
> > to the decision whether to set up stack as non-executable in function
> > mips_elf_read_implies_exec().
> >
> > The new parameter is used to control non executable stack and heap,
> > regardless of PT_GNU_STACK entry. This does apply to both stack and
> > heap, despite the name.
> >
> > Allowed values:
> >
> > nonxstack=on  Force non-exec stack & heap
> > nonxstack=off Force executable stack & heap
> >
> > If this parameter is omitted, kernel behavior remains the same as it
> > was before this patch is applied.
> 
> Do other architectures have a similar hack?
> 
> If arm{,64} and x86 don't need this, what would make MIPS so special
> that we have to carry this around?
> 
> 
> >
> > This functionality is convenient during debugging and is especially
> > useful for Android development where non-exec stack is required.
> 
> Why not just set the PT_GNU_STACK flags correctly in the first place?
> 
> >
> > Signed-off-by: Miodrag Dinic 
> > Signed-off-by: Aleksandar Markovic 
> > ---
> >   Documentation/admin-guide/kernel-parameters.txt | 11 +++
> >   arch/mips/kernel/elf.c  | 39 
> > +
> >   2 files changed, 50 insertions(+)
> >
> > diff --git a/Documentation/admin-guide/kernel-parameters.txt 
> > b/Documentation/admin-guide/kernel-parameters.txt
> > index b74e133..99464ee 100644
> > --- a/Documentation/admin-guide/kernel-parameters.txt
> > +++ b/Documentation/admin-guide/kernel-parameters.txt
> > @@ -2614,6 +2614,17 @@
> >   noexec32=off: disable non-executable mappings
> >   read implies executable mappings
> >
> > + nonxstack   [MIPS]
> > + Force setting up stack and heap as non-executable or
> > + executable regardless of PT_GNU_STACK entry. Both
> > + stack and heap are affected, despite the name. Valid
> > + arguments: on, off.
> > + nonxstack=on:   Force non-executable stack and heap
> > + nonxstack=off:  Force executable stack and heap
> > + If ommited, stack and heap will or will not be set
> > + up as non-executable depending on PT_GNU_STACK
> > +   

RE: [PATCH v2] MIPS: Add nonxstack=on|off kernel parameter

2017-11-30 Thread Miodrag Dinic
Hi David,

Sorry for a late response, please find answers in-lined:

> > If this parameter is omitted, kernel behavior remains the same as it
> > was before this patch is applied.
> 
> Do other architectures have a similar hack?
> 
> If arm{,64} and x86 don't need this, what would make MIPS so special
> that we have to carry this around?

Yes, there are similar workarounds. Just a couple lines above
nonxstack description in the documentation there are :
noexec  [IA-64]

noexec  [X86]
On X86-32 available only on PAE configured kernels.
noexec=on: enable non-executable mappings (default)
noexec=off: disable non-executable mappings
..

noexec32[X86-64]
This affects only 32-bit executables.
noexec32=on: enable non-executable mappings (default)
read doesn't imply executable mappings
noexec32=off: disable non-executable mappings
read implies executable mappings

> > 
> > This functionality is convenient during debugging and is especially
> > useful for Android development where non-exec stack is required.
> 
> Why not just set the PT_GNU_STACK flags correctly in the first place?

We do have PT_GNU_STACK flags set correctly, this feature is required to
workaround CPU revisions which do not have RIXI support.

Kind regards,
Miodrag

From: David Daney [dda...@caviumnetworks.com]
Sent: Tuesday, November 21, 2017 9:53 PM
To: Aleksandar Markovic; linux-m...@linux-mips.org
Cc: Miodrag Dinic; Aleksandar Markovic; Andrew Morton; DengCheng Zhu; Ding 
Tianhong; Douglas Leung; Frederic Weisbecker; Goran Ferenc; Ingo Molnar; James 
Cowgill; James Hogan; Jonathan Corbet; linux-doc@vger.kernel.org; 
linux-ker...@vger.kernel.org; Marc Zyngier; Matt Redfearn; Mimi Zohar; Paul 
Burton; Paul E. McKenney; Petar Jovanovic; Raghu Gandham; Ralf Baechle; Thomas 
Gleixner; Tom Saeger
Subject: Re: [PATCH v2] MIPS: Add nonxstack=on|off kernel parameter

On 11/21/2017 05:56 AM, Aleksandar Markovic wrote:
> From: Miodrag Dinic 
>
> Add a new kernel parameter to override the default behavior related
> to the decision whether to set up stack as non-executable in function
> mips_elf_read_implies_exec().
>
> The new parameter is used to control non executable stack and heap,
> regardless of PT_GNU_STACK entry. This does apply to both stack and
> heap, despite the name.
>
> Allowed values:
>
> nonxstack=on  Force non-exec stack & heap
> nonxstack=off Force executable stack & heap
>
> If this parameter is omitted, kernel behavior remains the same as it
> was before this patch is applied.

Do other architectures have a similar hack?

If arm{,64} and x86 don't need this, what would make MIPS so special
that we have to carry this around?


>
> This functionality is convenient during debugging and is especially
> useful for Android development where non-exec stack is required.

Why not just set the PT_GNU_STACK flags correctly in the first place?

>
> Signed-off-by: Miodrag Dinic 
> Signed-off-by: Aleksandar Markovic 
> ---
>   Documentation/admin-guide/kernel-parameters.txt | 11 +++
>   arch/mips/kernel/elf.c  | 39 
> +
>   2 files changed, 50 insertions(+)
>
> diff --git a/Documentation/admin-guide/kernel-parameters.txt 
> b/Documentation/admin-guide/kernel-parameters.txt
> index b74e133..99464ee 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -2614,6 +2614,17 @@
>   noexec32=off: disable non-executable mappings
>   read implies executable mappings
>
> + nonxstack   [MIPS]
> + Force setting up stack and heap as non-executable or
> + executable regardless of PT_GNU_STACK entry. Both
> + stack and heap are affected, despite the name. Valid
> + arguments: on, off.
> + nonxstack=on:   Force non-executable stack and heap
> + nonxstack=off:  Force executable stack and heap
> + If ommited, stack and heap will or will not be set
> + up as non-executable depending on PT_GNU_STACK
> + entry and possibly other factors.
> +
>   nofpu   [MIPS,SH] Disable hardware FPU at boot time.
>
>   nofxsr  [BUGS=X86-32] Disables x86 floating point extended
> diff --git a/arch/mips/kernel/elf.c b/arch/mips/kernel/elf.c
> index 731325a..28ef7f3 100644
> --- a/arch/mips/kernel/elf.c
> +++ b/arch/mips/kernel/elf.c
> @@ -326,8 +326,47 @@ void mips_set_personality_nan(struct arch_elf_state 
> *state)
>   }
>   }
>
> +static 

Re: [PATCH v3] doc: add maintainer book

2017-11-30 Thread Mauro Carvalho Chehab
Em Thu, 30 Nov 2017 12:55:07 +1100
"Tobin C. Harding"  escreveu:

> There is currently very little documentation in the kernel on maintainer
> level tasks. In particular there are no documents on creating pull
> requests to submit to Linus.
> 
> Quoting Greg Kroah-Hartman on LKML:
> 
> Anyway, this actually came up at the kernel summit / maintainer
> meeting a few weeks ago, in that "how do I make a
> good pull request to Linus" is something we need to document.
> 
> Here's what I do, and it seems to work well, so maybe we should turn
> it into the start of the documentation for how to do it.
> 
> (quote references: kernel summit, Europe 2017)
> 
> Create a new kernel documentation book 'how to be a maintainer'
> (suggested by Jonathan Corbet). Add chapters on 'configuring git' and
> 'creating a pull request'.
> 
> Most of the content was written by Linus Torvalds and Greg Kroah-Hartman
> in discussion on LKML. This is stated at the start of one of the
> chapters and the original email thread is referenced in
> 'pull-requests.rst'.
> 
> Signed-off-by: Tobin C. Harding 
> Reviewed-by: Greg Kroah-Hartman 
> 
> ---
> 
> v3:
>  - Modified details for branch and tag naming, suggested by Mauro
>Carvalho Chehab.
>  - Added example email subject line for submitting pull requests.
>  - Re-added Greg's reviewed-by tag from version 1.
> 
> v2:
>  - Change title of book, suggested by Dan Williams.
> 
> ---
>  Documentation/index.rst|   1 +
>  Documentation/maintainer/conf.py   |  10 ++
>  Documentation/maintainer/configure-git.rst |  34 ++
>  Documentation/maintainer/index.rst |  10 ++
>  Documentation/maintainer/pull-requests.rst | 182 
> +
>  5 files changed, 237 insertions(+)
>  create mode 100644 Documentation/maintainer/conf.py
>  create mode 100644 Documentation/maintainer/configure-git.rst
>  create mode 100644 Documentation/maintainer/index.rst
>  create mode 100644 Documentation/maintainer/pull-requests.rst
> 
> diff --git a/Documentation/index.rst b/Documentation/index.rst
> index cb7f1ba5b3b1..a4fb34dddcf3 100644
> --- a/Documentation/index.rst
> +++ b/Documentation/index.rst
> @@ -52,6 +52,7 @@ merged much easier.
> dev-tools/index
> doc-guide/index
> kernel-hacking/index
> +   maintainer/index
>  
>  Kernel API documentation
>  
> diff --git a/Documentation/maintainer/conf.py 
> b/Documentation/maintainer/conf.py
> new file mode 100644
> index ..81e9eb7a7884
> --- /dev/null
> +++ b/Documentation/maintainer/conf.py
> @@ -0,0 +1,10 @@
> +# -*- coding: utf-8; mode: python -*-
> +
> +project = 'Linux Kernel Development Documentation'
> +
> +tags.add("subproject")
> +
> +latex_documents = [
> +('index', 'maintainer.tex', 'Linux Kernel Development Documentation',
> + 'The kernel development community', 'manual'),
> +]
> diff --git a/Documentation/maintainer/configure-git.rst 
> b/Documentation/maintainer/configure-git.rst
> new file mode 100644
> index ..780d2c84
> --- /dev/null
> +++ b/Documentation/maintainer/configure-git.rst
> @@ -0,0 +1,34 @@
> +.. _configuregit:
> +
> +Configure Git
> +=
> +
> +This chapter describes maintainer level git configuration.
> +
> +Tagged branches used in :ref:`Documentation/maintainer/pull-requests.rst
> +` should be signed with the developers public GPG key. Signed
> +tags can be created by passing the ``-u`` flag to ``git tag``. However,
> +since you would *usually* use the same key for the same project, you can
> +set it once with
> +::
> +
> + git config user.signingkey "keyname"
> +
> +Alternatively, edit your ``.git/config`` or ``~/.gitconfig`` file by hand:
> +::
> +
> + [user]
> + name = Jane Developer
> + email = j...@domain.org
> + signingkey = j...@domain.org
> +
> +You may need to tell ``git`` to use ``gpg2``
> +::
> +
> + [gpg]
> + program = /path/to/gpg2
> +
> +You may also like to tell ``gpg`` which ``tty`` to use (add to your shell rc 
> file)
> +::
> +
> + export GPG_TTY=$(tty)
> diff --git a/Documentation/maintainer/index.rst 
> b/Documentation/maintainer/index.rst
> new file mode 100644
> index ..fa84ac9cae39
> --- /dev/null
> +++ b/Documentation/maintainer/index.rst
> @@ -0,0 +1,10 @@
> +==
> +Kernel Maintainer Handbook
> +==
> +
> +.. toctree::
> +   :maxdepth: 2
> +
> +   configure-git
> +   pull-requests
> +
> diff --git a/Documentation/maintainer/pull-requests.rst 
> b/Documentation/maintainer/pull-requests.rst
> new file mode 100644
> index ..a25e1002a5b9
> --- /dev/null
> +++ b/Documentation/maintainer/pull-requests.rst
> @@ -0,0 +1,182 @@
> +.. _pullrequests:
> +
> +Creating Pull Requests
> +==
> +
> +This chapter describes how maintainers can create and submit pull