date:20160620

Re: [PATCH] devpts: remove DEVPTS_MULTIPLE_INSTANCES from all configs

2016-06-20 Thread Vineet Gupta

On Monday 20 June 2016 02:44 PM, Alexandru Moise wrote:
> As each mount of devpts is now an independent filesystem,
> the DEVPTS_MULTIPLE_INSTANCES config option no longer exists.
> So remove it.
>
> Signed-off-by: Alexandru Moise <00moses.alexande...@gmail.com>

For arch/arc

Acked-by: Vineet Gupta 
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [v6, 1/2] cxl: Add mechanism for delivering AFU driver specific events

2016-06-20 Thread Ian Munsie

Excerpts from Vaibhav Jain's message of 2016-06-20 14:20:16 +0530:
> > +int cxl_unset_driver_ops(struct cxl_context *ctx)
> > +{
> > +if (atomic_read(>afu_driver_events))
> > +return -EBUSY;
> > +
> > +ctx->afu_driver_ops = NULL;
> Need a write memory barrier so that afu_driver_ops isnt possibly called
> after this store.

What situation do you think this will help? I haven't looked closely at
the last few iterations of this patch set, but if you're in a situation
where you might be racing with some code doing e.g.

if (ctx->afu_driver_ops)
ctx->afu_driver_ops->something();

You have a race with or without a memory barrier. Ideally you would just
have the caller guarantee that it will only call cxl_unset_driver_ops if
no further calls to afu_driver_ops is possible, otherwise you may need
locking here which would be far from ideal.


What exactly is the use case for this API? I'd vote to drop it if we can
do without it.

-Ian

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: powerpc/powernv: Exclude MSI region in extended bridge window

2016-06-20 Thread Michael Ellerman

On Tue, 2016-21-06 at 02:41:05 UTC, Gavin Shan wrote:
> The windows of root port and bridge behind that are extended to
> the PHB's windows to accomodate the PCI hotplug happening in
> future. The PHB's 64KB 32-bits MSI region is included in bridge's
> M32 windows (in hardware) though it's excluded in the corresponding
> resource, as the bridge's M32 windows have 1MB as their minimal
> alignment. We observed EEH error during system boot when the MSI
> region is included in bridge's M32 window.
> 
> This excludes top 1MB (including 64KB 32-bits MSI region) region
> from bridge's M32 windows when extending them.

AFAICS you added that code in "powerpc/powernv: Extend PCI bridge resources", so
I'll squash it into that. That way there is no window of breakage.

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc/mm: Prevent unlikely crash in copro_calculate_slb()

2016-06-20 Thread Ian Munsie

Acked-by: Ian Munsie 

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: unrecoverable exception on G5 with CONFIG_PPC_EARLY_DEBUG enabled

2016-06-20 Thread Denis Kirjanov

> How about this? Denis does this work?
>
> cheers
>
> diff --git a/arch/powerpc/kernel/exceptions-64s.S
> b/arch/powerpc/kernel/exceptions-64s.S
> index 4c9440629128..8bcc1b457115 100644
> --- a/arch/powerpc/kernel/exceptions-64s.S
> +++ b/arch/powerpc/kernel/exceptions-64s.S
> @@ -1399,11 +1399,12 @@ END_MMU_FTR_SECTION_IFCLR(MMU_FTR_RADIX)
>   lwz r9,PACA_EXSLB+EX_CCR(r13)   /* get saved CR */
>
>   mtlrr10
> -BEGIN_MMU_FTR_SECTION
> - b   2f
> -END_MMU_FTR_SECTION_IFSET(MMU_FTR_RADIX)
>   andi.   r10,r12,MSR_RI  /* check for unrecoverable exception */
> +BEGIN_MMU_FTR_SECTION
>   beq-2f
> +FTR_SECTION_ELSE
> + b   2f
> +ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_RADIX)
>
>  .machine push
>  .machine "power4"
>

Yeah, it works. Thanks

>
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v20 20/20] Allow period= in perf stat CPU event descriptions.

2016-06-20 Thread Sukadev Bhattiprolu

This avoids the JSON PMU events parser having to know whether its aliases
are for perf stat or perf record.

Signed-off-by: Andi Kleen 
Signed-off-by: Sukadev Bhattiprolu 
---
 tools/perf/util/parse-events.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index c599077..e1b393d 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -920,6 +920,7 @@ config_term_avail(int term_type, struct parse_events_error 
*err)
case PARSE_EVENTS__TERM_TYPE_CONFIG1:
case PARSE_EVENTS__TERM_TYPE_CONFIG2:
case PARSE_EVENTS__TERM_TYPE_NAME:
+   case PARSE_EVENTS__TERM_TYPE_SAMPLE_PERIOD:
return true;
default:
if (!err)
-- 
2.5.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v20 19/20] perf, tools, pmu-events: Add Skylake frontend MSR support

2016-06-20 Thread Sukadev Bhattiprolu

From: Andi Kleen 

Add support for the "frontend" extra MSR on Skylake in the JSON
conversion.

Signed-off-by: Andi Kleen 
---
 tools/perf/pmu-events/jevents.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/perf/pmu-events/jevents.c b/tools/perf/pmu-events/jevents.c
index c8d8e4a..0e43dce 100644
--- a/tools/perf/pmu-events/jevents.c
+++ b/tools/perf/pmu-events/jevents.c
@@ -126,6 +126,7 @@ static struct msrmap {
{ "0x3F6", "ldlat=" },
{ "0x1A6", "offcore_rsp=" },
{ "0x1A7", "offcore_rsp=" },
+   { "0x3F7", "frontend=" },
{ NULL, NULL }
 };
 
-- 
2.5.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v20 18/20] perf, tools, pmu-events: Fix fixed counters on Intel

2016-06-20 Thread Sukadev Bhattiprolu

From: Andi Kleen 

The JSON event lists use a different encoding for fixed counters
than perf for instructions and cycles (ref-cycles is ok)

This lead to some common events like inst_retired.any
or cpu_clk_unhalted.thread not counting, when specified with their
JSON name.

Special case these events in the jevents conversion process.
I prefer to not touch the JSON files for this, as it's intended
that standard JSON files can be just dropped into the perf
build without changes.

Signed-off-by: Andi Kleen 
Signed-off-by: Sukadev Bhattiprolu 
[Fix minor compile error]
---
 tools/perf/pmu-events/jevents.c | 27 +--
 1 file changed, 25 insertions(+), 2 deletions(-)

diff --git a/tools/perf/pmu-events/jevents.c b/tools/perf/pmu-events/jevents.c
index b701d77..c8d8e4a 100644
--- a/tools/perf/pmu-events/jevents.c
+++ b/tools/perf/pmu-events/jevents.c
@@ -237,6 +237,29 @@ static void print_events_table_suffix(FILE *outfp)
fprintf(outfp, "};\n");
 }
 
+static struct fixed {
+   const char *name;
+   const char *event;
+} fixed[] = {
+   { "inst_retired.any", "event=0xc0" },
+   { "cpu_clk_unhalted.thread", "event=0x3c" },
+   { "cpu_clk_unhalted.thread_any", "event=0x3c,any=1" },
+   { NULL, NULL},
+};
+
+/*
+ * Handle different fixed counter encodings between JSON and perf.
+ */
+static char *real_event(const char *name, char *event)
+{
+   int i;
+
+   for (i = 0; fixed[i].name; i++)
+   if (!strcasecmp(name, fixed[i].name))
+   return (char *)fixed[i].event;
+   return event;
+}
+
 /* Call func with each event in the json file */
 int json_events(const char *fn,
  int (*func)(void *data, char *name, char *event, char *desc,
@@ -325,8 +348,8 @@ int json_events(const char *fn,
if (msr != NULL)
addfield(map, , ",", msr->pname, msrval);
fixname(name);
-
-   err = func(data, name, event, desc, long_desc, topic);
+   err = func(data, name, real_event(name, event), desc,
+  long_desc, topic);
free(event);
free(desc);
free(name);
-- 
2.5.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v20 10/20] perf, tools, jevents: Add support for long descriptions

2016-06-20 Thread Sukadev Bhattiprolu

Implement support in jevents to parse long descriptions for events
that may have them in the JSON files. A follow on patch will make this
long description available to user through the 'perf list' command.

Signed-off-by: Andi Kleen 
Signed-off-by: Sukadev Bhattiprolu 
Acked-by: Jiri Olsa 
---

Changelog[v14]
- [Jiri Olsa] Break up independent parts of the patch into
  separate patches.
---
 tools/perf/pmu-events/jevents.c| 32 
 tools/perf/pmu-events/jevents.h|  3 ++-
 tools/perf/pmu-events/pmu-events.h |  1 +
 3 files changed, 27 insertions(+), 9 deletions(-)

diff --git a/tools/perf/pmu-events/jevents.c b/tools/perf/pmu-events/jevents.c
index aeafa82..51b864a 100644
--- a/tools/perf/pmu-events/jevents.c
+++ b/tools/perf/pmu-events/jevents.c
@@ -203,7 +203,7 @@ static void print_events_table_prefix(FILE *fp, const char 
*tblname)
 }
 
 static int print_events_table_entry(void *data, char *name, char *event,
-   char *desc)
+   char *desc, char *long_desc)
 {
FILE *outfp = data;
/*
@@ -215,6 +215,8 @@ static int print_events_table_entry(void *data, char *name, 
char *event,
fprintf(outfp, "\t.name = \"%s\",\n", name);
fprintf(outfp, "\t.event = \"%s\",\n", event);
fprintf(outfp, "\t.desc = \"%s\",\n", desc);
+   if (long_desc && long_desc[0])
+   fprintf(outfp, "\t.long_desc = \"%s\",\n", long_desc);
 
fprintf(outfp, "},\n");
 
@@ -235,7 +237,8 @@ static void print_events_table_suffix(FILE *outfp)
 
 /* Call func with each event in the json file */
 int json_events(const char *fn,
- int (*func)(void *data, char *name, char *event, char *desc),
+ int (*func)(void *data, char *name, char *event, char *desc,
+ char *long_desc),
  void *data)
 {
int err = -EIO;
@@ -254,6 +257,8 @@ int json_events(const char *fn,
tok = tokens + 1;
for (i = 0; i < tokens->size; i++) {
char *event = NULL, *desc = NULL, *name = NULL;
+   char *long_desc = NULL;
+   char *extra_desc = NULL;
struct msrmap *msr = NULL;
jsmntok_t *msrval = NULL;
jsmntok_t *precise = NULL;
@@ -279,6 +284,10 @@ int json_events(const char *fn,
} else if (json_streq(map, field, "BriefDescription")) {
addfield(map, , "", "", val);
fixdesc(desc);
+   } else if (json_streq(map, field,
+"PublicDescription")) {
+   addfield(map, _desc, "", "", val);
+   fixdesc(long_desc);
} else if (json_streq(map, field, "PEBS") && nz) {
precise = val;
} else if (json_streq(map, field, "MSRIndex") && nz) {
@@ -287,10 +296,10 @@ int json_events(const char *fn,
msrval = val;
} else if (json_streq(map, field, "Errata") &&
   !json_streq(map, val, "null")) {
-   addfield(map, , ". ",
+   addfield(map, _desc, ". ",
" Spec update: ", val);
} else if (json_streq(map, field, "Data_LA") && nz) {
-   addfield(map, , ". ",
+   addfield(map, _desc, ". ",
" Supports address when precise",
NULL);
}
@@ -298,19 +307,26 @@ int json_events(const char *fn,
}
if (precise && !strstr(desc, "(Precise Event)")) {
if (json_streq(map, precise, "2"))
-   addfield(map, , " ", "(Must be precise)",
-   NULL);
+   addfield(map, _desc, " ",
+   "(Must be precise)", NULL);
else
-   addfield(map, , " ",
+   addfield(map, _desc, " ",
"(Precise event)", NULL);
}
+   if (desc && extra_desc)
+   addfield(map, , " ", extra_desc, NULL);
+   if (long_desc && extra_desc)
+   addfield(map, _desc, " ", extra_desc, NULL);
if (msr != NULL)
addfield(map, , ",", msr->pname, msrval);
fixname(name);
-   err = func(data, name, event, desc);
+
+   err = func(data, name, event, desc, long_desc);

[PATCH v20 16/20] perf, tools: Add README for info on parsing JSON/map files

2016-06-20 Thread Sukadev Bhattiprolu

Signed-off-by: Sukadev Bhattiprolu 
Acked-by: Jiri Olsa 
---
 tools/perf/pmu-events/README | 122 +++
 1 file changed, 122 insertions(+)
 create mode 100644 tools/perf/pmu-events/README

diff --git a/tools/perf/pmu-events/README b/tools/perf/pmu-events/README
new file mode 100644
index 000..da57cb5
--- /dev/null
+++ b/tools/perf/pmu-events/README
@@ -0,0 +1,122 @@
+
+The contents of this directory allow users to specify PMU events in
+their CPUs by their symbolic names rather than raw event codes (see
+example below).
+
+The main program in this directory, is the 'jevents', which is built and
+executed _before_ the perf binary itself is built.
+
+The 'jevents' program tries to locate and process JSON files in the directory
+tree tools/perf/pmu-events/arch/foo.
+
+   - Regular files with '.json' extension in the name are assumed to be
+ JSON files, each of which describes a set of PMU events.
+
+   - Regular files with basename starting with 'mapfile.csv' are assumed
+ to be a CSV file that maps a specific CPU to its set of PMU events.
+ (see below for mapfile format)
+
+   - Directories are traversed, but all other files are ignored.
+
+Using the JSON files and the mapfile, 'jevents' generates the C source file,
+'pmu-events.c', which encodes the two sets of tables:
+
+   - Set of 'PMU events tables' for all known CPUs in the architecture,
+ (one table like the following, per JSON file; table name 'pme_power8'
+ is derived from JSON file name, 'power8.json').
+
+   struct pmu_event pme_power8[] = {
+
+   ...
+
+   {
+   .name = "pm_1plus_ppc_cmpl",
+   .event = "event=0x100f2",
+   .desc = "1 or more ppc insts finished,",
+   },
+
+   ...
+   }
+
+   - A 'mapping table' that maps each CPU of the architecture, to its
+ 'PMU events table'
+
+   struct pmu_events_map pmu_events_map[] = {
+   {
+   .cpuid = "004b",
+   .version = "1",
+   .type = "core",
+   .table = pme_power8
+   },
+   ...
+
+   };
+
+After the 'pmu-events.c' is generated, it is compiled and the resulting
+'pmu-events.o' is added to 'libperf.a' which is then used to build perf.
+
+NOTES:
+   1. Several CPUs can support same set of events and hence use a common
+  JSON file. Hence several entries in the pmu_events_map[] could map
+  to a single 'PMU events table'.
+
+   2. The 'pmu-events.h' has an extern declaration for the mapping table
+  and the generated 'pmu-events.c' defines this table.
+
+   3. _All_ known CPU tables for architecture are included in the perf
+  binary.
+
+At run time, perf determines the actual CPU it is running on, finds the
+matching events table and builds aliases for those events. This allows
+users to specify events by their name:
+
+   $ perf stat -e pm_1plus_ppc_cmpl sleep 1
+
+where 'pm_1plus_ppc_cmpl' is a Power8 PMU event.
+
+In case of errors when processing files in the tools/perf/pmu-events/arch
+directory, 'jevents' tries to create an empty mapping file to allow the perf
+build to succeed even if the PMU event aliases cannot be used.
+
+However some errors in processing may cause the perf build to fail.
+
+Mapfile format
+===
+
+The mapfile.csv format is expected to be:
+
+   Header line
+   CPUID,Version,File/path/name.json,Type
+
+where:
+
+   Comma:
+   is the required field delimiter (i.e other fields cannot
+   have commas within them).
+
+   Comments:
+   Lines in which the first character is either '\n' or '#'
+   are ignored.
+
+   Header line
+   The header line is the first line in the file, which is
+   _IGNORED_. It can be a comment (begin with '#') or empty.
+
+   CPUID:
+   CPUID is an arch-specific char string, that can be used
+   to identify CPU (and associate it with a set of PMU events
+   it supports). Multiple CPUIDS can point to the same
+   File/path/name.json.
+
+   Example:
+   CPUID == 'GenuineIntel-6-2E' (on x86).
+   CPUID == '004b0100' (PVR value in Powerpc)
+   Version:
+   is the Version of the mapfile.
+
+   File/path/name.json:
+   is the pathname for the JSON file, relative to the directory
+   containing the mapfile.csv
+
+   Type:
+   indicates whether the events or "core" or "uncore" events.
-- 
2.5.3

___
Linuxppc-dev

[PATCH v20 17/20] perf, tools: Make alias matching case-insensitive

2016-06-20 Thread Sukadev Bhattiprolu

From: Andi Kleen 

Make alias matching the events parser case-insensitive. This is useful
with the JSON events. perf uses lower case events, but the CPU manuals
generally use upper case event names. The JSON files use lower
case by default too. But if we search case insensitively then
users can cut-n-paste the upper case event names.

So the following works:

% perf stat -e BR_INST_EXEC.TAKEN_INDIRECT_NEAR_CALL true

 Performance counter stats for 'true':

   305  BR_INST_EXEC.TAKEN_INDIRECT_NEAR_CALL

   0.000492799 seconds time elapsed

Signed-off-by: Andi Kleen 
---
 tools/perf/util/parse-events.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 77bb7ac..c599077 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -1436,7 +1436,7 @@ comp_pmu(const void *p1, const void *p2)
struct perf_pmu_event_symbol *pmu1 = (struct perf_pmu_event_symbol *) 
p1;
struct perf_pmu_event_symbol *pmu2 = (struct perf_pmu_event_symbol *) 
p2;
 
-   return strcmp(pmu1->symbol, pmu2->symbol);
+   return strcasecmp(pmu1->symbol, pmu2->symbol);
 }
 
 static void perf_pmu__parse_cleanup(void)
-- 
2.5.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v20 14/20] perf, tools: Add support for event list topics

2016-06-20 Thread Sukadev Bhattiprolu

From: Andi Kleen 

Add support to group the output of perf list by the Topic field
in the JSON file.

Example output:

% perf list
...
Cache:
  l1d.replacement
   [L1D data line replacements]
  l1d_pend_miss.pending
   [L1D miss oustandings duration in cycles]
  l1d_pend_miss.pending_cycles
   [Cycles with L1D load Misses outstanding]
  l2_l1d_wb_rqsts.all
   [Not rejected writebacks from L1D to L2 cache lines in any state]
  l2_l1d_wb_rqsts.hit_e
   [Not rejected writebacks from L1D to L2 cache lines in E state]
  l2_l1d_wb_rqsts.hit_m
   [Not rejected writebacks from L1D to L2 cache lines in M state]

...
Pipeline:
  arith.fpu_div
   [Divide operations executed]
  arith.fpu_div_active
   [Cycles when divider is busy executing divide operations]
  baclears.any
   [Counts the total number when the front end is resteered, mainly
   when the BPU cannot provide a correct prediction and this is
   corrected by other branch handling mechanisms at the front end]
  br_inst_exec.all_branches
   [Speculative and retired branches]
  br_inst_exec.all_conditional
   [Speculative and retired macro-conditional branches]
  br_inst_exec.all_direct_jmp
   [Speculative and retired macro-unconditional branches excluding
   calls and indirects]
  br_inst_exec.all_direct_near_call
   [Speculative and retired direct near calls]
  br_inst_exec.all_indirect_jump_non_call_ret

Signed-off-by: Andi Kleen 
Signed-off-by: Sukadev Bhattiprolu 
Acked-by: Jiri Olsa 
---

Changelog[v14]
- [Jiri Olsa] Move jevents support for Topic to a separate patch.
---
 tools/perf/util/pmu.c | 37 +++--
 tools/perf/util/pmu.h |  1 +
 2 files changed, 28 insertions(+), 10 deletions(-)

diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index c606f6a..2d526ad 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -223,7 +223,8 @@ static int perf_pmu__parse_snapshot(struct perf_pmu_alias 
*alias,
 }
 
 static int __perf_pmu__new_alias(struct list_head *list, char *dir, char *name,
-char *desc, char *val, char *long_desc)
+char *desc, char *val, char *long_desc,
+char *topic)
 {
struct perf_pmu_alias *alias;
int ret;
@@ -259,6 +260,7 @@ static int __perf_pmu__new_alias(struct list_head *list, 
char *dir, char *name,
alias->desc = desc ? strdup(desc) : NULL;
alias->long_desc = long_desc ? strdup(long_desc) :
desc ? strdup(desc) : NULL;
+   alias->topic = topic ? strdup(topic) : NULL;
 
list_add_tail(>list, list);
 
@@ -276,7 +278,7 @@ static int perf_pmu__new_alias(struct list_head *list, char 
*dir, char *name, FI
 
buf[ret] = 0;
 
-   return __perf_pmu__new_alias(list, dir, name, NULL, buf, NULL);
+   return __perf_pmu__new_alias(list, dir, name, NULL, buf, NULL, NULL);
 }
 
 static inline bool pmu_alias_info_file(char *name)
@@ -526,7 +528,7 @@ static int pmu_add_cpu_aliases(struct list_head *head)
/* need type casts to override 'const' */
__perf_pmu__new_alias(head, NULL, (char *)pe->name,
(char *)pe->desc, (char *)pe->event,
-   (char *)pe->long_desc);
+   (char *)pe->long_desc, (char *)pe->topic);
}
 
 out:
@@ -1047,19 +1049,26 @@ static char *format_alias_or(char *buf, int len, struct 
perf_pmu *pmu,
return buf;
 }
 
-struct pair {
+struct sevent {
char *name;
char *desc;
+   char *topic;
 };
 
-static int cmp_pair(const void *a, const void *b)
+static int cmp_sevent(const void *a, const void *b)
 {
-   const struct pair *as = a;
-   const struct pair *bs = b;
+   const struct sevent *as = a;
+   const struct sevent *bs = b;
 
/* Put extra events last */
if (!!as->desc != !!bs->desc)
return !!as->desc - !!bs->desc;
+   if (as->topic && bs->topic) {
+   int n = strcmp(as->topic, bs->topic);
+
+   if (n)
+   return n;
+   }
return strcmp(as->name, bs->name);
 }
 
@@ -1093,9 +1102,10 @@ void print_pmu_events(const char *event_glob, bool 
name_only, bool quiet_flag,
char buf[1024];
int printed = 0;
int len, j;
-   struct pair *aliases;
+   struct sevent *aliases;
int numdesc = 0;
int columns = pager_get_columns();
+   char *topic = NULL;
 
pmu = NULL;
len = 0;
@@ -1105,7 +1115,7 @@ void print_pmu_events(const char *event_glob, bool 
name_only, bool quiet_flag,
if (pmu->selectable)
len++;
}
-   aliases = zalloc(sizeof(struct pair) * len);
+   aliases = zalloc(sizeof(struct sevent)

[PATCH v20 15/20] perf, tools: Handle header line in mapfile

2016-06-20 Thread Sukadev Bhattiprolu

From: Andi Kleen 

To work with existing mapfiles, assume that the first line in
'mapfile.csv' is a header line and skip over it.

Signed-off-by: Andi Kleen 
Signed-off-by: Sukadev Bhattiprolu 
Acked-by: Jiri Olsa 
---

Changelog[v2]
All architectures may not use the "Family" to identify. So,
assume first line is header.
---
 tools/perf/pmu-events/jevents.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/tools/perf/pmu-events/jevents.c b/tools/perf/pmu-events/jevents.c
index a4f117c..b701d77 100644
--- a/tools/perf/pmu-events/jevents.c
+++ b/tools/perf/pmu-events/jevents.c
@@ -463,7 +463,12 @@ static int process_mapfile(FILE *outfp, char *fpath)
 
print_mapping_table_prefix(outfp);
 
-   line_num = 0;
+   /* Skip first line (header) */
+   p = fgets(line, n, mapfp);
+   if (!p)
+   goto out;
+
+   line_num = 1;
while (1) {
char *cpuid, *version, *type, *fname;
 
@@ -507,8 +512,8 @@ static int process_mapfile(FILE *outfp, char *fpath)
fprintf(outfp, "},\n");
}
 
+out:
print_mapping_table_suffix(outfp);
-
return 0;
 }
 
-- 
2.5.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v20 13/20] perf, tools, jevents: Add support for event topics

2016-06-20 Thread Sukadev Bhattiprolu

Allow assigning categories "Topics" field to the PMU events  i.e.
process the topic field from the JSON file and add a corresponding
topic field to the generated C events tables.

Signed-off-by: Andi Kleen 
Signed-off-by: Sukadev Bhattiprolu 
Acked-by: Jiri Olsa 
---

Changelog[v14]
[Jiri Olsa] Move this independent code off into a separate patch.
---
 tools/perf/pmu-events/jevents.c| 12 +---
 tools/perf/pmu-events/jevents.h|  2 +-
 tools/perf/pmu-events/pmu-events.h |  1 +
 3 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/tools/perf/pmu-events/jevents.c b/tools/perf/pmu-events/jevents.c
index 51b864a..a4f117c 100644
--- a/tools/perf/pmu-events/jevents.c
+++ b/tools/perf/pmu-events/jevents.c
@@ -203,7 +203,7 @@ static void print_events_table_prefix(FILE *fp, const char 
*tblname)
 }
 
 static int print_events_table_entry(void *data, char *name, char *event,
-   char *desc, char *long_desc)
+   char *desc, char *long_desc, char *topic)
 {
FILE *outfp = data;
/*
@@ -217,6 +217,8 @@ static int print_events_table_entry(void *data, char *name, 
char *event,
fprintf(outfp, "\t.desc = \"%s\",\n", desc);
if (long_desc && long_desc[0])
fprintf(outfp, "\t.long_desc = \"%s\",\n", long_desc);
+   if (topic)
+   fprintf(outfp, "\t.topic = \"%s\",\n", topic);
 
fprintf(outfp, "},\n");
 
@@ -238,7 +240,7 @@ static void print_events_table_suffix(FILE *outfp)
 /* Call func with each event in the json file */
 int json_events(const char *fn,
  int (*func)(void *data, char *name, char *event, char *desc,
- char *long_desc),
+ char *long_desc, char *topic),
  void *data)
 {
int err = -EIO;
@@ -259,6 +261,7 @@ int json_events(const char *fn,
char *event = NULL, *desc = NULL, *name = NULL;
char *long_desc = NULL;
char *extra_desc = NULL;
+   char *topic = NULL;
struct msrmap *msr = NULL;
jsmntok_t *msrval = NULL;
jsmntok_t *precise = NULL;
@@ -298,6 +301,8 @@ int json_events(const char *fn,
   !json_streq(map, val, "null")) {
addfield(map, _desc, ". ",
" Spec update: ", val);
+   } else if (json_streq(map, field, "Topic")) {
+   addfield(map, , "", "", val);
} else if (json_streq(map, field, "Data_LA") && nz) {
addfield(map, _desc, ". ",
" Supports address when precise",
@@ -321,12 +326,13 @@ int json_events(const char *fn,
addfield(map, , ",", msr->pname, msrval);
fixname(name);
 
-   err = func(data, name, event, desc, long_desc);
+   err = func(data, name, event, desc, long_desc, topic);
free(event);
free(desc);
free(name);
free(long_desc);
free(extra_desc);
+   free(topic);
if (err)
break;
tok += j;
diff --git a/tools/perf/pmu-events/jevents.h b/tools/perf/pmu-events/jevents.h
index b0eb274..9ffcb89 100644
--- a/tools/perf/pmu-events/jevents.h
+++ b/tools/perf/pmu-events/jevents.h
@@ -3,7 +3,7 @@
 
 int json_events(const char *fn,
int (*func)(void *data, char *name, char *event, char *desc,
-   char *long_desc),
+   char *long_desc, char *topic),
void *data);
 char *get_cpu_str(void);
 
diff --git a/tools/perf/pmu-events/pmu-events.h 
b/tools/perf/pmu-events/pmu-events.h
index 711f049..6b69f4b 100644
--- a/tools/perf/pmu-events/pmu-events.h
+++ b/tools/perf/pmu-events/pmu-events.h
@@ -9,6 +9,7 @@ struct pmu_event {
const char *event;
const char *desc;
const char *long_desc;
+   const char *topic;
 };
 
 /*
-- 
2.5.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v20 12/20] perf, tools: Support long descriptions with perf list

2016-06-20 Thread Sukadev Bhattiprolu

Previously we were dropping the useful longer descriptions that some
events have in the event list completely. This patch makes them appear with
perf list.

Old perf list:

baclears:
  baclears.all
   [Counts the number of baclears]

vs new:

perf list -v:
...
baclears:
  baclears.all
   [The BACLEARS event counts the number of times the front end is
resteered, mainly when the Branch Prediction Unit cannot provide
a correct prediction and this is corrected by the Branch Address
Calculator at the front end. The BACLEARS.ANY event counts the
number of baclears for any type of branch]

Signed-off-by: Andi Kleen 
Signed-off-by: Sukadev Bhattiprolu 
Acked-by: Jiri Olsa 
---

Changelog[v15]
- [Jir Olsa, Andi Kleen] Fix usage strings; update man page.

Changelog[v14]
- [Jiri Olsa] Break up independent parts of the patch into
  separate patches.

Changelog[v18]:
- Fix minor conflict in tools/perf/builtin-list.c; add long_desc_flag
  parameter to new print_pmu_events() call site.
---
 tools/perf/Documentation/perf-list.txt |  6 +-
 tools/perf/builtin-list.c  | 16 +++-
 2 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/tools/perf/Documentation/perf-list.txt 
b/tools/perf/Documentation/perf-list.txt
index 72209bc..41857cc 100644
--- a/tools/perf/Documentation/perf-list.txt
+++ b/tools/perf/Documentation/perf-list.txt
@@ -8,7 +8,7 @@ perf-list - List all symbolic event types
 SYNOPSIS
 
 [verse]
-'perf list' [--no-desc] [hw|sw|cache|tracepoint|pmu|event_glob]
+'perf list' [--no-desc] [--long-desc] [hw|sw|cache|tracepoint|pmu|event_glob]
 
 DESCRIPTION
 ---
@@ -20,6 +20,10 @@ OPTIONS
 --no-desc::
 Don't print descriptions.
 
+-v::
+--long-desc::
+Print longer event descriptions.
+
 
 [[EVENT_MODIFIERS]]
 EVENT MODIFIERS
diff --git a/tools/perf/builtin-list.c b/tools/perf/builtin-list.c
index 6be195b..1fdf770 100644
--- a/tools/perf/builtin-list.c
+++ b/tools/perf/builtin-list.c
@@ -22,14 +22,17 @@ int cmd_list(int argc, const char **argv, const char 
*prefix __maybe_unused)
 {
int i;
bool raw_dump = false;
+   bool long_desc_flag = false;
struct option list_options[] = {
OPT_BOOLEAN(0, "raw-dump", _dump, "Dump raw events"),
OPT_BOOLEAN('d', "desc", _flag,
"Print extra event descriptions. --no-desc to not 
print."),
+   OPT_BOOLEAN('v', "long-desc", _desc_flag,
+   "Print longer event descriptions."),
OPT_END()
};
const char * const list_usage[] = {
-   "perf list [--no-desc] [hw|sw|cache|tracepoint|pmu|event_glob]",
+   "perf list [] [hw|sw|cache|tracepoint|pmu|event_glob]",
NULL
};
 
@@ -44,7 +47,7 @@ int cmd_list(int argc, const char **argv, const char *prefix 
__maybe_unused)
printf("\nList of pre-defined events (to be used in -e):\n\n");
 
if (argc == 0) {
-   print_events(NULL, raw_dump, !desc_flag);
+   print_events(NULL, raw_dump, !desc_flag, long_desc_flag);
return 0;
}
 
@@ -65,12 +68,14 @@ int cmd_list(int argc, const char **argv, const char 
*prefix __maybe_unused)
 strcmp(argv[i], "hwcache") == 0)
print_hwcache_events(NULL, raw_dump);
else if (strcmp(argv[i], "pmu") == 0)
-   print_pmu_events(NULL, raw_dump, !desc_flag);
+   print_pmu_events(NULL, raw_dump, !desc_flag,
+   long_desc_flag);
else if ((sep = strchr(argv[i], ':')) != NULL) {
int sep_idx;
 
if (sep == NULL) {
-   print_events(argv[i], raw_dump, !desc_flag);
+   print_events(argv[i], raw_dump, !desc_flag,
+   long_desc_flag);
continue;
}
sep_idx = sep - argv[i];
@@ -91,7 +96,8 @@ int cmd_list(int argc, const char **argv, const char *prefix 
__maybe_unused)
print_symbol_events(s, PERF_TYPE_SOFTWARE,
event_symbols_sw, 
PERF_COUNT_SW_MAX, raw_dump);
print_hwcache_events(s, raw_dump);
-   print_pmu_events(s, raw_dump, !desc_flag);
+   print_pmu_events(s, raw_dump, !desc_flag,
+   long_desc_flag);
print_tracepoint_events(NULL, s, raw_dump);
free(s);
}
-- 
2.5.3

___
Linuxppc-dev mailing list

[PATCH v20 11/20] perf, tools: Add alias support for long descriptions

2016-06-20 Thread Sukadev Bhattiprolu

Previously we were dropping the useful longer descriptions that some
events have in the event list completely. Now that jevents provides
support for longer descriptions (see previous patch), add support for
parsing the long descriptions

Signed-off-by: Andi Kleen 
Signed-off-by: Sukadev Bhattiprolu 
Acked-by: Jiri Olsa 
---

Changelog[v14]
- [Jiri Olsa] Break up independent parts of the patch into
  separate patches.
---
 tools/perf/util/parse-events.c |  5 +++--
 tools/perf/util/parse-events.h |  3 ++-
 tools/perf/util/pmu.c  | 15 ++-
 tools/perf/util/pmu.h  |  4 +++-
 4 files changed, 18 insertions(+), 9 deletions(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 72803e3..77bb7ac 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -2133,7 +2133,8 @@ out_enomem:
 /*
  * Print the help text for the event symbols:
  */
-void print_events(const char *event_glob, bool name_only, bool quiet_flag)
+void print_events(const char *event_glob, bool name_only, bool quiet_flag,
+   bool long_desc)
 {
print_symbol_events(event_glob, PERF_TYPE_HARDWARE,
event_symbols_hw, PERF_COUNT_HW_MAX, name_only);
@@ -2143,7 +2144,7 @@ void print_events(const char *event_glob, bool name_only, 
bool quiet_flag)
 
print_hwcache_events(event_glob, name_only);
 
-   print_pmu_events(event_glob, name_only, quiet_flag);
+   print_pmu_events(event_glob, name_only, quiet_flag, long_desc);
 
if (event_glob != NULL)
return;
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index dee5022..68794e6 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -169,7 +169,8 @@ void parse_events_update_lists(struct list_head *list_event,
 void parse_events_evlist_error(struct parse_events_evlist *data,
   int idx, const char *str);
 
-void print_events(const char *event_glob, bool name_only, bool quiet);
+void print_events(const char *event_glob, bool name_only, bool quiet,
+ bool long_desc);
 
 struct event_symbol {
const char  *symbol;
diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index b94712a..c606f6a 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -223,7 +223,7 @@ static int perf_pmu__parse_snapshot(struct perf_pmu_alias 
*alias,
 }
 
 static int __perf_pmu__new_alias(struct list_head *list, char *dir, char *name,
-char *desc, char *val)
+char *desc, char *val, char *long_desc)
 {
struct perf_pmu_alias *alias;
int ret;
@@ -257,6 +257,8 @@ static int __perf_pmu__new_alias(struct list_head *list, 
char *dir, char *name,
}
 
alias->desc = desc ? strdup(desc) : NULL;
+   alias->long_desc = long_desc ? strdup(long_desc) :
+   desc ? strdup(desc) : NULL;
 
list_add_tail(>list, list);
 
@@ -274,7 +276,7 @@ static int perf_pmu__new_alias(struct list_head *list, char 
*dir, char *name, FI
 
buf[ret] = 0;
 
-   return __perf_pmu__new_alias(list, dir, name, NULL, buf);
+   return __perf_pmu__new_alias(list, dir, name, NULL, buf, NULL);
 }
 
 static inline bool pmu_alias_info_file(char *name)
@@ -523,7 +525,8 @@ static int pmu_add_cpu_aliases(struct list_head *head)
 
/* need type casts to override 'const' */
__perf_pmu__new_alias(head, NULL, (char *)pe->name,
-   (char *)pe->desc, (char *)pe->event);
+   (char *)pe->desc, (char *)pe->event,
+   (char *)pe->long_desc);
}
 
 out:
@@ -1082,7 +1085,8 @@ static void wordwrap(char *s, int start, int max, int 
corr)
}
 }
 
-void print_pmu_events(const char *event_glob, bool name_only, bool quiet_flag)
+void print_pmu_events(const char *event_glob, bool name_only, bool quiet_flag,
+   bool long_desc)
 {
struct perf_pmu *pmu;
struct perf_pmu_alias *alias;
@@ -1130,7 +1134,8 @@ void print_pmu_events(const char *event_glob, bool 
name_only, bool quiet_flag)
if (!aliases[j].name)
goto out_enomem;
 
-   aliases[j].desc = alias->desc;
+   aliases[j].desc = long_desc ? alias->long_desc :
+   alias->desc;
j++;
}
if (pmu->selectable &&
diff --git a/tools/perf/util/pmu.h b/tools/perf/util/pmu.h
index 42999c7..1aa614e 100644
--- a/tools/perf/util/pmu.h
+++ b/tools/perf/util/pmu.h
@@ -39,6 +39,7 @@ struct perf_pmu_info {
 struct perf_pmu_alias {
char *name;
char *desc;
+   char *long_desc;

[PATCH v20 09/20] perf, tools: Add override support for event list CPUID

2016-06-20 Thread Sukadev Bhattiprolu

From: Andi Kleen 

Add a PERF_CPUID variable to override the CPUID of the current CPU (within
the current architecture). This is useful for testing, so that all event
lists can be tested on a single system.

Signed-off-by: Andi Kleen 
Signed-off-by: Sukadev Bhattiprolu 
Acked-by: Jiri Olsa 
---

v2: Fix double free in earlier version.
Print actual CPUID being used with verbose option.
---
 tools/perf/util/pmu.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index fe1dec3..b94712a 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -492,10 +492,16 @@ static int pmu_add_cpu_aliases(struct list_head *head)
struct pmu_event *pe;
char *cpuid;
 
-   cpuid = get_cpuid_str();
+   cpuid = getenv("PERF_CPUID");
+   if (cpuid)
+   cpuid = strdup(cpuid);
+   if (!cpuid)
+   cpuid = get_cpuid_str();
if (!cpuid)
return 0;
 
+   pr_debug("Using CPUID %s\n", cpuid);
+
i = 0;
while (1) {
map = _events_map[i++];
-- 
2.5.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v20 08/20] perf, tools: Add a --no-desc flag to perf list

2016-06-20 Thread Sukadev Bhattiprolu

From: Andi Kleen 

Add a --no-desc flag to perf list to not print the event descriptions
that were earlier added for JSON events. This may be useful to
get a less crowded listing.

It's still default to print descriptions as that is the more useful
default for most users.

Signed-off-by: Andi Kleen 
Signed-off-by: Sukadev Bhattiprolu 
Acked-by: Jiri Olsa 
---

v2: Rename --quiet to --no-desc. Add option to man page.

v18: Fix minor conflict in tools/perf/builtin-list.c; Add !desc_flag
to the newly introduced print_pmu_events() call site.
---
 tools/perf/Documentation/perf-list.txt |  8 +++-
 tools/perf/builtin-list.c  | 14 +-
 tools/perf/util/parse-events.c |  4 ++--
 tools/perf/util/parse-events.h |  2 +-
 tools/perf/util/pmu.c  |  4 ++--
 tools/perf/util/pmu.h  |  2 +-
 6 files changed, 22 insertions(+), 12 deletions(-)

diff --git a/tools/perf/Documentation/perf-list.txt 
b/tools/perf/Documentation/perf-list.txt
index a126e97..72209bc 100644
--- a/tools/perf/Documentation/perf-list.txt
+++ b/tools/perf/Documentation/perf-list.txt
@@ -8,13 +8,19 @@ perf-list - List all symbolic event types
 SYNOPSIS
 
 [verse]
-'perf list' [hw|sw|cache|tracepoint|pmu|event_glob]
+'perf list' [--no-desc] [hw|sw|cache|tracepoint|pmu|event_glob]
 
 DESCRIPTION
 ---
 This command displays the symbolic event types which can be selected in the
 various perf commands with the -e option.
 
+OPTIONS
+---
+--no-desc::
+Don't print descriptions.
+
+
 [[EVENT_MODIFIERS]]
 EVENT MODIFIERS
 ---
diff --git a/tools/perf/builtin-list.c b/tools/perf/builtin-list.c
index 5e22db4..6be195b 100644
--- a/tools/perf/builtin-list.c
+++ b/tools/perf/builtin-list.c
@@ -16,16 +16,20 @@
 #include "util/pmu.h"
 #include 
 
+static bool desc_flag = true;
+
 int cmd_list(int argc, const char **argv, const char *prefix __maybe_unused)
 {
int i;
bool raw_dump = false;
struct option list_options[] = {
OPT_BOOLEAN(0, "raw-dump", _dump, "Dump raw events"),
+   OPT_BOOLEAN('d', "desc", _flag,
+   "Print extra event descriptions. --no-desc to not 
print."),
OPT_END()
};
const char * const list_usage[] = {
-   "perf list [hw|sw|cache|tracepoint|pmu|event_glob]",
+   "perf list [--no-desc] [hw|sw|cache|tracepoint|pmu|event_glob]",
NULL
};
 
@@ -40,7 +44,7 @@ int cmd_list(int argc, const char **argv, const char *prefix 
__maybe_unused)
printf("\nList of pre-defined events (to be used in -e):\n\n");
 
if (argc == 0) {
-   print_events(NULL, raw_dump);
+   print_events(NULL, raw_dump, !desc_flag);
return 0;
}
 
@@ -61,12 +65,12 @@ int cmd_list(int argc, const char **argv, const char 
*prefix __maybe_unused)
 strcmp(argv[i], "hwcache") == 0)
print_hwcache_events(NULL, raw_dump);
else if (strcmp(argv[i], "pmu") == 0)
-   print_pmu_events(NULL, raw_dump);
+   print_pmu_events(NULL, raw_dump, !desc_flag);
else if ((sep = strchr(argv[i], ':')) != NULL) {
int sep_idx;
 
if (sep == NULL) {
-   print_events(argv[i], raw_dump);
+   print_events(argv[i], raw_dump, !desc_flag);
continue;
}
sep_idx = sep - argv[i];
@@ -87,7 +91,7 @@ int cmd_list(int argc, const char **argv, const char *prefix 
__maybe_unused)
print_symbol_events(s, PERF_TYPE_SOFTWARE,
event_symbols_sw, 
PERF_COUNT_SW_MAX, raw_dump);
print_hwcache_events(s, raw_dump);
-   print_pmu_events(s, raw_dump);
+   print_pmu_events(s, raw_dump, !desc_flag);
print_tracepoint_events(NULL, s, raw_dump);
free(s);
}
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index d15e335..72803e3 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -2133,7 +2133,7 @@ out_enomem:
 /*
  * Print the help text for the event symbols:
  */
-void print_events(const char *event_glob, bool name_only)
+void print_events(const char *event_glob, bool name_only, bool quiet_flag)
 {
print_symbol_events(event_glob, PERF_TYPE_HARDWARE,
event_symbols_hw, PERF_COUNT_HW_MAX, name_only);
@@ -2143,7 +2143,7 @@ void print_events(const char *event_glob, bool name_only)
 
print_hwcache_events(event_glob, name_only);
 
-

[PATCH v20 07/20] perf, tools: Query terminal width and use in perf list

2016-06-20 Thread Sukadev Bhattiprolu

From: Andi Kleen 

Automatically adapt the now wider and word wrapped perf list
output to wider terminals. This requires querying the terminal
before the auto pager takes over, and exporting this
information from the pager subsystem.

Signed-off-by: Andi Kleen 
Signed-off-by: Sukadev Bhattiprolu 
Acked-by: Namhyung Kim 
Acked-by: Jiri Olsa 
---

Changelog[v20]
- Minor reorg since helpers like setup_pager() are now in
  tools/lib/subcmd/pager.c
---
 tools/lib/subcmd/pager.c | 15 +++
 tools/lib/subcmd/pager.h |  1 +
 tools/perf/util/pmu.c|  3 ++-
 3 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/tools/lib/subcmd/pager.c b/tools/lib/subcmd/pager.c
index d50f3b5..409b582 100644
--- a/tools/lib/subcmd/pager.c
+++ b/tools/lib/subcmd/pager.c
@@ -3,6 +3,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "pager.h"
 #include "run-command.h"
 #include "sigchain.h"
@@ -14,6 +15,7 @@
  */
 
 static int spawned_pager;
+static int pager_columns;
 
 void pager_init(const char *pager_env)
 {
@@ -58,9 +60,12 @@ static void wait_for_pager_signal(int signo)
 void setup_pager(void)
 {
const char *pager = getenv(subcmd_config.pager_env);
+   struct winsize sz;
 
if (!isatty(1))
return;
+   if (ioctl(1, TIOCGWINSZ, ) == 0)
+   pager_columns = sz.ws_col;
if (!pager)
pager = getenv("PAGER");
if (!(pager || access("/usr/bin/pager", X_OK)))
@@ -98,3 +103,13 @@ int pager_in_use(void)
 {
return spawned_pager;
 }
+
+int pager_get_columns(void)
+{
+   char *s;
+
+   s = getenv("COLUMNS");
+   if (s)
+   return atoi(s);
+   return (pager_columns ? pager_columns : 80) - 2;
+}
diff --git a/tools/lib/subcmd/pager.h b/tools/lib/subcmd/pager.h
index 8b83714..623f554 100644
--- a/tools/lib/subcmd/pager.h
+++ b/tools/lib/subcmd/pager.h
@@ -5,5 +5,6 @@ extern void pager_init(const char *pager_env);
 
 extern void setup_pager(void);
 extern int pager_in_use(void);
+extern int pager_get_columns(void);
 
 #endif /* __SUBCMD_PAGER_H */
diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index 496c02e..10a95e7 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -14,6 +14,7 @@
 #include "cpumap.h"
 #include "header.h"
 #include "pmu-events/pmu-events.h"
+#include "cache.h"
 
 struct perf_pmu_format {
char *name;
@@ -1084,7 +1085,7 @@ void print_pmu_events(const char *event_glob, bool 
name_only)
int len, j;
struct pair *aliases;
int numdesc = 0;
-   int columns = 78;
+   int columns = pager_get_columns();
 
pmu = NULL;
len = 0;
-- 
2.5.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v20 06/20] perf, tools: Support alias descriptions

2016-06-20 Thread Sukadev Bhattiprolu

From: Andi Kleen 

Add support to print alias descriptions in perf list, which
are taken from the generated event files.

The sorting code is changed to put the events with descriptions
at the end. The descriptions are printed as possibly multiple word
wrapped lines.

Example output:

% perf list
...
  arith.fpu_div
   [Divide operations executed]
  arith.fpu_div_active
   [Cycles when divider is busy executing divide operations]

Signed-off-by: Andi Kleen 
Signed-off-by: Sukadev Bhattiprolu 
Acked-by: Jiri Olsa 
---

Changelog
- Delete a redundant free()

Changelog[v14]
- [Jiri Olsa] Fail, rather than continue if strdup() returns NULL;
  remove unnecessary __maybe_unused.
---
 tools/perf/util/pmu.c | 83 +--
 tools/perf/util/pmu.h |  1 +
 2 files changed, 68 insertions(+), 16 deletions(-)

diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index eae0de1..496c02e 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -222,7 +222,7 @@ static int perf_pmu__parse_snapshot(struct perf_pmu_alias 
*alias,
 }
 
 static int __perf_pmu__new_alias(struct list_head *list, char *dir, char *name,
-char *desc __maybe_unused, char *val)
+char *desc, char *val)
 {
struct perf_pmu_alias *alias;
int ret;
@@ -255,6 +255,8 @@ static int __perf_pmu__new_alias(struct list_head *list, 
char *dir, char *name,
perf_pmu__parse_snapshot(alias, dir, name);
}
 
+   alias->desc = desc ? strdup(desc) : NULL;
+
list_add_tail(>list, list);
 
return 0;
@@ -1035,11 +1037,42 @@ static char *format_alias_or(char *buf, int len, struct 
perf_pmu *pmu,
return buf;
 }
 
-static int cmp_string(const void *a, const void *b)
+struct pair {
+   char *name;
+   char *desc;
+};
+
+static int cmp_pair(const void *a, const void *b)
+{
+   const struct pair *as = a;
+   const struct pair *bs = b;
+
+   /* Put extra events last */
+   if (!!as->desc != !!bs->desc)
+   return !!as->desc - !!bs->desc;
+   return strcmp(as->name, bs->name);
+}
+
+static void wordwrap(char *s, int start, int max, int corr)
 {
-   const char * const *as = a;
-   const char * const *bs = b;
-   return strcmp(*as, *bs);
+   int column = start;
+   int n;
+
+   while (*s) {
+   int wlen = strcspn(s, " \t");
+
+   if (column + wlen >= max && column > start) {
+   printf("\n%*s", start, "");
+   column = start + corr;
+   }
+   n = printf("%s%.*s", column > start ? " " : "", wlen, s);
+   if (n <= 0)
+   break;
+   s += wlen;
+   column += n;
+   while (isspace(*s))
+   s++;
+   }
 }
 
 void print_pmu_events(const char *event_glob, bool name_only)
@@ -1049,7 +1082,9 @@ void print_pmu_events(const char *event_glob, bool 
name_only)
char buf[1024];
int printed = 0;
int len, j;
-   char **aliases;
+   struct pair *aliases;
+   int numdesc = 0;
+   int columns = 78;
 
pmu = NULL;
len = 0;
@@ -1059,14 +1094,15 @@ void print_pmu_events(const char *event_glob, bool 
name_only)
if (pmu->selectable)
len++;
}
-   aliases = zalloc(sizeof(char *) * len);
+   aliases = zalloc(sizeof(struct pair) * len);
if (!aliases)
goto out_enomem;
pmu = NULL;
j = 0;
while ((pmu = perf_pmu__scan(pmu)) != NULL) {
list_for_each_entry(alias, >aliases, list) {
-   char *name = format_alias(buf, sizeof(buf), pmu, alias);
+   char *name = alias->desc ? alias->name :
+   format_alias(buf, sizeof(buf), pmu, alias);
bool is_cpu = !strcmp(pmu->name, "cpu");
 
if (event_glob != NULL &&
@@ -1075,12 +,19 @@ void print_pmu_events(const char *event_glob, bool 
name_only)
   event_glob
continue;
 
-   if (is_cpu && !name_only)
+   if (is_cpu && !name_only && !alias->desc)
name = format_alias_or(buf, sizeof(buf), pmu, 
alias);
 
-   aliases[j] = strdup(name);
-   if (aliases[j] == NULL)
+   aliases[j].name = name;
+   if (is_cpu && !name_only && !alias->desc)
+   aliases[j].name = format_alias_or(buf,
+ sizeof(buf),
+

[PATCH v20 05/20] perf, tools: Support CPU id matching for x86 v2

2016-06-20 Thread Sukadev Bhattiprolu

From: Andi Kleen 

Implement the code to match CPU types to mapfile types for x86
based on CPUID. This extends an existing similar function,
but changes it to use the x86 mapfile cpu description.
This allows to resolve event lists generated by jevents.

Signed-off-by: Andi Kleen 
Signed-off-by: Sukadev Bhattiprolu 
Acked-by: Jiri Olsa 
---

v2: Update to new get_cpuid_str() interface
---
 tools/perf/arch/x86/util/header.c | 24 +---
 1 file changed, 21 insertions(+), 3 deletions(-)

diff --git a/tools/perf/arch/x86/util/header.c 
b/tools/perf/arch/x86/util/header.c
index 146d12a..a74a48d 100644
--- a/tools/perf/arch/x86/util/header.c
+++ b/tools/perf/arch/x86/util/header.c
@@ -19,8 +19,8 @@ cpuid(unsigned int op, unsigned int *a, unsigned int *b, 
unsigned int *c,
: "a" (op));
 }
 
-int
-get_cpuid(char *buffer, size_t sz)
+static int
+__get_cpuid(char *buffer, size_t sz, const char *fmt)
 {
unsigned int a, b, c, d, lvl;
int family = -1, model = -1, step = -1;
@@ -48,7 +48,7 @@ get_cpuid(char *buffer, size_t sz)
if (family >= 0x6)
model += ((a >> 16) & 0xf) << 4;
}
-   nb = scnprintf(buffer, sz, "%s,%u,%u,%u$", vendor, family, model, step);
+   nb = scnprintf(buffer, sz, fmt, vendor, family, model, step);
 
/* look for end marker to ensure the entire data fit */
if (strchr(buffer, '$')) {
@@ -57,3 +57,21 @@ get_cpuid(char *buffer, size_t sz)
}
return -1;
 }
+
+int
+get_cpuid(char *buffer, size_t sz)
+{
+   return __get_cpuid(buffer, sz, "%s,%u,%u,%u$");
+}
+
+char *
+get_cpuid_str(void)
+{
+   char *buf = malloc(128);
+
+   if (__get_cpuid(buf, 128, "%s-%u-%X$") < 0) {
+   free(buf);
+   return NULL;
+   }
+   return buf;
+}
-- 
2.5.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v20 04/20] perf, tools: Support CPU ID matching for Powerpc

2016-06-20 Thread Sukadev Bhattiprolu

Implement code that returns the generic CPU ID string for Powerpc.
This will be used to identify the specific table of PMU events to
parse/compare user specified events against.

Signed-off-by: Sukadev Bhattiprolu 
Acked-by: Jiri Olsa 
---

Changelog[v14]
- [Jiri Olsa] Move this independent code off into a separate patch.
---
 tools/perf/arch/powerpc/util/header.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/tools/perf/arch/powerpc/util/header.c 
b/tools/perf/arch/powerpc/util/header.c
index f8ccee1..9aaa6f5 100644
--- a/tools/perf/arch/powerpc/util/header.c
+++ b/tools/perf/arch/powerpc/util/header.c
@@ -32,3 +32,14 @@ get_cpuid(char *buffer, size_t sz)
}
return -1;
 }
+
+char *
+get_cpuid_str(void)
+{
+   char *bufp;
+
+   if (asprintf(, "%.8lx", mfspr(SPRN_PVR)) < 0)
+   bufp = NULL;
+
+   return bufp;
+}
-- 
2.5.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v20 03/20] perf, tools: Use pmu_events table to create aliases

2016-06-20 Thread Sukadev Bhattiprolu

At run time (when 'perf' is starting up), locate the specific table
of PMU events that corresponds to the current CPU. Using that table,
create aliases for the each of the PMU events in the CPU. The use
these aliases to parse the user specified perf event.

In short this would allow the user to specify events using their
aliases rather than raw event codes.

Based on input and some earlier patches from Andi Kleen, Jiri Olsa.

Signed-off-by: Sukadev Bhattiprolu 
Acked-by: Jiri Olsa 
---

Changelog[v4]
- Split off unrelated code into separate patches.
Changelog[v3]
- [Jiri Olsa] Fix memory leak in cpuid
Changelog[v2]
- [Andi Kleen] Replace pmu_events_map->vfm with a generic "cpuid".
---
 tools/perf/util/header.h |  1 +
 tools/perf/util/pmu.c| 61 
 2 files changed, 62 insertions(+)

diff --git a/tools/perf/util/header.h b/tools/perf/util/header.h
index d306ca1..d30109b 100644
--- a/tools/perf/util/header.h
+++ b/tools/perf/util/header.h
@@ -151,4 +151,5 @@ int write_padded(int fd, const void *bf, size_t count, 
size_t count_aligned);
  */
 int get_cpuid(char *buffer, size_t sz);
 
+char *get_cpuid_str(void);
 #endif /* __PERF_HEADER_H */
diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index ddb0261..eae0de1 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -12,6 +12,8 @@
 #include "pmu.h"
 #include "parse-events.h"
 #include "cpumap.h"
+#include "header.h"
+#include "pmu-events/pmu-events.h"
 
 struct perf_pmu_format {
char *name;
@@ -464,6 +466,62 @@ static struct cpu_map *pmu_cpumask(const char *name)
return cpus;
 }
 
+/*
+ * Return the CPU id as a raw string.
+ *
+ * Each architecture should provide a more precise id string that
+ * can be use to match the architecture's "mapfile".
+ */
+char * __weak get_cpuid_str(void)
+{
+   return NULL;
+}
+
+/*
+ * From the pmu_events_map, find the table of PMU events that corresponds
+ * to the current running CPU. Then, add all PMU events from that table
+ * as aliases.
+ */
+static int pmu_add_cpu_aliases(struct list_head *head)
+{
+   int i;
+   struct pmu_events_map *map;
+   struct pmu_event *pe;
+   char *cpuid;
+
+   cpuid = get_cpuid_str();
+   if (!cpuid)
+   return 0;
+
+   i = 0;
+   while (1) {
+   map = _events_map[i++];
+   if (!map->table)
+   goto out;
+
+   if (!strcmp(map->cpuid, cpuid))
+   break;
+   }
+
+   /*
+* Found a matching PMU events table. Create aliases
+*/
+   i = 0;
+   while (1) {
+   pe = >table[i++];
+   if (!pe->name)
+   break;
+
+   /* need type casts to override 'const' */
+   __perf_pmu__new_alias(head, NULL, (char *)pe->name,
+   (char *)pe->desc, (char *)pe->event);
+   }
+
+out:
+   free(cpuid);
+   return 0;
+}
+
 struct perf_event_attr * __weak
 perf_pmu__get_default_config(struct perf_pmu *pmu __maybe_unused)
 {
@@ -488,6 +546,9 @@ static struct perf_pmu *pmu_lookup(const char *name)
if (pmu_aliases(name, ))
return NULL;
 
+   if (!strcmp(name, "cpu"))
+   (void)pmu_add_cpu_aliases();
+
if (pmu_type(name, ))
return NULL;
 
-- 
2.5.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v20 02/20] perf, tools, jevents: Program to convert JSON file to C style file

2016-06-20 Thread Sukadev Bhattiprolu

From: Andi Kleen 

This is a modified version of an earlier patch by Andi Kleen.

We expect architectures to describe the performance monitoring events
for each CPU in a corresponding JSON file, which look like:

[
{
"EventCode": "0x00",
"UMask": "0x01",
"EventName": "INST_RETIRED.ANY",
"BriefDescription": "Instructions retired from execution.",
"PublicDescription": "Instructions retired from execution.",
"Counter": "Fixed counter 1",
"CounterHTOff": "Fixed counter 1",
"SampleAfterValue": "203",
"SampleAfterValue": "203",
"MSRIndex": "0",
"MSRValue": "0",
"TakenAlone": "0",
"CounterMask": "0",
"Invert": "0",
"AnyThread": "0",
"EdgeDetect": "0",
"PEBS": "0",
"PRECISE_STORE": "0",
"Errata": "null",
"Offcore": "0"
}
]

We also expect the architectures to provide a mapping between individual
CPUs to their JSON files. Eg:

GenuineIntel-6-1E,V1,/NHM-EP/NehalemEP_core_V1.json,core

which maps each CPU, identified by [vendor, family, model, version, type]
to a JSON file.

Given these files, the program, jevents::
- locates all JSON files for the architecture,
- parses each JSON file and generates a C-style "PMU-events table"
  (pmu-events.c)
- locates a mapfile for the architecture
- builds a global table, mapping each model of CPU to the
  corresponding PMU-events table.

The 'pmu-events.c' is generated when building perf and added to libperf.a.
The global table pmu_events_map[] table in this pmu-events.c will be used
in perf in a follow-on patch.

If the architecture does not have any JSON files or there is an error in
processing them, an empty mapping file is created. This would allow the
build of perf to proceed even if we are not able to provide aliases for
events.

The parser for JSON files allows parsing Intel style JSON event files. This
allows to use an Intel event list directly with perf. The Intel event lists
can be quite large and are too big to store in unswappable kernel memory.

The conversion from JSON to C-style is straight forward.  The parser knows
(very little) Intel specific information, and can be easily extended to
handle fields for other CPUs.

The parser code is partially shared with an independent parsing library,
which is 2-clause BSD licenced. To avoid any conflicts I marked those
files as BSD licenced too. As part of perf they become GPLv2.

Signed-off-by: Andi Kleen 
Signed-off-by: Sukadev Bhattiprolu 
Acked-by: Jiri Olsa 
---
v2: Address review feedback. Rename option to --event-files
v3: Add JSON example
v4: Update manpages.
v5: Don't remove dot in fixname. Fix compile error. Add include
protection. Comment realloc.
v6: Include debug/util.h
v7: (Sukadev Bhattiprolu)
Rebase to 4.0 and fix some conflicts.
v8: (Sukadev Bhattiprolu)
Move jevents.[hc] to tools/perf/pmu-events/
Rewrite to locate and process arch specific JSON and "map" files;
and generate a C file.
(Removed acked-by Namhyung Kim due to modest changes to patch)
Compile the generated pmu-events.c and add the pmu-events.o to
libperf.a
v9: [Sukadev Bhattiprolu/Andi Kleen] Rename ->vfm to ->cpuid and use
that field to encode the PVR in Power.
Allow blank lines in mapfile.
[Jiri Olsa] Pass ARCH as a parameter to jevents so we don't have
to detect it.
[Jiri Olsa] Use the infrastrastructure to build pmu-events/perf
(Makefile changes from Jiri included in this patch).
[Jiri Olsa, Andi Kleen] Detect changes to JSON files and rebuild
pmu-events.o only if necessary.

v11:- [Andi Kleen] Add mapfile, jevents dependency on pmu-events.c
- [Jiri Olsa] Be silient if arch doesn't have JSON files
- Also silence 'jevents' when parsing JSON files unless V=1 is
  specified during build. Cleanup error messages.

v14:-   - [Jiri Olsa] Fix compile error with DEBUG=1; drop unlink() and
  use "w" mode with fopen(); simplify file_name_to_table_name()

v15:- Fix minor conflict in tools/perf/Makefile.perf when rebasing
  to recent perf/core.

v16:- Rebase to upstream; fix conflicts in tools/perf/Makefile.perf

v18:- Rebase to upstream; fix conflicts in tools/perf/Makefile.perf

v20: Rebase to upstream; rename a local variable to 'ldirname' to avoid
collision with the dirname().
---
 tools/perf/Makefile.perf   |  26 +-
 tools/perf/pmu-events/Build|  11 +
 tools/perf/pmu-events/jevents.c| 686 +
 tools/perf/pmu-events/jevents.h|  17 +
 tools/perf/pmu-events/json.h   |   3 +
 tools/perf/pmu-events/pmu-events.h |  35 ++
 6 files changed, 775 insertions(+), 3

[PATCH v20 00/20] perf, tools: Add support for PMU events in JSON format

2016-06-20 Thread Sukadev Bhattiprolu

CPUs support a large number of performance monitoring events (PMU events)
and often these events are very specific to an architecture/model of the
CPU. To use most of these PMU events with perf, we currently have to identify
them by their raw codes:

perf stat -e r100f2 sleep 1

This patchset allows architectures to specify these PMU events in JSON
files located in 'tools/perf/pmu-events/arch/' of the mainline tree.
The events from the JSON files for the architecture are then built into
the perf binary.

At run time, perf identifies the specific set of events for the CPU and
creates "event aliases". These aliases allow users to specify events by
"name" as:

perf stat -e pm_1plus_ppc_cmpl sleep 1

The file, 'tools/perf/pmu-events/README' in [PATCH 16/16] gives more
details.

Note:
- All known events tables for the architecture are included in the
  perf binary.

- For architectures that don't have any JSON files, an empty mapping
  table is created and they should continue to build.

Thanks to input from Andi Kleen, Jiri Olsa, Namhyung Kim and Ingo Molnar.

These patches are available from:

https://github.com/sukadev/linux.git 

Branch  Description
--
json-code-v20   Source Code only 
json-data-v20   x86 and Powerpc datafiles only
json-code+data-v20  Both code and data (for build/test)

NOTE:   Only "source code" patches (i.e those in json-code-v20) are being
emailed.  Please pull the "data files" from the json-data-v20 branch.

Changelog[v20]
- Rebase to recent perf/core
- Add Patch 20/20 to allow perf-stat to work with the period= field

Changelog[v19]
Rebase to recent perf/core; fix couple lines >80 chars.

Changelog[v18]
Rebase to recent perf/core; fix minor merge conflicts.

Changelog[v17]
Rebase to recent perf/core; couple of small fixes to processing Intel
JSON files; allow case-insensitive PMU event names.

Changelog[v16]
Rebase to recent perf/core; fix minor merge conflicts; drop 3 patches
that were merged into perf/core.

Changelog[v15]
Code changes:
- Fix 'perf list' usage string and update man page.
- Remove a redundant __maybe_unused tag.
- Rebase to recent perf/core branch.

Data files updates: json-files-5 branch
- Rebase to perf/intel-json-files-5 from Andi Kleen
- Add patch from Madhavan Srinivasan for couple more Powerpc models

Changelog[v14]
Comments from Jiri Olsa:
- Change parameter name/type for pmu_add_cpu_aliases (from void *data
  to list_head *head)
- Use asprintf() in file_name_to_tablename() and simplify/reorg code.
- Use __weak definition from 
- Use fopen() with mode "w" and eliminate unlink()
- Remove minor TODO.
- Add error check for return value from strdup() in print_pmu_events().
- Move independent changes from patches 3,11,12 .. to separate patches
  for easier review/backport.
- Clarify mapfile's "header line support" in patch description.
- Fix build failure with DEBUG=1

Comment from Andi Kleen:
- In tools/perf/pmu-events/Build, check for 'mapfile.csv' rather than
  'mapfile*'

Misc:
- Minor changes/clarifications to tools/perf/pmu-events/README.


Changelog[v13]
Version: Individual patches have their own history :-) that I am
preserving. Patchset version (v13) is for overall patchset and is
somewhat arbitrary.

- Added support for "categories" of events to perf
- Add mapfile, jevents build dependency on pmu-events.c
- Silence jevents when parsing JSON files unless V=1 is specified
- Cleanup error messages
- Fix memory leak with ->cpuid
- Rebase to Arnaldo's tree
- Allow overriding CPUID via environment variable
- Support long descriptions for events
- Handle header line in mapfile.csv
- Cleanup JSON files (trim PublicDescription if identical to/prefix of
  BriefDescription field)


Andi Kleen (12):
  perf, tools: Add jsmn `jasmine' JSON parser
  perf, tools, jevents: Program to convert JSON file to C style file
  perf, tools: Support CPU id matching for x86 v2
  perf, tools: Support alias descriptions
  perf, tools: Query terminal width and use in perf list
  perf, tools: Add a --no-desc flag to perf list
  perf, tools: Add override support for event list CPUID
  perf, tools: Add support for event list topics
  perf, tools: Handle header line in mapfile
  perf, tools: Make alias matching case-insensitive
  perf, tools, pmu-events: Fix fixed counters on Intel
  perf, tools, pmu-events: Add Skylake frontend MSR support

Sukadev Bhattiprolu (8):
  perf, tools: Use pmu_events table to create aliases

[PATCH v20 01/20] perf, tools: Add jsmn `jasmine' JSON parser

2016-06-20 Thread Sukadev Bhattiprolu

From: Andi Kleen 

I need a JSON parser. This adds the simplest JSON
parser I could find -- Serge Zaitsev's jsmn `jasmine' --
to the perf library. I merely converted it to (mostly)
Linux style and added support for non 0 terminated input.

The parser is quite straight forward and does not
copy any data, just returns tokens with offsets
into the input buffer. So it's relatively efficient
and simple to use.

The code is not fully checkpatch clean, but I didn't
want to completely fork the upstream code.

Original source: http://zserge.bitbucket.org/jsmn.html

In addition I added a simple wrapper that mmaps a json
file and provides some straight forward access functions.

Used in follow-on patches to parse event files.

Acked-by: Namhyung Kim 
Acked-by: Jiri Olsa 
Signed-off-by: Andi Kleen 
Signed-off-by: Sukadev Bhattiprolu 
---
v2: Address review feedback.
v3: Minor checkpatch fixes.
v4 (by Sukadev Bhattiprolu)
- Rebase to 4.0 and fix minor conflicts in tools/perf/Makefile.perf
- Report error if specified events file is invalid.
v5 (Sukadev Bhattiprolu)
- Move files to tools/perf/pmu-events/ since parsing of JSON file
now occurs when _building_ rather than running perf.
---
 tools/perf/pmu-events/jsmn.c | 313 +++
 tools/perf/pmu-events/jsmn.h |  67 +
 tools/perf/pmu-events/json.c | 162 ++
 tools/perf/pmu-events/json.h |  36 +
 4 files changed, 578 insertions(+)
 create mode 100644 tools/perf/pmu-events/jsmn.c
 create mode 100644 tools/perf/pmu-events/jsmn.h
 create mode 100644 tools/perf/pmu-events/json.c
 create mode 100644 tools/perf/pmu-events/json.h

diff --git a/tools/perf/pmu-events/jsmn.c b/tools/perf/pmu-events/jsmn.c
new file mode 100644
index 000..11d1fa1
--- /dev/null
+++ b/tools/perf/pmu-events/jsmn.c
@@ -0,0 +1,313 @@
+/*
+ * Copyright (c) 2010 Serge A. Zaitsev
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ *
+ * Slightly modified by AK to not assume 0 terminated input.
+ */
+
+#include 
+#include "jsmn.h"
+
+/*
+ * Allocates a fresh unused token from the token pool.
+ */
+static jsmntok_t *jsmn_alloc_token(jsmn_parser *parser,
+  jsmntok_t *tokens, size_t num_tokens)
+{
+   jsmntok_t *tok;
+
+   if ((unsigned)parser->toknext >= num_tokens)
+   return NULL;
+   tok = [parser->toknext++];
+   tok->start = tok->end = -1;
+   tok->size = 0;
+   return tok;
+}
+
+/*
+ * Fills token type and boundaries.
+ */
+static void jsmn_fill_token(jsmntok_t *token, jsmntype_t type,
+   int start, int end)
+{
+   token->type = type;
+   token->start = start;
+   token->end = end;
+   token->size = 0;
+}
+
+/*
+ * Fills next available token with JSON primitive.
+ */
+static jsmnerr_t jsmn_parse_primitive(jsmn_parser *parser, const char *js,
+ size_t len,
+ jsmntok_t *tokens, size_t num_tokens)
+{
+   jsmntok_t *token;
+   int start;
+
+   start = parser->pos;
+
+   for (; parser->pos < len; parser->pos++) {
+   switch (js[parser->pos]) {
+#ifndef JSMN_STRICT
+   /*
+* In strict mode primitive must be followed by ","
+* or "}" or "]"
+*/
+   case ':':
+#endif
+   case '\t':
+   case '\r':
+   case '\n':
+   case ' ':
+   case ',':
+   case ']':
+   case '}':
+   goto found;
+   default:
+   break;
+   }
+   if (js[parser->pos] < 32 || js[parser->pos] >= 127) {
+   parser->pos = start;
+   return

Re: [RESEND PATCH v2 3/4] PCI: Add a new option for resource_alignment to reassign alignment

2016-06-20 Thread Yongji Xie


On 2016/6/21 9:57, Bjorn Helgaas wrote:


On Thu, Jun 02, 2016 at 01:46:50PM +0800, Yongji Xie wrote:

When using resource_alignment kernel parameter, the current
implement reassigns the alignment by changing resources' size
which can potentially break some drivers. For example, the driver
uses the size to locate some register whose length is related
to the size.

This patch adds a new option "noresize" for the parameter to
solve this problem.

Why do we ever want to change the resource's size?  I understand that
you want to change the resource's *alignment*, and that part makes
sense.  But why change the *size*?  Changing the resource size doesn't
change the hardware BAR size; it just means the struct resource will
describe a region larger than what the BAR actually claims.  That
unnecessarily wastes space after the BAR.

This was a problem with the code even before your patch; I'm
suggesting that if you have a way to change the alignment without
changing the resource size, maybe we should do that all the time.
Then you wouldn't need to add the "noresize" option.


Yes, changing resource's size seems not a good idea. But
would it break some existing systems which are using this
kernel parameter if we remove the "noresize" option and change
the alignment all the time?

Thanks,
Yongji


Signed-off-by: Yongji Xie 
---
  Documentation/kernel-parameters.txt |5 -
  drivers/pci/pci.c   |   35 +--
  2 files changed, 29 insertions(+), 11 deletions(-)

diff --git a/Documentation/kernel-parameters.txt 
b/Documentation/kernel-parameters.txt
index 82b42c9..c4802f5 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -2997,13 +2997,16 @@ bytes respectively. Such letter suffixes can also be 
entirely omitted.
window. The default value is 64 megabytes.
resource_alignment=
Format:
-   [@][:]:.[; ...]
+   [@][:]:.
+   [:noresize][; ...]
Specifies alignment and device to reassign
aligned memory resources.
If  is not specified,
PAGE_SIZE is used as alignment.
PCI-PCI bridge can be specified, if resource
windows need to be expanded.
+   noresize: Don't change the resources' sizes when
+   reassigning alignment.
ecrc=   Enable/disable PCIe ECRC (transaction layer
end-to-end CRC checking).
bios: Use BIOS/firmware settings. This is the
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index a259394..3ee13e5 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4748,11 +4748,13 @@ static DEFINE_SPINLOCK(resource_alignment_lock);
  /**
   * pci_specified_resource_alignment - get resource alignment specified by 
user.
   * @dev: the PCI device to get
+ * @resize: whether or not to change resources' size when reassigning alignment
   *
   * RETURNS: Resource alignment if it is specified.
   *  Zero if it is not specified.
   */
-static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev)
+static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev,
+   bool *resize)
  {
int seg, bus, slot, func, align_order, count;
resource_size_t align = 0;
@@ -4786,6 +4788,11 @@ static resource_size_t 
pci_specified_resource_alignment(struct pci_dev *dev)
}
}
p += count;
+   if (!strncmp(p, ":noresize", 9)) {
+   *resize = false;
+   p += 9;
+   } else
+   *resize = true;
if (seg == pci_domain_nr(dev->bus) &&
bus == dev->bus->number &&
slot == PCI_SLOT(dev->devfn) &&
@@ -4818,11 +4825,12 @@ void pci_reassigndev_resource_alignment(struct pci_dev 
*dev)
  {
int i;
struct resource *r;
+   bool resize = true;
resource_size_t align, size;
u16 command;
  
  	/* check if specified PCI is target device to reassign */

-   align = pci_specified_resource_alignment(dev);
+   align = pci_specified_resource_alignment(dev, );
if (!align)
return;
  
@@ -4844,15 +4852,22 @@ void pci_reassigndev_resource_alignment(struct pci_dev *dev)

if (!(r->flags & IORESOURCE_MEM))
continue;
size = resource_size(r);
-   if (size < align) {
-   size = align;
-   dev_info(>dev,
-   "Rounding

Re: [RFC PATCH 1/2] KVM: PPC: divide the ics lock into smaller ones for each irq

2016-06-20 Thread Li Zhong

On Jun 20, 2016, at 13:25, Paul Mackerras  wrote:On Mon, May 16, 2016 at 02:58:18PM +0800, Li Zhong wrote:This patch tries to use smaller locks for each irq in the ics, insteadof a lock at the ics level, to provide better scalability.This looks like a worth-while thing to do.  Do you have anyperformance measurements to justify the change?  This will increasethe size of struct kvmppc_ics by 4kB, so it would be useful to showthe performance increase that justifies it.Actually, I saw some “improvement” because of the vcpus were not binded, io jobs and irqs on the guest were not binded. After I fixed those random factors, the result became stable, but I couldn’t see any obvious improvements from the patches... Maybe I need find some other test cases that could support this change. Also, when you resend the patch, please make the patch descriptionmore definite - say "With this patch, we use" rather than "this patchtries to use", for instance.OK, I will change that when doing a resend, if I can find some workload that could benefit from this change. Thanks, ZhongRegards,Paul.___Linuxppc-dev mailing listLinuxppc-dev@lists.ozlabs.orghttps://lists.ozlabs.org/listinfo/linuxppc-dev
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 10/12] lguest: Only descend into lguest directory when CONFIG_LGUEST is set

2016-06-20 Thread Rusty Russell

"Andrew F. Davis"  writes:
> When CONFIG_LGUEST is not set make will still descend into the lguest
> directory but nothing will be built. This produces unneeded build
> artifacts and messages in addition to slowing the build. Fix this here.
>
> Signed-off-by: Andrew F. Davis 
> ---
>  drivers/Makefile | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Applied,

Thanks!
Rusty.

>
> diff --git a/drivers/Makefile b/drivers/Makefile
> index 19305e0..b178e2f 100644
> --- a/drivers/Makefile
> +++ b/drivers/Makefile
> @@ -122,7 +122,7 @@ obj-$(CONFIG_ACCESSIBILITY)   += accessibility/
>  obj-$(CONFIG_ISDN)   += isdn/
>  obj-$(CONFIG_EDAC)   += edac/
>  obj-$(CONFIG_EISA)   += eisa/
> -obj-y+= lguest/
> +obj-$(CONFIG_LGUEST) += lguest/
>  obj-$(CONFIG_CPU_FREQ)   += cpufreq/
>  obj-$(CONFIG_CPU_IDLE)   += cpuidle/
>  obj-y+= mmc/
> -- 
> 2.8.3
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RESEND PATCH v2 2/4] PCI: Do not Use IORESOURCE_STARTALIGN to identify bridge resources

2016-06-20 Thread Yongji Xie


On 2016/6/21 9:50, Bjorn Helgaas wrote:


On Thu, Jun 02, 2016 at 01:46:49PM +0800, Yongji Xie wrote:

Now we use the IORESOURCE_STARTALIGN to identify bridge resources
in __assign_resources_sorted(). That's quite fragile. We can't
make sure that the PCI devices' resources will not use
IORESOURCE_STARTALIGN any more.

Can you explain this a little more?  I don't quite understand the
problem.  Maybe you can give an example of the problem?


Now there are two kinds of additional resources in the list: bridge
resource and SR-IOV resource. Here we just want to fix the
additional alignment for bridge. But if SR-IOV resource get the
flag IORESOURCE_STARTALIGN which is not only for bridge, the
current check seems to be wrong. And kernel parameter
"resource_alignment" could set IORESOURCE_STARTALIGN for
SR-IOV resource with my next patch applied.

Thanks,
Yongji


In this patch, we try to use a more robust way to identify
bridge resources.

Signed-off-by: Yongji Xie 
---
  drivers/pci/setup-bus.c |9 ++---
  1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 55641a3..216ddbc 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -390,6 +390,7 @@ static void __assign_resources_sorted(struct list_head 
*head,
struct pci_dev_resource *dev_res, *tmp_res, *dev_res2;
unsigned long fail_type;
resource_size_t add_align, align;
+   int index;
  
  	/* Check if optional add_size is there */

if (!realloc_head || list_empty(realloc_head))
@@ -410,11 +411,13 @@ static void __assign_resources_sorted(struct list_head 
*head,
  
  		/*

 * There are two kinds of additional resources in the list:
-* 1. bridge resource  -- IORESOURCE_STARTALIGN
-* 2. SR-IOV resource   -- IORESOURCE_SIZEALIGN
+* 1. bridge resource
+* 2. SR-IOV resource
 * Here just fix the additional alignment for bridge
 */
-   if (!(dev_res->res->flags & IORESOURCE_STARTALIGN))
+   index = dev_res->res - dev_res->dev->resource;
+   if (index < PCI_BRIDGE_RESOURCES ||
+   index > PCI_BRIDGE_RESOURCE_END)

I think the code looks OK; at least, it seems to match the comment.
I'd just like to understand the problem a little better.


continue;
  
  		add_align = get_res_add_align(realloc_head, dev_res->res);

--
1.7.9.5


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC PATCH 2/2] KVM: PPC: Don't take lock when check irq's resend flag

2016-06-20 Thread Li Zhong


> On Jun 20, 2016, at 13:27, Paul Mackerras  wrote:
> 
> On Mon, May 16, 2016 at 03:02:13PM +0800, Li Zhong wrote:
>> It seems that we don't need to take the lock before evaluating irq's
>> resend flag. What needed is to make sure when we clear the ics's bit
>> in the icp's resend_map, we don't miss the resend flag of the irqs
>> that set the bit.
>> 
>> And seems this could be ordered through the barrier in test_and_clear_bit(),
>> and an newly added wmb when setting irq's resend flag, and icp's resend_map.
> 
> This looks fine to me.  Is there a measurable performance improvement
> from this?  I understand it could be hard to measure.
> 
> Also, you could make the patch description more definite - just say
> that we don't need to take the lock, there's no need for "seems”.

OK :)

However, we may need to ignore this one for now. To implement the P/Q stuff, we 
probably need make sure the resend irqs to be resent only once. It’s easier to 
make sure that with the lock here, and the resend flag can be cleared inside 
the lock. 

Thanks, Zhong
 
> 
> Paul.
> ___
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc/powernv: Exclude MSI region in extended bridge window

2016-06-20 Thread Gavin Shan

On Tue, Jun 21, 2016 at 12:41:05PM +1000, Gavin Shan wrote:
>The windows of root port and bridge behind that are extended to
>the PHB's windows to accomodate the PCI hotplug happening in
>future. The PHB's 64KB 32-bits MSI region is included in bridge's
>M32 windows (in hardware) though it's excluded in the corresponding
>resource, as the bridge's M32 windows have 1MB as their minimal
>alignment. We observed EEH error during system boot when the MSI
>region is included in bridge's M32 window.
>
>This excludes top 1MB (including 64KB 32-bits MSI region) region
>from bridge's M32 windows when extending them.
>
>Signed-off-by: Gavin Shan 
>---
> arch/powerpc/platforms/powernv/pci-ioda.c | 17 -
> 1 file changed, 16 insertions(+), 1 deletion(-)
>

Michael, I saw the PCI hotplug patches have been merged to your "test"
branch. This one is the fix for EEH error found on Garrison platform.
Please apply it on top of that series.

Thanks,
Gavin

>diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
>b/arch/powerpc/platforms/powernv/pci-ioda.c
>index bde7f76..e0a8a92 100644
>--- a/arch/powerpc/platforms/powernv/pci-ioda.c
>+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
>@@ -3239,6 +3239,7 @@ static void pnv_pci_fixup_bridge_resources(struct 
>pci_bus *bus,
>   struct pnv_phb *phb = hose->private_data;
>   struct pci_dev *bridge = bus->self;
>   struct resource *r, *w;
>+  bool msi_region = false;
>   int i;
>
>   /* Check if we need apply fixup to the bridge's windows */
>@@ -3259,11 +3260,25 @@ static void pnv_pci_fixup_bridge_resources(struct 
>pci_bus *bus,
>(type & IORESOURCE_PREFETCH) &&
>phb->ioda.m64_segsize)
>   w = >mem_resources[1];
>-  else if (r->flags & type & IORESOURCE_MEM)
>+  else if (r->flags & type & IORESOURCE_MEM) {
>   w = >mem_resources[0];
>+  msi_region = true;
>+  }
>
>   r->start = w->start;
>   r->end = w->end;
>+
>+  /* The 64KB 32-bits MSI region shouldn't be included in
>+   * the 32-bits bridge window. Otherwise, we can see strange
>+   * issues. One of them is EEH error observed on Garrison.
>+   *
>+   * Exclude top 1MB region which is the minimal alignment of
>+   * 32-bits bridge window.
>+   */
>+  if (msi_region) {
>+  r->end += 0x1;
>+  r->end -= 0x10;
>+  }
>   }
> }
>
>-- 
>2.1.0
>

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH] powerpc/powernv: Exclude MSI region in extended bridge window

2016-06-20 Thread Gavin Shan

The windows of root port and bridge behind that are extended to
the PHB's windows to accomodate the PCI hotplug happening in
future. The PHB's 64KB 32-bits MSI region is included in bridge's
M32 windows (in hardware) though it's excluded in the corresponding
resource, as the bridge's M32 windows have 1MB as their minimal
alignment. We observed EEH error during system boot when the MSI
region is included in bridge's M32 window.

This excludes top 1MB (including 64KB 32-bits MSI region) region
from bridge's M32 windows when extending them.

Signed-off-by: Gavin Shan 
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 17 -
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
b/arch/powerpc/platforms/powernv/pci-ioda.c
index bde7f76..e0a8a92 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -3239,6 +3239,7 @@ static void pnv_pci_fixup_bridge_resources(struct pci_bus 
*bus,
struct pnv_phb *phb = hose->private_data;
struct pci_dev *bridge = bus->self;
struct resource *r, *w;
+   bool msi_region = false;
int i;
 
/* Check if we need apply fixup to the bridge's windows */
@@ -3259,11 +3260,25 @@ static void pnv_pci_fixup_bridge_resources(struct 
pci_bus *bus,
 (type & IORESOURCE_PREFETCH) &&
 phb->ioda.m64_segsize)
w = >mem_resources[1];
-   else if (r->flags & type & IORESOURCE_MEM)
+   else if (r->flags & type & IORESOURCE_MEM) {
w = >mem_resources[0];
+   msi_region = true;
+   }
 
r->start = w->start;
r->end = w->end;
+
+   /* The 64KB 32-bits MSI region shouldn't be included in
+* the 32-bits bridge window. Otherwise, we can see strange
+* issues. One of them is EEH error observed on Garrison.
+*
+* Exclude top 1MB region which is the minimal alignment of
+* 32-bits bridge window.
+*/
+   if (msi_region) {
+   r->end += 0x1;
+   r->end -= 0x10;
+   }
}
 }
 
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH] powerpc/powernv: Print correct PHB type names

2016-06-20 Thread Gavin Shan

We're initializing "IODA1" and "IODA2" PHBs though they are IODA2
and NPU PHBs as below kernel log indicates.

   Initializing IODA1 OPAL PHB /pciex@3fffe4070
   Initializing IODA2 OPAL PHB /pciex@3fff00040

This fixes the PHB names. After it's applied, we get:

   Initializing IODA2 PHB (/pciex@3fffe4070)
   Initializing NPU PHB (/pciex@3fff00040)

Signed-off-by: Gavin Shan 
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
b/arch/powerpc/platforms/powernv/pci-ioda.c
index e0a8a92..7f952a6 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -55,6 +55,7 @@
 #define POWERNV_IOMMU_DEFAULT_LEVELS   1
 #define POWERNV_IOMMU_MAX_LEVELS   5
 
+static const char * const pnv_phb_names[] = { "IODA1", "IODA2", "NPU" };
 static void pnv_pci_ioda2_table_free_pages(struct iommu_table *tbl);
 
 void pe_level_printk(const struct pnv_ioda_pe *pe, const char *level,
@@ -3626,7 +3627,8 @@ static void __init pnv_pci_init_ioda_phb(struct 
device_node *np,
void *aux;
long rc;
 
-   pr_info("Initializing IODA%d OPAL PHB %s\n", ioda_type, np->full_name);
+   pr_info("Initializing %s PHB (%s)\n",
+   pnv_phb_names[ioda_type], of_node_full_name(np));
 
prop64 = of_get_property(np, "ibm,opal-phbid", NULL);
if (!prop64) {
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RESEND PATCH v2 4/4] PCI: Add support for enforcing all MMIO BARs to be page aligned

2016-06-20 Thread Bjorn Helgaas

On Thu, Jun 02, 2016 at 01:46:51PM +0800, Yongji Xie wrote:
> When vfio passthrough a PCI device of which MMIO BARs are
> smaller than PAGE_SIZE, guest will not handle the mmio
> accesses to the BARs which leads to mmio emulations in host.
> 
> This is because vfio will not allow to passthrough one BAR's
> mmio page which may be shared with other BARs. Otherwise,
> there will be a backdoor that guest can use to access BARs
> of other guest.
> 
> To solve this issue, this patch modifies resource_alignment
> to support syntax where multiple devices get the same
> alignment. So we can use something like
> "pci=resource_alignment=*:*:*.*:noresize" to enforce the
> alignment of all MMIO BARs to be at least PAGE_SIZE so that
> one BAR's mmio page would not be shared with other BARs.
> 
> And we also define a macro PCIBIOS_MIN_ALIGNMENT to enable this
> automatically on PPC64 platform which can easily hit this issue
> because its PAGE_SIZE is 64KB.
> 
> Note that this would not be applied to VFs whose BARs are always
> page aligned and should be never reassigned according to SRIOV
> spec.

I see that SR-IOV spec r1.1, sec 3.3.13 requires that all VF BAR
resources be aligned on System Page Size, and must be sized to consume
an integral number of pages.

Where does it say VF BARs can't be reassigned?  I thought they *could*
be reassigned, as long as VFs are disabled when you do it.

> Signed-off-by: Yongji Xie 
> ---
>  Documentation/kernel-parameters.txt |2 ++
>  arch/powerpc/include/asm/pci.h  |2 ++
>  drivers/pci/pci.c   |   68 
> +--
>  3 files changed, 61 insertions(+), 11 deletions(-)
> 
> diff --git a/Documentation/kernel-parameters.txt 
> b/Documentation/kernel-parameters.txt
> index c4802f5..cb09503 100644
> --- a/Documentation/kernel-parameters.txt
> +++ b/Documentation/kernel-parameters.txt
> @@ -3003,6 +3003,8 @@ bytes respectively. Such letter suffixes can also be 
> entirely omitted.
>   aligned memory resources.
>   If  is not specified,
>   PAGE_SIZE is used as alignment.
> + , ,  and  can be set to
> + "*" which means match all values.
>   PCI-PCI bridge can be specified, if resource
>   windows need to be expanded.
>   noresize: Don't change the resources' sizes when
> diff --git a/arch/powerpc/include/asm/pci.h b/arch/powerpc/include/asm/pci.h
> index a6f3ac0..742fd34 100644
> --- a/arch/powerpc/include/asm/pci.h
> +++ b/arch/powerpc/include/asm/pci.h
> @@ -28,6 +28,8 @@
>  #define PCIBIOS_MIN_IO   0x1000
>  #define PCIBIOS_MIN_MEM  0x1000
>  
> +#define PCIBIOS_MIN_ALIGNMENT  PAGE_SIZE
> +
>  struct pci_dev;
>  
>  /* Values for the `which' argument to sys_pciconfig_iobase syscall.  */
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 3ee13e5..664f295 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -4759,7 +4759,12 @@ static resource_size_t 
> pci_specified_resource_alignment(struct pci_dev *dev,
>   int seg, bus, slot, func, align_order, count;
>   resource_size_t align = 0;
>   char *p;
> + bool invalid = false;
>  
> +#ifdef PCIBIOS_MIN_ALIGNMENT
> + align = PCIBIOS_MIN_ALIGNMENT;
> + *resize = false;
> +#endif

This PCIBIOS_MIN_ALIGNMENT part should be a separate patch by itself.

If you have PCIBIOS_MIN_ALIGNMENT enabled automatically for powerpc,
do you still need the command-line argument?

>   spin_lock(_alignment_lock);
>   p = resource_alignment_param;
>   while (*p) {
> @@ -4776,16 +4781,49 @@ static resource_size_t 
> pci_specified_resource_alignment(struct pci_dev *dev,
>   } else {
>   align_order = -1;
>   }
> - if (sscanf(p, "%x:%x:%x.%x%n",
> - , , , , ) != 4) {
> + if (p[0] == '*' && p[1] == ':') {
> + seg = -1;
> + count = 1;
> + } else if (sscanf(p, "%x%n", , ) != 1 ||
> + p[count] != ':') {
> + invalid = true;
> + break;
> + }
> + p += count + 1;
> + if (*p == '*') {
> + bus = -1;
> + count = 1;
> + } else if (sscanf(p, "%x%n", , ) != 1) {
> + invalid = true;
> + break;
> + }
> + p += count;
> + if (*p == '.') {
> + slot = bus;
> + bus = seg;
>   seg = 0;
> - if (sscanf(p, "%x:%x.%x%n",
> - , , , ) != 3) {
> - /* Invalid format */
> -

Re: [RESEND PATCH v2 1/4] PCI: Ignore resource_alignment if PCI_PROBE_ONLY was set\\

2016-06-20 Thread Yongji Xie


On 2016/6/21 9:43, Bjorn Helgaas wrote:


On Thu, Jun 02, 2016 at 01:46:48PM +0800, Yongji Xie wrote:

The resource_alignment will releases memory resources allocated
by firmware so that kernel can reassign new resources later on.
But this will cause the problem that no resources can be
allocated by kernel if PCI_PROBE_ONLY was set, e.g. on pSeries
platform because PCI_PROBE_ONLY force kernel to use firmware
setup and not to reassign any resources.

To solve this problem, this patch ignores resource_alignment
if PCI_PROBE_ONLY was set.

Signed-off-by: Yongji Xie 
---
  drivers/pci/pci.c |6 ++
  1 file changed, 6 insertions(+)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index c8b4dbd..a259394 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4761,6 +4761,12 @@ static resource_size_t 
pci_specified_resource_alignment(struct pci_dev *dev)
spin_lock(_alignment_lock);
p = resource_alignment_param;
while (*p) {
+   if (pci_has_flag(PCI_PROBE_ONLY)) {
+   printk(KERN_ERR "PCI: Ignore resource_alignment parameter: 
%s with PCI_PROBE_ONLY set\n",
+   p);
+   *p = 0;
+   break;

Wouldn't it be simpler to make pci_set_resource_alignment_param() fail
if PCI_PROBE_ONLY is set?


I add the check here because I want to print some logs so that users
could know the reason why resource_alignment doesn't work when
they add this parameter.

Thanks,
Yongji


+   }
count = 0;
if (sscanf(p, "%d%n", _order, ) == 1 &&
p[count] == '@') {
--
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc/pseries: Auto online hotplugged memory

2016-06-20 Thread Nathan Fontenot

On 06/20/2016 07:57 PM, Michael Ellerman wrote:
> On Mon, 2016-06-20 at 08:51 -0500, Nathan Fontenot wrote:
> 
>> Auto online hotplugged memory
>>
>> A recent update (commit id 31bc3858ea3) to the core mm hotplug code
>> introduced the memhp_auto_online variable to allow for automatically
>> onlining memory that is added.
>>
>> This patch update the pseries memory hotplug code to enable this so that
>> any memory DLPAR added to the system is automatically onlined. The code
>> to add the memory block for memory added from add_memory() is removed as
>> this is not needed, the memory_add code does this.
> 
> Is this a bug fix, or just a cleanup?
> 

Hmmm.. some cleanup and some new feature. The removal of the memblock_add()
call is a cleanup and the setting of the memhp_auto_online variable is
taking advantage of a feature I was not previously aware of.

None of this is a bug fix.

-Nathan

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RESEND PATCH v2 3/4] PCI: Add a new option for resource_alignment to reassign alignment

2016-06-20 Thread Bjorn Helgaas

On Thu, Jun 02, 2016 at 01:46:50PM +0800, Yongji Xie wrote:
> When using resource_alignment kernel parameter, the current
> implement reassigns the alignment by changing resources' size
> which can potentially break some drivers. For example, the driver
> uses the size to locate some register whose length is related
> to the size.
> 
> This patch adds a new option "noresize" for the parameter to
> solve this problem.

Why do we ever want to change the resource's size?  I understand that
you want to change the resource's *alignment*, and that part makes
sense.  But why change the *size*?  Changing the resource size doesn't
change the hardware BAR size; it just means the struct resource will
describe a region larger than what the BAR actually claims.  That
unnecessarily wastes space after the BAR.

This was a problem with the code even before your patch; I'm
suggesting that if you have a way to change the alignment without
changing the resource size, maybe we should do that all the time.
Then you wouldn't need to add the "noresize" option.

> Signed-off-by: Yongji Xie 
> ---
>  Documentation/kernel-parameters.txt |5 -
>  drivers/pci/pci.c   |   35 
> +--
>  2 files changed, 29 insertions(+), 11 deletions(-)
> 
> diff --git a/Documentation/kernel-parameters.txt 
> b/Documentation/kernel-parameters.txt
> index 82b42c9..c4802f5 100644
> --- a/Documentation/kernel-parameters.txt
> +++ b/Documentation/kernel-parameters.txt
> @@ -2997,13 +2997,16 @@ bytes respectively. Such letter suffixes can also be 
> entirely omitted.
>   window. The default value is 64 megabytes.
>   resource_alignment=
>   Format:
> - [ align>@][:]:.[; ...]
> + [ align>@][:]:.
> + [:noresize][; ...]
>   Specifies alignment and device to reassign
>   aligned memory resources.
>   If  is not specified,
>   PAGE_SIZE is used as alignment.
>   PCI-PCI bridge can be specified, if resource
>   windows need to be expanded.
> + noresize: Don't change the resources' sizes when
> + reassigning alignment.
>   ecrc=   Enable/disable PCIe ECRC (transaction layer
>   end-to-end CRC checking).
>   bios: Use BIOS/firmware settings. This is the
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index a259394..3ee13e5 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -4748,11 +4748,13 @@ static DEFINE_SPINLOCK(resource_alignment_lock);
>  /**
>   * pci_specified_resource_alignment - get resource alignment specified by 
> user.
>   * @dev: the PCI device to get
> + * @resize: whether or not to change resources' size when reassigning 
> alignment
>   *
>   * RETURNS: Resource alignment if it is specified.
>   *  Zero if it is not specified.
>   */
> -static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev)
> +static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev,
> + bool *resize)
>  {
>   int seg, bus, slot, func, align_order, count;
>   resource_size_t align = 0;
> @@ -4786,6 +4788,11 @@ static resource_size_t 
> pci_specified_resource_alignment(struct pci_dev *dev)
>   }
>   }
>   p += count;
> + if (!strncmp(p, ":noresize", 9)) {
> + *resize = false;
> + p += 9;
> + } else
> + *resize = true;
>   if (seg == pci_domain_nr(dev->bus) &&
>   bus == dev->bus->number &&
>   slot == PCI_SLOT(dev->devfn) &&
> @@ -4818,11 +4825,12 @@ void pci_reassigndev_resource_alignment(struct 
> pci_dev *dev)
>  {
>   int i;
>   struct resource *r;
> + bool resize = true;
>   resource_size_t align, size;
>   u16 command;
>  
>   /* check if specified PCI is target device to reassign */
> - align = pci_specified_resource_alignment(dev);
> + align = pci_specified_resource_alignment(dev, );
>   if (!align)
>   return;
>  
> @@ -4844,15 +4852,22 @@ void pci_reassigndev_resource_alignment(struct 
> pci_dev *dev)
>   if (!(r->flags & IORESOURCE_MEM))
>   continue;
>   size = resource_size(r);
> - if (size < align) {
> - size = align;
> - dev_info(>dev,
> - "Rounding up size of resource #%d to %#llx.\n",
> - i, (unsigned long long)size);
> + if (resize) {
> + if (size <

Re: [RESEND PATCH v2 2/4] PCI: Do not Use IORESOURCE_STARTALIGN to identify bridge resources

2016-06-20 Thread Bjorn Helgaas

On Thu, Jun 02, 2016 at 01:46:49PM +0800, Yongji Xie wrote:
> Now we use the IORESOURCE_STARTALIGN to identify bridge resources
> in __assign_resources_sorted(). That's quite fragile. We can't
> make sure that the PCI devices' resources will not use
> IORESOURCE_STARTALIGN any more.

Can you explain this a little more?  I don't quite understand the
problem.  Maybe you can give an example of the problem?

> In this patch, we try to use a more robust way to identify
> bridge resources.
> 
> Signed-off-by: Yongji Xie 
> ---
>  drivers/pci/setup-bus.c |9 ++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
> index 55641a3..216ddbc 100644
> --- a/drivers/pci/setup-bus.c
> +++ b/drivers/pci/setup-bus.c
> @@ -390,6 +390,7 @@ static void __assign_resources_sorted(struct list_head 
> *head,
>   struct pci_dev_resource *dev_res, *tmp_res, *dev_res2;
>   unsigned long fail_type;
>   resource_size_t add_align, align;
> + int index;
>  
>   /* Check if optional add_size is there */
>   if (!realloc_head || list_empty(realloc_head))
> @@ -410,11 +411,13 @@ static void __assign_resources_sorted(struct list_head 
> *head,
>  
>   /*
>* There are two kinds of additional resources in the list:
> -  * 1. bridge resource  -- IORESOURCE_STARTALIGN
> -  * 2. SR-IOV resource   -- IORESOURCE_SIZEALIGN
> +  * 1. bridge resource
> +  * 2. SR-IOV resource
>* Here just fix the additional alignment for bridge
>*/
> - if (!(dev_res->res->flags & IORESOURCE_STARTALIGN))
> + index = dev_res->res - dev_res->dev->resource;
> + if (index < PCI_BRIDGE_RESOURCES ||
> + index > PCI_BRIDGE_RESOURCE_END)

I think the code looks OK; at least, it seems to match the comment.
I'd just like to understand the problem a little better.

>   continue;
>  
>   add_align = get_res_add_align(realloc_head, dev_res->res);
> -- 
> 1.7.9.5
> 
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [V3, 2/2] powerpc/drivers: Add driver for operator panel on FSP machines

2016-06-20 Thread Suraj Jitindar Singh

On Thu, 16 Jun 2016 20:22:39 +1000 (AEST)
Michael Ellerman  wrote:

> On Thu, 2016-28-04 at 07:02:38 UTC, Suraj Jitindar Singh wrote:
> > Implement new character device driver to allow access from user
> > space to the 2x16 character operator panel display present on IBM
> > Power Systems machines with FSPs.  
> 
> I looked at this previously and somehow convinced myself it depended
> on skiboot changes, but it seems it doesn't.
> 
> Some comments below ...
> 
> > This will allow status information to be presented on the display
> > which is visible to a user.
> > 
> > The driver implements a 32 character buffer which a user can
> > read/write  
> 
> It looks like "32" is actually just one possible size, it comes from
> the device tree no?
Correct, although it is kind of hard coded into skiboot at the moment. I
will change the commit message to omit this.
> 
> > by accessing the device (/dev/oppanel). This buffer is then
> > displayed on  
> 
> Are we sure "op_panel" wouldn't be better?
Seems like that will cause less confusion, will change it.
> 
> > diff --git a/arch/powerpc/configs/powernv_defconfig
> > b/arch/powerpc/configs/powernv_defconfig index 0450310..8f9f4ce
> > 100644 --- a/arch/powerpc/configs/powernv_defconfig
> > +++ b/arch/powerpc/configs/powernv_defconfig
> > @@ -181,6 +181,7 @@ CONFIG_SERIAL_8250=y
> >  CONFIG_SERIAL_8250_CONSOLE=y
> >  CONFIG_SERIAL_JSM=m
> >  CONFIG_VIRTIO_CONSOLE=m
> > +CONFIG_IBM_OP_PANEL=m  
> 
> I think CONFIG_POWERNV_OP_PANEL would be a better name.
I agree.
> 
> > diff --git a/arch/powerpc/include/asm/opal.h
> > b/arch/powerpc/include/asm/opal.h index 9d86c66..b33e349 100644
> > --- a/arch/powerpc/include/asm/opal.h
> > +++ b/arch/powerpc/include/asm/opal.h
> > @@ -178,6 +178,8 @@ int64_t opal_dump_ack(uint32_t dump_id);
> >  int64_t opal_dump_resend_notification(void);
> >  
> >  int64_t opal_get_msg(uint64_t buffer, uint64_t size);
> > +int64_t opal_write_oppanel_async(uint64_t token, oppanel_line_t
> > *lines,
> > +   uint64_t num_lines);  
> 
> I realise you're just following the skiboot code which uses
> oppanel_line_t, but please don't do that in the kernel. Just use
> struct oppanel_line directly.
Struct oppanel_line is typedefed to oppanel_line_t in opal-api.h,
so this should be oppanel_line_t or struct oppanel_line?
> 
> > diff --git a/drivers/char/Makefile b/drivers/char/Makefile
> > index d8a7579..a02c61b 100644
> > --- a/drivers/char/Makefile
> > +++ b/drivers/char/Makefile
> > @@ -60,3 +60,4 @@ js-rtc-y = rtc.o
> >  
> >  obj-$(CONFIG_TILE_SROM)+= tile-srom.o
> >  obj-$(CONFIG_XILLYBUS) += xillybus/
> > +obj-$(CONFIG_IBM_OP_PANEL) += op-panel-powernv.o  
> 
> I'd prefer powernv-op-panel.c, but up to you.
This will align to the name of the config option, so will change
to your recommendation
> 
> > diff --git a/drivers/char/op-panel-powernv.c
> > b/drivers/char/op-panel-powernv.c new file mode 100644
> > index 000..90b74b7
> > --- /dev/null
> > +++ b/drivers/char/op-panel-powernv.c
> > @@ -0,0 +1,247 @@
> > +/*
> > + * OPAL Operator Panel Display Driver
> > + *
> > + * (C) Copyright IBM Corp. 2016
> > + *
> > + * Author: Suraj Jitindar Singh   
> 
> I'm not a fan of email addresses in C files, they just bit rot.
> 
> The preferred format is:
> 
>  * Copyright 2016, Suraj Jitindar Singh, IBM Corporation.
> 
> > + *
> > + * This program is free software; you can redistribute it and/or
> > modify
> > + * it under the terms of the GNU General Public License as
> > published by
> > + * the Free Software Foundation; either version 2 of the License,
> > or
> > + * (at your option) any later version.
> > + *
> > + * This program is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > + * GNU General Public License for more details.  
> 
> We don't need that paragraph in every file.
> 
Will update and remove these sections.
> > + */
> > +
> > +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> > +
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +
> > +#include 
> > +#include   
> 
> opal-api.h is sort of an implementation detail, you should just
> include opal.h
> 
> > +/*
> > + * This driver creates a character device (/dev/oppanel) which
> > exposes the
> > + * operator panel (2x16 character LCD display) on IBM Power
> > Systems machines
> > + * with FSPs.
> > + * A 32 character buffer written to the device will be displayed
> > on the
> > + * operator panel.
> > + */
> > +
> > +static DEFINE_MUTEX(oppanel_mutex);
> > +
> > +static oppanel_line_t  *oppanel_lines;
> > +static char*oppanel_data;
> > +static u32 line_length, num_lines;  
> 
> You calculate (num_lines *

[PATCH 6/6] IMA: Demonstration code for kexec buffer passing.

2016-06-20 Thread Thiago Jung Bauermann

This shows how kernel code can use the kexec buffer passing mechanism
to pass information to the next kernel.

This patch is not intended to be committed.

Signed-off-by: Thiago Jung Bauermann 
---
 include/linux/ima.h   | 11 +
 kernel/kexec_file.c   |  4 ++
 security/integrity/ima/ima.h  |  5 +++
 security/integrity/ima/ima_init.c | 26 
 security/integrity/ima/ima_template.c | 79 +++
 5 files changed, 125 insertions(+)

diff --git a/include/linux/ima.h b/include/linux/ima.h
index 0eb7c2e7f0d6..96528d007139 100644
--- a/include/linux/ima.h
+++ b/include/linux/ima.h
@@ -11,6 +11,7 @@
 #define _LINUX_IMA_H
 
 #include 
+#include 
 struct linux_binprm;
 
 #ifdef CONFIG_IMA
@@ -23,6 +24,10 @@ extern int ima_post_read_file(struct file *file, void *buf, 
loff_t size,
  enum kernel_read_file_id id);
 extern void ima_post_path_mknod(struct dentry *dentry);
 
+#ifdef CONFIG_KEXEC_FILE
+extern void ima_add_kexec_buffer(struct kimage *image);
+#endif
+
 #else
 static inline int ima_bprm_check(struct linux_binprm *bprm)
 {
@@ -60,6 +65,12 @@ static inline void ima_post_path_mknod(struct dentry *dentry)
return;
 }
 
+#ifdef CONFIG_KEXEC_FILE
+static inline void ima_add_kexec_buffer(struct kimage *image)
+{
+}
+#endif
+
 #endif /* CONFIG_IMA */
 
 #ifdef CONFIG_IMA_APPRAISE
diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index 79d09a7784d8..143c70d2ef1c 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -261,6 +262,9 @@ kimage_file_prepare_segments(struct kimage *image, int 
kernel_fd, int initrd_fd,
}
}
 
+   /* IMA needs to pass the measurement list to the next kernel. */
+   ima_add_kexec_buffer(image);
+
/* Call arch image load handlers */
ldata = arch_kexec_kernel_image_load(image);
 
diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
index d3a939bf2781..940f68f3ccc9 100644
--- a/security/integrity/ima/ima.h
+++ b/security/integrity/ima/ima.h
@@ -101,6 +101,11 @@ struct ima_queue_entry {
 };
 extern struct list_head ima_measurements;  /* list of all measurements */
 
+#ifdef CONFIG_KEXEC_FILE
+extern void *kexec_buffer;
+extern size_t kexec_buffer_size;
+#endif
+
 /* Internal IMA function definitions */
 int ima_init(void);
 int ima_fs_init(void);
diff --git a/security/integrity/ima/ima_init.c 
b/security/integrity/ima/ima_init.c
index 5d679a685616..aaa2fc536ca4 100644
--- a/security/integrity/ima/ima_init.c
+++ b/security/integrity/ima/ima_init.c
@@ -21,6 +21,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "ima.h"
 
@@ -103,6 +104,29 @@ void __init ima_load_x509(void)
 }
 #endif
 
+#ifdef CONFIG_KEXEC_FILE
+static void ima_load_kexec_buffer(void)
+{
+   int rc;
+
+   /* Fetch the buffer from the previous kernel, if any. */
+   rc = kexec_get_handover_buffer(_buffer, _buffer_size);
+   if (rc == 0) {
+   /* Demonstrate that buffer handover works. */
+   pr_err("kexec buffer contents: %s\n", (char *) kexec_buffer);
+   pr_err("kexec buffer contents after update: %s\n",
+  (char *) kexec_buffer + 4 * PAGE_SIZE + 10);
+
+   kexec_free_handover_buffer();
+   } else if (rc == -ENOENT)
+   pr_debug("No kexec buffer from the previous kernel.\n");
+   else
+   pr_debug("Error restoring kexec buffer: %d\n", rc);
+}
+#else
+static void ima_load_kexec_buffer(void) { }
+#endif
+
 int __init ima_init(void)
 {
u8 pcr_i[TPM_DIGEST_SIZE];
@@ -133,5 +157,7 @@ int __init ima_init(void)
 
ima_init_policy();
 
+   ima_load_kexec_buffer();
+
return ima_fs_init();
 }
diff --git a/security/integrity/ima/ima_template.c 
b/security/integrity/ima/ima_template.c
index febd12ed9b55..c5e81af8cb9c 100644
--- a/security/integrity/ima/ima_template.c
+++ b/security/integrity/ima/ima_template.c
@@ -15,6 +15,8 @@
 
 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
+#include 
+#include 
 #include "ima.h"
 #include "ima_template_lib.h"
 
@@ -182,6 +184,83 @@ static int template_desc_init_fields(const char 
*template_fmt,
return 0;
 }
 
+#ifdef CONFIG_KEXEC_FILE
+void *kexec_buffer = NULL;
+size_t kexec_buffer_size = 0;
+
+/* Physical address of the measurement buffer in the next kernel. */
+unsigned long kexec_buffer_load_addr = 0;
+
+/*
+ * Called during reboot. IMA can add here new events that were generated after
+ * the kexec image was loaded.
+ */
+static int ima_update_kexec_buffer(struct notifier_block *self,
+  unsigned long action, void *data)
+{
+   int ret;
+
+   if (!kexec_in_progress)
+   return NOTIFY_OK;
+
+   /*
+* Add content deep in the buffer to show that we can update
+

[PATCH 5/6] kexec: Share logic to copy segment page contents.

2016-06-20 Thread Thiago Jung Bauermann

Make kimage_load_normal_segment and kexec_update_segment share code
which they currently duplicate.

Signed-off-by: Thiago Jung Bauermann 
---
 kernel/kexec_core.c | 159 +++-
 1 file changed, 95 insertions(+), 64 deletions(-)

diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
index 8781d3e4479d..281d8b961fb4 100644
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -700,6 +700,65 @@ static struct page *kimage_alloc_page(struct kimage *image,
return page;
 }
 
+struct kimage_update_buffer_state {
+   /* Destination memory address currently being copied to. */
+   unsigned long maddr;
+
+   /* Bytes in buffer still left to copy. */
+   size_t ubytes;
+
+   /* Bytes in memory still left to copy. */
+   size_t mbytes;
+
+   /* If true, copy from kbuf. */
+   bool from_kernel;
+
+   /* Clear pages before copying? */
+   bool clear_pages;
+
+   /* Buffer position to continue copying from. */
+   const unsigned char *kbuf;
+   const unsigned char __user *buf;
+};
+
+static int kimage_update_page(struct page *page,
+ struct kimage_update_buffer_state *state)
+{
+   char *ptr;
+   int result = 0;
+   size_t uchunk, mchunk;
+
+   ptr = kmap(page);
+
+   /* Start with a clear page */
+   if (state->clear_pages)
+   clear_page(ptr);
+
+   ptr += state->maddr & ~PAGE_MASK;
+   mchunk = min_t(size_t, state->mbytes,
+  PAGE_SIZE - (state->maddr & ~PAGE_MASK));
+   uchunk = min(state->ubytes, mchunk);
+
+   if (state->from_kernel)
+   memcpy(ptr, state->kbuf, uchunk);
+   else
+   result = copy_from_user(ptr, state->buf, uchunk);
+
+   kunmap(page);
+   if (result)
+   return -EFAULT;
+
+   state->ubytes -= uchunk;
+   state->maddr += mchunk;
+   if (state->from_kernel)
+   state->kbuf += mchunk;
+   else
+   state->buf += mchunk;
+   state->mbytes -= mchunk;
+
+   return 0;
+}
+
 /**
  * kexec_update_segment - update the contents of a kimage segment
  * @buffer:New contents of the segment.
@@ -718,6 +777,7 @@ int kexec_update_segment(const char *buffer, unsigned long 
bufsz,
unsigned long entry;
unsigned long *ptr = NULL;
void *dest = NULL;
+   struct kimage_update_buffer_state state;
 
for (i = 0; i < kexec_image->nr_segments; i++)
/* We only support updating whole segments. */
@@ -736,8 +796,15 @@ int kexec_update_segment(const char *buffer, unsigned long 
bufsz,
return -EINVAL;
}
 
-   for (entry = kexec_image->head; !(entry & IND_DONE) && memsz;
-entry = *ptr++) {
+   state.maddr = load_addr;
+   state.ubytes = bufsz;
+   state.mbytes = memsz;
+   state.kbuf = buffer;
+   state.from_kernel = true;
+   state.clear_pages = false;
+
+   for (entry = kexec_image->head; !(entry & IND_DONE) &&
+   state.mbytes; entry = *ptr++) {
void *addr = (void *) (entry & PAGE_MASK);
 
switch (entry & IND_FLAGS) {
@@ -754,26 +821,13 @@ int kexec_update_segment(const char *buffer, unsigned 
long bufsz,
return -EINVAL;
}
 
-   if (dest == (void *) load_addr) {
-   struct page *page;
-   char *ptr;
-   size_t uchunk, mchunk;
-
-   page = kmap_to_page(addr);
-
-   ptr = kmap(page);
-   ptr += load_addr & ~PAGE_MASK;
-   mchunk = min_t(size_t, memsz,
-  PAGE_SIZE - (load_addr & 
~PAGE_MASK));
-   uchunk = min(bufsz, mchunk);
-   memcpy(ptr, buffer, uchunk);
-
-   kunmap(page);
+   if (dest == (void *) state.maddr) {
+   int ret;
 
-   bufsz -= uchunk;
-   load_addr += mchunk;
-   buffer += mchunk;
-   memsz -= mchunk;
+   ret = kimage_update_page(kmap_to_page(addr),
+);
+   if (ret)
+   return ret;
}
dest += PAGE_SIZE;
}
@@ -791,31 +845,30 @@ int kexec_update_segment(const char *buffer, unsigned 
long bufsz,
 static int kimage_load_normal_segment(struct kimage *image,
 struct kexec_segment *segment)
 {
-   unsigned

[PATCH 4/6] kexec_file: Add mechanism to update kexec segments.

2016-06-20 Thread Thiago Jung Bauermann

kexec_update_segment allows a given segment in kexec_image to have
its contents updated. This is useful if the current kernel wants to
send information to the next kernel that is up-to-date at the time of
reboot.

Signed-off-by: Thiago Jung Bauermann 
---
 include/linux/kexec.h |  2 ++
 kernel/kexec_core.c   | 88 +++
 kernel/kexec_file.c   |  1 +
 3 files changed, 91 insertions(+)

diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 131b1fc7820e..14d4ac070a8c 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -222,6 +222,8 @@ extern int kexec_add_buffer(struct kimage *image, char 
*buffer,
unsigned long buf_align, unsigned long buf_min,
unsigned long buf_max, bool top_down, bool checksum,
unsigned long *load_addr);
+int kexec_update_segment(const char *buffer, unsigned long bufsz,
+unsigned long load_addr, unsigned long memsz);
 extern struct page *kimage_alloc_control_pages(struct kimage *image,
unsigned int order);
 extern int kexec_load_purgatory(struct kimage *image, unsigned long min,
diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
index 56b3ed0927b0..8781d3e4479d 100644
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -700,6 +700,94 @@ static struct page *kimage_alloc_page(struct kimage *image,
return page;
 }
 
+/**
+ * kexec_update_segment - update the contents of a kimage segment
+ * @buffer:New contents of the segment.
+ * @bufsz: @buffer size.
+ * @load_addr: Segment's physical address in the next kernel.
+ * @memsz: Segment size.
+ *
+ * This function assumes kexec_mutex is held.
+ *
+ * Return: 0 on success, negative errno on error.
+ */
+int kexec_update_segment(const char *buffer, unsigned long bufsz,
+unsigned long load_addr, unsigned long memsz)
+{
+   int i;
+   unsigned long entry;
+   unsigned long *ptr = NULL;
+   void *dest = NULL;
+
+   for (i = 0; i < kexec_image->nr_segments; i++)
+   /* We only support updating whole segments. */
+   if (load_addr == kexec_image->segment[i].mem &&
+   memsz == kexec_image->segment[i].memsz) {
+   if (kexec_image->segment[i].do_checksum) {
+   pr_err("Trying to update non-modifiable 
segment.\n");
+   return -EINVAL;
+   }
+
+   break;
+   }
+   if (i == kexec_image->nr_segments) {
+   pr_err("Couldn't find segment to update: 0x%lx, size 0x%lx\n",
+  load_addr, memsz);
+   return -EINVAL;
+   }
+
+   for (entry = kexec_image->head; !(entry & IND_DONE) && memsz;
+entry = *ptr++) {
+   void *addr = (void *) (entry & PAGE_MASK);
+
+   switch (entry & IND_FLAGS) {
+   case IND_DESTINATION:
+   dest = addr;
+   break;
+   case IND_INDIRECTION:
+   ptr = __va(addr);
+   break;
+   case IND_SOURCE:
+   /* Shouldn't happen, but verify just to be safe. */
+   if (dest == NULL) {
+   pr_err("Invalid kexec entries list.");
+   return -EINVAL;
+   }
+
+   if (dest == (void *) load_addr) {
+   struct page *page;
+   char *ptr;
+   size_t uchunk, mchunk;
+
+   page = kmap_to_page(addr);
+
+   ptr = kmap(page);
+   ptr += load_addr & ~PAGE_MASK;
+   mchunk = min_t(size_t, memsz,
+  PAGE_SIZE - (load_addr & 
~PAGE_MASK));
+   uchunk = min(bufsz, mchunk);
+   memcpy(ptr, buffer, uchunk);
+
+   kunmap(page);
+
+   bufsz -= uchunk;
+   load_addr += mchunk;
+   buffer += mchunk;
+   memsz -= mchunk;
+   }
+   dest += PAGE_SIZE;
+   }
+
+   /* Shouldn't happen, but verify just to be safe. */
+   if (ptr == NULL) {
+   pr_err("Invalid kexec entries list.");
+   return -EINVAL;
+   }
+   }
+
+   return 0;
+}
+
 static int kimage_load_normal_segment(struct kimage *image,
 struct kexec_segment *segment)
 {
diff --git a/kernel/kexec_file.c

[PATCH 3/6] kexec_file: Allow skipping checksum calculation for some segments.

2016-06-20 Thread Thiago Jung Bauermann

Adds checksum argument to kexec_add_buffer specifying whether the given
segment should be part of the checksum calculation.

The next patch will add a way to update segments after a kimage is loaded.
Segments that will be updated in this way should not be checksummed,
otherwise they will cause the purgatory checksum verification to fail
when the machine is rebooted.

As a bonus, we don't need to special-case the purgatory segment anymore
to avoid checksumming it.

Adjust call sites for the new argument.

Signed-off-by: Thiago Jung Bauermann 
---
 arch/powerpc/kernel/kexec_elf_64.c |  6 +++---
 arch/x86/kernel/crash.c|  4 ++--
 arch/x86/kernel/kexec-bzimage64.c  |  6 +++---
 include/linux/kexec.h  |  7 +--
 kernel/kexec_file.c| 22 +++---
 5 files changed, 24 insertions(+), 21 deletions(-)

diff --git a/arch/powerpc/kernel/kexec_elf_64.c 
b/arch/powerpc/kernel/kexec_elf_64.c
index 5d2b7036fee7..abbad484d7b2 100644
--- a/arch/powerpc/kernel/kexec_elf_64.c
+++ b/arch/powerpc/kernel/kexec_elf_64.c
@@ -311,7 +311,7 @@ static int elf_exec_load(struct kimage *image, struct 
elfhdr *ehdr,
   (char *) elf_info->buffer + 
phdr->p_offset,
   size, phdr->p_memsz, phdr->p_align,
   phdr->p_paddr + base, ppc64_rma_size,
-  false, _addr);
+  false, true, _addr);
if (ret)
goto out;
 
@@ -487,7 +487,7 @@ void *elf64_load(struct kimage *image, char *kernel_buf,
if (initrd != NULL) {
ret = kexec_add_buffer(image, initrd, initrd_len, initrd_len,
   PAGE_SIZE, 0, ppc64_rma_size, false,
-  _load_addr);
+  true, _load_addr);
if (ret)
goto out;
 
@@ -564,7 +564,7 @@ void *elf64_load(struct kimage *image, char *kernel_buf,
fdt_pack(fdt);
 
ret = kexec_add_buffer(image, fdt, fdt_size, fdt_size, PAGE_SIZE, 0,
-  ppc64_rma_size, true, _load_addr);
+  ppc64_rma_size, true, true, _load_addr);
if (ret)
goto out;
 
diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index 9ef978d69c22..c8b16f2ca321 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -643,7 +643,7 @@ int crash_load_segments(struct kimage *image)
 */
ret = kexec_add_buffer(image, (char *)_zero_bytes,
   sizeof(crash_zero_bytes), src_sz,
-  PAGE_SIZE, 0, -1, 0,
+  PAGE_SIZE, 0, -1, false, true,
   >arch.backup_load_addr);
if (ret)
return ret;
@@ -660,7 +660,7 @@ int crash_load_segments(struct kimage *image)
image->arch.elf_headers_sz = elf_sz;
 
ret = kexec_add_buffer(image, (char *)elf_addr, elf_sz, elf_sz,
-   ELF_CORE_HEADER_ALIGN, 0, -1, 0,
+   ELF_CORE_HEADER_ALIGN, 0, -1, false, true,
>arch.elf_load_addr);
if (ret) {
vfree((void *)image->arch.elf_headers);
diff --git a/arch/x86/kernel/kexec-bzimage64.c 
b/arch/x86/kernel/kexec-bzimage64.c
index f2356bda2b05..f9016be44da6 100644
--- a/arch/x86/kernel/kexec-bzimage64.c
+++ b/arch/x86/kernel/kexec-bzimage64.c
@@ -420,7 +420,7 @@ static void *bzImage64_load(struct kimage *image, char 
*kernel,
 
ret = kexec_add_buffer(image, (char *)params, params_misc_sz,
   params_misc_sz, 16, MIN_BOOTPARAM_ADDR,
-  ULONG_MAX, 1, _load_addr);
+  ULONG_MAX, true, true, _load_addr);
if (ret)
goto out_free_params;
pr_debug("Loaded boot_param, command line and misc at 0x%lx bufsz=0x%lx 
memsz=0x%lx\n",
@@ -434,7 +434,7 @@ static void *bzImage64_load(struct kimage *image, char 
*kernel,
 
ret = kexec_add_buffer(image, kernel_buf,
   kernel_bufsz, kernel_memsz, kernel_align,
-  MIN_KERNEL_LOAD_ADDR, ULONG_MAX, 1,
+  MIN_KERNEL_LOAD_ADDR, ULONG_MAX, true, true,
   _load_addr);
if (ret)
goto out_free_params;
@@ -446,7 +446,7 @@ static void *bzImage64_load(struct kimage *image, char 
*kernel,
if (initrd) {
ret = kexec_add_buffer(image, initrd, initrd_len, initrd_len,
   PAGE_SIZE, MIN_INITRD_LOAD_ADDR,
-  ULONG_MAX, 1, _load_addr);
+  ULONG_MAX,

[PATCH 2/6] powerpc: kexec_file: Add buffer hand-over support for the next kernel

2016-06-20 Thread Thiago Jung Bauermann

The buffer hand-over mechanism allows the currently running kernel to pass
data to kernel that will be kexec'd via a kexec segment. The second kernel
can check whether the previous kernel sent data and retrieve it.

This is the architecture-specific part.

Signed-off-by: Thiago Jung Bauermann 
---
 arch/powerpc/include/asm/kexec.h   |  9 +
 arch/powerpc/kernel/kexec_elf_64.c | 44 +++
 arch/powerpc/kernel/machine_kexec_64.c | 64 ++
 3 files changed, 117 insertions(+)

diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
index a46f5f45570c..9b1ff59bc188 100644
--- a/arch/powerpc/include/asm/kexec.h
+++ b/arch/powerpc/include/asm/kexec.h
@@ -55,6 +55,15 @@ typedef void (*crash_shutdown_t)(void);
 
 #ifdef CONFIG_KEXEC
 
+#ifdef CONFIG_KEXEC_FILE
+#define ARCH_HAS_KIMAGE_ARCH
+
+struct kimage_arch {
+   phys_addr_t handover_buffer_addr;
+   unsigned long handover_buffer_size;
+};
+#endif
+
 /*
  * This function is responsible for capturing register states if coming
  * via panic or invoking dump using sysrq-trigger.
diff --git a/arch/powerpc/kernel/kexec_elf_64.c 
b/arch/powerpc/kernel/kexec_elf_64.c
index 4e71595300ed..5d2b7036fee7 100644
--- a/arch/powerpc/kernel/kexec_elf_64.c
+++ b/arch/powerpc/kernel/kexec_elf_64.c
@@ -96,6 +96,46 @@ static int elf64_probe(const char *buf, unsigned long len)
return elf_check_arch()? 0 : -ENOEXEC;
 }
 
+static int setup_handover_buffer(struct kimage *image, void *fdt,
+int chosen_node)
+{
+   int ret;
+
+   if (image->arch.handover_buffer_addr) {
+   ret = fdt_setprop_u64(fdt, chosen_node,
+ "linux,kexec-handover-buffer-start",
+ image->arch.handover_buffer_addr);
+   if (ret < 0) {
+   pr_err("Error setting up the new device tree.\n");
+   return -EINVAL;
+   }
+
+   /* -end is the first address after the buffer. */
+   ret = fdt_setprop_u64(fdt, chosen_node,
+ "linux,kexec-handover-buffer-end",
+ image->arch.handover_buffer_addr +
+ image->arch.handover_buffer_size);
+   if (ret < 0) {
+   pr_err("Error setting up the new device tree.\n");
+   return -EINVAL;
+   }
+
+   ret = fdt_add_mem_rsv(fdt, image->arch.handover_buffer_addr,
+ image->arch.handover_buffer_size);
+   if (ret) {
+   pr_err("Error reserving kexec handover buffer: %s\n",
+  fdt_strerror(ret));
+   return -EINVAL;
+   }
+
+   pr_debug("kexec handover buffer at 0x%llx, size = 0x%lx\n",
+image->arch.handover_buffer_addr,
+image->arch.handover_buffer_size);
+   }
+
+   return 0;
+}
+
 static bool find_debug_console(void *fdt, int chosen_node)
 {
int len;
@@ -494,6 +534,10 @@ void *elf64_load(struct kimage *image, char *kernel_buf,
}
}
 
+   ret = setup_handover_buffer(image, fdt, chosen_node);
+   if (ret)
+   goto out;
+
ret = fdt_setprop(fdt, chosen_node, "linux,booted-from-kexec", NULL, 0);
if (ret) {
pr_err("Error setting up the new device tree.\n");
diff --git a/arch/powerpc/kernel/machine_kexec_64.c 
b/arch/powerpc/kernel/machine_kexec_64.c
index 43e8185ab6f7..c582abf726f5 100644
--- a/arch/powerpc/kernel/machine_kexec_64.c
+++ b/arch/powerpc/kernel/machine_kexec_64.c
@@ -19,6 +19,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -481,6 +482,69 @@ int arch_kimage_file_post_load_cleanup(struct kimage 
*image)
return image->fops->cleanup(image->image_loader_data);
 }
 
+bool kexec_can_hand_over_buffer(void)
+{
+   return true;
+}
+
+int arch_kexec_add_handover_buffer(struct kimage *image,
+  unsigned long load_addr, unsigned long size)
+{
+   image->arch.handover_buffer_addr = load_addr;
+   image->arch.handover_buffer_size = size;
+
+   return 0;
+}
+
+int kexec_get_handover_buffer(void **addr, unsigned long *size)
+{
+   int chosen_node;
+   int startsz, endsz;
+   const void *startp, *endp;
+   unsigned long start_addr, end_addr;
+
+   chosen_node = fdt_path_offset(initial_boot_params, "/chosen");
+   if (chosen_node < 0) {
+   pr_err("Malformed device tree: /chosen not found.\n");
+   return -EINVAL;
+   }
+
+   startp = of_get_flat_dt_prop(chosen_node,
+"linux,kexec-handover-buffer-start",
+);

[PATCH 1/6] kexec_file: Add buffer hand-over support for the next kernel

2016-06-20 Thread Thiago Jung Bauermann

The buffer hand-over mechanism allows the currently running kernel to pass
data to kernel that will be kexec'd via a kexec segment. The second kernel
can check whether the previous kernel sent data and retrieve it.

This is the architecture-independent part of the feature.

Signed-off-by: Thiago Jung Bauermann 
---
 include/linux/kexec.h | 40 ++
 kernel/kexec_file.c   | 79 +++
 2 files changed, 119 insertions(+)

diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index a08cd986b5a1..72db95c623b3 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -325,6 +325,46 @@ int __weak arch_kexec_walk_mem(unsigned int image_type, 
unsigned long start,
 void arch_kexec_protect_crashkres(void);
 void arch_kexec_unprotect_crashkres(void);
 
+#ifdef CONFIG_KEXEC_FILE
+bool __weak kexec_can_hand_over_buffer(void);
+int __weak arch_kexec_add_handover_buffer(struct kimage *image,
+ unsigned long load_addr,
+ unsigned long size);
+int kexec_add_handover_buffer(struct kimage *image, void *buffer,
+ unsigned long bufsz, unsigned long memsz,
+ unsigned long buf_align, unsigned long buf_min,
+ unsigned long buf_max, bool top_down,
+ unsigned long *load_addr);
+int __weak kexec_get_handover_buffer(void **addr, unsigned long *size);
+int __weak kexec_free_handover_buffer(void);
+#else
+static inline bool kexec_can_hand_over_buffer(void)
+{
+   return false;
+}
+
+static inline int kexec_add_handover_buffer(struct kimage *image, void *buffer,
+   unsigned long bufsz,
+   unsigned long memsz,
+   unsigned long buf_align,
+   unsigned long buf_min,
+   unsigned long buf_max,
+   bool top_down, bool checksum,
+   unsigned long *load_addr)
+{
+   return -ENOTSUPP;
+}
+
+static inline int kexec_get_handover_buffer(void **addr, unsigned long *size)
+{
+   return -ENOTSUPP;
+}
+
+static inline int kexec_free_handover_buffer(void)
+{
+   return -ENOTSUPP;
+}
+#endif /* CONFIG_KEXEC_FILE */
 #else /* !CONFIG_KEXEC_CORE */
 struct pt_regs;
 struct task_struct;
diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index 3e494261d32a..d6ba702654f5 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -113,6 +113,85 @@ void kimage_file_post_load_cleanup(struct kimage *image)
image->image_loader_data = NULL;
 }
 
+/**
+ * kexec_can_hand_over_buffer - can we pass data to the kexec'd kernel?
+ */
+bool __weak kexec_can_hand_over_buffer(void)
+{
+   return false;
+}
+
+/**
+ * arch_kexec_add_handover_buffer - do arch-specific steps to handover buffer
+ *
+ * Architectures should use this function to pass on the handover buffer
+ * information to the next kernel.
+ *
+ * Return: 0 on success, negative errno on error.
+ */
+int __weak arch_kexec_add_handover_buffer(struct kimage *image,
+ unsigned long load_addr,
+ unsigned long size)
+{
+   return -ENOTSUPP;
+}
+
+/**
+ * kexec_add_handover_buffer - add buffer to be used by the next kernel
+ * @image: kexec image to add buffer to.
+ * @buffer:Contents of the handover buffer.
+ * @bufsz: @buffer size.
+ * @memsz: Handover buffer size in memory.
+ * @buf_align: Buffer alignment restriction.
+ * @buf_min:   Minimum address where buffer can be placed.
+ * @buf_max:   Maximum address where buffer can be placed.
+ * @top_down:  Find the highest available memory position for the buffer?
+ * @load_addr: On successful return, set to the physical memory address of the
+ * buffer in the next kernel.
+ *
+ * This function assumes that kexec_mutex is held.
+ *
+ * Return: 0 on success, negative errno on error.
+ */
+int kexec_add_handover_buffer(struct kimage *image, void *buffer,
+ unsigned long bufsz, unsigned long memsz,
+ unsigned long buf_align, unsigned long buf_min,
+ unsigned long buf_max, bool top_down,
+ unsigned long *load_addr)
+{
+   int ret;
+
+   if (!kexec_can_hand_over_buffer())
+   return -ENOTSUPP;
+
+   ret = kexec_add_buffer(image, buffer, bufsz, memsz, buf_align, buf_min,
+  buf_max, top_down, load_addr);
+   if (ret)
+   return ret;
+
+   return arch_kexec_add_handover_buffer(image, *load_addr, memsz);
+}
+
+/**
+ * kexec_get_handover_buffer - get the handover buffer from the

[PATCH 0/6] kexec_file: Add buffer hand-over for the next kernel

2016-06-20 Thread Thiago Jung Bauermann

Hello,

This patch series implements a mechanism which allows the kernel to pass on
a buffer to the kernel that will be kexec'd. This buffer is passed as a
segment which is added to the kimage when it is being prepared by
kexec_file_load.

How the second kernel is informed of this buffer is architecture-specific.
On PowerPC, this is done via the device tree, by checking the properties
/chosen/linux,kexec-handover-buffer-start and
/chosen/linux,kexec-handover-buffer-end, which is analogous to how the
kernel finds the initrd.

This feature was implemented because the Integrity Measurement Architecture
subsystem needs to preserve its measurement list accross the kexec reboot.
This is so that IMA can implement trusted boot support on the OpenPower
platform, because on such systems an intermediary Linux instance running as
part of the firmware is used to boot the target operating system via kexec.
Using this mechanism, IMA on this intermediary instance can hand over to the
target OS the measurements of the components that were used to boot it.

Because there could be additional measurement events between the
kexec_file_load call and the actual reboot, IMA needs a way to update the
buffer with those additional events before rebooting. One can minimize
the interval between the kexec_file_load and the reboot syscalls, but as
small as it can be, there is always the possibility that the measurement
list will be out of date at the time of reboot.

To address this issue, this patch series also introduces kexec_update_segment,
which allows a reboot notifier to change the contents of the image segment
during the reboot process.

There's one patch which makes kimage_load_normal_segment and
kexec_update_segment share code. It's not much code that they can share
though, so I'm not sure if it's worth including this patch.

The last patch is not intended to be merged, it just demonstrates how this
feature can be used.

This series applies on top of v2 of the "kexec_file_load implementation
for PowerPC" patch series at:

http://lists.infradead.org/pipermail/kexec/2016-June/016078.html

Thiago Jung Bauermann (6):
  kexec_file: Add buffer hand-over support for the next kernel
  powerpc: kexec_file: Add buffer hand-over support for the next kernel
  kexec_file: Allow skipping checksum calculation for some segments.
  kexec_file: Add mechanism to update kexec segments.
  kexec: Share logic to copy segment page contents.
  IMA: Demonstration code for kexec buffer passing.

 arch/powerpc/include/asm/kexec.h   |   9 ++
 arch/powerpc/kernel/kexec_elf_64.c |  50 +++-
 arch/powerpc/kernel/machine_kexec_64.c |  64 ++
 arch/x86/kernel/crash.c|   4 +-
 arch/x86/kernel/kexec-bzimage64.c  |   6 +-
 include/linux/ima.h|  11 ++
 include/linux/kexec.h  |  47 +++-
 kernel/kexec_core.c| 205 ++---
 kernel/kexec_file.c| 102 ++--
 security/integrity/ima/ima.h   |   5 +
 security/integrity/ima/ima_init.c  |  26 +
 security/integrity/ima/ima_template.c  |  79 +
 12 files changed, 547 insertions(+), 61 deletions(-)

-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/3] cxlflash: Shutdown notification and reset patch

2016-06-20 Thread Martin K. Petersen

> "Uma" == Uma Krishnan  writes:

Uma> This patch set contains support to notify CXL Flash devices of an
Uma> impending shutdown and a fix to drain operations prior to a reset.

Uma> This series is intended for 4.8 and is bisectable.

Applied to 4.8/scsi-queue.

-- 
Martin K. Petersen  Oracle Linux Engineering
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc/align: Use #ifdef __BIG_ENDIAN__ #else for REG_BYTE

2016-06-20 Thread Daniel Axtens

Hi Arnd,

> Something like the (untested) patch below, similar to how we
> already handle the word size and how some other architectures
> handle setting __BIG_ENDIAN__.

I tested this by reverting Michael's patch and applying yours.

Not only does it successfully fix the errors that patch fixes, it
manages to clean up the following four errors as well:

+/scratch/dja/linux/arch/powerpc/lib/sstep.c:371:32: error: cast from unknown 
type
+/scratch/dja/linux/arch/powerpc/lib/sstep.c:371:59: error: using member 'word' 
in incomplete struct 
+/scratch/dja/linux/arch/powerpc/lib/sstep.c:411:32: error: cast from unknown 
type
+/scratch/dja/linux/arch/powerpc/lib/sstep.c:411:59: error: using member 'word' 
in incomplete struct 

So:
Tested-by: Daniel Axtens 

(I think the patch also needs your sign-off.)

Regards,
Daniel
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: unrecoverable exception on G5 with CONFIG_PPC_EARLY_DEBUG enabled

2016-06-20 Thread Michael Ellerman

On Mon, 2016-06-20 at 15:51 +0530, Aneesh Kumar K.V wrote:
> Michael Ellerman  writes:
> > diff --git a/arch/powerpc/kernel/exceptions-64s.S 
> > b/arch/powerpc/kernel/exceptions-64s.S
> > index 4c9440629128..8bcc1b457115 100644
> > --- a/arch/powerpc/kernel/exceptions-64s.S
> > +++ b/arch/powerpc/kernel/exceptions-64s.S
> > @@ -1399,11 +1399,12 @@ END_MMU_FTR_SECTION_IFCLR(MMU_FTR_RADIX)
> > lwz r9,PACA_EXSLB+EX_CCR(r13)   /* get saved CR */
> > 
> > mtlrr10
> > -BEGIN_MMU_FTR_SECTION
> > -   b   2f
> > -END_MMU_FTR_SECTION_IFSET(MMU_FTR_RADIX)
> > andi.   r10,r12,MSR_RI  /* check for unrecoverable exception */
> > +BEGIN_MMU_FTR_SECTION
> > beq-2f
> > +FTR_SECTION_ELSE
> > +   b   2f
> > +ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_RADIX)
> > 
> >  .machine   push
> >  .machine   "power4"
> 
> I sent a patch which should get this problem fixed.
> 
> http://mid.gmane.org/1466274479-5650-1-git-send-email-aneesh.ku...@linux.vnet.ibm.com

Well s/fixed/avoided/.

I'd rather we fixed the root cause, which is that the SLB miss handler is broken
until code patching happens. When possible we should write feature sections so
that the unpatched code is functional, to avoid problems like this.

cheers

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc/pseries: Auto online hotplugged memory

2016-06-20 Thread Michael Ellerman

On Mon, 2016-06-20 at 08:51 -0500, Nathan Fontenot wrote:

> Auto online hotplugged memory
> 
> A recent update (commit id 31bc3858ea3) to the core mm hotplug code
> introduced the memhp_auto_online variable to allow for automatically
> onlining memory that is added.
> 
> This patch update the pseries memory hotplug code to enable this so that
> any memory DLPAR added to the system is automatically onlined. The code
> to add the memory block for memory added from add_memory() is removed as
> this is not needed, the memory_add code does this.

Is this a bug fix, or just a cleanup?

cheers

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc/align: Use #ifdef __BIG_ENDIAN__ #else for REG_BYTE

2016-06-20 Thread Michael Ellerman

On Fri, 2016-06-17 at 12:46 +0200, Arnd Bergmann wrote:
> On Friday, June 17, 2016 1:35:35 PM CEST Daniel Axtens wrote:
> > > It would be better to fix the sparse compilation so the same endianess
> > > is set that you get when calling gcc.
> > 
> > I will definitely work on a patch to sparse! I'd still like this or
> > something like it to go in though, so we can keep working on reducing
> > the sparse warning count while the sparse patch is in the works.
> 
> I think you just need to fix the Makefile so it sets the right
> arguments when calling sparse.
> 
> Something like the (untested) patch below, similar to how we
> already handle the word size and how some other architectures
> handle setting __BIG_ENDIAN__.

Yep that's clearly better. I didn't know we had separate CHECKER_FLAGS.

Daniel can you test that?

Arnd we'll add Suggested-by: you, or send a SOB if you like?

cheers

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: powerpc/asm: Remove unused symbols in asm-offsets.c

2016-06-20 Thread Michael Ellerman

On Wed, 2016-01-06 at 22:56:47 UTC, Rashmica Gupta wrote:
> THREAD_DSCR: Added in commit efcac6589a27 ("powerpc: Per process DSCR +
...
> 
> Signed-off-by: Rashmica Gupta 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/aac6a91fea93e6bdd7ac20365d

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: powerpc/align: Use #ifdef __BIG_ENDIAN__ #else for REG_BYTE

2016-06-20 Thread Michael Ellerman

On Thu, 2016-16-06 at 12:33:41 UTC, Michael Ellerman wrote:
> From: Daniel Axtens 
> 
> Sparse complains that it doesn't know what REG_BYTE is:
> 
>   arch/powerpc/kernel/align.c:313:29: error: undefined identifier 'REG_BYTE'
> 
> REG_BYTE is defined differently based on whether we're compiling for
> LE, BE32 or BE64. Sparse apparently doesn't provide __BIG_ENDIAN__ or
> __LITTLE_ENDIAN__, which means we get no definition.
> 
> Rather than check for __BIG_ENDIAN__ and then separately for
> __LITTLE_ENDIAN__, just switch the #ifdef to check for __BIG_ENDIAN__
> and then #else we define the little endian version. Technically that's
> dicey because PDP_ENDIAN is also a possibility, but we already do it in
> a lot of places so one more hardly matters.
> 
> Signed-off-by: Daniel Axtens 
> Signed-off-by: Michael Ellerman 

Applied to powerpc next.

https://git.kernel.org/powerpc/c/a9650e9bc53239c30c39f77d9d

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [v4] powerpc: spinlock: Fix spin_unlock_wait()

2016-06-20 Thread Michael Ellerman

On Fri, 2016-10-06 at 03:51:28 UTC, Boqun Feng wrote:
> There is an ordering issue with spin_unlock_wait() on powerpc, because
> the spin_lock primitive is an ACQUIRE and an ACQUIRE is only ordering
> the load part of the operation with memory operations following it.
...
> 
> Suggested-by: "Paul E. McKenney" 
> Signed-off-by: Boqun Feng 
> Reviewed-by: "Paul E. McKenney" 
> [mpe: Inline the "nop" ll/sc loop and set EH=0, munge change log]
> Signed-off-by: Michael Ellerman 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/6262db7c088bbfc26480d10144

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: powerpc: Add array bounds checking to crash_shutdown_handlers

2016-06-20 Thread Michael Ellerman

On Wed, 2016-11-05 at 00:57:32 UTC, Suraj Jitindar Singh wrote:
> The array crash_shutdown_handles is an array of size CRASH_HANDLER_MAX+1
> containing up to CRASH_HANDLER_MAX shutdown_handlers. It is assumed to
> be NULL terminated, which it is under normal circumstances. Array
> accesses in the functions crash_shutdown_unregister() and
> default_machine_crash_shutdown() rely on this NULL termination property
> when traversing this list and don't protect again out of bounds accesses.
> If the NULL terminator were somehow overwritten these functions could
> potentially access out of the bounds of the array.
> 
> Shrink the array to size CRASH_HANDLER_MAX and implement explicit array
> bounds checking when accessing the elements of the
> crash_shutdown_handles[] array in crash_shutdown_unregister() and
> default_machine_crash_shutdown().
> 
> Signed-off-by: Suraj Jitindar Singh 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/1d1451655bad9a6a5fd7a42de6

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [v2] powerpc/mm: Ensure "special" zones are empty

2016-06-20 Thread Michael Ellerman

On Wed, 2016-11-05 at 09:22:18 UTC, Oliver O'Halloran wrote:
> The mm zone mechanism was traditionally used by arch specific code to
> partition memory into allocation zones. However there are several zones
> that are managed by the mm subsystem rather than the architecture. Most
> architectures set the max PFN of these special zones to zero, however on
> powerpc we set them to ~0ul. This, in conjunction with a bug in
> free_area_init_nodes() results in all of system memory being placed in
> ZONE_DEVICE when enabled. Device memory cannot be used for regular kernel
> memory allocations so this will cause a kernel panic at boot. Given the
> planned addition of more mm managed zones (ZONE_CMA) we should aim to be
> consistent with every other architecture and set the max PFN for these
> zones to zero.
> 
> Signed-off-by: Oliver O'Halloran 
> Reviewed-by: Balbir Singh 
> Cc: linux...@kvack.org

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/3079abe11031e2ba5d1e21

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: cxl: Update process element after allocating interrupts

2016-06-20 Thread Michael Ellerman

On Mon, 2016-23-05 at 16:14:05 UTC, Ian Munsie wrote:
> From: Ian Munsie 
> 
> In the kernel API, it is possible to attempt to allocate AFU interrupts
> after already starting a context. Since the process element structure
> used by the hardware is only filled out at the time the context is
> started, it will not be updated with the interrupt numbers that have
> just been allocated and therefore AFU interrupts will not work unless
> they were allocated prior to starting the context.
> 
> This can present some difficulties as each CAPI enabled PCI device in
> the kernel API has a default context, which may need to be started very
> early to enable translations, potentially before interrupts can easily
> be set up.
> 
> This patch makes the API more flexible to allow interrupts to be
> allocated after a context has already been started and takes care of
> updating the PE structure used by the hardware and notifying it to
> discard any cached copy it may have.
> 
> The update is currently performed via a terminate/remove/add sequence.
> This is necessary on some hardware such as the XSL that does not
> properly support the update LLCMD.
> 
> Note that this is only supported on powernv at present - attempting to
> perform this ordering on PowerVM will raise a warning.
> 
> Signed-off-by: Ian Munsie 
> Reviewed-by: Frederic Barrat 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/292841b09648ce7aee5df16ab7

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: cxl: static-ify variables to fix sparse warnings

2016-06-20 Thread Michael Ellerman

On Mon, 2016-18-04 at 05:03:50 UTC, Andrew Donnellan wrote:
> Make a couple more variables static. Found by sparse.
> 
> Signed-off-by: Andrew Donnellan 
> Reviewed-by: fbar...@linux.vnet.ibm.com
> Reviewed-by: Matthew R. Ochs 
> Acked-by: Ian Munsie 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/64417a398973d964139306c0b1

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: cxl: Make vPHB device node match adapter's

2016-06-20 Thread Michael Ellerman

On Wed, 2016-15-06 at 14:42:16 UTC, Frederic Barrat wrote:
> Tested by cxlflash on bare-metal and powerVM.
> 
> Signed-off-by: Frederic Barrat 
> Reviewed-by: Matthew R. Ochs 
> Acked-by: Ian Munsie 
> Signed-off-by: Frederic Barrat 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/a430739009384ba2c4804f3a42

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC] cxl: Add support for CAPP DMA mode

2016-06-20 Thread Michael Ellerman

On Wed, 2016-08-06 at 05:09:54 UTC, Ian Munsie wrote:
> From: Ian Munsie 
> 
> This adds support for using CAPP DMA mode, which is required for XSL
> based cards such as the Mellanox CX4 to function.
> 
> This is currently an RFC as it depends on the corresponding support to
> be merged into skiboot first, which was submitted here:
> http://patchwork.ozlabs.org/patch/625582/
> 
> In the event that the skiboot on the system does not have the above
> support, it will indicate as such in the kernel log and abort the init
> process.
> 
> Signed-off-by: Ian Munsie 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/b385c9e971468eb8816b267424

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [3/4] powerpc/sparse: Include headers containing prototypes

2016-06-20 Thread Michael Ellerman

On Wed, 2016-18-05 at 01:16:51 UTC, Daniel Axtens wrote:
> Sometimes headers that provide prototypes for functions are
> accidentally omitted from the files that define the functions.
> 
> Fix a couple of times that occurs.
> 
> Signed-off-by: Daniel Axtens 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/665e87ffe1c400c525c3a4cd6f

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: cxl: Abstract the differences between the PSL and XSL

2016-06-20 Thread Michael Ellerman

On Mon, 2016-23-05 at 17:39:18 UTC, Ian Munsie wrote:
> From: Frederic Barrat 
> 
> The XSL (Translation Service Layer) is a stripped down version of the
> PSL (Power Service Layer) used in some cards such as the Mellanox CX4.
> 
> Like the PSL, it implements the CAIA architecture, but has a number of
> differences, mostly in it's implementation dependent registers. This
> adds an ops structure to abstract these differences to bring initial
> support for XSL CAPI devices.
> 
> The XSL does not implement the optional architected SERR register,
> however while it treats it as a reserved register and should work with
> no special treatment, attempting to access it will cause the XSL_FEC
> (First Error Capture) register to be filled out, preventing it from
> capturing any subsequent errors. Therefore, this patch also prevents the
> kernel from trying to set up the SERR register so that the FEC register
> may still be useful, and to save one interrupt.
> 
> The XSL also uses a special DMA cxl mode, which uses a slightly
> different init sequence for the CAPP and PHB. The kernel support for
> this will be in a future patch once the corresponding support has been
> merged into skiboot.
> 
> Co-authored-by: Ian Munsie 
> Signed-off-by: Ian Munsie 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/6d382616ac2283ed65c7a6a52d

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [2/4] powerpc: Introduce asm-prototypes.h

2016-06-20 Thread Michael Ellerman

On Wed, 2016-18-05 at 01:16:50 UTC, Daniel Axtens wrote:
> Sparse picked up a number of functions that are implemented in C and
> then only referred to in asm code.
> 
> This introduces asm-prototypes.h, which provides a place for
> prototypes of these functions.
> 
> This silences some sparse warnings.
> 
> Signed-off-by: Daniel Axtens 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/42f5b4cacd783faf05e3ff8bf8

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [FIX,v2,2/2] powerpc,numa: Fix memory_hotplug_max()

2016-06-20 Thread Michael Ellerman

On Thu, 2016-12-05 at 13:34:15 UTC, Bharata B Rao wrote:
> memory_hotplug_max() uses hot_add_drconf_memory_max() to get maxmimum
> addressable memory by referring to ibm,dyanamic-memory property. There
> are three problems with the current approach:
> 
> 1 hot_add_drconf_memory_max() assumes that ibm,dynamic-memory includes
>   all the LMBs of the guest, but that is not true for PowerKVM which
>   populates only DR LMBs (LMBs that can be hotplugged/removed) in that
>   property.
> 2 hot_add_drconf_memory_max() multiplies lmb-size with lmb-count to arrive
>   at the max possible address. Since ibm,dynamic-memory doesn't include
>   RMA LMBs, the address thus obtained will be less than the actual max
>   address. For example, if max possible memory size is 32G, with lmb-size
>   of 256MB there can be 127 LMBs in ibm,dynamic-memory (1 LMB for RMA
>   which won't be present here).  hot_add_drconf_memory_max() would then
>   return the max addressable memory as 127 * 256MB = 31.75GB, the max
>   address should have been 32G which is what ibm,lrdr-capacity shows.
> 3 In PowerKVM, there can be a gap between the end of boot time RAM and
>   beginning of hotplug RAM area. So just multiplying lmb-count with
>   lmb-size will not provide the correct max possible address for PowerKVM.
> 
> This patch fixes 1 by using ibm,lrdr-capacity property to return the max
> addressable memory whenever the property is present. Then it fixes 2 & 3
> by fetching the address of the last LMB in ibm,dynamic-memory property.
> 
> Signed-off-by: Bharata B Rao 
> Reviewed-by: David Gibson 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/45b64ee64970dee9392229302e

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [1/4] powerpc/sparse: make some things static

2016-06-20 Thread Michael Ellerman

On Wed, 2016-18-05 at 01:16:49 UTC, Daniel Axtens wrote:
> This is just a smattering of things picked up by sparse that should
> be made static.
> 
> Signed-off-by: Daniel Axtens 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/34852ed5511ec5d07897f22d56

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [FIX, v2, 1/2] powerpc, numa: Fix whitespace in hot_add_drconf_memory_max()

2016-06-20 Thread Michael Ellerman

On Thu, 2016-12-05 at 13:34:14 UTC, Bharata B Rao wrote:
> Signed-off-by: Bharata B Rao 
> Reviewed-by: David Gibson 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/e70bd3ae914ec40d8505ed842d

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [Patch v2 1/2] powerpc: Send SIGBUS on unaligned copy and paste

2016-06-20 Thread Chris Smart


On Thu, Jun 16, 2016 at 11:04:12PM -0500, Segher Boessenkool wrote:

On Fri, Jun 17, 2016 at 09:33:45AM +1000, Chris Smart wrote:

+#define PPC_INST_COPY  0x7c00060c
+#define PPC_INST_COPY_FIRST0x7c20060c



+#define PPC_INST_PASTE 0x7c00070c
+#define PPC_INST_PASTE_LAST0x7c20070d


That's not quite right I think.



Hi Segher,

Thanks for checking that for me, it's good to make sure it's correct.

Just to be sure, I've gone back and compared them all with the ISA. I
think that the only one that differs is the paste_last. Am I missing
something?


copy is   7c00060c mask fc2007fe (or ffe007fe)


COPY = copy RA,RB,L (L=0)

 31    L  RARB  774 /
01  0 0 0 110110 0 = 0x7c00060c (instruction)
11  1 0 0 11 0 = 0xfc2007fe (specific mask)


copy_first is 7c20060c mask fc2007fe


COPY_FIRST = copy RA,RB,L (L=1)
If L=1, the instruction identifies the beginning of a move group.

 31    L   RA   RB  774 /
01  1 0 0 110110 0 = 0x7c20060c (instruction)
11  1 0 0 11 0 = 0xfc2007fe (specific mask)


paste is  7c00070c mask fc2007fe


PASTE = paste RA,RB,L (L=0 Rc=0)

 31    L  RARB  902 Rc
01  0 0 0 111110 0 = 0x7c00070c (instruction)
11  1 0 0 11 1 = 0xfc2007ff (specific mask)


paste_last is 7c20070c mask fc2007fe



PASTE_LAST = paste. RA,RB,L (L=1 Rc=1)
If L=1, the instruction identifies the end of a move group.
If L≠Rc, the instruction form is invalid.

 31    L  RARB  902 Rc
01  1 0 0 111110 1 = 0x7c20070d (instruction)
11  1 0 0 11 1 = 0xfc2007ff (specific mask)


(this includes record form for paste; the low bit).



To make the test simple I use a combined copy, copy_first, paste and
paste_last mask to compare just against copy. So that excluded:
- L
- bit 24 of 32
- Rc
11  0 0 0 110111 0 = 0xfc0006fe

Would it be better and more clear to check each instruction with its
mask? Something like:

#define PPC_INST_COPY_MASK 0xfc2007fe
#define PPC_INST_PASTE_MASK 0xfc2007ff

if (cpu_has_feature(CPU_FTR_ARCH_300)) {
unsigned int masked_instruction = instruction & PPC_INST_COPY_MASK;

if (masked_instruction == PPC_INST_COPY || \
masked_instruction == PPC_INST_COPY_FIRST)
return -EIO;

masked_instruction = instruction & PPC_INST_PASTE_MASK;

if (masked_instruction == PPC_INST_PASTE || \
masked_instruction == PPC_INST_PASTE_LAST)
return -EIO;
}

Thanks!
-c
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [6/6] ppc: ebpf/jit: Implement JIT compiler for extended BPF

2016-06-20 Thread Michael Ellerman

On Sun, 2016-06-19 at 23:06 +0530, Naveen N. Rao wrote:
> On 2016/06/17 10:53PM, Michael Ellerman wrote:
> > On Tue, 2016-07-06 at 13:32:23 UTC, "Naveen N. Rao" wrote:
> > > diff --git a/arch/powerpc/net/bpf_jit_comp64.c 
> > > b/arch/powerpc/net/bpf_jit_comp64.c
> > > new file mode 100644
> > > index 000..954ff53
> > > --- /dev/null
> > > +++ b/arch/powerpc/net/bpf_jit_comp64.c
> > > @@ -0,0 +1,956 @@
> > ...

> > > +
> > > +static void bpf_jit_fill_ill_insns(void *area, unsigned int size)
> > > +{
> > > + int *p = area;
> > > +
> > > + /* Fill whole space with trap instructions */
> > > + while (p < (int *)((char *)area + size))
> > > + *p++ = BREAKPOINT_INSTRUCTION;
> > > +}
> > 
> > This breaks the build for some configs, presumably you're missing a header:
> > 
> >   arch/powerpc/net/bpf_jit_comp64.c:30:10: error: 'BREAKPOINT_INSTRUCTION' 
> > undeclared (first use in this function)
> > 
> > http://kisskb.ellerman.id.au/kisskb/buildresult/12720611/
> 
> Oops. Yes, I should have caught that. I need to add:
> 
> #include 
> 
> in bpf_jit_comp64.c
> 
> Can you please check if it resolves the build error?

Can you? :D

cheers

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH] leds: Add no-op gpio_led_register_device when LED subsystem is disabled

2016-06-20 Thread Andrew F. Davis

Some systems use 'gpio_led_register_device' to make an in-memory copy of
their LED device table so the original can be removed as .init.rodata.
When the LED subsystem is not enabled source in the led directory is not
built and so this function may be undefined. Fix this here.

Signed-off-by: Andrew F. Davis 
---
 include/linux/leds.h | 8 
 1 file changed, 8 insertions(+)

diff --git a/include/linux/leds.h b/include/linux/leds.h
index d2b1306..a4a3da6 100644
--- a/include/linux/leds.h
+++ b/include/linux/leds.h
@@ -386,8 +386,16 @@ struct gpio_led_platform_data {
unsigned long *delay_off);
 };

+#ifdef CONFIG_NEW_LEDS
 struct platform_device *gpio_led_register_device(
int id, const struct gpio_led_platform_data *pdata);
+#else
+static inline struct platform_device *gpio_led_register_device(
+   int id, const struct gpio_led_platform_data *pdata)
+{
+   return 0;
+}
+#endif

 enum cpu_led_event {
CPU_LED_IDLE_START, /* CPU enters idle */
-- 
2.9.0
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] ppc: Fix BPF JIT for ABIv2

2016-06-20 Thread Thadeu Lima de Souza Cascardo

On Sun, Jun 19, 2016 at 11:19:14PM +0530, Naveen N. Rao wrote:
> On 2016/06/17 10:00AM, Thadeu Lima de Souza Cascardo wrote:
> > On Fri, Jun 17, 2016 at 10:53:21PM +1000, Michael Ellerman wrote:
> > > On Tue, 2016-07-06 at 13:32:23 UTC, "Naveen N. Rao" wrote:
> > > > diff --git a/arch/powerpc/net/bpf_jit_comp64.c 
> > > > b/arch/powerpc/net/bpf_jit_comp64.c
> > > > new file mode 100644
> > > > index 000..954ff53
> > > > --- /dev/null
> > > > +++ b/arch/powerpc/net/bpf_jit_comp64.c
> > > > @@ -0,0 +1,956 @@
> > > ...
> > > > +
> > > > +static void bpf_jit_fill_ill_insns(void *area, unsigned int size)
> > > > +{
> > > > +   int *p = area;
> > > > +
> > > > +   /* Fill whole space with trap instructions */
> > > > +   while (p < (int *)((char *)area + size))
> > > > +   *p++ = BREAKPOINT_INSTRUCTION;
> > > > +}
> > > 
> > > This breaks the build for some configs, presumably you're missing a 
> > > header:
> > > 
> > >   arch/powerpc/net/bpf_jit_comp64.c:30:10: error: 
> > > 'BREAKPOINT_INSTRUCTION' undeclared (first use in this function)
> > > 
> > > http://kisskb.ellerman.id.au/kisskb/buildresult/12720611/
> > > 
> > > cheers
> > 
> > Hi, Michael and Naveen.
> > 
> > I noticed independently that there is a problem with BPF JIT and ABIv2, and
> > worked out the patch below before I noticed Naveen's patchset and the latest
> > changes in ppc tree for a better way to check for ABI versions.
> > 
> > However, since the issue described below affect mainline and stable kernels,
> > would you consider applying it before merging your two patchsets, so that 
> > we can
> > more easily backport the fix?
> 
> Hi Cascardo,
> Given that this has been broken on ABIv2 since forever, I didn't bother 
> fixing it. But, I can see why this would be a good thing to have for 
> -stable and existing distros. However, while your patch below may fix 
> the crash you're seeing on ppc64le, it is not sufficient -- you'll need 
> changes in bpf_jit_asm.S as well.

Hi, Naveen.

Any tips on how to exercise possible issues there? Or what changes you think
would be sufficient?

I will see what I can find by myself, but would appreciate any help.

Regards.
Cascardo.

> 
> Regards,
> Naveen
> 
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 3/3] cxlflash: Shutdown notify support for CXL Flash cards

2016-06-20 Thread Matthew R. Ochs

> On Jun 15, 2016, at 6:49 PM, Uma Krishnan  wrote:
> 
> Some CXL Flash cards need notification of device shutdown
> in order to flush pending I/Os.
> 
> A PCI notification hook for shutdown has been added where
> the driver notifies the card and returns. When the device
> is removed in the PCI remove path, notification code will
> wait for shutdown processing to complete.
> 
> Signed-off-by: Uma Krishnan 

Acked-by: Matthew R. Ochs 

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 2/3] cxlflash: Add device dependent flags

2016-06-20 Thread Matthew R. Ochs

> On Jun 15, 2016, at 6:49 PM, Uma Krishnan  wrote:
> 
> Device dependent flags are needed to support functions that are
> specific to a particular device.
> 
> One such case is - some CXL Flash cards need to be notified of
> device shutdown. For other CXL devices, this feature does not prove
> to be useful yet. Such distinct features need to be identified in
> the driver to bypass or invoke specific functionality.
> 
> In this patch, a member 'flags' has been added to device dependent
> values. These flags will be used and expanded in the future to
> support various device specific functions.
> 
> Signed-off-by: Uma Krishnan 

Acked-by: Matthew R. Ochs 


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2 2/9] kexec_file: Generalize kexec_add_buffer.

2016-06-20 Thread Thiago Jung Bauermann

Am Montag, 20 Juni 2016, 10:26:05 schrieb Dave Young:
> kexec_buf should go within #ifdef for kexec file like struct
> purgatory_info
> 
> Other than that it looks good.

Great! Here it is.
-- 
[]'s
Thiago Jung Bauermann
IBM Linux Technology Center


kexec_file: Generalize kexec_add_buffer.

Allow architectures to specify different memory walking functions for
kexec_add_buffer. Intel uses iomem to track reserved memory ranges,
but PowerPC uses the memblock subsystem.

Signed-off-by: Thiago Jung Bauermann 
Cc: Eric Biederman 
Cc: Dave Young 
Cc: ke...@lists.infradead.org
Cc: linux-ker...@vger.kernel.org

diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index e8acb2b43dd9..3d91bcfc180d 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -146,7 +146,24 @@ struct kexec_file_ops {
kexec_verify_sig_t *verify_sig;
 #endif
 };
-#endif
+
+/*
+ * Keeps track of buffer parameters as provided by caller for requesting
+ * memory placement of buffer.
+ */
+struct kexec_buf {
+   struct kimage *image;
+   unsigned long mem;
+   unsigned long memsz;
+   unsigned long buf_align;
+   unsigned long buf_min;
+   unsigned long buf_max;
+   bool top_down;  /* allocate from top of memory hole */
+};
+
+int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf,
+  int (*func)(u64, u64, void *));
+#endif /* CONFIG_KEXEC_FILE */
 
 struct kimage {
kimage_entry_t head;
diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index b6eec7527e9f..b1f1f6402518 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -428,6 +428,27 @@ static int locate_mem_hole_callback(u64 start, u64 end, 
void *arg)
return locate_mem_hole_bottom_up(start, end, kbuf);
 }
 
+/**
+ * arch_kexec_walk_mem - call func(data) on free memory regions
+ * @kbuf:  Context info for the search. Also passed to @func.
+ * @func:  Function to call for each memory region.
+ *
+ * Return: The memory walk will stop when func returns a non-zero value
+ * and that value will be returned. If all free regions are visited without
+ * func returning non-zero, then zero will be returned.
+ */
+int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf,
+  int (*func)(u64, u64, void *))
+{
+   if (kbuf->image->type == KEXEC_TYPE_CRASH)
+   return walk_iomem_res_desc(crashk_res.desc,
+  IORESOURCE_SYSTEM_RAM | 
IORESOURCE_BUSY,
+  crashk_res.start, crashk_res.end,
+  kbuf, func);
+   else
+   return walk_system_ram_res(0, ULONG_MAX, kbuf, func);
+}
+
 /*
  * Helper function for placing a buffer in a kexec segment. This assumes
  * that kexec_mutex is held.
@@ -472,14 +493,7 @@ int kexec_add_buffer(struct kimage *image, char *buffer, 
unsigned long bufsz,
kbuf->top_down = top_down;
 
/* Walk the RAM ranges and allocate a suitable range for the buffer */
-   if (image->type == KEXEC_TYPE_CRASH)
-   ret = walk_iomem_res_desc(crashk_res.desc,
-   IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY,
-   crashk_res.start, crashk_res.end, kbuf,
-   locate_mem_hole_callback);
-   else
-   ret = walk_system_ram_res(0, -1, kbuf,
- locate_mem_hole_callback);
+   ret = arch_kexec_walk_mem(kbuf, locate_mem_hole_callback);
if (ret != 1) {
/* A suitable memory range could not be found for buffer */
return -EADDRNOTAVAIL;
diff --git a/kernel/kexec_internal.h b/kernel/kexec_internal.h
index eefd5bf960c2..4cef7e4706b0 100644
--- a/kernel/kexec_internal.h
+++ b/kernel/kexec_internal.h
@@ -20,20 +20,6 @@ struct kexec_sha_region {
unsigned long len;
 };
 
-/*
- * Keeps track of buffer parameters as provided by caller for requesting
- * memory placement of buffer.
- */
-struct kexec_buf {
-   struct kimage *image;
-   unsigned long mem;
-   unsigned long memsz;
-   unsigned long buf_align;
-   unsigned long buf_min;
-   unsigned long buf_max;
-   bool top_down;  /* allocate from top of memory hole */
-};
-
 void kimage_file_post_load_cleanup(struct kimage *image);
 #else /* CONFIG_KEXEC_FILE */
 static inline void kimage_file_post_load_cleanup(struct kimage *image) { }

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] ibmvnic: fix to use list_for_each_safe() when delete items

2016-06-20 Thread Thomas Falcon

On 06/17/2016 09:53 PM, weiyj...@163.com wrote:
> From: Wei Yongjun 
>
> Since we will remove items off the list using list_del() we need
> to use a safe version of the list_for_each() macro aptly named
> list_for_each_safe().
>
> Signed-off-by: Wei Yongjun 
> ---
>  drivers/net/ethernet/ibm/ibmvnic.c | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/net/ethernet/ibm/ibmvnic.c 
> b/drivers/net/ethernet/ibm/ibmvnic.c
> index 864cb21..0b6a922 100644
> --- a/drivers/net/ethernet/ibm/ibmvnic.c
> +++ b/drivers/net/ethernet/ibm/ibmvnic.c
> @@ -3141,14 +3141,14 @@ static void handle_request_ras_comp_num_rsp(union 
> ibmvnic_crq *crq,
>  
>  static void ibmvnic_free_inflight(struct ibmvnic_adapter *adapter)
>  {
> - struct ibmvnic_inflight_cmd *inflight_cmd;
> + struct ibmvnic_inflight_cmd *inflight_cmd, *tmp1;
>   struct device *dev = >vdev->dev;
> - struct ibmvnic_error_buff *error_buff;
> + struct ibmvnic_error_buff *error_buff, *tmp2;
>   unsigned long flags;
>   unsigned long flags2;
>  
>   spin_lock_irqsave(>inflight_lock, flags);
> - list_for_each_entry(inflight_cmd, >inflight, list) {
> + list_for_each_entry_safe(inflight_cmd, tmp1, >inflight, list) {
>   switch (inflight_cmd->crq.generic.cmd) {
>   case LOGIN:
>   dma_unmap_single(dev, adapter->login_buf_token,
> @@ -3165,8 +3165,8 @@ static void ibmvnic_free_inflight(struct 
> ibmvnic_adapter *adapter)
>   break;
>   case REQUEST_ERROR_INFO:
>   spin_lock_irqsave(>error_list_lock, flags2);
> - list_for_each_entry(error_buff, >errors,
> - list) {
> + list_for_each_entry_safe(error_buff, tmp2,
> +  >errors, list) {
>   dma_unmap_single(dev, error_buff->dma,
>error_buff->len,
>DMA_FROM_DEVICE);
>
Thanks!

Acked-by: Thomas Falcon 
>
>
>
> ___
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 3/4] kvm/stats: Add provisioning for 64-bit vcpu statistics

2016-06-20 Thread Paolo Bonzini



On 20/06/2016 02:08, Paul Mackerras wrote:
> Paolo,
> 
> Can I have an ack for Suraj's patch below?  If it's OK with you,
> I'll take his series through my tree.

Yes, please do.

Paolo

> Thanks,
> Paul.
> 
> On Wed, Jun 15, 2016 at 07:21:07PM +1000, Suraj Jitindar Singh wrote:
>> vcpus have statistics associated with them which can be viewed within the
>> debugfs. Currently it is assumed within the vcpu_stat_get() and
>> vcpu_stat_get_per_vm() functions that all of these statistics are
>> represented as 32-bit numbers. The next patch adds some 64-bit statistics,
>> so add provisioning for the display of 64-bit vcpu statistics.
>>
>> Signed-off-by: Suraj Jitindar Singh 
>> ---
>>  arch/powerpc/kvm/book3s.c |  1 +
>>  include/linux/kvm_host.h  |  1 +
>>  virt/kvm/kvm_main.c   | 60 
>> +++
>>  3 files changed, 58 insertions(+), 4 deletions(-)
>>
>> diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
>> index 47018fc..ed9132b 100644
>> --- a/arch/powerpc/kvm/book3s.c
>> +++ b/arch/powerpc/kvm/book3s.c
>> @@ -40,6 +40,7 @@
>>  #include "trace.h"
>>  
>>  #define VCPU_STAT(x) offsetof(struct kvm_vcpu, stat.x), KVM_STAT_VCPU
>> +#define VCPU_STAT_U64(x) offsetof(struct kvm_vcpu, stat.x), 
>> KVM_STAT_VCPU_U64
>>  
>>  /* #define EXIT_DEBUG */
>>  
>> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
>> index 1c9c973..667b30e 100644
>> --- a/include/linux/kvm_host.h
>> +++ b/include/linux/kvm_host.h
>> @@ -991,6 +991,7 @@ static inline bool kvm_is_error_gpa(struct kvm *kvm, 
>> gpa_t gpa)
>>  enum kvm_stat_kind {
>>  KVM_STAT_VM,
>>  KVM_STAT_VCPU,
>> +KVM_STAT_VCPU_U64,
>>  };
>>  
>>  struct kvm_stat_data {
>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
>> index 02e98f3..ac47ffb 100644
>> --- a/virt/kvm/kvm_main.c
>> +++ b/virt/kvm/kvm_main.c
>> @@ -3566,6 +3566,20 @@ static int vcpu_stat_get_per_vm(void *data, u64 *val)
>>  return 0;
>>  }
>>  
>> +static int vcpu_stat_u64_get_per_vm(void *data, u64 *val)
>> +{
>> +int i;
>> +struct kvm_stat_data *stat_data = (struct kvm_stat_data *)data;
>> +struct kvm_vcpu *vcpu;
>> +
>> +*val = 0;
>> +
>> +kvm_for_each_vcpu(i, vcpu, stat_data->kvm)
>> +*val += *(u64 *)((void *)vcpu + stat_data->offset);
>> +
>> +return 0;
>> +}
>> +
>>  static int vcpu_stat_get_per_vm_open(struct inode *inode, struct file *file)
>>  {
>>  __simple_attr_check_format("%llu\n", 0ull);
>> @@ -3573,6 +3587,13 @@ static int vcpu_stat_get_per_vm_open(struct inode 
>> *inode, struct file *file)
>>   NULL, "%llu\n");
>>  }
>>  
>> +static int vcpu_stat_u64_get_per_vm_open(struct inode *inode, struct file 
>> *file)
>> +{
>> +__simple_attr_check_format("%llu\n", 0ull);
>> +return kvm_debugfs_open(inode, file, vcpu_stat_u64_get_per_vm,
>> + NULL, "%llu\n");
>> +}
>> +
>>  static const struct file_operations vcpu_stat_get_per_vm_fops = {
>>  .owner   = THIS_MODULE,
>>  .open= vcpu_stat_get_per_vm_open,
>> @@ -3582,9 +3603,19 @@ static const struct file_operations 
>> vcpu_stat_get_per_vm_fops = {
>>  .llseek  = generic_file_llseek,
>>  };
>>  
>> +static const struct file_operations vcpu_stat_u64_get_per_vm_fops = {
>> +.owner   = THIS_MODULE,
>> +.open= vcpu_stat_u64_get_per_vm_open,
>> +.release = kvm_debugfs_release,
>> +.read= simple_attr_read,
>> +.write   = simple_attr_write,
>> +.llseek  = generic_file_llseek,
>> +};
>> +
>>  static const struct file_operations *stat_fops_per_vm[] = {
>> -[KVM_STAT_VCPU] = _stat_get_per_vm_fops,
>> -[KVM_STAT_VM]   = _stat_get_per_vm_fops,
>> +[KVM_STAT_VCPU] = _stat_get_per_vm_fops,
>> +[KVM_STAT_VCPU_U64] = _stat_u64_get_per_vm_fops,
>> +[KVM_STAT_VM]   = _stat_get_per_vm_fops,
>>  };
>>  
>>  static int vm_stat_get(void *_offset, u64 *val)
>> @@ -3627,9 +3658,30 @@ static int vcpu_stat_get(void *_offset, u64 *val)
>>  
>>  DEFINE_SIMPLE_ATTRIBUTE(vcpu_stat_fops, vcpu_stat_get, NULL, "%llu\n");
>>  
>> +static int vcpu_stat_u64_get(void *_offset, u64 *val)
>> +{
>> +unsigned offset = (long)_offset;
>> +struct kvm *kvm;
>> +struct kvm_stat_data stat_tmp = {.offset = offset};
>> +u64 tmp_val;
>> +
>> +*val = 0;
>> +spin_lock(_lock);
>> +list_for_each_entry(kvm, _list, vm_list) {
>> +stat_tmp.kvm = kvm;
>> +vcpu_stat_u64_get_per_vm((void *)_tmp, _val);
>> +*val += tmp_val;
>> +}
>> +spin_unlock(_lock);
>> +return 0;
>> +}
>> +
>> +DEFINE_SIMPLE_ATTRIBUTE(vcpu_stat_u64_fops, vcpu_stat_u64_get, NULL, 
>> "%llu\n");
>> +
>>  static const struct file_operations *stat_fops[] = {
>> -[KVM_STAT_VCPU] = _stat_fops,
>> -[KVM_STAT_VM]   = _stat_fops,
>> +[KVM_STAT_VCPU] = _stat_fops,
>> +[KVM_STAT_VCPU_U64] = _stat_u64_fops,
>> +[KVM_STAT_VM]

[PATCH 2/2] powerpc/pseries: Dynamic add entires to associativity lookup array

2016-06-20 Thread Nathan Fontenot

Dynamically add entries to the associativity lookup array

The ibm,associativity-lookup-arrays property may only contain
associativity arrays for LMBs present at boot time. When hotplug
adding a LMB its associativity array may not be in the associativity
lookup array, this patch adds the ability to add new entries to the
associativity lookup array.

Signed-off-by: Nathan Fontenot 
---
 arch/powerpc/platforms/pseries/hotplug-memory.c |   93 ---
 1 file changed, 66 insertions(+), 27 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c 
b/arch/powerpc/platforms/pseries/hotplug-memory.c
index b10f2ef..f62eef652 100644
--- a/arch/powerpc/platforms/pseries/hotplug-memory.c
+++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
@@ -191,14 +191,72 @@ static int dlpar_update_device_tree_lmb(struct 
of_drconf_cell *lmb)
return 0;
 }
 
+static u32 find_aa_index(struct device_node *dr_node,
+struct property *ala_prop, const u32 *lmb_assoc)
+{
+   u32 *assoc_arrays;
+   u32 aa_index;
+   int aa_arrays, aa_array_entries, aa_array_sz;
+   int i, index;
+
+   /* The ibm,associativity-lookup-arrays property is defined to be
+* a 32-bit value specifying the number of associativity arrays
+* followed by a 32-bitvalue specifying the number of entries per
+* array, followed by the associativity arrays.
+*/
+   assoc_arrays = ala_prop->value;
+
+   aa_arrays = be32_to_cpu(assoc_arrays[0]);
+   aa_array_entries = be32_to_cpu(assoc_arrays[1]);
+   aa_array_sz = aa_array_entries * sizeof(u32);
+
+   aa_index = -1;
+   for (i = 0; i < aa_arrays; i++) {
+   index = (i * aa_array_entries) + 2;
+
+   if (memcmp(_arrays[index], _assoc[1], aa_array_sz))
+   continue;
+
+   aa_index = i;
+   break;
+   }
+
+   if (aa_index == -1) {
+   struct property *new_prop;
+   u32 new_prop_size;
+
+   new_prop_size = ala_prop->length + aa_array_sz;
+   new_prop = dlpar_clone_property(ala_prop, new_prop_size);
+   if (!new_prop)
+   return -1;
+
+   assoc_arrays = new_prop->value;
+
+   /* increment the number of entries in the lookup array */
+   assoc_arrays[0] = cpu_to_be32(aa_arrays + 1);
+
+   /* copy the new associativity into the lookup array */
+   index = aa_arrays * aa_array_entries + 2;
+   memcpy(_arrays[index], _assoc[1], aa_array_sz);
+
+   of_update_property(dr_node, new_prop);
+
+   /* The associativity lookup array index for this lmb is
+* number of entries - 1 since we added its associativity
+* to the end of the lookup array.
+*/
+   aa_index = be32_to_cpu(assoc_arrays[0]) - 1;
+   }
+
+   return aa_index;
+}
+
 static u32 lookup_lmb_associativity_index(struct of_drconf_cell *lmb)
 {
struct device_node *parent, *lmb_node, *dr_node;
+   struct property *ala_prop;
const u32 *lmb_assoc;
-   const u32 *assoc_arrays;
u32 aa_index;
-   int aa_arrays, aa_array_entries, aa_array_sz;
-   int i;
 
parent = of_find_node_by_path("/");
if (!parent)
@@ -222,34 +280,15 @@ static u32 lookup_lmb_associativity_index(struct 
of_drconf_cell *lmb)
return -ENODEV;
}
 
-   assoc_arrays = of_get_property(dr_node,
-  "ibm,associativity-lookup-arrays",
-  NULL);
-   of_node_put(dr_node);
-   if (!assoc_arrays) {
+   ala_prop = of_find_property(dr_node, "ibm,associativity-lookup-arrays",
+   NULL);
+   if (!ala_prop) {
+   of_node_put(dr_node);
dlpar_free_cc_nodes(lmb_node);
return -ENODEV;
}
 
-   /* The ibm,associativity-lookup-arrays property is defined to be
-* a 32-bit value specifying the number of associativity arrays
-* followed by a 32-bitvalue specifying the number of entries per
-* array, followed by the associativity arrays.
-*/
-   aa_arrays = be32_to_cpu(assoc_arrays[0]);
-   aa_array_entries = be32_to_cpu(assoc_arrays[1]);
-   aa_array_sz = aa_array_entries * sizeof(u32);
-
-   aa_index = -1;
-   for (i = 0; i < aa_arrays; i++) {
-   int indx = (i * aa_array_entries) + 2;
-
-   if (memcmp(_arrays[indx], _assoc[1], aa_array_sz))
-   continue;
-
-   aa_index = i;
-   break;
-   }
+   aa_index = find_aa_index(dr_node, ala_prop, lmb_assoc);
 
dlpar_free_cc_nodes(lmb_node);
return aa_index;

___
Linuxppc-dev mailing list

[PATCH 1/2] powerpc/pseries: Move property cloning into its own routine

2016-06-20 Thread Nathan Fontenot

Move property cloning code into its own routine

Split the pieces of dlpar_clone_drconf_property() that create a copy of
the property struct into its own routine. This allows for creating
clones of more than just the ibm,dynamic-memory property used in memory
hotplug.

Signed-off-by: Nathan Fontenot 
---
 arch/powerpc/platforms/pseries/hotplug-memory.c |   38 ---
 1 file changed, 26 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c 
b/arch/powerpc/platforms/pseries/hotplug-memory.c
index 03f6169..b10f2ef 100644
--- a/arch/powerpc/platforms/pseries/hotplug-memory.c
+++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
@@ -69,13 +69,36 @@ unsigned long pseries_memory_block_size(void)
return memblock_size;
 }
 
-static void dlpar_free_drconf_property(struct property *prop)
+static void dlpar_free_property(struct property *prop)
 {
kfree(prop->name);
kfree(prop->value);
kfree(prop);
 }
 
+static struct property *dlpar_clone_property(struct property *prop,
+u32 prop_size)
+{
+   struct property *new_prop;
+
+   new_prop = kzalloc(sizeof(*new_prop), GFP_KERNEL);
+   if (!new_prop)
+   return NULL;
+
+   new_prop->name = kstrdup(prop->name, GFP_KERNEL);
+   new_prop->value = kzalloc(prop_size, GFP_KERNEL);
+   if (!new_prop->name || !new_prop->value) {
+   dlpar_free_property(new_prop);
+   return NULL;
+   }
+
+   memcpy(new_prop->value, prop->value, prop->length);
+   new_prop->length = prop_size;
+
+   of_property_set_flag(new_prop, OF_DYNAMIC);
+   return new_prop;
+}
+
 static struct property *dlpar_clone_drconf_property(struct device_node *dn)
 {
struct property *prop, *new_prop;
@@ -87,19 +110,10 @@ static struct property *dlpar_clone_drconf_property(struct 
device_node *dn)
if (!prop)
return NULL;
 
-   new_prop = kzalloc(sizeof(*new_prop), GFP_KERNEL);
+   new_prop = dlpar_clone_property(prop, prop->length);
if (!new_prop)
return NULL;
 
-   new_prop->name = kstrdup(prop->name, GFP_KERNEL);
-   new_prop->value = kmemdup(prop->value, prop->length, GFP_KERNEL);
-   if (!new_prop->name || !new_prop->value) {
-   dlpar_free_drconf_property(new_prop);
-   return NULL;
-   }
-
-   new_prop->length = prop->length;
-
/* Convert the property to cpu endian-ness */
p = new_prop->value;
*p = be32_to_cpu(*p);
@@ -718,7 +732,7 @@ int dlpar_memory(struct pseries_hp_errorlog *hp_elog)
break;
}
 
-   dlpar_free_drconf_property(prop);
+   dlpar_free_property(prop);
 
 dlpar_memory_out:
of_node_put(dn);

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 0/2] powerpc/pseries: Dynamic associativity-lookup-arrays updating

2016-06-20 Thread Nathan Fontenot

The ibm,dynamic-reconfiguration-memory/ibm,associativity-lookup-arrays
property used to track the associativity for LMBs assigned to a system
may not contain all of the possible associativity arrays for the system
at boot time. When a LMB is added to the system and its associativity
array is not present in the lookup array we need to update the lookup
array to contain the new associativity array.

The first patch splits the code that creates a clone of a property into
its own routine so this can be used for cloning any of the properties
used during memory hotplug.

The second patch updates the associativity lookup code to dynamically
add new associativity arrays to the lookup array if they are not
present.

-Nathan

 hotplug-memory.c |  131 ++-
 1 file changed, 92 insertions(+), 39 deletions(-)
 

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] tracing: Expose CPU physical addresses (resource values) for PCI devices

2016-06-20 Thread Steven Rostedt

On Sat, 18 Jun 2016 08:23:23 +1000
Benjamin Herrenschmidt  wrote:

> On Fri, 2016-06-17 at 17:59 -0400, Steven Rostedt wrote:
> > Sorry for the late reply, this patch got pushed down in my INBOX.
> > 
> > Could I get someone from PPC to review this patch, just to be safe?  
> 
> The patch makes sense, I can try getting somebody onto porting
> mmiotrace one of these days.
> 

OK, thanks!

I'll pull this into my for-next repo then.

-- Steve
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH] powerpc/pseries: Auto online hotplugged memory

2016-06-20 Thread Nathan Fontenot

Auto online hotplugged memory

A recent update (commit id 31bc3858ea3) to the core mm hotplug code
introduced the memhp_auto_online variable to allow for automatically
onlining memory that is added.

This patch update the pseries memory hotplug code to enable this so that
any memory DLPAR added to the system is automatically onlined. The code
to add the memory block for memory added from add_memory() is removed as
this is not needed, the memory_add code does this.

Signed-off-by: Nathan Fontenot 
---
 arch/powerpc/platforms/pseries/hotplug-memory.c |   52 +--
 1 file changed, 11 insertions(+), 41 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c 
b/arch/powerpc/platforms/pseries/hotplug-memory.c
index 2ce1385..03f6169 100644
--- a/arch/powerpc/platforms/pseries/hotplug-memory.c
+++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
@@ -533,50 +533,11 @@ static int dlpar_memory_remove_by_index(u32 drc_index, 
struct property *prop)
 
 #endif /* CONFIG_MEMORY_HOTREMOVE */
 
-static int dlpar_add_lmb_memory(struct of_drconf_cell *lmb)
+static int dlpar_add_lmb(struct of_drconf_cell *lmb)
 {
-   struct memory_block *mem_block;
unsigned long block_sz;
int nid, rc;
 
-   block_sz = memory_block_size_bytes();
-
-   /* Find the node id for this address */
-   nid = memory_add_physaddr_to_nid(lmb->base_addr);
-
-   /* Add the memory */
-   rc = add_memory(nid, lmb->base_addr, block_sz);
-   if (rc)
-   return rc;
-
-   /* Register this block of memory */
-   rc = memblock_add(lmb->base_addr, block_sz);
-   if (rc) {
-   remove_memory(nid, lmb->base_addr, block_sz);
-   return rc;
-   }
-
-   mem_block = lmb_to_memblock(lmb);
-   if (!mem_block) {
-   remove_memory(nid, lmb->base_addr, block_sz);
-   return -EINVAL;
-   }
-
-   rc = device_online(_block->dev);
-   put_device(_block->dev);
-   if (rc) {
-   remove_memory(nid, lmb->base_addr, block_sz);
-   return rc;
-   }
-
-   lmb->flags |= DRCONF_MEM_ASSIGNED;
-   return 0;
-}
-
-static int dlpar_add_lmb(struct of_drconf_cell *lmb)
-{
-   int rc;
-
if (lmb->flags & DRCONF_MEM_ASSIGNED)
return -EINVAL;
 
@@ -592,10 +553,19 @@ static int dlpar_add_lmb(struct of_drconf_cell *lmb)
return rc;
}
 
-   rc = dlpar_add_lmb_memory(lmb);
+   block_sz = memory_block_size_bytes();
+
+   /* Find the node id for this address */
+   nid = memory_add_physaddr_to_nid(lmb->base_addr);
+
+   /* Add and online memory */
+   memhp_auto_online = true;
+   rc = add_memory(nid, lmb->base_addr, block_sz);
if (rc) {
dlpar_remove_device_tree_lmb(lmb);
dlpar_release_drc(lmb->drc_index);
+   } else {
+   lmb->flags |= DRCONF_MEM_ASSIGNED;
}
 
return rc;

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 1/3] cxlflash: Fix to drain operations from previous reset

2016-06-20 Thread Matthew R. Ochs

> On Jun 15, 2016, at 6:49 PM, Uma Krishnan  wrote:
> 
> From: "Manoj N. Kumar" 
> 
> While running 'sg_reset -H' in a loop with a user-space application active,
> hit the following exception:
> 
> cpu 0x2: Vector: 300 (Data Access)
>pc: : afu_attach+0x50/0x240 [cxlflash]
>lr: : cxlflash_afu_recover+0x3dc/0x7d0 [cxlflash]
>pid   = 20365, comm = run_block_fvt
> 
> Linux version 4.5.0-491-26f710d+
> 
> cxlflash_afu_recover+0x3dc/0x7d0 [cxlflash]
> cxlflash_ioctl+0x5a8/0x6f0 [cxlflash]
> scsi_ioctl+0x3b0/0x4c0
> sd_ioctl+0x110/0x190
> blkdev_ioctl+0x28c/0xc20
> block_ioctl+0xa4/0xd0
> do_vfs_ioctl+0xd8/0x8c0
> SyS_ioctl+0xd4/0xf0
> system_call+0x38/0xb4
> 
> The problem here is that the problem space area is unmapped while the
> application issues the DK_CXLFLASH_RECOVER_AFU ioctl.
> 
> This is the order I observe:
> 
> proc1 proc2
> 1) sg_reset
>   2) ioctl(DK_CXLFLASH_RECOVER_AFU)
> 3) sg_reset again
>   causing a PSA unmap
>   4) continues RECOVER_AFU processing
> 
> The resolution to this problem is to have the reset handler drain all
> outstanding user space initiated ioctls before proceeding.  It is safe
> to drain after the state has been changed to STATE_RESET. Also since
> drain_ioctls() was static, it had to be moved up a bit to be before
> cxlflash_eh_host_reset_handler().
> 
> Signed-off-by: Manoj N. Kumar 

Acked-by: Matthew R. Ochs 

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: unrecoverable exception on G5 with CONFIG_PPC_EARLY_DEBUG enabled

2016-06-20 Thread Aneesh Kumar K.V

Michael Ellerman  writes:

> On Sat, 2016-06-18 at 22:47 +0530, Aneesh Kumar K.V wrote:
>> Michael Ellerman  writes:
>> > On Fri, 2016-06-17 at 08:33 +1000, Benjamin Herrenschmidt wrote:
>> > > On Thu, 2016-06-16 at 22:49 +0300, Denis Kirjanov wrote:
>> > > > -
>> > > > +BEGIN_MMU_FTR_SECTION
>> > > > +   b   2f
>> > > > +END_MMU_FTR_SECTION_IFSET(MMU_FTR_RADIX)
>> > > > andi.   r10,r12,MSR_RI  /* check for unrecoverable exception
>> > > > */
>> > > > beq-2f
>> > > 
>> > > Are we taking an SLB miss before we do the fixup maybe ?
>> > 
>> > Yeah that's the only explanation that makes any sense.
>> > 
>> > I think instead of patching down this low we should instead be redirecting 
>> > SLB
>> > misses to unknown_exception() when radix is enabled. Aneesh?
>> 
>> The 2f branch ends up doing unrecoverable exception. Or are you
>> suggesting something else ?
>
> I meant more like diverting to unknown_exception() higher up the call stack, 
> but
> that's complicated.
>
> How about this? Denis does this work?
>
> cheers
>
> diff --git a/arch/powerpc/kernel/exceptions-64s.S 
> b/arch/powerpc/kernel/exceptions-64s.S
> index 4c9440629128..8bcc1b457115 100644
> --- a/arch/powerpc/kernel/exceptions-64s.S
> +++ b/arch/powerpc/kernel/exceptions-64s.S
> @@ -1399,11 +1399,12 @@ END_MMU_FTR_SECTION_IFCLR(MMU_FTR_RADIX)
>   lwz r9,PACA_EXSLB+EX_CCR(r13)   /* get saved CR */
>
>   mtlrr10
> -BEGIN_MMU_FTR_SECTION
> - b   2f
> -END_MMU_FTR_SECTION_IFSET(MMU_FTR_RADIX)
>   andi.   r10,r12,MSR_RI  /* check for unrecoverable exception */
> +BEGIN_MMU_FTR_SECTION
>   beq-2f
> +FTR_SECTION_ELSE
> + b   2f
> +ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_RADIX)
>
>  .machine push
>  .machine "power4"

I sent a patch which should get this problem fixed.

http://mid.gmane.org/1466274479-5650-1-git-send-email-aneesh.ku...@linux.vnet.ibm.com

-aneesh

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2] tools/perf: Fix the mask in regs_dump__printf and

2016-06-20 Thread Madhavan Srinivasan



On Monday 20 June 2016 03:10 PM, Jiri Olsa wrote:
> On Mon, Jun 20, 2016 at 05:27:25PM +0800, Wangnan (F) wrote:
>>
>> On 2016/6/20 17:18, Jiri Olsa wrote:
>>> On Mon, Jun 20, 2016 at 02:14:01PM +0530, Madhavan Srinivasan wrote:
 When decoding the perf_regs mask in regs_dump__printf(),
 we loop through the mask using find_first_bit and find_next_bit functions.
 "mask" is of type "u64", but sent as a "unsigned long *" to
 lib functions along with sizeof(). While the exisitng code works fine in
 most of the case, the logic is broken when using a 32bit perf on a
 64bit kernel (Big Endian). We end up reading the wrong word of the u64
 first in the lib functions.
>>> hum, I still don't see why this happens.. why do we read the
>>> wrong word in this case?
>> If you read a u64 using (u32 *)()[0] and (u32 *)()[1]
>> you can get wrong value. This is what _find_next_bit() is doing.

Also in find_first_bit().

>>
>> In a big endian environment where 'unsigned long' is 32 bits
>> long, "(u32 *)()[0]" gets upper 32 bits, but without this patch
>> perf assumes it gets lower 32 bits. The root cause is wrongly convert
>> u64 value to bitmap.
> i see, could you please put this into comment in the code?
>
> also we could have common function for that, to keep it on
> one place only, like bitmap_from_u64 or so

Sure will do.

>
> thanks,
> jirka
>

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2] tools/perf: Fix the mask in regs_dump__printf and

2016-06-20 Thread Jiri Olsa

On Mon, Jun 20, 2016 at 05:27:25PM +0800, Wangnan (F) wrote:
> 
> 
> On 2016/6/20 17:18, Jiri Olsa wrote:
> > On Mon, Jun 20, 2016 at 02:14:01PM +0530, Madhavan Srinivasan wrote:
> > > When decoding the perf_regs mask in regs_dump__printf(),
> > > we loop through the mask using find_first_bit and find_next_bit functions.
> > > "mask" is of type "u64", but sent as a "unsigned long *" to
> > > lib functions along with sizeof(). While the exisitng code works fine in
> > > most of the case, the logic is broken when using a 32bit perf on a
> > > 64bit kernel (Big Endian). We end up reading the wrong word of the u64
> > > first in the lib functions.
> > hum, I still don't see why this happens.. why do we read the
> > wrong word in this case?
> 
> If you read a u64 using (u32 *)()[0] and (u32 *)()[1]
> you can get wrong value. This is what _find_next_bit() is doing.
> 
> In a big endian environment where 'unsigned long' is 32 bits
> long, "(u32 *)()[0]" gets upper 32 bits, but without this patch
> perf assumes it gets lower 32 bits. The root cause is wrongly convert
> u64 value to bitmap.

i see, could you please put this into comment in the code?

also we could have common function for that, to keep it on
one place only, like bitmap_from_u64 or so

thanks,
jirka
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2] tools/perf: Fix the mask in regs_dump__printf and

2016-06-20 Thread Wangnan (F)




On 2016/6/20 17:18, Jiri Olsa wrote:

On Mon, Jun 20, 2016 at 02:14:01PM +0530, Madhavan Srinivasan wrote:

When decoding the perf_regs mask in regs_dump__printf(),
we loop through the mask using find_first_bit and find_next_bit functions.
"mask" is of type "u64", but sent as a "unsigned long *" to
lib functions along with sizeof(). While the exisitng code works fine in
most of the case, the logic is broken when using a 32bit perf on a
64bit kernel (Big Endian). We end up reading the wrong word of the u64
first in the lib functions.

hum, I still don't see why this happens.. why do we read the
wrong word in this case?


If you read a u64 using (u32 *)()[0] and (u32 *)()[1]
you can get wrong value. This is what _find_next_bit() is doing.

In a big endian environment where 'unsigned long' is 32 bits
long, "(u32 *)()[0]" gets upper 32 bits, but without this patch
perf assumes it gets lower 32 bits. The root cause is wrongly convert
u64 value to bitmap.


Thank you.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: unrecoverable exception on G5 with CONFIG_PPC_EARLY_DEBUG enabled

2016-06-20 Thread Michael Ellerman

On Sat, 2016-06-18 at 22:47 +0530, Aneesh Kumar K.V wrote:
> Michael Ellerman  writes:
> > On Fri, 2016-06-17 at 08:33 +1000, Benjamin Herrenschmidt wrote:
> > > On Thu, 2016-06-16 at 22:49 +0300, Denis Kirjanov wrote:
> > > > -
> > > > +BEGIN_MMU_FTR_SECTION
> > > > +   b   2f
> > > > +END_MMU_FTR_SECTION_IFSET(MMU_FTR_RADIX)
> > > > andi.   r10,r12,MSR_RI  /* check for unrecoverable exception
> > > > */
> > > > beq-2f
> > > 
> > > Are we taking an SLB miss before we do the fixup maybe ?
> > 
> > Yeah that's the only explanation that makes any sense.
> > 
> > I think instead of patching down this low we should instead be redirecting 
> > SLB
> > misses to unknown_exception() when radix is enabled. Aneesh?
> 
> The 2f branch ends up doing unrecoverable exception. Or are you
> suggesting something else ?

I meant more like diverting to unknown_exception() higher up the call stack, but
that's complicated.

How about this? Denis does this work?

cheers

diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index 4c9440629128..8bcc1b457115 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -1399,11 +1399,12 @@ END_MMU_FTR_SECTION_IFCLR(MMU_FTR_RADIX)
lwz r9,PACA_EXSLB+EX_CCR(r13)   /* get saved CR */
 
mtlrr10
-BEGIN_MMU_FTR_SECTION
-   b   2f
-END_MMU_FTR_SECTION_IFSET(MMU_FTR_RADIX)
andi.   r10,r12,MSR_RI  /* check for unrecoverable exception */
+BEGIN_MMU_FTR_SECTION
beq-2f
+FTR_SECTION_ELSE
+   b   2f
+ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_RADIX)
 
 .machine   push
 .machine   "power4"

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2] tools/perf: Fix the mask in regs_dump__printf and

2016-06-20 Thread Jiri Olsa

On Mon, Jun 20, 2016 at 02:14:01PM +0530, Madhavan Srinivasan wrote:

SNIP

> diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
> index e3ce2f34d3ad..76d5006ebcc3 100644
> --- a/tools/perf/builtin-script.c
> +++ b/tools/perf/builtin-script.c
> @@ -412,11 +412,16 @@ static void print_sample_iregs(struct perf_sample 
> *sample,
>   struct regs_dump *regs = >intr_regs;
>   uint64_t mask = attr->sample_regs_intr;
>   unsigned i = 0, r;
> + unsigned long _mask[sizeof(mask)/sizeof(unsigned long)];
>  
>   if (!regs)
>   return;
>  
> - for_each_set_bit(r, (unsigned long *) , sizeof(mask) * 8) {
> + _mask[0] = mask & ULONG_MAX;
> + if (sizeof(mask) > sizeof(unsigned long))
> + _mask[1] = mask >> 32;
> +
> + for_each_set_bit(r, _mask, sizeof(mask) * 8) {
>   u64 val = regs->regs[i++];
>   printf("%5s:0x%"PRIx64" ", perf_reg_name(r), val);
>   }
> diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
> index 5214974e841a..2eaa42a4832a 100644
> --- a/tools/perf/util/session.c
> +++ b/tools/perf/util/session.c
> @@ -940,8 +940,13 @@ static void branch_stack__printf(struct perf_sample 
> *sample)
>  static void regs_dump__printf(u64 mask, u64 *regs)
>  {
>   unsigned rid, i = 0;
> + unsigned long _mask[sizeof(mask)/sizeof(unsigned long)];
>  
> - for_each_set_bit(rid, (unsigned long *) , sizeof(mask) * 8) {
> + _mask[0] = mask & ULONG_MAX;
> + if (sizeof(mask) > sizeof(unsigned long))
> + _mask[1] = mask >> 32;
> +
> + for_each_set_bit(rid, _mask, sizeof(mask) * 8) {
>   u64 val = regs[i++];

could you please move the common code into the function?

thanks,
jirka
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2] tools/perf: Fix the mask in regs_dump__printf and

2016-06-20 Thread Jiri Olsa

On Mon, Jun 20, 2016 at 02:14:01PM +0530, Madhavan Srinivasan wrote:
> When decoding the perf_regs mask in regs_dump__printf(),
> we loop through the mask using find_first_bit and find_next_bit functions.
> "mask" is of type "u64", but sent as a "unsigned long *" to
> lib functions along with sizeof(). While the exisitng code works fine in
> most of the case, the logic is broken when using a 32bit perf on a
> 64bit kernel (Big Endian). We end up reading the wrong word of the u64
> first in the lib functions.

hum, I still don't see why this happens.. why do we read the
wrong word in this case?

thanks,
jirka

> first in the lib functions. Proposed fix is to swap the words of the
> u64 to handle this case. This is not endianess swap.
> 

SNIP


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH] devpts: remove DEVPTS_MULTIPLE_INSTANCES from all configs

2016-06-20 Thread Alexandru Moise

As each mount of devpts is now an independent filesystem,
the DEVPTS_MULTIPLE_INSTANCES config option no longer exists.
So remove it.

Signed-off-by: Alexandru Moise <00moses.alexande...@gmail.com>
---
 arch/arc/configs/tb10x_defconfig| 1 -
 arch/arm/configs/mxs_defconfig  | 1 -
 arch/mips/configs/ip27_defconfig| 1 -
 arch/mips/configs/nlm_xlp_defconfig | 1 -
 arch/mips/configs/nlm_xlr_defconfig | 1 -
 arch/parisc/configs/generic-64bit_defconfig | 1 -
 arch/powerpc/configs/powernv_defconfig  | 1 -
 arch/powerpc/configs/pseries_defconfig  | 1 -
 arch/s390/configs/default_defconfig | 1 -
 arch/s390/configs/gcov_defconfig| 1 -
 arch/s390/configs/performance_defconfig | 1 -
 arch/sh/configs/sh7785lcr_32bit_defconfig   | 1 -
 arch/xtensa/configs/iss_defconfig   | 1 -
 tools/testing/selftests/mount/config| 1 -
 14 files changed, 14 deletions(-)

diff --git a/arch/arc/configs/tb10x_defconfig b/arch/arc/configs/tb10x_defconfig
index 4c51183..be0b4fb 100644
--- a/arch/arc/configs/tb10x_defconfig
+++ b/arch/arc/configs/tb10x_defconfig
@@ -58,7 +58,6 @@ CONFIG_STMMAC_ETH=y
 # CONFIG_INPUT is not set
 # CONFIG_SERIO is not set
 # CONFIG_VT is not set
-CONFIG_DEVPTS_MULTIPLE_INSTANCES=y
 # CONFIG_LEGACY_PTYS is not set
 # CONFIG_DEVKMEM is not set
 CONFIG_SERIAL_8250=y
diff --git a/arch/arm/configs/mxs_defconfig b/arch/arm/configs/mxs_defconfig
index 6e0f751..65a84b4 100644
--- a/arch/arm/configs/mxs_defconfig
+++ b/arch/arm/configs/mxs_defconfig
@@ -77,7 +77,6 @@ CONFIG_INPUT_EVDEV=y
 CONFIG_INPUT_TOUCHSCREEN=y
 CONFIG_TOUCHSCREEN_TSC2007=m
 # CONFIG_SERIO is not set
-CONFIG_DEVPTS_MULTIPLE_INSTANCES=y
 # CONFIG_LEGACY_PTYS is not set
 # CONFIG_DEVKMEM is not set
 CONFIG_SERIAL_AMBA_PL011=y
diff --git a/arch/mips/configs/ip27_defconfig b/arch/mips/configs/ip27_defconfig
index 2b74aee..df11563 100644
--- a/arch/mips/configs/ip27_defconfig
+++ b/arch/mips/configs/ip27_defconfig
@@ -266,7 +266,6 @@ CONFIG_SERIAL_8250_CONSOLE=y
 CONFIG_SERIAL_8250_EXTENDED=y
 CONFIG_SERIAL_8250_MANY_PORTS=y
 CONFIG_SERIAL_8250_SHARE_IRQ=y
-CONFIG_DEVPTS_MULTIPLE_INSTANCES=y
 CONFIG_HW_RANDOM_TIMERIOMEM=m
 CONFIG_I2C_CHARDEV=m
 CONFIG_I2C_ALI1535=m
diff --git a/arch/mips/configs/nlm_xlp_defconfig 
b/arch/mips/configs/nlm_xlp_defconfig
index b496c25..5c40b48 100644
--- a/arch/mips/configs/nlm_xlp_defconfig
+++ b/arch/mips/configs/nlm_xlp_defconfig
@@ -409,7 +409,6 @@ CONFIG_SERIO_SERPORT=m
 CONFIG_SERIO_LIBPS2=y
 CONFIG_SERIO_RAW=m
 CONFIG_VT_HW_CONSOLE_BINDING=y
-CONFIG_DEVPTS_MULTIPLE_INSTANCES=y
 CONFIG_LEGACY_PTY_COUNT=0
 CONFIG_SERIAL_NONSTANDARD=y
 CONFIG_N_HDLC=m
diff --git a/arch/mips/configs/nlm_xlr_defconfig 
b/arch/mips/configs/nlm_xlr_defconfig
index 8e99ad8..47a2756 100644
--- a/arch/mips/configs/nlm_xlr_defconfig
+++ b/arch/mips/configs/nlm_xlr_defconfig
@@ -347,7 +347,6 @@ CONFIG_SERIO_SERPORT=m
 CONFIG_SERIO_LIBPS2=y
 CONFIG_SERIO_RAW=m
 CONFIG_VT_HW_CONSOLE_BINDING=y
-CONFIG_DEVPTS_MULTIPLE_INSTANCES=y
 CONFIG_LEGACY_PTY_COUNT=0
 CONFIG_SERIAL_NONSTANDARD=y
 CONFIG_N_HDLC=m
diff --git a/arch/parisc/configs/generic-64bit_defconfig 
b/arch/parisc/configs/generic-64bit_defconfig
index e945c08..69aa66c 100644
--- a/arch/parisc/configs/generic-64bit_defconfig
+++ b/arch/parisc/configs/generic-64bit_defconfig
@@ -166,7 +166,6 @@ CONFIG_INPUT_MISC=y
 CONFIG_SERIO_SERPORT=m
 # CONFIG_HP_SDC is not set
 CONFIG_SERIO_RAW=m
-CONFIG_DEVPTS_MULTIPLE_INSTANCES=y
 # CONFIG_LEGACY_PTYS is not set
 CONFIG_NOZOMI=m
 # CONFIG_DEVKMEM is not set
diff --git a/arch/powerpc/configs/powernv_defconfig 
b/arch/powerpc/configs/powernv_defconfig
index 0450310..2fd6bbe 100644
--- a/arch/powerpc/configs/powernv_defconfig
+++ b/arch/powerpc/configs/powernv_defconfig
@@ -176,7 +176,6 @@ CONFIG_PPP_SYNC_TTY=m
 CONFIG_INPUT_EVDEV=m
 CONFIG_INPUT_MISC=y
 # CONFIG_SERIO_SERPORT is not set
-CONFIG_DEVPTS_MULTIPLE_INSTANCES=y
 CONFIG_SERIAL_8250=y
 CONFIG_SERIAL_8250_CONSOLE=y
 CONFIG_SERIAL_JSM=m
diff --git a/arch/powerpc/configs/pseries_defconfig 
b/arch/powerpc/configs/pseries_defconfig
index 36871a4..3c325ba 100644
--- a/arch/powerpc/configs/pseries_defconfig
+++ b/arch/powerpc/configs/pseries_defconfig
@@ -180,7 +180,6 @@ CONFIG_INPUT_EVDEV=m
 CONFIG_INPUT_MISC=y
 CONFIG_INPUT_PCSPKR=m
 # CONFIG_SERIO_SERPORT is not set
-CONFIG_DEVPTS_MULTIPLE_INSTANCES=y
 CONFIG_SERIAL_8250=y
 CONFIG_SERIAL_8250_CONSOLE=y
 CONFIG_SERIAL_ICOM=m
diff --git a/arch/s390/configs/default_defconfig 
b/arch/s390/configs/default_defconfig
index d5ec71b..501e93a 100644
--- a/arch/s390/configs/default_defconfig
+++ b/arch/s390/configs/default_defconfig
@@ -453,7 +453,6 @@ CONFIG_PPP_SYNC_TTY=m
 # CONFIG_INPUT_KEYBOARD is not set
 # CONFIG_INPUT_MOUSE is not set
 # CONFIG_SERIO is not set
-CONFIG_DEVPTS_MULTIPLE_INSTANCES=y
 CONFIG_LEGACY_PTY_COUNT=0
 CONFIG_HW_RANDOM_VIRTIO=m
 CONFIG_RAW_DRIVER=m
diff --git a/arch/s390/configs/gcov_defconfig b/arch/s390/configs/gcov_defconfig
index

Re: [v6, 1/2] cxl: Add mechanism for delivering AFU driver specific events

2016-06-20 Thread Vaibhav Jain

Hi Philippe,

Few points


Philippe Bergheaud  writes:

> +int cxl_unset_driver_ops(struct cxl_context *ctx)
> +{
> + if (atomic_read(>afu_driver_events))
> + return -EBUSY;
> +
> + ctx->afu_driver_ops = NULL;
Need a write memory barrier so that afu_driver_ops isnt possibly called
after this store.

>  
> -static inline int ctx_event_pending(struct cxl_context *ctx)
> +static ssize_t afu_driver_event_copy(struct cxl_context *ctx,
> +  char __user *buf,
> +  struct cxl_event *event,
> +  struct cxl_event_afu_driver_reserved *pl)
>  {
> - return (ctx->pending_irq || ctx->pending_fault ||
> - ctx->pending_afu_err || (ctx->status == CLOSED));
> + /* Check data size */
> + if (!pl || (pl->data_size > CXL_MAX_EVENT_DATA_SIZE)) {
> + ctx->afu_driver_ops->event_delivered(ctx, pl, -EINVAL);
> + return -EFAULT;
> + }
Should also check against the length of user-buffer (count) provided in the read
call.Ideally this condition check should be moved to the read call where
you have access to the count variable.

Right now libcxl is using a harcoded value of CXL_READ_MIN_SIZE to
issue the read call and in kernel code we have a check to ensure that
read buffer is atleast CXL_READ_MIN_SIZE in size.

But it might be a good idea to decouple driver from
CXL_MAX_EVENT_DATA_SIZE. Ideally the maximum event size that we can
support should be dependent on the amount user buffer we receive in the
read call. That way future libcxl can support larger event_data without
needing a change to the cxl.h


> diff --git a/include/uapi/misc/cxl.h b/include/uapi/misc/cxl.h
> index 8cd334f..4fa36e4 100644
> --- a/include/uapi/misc/cxl.h
> +++ b/include/uapi/misc/cxl.h
> @@ -93,6 +93,7 @@ enum cxl_event_type {
>   CXL_EVENT_AFU_INTERRUPT = 1,
>   CXL_EVENT_DATA_STORAGE  = 2,
>   CXL_EVENT_AFU_ERROR = 3,
> + CXL_EVENT_AFU_DRIVER= 4,
>  };
>  
>  struct cxl_event_header {
> @@ -124,12 +125,32 @@ struct cxl_event_afu_error {
>   __u64 error;
>  };
>  
> +#define CXL_MAX_EVENT_DATA_SIZE 128
> +

Agree with Matt's earlier comments. 128 is very small and I would prefer
for atleast a page size (4k/64K) limit.

Cheers,
~ Vaibhav

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v2] tools/perf: Fix the mask in regs_dump__printf and

2016-06-20 Thread Madhavan Srinivasan

When decoding the perf_regs mask in regs_dump__printf(),
we loop through the mask using find_first_bit and find_next_bit functions.
"mask" is of type "u64", but sent as a "unsigned long *" to
lib functions along with sizeof(). While the exisitng code works fine in
most of the case, the logic is broken when using a 32bit perf on a
64bit kernel (Big Endian). We end up reading the wrong word of the u64
first in the lib functions. Proposed fix is to swap the words of the
u64 to handle this case. This is not endianess swap.

Suggested-by: Yury Norov 
Cc: Yury Norov 
Cc: Peter Zijlstra 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: Alexander Shishkin 
Cc: Jiri Olsa 
Cc: Adrian Hunter 
Cc: Kan Liang 
Cc: Wang Nan 
Cc: Michael Ellerman 
Signed-off-by: Madhavan Srinivasan 
---
Changelog v1:
1)updated commit message and patch subject
2)Add the fix to print_sample_iregs() in builtin-script.c

 tools/perf/builtin-script.c | 7 ++-
 tools/perf/util/session.c   | 7 ++-
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index e3ce2f34d3ad..76d5006ebcc3 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -412,11 +412,16 @@ static void print_sample_iregs(struct perf_sample *sample,
struct regs_dump *regs = >intr_regs;
uint64_t mask = attr->sample_regs_intr;
unsigned i = 0, r;
+   unsigned long _mask[sizeof(mask)/sizeof(unsigned long)];
 
if (!regs)
return;
 
-   for_each_set_bit(r, (unsigned long *) , sizeof(mask) * 8) {
+   _mask[0] = mask & ULONG_MAX;
+   if (sizeof(mask) > sizeof(unsigned long))
+   _mask[1] = mask >> 32;
+
+   for_each_set_bit(r, _mask, sizeof(mask) * 8) {
u64 val = regs->regs[i++];
printf("%5s:0x%"PRIx64" ", perf_reg_name(r), val);
}
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 5214974e841a..2eaa42a4832a 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -940,8 +940,13 @@ static void branch_stack__printf(struct perf_sample 
*sample)
 static void regs_dump__printf(u64 mask, u64 *regs)
 {
unsigned rid, i = 0;
+   unsigned long _mask[sizeof(mask)/sizeof(unsigned long)];
 
-   for_each_set_bit(rid, (unsigned long *) , sizeof(mask) * 8) {
+   _mask[0] = mask & ULONG_MAX;
+   if (sizeof(mask) > sizeof(unsigned long))
+   _mask[1] = mask >> 32;
+
+   for_each_set_bit(rid, _mask, sizeof(mask) * 8) {
u64 val = regs[i++];
 
printf(" %-5s 0x%" PRIx64 "\n",
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 12/12] leds: Only descend into leds directory when CONFIG_NEW_LEDS is set

2016-06-20 Thread Jacek Anaszewski


On 06/18/2016 12:46 AM, Andrew F. Davis wrote:

On 06/15/2016 01:48 AM, Jacek Anaszewski wrote:

Hi Andrew,

Thanks for the patch.

Please address the issue [1] raised by test bot and resubmit.

Thanks,
Jacek Anaszewski

[1] https://lkml.org/lkml/2016/6/13/1091



It looks like some systems use 'gpio_led_register_device' to make an
in-memory copy of their LED device table so the original can be removed
as .init.rodata. This doesn't necessarily depend on the LED subsystem
but it kind of seems useless when the rest of the subsystem is disabled.

One solution could be to use a dummy 'gpio_led_register_device' when the
subsystem is not enabled.


It sounds good. Please add a no-op version of gpio_led_register_device()
to include/leds.h, in a separate patch.

Thanks,
Jacek Anaszewski


Another is just to remove the five or so uses
of 'gpio_led_register_device' and have those systems register LED device
tables like other systems do.

If nether of these are acceptable then this patch can be dropped from
this series for now.

Thanks,
Andrew


On 06/13/2016 10:02 PM, Andrew F. Davis wrote:

When CONFIG_NEW_LEDS is not set make will still descend into the leds
directory but nothing will be built. This produces unneeded build
artifacts and messages in addition to slowing the build. Fix this here.

Signed-off-by: Andrew F. Davis 
---
   drivers/Makefile | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/Makefile b/drivers/Makefile
index 567e32c..fa514d5 100644
--- a/drivers/Makefile
+++ b/drivers/Makefile
@@ -127,7 +127,7 @@ obj-$(CONFIG_CPU_FREQ)+= cpufreq/
   obj-$(CONFIG_CPU_IDLE)+= cpuidle/
   obj-$(CONFIG_MMC)+= mmc/
   obj-$(CONFIG_MEMSTICK)+= memstick/
-obj-y+= leds/
+obj-$(CONFIG_NEW_LEDS)+= leds/
   obj-$(CONFIG_INFINIBAND)+= infiniband/
   obj-$(CONFIG_SGI_SN)+= sn/
   obj-y+= firmware/









___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH V2] powerpc/mm: Don't do debug print before we do feature fixup

2016-06-20 Thread Denis Kirjanov

On 6/19/16, Benjamin Herrenschmidt  wrote:
> On Sat, 2016-06-18 at 23:57 +0530, Aneesh Kumar K.V wrote:
>> We use feature fixup in segment and page fault path and hence we should
>> not call any function that can cause either of these before we finish
>> feature fixup.
>>
>> Calling into early debug routine can result in segment fault as updated
>> in
>> https://lkml.kernel.org/r/CAOJe8K2L8D1M_yZPmyiZ11JvJ+HAS0aFHOnsqaY=xlonqta...@mail.gmail.com
>>
>> Avoid the segment fault by removing the debug print. Also remove the
>> matching closing debug print at the end of the function.
>
> Please add a comment at the beginning of setup_system() explaining
> the situation. IE, that until the fixups are done, not even debug
> printfs can be used.
>
Right, otherwise we'll hit the same issue in the future.

> It used to work though, the fact that we used 256M segment in that
> case wasn't an issue, in part because btext only really existed on
> machines that didn't have 1T segments. It was only ever going to be
> a segment fault, the rest was bolted (early ioremap).
>
> Ben.
>
>
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

94 matches

Mail list logo