Re: [PATCH 1/2] Move the pt_regs_offset struct definition from arch to common include file

2015-06-23 Thread David Long

On 06/22/15 23:32, Michael Ellerman wrote:

On Fri, 2015-06-19 at 10:12 -0400, David Long wrote:

On 06/19/15 00:19, Michael Ellerman wrote:

On Mon, 2015-06-15 at 12:42 -0400, David Long wrote:

From: David A. Long dave.l...@linaro.org

The pt_regs_offset structure is used for HAVE_REGS_AND_STACK_ACCESS_API
   feature and has identical definitions in four different arch ptrace.h
include files. It seems unlikely that definition would ever need to be
changed regardless of architecture so lets move it into
include/linux/ptrace.h.

Signed-off-by: David A. Long dave.l...@linaro.org
---
   arch/powerpc/kernel/ptrace.c | 5 -


Built and booted on powerpc, but is there an easy way to actually test the code
paths in question?



There is an easy way to smoke test it on all archiectures that also
implement kprobes (which powerpc does).  If I'm understanding the
powerpc code correctly (WRT register naming conventions) just do the
following:

cd /sys/kernel/debug/tracing
echo 'p do_fork %gpr0'  kprobe_events
echo 1  events/kprobes/enable
ls
cat trace
echo 0  events/kprobes/enable

Every fork() call done on the system between those two echo commands
(hence the ls) should append a line to the trace file.  For a more
exhaustive test one could repeat this sequence for every register in the
architecture.


OK, so I went the whole hog and did:

$ echo 'p do_fork %gpr0 %gpr1 %gpr2 %gpr3 %gpr4 %gpr5 %gpr6 %gpr7 %gpr8 %gpr9 
%gpr10 %gpr11 %gpr12 %gpr13 %gpr14 %gpr15 %gpr16 %gpr17 %gpr18 %gpr19 %gpr20 
%gpr21 %gpr22 %gpr23 %gpr24 %gpr25 %gpr26 %gpr27 %gpr28 %gpr29 %gpr30 %gpr31 %nip 
%msr %ctr %link %xer %ccr %softe %trap %dar %dsisr'  kprobe_events

And I get:

 bash-2057  [001] d...   535.433941: p_do_fork_0: 
(do_fork+0x8/0x490) arg1=0xc00094d0 arg2=0xc001fbe9be30 
arg3=0xc1133bb8 arg4=0x1200011 arg5=0x0 arg6=0x0 arg7=0x0 
arg8=0x3fff7c885940 arg9=0x1 arg10=0xc001fbe9bea0 arg11=0x0 arg12=0xc01 
arg13=0xc00094c8 arg14=0xcfdc0480 arg15=0x0 arg16=0x2200 
arg17=0x1016d6e8 arg18=0x0 arg19=0x4400 arg20=0x0 arg21=0x10037c82208 
arg22=0x1017b008 arg23=0x10143d18 arg24=0x10178854 arg25=0x10144f90 
arg26=0x10037c821e8 arg27=0x0 arg28=0x0 arg29=0x0 arg30=0x0 arg31=0x809 
arg32=0x3788c010 arg33=0xc00a7fe8 arg34=0x80029033 
arg35=0xc00094c8 arg36=0xc00094d0 arg37=0x0 arg38=0x4844 
arg39=0x1 arg40=0x700 arg41=0xc001fbe9bd50 arg42=0xc001fbe9bd30

Which is ugly as hell, but appears unchanged since before your patch.



Excellent.  Many thanks.


I take it it's expected that the names are not decoded in the output?



Yes.


Also I wonder why we choose gpr when r is the more usual prefix on powerpc.
I guess we can add new aliases to the table.



Yeah I can't answer that, this is just what the preexisting code is 
written to do. I believe you could add aliases to the table, perhaps as 
a step in migrating to supporting only the more common naming.  The 
reverse translation would have to return one or the other though.


-dl


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2 1/1] net: fs_enet: Fix NETIF_F_SG feature for Freescale MPC5121

2015-06-23 Thread David Miller
From: Alexander Popov alex.po...@linux.com
Date: Sun, 21 Jun 2015 01:32:46 +0300

 Commit 4fc9b87bae25 (net: fs_enet: Implement NETIF_F_SG feature)
 brings a trouble to Freescale MPC512x: a kernel oops happens
 during sending non-linear sk_buff with .data not aligned by 4.
 
 Log quotation:
 ...
 The reason:
 
 MPC5121 FEC requires 4-byte alignment for TX data buffer and calls
 tx_skb_align_workaround() for copying sk_buff with not aligned .data to a new
 sk_buff with aligned one. But tx_skb_align_workaround() uses
 skb_copy_from_linear_data() which doesn't work for non-linear sk_buff:
 a new sk_buff has non-zero nr_frags and zero .data_len.
 
 So improve the condition of calling tx_skb_align_workaround() and use
 skb_linearize() in it.
 
 Signed-off-by: Alexander Popov alex.po...@linux.com

Applied, thanks.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RESEND PATCH V2 1/3] Add mmap flag to request pages are locked after page fault

2015-06-23 Thread Vlastimil Babka

On 06/22/2015 04:18 PM, Eric B Munson wrote:

On Mon, 22 Jun 2015, Michal Hocko wrote:


On Fri 19-06-15 12:43:33, Eric B Munson wrote:

On Fri, 19 Jun 2015, Michal Hocko wrote:


On Thu 18-06-15 16:30:48, Eric B Munson wrote:

On Thu, 18 Jun 2015, Michal Hocko wrote:

[...]

Wouldn't it be much more reasonable and straightforward to have
MAP_FAULTPOPULATE as a counterpart for MAP_POPULATE which would
explicitly disallow any form of pre-faulting? It would be usable for
other usecases than with MAP_LOCKED combination.


I don't see a clear case for it being more reasonable, it is one
possible way to solve the problem.


MAP_FAULTPOPULATE would be usable for other cases as well. E.g. fault
around is all or nothing feature. Either all mappings (which support
this) fault around or none. There is no way to tell the kernel that
this particular mapping shouldn't fault around. I haven't seen such a
request yet but we have seen requests to have a way to opt out from
a global policy in the past (e.g. per-process opt out from THP). So
I can imagine somebody will come with a request to opt out from any
speculative operations on the mapped area in the future.


That sounds like something where new madvise() flag would make more 
sense than a new mmap flag, and conflating it with locking behavior 
would lead to all kinds of weird corner cases as Eric mentioned.





But I think it leaves us in an even
more akward state WRT VMA flags.  As you noted in your fix for the
mmap() man page, one can get into a state where a VMA is VM_LOCKED, but
not present.  Having VM_LOCKONFAULT states that this was intentional, if
we go to using MAP_FAULTPOPULATE instead of MAP_LOCKONFAULT, we no
longer set VM_LOCKONFAULT (unless we want to start mapping it to the
presence of two MAP_ flags).  This can make detecting the MAP_LOCKED +
populate failure state harder.


I am not sure I understand your point here. Could you be more specific
how would you check for that and what for?


My thought on detecting was that someone might want to know if they had
a VMA that was VM_LOCKED but had not been made present becuase of a
failure in mmap.  We don't have a way today, but adding VM_LOCKONFAULT
is at least explicit about what is happening which would make detecting
the VM_LOCKED but not present state easier.


One could use /proc/pid/pagemap to query the residency.


I think that's all too much complex scenario for a little gain. If 
someone knows that mmap(MAP_LOCKED|MAP_POPULATE) is not perfect, he 
should either mlock() separately from mmap(), or fault the range 
manually with a for loop. Why try to detect if the corner case was hit?





This assumes that
MAP_FAULTPOPULATE does not translate to a VMA flag, but it sounds like
it would have to.


Yes, it would have to have a VM flag for the vma.


So with your approach, VM_LOCKED flag is enough, right? The new MAP_ / 
MLOCK_ flags just cause setting VM_LOCKED to not fault the whole vma, 
but otherwise nothing changes.


If that's true, I think it's better than a new vma flag.




 From my understanding MAP_LOCKONFAULT is essentially
MAP_FAULTPOPULATE|MAP_LOCKED with a quite obvious semantic (unlike
single MAP_LOCKED unfortunately). I would love to also have
MAP_LOCKED|MAP_POPULATE (aka full mlock semantic) but I am really
skeptical considering how my previous attempt to make MAP_POPULATE
reasonable went.


Are you objecting to the addition of the VMA flag VM_LOCKONFAULT, or the
new MAP_LOCKONFAULT flag (or both)?


I thought the MAP_FAULTPOPULATE (or any other better name) would
directly translate into VM_FAULTPOPULATE and wouldn't be tight to the
locked semantic. We already have VM_LOCKED for that. The direct effect
of the flag would be to prevent from population other than the direct
page fault - including any speculative actions like fault around or
read-ahead.


I like the ability to control other speculative population, but I am not
sure about overloading it with the VM_LOCKONFAULT case.  Here is my
concern.  If we are using VM_FAULTPOPULATE | VM_LOCKED to denote
LOCKONFAULT, how can we tell the difference between someone that wants
to avoid read-ahead and wants to use mlock()?  This might lead to some
interesting states with mlock() and munlock() that take flags.  For
instance, using VM_LOCKONFAULT mlock(MLOCK_ONFAULT) followed by
munlock(MLOCK_LOCKED) leaves the VMAs in the same state with
VM_LOCKONFAULT set.  If we use VM_FAULTPOPULATE, the same pair of calls
would clear VM_LOCKED, but leave VM_FAULTPOPULATE.  It may not matter in
the end, but I am concerned about the subtleties here.


Right.




If you prefer that MAP_LOCKED |
MAP_FAULTPOPULATE means that VM_LOCKONFAULT is set, I am fine with that
instead of introducing MAP_LOCKONFAULT.  I went with the new flag
because to date, we have a one to one mapping of MAP_* to VM_* flags.




If this is the preferred path for mmap(), I am fine with that.



However,
I would like to see the new system calls that Andrew mentioned (and that
I am 

Re: [RESEND PATCH V2 0/3] Allow user to request memory to be locked on page fault

2015-06-23 Thread Vlastimil Babka

On 06/15/2015 04:43 PM, Eric B Munson wrote:

Note that the semantic of MAP_LOCKED can be subtly surprising:

mlock(2) fails if the memory range cannot get populated to guarantee
that no future major faults will happen on the range.
mmap(MAP_LOCKED) on the other hand silently succeeds even if the
range was populated only
partially.

( from http://marc.info/?l=linux-mmm=143152790412727w=2 )

So MAP_LOCKED can silently behave like MAP_LOCKONFAULT. While
MAP_LOCKONFAULT doesn't suffer from such problem, I wonder if that's
sufficient reason not to extend mmap by new mlock() flags that can
be instead applied to the VMA after mmapping, using the proposed
mlock2() with flags. So I think instead we could deprecate
MAP_LOCKED more prominently. I doubt the overhead of calling the
extra syscall matters here?


We could talk about retiring the MAP_LOCKED flag but I suspect that
would get significantly more pushback than adding a new mmap flag.


Oh no we can't retire as in remove the flag, ever. Just not continue 
the way of mmap() flags related to mlock().



Likely that the overhead does not matter in most cases, but presumably
there are cases where it does (as we have a MAP_LOCKED flag today).
Even with the proposed new system calls I think we should have the
MAP_LOCKONFAULT for parity with MAP_LOCKED.


I'm not convinced, but it's not a major issue.




- mlock() takes a `flags' argument.  Presently that's
   MLOCK_LOCKED|MLOCK_LOCKONFAULT.

- munlock() takes a `flags' arument.  MLOCK_LOCKED|MLOCK_LOCKONFAULT
   to specify which flags are being cleared.

- mlockall() and munlockall() ditto.


IOW, LOCKED and LOCKEDONFAULT are treated identically and independently.

Now, that's how we would have designed all this on day one.  And I
think we can do this now, by adding new mlock2() and munlock2()
syscalls.  And we may as well deprecate the old mlock() and munlock(),
not that this matters much.

*should* we do this?  I'm thinking yes - it's all pretty simple
boilerplate and wrappers and such, and it gets the interface correct,
and extensible.


If the new LOCKONFAULT functionality is indeed desired (I haven't
still decided myself) then I agree that would be the cleanest way.


Do you disagree with the use cases I have listed or do you think there
is a better way of addressing those cases?


I'm somewhat sceptical about the security one. Are security sensitive 
buffers that large to matter? The performance one is more convincing and 
I don't see a better way, so OK.







What do others think?


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 05/13] perf pmu: Use __weak definition from linux/compiler.h

2015-06-23 Thread Arnaldo Carvalho de Melo
From: Sukadev Bhattiprolu suka...@linux.vnet.ibm.com

Jiri Olsa pointed out, that the linux/compiler.h defines the attribute
'__weak'. We might as well use that.

Signed-off-by: Sukadev Bhattiprolu suka...@linux.vnet.ibm.com
Acked-by: Jiri Olsa jo...@redhat.com
Cc: Andi Kleen a...@linux.intel.com
Cc: Madhavan Srinivasan ma...@linux.vnet.ibm.com
Cc: Michael Ellerman m...@ellerman.id.au
Cc: Namhyung Kim namhy...@kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Link: 
http://lkml.kernel.org/r/1433921123-25327-4-git-send-email-suka...@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo a...@redhat.com
---
 tools/perf/util/pmu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index 0fcc624eb767..c6b16b1db6d0 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -1,4 +1,5 @@
 #include linux/list.h
+#include linux/compiler.h
 #include sys/types.h
 #include unistd.h
 #include stdio.h
@@ -436,7 +437,7 @@ static struct cpu_map *pmu_cpumask(const char *name)
return cpus;
 }
 
-struct perf_event_attr *__attribute__((weak))
+struct perf_event_attr * __weak
 perf_pmu__get_default_config(struct perf_pmu *pmu __maybe_unused)
 {
return NULL;
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 07/13] perf tools: Allow events with dot

2015-06-23 Thread Arnaldo Carvalho de Melo
From: Andi Kleen a...@linux.intel.com

The Intel events use a dot to separate event name and unit mask.  Allow
dot in names in the scanner, and remove special handling of dot as EOF.
Also remove the hack in jevents to replace dot with underscore. This way
dotted events can be specified directly by the user.

I'm not fully sure this change to the scanner is correct (what was the
dot special case good for?), but I haven't found anything that breaks
with it so far at least.

Signed-off-by: Andi Kleen a...@linux.intel.com
Acked-by: Jiri Olsa jo...@redhat.com
Acked-by: Namhyung Kim namhy...@kernel.org
Cc: Madhavan Srinivasan ma...@linux.vnet.ibm.com
Cc: Michael Ellerman m...@ellerman.id.au
Cc: linuxppc-dev@lists.ozlabs.org
Link: 
http://lkml.kernel.org/r/1433921123-25327-8-git-send-email-suka...@linux.vnet.ibm.com
Signed-off-by: Sukadev Bhattiprolu suka...@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo a...@redhat.com
---
 tools/perf/util/parse-events.l | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index 09e738fe9ea2..13cef3c65565 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -119,8 +119,8 @@ event   [^,{}/]+
 num_dec[0-9]+
 num_hex0x[a-fA-F0-9]+
 num_raw_hex[a-fA-F0-9]+
-name   [a-zA-Z_*?][a-zA-Z0-9_*?]*
-name_minus [a-zA-Z_*?][a-zA-Z0-9\-_*?]*
+name   [a-zA-Z_*?][a-zA-Z0-9_*?.]*
+name_minus [a-zA-Z_*?][a-zA-Z0-9\-_*?.]*
 /* If you add a modifier you need to update check_modifier() */
 modifier_event [ukhpGHSDI]+
 modifier_bp[rwx]{1,3}
@@ -165,7 +165,6 @@ modifier_bp [rwx]{1,3}
return PE_EVENT_NAME;
}
 
-.  |
 EOF{
BEGIN(INITIAL);
REWIND(0);
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[GIT PULL 00/13] perf/core improvements and fixes

2015-06-23 Thread Arnaldo Carvalho de Melo
Hi Ingo,

Please consider pulling,

- Arnaldo

The following changes since commit a9a3cd900fbbcbf837d65653105e7bfc583ced09:

  Merge tag 'perf-core-for-mingo' of 
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core 
(2015-06-20 01:11:11 +0200)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git 
tags/perf-core-for-mingo

for you to fetch changes up to 83b2ea257eb1d43e52f76d756722aeb899a2852c:

  perf tools: Allow auxtrace data alignment (2015-06-23 18:28:37 -0300)


perf/core improvements and fixes:

User visible:

- Move toggling event logic from 'perf top' and into hists browser, allowing
  freeze/unfreeze with event lists with more than one entry (Namhyung Kim)

- Add missing newlines when dumping PERF_RECORD_FINISHED_ROUND and
  showing the Aggregated stats in 'perf report -D' (Adrian Hunter)

Infrastructure:

- Allow auxtrace data alignment (Adrian Hunter)

- Allow events with dot (Andi Kleen)

- Fix failure to 'perf probe' events on arm (He Kuang)

- Add testing for Makefile.perf (Jiri Olsa)

- Add test for make install with prefix (Jiri Olsa)

- Fix single target build dependency check (Jiri Olsa)

- Access thread_map entries via accessors, prep patch to hold more info per
  entry, for ongoing 'perf stat --per-thread' work (Jiri Olsa)

- Use __weak definition from compiler.h (Sukadev Bhattiprolu)

- Split perf_pmu__new_alias() (Sukadev Bhattiprolu)

Signed-off-by: Arnaldo Carvalho de Melo a...@redhat.com


Adrian Hunter (3):
  perf session: Print a newline when dumping PERF_RECORD_FINISHED_ROUND
  perf tools: Print a newline before dumping Aggregated stats
  perf tools: Allow auxtrace data alignment

Andi Kleen (1):
  perf tools: Allow events with dot

He Kuang (1):
  perf probe: Fix failure to probe events on arm

Jiri Olsa (5):
  perf tests: Add testing for Makefile.perf
  perf tests: Add test for make install with prefix
  perf build: Fix single target build dependency check
  perf thread_map: Don't access the array entries directly
  perf thread_map: Change map entries into a struct

Namhyung Kim (1):
  perf top: Move toggling event logic into hists browser

Sukadev Bhattiprolu (2):
  perf pmu: Use __weak definition from linux/compiler.h
  perf pmu: Split perf_pmu__new_alias()

 tools/perf/Makefile |  4 +--
 tools/perf/builtin-top.c| 24 ++-
 tools/perf/builtin-trace.c  |  4 +--
 tools/perf/tests/make   | 31 ++--
 tools/perf/tests/openat-syscall-tp-fields.c |  2 +-
 tools/perf/ui/browsers/hists.c  | 19 ++--
 tools/perf/util/auxtrace.c  | 11 +--
 tools/perf/util/auxtrace.h  |  1 +
 tools/perf/util/event.c |  6 ++--
 tools/perf/util/evlist.c|  4 +--
 tools/perf/util/evsel.c |  2 +-
 tools/perf/util/parse-events.l  |  5 ++--
 tools/perf/util/pmu.c   | 45 +++--
 tools/perf/util/probe-event.c   |  6 +++-
 tools/perf/util/session.c   |  4 ++-
 tools/perf/util/thread_map.c| 24 ---
 tools/perf/util/thread_map.h| 16 +-
 17 files changed, 136 insertions(+), 72 deletions(-)
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 06/13] perf pmu: Split perf_pmu__new_alias()

2015-06-23 Thread Arnaldo Carvalho de Melo
From: Sukadev Bhattiprolu suka...@linux.vnet.ibm.com

Separate the event parsing code in perf_pmu__new_alias() out into a
separate function __perf_pmu__new_alias() so that code can be called
indepdently.

This is based on an earlier patch from Andi Kleen.

Signed-off-by: Sukadev Bhattiprolu suka...@linux.vnet.ibm.com
Acked-by: Jiri Olsa jo...@redhat.com
Cc: Andi Kleen a...@linux.intel.com
Cc: Madhavan Srinivasan ma...@linux.vnet.ibm.com
Cc: Michael Ellerman m...@ellerman.id.au
Cc: Namhyung Kim namhy...@kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Link: 
http://lkml.kernel.org/r/1433921123-25327-5-git-send-email-suka...@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo a...@redhat.com
---
 tools/perf/util/pmu.c | 42 +++---
 1 file changed, 27 insertions(+), 15 deletions(-)

diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index c6b16b1db6d0..7bcb8c315615 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -206,17 +206,12 @@ static int perf_pmu__parse_snapshot(struct perf_pmu_alias 
*alias,
return 0;
 }
 
-static int perf_pmu__new_alias(struct list_head *list, char *dir, char *name, 
FILE *file)
+static int __perf_pmu__new_alias(struct list_head *list, char *dir, char *name,
+char *desc __maybe_unused, char *val)
 {
struct perf_pmu_alias *alias;
-   char buf[256];
int ret;
 
-   ret = fread(buf, 1, sizeof(buf), file);
-   if (ret == 0)
-   return -EINVAL;
-   buf[ret] = 0;
-
alias = malloc(sizeof(*alias));
if (!alias)
return -ENOMEM;
@@ -226,26 +221,43 @@ static int perf_pmu__new_alias(struct list_head *list, 
char *dir, char *name, FI
alias-unit[0] = '\0';
alias-per_pkg = false;
 
-   ret = parse_events_terms(alias-terms, buf);
+   ret = parse_events_terms(alias-terms, val);
if (ret) {
+   pr_err(Cannot parse alias %s: %d\n, val, ret);
free(alias);
return ret;
}
 
alias-name = strdup(name);
-   /*
-* load unit name and scale if available
-*/
-   perf_pmu__parse_unit(alias, dir, name);
-   perf_pmu__parse_scale(alias, dir, name);
-   perf_pmu__parse_per_pkg(alias, dir, name);
-   perf_pmu__parse_snapshot(alias, dir, name);
+   if (dir) {
+   /*
+* load unit name and scale if available
+*/
+   perf_pmu__parse_unit(alias, dir, name);
+   perf_pmu__parse_scale(alias, dir, name);
+   perf_pmu__parse_per_pkg(alias, dir, name);
+   perf_pmu__parse_snapshot(alias, dir, name);
+   }
 
list_add_tail(alias-list, list);
 
return 0;
 }
 
+static int perf_pmu__new_alias(struct list_head *list, char *dir, char *name, 
FILE *file)
+{
+   char buf[256];
+   int ret;
+
+   ret = fread(buf, 1, sizeof(buf), file);
+   if (ret == 0)
+   return -EINVAL;
+
+   buf[ret] = 0;
+
+   return __perf_pmu__new_alias(list, dir, name, NULL, buf);
+}
+
 static inline bool pmu_alias_info_file(char *name)
 {
size_t len;
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 1/3] powerpc/iommu: Remove dma_data union

2015-06-23 Thread Benjamin Herrenschmidt
To support hybrid DMA ops in a subsequent patch, we will need both
a direct DMA offset and an iommu pointer. Those are currently exclusive
(a union), so change them to be separate fields.

While there, also type iommu_table_base properly and make exist only
on CONFIG_PPC64 since it's not referenced on 32-bit at all.

Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org
---
 arch/powerpc/include/asm/device.h  | 15 +--
 arch/powerpc/include/asm/dma-mapping.h |  4 ++--
 arch/powerpc/include/asm/iommu.h   | 31 +--
 3 files changed, 36 insertions(+), 14 deletions(-)

diff --git a/arch/powerpc/include/asm/device.h 
b/arch/powerpc/include/asm/device.h
index e9bdda8..406c2b1 100644
--- a/arch/powerpc/include/asm/device.h
+++ b/arch/powerpc/include/asm/device.h
@@ -10,6 +10,7 @@ struct dma_map_ops;
 struct device_node;
 #ifdef CONFIG_PPC64
 struct pci_dn;
+struct iommu_table;
 #endif
 
 /*
@@ -23,13 +24,15 @@ struct dev_archdata {
struct dma_map_ops  *dma_ops;
 
/*
-* When an iommu is in use, dma_data is used as a ptr to the base of the
-* iommu_table.  Otherwise, it is a simple numerical offset.
+* These two used to be a union. However, with the hybrid ops we need
+* both so here we store both a DMA offset for direct mappings and
+* an iommu_table for remapped DMA.
 */
-   union {
-   dma_addr_t  dma_offset;
-   void*iommu_table_base;
-   } dma_data;
+   dma_addr_t  dma_offset;
+
+#ifdef CONFIG_PPC64
+   struct iommu_table  *iommu_table_base;
+#endif
 
 #ifdef CONFIG_IOMMU_API
void*iommu_domain;
diff --git a/arch/powerpc/include/asm/dma-mapping.h 
b/arch/powerpc/include/asm/dma-mapping.h
index 9103687..9cbbc9e 100644
--- a/arch/powerpc/include/asm/dma-mapping.h
+++ b/arch/powerpc/include/asm/dma-mapping.h
@@ -106,7 +106,7 @@ static inline void set_dma_ops(struct device *dev, struct 
dma_map_ops *ops)
 static inline dma_addr_t get_dma_offset(struct device *dev)
 {
if (dev)
-   return dev-archdata.dma_data.dma_offset;
+   return dev-archdata.dma_offset;
 
return PCI_DRAM_OFFSET;
 }
@@ -114,7 +114,7 @@ static inline dma_addr_t get_dma_offset(struct device *dev)
 static inline void set_dma_offset(struct device *dev, dma_addr_t off)
 {
if (dev)
-   dev-archdata.dma_data.dma_offset = off;
+   dev-archdata.dma_offset = off;
 }
 
 /* this will be removed soon */
diff --git a/arch/powerpc/include/asm/iommu.h b/arch/powerpc/include/asm/iommu.h
index ca18cff..7b87bab 100644
--- a/arch/powerpc/include/asm/iommu.h
+++ b/arch/powerpc/include/asm/iommu.h
@@ -2,17 +2,17 @@
  * Copyright (C) 2001 Mike Corrigan  Dave Engebretsen, IBM Corporation
  * Rewrite, cleanup:
  * Copyright (C) 2004 Olof Johansson o...@lixom.net, IBM Corporation
- * 
+ *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License as published by
  * the Free Software Foundation; either version 2 of the License, or
  * (at your option) any later version.
- * 
+ *
  * This program is distributed in the hope that it will be useful,
  * but WITHOUT ANY WARRANTY; without even the implied warranty of
  * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  * GNU General Public License for more details.
- * 
+ *
  * You should have received a copy of the GNU General Public License
  * along with this program; if not, write to the Free Software
  * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307 USA
@@ -131,16 +131,21 @@ int get_iommu_order(unsigned long size, struct 
iommu_table *tbl)
 
 struct scatterlist;
 
-static inline void set_iommu_table_base(struct device *dev, void *base)
+#ifdef CONFIG_PPC64
+
+static inline void set_iommu_table_base(struct device *dev,
+   struct iommu_table *base)
 {
-   dev-archdata.dma_data.iommu_table_base = base;
+   dev-archdata.iommu_table_base = base;
 }
 
 static inline void *get_iommu_table_base(struct device *dev)
 {
-   return dev-archdata.dma_data.iommu_table_base;
+   return dev-archdata.iommu_table_base;
 }
 
+extern int dma_iommu_dma_supported(struct device *dev, u64 mask);
+
 /* Frees table for an individual device node */
 extern void iommu_free_table(struct iommu_table *tbl, const char *node_name);
 
@@ -225,6 +230,20 @@ static inline int __init tce_iommu_bus_notifier_init(void)
 }
 #endif /* !CONFIG_IOMMU_API */
 
+#else
+
+static inline void *get_iommu_table_base(struct device *dev)
+{
+   return NULL;
+}
+
+static inline int dma_iommu_dma_supported(struct device *dev, u64 mask)
+{
+   return 0;
+}
+
+#endif /* CONFIG_PPC64 */
+
 extern int ppc_iommu_map_sg(struct device *dev, struct iommu_table *tbl,
struct scatterlist *sglist, int 

[PATCH 3/3] powerpc/iommu: Support hybrid iommu/direct DMA ops for coherent_mask dma_mask

2015-06-23 Thread Benjamin Herrenschmidt
This patch adds the ability to the DMA direct ops to fallback to the IOMMU
ops for coherent alloc/free if the coherent mask of the device isn't
suitable for accessing the direct DMA space and the device also happens
to have an active IOMMU table.

Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org
---
 arch/powerpc/Kconfig   |   4 ++
 arch/powerpc/include/asm/dma-mapping.h |  10 +--
 arch/powerpc/kernel/dma-iommu.c|   2 +-
 arch/powerpc/kernel/dma-swiotlb.c  |   4 +-
 arch/powerpc/kernel/dma.c  | 111 +++--
 5 files changed, 105 insertions(+), 26 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 190cc48..72302fa 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -82,6 +82,9 @@ config GENERIC_HWEIGHT
bool
default y
 
+config ARCH_HAS_DMA_SET_COHERENT_MASK
+bool
+
 config PPC
bool
default y
@@ -153,6 +156,7 @@ config PPC
select NO_BOOTMEM
select HAVE_GENERIC_RCU_GUP
select HAVE_PERF_EVENTS_NMI if PPC64
+   select ARCH_HAS_DMA_SET_COHERENT_MASK
 
 config GENERIC_CSUM
def_bool CPU_LITTLE_ENDIAN
diff --git a/arch/powerpc/include/asm/dma-mapping.h 
b/arch/powerpc/include/asm/dma-mapping.h
index 9cbbc9e..710f60e 100644
--- a/arch/powerpc/include/asm/dma-mapping.h
+++ b/arch/powerpc/include/asm/dma-mapping.h
@@ -21,12 +21,12 @@
 #define DMA_ERROR_CODE (~(dma_addr_t)0x0)
 
 /* Some dma direct funcs must be visible for use in other dma_ops */
-extern void *dma_direct_alloc_coherent(struct device *dev, size_t size,
-  dma_addr_t *dma_handle, gfp_t flag,
+extern void *__dma_direct_alloc_coherent(struct device *dev, size_t size,
+dma_addr_t *dma_handle, gfp_t flag,
+struct dma_attrs *attrs);
+extern void __dma_direct_free_coherent(struct device *dev, size_t size,
+  void *vaddr, dma_addr_t dma_handle,
   struct dma_attrs *attrs);
-extern void dma_direct_free_coherent(struct device *dev, size_t size,
-void *vaddr, dma_addr_t dma_handle,
-struct dma_attrs *attrs);
 extern int dma_direct_mmap_coherent(struct device *dev,
struct vm_area_struct *vma,
void *cpu_addr, dma_addr_t handle,
diff --git a/arch/powerpc/kernel/dma-iommu.c b/arch/powerpc/kernel/dma-iommu.c
index 4c68bfe..41a7d9d 100644
--- a/arch/powerpc/kernel/dma-iommu.c
+++ b/arch/powerpc/kernel/dma-iommu.c
@@ -73,7 +73,7 @@ static void dma_iommu_unmap_sg(struct device *dev, struct 
scatterlist *sglist,
 }
 
 /* We support DMA to/from any memory page via the iommu */
-static int dma_iommu_dma_supported(struct device *dev, u64 mask)
+int dma_iommu_dma_supported(struct device *dev, u64 mask)
 {
struct iommu_table *tbl = get_iommu_table_base(dev);
 
diff --git a/arch/powerpc/kernel/dma-swiotlb.c 
b/arch/powerpc/kernel/dma-swiotlb.c
index 6e8d764..c6689f6 100644
--- a/arch/powerpc/kernel/dma-swiotlb.c
+++ b/arch/powerpc/kernel/dma-swiotlb.c
@@ -47,8 +47,8 @@ static u64 swiotlb_powerpc_get_required(struct device *dev)
  * for everything else.
  */
 struct dma_map_ops swiotlb_dma_ops = {
-   .alloc = dma_direct_alloc_coherent,
-   .free = dma_direct_free_coherent,
+   .alloc = __dma_direct_alloc_coherent,
+   .free = __dma_direct_free_coherent,
.mmap = dma_direct_mmap_coherent,
.map_sg = swiotlb_map_sg_attrs,
.unmap_sg = swiotlb_unmap_sg_attrs,
diff --git a/arch/powerpc/kernel/dma.c b/arch/powerpc/kernel/dma.c
index 35e4dcc..1558f81 100644
--- a/arch/powerpc/kernel/dma.c
+++ b/arch/powerpc/kernel/dma.c
@@ -16,6 +16,7 @@
 #include asm/bug.h
 #include asm/machdep.h
 #include asm/swiotlb.h
+#include asm/iommu.h
 
 /*
  * Generic direct DMA implementation
@@ -39,9 +40,31 @@ static u64 __maybe_unused get_pfn_limit(struct device *dev)
return pfn;
 }
 
-void *dma_direct_alloc_coherent(struct device *dev, size_t size,
-   dma_addr_t *dma_handle, gfp_t flag,
-   struct dma_attrs *attrs)
+static int dma_direct_dma_supported(struct device *dev, u64 mask)
+{
+#ifdef CONFIG_PPC64
+   u64 limit = get_dma_offset(dev) + (memblock_end_of_DRAM() - 1);
+
+   /* Limit fits in the mask, we are good */
+   if (mask = limit)
+   return 1;
+
+#ifdef CONFIG_FSL_SOC
+   /* Freescale gets another chance via ZONE_DMA/ZONE_DMA32, however
+* that will have to be refined if/when they support iommus
+*/
+   return 1;
+#endif
+   /* Sorry ... */
+   return 0;
+#else
+   return 1;
+#endif
+}
+
+void *__dma_direct_alloc_coherent(struct device *dev, size_t size,
+ dma_addr_t 

[PATCH 2/3] powerpc/iommu: Cleanup setting of DMA base/offset

2015-06-23 Thread Benjamin Herrenschmidt
Now that the table and the offset can co-exist, we no longer need
to flip/flop, we can just establish both once at boot time.

Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org
---
 arch/powerpc/platforms/powernv/pci-ioda.c |  3 +--
 arch/powerpc/platforms/pseries/iommu.c|  3 +--
 arch/powerpc/sysdev/dart_iommu.c  | 16 +++-
 3 files changed, 5 insertions(+), 17 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
b/arch/powerpc/platforms/powernv/pci-ioda.c
index 5738d31..2c286b57 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -1620,11 +1620,9 @@ static int pnv_pci_ioda_dma_set_mask(struct pci_dev 
*pdev, u64 dma_mask)
if (bypass) {
dev_info(pdev-dev, Using 64-bit DMA iommu bypass\n);
set_dma_ops(pdev-dev, dma_direct_ops);
-   set_dma_offset(pdev-dev, pe-tce_bypass_base);
} else {
dev_info(pdev-dev, Using 32-bit DMA via iommu\n);
set_dma_ops(pdev-dev, dma_iommu_ops);
-   set_iommu_table_base(pdev-dev, pe-table_group.tables[0]);
}
*pdev-dev.dma_mask = dma_mask;
return 0;
@@ -1659,6 +1657,7 @@ static void pnv_ioda_setup_bus_dma(struct pnv_ioda_pe *pe,
 
list_for_each_entry(dev, bus-devices, bus_list) {
set_iommu_table_base(dev-dev, pe-table_group.tables[0]);
+   set_dma_offset(dev-dev, pe-tce_bypass_base);
iommu_add_device(dev-dev);
 
if ((pe-flags  PNV_IODA_PE_BUS_ALL)  dev-subordinate)
diff --git a/arch/powerpc/platforms/pseries/iommu.c 
b/arch/powerpc/platforms/pseries/iommu.c
index 10510de..0946b98 100644
--- a/arch/powerpc/platforms/pseries/iommu.c
+++ b/arch/powerpc/platforms/pseries/iommu.c
@@ -1253,11 +1253,10 @@ static int dma_set_mask_pSeriesLP(struct device *dev, 
u64 dma_mask)
}
}
 
-   /* fall back on iommu ops, restore table pointer with ops */
+   /* fall back on iommu ops */
if (!ddw_enabled  get_dma_ops(dev) != dma_iommu_ops) {
dev_info(dev, Restoring 32-bit DMA via iommu\n);
set_dma_ops(dev, dma_iommu_ops);
-   pci_dma_dev_setup_pSeriesLP(pdev);
}
 
 check_mask:
diff --git a/arch/powerpc/sysdev/dart_iommu.c b/arch/powerpc/sysdev/dart_iommu.c
index 90bcdfe..b734863 100644
--- a/arch/powerpc/sysdev/dart_iommu.c
+++ b/arch/powerpc/sysdev/dart_iommu.c
@@ -313,20 +313,11 @@ static void iommu_table_dart_setup(void)
set_bit(iommu_table_dart.it_size - 1, iommu_table_dart.it_map);
 }
 
-static void dma_dev_setup_dart(struct device *dev)
-{
-   /* We only have one iommu table on the mac for now, which makes
-* things simple. Setup all PCI devices to point to this table
-*/
-   if (get_dma_ops(dev) == dma_direct_ops)
-   set_dma_offset(dev, DART_U4_BYPASS_BASE);
-   else
-   set_iommu_table_base(dev, iommu_table_dart);
-}
-
 static void pci_dma_dev_setup_dart(struct pci_dev *dev)
 {
-   dma_dev_setup_dart(dev-dev);
+   if (dart_is_u4)
+   set_dma_offset(dev-dev, DART_U4_BYPASS_BASE);
+   set_iommu_table_base(dev-dev, iommu_table_dart);
 }
 
 static void pci_dma_bus_setup_dart(struct pci_bus *bus)
@@ -370,7 +361,6 @@ static int dart_dma_set_mask(struct device *dev, u64 
dma_mask)
dev_info(dev, Using 32-bit DMA via iommu\n);
set_dma_ops(dev, dma_iommu_ops);
}
-   dma_dev_setup_dart(dev);
 
*dev-dma_mask = dma_mask;
return 0;


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH SLOF 4/5] disk-label: add support for booting from GPT FAT partition

2015-06-23 Thread Nikunj A Dadhania
Segher Boessenkool seg...@kernel.crashing.org writes:

 On Tue, Jun 23, 2015 at 09:34:44AM +0200, Thomas Huth wrote:
  +: load-from-gpt-partition ( [ addr ] -- size | TRUE )
 
 What do you mean with addr in square brackets? Is it optional?

 And size | TRUE?  The code even returns false instead, which
 usually is a valid size (0).  Just always return a flag?  Or maybe
 you mean something like ( -- false | size true ) .  Not going to
 read the code, I cannot keep track of the stack, bringing us to...


 Hmm, I wonder whether we need a proper coding conventions spec for
 writing Forth code ... (at least about the indentation depths ...) ;-)

 Write readable code.  That means in part, do not write long definitions
 (longer than a few lines).

I ended up here by combining two similar looking words as they were
doing too many similar stuff.

But I guess it ended up being pretty big. I will break it up into
smaller units and resend this patch.


 There, all coding conventions you'll ever need :-)


 Almost all short definitions (with good names!) are easily readable
 (with a little effort if the subject matter is tricky).  No longer
 definitions are ever readable (well, there are exceptions; not many).

 Don't get hung up on how many spaces should I indent...  Since your
 words are short, you won't have more than two levels of indent anyway :-)

 Adding extra spacing to group things is also very helpful.

 Minor things...  Most words want a stack comment.  If you need stack
 comments inside a definition, it is too complex.  If there is any
 significant amount of stack juggling, the word is too complex.  If
 the word would be too complex, you need to factor it.  If you cannot
 easily split off factors, your solution is too complex.  If it is
 hard to think of good names for the factors, that is simply because
 naming things is the hardest part of programming (but see also the
 previous point).

 You also want short words that do one little thing because you _do_
 test your code.

Regards
Nikunj

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc: Set the correct kernel taint on machine check errors.

2015-06-23 Thread Daniel Axtens
On Wed, 2015-06-24 at 15:00 +1000, Michael Ellerman wrote:
 On Mon, 2015-06-15 at 13:25 +1000, Daniel Axtens wrote:
  This means the 'M' flag will work properly when the kernel prints a 
  backtrace.
  
  Signed-off-by: Daniel Axtens d...@axtens.net
  ---
   arch/powerpc/kernel/traps.c | 2 ++
   1 file changed, 2 insertions(+)
  
  diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
  index 6530f1b..37de90f 100644
  --- a/arch/powerpc/kernel/traps.c
  +++ b/arch/powerpc/kernel/traps.c
  @@ -297,6 +297,8 @@ long machine_check_early(struct pt_regs *regs)
   
  __this_cpu_inc(irq_stat.mce_exceptions);
   
  +   add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE);
 
 I'm not sure about the lockdep bit.
 
 I guess it's safer to just declare it fubar.
 
I'm also not sure; I was matching x86 behaviour.

 Does this fix a bug, or just nice to have?
 
If you consider having the taint bits set incorrectly to be a bug, then
it fixes a bug. Otherwise, it's just nice debug info to have: when we
get a traceback with M set that isn't obviously related to a MCE we are
forewarned.

 cheers
 
 

-- 
Regards,
Daniel


signature.asc
Description: This is a digitally signed message part
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH SLOF 5/5] disk-label: make gpt detection code more robust

2015-06-23 Thread Nikunj A Dadhania
Thomas Huth th...@redhat.com writes:

 On Mon, 22 Jun 2015 13:29:47 +0530
 Nikunj A Dadhania nik...@linux.vnet.ibm.com wrote:

 * Check for Protective MBR Magic
 * Check for valid GPT Signature
 * Boundary check for allocated block size before reading into the
   buffer
 
 Signed-off-by: Nikunj A Dadhania nik...@linux.vnet.ibm.com
 ---
  slof/fs/packages/disk-label.fs | 21 +
  1 file changed, 17 insertions(+), 4 deletions(-)
 
 diff --git a/slof/fs/packages/disk-label.fs b/slof/fs/packages/disk-label.fs
 index 821e959..d9c3a8d 100644
 --- a/slof/fs/packages/disk-label.fs
 +++ b/slof/fs/packages/disk-label.fs
 @@ -20,6 +20,7 @@ false VALUE debug-disk-label?
  \ If we ever want to put a large kernel with initramfs from a PREP partition
  \ we might need to increase this value. The default value is 65536 blocks 
 (32MB)
  d# 65536 value max-prep-partition-blocks
 +d# 4096 value block-array-size
  
  s disk-label device-name
  
 @@ -152,8 +153,8 @@ CONSTANT /gpt-part-entry
  : init-block ( -- )
 s block-size ['] $call-parent CATCH IF ABORT parent has no 
 block-size. THEN
 to block-size
 -   d# 4096 alloc-mem
 -   dup d# 4096 erase
 +   block-array-size alloc-mem
 +   dup block-array-size erase
 to block
 debug-disk-label? IF
. init-block: block-size= block-size .d . block=0x block u. cr
 @@ -175,10 +176,18 @@ CONSTANT /gpt-part-entry
 block mbrmagic w@-le aa55 
  ;
  
 +\
 +\ GPT Signature
 +\ (EFI PART, 45h 46h 49h 20h 50h 41h 52h 54h)
 +\
 +4546492050415254 CONSTANT GPT-SIGNATURE
 +
  \ This word returns true if the currently loaded block has _NO_ GPT 
 partition id
  : no-gpt? ( -- true|false )
 0 read-sector
 -   1 partitionpart-entry part-entryid c@ ee 
 +   1 partitionpart-entry part-entryid c@ ee  IF TRUE EXIT THEN
 +   block mbrmagic w@-le aa55  IF TRUE EXIT THEN
 +   1 read-sector block gptsignature x@ GPT-SIGNATURE 

 The comment above the function talks about the currently loaded
 block, so I'd maybe avoid to load another sector here.
 Maybe move this gptsignature check to load-from-gpt-partition where
 this block gets loaded anyway?

Sure.


  ;
  
  : pc-extended-partition? ( part-entry-addr -- true|false )
 @@ -411,6 +420,10 @@ B9E5CONSTANT GPT-BASIC-DATA-PARTITION-2
 1 read-sector block gptpart-entry-lba x@-le
 block-size * to seek-pos
 block gptpart-entry-size l@-le to gpt-part-size
 +   gpt-part-size block-array-size  IF
 +   cr . GPT part size exceeds buffer allocated  cr

 Isn't there this addr parameter on the stack which you might need to
 drop here?

Will check


 +   FALSE EXIT
 +   THEN
 block gptnum-part-entry l@-le dup 0= IF FALSE EXIT THEN
 1+ 1 ?DO
seek-pos 0 seek drop
 @@ -646,7 +659,7 @@ B9E5CONSTANT GPT-BASIC-DATA-PARTITION-2
  
  : close ( -- )
 debug-disk-label? IF . Closing disk-label: block=0x block u. . 
 block-size= block-size .d cr THEN
 -   block d# 4096 free-mem
 +   block block-array-size free-mem
  ;

  Thomas

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC PATCH] powerpc/numa: initialize distance lookup table from drconf path

2015-06-23 Thread Nikunj A Dadhania

Hi Anton,

Anton Blanchard an...@samba.org writes:
 Hi Nikunj,

 From: Nikunj A Dadhania nik...@linux.vnet.ibm.com
 
 powerpc/numa: initialize distance lookup table from drconf path
 
 In some situations, a NUMA guest that supports
 ibm,dynamic-memory-reconfiguration node will end up having flat NUMA
 distances between nodes. This is because of two problems in the
 current code.

 Thanks for the patch. Have we tested that this doesn't regress the
 non dynamic representation?

Yes, that is tested. And works as expected.

Regards
Nikunj

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH SLOF 1/5] disk-label: simplify gpt-prep-partition? routine

2015-06-23 Thread Thomas Huth
On Mon, 22 Jun 2015 13:29:43 +0530
Nikunj A Dadhania nik...@linux.vnet.ibm.com wrote:

 Signed-off-by: Nikunj A Dadhania nik...@linux.vnet.ibm.com
 ---
  slof/fs/packages/disk-label.fs | 36 +---
  1 file changed, 13 insertions(+), 23 deletions(-)
 
 diff --git a/slof/fs/packages/disk-label.fs b/slof/fs/packages/disk-label.fs
 index fe1c25e..2305eee 100644
 --- a/slof/fs/packages/disk-label.fs
 +++ b/slof/fs/packages/disk-label.fs
 @@ -352,31 +352,21 @@ CONSTANT /gpt-part-entry
 drop 0
  ;
  
 -\ Check for GPT PReP partition GUID
 -9E1A2D38 CONSTANT GPT-PREP-PARTITION-1
 -C612 CONSTANT GPT-PREP-PARTITION-2
 -4316 CONSTANT GPT-PREP-PARTITION-3
 -AA26 CONSTANT GPT-PREP-PARTITION-4
 -8B49521E5A8B CONSTANT GPT-PREP-PARTITION-5
 +\ Check for GPT PReP partition GUID. Only first 3 blocks are
 +\ byte-swapped treating last two blocks as contigous for simplifying
 +\ comparison
 +9E1A2D38CONSTANT GPT-PREP-PARTITION-1
 +C612CONSTANT GPT-PREP-PARTITION-2
 +4316CONSTANT GPT-PREP-PARTITION-3
 +AA268B49521E5A8BCONSTANT GPT-PREP-PARTITION-4
  
  : gpt-prep-partition? ( -- true|false )
 -   block gpt-part-entrypart-type-guid l@-le GPT-PREP-PARTITION-1 = IF
 -  block gpt-part-entrypart-type-guid 4 + w@-le
 -  GPT-PREP-PARTITION-2 = IF
 - block gpt-part-entrypart-type-guid 6 + w@-le
 - GPT-PREP-PARTITION-3 = IF
 -block gpt-part-entrypart-type-guid 8 + w@
 -GPT-PREP-PARTITION-4 = IF
 -   block gpt-part-entrypart-type-guid a + w@
 -   block gpt-part-entrypart-type-guid c + l@ swap lxjoin
 -   GPT-PREP-PARTITION-5 = IF
 -   TRUE EXIT
 -   THEN
 -THEN
 - THEN
 -  THEN
 -   THEN
 -   FALSE
 +   block gpt-part-entrypart-type-guid
 +   dup l@-le GPT-PREP-PARTITION-1  IF DROP FALSE EXIT THEN
 +   dup 4 + w@-le GPT-PREP-PARTITION-2  IF DROP FALSE EXIT THEN
 +   dup 6 + w@-le GPT-PREP-PARTITION-3  IF DROP FALSE EXIT THEN
 +   8 + x@GPT-PREP-PARTITION-4  IF FALSE EXIT THEN
 +   TRUE
  ;
  
  : load-from-gpt-prep-partition ( addr -- size )

Also change DROP, FALSE and TRUE to lowercase, as Segher
suggested with patch 3? Apart from that, looks fine to me.

Reviewed-by: Thomas Huth th...@redhat.com
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH SLOF 2/5] introduce 8-byte LE helpers

2015-06-23 Thread Thomas Huth
On Mon, 22 Jun 2015 13:29:44 +0530
Nikunj A Dadhania nik...@linux.vnet.ibm.com wrote:

 Signed-off-by: Nikunj A Dadhania nik...@linux.vnet.ibm.com
 ---
  slof/fs/little-endian.fs   | 6 ++
  slof/fs/packages/disk-label.fs | 4 ++--
  2 files changed, 8 insertions(+), 2 deletions(-)
 
 diff --git a/slof/fs/little-endian.fs b/slof/fs/little-endian.fs
 index f2e4e8d..6b4779e 100644
 --- a/slof/fs/little-endian.fs
 +++ b/slof/fs/little-endian.fs
 @@ -17,6 +17,9 @@ here c@ ef = CONSTANT ?littleendian
  
  ?bigendian [IF]
  
 +: x!-le r xbflip r x! ;
 +: x@-le x@ xbflip ;
 +
  : l!-le  r lbflip r l! ;
  : l@-le  l@ lbflip ;
  
 @@ -47,6 +50,9 @@ here c@ ef = CONSTANT ?littleendian
  
  [ELSE]
  
 +: x!-le x! ;
 +: x@-le x@ ;
 +
  : l!-le  l! ;
  : l@-le  l@ ;
  
 diff --git a/slof/fs/packages/disk-label.fs b/slof/fs/packages/disk-label.fs
 index 2305eee..2cf1b85 100644
 --- a/slof/fs/packages/disk-label.fs
 +++ b/slof/fs/packages/disk-label.fs
 @@ -384,8 +384,8 @@ AA268B49521E5A8BCONSTANT GPT-PREP-PARTITION-4
   debug-disk-label? IF
  . GPT PReP partition found  cr
   THEN
 - block gpt-part-entryfirst-lba x@ xbflip
 - block gpt-part-entrylast-lba x@ xbflip
 + block gpt-part-entryfirst-lba x@-le
 + block gpt-part-entrylast-lba x@-le
   over - 1+ ( addr offset len )
   swap  ( addr len offset )
   block-size * to part-offset

Reviewed-by: Thomas Huth th...@redhat.com
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH SLOF 3/5] disk-label: introduce helper to check fat filesystem

2015-06-23 Thread Nikunj A Dadhania

Hi Segher,

Segher Boessenkool seg...@kernel.crashing.org writes:
 On Mon, Jun 22, 2015 at 01:29:45PM +0530, Nikunj A Dadhania wrote:
 +: has-fat-filesystem ( block -- true | false )
 +   \ block 0 byte 0-2 is a jump instruction in all FAT
 +   \ filesystems.

 block there is not a block number, just a host address.  So it's not
 a good name.  Maybe do a better name for this word as well, something
 saying it looks at a disk block.

Sure.


 +   \ e9 and eb are jump instructions in x86 assembler.
 +   dup c@ e9  IF
 +  dup c@ eb  swap
 +  2+  c@ 90  or
 +  IF false EXIT THEN
 +   ELSE DROP THEN
 +   TRUE
 +;

 Don't write DROP and TRUE in caps please.  The purpose of having the
 structure words in caps is to make them stand out more, to make things
 more readable; putting other things in caps as well destroys that.

Sure, will take care.

 Since you factored this, it becomes more readable if you invert the
 conditions:

Sure.

 : fat-bootblock? ( addr -- flag )
\ byte 0-2 of the bootblock is a jump instruction in
\ all FAT filesystems.
\ e9 and eb are jump instructions in x86 assembler.
dup c@ e9 = IF drop true EXIT THEN
dup c@ eb = swap 2+ c@ 90 = and ;

 (not tested, etc.)
Will test.

Regards,
Nikunj

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH] powerpc/powernv: Unfreeze VF PE on releasing it

2015-06-23 Thread Gavin Shan
When releasing PE for SRIOV VF, the PE is forced to be frozen
wrongly. When the same PE is picked for another VF, it won't
work anyhow. The patch fixes the issue by unfreezing, not
freezing the VF PE when releasing it.

Signed-off-by: Gavin Shan gws...@linux.vnet.ibm.com
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
b/arch/powerpc/platforms/powernv/pci-ioda.c
index b1248ca..88c00ff 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -701,7 +701,7 @@ static int pnv_ioda_deconfigure_pe(struct pnv_phb *phb, 
struct pnv_ioda_pe *pe)
parent = parent-bus-self;
}
 
-   opal_pci_eeh_freeze_set(phb-opal_id, pe-pe_number,
+   opal_pci_eeh_freeze_clear(phb-opal_id, pe-pe_number,
  OPAL_EEH_ACTION_CLEAR_FREEZE_ALL);
 
/* Disassociate PE in PELT */
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v4] powerpc/rcpm: add RCPM driver

2015-06-23 Thread Yuantian.Tang
From: Tang Yuantian yuantian.t...@freescale.com

There is a RCPM (Run Control/Power Management) in Freescale QorIQ
series processors. The device performs tasks associated with device
run control and power management.

The driver implements some features: mask/unmask irq, enter/exit low
power states, freeze time base, etc.

Signed-off-by: Chenhui Zhao chenhui.z...@freescale.com
Signed-off-by: Tang Yuantian yuantian.t...@freescale.com
---
v4:
- refine bindings document
v3:
- added static and __init modifier to fsl_rcpm_init
v2:
- fix code style issues
- refine compatible string match part

 Documentation/devicetree/bindings/soc/fsl/rcpm.txt |  42 +++
 arch/powerpc/include/asm/fsl_guts.h| 105 +++
 arch/powerpc/include/asm/fsl_pm.h  |  48 +++
 arch/powerpc/platforms/85xx/Kconfig|   1 +
 arch/powerpc/sysdev/Kconfig|   5 +
 arch/powerpc/sysdev/Makefile   |   1 +
 arch/powerpc/sysdev/fsl_rcpm.c | 338 +
 7 files changed, 540 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/soc/fsl/rcpm.txt
 create mode 100644 arch/powerpc/include/asm/fsl_pm.h
 create mode 100644 arch/powerpc/sysdev/fsl_rcpm.c

diff --git a/Documentation/devicetree/bindings/soc/fsl/rcpm.txt 
b/Documentation/devicetree/bindings/soc/fsl/rcpm.txt
new file mode 100644
index 000..1f58018
--- /dev/null
+++ b/Documentation/devicetree/bindings/soc/fsl/rcpm.txt
@@ -0,0 +1,42 @@
+* Run Control and Power Management
+
+The RCPM performs all device-level tasks associated with device run control
+and power management.
+
+Required properites:
+  - reg : Offset and length of the register set of RCPM block.
+  - compatible : Sould contain a chip-specific RCPM block compatible string
+   and (if applicable) may contain a chassis-version RCPM compatible 
string.
+   Chip-specific strings are of the form fsl,chip-rcpm, such as:
+   * fsl,p2041-rcpm
+   * fsl,p3041-rcpm
+   * fsl,p4080-rcpm
+   * fsl,p5020-rcpm
+   * fsl,p5040-rcpm
+   * fsl,t4240-rcpm
+   * fsl,b4420-rcpm
+   * fsl,b4860-rcpm
+
+   Chassis-version RCPM strings include:
+   * fsl,qoriq-rcpm-1.0: for chassis 1.0 rcpm
+   * fsl,qoriq-rcpm-2.0: for chassis 2.0 rcpm
+
+All references to 1.0 and 2.0 refer to the QorIQ chassis version to
+which the chip complies.
+Chassis VersionExample Chips
+------
+1.0p4080, p5020, p5040, p2041, p3041
+2.0t4240, b4860, t1040, b4420
+
+Example:
+The RCPM node for T4240:
+   rcpm: global-utilities@e2000 {
+   compatible = fsl,t4240-rcpm, fsl,qoriq-rcpm-2.0;
+   reg = 0xe2000 0x1000;
+   };
+
+The RCPM node for P4080:
+   rcpm: global-utilities@e2000 {
+   compatible = fsl,qoriq-rcpm-1.0;
+   reg = 0xe2000 0x1000;
+   };
diff --git a/arch/powerpc/include/asm/fsl_guts.h 
b/arch/powerpc/include/asm/fsl_guts.h
index 43b6bb1..a67413c 100644
--- a/arch/powerpc/include/asm/fsl_guts.h
+++ b/arch/powerpc/include/asm/fsl_guts.h
@@ -188,5 +188,110 @@ static inline void guts_set_pmuxcr_dma(struct ccsr_guts 
__iomem *guts,
 
 #endif
 
+struct ccsr_rcpm_v1 {
+   u8  res[4];
+   __be32  cdozsr; /* 0x0004 Core Doze Status Register */
+   u8  res0008[4];
+   __be32  cdozcr; /* 0x000c Core Doze Control Register */
+   u8  res0010[4];
+   __be32  cnapsr; /* 0x0014 Core Nap Status Register */
+   u8  res0018[4];
+   __be32  cnapcr; /* 0x001c Core Nap Control Register */
+   u8  res0020[4];
+   __be32  cdozpsr;/* 0x0024 Core Doze Previous Status Register */
+   u8  res0028[4];
+   __be32  cnappsr;/* 0x002c Core Nap Previous Status Register */
+   u8  res0030[4];
+   __be32  cwaitsr;/* 0x0034 Core Wait Status Register */
+   u8  res0038[4];
+   __be32  cwdtdsr;/* 0x003c Core Watchdog Detect Status Register */
+   __be32  powmgtcsr;  /* 0x0040 PM ControlStatus Register */
+#define RCPM_POWMGTCSR_SLP 0x0002
+   u8  res0044[12];
+   __be32  ippdexpcr;  /* 0x0050 IP Powerdown Exception Control Register */
+   u8  res0054[16];
+   __be32  cpmimr; /* 0x0064 Core PM IRQ Mask Register */
+   u8  res0068[4];
+   __be32  cpmcimr;/* 0x006c Core PM Critical IRQ Mask Register */
+   u8  res0070[4];
+   __be32  cpmmcmr;/* 0x0074 Core PM Machine Check Mask Register */
+   u8  res0078[4];
+   __be32  cpmnmimr;   /* 0x007c Core PM NMI Mask Register */
+   u8  res0080[4];
+   __be32  ctbenr; /* 0x0084 Core Time Base Enable Register */
+   u8  res0088[4];
+   __be32  ctbckselr;  /* 0x008c Core Time Base Clock Select Register */
+   

Re: [PATCH SLOF 4/5] disk-label: add support for booting from GPT FAT partition

2015-06-23 Thread Thomas Huth
On Mon, 22 Jun 2015 13:29:46 +0530
Nikunj A Dadhania nik...@linux.vnet.ibm.com wrote:

 For a GPT+LVM combination disk, older bootloader that does not support
 LVM, cannot load kernel from LVM.
 
 The patch add support to read from BASIC_DATA UUID
 partition. Installer has installed CHRP-BOOT config on a FAT file
 system.

Maybe better: The patch adds support to read from BASIC_DATA UUID
partitions for the case that the OS installer has installed
the CHRP-BOOT config on a FAT file system.

?

 Signed-off-by: Nikunj A Dadhania nik...@linux.vnet.ibm.com
 ---
  slof/fs/packages/disk-label.fs | 54 
 +-
  1 file changed, 48 insertions(+), 6 deletions(-)
 
 diff --git a/slof/fs/packages/disk-label.fs b/slof/fs/packages/disk-label.fs
 index e317e93..821e959 100644
 --- a/slof/fs/packages/disk-label.fs
 +++ b/slof/fs/packages/disk-label.fs
 @@ -266,7 +266,10 @@ CONSTANT /gpt-part-entry
  
  : try-dos-partition ( -- okay? )
 \ Read partition table and check magic.
 -   no-mbr? IF cr . No DOS disk-label found. cr false EXIT THEN
 +   no-mbr? IF
 +   debug-disk-label? IF cr . No DOS disk-label found. cr THEN
 +   false EXIT
 +   THEN
  
 count-dos-logical-partitions TO dos-logical-partitions
  
 @@ -381,18 +384,38 @@ AA268B49521E5A8BCONSTANT GPT-PREP-PARTITION-4
 TRUE
  ;
  
 -: load-from-gpt-prep-partition ( addr -- size )
 +\ Check for GPT MSFT BASIC DATA GUID - fat based
 +EBD0A0A2CONSTANT GPT-BASIC-DATA-PARTITION-1
 +B9E5CONSTANT GPT-BASIC-DATA-PARTITION-2
 +4433CONSTANT GPT-BASIC-DATA-PARTITION-3
 +87C068B6B72699C7CONSTANT GPT-BASIC-DATA-PARTITION-4
 +
 +: gpt-basic-data-partition? ( -- true|false )
 +   block gpt-part-entrypart-type-guid
 +   dup l@-le GPT-BASIC-DATA-PARTITION-1  IF DROP FALSE EXIT THEN
 +   dup 4 + w@-le GPT-BASIC-DATA-PARTITION-2  IF DROP FALSE EXIT THEN
 +   dup 6 + w@-le GPT-BASIC-DATA-PARTITION-3  IF DROP FALSE EXIT THEN
 +   8 + x@GPT-BASIC-DATA-PARTITION-4  IF FALSE EXIT THEN
 +   TRUE
 +;
 +
 +\ Can be called from two path
 +\ 1) load-from-boot-partition for GPT PReP partition
 +\ 2) try-partitions for gpt basic data partition having fat chrp-boot

What did you want to achieve with the above comment? Caller locations
can change with later patches, but it's unlikely that everybody
remembers to update such comments in that case. So unlikely you've got
a good reason for above comment (but in that case, it maybe should be
written in a different way), I'd suggest to drop it.

 +: load-from-gpt-partition ( [ addr ] -- size | TRUE )

What do you mean with addr in square brackets? Is it optional?

 no-gpt? IF drop FALSE EXIT THEN
 debug-disk-label? IF
cr . GPT partition found  cr
 THEN
 -   1 read-sector block gptpart-entry-lba l@-le
 +   1 read-sector block gptpart-entry-lba x@-le
 block-size * to seek-pos
 block gptpart-entry-size l@-le to gpt-part-size
 block gptnum-part-entry l@-le dup 0= IF FALSE EXIT THEN
 1+ 1 ?DO
seek-pos 0 seek drop
 -  block gpt-part-size read drop gpt-prep-partition? IF
 +   block gpt-part-size read drop
 +   gpt-prep-partition? IF

You've changed the level of indentation here. Please try to avoid that
(unless you've got a good reason, e.g. because the previous indentation
was obviously wrong)

   debug-disk-label? IF
  . GPT PReP partition found  cr
   THEN
 @@ -404,6 +427,24 @@ AA268B49521E5A8BCONSTANT GPT-PREP-PARTITION-4
   0 0 seek drop ( addr len )
   block-size * read ( size )
   UNLOOP EXIT
 + THEN
 + gpt-basic-data-partition? IF

Hmm, I wonder whether we need a proper coding conventions spec for
writing Forth code ... (at least about the indentation depths ...) ;-)

 + debug-disk-label? IF
 +. GPT LINUX DATA partition found  cr
 + THEN
 + block gpt-part-entryfirst-lba x@ xbflip
 + dup to part-start
 + block gpt-part-entrylast-lba x@ xbflip
 + over - 1+ ( addr offset len )
 + dup block-size * to part-size
 + swap  ( addr len offset )
 + block-size * to part-offset
 + 0 0 seek
 + block block-size read
 + 3drop
 + block has-fat-filesystem 0= IF false UNLOOP EXIT THEN
 + TRUE
 + UNLOOP EXIT
THEN
seek-pos gpt-part-size i * + to seek-pos
 LOOP
 @@ -495,11 +536,11 @@ AA268B49521E5A8BCONSTANT GPT-PREP-PARTITION-4
  
 debug-disk-label? IF . Trying CHRP boot  .s cr THEN
 1 disk-chrp-boot !
 -   dup load-chrp-boot-file ?dup 0  IF .s cr nip EXIT THEN
 +   dup load-chrp-boot-file ?dup 0  IF nip EXIT THEN
 0 disk-chrp-boot !
  
 debug-disk-label? IF . Trying GPT boot  .s cr THEN
 -   load-from-gpt-prep-partition
 +   load-from-gpt-partition

So here the function is called with an addr parameter on the stack 

Re: [PATCH SLOF 5/5] disk-label: make gpt detection code more robust

2015-06-23 Thread Thomas Huth
On Mon, 22 Jun 2015 13:29:47 +0530
Nikunj A Dadhania nik...@linux.vnet.ibm.com wrote:

 * Check for Protective MBR Magic
 * Check for valid GPT Signature
 * Boundary check for allocated block size before reading into the
   buffer
 
 Signed-off-by: Nikunj A Dadhania nik...@linux.vnet.ibm.com
 ---
  slof/fs/packages/disk-label.fs | 21 +
  1 file changed, 17 insertions(+), 4 deletions(-)
 
 diff --git a/slof/fs/packages/disk-label.fs b/slof/fs/packages/disk-label.fs
 index 821e959..d9c3a8d 100644
 --- a/slof/fs/packages/disk-label.fs
 +++ b/slof/fs/packages/disk-label.fs
 @@ -20,6 +20,7 @@ false VALUE debug-disk-label?
  \ If we ever want to put a large kernel with initramfs from a PREP partition
  \ we might need to increase this value. The default value is 65536 blocks 
 (32MB)
  d# 65536 value max-prep-partition-blocks
 +d# 4096 value block-array-size
  
  s disk-label device-name
  
 @@ -152,8 +153,8 @@ CONSTANT /gpt-part-entry
  : init-block ( -- )
 s block-size ['] $call-parent CATCH IF ABORT parent has no 
 block-size. THEN
 to block-size
 -   d# 4096 alloc-mem
 -   dup d# 4096 erase
 +   block-array-size alloc-mem
 +   dup block-array-size erase
 to block
 debug-disk-label? IF
. init-block: block-size= block-size .d . block=0x block u. cr
 @@ -175,10 +176,18 @@ CONSTANT /gpt-part-entry
 block mbrmagic w@-le aa55 
  ;
  
 +\
 +\ GPT Signature
 +\ (EFI PART, 45h 46h 49h 20h 50h 41h 52h 54h)
 +\
 +4546492050415254 CONSTANT GPT-SIGNATURE
 +
  \ This word returns true if the currently loaded block has _NO_ GPT 
 partition id
  : no-gpt? ( -- true|false )
 0 read-sector
 -   1 partitionpart-entry part-entryid c@ ee 
 +   1 partitionpart-entry part-entryid c@ ee  IF TRUE EXIT THEN
 +   block mbrmagic w@-le aa55  IF TRUE EXIT THEN
 +   1 read-sector block gptsignature x@ GPT-SIGNATURE 

The comment above the function talks about the currently loaded
block, so I'd maybe avoid to load another sector here.
Maybe move this gptsignature check to load-from-gpt-partition where
this block gets loaded anyway?

  ;
  
  : pc-extended-partition? ( part-entry-addr -- true|false )
 @@ -411,6 +420,10 @@ B9E5CONSTANT GPT-BASIC-DATA-PARTITION-2
 1 read-sector block gptpart-entry-lba x@-le
 block-size * to seek-pos
 block gptpart-entry-size l@-le to gpt-part-size
 +   gpt-part-size block-array-size  IF
 +   cr . GPT part size exceeds buffer allocated  cr

Isn't there this addr parameter on the stack which you might need to
drop here?

 +   FALSE EXIT
 +   THEN
 block gptnum-part-entry l@-le dup 0= IF FALSE EXIT THEN
 1+ 1 ?DO
seek-pos 0 seek drop
 @@ -646,7 +659,7 @@ B9E5CONSTANT GPT-BASIC-DATA-PARTITION-2
  
  : close ( -- )
 debug-disk-label? IF . Closing disk-label: block=0x block u. . 
 block-size= block-size .d cr THEN
 -   block d# 4096 free-mem
 +   block block-array-size free-mem
  ;

 Thomas

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[BACKPORT PATCH 1/5] perf probe ppc: Fix symbol fixup issues due to ELF type

2015-06-23 Thread Naveen N. Rao
If using the symbol table, symbol addresses are not being fixed up
properly, resulting in probes being placed at wrong addresses:

  # perf probe do_fork
  Added new event:
probe:do_fork(on do_fork)

  You can now use it in all perf tools, such as:

  perf record -e probe:do_fork -aR sleep 1

  # cat /sys/kernel/debug/tracing/kprobe_events
  p:probe/do_fork _text+635952
  # printf %x 635952
  9b430
  # grep do_fork /boot/System.map
  c00ab430 T .do_fork

Fix by checking for ELF type ET_DYN used by ppc64 kernels.

Signed-off-by: Naveen N. Rao naveen.n@linux.vnet.ibm.com
Reviewed-by: Srikar Dronamraju sri...@linux.vnet.ibm.com
Cc: Ananth N Mavinakayanahalli ana...@in.ibm.com
Cc: Masami Hiramatsu masami.hiramatsu...@hitachi.com
Cc: Michael Ellerman m...@ellerman.id.au
Cc: Sukadev Bhattiprolu suka...@linux.vnet.ibm.com
Cc: linuxppc-dev@lists.ozlabs.org
Link: 
http://lkml.kernel.org/r/41392bb856ef62d929995e0b61967689b7915207.1430217967.git.naveen.n@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo a...@redhat.com
---
 tools/perf/arch/powerpc/Makefile|  1 +
 tools/perf/arch/powerpc/util/sym-handling.c | 19 +++
 tools/perf/util/symbol-elf.c|  8 ++--
 tools/perf/util/symbol.h|  4 
 4 files changed, 30 insertions(+), 2 deletions(-)
 create mode 100644 tools/perf/arch/powerpc/util/sym-handling.c

diff --git a/tools/perf/arch/powerpc/Makefile b/tools/perf/arch/powerpc/Makefile
index 6f7782b..85d2306 100644
--- a/tools/perf/arch/powerpc/Makefile
+++ b/tools/perf/arch/powerpc/Makefile
@@ -4,3 +4,4 @@ LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/dwarf-regs.o
 LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/skip-callchain-idx.o
 endif
 LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/header.o
+LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/sym-handling.o
diff --git a/tools/perf/arch/powerpc/util/sym-handling.c 
b/tools/perf/arch/powerpc/util/sym-handling.c
new file mode 100644
index 000..c9de001
--- /dev/null
+++ b/tools/perf/arch/powerpc/util/sym-handling.c
@@ -0,0 +1,19 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * Copyright (C) 2015 Naveen N. Rao, IBM Corporation
+ */
+
+#include debug.h
+#include symbol.h
+
+#ifdef HAVE_LIBELF_SUPPORT
+bool elf__needs_adjust_symbols(GElf_Ehdr ehdr)
+{
+   return ehdr.e_type == ET_EXEC ||
+  ehdr.e_type == ET_REL ||
+  ehdr.e_type == ET_DYN;
+}
+#endif
diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index 1e23a5b..ddb300b 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -563,6 +563,11 @@ void symsrc__destroy(struct symsrc *ss)
close(ss-fd);
 }
 
+bool __weak elf__needs_adjust_symbols(GElf_Ehdr ehdr)
+{
+   return ehdr.e_type == ET_EXEC || ehdr.e_type == ET_REL;
+}
+
 int symsrc__init(struct symsrc *ss, struct dso *dso, const char *name,
 enum dso_binary_type type)
 {
@@ -628,8 +633,7 @@ int symsrc__init(struct symsrc *ss, struct dso *dso, const 
char *name,
 .gnu.prelink_undo,
 NULL) != NULL);
} else {
-   ss-adjust_symbols = ehdr.e_type == ET_EXEC ||
-ehdr.e_type == ET_REL;
+   ss-adjust_symbols = elf__needs_adjust_symbols(ehdr);
}
 
ss-name   = strdup(name);
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index eb2c19b..7335790 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -313,4 +313,8 @@ int compare_proc_modules(const char *from, const char *to);
 int setup_list(struct strlist **list, const char *list_str,
   const char *list_name);
 
+#ifdef HAVE_LIBELF_SUPPORT
+bool elf__needs_adjust_symbols(GElf_Ehdr ehdr);
+#endif
+
 #endif /* __PERF_SYMBOL */
-- 
2.4.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[BACKPORT PATCH 2/5] perf probe ppc64le: Fix ppc64 ABIv2 symbol decoding

2015-06-23 Thread Naveen N. Rao
From: Ananth N Mavinakayanahalli ana...@in.ibm.com

From: Ananth N Mavinakayanahalli ana...@in.ibm.com

ppc64 ELF ABIv2 has a Global Entry Point (GEP) and a Local Entry Point
(LEP). For purposes of probing, we need the LEP - the offset to which is
encoded in st_other.

Signed-off-by: Ananth N Mavinakayanahalli ana...@in.ibm.com
Reviewed-by: Srikar Dronamraju sri...@linux.vnet.ibm.com
Cc: Masami Hiramatsu masami.hiramatsu...@hitachi.com
Cc: Michael Ellerman m...@ellerman.id.au
Cc: Sukadev Bhattiprolu suka...@linux.vnet.ibm.com
Cc: linuxppc-dev@lists.ozlabs.org
Link: 
http://lkml.kernel.org/r/ab9cc5e2b9de4cbaaf50f6ef2346a6a81100bad1.1430217967.git.naveen.n@linux.vnet.ibm.com
Signed-off-by: Naveen N. Rao naveen.n@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo a...@redhat.com
---
 tools/perf/arch/powerpc/util/sym-handling.c | 7 +++
 tools/perf/util/symbol-elf.c| 4 
 tools/perf/util/symbol.h| 1 +
 3 files changed, 12 insertions(+)

diff --git a/tools/perf/arch/powerpc/util/sym-handling.c 
b/tools/perf/arch/powerpc/util/sym-handling.c
index c9de001..fd11157 100644
--- a/tools/perf/arch/powerpc/util/sym-handling.c
+++ b/tools/perf/arch/powerpc/util/sym-handling.c
@@ -16,4 +16,11 @@ bool elf__needs_adjust_symbols(GElf_Ehdr ehdr)
   ehdr.e_type == ET_REL ||
   ehdr.e_type == ET_DYN;
 }
+
+#if defined(_CALL_ELF)  _CALL_ELF == 2
+void arch__elf_sym_adjust(GElf_Sym *sym)
+{
+   sym-st_value += PPC64_LOCAL_ENTRY_OFFSET(sym-st_other);
+}
+#endif
 #endif
diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index ddb300b..1ffd44f 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -690,6 +690,8 @@ static bool want_demangle(bool is_kernel_sym)
return is_kernel_sym ? symbol_conf.demangle_kernel : 
symbol_conf.demangle;
 }
 
+void __weak arch__elf_sym_adjust(GElf_Sym *sym __maybe_unused) { }
+
 int dso__load_sym(struct dso *dso, struct map *map,
  struct symsrc *syms_ss, struct symsrc *runtime_ss,
  symbol_filter_t filter, int kmodule)
@@ -851,6 +853,8 @@ int dso__load_sym(struct dso *dso, struct map *map,
(sym.st_value  1))
--sym.st_value;
 
+   arch__elf_sym_adjust(sym);
+
if (dso-kernel || kmodule) {
char dso_name[PATH_MAX];
 
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index 7335790..2ef8119 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -315,6 +315,7 @@ int setup_list(struct strlist **list, const char *list_str,
 
 #ifdef HAVE_LIBELF_SUPPORT
 bool elf__needs_adjust_symbols(GElf_Ehdr ehdr);
+void arch__elf_sym_adjust(GElf_Sym *sym);
 #endif
 
 #endif /* __PERF_SYMBOL */
-- 
2.4.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[git pull] Please pull mpe/linux.git powerpc-4.2-1 tag

2015-06-23 Thread Michael Ellerman
Hi Linus,

Please pull powerpc updates for 4.2:

The following changes since commit 030bbdbf4c833bc69f502eae58498bc5572db736:

  Linux 4.1-rc3 (2015-05-10 15:12:29 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux.git tags/powerpc-4.2-1

for you to fetch changes up to 6096f884515466f400864ad23d16f20b731a7ce7:

  Merge branch 'next' of 
git://git.kernel.org/pub/scm/linux/kernel/git/scottwood/linux into next 
(2015-06-19 17:23:48 +1000)


powerpc updates for 4.2

 - Disable the 32-bit vdso when building LE, so we can build with a 64-bit only
   toolchain.
 - EEH fixes from Gavin  Richard.
 - Enable the sys_kcmp syscall from Laurent.
 - Sysfs control for fastsleep workaround from Shreyas.
 - Expose OPAL events as an irq chip by Alistair.
 - MSI ops moved to pci_controller_ops by Daniel.
 - Fix for kernel to userspace backtraces for perf from Anton.
 - Merge pseries and pseries_le defconfigs from Cyril.
 - CXL in-kernel API from Mikey.
 - OPAL prd driver from Jeremy.
 - Fix for DSCR handling  tests from Anshuman.
 - Powernv flash mtd driver from Cyril.
 - Dynamic DMA Window support on powernv from Alexey.
 - LLVM clang fixes  workarounds from Anton.
 - Reworked version of the patch to abort syscalls when transactional.
 - Fix the swap encoding to support 4TB, from Aneesh.
 - Various fixes as usual.
 - Freescale updates from Scott: Highlights include more 8xx optimizations, an
   e6500 hugetlb optimization, QMan device tree nodes, t1024/t1023 support, and
   various fixes and cleanup.


Alexey Kardashevskiy (36):
  powerpc/eeh/ioda2: Use device::iommu_group to check IOMMU group
  powerpc/iommu/powernv: Get rid of set_iommu_table_base_and_group
  powerpc/powernv/ioda: Clean up IOMMU group registration
  powerpc/iommu: Put IOMMU group explicitly
  powerpc/iommu: Always release iommu_table in iommu_free_table()
  vfio: powerpc/spapr: Move page pinning from arch code to VFIO IOMMU driver
  vfio: powerpc/spapr: Check that IOMMU page is fully contained by system 
page
  vfio: powerpc/spapr: Use it_page_size
  vfio: powerpc/spapr: Move locked_vm accounting to helpers
  vfio: powerpc/spapr: Disable DMA mappings on disabled container
  vfio: powerpc/spapr: Moving pinning/unpinning to helpers
  vfio: powerpc/spapr: Rework groups attaching
  powerpc/powernv: Do not set read flag if direction==DMA_NONE
  powerpc/iommu: Move tce_xxx callbacks from ppc_md to iommu_table
  powerpc/powernv/ioda/ioda2: Rework TCE invalidation in 
tce_build()/tce_free()
  powerpc/spapr: vfio: Replace iommu_table with iommu_table_group
  powerpc/spapr: vfio: Switch from iommu_table to new iommu_table_group
  vfio: powerpc/spapr/iommu/powernv/ioda2: Rework IOMMU ownership control
  powerpc/iommu: Fix IOMMU ownership control functions
  powerpc/powernv/ioda2: Move TCE kill register address to PE
  powerpc/powernv/ioda2: Add TCE invalidation for all attached groups
  powerpc/powernv: Implement accessor to TCE entry
  powerpc/iommu/powernv: Release replaced TCE
  powerpc/powernv/ioda2: Rework iommu_table creation
  powerpc/powernv/ioda2: Introduce helpers to allocate TCE pages
  powerpc/powernv/ioda2: Introduce pnv_pci_ioda2_set_window
  powerpc/powernv: Implement multilevel TCE tables
  vfio: powerpc/spapr: powerpc/powernv/ioda: Define and implement DMA 
windows API
  powerpc/powernv/ioda2: Use new helpers to do proper cleanup on PE release
  powerpc/iommu/ioda2: Add get_table_size() to calculate the size of future 
table
  vfio: powerpc/spapr: powerpc/powernv/ioda2: Use DMA windows API in 
ownership control
  powerpc/mmu: Add userspace-to-physical addresses translation cache
  vfio: powerpc/spapr: Register memory and define IOMMU v2
  vfio: powerpc/spapr: Support Dynamic DMA windows
  powerpc/iommu/ioda2: Enable compile with IOV=on and IOMMU_API=off
  powerpc/powernv: Fix wrong IOMMU table in pnv_ioda_setup_bus_dma()

Alistair Popple (10):
  powerpc/powernv: Reorder OPAL subsystem initialisation
  powerpc/powernv: Add a virtual irqchip for opal events
  ipmi/powernv: Convert to irq event interface
  hvc: Convert to using interrupts instead of opal events
  powernv/eeh: Update the EEH code to use the opal irq domain
  powernv/opal: Convert opal message events to opal irq domain
  powernv/elog: Convert elog to opal irq domain
  powernv/opal-dump: Convert to irq domain
  opal: Remove events notifier
  powerpc/powernv: Increase opal-irqchip initcall priority

Aneesh Kumar K.V (3):
  powerpc/mm: Add trace point for tracking hash pte fault
  powerpc/mm: PTE_RPN_MAX is not used, remove the same
  powerpc/mm: Change the swap encoding in pte.

Anshuman Khandual (12):
  powerpc: 

[PATCH] ASoC: fsl: Add dedicated DMA buffer size for each cpu dai

2015-06-23 Thread Shengjiu Wang
As the ssi is not the only cpu dai, there are esai, spdif, sai.
and imx_pcm_dma can be used by all of them. Especially ESAI need
a larger DMA buffer size. So Add dedicated DMA buffer for each cpu
dai.

Signed-off-by: Shengjiu Wang shengjiu.w...@freescale.com
---
 sound/soc/fsl/fsl_esai.c|2 +-
 sound/soc/fsl/fsl_sai.c |2 +-
 sound/soc/fsl/fsl_spdif.c   |2 +-
 sound/soc/fsl/fsl_ssi.c |2 +-
 sound/soc/fsl/imx-pcm-dma.c |   25 +
 sound/soc/fsl/imx-pcm.h |9 +++--
 sound/soc/fsl/imx-ssi.c |2 +-
 7 files changed, 33 insertions(+), 11 deletions(-)

diff --git a/sound/soc/fsl/fsl_esai.c b/sound/soc/fsl/fsl_esai.c
index 5c75971..8c2ddc1 100644
--- a/sound/soc/fsl/fsl_esai.c
+++ b/sound/soc/fsl/fsl_esai.c
@@ -839,7 +839,7 @@ static int fsl_esai_probe(struct platform_device *pdev)
return ret;
}
 
-   ret = imx_pcm_dma_init(pdev);
+   ret = imx_pcm_dma_init(pdev, IMX_ESAI_DMABUF_SIZE);
if (ret)
dev_err(pdev-dev, failed to init imx pcm dma: %d\n, ret);
 
diff --git a/sound/soc/fsl/fsl_sai.c b/sound/soc/fsl/fsl_sai.c
index 5c73bea..a18fd92 100644
--- a/sound/soc/fsl/fsl_sai.c
+++ b/sound/soc/fsl/fsl_sai.c
@@ -791,7 +791,7 @@ static int fsl_sai_probe(struct platform_device *pdev)
return ret;
 
if (sai-sai_on_imx)
-   return imx_pcm_dma_init(pdev);
+   return imx_pcm_dma_init(pdev, IMX_SAI_DMABUF_SIZE);
else
return devm_snd_dmaengine_pcm_register(pdev-dev, NULL, 0);
 }
diff --git a/sound/soc/fsl/fsl_spdif.c b/sound/soc/fsl/fsl_spdif.c
index 8e93221..d1e9be7 100644
--- a/sound/soc/fsl/fsl_spdif.c
+++ b/sound/soc/fsl/fsl_spdif.c
@@ -1255,7 +1255,7 @@ static int fsl_spdif_probe(struct platform_device *pdev)
return ret;
}
 
-   ret = imx_pcm_dma_init(pdev);
+   ret = imx_pcm_dma_init(pdev, IMX_SPDIF_DMABUF_SIZE);
if (ret)
dev_err(pdev-dev, imx_pcm_dma_init failed: %d\n, ret);
 
diff --git a/sound/soc/fsl/fsl_ssi.c b/sound/soc/fsl/fsl_ssi.c
index c7647e0..e122dab 100644
--- a/sound/soc/fsl/fsl_ssi.c
+++ b/sound/soc/fsl/fsl_ssi.c
@@ -1257,7 +1257,7 @@ static int fsl_ssi_imx_probe(struct platform_device *pdev,
if (ret)
goto error_pcm;
} else {
-   ret = imx_pcm_dma_init(pdev);
+   ret = imx_pcm_dma_init(pdev, IMX_SSI_DMABUF_SIZE);
if (ret)
goto error_pcm;
}
diff --git a/sound/soc/fsl/imx-pcm-dma.c b/sound/soc/fsl/imx-pcm-dma.c
index 0db94f49..1fc01ed 100644
--- a/sound/soc/fsl/imx-pcm-dma.c
+++ b/sound/soc/fsl/imx-pcm-dma.c
@@ -40,7 +40,7 @@ static const struct snd_pcm_hardware imx_pcm_hardware = {
SNDRV_PCM_INFO_MMAP_VALID |
SNDRV_PCM_INFO_PAUSE |
SNDRV_PCM_INFO_RESUME,
-   .buffer_bytes_max = IMX_SSI_DMABUF_SIZE,
+   .buffer_bytes_max = IMX_DEFAULT_DMABUF_SIZE,
.period_bytes_min = 128,
.period_bytes_max = 65535, /* Limited by SDMA engine */
.periods_min = 2,
@@ -52,13 +52,30 @@ static const struct snd_dmaengine_pcm_config 
imx_dmaengine_pcm_config = {
.pcm_hardware = imx_pcm_hardware,
.prepare_slave_config = snd_dmaengine_pcm_prepare_slave_config,
.compat_filter_fn = filter,
-   .prealloc_buffer_size = IMX_SSI_DMABUF_SIZE,
+   .prealloc_buffer_size = IMX_DEFAULT_DMABUF_SIZE,
 };
 
-int imx_pcm_dma_init(struct platform_device *pdev)
+int imx_pcm_dma_init(struct platform_device *pdev, size_t size)
 {
+   struct snd_dmaengine_pcm_config *config;
+   struct snd_pcm_hardware *pcm_hardware;
+
+   config = devm_kzalloc(pdev-dev,
+   sizeof(struct snd_dmaengine_pcm_config), GFP_KERNEL);
+   *config = imx_dmaengine_pcm_config;
+   if (size)
+   config-prealloc_buffer_size = size;
+
+   pcm_hardware = devm_kzalloc(pdev-dev,
+   sizeof(struct snd_pcm_hardware), GFP_KERNEL);
+   *pcm_hardware = imx_pcm_hardware;
+   if (size)
+   pcm_hardware-buffer_bytes_max = size;
+
+   config-pcm_hardware = pcm_hardware;
+
return devm_snd_dmaengine_pcm_register(pdev-dev,
-   imx_dmaengine_pcm_config,
+   config,
SND_DMAENGINE_PCM_FLAG_COMPAT);
 }
 EXPORT_SYMBOL_GPL(imx_pcm_dma_init);
diff --git a/sound/soc/fsl/imx-pcm.h b/sound/soc/fsl/imx-pcm.h
index c79cb27..133c4470a 100644
--- a/sound/soc/fsl/imx-pcm.h
+++ b/sound/soc/fsl/imx-pcm.h
@@ -20,6 +20,11 @@
  */
 #define IMX_SSI_DMABUF_SIZE(64 * 1024)
 
+#define IMX_DEFAULT_DMABUF_SIZE(64 * 1024)
+#define IMX_SAI_DMABUF_SIZE(64 * 1024)
+#define IMX_SPDIF_DMABUF_SIZE  (64 * 1024)
+#define IMX_ESAI_DMABUF_SIZE   (256 * 1024)
+
 static inline void
 imx_pcm_dma_params_init_data(struct imx_dma_data *dma_data,
int dma, enum sdma_peripheral_type peripheral_type)

[BACKPORT PATCH 3/5] perf probe ppc64le: Prefer symbol table lookup over DWARF

2015-06-23 Thread Naveen N. Rao
Use symbol table lookups by default if DWARF is not necessary, since
powerpc ABIv2 encodes local entry points in the symbol table and the
function entry address in DWARF may not be appropriate for kprobes, as
described here:

https://sourceware.org/bugzilla/show_bug.cgi?id=17638

The DWARF address ranges deliberately include the *whole* function,
both global and local entry points.
...
If you want to set probes on a local entry point, you should look up
the symbol in the main symbol table (not DWARF), and check the st_other
bits; they will indicate whether the function has a local entry point,
and what its offset from the global entry point is.  Note that GDB does
the same when setting a breakpoint on a function entry.

Signed-off-by: Naveen N. Rao naveen.n@linux.vnet.ibm.com
Reviewed-by: Srikar Dronamraju sri...@linux.vnet.ibm.com
Cc: Ananth N Mavinakayanahalli ana...@in.ibm.com
Cc: Masami Hiramatsu masami.hiramatsu...@hitachi.com
Cc: Michael Ellerman m...@ellerman.id.au
Cc: Sukadev Bhattiprolu suka...@linux.vnet.ibm.com
Cc: linuxppc-dev@lists.ozlabs.org
Link: 
http://lkml.kernel.org/r/88a10e22f4aaba2aef812824ca4b10d7beeea012.1430217967.git.naveen.n@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo a...@redhat.com
---
 tools/perf/arch/powerpc/util/sym-handling.c | 8 
 tools/perf/util/probe-event.c   | 8 
 tools/perf/util/probe-event.h   | 1 +
 3 files changed, 17 insertions(+)

diff --git a/tools/perf/arch/powerpc/util/sym-handling.c 
b/tools/perf/arch/powerpc/util/sym-handling.c
index fd11157..e8f95e5 100644
--- a/tools/perf/arch/powerpc/util/sym-handling.c
+++ b/tools/perf/arch/powerpc/util/sym-handling.c
@@ -8,6 +8,7 @@
 
 #include debug.h
 #include symbol.h
+#include probe-event.h
 
 #ifdef HAVE_LIBELF_SUPPORT
 bool elf__needs_adjust_symbols(GElf_Ehdr ehdr)
@@ -24,3 +25,10 @@ void arch__elf_sym_adjust(GElf_Sym *sym)
 }
 #endif
 #endif
+
+#if defined(_CALL_ELF)  _CALL_ELF == 2
+bool arch__prefers_symtab(void)
+{
+   return true;
+}
+#endif
diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index c150ca4..7cc4e47 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -2337,6 +2337,8 @@ err_out:
goto out;
 }
 
+bool __weak arch__prefers_symtab(void) { return false; }
+
 static int convert_to_probe_trace_events(struct perf_probe_event *pev,
  struct probe_trace_event **tevs,
  int max_tevs, const char *target)
@@ -2352,6 +2354,12 @@ static int convert_to_probe_trace_events(struct 
perf_probe_event *pev,
}
}
 
+   if (arch__prefers_symtab()  !perf_probe_event_need_dwarf(pev)) {
+   ret = find_probe_trace_events_from_map(pev, tevs, max_tevs, 
target);
+   if (ret  0)
+   return ret; /* Found in symbol table */
+   }
+
/* Convert perf_probe_event with debuginfo */
ret = try_to_find_probe_trace_events(pev, tevs, max_tevs, target);
if (ret != 0)
diff --git a/tools/perf/util/probe-event.h b/tools/perf/util/probe-event.h
index e01e994..bb65a7b 100644
--- a/tools/perf/util/probe-event.h
+++ b/tools/perf/util/probe-event.h
@@ -135,6 +135,7 @@ extern int show_available_vars(struct perf_probe_event 
*pevs, int npevs,
   struct strfilter *filter, bool externs);
 extern int show_available_funcs(const char *module, struct strfilter *filter,
bool user);
+bool arch__prefers_symtab(void);
 
 /* Maximum index number of event-name postfix */
 #define MAX_EVENT_INDEX1024
-- 
2.4.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[BACKPORT PATCH 4/5] perf probe ppc64le: Fixup function entry if using kallsyms lookup

2015-06-23 Thread Naveen N. Rao
On powerpc ABIv2, if no debug-info is found and we use kallsyms, we need
to fixup the function entry to point to the local entry point. Use
offset of 8 since current toolchains always generate 2 instructions (8
bytes).

Signed-off-by: Naveen N. Rao naveen.n@linux.vnet.ibm.com
Reviewed-by: Srikar Dronamraju sri...@linux.vnet.ibm.com
Cc: Ananth N Mavinakayanahalli ana...@in.ibm.com
Cc: Masami Hiramatsu masami.hiramatsu...@hitachi.com
Cc: Michael Ellerman m...@ellerman.id.au
Cc: Sukadev Bhattiprolu suka...@linux.vnet.ibm.com
Cc: linuxppc-dev@lists.ozlabs.org
Link: 
http://lkml.kernel.org/r/92253021e77a104b23b615c8c23bf9501dfe60bf.1430217967.git.naveen.n@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo a...@redhat.com
---
 tools/perf/arch/powerpc/util/sym-handling.c | 16 
 tools/perf/util/probe-event.c   |  5 +
 tools/perf/util/probe-event.h   |  2 ++
 3 files changed, 23 insertions(+)

diff --git a/tools/perf/arch/powerpc/util/sym-handling.c 
b/tools/perf/arch/powerpc/util/sym-handling.c
index e8f95e5..b28b5ec 100644
--- a/tools/perf/arch/powerpc/util/sym-handling.c
+++ b/tools/perf/arch/powerpc/util/sym-handling.c
@@ -8,6 +8,7 @@
 
 #include debug.h
 #include symbol.h
+#include map.h
 #include probe-event.h
 
 #ifdef HAVE_LIBELF_SUPPORT
@@ -31,4 +32,19 @@ bool arch__prefers_symtab(void)
 {
return true;
 }
+
+#define PPC64LE_LEP_OFFSET 8
+
+void arch__fix_tev_from_maps(struct perf_probe_event *pev,
+struct probe_trace_event *tev, struct map *map)
+{
+   /*
+* ppc64 ABIv2 local entry point is currently always 2 instructions
+* (8 bytes) after the global entry point.
+*/
+   if (!pev-uprobes  map-dso-symtab_type == 
DSO_BINARY_TYPE__KALLSYMS) {
+   tev-point.address += PPC64LE_LEP_OFFSET;
+   tev-point.offset += PPC64LE_LEP_OFFSET;
+   }
+}
 #endif
diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 7cc4e47..963efb0 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -2206,6 +2206,10 @@ static int probe_function_filter(struct map *map 
__maybe_unused,
 #define strdup_or_goto(str, label) \
({ char *__p = strdup(str); if (!__p) goto label; __p; })
 
+void __weak arch__fix_tev_from_maps(struct perf_probe_event *pev 
__maybe_unused,
+   struct probe_trace_event *tev __maybe_unused,
+   struct map *map __maybe_unused) { }
+
 /*
  * Find probe function addresses from map.
  * Return an error or the number of found probe_trace_event
@@ -2319,6 +2323,7 @@ static int find_probe_trace_events_from_map(struct 
perf_probe_event *pev,
strdup_or_goto(pev-args[i].type,
nomem_out);
}
+   arch__fix_tev_from_maps(pev, tev, map);
}
 
 out:
diff --git a/tools/perf/util/probe-event.h b/tools/perf/util/probe-event.h
index bb65a7b..006adba 100644
--- a/tools/perf/util/probe-event.h
+++ b/tools/perf/util/probe-event.h
@@ -136,6 +136,8 @@ extern int show_available_vars(struct perf_probe_event 
*pevs, int npevs,
 extern int show_available_funcs(const char *module, struct strfilter *filter,
bool user);
 bool arch__prefers_symtab(void);
+void arch__fix_tev_from_maps(struct perf_probe_event *pev,
+struct probe_trace_event *tev, struct map *map);
 
 /* Maximum index number of event-name postfix */
 #define MAX_EVENT_INDEX1024
-- 
2.4.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2 6/7]powerpc/powernv: generic nest pmu event functions

2015-06-23 Thread Madhavan Srinivasan


On Tuesday 23 June 2015 07:19 AM, Sukadev Bhattiprolu wrote:
 Madhavan Srinivasan [ma...@linux.vnet.ibm.com] wrote:
 | From: Madhavan Srinivasan ma...@linux.vnet.ibm.com
 | Subject: [PATCH v2 6/7]powerpc/powernv: generic nest pmu event functions
 | 
 | Add generic format attribute and set of generic nest pmu related
 | event functions to be used by each nest pmu. Add code to register nest pmus.
 | 
 | Cc: Michael Ellerman m...@ellerman.id.au
 | Cc: Benjamin Herrenschmidt b...@kernel.crashing.org
 | Cc: Paul Mackerras pau...@samba.org
 | Cc: Anton Blanchard an...@samba.org
 | Cc: Sukadev Bhattiprolu suka...@linux.vnet.ibm.com
 | Cc: Anshuman Khandual khand...@linux.vnet.ibm.com
 | Cc: Stephane Eranian eran...@google.com
 | Signed-off-by: Madhavan Srinivasan ma...@linux.vnet.ibm.com
 | ---
 |  arch/powerpc/perf/nest-pmu.c | 109 
 +++
 |  1 file changed, 109 insertions(+)
 | 
 | diff --git a/arch/powerpc/perf/nest-pmu.c b/arch/powerpc/perf/nest-pmu.c
 | index 8fad2d9..a662c14 100644
 | --- a/arch/powerpc/perf/nest-pmu.c
 | +++ b/arch/powerpc/perf/nest-pmu.c
 | @@ -13,6 +13,108 @@
 |  static struct perchip_nest_info p8_perchip_nest_info[P8_MAX_CHIP];
 |  static struct nest_pmu *per_nest_pmu_arr[P8_MAX_NEST_PMUS];
 | 
 | +PMU_FORMAT_ATTR(event, config:0-20);
 | +struct attribute *p8_nest_format_attrs[] = {
 | +   format_attr_event.attr,
 | +   NULL,
 | +};
 | +
 | +struct attribute_group p8_nest_format_group = {
 | +   .name = format,
 | +   .attrs = p8_nest_format_attrs,
 | +};

 Could this be included in previous/separate patch? That way,
 this patch could focus on just registering the nest-pmu.

Yes. Will move it.

 | +
 | +static int p8_nest_event_init(struct perf_event *event)
 | +{
 | +   int chip_id;
 | +
 | +   if (event-attr.type != event-pmu-type)
 | +   return -ENOENT;
 | +
 | +   /* Sampling not supported yet */
 | +   if (event-hw.sample_period)
 | +   return -EINVAL;
 | +
 | +   /* unsupported modes and filters */
 | +   if (event-attr.exclude_user   ||
 | +   event-attr.exclude_kernel ||
 | +   event-attr.exclude_hv ||
 | +   event-attr.exclude_idle   ||
 | +   event-attr.exclude_host   ||
 | +   event-attr.exclude_guest)
 | +   return -EINVAL;
 | +
 | +   if (event-cpu  0)
 | +   return -EINVAL;
 | +
 | +   chip_id = topology_physical_package_id(event-cpu);
 | +   event-hw.event_base = event-attr.config +
 | +   p8_perchip_nest_info[chip_id].vbase;
 | +
 | +   return 0;
 | +}
 | +
 | +static void p8_nest_read_counter(struct perf_event *event)
 | +{
 | +   u64 *addr;
 | 

 Define as uint64_t so we can eliminate one cast below? Would also
 be consistent with p8_nest_perf_event_update().
Yes make sense.

 | 
 | +   u64 data = 0;
 | +
 | +   addr = (u64 *)event-hw.event_base;
 | +   data = __be64_to_cpu((uint64_t)*addr);
 | +   local64_set(event-hw.prev_count, data);
 | +}
 | +
 | +static void p8_nest_perf_event_update(struct perf_event *event)
 | +{
 | +   u64 counter_prev, counter_new, final_count;
 | +   uint64_t *addr;
 | +
 | +   addr = (u64 *)event-hw.event_base;

 uint64_t *?
My bad.  will change it.

 | +   counter_prev = local64_read(event-hw.prev_count);
 | +   counter_new = __be64_to_cpu((uint64_t)*addr);

 Redundant cast? addr is already uint64_t *?

Nice catch. Will remove it.
 | +   final_count = counter_new - counter_prev;
 | +
 | +   local64_set(event-hw.prev_count, counter_new);
 | +   local64_add(final_count, event-count);
 | +}
 | +
 | +static void p8_nest_event_start(struct perf_event *event, int flags)
 | +{

 Check PERF_EF_RELOAD before reloading?

 | +   event-hw.state = 0;
 | +   p8_nest_read_counter(event);
 | +}
 | +
 | +static void p8_nest_event_stop(struct perf_event *event, int flags)
 | +{

 Check PERF_EF_UPDATE when stopping?

 | +   p8_nest_perf_event_update(event);
 | +}
 | +
 | +static int p8_nest_event_add(struct perf_event *event, int flags)
 | +{

 Check PERF_EF_START flags before starting the counter on an -add()?
Will add the flags.

 | +   p8_nest_event_start(event, flags);
 | +   return 0;
 | +}
 | +
 | +/*
 | + * Populate pmu ops in the structure
 | + */
 | +static int update_pmu_ops(struct nest_pmu *pmu)
 | +{
 | +   if (!pmu)
 | +   return -EINVAL;
 | +
 | +   pmu-pmu.task_ctx_nr = perf_invalid_context;
 | +   pmu-pmu.event_init = p8_nest_event_init;
 | +   pmu-pmu.add = p8_nest_event_add;
 | +   pmu-pmu.del = p8_nest_event_stop;
 | +   pmu-pmu.start = p8_nest_event_start;
 | +   pmu-pmu.stop = p8_nest_event_stop;
 | +   pmu-pmu.read = p8_nest_perf_event_update;
 | +   pmu-pmu.attr_groups = pmu-attr_groups;
 | +
 | +   return 0;
 | +}
 | +
 |  /*
 |   * Populate event name and string in attribute
 |   */
 | @@ -106,6 +208,7 @@ static int nest_pmu_create(struct device_node *dev, int 
 pmu_index)
 | /* Save the name to register the PMU with it */
 | sprintf(buf, Nest_%s, (char *)pp-value);
 

Re: powerpc,numa: Memory hotplug to memory-less nodes ?

2015-06-23 Thread Bharata B Rao
So will it be correct to say that memory hotplug to memory-less node
isn't supported by PowerPC kernel ? Should I enforce the same in QEMU
for PowerKVM ?

On Mon, Jun 22, 2015 at 10:18 AM, Bharata B Rao bharata@gmail.com wrote:
 Hi,

 While developing memory hotplug support in QEMU for PoweKVM, I
 realized that guest kernel has specific checks to prevent hot addition
 of memory to a memory-less node.

 I am referring to arch/powerpc/mm/numa.c:hot_add_scn_to_nid() which
 has explicit checks to ensure that it returns a nid that has some some
 memory (NODE_DATA(nid)-node_spanned_pages) even when user wants to
 hotplug to a node that currently has zero memory.

 Is this limitation by design ?

 Regards,
 Bharata.
 --
 http://raobharata.wordpress.com/



-- 
http://raobharata.wordpress.com/
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 1/2] Move the pt_regs_offset struct definition from arch to common include file

2015-06-23 Thread Michael Ellerman
On Tue, 2015-06-23 at 09:48 -0400, David Long wrote:
 On 06/22/15 23:32, Michael Ellerman wrote:
  On Fri, 2015-06-19 at 10:12 -0400, David Long wrote:
  On 06/19/15 00:19, Michael Ellerman wrote:
  On Mon, 2015-06-15 at 12:42 -0400, David Long wrote:
  From: David A. Long dave.l...@linaro.org
 
  The pt_regs_offset structure is used for HAVE_REGS_AND_STACK_ACCESS_API
 feature and has identical definitions in four different arch ptrace.h
  include files. It seems unlikely that definition would ever need to be
  changed regardless of architecture so lets move it into
  include/linux/ptrace.h.
 
  Signed-off-by: David A. Long dave.l...@linaro.org
  ---
 arch/powerpc/kernel/ptrace.c | 5 -
 
  Built and booted on powerpc, but is there an easy way to actually test 
  the code
  paths in question?
 
  There is an easy way to smoke test it on all archiectures that also
  implement kprobes (which powerpc does).  If I'm understanding the
  powerpc code correctly (WRT register naming conventions) just do the
  following:
 
  cd /sys/kernel/debug/tracing
  echo 'p do_fork %gpr0'  kprobe_events
  echo 1  events/kprobes/enable
  ls
  cat trace
  echo 0  events/kprobes/enable
 
  Every fork() call done on the system between those two echo commands
  (hence the ls) should append a line to the trace file.  For a more
  exhaustive test one could repeat this sequence for every register in the
  architecture.
 
  OK, so I went the whole hog and did:
 
  $ echo 'p do_fork %gpr0 %gpr1 %gpr2 %gpr3 %gpr4 %gpr5 %gpr6 %gpr7 %gpr8 
  %gpr9 %gpr10 %gpr11 %gpr12 %gpr13 %gpr14 %gpr15 %gpr16 %gpr17 %gpr18 %gpr19 
  %gpr20 %gpr21 %gpr22 %gpr23 %gpr24 %gpr25 %gpr26 %gpr27 %gpr28 %gpr29 
  %gpr30 %gpr31 %nip %msr %ctr %link %xer %ccr %softe %trap %dar %dsisr'  
  kprobe_events
 
  And I get:
 
   bash-2057  [001] d...   535.433941: p_do_fork_0: 
  (do_fork+0x8/0x490) arg1=0xc00094d0 arg2=0xc001fbe9be30 
  arg3=0xc1133bb8 arg4=0x1200011 arg5=0x0 arg6=0x0 arg7=0x0 
  arg8=0x3fff7c885940 arg9=0x1 arg10=0xc001fbe9bea0 arg11=0x0 arg12=0xc01 
  arg13=0xc00094c8 arg14=0xcfdc0480 arg15=0x0 
  arg16=0x2200 arg17=0x1016d6e8 arg18=0x0 arg19=0x4400 arg20=0x0 
  arg21=0x10037c82208 arg22=0x1017b008 arg23=0x10143d18 arg24=0x10178854 
  arg25=0x10144f90 arg26=0x10037c821e8 arg27=0x0 arg28=0x0 arg29=0x0 
  arg30=0x0 arg31=0x809 arg32=0x3788c010 arg33=0xc00a7fe8 
  arg34=0x80029033 arg35=0xc00094c8 arg36=0xc00094d0 
  arg37=0x0 arg38=0x4844 arg39=0x1 arg40=0x700 arg41=0xc001fbe9bd50 
  arg42=0xc001fbe9bd30
 
  Which is ugly as hell, but appears unchanged since before your patch.
 
 
 Excellent.  Many thanks.

No worries.

Did I already send you an ack? Have another one in case:

Acked-by: Michael Ellerman m...@ellerman.id.au


  I take it it's expected that the names are not decoded in the output?

 Yes.

In fact I don't see anywhere that uses the reverse decoding, ie.
regs_query_register_name().

cheers


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc: Set the correct kernel taint on machine check errors.

2015-06-23 Thread Michael Ellerman
On Mon, 2015-06-15 at 13:25 +1000, Daniel Axtens wrote:
 This means the 'M' flag will work properly when the kernel prints a backtrace.
 
 Signed-off-by: Daniel Axtens d...@axtens.net
 ---
  arch/powerpc/kernel/traps.c | 2 ++
  1 file changed, 2 insertions(+)
 
 diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
 index 6530f1b..37de90f 100644
 --- a/arch/powerpc/kernel/traps.c
 +++ b/arch/powerpc/kernel/traps.c
 @@ -297,6 +297,8 @@ long machine_check_early(struct pt_regs *regs)
  
   __this_cpu_inc(irq_stat.mce_exceptions);
  
 + add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE);

I'm not sure about the lockdep bit.

I guess it's safer to just declare it fubar.

Does this fix a bug, or just nice to have?

cheers


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH SLOF 4/5] disk-label: add support for booting from GPT FAT partition

2015-06-23 Thread Segher Boessenkool
On Tue, Jun 23, 2015 at 09:34:44AM +0200, Thomas Huth wrote:
  +: load-from-gpt-partition ( [ addr ] -- size | TRUE )
 
 What do you mean with addr in square brackets? Is it optional?

And size | TRUE?  The code even returns false instead, which
usually is a valid size (0).  Just always return a flag?  Or maybe
you mean something like ( -- false | size true ) .  Not going to
read the code, I cannot keep track of the stack, bringing us to...


 Hmm, I wonder whether we need a proper coding conventions spec for
 writing Forth code ... (at least about the indentation depths ...) ;-)

Write readable code.  That means in part, do not write long definitions
(longer than a few lines).

There, all coding conventions you'll ever need :-)


Almost all short definitions (with good names!) are easily readable
(with a little effort if the subject matter is tricky).  No longer
definitions are ever readable (well, there are exceptions; not many).

Don't get hung up on how many spaces should I indent...  Since your
words are short, you won't have more than two levels of indent anyway :-)

Adding extra spacing to group things is also very helpful.

Minor things...  Most words want a stack comment.  If you need stack
comments inside a definition, it is too complex.  If there is any
significant amount of stack juggling, the word is too complex.  If
the word would be too complex, you need to factor it.  If you cannot
easily split off factors, your solution is too complex.  If it is
hard to think of good names for the factors, that is simply because
naming things is the hardest part of programming (but see also the
previous point).

You also want short words that do one little thing because you _do_
test your code.


Segher
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [2/3] powerpc/iommu: Cleanup setting of DMA base/offset

2015-06-23 Thread Michael Ellerman
On Mon, 2015-22-06 at 04:32:05 UTC, Benjamin Herrenschmidt wrote:
 Now that the table and the offset can co-exist, we no longer need
 to flip/flop, we can just establish both once at boot time.
 
 Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org
 ---
  arch/powerpc/platforms/powernv/pci-ioda.c | 50 
 +++
  arch/powerpc/platforms/pseries/iommu.c|  3 +-
  arch/powerpc/sysdev/dart_iommu.c  | 16 ++


This doesn't apply to my next, since the VFIO changes went in:

  Applying patch #487103 to current directory
  Description: [2/3] powerpc/iommu: Cleanup setting of DMA base/offset
  patching file arch/powerpc/platforms/powernv/pci-ioda.c
  Hunk #1 FAILED at 1621.
  Hunk #2 FAILED at 1653.
  Hunk #3 FAILED at 1841.
  Hunk #4 FAILED at 1882.
  Hunk #5 FAILED at 1977.
  5 out of 5 hunks FAILED -- saving rejects to file 
arch/powerpc/platforms/powernv/pci-ioda.c.rej
  patching file arch/powerpc/platforms/pseries/iommu.c
  Hunk #1 succeeded at 1253 (offset 92 lines).
  patching file arch/powerpc/sysdev/dart_iommu.c
  Hunk #1 succeeded at 313 (offset 7 lines).
  Hunk #2 succeeded at 361 (offset 7 lines).


I could fix the conflicts but then the result wouldn't have been tested.

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev