Re: [GIT PULL] EFI urgent fixes

2014-09-26 Thread Ingo Molnar

* Matt Fleming  wrote:

> On Fri, 26 Sep, at 01:35:10PM, Ingo Molnar wrote:
> > 
> > A delta fix would be nice at this point, I've got the x86 side 
> > tested and besides this build bug it's ready to go to Linus.
> 
> Given that the typo doesn't result in an actual build failure, do you
> still want the following delta fix?

Ok, this one can indeed wait for the merge window - I'll send the 
other bits to Linus in a minute.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 1/2] Return a value from printk_ratelimited

2014-09-26 Thread Joe Perches
On Fri, 2014-09-26 at 22:36 -0700, Omar Sandoval wrote:
> printk returns an integer; there's no reason for printk_ratelimited to swallow
> it.
> 
> Signed-off-by: Omar Sandoval 
> Acked-by: Paul E. McKenney 
> ---

I'd prefer to keep it the way it is actually.

I've submitted several patches to convert the int return
to void for printk derived functions recently.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 0/2] Move BTRFS RCU string to common library

2014-09-26 Thread Omar Sandoval
On Fri, Sep 26, 2014 at 10:36:47PM -0700, Omar Sandoval wrote:
> This patch series makes the generic RCU string library used internally by 
> BTRFS
> accessible by anyone.
> 
> The first patch makes printk_ratelimited pass on the return value from printk.
> Version 3 gives the temporary return variable a unique name to avoid name
> clashes.
> 
> The second patch moves the RCU string header and updates the BTRFS code that
> depends on it. Version 3 adds the __rcu annotation to the relevant functions.
> 
> Version 3 also adds Paul's ack and Josh's review.
> 
> Omar Sandoval (2):
>   Return a value from printk_ratelimited
>   Move BTRFS RCU string to common library
> 
>  fs/btrfs/check-integrity.c |  6 +--
>  fs/btrfs/dev-replace.c | 19 +-
>  fs/btrfs/disk-io.c |  6 +--
>  fs/btrfs/extent_io.c   |  4 +-
>  fs/btrfs/ioctl.c   |  4 +-
>  fs/btrfs/raid56.c  |  2 +-
>  fs/btrfs/rcu-string.h  | 56 
>  fs/btrfs/scrub.c   | 15 
>  fs/btrfs/super.c   |  2 +-
>  fs/btrfs/volumes.c | 14 +++
>  include/linux/printk.h |  4 +-
>  include/linux/rcustring.h  | 92 
> ++
>  12 files changed, 132 insertions(+), 92 deletions(-)
>  delete mode 100644 fs/btrfs/rcu-string.h
>  create mode 100644 include/linux/rcustring.h
> 

I forgot to mention: this patch series is against 3.17-rc6.

-- 
Omar
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 2/2] Move BTRFS RCU string to common library

2014-09-26 Thread Omar Sandoval
The RCU-friendly string API used internally by BTRFS is generic enough for
common use. This doesn't add any new functionality, but instead just moves the
code and documents the existing API.

Signed-off-by: Omar Sandoval 
Reviewed-by: Josh Triplett 
Acked-by: Paul E. McKenney 
---
 fs/btrfs/check-integrity.c |  6 +--
 fs/btrfs/dev-replace.c | 19 +-
 fs/btrfs/disk-io.c |  6 +--
 fs/btrfs/extent_io.c   |  4 +-
 fs/btrfs/ioctl.c   |  4 +-
 fs/btrfs/raid56.c  |  2 +-
 fs/btrfs/rcu-string.h  | 56 
 fs/btrfs/scrub.c   | 15 
 fs/btrfs/super.c   |  2 +-
 fs/btrfs/volumes.c | 14 +++
 include/linux/rcustring.h  | 92 ++
 11 files changed, 129 insertions(+), 91 deletions(-)
 delete mode 100644 fs/btrfs/rcu-string.h
 create mode 100644 include/linux/rcustring.h

diff --git a/fs/btrfs/check-integrity.c b/fs/btrfs/check-integrity.c
index ce92ae3..4ccd7da 100644
--- a/fs/btrfs/check-integrity.c
+++ b/fs/btrfs/check-integrity.c
@@ -94,6 +94,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "ctree.h"
 #include "disk-io.h"
 #include "hash.h"
@@ -103,7 +104,6 @@
 #include "print-tree.h"
 #include "locking.h"
 #include "check-integrity.h"
-#include "rcu-string.h"
 
 #define BTRFSIC_BLOCK_HASHTABLE_SIZE 0x1
 #define BTRFSIC_BLOCK_LINK_HASHTABLE_SIZE 0x1
@@ -851,8 +851,8 @@ static int btrfsic_process_superblock_dev_mirror(
printk_in_rcu(KERN_INFO "New initial S-block (bdev %p, 
%s)"
 " @%llu (%s/%llu/%d)\n",
 superblock_bdev,
-rcu_str_deref(device->name), dev_bytenr,
-dev_state->name, dev_bytenr,
+rcu_string_dereference(device->name),
+dev_bytenr, dev_state->name, dev_bytenr,
 superblock_mirror_num);
list_add(_tmp->all_blocks_node,
 >all_blocks_list);
diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c
index eea26e1..87d10cc 100644
--- a/fs/btrfs/dev-replace.c
+++ b/fs/btrfs/dev-replace.c
@@ -25,6 +25,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include "ctree.h"
 #include "extent_map.h"
@@ -34,7 +35,6 @@
 #include "volumes.h"
 #include "async-thread.h"
 #include "check-integrity.h"
-#include "rcu-string.h"
 #include "dev-replace.h"
 #include "sysfs.h"
 
@@ -376,9 +376,9 @@ int btrfs_dev_replace_start(struct btrfs_root *root,
printk_in_rcu(KERN_INFO
  "BTRFS: dev_replace from %s (devid %llu) to %s started\n",
  src_device->missing ? "" :
-   rcu_str_deref(src_device->name),
+ rcu_string_dereference(src_device->name),
  src_device->devid,
- rcu_str_deref(tgt_device->name));
+ rcu_string_dereference(tgt_device->name));
 
tgt_device->total_bytes = src_device->total_bytes;
tgt_device->disk_total_bytes = src_device->disk_total_bytes;
@@ -528,9 +528,10 @@ static int btrfs_dev_replace_finishing(struct 
btrfs_fs_info *fs_info,
printk_in_rcu(KERN_ERR
  "BTRFS: btrfs_scrub_dev(%s, %llu, %s) failed 
%d\n",
  src_device->missing ? "" :
-   rcu_str_deref(src_device->name),
+ rcu_string_dereference(src_device->name),
  src_device->devid,
- rcu_str_deref(tgt_device->name), scrub_ret);
+ rcu_string_dereference(tgt_device->name),
+ scrub_ret);
btrfs_dev_replace_unlock(dev_replace);
mutex_unlock(>fs_info->fs_devices->device_list_mutex);
mutex_unlock(>fs_info->chunk_mutex);
@@ -544,9 +545,9 @@ static int btrfs_dev_replace_finishing(struct btrfs_fs_info 
*fs_info,
printk_in_rcu(KERN_INFO
  "BTRFS: dev_replace from %s (devid %llu) to %s) 
finished\n",
  src_device->missing ? "" :
-   rcu_str_deref(src_device->name),
+ rcu_string_dereference(src_device->name),
  src_device->devid,
- rcu_str_deref(tgt_device->name));
+ rcu_string_dereference(tgt_device->name));
tgt_device->is_tgtdev_for_dev_replace = 0;
tgt_device->devid = src_device->devid;
src_device->devid = BTRFS_DEV_REPLACE_DEVID;
@@ -805,10 +806,10 @@ static int btrfs_dev_replace_kthread(void *data)
printk_in_rcu(KERN_INFO
"BTRFS: continuing dev_replace from %s (devid %llu) to 
%s @%u%%\n",
dev_replace->srcdev->missing 

[PATCH v3 0/2] Move BTRFS RCU string to common library

2014-09-26 Thread Omar Sandoval
This patch series makes the generic RCU string library used internally by BTRFS
accessible by anyone.

The first patch makes printk_ratelimited pass on the return value from printk.
Version 3 gives the temporary return variable a unique name to avoid name
clashes.

The second patch moves the RCU string header and updates the BTRFS code that
depends on it. Version 3 adds the __rcu annotation to the relevant functions.

Version 3 also adds Paul's ack and Josh's review.

Omar Sandoval (2):
  Return a value from printk_ratelimited
  Move BTRFS RCU string to common library

 fs/btrfs/check-integrity.c |  6 +--
 fs/btrfs/dev-replace.c | 19 +-
 fs/btrfs/disk-io.c |  6 +--
 fs/btrfs/extent_io.c   |  4 +-
 fs/btrfs/ioctl.c   |  4 +-
 fs/btrfs/raid56.c  |  2 +-
 fs/btrfs/rcu-string.h  | 56 
 fs/btrfs/scrub.c   | 15 
 fs/btrfs/super.c   |  2 +-
 fs/btrfs/volumes.c | 14 +++
 include/linux/printk.h |  4 +-
 include/linux/rcustring.h  | 92 ++
 12 files changed, 132 insertions(+), 92 deletions(-)
 delete mode 100644 fs/btrfs/rcu-string.h
 create mode 100644 include/linux/rcustring.h

-- 
2.1.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 1/2] Return a value from printk_ratelimited

2014-09-26 Thread Omar Sandoval
printk returns an integer; there's no reason for printk_ratelimited to swallow
it.

Signed-off-by: Omar Sandoval 
Acked-by: Paul E. McKenney 
---
 include/linux/printk.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/linux/printk.h b/include/linux/printk.h
index d78125f..89bb7ab 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -343,12 +343,14 @@ extern asmlinkage void dump_stack(void) __cold;
 #ifdef CONFIG_PRINTK
 #define printk_ratelimited(fmt, ...)   \
 ({ \
+   int __ret_printk_ratelimited = 0;   \
static DEFINE_RATELIMIT_STATE(_rs,  \
  DEFAULT_RATELIMIT_INTERVAL,   \
  DEFAULT_RATELIMIT_BURST); \
\
if (__ratelimit(&_rs))  \
-   printk(fmt, ##__VA_ARGS__); \
+   __ret_printk_ratelimited = printk(fmt, ##__VA_ARGS__);  \
+   __ret_printk_ratelimited;   \
 })
 #else
 #define printk_ratelimited(fmt, ...)   \
-- 
2.1.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RESEND PATCH] acpi-cpufreq: get the cur_freq from acpi_processor_performance states

2014-09-26 Thread Wang Weidong
On 2014/9/27 7:21, Rafael J. Wysocki wrote:
> On Thursday, August 21, 2014 01:55:15 PM Wang Weidong wrote:
>> As the initialized freq_tables maybe different from the p-states
>> values, so the array index is different as well.
>>
>> p-states value: [2400 2400 2000 ...], while the freq_tables:
>> [2400 2000 ... CPUFREQ_TABLE_END]. After setted the freqs 2000,
>> the perf->state is 3 while the freqs_table's index should be 2.
>> So when call the get_cur_freq_on_cpu, the freqs value we get
>> is 2400.
>>
>> So, fix the problem with the correct tables.
> 
> What you're saying is basically that freq_table and perf->states
> diverge at one point.  Shouldn't we re-generate freq_table in that
> case instead of fixing up get_cur_freq_on_cpu() only in a quite
> indirect way? 
> 
Hi Rafael,

Thanks for your reply.

You mean that we should re-generate the freq_table in that case?
Could we fix the table init like this:

--- a/drivers/cpufreq/acpi-cpufreq.c
+++ b/drivers/cpufreq/acpi-cpufreq.c
@@ -779,7 +779,7 @@ static int acpi_cpufreq_cpu_init(struct cpufreq_policy 
*policy)

/* table init */
for (i = 0; i < perf->state_count; i++) {
-   if (i > 0 && perf->states[i].core_frequency >=
+   if (i > 0 && perf->states[i].core_frequency >
data->freq_table[valid_states-1].frequency / 1000)
continue;

when the value is same, we just keep the value into the freq_table.

Regards,
Wang

>> Signed-off-by: Wang Weidong 
>> ---
>>  drivers/cpufreq/acpi-cpufreq.c | 4 +++-
>>  1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/cpufreq/acpi-cpufreq.c b/drivers/cpufreq/acpi-cpufreq.c
>> index b0c18ed..ac93885 100644
>> --- a/drivers/cpufreq/acpi-cpufreq.c
>> +++ b/drivers/cpufreq/acpi-cpufreq.c
>> @@ -365,6 +365,7 @@ static u32 get_cur_val(const struct cpumask *mask)
>>  static unsigned int get_cur_freq_on_cpu(unsigned int cpu)
>>  {
>>  struct acpi_cpufreq_data *data = per_cpu(acfreq_data, cpu);
>> +struct acpi_processor_performance *perf;
>>  unsigned int freq;
>>  unsigned int cached_freq;
>>  
>> @@ -375,7 +376,8 @@ static unsigned int get_cur_freq_on_cpu(unsigned int cpu)
>>  return 0;
>>  }
>>  
>> -cached_freq = data->freq_table[data->acpi_data->state].frequency;
>> +perf = data->acpi_data;
>> +cached_freq = perf->states[perf->state].core_frequency * 1000;
>>  freq = extract_freq(get_cur_val(cpumask_of(cpu)), data);
>>  if (freq != cached_freq) {
>>  /*
>>
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[3.10.49 stable kernel] crash in process_one_work

2014-09-26 Thread Arun KS
Hello Tejun,

I m seen the following crash in 3.10 kernel workqueue.

[ 1133.893817] Unable to handle kernel NULL pointer dereference at
virtual address 0004
[ 1133.893821] pgd = c0004000
[ 1133.893827] [0004] *pgd=
[ 1133.893834] Internal error: Oops: 5 [#1] PREEMPT SMP ARM
[ 1133.893841] Modules linked in:
[ 1133.893849] CPU: 2 PID: 5359 Comm: kworker/u8:20 Not tainted
3.10.28-g99b6153-6-gc32dab7 #1
[ 1133.893859] task: d8c2aa00 ti: e79a4000 task.ti: e79a4000
[ 1133.893873] PC is at process_one_work+0x18/0x448
[ 1133.893878] LR is at process_one_work+0x14/0x448
[ 1133.893887] pc : []lr : []psr: 400f0093
   sp : e79a5ef8  ip : daf7f100  fp : 0089
[ 1133.893891] r10: daf7f118  r9 : ee80e820  r8 : ee80e800
[ 1133.893897] r7 : c111872e  r6 : ee80e800  r5 : ed7cf150  r4 : daf7f100
[ 1133.893902] r3 : ffe0  r2 : 0081  r1 : ed7cf150  r0 : 
[ 1133.893908] Flags: nZcv  IRQs off  FIQs on  Mode SVC_32  ISA ARM
Segment kernel
[ 1133.893914] Control: 10c5383d  Table: a7dbc06a  DAC: 0015


struct pool_workqueue *pwq = get_work_pwq(work);
get_work_pwq returned NULL because WORK_STRUCT_PWQ was not set on
work_struct->data.

The work_struct looks likes this,

crash> struct work_struct ed7cf150
struct work_struct {
  data = {
counter = 0xffe0
  },
  entry = {
next = 0xed7cf154,
prev = 0xed7cf154
  },
  func = 0xc0140ac4 
}


The value of data is 0xffe0. I can think of only two reason for
this value in data.
1) driver calls INIT_WORK on same worker again after queuing.
2) workqueue subsytem called clear_work_data(work);

>From the above details of the work_struct shows that the work is
queued from kernel/asyc.c.
async_schedule dynamically allocates the work_struct and possibility
of calling INIT_WORK is not there.

I m suspecting the second reason.

Your inputs are really appreciated.
Please let me know if you want any more information from the crashed system.

Thanks,
Arun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [OPW kernel] checkpatch: Does anyone care that comments are freeform aligned?

2014-09-26 Thread Joe Perches
On Fri, 2014-09-26 at 16:03 -0700, j...@joshtriplett.org wrote:
> On Wed, Sep 24, 2014 at 11:41:14AM -0700, Joe Perches wrote:
> > On Wed, 2014-09-24 at 07:47 +0200, Julia Lawall wrote:
> > > In the following patch extract, one line is indented with spaces rather 
> > > than tabs.  Is it intentional that checkpatch doesn't complain about 
> > > this, 
> > > I guess due to the line being a comment?
> > []
> > >  struct bcm_hdr_suppression_contextinfo {
> > > - UCHAR ucaHdrSuppressionInBuf[MAX_PHS_LENGTHS]; /* Intermediate buffer 
> > > to accumulate pkt Header for PHS */
> > > - UCHAR ucaHdrSuppressionOutBuf[MAX_PHS_LENGTHS + PHSI_LEN]; /* 
> > > Intermediate buffer containing pkt Header after PHS */
> > > +/* Intermediate buffer to accumulate pkt Header for PHS */
> > > + UCHAR ucaHdrSuppressionInBuf[MAX_PHS_LENGTHS];
> > > + /* Intermediate buffer containing pkt Header after PHS */
> > > + UCHAR ucaHdrSuppressionOutBuf[MAX_PHS_LENGTHS + PHSI_LEN];
> > >  };
> > 
> > checkpatch does not care when comments start in any
> > particular position or ensure comments have tabs
> > preceding them.
> > 
> > Does anyone care?
> 
> This should have already been caught by other whitespace checks that
> check for indentations of 8 or more spaces.  Similarly, mixed tab/space
> indentations would get caught by those same checks.

Yes and no.

Leading whitespace over 8 chars would (ie: "^ {8,}" is a warning)
but a statement like:

int foo;/* some comment */
[tab]   [20 spaces] [tab]

would not.

> I don't think checkpatch needs to check those.

For the most part, I agree.

On Fri, 2014-09-26 at 16:06 -0700, j...@joshtriplett.org wrote:
> Lines starting with at least 8 spaces before a comment: 
> $ git grep -E '^ {8,}/\*' | wc -l
> 1544
> 
> I think just about all of those are bugs.

Maybe.

Most of those are from Linus' first git commit.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] vfs: Don't exchange "short" filenames unconditionally.

2014-09-26 Thread Al Viro
On Fri, Sep 26, 2014 at 05:44:42PM +0100, Al Viro wrote:

> FWIW, the longer I'm looking at that, the more it seems that __d_move() and
> __d_materialise_dentry() ought to be merged.  The thing is, both callers of
> the latter are followed by the same sequence:
> write_sequnlock(_lock);
>   _d_rehash(anon);
>   spin_unlock(>d_lock);
> (in one case - verbatim, in another - with goto thrown in before _d_rehash()).
> Now, there's no good reason for doing that write_sequnlock() first; both
> _d_rehash() and spin_unlock() are fast enough to make contention on
> rename_lock a non-issue.  And if we pull those before write_sequnlock() and
> into __d_materialise_dentry(dentry, anon), we get rid not only of code
> duplication, but of the "it returns with anon->d_lock held" caution as well,
> *and* __d_materialise_dentry() becomes very similar to __d_move().
> 
>   Note that __d_move(x, y) depends on !IS_ROOT(y), while
> __d_materialise_dentry(x, y) assumes (and is obviously guaranteed by
> callers) IS_ROOT(y).  IOW, even leaving aside the "is it an exchange" argument
> fed to __d_move(), we could distinguish between these guys by IS_ROOT(y)
> alone.  And there's a fuckload of duplication between them even now, let alone
> after pulling the rehash in.

It's actually even better.  __d_materialise_dentry() has its arguments in
the wrong order.  Flip them around and it's very nearly an exact copy of
what we do in __d_move() when the first argument is IS_ROOT().  Actually,
that case in __d_move() had been added back in 2002 exactly for the needs
of d_splice_alias().  And it's been unreachable since this Bruce has
switched d_splice_alias() to use of __d_materialise_dentry().  In fact,
both d_splice_alias() and d_materialise_unique() should've been switched
to __d_move() instead.

I've done that massage and removal of __d_materialise_dentry().  The things
look a lot saner now that all dentry moving is done in one place, IMO.  Not to
mention that it trims ~50 lines off fs/dcache.c, driving it down to the 3rd
place in fs/*.c ;-)

Anyway, I've put the branch with fixes + that stuff in vfs.git#for-linus;
I have *NOT* put the "keep names on overwriting rename" horror in there,
but I believe that we will need something of that sort.

Linus, it's a real userland regression.  Yes, it affects only shitty scripts,
and they had never been reliable for long names anyway, but we have broken
real-world userland code by that.  What happened there is that old behaviour
of removal-by-rename wrt d_path() used to be similar to that on unlink.
cd /tmp; touch foo; touch bar
exec 42 /tmp/bar (deleted)
Now it gives "/tmp/foo (deleted)".  The trouble being, the same goes for
/proc/*/exe.  If you upgraded a package and that had replaced a running binary,
you used to be able to find process still running that binary.  After the
upgrade.  And yes, it was an unreliable kludge and the value of that was
in the best case dubious, but there really are people who used to do it.
And it broke after we'd added RENAME_EXCHANGE support.  The old behaviour
of switch_names() in short names case used to be "copy the name/length
of target to source dentry, swap hash keys and piss on hash key of target
not matching target's name anymore - the sucker is unhashed anyway".  For
RENAME_EXCHANGE we certainly needed to swap the names in all cases -
precisely because both suckers are hashed in the end.  And since we had
done that in exchange case, it was simpler to do it in all cases.  Especially
since when either name had been a long one we'd always swapped them, so
anything relying on old behaviour had been unreliable anyway.

Except that it turns out that old behaviour (keep the last component of
victim on normal rename) was relied upon...  That ucking fugly patch
reverts the non-swapping renames to the old behaviour.  I don't like it
any more than you do - it *is* ugly as hell and I still can't swear that
we don't have a bogus codepath leading to d_rehash() of rename() victim
(which would break things very badly with the old behaviour).  Still,
it is a real-world userland regression.

Up to you, but I'm afraid that it's on the "we get to keep supporting that"
side of things.

Anyway, what I've got in vfs.git#for-linus right now seems to survive the
obvious tests so far, still running through full xfstests; please, take
a look at that pile.  The last one should go with you as author - it was
just faster to do the change manually than deal with git-am failure when
applying your patch.  I'll take the headers from your mail and update that
commit before sending the pull request.

Shortlog:
Al Viro (8):
  ufs: deal with nfsd/iget races
  pull rehashing and unlocking the target dentry into 
__d_materialise_dentry()
  don't open-code d_rehash() in d_materialise_unique()
  __d_move(): fold manipulations with ->d_child/->d_subdirs
  __d_materialise_dentry(): flip the order of arguments

Re: [PATCH v2 0/3] fuse: Add support for mounts from pid/user namespaces

2014-09-26 Thread Seth Forshee
On Fri, Sep 26, 2014 at 06:41:33PM -0700, Eric W. Biederman wrote:
> >> I am on the fence about what to do when a uid from the filesystem server
> >> or for other filesystems the on-disk data structures does not map, but
> >> make_bad_inode is simpler in conception.  So make_bad_inode seems like
> >> a good place to start.   For fuse especially this isn't hard because
> >> the make_bad_inode calls are already there to handle a change in i_mode.
> >
> > I agree that if we're unsure then make_bad_inode is a more logical place
> > to start, since it's easier to loosen the restrictions later than to
> > tighten them. I've got an initiail implementation that I'm in the
> > process of testing. If you want to take a look I've pushed it to:
> >
> >   git://kernel.ubuntu.com/sforshee/linux.git fuse-userns 
> 
> Thanks.  If I can scrape together enough focus I will look at it.
> 
> As a second best thing here are my prototype from when I was looking at
> performing this fuse conversion last.  Given that you had missed
> checking the from_kuid permission checks, it might be worth comparing
> and seeing if there is something else in these patches that would be
> worth including.

I think all of the from_kuid checks have been there for at least the
last two rounds of patches, but let me know if you see one I missed.

Looking at your patches, I see a lot that looks familiar. Seems like a
good sign :-)

I do see a few things I missed though. I've pointed those out below, and
in those cases I'm updating my patches. There are also some other
differences which I don't believe require any changes on my part, and
I've provided explanations for those. I also asked a few questions.

Miklos, are you satisfied with Eric's explanations about using a single
namespace and the permissions checks on uids? If so I should be able to
send updated patches early next week, otherwise let's continue trying to
resolve the points of disagreement.

> -static void fuse_fillattr(struct inode *inode, struct fuse_attr *attr,
> +static int fuse_fillattr(struct inode *inode, struct fuse_attr *attr,
> struct kstat *stat)
>  {
>   unsigned int blkbits;
> @@ -905,8 +905,12 @@ static void fuse_fillattr(struct inode *inode, struct 
> fuse_attr *attr,
>   stat->ino = attr->ino;
>   stat->mode = (inode->i_mode & S_IFMT) | (attr->mode & 0);
>   stat->nlink = attr->nlink;
> - stat->uid = make_kuid(_user_ns, attr->uid);
> - stat->gid = make_kgid(_user_ns, attr->gid);
> + stat->uid = make_kuid(fc->user_ns, attr->uid);
> + if (!uid_valid(stat->uid))
> + return -EOVERFLOW;
> + stat->gid = make_kgid(fc->user_ns, attr->gid);
> + if (!gid_valid(stat->gid))
> + return -EOVERFLOW;

If we were to go with the invalid id approach these checks shouldn't be
necessary, and they'll get converted to the overflow ids for userspace.
In my make_bad_inode patches I'm doing this check when we update the
inode and returning -EINVAL if it doesn't map. Then I changed
fuse_fillattr to just copy them from the inode, since fuse always
updates the inode right before calling fuse_fillattr anyway and that
makes it clear why we don't need to check the results of the conversion.

>  static int fuse_do_getattr(struct inode *inode, struct kstat *stat,
> @@ -973,7 +978,8 @@ static int fuse_do_getattr(struct inode *inode, struct 
> kstat *stat,
>  attr_timeout(),
>  attr_version);
>   if (stat)
> - fuse_fillattr(inode, , stat);
> + err = fuse_fillattr(inode, ,
> + stat);
>   }
>   }
>   return err;

So right before calling fuse_fillattr here fuse_change_attributes is
called to update the inode, so it appears your patches would have
allowed setting inode->i_iuid to INVALID_UID?

> @@ -624,6 +626,8 @@ void fuse_conn_put(struct fuse_conn *fc)
>   if (atomic_dec_and_test(>count)) {
>   if (fc->destroy_req)
>   fuse_request_free(fc->destroy_req);
> + put_user_ns(fc->user_ns);
> + fc->user_ns = NULL;
>   fc->release(fc);
>   }
>  }

I did this in fuse_free_con, but now looking again this is a bug for
cuse since it uses a different release method. I've updated my tree to
do it here instead.

> @@ -1033,6 +1037,7 @@ static int fuse_fill_super(struct super_block *sb, void 
> *data, int silent)
>   sb->s_maxbytes = MAX_LFS_FILESIZE;
>   sb->s_time_gran = 1;
>   sb->s_export_op = _export_operations;
> + sb->s_user_ns = get_user_ns(current_user_ns());
>  
>   file = fget(d.fd);
>   err = -EINVAL;
> @@ -1149,6 +1154,7 @@ static void fuse_kill_sb_anon(struct super_block *sb)
>   }
>  
>   kill_anon_super(sb);
> + put_user_ns(sb->s_user_ns);
>  }

What's the point of s_user_ns? It 

[PATCH] regmap: fix NULL pointer dereference in _regmap_write/read

2014-09-26 Thread Pankaj Dubey
If LOG_DEVICE is defined and map->dev is NULL it will lead to NULL
pointer dereference. This patch fixes this issue by adding check for
dev->NULL in all such places in regmap.c

Signed-off-by: Pankaj Dubey 
---
 drivers/base/regmap/regmap.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/base/regmap/regmap.c b/drivers/base/regmap/regmap.c
index 455a877..9980044 100644
--- a/drivers/base/regmap/regmap.c
+++ b/drivers/base/regmap/regmap.c
@@ -1465,7 +1465,7 @@ int _regmap_write(struct regmap *map, unsigned int reg,
}
 
 #ifdef LOG_DEVICE
-   if (strcmp(dev_name(map->dev), LOG_DEVICE) == 0)
+   if (map->dev && strcmp(dev_name(map->dev), LOG_DEVICE) == 0)
dev_info(map->dev, "%x <= %x\n", reg, val);
 #endif
 
@@ -2115,7 +2115,7 @@ static int _regmap_read(struct regmap *map, unsigned int 
reg,
ret = map->reg_read(context, reg, val);
if (ret == 0) {
 #ifdef LOG_DEVICE
-   if (strcmp(dev_name(map->dev), LOG_DEVICE) == 0)
+   if (map->dev && strcmp(dev_name(map->dev), LOG_DEVICE) == 0)
dev_info(map->dev, "%x => %x\n", reg, *val);
 #endif
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] next: openrisc: Fix build

2014-09-26 Thread Stefan Kristiansson
On Fri, Sep 26, 2014 at 07:51:49PM +0200, Jonas Bonn wrote:
> On 09/26/2014 06:05 PM, Guenter Roeck wrote:
> > openrisc:defconfig fails to build in next-20140926 with the following error.
> >
> > In file included from arch/openrisc/kernel/signal.c:31:0:
> > ./arch/openrisc/include/asm/syscall.h: In function 'syscall_get_arch':
> > ./arch/openrisc/include/asm/syscall.h:77:9: error: 'EM_OPENRISC' undeclared
> >
> > Fix by moving EM_OPENRISC to include/uapi/linux/elf-em.h.
> >
> > Fixes: ce5d112827e5 ("ARCH: AUDIT: implement syscall_get_arch for all 
> > arches")
> > Cc: Eric Paris 
> > Cc: Stefan Kristiansson 
> > Cc: Geert Uytterhoeven 
> > Cc: Stephen Rothwell 
> > Signed-off-by: Guenter Roeck 
> > ---
> > v2: Only move EM_OPENRISC.
> >
> > Another possible solution for the problem would be to include asm/elf.h
> > in arch/openrisc/kernel/signal.c. I had actually submitted a patch with
> > that fix back in August (maybe that is where I remembered the problem from).
> > Wonder what happened with that patch.
> >
> > Would it make sense to drop EM_OR32 and replace it with EM_OPENRISC where
> > it is used ? binutils seems to suggest that EM_OPENRISC is the "official"
> > definition.
> 
> Do we even use EM_OR32?  Will the kernel build with the old toolchain if
> we drop it?  If yes, drop it altogether... I don't recall the details as
> to why we kept that define around at all.  And really, why bother
> supporting the old toolchain at all... it's been at least two or three
> years since EM_OPENRISC was added, hopefully people have moved on.  If
> users want to upgrade their kernel, they can update the toolchain, too,
> at this point.
> 

EM_OPENRISC was added about ten years ago, and when the OR32 things in binutils
was removed in favour for OR1K earlier this year,
EM_OR1K was added as an alias to EM_OPENRISC.

With that said, I'm putting in a vote for removing EM_OR32 as well.

Stefan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4] power: reset: use restart_notifier mechanism for msm-poweroff

2014-09-26 Thread Pramod Gurav
On 27-09-2014 01:55 AM, Sebastian Reichel wrote:
> Hi,
> 
> On Thu, Sep 25, 2014 at 05:03:51PM +0530, Pramod Gurav wrote:
>> This change replaces use of arm_pm_restart with recently introduced
>> reset mechanism in Linux kernel called restart_notifier.
>>
>> Choosing priority 128, which is default priority, as according to 
>> documentation, this mechanism is sufficient to restart the entire system.
> 
> http://git.infradead.org/battery-2.6.git/commit/18a702e0de9879d5c0225a09f494443f0b91a0cc

Thanks Sebastian.
> 
> -- Sebastian
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mmc: atmel-mci: fix mismatched section on atmci_cleanup_slot

2014-09-26 Thread Pramod Gurav
Hi Arnd,

On Sat, Sep 27, 2014 at 1:04 AM, Arnd Bergmann  wrote:
> As of 528bc7808f4e ("mmc: atmel-mci: Release mmc resources on failure in 
> probe"),
> the atmci_probe() function calls atmci_cleanup_slot in the failure path.
>
> This causes a new warning whenever the driver is built:
>
> WARNING: drivers/mmc/host/built-in.o(.init.text+0xa04): Section mismatch in 
> reference from the function atmci_probe() to the function 
> .exit.text:atmci_cleanup_slot()
> The function __init atmci_probe() references
> a function __exit atmci_cleanup_slot().
Thanks for this though I am not owner of the driver but the last
commit was mine.
but how come I did not see this warning? Any flag with compiler that
warned about this?
>
> Gcc correctly warns about this function getting dropped in the link stage
> for the built-in case, which would cause undefined behavior when this error
> path is hit. The solution is to simply drop the __exit annotation.
>
> Signed-off-by: Arnd Bergmann 
> Fixes: 528bc7808f4e ("mmc: atmel-mci: Release mmc resources on failure in 
> probe")
>
> diff --git a/drivers/mmc/host/atmel-mci.c b/drivers/mmc/host/atmel-mci.c
> index 772ef5b0e4d5..974626087732 100644
> --- a/drivers/mmc/host/atmel-mci.c
> +++ b/drivers/mmc/host/atmel-mci.c
> @@ -2244,7 +2244,7 @@ static int __init atmci_init_slot(struct atmel_mci 
> *host,
> return 0;
>  }
>
> -static void __exit atmci_cleanup_slot(struct atmel_mci_slot *slot,
> +static void atmci_cleanup_slot(struct atmel_mci_slot *slot,
> unsigned int id)
>  {
> /* Debugfs stuff is cleaned up by mmc core */
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/



-- 
Thanks and Regards
Pramod
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] net: llc: check for device before dereferencing

2014-09-26 Thread Sasha Levin
llc_ui_sendmsg would not make sure that a device indeed exists before
dereferencing it. This caused a user triggerable NULL ptr deref:

[  430.542391] BUG: unable to handle kernel NULL pointer dereference at 
021e
[  430.551939] IP: llc_ui_sendmsg (net/llc/af_llc.c:912)
[  430.551939] PGD 5edcd067 PUD 5edce067 PMD 0
[  430.551939] Oops:  [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN
[  430.551939] Dumping ftrace buffer:
[  430.551939](ftrace buffer empty)
[  430.551939] Modules linked in:
[  430.551939] CPU: 2 PID: 9395 Comm: trinity-c261 Not tainted 
3.17.0-rc6-next-20140926-sasha-00050-g625a54d-dirty #1239
[  430.551939] task: 88005edc ti: 88005edc8000 task.ti: 
88005edc8000
[  430.551939] RIP: llc_ui_sendmsg (net/llc/af_llc.c:912)
[  430.551939] RSP: 0018:88005edcbcd8  EFLAGS: 00010282
[  430.551939] RAX:  RBX: 880239191148 RCX: 
[  430.551939] RDX: dfffe900 RSI: 814471e1 RDI: 85f7a77f
[  430.551939] RBP: 88005edcbd18 R08: dfffe901 R09: 
[  430.551939] R10:  R11:  R12: 
[  430.551939] R13: 88023716 R14: 88005edcbeb0 R15: 
[  430.551939] FS:  7fbcf9f6b700() GS:880111c0() 
knlGS:
[  430.551939] CS:  0010 DS:  ES:  CR0: 8005003b
[  430.551939] CR2: 021e CR3: 5edcc000 CR4: 06a0
[  430.551939] DR0: 006ef000 DR1:  DR2: 
[  430.551939] DR3:  DR6: 0ff0 DR7: 00010602
[  430.551939] Stack:
[  430.551939]   0100 88005edcbd18 
88005edcbd30
[  430.551939]  88023716 88005edcbe78 0100 
86963ec0
[  430.551939]  88005edcbe38 85680290 88005edcbd68 

[  430.551939] Call Trace:
[  430.551939] sock_sendmsg (net/socket.c:663)
[  430.551939] ? might_fault (mm/memory.c:3733)
[  430.551939] ? might_fault (./arch/x86/include/asm/current.h:14 
mm/memory.c:3732)
[  430.551939] ? __fdget (fs/file.c:698)
[  430.551939] SYSC_sendto (net/socket.c:1818)
[  430.551939] ? do_audit_syscall_entry (include/linux/audit.h:153 
arch/x86/kernel/ptrace.c:1448)
[  430.551939] ? syscall_trace_enter_phase2 (arch/x86/kernel/ptrace.c:1598 
(discriminator 2))
[  430.551939] SyS_sendto (net/socket.c:1783)
[  430.551939] tracesys_phase2 (arch/x86/kernel/entry_64.S:529)
[ 430.551939] Code: 0f 85 b6 00 00 00 48 8d bb 38 05 00 00 e8 f2 01 d5 fb 4c 8b 
bb 38 05 00 00 49 8d bf 1e 02 00 00 e8 2f 01 d5 fb 66 41 83 7e 04 00 <45> 0f b7 
af 1e 02 00 00 75 1a 48 8d bb d0 02 00 00 e8 53 00 d5

All code

   0:   0f 85 b6 00 00 00   jne0xbc
   6:   48 8d bb 38 05 00 00lea0x538(%rbx),%rdi
   d:   e8 f2 01 d5 fb  callq  0xfbd50204
  12:   4c 8b bb 38 05 00 00mov0x538(%rbx),%r15
  19:   49 8d bf 1e 02 00 00lea0x21e(%r15),%rdi
  20:   e8 2f 01 d5 fb  callq  0xfbd50154
  25:   66 41 83 7e 04 00   cmpw   $0x0,0x4(%r14)
  2b:*  45 0f b7 af 1e 02 00movzwl 0x21e(%r15),%r13d<-- 
trapping instruction
  32:   00
  33:   75 1a   jne0x4f
  35:   48 8d bb d0 02 00 00lea0x2d0(%rbx),%rdi
  3c:   e8 53 00 d5 00  callq  0xd50094

Code starting with the faulting instruction
===
   0:   45 0f b7 af 1e 02 00movzwl 0x21e(%r15),%r13d
   7:   00
   8:   75 1a   jne0x24
   a:   48 8d bb d0 02 00 00lea0x2d0(%rbx),%rdi
  11:   e8 53 00 d5 00  callq  0xd50069
[  430.551939] RIP llc_ui_sendmsg (net/llc/af_llc.c:912)
[  430.551939]  RSP 
[  430.551939] CR2: 021e

Signed-off-by: Sasha Levin 
---
 net/llc/af_llc.c |3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/llc/af_llc.c b/net/llc/af_llc.c
index c776ffb..c16e01a 100644
--- a/net/llc/af_llc.c
+++ b/net/llc/af_llc.c
@@ -909,6 +909,9 @@ static int llc_ui_sendmsg(struct kiocb *iocb, struct socket 
*sock,
if (rc)
goto release;
}
+   rc = -ENODEV;
+   if (!llc->dev)
+   goto release;
hdrlen = llc->dev->hard_header_len + llc_ui_header_len(sk, addr);
size = hdrlen + len;
if (size > llc->dev->mtu)
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/3] fuse: Add support for mounts from pid/user namespaces

2014-09-26 Thread Eric W. Biederman
Seth Forshee  writes:

> On Thu, Sep 25, 2014 at 12:14:01PM -0700, Eric W. Biederman wrote:

>> Sorry iattr_to_setattr look for from_kuid and from_kgid.
>> 
>> The call path is
>> fuse_setattr
>>fuse_do_setattr
>>   iattr_to_fattr.
>
> Bah. Sorry, I misread that originally and thought you were talking about
> something outside of fuse. And I was looking at a tree with my fuse
> changes, so of course I wouldn't have found it.
>
> Actually in 3.17-rc6 I still don't see that iattr_to_fattr (I assume
> this is what you meant) checks the results of the conversion (not that
> it really needs to since it uses init_user_ns), nor any use of EOVERFLOW
> in fuse. Anyway, it's not really important.

True.  I also goofed up in that I was looking at the wrong tree.  My
tree had all of my preliminary fuse patches I worked up a while ago
applied so I did have the error handling in there.

> Well, unless you say otherwise I guess I'll leave it -EINVAL to be
> consistent with chown_common().

That sounds like a good plan.

>> I am on the fence about what to do when a uid from the filesystem server
>> or for other filesystems the on-disk data structures does not map, but
>> make_bad_inode is simpler in conception.  So make_bad_inode seems like
>> a good place to start.   For fuse especially this isn't hard because
>> the make_bad_inode calls are already there to handle a change in i_mode.
>
> I agree that if we're unsure then make_bad_inode is a more logical place
> to start, since it's easier to loosen the restrictions later than to
> tighten them. I've got an initiail implementation that I'm in the
> process of testing. If you want to take a look I've pushed it to:
>
>   git://kernel.ubuntu.com/sforshee/linux.git fuse-userns 

Thanks.  If I can scrape together enough focus I will look at it.

As a second best thing here are my prototype from when I was looking at
performing this fuse conversion last.  Given that you had missed
checking the from_kuid permission checks, it might be worth comparing
and seeing if there is something else in these patches that would be
worth including.

Eric

>From 94195ce5b06915846d14eaa35d9274c0315b46a0 Mon Sep 17 00:00:00 2001
From: "Eric W. Biederman" 
Date: Thu, 4 Oct 2012 13:34:45 -0700
Subject: [PATCH 1/4] userns: Allow for fuse filesystems outside the initial user
 namespace

Signed-off-by: "Eric W. Biederman" 
---
 fs/fuse/dev.c| 10 +-
 fs/fuse/dir.c| 41 ++---
 fs/fuse/fuse_i.h |  3 +++
 fs/fuse/inode.c  | 10 --
 4 files changed, 46 insertions(+), 18 deletions(-)

diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
index ca887314aba9..e01f30c51b3c 100644
--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -124,10 +124,10 @@ static void __fuse_put_request(struct fuse_req *req)
 	atomic_dec(>count);
 }
 
-static void fuse_req_init_context(struct fuse_req *req)
+static void fuse_req_init_context(struct fuse_conn *fc, struct fuse_req *req)
 {
-	req->in.h.uid = from_kuid_munged(_user_ns, current_fsuid());
-	req->in.h.gid = from_kgid_munged(_user_ns, current_fsgid());
+	req->in.h.uid = from_kuid_munged(fc->user_ns, current_fsuid());
+	req->in.h.gid = from_kgid_munged(fc->user_ns, current_fsgid());
 	req->in.h.pid = current->pid;
 }
 
@@ -168,7 +168,7 @@ static struct fuse_req *__fuse_get_req(struct fuse_conn *fc, unsigned npages,
 		goto out;
 	}
 
-	fuse_req_init_context(req);
+	fuse_req_init_context(fc, req);
 	req->waiting = 1;
 	req->background = for_background;
 	return req;
@@ -257,7 +257,7 @@ struct fuse_req *fuse_get_req_nofail_nopages(struct fuse_conn *fc,
 	if (!req)
 		req = get_reserved_req(fc, file);
 
-	fuse_req_init_context(req);
+	fuse_req_init_context(fc, req);
 	req->waiting = 1;
 	req->background = 0;
 	return req;
diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index de1d84af9f7c..d74c75a057cd 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -886,7 +886,7 @@ static int fuse_link(struct dentry *entry, struct inode *newdir,
 	return err;
 }
 
-static void fuse_fillattr(struct inode *inode, struct fuse_attr *attr,
+static int fuse_fillattr(struct inode *inode, struct fuse_attr *attr,
 			  struct kstat *stat)
 {
 	unsigned int blkbits;
@@ -905,8 +905,12 @@ static void fuse_fillattr(struct inode *inode, struct fuse_attr *attr,
 	stat->ino = attr->ino;
 	stat->mode = (inode->i_mode & S_IFMT) | (attr->mode & 0);
 	stat->nlink = attr->nlink;
-	stat->uid = make_kuid(_user_ns, attr->uid);
-	stat->gid = make_kgid(_user_ns, attr->gid);
+	stat->uid = make_kuid(fc->user_ns, attr->uid);
+	if (!uid_valid(stat->uid))
+		return -EOVERFLOW;
+	stat->gid = make_kgid(fc->user_ns, attr->gid);
+	if (!gid_valid(stat->gid))
+		return -EOVERFLOW;
 	stat->rdev = inode->i_rdev;
 	stat->atime.tv_sec = attr->atime;
 	stat->atime.tv_nsec = attr->atimensec;
@@ -923,6 +927,7 @@ static void fuse_fillattr(struct inode *inode, struct fuse_attr *attr,
 		blkbits = inode->i_sb->s_blocksize_bits;
 
 	stat->blksize = 1 << blkbits;
+	return 0;
 }
 
 static int 

Re: [PATCH 0/5] HID: wacom: introduce generic HID handling

2014-09-26 Thread Jason Gerecke
On Fri, Sep 26, 2014 at 4:03 AM, Jiri Kosina  wrote:
> On Tue, 23 Sep 2014, Benjamin Tissoires wrote:
>
>> Hi guys,
>>
>> So, this patch series aims at supporting natively any future HID compliant 
>> wacom
>> tablet. Those found on the various laptops (ISDv4/5) already are HID 
>> compliant
>> and they should work in the future without any modification of the kernel.
>>
>> Few things to note here:
>> - I used the PID 0x00E6 found on the Lenovo X230 for the tests
>> - I did not removed its entry in the list because there is a slightly 
>> different
>>   behavior while using this patch set: when more than two fingers are on the
>>   screen, the tablet stops sending finger events, while with the wacom
>>   proprietary bits, it continues to send them. Besides that, the events 
>> emitted
>>   before and after the patch series are the same (at least on the E6)
>> - I can not rely on hid-input directly because wacom wants to be in control 
>> of
>>   the input devices it creates. This might be solved in the future, but for 
>> now
>>   it is just easier to rewrite the few mapping/events handling than trying to
>>   fit in the hid-input model.
>> - there will still be more specific use cases to handle later (see the joke 
>> of
>>   the MS surface 3 pen[1] for instance), but this should give a roughly 
>> working
>>   pen/touch support for those future devices.
>>
>> Jason, I would be very glad if you could conduct a few tests for this on the
>> ISDv4/5 sensors you have.
>
> I am waiting with this one for testing results from Jason, but if you guys
> want this to go in the next merge window, I'd like to ask you not to
> postpone the testing too much.
>
> Thanks.
>

I'm in the process of capturing traces from hid-record for multiple
tablet PCs both pre- and post-patch to locate any obvious issues. My
initial results are promising on the pen side (the only things I've
noticed so far is the 2nd barrel switch not being supported -- as
might be suspected) but less so on the touch side (single-finger
devices seem to work fine, but the new code doesn't seem to handle any
of the multitouch devices I've tried).

Expect more news next week :)


PS: I'm pretty sure ISDv5 shouldn't be relied on for HID data :| The
descriptor for the Cintiq Companion Hybrid, at least, is like our
branded sensors with useful data only in the touchscreen descriptor
(the pen descriptor is a bunch of opaque vendor-defined reports). Not
sure if the Cintiq Companion is the same though since it runs
Windows...

Jason
---
Now instead of four in the eights place /
you’ve got three, ‘Cause you added one  /
(That is to say, eight) to the two, /
But you can’t take seven from three,/
So you look at the sixty-fours

>>
>> Cheers,
>> Benjamin
>>
>> [1] 
>> http://who-t.blogspot.com/2014/09/stylus-behaviour-on-microsoft-surface-3.html
>>
>> Benjamin Tissoires (5):
>>   HID: wacom: rename failN with some meaningful information
>>   HID: wacom: split out input allocation and registration
>>   HID: wacom: move allocation of inputs earlier
>>   HID: wacom: implement generic HID handling for pen generic devices
>>   HID: wacom: implement the finger part of the HID generic handling
>>
>>  drivers/hid/hid-core.c  |   3 +
>>  drivers/hid/wacom.h |   6 +
>>  drivers/hid/wacom_sys.c | 184 +
>>  drivers/hid/wacom_wac.c | 299 
>> 
>>  drivers/hid/wacom_wac.h |  17 +++
>>  include/linux/hid.h |   2 +
>>  6 files changed, 462 insertions(+), 49 deletions(-)
>>
>> --
>> 2.1.0
>>
>
> --
> Jiri Kosina
> SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] media, platform, LLVMLinux: Remove nested function from ti-vpe

2014-09-26 Thread Behan Webster
Replace the use of nested functions where a normal function will suffice.

Nested functions are not liked by upstream kernel developers in general. Their
use breaks the use of clang as a compiler, and doesn't make the code any
better.

This code now works for both gcc and clang.

Signed-off-by: Behan Webster 
Suggested-by: Arnd Bergmann 
Cc: Arnd Bergmann 
---
 drivers/media/platform/ti-vpe/csc.c | 8 ++--
 drivers/media/platform/ti-vpe/sc.c  | 8 ++--
 2 files changed, 4 insertions(+), 12 deletions(-)

diff --git a/drivers/media/platform/ti-vpe/csc.c 
b/drivers/media/platform/ti-vpe/csc.c
index 940df40..44fbf41 100644
--- a/drivers/media/platform/ti-vpe/csc.c
+++ b/drivers/media/platform/ti-vpe/csc.c
@@ -93,12 +93,8 @@ void csc_dump_regs(struct csc_data *csc)
 {
struct device *dev = >pdev->dev;
 
-   u32 read_reg(struct csc_data *csc, int offset)
-   {
-   return ioread32(csc->base + offset);
-   }
-
-#define DUMPREG(r) dev_dbg(dev, "%-35s %08x\n", #r, read_reg(csc, CSC_##r))
+#define DUMPREG(r) dev_dbg(dev, "%-35s %08x\n", #r, \
+   ioread32(csc->base + CSC_##r))
 
DUMPREG(CSC00);
DUMPREG(CSC01);
diff --git a/drivers/media/platform/ti-vpe/sc.c 
b/drivers/media/platform/ti-vpe/sc.c
index 6314171..1088381 100644
--- a/drivers/media/platform/ti-vpe/sc.c
+++ b/drivers/media/platform/ti-vpe/sc.c
@@ -24,12 +24,8 @@ void sc_dump_regs(struct sc_data *sc)
 {
struct device *dev = >pdev->dev;
 
-   u32 read_reg(struct sc_data *sc, int offset)
-   {
-   return ioread32(sc->base + offset);
-   }
-
-#define DUMPREG(r) dev_dbg(dev, "%-35s %08x\n", #r, read_reg(sc, CFG_##r))
+#define DUMPREG(r) dev_dbg(dev, "%-35s %08x\n", #r, \
+   ioread32(sc->base + CFG_##r))
 
DUMPREG(SC0);
DUMPREG(SC1);
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] arm, fbdev, omap2, LLVMLinux: Remove nested function from omapfb

2014-09-26 Thread Behan Webster
Replace the use of nested functions where a normal function will suffice.

Nested functions are not liked by upstream kernel developers in general. Their
use breaks the use of clang as a compiler, and doesn't make the code any
better.

This code now works for both gcc and clang.

Signed-off-by: Behan Webster 
Suggested-by: Arnd Bergmann 
Cc: Arnd Bergmann 
---
 drivers/video/fbdev/omap2/omapfb/omapfb-main.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/video/fbdev/omap2/omapfb/omapfb-main.c 
b/drivers/video/fbdev/omap2/omapfb/omapfb-main.c
index ec2d132..1587243 100644
--- a/drivers/video/fbdev/omap2/omapfb/omapfb-main.c
+++ b/drivers/video/fbdev/omap2/omapfb/omapfb-main.c
@@ -273,16 +273,16 @@ static struct omapfb_colormode omapfb_colormodes[] = {
},
 };
 
+static bool cmp_component(struct fb_bitfield *f1, struct fb_bitfield *f2)
+{
+   return f1->length == f2->length &&
+   f1->offset == f2->offset &&
+   f1->msb_right == f2->msb_right;
+}
+
 static bool cmp_var_to_colormode(struct fb_var_screeninfo *var,
struct omapfb_colormode *color)
 {
-   bool cmp_component(struct fb_bitfield *f1, struct fb_bitfield *f2)
-   {
-   return f1->length == f2->length &&
-   f1->offset == f2->offset &&
-   f1->msb_right == f2->msb_right;
-   }
-
if (var->bits_per_pixel == 0 ||
var->red.length == 0 ||
var->blue.length == 0 ||
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/2] LLVMLinux: Patches to enable the kernel to be compiled with clang/LLVM

2014-09-26 Thread Behan Webster
Replace the use of nested functions where a normal function will suffice.

Nested functions are not liked by upstream kernel developers in general. Their
use breaks the use of clang as a compiler, and doesn't make the code any
better.

This code now works for both gcc and clang.

The LLVMLinux project aims to fully build the Linux kernel using both gcc and
clang (the C front end for the LLVM compiler infrastructure project). 

Behan Webster (2):
  arm, fbdev, omap2, LLVMLinux: Remove nested function from omap2 dss
  arm, fbdev, omap2, LLVMLinux: Remove nested function from omapfb

 drivers/video/fbdev/omap2/dss/dispc-compat.c   |  9 +
 drivers/video/fbdev/omap2/dss/manager-sysfs.c  | 16 +---
 drivers/video/fbdev/omap2/omapfb/omapfb-main.c | 14 +++---
 3 files changed, 21 insertions(+), 18 deletions(-)

-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] arm, fbdev, omap2, LLVMLinux: Remove nested function from omap2 dss

2014-09-26 Thread Behan Webster
Replace the use of nested functions where a normal function will suffice.

Nested functions are not liked by upstream kernel developers in general. Their
use breaks the use of clang as a compiler, and doesn't make the code any
better.

This code now works for both gcc and clang.

Signed-off-by: Behan Webster 
Suggested-by: Arnd Bergmann 
Cc: Arnd Bergmann 
---
 drivers/video/fbdev/omap2/dss/dispc-compat.c  |  9 +
 drivers/video/fbdev/omap2/dss/manager-sysfs.c | 16 +---
 2 files changed, 14 insertions(+), 11 deletions(-)

diff --git a/drivers/video/fbdev/omap2/dss/dispc-compat.c 
b/drivers/video/fbdev/omap2/dss/dispc-compat.c
index 83779c2..633c461 100644
--- a/drivers/video/fbdev/omap2/dss/dispc-compat.c
+++ b/drivers/video/fbdev/omap2/dss/dispc-compat.c
@@ -634,13 +634,14 @@ void dispc_mgr_disable_sync(enum omap_channel channel)
WARN_ON(1);
 }
 
+static inline void dispc_irq_wait_handler(void *data, u32 mask)
+{
+   complete((struct completion *)data);
+}
+
 int omap_dispc_wait_for_irq_interruptible_timeout(u32 irqmask,
unsigned long timeout)
 {
-   void dispc_irq_wait_handler(void *data, u32 mask)
-   {
-   complete((struct completion *)data);
-   }
 
int r;
DECLARE_COMPLETION_ONSTACK(completion);
diff --git a/drivers/video/fbdev/omap2/dss/manager-sysfs.c 
b/drivers/video/fbdev/omap2/dss/manager-sysfs.c
index 37b59fe..a7414fb 100644
--- a/drivers/video/fbdev/omap2/dss/manager-sysfs.c
+++ b/drivers/video/fbdev/omap2/dss/manager-sysfs.c
@@ -44,6 +44,13 @@ static ssize_t manager_display_show(struct 
omap_overlay_manager *mgr, char *buf)
dssdev->name : "");
 }
 
+static int manager_display_match(struct omap_dss_device *dssdev, void *data)
+{
+   const char *str = data;
+
+   return sysfs_streq(dssdev->name, str);
+}
+
 static ssize_t manager_display_store(struct omap_overlay_manager *mgr,
const char *buf, size_t size)
 {
@@ -52,17 +59,12 @@ static ssize_t manager_display_store(struct 
omap_overlay_manager *mgr,
struct omap_dss_device *dssdev = NULL;
struct omap_dss_device *old_dssdev;
 
-   int match(struct omap_dss_device *dssdev, void *data)
-   {
-   const char *str = data;
-   return sysfs_streq(dssdev->name, str);
-   }
-
if (buf[size-1] == '\n')
--len;
 
if (len > 0)
-   dssdev = omap_dss_find_device((void *)buf, match);
+   dssdev = omap_dss_find_device((void *)buf,
+   manager_display_match);
 
if (len > 0 && dssdev == NULL)
return -EINVAL;
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] md, sysfs, LLVMLinux: Remove nested function from bcache sysfs

2014-09-26 Thread Behan Webster
Replace the use of nested functions where a normal function will suffice.

Nested functions are not liked by upstream kernel developers in general. Their
use breaks the use of clang as a compiler, and doesn't make the code any
better.

This code now works for both gcc and clang.

Signed-off-by: Behan Webster 
Suggested-by: Arnd Bergmann 
Cc: Arnd Bergmann 
---
 drivers/md/bcache/sysfs.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/md/bcache/sysfs.c b/drivers/md/bcache/sysfs.c
index b3ff57d..53d8baa 100644
--- a/drivers/md/bcache/sysfs.c
+++ b/drivers/md/bcache/sysfs.c
@@ -731,6 +731,11 @@ static struct attribute *bch_cache_set_internal_files[] = {
 };
 KTYPE(bch_cache_set_internal);
 
+static int __bch_cache_cmp(const void *l, const void *r)
+{
+   return *((uint16_t *) r) - *((uint16_t *) l);
+}
+
 SHOW(__bch_cache)
 {
struct cache *ca = container_of(kobj, struct cache, kobj);
@@ -755,9 +760,6 @@ SHOW(__bch_cache)
   CACHE_REPLACEMENT(>sb));
 
if (attr == _priority_stats) {
-   int cmp(const void *l, const void *r)
-   {   return *((uint16_t *) r) - *((uint16_t *) l); }
-
struct bucket *b;
size_t n = ca->sb.nbuckets, i;
size_t unused = 0, available = 0, dirty = 0, meta = 0;
@@ -786,7 +788,7 @@ SHOW(__bch_cache)
p[i] = ca->buckets[i].prio;
mutex_unlock(>set->bucket_lock);
 
-   sort(p, n, sizeof(uint16_t), cmp, NULL);
+   sort(p, n, sizeof(uint16_t), __bch_cache_cmp, NULL);
 
while (n &&
   !cached[n - 1])
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 02/10] power/reset: xgene: Return -ENOMEM if out of memory

2014-09-26 Thread Guenter Roeck
It is customary to return an error code of -ENOMEM if the system
is out of memory. Also, in that case, the infrastructure will report
an error, so it is unnecessary to report it again.

Cc: Loc Ho 
Signed-off-by: Guenter Roeck 
---
 drivers/power/reset/xgene-reboot.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/power/reset/xgene-reboot.c 
b/drivers/power/reset/xgene-reboot.c
index ecd55f8..20cf7c3 100644
--- a/drivers/power/reset/xgene-reboot.c
+++ b/drivers/power/reset/xgene-reboot.c
@@ -61,10 +61,8 @@ static int xgene_reboot_probe(struct platform_device *pdev)
struct xgene_reboot_context *ctx;
 
ctx = devm_kzalloc(>dev, sizeof(*ctx), GFP_KERNEL);
-   if (!ctx) {
-   dev_err(>dev, "out of memory for context\n");
-   return -ENODEV;
-   }
+   if (!ctx)
+   return -ENOMEM;
 
ctx->csr = of_iomap(pdev->dev.of_node, 0);
if (!ctx->csr) {
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] DT: add binding for mxs regulator

2014-09-26 Thread Stefan Wahren
This patch adds the Device tree bindings for the Freescale MXS
on-chip regulators.

Signed-off-by: Stefan Wahren 
---
 .../bindings/regulator/mxs-regulator.txt   |   36 
 1 file changed, 36 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/regulator/mxs-regulator.txt

diff --git a/Documentation/devicetree/bindings/regulator/mxs-regulator.txt 
b/Documentation/devicetree/bindings/regulator/mxs-regulator.txt
new file mode 100644
index 000..e3133a4
--- /dev/null
+++ b/Documentation/devicetree/bindings/regulator/mxs-regulator.txt
@@ -0,0 +1,36 @@
+MXS regulators
+
+Required node properties:
+- compatible: Should be "simple-bus"
+- #address-cells: Number of cells required to define regulator register,
+  must be 1
+- #size-cells: Number of cells required to define register size, must be 1
+- reg: Absolute physical address and size of the register set for the device
+
+Required regulator properties:
+- compatible: Must be "fsl,mxs-regulator"
+- reg: Absolute physical address of the register set for the regulator
+
+Any regulator property defined as part of the core regulator
+binding, defined in regulator.txt, can also be used.
+
+Example:
+
+   power: power@80044000 {
+   compatible = "simple-bus";
+   #address-cells = <1>;
+   #size-cells = <1>;
+   reg = <0x80044000 0x2000>;
+   ranges;
+
+   reg_vddd: regulator@80044040 {
+   reg = <0x80044040 0x10>;
+   compatible = "fsl,mxs-regulator";
+   regulator-name = "vddd";
+   regulator-min-microvolt = <80>;
+   regulator-max-microvolt = <1575000>;
+   regulator-boot-on;
+   vddd-supply = <_vdda>;
+   };
+   };
+
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/2] regulator: add support for mxs regulator

2014-09-26 Thread Stefan Wahren
This patch series adds support for Freescale i.MX23, i.xM28
on-chip regulators: vddd, vdda, vddio

This driver based on the Freescale high level [1] and low level 
driver [2], but contains the following changes:

* devicetree support
* fix for regulator modes
* drop support for overall_current and brown out
* replace udelay() with schedule()
* code cleanup

The code has been tested on a I2SE Duckbill.

[1] - 
http://git.freescale.com/git/cgit.cgi/imx/linux-2.6-imx.git/tree/drivers/regulator/mxs-regulator.c?h=imx_2.6.35_maintain
[2] - 
http://git.freescale.com/git/cgit.cgi/imx/linux-2.6-imx.git/tree/arch/arm/mach-mx28/power.c?h=imx_2.6.35_maintain

Stefan Wahren (2):
  DT: add binding for mxs regulator
  regulator: add mxs regulator driver

 .../bindings/regulator/mxs-regulator.txt   |   36 ++
 drivers/regulator/Kconfig  |   11 +
 drivers/regulator/Makefile |1 +
 drivers/regulator/mxs-regulator.c  |  395 
 4 files changed, 443 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/regulator/mxs-regulator.txt
 create mode 100644 drivers/regulator/mxs-regulator.c

-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] regulator: add mxs regulator driver

2014-09-26 Thread Stefan Wahren
This patch adds driver support for Freescale i.MX23, i.MX28
on-chip regulators. The driver supports the following regulators:
vddd, vdda and vddio.

Signed-off-by: Stefan Wahren 
---
 drivers/regulator/Kconfig |   11 ++
 drivers/regulator/Makefile|1 +
 drivers/regulator/mxs-regulator.c |  395 +
 3 files changed, 407 insertions(+)
 create mode 100644 drivers/regulator/mxs-regulator.c

diff --git a/drivers/regulator/Kconfig b/drivers/regulator/Kconfig
index a353011..f353a2b 100644
--- a/drivers/regulator/Kconfig
+++ b/drivers/regulator/Kconfig
@@ -433,6 +433,17 @@ config REGULATOR_MC13892
  Say y here to support the regulators found on the Freescale MC13892
  PMIC.
 
+config REGULATOR_MXS
+   tristate "Freescale MXS on-chip regulators"
+   depends on ARCH_MXS
+   default y if ARCH_MXS
+   help
+ Say y here to support Freescale MXS on-chip regulators.
+
+ The driver currently support the following voltage regulators:
+  vddd, vdda and vddio. It is recommended that this option be
+ enabled on i.MX23, i.MX28 platform.
+
 config REGULATOR_PALMAS
tristate "TI Palmas PMIC Regulators"
depends on MFD_PALMAS
diff --git a/drivers/regulator/Makefile b/drivers/regulator/Makefile
index 25271f8..0f5b66c 100644
--- a/drivers/regulator/Makefile
+++ b/drivers/regulator/Makefile
@@ -58,6 +58,7 @@ obj-$(CONFIG_REGULATOR_MAX77802) += max77802.o
 obj-$(CONFIG_REGULATOR_MC13783) += mc13783-regulator.o
 obj-$(CONFIG_REGULATOR_MC13892) += mc13892-regulator.o
 obj-$(CONFIG_REGULATOR_MC13XXX_CORE) +=  mc13xxx-regulator-core.o
+obj-$(CONFIG_REGULATOR_MXS) += mxs-regulator.o
 obj-$(CONFIG_REGULATOR_PALMAS) += palmas-regulator.o
 obj-$(CONFIG_REGULATOR_PFUZE100) += pfuze100-regulator.o
 obj-$(CONFIG_REGULATOR_TPS51632) += tps51632-regulator.o
diff --git a/drivers/regulator/mxs-regulator.c 
b/drivers/regulator/mxs-regulator.c
new file mode 100644
index 000..63c09cc
--- /dev/null
+++ b/drivers/regulator/mxs-regulator.c
@@ -0,0 +1,395 @@
+/*
+ * Freescale STMP378X voltage regulators
+ *
+ * Embedded Alley Solutions, Inc 
+ *
+ * Copyright (C) 2014 Stefan Wahren
+ * Copyright (C) 2010 Freescale Semiconductor, Inc.
+ * Copyright 2008 Embedded Alley Solutions, Inc All Rights Reserved.
+ */
+
+/*
+ * The code contained herein is licensed under the GNU General Public
+ * License. You may obtain a copy of the GNU General Public License
+ * Version 2 or later at the following locations:
+ *
+ * http://www.opensource.org/licenses/gpl-license.html
+ * http://www.gnu.org/copyleft/gpl.html
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define HW_POWER_STS   (0x00c0)
+
+#define BM_POWER_STS_DC_OK BIT(9)
+
+#define MXS_VDDIO  1
+#define MXS_VDDA   2
+#define MXS_VDDD   3
+
+struct mxs_regulator {
+   struct regulator_desc rdesc;
+   struct regulator_init_data *initdata;
+
+   const char *name;
+   void __iomem *base_addr;
+   void __iomem *power_addr;
+   unsigned int mode_mask;
+};
+
+static int mxs_set_voltage(struct regulator_dev *reg, int min_uV, int max_uV,
+  unsigned *selector)
+{
+   struct mxs_regulator *sreg = rdev_get_drvdata(reg);
+   struct regulation_constraints *con = >initdata->constraints;
+   void __iomem *power_sts = sreg->power_addr + HW_POWER_STS;
+   unsigned long start;
+   u32 val, regs, i;
+
+   pr_debug("%s: min_uV %d, max_uV %d, min %d, max %d\n", __func__,
+min_uV, max_uV, con->min_uV, con->max_uV);
+
+   if (max_uV < con->min_uV || max_uV > con->max_uV)
+   return -EINVAL;
+
+   val = (max_uV - con->min_uV) * sreg->rdesc.n_voltages /
+   (con->max_uV - con->min_uV);
+
+   regs = (readl(sreg->base_addr) & ~sreg->rdesc.vsel_mask);
+
+   pr_debug("%s: %s calculated val %d\n", __func__, sreg->name, val);
+
+   writel(val | regs, sreg->base_addr);
+   for (i = 20; i; i--) {
+   /* delay for fast mode */
+   if (readl(power_sts) & BM_POWER_STS_DC_OK)
+   return 0;
+
+   udelay(1);
+   }
+
+   writel(val | regs, sreg->base_addr);
+   start = jiffies;
+   while (1) {
+   /* delay for normal mode */
+   if (readl(power_sts) & BM_POWER_STS_DC_OK)
+   return 0;
+
+   if (time_after(jiffies, start + msecs_to_jiffies(80)))
+   break;
+
+   schedule();
+   }
+
+   return -ETIMEDOUT;
+}
+
+
+static int mxs_get_voltage(struct regulator_dev *reg)
+{
+   struct mxs_regulator *sreg = rdev_get_drvdata(reg);
+   struct regulation_constraints *con = >initdata->constraints;
+   int uV;
+   u32 val = readl(sreg->base_addr) & 

[PATCH 08/10] power/reset: keystone: Register with kernel restart handler

2014-09-26 Thread Guenter Roeck
Register with kernel restart handler instead of setting arm_pm_restart directly.

Move notifier registration to the end of the probe function to avoid having to
implement error handling.

Cc: Ivan Khoronzhuk 
Cc: Santosh Shilimkar 
Signed-off-by: Guenter Roeck 
---
 drivers/power/reset/keystone-reset.c | 18 ++
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/drivers/power/reset/keystone-reset.c 
b/drivers/power/reset/keystone-reset.c
index 408a18f..9d46586 100644
--- a/drivers/power/reset/keystone-reset.c
+++ b/drivers/power/reset/keystone-reset.c
@@ -12,9 +12,9 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
-#include 
 #include 
 #include 
 
@@ -52,7 +52,8 @@ static inline int rsctrl_enable_rspll_write(void)
  RSCTRL_KEY_MASK, RSCTRL_KEY);
 }
 
-static void rsctrl_restart(enum reboot_mode mode, const char *cmd)
+static int rsctrl_restart_handler(struct notifier_block *this,
+ unsigned long mode, void *cmd)
 {
/* enable write access to RSTCTRL */
rsctrl_enable_rspll_write();
@@ -60,8 +61,15 @@ static void rsctrl_restart(enum reboot_mode mode, const char 
*cmd)
/* reset the SOC */
regmap_update_bits(pllctrl_regs, rspll_offset + RSCTRL_RG,
   RSCTRL_RESET_MASK, 0);
+
+   return NOTIFY_DONE;
 }
 
+static struct notifier_block rsctrl_restart_nb = {
+   .notifier_call = rsctrl_restart_handler,
+   .priority = 128,
+};
+
 static struct of_device_id rsctrl_of_match[] = {
{.compatible = "ti,keystone-reset", },
{},
@@ -114,8 +122,6 @@ static int rsctrl_probe(struct platform_device *pdev)
if (ret)
return ret;
 
-   arm_pm_restart = rsctrl_restart;
-
/* disable a reset isolation for all module clocks */
ret = regmap_write(pllctrl_regs, rspll_offset + RSISO_RG, 0);
if (ret)
@@ -147,6 +153,10 @@ static int rsctrl_probe(struct platform_device *pdev)
return ret;
}
 
+   ret = register_restart_handler(_restart_nb);
+   if (ret)
+   dev_err(dev, "cannot register restart handler (err=%d)\n", ret);
+
return 0;
 }
 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 04/10] power/reset: xgene: Use local variable dev instead of pdev->dev

2014-09-26 Thread Guenter Roeck
Using a local variable dev to point to the device is simpler then repeatedly
dereferencing pdev->dev.

Cc: Loc Ho 
Signed-off-by: Guenter Roeck 
---
 drivers/power/reset/xgene-reboot.c | 15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/power/reset/xgene-reboot.c 
b/drivers/power/reset/xgene-reboot.c
index aac0287..1bab003 100644
--- a/drivers/power/reset/xgene-reboot.c
+++ b/drivers/power/reset/xgene-reboot.c
@@ -33,7 +33,7 @@
 #include 
 
 struct xgene_reboot_context {
-   struct platform_device *pdev;
+   struct device *dev;
void *csr;
u32 mask;
 };
@@ -53,27 +53,28 @@ static void xgene_restart(char str, const char *cmd)
while (time_before(jiffies, timeout))
cpu_relax();
 
-   dev_emerg(>pdev->dev, "Unable to restart system\n");
+   dev_emerg(ctx->dev, "Unable to restart system\n");
 }
 
 static int xgene_reboot_probe(struct platform_device *pdev)
 {
struct xgene_reboot_context *ctx;
+   struct device *dev = >dev;
 
-   ctx = devm_kzalloc(>dev, sizeof(*ctx), GFP_KERNEL);
+   ctx = devm_kzalloc(dev, sizeof(*ctx), GFP_KERNEL);
if (!ctx)
return -ENOMEM;
 
-   ctx->csr = of_iomap(pdev->dev.of_node, 0);
+   ctx->csr = of_iomap(dev->of_node, 0);
if (!ctx->csr) {
-   dev_err(>dev, "can not map resource\n");
+   dev_err(dev, "can not map resource\n");
return -ENODEV;
}
 
-   if (of_property_read_u32(pdev->dev.of_node, "mask", >mask))
+   if (of_property_read_u32(dev->of_node, "mask", >mask))
ctx->mask = 0x;
 
-   ctx->pdev = pdev;
+   ctx->dev = dev;
arm_pm_restart = xgene_restart;
xgene_restart_ctx = ctx;
 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 05/10] power/reset: xgene: Use mdelay instead of jiffies based timeout

2014-09-26 Thread Guenter Roeck
jiffies are not running at this stage of system shutdown, meaning an
error in the reset function would never be reported. Replace with mdelay().

Cc: Loc Ho 
Signed-off-by: Guenter Roeck 
---
 drivers/power/reset/xgene-reboot.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/power/reset/xgene-reboot.c 
b/drivers/power/reset/xgene-reboot.c
index 1bab003..ae470b8 100644
--- a/drivers/power/reset/xgene-reboot.c
+++ b/drivers/power/reset/xgene-reboot.c
@@ -24,6 +24,7 @@
  * For system shutdown, this is board specify. If a board designer
  * implements GPIO shutdown, use the gpio-poweroff.c driver.
  */
+#include 
 #include 
 #include 
 #include 
@@ -43,15 +44,12 @@ static struct xgene_reboot_context *xgene_restart_ctx;
 static void xgene_restart(char str, const char *cmd)
 {
struct xgene_reboot_context *ctx = xgene_restart_ctx;
-   unsigned long timeout;
 
/* Issue the reboot */
if (ctx)
writel(ctx->mask, ctx->csr);
 
-   timeout = jiffies + HZ;
-   while (time_before(jiffies, timeout))
-   cpu_relax();
+   mdelay(1000);
 
dev_emerg(ctx->dev, "Unable to restart system\n");
 }
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 06/10] power/reset: xgene: Register with kernel restart handler

2014-09-26 Thread Guenter Roeck
Register with kernel restart handler instead of setting arm_pm_restart directly.

This patch also addresses the following compile warning.

drivers/power/reset/xgene-reboot.c: In function 'xgene_reboot_probe':
drivers/power/reset/xgene-reboot.c:77:17: warning:
assignment from incompatible pointer type [enabled by default]

The warning was due to a mismatch between the type of arm_pm_restart
and the restart function.

Cc: Loc Ho 
Signed-off-by: Guenter Roeck 
---
 drivers/power/reset/xgene-reboot.c | 25 -
 1 file changed, 16 insertions(+), 9 deletions(-)

diff --git a/drivers/power/reset/xgene-reboot.c 
b/drivers/power/reset/xgene-reboot.c
index ae470b8..1e7f5f5 100644
--- a/drivers/power/reset/xgene-reboot.c
+++ b/drivers/power/reset/xgene-reboot.c
@@ -26,32 +26,36 @@
  */
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
-#include 
 
 struct xgene_reboot_context {
struct device *dev;
void *csr;
u32 mask;
+   struct notifier_block restart_handler;
 };
 
-static struct xgene_reboot_context *xgene_restart_ctx;
-
-static void xgene_restart(char str, const char *cmd)
+static int xgene_restart_handler(struct notifier_block *this,
+unsigned long mode, void *cmd)
 {
-   struct xgene_reboot_context *ctx = xgene_restart_ctx;
+   struct xgene_reboot_context *ctx =
+   container_of(this, struct xgene_reboot_context,
+restart_handler);
 
/* Issue the reboot */
-   if (ctx)
-   writel(ctx->mask, ctx->csr);
+   writel(ctx->mask, ctx->csr);
 
mdelay(1000);
 
dev_emerg(ctx->dev, "Unable to restart system\n");
+
+   return NOTIFY_DONE;
 }
 
 static int xgene_reboot_probe(struct platform_device *pdev)
@@ -73,8 +77,11 @@ static int xgene_reboot_probe(struct platform_device *pdev)
ctx->mask = 0x;
 
ctx->dev = dev;
-   arm_pm_restart = xgene_restart;
-   xgene_restart_ctx = ctx;
+   ctx->restart_handler.notifier_call = xgene_restart_handler;
+   ctx->restart_handler.priority = 128;
+   err = register_restart_handler(>restart_handler);
+   if (err)
+   dev_err(dev, "cannot register restart handler (err=%d)\n", err);
 
return 0;
 }
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 07/10] power/reset: axxia: Register with kernel restart handler

2014-09-26 Thread Guenter Roeck
Register with kernel restart handler instead of setting arm_pm_restart
directly.

Cc: Anders Berg 
Signed-off-by: Guenter Roeck 
---
 drivers/power/reset/axxia-reset.c | 19 ++-
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/drivers/power/reset/axxia-reset.c 
b/drivers/power/reset/axxia-reset.c
index 3b1f8d6..2a772d9 100644
--- a/drivers/power/reset/axxia-reset.c
+++ b/drivers/power/reset/axxia-reset.c
@@ -19,14 +19,12 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
 #include 
 
-#include 
-
-
 #define SC_CRIT_WRITE_KEY  0x1000
 #define SC_LATCH_ON_RESET  0x1004
 #define SC_RESET_CONTROL   0x1008
@@ -39,7 +37,8 @@
 
 static struct regmap *syscon;
 
-static void do_axxia_restart(enum reboot_mode reboot_mode, const char *cmd)
+static int axxia_restart_handler(struct notifier_block *this,
+unsigned long mode, void *cmd)
 {
/* Access Key (0xab) */
regmap_write(syscon, SC_CRIT_WRITE_KEY, 0xab);
@@ -50,11 +49,19 @@ static void do_axxia_restart(enum reboot_mode reboot_mode, 
const char *cmd)
/* Assert chip reset */
regmap_update_bits(syscon, SC_RESET_CONTROL,
   RSTCTL_RST_CHIP, RSTCTL_RST_CHIP);
+
+   return NOTIFY_DONE;
 }
 
+static struct notifier_block axxia_restart_nb = {
+   .notifier_call = axxia_restart_handler,
+   .priority = 128,
+};
+
 static int axxia_reset_probe(struct platform_device *pdev)
 {
struct device *dev = >dev;
+   int err;
 
syscon = syscon_regmap_lookup_by_phandle(dev->of_node, "syscon");
if (IS_ERR(syscon)) {
@@ -62,7 +69,9 @@ static int axxia_reset_probe(struct platform_device *pdev)
return PTR_ERR(syscon);
}
 
-   arm_pm_restart = do_axxia_restart;
+   err = register_restart_handler(_restart_nb);
+   if (err)
+   dev_err(dev, "cannot register restart handler (err=%d)\n", err);
 
return 0;
 }
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clk, ti, LLVMLinux: Move __init outside of type definition

2014-09-26 Thread Behan Webster

On 09/26/14 17:55, Felipe Balbi wrote:

On Fri, Sep 26, 2014 at 05:31:48PM -0700, Behan Webster wrote:

As written, the __init for ti_clk_get_div_table is in the middle of the return
type.

The gcc documentation indicates that section attributes should be added to the
end of the function declaration:

   extern void foobar (void) __attribute__ ((section ("bar")));

However gcc seems to be very permissive with where attributes can be placed.
clang on the other hand isn't so permissive, and fails if you put the section
definition in the middle of the return type:

drivers/clk/ti/divider.c:298:28: error: expected ';' after struct
static struct clk_div_table
^
;
drivers/clk/ti/divider.c:298:1: warning: 'static' ignored on this
   declaration [-Wmissing-declarations]
static struct clk_div_table
^
drivers/clk/ti/divider.c:299:9: error: type specifier missing,
   defaults to 'int' [-Werror,-Wimplicit-int]
__init *ti_clk_get_div_table(struct device_node *node)
~~  ^
drivers/clk/ti/divider.c:345:9: warning: incompatible pointer types
   returning 'struct clk_div_table *' from a function with result type 'int 
*' [-Wincompatible-pointer-types]
 return table;
^
drivers/clk/ti/divider.c:419:9: warning: incompatible pointer types
   assigning to 'const struct clk_div_table *' from 'int *' 
[-Wincompatible-pointer-types]
 *table = ti_clk_get_div_table(node);
^ ~~
3 warnings and 2 errors generated.

By convention, most of the kernel code puts section attributes between the
return type and function name. In the case where the return type is a pointer,
it's important to place the '*' on left of the __init.

This updated code works for both gcc and clang.

Signed-off-by: Behan Webster 
Reviewed-by: Mark Charlebois 

makes sense to me:

Reviewed-by: Felipe Balbi 

Thank you.


I wonder if we should add this a Sparse or Coccinelle rule.

+1

I'm hoping it can be added to checkpatch as well.

Behan

--
Behan Webster
beh...@converseincode.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 10/10] power/reset: brcmstb: Register with kernel restart handler

2014-09-26 Thread Guenter Roeck
Register with kernel restart handler instead of setting arm_pm_restart directly.

Cc: Marc Carino 
Cc: Brian Norris 
Signed-off-by: Guenter Roeck 
---
 drivers/power/reset/brcmstb-reboot.c | 26 ++
 1 file changed, 18 insertions(+), 8 deletions(-)

diff --git a/drivers/power/reset/brcmstb-reboot.c 
b/drivers/power/reset/brcmstb-reboot.c
index 3f23692..3306241 100644
--- a/drivers/power/reset/brcmstb-reboot.c
+++ b/drivers/power/reset/brcmstb-reboot.c
@@ -16,6 +16,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -26,8 +27,6 @@
 #include 
 #include 
 
-#include 
-
 #define RESET_SOURCE_ENABLE_REG 1
 #define SW_MASTER_RESET_REG 2
 
@@ -35,7 +34,8 @@ static struct regmap *regmap;
 static u32 rst_src_en;
 static u32 sw_mstr_rst;
 
-static void brcmstb_reboot(enum reboot_mode mode, const char *cmd)
+static int brcmstb_restart_handler(struct notifier_block *this,
+  unsigned long mode, void *cmd)
 {
int rc;
u32 tmp;
@@ -43,31 +43,38 @@ static void brcmstb_reboot(enum reboot_mode mode, const 
char *cmd)
rc = regmap_write(regmap, rst_src_en, 1);
if (rc) {
pr_err("failed to write rst_src_en (%d)\n", rc);
-   return;
+   return NOTIFY_DONE;
}
 
rc = regmap_read(regmap, rst_src_en, );
if (rc) {
pr_err("failed to read rst_src_en (%d)\n", rc);
-   return;
+   return NOTIFY_DONE;
}
 
rc = regmap_write(regmap, sw_mstr_rst, 1);
if (rc) {
pr_err("failed to write sw_mstr_rst (%d)\n", rc);
-   return;
+   return NOTIFY_DONE;
}
 
rc = regmap_read(regmap, sw_mstr_rst, );
if (rc) {
pr_err("failed to read sw_mstr_rst (%d)\n", rc);
-   return;
+   return NOTIFY_DONE;
}
 
while (1)
;
+
+   return NOTIFY_DONE;
 }
 
+static struct notifier_block brcmstb_restart_nb = {
+   .notifier_call = brcmstb_restart_handler,
+   .priority = 128,
+};
+
 static int brcmstb_reboot_probe(struct platform_device *pdev)
 {
int rc;
@@ -93,7 +100,10 @@ static int brcmstb_reboot_probe(struct platform_device 
*pdev)
return -EINVAL;
}
 
-   arm_pm_restart = brcmstb_reboot;
+   rc = register_restart_handler(_restart_nb);
+   if (rc)
+   dev_err(>dev,
+   "cannot register restart handler (err=%d)\n", rc);
 
return 0;
 }
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 09/10] power/reset: hisi: Register with kernel restart handler

2014-09-26 Thread Guenter Roeck
Register with kernel restart handler instead of setting arm_pm_restart directly.

Cc: Haojian Zhuang 
Signed-off-by: Guenter Roeck 
---
 drivers/power/reset/hisi-reboot.c | 18 +++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/drivers/power/reset/hisi-reboot.c 
b/drivers/power/reset/hisi-reboot.c
index 0c91d02..e9774b0 100644
--- a/drivers/power/reset/hisi-reboot.c
+++ b/drivers/power/reset/hisi-reboot.c
@@ -14,27 +14,36 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
 
 #include 
-#include 
 
 static void __iomem *base;
 static u32 reboot_offset;
 
-static void hisi_restart(enum reboot_mode mode, const char *cmd)
+static int hisi_restart_handler(struct notifier_block *this,
+   unsigned long mode, void *cmd)
 {
writel_relaxed(0xdeadbeef, base + reboot_offset);
 
while (1)
cpu_do_idle();
+
+   return NOTIFY_DONE;
 }
 
+static struct notifier_block hisi_restart_nb = {
+   .notifier_call = hisi_restart_handler,
+   .priority = 128,
+};
+
 static int hisi_reboot_probe(struct platform_device *pdev)
 {
struct device_node *np = pdev->dev.of_node;
+   int err;
 
base = of_iomap(np, 0);
if (!base) {
@@ -47,7 +56,10 @@ static int hisi_reboot_probe(struct platform_device *pdev)
return -EINVAL;
}
 
-   arm_pm_restart = hisi_restart;
+   err = register_restart_handler(_restart_nb);
+   if (err)
+   dev_err(>dev, "cannot register restart handler 
(err=%d)\n",
+   err);
 
return 0;
 }
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 03/10] power/reset: xgene: Drop devm_kfree

2014-09-26 Thread Guenter Roeck
Calling devm_kfree is unnecessary. Drop it.

Cc: Loc Ho 
Signed-off-by: Guenter Roeck 
---
 drivers/power/reset/xgene-reboot.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/power/reset/xgene-reboot.c 
b/drivers/power/reset/xgene-reboot.c
index 20cf7c3..aac0287 100644
--- a/drivers/power/reset/xgene-reboot.c
+++ b/drivers/power/reset/xgene-reboot.c
@@ -66,7 +66,6 @@ static int xgene_reboot_probe(struct platform_device *pdev)
 
ctx->csr = of_iomap(pdev->dev.of_node, 0);
if (!ctx->csr) {
-   devm_kfree(>dev, ctx);
dev_err(>dev, "can not map resource\n");
return -ENODEV;
}
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 01/10] power/reset: vexpress: Register with kernel restart handler

2014-09-26 Thread Guenter Roeck
Use the kernel restart handler instead of setting arm_pm_restart directly.
This allows for more than one restart handler in the system.

Signed-off-by: Guenter Roeck 
---
 drivers/power/reset/vexpress-poweroff.c | 33 -
 1 file changed, 24 insertions(+), 9 deletions(-)

diff --git a/drivers/power/reset/vexpress-poweroff.c 
b/drivers/power/reset/vexpress-poweroff.c
index 4dc102e2..03959ba 100644
--- a/drivers/power/reset/vexpress-poweroff.c
+++ b/drivers/power/reset/vexpress-poweroff.c
@@ -12,14 +12,14 @@
  */
 
 #include 
+#include 
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
-#include 
-
 static void vexpress_reset_do(struct device *dev, const char *what)
 {
int err = -ENOENT;
@@ -43,11 +43,19 @@ static void vexpress_power_off(void)
 
 static struct device *vexpress_restart_device;
 
-static void vexpress_restart(enum reboot_mode reboot_mode, const char *cmd)
+static int vexpress_restart(struct notifier_block *this, unsigned long mode,
+void *cmd)
 {
vexpress_reset_do(vexpress_restart_device, "restart");
+
+   return NOTIFY_DONE;
 }
 
+static struct notifier_block vexpress_restart_nb = {
+   .notifier_call = vexpress_restart,
+   .priority = 128,
+};
+
 static ssize_t vexpress_reset_active_show(struct device *dev,
struct device_attribute *attr, char *buf)
 {
@@ -86,6 +94,17 @@ static struct of_device_id vexpress_reset_of_match[] = {
{}
 };
 
+static void _vexpress_register_restart_handler(struct device *dev)
+{
+   int err;
+
+   vexpress_restart_device = dev;
+   err = register_restart_handler(_restart_nb);
+   if (err)
+   dev_err(dev, "cannot register restart handler (err=%d)\n", err);
+   device_create_file(dev, _attr_active);
+}
+
 static int vexpress_reset_probe(struct platform_device *pdev)
 {
enum vexpress_reset_func func;
@@ -110,14 +129,10 @@ static int vexpress_reset_probe(struct platform_device 
*pdev)
break;
case FUNC_RESET:
if (!vexpress_restart_device)
-   vexpress_restart_device = >dev;
-   arm_pm_restart = vexpress_restart;
-   device_create_file(>dev, _attr_active);
+   _vexpress_register_restart_handler(>dev);
break;
case FUNC_REBOOT:
-   vexpress_restart_device = >dev;
-   arm_pm_restart = vexpress_restart;
-   device_create_file(>dev, _attr_active);
+   _vexpress_register_restart_handler(>dev);
break;
};
 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 00/10] power/reset: Register drivers with restart handler

2014-09-26 Thread Guenter Roeck
Convert drivers to use the kernel restart handler instead of setting
arm_pm_restart directly.

This patch set depends on the kernel restart handler patchset submitted
earlier.

Patch 01/10 was tested with qemu. All other patches were compile tested only.

Some of the restart handlers loop forever after the reset instruction was
executed. It might makes sense to use mdelay() instead and return if resetting
the system failed. I did not implement that since I do not know what reasonable
delays would be.

The series does not include drivers to be introduced in the next commit window
(at91, versatile). Those can be converted later. I also did not convert the
sun6i driver, in the assumption that it will be removed since the sunxi
watchdog driver will register a restart handler for the architecture.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clk, ti, LLVMLinux: Move __init outside of type definition

2014-09-26 Thread Felipe Balbi
On Fri, Sep 26, 2014 at 05:31:48PM -0700, Behan Webster wrote:
> As written, the __init for ti_clk_get_div_table is in the middle of the return
> type.
> 
> The gcc documentation indicates that section attributes should be added to the
> end of the function declaration:
> 
>   extern void foobar (void) __attribute__ ((section ("bar")));
> 
> However gcc seems to be very permissive with where attributes can be placed.
> clang on the other hand isn't so permissive, and fails if you put the section
> definition in the middle of the return type:
> 
> drivers/clk/ti/divider.c:298:28: error: expected ';' after struct
> static struct clk_div_table
>^
>;
> drivers/clk/ti/divider.c:298:1: warning: 'static' ignored on this
>   declaration [-Wmissing-declarations]
> static struct clk_div_table
> ^
> drivers/clk/ti/divider.c:299:9: error: type specifier missing,
>   defaults to 'int' [-Werror,-Wimplicit-int]
> __init *ti_clk_get_div_table(struct device_node *node)
> ~~  ^
> drivers/clk/ti/divider.c:345:9: warning: incompatible pointer types
>   returning 'struct clk_div_table *' from a function with result type 
> 'int *' [-Wincompatible-pointer-types]
> return table;
>^
> drivers/clk/ti/divider.c:419:9: warning: incompatible pointer types
>   assigning to 'const struct clk_div_table *' from 'int *' 
> [-Wincompatible-pointer-types]
> *table = ti_clk_get_div_table(node);
>^ ~~
> 3 warnings and 2 errors generated.
> 
> By convention, most of the kernel code puts section attributes between the
> return type and function name. In the case where the return type is a pointer,
> it's important to place the '*' on left of the __init.
> 
> This updated code works for both gcc and clang.
> 
> Signed-off-by: Behan Webster 
> Reviewed-by: Mark Charlebois 

makes sense to me:

Reviewed-by: Felipe Balbi 

I wonder if we should add this a Sparse or Coccinelle rule.

-- 
balbi


signature.asc
Description: Digital signature


Re: linux-next: Tree for Sep 26 (media/pci/pt3)

2014-09-26 Thread Mauro Carvalho Chehab
Em Fri, 26 Sep 2014 10:01:47 -0700
Randy Dunlap  escreveu:

> On 09/26/14 04:10, Stephen Rothwell wrote:
> > Hi all,
> > 
> > There will be no linux-next release on Monday.
> > 
> > This has not been a good day :-(
> > 
> > Changes since 20140925:
> 
> 
> on x86_64:
> when CONFIG_MODULES is not enabled:
> 
> ../drivers/media/pci/pt3/pt3.c: In function 'pt3_attach_fe':
> ../drivers/media/pci/pt3/pt3.c:433:6: error: implicit declaration of function 
> 'module_is_live' [-Werror=implicit-function-declaration]


:(

I didn't notice that weird I2C attach code on this driver.

Akihiro, could you please fix it? The entire I2C attach code at
pt3_attach looks weird. We should soon add support for the I2C
tuner attach code for all DVB drivers on a common place, just like
what we do for V4L drivers. That's why I didn't spend much time on
that piece of the code.

Yet, while we don't have it, please take a look at 
drivers/media/v4l2-core/v4l2-device.c and do (almost) the same on
your driver, if possible, putting the I2C load module on a function.
That will make easier for us to replace such function when we'll add 
core support for it. The function at v4l2-device for you to take
a look is: v4l2_device_register_subdev().

Thank you,
Mauro
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] clk, ti, LLVMLinux: Move __init outside of type definition

2014-09-26 Thread Behan Webster
As written, the __init for ti_clk_get_div_table is in the middle of the return
type.

The gcc documentation indicates that section attributes should be added to the
end of the function declaration:

  extern void foobar (void) __attribute__ ((section ("bar")));

However gcc seems to be very permissive with where attributes can be placed.
clang on the other hand isn't so permissive, and fails if you put the section
definition in the middle of the return type:

drivers/clk/ti/divider.c:298:28: error: expected ';' after struct
static struct clk_div_table
   ^
   ;
drivers/clk/ti/divider.c:298:1: warning: 'static' ignored on this
  declaration [-Wmissing-declarations]
static struct clk_div_table
^
drivers/clk/ti/divider.c:299:9: error: type specifier missing,
  defaults to 'int' [-Werror,-Wimplicit-int]
__init *ti_clk_get_div_table(struct device_node *node)
~~  ^
drivers/clk/ti/divider.c:345:9: warning: incompatible pointer types
  returning 'struct clk_div_table *' from a function with result type 'int 
*' [-Wincompatible-pointer-types]
return table;
   ^
drivers/clk/ti/divider.c:419:9: warning: incompatible pointer types
  assigning to 'const struct clk_div_table *' from 'int *' 
[-Wincompatible-pointer-types]
*table = ti_clk_get_div_table(node);
   ^ ~~
3 warnings and 2 errors generated.

By convention, most of the kernel code puts section attributes between the
return type and function name. In the case where the return type is a pointer,
it's important to place the '*' on left of the __init.

This updated code works for both gcc and clang.

Signed-off-by: Behan Webster 
Reviewed-by: Mark Charlebois 
---
 drivers/clk/ti/divider.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/clk/ti/divider.c b/drivers/clk/ti/divider.c
index a837f70..bff2b5b 100644
--- a/drivers/clk/ti/divider.c
+++ b/drivers/clk/ti/divider.c
@@ -300,8 +300,8 @@ static struct clk *_register_divider(struct device *dev, 
const char *name,
return clk;
 }
 
-static struct clk_div_table
-__init *ti_clk_get_div_table(struct device_node *node)
+static struct clk_div_table *
+__init ti_clk_get_div_table(struct device_node *node)
 {
struct clk_div_table *table;
const __be32 *divspec;
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v13 0/9] Per-user clock constraints

2014-09-26 Thread Stephen Boyd
On 09/26/14 16:20, Mike Turquette wrote:
> Quoting Tomeu Vizoso (2014-09-26 01:09:20)
>> On 09/26/2014 03:29 AM, Stephen Boyd wrote:
>>
>>> We already have the consumer/provider split in the struct clk_hw and
>>> struct clk separation. Why don't we just use struct clk_hw throughout
>>> the provider APIs? The only op that isn't doing this is determine_rate()
>>> which might be able to accept a flag day. Otherwise we rename it to
>>> something else and migrate everyone over to a different named function
>>> that doesn't take a struct clk **. Then we introduce new APIs for the
>>> providers to use that are struct clk_hw focused instead of struct clk
>>> focused and migrate them too. The benefit being that we get proper
>>> review of this stuff because the patches are small. We can let
>>> coccinelle do it too.
> I have been opposed to mucking with clk_hw before, but that was because
> the goal was never clear. Now that we know that we're trying to split
> the API then it might be reasonable to use it.
>
> Stephen, does your above proposal still allow for unique struct clk
> cookies for each user of a clock?
>
>

Yes. The clkdev code would know that it's getting clk_hw pointers back
from the provider instead of struct clk pointers because it uses a
different callback:

diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
index bacc06ff939b..76c356b779d1 100644
--- a/drivers/clk/clk.c
+++ b/drivers/clk/clk.c
@@ -2337,6 +2337,7 @@ struct of_clk_provider {
 
struct device_node *node;
struct clk *(*get)(struct of_phandle_args *clkspec, void *data);
+   struct clk_hw *(*get_hw)(struct of_phandle_args *clkspec, void *data);
void *data;
 };
 

 

It would test for the presence of the get_hw callback and then fall back
to the get callback. Similarly we would have a clk_hw pointer in the
clkdev lookup structure. Once clkdev has the hw pointer it can ask the
ccf to generate a struct clk cookie. This would happen either in
__of_clk_get_from_provider() or clk_get_sys() depending on if we're
using DT or not.

The clk_hw structure will need to be updated to have a pointer to
clk_core. During the transition we would have struct clk and struct
clk_core next to each other in struct clk_hw. Eventually once we convert
all drivers we can remove struct clk from clk_hw. This would require
getting rid of the DEFINE_CLK macro in clk-private.h though.

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clk: add gpio gated clock

2014-09-26 Thread Mike Turquette
Quoting Jyri Sarha (2014-09-11 01:44:24)
> On 09/10/2014 01:14 AM, Mike Turquette wrote:
> > Quoting Jyri Sarha (2014-09-05 05:21:34)
> >> The added gpio-gate-clock is a basic clock that can be enabled and
> >> disabled trough a gpio output. The DT binding document for the clock
> >> is also added. For EPROBE_DEFER handling the registering of the clock
> >> has to be delayed until of_clk_get() call time.
> >>
> >> Signed-off-by: Jyri Sarha 
> >> ---
> >>
> >> This is my final attempt to get this generic gpio controlled basic
> >> clock into mainline. Of course I gladly fix any issues that the patch
> >> may have. However, if there is no response, I give up and move it to TI
> >> specific clocks.
> >>
> >
> > I searched through my archives and found a post from January. You Cc'd
> > me as "". Note that the address is wrapped in
> > chevrons but there is no name string (e.g. "Mike Turquette").
> >
> > My mailer doesn't parse this well it was not flagged as to:me in my
> > filters. Maybe other mailers handle this better? If you leave out the
> > name string in the future then it would probably be best to drop the
> > chevrons.
> >
> 
> Then git send-email adds the chevrons, but in the future I'll put the 
> name string there too.
> 
> >> I've been sending this patch as a part of Beaglebone-Black HDMI audio
> >> patch series since last autumn. Since the previous version I have done
> >> some minor cleanups and changed the clock's compatible property from
> >> "gpio-clock" to "gpio-gate-clock". All the file names, comments,
> >> etc. have also been changed accordingly.
> >
> > Is your platform the only one to take advantage of this clock type so
> > far? I feel that it is esoteric enough that it shouldn't be made
> > generic.
> >
> > The main reason is that all of the generic clock types needs to be
> > overhauled at some point. E.g. the clk-gate should have its
> > machine-specific logic separated from its machine-independent logic. If
> > the gate clock were to populate .enable and .disable callbacks and then
> > leave the actual register banging, or regmap'ing, or gpio'ing up to your
> > backend driver then that would be a big improvement and would avoid the
> > need to create this new clock type outright.
> >
> > So that's on my todo list, but it's not done yet. For your patch I think
> > that putting this code into drivers/clk/ti would probably be best,
> > unless other folks could use it as-is. Even if others could use it today
> > I would want to remove it eventually for the reasons stated in the
> > paragraph above.
> >
> 
> Ok, I see. I do not know of anybody else needing a gpio gate clock at 
> the moment. I'll put the driver under drivers/clk/ti unless someone 
> comes forward soon.

Well nobody came forward but after thinking about it I've seen this
design elsewhere, so it should probably be generic. And the underlying
machine-specific ops are less relevant to this type since most of that
is abstracted away behind the GPIO api.

Applied to clk-next. Let me know if that causes a problem for you if you
have merged this into the TI clk stuff.

Regards,
Mike

> 
> Thanks,
> Jyri
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/8] perf, tools: Support handling complete branch stacks as histograms

2014-09-26 Thread Andi Kleen
From: Andi Kleen 

Currently branch stacks can be only shown as edge histograms for
individual branches. I never found this display particularly useful.

This implements an alternative mode that creates histograms over complete
branch traces, instead of individual branches, similar to how normal
callgraphs are handled. This is done by putting it in
front of the normal callgraph and then using the normal callgraph
histogram infrastructure to unify them.

This way in complex functions we can understand the control flow
that lead to a particular sample, and may even see some control
flow in the caller for short functions.

Example (simplified, of course for such simple code this
is usually not needed):

tcall.c:

volatile a = 1, b = 10, c;

__attribute__((noinline)) f2()
{
c = a / b;
}

__attribute__((noinline)) f1()
{
f2();
f2();
}
main()
{
int i;
for (i = 0; i < 100; i++)
f1();
}

% perf record -b -g ./tsrc/tcall
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.044 MB perf.data (~1923 samples) ]
% perf report --branch-history
...
54.91%  tcall.c:6  [.] f2  tcall
|
|--65.53%-- f2 tcall.c:5
|  |
|  |--70.83%-- f1 tcall.c:11
|  |  f1 tcall.c:10
|  |  main tcall.c:18
|  |  main tcall.c:18
|  |  main tcall.c:17
|  |  main tcall.c:17
|  |  f1 tcall.c:13
|  |  f1 tcall.c:13
|  |  f2 tcall.c:7
|  |  f2 tcall.c:5
|  |  f1 tcall.c:12
|  |  f1 tcall.c:12
|  |  f2 tcall.c:7
|  |  f2 tcall.c:5
|  |  f1 tcall.c:11
|  |
|   --29.17%-- f1 tcall.c:12
| f1 tcall.c:12
| f2 tcall.c:7
| f2 tcall.c:5
| f1 tcall.c:11
| f1 tcall.c:10
| main tcall.c:18
| main tcall.c:18
| main tcall.c:17
| main tcall.c:17
| f1 tcall.c:13
| f1 tcall.c:13
| f2 tcall.c:7
| f2 tcall.c:5
| f1 tcall.c:12

The default output is unchanged.

This is only implemented in perf report, no change to record
or anywhere else.

This adds the basic code to report:
- add a new "branch" option to the -g option parser to enable this mode
- when the flag is set include the LBR into the callstack in machine.c.
The rest of the history code is unchanged and doesn't know the difference
between LBR entry and normal call entry.
- detect overlaps with the callchain
- remove small loop duplicates in the LBR

Current limitations:
- The LBR flags (mispredict etc.) are not shown in the history
and LBR entries have no special marker.
- It would be nice if annotate marked the LBR entries somehow
(e.g. with arrows)

v2: Various fixes.
v3: Merge further patches into this one. Fix white space.
v4: Improve manpage. Address review feedback.
v5: Rename functions. Better error message without -g. Fix crash without
-b.
v6: Rebase
v7: Rebase. Use NO_ENTRY in memset.
Signed-off-by: Andi Kleen 
---
 tools/perf/Documentation/perf-report.txt |   7 +-
 tools/perf/builtin-report.c  |   4 +-
 tools/perf/util/callchain.c  |   2 +
 tools/perf/util/callchain.h  |   1 +
 tools/perf/util/machine.c| 162 +++
 tools/perf/util/symbol.h |   3 +-
 6 files changed, 154 insertions(+), 25 deletions(-)

diff --git a/tools/perf/Documentation/perf-report.txt 
b/tools/perf/Documentation/perf-report.txt
index 0927bf4..22706be 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -159,7 +159,7 @@ OPTIONS
 --dump-raw-trace::
 Dump raw trace in ASCII.
 
--g [type,min[,limit],order[,key]]::
+-g [type,min[,limit],order[,key][,branch]]::
 --call-graph::
 Display call chains using type, min percent threshold, optional print
limit and order.
@@ -177,6 +177,11 @@ OPTIONS
- function: compare on functions
- address: compare on individual code addresses
 
+   branch can be:
+   - branch: include last branch information in callgraph
+   when available. Usually more convenient to use --branch-history
+   for this.
+
Default: fractal,0.5,callee,function.
 
 --children::

[PATCH 2/8] perf, tools: Add --branch-history option to report

2014-09-26 Thread Andi Kleen
From: Andi Kleen 

Add a --branch-history option to perf report that changes all
the settings necessary for using the branches in callstacks.

This is just a short cut to make this nicer to use, it does
not enable any functionality by itself.

v2: Change sort order. Rename option to --branch-history to
be less confusing.
v3: Updates
v4: Fix conflict with newer perf base
Signed-off-by: Andi Kleen 
---
 tools/perf/Documentation/perf-report.txt |  5 +
 tools/perf/builtin-report.c  | 35 +++-
 2 files changed, 35 insertions(+), 5 deletions(-)

diff --git a/tools/perf/Documentation/perf-report.txt 
b/tools/perf/Documentation/perf-report.txt
index 22706be..dd7cccd 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -271,6 +271,11 @@ OPTIONS
branch stacks and it will automatically switch to the branch view mode,
unless --no-branch-stack is used.
 
+--branch-history::
+   Add the addresses of sampled taken branches to the callstack.
+   This allows to examine the path the program took to each sample.
+   The data collection must have used -b (or -j) and -g.
+
 --objdump=::
 Path to objdump binary.
 
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 6b53d02..cd87a40 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -226,8 +226,9 @@ static int report__setup_sample_type(struct report *rep)
return -EINVAL;
}
if (symbol_conf.use_callchain) {
-   ui__error("Selected -g but no callchain data. Did "
-   "you call 'perf record' without -g?\n");
+   ui__error("Selected -g or --branch-history but no "
+ "callchain data. Did\n"
+ "you call 'perf record' without -g?\n");
return -1;
}
} else if (!rep->dont_use_callchains &&
@@ -550,6 +551,16 @@ parse_branch_mode(const struct option *opt __maybe_unused,
 }
 
 static int
+parse_branch_call_mode(const struct option *opt __maybe_unused,
+ const char *str __maybe_unused, int unset)
+{
+   int *branch_mode = opt->value;
+
+   *branch_mode = !unset;
+   return 0;
+}
+
+static int
 parse_percent_limit(const struct option *opt, const char *str,
int unset __maybe_unused)
 {
@@ -564,7 +575,7 @@ int cmd_report(int argc, const char **argv, const char 
*prefix __maybe_unused)
struct perf_session *session;
struct stat st;
bool has_br_stack = false;
-   int branch_mode = -1;
+   int branch_mode = -1, branch_call_mode = -1;
int ret = -1;
char callchain_default_opt[] = "fractal,0.5,callee";
const char * const report_usage[] = {
@@ -675,7 +686,11 @@ int cmd_report(int argc, const char **argv, const char 
*prefix __maybe_unused)
OPT_BOOLEAN(0, "group", _conf.event_group,
"Show event group information together"),
OPT_CALLBACK_NOOPT('b', "branch-stack", _mode, "",
-   "use branch records for histogram filling", 
parse_branch_mode),
+   "use branch records for per branch histogram filling",
+   parse_branch_mode),
+   OPT_CALLBACK_NOOPT(0, "branch-history", _call_mode, "",
+   "add last branch records to call history",
+   parse_branch_call_mode),
OPT_STRING(0, "objdump", _path, "path",
   "objdump binary to use for disassembly and annotations"),
OPT_BOOLEAN(0, "demangle", _conf.demangle,
@@ -732,10 +747,20 @@ repeat:
has_br_stack = perf_header__has_feat(>header,
 HEADER_BRANCH_STACK);
 
-   if ((branch_mode == -1 && has_br_stack) || branch_mode == 1) {
+   if (((branch_mode == -1 && has_br_stack) || branch_mode == 1) &&
+   branch_call_mode == -1) {
sort__mode = SORT_MODE__BRANCH;
symbol_conf.cumulate_callchain = false;
}
+   if (branch_call_mode != -1) {
+   callchain_param.branch_callstack = 1;
+   callchain_param.key = CCKEY_ADDRESS;
+   symbol_conf.use_callchain = true;
+   callchain_register_param(_param);
+   if (sort_order == default_sort_order)
+   sort_order = "srcline,symbol,dso";
+   branch_mode = 0;
+   }
 
if (report.mem_mode) {
if (sort__mode == SORT_MODE__BRANCH) {
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 8/8] tools, perf: Add asprintf replacement

2014-09-26 Thread Andi Kleen
From: Andi Kleen 

asprintf corrupts memory on some older glibc versions.
Provide a replacement. This fixes various segfaults
with --branch-history on older Fedoras.

Signed-off-by: Andi Kleen 
---
 tools/perf/Makefile.perf|  1 +
 tools/perf/builtin-report.c |  3 ++-
 tools/perf/util/asprintf.c  | 28 
 3 files changed, 31 insertions(+), 1 deletion(-)
 create mode 100644 tools/perf/util/asprintf.c

diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 171f4e6..5bdb960 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -377,6 +377,7 @@ LIB_OBJS += $(OUTPUT)util/vdso.o
 LIB_OBJS += $(OUTPUT)util/stat.o
 LIB_OBJS += $(OUTPUT)util/record.o
 LIB_OBJS += $(OUTPUT)util/srcline.o
+LIB_OBJS += $(OUTPUT)util/asprintf.o
 LIB_OBJS += $(OUTPUT)util/data.o
 LIB_OBJS += $(OUTPUT)util/tsc.o
 LIB_OBJS += $(OUTPUT)util/cloexec.o
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index cd87a40..96bc166 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -756,7 +756,8 @@ repeat:
callchain_param.branch_callstack = 1;
callchain_param.key = CCKEY_ADDRESS;
symbol_conf.use_callchain = true;
-   callchain_register_param(_param);
+   if (callchain_register_param(_param) < 0)
+   pr_err("Cannot register callchain parameters");
if (sort_order == default_sort_order)
sort_order = "srcline,symbol,dso";
branch_mode = 0;
diff --git a/tools/perf/util/asprintf.c b/tools/perf/util/asprintf.c
new file mode 100644
index 000..9aafaca
--- /dev/null
+++ b/tools/perf/util/asprintf.c
@@ -0,0 +1,28 @@
+/* Replacement for asprintf as it's buggy in older glibc versions */
+#include 
+#include 
+#include 
+#include 
+
+int vasprintf(char **str, const char *fmt, va_list ap)
+{
+   char buf[1024];
+   int len = vsnprintf(buf, sizeof buf, fmt, ap);
+
+   *str = malloc(len + 1);
+   if (!*str)
+   return -1;
+   strcpy(*str, buf);
+   return len;
+}
+
+int asprintf(char **str, const char *fmt, ...)
+{
+   va_list ap;
+   int ret;
+
+   va_start(ap, fmt);
+   ret = vasprintf(str, fmt, ap);
+   va_end(ap);
+   return ret;
+}
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 7/8] tools, perf: Make srcline output address with -v

2014-09-26 Thread Andi Kleen
From: Andi Kleen 

When -v is specified always print the hex address for the srcline.

Signed-off-by: Andi Kleen 
---
 tools/perf/util/srcline.c | 18 +++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c
index 36a7aff..a22be7c 100644
--- a/tools/perf/util/srcline.c
+++ b/tools/perf/util/srcline.c
@@ -258,6 +258,12 @@ char *get_srcline(struct dso *dso, unsigned long addr, 
struct symbol *sym,
unsigned line = 0;
char *srcline;
const char *dso_name;
+   char astr[50];
+
+   if (verbose)
+   snprintf(astr, sizeof astr, " %#lx", addr);
+   else
+   astr[0] = 0;
 
if (!dso->has_srcline)
goto out;
@@ -276,7 +282,12 @@ char *get_srcline(struct dso *dso, unsigned long addr, 
struct symbol *sym,
if (!addr2line(dso_name, addr, , , dso))
goto out;
 
-   if (asprintf(, "%s:%u", basename(file), line) < 0) {
+   if (line == 0) {
+   free(file);
+   goto fallback;
+   }
+
+   if (asprintf(, "%s:%u%s", basename(file), line, astr) < 0) {
free(file);
goto out;
}
@@ -291,9 +302,10 @@ out:
dso->has_srcline = 0;
dso__free_a2l(dso);
}
+fallback:
if (sym) {
-   if (asprintf(, "%s+%ld", show_sym ? sym->name : "",
-   addr - sym->start) < 0)
+   if (asprintf(, "%s+%ld%s", show_sym ? sym->name : "",
+   addr - sym->start, astr) < 0)
return SRCLINE_UNKNOWN;
} else if (asprintf(, "%s[%lx]", dso->short_name, addr) < 0)
return SRCLINE_UNKNOWN;
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 6/8] tools, perf: Make get_srcline fall back to sym+offset

2014-09-26 Thread Andi Kleen
From: Andi Kleen 

When the source line is not found fall back to sym + offset.
This is generally much more useful than a raw address.
For this we need to pass in the symbol from the caller.
For some callers it's awkward to compute, so we stay
at the old behaviour.

Signed-off-by: Andi Kleen 
---
 tools/perf/util/annotate.c  |  2 +-
 tools/perf/util/callchain.c |  3 ++-
 tools/perf/util/map.c   |  2 +-
 tools/perf/util/sort.c  |  6 --
 tools/perf/util/srcline.c   | 12 +---
 tools/perf/util/util.h  |  4 +++-
 6 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index f49f9a5..ccd6f3a 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -1190,7 +1190,7 @@ static int symbol__get_source_line(struct symbol *sym, 
struct map *map,
goto next;
 
offset = start + i;
-   src_line->path = get_srcline(map->dso, offset);
+   src_line->path = get_srcline(map->dso, offset, NULL, false);
insert_source_line(_root, src_line);
 
next:
diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index b7ee210..8c0cea0 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -666,7 +666,8 @@ char *callchain_list__sym_name(struct callchain_list *cl,
cl->ms.map && !cl->srcline)
cl->srcline = get_srcline(cl->ms.map->dso,
  map__rip_2objdump(cl->ms.map,
-   cl->ip));
+   cl->ip),
+ cl->ms.sym, false);
if (cl->srcline)
printed = scnprintf(bf, bfsize, "%s %s",
cl->ms.sym->name, cl->srcline);
diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index b709059..4f3cdbc 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -360,7 +360,7 @@ int map__fprintf_srcline(struct map *map, u64 addr, const 
char *prefix,
 
if (map && map->dso) {
srcline = get_srcline(map->dso,
- map__rip_2objdump(map, addr));
+ map__rip_2objdump(map, addr), NULL, true);
if (srcline != SRCLINE_UNKNOWN)
ret = fprintf(fp, "%s%s", prefix, srcline);
free_srcline(srcline);
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 289df9d..505700e 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -291,7 +291,8 @@ sort__srcline_cmp(struct hist_entry *left, struct 
hist_entry *right)
else {
struct map *map = left->ms.map;
left->srcline = get_srcline(map->dso,
-   map__rip_2objdump(map, left->ip));
+  map__rip_2objdump(map, left->ip),
+   left->ms.sym, true);
}
}
if (!right->srcline) {
@@ -300,7 +301,8 @@ sort__srcline_cmp(struct hist_entry *left, struct 
hist_entry *right)
else {
struct map *map = right->ms.map;
right->srcline = get_srcline(map->dso,
-   map__rip_2objdump(map, right->ip));
+map__rip_2objdump(map, right->ip),
+right->ms.sym, true);
}
}
return strcmp(right->srcline, left->srcline);
diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c
index ac877f9..36a7aff 100644
--- a/tools/perf/util/srcline.c
+++ b/tools/perf/util/srcline.c
@@ -8,12 +8,13 @@
 #include "util/util.h"
 #include "util/debug.h"
 
+#include "symbol.h"
+
 #ifdef HAVE_LIBBFD_SUPPORT
 
 /*
  * Implement addr2line using libbfd.
  */
-#define PACKAGE "perf"
 #include 
 
 struct a2l_data {
@@ -250,7 +251,8 @@ void dso__free_a2l(struct dso *dso __maybe_unused)
  */
 #define A2L_FAIL_LIMIT 123
 
-char *get_srcline(struct dso *dso, unsigned long addr)
+char *get_srcline(struct dso *dso, unsigned long addr, struct symbol *sym,
+ bool show_sym)
 {
char *file = NULL;
unsigned line = 0;
@@ -289,7 +291,11 @@ out:
dso->has_srcline = 0;
dso__free_a2l(dso);
}
-   if (asprintf(, "%s[%lx]", dso->short_name, addr) < 0)
+   if (sym) {
+   if (asprintf(, "%s+%ld", show_sym ? sym->name : "",
+   addr - sym->start) < 0)
+   return SRCLINE_UNKNOWN;
+   } else if (asprintf(, "%s[%lx]", dso->short_name, addr) < 0)
return 

Implement lbr-as-callgraph v10

2014-09-26 Thread Andi Kleen
This has been reviewed before. Any change to merge it now?

[Rebase. Fix one conflict with an earlier tip change]
[Just a repost after a rebase]
[Even more review feedback and some bugs addressed.]
[Only port to changes in perf/core. No other changes.]
[Rebase to latest perf/core]
[Another rebase. No changes]

This patchkit implements lbr-as-callgraphs in per freport,
as an alternative way to present LBR information.

Current perf report does a histogram over the branch edges,
which is useful to look at basic blocks, but doesn't tell
you anything about the larger control flow behaviour.

This patchkit adds a new option --branch-history that
adds the branch paths to the callgraph history instead.

This allows to reason about individual branch paths leading
to specific samples.

Also available at
git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc perf/lbr-callgraph7

v2:
- rebased on perf/core
- fix various issues
- rename the option to --branch-history
- various fixes to display the information more concise
v3:
- White space changes
- Consolidate some patches
- Update some descriptions
v4:
- Fix various display problems
- Unknown srcline is now printed as symbol+offset
- Refactor some code to address review feedback
- Merge with latest tip
- Fix missing srcline display in stdio hist output.
v5:
- Rename functions
- Fix gtk build problem
- Fix crash without -g
- Improve error messages
- Improve srcline display in various ways
v6:
- Port to latest perf/core
v7:
- Really port to latest perf/core
v8:
- Rebased on 3.16-rc1
v9:
- Rebase on 3.17-rc* tip/perf/core
v10:
- Rebase to tip/perf/core.
- Add missing check to make --branch-history work again
with latest tip.


Example output:

% perf record -b -g ./tsrc/tcall
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.044 MB perf.data (~1923 samples) ]
% perf report --branch-history
...
54.91%  tcall.c:6  [.] f2  tcall
|
|--65.53%-- f2 tcall.c:5
|  |
|  |--70.83%-- f1 tcall.c:11
|  |  f1 tcall.c:10
|  |  main tcall.c:18
|  |  main tcall.c:18
|  |  main tcall.c:17
|  |  main tcall.c:17
|  |  f1 tcall.c:13
|  |  f1 tcall.c:13
|  |  f2 tcall.c:7
|  |  f2 tcall.c:5
|  |  f1 tcall.c:12
|  |  f1 tcall.c:12
|  |  f2 tcall.c:7
|  |  f2 tcall.c:5
|  |  f1 tcall.c:11


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/8] perf, tools: Enable printing the srcline in the history

2014-09-26 Thread Andi Kleen
From: Andi Kleen 

For lbr-as-callgraph we need to see the line number in the history,
because many LBR entries can be in a single function, and just
showing the same function name many times is not useful.

When the history code is configured to sort by address, also try to
resolve the address to a file:srcline and display this in the browser.
If that doesn't work still display the address.

This can be also useful without LBRs for understanding which call in a large
function (or in which inlined function) called something else.

Contains fixes from Namhyung Kim

v2: Refactor code into common function
v3: Fix GTK build
v4: Rebase
Signed-off-by: Andi Kleen 
---
 tools/perf/ui/browsers/hists.c | 17 -
 tools/perf/ui/gtk/hists.c  | 11 +--
 tools/perf/ui/stdio/hist.c | 23 +--
 tools/perf/util/callchain.c| 29 +
 tools/perf/util/callchain.h|  4 
 tools/perf/util/machine.c  |  2 +-
 tools/perf/util/srcline.c  |  6 --
 7 files changed, 48 insertions(+), 44 deletions(-)

diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index d4cef68..3144752 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -460,23 +460,6 @@ out:
return key;
 }
 
-static char *callchain_list__sym_name(struct callchain_list *cl,
- char *bf, size_t bfsize, bool show_dso)
-{
-   int printed;
-
-   if (cl->ms.sym)
-   printed = scnprintf(bf, bfsize, "%s", cl->ms.sym->name);
-   else
-   printed = scnprintf(bf, bfsize, "%#" PRIx64, cl->ip);
-
-   if (show_dso)
-   scnprintf(bf + printed, bfsize - printed, " %s",
- cl->ms.map ? cl->ms.map->dso->short_name : "unknown");
-
-   return bf;
-}
-
 struct callchain_print_arg {
/* for hists browser */
off_t   row_offset;
diff --git a/tools/perf/ui/gtk/hists.c b/tools/perf/ui/gtk/hists.c
index f3fa425..b19f96d 100644
--- a/tools/perf/ui/gtk/hists.c
+++ b/tools/perf/ui/gtk/hists.c
@@ -89,15 +89,6 @@ void perf_gtk__init_hpp(void)
perf_gtk__hpp_color_overhead_acc;
 }
 
-static void callchain_list__sym_name(struct callchain_list *cl,
-char *bf, size_t bfsize)
-{
-   if (cl->ms.sym)
-   scnprintf(bf, bfsize, "%s", cl->ms.sym->name);
-   else
-   scnprintf(bf, bfsize, "%#" PRIx64, cl->ip);
-}
-
 static void perf_gtk__add_callchain(struct rb_root *root, GtkTreeStore *store,
GtkTreeIter *parent, int col, u64 total)
 {
@@ -128,7 +119,7 @@ static void perf_gtk__add_callchain(struct rb_root *root, 
GtkTreeStore *store,
scnprintf(buf, sizeof(buf), "%5.2f%%", percent);
gtk_tree_store_set(store, , 0, buf, -1);
 
-   callchain_list__sym_name(chain, buf, sizeof(buf));
+   callchain_list__sym_name(chain, buf, sizeof(buf), 
false);
gtk_tree_store_set(store, , col, buf, -1);
 
if (need_new_parent) {
diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c
index 15b451a..dfcbc90 100644
--- a/tools/perf/ui/stdio/hist.c
+++ b/tools/perf/ui/stdio/hist.c
@@ -41,6 +41,7 @@ static size_t ipchain__fprintf_graph(FILE *fp, struct 
callchain_list *chain,
 {
int i;
size_t ret = 0;
+   char bf[1024];
 
ret += callchain__fprintf_left_margin(fp, left_margin);
for (i = 0; i < depth; i++) {
@@ -56,11 +57,8 @@ static size_t ipchain__fprintf_graph(FILE *fp, struct 
callchain_list *chain,
} else
ret += fprintf(fp, "%s", "  ");
}
-   if (chain->ms.sym)
-   ret += fprintf(fp, "%s\n", chain->ms.sym->name);
-   else
-   ret += fprintf(fp, "0x%0" PRIx64 "\n", chain->ip);
-
+   fputs(callchain_list__sym_name(chain, bf, sizeof(bf), false), fp);
+   fputc('\n', fp);
return ret;
 }
 
@@ -168,6 +166,7 @@ static size_t callchain__fprintf_graph(FILE *fp, struct 
rb_root *root,
struct rb_node *node;
int i = 0;
int ret = 0;
+   char bf[1024];
 
/*
 * If have one single callchain root, don't bother printing
@@ -196,10 +195,8 @@ static size_t callchain__fprintf_graph(FILE *fp, struct 
rb_root *root,
} else
ret += callchain__fprintf_left_margin(fp, 
left_margin);
 
-   if (chain->ms.sym)
-   ret += fprintf(fp, " %s\n", 
chain->ms.sym->name);
-   else
-   ret += fprintf(fp, " %p\n", (void 
*)(long)chain->ip);
+   ret += fprintf(fp, "%s\n", 
callchain_list__sym_name(chain, bf, sizeof(bf),
+ 

[PATCH 5/8] perf, tools: Support source line numbers in annotate

2014-09-26 Thread Andi Kleen
From: Andi Kleen 

With srcline key/sort'ing it's useful to have line numbers
in the annotate window. This patch implements this.

Use objdump -l to request the line numbers and
save them in the line structure. Then the browser
displays them for source lines.

The line numbers are not displayed by default, but can be
toggled on with 'k'

There is one unfortunate problem with this setup. For
lines not containing source and which are outside functions
objdump -l reports line numbers off by a few: it always reports
the first line number in the next function even for lines
that are outside the function.
I haven't found a nice way to detect/correct this. Probably objdump
has to be fixed.
See https://sourceware.org/bugzilla/show_bug.cgi?id=16433

The line numbers are still useful even with these problems,
as most are correct and the ones which are not are nearby.

Signed-off-by: Andi Kleen 
---
 tools/perf/ui/browsers/annotate.c | 13 -
 tools/perf/util/annotate.c| 30 +-
 tools/perf/util/annotate.h|  1 +
 3 files changed, 38 insertions(+), 6 deletions(-)

diff --git a/tools/perf/ui/browsers/annotate.c 
b/tools/perf/ui/browsers/annotate.c
index f0697a3..8df6787 100644
--- a/tools/perf/ui/browsers/annotate.c
+++ b/tools/perf/ui/browsers/annotate.c
@@ -27,6 +27,7 @@ static struct annotate_browser_opt {
bool hide_src_code,
 use_offset,
 jump_arrows,
+show_linenr,
 show_nr_jumps;
 } annotate_browser__opts = {
.use_offset = true,
@@ -128,7 +129,11 @@ static void annotate_browser__write(struct ui_browser 
*browser, void *entry, int
if (!*dl->line)
slsmg_write_nstring(" ", width - pcnt_width);
else if (dl->offset == -1) {
-   printed = scnprintf(bf, sizeof(bf), "%*s  ",
+   if (dl->line_nr && annotate_browser__opts.show_linenr)
+   printed = scnprintf(bf, sizeof(bf), "%*s %-5d ",
+   ab->addr_width, " ", dl->line_nr);
+   else
+   printed = scnprintf(bf, sizeof(bf), "%*s  ",
ab->addr_width, " ");
slsmg_write_nstring(bf, printed);
slsmg_write_nstring(dl->line, width - printed - pcnt_width + 1);
@@ -733,6 +738,7 @@ static int annotate_browser__run(struct annotate_browser 
*browser,
"o Toggle disassembler output/simplified view\n"
"s Toggle source code view\n"
"/ Search string\n"
+   "k Toggle line numbers\n"
"r Run available scripts\n"
"? Search string backwards\n");
continue;
@@ -741,6 +747,10 @@ static int annotate_browser__run(struct annotate_browser 
*browser,
script_browse(NULL);
continue;
}
+   case 'k':
+   annotate_browser__opts.show_linenr =
+   !annotate_browser__opts.show_linenr;
+   break;
case 'H':
nd = browser->curr_hot;
break;
@@ -984,6 +994,7 @@ static struct annotate_config {
 } annotate__configs[] = {
ANNOTATE_CFG(hide_src_code),
ANNOTATE_CFG(jump_arrows),
+   ANNOTATE_CFG(show_linenr),
ANNOTATE_CFG(show_nr_jumps),
ANNOTATE_CFG(use_offset),
 };
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 3643752..f49f9a5 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -17,11 +17,13 @@
 #include "debug.h"
 #include "annotate.h"
 #include "evsel.h"
+#include 
 #include 
 #include 
 
 const char *disassembler_style;
 const char *objdump_path;
+static regex_t  file_lineno;
 
 static struct ins *ins__find(const char *name);
 static int disasm_line__parse(char *line, char **namep, char **rawp);
@@ -570,13 +572,15 @@ out_free_name:
return -1;
 }
 
-static struct disasm_line *disasm_line__new(s64 offset, char *line, size_t 
privsize)
+static struct disasm_line *disasm_line__new(s64 offset, char *line,
+   size_t privsize, int line_nr)
 {
struct disasm_line *dl = zalloc(sizeof(*dl) + privsize);
 
if (dl != NULL) {
dl->offset = offset;
dl->line = strdup(line);
+   dl->line_nr = line_nr;
if (dl->line == NULL)
goto out_delete;
 
@@ -788,13 +792,15 @@ static int disasm_line__print(struct disasm_line *dl, 
struct symbol *sym, u64 st
  * The ops.raw part will be parsed further according to type of the 
instruction.
  */
 static int symbol__parse_objdump_line(struct symbol *sym, struct map *map,
- FILE *file, size_t 

[PATCH 4/8] perf, tools: Only print base source file for srcline

2014-09-26 Thread Andi Kleen
From: Andi Kleen 

For perf report with --sort srcline only print the base source file
name. This makes the results generally fit much better to the
screen. The path is usually not that useful anyways because it is
often from different systems.

Signed-off-by: Andi Kleen 
---
 tools/perf/util/srcline.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c
index c6a7cdc..ac877f9 100644
--- a/tools/perf/util/srcline.c
+++ b/tools/perf/util/srcline.c
@@ -274,7 +274,7 @@ char *get_srcline(struct dso *dso, unsigned long addr)
if (!addr2line(dso_name, addr, , , dso))
goto out;
 
-   if (asprintf(, "%s:%u", file, line) < 0) {
+   if (asprintf(, "%s:%u", basename(file), line) < 0) {
free(file);
goto out;
}
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] Use faster check for modules in backtrace on 64bit

2014-09-26 Thread Andi Kleen
From: Andi Kleen 

On my workstation which has a lot of modules loaded:

$ lsmod | wc -l
80

backtrace from the NMI for perf record -g can take a quite long time.

This leads to frequent messages like:
perf interrupt took too long (7852 > 7812), lowering 
kernel.perf_event_max_sample_rate to 16000

One larger part of the PMI cost is each text address check during
the backtrace taking upto to 3us, like this:

  1)   |  print_context_stack_bp() {
  1)   |__kernel_text_address() {
  1)   |  is_module_text_address() {
  1)   |__module_text_address() {
  1)   1.611 us|  __module_address();
  1)   1.939 us|}
  1)   2.296 us|  }
  1)   2.659 us|}
  1)   |__kernel_text_address() {
  1)   |  is_module_text_address() {
  1)   |__module_text_address() {
  1)   0.724 us|  __module_address();
  1)   1.064 us|}
  1)   1.430 us|  }
  1)   1.798 us|}
  1)   |__kernel_text_address() {
  1)   |  is_module_text_address() {
  1)   |__module_text_address() {
  1)   0.656 us|  __module_address();
  1)   1.012 us|}
  1)   1.356 us|  }
  1)   1.761 us|}

So just with a reasonably sized backtrace easily 10-20us can be spent
on just checking the frame pointer IPs.

The main cost is simply walking this long list of modules and checking it.

On 64bit kernels we can do a short cut. All modules are in a special reserved
virtual address space area. So only check for that range, which is much cheaper.

This has the (small) potential to get a false positive on a pointer to a
data segment in a module.  However since we also use the frame pointer
chain as initial sanity check I think the danger of this is very low.

Signed-off-by: Andi Kleen 
---
 arch/x86/kernel/dumpstack.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
index b74ebc7..b7cbae3 100644
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -130,8 +130,20 @@ print_context_stack_bp(struct thread_info *tinfo,
while (valid_stack_ptr(tinfo, ret_addr, sizeof(*ret_addr), end)) {
unsigned long addr = *ret_addr;
 
+#ifdef CONFIG_64BIT
+   /*
+* On 64 bit the modules are in a special reserved
+* area, so we can just check the range.
+* It is not as exact as a full lookup, but together
+* with the frame pointer it is good enough.
+*/
+   if (!core_kernel_text(addr) &&
+   !(addr >= MODULES_VADDR && addr < MODULES_END))
+   break;
+#else
if (!__kernel_text_address(addr))
break;
+#endif
 
ops->address(data, addr, 1);
frame = frame->next_frame;
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86 : Ensure X86_FLAGS_NT is cleared on syscall entry

2014-09-26 Thread Linus Torvalds
On Fri, Sep 26, 2014 at 3:00 PM, Chuck Ebbert  wrote:
>
> I don't get it. Why isn't this patch acceptable, at least on x86-64
> where NT is never valid?

Why do you think it's not acceptable? Why do you raise a stink *one*
day after the patch - that seems to not be very important - is sent
out?

Why would SIGSEGV even be that bad? The program does crap things, why
accept it silently?

I don't think the patch is necessarily wrong, but I don't see why it
would be critical, and I *definitely* don't see why the f*ck you are
making a big deal of it.

Go away, stop bothering people.

 Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Mount propagation issue

2014-09-26 Thread Ross Lagerwall
Hi,

I have encountered a strange bug (on 3.14, 3.16.3 and master) regarding
mount namespaces and a particular directory.  Somehow, mounts are being
mounted successfully but disappearing when the process dies, but only
for /mnt/puppy, not for any other directory in the system:

# grep puppy /proc/mounts# nothing mounted yet
# mount -t tmpfs tmpfs /mnt/puppy
# grep puppy /proc/mounts# no output

# mount -t tmpfs tmpfs /mnt/puppy2
# grep puppy /proc/mounts
tmpfs /mnt/puppy2 tmpfs rw,relatime 0 0

Furthermore, I wrote a program to mount /mnt/puppy and immediately
read /proc/mounts:
# grep puppy /proc/mounts# no output
# ./mounter
tmpfs /mnt/puppy tmpfs rw,relatime 0 0
# grep puppy /proc/mounts# no output

Finally, I note that the problem seems to occur if the mount needs to be
propagated to the default namespace (note that systemd causes namespaces
to be shared by default):
# unshare -m

subshell # mount -t tmpfs tmpfs /mnt/puppy
subshell # grep puppy /proc/mounts   # no output
subshell # mount --make-rprivate /
subshell # mount -t tmpfs tmpfs /mnt/puppy
subshell # grep puppy /proc/mounts   # success!
tmpfs /mnt/puppy tmpfs rw,relatime 0 0
subshell # umount /mnt/puppy
subshell # exit

# mount --make-rprivate /# back in the default namespace
# mount -t tmpfs tmpfs /mnt/puppy
# grep puppy /proc/mounts# no output

I hope someone has an idea of what's going on or how to debug this
because I've run out of ideas...

Thanks!
-- 
Ross Lagerwall
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Optimize backtrace code for perf PMI handler.

2014-09-26 Thread Andi Kleen
While doing perf record -g I regularly exceed the time quote for the NMI:

This leads to perf shortening the period, which causes various problems.

This patchkit optimizes two sources of longer latencies in the NMI backtrace
code. There's probably more work needed to fix other sources of latencies
too.

-Andi

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] x86: Only do a single page fault for copy_from_user_nmi

2014-09-26 Thread Andi Kleen
From: Andi Kleen 

When copy_from_user_nmi faults the copy_user_tail code ends
up "replaying" the page faults to compute the exact tail bytes,
(added with 112958).

So we do an expensive page fault. And then we do it *again*.

This ends up being very expensive in the PMI handler for any
page fault on a stack access, and is one the more common
causes for the NMI handler exceeding its runtime limit.

  1)   0.109 us|copy_from_user_nmi();
  1)   |copy_from_user_nmi() {
  1)   |  __do_page_fault() {
  1)   |bad_area_nosemaphore() {
  1)   |  __bad_area_nosemaphore() {
  1)   |no_context() {
  1)   |  fixup_exception() {
  1)   |search_exception_tables() {
  1)   0.079 us|  search_extable();
  1)   0.409 us|}
  1)   0.757 us|  }
  1)   1.106 us|}
  1)   1.466 us|  }
  1)   1.793 us|}
  1)   2.233 us|  }
  1)   |  copy_user_handle_tail() {
  1)   |__do_page_fault() {
  1)   |  bad_area_nosemaphore() {
  1)   |__bad_area_nosemaphore() {
  1)   |  no_context() {
  1)   |fixup_exception() {
  1)   |  search_exception_tables() {
  1)   0.060 us|search_extable();
  1)   0.412 us|  }
  1)   0.764 us|}
  1)   1.074 us|  }
  1)   1.389 us|}
  1)   1.665 us|  }
  1)   2.002 us|}
  1)   2.784 us|  }
  1)   6.230 us|}

The NMI code actually doesn't care about the exact tail value. It only
needs to know if a fault happened (!= 0)

So check for in_nmi() in copy_user_tail and don't bother with the exact
tail check. This way we save the extra ~2.7us.

In theory we could also duplicate the whole copy_*_ path for cases
where the caller doesn't care about the exact bytes. But that
seems overkill for just this issue, and I'm not sure anyone
else cares about how fast this is. The simpler check works
as well for now.

Cc: torva...@linux-foundation.org
Cc: Vitaly Mayatskikh 
Signed-off-by: Andi Kleen 
---
 arch/x86/lib/usercopy_64.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/arch/x86/lib/usercopy_64.c b/arch/x86/lib/usercopy_64.c
index c905e89..1b581e7 100644
--- a/arch/x86/lib/usercopy_64.c
+++ b/arch/x86/lib/usercopy_64.c
@@ -67,6 +67,9 @@ EXPORT_SYMBOL(copy_in_user);
  * Try to copy last bytes and clear the rest if needed.
  * Since protection fault in copy_from/to_user is not a normal situation,
  * it is not necessary to optimize tail handling.
+ *
+ * It can be somewhat common in NMI handlers doing backtraces.
+ * So don't bother here with returning the exact tail.
  */
 __visible unsigned long
 copy_user_handle_tail(char *to, char *from, unsigned len, unsigned zerorest)
@@ -74,6 +77,11 @@ copy_user_handle_tail(char *to, char *from, unsigned len, 
unsigned zerorest)
char c;
unsigned zero_len;
 
+   if (in_nmi()) {
+   len = 1; /* users only care about != 0 */
+   goto out;
+   }
+
for (; len; --len, to++) {
if (__get_user_nocheck(c, from++, sizeof(char)))
break;
@@ -84,6 +92,7 @@ copy_user_handle_tail(char *to, char *from, unsigned len, 
unsigned zerorest)
for (c = 0, zero_len = len; zerorest && zero_len; --zero_len)
if (__put_user_nocheck(c, to++, sizeof(char)))
break;
+out:
clac();
return len;
 }
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 2/2] Move BTRFS RCU string to common library

2014-09-26 Thread josh
On Fri, Sep 19, 2014 at 02:01:30AM -0700, Omar Sandoval wrote:
> The RCU-friendy string API used internally by BTRFS is generic enough for
> common use. This doesn't add any new functionality, but instead just moves the
> code and documents the existing API.
> 
> Signed-off-by: Omar Sandoval 

One nit: shouldn't the returned rcu_string pointer have an __rcu
address space annotation?

With that fixed:
Reviewed-by: Josh Triplett 

>  fs/btrfs/check-integrity.c |  6 +--
>  fs/btrfs/dev-replace.c | 19 +-
>  fs/btrfs/disk-io.c |  6 +--
>  fs/btrfs/extent_io.c   |  4 +-
>  fs/btrfs/ioctl.c   |  4 +-
>  fs/btrfs/raid56.c  |  2 +-
>  fs/btrfs/rcu-string.h  | 56 
>  fs/btrfs/scrub.c   | 15 
>  fs/btrfs/super.c   |  2 +-
>  fs/btrfs/volumes.c | 14 +++
>  include/linux/rcustring.h  | 91 
> ++
>  11 files changed, 128 insertions(+), 91 deletions(-)
>  delete mode 100644 fs/btrfs/rcu-string.h
>  create mode 100644 include/linux/rcustring.h
> 
> diff --git a/fs/btrfs/check-integrity.c b/fs/btrfs/check-integrity.c
> index ce92ae3..4ccd7da 100644
> --- a/fs/btrfs/check-integrity.c
> +++ b/fs/btrfs/check-integrity.c
> @@ -94,6 +94,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include "ctree.h"
>  #include "disk-io.h"
>  #include "hash.h"
> @@ -103,7 +104,6 @@
>  #include "print-tree.h"
>  #include "locking.h"
>  #include "check-integrity.h"
> -#include "rcu-string.h"
>  
>  #define BTRFSIC_BLOCK_HASHTABLE_SIZE 0x1
>  #define BTRFSIC_BLOCK_LINK_HASHTABLE_SIZE 0x1
> @@ -851,8 +851,8 @@ static int btrfsic_process_superblock_dev_mirror(
>   printk_in_rcu(KERN_INFO "New initial S-block (bdev %p, 
> %s)"
>" @%llu (%s/%llu/%d)\n",
>superblock_bdev,
> -  rcu_str_deref(device->name), dev_bytenr,
> -  dev_state->name, dev_bytenr,
> +  rcu_string_dereference(device->name),
> +  dev_bytenr, dev_state->name, dev_bytenr,
>superblock_mirror_num);
>   list_add(_tmp->all_blocks_node,
>>all_blocks_list);
> diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c
> index eea26e1..87d10cc 100644
> --- a/fs/btrfs/dev-replace.c
> +++ b/fs/btrfs/dev-replace.c
> @@ -25,6 +25,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include "ctree.h"
>  #include "extent_map.h"
> @@ -34,7 +35,6 @@
>  #include "volumes.h"
>  #include "async-thread.h"
>  #include "check-integrity.h"
> -#include "rcu-string.h"
>  #include "dev-replace.h"
>  #include "sysfs.h"
>  
> @@ -376,9 +376,9 @@ int btrfs_dev_replace_start(struct btrfs_root *root,
>   printk_in_rcu(KERN_INFO
> "BTRFS: dev_replace from %s (devid %llu) to %s started\n",
> src_device->missing ? "" :
> - rcu_str_deref(src_device->name),
> +   rcu_string_dereference(src_device->name),
> src_device->devid,
> -   rcu_str_deref(tgt_device->name));
> +   rcu_string_dereference(tgt_device->name));
>  
>   tgt_device->total_bytes = src_device->total_bytes;
>   tgt_device->disk_total_bytes = src_device->disk_total_bytes;
> @@ -528,9 +528,10 @@ static int btrfs_dev_replace_finishing(struct 
> btrfs_fs_info *fs_info,
>   printk_in_rcu(KERN_ERR
> "BTRFS: btrfs_scrub_dev(%s, %llu, %s) failed 
> %d\n",
> src_device->missing ? "" :
> - rcu_str_deref(src_device->name),
> +   rcu_string_dereference(src_device->name),
> src_device->devid,
> -   rcu_str_deref(tgt_device->name), scrub_ret);
> +   rcu_string_dereference(tgt_device->name),
> +   scrub_ret);
>   btrfs_dev_replace_unlock(dev_replace);
>   mutex_unlock(>fs_info->fs_devices->device_list_mutex);
>   mutex_unlock(>fs_info->chunk_mutex);
> @@ -544,9 +545,9 @@ static int btrfs_dev_replace_finishing(struct 
> btrfs_fs_info *fs_info,
>   printk_in_rcu(KERN_INFO
> "BTRFS: dev_replace from %s (devid %llu) to %s) 
> finished\n",
> src_device->missing ? "" :
> - rcu_str_deref(src_device->name),
> +   rcu_string_dereference(src_device->name),
> src_device->devid,
> -   rcu_str_deref(tgt_device->name));
> +   rcu_string_dereference(tgt_device->name));
>   tgt_device->is_tgtdev_for_dev_replace = 0;
>   tgt_device->devid = src_device->devid;
>   src_device->devid 

Re: [PATCH] clk: prevent erronous parsing of children during rate change

2014-09-26 Thread Mike Turquette
Quoting Tero Kristo (2014-09-26 00:18:55)
> On 09/26/2014 04:35 AM, Stephen Boyd wrote:
> > On 09/23/14 06:38, Tero Kristo wrote:
> >> On 09/22/2014 10:18 PM, Stephen Boyd wrote:
> >>> On 08/21, Tero Kristo wrote:
> /* Skip children who will be reparented to another clock */
> if (child->new_parent && child->new_parent != clk)
> continue;
> >>>
> >>> Are we not hitting the new_parent check here? I don't understand
> >>> how we can be changing parents here unless the check is being
> >>> avoided, in which case I wonder why determine_rate isn't being
> >>> used.
> >>>
> >>
> >> It depends how the clock underneath handles the situation. The error I
> >> am seeing actually happens with a SoC specific compound clock (DPLL)
> >> which integrates set_rate + mux functionality into a single clock
> >> node. A call to the clk_set_rate changes the parent of this clock
> >> (from bypass clock to reference clock), in addition to changing the
> >> rate (tune the mul+div.) I looked at using the determine rate call
> >> with this type but it breaks everything up... the parent gets changed
> >> but not the clock rate, in addition to some other issues.
> >
> > Ok. Is this omap3_noncore_dpll_set_rate()?
> 
> Yes.
> 
>  > Can we use determine_rate +
> > clk_set_parent_and_rate()? At least clk_set_parent_and_rate() would
> > allow us to do the mult+div and the parent in the same op call, although
> > I don't understand why setting the parent and then setting the rate is
> > not going to work.
> 
> Well, setting parent first, then rate later causes problems with the 
> DPLL ending up running with illegal (non-specified) rate, the M+N values 
> are most likely wrong if you just switch from bypass clock to reference 
> clock first without programming the M+N first.

I took a quick look and it still seems to me that the OMAP DPLLs are
still not modeled properly as mux clocks. Is this correct?

This issue has been lingering for a long time and we can't use
determine_rate unless that clock has multiple parents. Simply hacking
knowledge of the parent bypass clock into the .set_rate callback is not
enough.

Regards,
Mike

> 
>   I'm interested in the other issues that you mentioned
> > too.
> 
> Mostly these were side-effects from the illegal DPLL setup I guess, like 
> boot hang, failed drivers etc. I didn't really investigate this that 
> much as it is much more simpler just to use safe list iteration here.
> 
> -Tero
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] partitions: efi.c: fix formatting and function readability

2014-09-26 Thread Davidlohr Bueso
On Fri, 2014-09-26 at 16:53 +0100, Matt Fleming wrote:
> (No response from Davidlohr, try another address)
> 
> On Wed, 03 Sep, at 11:07:20PM, Joseph Poirier wrote:
> > From: jpoirier 
> > 
> > Fixed coding style issues, refactored the protective MBR check
> > function for readability, and removed an explicit return from a
> > function returning void.

Apologies for the late reply, I had not seen this.

This patch does too much, mind sending a series instead?

Thanks,
Davidlohr

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v13 0/9] Per-user clock constraints

2014-09-26 Thread Mike Turquette
Quoting Tomeu Vizoso (2014-09-26 01:09:20)
> On 09/26/2014 03:29 AM, Stephen Boyd wrote:
> > On 09/24/14 01:27, Tomeu Vizoso wrote:
> >> On 09/23/2014 10:59 PM, Stephen Boyd wrote:
> >>>
> >>> Any thoughts on my comments on patch set #10[1]? It seems like we can
> >>> avoid having a flag day to support this.
> >> I cannot say that I fully understand your proposal, but IMO the most
> >> valuable thing in this patchset is precisely the API split (and thus,
> >> the flag day is inherent to it).
> >>
> >> I see a lot of value in clk consumers to use a defined set of functions
> >> that all take and/or return struct clk, and for providers to use the
> >> functions that take and/or return struct clk_core. Makes the API clearer
> >> and allows it to have a more scalable growth in the future.
> > 
> > I see a lot of non-value. It's hugely invasive needing every driver to
> > change.
> 
> There's obviously a trade-off to be made by the maintainer, and I don't
> think that Mike has taken this lightly. He has already stated what his
> opinion is and I have acted accordingly.

I respect Stephen's opinion on this. You both have the goal of getting
clock-based constraints merged in, which is good.

> 
> > Invariably we're going to break something.
> 
> Given all the build tests that these series have passed and the nature
> of the changes (you can also check the script that does the
> refactoring), I don't expect that much breakage.
> 
> > Back porting things
> > across this is a real pain (good luck stable trees!). Reviewing the
> > patches aren't feasible given their size. There's a reason why we don't
> > do flag days. They suck.
> 
> Flag days are sometimes needed and is up to each maintainer to decide
> what is better for the subsystem since they are the ones dealing with
> the issues that may arise anyway.
> 
> > We already have the consumer/provider split in the struct clk_hw and
> > struct clk separation. Why don't we just use struct clk_hw throughout
> > the provider APIs? The only op that isn't doing this is determine_rate()
> > which might be able to accept a flag day. Otherwise we rename it to
> > something else and migrate everyone over to a different named function
> > that doesn't take a struct clk **. Then we introduce new APIs for the
> > providers to use that are struct clk_hw focused instead of struct clk
> > focused and migrate them too. The benefit being that we get proper
> > review of this stuff because the patches are small. We can let
> > coccinelle do it too.

I have been opposed to mucking with clk_hw before, but that was because
the goal was never clear. Now that we know that we're trying to split
the API then it might be reasonable to use it.

Stephen, does your above proposal still allow for unique struct clk
cookies for each user of a clock?

Regards,
Mike

> 
> Really, the bulk of it is just a plain function rename that is in its
> own commit. And we have a simple script that can be reviewed. And it has
> passed the kbuild tests. Given that, and if we don't focus on just the
> line count of that patch, I don't see your proposal as being
> significantly better.
> 
> >> A less important feature of the patchset are per-user clocks, which (if
> >> I understand correctly) your proposal would address without requiring a
> >> flag day.
> > 
> > The subject of this thread seems to imply that it's the most important
> > part. I'm confused.
> 
> Sorry about that, I agree that it could be improved.
> 
> >> And then we have clock constraints, which is probably the least
> >> important feature in the grand scheme of things, but it's actually what
> >> I personally care about.
> >>
> >> If we wanted to add a way for clk users to specify clock constraints
> >> without any refactoring, we could easily do so by reusing the request
> >> pattern that pm_qos uses:
> >>
> >> void clk_add_constraint(struct clk_request *req,
> >> int constraint_type,
> >> unsigned long value);
> >>
> >> void clk_update_constraint(struct clk_request *req,
> >>unsigned long new_value);
> >>
> >> void clk_remove_constraint(struct clk_request *req);
> >>
> >> It wouldn't be that bad IMO, but the API refactoring was something that
> >> was long desired and this was seen as a good opportunity to tackle it
> >> before it gets worst.
> >>
> > 
> > Sure. Maybe we should just do that so we don't break things.
> 
> That hasn't been what Mike has stated in the past, but I actually have
> such a patch somewhere that I can dust off and send for review.
> 
> Cheers,
> 
> Tomeu
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/1] jffs2: fix sparse warning: unexpected unlock

2014-09-26 Thread josh
On Mon, Sep 22, 2014 at 11:12:50AM -0700, Brian Norris wrote:
> + linux-sparse
> 
> On Thu, Sep 18, 2014 at 08:46:16PM +0200, Fabian Frederick wrote:
> > fs/jffs2/summary.c:846:5: warning: context imbalance in 
> > 'jffs2_sum_write_sumnode' - unexpected unlock
> > 
> > Signed-off-by: Fabian Frederick 
> > ---
> >  fs/jffs2/summary.c | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/fs/jffs2/summary.c b/fs/jffs2/summary.c
> > index c522d09..a0bac7b 100644
> > --- a/fs/jffs2/summary.c
> > +++ b/fs/jffs2/summary.c
> > @@ -844,6 +844,8 @@ static int jffs2_sum_write_data(struct jffs2_sb_info 
> > *c, struct jffs2_eraseblock
> >  /* Write out summary information - called from jffs2_do_reserve_space */
> >  
> >  int jffs2_sum_write_sumnode(struct jffs2_sb_info *c)
> > +   __releases(>erase_completion_lock)
> > +   __acquires(>erase_completion_lock)
> 
> I'm not too familiar with sparse notations, but Documentation/sparse.txt
> suggests the above is wrong, and the following is more accurate:
> 
>   __must_hold(>erase_completion_lock)
> 
> But it looks like there are several other examples which do this.
> Anyway, here's the relevant doc text, in case someone wants to clarify
> it for me, or else tell me the documentation is wrong:
> 
> __must_hold - The specified lock is held on function entry and exit.
> 
> __acquires - The specified lock is held on function exit, but not entry.
> 
> __releases - The specified lock is held on function entry, but not exit.
> 
> So __acquires and __releases look mutually exclusive, but it's not clear
> if __must_hold will actually cover what we want. (I haven't tested it.)

__must_hold is indeed the correct annotation.  (There isn't currently
anything enforcing that, though.)

- Josh Triplett
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: eBPF verifier thoughts (Re: [PATCH v15 net-next 00/11] eBPF syscall, verifier, testsuite)

2014-09-26 Thread Alexei Starovoitov
On Fri, Sep 26, 2014 at 3:41 PM, Andy Lutomirski  wrote:
> On Fri, Sep 26, 2014 at 3:26 PM, Alexei Starovoitov  wrote:
>> On Fri, Sep 26, 2014 at 3:07 PM, Andy Lutomirski  wrote:
>>> On Fri, Sep 26, 2014 at 3:03 PM, Alexei Starovoitov  
>>> wrote:
 On Fri, Sep 26, 2014 at 2:47 PM, Andy Lutomirski  
 wrote:
>
> Can't you just disallow the 1-byte write to the stack?

 of course not.
 That would be extremely limiting to users.
 Can you actually see yourself living with stack that only
 allows 8-byte writes/reads?
 The stack usage will increase a lot, since all char/short
 stack variables will become 8-byte...
>>>
>>> How about requiring that sub-8-byte stack accesses only be to integer slots?
>>
>> you mean to reject the sub-8-byte write early if it's going
>> into space where pointers were stored?
>> That will limit stack reuse.
>> gcc/llvm generate code where the same stack location
>> is used by different variables during life of the function.
>> So if I reject the write early, it will break otherwise valid
>> programs.
>
> I think that a sub-8-byte write to an integer slot should leave it as
> an integer and a sub-8-byte write to a non-integer slot should turn
> that slot into an integer (if conversions to integer are permitted) or
> be rejected otherwise.  gcc/llvm could emit an 8-byte write first, as
> needed, to make this valid.

I don't think the above will work.

> Alternatively, an integer stack slot could have a bitmask indicating
> which bytes are valid.

but this one might. Let me think about it more.
Note verifier, as it stands, is using 12Kbyte of temporary
memory to track stack with byte granularity.
(it's freed as soon as verifier finishes)
This bitmask optimization, best case, will reduce it to 1.5Kbyte
at the cost of extra complexity. I'll play with this idea to
see whether this trade off is actually worth doing.
Also note that there are indirect stack reads
(see check_func_arg() -> check_stack_boundary())
used to verify that N bytes were initialized in stack
before calling into helper function.
Also in the future I was planning to allow
indirect stack write, so that helper function can
store stuff into program stack. This indirect accesses
are crossing 8-byte boundaries, so would need special
care with bitmask optimization.
We need to carefully weight all pros and cons.
imo this memory usage during verification is not a big deal,
but of course we should not waste it. I'll see what can be done.

In any case, we're talking about incremental improvements
on top of current stuff, right?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] PCI: Intel 10G NIC ACS quirks

2014-09-26 Thread Alex Williamson
Intel has verified there is no peer-to-peer between functions for the
below selection of 82598, 82599, and X520 10G NICs.  These NICs lack
an ACS capability, so we're not able to determine this isolation
without the help of quirks.  Generalize the Solarflare quirk and add
these.

Signed-off-by: Alex Williamson 
Cc: John Ronciak 
---

 drivers/pci/quirks.c |   34 --
 1 file changed, 28 insertions(+), 6 deletions(-)

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 95239e0..b6c6500 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -3635,14 +3635,16 @@ static int pci_quirk_intel_pch_acs(struct pci_dev *dev, 
u16 acs_flags)
return acs_flags & ~flags ? 0 : 1;
 }
 
-static int pci_quirk_solarflare_acs(struct pci_dev *dev, u16 acs_flags)
+static int pci_quirk_mf_endpoint_acs(struct pci_dev *dev, u16 acs_flags)
 {
/*
 * SV, TB, and UF are not relevant to multifunction endpoints.
 *
-* Solarflare indicates that peer-to-peer between functions is not
-* possible, therefore RR, CR, and DT are not implemented.  Mask
-* these out as if they were clear in the ACS capabilities register.
+* Multifunction devices are only required to implement RR, CR, and DT
+* in their ACS capability if they support peer-to-peer transactions.
+* Devices matching this quirk have been verified by the vendor to not
+* perform peer-to-peer with other functions, allowing us to mask out
+* these bits as if they were unimplemented in the ACS capability.
 */
acs_flags &= ~(PCI_ACS_SV | PCI_ACS_TB | PCI_ACS_RR |
   PCI_ACS_CR | PCI_ACS_UF | PCI_ACS_DT);
@@ -3661,8 +3663,28 @@ static const struct pci_dev_acs_enabled {
{ PCI_VENDOR_ID_ATI, 0x439d, pci_quirk_amd_sb_acs },
{ PCI_VENDOR_ID_ATI, 0x4384, pci_quirk_amd_sb_acs },
{ PCI_VENDOR_ID_ATI, 0x4399, pci_quirk_amd_sb_acs },
-   { PCI_VENDOR_ID_SOLARFLARE, 0x0903, pci_quirk_solarflare_acs },
-   { PCI_VENDOR_ID_SOLARFLARE, 0x0923, pci_quirk_solarflare_acs },
+   { PCI_VENDOR_ID_SOLARFLARE, 0x0903, pci_quirk_mf_endpoint_acs },
+   { PCI_VENDOR_ID_SOLARFLARE, 0x0923, pci_quirk_mf_endpoint_acs },
+   { PCI_VENDOR_ID_INTEL, 0x10C6, pci_quirk_mf_endpoint_acs },
+   { PCI_VENDOR_ID_INTEL, 0x10DB, pci_quirk_mf_endpoint_acs },
+   { PCI_VENDOR_ID_INTEL, 0x10DD, pci_quirk_mf_endpoint_acs },
+   { PCI_VENDOR_ID_INTEL, 0x10E1, pci_quirk_mf_endpoint_acs },
+   { PCI_VENDOR_ID_INTEL, 0x10F1, pci_quirk_mf_endpoint_acs },
+   { PCI_VENDOR_ID_INTEL, 0x10F7, pci_quirk_mf_endpoint_acs },
+   { PCI_VENDOR_ID_INTEL, 0x10F8, pci_quirk_mf_endpoint_acs },
+   { PCI_VENDOR_ID_INTEL, 0x10F9, pci_quirk_mf_endpoint_acs },
+   { PCI_VENDOR_ID_INTEL, 0x10FA, pci_quirk_mf_endpoint_acs },
+   { PCI_VENDOR_ID_INTEL, 0x10FB, pci_quirk_mf_endpoint_acs },
+   { PCI_VENDOR_ID_INTEL, 0x10FC, pci_quirk_mf_endpoint_acs },
+   { PCI_VENDOR_ID_INTEL, 0x1507, pci_quirk_mf_endpoint_acs },
+   { PCI_VENDOR_ID_INTEL, 0x1514, pci_quirk_mf_endpoint_acs },
+   { PCI_VENDOR_ID_INTEL, 0x151C, pci_quirk_mf_endpoint_acs },
+   { PCI_VENDOR_ID_INTEL, 0x1529, pci_quirk_mf_endpoint_acs },
+   { PCI_VENDOR_ID_INTEL, 0x152A, pci_quirk_mf_endpoint_acs },
+   { PCI_VENDOR_ID_INTEL, 0x154D, pci_quirk_mf_endpoint_acs },
+   { PCI_VENDOR_ID_INTEL, 0x154F, pci_quirk_mf_endpoint_acs },
+   { PCI_VENDOR_ID_INTEL, 0x1551, pci_quirk_mf_endpoint_acs },
+   { PCI_VENDOR_ID_INTEL, 0x1558, pci_quirk_mf_endpoint_acs },
{ PCI_VENDOR_ID_INTEL, PCI_ANY_ID, pci_quirk_intel_pch_acs },
{ 0 }
 };

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [OPW kernel] checkpatch: Does anyone care that comments are freeform aligned?

2014-09-26 Thread josh
On Wed, Sep 24, 2014 at 11:41:14AM -0700, Joe Perches wrote:
> On Wed, 2014-09-24 at 07:47 +0200, Julia Lawall wrote:
> > In the following patch extract, one line is indented with spaces rather 
> > than tabs.  Is it intentional that checkpatch doesn't complain about this, 
> > I guess due to the line being a comment?
> []
> >  struct bcm_hdr_suppression_contextinfo {
> > -   UCHAR ucaHdrSuppressionInBuf[MAX_PHS_LENGTHS]; /* Intermediate buffer 
> > to accumulate pkt Header for PHS */
> > -   UCHAR ucaHdrSuppressionOutBuf[MAX_PHS_LENGTHS + PHSI_LEN]; /* 
> > Intermediate buffer containing pkt Header after PHS */
> > +/* Intermediate buffer to accumulate pkt Header for PHS */
> > +   UCHAR ucaHdrSuppressionInBuf[MAX_PHS_LENGTHS];
> > +   /* Intermediate buffer containing pkt Header after PHS */
> > +   UCHAR ucaHdrSuppressionOutBuf[MAX_PHS_LENGTHS + PHSI_LEN];
> >  };
> 
> checkpatch does not care when comments start in any
> particular position or ensure comments have tabs
> preceding them.
> 
> Does anyone care?

This should have already been caught by other whitespace checks that
check for indentations of 8 or more spaces.  Similarly, mixed tab/space
indentations would get caught by those same checks.  I don't think
checkpatch needs to check those.

- Josh Triplett
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RESEND PATCH] acpi-cpufreq: get the cur_freq from acpi_processor_performance states

2014-09-26 Thread Rafael J. Wysocki
On Thursday, August 21, 2014 01:55:15 PM Wang Weidong wrote:
> As the initialized freq_tables maybe different from the p-states
> values, so the array index is different as well.
> 
> p-states value: [2400 2400 2000 ...], while the freq_tables:
> [2400 2000 ... CPUFREQ_TABLE_END]. After setted the freqs 2000,
> the perf->state is 3 while the freqs_table's index should be 2.
> So when call the get_cur_freq_on_cpu, the freqs value we get
> is 2400.
> 
> So, fix the problem with the correct tables.

What you're saying is basically that freq_table and perf->states
diverge at one point.  Shouldn't we re-generate freq_table in that
case instead of fixing up get_cur_freq_on_cpu() only in a quite
indirect way? 

> Signed-off-by: Wang Weidong 
> ---
>  drivers/cpufreq/acpi-cpufreq.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/cpufreq/acpi-cpufreq.c b/drivers/cpufreq/acpi-cpufreq.c
> index b0c18ed..ac93885 100644
> --- a/drivers/cpufreq/acpi-cpufreq.c
> +++ b/drivers/cpufreq/acpi-cpufreq.c
> @@ -365,6 +365,7 @@ static u32 get_cur_val(const struct cpumask *mask)
>  static unsigned int get_cur_freq_on_cpu(unsigned int cpu)
>  {
>   struct acpi_cpufreq_data *data = per_cpu(acfreq_data, cpu);
> + struct acpi_processor_performance *perf;
>   unsigned int freq;
>   unsigned int cached_freq;
>  
> @@ -375,7 +376,8 @@ static unsigned int get_cur_freq_on_cpu(unsigned int cpu)
>   return 0;
>   }
>  
> - cached_freq = data->freq_table[data->acpi_data->state].frequency;
> + perf = data->acpi_data;
> + cached_freq = perf->states[perf->state].core_frequency * 1000;
>   freq = extract_freq(get_cur_val(cpumask_of(cpu)), data);
>   if (freq != cached_freq) {
>   /*
> 

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v12 08/12] PCI: OF: Introduce helper function for retrieving PCI domain numbers

2014-09-26 Thread Liviu Dudau
On Fri, Sep 26, 2014 at 10:53:26PM +0100, Rob Herring wrote:
> On Fri, Sep 26, 2014 at 4:20 PM,   wrote:
> > On Fri, Sep 26, 2014 at 01:20:39PM -0500, Rob Herring wrote:
> >> On Tue, Sep 23, 2014 at 2:01 PM, Liviu Dudau  wrote:
> >> > Add pci_get_new_domain_nr() to allocate a new domain number
> >> > and of_get_pci_domain_nr() to retrieve the PCI domain number
> >> > of a given device from DT. Host bridge drivers or architecture
> >> > specific code can choose to implement their PCI domain number
> >> > policy using these two functions.
> >> >
> >> > Using of_get_pci_domain_nr() guarantees a stable PCI domain
> >> > number on every boot provided that all host bridge controllers
> >> > are assigned a number in the device tree using "linux,pci-domain"
> >> > property. Mix use of pci_get_new_domain_nr() and of_get_pci_domain_nr()
> >> > is not recommended as it can lead to potentially conflicting
> >> > domain numbers being assigned to root busses behind different
> >> > host bridges.
> >> >
> >> > Cc: Bjorn Helgaas 
> >> > Cc: Arnd Bergmann 
> >> > Cc: Grant Likely 
> >> > Cc: Rob Herring 
> >> > Cc: Catalin Marinas 
> >> > Signed-off-by: Liviu Dudau 
> >>
> >> Looks pretty good, but a couple of comments that can be addressed as
> >> follow-up patches.
> >>
> >> > ---
> >> >  drivers/of/of_pci.c| 25 +
> >> >  drivers/pci/pci.c  |  9 +
> >> >  include/linux/of_pci.h |  7 +++
> >> >  include/linux/pci.h|  3 +++
> >> >  4 files changed, 44 insertions(+)
> >> >
> >> > diff --git a/drivers/of/of_pci.c b/drivers/of/of_pci.c
> >> > index 8481996..82d172f 100644
> >> > --- a/drivers/of/of_pci.c
> >> > +++ b/drivers/of/of_pci.c
> >> > @@ -89,6 +89,31 @@ int of_pci_parse_bus_range(struct device_node *node, 
> >> > struct resource *res)
> >> >  }
> >> >  EXPORT_SYMBOL_GPL(of_pci_parse_bus_range);
> >> >
> >> > +/**
> >> > + * This function will try to obtain the host bridge domain number by
> >> > + * finding a property called "linux,pci-domain" of the given device 
> >> > node.
> >> > + *
> >> > + * @node: device tree node with the domain information
> >> > + *
> >> > + * Returns the associated domain number from DT in the range 
> >> > [0-0x], or
> >> > + * a negative value if the required property is not found.
> >> > + */
> >> > +int of_get_pci_domain_nr(struct device_node *node)
> >> > +{
> >> > +   const __be32 *value;
> >> > +   int len;
> >> > +   u16 domain;
> >> > +
> >> > +   value = of_get_property(node, "linux,pci-domain", );
> >>
> >> This needs to be documented.
> >
> > I would say needs agreed on *and* documented.
> 
> I thought Bjorn already applied the series. Who's agreement are you
> waiting for specifically.

Looks like he only applied about 60% of the series.

I'm not waiting on anyones specifig agreement, just general agreement.
Jason Gunthrope have initially suggested using pci-domain alias which
you don't like, now I'm using "linux,pci-domain" property which you've
ACKed, I wonder if I need to give people more time to catch up to the
fact?

> 
> >>
> >> > +   if (!value || len < sizeof(*value))
> >> > +   return -EINVAL;
> >> > +
> >> > +   domain = (u16)be32_to_cpup(value);
> >>
> >> of_property_read_u32 would not work here?
> >
> > It should, I guess, but I want to limit the domain number range to 16 bits.
> 
> Okay, but it would still be fewer lines to use of_property_read_u32
> and then cast it. You could use the u16 variant and make the property
> be defined to be 16-bit in the documentation.

True. Will do.

Best regards,
Liviu

> 
> Rob
> 

-- 

| I would like to |
| fix the world,  |
| but they're not |
| giving me the   |
 \ source code!  /
  ---
¯\_(ツ)_/¯

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: eBPF verifier thoughts (Re: [PATCH v15 net-next 00/11] eBPF syscall, verifier, testsuite)

2014-09-26 Thread Andy Lutomirski
On Fri, Sep 26, 2014 at 3:26 PM, Alexei Starovoitov  wrote:
> On Fri, Sep 26, 2014 at 3:07 PM, Andy Lutomirski  wrote:
>> On Fri, Sep 26, 2014 at 3:03 PM, Alexei Starovoitov  
>> wrote:
>>> On Fri, Sep 26, 2014 at 2:47 PM, Andy Lutomirski  
>>> wrote:

 Can't you just disallow the 1-byte write to the stack?
>>>
>>> of course not.
>>> That would be extremely limiting to users.
>>> Can you actually see yourself living with stack that only
>>> allows 8-byte writes/reads?
>>> The stack usage will increase a lot, since all char/short
>>> stack variables will become 8-byte...
>>
>> How about requiring that sub-8-byte stack accesses only be to integer slots?
>
> you mean to reject the sub-8-byte write early if it's going
> into space where pointers were stored?
> That will limit stack reuse.
> gcc/llvm generate code where the same stack location
> is used by different variables during life of the function.
> So if I reject the write early, it will break otherwise valid
> programs.

I think that a sub-8-byte write to an integer slot should leave it as
an integer and a sub-8-byte write to a non-integer slot should turn
that slot into an integer (if conversions to integer are permitted) or
be rejected otherwise.  gcc/llvm could emit an 8-byte write first, as
needed, to make this valid.

Alternatively, an integer stack slot could have a bitmask indicating
which bytes are valid.

--Andy

-- 
Andy Lutomirski
AMA Capital Management, LLC
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: eBPF verifier thoughts (Re: [PATCH v15 net-next 00/11] eBPF syscall, verifier, testsuite)

2014-09-26 Thread Alexei Starovoitov
On Fri, Sep 26, 2014 at 3:07 PM, Andy Lutomirski  wrote:
> On Fri, Sep 26, 2014 at 3:03 PM, Alexei Starovoitov  wrote:
>> On Fri, Sep 26, 2014 at 2:47 PM, Andy Lutomirski  wrote:
>>>
>>> Can't you just disallow the 1-byte write to the stack?
>>
>> of course not.
>> That would be extremely limiting to users.
>> Can you actually see yourself living with stack that only
>> allows 8-byte writes/reads?
>> The stack usage will increase a lot, since all char/short
>> stack variables will become 8-byte...
>
> How about requiring that sub-8-byte stack accesses only be to integer slots?

you mean to reject the sub-8-byte write early if it's going
into space where pointers were stored?
That will limit stack reuse.
gcc/llvm generate code where the same stack location
is used by different variables during life of the function.
So if I reject the write early, it will break otherwise valid
programs.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: sleeping function called from invalid context at drivers/cpufreq/cpufreq.c:370

2014-09-26 Thread Rafael J. Wysocki
On Friday, September 26, 2014 09:54:00 AM Mike Galbraith wrote:
> On Fri, 2014-09-26 at 08:24 +0200, Mike Galbraith wrote: 
> > (the hazards of multitasking.. post escaped early, and went to mostly
> > the wrong folks)
> > 
> > 
> > While testing some scheduler patches, the below pcc-cpufreq
> > might_sleep() gripe fell out.
> 
> Because the bits below from 8fec051e didn't make the lock go away first.
> Reverting only pcc-cpufreq back to notifier.. works.
> 
> - drivers/cpufreq/integrator-cpufreq.c 
> -

Are you sure this is the right file?

Shouldn't that be pcc-cpufreq.c rather?

Also moving the spin_lock(_lock) after the cpufreq_freq_transition_begin()
should fix the problem too (like the below).  Have you tried that?


---
 drivers/cpufreq/pcc-cpufreq.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-pm/drivers/cpufreq/pcc-cpufreq.c
===
--- linux-pm.orig/drivers/cpufreq/pcc-cpufreq.c
+++ linux-pm/drivers/cpufreq/pcc-cpufreq.c
@@ -204,7 +204,6 @@ static int pcc_cpufreq_target(struct cpu
u32 input_buffer;
int cpu;
 
-   spin_lock(_lock);
cpu = policy->cpu;
pcc_cpu_data = per_cpu_ptr(pcc_cpu_info, cpu);
 
@@ -216,6 +215,7 @@ static int pcc_cpufreq_target(struct cpu
freqs.old = policy->cur;
freqs.new = target_freq;
cpufreq_freq_transition_begin(policy, );
+   spin_lock(_lock);
 
input_buffer = 0x1 | (((target_freq * 100)
   / (ioread32(_hdr->nominal) * 1000)) << 8);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] x86 : Ensure X86_FLAGS_NT is cleared on syscall entry

2014-09-26 Thread Anish Bhatt
> -Original Message-
> From: Chuck Ebbert [mailto:cebbert.l...@gmail.com]
> Sent: Friday, September 26, 2014 3:01 PM
> To: Anish Bhatt
> Cc: linux-kernel@vger.kernel.org; x...@kernel.org; t...@linutronix.de;
> mi...@redhat.com; h...@zytor.com; sebast...@fds-team.de; Linus
> Torvalds
> Subject: Re: [PATCH] x86 : Ensure X86_FLAGS_NT is cleared on syscall entry
> 
> On Thu, 25 Sep 2014 12:42:51 -0700
> Anish Bhatt  wrote:
> 
> > The MSR_SYSCALL_MASK, which is responsible for clearing specific
> > EFLAGS on  syscall entry, should also clear the nested task (NT) flag
> > to be safe from  userspace injection. Without this fix the application
> > segmentation  faults on syscall return because of the changed meaning
> > of the IRET  instruction.
> >
> > Further details can be seen here
> > https://bugs.winehq.org/show_bug.cgi?id=33275
> >
> > Signed-off-by: Anish Bhatt 
> > Signed-off-by: Sebastian Lackner 
> > ---
> >  arch/x86/kernel/cpu/common.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/arch/x86/kernel/cpu/common.c
> > b/arch/x86/kernel/cpu/common.c index e4ab2b4..3126558 100644
> > --- a/arch/x86/kernel/cpu/common.c
> > +++ b/arch/x86/kernel/cpu/common.c
> > @@ -1184,7 +1184,7 @@ void syscall_init(void)
> > /* Flags to clear on syscall */
> > wrmsrl(MSR_SYSCALL_MASK,
> >X86_EFLAGS_TF|X86_EFLAGS_DF|X86_EFLAGS_IF|
> > -  X86_EFLAGS_IOPL|X86_EFLAGS_AC);
> > +  X86_EFLAGS_IOPL|X86_EFLAGS_AC|X86_EFLAGS_NT);
> >  }
> >
> >  /*
> 
> I don't get it. Why isn't this patch acceptable, at least on x86-64 where NT 
> is
> never valid?
> 
> Bueller?

Chuck, I don't think anyone has gotten around to reviewing, accepting or 
 rejecting this patch yet.
-Anish
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/1] [RFC] kconfig: warn if an unknown symbol is selected

2014-09-26 Thread Paul Bolle
A select of an unknown symbol is basically treated as a nop and is
silently skipped. This is annoying if the selected symbol contains a
typo. It can also hide the fact that a treewide symbol cleanup was only
done partially.

There are also a few cases were this might have been done on purpose.
But that anti-pattern should be discouraged. Almost all select
statements point to a known and reachable symbol. So people will likely
assume that any selected symbol is actually set. Violating that
assumption might lead to (subtle) bugs.

So let's warn when we notice a select of a symbol that is not known in
the configuration we're creating.

Rfced-by: Paul Bolle 
---
 scripts/kconfig/conf.c | 41 +
 1 file changed, 41 insertions(+)

diff --git a/scripts/kconfig/conf.c b/scripts/kconfig/conf.c
index fef75fc756f4..de8406287531 100644
--- a/scripts/kconfig/conf.c
+++ b/scripts/kconfig/conf.c
@@ -446,6 +446,45 @@ static void check_conf(struct menu *menu)
check_conf(child);
 }
 
+static void check_selects(struct menu *menu)
+{
+   struct symbol *sym, *sel;
+   struct property *prop;
+
+   while (menu) {
+   sym = menu->sym;
+
+   if (sym && !sym_is_choice(sym) &&
+   sym->type != S_UNKNOWN &&
+   sym->flags & SYMBOL_WRITE) {
+   for_all_properties(sym, prop, P_SELECT) {
+   sel = prop->expr->left.sym;
+   if (sel->type == S_UNKNOWN &&
+   expr_calc_value(prop->visible.expr) != no) {
+   fprintf(stderr, "%s:%d:warning: ",
+   prop->file->name,
+   prop->lineno);
+   fprintf(stderr,
+   "'%s' selects unknown symbol 
'%s'\n",
+   sym->name,
+   sel->name);
+   }
+   }
+   }
+
+   if (menu->list) {
+   menu = menu->list;
+   } else if (menu->next) {
+   menu = menu->next;
+   } else while ((menu = menu->parent)) {
+   if (menu->next) {
+   menu = menu->next;
+   break;
+   }
+   }
+   }
+}
+
 static struct option long_opts[] = {
{"oldaskconfig",no_argument,   NULL, oldaskconfig},
{"oldconfig",   no_argument,   NULL, oldconfig},
@@ -681,6 +720,8 @@ int main(int ac, char **av)
break;
}
 
+   check_selects(rootmenu.list);
+
if (sync_kconfig) {
/* silentoldconfig is used during the build so we shall update 
autoconf.
 * All other commands are only used to generate a config.
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/1] [RFC] kconfig: warn if an unknown symbol is selected

2014-09-26 Thread Paul Bolle
This is an attempt to make a recurring kconfig issue more visible:
select statements for unknown, or unreachable, Kconfig symbols. See,
I've submitted quite a few patches that remove statements like
select BOGUS

One of the annoying properties of kconfig is that select statements like
that are silently treated as a nop. That makes them surprisingly easy to
stay unnoticed. Some of them are caught by my local "800 line perl
monster", but not all. But perhaps silently skipping these statements is
actually a feature. I think it is not.

Anyhow, I've tweaked kconfig to warn about them. Then I ran kconfig on
all (500+) in tree defconfig files as of next-20140926. These are the
select statements warns about when doing that.

1) arch/arm/Kconfig:429:warning: 'ARCH_EFM32' selects unknown symbol 'NO_DMA'

Unknown on arm. Patch submitted.

2) arch/arm/mach-meson/Kconfig:11:warning: 'MACH_MESON6' selects unknown symbol 
'MESON6_TIMER'

Clearly bogus select. But Kconfig symbol MESON6_TIMER is apparently
queued.

3) arch/arm/mach-mvebu/Kconfig:40:warning: 'MACH_ARMADA_375' selects unknown 
symbol 'ARM_ERRATA_753970'
   arch/arm/mach-mvebu/Kconfig:55:warning: 'MACH_ARMADA_38X' selects unknown 
symbol 'ARM_ERRATA_753970'

Typo. Patch is being discussed for some time now.

4) arch/arm/mach-shmobile/Kconfig:39:warning: 'ARCH_SHMOBILE_MULTI' selects 
unknown symbol 'ARCH_HAS_OPP'

Unfinished Kconfig cleanup. Patches submitted and/or queued.

5) arch/hexagon/Kconfig:22:warning: 'HEXAGON' selects unknown symbol 
'NO_IOPORT_MAP'

Unknown on hexagon. Patch submitted.

6) arch/powerpc/platforms/44x/Kconfig:217:warning: 'AKEBONO' selects unknown 
symbol 'IBM_EMAC_RGMII_WOL'
   arch/powerpc/platforms/44x/Kconfig:223:warning: 'AKEBONO' selects unknown 
symbol 'MMC_SDHCI_OF_476GTR'

Clearly bogus selects. One symbol might be queued. Not sure why the
other select hasn't been removed.

7) arch/score/Kconfig:24:warning: 'ARCH_SCORE7' selects unknown symbol 
'SYS_SUPPORTS_32BIT_KERNEL'
   arch/score/Kconfig:28:warning: 'MACH_SPCT6600' selects unknown symbol 
'SYS_SUPPORTS_32BIT_KERNEL'
   arch/score/Kconfig:32:warning: 'SCORE_SIM' selects unknown symbol 
'SYS_SUPPORTS_32BIT_KERNEL'

Bogus. Submitted a patch long ago. Nothing happened.

8) drivers/acpi/Kconfig:165:warning: 'ACPI_PROCESSOR' selects unknown symbol 
'CPU_IDLE'

Unreachable for eg ia64. Perhaps fixed by:
--- a/drivers/acpi/Kconfig
+++ b/drivers/acpi/Kconfig
@@ -162,7 +162,7 @@ config ACPI_DOCK
 config ACPI_PROCESSOR
tristate "Processor"
select THERMAL
-   select CPU_IDLE
+   select CPU_IDLE if (ARM || ARM64 || MIPS || PPC || X86 || SUPERH)
default y
help
  This driver installs ACPI as the idle handler for Linux and uses
 
9) drivers/irqchip/Kconfig:16:warning: 'ARM_GIC_V3' selects unknown symbol 
'MULTI_IRQ_HANDLER'
   drivers/irqchip/Kconfig:8:warning: 'ARM_GIC' selects unknown symbol 
'MULTI_IRQ_HANDLER'

MULTI_IRQ_HANDLER is arm specific. So arm64 builds trigger this.
Suggested fix:
--- a/drivers/irqchip/Kconfig
+++ b/drivers/irqchip/Kconfig
@@ -5,7 +5,7 @@ config IRQCHIP
 config ARM_GIC
bool
select IRQ_DOMAIN
-   select MULTI_IRQ_HANDLER
+   select MULTI_IRQ_HANDLER if ARM
 
 config GIC_NON_BANKED
bool
@@ -13,7 +13,6 @@ config GIC_NON_BANKED
 config ARM_GIC_V3
bool
select IRQ_DOMAIN
-   select MULTI_IRQ_HANDLER
 
 config ARM_NVIC
bool

10) lib/Kconfig.debug:1008:warning: 'DEBUG_ATOMIC_SLEEP' selects unknown symbol 
'PREEMPT_COUNT'

Not all architectures have PREEMPT_COUNT. Maybe fixed by:
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1005,7 +1005,7 @@ config DEBUG_LOCKDEP
 
 config DEBUG_ATOMIC_SLEEP
bool "Sleep inside atomic section checking"
-   select PREEMPT_COUNT
+   select PREEMPT_COUNT if !(ALPHA || FRV || HEXAGON || UML)
depends on DEBUG_KERNEL
help
  If you say Y here, various routines which may sleep will become very

Not sure, however.

11) sound/pci/Kconfig:158:warning: 'SND_AZT3328' selects unknown symbol 
'ZONE_DMA'
sound/pci/Kconfig:28:warning: 'SND_ALS300' selects unknown symbol 'ZONE_DMA'
sound/pci/Kconfig:466:warning: 'SND_EMU10K1' selects unknown symbol 
'ZONE_DMA'
sound/pci/Kconfig:482:warning: 'SND_EMU10K1X' selects unknown symbol 
'ZONE_DMA'
sound/pci/Kconfig:516:warning: 'SND_ES1938' selects unknown symbol 
'ZONE_DMA'
sound/pci/Kconfig:528:warning: 'SND_ES1968' selects unknown symbol 
'ZONE_DMA'
sound/pci/Kconfig:53:warning: 'SND_ALI5451' selects unknown symbol 
'ZONE_DMA'
sound/pci/Kconfig:615:warning: 'SND_ICE1712' selects unknown symbol 
'ZONE_DMA'
sound/pci/Kconfig:703:warning: 'SND_MAESTRO3' selects unknown symbol 
'ZONE_DMA'
sound/pci/Kconfig:812:warning: 'SND_SONICVIBES' selects unknown symbol 
'ZONE_DMA'
sound/pci/Kconfig:824:warning: 'SND_TRIDENT' selects unknown symbol 
'ZONE_DMA'

Tricky. See commit 80ab8eae70e5 

Re: eBPF verifier thoughts (Re: [PATCH v15 net-next 00/11] eBPF syscall, verifier, testsuite)

2014-09-26 Thread Andy Lutomirski
On Fri, Sep 26, 2014 at 3:03 PM, Alexei Starovoitov  wrote:
> On Fri, Sep 26, 2014 at 2:47 PM, Andy Lutomirski  wrote:
>>
>> Can't you just disallow the 1-byte write to the stack?
>
> of course not.
> That would be extremely limiting to users.
> Can you actually see yourself living with stack that only
> allows 8-byte writes/reads?
> The stack usage will increase a lot, since all char/short
> stack variables will become 8-byte...

How about requiring that sub-8-byte stack accesses only be to integer slots?

>
>> I'll try it in Python.  I bet I can get it to be shorter than the current 
>> code.
>
> Awesome challenge! :)
> I'll buy you a drink of your choice if you can achieve that.
> Also I'll send you our C programs that we use for
> testing to make sure, your python verifier can analyze them.
> If it passes, I'll be glad to rip mine out. Seriously.
> Deal?

Sure.

>
>> Yes, but does it work reliably?
>
> I'm not saying current verifier is perfect. It can get better.
> So far it was enough to let people code freely on top of it.



-- 
Andy Lutomirski
AMA Capital Management, LLC
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] libata-sff: Fix controllers with no ctl port

2014-09-26 Thread Ondrej Zary
Currently, ata_sff_softreset is skipped for controllers with no ctl port.
But that also skips ata_sff_dev_classify required for device detection.
This means that libata is currently broken on controllers with no ctl port.

No device connected:
[1.872480] pata_isapnp 01:01.02: activated
[1.889823] scsi2 : pata_isapnp
[1.890109] ata3: PATA max PIO0 cmd 0x1e8 ctl 0x0 irq 11
[6.888110] ata3.01: qc timeout (cmd 0xec)
[6.888179] ata3.01: failed to IDENTIFY (I/O error, err_mask=0x5)
[   16.888085] ata3.01: qc timeout (cmd 0xec)
[   16.888147] ata3.01: failed to IDENTIFY (I/O error, err_mask=0x5)
[   46.888086] ata3.01: qc timeout (cmd 0xec)
[   46.888148] ata3.01: failed to IDENTIFY (I/O error, err_mask=0x5)
[   51.888100] ata3.00: qc timeout (cmd 0xec)
[   51.888160] ata3.00: failed to IDENTIFY (I/O error, err_mask=0x5)
[   61.888079] ata3.00: qc timeout (cmd 0xec)
[   61.888141] ata3.00: failed to IDENTIFY (I/O error, err_mask=0x5)
[   91.888089] ata3.00: qc timeout (cmd 0xec)
[   91.888152] ata3.00: failed to IDENTIFY (I/O error, err_mask=0x5)

ATAPI device connected:
[1.882061] pata_isapnp 01:01.02: activated
[1.893430] scsi2 : pata_isapnp
[1.893719] ata3: PATA max PIO0 cmd 0x1e8 ctl 0x0 irq 11
[6.892107] ata3.01: qc timeout (cmd 0xec)
[6.892171] ata3.01: failed to IDENTIFY (I/O error, err_mask=0x5)
[   16.892079] ata3.01: qc timeout (cmd 0xec)
[   16.892138] ata3.01: failed to IDENTIFY (I/O error, err_mask=0x5)
[   46.892079] ata3.01: qc timeout (cmd 0xec)
[   46.892138] ata3.01: failed to IDENTIFY (I/O error, err_mask=0x5)
[   46.908586] ata3.00: ATAPI: ACER CD-767E/O, V1.5X, max PIO2, CDB intr
[   46.924570] ata3.00: configured for PIO0 (device error ignored)
[   46.926295] scsi 2:0:0:0: CD-ROMACER CD-767E/O1.5X 
PQ: 0 ANSI: 5
[   46.984519] sr0: scsi3-mmc drive: 6x/6x xa/form2 tray
[   46.984592] cdrom: Uniform CD-ROM driver Revision: 3.20

So don't skip ata_sff_softreset, just skip the reset part of ata_bus_softreset
if the ctl port is not available.

This makes IDE port on ES968 behave correctly:

No device connected:
[4.670888] pata_isapnp 01:01.02: activated
[4.673207] scsi host2: pata_isapnp
[4.673675] ata3: PATA max PIO0 cmd 0x1e8 ctl 0x0 irq 11
[7.081840] Adding 2541652k swap on /dev/sda2.  Priority:-1 extents:1 
across:2541652k

ATAPI device connected:
[4.704362] pata_isapnp 01:01.02: activated
[4.706620] scsi host2: pata_isapnp
[4.706877] ata3: PATA max PIO0 cmd 0x1e8 ctl 0x0 irq 11
[4.872782] ata3.00: ATAPI: ACER CD-767E/O, V1.5X, max PIO2, CDB intr
[4.888673] ata3.00: configured for PIO0 (device error ignored)
[4.893984] scsi 2:0:0:0: CD-ROMACER CD-767E/O1.5X 
PQ: 0 ANSI: 5
[7.015578] Adding 2541652k swap on /dev/sda2.  Priority:-1 extents:1 
across:2541652k

Signed-off-by: Ondrej Zary 
---
 drivers/ata/libata-sff.c |   20 +---
 1 file changed, 9 insertions(+), 11 deletions(-)

diff --git a/drivers/ata/libata-sff.c b/drivers/ata/libata-sff.c
index 1121153..db90aa3 100644
--- a/drivers/ata/libata-sff.c
+++ b/drivers/ata/libata-sff.c
@@ -2008,13 +2008,15 @@ static int ata_bus_softreset(struct ata_port *ap, 
unsigned int devmask,
 
DPRINTK("ata%u: bus reset via SRST\n", ap->print_id);
 
-   /* software reset.  causes dev0 to be selected */
-   iowrite8(ap->ctl, ioaddr->ctl_addr);
-   udelay(20); /* FIXME: flush */
-   iowrite8(ap->ctl | ATA_SRST, ioaddr->ctl_addr);
-   udelay(20); /* FIXME: flush */
-   iowrite8(ap->ctl, ioaddr->ctl_addr);
-   ap->last_ctl = ap->ctl;
+   if (ap->ioaddr.ctl_addr) {
+   /* software reset.  causes dev0 to be selected */
+   iowrite8(ap->ctl, ioaddr->ctl_addr);
+   udelay(20); /* FIXME: flush */
+   iowrite8(ap->ctl | ATA_SRST, ioaddr->ctl_addr);
+   udelay(20); /* FIXME: flush */
+   iowrite8(ap->ctl, ioaddr->ctl_addr);
+   ap->last_ctl = ap->ctl;
+   }
 
/* wait the port to become ready */
return ata_sff_wait_after_reset(>link, devmask, deadline);
@@ -2215,10 +2217,6 @@ void ata_sff_error_handler(struct ata_port *ap)
 
spin_unlock_irqrestore(ap->lock, flags);
 
-   /* ignore ata_sff_softreset if ctl isn't accessible */
-   if (softreset == ata_sff_softreset && !ap->ioaddr.ctl_addr)
-   softreset = NULL;
-
/* ignore built-in hardresets if SCR access is not available */
if ((hardreset == sata_std_hardreset ||
 hardreset == sata_sff_hardreset) && !sata_scr_valid(>link))
-- 
Ondrej Zary

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: eBPF verifier thoughts (Re: [PATCH v15 net-next 00/11] eBPF syscall, verifier, testsuite)

2014-09-26 Thread Alexei Starovoitov
On Fri, Sep 26, 2014 at 2:47 PM, Andy Lutomirski  wrote:
>
> Can't you just disallow the 1-byte write to the stack?

of course not.
That would be extremely limiting to users.
Can you actually see yourself living with stack that only
allows 8-byte writes/reads?
The stack usage will increase a lot, since all char/short
stack variables will become 8-byte...

> I'll try it in Python.  I bet I can get it to be shorter than the current 
> code.

Awesome challenge! :)
I'll buy you a drink of your choice if you can achieve that.
Also I'll send you our C programs that we use for
testing to make sure, your python verifier can analyze them.
If it passes, I'll be glad to rip mine out. Seriously.
Deal?

> Yes, but does it work reliably?

I'm not saying current verifier is perfect. It can get better.
So far it was enough to let people code freely on top of it.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86 : Ensure X86_FLAGS_NT is cleared on syscall entry

2014-09-26 Thread Chuck Ebbert
On Thu, 25 Sep 2014 12:42:51 -0700
Anish Bhatt  wrote:

> The MSR_SYSCALL_MASK, which is responsible for clearing specific EFLAGS on
>  syscall entry, should also clear the nested task (NT) flag to be safe from
>  userspace injection. Without this fix the application segmentation
>  faults on syscall return because of the changed meaning of the IRET
>  instruction.
> 
> Further details can be seen here https://bugs.winehq.org/show_bug.cgi?id=33275
> 
> Signed-off-by: Anish Bhatt 
> Signed-off-by: Sebastian Lackner 
> ---
>  arch/x86/kernel/cpu/common.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> index e4ab2b4..3126558 100644
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -1184,7 +1184,7 @@ void syscall_init(void)
>   /* Flags to clear on syscall */
>   wrmsrl(MSR_SYSCALL_MASK,
>  X86_EFLAGS_TF|X86_EFLAGS_DF|X86_EFLAGS_IF|
> -X86_EFLAGS_IOPL|X86_EFLAGS_AC);
> +X86_EFLAGS_IOPL|X86_EFLAGS_AC|X86_EFLAGS_NT);
>  }
>  
>  /*

I don't get it. Why isn't this patch acceptable, at least on x86-64
where NT is never valid?

Bueller?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v1 2/5] driver-core: enable drivers to opt-out of async probe

2014-09-26 Thread Luis R. Rodriguez
From: "Luis R. Rodriguez" 

We'll soon add generic support for asynchronous probe, before
that gets merged lets let drivers annotate if they should never
probe asynchronously.

Cc: Tejun Heo 
Cc: Arjan van de Ven 
Cc: Greg Kroah-Hartman 
Cc: Doug Thompson 
Cc: Borislav Petkov 
Cc: Mauro Carvalho Chehab 
Cc: linux-e...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Luis R. Rodriguez 
---
 include/linux/device.h | 5 +
 1 file changed, 5 insertions(+)

diff --git a/include/linux/device.h b/include/linux/device.h
index 43d183a..4de6328 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -200,6 +200,10 @@ extern struct klist *bus_get_device_klist(struct bus_type 
*bus);
  * @owner: The module owner.
  * @mod_name:  Used for built-in modules.
  * @suppress_bind_attrs: Disables bind/unbind via sysfs.
+ * @sync_probe: requests probe to be run always be run synchronously even
+ * if userspace asked us to run asynchronously. Some devices drivers
+ * may be known to not work well with async probe, use this to annotate
+ * your driver if you know it needs synchronous probe.
  * @of_match_table: The open firmware table.
  * @acpi_match_table: The ACPI match table.
  * @probe: Called to query the existence of a specific device,
@@ -233,6 +237,7 @@ struct device_driver {
const char  *mod_name;  /* used for built-in modules */
 
bool suppress_bind_attrs;   /* disables bind/unbind via sysfs */
+   bool sync_probe;
 
const struct of_device_id   *of_match_table;
const struct acpi_device_id *acpi_match_table;
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v1 3/5] amd64_edac: enforce synchronous probe

2014-09-26 Thread Luis R. Rodriguez
From: "Luis R. Rodriguez" 

While testing asynchronous PCI probe on this driver I noticed
it failed so enforce just synchronouse probe for now. Asynchronous
probe is not used by default and requires userepace intervention.
Patches for its support will be merged later.

[   14.411083] AMD64 EDAC driver v3.4.0
[   14.411156] bus: 'pci': probe for driver amd64_edac is run asynchronously
[   14.411308] really_probe: driver_sysfs_add(:00:18.2) failed
[   14.411354] [ cut here ]
[   14.411398] WARNING: CPU: 4 PID: 224 at fs/kernfs/dir.c:1220 
kernfs_remove_by_name_ns+0x83/0x90()
[   14.411446] kernfs: can not remove ':00:18.2', no directory
[   14.411484] Modules linked in: glue_helper(+) amd64_edac_mod(-) 
[   14.413654] CPU: 4 PID: 224 Comm: kworker/u16:3 Not tainted 3.17.0-rc6+ #1
[   14.413695] Hardware name: To be filled by O.E.M. To be filled by 
O.E.M./M5A97, BIOS 1605 10/25/2012
[   14.413809] Workqueue: events_unbound driver_attach_workfn
[   14.413909]  0009 814d2cf9 8803f0513d20 
81061972
[   14.414145]  
[   14.414254]  8803f0513d70
[   14.414383]  a032c068 0038
[   14.414507]   810619d7 81723238 
88030020
[   14.414701] Call Trace:
[   14.414746]  [] ? dump_stack+0x41/0x51
[   14.414790]  [] ? warn_slowpath_common+0x72/0x90
[   14.414834]  [] ? warn_slowpath_fmt+0x47/0x50
[   14.414880]  [] ? printk+0x4f/0x51
[   14.414921]  [] ? kernfs_remove_by_name_ns+0x83/0x90
[   14.415000]  [] ? driver_sysfs_remove+0x1d/0x40
[   14.415046]  [] ? driver_probe_device+0x1d5/0x250
[   14.415099]  [] ? __driver_attach+0x7b/0x80
[   14.415149]  [] ? __device_attach+0x40/0x40
[   14.415204]  [] ? bus_for_each_dev+0x53/0x90
[   14.415254]  [] ? driver_attach_workfn+0x13/0x80
[   14.415298]  [] ? process_one_work+0x143/0x3c0
[   14.415342]  [] ? worker_thread+0x114/0x480
[   14.415384]  [] ? rescuer_thread+0x2b0/0x2b0
[   14.415427]  [] ? kthread+0xc1/0xe0
[   14.415468]  [] ? kthread_create_on_node+0x170/0x170
[   14.415511]  [] ? ret_from_fork+0x7c/0xb0
[   14.415554]  [] ? kthread_create_on_node+0x170/0x170
[   14.415595] ---[ end trace dd11ab53a6e1ec0d ]---
[   14.415643] amd64_edac: probe of :00:18.2 failed with error 0

Cc: Tejun Heo 
Cc: Arjan van de Ven 
Cc: Greg Kroah-Hartman 
Cc: Doug Thompson 
Cc: Borislav Petkov 
Cc: Mauro Carvalho Chehab 
Cc: linux-e...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Luis R. Rodriguez 
---
 drivers/edac/amd64_edac.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
index f8bf000..dc997ae 100644
--- a/drivers/edac/amd64_edac.c
+++ b/drivers/edac/amd64_edac.c
@@ -2872,6 +2872,7 @@ static struct pci_driver amd64_pci_driver = {
.probe  = probe_one_instance,
.remove = remove_one_instance,
.id_table   = amd64_pci_table,
+   .driver.sync_probe = true,
 };
 
 static void setup_pci_device(void)
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v1 4/5] driver-core: generalize freeing driver private member

2014-09-26 Thread Luis R. Rodriguez
From: "Luis R. Rodriguez" 

This will be used later.

Cc: Tejun Heo 
Cc: Arjan van de Ven 
Cc: Greg Kroah-Hartman 
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Luis R. Rodriguez 
---
 drivers/base/bus.c | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/base/bus.c b/drivers/base/bus.c
index 83e910a..a5f41e4 100644
--- a/drivers/base/bus.c
+++ b/drivers/base/bus.c
@@ -657,6 +657,15 @@ static ssize_t uevent_store(struct device_driver *drv, 
const char *buf,
 }
 static DRIVER_ATTR_WO(uevent);
 
+static void remove_driver_private(struct device_driver *drv)
+{
+   struct driver_private *priv = drv->p;
+
+   kobject_put(>kobj);
+   kfree(priv);
+   drv->p = NULL;
+}
+
 /**
  * bus_add_driver - Add a driver to the bus.
  * @drv: driver.
@@ -719,9 +728,7 @@ int bus_add_driver(struct device_driver *drv)
return 0;
 
 out_unregister:
-   kobject_put(>kobj);
-   kfree(drv->p);
-   drv->p = NULL;
+   remove_driver_private(drv);
 out_put_bus:
bus_put(bus);
return error;
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v1 5/5] driver-core: add driver asynchronous probe support

2014-09-26 Thread Luis R. Rodriguez
From: "Luis R. Rodriguez" 

Some init systems may wish to express the desire to have
device drivers run their device driver's bus probe() run
asynchronously. This implements support for this and
allows userspace to request async probe as a preference
through a generic shared device driver module parameter,
async_probe. Implemention for async probe is supported
through a module parameter given that since synchronous
probe has been prevalent for years some userspace might
exist which relies on the fact that the device driver will
probe synchronously and the assumption that devices it
provides will be immediately available after this.

Some device driver might not be able to run async probe
so we enable device drivers to annotate this to prevent
this module parameter from having any effect on them.

This implementation uses queue_work(system_unbound_wq)
to queue async probes, this should enable probe to run
slightly *faster* if the driver's probe path did not
have much interaction with other workqueues otherwise
it may run _slightly_ slower. Tests were done with cxgb4,
which is known to take long on probe, both without
having to run request_firmware() [0] and then by
requiring it to use request_firmware() [1]. The
difference in run time are only measurable in microseconds:

=|
strategyfw (usec)   no-fw (usec) |
-|
synchronous 244725691307563  |
kthread 25066415.5  1309868.5|
queue_work(system_unbound_wq)   24913661.5  1307631  |
-|

In practice, in seconds, the difference is barely noticeable:

=|
strategyfw (s)  no-fw (s)|
-|
synchronous 24.47   1.31 |
kthread 25.07   1.31 |
queue_work(system_unbound_wq)   24.91   1.31 |
-|

[0] 
http://ftp.suse.com/pub/people/mcgrof/async-probe/probe-cgxb4-no-firmware.png
[1] http://ftp.suse.com/pub/people/mcgrof/async-probe/probe-cgxb4-firmware.png

The rest of the commit log documents why this feature was implemented
primarily first for systemd and things it should consider next.

Systemd has a general timeout for all workers currently set to 180
seconds after which it will send a sigkill signal. Systemd now has a
warning which is issued once it reaches 1/3 of the timeout. The original
motivation for the systemd timeout was to help track device drivers
which do not use asynch firmware loading on init() and the timeout was
originally set to 30 seconds.

Since systemd + kernel are heavily tied in for the purposes of this
patch it is assumed you have merged on systemd the following
commits:

671174136525ddf208cdbe75d6d6bd159afa961fudev: timeout - warn after a 
third of the timeout before killing
b5338a19864ac3f5632aee48069a669479621dcaudev: timeout - increase timeout
2e92633dbae52f5ac9b7b2e068935990d475d2cdudev: bump event timeout to 60 
seconds
be2ea723b1d023b3d385d3b791ee4607cbfb20caudev: remove userspace firmware 
loading support
9f20a8a376f924c8eb5423cfc1f98644fc1e2d1audev: fixup commit
dd5eddd28a74a49607a8fffcaf960040dba98479udev: unify event timeout 
handling
9719859c07aa13539ed2cd4b31972cd30f678543udevd: add --event-timeout 
commandline option

Since we bundle together serially driver init() and probe()
on module initialiation systemd's imposed timeout  put a limit on the
amount of time a driver init() and probe routines can take. There's a
few overlooked issues with this and the timeout in general:

0) Not all drivers are killed, the signal is just sent and
   the kill will only be acted upoon if the driver you loaded
   happens to have some code path that either uses kthreads (which
   as of 786235ee are now killable), or uses some code which checks for
   fatal_signal_pending() on the kernel somewhere -- i.e: pci_read_vpd().

1) Since systemd is the only one logging the sigkill debugging that
   drivers are not loaded or in the worst case *failed to boot* because
   of a sigkill has proven hard to debug.

2) When and if the signal is received by the driver somehow
   the driver may fail at different points in its initialization
   and unless all error paths on the driver are implemented
   perfectly this could mean leaving a device in a half
   initialized state.

3) The timeout is penalizing device drivers that take long on
   probe(), this wasn't the original motivation. Systemd seems
   to have been under assumption 

[PATCH v1 0/5] driver-core: async probe support

2014-09-26 Thread Luis R. Rodriguez
From: "Luis R. Rodriguez" 

What started off as a witch hunt for a solution to the systemd
kernel-module-loading-timeout-and-it-killing-drivers-issue (TM)
ended up in our last discussions [0] agreement for us to add to the
kernel what systemd expected us to be doing: asynchronous probing.

This implements support for that, but please note that upon further
inspection of what the timeout is doing and implicating (documented on
patch number 5 in this series) its also recommended that systemd
revisit its timeout and perhaps consider simply not killing kmod
workers. Even if the killing goes away this framework can still be
desirable, and its expected this will be expanded to support
built-in drivers as well later.

This goes build tested with allmodconfig, and run time tested
with and without bus.safe_mod_async_probe=1 on 4 different systems.
As noted on patch #5 systemd should not use bus.safe_mod_async_probe=1
but instead issue a request per module so that in other cases where
its not advisable to use async probe (also documented on patch #5)
it can avoid it. We also enable a flag to let drivers specify if
they never want to enable async probe for whatever reason and an
example is provided on how to do this.

This should have 0 impact unless userspace is issuing the module
parameter for modules, or if any of the new module parameters are
used to help test things.

[0] https://lkml.org/lkml/2014/9/12/625

Luis R. Rodriguez (5):
  module: add extra argument for parse_params() callback
  driver-core: enable drivers to opt-out of async probe
  amd64_edac: enforce synchronous probe
  driver-core: generalize freeing driver private member
  driver-core: add driver asynchronous probe support

 arch/powerpc/mm/hugetlbpage.c |   4 +-
 drivers/base/base.h   |   6 ++
 drivers/base/bus.c| 150 --
 drivers/base/dd.c |   7 ++
 drivers/edac/amd64_edac.c |   1 +
 include/linux/device.h|   5 ++
 include/linux/module.h|   2 +
 include/linux/moduleparam.h   |   3 +-
 init/main.c   |  25 ---
 kernel/module.c   |  16 -
 kernel/params.c   |  11 ++--
 lib/dynamic_debug.c   |   4 +-
 12 files changed, 206 insertions(+), 28 deletions(-)

-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v1 1/5] module: add extra argument for parse_params() callback

2014-09-26 Thread Luis R. Rodriguez
From: "Luis R. Rodriguez" 

This adds an extra argument onto parse_params() to be used
as a way to make the unused callback a bit more useful and
generic by allowing the caller to pass on a data structure
of its choice. An example use case is to allow us to easily
make module parameters for every module which we will do
next.

@ parse @
identifier name, args, params, num, level_min, level_max;
identifier unknown, param, val, doing;
type s16;
@@
 extern char *parse_args(const char *name,
 char *args,
 const struct kernel_param *params,
 unsigned num,
 s16 level_min,
 s16 level_max,
+void *arg,
 int (*unknown)(char *param, char *val,
const char *doing
+   , void *arg,
));

@ parse_mod @
identifier name, args, params, num, level_min, level_max;
identifier unknown, param, val, doing;
type s16;
@@
 char *parse_args(const char *name,
 char *args,
 const struct kernel_param *params,
 unsigned num,
 s16 level_min,
 s16 level_max,
+void *arg,
 int (*unknown)(char *param, char *val,
const char *doing
+   , void *arg,
))
{
...
}

@ parse_args_found @
expression R, E1, E2, E3, E4, E5, E6;
identifier func;
@@

(
R =
parse_args(E1, E2, E3, E4, E5, E6,
+  NULL,
   func);
|
R =
parse_args(E1, E2, E3, E4, E5, E6,
+  NULL,
   );
|
R =
parse_args(E1, E2, E3, E4, E5, E6,
+  NULL,
   NULL);
|
parse_args(E1, E2, E3, E4, E5, E6,
+  NULL,
   func);
|
parse_args(E1, E2, E3, E4, E5, E6,
+  NULL,
   );
|
parse_args(E1, E2, E3, E4, E5, E6,
+  NULL,
   NULL);
)

@ parse_args_unused depends on parse_args_found @
identifier parse_args_found.func;
@@

int func(char *param, char *val, const char *unused
+, void *arg
 )
{
...
}

@ mod_unused depends on parse_args_found @
identifier parse_args_found.func;
expression A1, A2, A3;
@@

-   func(A1, A2, A3);
+   func(A1, A2, A3, NULL);

Generated-by: Coccinelle SmPL
Cc: co...@systeme.lip6.fr
Cc: Tejun Heo 
Cc: Arjan van de Ven 
Cc: Greg Kroah-Hartman 
Cc: Rusty Russell 
Cc: Christoph Hellwig 
Cc: Felipe Contreras 
Cc: Ewan Milne 
Cc: Jean Delvare 
Cc: Hannes Reinecke 
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Luis R. Rodriguez 
---
 arch/powerpc/mm/hugetlbpage.c |  4 ++--
 include/linux/moduleparam.h   |  3 ++-
 init/main.c   | 25 +++--
 kernel/module.c   |  6 --
 kernel/params.c   | 11 +++
 lib/dynamic_debug.c   |  4 ++--
 6 files changed, 32 insertions(+), 21 deletions(-)

diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
index 7e70ae9..f3f836e 100644
--- a/arch/powerpc/mm/hugetlbpage.c
+++ b/arch/powerpc/mm/hugetlbpage.c
@@ -333,7 +333,7 @@ int alloc_bootmem_huge_page(struct hstate *hstate)
 unsigned long gpage_npages[MMU_PAGE_COUNT];
 
 static int __init do_gpage_early_setup(char *param, char *val,
-  const char *unused)
+  const char *unused, void *arg)
 {
static phys_addr_t size;
unsigned long npages;
@@ -375,7 +375,7 @@ void __init reserve_hugetlb_gpages(void)
 
strlcpy(cmdline, boot_command_line, COMMAND_LINE_SIZE);
parse_args("hugetlb gpages", cmdline, NULL, 0, 0, 0,
-   _gpage_early_setup);
+   NULL, _gpage_early_setup);
 
/*
 * Walk gpage list in reverse, allocating larger page sizes first.
diff --git a/include/linux/moduleparam.h b/include/linux/moduleparam.h
index 494f99e..e975628 100644
--- a/include/linux/moduleparam.h
+++ b/include/linux/moduleparam.h
@@ -327,8 +327,9 @@ extern char *parse_args(const char *name,
  unsigned num,
  s16 level_min,
  s16 level_max,
+ void *arg,
  int (*unknown)(char *param, char *val,
- const char *doing));
+const char *doing, void *arg));
 
 /* Called by module remove. */
 #ifdef CONFIG_SYSFS
diff --git a/init/main.c b/init/main.c
index bb1aed9..32ae416 100644
--- a/init/main.c
+++ b/init/main.c
@@ -236,7 +236,8 @@ static int __init loglevel(char 

Re: [PATCH v12 08/12] PCI: OF: Introduce helper function for retrieving PCI domain numbers

2014-09-26 Thread Rob Herring
On Fri, Sep 26, 2014 at 4:20 PM,   wrote:
> On Fri, Sep 26, 2014 at 01:20:39PM -0500, Rob Herring wrote:
>> On Tue, Sep 23, 2014 at 2:01 PM, Liviu Dudau  wrote:
>> > Add pci_get_new_domain_nr() to allocate a new domain number
>> > and of_get_pci_domain_nr() to retrieve the PCI domain number
>> > of a given device from DT. Host bridge drivers or architecture
>> > specific code can choose to implement their PCI domain number
>> > policy using these two functions.
>> >
>> > Using of_get_pci_domain_nr() guarantees a stable PCI domain
>> > number on every boot provided that all host bridge controllers
>> > are assigned a number in the device tree using "linux,pci-domain"
>> > property. Mix use of pci_get_new_domain_nr() and of_get_pci_domain_nr()
>> > is not recommended as it can lead to potentially conflicting
>> > domain numbers being assigned to root busses behind different
>> > host bridges.
>> >
>> > Cc: Bjorn Helgaas 
>> > Cc: Arnd Bergmann 
>> > Cc: Grant Likely 
>> > Cc: Rob Herring 
>> > Cc: Catalin Marinas 
>> > Signed-off-by: Liviu Dudau 
>>
>> Looks pretty good, but a couple of comments that can be addressed as
>> follow-up patches.
>>
>> > ---
>> >  drivers/of/of_pci.c| 25 +
>> >  drivers/pci/pci.c  |  9 +
>> >  include/linux/of_pci.h |  7 +++
>> >  include/linux/pci.h|  3 +++
>> >  4 files changed, 44 insertions(+)
>> >
>> > diff --git a/drivers/of/of_pci.c b/drivers/of/of_pci.c
>> > index 8481996..82d172f 100644
>> > --- a/drivers/of/of_pci.c
>> > +++ b/drivers/of/of_pci.c
>> > @@ -89,6 +89,31 @@ int of_pci_parse_bus_range(struct device_node *node, 
>> > struct resource *res)
>> >  }
>> >  EXPORT_SYMBOL_GPL(of_pci_parse_bus_range);
>> >
>> > +/**
>> > + * This function will try to obtain the host bridge domain number by
>> > + * finding a property called "linux,pci-domain" of the given device node.
>> > + *
>> > + * @node: device tree node with the domain information
>> > + *
>> > + * Returns the associated domain number from DT in the range [0-0x], 
>> > or
>> > + * a negative value if the required property is not found.
>> > + */
>> > +int of_get_pci_domain_nr(struct device_node *node)
>> > +{
>> > +   const __be32 *value;
>> > +   int len;
>> > +   u16 domain;
>> > +
>> > +   value = of_get_property(node, "linux,pci-domain", );
>>
>> This needs to be documented.
>
> I would say needs agreed on *and* documented.

I thought Bjorn already applied the series. Who's agreement are you
waiting for specifically.

>>
>> > +   if (!value || len < sizeof(*value))
>> > +   return -EINVAL;
>> > +
>> > +   domain = (u16)be32_to_cpup(value);
>>
>> of_property_read_u32 would not work here?
>
> It should, I guess, but I want to limit the domain number range to 16 bits.

Okay, but it would still be fewer lines to use of_property_read_u32
and then cast it. You could use the u16 variant and make the property
be defined to be 16-bit in the documentation.

Rob
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git pull] vfs fixes

2014-09-26 Thread Joachim Eastwood
On 26 September 2014 23:28, Joachim  Eastwood  wrote:
> On 26 September 2014 22:58, Al Viro  wrote:
>> On Fri, Sep 26, 2014 at 10:46:14PM +0200, Joachim Eastwood wrote:
>>> On 14 September 2014 21:47, Al Viro  wrote:
>>> > double iput() on failure exit in lustre, racy removal of spliced dentries
>>> > from ->s_anon in __d_materialise_dentry() plus a bunch of assorted RCU 
>>> > pathwalk
>>> > fixes.  Please, pull from the usual place -
>>> > git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git for-linus
>>> >
>>> > Shortlog:
>>> > Al Viro (5):
>>> >   [fix] lustre: d_make_root() does iput() on dentry allocation failure
>>> >   move the call of __d_drop(anon) into __d_materialise_unique(dentry, 
>>> > anon)
>>> >   fix bogus read_seqretry() checks introduced in b37199e
>>> >   don't bugger nd->seq on set_root_rcu() from follow_dotdot_rcu()
>>> >   be careful with nd->inode in path_init() and follow_dotdot_rcu()
>>>
>>> Hi,
>>>
>>> Commit 4023bfc9f351a7994 "be careful with nd->inode in path_init() and
>>> follow_dotdot_rcu(), seem to hang my ARM no-MMU platform when mounting
>>> the ramdisk.
>>>
>>> 3.17-rc4 - works
>>> 3.17-rc5 - works with 4023bfc9f351a7994 reverted.
>>>
>>> Boot log with from rc5:
>>> [ 5.81] TCP: cubic registered
>>> [ 5.82] NET: Registered protocol family 17
>>> [ 5.86] lpc2k-rtc 40046000.rtc: hctosys: unable to read the hardware 
>>> clock
>>> [ 5.91] mmc_host mmc0: Bus speed (slot 0) = 1200Hz (slot req
>>> 2500Hz, actual 1200HZ div = 0)
>>> [ 5.93] mmc0: new SDHC card at address 0007
>>> [ 5.95] mmcblk0: mmc0:0007 SD08G 7.42 GiB
>>> [ 6.15] clk: Not disabling unused clocks
>>> [ 81.24] random: nonblocking pool is initialized
>>>
>>> And there it just hangs it seems.
>>>
>>>
>>> With patch reverted
>>> [ 5.81] TCP: cubic registered
>>> [ 5.82] NET: Registered protocol family 17
>>> [ 5.85] lpc2k-rtc 40046000.rtc: hctosys: unable to read the hardware 
>>> clock
>>> [ 6.10] clk: Not disabling unused clocks
>>> [ 6.11] RAMDISK: gzip image found at block 0
>>> [ 9.59] VFS: Mounted root (ext2 filesystem) readonly on device 1:0.
>>> [ 9.60] devtmpfs: mounted
>>> [ 9.61] Freeing unused kernel memory: 68K (281e5000 - 281f6000)
>>>
>>> And then user space starts.
>>
>> *blink*  What happens to mmc-related messages on successful boot?  And what
>> in that commit could've possibly lead to those not being produced?
>
> Now I am puzzled too. I can not longer reproduce that hang.
>
> I am guessing it was probably related to the mmc card being flaky or
> something random like that.
>
> Sorry for noise!

Just to confirm what was happening:

I managed to trigger it again and removing the mmc driver makes the
problem go away. So this is an mmc issue and _not_ vfs.

But what is strange is that reverting 4023bfc9f351a also makes it boot
again... Which is what fooled me.

Here are the boot logs just in case you want to have a look.
Boot log with hang: http://slexy.org/raw/s2eRhKN3aG
Boot ok (4023bfc9f351a reverted): http://slexy.org/raw/s21NgzZkCA

mmc is behaving differently.

Sorry for prematurely blaming vfs.

regards,
Joachim Eastwood
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: eBPF verifier thoughts (Re: [PATCH v15 net-next 00/11] eBPF syscall, verifier, testsuite)

2014-09-26 Thread Andy Lutomirski
On Fri, Sep 26, 2014 at 2:46 PM, Alexei Starovoitov  wrote:
> On Fri, Sep 26, 2014 at 1:42 PM, Andy Lutomirski  wrote:
>> On Fri, Sep 26, 2014 at 1:09 PM, Alexei Starovoitov  
>> wrote:
>>> On Fri, Sep 26, 2014 at 12:51 PM, Andy Lutomirski  
>>> wrote:
 On Fri, Sep 26, 2014 at 12:34 PM, Andy Lutomirski  
 wrote:

 To add one more point:

 With the current verifier design, it's impossible to write a userspace
 tool that can take an eBPF program and check it.  The verification is
 far too context-dependent for that to be possible.  I won't go so far
 as to say that a userspace tool needs to *exist*, but I strongly
 object to exposing a verification algorithm that *precludes* writing
 such a tool.
>>>
>>> that's just not true.
>>> why is it not possible?
>>
>> Because the types of referenced objects aren't encoded in the blob
>> that a user program loads, unless I'm missing something.
>
> patch #8 'handle pseudo BPF_LD_IMM64 insn' of this set
> handles first type == map. Other types will be added in the future.
> The same verification can be done in user space.
> It's pretty much copy paste for everything from the kernel.
> I don't understand yet why you really must do it in in userspace
> in addition to doing it in kernel. It's definitely doable.
> Instead of asking kernel to create a map, user space
> can just remember map attributes (key_size, value_size)
> and continues verification in userspace.
>
>> But the eBPF binary doesn't encode this information. In fact, the
>> caller of an ebpf syscall may not even have access to this
>> information.
>
> I don't follow. What info are you talking about?
> are you saying that program only that references maps via fds
> is not verifiable unless one knows what this fds refer to?
> yeah, but we're talking user space verification here.
> user knows what maps it creates with what attributes.
> Also we can add a command to this syscall to fetch map
> attributes. That would be trivial _incremental_ addition, right?

That would also work, I suppose.

>
>> I think this is addressable as a smallish change on top of your code.
>> Rather than looking up a map when you need to learn its key and value
>> size, I think that all you need to do is to look in a program section
>> for the key and value size (and the fact that it's a map) and confirm
>> that the referenced map *matches* the stored values.
>
> we can add extra info to the program that will encode
> program assumptions about maps. Sure. Though I think
> it's extra info that kernel doesn't really need, since it can
> only check that program assumptions match to what
> kernel already knows. Kernel cannot rely on them.
> So I'm not sure what this extra check really buys.
>
> Anyway, if you think it's a smallish change, we can do it
> incrementally on top of existing stuff, right?
> Why this arguing then?
> Sounds like you want to help with the development?
> This is great! I'm all for it :)



-- 
Andy Lutomirski
AMA Capital Management, LLC
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: eBPF verifier thoughts (Re: [PATCH v15 net-next 00/11] eBPF syscall, verifier, testsuite)

2014-09-26 Thread Andy Lutomirski
On Fri, Sep 26, 2014 at 2:25 PM, Alexei Starovoitov  wrote:
> On Fri, Sep 26, 2014 at 1:39 PM, Andy Lutomirski  wrote:
>>> not quite. there is a distinction between key and value.
>>> They both come from map definition and correspond to key_size
>>> and value_size, so they have to have two different corresponding
>>> _internal_ types 'ptr_to_map_key' and 'ptr_to_map_value'
>>> This distinction is needed to properly describe function
>>> arguments constraints.
>>
>> But they're still just pointers to buffers of some size known to the
>> verifier, right?  By calling them "pointer to map key" and "pointer to
>> map value" you're tying them to map objects in a way that makes little
>> sense to me.
>
> 'pointer_to_map_key' is internal argument constraint of the
> in-kernel helper function. It tells verifier how to check the values
> passed into function.
> Just pointer + size abstraction is not enough here.
> verifier has to know the type of what it's checking.

Ignore "pointer_to_map_key" -- that was an error on my part.

I still think that "pointer to map value" should be "pointer to bytes".

>
>> So what's "spill part"?  Unless I misunderstood the stack tracking
>> code, you're tracking each byte separately.
>>
>> You're also tracking the type for each stack slot separately for each
>> instruction.  That looks like it'll account for the considerable
>> majority of total memory usage.
>
> verifier has to track each byte separately, because
> malicious program may write a pointer into stack with 8-byte
> write, then modify single byte with 1-byte write and then
> try to read 8-byte back. Verifier has to catch that and
> that's why it's tracking every byte-sized slot independently.
>

Can't you just disallow the 1-byte write to the stack?

>> I don't like the fact that the function proto comes from the
>> environment instead of from the program.
>
> that's must have.
> in-kernel function argument constraints must come from
> kernel. where else?
> User program says I want to call function foo() and here
> is my code that invokes it. Kernel sees prototype of this
> foo() and checks arguments.
> There is no point for user space program to also
> pass foo() constraints. The only thing kernel can do
> with this extra info is to check that it matches what
> kernel already knows.

User says "I'm calling a function called foo that has this signature".
Kernel checks (a) that the signature is right and (b) that the call is
compliant.

>
>>> nope. breadth-first just doesn't work at all.
>>
>> Sorry, I didn't actually mean BFS.  I meant to order the search such
>> that all incoming control flow edges to an insn are visited before any
>> of the outgoing edges are visited.
>
> hmm. I'm not sure how exactly you plan on achieving that.
> I don't think we want to see real control/data flow graph
> analysis in the kernel the way compilers do things.
> It will be tens of thousands lines of code.
> The algorithm you see in this verifier is straight forward and
> tiny. I guess when time passes by when may get enough
> courage to attempt something like this, but
> today 'kiss' principle rules.

I'll try it in Python.  I bet I can get it to be shorter than the current code.

>
>>> complexity is actually described in the doc.
>>> There are several limits. Verifier will be aborted if it walks
>>> more then 32k instructions or more then 1k branches.
>>> So the very worst case takes micro seconds to reject
>>> the program. So I don't see your concern.
>>
>> That this will randomly fail, then.  For all I know, there are
>> existing valid BPF programs with vastly more than 32k "instructions"
>> as counted by the verifier.
>
> you need to double check your data :)
> classic bpf limit is 4k instructions per program.
> We're keeping the same limit for eBPF.
> 32k limit says that verifier will visit each instruction
> no more than 8 times.
> if we have a program full of branches, then yes, 32k limit will
> be reached and that's exactly what 'state pruning' patch is
> addressing! As I already said, I dropped it out of this set
> to ease review and to keep patch set size minimal.
> You can see it my tree:
> https://git.kernel.org/cgit/linux/kernel/git/ast/bpf.git/commit/?h=v14=1d9529ae4ce24bc31ca245a156299aa9e59a29f0
> I was planning to send it next.
> It's small incremental patch on top of existing things.

Yes, but does it work reliably?

--Andy

-- 
Andy Lutomirski
AMA Capital Management, LLC
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: eBPF verifier thoughts (Re: [PATCH v15 net-next 00/11] eBPF syscall, verifier, testsuite)

2014-09-26 Thread Alexei Starovoitov
On Fri, Sep 26, 2014 at 1:42 PM, Andy Lutomirski  wrote:
> On Fri, Sep 26, 2014 at 1:09 PM, Alexei Starovoitov  wrote:
>> On Fri, Sep 26, 2014 at 12:51 PM, Andy Lutomirski  
>> wrote:
>>> On Fri, Sep 26, 2014 at 12:34 PM, Andy Lutomirski  
>>> wrote:
>>>
>>> To add one more point:
>>>
>>> With the current verifier design, it's impossible to write a userspace
>>> tool that can take an eBPF program and check it.  The verification is
>>> far too context-dependent for that to be possible.  I won't go so far
>>> as to say that a userspace tool needs to *exist*, but I strongly
>>> object to exposing a verification algorithm that *precludes* writing
>>> such a tool.
>>
>> that's just not true.
>> why is it not possible?
>
> Because the types of referenced objects aren't encoded in the blob
> that a user program loads, unless I'm missing something.

patch #8 'handle pseudo BPF_LD_IMM64 insn' of this set
handles first type == map. Other types will be added in the future.
The same verification can be done in user space.
It's pretty much copy paste for everything from the kernel.
I don't understand yet why you really must do it in in userspace
in addition to doing it in kernel. It's definitely doable.
Instead of asking kernel to create a map, user space
can just remember map attributes (key_size, value_size)
and continues verification in userspace.

> But the eBPF binary doesn't encode this information. In fact, the
> caller of an ebpf syscall may not even have access to this
> information.

I don't follow. What info are you talking about?
are you saying that program only that references maps via fds
is not verifiable unless one knows what this fds refer to?
yeah, but we're talking user space verification here.
user knows what maps it creates with what attributes.
Also we can add a command to this syscall to fetch map
attributes. That would be trivial _incremental_ addition, right?

> I think this is addressable as a smallish change on top of your code.
> Rather than looking up a map when you need to learn its key and value
> size, I think that all you need to do is to look in a program section
> for the key and value size (and the fact that it's a map) and confirm
> that the referenced map *matches* the stored values.

we can add extra info to the program that will encode
program assumptions about maps. Sure. Though I think
it's extra info that kernel doesn't really need, since it can
only check that program assumptions match to what
kernel already knows. Kernel cannot rely on them.
So I'm not sure what this extra check really buys.

Anyway, if you think it's a smallish change, we can do it
incrementally on top of existing stuff, right?
Why this arguing then?
Sounds like you want to help with the development?
This is great! I'm all for it :)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 00/17] Cross-architecture definitions of relaxed MMIO accessors

2014-09-26 Thread Arnd Bergmann
On Friday 26 September 2014 14:46:08 Russell King - ARM Linux wrote:
> 
> Obviously, this does nothing for the:
> 
> include/asm-generic/io.h:804:29: error: redefinition of 'virt_to_bus'
> include/asm-generic/io.h:809:21: error: redefinition of 'bus_to_virt'
> 
> errors which are also reported in Olof's build system.
> 
> Given how close we are to the merge window, I'd suggest this stuff gets
> reverted so that it can have a better period of testing, rather than
> stuffing it into -next at such a critical time.

Right, I've fixed these all up locally, but then found new warnings on
allmodconfig builds that I haven't been able to track down yet.

I've dropped everything from the asm-generic tree now, so we can get
a clean linux-next tree.

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ACPI / i915: Update the condition to ignore firmware backlight change request

2014-09-26 Thread Rafael J. Wysocki
On Friday, September 26, 2014 10:30:08 AM Aaron Lu wrote:
> Some of the Thinkpads' firmware will issue a backlight change request
> through i915 operation region unconditionally on AC plug/unplug, the
> backlight level used is arbitrary and thus should be ignored. This is
> handled by commit 0b9f7d93ca61 (ACPI / i915: ignore firmware requests
> for backlight change). Then there is a Dell laptop whose vendor backlight
> interface also makes use of operation region to change backlight level
> and with the above commit, that interface no long works. The condition
> used to ignore the backlight change request from firmware is thus
> changed to: if the vendor backlight interface is not in use and the ACPI
> backlight interface is broken, we ignore the requests; oterwise, we keep
> processing them.
> 
> Reference: https://lkml.org/lkml/2014/9/23/854
> Reported-and-tested-by: Pali Rohár 
> Cc:  # v3.16 and later
> Signed-off-by: Aaron Lu 

Daniel, any objections?

> ---
>  drivers/gpu/drm/i915/intel_opregion.c | 16 +++-
>  1 file changed, 11 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_opregion.c 
> b/drivers/gpu/drm/i915/intel_opregion.c
> index ca52ad2ae7d1..d8de1d5140a7 100644
> --- a/drivers/gpu/drm/i915/intel_opregion.c
> +++ b/drivers/gpu/drm/i915/intel_opregion.c
> @@ -396,6 +396,16 @@ int intel_opregion_notify_adapter(struct drm_device 
> *dev, pci_power_t state)
>   return -EINVAL;
>  }
>  
> +/*
> + * If the vendor backlight interface is not in use and ACPI backlight 
> interface
> + * is broken, do not bother processing backlight change requests from 
> firmware.
> + */
> +static bool should_ignore_backlight_request(void)
> +{
> + return acpi_video_backlight_support() &&
> +!acpi_video_verify_backlight_support();
> +}
> +
>  static u32 asle_set_backlight(struct drm_device *dev, u32 bclp)
>  {
>   struct drm_i915_private *dev_priv = dev->dev_private;
> @@ -404,11 +414,7 @@ static u32 asle_set_backlight(struct drm_device *dev, 
> u32 bclp)
>  
>   DRM_DEBUG_DRIVER("bclp = 0x%08x\n", bclp);
>  
> - /*
> -  * If the acpi_video interface is not supposed to be used, don't
> -  * bother processing backlight level change requests from firmware.
> -  */
> - if (!acpi_video_verify_backlight_support()) {
> + if (should_ignore_backlight_request()) {
>   DRM_DEBUG_KMS("opregion backlight request ignored\n");
>   return 0;
>   }
> 

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v12 08/12] PCI: OF: Introduce helper function for retrieving PCI domain numbers

2014-09-26 Thread liviu
On Fri, Sep 26, 2014 at 01:20:39PM -0500, Rob Herring wrote:
> On Tue, Sep 23, 2014 at 2:01 PM, Liviu Dudau  wrote:
> > Add pci_get_new_domain_nr() to allocate a new domain number
> > and of_get_pci_domain_nr() to retrieve the PCI domain number
> > of a given device from DT. Host bridge drivers or architecture
> > specific code can choose to implement their PCI domain number
> > policy using these two functions.
> >
> > Using of_get_pci_domain_nr() guarantees a stable PCI domain
> > number on every boot provided that all host bridge controllers
> > are assigned a number in the device tree using "linux,pci-domain"
> > property. Mix use of pci_get_new_domain_nr() and of_get_pci_domain_nr()
> > is not recommended as it can lead to potentially conflicting
> > domain numbers being assigned to root busses behind different
> > host bridges.
> >
> > Cc: Bjorn Helgaas 
> > Cc: Arnd Bergmann 
> > Cc: Grant Likely 
> > Cc: Rob Herring 
> > Cc: Catalin Marinas 
> > Signed-off-by: Liviu Dudau 
> 
> Looks pretty good, but a couple of comments that can be addressed as
> follow-up patches.
> 
> > ---
> >  drivers/of/of_pci.c| 25 +
> >  drivers/pci/pci.c  |  9 +
> >  include/linux/of_pci.h |  7 +++
> >  include/linux/pci.h|  3 +++
> >  4 files changed, 44 insertions(+)
> >
> > diff --git a/drivers/of/of_pci.c b/drivers/of/of_pci.c
> > index 8481996..82d172f 100644
> > --- a/drivers/of/of_pci.c
> > +++ b/drivers/of/of_pci.c
> > @@ -89,6 +89,31 @@ int of_pci_parse_bus_range(struct device_node *node, 
> > struct resource *res)
> >  }
> >  EXPORT_SYMBOL_GPL(of_pci_parse_bus_range);
> >
> > +/**
> > + * This function will try to obtain the host bridge domain number by
> > + * finding a property called "linux,pci-domain" of the given device node.
> > + *
> > + * @node: device tree node with the domain information
> > + *
> > + * Returns the associated domain number from DT in the range [0-0x], or
> > + * a negative value if the required property is not found.
> > + */
> > +int of_get_pci_domain_nr(struct device_node *node)
> > +{
> > +   const __be32 *value;
> > +   int len;
> > +   u16 domain;
> > +
> > +   value = of_get_property(node, "linux,pci-domain", );
> 
> This needs to be documented.

I would say needs agreed on *and* documented.

> 
> > +   if (!value || len < sizeof(*value))
> > +   return -EINVAL;
> > +
> > +   domain = (u16)be32_to_cpup(value);
> 
> of_property_read_u32 would not work here?

It should, I guess, but I want to limit the domain number range to 16 bits.

Best regards,
Liviu

> 
> > +
> > +   return domain;
> > +}
> > +EXPORT_SYMBOL_GPL(of_get_pci_domain_nr);
> > +
> >  #ifdef CONFIG_PCI_MSI
> >
> >  static LIST_HEAD(of_pci_msi_chip_list);
> > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > index 2c9ac70..d36f35f 100644
> > --- a/drivers/pci/pci.c
> > +++ b/drivers/pci/pci.c
> > @@ -4406,6 +4406,15 @@ static void pci_no_domains(void)
> >  #endif
> >  }
> >
> > +#ifdef CONFIG_PCI_DOMAINS
> > +static atomic_t __domain_nr = ATOMIC_INIT(-1);
> > +
> > +int pci_get_new_domain_nr(void)
> > +{
> > +   return atomic_inc_return(&__domain_nr);
> > +}
> > +#endif
> > +
> >  /**
> >   * pci_ext_cfg_avail - can we access extended PCI config space?
> >   *
> > diff --git a/include/linux/of_pci.h b/include/linux/of_pci.h
> > index dde3a4a..71062e9 100644
> > --- a/include/linux/of_pci.h
> > +++ b/include/linux/of_pci.h
> > @@ -15,6 +15,7 @@ struct device_node *of_pci_find_child_device(struct 
> > device_node *parent,
> >  int of_pci_get_devfn(struct device_node *np);
> >  int of_irq_parse_and_map_pci(const struct pci_dev *dev, u8 slot, u8 pin);
> >  int of_pci_parse_bus_range(struct device_node *node, struct resource *res);
> > +int of_get_pci_domain_nr(struct device_node *node);
> >  #else
> >  static inline int of_irq_parse_pci(const struct pci_dev *pdev, struct 
> > of_phandle_args *out_irq)
> >  {
> > @@ -43,6 +44,12 @@ of_pci_parse_bus_range(struct device_node *node, struct 
> > resource *res)
> >  {
> > return -EINVAL;
> >  }
> > +
> > +static inline int
> > +of_get_pci_domain_nr(struct device_node *node)
> > +{
> > +   return -1;
> > +}
> >  #endif
> >
> >  #if defined(CONFIG_OF) && defined(CONFIG_PCI_MSI)
> > diff --git a/include/linux/pci.h b/include/linux/pci.h
> > index a494e5d..150da2d 100644
> > --- a/include/linux/pci.h
> > +++ b/include/linux/pci.h
> > @@ -1285,10 +1285,12 @@ void pci_cfg_access_unlock(struct pci_dev *dev);
> >   */
> >  #ifdef CONFIG_PCI_DOMAINS
> >  extern int pci_domains_supported;
> > +int pci_get_new_domain_nr(void);
> >  #else
> >  enum { pci_domains_supported = 0 };
> >  static inline int pci_domain_nr(struct pci_bus *bus) { return 0; }
> >  static inline int pci_proc_domain(struct pci_bus *bus) { return 0; }
> > +static inline int pci_get_new_domain_nr(void) { return -ENOSYS; }
> >  #endif /* CONFIG_PCI_DOMAINS */
> >
> >  /*
> > @@ -1417,6 +1419,7 @@ static inline 

Re: [PATCH v2] x86: Quark: Add if/else to setup_arch for Quark TLB bug

2014-09-26 Thread Bryan O'Donoghue

On 26/09/14 21:59, Henrique de Moraes Holschuh wrote:


I'm confused, now.

Wasn't the other patch -- which just added a comment -- the one selected as
a better fix, because there is absolutely no point in calling __flush_tlb()
on Quark X1000 *right after* you just flushed the TLB [on these processors]
by doing a load_cr3() ?

Should this one ([PATCH v2] x86: Quark: Add if/else to setup_arch for Quark
TLB bug) be ignored?  Or should the other one which just adds a comment be
ignored?


Hi Henrique.

My view is that the CR3 load should have flushed the TLB in it's entirety.

Ong Boong Leong said that a discussion he had which included HPA 
concluded with a flush of the TLB being required after the CR3 reload.


The current code

a) Writes to CR3.
On Quark that will flush the entire TLB
Ref: Quark SoC X1000 Core Developer's Manual section 6.4.11

On other processors this won't invalidate the entire TLB under every 
scenario. For example a PTE with PGE on may not be flushed.


Ref: Intel 64 and IA32 Architectures .. vol (3a, 3b, 3c) 325384.pdf 
section 4.10.4.1


b) __flush_tlb_all
Depending on the value of cpu_has_pge() - which will be true for ~all 
x86 processors will call


native_write_cr4(cr4 & ~X86_CR4_PGE);

According to the same section "4.10.4.1" this will nuke the rest of the 
TLB entries even with PGE bits set in the PTEs


So the steps a & b are the appropriate sets to take to entirely flush 
the TLB for most processors i.e. one's that support PGE


The suggestion is that on Quark we should flush the TLB again.

The documentation doesn't really support that statement but, my view is 
OK if we are going to accommodate that position let's not force a third 
TLB flush on everybody who isn't running a Quark.


From the technical/documentation standpoint I don't think Quark is 
forcing a step c - if we do decide to add a __flush_tlb() call for the 
sake of Quark anyway - then let's not foist the extra cycles on every 
other x86 processor out there.


My preference is

1. Just comment the code as is to explain why it works for Quark.

If that's not good enough for people then

2. if/else the flow so that Quark does __flush_tlb() and the rest of the 
world does a __flush_tlb_all()



--
BOD
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git pull] vfs fixes

2014-09-26 Thread Joachim Eastwood
On 26 September 2014 22:58, Al Viro  wrote:
> On Fri, Sep 26, 2014 at 10:46:14PM +0200, Joachim Eastwood wrote:
>> On 14 September 2014 21:47, Al Viro  wrote:
>> > double iput() on failure exit in lustre, racy removal of spliced dentries
>> > from ->s_anon in __d_materialise_dentry() plus a bunch of assorted RCU 
>> > pathwalk
>> > fixes.  Please, pull from the usual place -
>> > git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git for-linus
>> >
>> > Shortlog:
>> > Al Viro (5):
>> >   [fix] lustre: d_make_root() does iput() on dentry allocation failure
>> >   move the call of __d_drop(anon) into __d_materialise_unique(dentry, 
>> > anon)
>> >   fix bogus read_seqretry() checks introduced in b37199e
>> >   don't bugger nd->seq on set_root_rcu() from follow_dotdot_rcu()
>> >   be careful with nd->inode in path_init() and follow_dotdot_rcu()
>>
>> Hi,
>>
>> Commit 4023bfc9f351a7994 "be careful with nd->inode in path_init() and
>> follow_dotdot_rcu(), seem to hang my ARM no-MMU platform when mounting
>> the ramdisk.
>>
>> 3.17-rc4 - works
>> 3.17-rc5 - works with 4023bfc9f351a7994 reverted.
>>
>> Boot log with from rc5:
>> [ 5.81] TCP: cubic registered
>> [ 5.82] NET: Registered protocol family 17
>> [ 5.86] lpc2k-rtc 40046000.rtc: hctosys: unable to read the hardware 
>> clock
>> [ 5.91] mmc_host mmc0: Bus speed (slot 0) = 1200Hz (slot req
>> 2500Hz, actual 1200HZ div = 0)
>> [ 5.93] mmc0: new SDHC card at address 0007
>> [ 5.95] mmcblk0: mmc0:0007 SD08G 7.42 GiB
>> [ 6.15] clk: Not disabling unused clocks
>> [ 81.24] random: nonblocking pool is initialized
>>
>> And there it just hangs it seems.
>>
>>
>> With patch reverted
>> [ 5.81] TCP: cubic registered
>> [ 5.82] NET: Registered protocol family 17
>> [ 5.85] lpc2k-rtc 40046000.rtc: hctosys: unable to read the hardware 
>> clock
>> [ 6.10] clk: Not disabling unused clocks
>> [ 6.11] RAMDISK: gzip image found at block 0
>> [ 9.59] VFS: Mounted root (ext2 filesystem) readonly on device 1:0.
>> [ 9.60] devtmpfs: mounted
>> [ 9.61] Freeing unused kernel memory: 68K (281e5000 - 281f6000)
>>
>> And then user space starts.
>
> *blink*  What happens to mmc-related messages on successful boot?  And what
> in that commit could've possibly lead to those not being produced?

Now I am puzzled too. I can not longer reproduce that hang.

I am guessing it was probably related to the mmc card being flaky or
something random like that.

Sorry for noise!

regards,
Joachim Eastwood
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: irq mask swapping during suspend/resume

2014-09-26 Thread Rafael J. Wysocki
On Friday, September 26, 2014 11:39:28 PM Rafael J. Wysocki wrote:
> On Friday, September 26, 2014 09:58:39 AM Eric Caruso wrote:
> > I was putting together a prototype for this, and ran into a design
> > issue. It's not obvious how to get from the struct wakeup_source to
> > places where we hold all of the relevant device information or irq
> > information. If we were to walk the list of wakeup source objects,
> > where would we actually get the irq information to call
> > enable_irq_wake()?
> > 
> > Does this just require a bunch of extra plumbing we don't have right
> > now, and this is what you meant by changes to wakeup_source_create()
> > and friends?
> 
> Yes, that's it.
> 
> But at least some drivers call enable_irq_wake() today, so it should just
> work for devices handled by them.  That may not be all devices you want,
> though.

Ah, I've confused threads.  So the paragraph above doesn't make sense, but
the one below does. :-)  Please don't top-post, it makes things rather
difficult to follow.

> And BTW, there is a problem with the approach I was talking about which is
> the sysfs interface for enabling wakeup, so we actually need to put wakeup
> IRQ information into struct dev_pm_info and point to that from the
> associated wakeup source object.  It still should be doable though.
> 
> > On Sat, Sep 20, 2014 at 5:48 PM, Rafael J. Wysocki  
> > wrote:
> > > On Thursday, September 18, 2014 01:32:06 PM Thomas Gleixner wrote:
> > >> On Wed, 17 Sep 2014, Dmitry Torokhov wrote:
> > >> > Hi Thomas,
> > >> >
> > >> > On Wednesday, September 17, 2014 12:05:42 PM Thomas Gleixner wrote:
> > >> > > On Tue, 16 Sep 2014, Eric Caruso wrote:
> > >> > > > We would like to be able to set different irq masks for triggers 
> > >> > > > during
> > >> > > > normal operation and for waking up the system. For example, while 
> > >> > > > a laptop
> > >> > > > is awake, closing the lid and opening the lid should both fire an
> > >> > > > interrupt, but when the system is asleep, we would like to stay 
> > >> > > > asleep
> > >> > > > when
> > >> > > > closing the lid.
> > >> > > >
> > >> > > > We are thinking about stashing the irq mask used specifically for 
> > >> > > > waking
> > >> > > > the system up in the irq_desc struct, and then swapping it during
> > >> > > > enable_irq_wake and disable_irq_wake calls. Devices that do not 
> > >> > > > specify a
> > >> > > > different wake mask will use their normal trigger mask for both
> > >> > > > situations.
> > >> > > >
> > >> > > > Is this acceptable?
> > >> > >
> > >> > > Not really. Why should irq_desc provide storage for random
> > >> > > configurations and bind them to some random system state?
> > >> > >
> > >> > > What's wrong with calling
> > >> > >
> > >> > >irq_set_type(irq, B);
> > >> > >enable_irq_wake(irq);
> > >> > >
> > >> > >disable_irq_wake(irq);
> > >> > >irq_set_type(irq, A);
> > >> >
> > >> > The desire is to avoid doing it in [every] driver but rather have it 
> > >> > done
> > >> > centrally by device/PM core. It does not have to be irq_desc though,
> > >> > maybe you can suggest a better place for it (aside of the individual 
> > >> > driver
> > >> > code that is)?
> > >>
> > >> Well, if it should be done by the device/pm core then you want to
> > >> store that information in the device related data structure.
> > >>
> > >> struct dev_pm_info might be the right place for it, but that's up to
> > >> Rafael.
> > >>
> > >> So driver would set
> > >>
> > >>dev->power.update_wakeirq_type = true;
> > >>dev->power.irq_type_normal = IRQ_TYPE_EDGE_BOTH;
> > >>dev->power.irq_type_sleep = IRQ_TYPE_EDGE_LOW;
> > >>
> > >> And the dev/PM core can issue the calls on suspend/resume.
> > >
> > > So I'd rather put that into the struct wakeup_source pointed to by
> > > the wakeup pointer in struct dev_pm_info.  That would give us a mapping
> > > between wakeup source objects and wakeup interrupts and which would make
> > > a fair amount of sense in my view.
> > >
> > > Then, we could simply walk the list of wakeup source objects before
> > > suspend_device_irqs() and call enable_irq_wake() etc. for all of the
> > > interrupts in question without drivers having to worry about that.
> > > We also could save the current IRQ type for them at that point and
> > > restore it during resume.
> > >
> > > Of course, that would require some changes to wakeup_source_create()
> > > and friends, but is probably worth doing.
> > >
> > > Still, before we start making those changes, here's a bunch of questions
> > > to answer:
> > >
> > > (1) Say a wakeup interrupt is shared between two drivers and one of them
> > > asks for a different "IRQ type for sleep" than the other one.  How are
> > > we going to resolve such conflicts?
> > >
> > > (2) Can platforms place restrictions on the IRQ type to be used with a 
> > > given
> > > line?  If so, how do we handle situations in which the requested
> > > "IRQ type for sleep" is 

Re: eBPF verifier thoughts (Re: [PATCH v15 net-next 00/11] eBPF syscall, verifier, testsuite)

2014-09-26 Thread Alexei Starovoitov
On Fri, Sep 26, 2014 at 1:39 PM, Andy Lutomirski  wrote:
>> not quite. there is a distinction between key and value.
>> They both come from map definition and correspond to key_size
>> and value_size, so they have to have two different corresponding
>> _internal_ types 'ptr_to_map_key' and 'ptr_to_map_value'
>> This distinction is needed to properly describe function
>> arguments constraints.
>
> But they're still just pointers to buffers of some size known to the
> verifier, right?  By calling them "pointer to map key" and "pointer to
> map value" you're tying them to map objects in a way that makes little
> sense to me.

'pointer_to_map_key' is internal argument constraint of the
in-kernel helper function. It tells verifier how to check the values
passed into function.
Just pointer + size abstraction is not enough here.
verifier has to know the type of what it's checking.

> So what's "spill part"?  Unless I misunderstood the stack tracking
> code, you're tracking each byte separately.
>
> You're also tracking the type for each stack slot separately for each
> instruction.  That looks like it'll account for the considerable
> majority of total memory usage.

verifier has to track each byte separately, because
malicious program may write a pointer into stack with 8-byte
write, then modify single byte with 1-byte write and then
try to read 8-byte back. Verifier has to catch that and
that's why it's tracking every byte-sized slot independently.

> I don't like the fact that the function proto comes from the
> environment instead of from the program.

that's must have.
in-kernel function argument constraints must come from
kernel. where else?
User program says I want to call function foo() and here
is my code that invokes it. Kernel sees prototype of this
foo() and checks arguments.
There is no point for user space program to also
pass foo() constraints. The only thing kernel can do
with this extra info is to check that it matches what
kernel already knows.

>> nope. breadth-first just doesn't work at all.
>
> Sorry, I didn't actually mean BFS.  I meant to order the search such
> that all incoming control flow edges to an insn are visited before any
> of the outgoing edges are visited.

hmm. I'm not sure how exactly you plan on achieving that.
I don't think we want to see real control/data flow graph
analysis in the kernel the way compilers do things.
It will be tens of thousands lines of code.
The algorithm you see in this verifier is straight forward and
tiny. I guess when time passes by when may get enough
courage to attempt something like this, but
today 'kiss' principle rules.

>> complexity is actually described in the doc.
>> There are several limits. Verifier will be aborted if it walks
>> more then 32k instructions or more then 1k branches.
>> So the very worst case takes micro seconds to reject
>> the program. So I don't see your concern.
>
> That this will randomly fail, then.  For all I know, there are
> existing valid BPF programs with vastly more than 32k "instructions"
> as counted by the verifier.

you need to double check your data :)
classic bpf limit is 4k instructions per program.
We're keeping the same limit for eBPF.
32k limit says that verifier will visit each instruction
no more than 8 times.
if we have a program full of branches, then yes, 32k limit will
be reached and that's exactly what 'state pruning' patch is
addressing! As I already said, I dropped it out of this set
to ease review and to keep patch set size minimal.
You can see it my tree:
https://git.kernel.org/cgit/linux/kernel/git/ast/bpf.git/commit/?h=v14=1d9529ae4ce24bc31ca245a156299aa9e59a29f0
I was planning to send it next.
It's small incremental patch on top of existing things.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] dma: imx-sdma: fix another incorrect __init annotation

2014-09-26 Thread Arnd Bergmann
In e34b731faa7d1 ("dma: imx-sdma: Remove spurious __init annotation on
sdma_probe()"), Mark found an extraneous __init label and fixed it.

However, he missed another one, because now we get this other warning:

WARNING: drivers/dma/imx-sdma.o(.text+0x3bb4): Section mismatch in reference 
from the function sdma_probe() to the function .init.text:sdma_get_firmware()
The function sdma_probe() references
the function __init sdma_get_firmware().

Same reasoning as the last time, the function may get called at
runtime, so it can't be __init.

Signed-off-by: Arnd Bergmann 

diff --git a/drivers/dma/imx-sdma.c b/drivers/dma/imx-sdma.c
index 52ce1d21154c..88afc48c2ca7 100644
--- a/drivers/dma/imx-sdma.c
+++ b/drivers/dma/imx-sdma.c
@@ -1334,7 +1334,7 @@ err_firmware:
release_firmware(fw);
 }
 
-static int __init sdma_get_firmware(struct sdma_engine *sdma,
+static int sdma_get_firmware(struct sdma_engine *sdma,
const char *fw_name)
 {
int ret;

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 01/11 v6] coresight: add CoreSight core layer framework

2014-09-26 Thread Mathieu Poirier
On 23 September 2014 23:45, Greg KH  wrote:
> On Thu, Sep 18, 2014 at 05:09:11PM -0600, Mathieu Poirier wrote:
>> On 12 September 2014 12:16, Greg KH  wrote:
>> > On Fri, Sep 12, 2014 at 11:41:44AM -0600, Mathieu Poirier wrote:
>> >> Good morning and thanks for the review.  Pls see comments below.
>> >>
>> >> Mathieu
>> >>
>> >> On 11 September 2014 14:33, Greg KH  wrote:
>> >> > Some first impressions in glancing at the code, not a complete review at
>> >> > all:
>> >> >
>> >> > On Thu, Sep 11, 2014 at 09:49:08AM -0600, mathieu.poir...@linaro.org 
>> >> > wrote:
>> >> >> --- /dev/null
>> >> >> +++ b/drivers/coresight/coresight.c
>> >> >> @@ -0,0 +1,663 @@
>> >> >> +/* Copyright (c) 2012, The Linux Foundation. All rights reserved.
>> >> >> + *
>> >> >> + * This program is free software; you can redistribute it and/or 
>> >> >> modify
>> >> >> + * it under the terms of the GNU General Public License version 2 and
>> >> >> + * only version 2 as published by the Free Software Foundation.
>> >> >> + *
>> >> >> + * This program is distributed in the hope that it will be useful,
>> >> >> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> >> >> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> >> >> + * GNU General Public License for more details.
>> >> >> + */
>> >> >> +
>> >> >> +#define pr_fmt(fmt) "coresight: " fmt
>> >> >
>> >> > MODULE_NAME ?
>> >>
>> >> Is this what you had in mind?
>> >>
>> >> #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
>> >
>> > Sorry, yes, that's much better and "portable".
>> >
>> > But as you are using the driver model, the odds of you ever using a pr_*
>> > call is all but none, so why do you even need this?
>>
>> You suggest to use dev_* instead?  My (perhaps flawed) logic was that
>> the framework ins't associated with a single device but a set of them.
>
> Yes, but a single device is causing a message to be emitted, so
> reference that please.
>
>> >> >> + struct coresight_connection *conns;
>> >> >> + int nr_conns;
>> >> >> + enum coresight_dev_type type;
>> >> >> + struct coresight_dev_subtype subtype;
>> >> >> + const struct coresight_ops *ops;
>> >> >> + struct dentry *de;
>> >> >
>> >> > All devices have a dentry?  in debugfs?  isn't that overkill?
>> >>
>> >> All devices along a "path" can require specific configuration, which
>> >> can only be made by a user.  Exposing the registers via debugfs seemed
>> >> like the most plausible solution.
>> >
>> > configuration is not to be done through debugfs, especially as that is
>> > restricted to root, and lots of systems don't even have it.  debugfs is
>> > always optional, don't make your code depend on it to work properly.
>>
>> Humm... You are the first one to be of the opinion that debugfs
>> shouldn't be used for coresight configuration - other reviewers have
>> unequivocally requested that configurables be moved under debugfs.
>> Debugfs seemed to be a good fit, specifically since ftrace is already
>> in there.  We could enable CONFIG_DEBUG_FS automatically (the same way
>> I do it for AMBA) when coresight gets enabled in the kernel config...
>>
>> If not debugfs, where then?  I'd welcome a suggestion.
>
> It is a mistake that ftrace is in debugfs, the developers of that code
> admit it.

That's interesting to know.

>
> Your system should be able to run just fine if debugfs is disabled, if
> your interface is "optional" like that, great, put it in debugfs,
> otherwise put it somewhere else.

I lumped the entries under /sys/bus/coresight/, the same way the
orgininal author had them.

>
> As to where else to put it, what exactly are you trying to do here?
> What do you want to export / import?  Who would use it?  How would they
> use it?  When would they use it?
>
>> >> >> + struct device dev;
>> >> >> + struct kref kref;
>> >> >
>> >> > You CAN NOT have two reference counts in the same structure, that's a
>> >> > huge design mistake.  Stick with one reference count, otherwise they are
>> >> > guaranteed to get out of sync and bad things will happen.
>> >>
>> >> In this case the additional @kref is for accounting of components
>> >> within the coresight framework only.  When the amba framework calls
>> >> the driver's _probe() routine the kref count is already equal to '2'
>> >> (as expected), even if no other coresight components have used that
>> >> device.  Knowing when a component is no longer in use by the framework
>> >> (but still available in the system) is important for coresight cleanup
>> >> consideration, i.e switch off the component to save power.
>> >>
>> >> The kref helpers don't expose the refcount and @kref_sub will only
>> >> call the cleanup method when refcount is '1'.  How else should I
>> >> proceed?
>> >
>> > Don't use a kref at all, you should never need to "know" a refcount
>> > value.  If you do, your logic is incorrect, sorry.  Just use the normal
>> > driver core functions for when to cleanup memory and the like and you
>> > 

Re: irq mask swapping during suspend/resume

2014-09-26 Thread Rafael J. Wysocki
On Friday, September 26, 2014 09:58:39 AM Eric Caruso wrote:
> I was putting together a prototype for this, and ran into a design
> issue. It's not obvious how to get from the struct wakeup_source to
> places where we hold all of the relevant device information or irq
> information. If we were to walk the list of wakeup source objects,
> where would we actually get the irq information to call
> enable_irq_wake()?
> 
> Does this just require a bunch of extra plumbing we don't have right
> now, and this is what you meant by changes to wakeup_source_create()
> and friends?

Yes, that's it.

But at least some drivers call enable_irq_wake() today, so it should just
work for devices handled by them.  That may not be all devices you want,
though.

And BTW, there is a problem with the approach I was talking about which is
the sysfs interface for enabling wakeup, so we actually need to put wakeup
IRQ information into struct dev_pm_info and point to that from the
associated wakeup source object.  It still should be doable though.

> On Sat, Sep 20, 2014 at 5:48 PM, Rafael J. Wysocki  wrote:
> > On Thursday, September 18, 2014 01:32:06 PM Thomas Gleixner wrote:
> >> On Wed, 17 Sep 2014, Dmitry Torokhov wrote:
> >> > Hi Thomas,
> >> >
> >> > On Wednesday, September 17, 2014 12:05:42 PM Thomas Gleixner wrote:
> >> > > On Tue, 16 Sep 2014, Eric Caruso wrote:
> >> > > > We would like to be able to set different irq masks for triggers 
> >> > > > during
> >> > > > normal operation and for waking up the system. For example, while a 
> >> > > > laptop
> >> > > > is awake, closing the lid and opening the lid should both fire an
> >> > > > interrupt, but when the system is asleep, we would like to stay 
> >> > > > asleep
> >> > > > when
> >> > > > closing the lid.
> >> > > >
> >> > > > We are thinking about stashing the irq mask used specifically for 
> >> > > > waking
> >> > > > the system up in the irq_desc struct, and then swapping it during
> >> > > > enable_irq_wake and disable_irq_wake calls. Devices that do not 
> >> > > > specify a
> >> > > > different wake mask will use their normal trigger mask for both
> >> > > > situations.
> >> > > >
> >> > > > Is this acceptable?
> >> > >
> >> > > Not really. Why should irq_desc provide storage for random
> >> > > configurations and bind them to some random system state?
> >> > >
> >> > > What's wrong with calling
> >> > >
> >> > >irq_set_type(irq, B);
> >> > >enable_irq_wake(irq);
> >> > >
> >> > >disable_irq_wake(irq);
> >> > >irq_set_type(irq, A);
> >> >
> >> > The desire is to avoid doing it in [every] driver but rather have it done
> >> > centrally by device/PM core. It does not have to be irq_desc though,
> >> > maybe you can suggest a better place for it (aside of the individual 
> >> > driver
> >> > code that is)?
> >>
> >> Well, if it should be done by the device/pm core then you want to
> >> store that information in the device related data structure.
> >>
> >> struct dev_pm_info might be the right place for it, but that's up to
> >> Rafael.
> >>
> >> So driver would set
> >>
> >>dev->power.update_wakeirq_type = true;
> >>dev->power.irq_type_normal = IRQ_TYPE_EDGE_BOTH;
> >>dev->power.irq_type_sleep = IRQ_TYPE_EDGE_LOW;
> >>
> >> And the dev/PM core can issue the calls on suspend/resume.
> >
> > So I'd rather put that into the struct wakeup_source pointed to by
> > the wakeup pointer in struct dev_pm_info.  That would give us a mapping
> > between wakeup source objects and wakeup interrupts and which would make
> > a fair amount of sense in my view.
> >
> > Then, we could simply walk the list of wakeup source objects before
> > suspend_device_irqs() and call enable_irq_wake() etc. for all of the
> > interrupts in question without drivers having to worry about that.
> > We also could save the current IRQ type for them at that point and
> > restore it during resume.
> >
> > Of course, that would require some changes to wakeup_source_create()
> > and friends, but is probably worth doing.
> >
> > Still, before we start making those changes, here's a bunch of questions
> > to answer:
> >
> > (1) Say a wakeup interrupt is shared between two drivers and one of them
> > asks for a different "IRQ type for sleep" than the other one.  How are
> > we going to resolve such conflicts?
> >
> > (2) Can platforms place restrictions on the IRQ type to be used with a given
> > line?  If so, how do we handle situations in which the requested
> > "IRQ type for sleep" is different from what the given line can use?
> > Do we need to resolve that at the struct wakeup_source creation time or
> > can we do that later (during suspend?) and how?
> >
> > Rafael
> >
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
I speak 

Re: [PATCH v5 RESEND 1/3] ARM: EXYNOS: Move code from hotplug.c to platsmp.c

2014-09-26 Thread Arnd Bergmann
On Friday 05 September 2014, Krzysztof Kozlowski wrote:
> The commit only moves code around with one additional observable change:
> the hotplug.c was compiled with custom CFLAGS (-march=armv7-a). These
> CFLAGS are not necessary any more.

This turns out to be wrong, and your change broke 'allmodconfig' builds
in linux-next. Please apply this patch on top.

Arnd

8<--
>From 4ba6bf8806caec386e35930314dbad071284c837 Mon Sep 17 00:00:00 2001
From: Arnd Bergmann 
Date: Fri, 26 Sep 2014 23:09:38 +0200
Subject: [PATCH] ARM: EXYNOS: fix build error in platsmp.c

/tmp/ccYeWL3V.s: Assembler messages:
/tmp/ccYeWL3V.s:659: Error: selected processor does not support ARM mode `isb '
/tmp/ccYeWL3V.s:664: Error: selected processor does not support ARM mode `isb '
/tmp/ccYeWL3V.s:665: Error: selected processor does not support ARM mode `dsb '
make[3]: *** [arch/arm/mach-exynos/platsmp.o] Error 1

Signed-off-by: Arnd Bergmann 
Fixes: 17342534e1d932 ("ARM: EXYNOS: Move code from hotplug.c to platsmp.c")

diff --git a/arch/arm/mach-exynos/Makefile b/arch/arm/mach-exynos/Makefile
index 4e49d4efb264..64324bf5edb4 100644
--- a/arch/arm/mach-exynos/Makefile
+++ b/arch/arm/mach-exynos/Makefile
@@ -21,6 +21,7 @@ obj-$(CONFIG_PM_SLEEP)+= suspend.o
 obj-$(CONFIG_PM_GENERIC_DOMAINS) += pm_domains.o
 
 obj-$(CONFIG_SMP)  += platsmp.o headsmp.o
+CFLAGS_platsmp.o   := -march=armv7-a
 
 plus_sec := $(call as-instr,.arch_extension sec,+sec)
 AFLAGS_exynos-smc.o:=-Wa,-march=armv7-a$(plus_sec)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Stupid NVIDIA 3D vgaarb.c patch

2014-09-26 Thread Ilia Mirkin
On Fri, Sep 26, 2014 at 5:08 PM, Aaron Plattner  wrote:
> On 09/23/2014 01:29 PM, Benjamin Herrenschmidt wrote:
>>
>> On Mon, 2014-09-22 at 13:43 -0700, Linus Torvalds wrote:
>>>
>>> Adding proper people and mailing lists..
>>>
>>> The PCI_CLASS_DISPLAY_VGA test goes back to the very beginning by
>>> BenH, and I have no idea if adding PCI_CLASS_DISPLAY_3D is
>>> appropriate, but hopefully somebody does. The fact that it makes
>>> things work certainly argues fairly convincingly for it, but I want
>>> some backup here.
>>>
>>> Dave, BenH?
>>>
>>> Christopher(?), can you please also specify which laptop etc. And the
>>> patch itself seems to have come from somebody else, unless you're
>>> Lekensteyn. So we'd want to get the provenance of that too.
>>
>>
>> Hrm, that sucks. "3D" could mean anything really, we might need an
>> explicit vendor ID check as well and maybe even device ID ...
>
>
> If my understanding is correct, the board designers explicitly mark them as
> "3D controller" when they don't have any outputs connected, specifically so
> the SBIOS won't choose them as the boot VGA device. Depending on the design,
> some GPUs on these 3D controller boards don't have a display engine at all,
> while others still have it in the silicon but leave it disabled at runtime.
> In either case, VGA should not be routed to them and I don't think they
> should need to participate in VGA arbitration.

Without commenting on the vga aspects of this discussion (about which
I know next to nothing), I'm moderately sure the nouveau team has seen
optimus setups where the secondary GPU reports itself as a 3d
controller, but has a working display engine, and outputs connected to
it. And I believe the inverse has also happened (a device that reports
itself as VGA but has no outputs advertised and a potentially-absent
display engine).

  -ilia
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   8   9   10   >