date:20180511

[PATCH] pstore: Convert internal records to timespec64

2018-05-11 Thread Kees Cook

This prepares pstore for converting the VFS layer to timespec64.

Cc: Deepa Dinamani 
Signed-off-by: Kees Cook 
---
 fs/pstore/inode.c  | 3 ++-
 fs/pstore/platform.c   | 2 +-
 include/linux/pstore.h | 2 +-
 3 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/fs/pstore/inode.c b/fs/pstore/inode.c
index 5fcb845b9fec..75afe5eb0574 100644
--- a/fs/pstore/inode.c
+++ b/fs/pstore/inode.c
@@ -392,7 +392,8 @@ int pstore_mkfile(struct dentry *root, struct pstore_record 
*record)
inode->i_private = private;
 
if (record->time.tv_sec)
-   inode->i_mtime = inode->i_ctime = record->time;
+   inode->i_mtime = inode->i_ctime =
+   timespec64_to_timespec(record->time);
 
d_add(dentry, inode);
 
diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c
index dc720573fd53..c238ab8ba31d 100644
--- a/fs/pstore/platform.c
+++ b/fs/pstore/platform.c
@@ -328,7 +328,7 @@ void pstore_record_init(struct pstore_record *record,
record->psi = psinfo;
 
/* Report zeroed timestamp if called before timekeeping has resumed. */
-   record->time = ns_to_timespec(ktime_get_real_fast_ns());
+   record->time = ns_to_timespec64(ktime_get_real_fast_ns());
 }
 
 /*
diff --git a/include/linux/pstore.h b/include/linux/pstore.h
index 61f806a7fe29..a15bc4d48752 100644
--- a/include/linux/pstore.h
+++ b/include/linux/pstore.h
@@ -71,7 +71,7 @@ struct pstore_record {
struct pstore_info  *psi;
enum pstore_type_id type;
u64 id;
-   struct timespec time;
+   struct timespec64   time;
char*buf;
ssize_t size;
ssize_t ecc_notice_size;
-- 
2.17.0


-- 
Kees Cook
Pixel Security

Re: [PATCH 6/6] vfs: change inode times to use struct timespec64

2018-05-11 Thread Kees Cook

On Fri, May 11, 2018 at 9:59 PM, Deepa Dinamani  wrote:
> diff --git a/fs/pstore/inode.c b/fs/pstore/inode.c
> index 5fcb845b9fec..fb681d302bb3 100644
> --- a/fs/pstore/inode.c
> +++ b/fs/pstore/inode.c
> @@ -392,7 +392,7 @@ int pstore_mkfile(struct dentry *root, struct 
> pstore_record *record)
> inode->i_private = private;
>
> if (record->time.tv_sec)
> -   inode->i_mtime = inode->i_ctime = record->time;
> +   inode->i_mtime = inode->i_ctime = 
> timespec_to_timespec64(record->time);
>
> d_add(dentry, inode);

I'm fine to just convert pstore internally to timespec64 right now. Is
it correct to say that I should use timespec64_to_timespec() here
until this flag day patch? And I'd need to do this as well, yes?

fs/pstore/platform.c: record->time =
ns_to_timespec64(ktime_get_real_fast_ns());

Thanks!

-Kees

-- 
Kees Cook
Pixel Security

[tip:sched/urgent] Revert "sched/numa: Delay retrying placement for automatic NUMA balance after wake_affine()"

2018-05-11 Thread tip-bot for Mel Gorman

Commit-ID:  789ba28013ce23dbf5e9f5f014f4233b35523bf3
Gitweb: https://git.kernel.org/tip/789ba28013ce23dbf5e9f5f014f4233b35523bf3
Author: Mel Gorman 
AuthorDate: Wed, 9 May 2018 17:31:15 +0100
Committer:  Ingo Molnar 
CommitDate: Sat, 12 May 2018 08:37:56 +0200

Revert "sched/numa: Delay retrying placement for automatic NUMA balance after 
wake_affine()"

This reverts commit 7347fc87dfe6b7315e74310ee1243dc222c68086.

Srikar Dronamra pointed out that while the commit in question did show
a performance improvement on ppc64, it did so at the cost of disabling
active CPU migration by automatic NUMA balancing which was not the intent.
The issue was that a serious flaw in the logic failed to ever active balance
if SD_WAKE_AFFINE was disabled on scheduler domains. Even when it's enabled,
the logic is still bizarre and against the original intent.

Investigation showed that fixing the patch in either the way he suggested,
using the correct comparison for jiffies values or introducing a new
numa_migrate_deferred variable in task_struct all perform similarly to a
revert with a mix of gains and losses depending on the workload, machine
and socket count.

The original intent of the commit was to handle a problem whereby
wake_affine, idle balancing and automatic NUMA balancing disagree on the
appropriate placement for a task. This was particularly true for cases where
a single task was a massive waker of tasks but where wake_wide logic did
not apply.  This was particularly noticeable when a futex (a barrier) woke
all worker threads and tried pulling the wakees to the waker nodes. In that
specific case, it could be handled by tuning MPI or openMP appropriately,
but the behavior is not illogical and was worth attempting to fix. However,
the approach was wrong. Given that we're at rc4 and a fix is not obvious,
it's better to play safe, revert this commit and retry later.

Signed-off-by: Mel Gorman 
Signed-off-by: Peter Zijlstra (Intel) 
Reviewed-by: Srikar Dronamraju 
Cc: Linus Torvalds 
Cc: Thomas Gleixner 
Cc: efa...@gmx.de
Cc: ggherdov...@suse.cz
Cc: h...@zytor.com
Cc: m...@codeblueprint.co.uk
Cc: m...@ellerman.id.au
Link: 
http://lkml.kernel.org/r/20180509163115.6fnnyeg4vdm2c...@techsingularity.net
Signed-off-by: Ingo Molnar 
---
 kernel/sched/fair.c | 57 +
 1 file changed, 1 insertion(+), 56 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 54dc31e7ab9b..f43627c6bb3d 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -1854,7 +1854,6 @@ static int task_numa_migrate(struct task_struct *p)
 static void numa_migrate_preferred(struct task_struct *p)
 {
unsigned long interval = HZ;
-   unsigned long numa_migrate_retry;
 
/* This task has no NUMA fault statistics yet */
if (unlikely(p->numa_preferred_nid == -1 || !p->numa_faults))
@@ -1862,18 +1861,7 @@ static void numa_migrate_preferred(struct task_struct *p)
 
/* Periodically retry migrating the task to the preferred node */
interval = min(interval, msecs_to_jiffies(p->numa_scan_period) / 16);
-   numa_migrate_retry = jiffies + interval;
-
-   /*
-* Check that the new retry threshold is after the current one. If
-* the retry is in the future, it implies that wake_affine has
-* temporarily asked NUMA balancing to backoff from placement.
-*/
-   if (numa_migrate_retry > p->numa_migrate_retry)
-   return;
-
-   /* Safe to try placing the task on the preferred node */
-   p->numa_migrate_retry = numa_migrate_retry;
+   p->numa_migrate_retry = jiffies + interval;
 
/* Success if task is already running on preferred CPU */
if (task_node(p) == p->numa_preferred_nid)
@@ -5922,48 +5910,6 @@ wake_affine_weight(struct sched_domain *sd, struct 
task_struct *p,
return this_eff_load < prev_eff_load ? this_cpu : nr_cpumask_bits;
 }
 
-#ifdef CONFIG_NUMA_BALANCING
-static void
-update_wa_numa_placement(struct task_struct *p, int prev_cpu, int target)
-{
-   unsigned long interval;
-
-   if (!static_branch_likely(&sched_numa_balancing))
-   return;
-
-   /* If balancing has no preference then continue gathering data */
-   if (p->numa_preferred_nid == -1)
-   return;
-
-   /*
-* If the wakeup is not affecting locality then it is neutral from
-* the perspective of NUMA balacing so continue gathering data.
-*/
-   if (cpu_to_node(prev_cpu) == cpu_to_node(target))
-   return;
-
-   /*
-* Temporarily prevent NUMA balancing trying to place waker/wakee after
-* wakee has been moved by wake_affine. This will potentially allow
-* related tasks to converge and update their data placement. The
-* 4 * numa_scan_period is to allow the two-pass filter to migrate
-* hot data to the wakers node.
-*/
-   interval = max(sysctl_numa_balancing

[PATCH v4] mm: Change return type to vm_fault_t

2018-05-11 Thread Souptick Joarder

Use new return type vm_fault_t for fault handler
in struct vm_operations_struct. For now, this is
just documenting that the function returns a
VM_FAULT value rather than an errno.  Once all
instances are converted, vm_fault_t will become
a distinct type.

commit 1c8f422059ae ("mm: change return type to
vm_fault_t")

Signed-off-by: Souptick Joarder 
Reviewed-by: Matthew Wilcox 
---
 include/linux/mm_types.h | 6 +++---
 mm/hugetlb.c | 2 +-
 mm/mmap.c| 4 ++--
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 2161234..cde40e6 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -627,9 +627,9 @@ struct vm_special_mapping {
 * If non-NULL, then this is called to resolve page faults
 * on the special mapping.  If used, .pages is not checked.
 */
-   int (*fault)(const struct vm_special_mapping *sm,
-struct vm_area_struct *vma,
-struct vm_fault *vmf);
+   vm_fault_t (*fault)(const struct vm_special_mapping *sm,
+   struct vm_area_struct *vma,
+   struct vm_fault *vmf);
 
int (*mremap)(const struct vm_special_mapping *sm,
 struct vm_area_struct *new_vma);
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 2186791..7e00bd3 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3159,7 +3159,7 @@ static unsigned long hugetlb_vm_op_pagesize(struct 
vm_area_struct *vma)
  * hugegpage VMA.  do_page_fault() is supposed to trap this, so BUG is we get
  * this far.
  */
-static int hugetlb_vm_op_fault(struct vm_fault *vmf)
+static vm_fault_t hugetlb_vm_op_fault(struct vm_fault *vmf)
 {
BUG();
return 0;
diff --git a/mm/mmap.c b/mm/mmap.c
index 188f195..bdd4ba9a 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -3228,7 +3228,7 @@ void vm_stat_account(struct mm_struct *mm, vm_flags_t 
flags, long npages)
mm->data_vm += npages;
 }
 
-static int special_mapping_fault(struct vm_fault *vmf);
+static vm_fault_t special_mapping_fault(struct vm_fault *vmf);
 
 /*
  * Having a close hook prevents vma merging regardless of flags.
@@ -3267,7 +3267,7 @@ static int special_mapping_mremap(struct vm_area_struct 
*new_vma)
.fault = special_mapping_fault,
 };
 
-static int special_mapping_fault(struct vm_fault *vmf)
+static vm_fault_t special_mapping_fault(struct vm_fault *vmf)
 {
struct vm_area_struct *vma = vmf->vma;
pgoff_t pgoff;
-- 
1.9.1

[PATCH] staging: lustre: Fix an error handling path in 'client_common_fill_super()'

2018-05-11 Thread Christophe JAILLET

According to error handling path before and after this one, we should go
to 'out_md_fid' here, instead of 'out_md', if 'obd_connect()' fails.

Signed-off-by: Christophe JAILLET 
---
The last goto 'out_lock_cn_cb' looks spurious but is correct.
In case of error, 'd_make_root()' performs a 'iput()', so skipping it in
the error handling path lokks fine to me.
---
 drivers/staging/lustre/lustre/llite/llite_lib.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c 
b/drivers/staging/lustre/lustre/llite/llite_lib.c
index 60dbe888e336..83eb2da2c9ad 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -400,11 +400,11 @@ static int client_common_fill_super(struct super_block 
*sb, char *md, char *dt)
LCONSOLE_ERROR_MSG(0x150,
   "An OST (dt %s) is performing recovery, of 
which this client is not a part.  Please wait for recovery to complete, abort, 
or time out.\n",
   dt);
-   goto out_md;
+   goto out_md_fid;
} else if (err) {
CERROR("%s: Cannot connect to %s: rc = %d\n",
   sbi->ll_dt_exp->exp_obd->obd_name, dt, err);
-   goto out_md;
+   goto out_md_fid;
}
 
sbi->ll_dt_exp->exp_connect_data = *data;
-- 
2.17.0

Re: [PATCH] rcu: Report a quiescent state when it's exactly in the state

2018-05-11 Thread Joel Fernandes

On Fri, May 11, 2018 at 10:08:24PM -0700, Paul E. McKenney wrote:
> On Fri, May 11, 2018 at 03:41:38PM -0700, Joel Fernandes wrote:
> > On Fri, May 11, 2018 at 09:17:46AM -0700, Paul E. McKenney wrote:
> > > On Fri, May 11, 2018 at 09:57:54PM +0900, Byungchul Park wrote:
> > > > Hello folks,
> > > > 
> > > > I think I wrote the title in a misleading way.
> > > > 
> > > > Please change the title to something else such as,
> > > > "rcu: Report a quiescent state when it's in the state" or,
> > > > "rcu: Add points reporting quiescent states where proper" or so on.
> > > > 
> > > > On 2018-05-11 오후 5:30, Byungchul Park wrote:
> > > > >We expect a quiescent state of TASKS_RCU when 
> > > > >cond_resched_tasks_rcu_qs()
> > > > >is called, no matter whether it actually be scheduled or not. However,
> > > > >it currently doesn't report the quiescent state when the task enters
> > > > >into __schedule() as it's called with preempt = true. So make it report
> > > > >the quiescent state unconditionally when cond_resched_tasks_rcu_qs() is
> > > > >called.
> > > > >
> > > > >And in TINY_RCU, even though the quiescent state of rcu_bh also should
> > > > >be reported when the tick interrupt comes from user, it doesn't. So 
> > > > >make
> > > > >it reported.
> > > > >
> > > > >Lastly in TREE_RCU, rcu_note_voluntary_context_switch() should be
> > > > >reported when the tick interrupt comes from not only user but also 
> > > > >idle,
> > > > >as an extended quiescent state.
> > > > >
> > > > >Signed-off-by: Byungchul Park 
> > > > >---
> > > > >  include/linux/rcupdate.h | 4 ++--
> > > > >  kernel/rcu/tiny.c| 6 +++---
> > > > >  kernel/rcu/tree.c| 4 ++--
> > > > >  3 files changed, 7 insertions(+), 7 deletions(-)
> > > > >
> > > > >diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> > > > >index ee8cf5fc..7432261 100644
> > > > >--- a/include/linux/rcupdate.h
> > > > >+++ b/include/linux/rcupdate.h
> > > > >@@ -195,8 +195,8 @@ static inline void exit_tasks_rcu_finish(void) { }
> > > > >   */
> > > > >  #define cond_resched_tasks_rcu_qs() \
> > > > >  do { \
> > > > >-  if (!cond_resched()) \
> > > > >-  rcu_note_voluntary_context_switch_lite(current); \
> > > > >+  rcu_note_voluntary_context_switch_lite(current); \
> > > > >+  cond_resched(); \
> > > 
> > > Ah, good point.
> > > 
> > > Peter, I have to ask...  Why is "cond_resched()" considered a preemption
> > > while "schedule()" is not?
> > 
> > Infact something interesting I inferred from the __schedule loop related to
> > your question:
> > 
> > switch_count can either be set to prev->invcsw or prev->nvcsw. If we can
> > assume that switch_count reflects whether the context switch is involuntary
> > or voluntary,
> >   
> > task-running-state  preempt switch_count
> > 0 (running) 1   involuntary
> > 0   0   involuntary
> > 1   0   voluntary
> > 1   1   involuntary
> > 
> > According to the above table, both the task's running state and the preempt
> > parameter to __schedule should be used together to determine if the switch 
> > is
> > a voluntary one or not.
> > 
> > So this code in rcu_note_context_switch should really be:
> > if (!preempt && !(current->state & TASK_RUNNING))

I should have writte here- !preempt && current->state

> > rcu_note_voluntary_context_switch_lite(current);
> > 
> > According to the above table, cond_resched always classifies as an
> > involuntary switch which makes sense to me. Even though cond_resched is
> > explicitly called, its still sort of involuntary in the sense its not called
> > into the scheduler for sleeping, but rather for seeing if something else can
> > run instead (a preemption point). Infact none of the task deactivation in 
> > the
> > __schedule loop will run if cond_resched is used.
> > 
> > I agree that if schedule was called directly but with TASK_RUNNING=1, then
> > that could probably be classified an involuntary switch too...
> > 
> > Also since we're deciding to call rcu_note_voluntary_context_switch_lite
> > unconditionally, then IMO this comment on that macro:
> > 
> > /*
> >  * Note a voluntary context switch for RCU-tasks benefit.  This is a
> >  * macro rather than an inline function to avoid #include hell.
> >  */
> >  #ifdef CONFIG_TASKS_RCU
> >  #define rcu_note_voluntary_context_switch_lite(t)
> > 
> > Should be changed to:
> > 
> > /*
> >  * Note a attempt to perform a voluntary context switch for RCU-tasks
> >  * benefit.  This is called even in situations where a context switch
> >  * didn't really happen even though it was requested. This is a
> >  * macro rather than an inline function to avoid #include hell.
> >  */
> >  #ifdef CONFIG_TASKS_RCU
> >  #define rcu_note_voluntary_context_switch_lite(t)
> > 
> > Right?
> > 
> > Correct me if I'm wrong about anything, thanks,
> 
> The starting point for me is that Ta

Re: [PATCH v3] mm: Change return type to vm_fault_t

2018-05-11 Thread Souptick Joarder

On Sat, May 12, 2018 at 11:55 AM, Souptick Joarder  wrote:
> On Sat, May 12, 2018 at 11:50 AM, Joe Perches  wrote:
>> On Sat, 2018-05-12 at 11:47 +0530, Souptick Joarder wrote:
>>> Use new return type vm_fault_t for fault handler
>>> in struct vm_operations_struct. For now, this is
>>> just documenting that the function returns a
>>> VM_FAULT value rather than an errno.  Once all
>>> instances are converted, vm_fault_t will become
>>> a distinct type.
>>
>> trivia:
>>
>>> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
>> []
>>> @@ -627,7 +627,7 @@ struct vm_special_mapping {
>>>* If non-NULL, then this is called to resolve page faults
>>>* on the special mapping.  If used, .pages is not checked.
>>>*/
>>> - int (*fault)(const struct vm_special_mapping *sm,
>>> + vm_fault_t (*fault)(const struct vm_special_mapping *sm,
>>>struct vm_area_struct *vma,
>>>struct vm_fault *vmf);
>>
>>
>> It'd be nicer to realign the 2nd and 3rd arguments
>> on the subsequent lines.
>>
>> vm_fault_t (*fault)(const struct vm_special_mapping *sm,
>> struct vm_area_struct *vma,
>> struct vm_fault *vmf);
>>
>
> Just now posted v3. Do you want me to send v4 again with
> realignment ?

Sorry, please ignore this mail.

Re: [PATCH v3] mm: Change return type to vm_fault_t

2018-05-11 Thread Souptick Joarder

On Sat, May 12, 2018 at 11:50 AM, Joe Perches  wrote:
> On Sat, 2018-05-12 at 11:47 +0530, Souptick Joarder wrote:
>> Use new return type vm_fault_t for fault handler
>> in struct vm_operations_struct. For now, this is
>> just documenting that the function returns a
>> VM_FAULT value rather than an errno.  Once all
>> instances are converted, vm_fault_t will become
>> a distinct type.
>
> trivia:
>
>> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> []
>> @@ -627,7 +627,7 @@ struct vm_special_mapping {
>>* If non-NULL, then this is called to resolve page faults
>>* on the special mapping.  If used, .pages is not checked.
>>*/
>> - int (*fault)(const struct vm_special_mapping *sm,
>> + vm_fault_t (*fault)(const struct vm_special_mapping *sm,
>>struct vm_area_struct *vma,
>>struct vm_fault *vmf);
>
>
> It'd be nicer to realign the 2nd and 3rd arguments
> on the subsequent lines.
>
> vm_fault_t (*fault)(const struct vm_special_mapping *sm,
> struct vm_area_struct *vma,
> struct vm_fault *vmf);
>

Just now posted v3. Do you want me to send v4 again with
realignment ?

Re: [PATCH v3] mm: Change return type to vm_fault_t

2018-05-11 Thread Joe Perches

On Sat, 2018-05-12 at 11:47 +0530, Souptick Joarder wrote:
> Use new return type vm_fault_t for fault handler
> in struct vm_operations_struct. For now, this is
> just documenting that the function returns a
> VM_FAULT value rather than an errno.  Once all
> instances are converted, vm_fault_t will become
> a distinct type.

trivia:

> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
[]
> @@ -627,7 +627,7 @@ struct vm_special_mapping {
>* If non-NULL, then this is called to resolve page faults
>* on the special mapping.  If used, .pages is not checked.
>*/
> - int (*fault)(const struct vm_special_mapping *sm,
> + vm_fault_t (*fault)(const struct vm_special_mapping *sm,
>struct vm_area_struct *vma,
>struct vm_fault *vmf);


It'd be nicer to realign the 2nd and 3rd arguments
on the subsequent lines.

vm_fault_t (*fault)(const struct vm_special_mapping *sm,
struct vm_area_struct *vma,
struct vm_fault *vmf);

[PATCH v3] mm: Change return type to vm_fault_t

2018-05-11 Thread Souptick Joarder

Use new return type vm_fault_t for fault handler
in struct vm_operations_struct. For now, this is
just documenting that the function returns a
VM_FAULT value rather than an errno.  Once all
instances are converted, vm_fault_t will become
a distinct type.

commit 1c8f422059ae ("mm: change return type to
vm_fault_t")

Signed-off-by: Souptick Joarder 
Reviewed-by: Matthew Wilcox 
---
v2: updated the change log

v3: added  changes
into the same patch

 include/linux/mm_types.h | 2 +-
 mm/hugetlb.c | 2 +-
 mm/mmap.c| 4 ++--
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 2161234..11acfdb 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -627,7 +627,7 @@ struct vm_special_mapping {
 * If non-NULL, then this is called to resolve page faults
 * on the special mapping.  If used, .pages is not checked.
 */
-   int (*fault)(const struct vm_special_mapping *sm,
+   vm_fault_t (*fault)(const struct vm_special_mapping *sm,
 struct vm_area_struct *vma,
 struct vm_fault *vmf);
 
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 2186791..7e00bd3 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3159,7 +3159,7 @@ static unsigned long hugetlb_vm_op_pagesize(struct 
vm_area_struct *vma)
  * hugegpage VMA.  do_page_fault() is supposed to trap this, so BUG is we get
  * this far.
  */
-static int hugetlb_vm_op_fault(struct vm_fault *vmf)
+static vm_fault_t hugetlb_vm_op_fault(struct vm_fault *vmf)
 {
BUG();
return 0;
diff --git a/mm/mmap.c b/mm/mmap.c
index 188f195..bdd4ba9a 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -3228,7 +3228,7 @@ void vm_stat_account(struct mm_struct *mm, vm_flags_t 
flags, long npages)
mm->data_vm += npages;
 }
 
-static int special_mapping_fault(struct vm_fault *vmf);
+static vm_fault_t special_mapping_fault(struct vm_fault *vmf);
 
 /*
  * Having a close hook prevents vma merging regardless of flags.
@@ -3267,7 +3267,7 @@ static int special_mapping_mremap(struct vm_area_struct 
*new_vma)
.fault = special_mapping_fault,
 };
 
-static int special_mapping_fault(struct vm_fault *vmf)
+static vm_fault_t special_mapping_fault(struct vm_fault *vmf)
 {
struct vm_area_struct *vma = vmf->vma;
pgoff_t pgoff;
-- 
1.9.1

Re: [tip/core/rcu,16/21] rcu: Add funnel locking to rcu_start_this_gp()

2018-05-11 Thread Joel Fernandes

On Sun, Apr 22, 2018 at 08:03:39PM -0700, Paul E. McKenney wrote:
> The rcu_start_this_gp() function had a simple form of funnel locking that
> used only the leaves and root of the rcu_node tree, which is fine for
> systems with only a few hundred CPUs, but sub-optimal for systems having
> thousands of CPUs.  This commit therefore adds full-tree funnel locking.
> 
> This variant of funnel locking is unusual in the following ways:
> 
> 1.The leaf-level rcu_node structure's ->lock is held throughout.
>   Other funnel-locking implementations drop the leaf-level lock
>   before progressing to the next level of the tree.
> 
> 2.Funnel locking can be started at the root, which is convenient
>   for code that already holds the root rcu_node structure's ->lock.
>   Other funnel-locking implementations start at the leaves.
> 
> 3.If an rcu_node structure other than the initial one believes
>   that a grace period is in progress, it is not necessary to
>   go further up the tree.  This is because grace-period cleanup
>   scans the full tree, so that marking the need for a subsequent
>   grace period anywhere in the tree suffices -- but only if
>   a grace period is currently in progress.
> 
> 4.It is possible that the RCU grace-period kthread has not yet
>   started, and this case must be handled appropriately.
> 
> However, the general approach of using a tree to control lock contention
> is still in place.
> 
> Signed-off-by: Paul E. McKenney 
> ---
>  kernel/rcu/tree.c | 92 
> +--
>  1 file changed, 35 insertions(+), 57 deletions(-)
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 94519c7d552f..d3c769502929 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -1682,74 +1682,52 @@ static bool rcu_start_this_gp(struct rcu_node *rnp, 
> struct rcu_data *rdp,
>  {
>   bool ret = false;
>   struct rcu_state *rsp = rdp->rsp;
> - struct rcu_node *rnp_root = rcu_get_root(rsp);
> -
> - raw_lockdep_assert_held_rcu_node(rnp);
> -
> - /* If the specified GP is already known needed, return to caller. */
> - trace_rcu_this_gp(rnp, rdp, c, TPS("Startleaf"));
> - if (need_future_gp_element(rnp, c)) {
> - trace_rcu_this_gp(rnp, rdp, c, TPS("Prestartleaf"));
> - goto out;
> - }
> + struct rcu_node *rnp_root;
>  
>   /*
> -  * If this rcu_node structure believes that a grace period is in
> -  * progress, then we must wait for the one following, which is in
> -  * "c".  Because our request will be noticed at the end of the
> -  * current grace period, we don't need to explicitly start one.
> +  * Use funnel locking to either acquire the root rcu_node
> +  * structure's lock or bail out if the need for this grace period
> +  * has already been recorded -- or has already started.  If there
> +  * is already a grace period in progress in a non-leaf node, no
> +  * recording is needed because the end of the grace period will
> +  * scan the leaf rcu_node structures.  Note that rnp->lock must
> +  * not be released.
>*/
> - if (rnp->gpnum != rnp->completed) {
> - need_future_gp_element(rnp, c) = true;
> - trace_rcu_this_gp(rnp, rdp, c, TPS("Startedleaf"));
> - goto out;

Referring to the above negative diff as [1] (which I wanted to refer to later
in this message..)

> + raw_lockdep_assert_held_rcu_node(rnp);
> + trace_rcu_this_gp(rnp, rdp, c, TPS("Startleaf"));
> + for (rnp_root = rnp; 1; rnp_root = rnp_root->parent) {
> + if (rnp_root != rnp)
> + raw_spin_lock_rcu_node(rnp_root);
> + if (need_future_gp_element(rnp_root, c) ||
> + ULONG_CMP_GE(rnp_root->gpnum, c) ||
> + (rnp != rnp_root &&
> +  rnp_root->gpnum != rnp_root->completed)) {
> + trace_rcu_this_gp(rnp_root, rdp, c, TPS("Prestarted"));
> + goto unlock_out;

I was a bit confused about the implementation of the above for loop:

In the previous code (which I refer to in the negative diff [1]), we were
checking the leaf, and if the leaf believed that RCU was not idle, then we
were marking the need for the future GP and quitting this function. In the
new code, it seems like even if the leaf believes RCU is not-idle, we still
go all the way up the tree.

I think the big change is, in the above new for loop, we either bail of if a
future GP need was already marked by an intermediate node, or we go marking
up the whole tree about the need for one.

If a leaf believes RCU is not idle, can we not just mark the future GP need
like before and return? It seems we would otherwise increase the lock
contention since now we lock intermediate nodes and then finally even the
root. Where as before we were not doing that if the leaf believed RCU was not
idle.

I am sorry

Re: [PATCH] mtd: nand: Add support for reading ooblayout from device tree

2018-05-11 Thread Boris Brezillon

Hi Paul,

On Fri, 11 May 2018 23:29:12 +0200
Paul Cercueil  wrote:

> By specifying the properties "mtd-oob-ecc" and "mtd-oob-free", it is
> now possible to specify from devicetree where the ECC data is located
> inside the OOB region.

Why would we want to do that? I mean, ECC/free regions are ECC
controller dependent (and NAND chip dependent for the OOB size part),
so there's no reason to describe it in the DT. And more importantly,
people are likely to get it wrong.

I'm curious, why do you need that?

Regards,

Boris

> 
> Signed-off-by: Paul Cercueil 
> ---
>  Documentation/devicetree/bindings/mtd/nand.txt |  7 +
>  drivers/mtd/nand/raw/nand_base.c   | 42 
> ++
>  2 files changed, 49 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/mtd/nand.txt 
> b/Documentation/devicetree/bindings/mtd/nand.txt
> index 8bb11d809429..118ea92787cb 100644
> --- a/Documentation/devicetree/bindings/mtd/nand.txt
> +++ b/Documentation/devicetree/bindings/mtd/nand.txt
> @@ -45,6 +45,13 @@ Optional NAND chip properties:
>as reliable as possible.
>  - nand-rb: shall contain the native Ready/Busy ids.
>  
> +- nand-oob-ecc:  couples of integers, specifying the offset
> +  and length of the ECC data in the OOB region. There can be 
> more
> +  than one couple.
> +- nand-oob-free:  couples of integers, specifying the offset
> +  and length of a free-to-use area in the OOB region. There 
> can be
> +  more than one couple.
> +
>  The ECC strength and ECC step size properties define the correction 
> capability
>  of a controller. Together, they say a controller can correct "{strength} bit
>  errors per {size} bytes".
> diff --git a/drivers/mtd/nand/raw/nand_base.c 
> b/drivers/mtd/nand/raw/nand_base.c
> index 72f3a89da513..c905531effb0 100644
> --- a/drivers/mtd/nand/raw/nand_base.c
> +++ b/drivers/mtd/nand/raw/nand_base.c
> @@ -213,6 +213,43 @@ static const struct mtd_ooblayout_ops 
> nand_ooblayout_lp_hamming_ops = {
>   .free = nand_ooblayout_free_lp_hamming,
>  };
>  
> +static int nand_oob_of(struct device_node *np, int section,
> +struct mtd_oob_region *oobregion, const char *prop)
> +{
> + int ret = of_property_read_u32_index(np, prop,
> + section * 2, &oobregion->offset);
> + if (ret == -EOVERFLOW)
> + return -ERANGE; /* We're done */
> + if (ret)
> + return ret;
> +
> + ret = of_property_read_u32_index(np, prop,
> + section * 2 + 1, &oobregion->length);
> + if (ret == -EOVERFLOW)
> + return -EINVAL; /* We must have an even number of integers */
> +
> + return ret;
> +}
> +
> +static int nand_ooblayout_ecc_of(struct mtd_info *mtd, int section,
> +  struct mtd_oob_region *oobregion)
> +{
> + return nand_oob_of(mtd->dev.of_node, section,
> + oobregion, "nand-oob-ecc");
> +}
> +
> +static int nand_ooblayout_free_of(struct mtd_info *mtd, int section,
> +  struct mtd_oob_region *oobregion)
> +{
> + return nand_oob_of(mtd->dev.of_node, section,
> + oobregion, "nand-oob-free");
> +}
> +
> +static const struct mtd_ooblayout_ops nand_ooblayout_of_ops = {
> + .ecc = nand_ooblayout_ecc_of,
> + .free = nand_ooblayout_free_of,
> +};
> +
>  static int check_offs_len(struct mtd_info *mtd,
>   loff_t ofs, uint64_t len)
>  {
> @@ -5843,6 +5880,11 @@ static int nand_dt_init(struct nand_chip *chip)
>   if (of_property_read_bool(dn, "nand-ecc-maximize"))
>   chip->ecc.options |= NAND_ECC_MAXIMIZE;
>  
> + if (!chip->mtd.ooblayout &&
> + of_property_read_bool(dn, "nand-oob-ecc") &&
> + of_property_read_bool(dn, "nand-oob-free"))
> + chip->mtd.ooblayout = &nand_ooblayout_of_ops;
> +
>   return 0;
>  }
>

Re: [PATCH v2 03/12] arm: dts: mt7623: fix invalid memory node being generated

2018-05-11 Thread Sean Wang

On Fri, 2018-05-11 at 17:03 +0200, Matthias Brugger wrote:
> 
> On 04/11/2018 10:53 AM, sean.w...@mediatek.com wrote:
> > From: Sean Wang 
> > 
> > Below two wrong nodes in existing DTS files would cause a fail boot since
> > in fact the address 0 is not the correct place the memory device locates
> > at.
> > 
> > memory {
> > device_type = "memory";
> > reg = <0x0 0x0 0x0 0x0>;
> > };
> > 
> > memory@8000 {
> > reg = <0x0 0x8000 0x0 0x4000>;
> > };
> > 
> > In order to avoid having a memory node starting at address 0, we can't
> > include file skeleton64.dtsi and instead need to explicitly manually
> > define a few of properties the DTS relies on such as #address-cells
> > and #size-cells in root node and device_type in the node memory@8000.
> > 
> > Cc: sta...@vger.kernel.org
> > Fixes: 31ac0d69a1d4 ("ARM: dts: mediatek: add MT7623 basic support")
> > Signed-off-by: Sean Wang 
> > Cc: Rob Herring 
> > ---
> >  arch/arm/boot/dts/mt7623.dtsi | 3 ++-
> >  arch/arm/boot/dts/mt7623n-bananapi-bpi-r2.dts | 1 +
> >  arch/arm/boot/dts/mt7623n-rfb.dtsi| 1 +
> >  3 files changed, 4 insertions(+), 1 deletion(-)
> > 
> 
> merged. We would need this at least for mt2701 as well, correct?
> Would you mind to provide a patch.
> 
> Regards,
> Matthias
> 

Thanks! I totally think the same problem could happen on mt2701, so I'm 
happy to come up with a patch for that.

Sean

> > diff --git a/arch/arm/boot/dts/mt7623.dtsi b/arch/arm/boot/dts/mt7623.dtsi
> > index fec4715..406a9f3 100644
> > --- a/arch/arm/boot/dts/mt7623.dtsi
> > +++ b/arch/arm/boot/dts/mt7623.dtsi
> > @@ -15,11 +15,12 @@
> >  #include 
> >  #include 
> >  #include 
> > -#include "skeleton64.dtsi"
> >  
> >  / {
> > compatible = "mediatek,mt7623";
> > interrupt-parent = <&sysirq>;
> > +   #address-cells = <2>;
> > +   #size-cells = <2>;
> >  
> > cpu_opp_table: opp-table {
> > compatible = "operating-points-v2";
> > diff --git a/arch/arm/boot/dts/mt7623n-bananapi-bpi-r2.dts 
> > b/arch/arm/boot/dts/mt7623n-bananapi-bpi-r2.dts
> > index bbf56f8..5938e4c 100644
> > --- a/arch/arm/boot/dts/mt7623n-bananapi-bpi-r2.dts
> > +++ b/arch/arm/boot/dts/mt7623n-bananapi-bpi-r2.dts
> > @@ -109,6 +109,7 @@
> > };
> >  
> > memory@8000 {
> > +   device_type = "memory";
> > reg = <0 0x8000 0 0x4000>;
> > };
> >  };
> > diff --git a/arch/arm/boot/dts/mt7623n-rfb.dtsi 
> > b/arch/arm/boot/dts/mt7623n-rfb.dtsi
> > index a199ae7..343e8ef 100644
> > --- a/arch/arm/boot/dts/mt7623n-rfb.dtsi
> > +++ b/arch/arm/boot/dts/mt7623n-rfb.dtsi
> > @@ -40,6 +40,7 @@
> > };
> >  
> > memory@8000 {
> > +   device_type = "memory";
> > reg = <0 0x8000 0 0x4000>;
> > };
> >  
> >

Re: [PATCH] rcu: Report a quiescent state when it's exactly in the state

2018-05-11 Thread Paul E. McKenney

On Fri, May 11, 2018 at 03:41:38PM -0700, Joel Fernandes wrote:
> On Fri, May 11, 2018 at 09:17:46AM -0700, Paul E. McKenney wrote:
> > On Fri, May 11, 2018 at 09:57:54PM +0900, Byungchul Park wrote:
> > > Hello folks,
> > > 
> > > I think I wrote the title in a misleading way.
> > > 
> > > Please change the title to something else such as,
> > > "rcu: Report a quiescent state when it's in the state" or,
> > > "rcu: Add points reporting quiescent states where proper" or so on.
> > > 
> > > On 2018-05-11 오후 5:30, Byungchul Park wrote:
> > > >We expect a quiescent state of TASKS_RCU when cond_resched_tasks_rcu_qs()
> > > >is called, no matter whether it actually be scheduled or not. However,
> > > >it currently doesn't report the quiescent state when the task enters
> > > >into __schedule() as it's called with preempt = true. So make it report
> > > >the quiescent state unconditionally when cond_resched_tasks_rcu_qs() is
> > > >called.
> > > >
> > > >And in TINY_RCU, even though the quiescent state of rcu_bh also should
> > > >be reported when the tick interrupt comes from user, it doesn't. So make
> > > >it reported.
> > > >
> > > >Lastly in TREE_RCU, rcu_note_voluntary_context_switch() should be
> > > >reported when the tick interrupt comes from not only user but also idle,
> > > >as an extended quiescent state.
> > > >
> > > >Signed-off-by: Byungchul Park 
> > > >---
> > > >  include/linux/rcupdate.h | 4 ++--
> > > >  kernel/rcu/tiny.c| 6 +++---
> > > >  kernel/rcu/tree.c| 4 ++--
> > > >  3 files changed, 7 insertions(+), 7 deletions(-)
> > > >
> > > >diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> > > >index ee8cf5fc..7432261 100644
> > > >--- a/include/linux/rcupdate.h
> > > >+++ b/include/linux/rcupdate.h
> > > >@@ -195,8 +195,8 @@ static inline void exit_tasks_rcu_finish(void) { }
> > > >   */
> > > >  #define cond_resched_tasks_rcu_qs() \
> > > >  do { \
> > > >-if (!cond_resched()) \
> > > >-rcu_note_voluntary_context_switch_lite(current); \
> > > >+rcu_note_voluntary_context_switch_lite(current); \
> > > >+cond_resched(); \
> > 
> > Ah, good point.
> > 
> > Peter, I have to ask...  Why is "cond_resched()" considered a preemption
> > while "schedule()" is not?
> 
> Infact something interesting I inferred from the __schedule loop related to
> your question:
> 
> switch_count can either be set to prev->invcsw or prev->nvcsw. If we can
> assume that switch_count reflects whether the context switch is involuntary
> or voluntary,
>   
> task-running-statepreempt switch_count
> 0 (running)   1   involuntary
> 0 0   involuntary
> 1 0   voluntary
> 1 1   involuntary
> 
> According to the above table, both the task's running state and the preempt
> parameter to __schedule should be used together to determine if the switch is
> a voluntary one or not.
> 
> So this code in rcu_note_context_switch should really be:
> if (!preempt && !(current->state & TASK_RUNNING))
>   rcu_note_voluntary_context_switch_lite(current);
> 
> According to the above table, cond_resched always classifies as an
> involuntary switch which makes sense to me. Even though cond_resched is
> explicitly called, its still sort of involuntary in the sense its not called
> into the scheduler for sleeping, but rather for seeing if something else can
> run instead (a preemption point). Infact none of the task deactivation in the
> __schedule loop will run if cond_resched is used.
> 
> I agree that if schedule was called directly but with TASK_RUNNING=1, then
> that could probably be classified an involuntary switch too...
> 
> Also since we're deciding to call rcu_note_voluntary_context_switch_lite
> unconditionally, then IMO this comment on that macro:
> 
> /*
>  * Note a voluntary context switch for RCU-tasks benefit.  This is a
>  * macro rather than an inline function to avoid #include hell.
>  */
>  #ifdef CONFIG_TASKS_RCU
>  #define rcu_note_voluntary_context_switch_lite(t)
> 
> Should be changed to:
> 
> /*
>  * Note a attempt to perform a voluntary context switch for RCU-tasks
>  * benefit.  This is called even in situations where a context switch
>  * didn't really happen even though it was requested. This is a
>  * macro rather than an inline function to avoid #include hell.
>  */
>  #ifdef CONFIG_TASKS_RCU
>  #define rcu_note_voluntary_context_switch_lite(t)
> 
> Right?
> 
> Correct me if I'm wrong about anything, thanks,

The starting point for me is that Tasks RCU is a special-purpose mechanism
for freeing trampolines in PREEMPT=y kernels.  The approach is to arrange
for the trampoline to be inaccessible to future execution, wait for a
tasks-RCU grace period, then free the trampoline.  So a tasks-RCU grace
period must wait until all tasks have spent at least some time outside
of a trampoline.  My understanding

[PATCH 2/6] lustre: Use long long type to print inode time

2018-05-11 Thread Deepa Dinamani

Subsequent patches in the series convert inode timestamps
to use struct timespec64 instead of struct timespec as
part of solving the y2038 problem.

Convert these print formats to use long long types to
avoid warnings and errors on conversion.

Signed-off-by: Deepa Dinamani 
CC: andreas.dil...@intel.com
---
 drivers/staging/lustre/lustre/llite/llite_lib.c | 12 +++-
 drivers/staging/lustre/lustre/lmv/lmv_obd.c |  7 ---
 drivers/staging/lustre/lustre/mdc/mdc_reint.c   |  6 +++---
 drivers/staging/lustre/lustre/obdclass/obdo.c   |  6 +++---
 4 files changed, 17 insertions(+), 14 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c 
b/drivers/staging/lustre/lustre/llite/llite_lib.c
index 60dbe888e336..dc31966bbf3c 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -1482,8 +1482,9 @@ int ll_setattr_raw(struct dentry *dentry, struct iattr 
*attr, bool hsm_import)
}
 
if (attr->ia_valid & (ATTR_MTIME | ATTR_CTIME))
-   CDEBUG(D_INODE, "setting mtime %lu, ctime %lu, now = %llu\n",
-  LTIME_S(attr->ia_mtime), LTIME_S(attr->ia_ctime),
+   CDEBUG(D_INODE, "setting mtime %llu, ctime %llu, now = %llu\n",
+  (unsigned long long)LTIME_S(attr->ia_mtime),
+  (unsigned long long)LTIME_S(attr->ia_ctime),
   (s64)ktime_get_real_seconds());
 
if (S_ISREG(inode->i_mode))
@@ -1760,9 +1761,10 @@ int ll_update_inode(struct inode *inode, struct 
lustre_md *md)
if (body->mbo_valid & OBD_MD_FLMTIME) {
if (body->mbo_mtime > LTIME_S(inode->i_mtime)) {
CDEBUG(D_INODE,
-  "setting ino %lu mtime from %lu to %llu\n",
-  inode->i_ino, LTIME_S(inode->i_mtime),
-  body->mbo_mtime);
+  "setting ino %lu mtime from %llu to %llu\n",
+  inode->i_ino,
+  (unsigned long long)LTIME_S(inode->i_mtime),
+  (unsigned long long)body->mbo_mtime);
LTIME_S(inode->i_mtime) = body->mbo_mtime;
}
lli->lli_mtime = body->mbo_mtime;
diff --git a/drivers/staging/lustre/lustre/lmv/lmv_obd.c 
b/drivers/staging/lustre/lustre/lmv/lmv_obd.c
index 7198a6384028..88e05a53716e 100644
--- a/drivers/staging/lustre/lustre/lmv/lmv_obd.c
+++ b/drivers/staging/lustre/lustre/lmv/lmv_obd.c
@@ -3029,11 +3029,12 @@ static int lmv_merge_attr(struct obd_export *exp,
for (i = 0; i < lsm->lsm_md_stripe_count; i++) {
struct inode *inode = lsm->lsm_md_oinfo[i].lmo_root;
 
-   CDEBUG(D_INFO, "" DFID " size %llu, blocks %llu nlink %u, atime 
%lu ctime %lu, mtime %lu.\n",
+   CDEBUG(D_INFO, "" DFID " size %llu, blocks %llu nlink %u, atime 
%llu ctime %llu, mtime %llu.\n",
   PFID(&lsm->lsm_md_oinfo[i].lmo_fid),
   i_size_read(inode), (unsigned long long)inode->i_blocks,
-  inode->i_nlink, LTIME_S(inode->i_atime),
-  LTIME_S(inode->i_ctime), LTIME_S(inode->i_mtime));
+  inode->i_nlink, (unsigned long 
long)LTIME_S(inode->i_atime),
+  (unsigned long long)LTIME_S(inode->i_ctime),
+  (unsigned long long)LTIME_S(inode->i_mtime));
 
/* for slave stripe, it needs to subtract nlink for . and .. */
if (i)
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_reint.c 
b/drivers/staging/lustre/lustre/mdc/mdc_reint.c
index 488b98007558..f1ccf8d26ddc 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_reint.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_reint.c
@@ -129,9 +129,9 @@ int mdc_setattr(struct obd_export *exp, struct md_op_data 
*op_data,
}
 
if (op_data->op_attr.ia_valid & (ATTR_MTIME | ATTR_CTIME))
-   CDEBUG(D_INODE, "setting mtime %ld, ctime %ld\n",
-  LTIME_S(op_data->op_attr.ia_mtime),
-  LTIME_S(op_data->op_attr.ia_ctime));
+   CDEBUG(D_INODE, "setting mtime %lld, ctime %lld\n",
+  (long long)LTIME_S(op_data->op_attr.ia_mtime),
+  (long long)LTIME_S(op_data->op_attr.ia_ctime));
mdc_setattr_pack(req, op_data, ea, ealen);
 
ptlrpc_request_set_replen(req);
diff --git a/drivers/staging/lustre/lustre/obdclass/obdo.c 
b/drivers/staging/lustre/lustre/obdclass/obdo.c
index c4503bc36591..8f4054aa970b 100644
--- a/drivers/staging/lustre/lustre/obdclass/obdo.c
+++ b/drivers/staging/lustre/lustre/obdclass/obdo.c
@@ -60,9 +60,9 @@ void obdo_from_inode(struct obdo *dst, struct inode *src, u32 
valid)
u32 newvalid = 0;
 
if (valid & (OBD_MD_FLCTIME | OBD_MD_FLMTIME))
-   CDEBUG(D_INODE, "valid %x, new time %lu

[PATCH 5/6] udf: Simplify calls to udf_disk_stamp_to_time

2018-05-11 Thread Deepa Dinamani

Subsequent patches in the series convert inode timestamps
to use struct timespec64 instead of struct timespec as
part of solving the y2038 problem.

commit fd3cfad374d4 ("udf: Convert udf_disk_stamp_to_time() to use mktime64()")
eliminated the NULL return condition from udf_disk_stamp_to_time().
udf_time_to_disk_time() is always called with a valid dest pointer and
the return value is ignored.
Further, caller can as well check the dest pointer being passed in rather
than return argument.
Make both the functions return void.

This will make the inode timestamp conversion simpler.

Signed-off-by: Deepa Dinamani 
Cc: j...@suse.com
---
 fs/udf/inode.c   | 28 +++-
 fs/udf/super.c   | 16 +---
 fs/udf/udfdecl.h |  4 ++--
 fs/udf/udftime.c |  9 ++---
 4 files changed, 20 insertions(+), 37 deletions(-)

diff --git a/fs/udf/inode.c b/fs/udf/inode.c
index c80765d62f7e..df2378d6ebb4 100644
--- a/fs/udf/inode.c
+++ b/fs/udf/inode.c
@@ -1443,15 +1443,9 @@ static int udf_read_inode(struct inode *inode, bool 
hidden_inode)
inode->i_blocks = le64_to_cpu(fe->logicalBlocksRecorded) <<
(inode->i_sb->s_blocksize_bits - 9);
 
-   if (!udf_disk_stamp_to_time(&inode->i_atime, fe->accessTime))
-   inode->i_atime = sbi->s_record_time;
-
-   if (!udf_disk_stamp_to_time(&inode->i_mtime,
-   fe->modificationTime))
-   inode->i_mtime = sbi->s_record_time;
-
-   if (!udf_disk_stamp_to_time(&inode->i_ctime, fe->attrTime))
-   inode->i_ctime = sbi->s_record_time;
+   udf_disk_stamp_to_time(&inode->i_atime, fe->accessTime);
+   udf_disk_stamp_to_time(&inode->i_mtime, fe->modificationTime);
+   udf_disk_stamp_to_time(&inode->i_ctime, fe->attrTime);
 
iinfo->i_unique = le64_to_cpu(fe->uniqueID);
iinfo->i_lenEAttr = le32_to_cpu(fe->lengthExtendedAttr);
@@ -1461,18 +1455,10 @@ static int udf_read_inode(struct inode *inode, bool 
hidden_inode)
inode->i_blocks = le64_to_cpu(efe->logicalBlocksRecorded) <<
(inode->i_sb->s_blocksize_bits - 9);
 
-   if (!udf_disk_stamp_to_time(&inode->i_atime, efe->accessTime))
-   inode->i_atime = sbi->s_record_time;
-
-   if (!udf_disk_stamp_to_time(&inode->i_mtime,
-   efe->modificationTime))
-   inode->i_mtime = sbi->s_record_time;
-
-   if (!udf_disk_stamp_to_time(&iinfo->i_crtime, efe->createTime))
-   iinfo->i_crtime = sbi->s_record_time;
-
-   if (!udf_disk_stamp_to_time(&inode->i_ctime, efe->attrTime))
-   inode->i_ctime = sbi->s_record_time;
+   udf_disk_stamp_to_time(&inode->i_atime, efe->accessTime);
+   udf_disk_stamp_to_time(&inode->i_mtime, efe->modificationTime);
+   udf_disk_stamp_to_time(&iinfo->i_crtime, efe->createTime);
+   udf_disk_stamp_to_time(&inode->i_ctime, efe->attrTime);
 
iinfo->i_unique = le64_to_cpu(efe->uniqueID);
iinfo->i_lenEAttr = le32_to_cpu(efe->lengthExtendedAttr);
diff --git a/fs/udf/super.c b/fs/udf/super.c
index 0d27d41f5c6e..bd0ae64bc31c 100644
--- a/fs/udf/super.c
+++ b/fs/udf/super.c
@@ -862,6 +862,9 @@ static int udf_load_pvoldesc(struct super_block *sb, 
sector_t block)
struct buffer_head *bh;
uint16_t ident;
int ret = -ENOMEM;
+#ifdef UDFFS_DEBUG
+   struct timestamp *ts;
+#endif
 
outstr = kmalloc(128, GFP_NOFS);
if (!outstr)
@@ -880,15 +883,14 @@ static int udf_load_pvoldesc(struct super_block *sb, 
sector_t block)
 
pvoldesc = (struct primaryVolDesc *)bh->b_data;
 
-   if (udf_disk_stamp_to_time(&UDF_SB(sb)->s_record_time,
- pvoldesc->recordingDateAndTime)) {
+   udf_disk_stamp_to_time(&UDF_SB(sb)->s_record_time,
+ pvoldesc->recordingDateAndTime);
 #ifdef UDFFS_DEBUG
-   struct timestamp *ts = &pvoldesc->recordingDateAndTime;
-   udf_debug("recording time %04u/%02u/%02u %02u:%02u (%x)\n",
- le16_to_cpu(ts->year), ts->month, ts->day, ts->hour,
- ts->minute, le16_to_cpu(ts->typeAndTimezone));
+   *ts = &pvoldesc->recordingDateAndTime;
+   udf_debug("recording time %04u/%02u/%02u %02u:%02u (%x)\n",
+ le16_to_cpu(ts->year), ts->month, ts->day, ts->hour,
+ ts->minute, le16_to_cpu(ts->typeAndTimezone));
 #endif
-   }
 
ret = udf_dstrCS0toChar(sb, outstr, 31, pvoldesc->volIdent, 32);
if (ret < 0)
diff --git a/fs/udf/udfdecl.h b/fs/udf/udfdecl.h
index fc8d1b3384d2..bae311b59400 100644
--- a/fs/udf/udfdecl.h
+++ b/fs/udf/udfdecl.h
@@ -253,8 +253,8 @@ extern struct long_ad *

[PATCH 6/6] vfs: change inode times to use struct timespec64

2018-05-11 Thread Deepa Dinamani

struct timespec is not y2038 safe. Transition vfs to use
y2038 safe struct timespec64 instead.

The change was made with the help of the following cocinelle
script. This catches about 80% of the changes.
All the header file and logic changes are included in the
first 5 rules. The rest are trivial substitutions.
I avoid changing any of the function signatures or any other
filesystem specific data structures to keep the patch simple
for review.

The script can be a little shorter by combining different cases.
But, this version was sufficient for my usecase.

virtual patch

@ depends on patch @
identifier now;
@@
- struct timespec
+ struct timespec64
  current_time ( ... )
  {
- struct timespec now = current_kernel_time();
+ struct timespec64 now = current_kernel_time64();
  ...
- return timespec_trunc(
+ return timespec64_trunc(
  ... );
  }

@ depends on patch @
identifier xtime;
@@
 struct \( iattr \| inode \| kstat \) {
 ...
-   struct timespec xtime;
+   struct timespec64 xtime;
 ...
 }

@ depends on patch @
identifier t;
@@
 struct inode_operations {
 ...
int (*update_time) (...,
-   struct timespec t,
+   struct timespec64 t,
...);
 ...
 }

@ depends on patch @
identifier t;
identifier fn_update_time =~ "update_time$";
@@
 fn_update_time (...,
- struct timespec *t,
+ struct timespec64 *t,
 ...) { ... }

@ depends on patch @
identifier t;
@@
lease_get_mtime( ... ,
- struct timespec *t
+ struct timespec64 *t
  ) { ... }

@te depends on patch forall@
identifier ts;
local idexpression struct inode *inode_node;
identifier i_xtime =~ "^i_[acm]time$";
identifier ia_xtime =~ "^ia_[acm]time$";
identifier fn_update_time =~ "update_time$";
identifier fn;
expression e, E3;
local idexpression struct inode *node1;
local idexpression struct inode *node2;
local idexpression struct iattr *attr1;
local idexpression struct iattr *attr2;
local idexpression struct iattr attr;
identifier i_xtime1 =~ "^i_[acm]time$";
identifier i_xtime2 =~ "^i_[acm]time$";
identifier ia_xtime1 =~ "^ia_[acm]time$";
identifier ia_xtime2 =~ "^ia_[acm]time$";
@@
(
(
- struct timespec ts;
+ struct timespec64 ts;
|
- struct timespec ts = current_time(inode_node);
+ struct timespec64 ts = current_time(inode_node);
)

<+... when != ts
(
- timespec_equal(&inode_node->i_xtime, &ts)
+ timespec64_equal(&inode_node->i_xtime, &ts)
|
- timespec_equal(&ts, &inode_node->i_xtime)
+ timespec64_equal(&ts, &inode_node->i_xtime)
|
- timespec_compare(&inode_node->i_xtime, &ts)
+ timespec64_compare(&inode_node->i_xtime, &ts)
|
- timespec_compare(&ts, &inode_node->i_xtime)
+ timespec64_compare(&ts, &inode_node->i_xtime)
|
ts = current_time(e)
|
fn_update_time(..., &ts,...)
|
inode_node->i_xtime = ts
|
node1->i_xtime = ts
|
ts = inode_node->i_xtime
|
<+... attr1->ia_xtime ...+> = ts
|
ts = attr1->ia_xtime
|
ts.tv_sec
|
ts.tv_nsec
|
btrfs_set_stack_timespec_sec(..., ts.tv_sec)
|
btrfs_set_stack_timespec_nsec(..., ts.tv_nsec)
|
- ts = timespec64_to_timespec(
+ ts =
...
-)
|
- ts = ktime_to_timespec(
+ ts = ktime_to_timespec64(
...)
|
- ts = E3
+ ts = timespec_to_timespec64(E3)
|
- ktime_get_real_ts(&ts)
+ ktime_get_real_ts64(&ts)
|
fn(...,
- ts
+ timespec64_to_timespec(ts)
,...)
)
...+>
(
<... when != ts
- return ts;
+ return timespec64_to_timespec(ts);
...>
)
|
- timespec_equal(&node1->i_xtime1, &node2->i_xtime2)
+ timespec64_equal(&node1->i_xtime2, &node2->i_xtime2)
|
- timespec_equal(&node1->i_xtime1, &attr2->ia_xtime2)
+ timespec64_equal(&node1->i_xtime2, &attr2->ia_xtime2)
|
- timespec_compare(&node1->i_xtime1, &node2->i_xtime2)
+ timespec64_compare(&node1->i_xtime1, &node2->i_xtime2)
|
node1->i_xtime1 =
- timespec_trunc(attr1->ia_xtime1,
+ timespec64_trunc(attr1->ia_xtime1,
...)
|
- attr1->ia_xtime1 = timespec_trunc(attr2->ia_xtime2,
+ attr1->ia_xtime1 =  timespec64_trunc(attr2->ia_xtime2,
...)
|
- ktime_get_real_ts(&attr1->ia_xtime1)
+ ktime_get_real_ts64(&attr1->ia_xtime1)
|
- ktime_get_real_ts(&attr.ia_xtime1)
+ ktime_get_real_ts64(&attr.ia_xtime1)
)

@ depends on patch @
struct inode *node;
struct iattr *attr;
identifier fn;
identifier i_xtime =~ "^i_[acm]time$";
identifier ia_xtime =~ "^ia_[acm]time$";
expression e;
@@
(
- fn(node->i_xtime);
+ fn(timespec64_to_timespec(node->i_xtime));
|
 fn(...,
- node->i_xtime);
+ timespec64_to_timespec(node->i_xtime));
|
- e = fn(attr->ia_xtime);
+ e = fn(timespec64_to_timespec(attr->ia_xtime));
)

@ depends on patch forall @
struct inode *node;
struct iattr *attr;
identifier i_xtime =~ "^i_[acm]time$";
identifier ia_xtime =~ "^ia_[acm]time$";
identifier fn;
@@
{
+ struct timespec ts;
<+...
(
+ ts = timespec64_to_timespec(node->i_xtime);
fn (...,
- &node->i_xtime,
+ &ts,
...);
|
+ ts = timespec64_to_timespec(attr->ia_xtime);
fn (...,
- &attr->ia_xtime,
+ &ts,
...);
)
...+>
}

@ depends on patch forall @
struct inode *node;
struct iattr *attr;
struct kstat *stat;
identifier ia_xtime =~ "^ia_[acm]time$";
identifier i_xtime =~ "^i_[acm]time$";
identifier xtime =~ "^[acm]time$";
identifier fn, ret;
@@
{
+ stru

[PATCH 3/6] ceph: make inode time prints to be long long

2018-05-11 Thread Deepa Dinamani

Subsequent patches in the series convert inode timestamps
to use struct timespec64 instead of struct timespec as
part of solving the y2038 problem.

Convert these print formats to use long long types to
avoid warnings and errors on conversion.

Signed-off-by: Deepa Dinamani 
Cc: z...@redhat.com
Cc: ceph-de...@vger.kernel.org
---
 fs/ceph/inode.c | 42 +-
 1 file changed, 21 insertions(+), 21 deletions(-)

diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c
index ae056927080d..676065a1ea62 100644
--- a/fs/ceph/inode.c
+++ b/fs/ceph/inode.c
@@ -671,18 +671,18 @@ void ceph_fill_file_time(struct inode *inode, int issued,
  CEPH_CAP_XATTR_EXCL)) {
if (ci->i_version == 0 ||
timespec_compare(ctime, &inode->i_ctime) > 0) {
-   dout("ctime %ld.%09ld -> %ld.%09ld inc w/ cap\n",
-inode->i_ctime.tv_sec, inode->i_ctime.tv_nsec,
-ctime->tv_sec, ctime->tv_nsec);
+   dout("ctime %lld.%09ld -> %lld.%09ld inc w/ cap\n",
+(long long)inode->i_ctime.tv_sec, 
inode->i_ctime.tv_nsec,
+(long long)ctime->tv_sec, ctime->tv_nsec);
inode->i_ctime = *ctime;
}
if (ci->i_version == 0 ||
ceph_seq_cmp(time_warp_seq, ci->i_time_warp_seq) > 0) {
/* the MDS did a utimes() */
-   dout("mtime %ld.%09ld -> %ld.%09ld "
+   dout("mtime %lld.%09ld -> %lld.%09ld "
 "tw %d -> %d\n",
-inode->i_mtime.tv_sec, inode->i_mtime.tv_nsec,
-mtime->tv_sec, mtime->tv_nsec,
+(long long)inode->i_mtime.tv_sec, 
inode->i_mtime.tv_nsec,
+(long long)mtime->tv_sec, mtime->tv_nsec,
 ci->i_time_warp_seq, (int)time_warp_seq);
 
inode->i_mtime = *mtime;
@@ -691,17 +691,17 @@ void ceph_fill_file_time(struct inode *inode, int issued,
} else if (time_warp_seq == ci->i_time_warp_seq) {
/* nobody did utimes(); take the max */
if (timespec_compare(mtime, &inode->i_mtime) > 0) {
-   dout("mtime %ld.%09ld -> %ld.%09ld inc\n",
-inode->i_mtime.tv_sec,
+   dout("mtime %lld.%09ld -> %lld.%09ld inc\n",
+(long long)inode->i_mtime.tv_sec,
 inode->i_mtime.tv_nsec,
-mtime->tv_sec, mtime->tv_nsec);
+(long long)mtime->tv_sec, mtime->tv_nsec);
inode->i_mtime = *mtime;
}
if (timespec_compare(atime, &inode->i_atime) > 0) {
-   dout("atime %ld.%09ld -> %ld.%09ld inc\n",
-inode->i_atime.tv_sec,
+   dout("atime %lld.%09ld -> %lld.%09ld inc\n",
+(long long)inode->i_atime.tv_sec,
 inode->i_atime.tv_nsec,
-atime->tv_sec, atime->tv_nsec);
+(long long)atime->tv_sec, atime->tv_nsec);
inode->i_atime = *atime;
}
} else if (issued & CEPH_CAP_FILE_EXCL) {
@@ -2015,9 +2015,9 @@ int __ceph_setattr(struct inode *inode, struct iattr 
*attr)
}
 
if (ia_valid & ATTR_ATIME) {
-   dout("setattr %p atime %ld.%ld -> %ld.%ld\n", inode,
-inode->i_atime.tv_sec, inode->i_atime.tv_nsec,
-attr->ia_atime.tv_sec, attr->ia_atime.tv_nsec);
+   dout("setattr %p atime %lld.%ld -> %lld.%ld\n", inode,
+(long long)inode->i_atime.tv_sec, inode->i_atime.tv_nsec,
+(long long)attr->ia_atime.tv_sec, attr->ia_atime.tv_nsec);
if (issued & CEPH_CAP_FILE_EXCL) {
ci->i_time_warp_seq++;
inode->i_atime = attr->ia_atime;
@@ -2037,9 +2037,9 @@ int __ceph_setattr(struct inode *inode, struct iattr 
*attr)
}
}
if (ia_valid & ATTR_MTIME) {
-   dout("setattr %p mtime %ld.%ld -> %ld.%ld\n", inode,
-inode->i_mtime.tv_sec, inode->i_mtime.tv_nsec,
-attr->ia_mtime.tv_sec, attr->ia_mtime.tv_nsec);
+   dout("setattr %p mtime %lld.%ld -> %lld.%ld\n", inode,
+(long long)inode->i_mtime.tv_sec, inode->i_mtime.tv_nsec,
+(long long)attr->ia_mtime.tv_sec, attr->ia_mtime.tv_nsec);
if (issued & CEP

[PATCH 1/6] fs: add timespec64_truncate()

2018-05-11 Thread Deepa Dinamani

As vfs moves to using struct timespec64 to represent times,
update the argument to timespec_truncate() to use
struct timespec64. Also change the name of the function.
The rest of the implementation logic is the same.

Move this to fs/inode.c instead of kernel/time/time.c as all the
users of this api are filesystems.

Signed-off-by: Deepa Dinamani 
Cc: 
---
 fs/inode.c | 24 
 include/linux/fs.h |  1 +
 2 files changed, 25 insertions(+)

diff --git a/fs/inode.c b/fs/inode.c
index 13ceb98c3bd3..93af998ee290 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -2110,6 +2110,30 @@ void inode_nohighmem(struct inode *inode)
 }
 EXPORT_SYMBOL(inode_nohighmem);
 
+/**
+ * timespec64_trunc - Truncate timespec64 to a granularity
+ * @t: Timespec64
+ * @gran: Granularity in ns.
+ *
+ * Truncate a timespec64 to a granularity. Always rounds down. gran must
+ * not be 0 nor greater than a second (NSEC_PER_SEC, or 10^9 ns).
+ */
+struct timespec64 timespec64_trunc(struct timespec64 t, unsigned gran)
+{
+   /* Avoid division in the common cases 1 ns and 1 s. */
+   if (gran == 1) {
+   /* nothing */
+   } else if (gran == NSEC_PER_SEC) {
+   t.tv_nsec = 0;
+   } else if (gran > 1 && gran < NSEC_PER_SEC) {
+   t.tv_nsec -= t.tv_nsec % gran;
+   } else {
+   WARN(1, "illegal file time granularity: %u", gran);
+   }
+   return t;
+}
+EXPORT_SYMBOL(timespec64_trunc);
+
 /**
  * current_time - Return FS time
  * @inode: inode.
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 0eedf745667b..381c77a37404 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1477,6 +1477,7 @@ static inline void i_gid_write(struct inode *inode, gid_t 
gid)
inode->i_gid = make_kgid(inode->i_sb->s_user_ns, gid);
 }
 
+extern struct timespec64 timespec64_trunc(struct timespec64 t, unsigned gran);
 extern struct timespec current_time(struct inode *inode);
 
 /*
-- 
2.17.0

[PATCH 4/6] fs: nfs: get rid of memcpys for inode times

2018-05-11 Thread Deepa Dinamani

Subsequent patches in the series convert inode timestamps
to use struct timespec64 instead of struct timespec as
part of solving the y2038 problem.
This will lead to type mismatch for memcpys.
Use regular assignments instead.

Signed-off-by: Deepa Dinamani 
Cc: trond.mykleb...@primarydata.com
---
 fs/nfs/inode.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
index bd15d0b57626..55b62254dd7c 100644
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -1315,13 +1315,13 @@ static void nfs_wcc_update_inode(struct inode *inode, 
struct nfs_fattr *fattr)
if ((fattr->valid & NFS_ATTR_FATTR_PRECTIME)
&& (fattr->valid & NFS_ATTR_FATTR_CTIME)
&& timespec_equal(&inode->i_ctime, &fattr->pre_ctime)) {
-   memcpy(&inode->i_ctime, &fattr->ctime, sizeof(inode->i_ctime));
+   inode->i_ctime = fattr->ctime;
}
 
if ((fattr->valid & NFS_ATTR_FATTR_PREMTIME)
&& (fattr->valid & NFS_ATTR_FATTR_MTIME)
&& timespec_equal(&inode->i_mtime, &fattr->pre_mtime)) {
-   memcpy(&inode->i_mtime, &fattr->mtime, sizeof(inode->i_mtime));
+   inode->i_mtime = fattr->mtime;
if (S_ISDIR(inode->i_mode))
nfs_set_cache_invalid(inode, NFS_INO_INVALID_DATA);
}
@@ -1667,12 +1667,12 @@ int nfs_post_op_update_inode_force_wcc_locked(struct 
inode *inode, struct nfs_fa
}
if ((fattr->valid & NFS_ATTR_FATTR_CTIME) != 0 &&
(fattr->valid & NFS_ATTR_FATTR_PRECTIME) == 0) {
-   memcpy(&fattr->pre_ctime, &inode->i_ctime, 
sizeof(fattr->pre_ctime));
+   fattr->pre_ctime = inode->i_ctime;
fattr->valid |= NFS_ATTR_FATTR_PRECTIME;
}
if ((fattr->valid & NFS_ATTR_FATTR_MTIME) != 0 &&
(fattr->valid & NFS_ATTR_FATTR_PREMTIME) == 0) {
-   memcpy(&fattr->pre_mtime, &inode->i_mtime, 
sizeof(fattr->pre_mtime));
+   fattr->pre_mtime = inode->i_mtime;
fattr->valid |= NFS_ATTR_FATTR_PREMTIME;
}
if ((fattr->valid & NFS_ATTR_FATTR_SIZE) != 0 &&
@@ -1829,7 +1829,7 @@ static int nfs_update_inode(struct inode *inode, struct 
nfs_fattr *fattr)
}
 
if (fattr->valid & NFS_ATTR_FATTR_MTIME) {
-   memcpy(&inode->i_mtime, &fattr->mtime, sizeof(inode->i_mtime));
+   inode->i_mtime = fattr->mtime;
} else if (server->caps & NFS_CAP_MTIME) {
nfsi->cache_validity |= save_cache_validity &
(NFS_INO_INVALID_MTIME
@@ -1838,7 +1838,7 @@ static int nfs_update_inode(struct inode *inode, struct 
nfs_fattr *fattr)
}
 
if (fattr->valid & NFS_ATTR_FATTR_CTIME) {
-   memcpy(&inode->i_ctime, &fattr->ctime, sizeof(inode->i_ctime));
+   inode->i_ctime = fattr->ctime;
} else if (server->caps & NFS_CAP_CTIME) {
nfsi->cache_validity |= save_cache_validity &
(NFS_INO_INVALID_CTIME
@@ -1875,7 +1875,7 @@ static int nfs_update_inode(struct inode *inode, struct 
nfs_fattr *fattr)
 
 
if (fattr->valid & NFS_ATTR_FATTR_ATIME)
-   memcpy(&inode->i_atime, &fattr->atime, sizeof(inode->i_atime));
+   inode->i_atime = fattr->atime;
else if (server->caps & NFS_CAP_ATIME) {
nfsi->cache_validity |= save_cache_validity &
(NFS_INO_INVALID_ATIME
-- 
2.17.0

[PATCH 0/6] Transition vfs to 64-bit timestamps

2018-05-11 Thread Deepa Dinamani

The series aims to switch vfs timestamps to use
struct timespec64. Currently vfs uses struct timespec,
which is not y2038 safe.

The series involves the following:
1. Add vfs helper functions for supporting struct timepec64 timestamps.
2. Cast prints of vfs timestamps to avoid warnings after the switch.
3. Simplify code using vfs timestamps so that the actual
   replacement becomes easy.
4. Convert vfs timestamps to use struct timespec64 using a script.
   This is a flag day patch.

I've tried to keep the conversions with the script simple, to
aid in the reviews. I've kept all the internal filesystem data
structures and function signatures the same.

Next steps:
1. Convert APIs that can handle timespec64, instead of converting
   timestamps at the boundaries.
2. Update internal data structures to avoid timestamp conversions.

Deepa Dinamani (6):
  fs: add timespec64_truncate()
  lustre: Use long long type to print inode time
  ceph: make inode time prints to be long long
  fs: nfs: get rid of memcpys for inode times
  udf: Simplify calls to udf_disk_stamp_to_time
  vfs: change inode times to use struct timespec64

 .../staging/lustre/lustre/llite/llite_lib.c   | 12 +--
 drivers/staging/lustre/lustre/lmv/lmv_obd.c   |  7 +-
 drivers/staging/lustre/lustre/mdc/mdc_reint.c |  6 +-
 drivers/staging/lustre/lustre/obdclass/obdo.c |  6 +-
 drivers/tty/tty_io.c  | 15 +++-
 drivers/usb/gadget/function/f_fs.c|  2 +-
 fs/adfs/inode.c   |  7 +-
 fs/afs/fsclient.c |  2 +-
 fs/attr.c | 14 +--
 fs/bad_inode.c|  2 +-
 fs/btrfs/file.c   |  6 +-
 fs/btrfs/inode.c  |  8 +-
 fs/btrfs/ioctl.c  |  4 +-
 fs/btrfs/root-tree.c  |  4 +-
 fs/btrfs/transaction.c|  2 +-
 fs/ceph/addr.c| 12 +--
 fs/ceph/cache.c   |  4 +-
 fs/ceph/caps.c|  6 +-
 fs/ceph/file.c|  6 +-
 fs/ceph/inode.c   | 86 ++-
 fs/ceph/mds_client.c  |  7 +-
 fs/ceph/snap.c|  6 +-
 fs/cifs/cache.c   |  4 +-
 fs/cifs/fscache.c |  8 +-
 fs/cifs/inode.c   | 26 +++---
 fs/coda/coda_linux.c  | 12 +--
 fs/configfs/inode.c   | 12 +--
 fs/cramfs/inode.c |  2 +-
 fs/ext4/ext4.h| 34 +---
 fs/ext4/ialloc.c  |  4 +-
 fs/ext4/namei.c   |  2 +-
 fs/f2fs/f2fs.h| 10 ++-
 fs/f2fs/file.c| 12 +--
 fs/f2fs/inode.c   | 12 +--
 fs/f2fs/namei.c   |  4 +-
 fs/fat/inode.c| 20 +++--
 fs/fat/namei_msdos.c  | 21 +++--
 fs/fat/namei_vfat.c   | 22 +++--
 fs/fuse/inode.c   |  2 +-
 fs/gfs2/dir.c |  6 +-
 fs/gfs2/glops.c   |  4 +-
 fs/hfs/inode.c|  4 +-
 fs/hfsplus/inode.c| 12 +--
 fs/hostfs/hostfs_kern.c   |  6 +-
 fs/inode.c| 58 +
 fs/jffs2/dir.c| 18 ++--
 fs/jffs2/file.c   |  2 +-
 fs/jffs2/fs.c | 12 +--
 fs/kernfs/dir.c   |  4 +-
 fs/kernfs/inode.c |  8 +-
 fs/locks.c|  2 +-
 fs/nfs/callback_proc.c|  4 +-
 fs/nfs/fscache-index.c|  4 +-
 fs/nfs/fscache.c  | 12 +--
 fs/nfs/inode.c| 39 +
 fs/nfs/nfs2xdr.c  | 25 +++---
 fs/nfs/nfs3xdr.c  |  8 +-
 fs/nfs/nfs4xdr.c  |  7 +-
 fs/nfsd/blocklayout.c |  8 +-
 fs/nfsd/nfs3xdr.c | 14 +--
 fs/nfsd/nfs4xdr.c |  7 +-
 fs/nfsd/nfsxdr.c  |  2 +-
 fs/ntfs/inode.c   | 30 +++
 fs/ocfs2/dlmglue.c| 20 +++--
 fs/ocfs2/file.c   |  6 +-
 fs/orangefs/inode.c   |  2 +-
 fs/orangefs/orangefs-kernel.h |  2 +-
 fs/overlayfs/inode.c  |  2 +-
 fs/overlayfs/overlayfs.h

Re: [Ksummit-discuss] bug-introducing patches

2018-05-11 Thread Stephen Rothwell

Hi all,

On Wed, 9 May 2018 20:47:27 +1000 Stephen Rothwell  
wrote:
>
> On Wed, 9 May 2018 18:03:46 +0900 Mark Brown  wrote:
> >
> > On Wed, May 09, 2018 at 10:47:57AM +0200, Daniel Vetter wrote:  
> > > On Wed, May 9, 2018 at 10:44 AM, Mark Brown  wrote:   
> > >  
> >   
> > > > I think this is an excellent idea, copying in Stephen for his input.
> > > > I'm currently on holiday but unless someone convinces me it's a terrible
> > > > idea I'm willing to at least give it a go on a trial basis once I'm back
> > > > home.
> >   
> > > Since Stephen merges all -fixes branches first, before merging all the
> > > -next branches, he already generates that as part of linux-next. All
> > > he'd need to do is push that intermediate state out to some
> > > linux-fixes branch for consumption by test bots.  
> 
> Good idea ... I will see what I can do.

See my announcement of a pending-fixes branch in linux-next (on LKML
and others)

> I currently have 44 such fixes branches.  More welcome!

We are up to 55.

-- 
Cheers,
Stephen Rothwell


pgpjVG8LUNEnK.pgp
Description: OpenPGP digital signature

linux-next: a new pending-fixes branch

2018-05-11 Thread Stephen Rothwell

Hi all,

As an outcome of some discussion, I have added a pending-fixes branch
to linux-next.  This branch contains Linus' tree merged with branches
containing only fixes pending for the current release.  The branch is a
strict subset of linux-next each day (as so rebases like linux-next
does).

It would be good if this branch could be tested by the 0-Day service
and any other testing that people do - in the hope of sending fewer
"fixes causing bugs" patches to Linus.

There is not intention that bug fixes for Linus' tree should
necessarily be tested in linux-next before being forwarded, but
(especially for slightly less urgent bugs, at least) it seems like a
good idea.

I currently have 55 branches of bug fixes included.  As of yesterday
the branch contains 165 commits and looks like this:


 .mailmap   |   3 +
 .../devicetree/bindings/net/can/rcar_canfd.txt |   4 +-
 MAINTAINERS|   8 +-
 arch/arm/boot/compressed/Makefile  |   8 +-
 arch/arm/boot/compressed/head.S|  20 +-
 arch/arm/boot/dts/dm8148-evm.dts   |   2 +-
 arch/arm/boot/dts/dm8148-t410.dts  |   2 +-
 arch/arm/boot/dts/dm8168-evm.dts   |   2 +-
 arch/arm/boot/dts/dra62x-j5eco-evm.dts |   2 +-
 arch/arm/boot/dts/imx35.dtsi   |   4 +-
 arch/arm/boot/dts/imx53.dtsi   |   4 +-
 arch/arm/boot/dts/logicpd-som-lv.dtsi  |  11 +-
 arch/arm/kernel/machine_kexec.c|  36 ++--
 arch/arm/mach-omap1/ams-delta-fiq.c|  28 +--
 arch/arm/mach-omap2/powerdomain.c  |   4 +-
 arch/powerpc/include/asm/ftrace.h  |  29 ++-
 arch/powerpc/include/asm/paca.h|   1 -
 arch/powerpc/include/asm/topology.h|  13 +-
 drivers/atm/firestream.c   |   2 +-
 drivers/atm/zatm.c |   3 +
 drivers/bluetooth/btusb.c  |  19 +-
 drivers/dma/pl330.c|  28 ---
 drivers/gpu/drm/bridge/Kconfig |   1 +
 drivers/gpu/drm/drm_atomic.c   |   8 +
 drivers/gpu/drm/i915/intel_cdclk.c |  41 -
 drivers/gpu/drm/i915/intel_display.c   |   2 +
 drivers/gpu/drm/i915/intel_dp.c|  20 --
 drivers/gpu/drm/i915/intel_lvds.c  |   3 +-
 drivers/gpu/drm/omapdrm/dss/dispc.c|  20 +-
 drivers/gpu/drm/omapdrm/dss/hdmi4.c|   2 +-
 drivers/gpu/drm/omapdrm/dss/hdmi4_core.c   |   7 +-
 drivers/gpu/drm/omapdrm/dss/hdmi5.c|   2 +-
 drivers/gpu/drm/omapdrm/omap_connector.c   |  10 +
 drivers/gpu/drm/omapdrm/omap_dmm_tiler.c   |   6 +-
 drivers/gpu/drm/omapdrm/tcm-sita.c |   2 +-
 drivers/gpu/drm/vc4/vc4_dpi.c  |  25 ++-
 drivers/gpu/drm/vc4/vc4_plane.c|   2 +-
 drivers/hwmon/k10temp.c|  40 +++-
 drivers/iio/adc/Kconfig|   1 +
 drivers/iio/adc/ad7793.c   |  75 +++-
 drivers/iio/adc/at91-sama5d2_adc.c |  41 -
 drivers/iio/adc/stm32-dfsdm-adc.c  |  17 +-
 drivers/iio/buffer/industrialio-buffer-dma.c   |   2 +-
 drivers/iio/buffer/kfifo_buf.c |  11 +-
 .../iio/common/hid-sensors/hid-sensor-trigger.c|   8 +-
 drivers/media/usb/uvc/uvc_ctrl.c   |  17 +-
 drivers/mtd/nand/onenand/omap2.c   | 105 ---
 drivers/mtd/nand/raw/marvell_nand.c|  12 +-
 drivers/mtd/nand/raw/nand_base.c   |   5 +
 drivers/net/can/dev.c  |   2 +-
 drivers/net/can/flexcan.c  |  26 +--
 drivers/net/can/spi/hi311x.c   |  11 +-
 drivers/net/can/usb/kvaser_usb.c   |   2 +-
 drivers/net/dsa/mv88e6xxx/chip.c   |  26 +++
 drivers/net/dsa/mv88e6xxx/chip.h   |   1 +
 drivers/net/dsa/mv88e6xxx/global2.c|   2 +-
 drivers/net/ethernet/aquantia/atlantic/aq_nic.c|   3 +
 drivers/net/ethernet/aquantia/atlantic/aq_nic.h|   1 +
 .../net/ethernet/aquantia/atlantic/aq_pci_func.c   |  20 +-
 drivers/net/ethernet/broadcom/tg3.c|   9 +-
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c|   7 +-
 drivers/net/ethernet/mellanox/mlx4/en_ethtool.c|  16 ++
 drivers/net/ethernet/mellanox/mlx4/en_netdev.c |   8 +-
 drivers/net/ethernet/mellanox/mlx4/mlx4_en.h   |   7 +-
 drivers/net/ethernet/netronome/nfp/bpf/main.c  |   2 +-
 drivers/net/ethernet/netronome/nfp/flower/main.c   |  19 --
 drivers/net/ethernet/ni/nixge.c|  10 +-
 drivers/net/ethernet/qlogic/qed/qed_l2.c   |   6 +-
 drivers/net/ethernet/qlogic

Re: [BUGFIX PATCH v3 0/4] arm: kprobes: Fix to prohibit probing on unsafe functions

2018-05-11 Thread Greg KH

On Sat, May 12, 2018 at 09:42:21AM +0900, Masami Hiramatsu wrote:
> Hi Greg,
> 
> Could you pick this series to stable?

This is not the correct way to submit patches for inclusion in the
stable kernel tree.  Please read:
https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
for how to do this properly.

thanks,

greg k-h

linux-next: added the vfs-fixes tree

2018-05-11 Thread Stephen Rothwell

Hi Al,

As requested I have added the vfs-fixes tree
(git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git#fixes) to
linux-next from Monday.

Thanks for adding your subsystem tree as a participant of linux-next.  As
you may know, this is not a judgement of your code.  The purpose of
linux-next is for integration testing and to lower the impact of
conflicts between subsystems in the next merge window. 

You will need to ensure that the patches/commits in your tree/series have
been:
 * submitted under GPL v2 (or later) and include the Contributor's
Signed-off-by,
 * posted to the relevant mailing list,
 * reviewed by you (or another maintainer of your subsystem tree),
 * successfully unit tested, and 
 * destined for the current or next Linux merge window.

Basically, this should be just what you would send to Linus (or ask him
to fetch).  It is allowed to be rebased if you deem it necessary.

-- 
Cheers,
Stephen Rothwell 
s...@canb.auug.org.au


pgpn_ZADpWsuU.pgp
Description: OpenPGP digital signature

Re: [Ksummit-discuss] bug-introducing patches

2018-05-11 Thread Stephen Rothwell

Hi David,

On Fri, 11 May 2018 10:47:01 +0200 David Sterba  wrote:
>
> Please add
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git next-fixes

Added from Monday (as btrfs-fixes).

Thanks for adding your subsystem tree as a participant of linux-next.  As
you may know, this is not a judgement of your code.  The purpose of
linux-next is for integration testing and to lower the impact of
conflicts between subsystems in the next merge window. 

You will need to ensure that the patches/commits in your tree/series have
been:
 * submitted under GPL v2 (or later) and include the Contributor's
Signed-off-by,
 * posted to the relevant mailing list,
 * reviewed by you (or another maintainer of your subsystem tree),
 * successfully unit tested, and 
 * destined for the current or next Linux merge window.

Basically, this should be just what you would send to Linus (or ask him
to fetch).  It is allowed to be rebased if you deem it necessary.

-- 
Cheers,
Stephen Rothwell 
s...@canb.auug.org.au

pgpRAzEmZx0E1.pgp
Description: OpenPGP digital signature

Re: linux-next: build warning after merge of the mac80211-next tree

2018-05-11 Thread Stephen Rothwell

Hi all,

Just cc'ing the wireless list at Kalle's suggestion.

On Wed, 9 May 2018 14:56:24 +1000 Stephen Rothwell  
wrote:
>
> Hi Johannes,
> 
> After merging the mac80211-next tree, today's linux-next build (arm_multi
> v7_defconfig) produced this warning:
> 
> drivers/net/wireless/marvell/mwifiex/uap_event.c: In function 
> 'mwifiex_process_uap_event':
> drivers/net/wireless/marvell/mwifiex/uap_event.c:333:1: warning: the frame 
> size of 1680 bytes is larger than 1024 bytes [-Wframe-larger-than=]
>  }
>  ^
> drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c: In function 
> 'brcmf_notify_connect_status_ap':
> drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c:5530:1: warning: 
> the frame size of 1680 bytes is larger than 1024 bytes [-Wframe-larger-than=]
>  }
>  ^
> 
> Maybe introduced by commit
> 
>   52539ca89f36 ("cfg80211: Expose TXQ stats and parameters to userspace")

-- 
Cheers,
Stephen Rothwell


pgpd0oZ1oEbcI.pgp
Description: OpenPGP digital signature

Re: linux-next: build warning after merge of the mac80211-next tree

2018-05-11 Thread Stephen Rothwell

Hi Kalle,

On Fri, 11 May 2018 15:20:23 +0300 Kalle Valo  wrote:
>
> Btw Stephen for mac80211 reports it would be a good idea to also cc
> linux-wireless list, in case Johannes is not around etc.

Thanks for the suggestion.  Done.

-- 
Cheers,
Stephen Rothwell


pgpWmdiByugpz.pgp
Description: OpenPGP digital signature

Fwd: KASAN: use-after-free Write in write_mem

2018-05-11 Thread Kyungtae Kim

-- Forwarded message --
From: Kyungtae Kim 
Date: Fri, May 11, 2018 at 11:38 AM
Subject: KASAN: use-after-free Write in write_mem
To: a...@arndb.de, gre...@linuxfoundation.org, linux-kernel@vger.kernel.org
Cc: Byoungyoung Lee , DaeLyong Jeong



We report the crash:
"KASAN: use-after-free Write in write_mem"

This crash was found in v4.17-rc3. Specifically, memory access (write
operation) is invalid, and it is detected by KASAN.

C repro code:
 https://kiwi.cs.purdue.edu/static/alexkkid-fuzzer/repro-3c6e1.c
kernel config:
 https://kiwi.cs.purdue.edu/static/alexkkid-fuzzer/kernel-config-v4.17-rc3

Crash log:

Write of size 4096 at addr 8801 by task syz-executor1/3358

CPU: 0 PID: 3358 Comm: syz-executor1 Not tainted 4.17.0-rc3 #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0xc7/0x138 lib/dump_stack.c:113
 print_address_description+0x6a/0x280 mm/kasan/report.c:256
 kasan_report_error mm/kasan/report.c:354 [inline]
 kasan_report+0x22f/0x350 mm/kasan/report.c:412
 check_memory_region_inline mm/kasan/kasan.c:260 [inline]
 check_memory_region+0x13b/0x1a0 mm/kasan/kasan.c:267
 kasan_check_write+0x14/0x20 mm/kasan/kasan.c:278
 _copy_from_user+0xb7/0x100 lib/usercopy.c:12
 copy_from_user include/linux/uaccess.h:147 [inline]
 write_mem+0x8f/0x190 drivers/char/mem.c:240
 __vfs_write+0x10d/0x610 fs/read_write.c:485
 vfs_write+0x187/0x500 fs/read_write.c:549
 ksys_write+0xd4/0x1a0 fs/read_write.c:598
 __do_sys_write fs/read_write.c:610 [inline]
 __se_sys_write fs/read_write.c:607 [inline]
 __x64_sys_write+0x73/0xb0 fs/read_write.c:607
 do_syscall_64+0xa4/0x460 arch/x86/entry/common.c:287
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x4497b9
RSP: 002b:7f6f7f254c68 EFLAGS: 0246 ORIG_RAX: 0001
RAX: ffda RBX: 7f6f7f2556cc RCX: 004497b9
RDX: ffad RSI: 2000 RDI: 0013
RBP: 0071bea0 R08:  R09: 
R10:  R11: 0246 R12: 
R13: 9ee8 R14: 006f0f88 R15: 7f6f7f255700

The buggy address belongs to the page:
page:ea000400 count:0 mapcount:-127 mapping: index:0x0
flags: 0x0()
raw:    ff80
raw: 88013fff91e0 ea002020 0004 
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 8800ff00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 8800ff80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>8801: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
   ^
 88010080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
 88010100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff


Thanks,
Kyungtae Kim

nios2 build: empty_zero_page ?

2018-05-11 Thread Randy Dunlap

Hi,

I bet this has come up before, but my search foo didn't find anything.

When (cross) building nios2 allmodconfig, I am seeing this build error:

ERROR: "empty_zero_page" [net/ceph/libceph.ko] undefined!
ERROR: "empty_zero_page" [fs/exofs/exofs.ko] undefined!
ERROR: "empty_zero_page" [fs/crypto/fscrypto.ko] undefined!
ERROR: "empty_zero_page" [fs/cramfs/cramfs.ko] undefined!
ERROR: "empty_zero_page" [drivers/usb/wusbcore/wusbcore.ko] undefined!
ERROR: "flush_icache_range" [drivers/misc/lkdtm/lkdtm.ko] undefined!
ERROR: "empty_zero_page" [drivers/md/dm-mod.ko] undefined!

and arch/nios2/mm/init.c references empty_zero_page, but I don't see
anywhere that it is defined.

Help?

There are plenty of other build issues also, but they all are about arithmetic
that is sometimes provided by libc functions, e.g.:

ERROR: "__ucmpdi2" [drivers/media/i2c/adv7842.ko] undefined!
ERROR: "__ashrdi3" [drivers/mtd/nand/onenand/onenand.ko] undefined!
ERROR: "__ashldi3" [fs/btrfs/btrfs.ko] undefined!
ERROR: "__lshrdi3" [drivers/mtd/tests/mtd_nandbiterrs.ko] undefined!

thanks,
-- 
~Randy

Re: [PATCH] rcu: trace: Remove Startedleaf from trace events comment

2018-05-11 Thread Paul E. McKenney

On Fri, May 11, 2018 at 06:29:57PM -0700, Joel Fernandes wrote:
> On Fri, May 11, 2018 at 6:29 PM, Joel Fernandes (Google)
>  wrote:
> >
> > As part of the gp_seq clean up, the Startleaf condition doesn't occur
> > anymore. Remove it from the comment in the trace event file.
> 
> Sorry, I meant here Startedleaf. Let me know if you want me to resend the 
> patch.

Please do, as it saves me making another typo when attempting to fix it.

Thanx, Paul

Re: [PATCH v2] rcu: Add comment documenting how rcu_seq_snap works

2018-05-11 Thread Randy Dunlap

On 05/11/2018 07:20 PM, Joel Fernandes (Google) wrote:
> rcu_seq_snap may be tricky for someone looking at it for the first time.
> Lets document how it works with an example to make it easier.
> 
> Signed-off-by: Joel Fernandes (Google) 
> ---
> v2 changes: Corrections as suggested by Randy.
> 
>  kernel/rcu/rcu.h | 24 +++-
>  1 file changed, 23 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
> index 003671825d62..533bc1087371 100644
> --- a/kernel/rcu/rcu.h
> +++ b/kernel/rcu/rcu.h
> @@ -91,7 +91,29 @@ static inline void rcu_seq_end(unsigned long *sp)
>   WRITE_ONCE(*sp, rcu_seq_endval(sp));
>  }
>  
> -/* Take a snapshot of the update side's sequence number. */
> +/*
> + * Take a snapshot of the update side's sequence number.
> + *
> + * This function predicts what the grace period number will be the next
> + * time an RCU callback will be executed, given the current grace period's
> + * number. This can be gp+1 if RCU is idle, or gp+2 if a grace period is
> + * already in progress.
> + *
> + * We do this with a single addition and masking.
> + * For example, if RCU_SEQ_STATE_MASK=1 and the least significant bit (LSB) 
> of
> + * the seq is used to track if a GP is in progress or not, its sufficient if 
> we

  it's

> + * add (2+1) and mask with ~1. Lets see why with an example:

  Let's

I.e., Let's not be so casual with (dropping) apostrophes.
But v3 can wait for other comments. :)

> + *
> + * Say the current seq is 6 which is 0b110 (gp is 3 and state bit is 0).
> + * To get the next GP number, we have to at least add 0b10 to this (0x1 << 1)
> + * to account for the state bit. However, if the current seq is 7 (gp is 3 
> and
> + * state bit is 1), then it means the current grace period is already in
> + * progress so the next time the callback will run is at the end of grace
> + * period number gp+2. To account for the extra +1, we just overflow the LSB 
> by
> + * adding another 0x1 and masking with ~0x1. In case no GP was in progress 
> (RCU
> + * is idle), then the addition of the extra 0x1 and masking will have no
> + * effect. This is calculated as below.
> + */
>  static inline unsigned long rcu_seq_snap(unsigned long *sp)
>  {
>   unsigned long s;
> 


-- 
~Randy

Re: Another NVMe failure, this time with AER info

2018-05-11 Thread Ming Lei

On Sat, May 12, 2018 at 12:57 AM, Bjorn Helgaas  wrote:
> Andrew wrote:
>> A friend of mine has a brand new LG laptop that has intermittent NVMe
>> failures.  They mostly happen during a suspend/resume cycle
>> (apparently during suspend, not resume).  Unlike the earlier
>> Dell/Samsung issue, the NVMe device isn't completely gone -- MMIO
>> reads fail, but PCI configuration space is apparently still there:
>
>> nvme nvme0: controller is down; will reset: CSTS=0x, PCI_STATUS=0x10
>
>> and it comes with a nice AER dump:
>
>> [12720.894411] pcieport :00:1c.0: AER: Multiple Corrected error 
>> received: id=00e0
>> [12720.909747] pcieport :00:1c.0: PCIe Bus Error: severity=Corrected, 
>> type=Physical Layer, id=00e0(Transmitter ID)
>> [12720.909751] pcieport :00:1c.0:   device [8086:9d14] error 
>> status/mask=1001/2000
>> [12720.909754] pcieport :00:1c.0:[ 0] Receiver Error (First)
>> [12720.909756] pcieport :00:1c.0:[12] Replay Timer Timeout
>
> I opened this bugzilla and attached the dmesg and lspci -vv output to
> it: https://bugzilla.kernel.org/show_bug.cgi?id=199695
>
> The root port at 00:1c.0 leads to the NVMe device at 01:00.0 (this is
> nvme0):
>
>   00:1c.0 PCI bridge: Intel Corporation Sunrise Point-LP PCI Express Root 
> Port #5 (rev f1) (prog-if 00 [Normal decode])
> Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
>   01:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD 
> Controller SM961/PM961 (prog-if 02 [NVM Express])
> Subsystem: Samsung Electronics Co Ltd Device a801
>
> We reported several corrected errors before the nvme timeout:
>
>   [12750.281158] nvme nvme0: controller is down; will reset: CSTS=0x, 
> PCI_STATUS=0x10
>   [12750.297594] nvme nvme0: I/O 455 QID 2 timeout, disable controller
>   [12750.305196] nvme :01:00.0: enabling device ( -> 0002)
>   [12750.305465] nvme nvme0: Removing after probe failure status: -19
>   [12750.313188] nvme nvme0: I/O 456 QID 2 timeout, disable controller
>   [12750.329152] nvme nvme0: I/O 457 QID 2 timeout, disable controller
>
> The corrected errors are supposedly recovered in hardware without
> software intervention, and AER logs them for informational purposes.
>
> But it seems very likely that these corrected errors are related to
> the nvme timeout: the first corrected errors were logged at
> 12720.894411, nvme_io_timeout defaults to 30 seconds, and the nvme
> timeout was at 12750.281158.

The following patchset might help this issue:

https://marc.info/?l=linux-block&m=152604179903505&w=2

--
Ming Lei

[PATCH v3 0/2] regulator: add QCOM RPMh regulator driver

2018-05-11 Thread David Collins

This patch series adds a driver and device tree binding documentation for
PMIC regulator control via Resource Power Manager-hardened (RPMh) on some
Qualcomm Technologies, Inc. SoCs such as SDM845.  RPMh is a hardware block
which contains several accelerators which are used to manage various
hardware resources that are shared between the processors of the SoC.  The
final hardware state of a regulator is determined within RPMh by performing
max aggregation of the requests made by all of the processors.

The RPMh regulator driver depends upon the RPMh driver [1] and command DB
driver [2] which are both still undergoing review.  It also depends upon
two recent of_regulator changes: [3] and [4].

Changes since v2 [5]:
 - Replaced '_' with '-' in device tree supply property names
 - Renamed qcom_rpmh-regulator.c to be qcom-rpmh-regulator.c
 - Updated various DT property names to use "microvolt" and "microamp"
 - Moved allowed modes constraint specification out of the driver [4]
 - Replaced rpmh_client with device pointer to match new RPMh API [1]
 - Corrected drms mode threshold checking
 - Initialized voltage_selector to -EINVAL when not specified in DT
 - Added constants for PMIC regulator hardware modes
 - Corrected type sign of mode mapping tables
 - Made variable names for mode arrays plural
 - Simplified Kconfig depends on
 - Removed unnecessary constants and struct fields
 - Added some descriptive comments

Changes since v1 [6]:
 - Addressed review feedback from Doug, Mark, and Stephen
 - Replaced set_voltage()/get_voltage() callbacks with set_voltage_sel()/
   get_voltage_sel()
 - Added set_bypass()/get_bypass() callbacks for BOB pass-through mode
   control
 - Removed top-level PMIC data structures
 - Removed initialization variables from structs and passed them as
   function parameters
 - Removed various comments and error messages
 - Simplified mode handling
 - Refactored per-PMIC rpmh-regulator data specification
 - Simplified probe function
 - Moved header into DT patch
 - Removed redundant property listings from DT binding documentation

[1]: https://lkml.org/lkml/2018/5/9/729
[2]: https://lkml.org/lkml/2018/4/10/714
[3]: https://patchwork.kernel.org/patch/10348629
[4]: https://lkml.org/lkml/2018/5/11/696
[5]: https://lkml.org/lkml/2018/4/13/687
[6]: https://lkml.org/lkml/2018/3/16/1431

David Collins (2):
  regulator: dt-bindings: add QCOM RPMh regulator bindings
  regulator: add QCOM RPMh regulator driver

 .../bindings/regulator/qcom,rpmh-regulator.txt | 208 +
 drivers/regulator/Kconfig  |   9 +
 drivers/regulator/Makefile |   1 +
 drivers/regulator/qcom-rpmh-regulator.c| 925 +
 .../dt-bindings/regulator/qcom,rpmh-regulator.h|  36 +
 5 files changed, 1179 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/regulator/qcom,rpmh-regulator.txt
 create mode 100644 drivers/regulator/qcom-rpmh-regulator.c
 create mode 100644 include/dt-bindings/regulator/qcom,rpmh-regulator.h

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

[PATCH v3 2/2] regulator: add QCOM RPMh regulator driver

2018-05-11 Thread David Collins

Add the QCOM RPMh regulator driver to manage PMIC regulators
which are controlled via RPMh on some Qualcomm Technologies, Inc.
SoCs.  RPMh is a hardware block which contains several
accelerators which are used to manage various hardware resources
that are shared between the processors of the SoC.  The final
hardware state of a regulator is determined within RPMh by
performing max aggregation of the requests made by all of the
processors.

Add support for PMIC regulator control via the voltage regulator
manager (VRM) and oscillator buffer (XOB) RPMh accelerators.  VRM
supports manipulation of enable state, voltage, mode, and
headroom voltage.  XOB supports manipulation of enable state.

Signed-off-by: David Collins 
---
 drivers/regulator/Kconfig   |   9 +
 drivers/regulator/Makefile  |   1 +
 drivers/regulator/qcom-rpmh-regulator.c | 925 
 3 files changed, 935 insertions(+)
 create mode 100644 drivers/regulator/qcom-rpmh-regulator.c

diff --git a/drivers/regulator/Kconfig b/drivers/regulator/Kconfig
index 4efae3b..1a69bdc 100644
--- a/drivers/regulator/Kconfig
+++ b/drivers/regulator/Kconfig
@@ -671,6 +671,15 @@ config REGULATOR_QCOM_RPM
  Qualcomm RPM as a module. The module will be named
  "qcom_rpm-regulator".
 
+config REGULATOR_QCOM_RPMH
+   tristate "Qualcomm Technologies, Inc. RPMh regulator driver"
+   depends on QCOM_RPMH || COMPILE_TEST
+   help
+ This driver supports control of PMIC regulators via the RPMh hardware
+ block found on Qualcomm Technologies Inc. SoCs.  RPMh regulator
+ control allows for voting on regulator state between multiple
+ processors within the SoC.
+
 config REGULATOR_QCOM_SMD_RPM
tristate "Qualcomm SMD based RPM regulator driver"
depends on QCOM_SMD_RPM
diff --git a/drivers/regulator/Makefile b/drivers/regulator/Makefile
index d81fb02..906f048 100644
--- a/drivers/regulator/Makefile
+++ b/drivers/regulator/Makefile
@@ -77,6 +77,7 @@ obj-$(CONFIG_REGULATOR_MT6323)+= mt6323-regulator.o
 obj-$(CONFIG_REGULATOR_MT6380) += mt6380-regulator.o
 obj-$(CONFIG_REGULATOR_MT6397) += mt6397-regulator.o
 obj-$(CONFIG_REGULATOR_QCOM_RPM) += qcom_rpm-regulator.o
+obj-$(CONFIG_REGULATOR_QCOM_RPMH) += qcom-rpmh-regulator.o
 obj-$(CONFIG_REGULATOR_QCOM_SMD_RPM) += qcom_smd-regulator.o
 obj-$(CONFIG_REGULATOR_QCOM_SPMI) += qcom_spmi-regulator.o
 obj-$(CONFIG_REGULATOR_PALMAS) += palmas-regulator.o
diff --git a/drivers/regulator/qcom-rpmh-regulator.c 
b/drivers/regulator/qcom-rpmh-regulator.c
new file mode 100644
index 000..991ecc1
--- /dev/null
+++ b/drivers/regulator/qcom-rpmh-regulator.c
@@ -0,0 +1,925 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2018, The Linux Foundation. All rights reserved. */
+
+#define pr_fmt(fmt) "%s: " fmt, __func__
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+
+#include 
+
+/**
+ * enum rpmh_regulator_type - supported RPMh accelerator types
+ * %VRM:   RPMh VRM accelerator which supports voting on enable, voltage,
+ * mode, and headroom voltage of LDO, SMPS, and BOB type PMIC
+ * regulators.
+ * %XOB:   RPMh XOB accelerator which supports voting on the enable state
+ * of PMIC regulators.
+ */
+enum rpmh_regulator_type {
+   VRM,
+   XOB,
+};
+
+#define RPMH_VRM_HEADROOM_MAX_UV   511000
+
+#define RPMH_REGULATOR_REG_VRM_VOLTAGE 0x0
+#define RPMH_REGULATOR_REG_ENABLE  0x4
+#define RPMH_REGULATOR_REG_VRM_MODE0x8
+#define RPMH_REGULATOR_REG_VRM_HEADROOM0xC
+
+#define RPMH_REGULATOR_MODE_COUNT  4
+
+#define PMIC4_LDO_MODE_RETENTION   4
+#define PMIC4_LDO_MODE_LPM 5
+#define PMIC4_LDO_MODE_HPM 7
+
+#define PMIC4_SMPS_MODE_RETENTION  4
+#define PMIC4_SMPS_MODE_PFM5
+#define PMIC4_SMPS_MODE_AUTO   6
+#define PMIC4_SMPS_MODE_PWM7
+
+#define PMIC4_BOB_MODE_PASS0
+#define PMIC4_BOB_MODE_PFM 1
+#define PMIC4_BOB_MODE_AUTO2
+#define PMIC4_BOB_MODE_PWM 3
+
+/**
+ * struct rpmh_vreg_hw_data - RPMh regulator hardware configurations
+ * @regulator_type:RPMh accelerator type used to manage this
+ * regulator
+ * @ops:   Pointer to regulator ops callback structure
+ * @voltage_range: The single range of voltages supported by this
+ * PMIC regulator type
+ * @n_voltages:The number of unique voltage set points 
defined
+ * by voltage_range
+ * @pmic_mode_map: Array indexed by regulator framework mode
+ * containing PMIC ha

[PATCH v3 1/2] regulator: dt-bindings: add QCOM RPMh regulator bindings

2018-05-11 Thread David Collins

Introduce bindings for RPMh regulator devices found on some
Qualcomm Technlogies, Inc. SoCs.  These devices allow a given
processor within the SoC to make PMIC regulator requests which
are aggregated within the RPMh hardware block along with requests
from other processors in the SoC to determine the final PMIC
regulator hardware state.

Signed-off-by: David Collins 
---
 .../bindings/regulator/qcom,rpmh-regulator.txt | 208 +
 .../dt-bindings/regulator/qcom,rpmh-regulator.h|  36 
 2 files changed, 244 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/regulator/qcom,rpmh-regulator.txt
 create mode 100644 include/dt-bindings/regulator/qcom,rpmh-regulator.h

diff --git 
a/Documentation/devicetree/bindings/regulator/qcom,rpmh-regulator.txt 
b/Documentation/devicetree/bindings/regulator/qcom,rpmh-regulator.txt
new file mode 100644
index 000..ad2185e
--- /dev/null
+++ b/Documentation/devicetree/bindings/regulator/qcom,rpmh-regulator.txt
@@ -0,0 +1,208 @@
+Qualcomm Technologies, Inc. RPMh Regulators
+
+rpmh-regulator devices support PMIC regulator management via the Voltage
+Regulator Manager (VRM) and Oscillator Buffer (XOB) RPMh accelerators.  The 
APPS
+processor communicates with these hardware blocks via a Resource State
+Coordinator (RSC) using command packets.  The VRM allows changing four
+parameters for a given regulator: enable state, output voltage, operating mode,
+and minimum headroom voltage.  The XOB allows changing only a single parameter
+for a given regulator: its enable state.  Despite its name, the XOB is capable
+of controlling the enable state of any PMIC peripheral.  It is used for clock
+buffers, low-voltage switches, and LDO/SMPS regulators which have a fixed
+voltage and mode.
+
+===
+Required Node Structure
+===
+
+RPMh regulators must be described in two levels of device nodes.  The first
+level describes the PMIC containing the regulators and must reside within an
+RPMh device node.  The second level describes each regulator within the PMIC
+which is to be used on the board.  Each of these regulators maps to a single
+RPMh resource.
+
+The names used for regulator nodes must match those supported by a given PMIC.
+Supported regulator node names:
+   PM8998: smps1 - smps13, ldo1 - ldo28, lvs1 - lvs2
+   PMI8998:bob
+   PM8005: smps1 - smps4
+
+
+First Level Nodes - PMIC
+
+
+- compatible
+   Usage:  required
+   Value type: 
+   Definition: Must be one of: "qcom,pm8998-rpmh-regulators",
+   "qcom,pmi8998-rpmh-regulators" or
+   "qcom,pm8005-rpmh-regulators".
+
+- qcom,pmic-id
+   Usage:  required
+   Value type: 
+   Definition: RPMh resource name suffix used for the regulators found on
+   this PMIC.  Typical values: "a", "b", "c", "d", "e", "f".
+
+- vdd-s1-supply
+- vdd-s2-supply
+- vdd-s3-supply
+- vdd-s4-supply
+   Usage:  optional (PM8998 and PM8005 only)
+   Value type: 
+   Definition: phandle of the parent supply regulator of one or more of the
+   regulators for this PMIC.
+
+- vdd-s5-supply
+- vdd-s6-supply
+- vdd-s7-supply
+- vdd-s8-supply
+- vdd-s9-supply
+- vdd-s10-supply
+- vdd-s11-supply
+- vdd-s12-supply
+- vdd-s13-supply
+- vdd-l1-l27-supply
+- vdd-l2-l8-l17-supply
+- vdd-l3-l11-supply
+- vdd-l4-l5-supply
+- vdd-l6-supply
+- vdd-l7-l12-l14-l15-supply
+- vdd-l9-supply
+- vdd-l10-l23-l25-supply
+- vdd-l13-l19-l21-supply
+- vdd-l16-l28-supply
+- vdd-l18-l22-supply
+- vdd-l20-l24-supply
+- vdd-l26-supply
+- vin-lvs-1-2-supply
+   Usage:  optional (PM8998 only)
+   Value type: 
+   Definition: phandle of the parent supply regulator of one or more of the
+   regulators for this PMIC.
+
+- vdd-bob-supply
+   Usage:  optional (PMI8998 only)
+   Value type: 
+   Definition: BOB regulator parent supply phandle
+
+===
+Second Level Nodes - Regulators
+===
+
+- qcom,regulator-initial-microvolt
+   Usage:  optional; VRM regulators only
+   Value type: 
+   Definition: Specifies the initial voltage in microvolts to request for a
+   VRM regulator.
+
+- regulator-initial-mode
+   Usage:  optional; VRM regulators only
+   Value type: 
+   Definition: Specifies the initial mode to request for a VRM regulator.
+   Supported values are RPMH_REGULATOR_MODE_* which are defined
+   in [1] (i.e. 0 to 3).  This property may be specified even
+   if the regulator-allow-set-load property is not specified.
+
+- qcom,allowed-drms-modes
+   Usage:  required if regulator-allow-set-load is specified;
+   VRM regulators only
+   Value type: 
+   Definition: A list of integers specifying

[PATCH v2] rcu: Add comment documenting how rcu_seq_snap works

2018-05-11 Thread Joel Fernandes (Google)

rcu_seq_snap may be tricky for someone looking at it for the first time.
Lets document how it works with an example to make it easier.

Signed-off-by: Joel Fernandes (Google) 
---
v2 changes: Corrections as suggested by Randy.

 kernel/rcu/rcu.h | 24 +++-
 1 file changed, 23 insertions(+), 1 deletion(-)

diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
index 003671825d62..533bc1087371 100644
--- a/kernel/rcu/rcu.h
+++ b/kernel/rcu/rcu.h
@@ -91,7 +91,29 @@ static inline void rcu_seq_end(unsigned long *sp)
WRITE_ONCE(*sp, rcu_seq_endval(sp));
 }
 
-/* Take a snapshot of the update side's sequence number. */
+/*
+ * Take a snapshot of the update side's sequence number.
+ *
+ * This function predicts what the grace period number will be the next
+ * time an RCU callback will be executed, given the current grace period's
+ * number. This can be gp+1 if RCU is idle, or gp+2 if a grace period is
+ * already in progress.
+ *
+ * We do this with a single addition and masking.
+ * For example, if RCU_SEQ_STATE_MASK=1 and the least significant bit (LSB) of
+ * the seq is used to track if a GP is in progress or not, its sufficient if we
+ * add (2+1) and mask with ~1. Lets see why with an example:
+ *
+ * Say the current seq is 6 which is 0b110 (gp is 3 and state bit is 0).
+ * To get the next GP number, we have to at least add 0b10 to this (0x1 << 1)
+ * to account for the state bit. However, if the current seq is 7 (gp is 3 and
+ * state bit is 1), then it means the current grace period is already in
+ * progress so the next time the callback will run is at the end of grace
+ * period number gp+2. To account for the extra +1, we just overflow the LSB by
+ * adding another 0x1 and masking with ~0x1. In case no GP was in progress (RCU
+ * is idle), then the addition of the extra 0x1 and masking will have no
+ * effect. This is calculated as below.
+ */
 static inline unsigned long rcu_seq_snap(unsigned long *sp)
 {
unsigned long s;
-- 
2.17.0.441.gb46fe60e1d-goog

[PATCH] rcu: Add comment documenting how rcu_seq_snap works

2018-05-11 Thread Randy Dunlap

On 05/11/2018 05:33 PM, Joel Fernandes (Google) wrote:
> rcu_seq_snap may be tricky for someone looking at it for the first time.
> Lets document how it works with an example to make it easier.
> 
> Signed-off-by: Joel Fernandes (Google) 
> ---
>  kernel/rcu/rcu.h | 23 ++-
>  1 file changed, 22 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
> index 003671825d62..004ace3d22c2 100644
> --- a/kernel/rcu/rcu.h
> +++ b/kernel/rcu/rcu.h
> @@ -91,7 +91,28 @@ static inline void rcu_seq_end(unsigned long *sp)
>   WRITE_ONCE(*sp, rcu_seq_endval(sp));
>  }
>  
> -/* Take a snapshot of the update side's sequence number. */
> +/*
> + * Take a snapshot of the update side's sequence number.
> + *
> + * This function predicts what the grace period number will be the next
> + * time an RCU callback will be executed, given the current grace period's
> + * number. This can be gp+1 if RCU is idle, or gp+2 if a grace period is
> + * already in progress.
> + *
> + * We do this with a single addition and masking.
> + * For example, if RCU_SEQ_STATE_MASK=1 and the least significant bit (LSB) 
> of
> + * the seq is used to track if a GP is in progress or not, its sufficient if 
> we
> + * add (2+1) and mask with ~1. Lets see why with an example:
> + *
> + * Say the current seq is 6 which is 0x110 (gp is 3 and state bit is 0).

0b110
   or   0x6

> + * To get the next GP number, we have to atleast add 0x10 to this (0x1 << 1) 
> to

at least add 0b10

> + * account for the state bit. However, if the current seq is 7 (GP num is 3
> + * and state bit is 1), then it means the current grace period is already
> + * in progress so the next the callback will run is at gp+2. To account for

  so the next time? the callback will run

> + * the extra +1, we just overflow the LSB by adding another 0x1 and masking
> + * with ~0x1. Incase no GP was in progress (RCU is idle), then the adding

 In case

> + * by 0x1 and masking will have no effect. This is calculated as below.
> + */
>  static inline unsigned long rcu_seq_snap(unsigned long *sp)
>  {
>   unsigned long s;
> 


-- 
~Randy

[PATCH 1/2] regulator: of: add property for allowed modes specification

2018-05-11 Thread David Collins

Add a common device tree property for regulator nodes to support
the specification of allowed operating modes.

Signed-off-by: David Collins 
---
 Documentation/devicetree/bindings/regulator/regulator.txt | 5 +
 1 file changed, 5 insertions(+)

diff --git a/Documentation/devicetree/bindings/regulator/regulator.txt 
b/Documentation/devicetree/bindings/regulator/regulator.txt
index 2babe15b..c627aa0 100644
--- a/Documentation/devicetree/bindings/regulator/regulator.txt
+++ b/Documentation/devicetree/bindings/regulator/regulator.txt
@@ -59,6 +59,11 @@ Optional properties:
 - regulator-initial-mode: initial operating mode. The set of possible operating
   modes depends on the capabilities of every hardware so each device binding
   documentation explains which values the regulator supports.
+- regulator-allowed-modes: list of operating modes that software is allowed to
+  configure for the regulator at run-time.  Elements may be specified in any
+  order.  The set of possible operating modes depends on the capabilities of
+  every hardware so each device binding document explains which values the
+  regulator supports.
 - regulator-system-load: Load in uA present on regulator that is not captured 
by
   any consumer request.
 - regulator-pull-down: Enable pull down resistor when the regulator is 
disabled.
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

[PATCH 2/2] regulator: of: add support for allowed modes configuration

2018-05-11 Thread David Collins

Add support for configuring the machine constraints
valid_modes_mask element based on a list of allowed modes
specified via a device tree property.

Signed-off-by: David Collins 
---
 drivers/regulator/of_regulator.c | 29 -
 1 file changed, 28 insertions(+), 1 deletion(-)

diff --git a/drivers/regulator/of_regulator.c b/drivers/regulator/of_regulator.c
index 0d3f73e..d61fed2 100644
--- a/drivers/regulator/of_regulator.c
+++ b/drivers/regulator/of_regulator.c
@@ -32,7 +32,7 @@ static void of_get_regulation_constraints(struct device_node 
*np,
struct regulator_state *suspend_state;
struct device_node *suspend_np;
unsigned int mode;
-   int ret, i;
+   int ret, i, len;
u32 pval;
 
constraints->name = of_get_property(np, "regulator-name", NULL);
@@ -136,6 +136,33 @@ static void of_get_regulation_constraints(struct 
device_node *np,
}
}
 
+   len = of_property_count_elems_of_size(np, "regulator-allowed-modes",
+   sizeof(u32));
+   if (len > 0) {
+   if (desc && desc->of_map_mode) {
+   for (i = 0; i < len; i++) {
+   ret = of_property_read_u32_index(np,
+   "regulator-allowed-modes", i, &pval);
+   if (ret) {
+   pr_err("%s: couldn't read allowed modes 
index %d, ret=%d\n",
+   np->name, i, ret);
+   break;
+   }
+   mode = desc->of_map_mode(pval);
+   if (mode == REGULATOR_MODE_INVALID)
+   pr_err("%s: invalid 
regulator-allowed-modes element %u\n",
+   np->name, pval);
+   else
+   constraints->valid_modes_mask |= mode;
+   }
+   if (constraints->valid_modes_mask)
+   constraints->valid_ops_mask
+   |= REGULATOR_CHANGE_MODE;
+   } else {
+   pr_warn("%s: mode mapping not defined\n", np->name);
+   }
+   }
+
if (!of_property_read_u32(np, "regulator-system-load", &pval))
constraints->system_load = pval;
 
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

[PATCH 0/2] regulator: of: add device tree property for allowed modes

2018-05-11 Thread David Collins

There is currently no accepted way to configure constraints->valid_modes_mask
for regulators defined in device tree.  This patch series defines a new
common regulator device tree property, regulator-allowed-modes, which can be
used to specify the set of modes that the regulator is allowed to use.
It also implements parsing for this new property inside of the
of_get_regulation_constraints() function.

David Collins (2):
  regulator: of: add property for allowed modes specification
  regulator: of: add support for allowed modes configuration

 .../devicetree/bindings/regulator/regulator.txt|  5 
 drivers/regulator/of_regulator.c   | 29 +-
 2 files changed, 33 insertions(+), 1 deletion(-)

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

Re: [PATCH] rcu: trace: Remove Startedleaf from trace events comment

2018-05-11 Thread Joel Fernandes

On Fri, May 11, 2018 at 6:29 PM, Joel Fernandes (Google)
 wrote:
>
> As part of the gp_seq clean up, the Startleaf condition doesn't occur
> anymore. Remove it from the comment in the trace event file.

Sorry, I meant here Startedleaf. Let me know if you want me to resend the patch.

thanks,

- Joel

[PATCH] rcu: trace: Remove Startedleaf from trace events comment

2018-05-11 Thread Joel Fernandes (Google)

As part of the gp_seq clean up, the Startleaf condition doesn't occur
anymore. Remove it from the comment in the trace event file.

Signed-off-by: Joel Fernandes (Google) 
---
 include/trace/events/rcu.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/include/trace/events/rcu.h b/include/trace/events/rcu.h
index ce9d1a1cac78..6d8dd04912d2 100644
--- a/include/trace/events/rcu.h
+++ b/include/trace/events/rcu.h
@@ -91,7 +91,6 @@ TRACE_EVENT(rcu_grace_period,
  *
  * "Startleaf": Request a grace period based on leaf-node data.
  * "Prestarted": Someone beat us to the request
- * "Startedleaf": Leaf-node start proved sufficient.
  * "Startedleafroot": Leaf-node start proved sufficient after checking root.
  * "Startedroot": Requested a nocb grace period based on root-node data.
  * "NoGPkthread": The RCU grace-period kthread has not yet started.
-- 
2.17.0.441.gb46fe60e1d-goog

[RFC] crypto: Remove mcryptd

2018-05-11 Thread Megha Dey

This patch attempts to remove the mcryptd interface and expose the
sha1 multibuffer algorithm as a proper ahash to the inner algorithm.

1. Host the flusher helper in sha1_mb.c instead of mcryptd.c (need to
change the names of these functions)
2. Remove unnecessary mcryptd structure mcryptd_hash_ctx
(combine sha_mb_ctx and mcryptd_hash_ctx)
3. Introduce a new simd_ahash_create_compat() similar to the
simd_skcipher_create_compat() in simd.c. This registers the outer
algorithm. Remove existing outer algorithm.
4. In the outer layer(simd wrapper), pass the right pointers to the
inner algorithm.(will shift 3 and 4 to simd.c later)
5. Remove mcryptd.c
6. Update the name, driver name and priority of inner algorithm.

Herbert,
I would like to know if the above approach is what you are suggesting.
The problem with this approach is there is no async workqueue context
which issues completions. Instead everything runs in a single thread of
execution. You had suggested that the SIMD wrapper will defer the job to
the Kthread context, but I am not sure that will be done.

Please let me know what you think.

Signed-off-by: Megha Dey 
---
 arch/x86/crypto/sha1-mb/sha1_mb.c | 312 +++--
 crypto/Makefile   |   1 -
 crypto/mcryptd.c  | 702 --
 include/crypto/mcryptd.h  |   5 -
 4 files changed, 200 insertions(+), 820 deletions(-)
 delete mode 100644 crypto/mcryptd.c

diff --git a/arch/x86/crypto/sha1-mb/sha1_mb.c 
b/arch/x86/crypto/sha1-mb/sha1_mb.c
index acf9fdf..b8c03ce 100644
--- a/arch/x86/crypto/sha1-mb/sha1_mb.c
+++ b/arch/x86/crypto/sha1-mb/sha1_mb.c
@@ -71,10 +71,62 @@
 
 #define FLUSH_INTERVAL 1000 /* in usec */
 
+static struct crypto_ahash *tfm_compact;
+
+struct mcryptd_flush_list {
+   struct list_head list;
+   struct mutex lock;
+};
+
+static struct mcryptd_flush_list __percpu *mcryptd_flist;
+
+void mcryptd_arm_flusher(struct mcryptd_alg_cstate *cstate, unsigned long 
delay)
+{
+   struct mcryptd_flush_list *flist;
+
+   if (!cstate->flusher_engaged) {
+   /* put the flusher on the flush list */
+   flist = per_cpu_ptr(mcryptd_flist, smp_processor_id());
+   mutex_lock(&flist->lock);
+   list_add_tail(&cstate->flush_list, &flist->list);
+   cstate->flusher_engaged = true;
+   cstate->next_flush = jiffies + delay;
+   queue_delayed_work_on(smp_processor_id(), kcrypto_wq,
+   &cstate->flush, delay);
+   mutex_unlock(&flist->lock);
+   }
+}
+
+void mcryptd_flusher(struct work_struct *__work)
+{
+   struct  mcryptd_alg_cstate  *alg_cpu_state;
+   struct  mcryptd_alg_state   *alg_state;
+   struct  mcryptd_flush_list  *flist;
+   int cpu;
+
+   cpu = smp_processor_id();
+   alg_cpu_state = container_of(to_delayed_work(__work),
+   struct mcryptd_alg_cstate, flush);
+   alg_state = alg_cpu_state->alg_state;
+   if (alg_cpu_state->cpu != cpu)
+   pr_debug("mcryptd error: work on cpu %d, should be cpu %d\n",
+   cpu, alg_cpu_state->cpu);
+
+   if (alg_cpu_state->flusher_engaged) {
+   flist = per_cpu_ptr(mcryptd_flist, cpu);
+   mutex_lock(&flist->lock);
+   list_del(&alg_cpu_state->flush_list);
+   alg_cpu_state->flusher_engaged = false;
+   mutex_unlock(&flist->lock);
+   alg_state->flusher(alg_cpu_state);
+   }
+}
+
 static struct mcryptd_alg_state sha1_mb_alg_state;
 
 struct sha1_mb_ctx {
-   struct mcryptd_ahash *mcryptd_tfm;
+   struct crypto_ahash *child;
+   struct mcryptd_alg_state *alg_state;
 };
 
 static inline struct mcryptd_hash_request_ctx
@@ -530,7 +582,6 @@ static int sha1_mb_update(struct ahash_request *areq)
struct sha1_hash_ctx *sha_ctx;
int ret = 0, nbytes;
 
-
/* sanity check */
if (rctx->tag.cpu != smp_processor_id()) {
pr_err("mcryptd error: cpu clash\n");
@@ -667,7 +718,6 @@ static int sha1_mb_final(struct ahash_request *areq)
sha_ctx = sha1_ctx_mgr_submit(cstate->mgr, sha_ctx, &data, 0,
HASH_LAST);
kernel_fpu_end();
-
/* check if anything is returned */
if (!sha_ctx)
return -EINPROGRESS;
@@ -707,21 +757,12 @@ static int sha1_mb_import(struct ahash_request *areq, 
const void *in)
 
 static int sha1_mb_async_init_tfm(struct crypto_tfm *tfm)
 {
-   struct mcryptd_ahash *mcryptd_tfm;
struct sha1_mb_ctx *ctx = crypto_tfm_ctx(tfm);
-   struct mcryptd_hash_ctx *mctx;
 
-   mcryptd_tfm = mcryptd_alloc_ahash("__intel_sha1-mb",
-   CRYPTO_ALG_INTERNAL,
-   CRYPTO_ALG_INTERNAL);
-   if (IS_ERR(mc

RE: [PATCH V8 1/5] crypto: Multi-buffer encryption infrastructure support

2018-05-11 Thread Dey, Megha



>-Original Message-
>From: Herbert Xu [mailto:herb...@gondor.apana.org.au]
>Sent: Thursday, May 10, 2018 9:46 PM
>To: Dey, Megha 
>Cc: linux-kernel@vger.kernel.org; linux-cry...@vger.kernel.org;
>da...@davemloft.net
>Subject: Re: [PATCH V8 1/5] crypto: Multi-buffer encryption infrastructure
>support
>
>On Fri, May 11, 2018 at 01:24:42AM +, Dey, Megha wrote:
>>
>> Are you suggesting that the SIMD wrapper, will do what is currently being
>done by the ' mcryptd_queue_worker ' function (assuming FPU is not disabled)
>i.e dispatching the job to the inner algorithm?
>>
>> I have got rid of the mcryptd layer( have an inner layer, outer SIMD layer,
>handled the pointers and completions accordingly), but still facing some issues
>after removing the per cpu mcryptd_cpu_queue.
>
>Why don't you post what you've got and we can work it out together?

Hi Herbert,

Sure, I will post an RFC patch. (crypto: Remove mcryptd). 

>
>Thanks,
>--
>Email: Herbert Xu  Home Page:
>http://gondor.apana.org.au/~herbert/
>PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

Re: WARNING: bad unlock balance in xfs_iunlock

2018-05-11 Thread Dave Chinner

On Fri, May 11, 2018 at 10:59:53AM +0200, Dmitry Vyukov wrote:
> On Thu, May 10, 2018 at 1:22 AM, Dave Chinner  wrote:
> > On Wed, May 09, 2018 at 10:43:05AM +0200, Dmitry Vyukov wrote:
> >> Does "xfstests fuzzing infrastructure" use coverage-guidance?
> >
> > It's guided manually to fuzz a substantial proportion of the fields
> > in the on-disk format that are susceptible to fuzzing bqased
> > attacks. It's not complete coverage yet, but it's getting better and
> > better, and we're finding more problems from it that random bit
> > based fuzzing has ever uncovered.
> >
> > Also, the xfstests fuzzing defeats the CRC protection now built into
> > the metadata, which means it can exercise all the new filesystem
> > features that random bit fuzzers cannot exercise. That's the problem
> > with fuzzers like syzbot - they can only usefully fuzz the legacy
> > filesystem format which doesn't have CRC validation, nor many of the
> > other protections that the current filesystem format has to detect
> > corruption. This will also allow us to test things like online
> > repair of fuzzed structures
> 
> syzkaller has 2 techniques to deal with checksums, if you are
> interested I can go into more detail.

You can if you want, but I'm betting it basically comes down to
teaching syzcaller about parts of the on-disk format, similar to
AFL. And, like AFL, I doubt any XFS developer has the time to
add such support to syzbot.

> > Given the results we're getting from our own fuzzers, I don't see
> > much point in (XFS developers) investing huge amounts of effort to
> > make some other fuzzer equivalent to what we already have. If
> > someone else starts fuzzing the current format (v5) XFS filesystems
> > and finding problems we haven't, then I'm going to be interested in
> > their fuzzing tools.  But (guided) random bit perturbation fuzzing
> > of legacy filesystem formats is really not that useful or
> > interesting to us right now.
> 
> Just asked.
> 
> Note that coverage-guidance does not necessary mean bit flipping.
> syzkaller combines coverage-guidance with grammar-awareness and other
> smartness.

Yup, I assumed that this would be the case - those sorts of
"directed fuzzing" techniques were pioneered by the Samba guys for
reverse engineering the SMB protocol used by MS servers all those
years ago. But at it's most basic level, it's still using bit
flipping techniques to perturb the input and provoke responses.

> Based on our experience with network testing, main advantage of
> syzkaller over just feeding blobs as network packets (even if these
> blobs are built in a very smart way) is the following. syzkaller can
> build complex interactions between syscalls, external inputs and
> blobs.

Yup, nothing new there - that's what every other filesystem fuzzer
infrastructure does, too.  The problem with this is that it doesn't
pin-point the actual operation that tripped over the on-disk
corruption. It's catching downstream symptoms of an unknown,
undetected on-disk format corruption. i.e. it's a poor substitute
for explicit testing of structure bounds and data relationships of a
known format.

That's the fundamental premise of fuzz testing - most software does
not have robust validation of it's inputs and so fuzzing those
inputs finds problems. We've moved on from the old "trust and don't
validate" model of filesystem structure architecture.  The on-disk
format is very well defined, it is constrained in most cases, and we
can validate most individual structures at runtime with relatively
little cost.

Hence the "structure bounds" exploits that fuzzers tend to exercise
are pretty much taken out of the picture, and that leaves us with
"data relationships" between structures as the main vector for
undetected corruptions. These are mostly detectable, and many are
correctable as the current on-disk format has a lot of redundant
information. So the space for fuzzers to detect problems is getting
smaller and smaller all the time.

IOWs, filesystem image fuzzers have their place, but if you want us
to take your fuzzing seriously then your fuzzer needs to understand
all the mechanisms we now use to detect corruptions to show us where
they are deficient. If your fuzzing doesn't expose flaws in our
current validation techniques, then it's really not useful to us.

> For example, handling of external network packets depend on if
> there is an open socket on that port, what setsockopts were called, if
> there is a pending receive, what flags were passed to that receive,
> were some data sent the other way, etc. For filesystems that would be
> various filesystem syscalls executed against the mounted image,
> concurrent umount, rebind, switch to read-only mode, etc.
> But maybe xfstests do this too, I don't know. Do they?

Generally there is no need to do this because we know exactly what
syscalls will trigger access and/or modification to on-disk
structures. Access to the on-disk structures triggers the built in
verifier infrastruc

Re: [PATCH 0/3] KVM: VMX: Allow to disable ioport intercept per-VM by userspace

2018-05-11 Thread Wanpeng Li

2018-05-11 23:40 GMT+08:00 Konrad Rzeszutek Wilk :
> On Mon, Apr 16, 2018 at 10:45:59PM -0700, Wanpeng Li wrote:
>> Tim Shearer reported that "There is a guest which is running a packet
>> forwarding app based on the DPDK (dpdk.org). The packet receive routine
>> writes to 0xc070 using glibc's "outw_p" function which does an additional
>> write to I/O port 0x80. It does this write for every packet that's
>> received, causing a flood of KVM userspace context switches". He uses
>> mpstat to observe a CPU performing L2 packet forwarding on a pinned
>> guest vCPU, the guest time is 95 percent when allowing I/O port 0x80
>> bypass, however, it is 65.78 percent when I/O port 0x80 bypss is
>> disabled.
>>
>> This patchset introduces per-VM I/O permission bitmaps, the userspace
>> can disable the ioport intercept when they are more concern the
>> performance than the security.
>
> Could you kindly also add:
>
> Suggested-by: Konrad Rzeszutek Wilk 

Yeah, both you and Liran give the original idea. :) Tim and Liran, any
review for the patchset?

Regards,
Wanpeng Li

Re: KASAN: null-ptr-deref Read in rds_ib_get_mr

2018-05-11 Thread Yanjun Zhu




On 2018/5/12 0:58, Santosh Shilimkar wrote:

On 5/11/2018 12:48 AM, Yanjun Zhu wrote:



On 2018/5/11 13:20, DaeRyong Jeong wrote:

We report the crash: KASAN: null-ptr-deref Read in rds_ib_get_mr

Note that this bug is previously reported by syzkaller.
https://syzkaller.appspot.com/bug?id=0bb56a5a48b000b52aa2b0d8dd20b1f545214d91 

Nonetheless, this bug has not fixed yet, and we hope that this 
report and our
analysis, which gets help by the RaceFuzzer's feature, will helpful 
to fix the

crash.

This crash has been found in v4.17-rc1 using RaceFuzzer (a modified
version of Syzkaller), which we describe more at the end of this
report. Our analysis shows that the race occurs when invoking two
syscalls concurrently, bind$rds and setsockopt$RDS_GET_MR.


Analysis:
We think the concurrent execution of __rds_rdma_map() and rds_bind()
causes the problem. __rds_rdma_map() checks whether 
rs->rs_bound_addr is 0

or not. But the concurrent execution with rds_bind() can by-pass this
check. Therefore, __rds_rdmap_map() calls rs->rs_transport->get_mr() 
and
rds_ib_get_mr() causes the null deref at ib_rdma.c:544 in v4.17-rc1, 
when

dereferencing rs_conn.


Thread interleaving:
CPU0 (__rds_rdma_map)    CPU1 (rds_bind)
    // rds_add_bound() sets rs->bound_addr 
as none 0
    ret = rds_add_bound(rs, 
sin->sin_addr.s_addr, &sin->sin_port);

if (rs->rs_bound_addr == 0 || !rs->rs_transport) {
ret = -ENOTCONN; /* XXX not a great errno */
goto out;
}
    if (rs->rs_transport) { /* previously 
bound */

    trans = rs->rs_transport;
    if 
(trans->laddr_check(sock_net(sock->sk),

sin->sin_addr.s_addr) != 0) {
    ret = -ENOPROTOOPT;
    // rds_remove_bound() sets 
rs->bound_addr as 0

    rds_remove_bound(rs);
...
trans_private = rs->rs_transport->get_mr(sg, nents, rs,
 &mr->r_key);
(in rds_ib_get_mr())
struct rds_ib_connection *ic = rs->rs_conn->c_transport_data;


Call sequence (v4.17-rc1):
CPU0
rds_setsockopt
rds_get_mr
    __rds_rdma_map
    rds_ib_get_mr


CPU1
rds_bind
rds_add_bound
...
rds_remove_bound


Crash log:
==
BUG: KASAN: null-ptr-deref in rds_ib_get_mr+0x3a/0x150 
net/rds/ib_rdma.c:544

Read of size 8 at addr 0068 by task syz-executor0/32067

CPU: 0 PID: 32067 Comm: syz-executor0 Not tainted 4.17.0-rc1 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014

Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x166/0x21c lib/dump_stack.c:113
  kasan_report_error mm/kasan/report.c:352 [inline]
  kasan_report+0x140/0x360 mm/kasan/report.c:412
  check_memory_region_inline mm/kasan/kasan.c:260 [inline]
  __asan_load8+0x54/0x90 mm/kasan/kasan.c:699
  rds_ib_get_mr+0x3a/0x150 net/rds/ib_rdma.c:544
  __rds_rdma_map+0x521/0x9d0 net/rds/rdma.c:271
  rds_get_mr+0xad/0xf0 net/rds/rdma.c:333
  rds_setsockopt+0x57f/0x720 net/rds/af_rds.c:347
  __sys_setsockopt+0x147/0x230 net/socket.c:1903
  __do_sys_setsockopt net/socket.c:1914 [inline]
  __se_sys_setsockopt net/socket.c:1911 [inline]
  __x64_sys_setsockopt+0x67/0x80 net/socket.c:1911
  do_syscall_64+0x15f/0x4a0 arch/x86/entry/common.c:287
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x4563f9
RSP: 002b:7f6a2b3c2b28 EFLAGS: 0246 ORIG_RAX: 0036
RAX: ffda RBX: 0072bee0 RCX: 004563f9
RDX: 0002 RSI: 0114 RDI: 0015
RBP: 0575 R08: 0020 R09: 
R10: 2140 R11: 0246 R12: 7f6a2b3c36d4
R13:  R14: 006fd398 R15: 
==

diff --git a/net/rds/ib_rdma.c b/net/rds/ib_rdma.c
index e678699..2228b50 100644
--- a/net/rds/ib_rdma.c
+++ b/net/rds/ib_rdma.c
@@ -539,11 +539,17 @@ void rds_ib_flush_mrs(void)
  void *rds_ib_get_mr(struct scatterlist *sg, unsigned long nents,
 struct rds_sock *rs, u32 *key_ret)
  {
-   struct rds_ib_device *rds_ibdev;
+   struct rds_ib_device *rds_ibdev = NULL;
 struct rds_ib_mr *ibmr = NULL;
-   struct rds_ib_connection *ic = rs->rs_conn->c_transport_data;
+   struct rds_ib_connection *ic = NULL;
 int ret;

+   if (rs->rs_bound_addr == 0) {
+   ret = -EPERM;
+   goto out;
+   }
+

No you can't return such error for this API and the
socket related checks needs to be done at core layer.
I remember fixing this race but probably never pushed
fix upstream.

OK. Wait for your patch. :-)


The MR code is due for update with optimized FRWR code
which now stable enough. We will address this iss

Re: [BUGFIX PATCH v3 0/4] arm: kprobes: Fix to prohibit probing on unsafe functions

2018-05-11 Thread Masami Hiramatsu

Hi Greg,

Could you pick this series to stable?

Thank you,

On Tue, 8 May 2018 12:25:03 +0100
Russell King - ARM Linux  wrote:

> On Fri, May 04, 2018 at 01:14:31PM +0900, Masami Hiramatsu wrote:
> > Hi,
> > 
> > This is the 3rd version of bugfix series for kprobes on arm.
> > This series fixes 4 different issues which I found.
> > 
> >  - Fix to use smp_processor_id() after disabling preemption.
> >  - Prohibit probing on optimized_callback() for avoiding
> >recursive probe.
> >  - Prohibit kprobes on do_undefinstr() by same reason.
> >  - Prohibit kprobes on get_user() by same reason.
> > 
> > >From v2, I included another 2 bugfixes (1/4 and 2/4)
> > which are not merged yet, and added "Cc: sta...@vger.kernel.org",
> > since there are obvious bugs.
> 
> Please submit them to the patch system, thanks.
> 
> > 
> > Thanks,
> > 
> > ---
> > 
> > Masami Hiramatsu (4):
> >   arm: kprobes: Fix to use get_kprobe_ctlblk after irq-disabed
> >   arm: kprobes: Prohibit probing on optimized_callback
> >   arm: kprobes: Prohibit kprobes on do_undefinstr
> >   arm: kprobes: Prohibit kprobes on get_user functions
> > 
> > 
> >  arch/arm/include/asm/assembler.h  |   10 ++
> >  arch/arm/kernel/traps.c   |5 -
> >  arch/arm/lib/getuser.S|   10 ++
> >  arch/arm/probes/kprobes/opt-arm.c |4 +++-
> >  4 files changed, 27 insertions(+), 2 deletions(-)
> > 
> > --
> > Masami Hiramatsu (Linaro) 
> 
> -- 
> RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
> FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
> According to speedtest.net: 8.21Mbps down 510kbps up


-- 
Masami Hiramatsu

[PATCH] rcu: Add comment documenting how rcu_seq_snap works

2018-05-11 Thread Joel Fernandes (Google)

rcu_seq_snap may be tricky for someone looking at it for the first time.
Lets document how it works with an example to make it easier.

Signed-off-by: Joel Fernandes (Google) 
---
 kernel/rcu/rcu.h | 23 ++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
index 003671825d62..004ace3d22c2 100644
--- a/kernel/rcu/rcu.h
+++ b/kernel/rcu/rcu.h
@@ -91,7 +91,28 @@ static inline void rcu_seq_end(unsigned long *sp)
WRITE_ONCE(*sp, rcu_seq_endval(sp));
 }
 
-/* Take a snapshot of the update side's sequence number. */
+/*
+ * Take a snapshot of the update side's sequence number.
+ *
+ * This function predicts what the grace period number will be the next
+ * time an RCU callback will be executed, given the current grace period's
+ * number. This can be gp+1 if RCU is idle, or gp+2 if a grace period is
+ * already in progress.
+ *
+ * We do this with a single addition and masking.
+ * For example, if RCU_SEQ_STATE_MASK=1 and the least significant bit (LSB) of
+ * the seq is used to track if a GP is in progress or not, its sufficient if we
+ * add (2+1) and mask with ~1. Lets see why with an example:
+ *
+ * Say the current seq is 6 which is 0x110 (gp is 3 and state bit is 0).
+ * To get the next GP number, we have to atleast add 0x10 to this (0x1 << 1) to
+ * account for the state bit. However, if the current seq is 7 (GP num is 3
+ * and state bit is 1), then it means the current grace period is already
+ * in progress so the next the callback will run is at gp+2. To account for
+ * the extra +1, we just overflow the LSB by adding another 0x1 and masking
+ * with ~0x1. Incase no GP was in progress (RCU is idle), then the adding
+ * by 0x1 and masking will have no effect. This is calculated as below.
+ */
 static inline unsigned long rcu_seq_snap(unsigned long *sp)
 {
unsigned long s;
-- 
2.17.0.441.gb46fe60e1d-goog

Re: [GIT] Networking

2018-05-11 Thread Linus Torvalds

On Fri, May 11, 2018 at 5:10 PM David Miller  wrote:

> I guess this is my reward for trying to break the monotony of
> pull requests :-)

I actually went back and checked a few older pull requests to see if this
had been going on for a while and I just hadn't noticed.

It just took me by surprise :^p

   Linus

Re: [GIT] Networking

2018-05-11 Thread David Miller

From: Linus Torvalds 
Date: Fri, 11 May 2018 14:25:59 -0700

> David, is there something you want to tell us?
> 
> Drugs are bad, m'kay..

I guess this is my reward for trying to break the monotony of
pull requests :-)

Re: [PATCH net] net: dsa: bcm_sf2: Fix RX_CLS_LOC_ANY overwrite for last rule

2018-05-11 Thread David Miller

From: Florian Fainelli 
Date: Fri, 11 May 2018 16:38:02 -0700

> David, please discard that for now, the IPv4 part is correct, but I am
> not fixing the bug correctly for the IPv6 part. v2 coming some time next
> week. Thank you!

Ok.

[PATCH v9 02/12] drivers: base: cacheinfo: setup DT cache properties early

2018-05-11 Thread Jeremy Linton

The original intent in cacheinfo was that an architecture
specific populate_cache_leaves() would probe the hardware
and then cache_shared_cpu_map_setup() and
cache_override_properties() would provide firmware help to
extend/expand upon what was probed. Arm64 was really
the only architecture that was working this way, and
with the removal of most of the hardware probing logic it
became clear that it was possible to simplify the logic a bit.

This patch combines the walk of the DT nodes with the
code updating the cache size/line_size and nr_sets.
cache_override_properties() (which was DT specific) is
then removed. The result is that cacheinfo.of_node is
no longer used as a temporary place to hold DT references
for future calls that update cache properties. That change
helps to clarify its one remaining use (matching
cacheinfo nodes that represent shared caches) which
will be used by the ACPI/PPTT code in the following patches.

Signed-off-by: Jeremy Linton 
Tested-by: Ard Biesheuvel 
Tested-by: Vijaya Kumar K 
Tested-by: Xiongfeng Wang 
Tested-by: Tomasz Nowicki 
Acked-by: Sudeep Holla 
Acked-by: Ard Biesheuvel 
---
 arch/riscv/kernel/cacheinfo.c |  1 -
 drivers/base/cacheinfo.c  | 65 +++
 2 files changed, 29 insertions(+), 37 deletions(-)

diff --git a/arch/riscv/kernel/cacheinfo.c b/arch/riscv/kernel/cacheinfo.c
index 10ed2749e246..0bc86e5f8f3f 100644
--- a/arch/riscv/kernel/cacheinfo.c
+++ b/arch/riscv/kernel/cacheinfo.c
@@ -20,7 +20,6 @@ static void ci_leaf_init(struct cacheinfo *this_leaf,
 struct device_node *node,
 enum cache_type type, unsigned int level)
 {
-   this_leaf->of_node = node;
this_leaf->level = level;
this_leaf->type = type;
/* not a sector cache */
diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index 09ccef7ddc99..a872523e8951 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -71,7 +71,7 @@ static inline int get_cacheinfo_idx(enum cache_type type)
return type;
 }
 
-static void cache_size(struct cacheinfo *this_leaf)
+static void cache_size(struct cacheinfo *this_leaf, struct device_node *np)
 {
const char *propname;
const __be32 *cache_size;
@@ -80,13 +80,14 @@ static void cache_size(struct cacheinfo *this_leaf)
ct_idx = get_cacheinfo_idx(this_leaf->type);
propname = cache_type_info[ct_idx].size_prop;
 
-   cache_size = of_get_property(this_leaf->of_node, propname, NULL);
+   cache_size = of_get_property(np, propname, NULL);
if (cache_size)
this_leaf->size = of_read_number(cache_size, 1);
 }
 
 /* not cache_line_size() because that's a macro in include/linux/cache.h */
-static void cache_get_line_size(struct cacheinfo *this_leaf)
+static void cache_get_line_size(struct cacheinfo *this_leaf,
+   struct device_node *np)
 {
const __be32 *line_size;
int i, lim, ct_idx;
@@ -98,7 +99,7 @@ static void cache_get_line_size(struct cacheinfo *this_leaf)
const char *propname;
 
propname = cache_type_info[ct_idx].line_size_props[i];
-   line_size = of_get_property(this_leaf->of_node, propname, NULL);
+   line_size = of_get_property(np, propname, NULL);
if (line_size)
break;
}
@@ -107,7 +108,7 @@ static void cache_get_line_size(struct cacheinfo *this_leaf)
this_leaf->coherency_line_size = of_read_number(line_size, 1);
 }
 
-static void cache_nr_sets(struct cacheinfo *this_leaf)
+static void cache_nr_sets(struct cacheinfo *this_leaf, struct device_node *np)
 {
const char *propname;
const __be32 *nr_sets;
@@ -116,7 +117,7 @@ static void cache_nr_sets(struct cacheinfo *this_leaf)
ct_idx = get_cacheinfo_idx(this_leaf->type);
propname = cache_type_info[ct_idx].nr_sets_prop;
 
-   nr_sets = of_get_property(this_leaf->of_node, propname, NULL);
+   nr_sets = of_get_property(np, propname, NULL);
if (nr_sets)
this_leaf->number_of_sets = of_read_number(nr_sets, 1);
 }
@@ -135,32 +136,27 @@ static void cache_associativity(struct cacheinfo 
*this_leaf)
this_leaf->ways_of_associativity = (size / nr_sets) / line_size;
 }
 
-static bool cache_node_is_unified(struct cacheinfo *this_leaf)
+static bool cache_node_is_unified(struct cacheinfo *this_leaf,
+ struct device_node *np)
 {
-   return of_property_read_bool(this_leaf->of_node, "cache-unified");
+   return of_property_read_bool(np, "cache-unified");
 }
 
-static void cache_of_override_properties(unsigned int cpu)
+static void cache_of_set_props(struct cacheinfo *this_leaf,
+  struct device_node *np)
 {
-   int index;
-   struct cacheinfo *this_leaf;
-   struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
-
-

[PATCH v9 04/12] arm64/acpi: Create arch specific cpu to acpi id helper

2018-05-11 Thread Jeremy Linton

Its helpful to be able to lookup the acpi_processor_id associated
with a logical cpu. Provide an arm64 helper to do this.

Signed-off-by: Jeremy Linton 
Tested-by: Ard Biesheuvel 
Tested-by: Vijaya Kumar K 
Tested-by: Xiongfeng Wang 
Tested-by: Tomasz Nowicki 
Acked-by: Ard Biesheuvel 
---
 arch/arm64/include/asm/acpi.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h
index 32f465a80e4e..0db62a4cbce2 100644
--- a/arch/arm64/include/asm/acpi.h
+++ b/arch/arm64/include/asm/acpi.h
@@ -86,6 +86,10 @@ static inline bool acpi_has_cpu_in_madt(void)
 }
 
 struct acpi_madt_generic_interrupt *acpi_cpu_get_madt_gicc(int cpu);
+static inline u32 get_acpi_id_for_cpu(unsigned int cpu)
+{
+   return  acpi_cpu_get_madt_gicc(cpu)->uid;
+}
 
 static inline void arch_fix_phys_package_id(int num, u32 slot) { }
 void __init acpi_init_cpus(void);
-- 
2.13.6

[PATCH v9 01/12] drivers: base: cacheinfo: move cache_setup_of_node()

2018-05-11 Thread Jeremy Linton

In preparation for the next patch, and to aid in
review of that patch, lets move cache_setup_of_node
further down in the module without any changes.

Signed-off-by: Jeremy Linton 
Tested-by: Ard Biesheuvel 
Tested-by: Vijaya Kumar K 
Tested-by: Xiongfeng Wang 
Tested-by: Tomasz Nowicki 
Reviewed-by: Sudeep Holla 
Acked-by: Ard Biesheuvel 
---
 drivers/base/cacheinfo.c | 80 
 1 file changed, 40 insertions(+), 40 deletions(-)

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index edf726267282..09ccef7ddc99 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -32,46 +32,6 @@ struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu)
 }
 
 #ifdef CONFIG_OF
-static int cache_setup_of_node(unsigned int cpu)
-{
-   struct device_node *np;
-   struct cacheinfo *this_leaf;
-   struct device *cpu_dev = get_cpu_device(cpu);
-   struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
-   unsigned int index = 0;
-
-   /* skip if of_node is already populated */
-   if (this_cpu_ci->info_list->of_node)
-   return 0;
-
-   if (!cpu_dev) {
-   pr_err("No cpu device for CPU %d\n", cpu);
-   return -ENODEV;
-   }
-   np = cpu_dev->of_node;
-   if (!np) {
-   pr_err("Failed to find cpu%d device node\n", cpu);
-   return -ENOENT;
-   }
-
-   while (index < cache_leaves(cpu)) {
-   this_leaf = this_cpu_ci->info_list + index;
-   if (this_leaf->level != 1)
-   np = of_find_next_cache_node(np);
-   else
-   np = of_node_get(np);/* cpu node itself */
-   if (!np)
-   break;
-   this_leaf->of_node = np;
-   index++;
-   }
-
-   if (index != cache_leaves(cpu)) /* not all OF nodes populated */
-   return -ENOENT;
-
-   return 0;
-}
-
 static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
   struct cacheinfo *sib_leaf)
 {
@@ -202,6 +162,46 @@ static void cache_of_override_properties(unsigned int cpu)
cache_associativity(this_leaf);
}
 }
+
+static int cache_setup_of_node(unsigned int cpu)
+{
+   struct device_node *np;
+   struct cacheinfo *this_leaf;
+   struct device *cpu_dev = get_cpu_device(cpu);
+   struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
+   unsigned int index = 0;
+
+   /* skip if of_node is already populated */
+   if (this_cpu_ci->info_list->of_node)
+   return 0;
+
+   if (!cpu_dev) {
+   pr_err("No cpu device for CPU %d\n", cpu);
+   return -ENODEV;
+   }
+   np = cpu_dev->of_node;
+   if (!np) {
+   pr_err("Failed to find cpu%d device node\n", cpu);
+   return -ENOENT;
+   }
+
+   while (index < cache_leaves(cpu)) {
+   this_leaf = this_cpu_ci->info_list + index;
+   if (this_leaf->level != 1)
+   np = of_find_next_cache_node(np);
+   else
+   np = of_node_get(np);/* cpu node itself */
+   if (!np)
+   break;
+   this_leaf->of_node = np;
+   index++;
+   }
+
+   if (index != cache_leaves(cpu)) /* not all OF nodes populated */
+   return -ENOENT;
+
+   return 0;
+}
 #else
 static void cache_of_override_properties(unsigned int cpu) { }
 static inline int cache_setup_of_node(unsigned int cpu) { return 0; }
-- 
2.13.6

[PATCH v9 05/12] ACPI/PPTT: Add Processor Properties Topology Table parsing

2018-05-11 Thread Jeremy Linton

ACPI 6.2 adds a new table, which describes how processing units
are related to each other in tree like fashion. Caches are
also sprinkled throughout the tree and describe the properties
of the caches in relation to other caches and processing units.

Add the code to parse the cache hierarchy and report the total
number of levels of cache for a given core using
acpi_find_last_cache_level() as well as fill out the individual
cores cache information with cache_setup_acpi() once the
cpu_cacheinfo structure has been populated by the arch specific
code.

An additional patch later in the set adds the ability to report
peers in the topology using find_acpi_cpu_topology()
to report a unique ID for each processing unit at a given level
in the tree. These unique id's can then be used to match related
processing units which exist as threads, within a given
package, etc.

Signed-off-by: Jeremy Linton 
Tested-by: Ard Biesheuvel 
Tested-by: Vijaya Kumar K 
Tested-by: Xiongfeng Wang 
Tested-by: Tomasz Nowicki 
Acked-by: Sudeep Holla 
Acked-by: Ard Biesheuvel 
---
 drivers/acpi/pptt.c  | 655 +++
 include/linux/acpi.h |   4 +
 2 files changed, 659 insertions(+)
 create mode 100644 drivers/acpi/pptt.c

diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
new file mode 100644
index ..e5ea1974d1e3
--- /dev/null
+++ b/drivers/acpi/pptt.c
@@ -0,0 +1,655 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * pptt.c - parsing of Processor Properties Topology Table (PPTT)
+ *
+ * Copyright (C) 2018, ARM
+ *
+ * This file implements parsing of the Processor Properties Topology Table
+ * which is optionally used to describe the processor and cache topology.
+ * Due to the relative pointers used throughout the table, this doesn't
+ * leverage the existing subtable parsing in the kernel.
+ *
+ * The PPTT structure is an inverted tree, with each node potentially
+ * holding one or two inverted tree data structures describing
+ * the caches available at that level. Each cache structure optionally
+ * contains properties describing the cache at a given level which can be
+ * used to override hardware probed values.
+ */
+#define pr_fmt(fmt) "ACPI PPTT: " fmt
+
+#include 
+#include 
+#include 
+
+static struct acpi_subtable_header *fetch_pptt_subtable(struct 
acpi_table_header *table_hdr,
+   u32 pptt_ref)
+{
+   struct acpi_subtable_header *entry;
+
+   /* there isn't a subtable at reference 0 */
+   if (pptt_ref < sizeof(struct acpi_subtable_header))
+   return NULL;
+
+   if (pptt_ref + sizeof(struct acpi_subtable_header) > table_hdr->length)
+   return NULL;
+
+   entry = ACPI_ADD_PTR(struct acpi_subtable_header, table_hdr, pptt_ref);
+
+   if (entry->length == 0)
+   return NULL;
+
+   if (pptt_ref + entry->length > table_hdr->length)
+   return NULL;
+
+   return entry;
+}
+
+static struct acpi_pptt_processor *fetch_pptt_node(struct acpi_table_header 
*table_hdr,
+  u32 pptt_ref)
+{
+   return (struct acpi_pptt_processor *)fetch_pptt_subtable(table_hdr, 
pptt_ref);
+}
+
+static struct acpi_pptt_cache *fetch_pptt_cache(struct acpi_table_header 
*table_hdr,
+   u32 pptt_ref)
+{
+   return (struct acpi_pptt_cache *)fetch_pptt_subtable(table_hdr, 
pptt_ref);
+}
+
+static struct acpi_subtable_header *acpi_get_pptt_resource(struct 
acpi_table_header *table_hdr,
+  struct 
acpi_pptt_processor *node,
+  int resource)
+{
+   u32 *ref;
+
+   if (resource >= node->number_of_priv_resources)
+   return NULL;
+
+   ref = ACPI_ADD_PTR(u32, node, sizeof(struct acpi_pptt_processor));
+   ref += resource;
+
+   return fetch_pptt_subtable(table_hdr, *ref);
+}
+
+static inline bool acpi_pptt_match_type(int table_type, int type)
+{
+   return ((table_type & ACPI_PPTT_MASK_CACHE_TYPE) == type ||
+   table_type & ACPI_PPTT_CACHE_TYPE_UNIFIED & type);
+}
+
+/**
+ * acpi_pptt_walk_cache() - Attempt to find the requested acpi_pptt_cache
+ * @table_hdr: Pointer to the head of the PPTT table
+ * @local_level: passed res reflects this cache level
+ * @res: cache resource in the PPTT we want to walk
+ * @found: returns a pointer to the requested level if found
+ * @level: the requested cache level
+ * @type: the requested cache type
+ *
+ * Attempt to find a given cache level, while counting the max number
+ * of cache levels for the cache node.
+ *
+ * Given a pptt resource, verify that it is a cache node, then walk
+ * down each level of caches, counting how many levels are found
+ * as well as checking the cache type (icache, dcache, unified). If a
+ * level & type match, then we set found, and continue the search.
+ * On

[PATCH v9 07/12] drivers: base cacheinfo: Add support for ACPI based firmware tables

2018-05-11 Thread Jeremy Linton

Call ACPI cache parsing routines from base cacheinfo code if ACPI
is enabled. Also stub out cache_setup_acpi and acpi_find_last_cache_level
so that individual architectures can enable ACPI topology parsing.

Signed-off-by: Jeremy Linton 
Tested-by: Ard Biesheuvel 
Tested-by: Vijaya Kumar K 
Tested-by: Xiongfeng Wang 
Tested-by: Tomasz Nowicki 
Acked-by: Sudeep Holla 
Acked-by: Ard Biesheuvel 
---
 drivers/base/cacheinfo.c  | 14 ++
 include/linux/cacheinfo.h | 17 +
 2 files changed, 27 insertions(+), 4 deletions(-)

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index 597aacb233fc..2880e2ab01f5 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -206,7 +206,7 @@ static inline bool cache_leaves_are_shared(struct cacheinfo 
*this_leaf,
   struct cacheinfo *sib_leaf)
 {
/*
-* For non-DT systems, assume unique level 1 cache, system-wide
+* For non-DT/ACPI systems, assume unique level 1 caches, system-wide
 * shared caches for all other levels. This will be used only if
 * arch specific code has not populated shared_cpu_map
 */
@@ -214,6 +214,11 @@ static inline bool cache_leaves_are_shared(struct 
cacheinfo *this_leaf,
 }
 #endif
 
+int __weak cache_setup_acpi(unsigned int cpu)
+{
+   return -ENOTSUPP;
+}
+
 static int cache_shared_cpu_map_setup(unsigned int cpu)
 {
struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
@@ -227,8 +232,8 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
if (of_have_populated_dt())
ret = cache_setup_of_node(cpu);
else if (!acpi_disabled)
-   /* No cache property/hierarchy support yet in ACPI */
-   ret = -ENOTSUPP;
+   ret = cache_setup_acpi(cpu);
+
if (ret)
return ret;
 
@@ -279,7 +284,8 @@ static void cache_shared_cpu_map_remove(unsigned int cpu)
cpumask_clear_cpu(cpu, &sib_leaf->shared_cpu_map);
cpumask_clear_cpu(sibling, &this_leaf->shared_cpu_map);
}
-   of_node_put(this_leaf->fw_token);
+   if (of_have_populated_dt())
+   of_node_put(this_leaf->fw_token);
}
 }
 
diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h
index 0c6f658054d2..89397e30e269 100644
--- a/include/linux/cacheinfo.h
+++ b/include/linux/cacheinfo.h
@@ -97,6 +97,23 @@ int func(unsigned int cpu)   
\
 struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu);
 int init_cache_level(unsigned int cpu);
 int populate_cache_leaves(unsigned int cpu);
+int cache_setup_acpi(unsigned int cpu);
+#ifndef CONFIG_ACPI
+/*
+ * acpi_find_last_cache_level is only called on ACPI enabled
+ * platforms using the PPTT for topology. This means that if
+ * the platform supports other firmware configuration methods
+ * we need to stub out the call when ACPI is disabled.
+ * ACPI enabled platforms not using PPTT won't be making calls
+ * to this function so we need not worry about them.
+ */
+static inline int acpi_find_last_cache_level(unsigned int cpu)
+{
+   return 0;
+}
+#else
+int acpi_find_last_cache_level(unsigned int cpu);
+#endif
 
 const struct attribute_group *cache_get_priv_group(struct cacheinfo 
*this_leaf);
 
-- 
2.13.6

[PATCH v9 09/12] arm64: topology: rename cluster_id

2018-05-11 Thread Jeremy Linton

The cluster concept isn't architecturally defined for arm64.
Lets match the name of the arm64 topology field to the kernel macro
that uses it.

Signed-off-by: Jeremy Linton 
Tested-by: Ard Biesheuvel 
Tested-by: Vijaya Kumar K 
Tested-by: Xiongfeng Wang 
Tested-by: Tomasz Nowicki 
Acked-by: Sudeep Holla 
Acked-by: Ard Biesheuvel 
Acked-by: Morten Rasmussen 
---
 arch/arm64/include/asm/topology.h |  4 ++--
 arch/arm64/kernel/topology.c  | 26 +-
 2 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/arch/arm64/include/asm/topology.h 
b/arch/arm64/include/asm/topology.h
index c4f2d50491eb..6b10459e6905 100644
--- a/arch/arm64/include/asm/topology.h
+++ b/arch/arm64/include/asm/topology.h
@@ -7,14 +7,14 @@
 struct cpu_topology {
int thread_id;
int core_id;
-   int cluster_id;
+   int package_id;
cpumask_t thread_sibling;
cpumask_t core_sibling;
 };
 
 extern struct cpu_topology cpu_topology[NR_CPUS];
 
-#define topology_physical_package_id(cpu)  (cpu_topology[cpu].cluster_id)
+#define topology_physical_package_id(cpu)  (cpu_topology[cpu].package_id)
 #define topology_core_id(cpu)  (cpu_topology[cpu].core_id)
 #define topology_core_cpumask(cpu) (&cpu_topology[cpu].core_sibling)
 #define topology_sibling_cpumask(cpu)  (&cpu_topology[cpu].thread_sibling)
diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
index 21868530018e..dc18b1e53194 100644
--- a/arch/arm64/kernel/topology.c
+++ b/arch/arm64/kernel/topology.c
@@ -47,7 +47,7 @@ static int __init get_cpu_for_node(struct device_node *node)
return cpu;
 }
 
-static int __init parse_core(struct device_node *core, int cluster_id,
+static int __init parse_core(struct device_node *core, int package_id,
 int core_id)
 {
char name[10];
@@ -63,7 +63,7 @@ static int __init parse_core(struct device_node *core, int 
cluster_id,
leaf = false;
cpu = get_cpu_for_node(t);
if (cpu >= 0) {
-   cpu_topology[cpu].cluster_id = cluster_id;
+   cpu_topology[cpu].package_id = package_id;
cpu_topology[cpu].core_id = core_id;
cpu_topology[cpu].thread_id = i;
} else {
@@ -85,7 +85,7 @@ static int __init parse_core(struct device_node *core, int 
cluster_id,
return -EINVAL;
}
 
-   cpu_topology[cpu].cluster_id = cluster_id;
+   cpu_topology[cpu].package_id = package_id;
cpu_topology[cpu].core_id = core_id;
} else if (leaf) {
pr_err("%pOF: Can't get CPU for leaf core\n", core);
@@ -101,7 +101,7 @@ static int __init parse_cluster(struct device_node 
*cluster, int depth)
bool leaf = true;
bool has_cores = false;
struct device_node *c;
-   static int cluster_id __initdata;
+   static int package_id __initdata;
int core_id = 0;
int i, ret;
 
@@ -140,7 +140,7 @@ static int __init parse_cluster(struct device_node 
*cluster, int depth)
}
 
if (leaf) {
-   ret = parse_core(c, cluster_id, core_id++);
+   ret = parse_core(c, package_id, core_id++);
} else {
pr_err("%pOF: Non-leaf cluster with core %s\n",
   cluster, name);
@@ -158,7 +158,7 @@ static int __init parse_cluster(struct device_node 
*cluster, int depth)
pr_warn("%pOF: empty cluster\n", cluster);
 
if (leaf)
-   cluster_id++;
+   package_id++;
 
return 0;
 }
@@ -194,7 +194,7 @@ static int __init parse_dt_topology(void)
 * only mark cores described in the DT as possible.
 */
for_each_possible_cpu(cpu)
-   if (cpu_topology[cpu].cluster_id == -1)
+   if (cpu_topology[cpu].package_id == -1)
ret = -EINVAL;
 
 out_map:
@@ -224,7 +224,7 @@ static void update_siblings_masks(unsigned int cpuid)
for_each_possible_cpu(cpu) {
cpu_topo = &cpu_topology[cpu];
 
-   if (cpuid_topo->cluster_id != cpu_topo->cluster_id)
+   if (cpuid_topo->package_id != cpu_topo->package_id)
continue;
 
cpumask_set_cpu(cpuid, &cpu_topo->core_sibling);
@@ -245,7 +245,7 @@ void store_cpu_topology(unsigned int cpuid)
struct cpu_topology *cpuid_topo = &cpu_topology[cpuid];
u64 mpidr;
 
-   if (cpuid_topo->cluster_id != -1)
+   if (cpuid_topo->package_id != -1)
goto topology_populated;
 
mpidr = read_cpuid_mpidr();
@@ -259,19 +259,19 @@ void store_cpu_topology(unsigned int cpuid)
/* Multiproc

[PATCH v9 08/12] arm64: Add support for ACPI based firmware tables

2018-05-11 Thread Jeremy Linton

The /sys cache entries should support ACPI/PPTT generated cache
topology information.  For arm64, if ACPI is enabled, determine
the max number of cache levels and populate them using the PPTT
table if one is available.

Signed-off-by: Jeremy Linton 
Tested-by: Ard Biesheuvel 
Tested-by: Vijaya Kumar K 
Tested-by: Xiongfeng Wang 
Tested-by: Tomasz Nowicki 
Reviewed-by: Sudeep Holla 
Acked-by: Ard Biesheuvel 
---
 arch/arm64/kernel/cacheinfo.c | 15 ++-
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/kernel/cacheinfo.c b/arch/arm64/kernel/cacheinfo.c
index 380f2e2fbed5..0bf0a835122f 100644
--- a/arch/arm64/kernel/cacheinfo.c
+++ b/arch/arm64/kernel/cacheinfo.c
@@ -17,6 +17,7 @@
  * along with this program.  If not, see .
  */
 
+#include 
 #include 
 #include 
 
@@ -46,7 +47,7 @@ static void ci_leaf_init(struct cacheinfo *this_leaf,
 
 static int __init_cache_level(unsigned int cpu)
 {
-   unsigned int ctype, level, leaves, of_level;
+   unsigned int ctype, level, leaves, fw_level;
struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
 
for (level = 1, leaves = 0; level <= MAX_CACHE_LEVEL; level++) {
@@ -59,15 +60,19 @@ static int __init_cache_level(unsigned int cpu)
leaves += (ctype == CACHE_TYPE_SEPARATE) ? 2 : 1;
}
 
-   of_level = of_find_last_cache_level(cpu);
-   if (level < of_level) {
+   if (acpi_disabled)
+   fw_level = of_find_last_cache_level(cpu);
+   else
+   fw_level = acpi_find_last_cache_level(cpu);
+
+   if (level < fw_level) {
/*
 * some external caches not specified in CLIDR_EL1
 * the information may be available in the device tree
 * only unified external caches are considered here
 */
-   leaves += (of_level - level);
-   level = of_level;
+   leaves += (fw_level - level);
+   level = fw_level;
}
 
this_cpu_ci->num_levels = level;
-- 
2.13.6

[PATCH v9 10/12] arm64: topology: enable ACPI/PPTT based CPU topology

2018-05-11 Thread Jeremy Linton

Propagate the topology information from the PPTT tree to the
cpu_topology array. We can get the thread id and core_id by assuming
certain levels of the PPTT tree correspond to those concepts.
The package_id is flagged in the tree and can be found by calling
find_acpi_cpu_topology_package() which terminates
its search when it finds an ACPI node flagged as the physical
package. If the tree doesn't contain enough levels to represent
all of the requested levels then the root node will be returned
for all subsequent levels.

Signed-off-by: Jeremy Linton 
Tested-by: Ard Biesheuvel 
Tested-by: Vijaya Kumar K 
Tested-by: Xiongfeng Wang 
Tested-by: Tomasz Nowicki 
Acked-by: Sudeep Holla 
Acked-by: Ard Biesheuvel 
Acked-by: Morten Rasmussen 
---
 arch/arm64/kernel/topology.c | 45 +++-
 1 file changed, 44 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
index dc18b1e53194..047d98e68502 100644
--- a/arch/arm64/kernel/topology.c
+++ b/arch/arm64/kernel/topology.c
@@ -11,6 +11,7 @@
  * for more details.
  */
 
+#include 
 #include 
 #include 
 #include 
@@ -22,6 +23,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include 
@@ -296,6 +298,45 @@ static void __init reset_cpu_topology(void)
}
 }
 
+#ifdef CONFIG_ACPI
+/*
+ * Propagate the topology information of the processor_topology_node tree to 
the
+ * cpu_topology array.
+ */
+static int __init parse_acpi_topology(void)
+{
+   bool is_threaded;
+   int cpu, topology_id;
+
+   is_threaded = read_cpuid_mpidr() & MPIDR_MT_BITMASK;
+
+   for_each_possible_cpu(cpu) {
+   topology_id = find_acpi_cpu_topology(cpu, 0);
+   if (topology_id < 0)
+   return topology_id;
+
+   if (is_threaded) {
+   cpu_topology[cpu].thread_id = topology_id;
+   topology_id = find_acpi_cpu_topology(cpu, 1);
+   cpu_topology[cpu].core_id   = topology_id;
+   } else {
+   cpu_topology[cpu].thread_id  = -1;
+   cpu_topology[cpu].core_id= topology_id;
+   }
+   topology_id = find_acpi_cpu_topology_package(cpu);
+   cpu_topology[cpu].package_id = topology_id;
+   }
+
+   return 0;
+}
+
+#else
+static inline int __init parse_acpi_topology(void)
+{
+   return -EINVAL;
+}
+#endif
+
 void __init init_cpu_topology(void)
 {
reset_cpu_topology();
@@ -304,6 +345,8 @@ void __init init_cpu_topology(void)
 * Discard anything that was parsed if we hit an error so we
 * don't use partial information.
 */
-   if (of_have_populated_dt() && parse_dt_topology())
+   if (!acpi_disabled && parse_acpi_topology())
+   reset_cpu_topology();
+   else if (of_have_populated_dt() && parse_dt_topology())
reset_cpu_topology();
 }
-- 
2.13.6

[PATCH v9 11/12] ACPI: Add PPTT to injectable table list

2018-05-11 Thread Jeremy Linton

Add ACPI_SIG_PPTT to the table so initrd's can override the
system topology.

Signed-off-by: Geoffrey Blake 
Signed-off-by: Jeremy Linton 
Tested-by: Ard Biesheuvel 
Tested-by: Vijaya Kumar K 
Tested-by: Xiongfeng Wang 
Tested-by: Tomasz Nowicki 
Acked-by: Ard Biesheuvel 
---
 drivers/acpi/tables.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/acpi/tables.c b/drivers/acpi/tables.c
index 849c4fb19b03..30d93bf7c6a2 100644
--- a/drivers/acpi/tables.c
+++ b/drivers/acpi/tables.c
@@ -457,7 +457,7 @@ static const char * const table_sigs[] = {
ACPI_SIG_UEFI, ACPI_SIG_WAET, ACPI_SIG_WDAT, ACPI_SIG_WDDT,
ACPI_SIG_WDRT, ACPI_SIG_DSDT, ACPI_SIG_FADT, ACPI_SIG_PSDT,
ACPI_SIG_RSDT, ACPI_SIG_XSDT, ACPI_SIG_SSDT, ACPI_SIG_IORT,
-   ACPI_SIG_NFIT, ACPI_SIG_HMAT, NULL };
+   ACPI_SIG_NFIT, ACPI_SIG_HMAT, ACPI_SIG_PPTT, NULL };
 
 #define ACPI_HEADER_SIZE sizeof(struct acpi_table_header)
 
-- 
2.13.6

[PATCH v9 06/12] ACPI: Enable PPTT support on ARM64

2018-05-11 Thread Jeremy Linton

Now that we have a PPTT parser, in preparation for its use
on arm64, lets build it.

Signed-off-by: Jeremy Linton 
Tested-by: Ard Biesheuvel 
Tested-by: Vijaya Kumar K 
Tested-by: Xiongfeng Wang 
Tested-by: Tomasz Nowicki 
Reviewed-by: Sudeep Holla 
Acked-by: Ard Biesheuvel 
---
 arch/arm64/Kconfig| 1 +
 drivers/acpi/Kconfig  | 3 +++
 drivers/acpi/Makefile | 1 +
 3 files changed, 5 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index eb2cf4938f6d..cff33d9eff0c 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -7,6 +7,7 @@ config ARM64
select ACPI_REDUCED_HARDWARE_ONLY if ACPI
select ACPI_MCFG if ACPI
select ACPI_SPCR_TABLE if ACPI
+   select ACPI_PPTT if ACPI
select ARCH_CLOCKSOURCE_DATA
select ARCH_HAS_DEBUG_VIRTUAL
select ARCH_HAS_DEVMEM_IS_ALLOWED
diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig
index 516d7b36d6fb..b533eeb6139d 100644
--- a/drivers/acpi/Kconfig
+++ b/drivers/acpi/Kconfig
@@ -547,6 +547,9 @@ config ACPI_CONFIGFS
 
 if ARM64
 source "drivers/acpi/arm64/Kconfig"
+
+config ACPI_PPTT
+   bool
 endif
 
 config TPS68470_PMIC_OPREGION
diff --git a/drivers/acpi/Makefile b/drivers/acpi/Makefile
index 48e202752754..6d59aa109a91 100644
--- a/drivers/acpi/Makefile
+++ b/drivers/acpi/Makefile
@@ -88,6 +88,7 @@ obj-$(CONFIG_ACPI_BGRT)   += bgrt.o
 obj-$(CONFIG_ACPI_CPPC_LIB)+= cppc_acpi.o
 obj-$(CONFIG_ACPI_SPCR_TABLE)  += spcr.o
 obj-$(CONFIG_ACPI_DEBUGGER_USER) += acpi_dbg.o
+obj-$(CONFIG_ACPI_PPTT)+= pptt.o
 
 # processor has its own "processor." module_param namespace
 processor-y:= processor_driver.o
-- 
2.13.6

[PATCH v9 00/12] Support PPTT for ARM64

2018-05-11 Thread Jeremy Linton

ACPI 6.2 adds the Processor Properties Topology Table (PPTT), which is
used to describe the processor and cache topology. Ideally it is
used to extend/override information provided by the hardware, but
right now ARM64 is entirely dependent on firmware provided tables.

This patch parses the table for the cache topology and CPU topology.
When we enable ACPI/PPTT for arm64 we map the package_id to the
PPTT node flagged as the physical package by the firmware.
This results in topologies that match what the remainder of the
system expects. Finally, we update the scheduler MC domain so that
it generally reflects the LLC unless the LLC is too large for the
NUMA domain (or package).

For example on juno:
[root@mammon-juno-rh topology]# lstopo-no-graphics
  Package L#0
L2 L#0 (1024KB)
  L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0 + PU L#0 (P#0)
  L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1 + PU L#1 (P#1)
  L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2 + PU L#2 (P#2)
  L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3 + PU L#3 (P#3)
L2 L#1 (2048KB)
  L1d L#4 (32KB) + L1i L#4 (48KB) + Core L#4 + PU L#4 (P#4)
  L1d L#5 (32KB) + L1i L#5 (48KB) + Core L#5 + PU L#5 (P#5)
  HostBridge L#0
PCIBridge
  PCIBridge
PCIBridge
  PCI 1095:3132
Block(Disk) L#0 "sda"
PCIBridge
  PCI 1002:68f9
GPU L#1 "renderD128"
GPU L#2 "card0"
GPU L#3 "controlD64"
PCIBridge
  PCI 11ab:4380
Net L#4 "enp8s0"

Git tree at:
http://linux-arm.org/git?p=linux-jlinton.git
branch: pptt_v9

v8->v9:
 Add further ack/tested by's (thanks everyone)
 kerneldoc, general comment and patch description tweaks.
 Squash the pptt.c module (#5 & #13) back together.
 remove a redundant () in an if, and rename a variable.

v7->v8:
 Modify the logic used to select the MC domain (the change
   shouldn't modify the sched domains on any existing machines
   compared to v7, only how they are built)
 Reduce the severity of some parsing messages.
 Fix s390 link problem.
 Further checks to deal with broken PPTT tables.
 Various style tweaks, SPDX license addition, etc.

(see previous cover letters for further changes)


Jeremy Linton (12):
  drivers: base: cacheinfo: move cache_setup_of_node()
  drivers: base: cacheinfo: setup DT cache properties early
  cacheinfo: rename of_node to fw_token
  arm64/acpi: Create arch specific cpu to acpi id helper
  ACPI/PPTT: Add Processor Properties Topology Table parsing
  ACPI: Enable PPTT support on ARM64
  drivers: base cacheinfo: Add support for ACPI based firmware tables
  arm64: Add support for ACPI based firmware tables
  arm64: topology: rename cluster_id
  arm64: topology: enable ACPI/PPTT based CPU topology
  ACPI: Add PPTT to injectable table list
  arm64: topology: divorce MC scheduling domain from core_siblings

 arch/arm64/Kconfig|   1 +
 arch/arm64/include/asm/acpi.h |   4 +
 arch/arm64/include/asm/topology.h |   6 +-
 arch/arm64/kernel/cacheinfo.c |  15 +-
 arch/arm64/kernel/topology.c  | 107 ++-
 arch/riscv/kernel/cacheinfo.c |   1 -
 drivers/acpi/Kconfig  |   3 +
 drivers/acpi/Makefile |   1 +
 drivers/acpi/pptt.c   | 655 ++
 drivers/acpi/tables.c |   2 +-
 drivers/base/cacheinfo.c  | 157 -
 include/linux/acpi.h  |   4 +
 include/linux/cacheinfo.h |  25 +-
 13 files changed, 874 insertions(+), 107 deletions(-)
 create mode 100644 drivers/acpi/pptt.c

-- 
2.13.6

[PATCH v9 12/12] arm64: topology: divorce MC scheduling domain from core_siblings

2018-05-11 Thread Jeremy Linton

Now that we have an accurate view of the physical topology
we need to represent it correctly to the scheduler. Generally MC
should equal the LLC in the system, but there are a number of
special cases that need to be dealt with.

In the case of NUMA in socket, we need to assure that the sched
domain we build for the MC layer isn't larger than the DIE above it.
Similarly for LLC's that might exist in cross socket interconnect or
directory hardware we need to assure that MC is shrunk to the socket
or NUMA node.

This patch builds a sibling mask for the LLC, and then picks the
smallest of LLC, socket siblings, or NUMA node siblings, which
gives us the behavior described above. This is ever so slightly
different than the similar alternative where we look for a cache
layer less than or equal to the socket/NUMA siblings.

The logic to pick the MC layer affects all arm64 machines, but
only changes the behavior for DT/MPIDR systems if the NUMA domain
is smaller than the core siblings (generally set to the cluster).
Potentially this fixes a possible bug in DT systems, but really
it only affects ACPI systems where the core siblings is correctly
set to the socket siblings. Thus all currently available ACPI
systems should have MC equal to LLC, including the NUMA in socket
machines where the LLC is partitioned between the NUMA nodes.

Signed-off-by: Jeremy Linton 
Tested-by: Ard Biesheuvel 
Tested-by: Vijaya Kumar K 
Tested-by: Xiongfeng Wang 
Tested-by: Tomasz Nowicki 
Acked-by: Sudeep Holla 
Acked-by: Ard Biesheuvel 
Acked-by: Morten Rasmussen 
---
 arch/arm64/include/asm/topology.h |  2 ++
 arch/arm64/kernel/topology.c  | 36 +++-
 2 files changed, 37 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/topology.h 
b/arch/arm64/include/asm/topology.h
index 6b10459e6905..df48212f767b 100644
--- a/arch/arm64/include/asm/topology.h
+++ b/arch/arm64/include/asm/topology.h
@@ -8,8 +8,10 @@ struct cpu_topology {
int thread_id;
int core_id;
int package_id;
+   int llc_id;
cpumask_t thread_sibling;
cpumask_t core_sibling;
+   cpumask_t llc_siblings;
 };
 
 extern struct cpu_topology cpu_topology[NR_CPUS];
diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
index 047d98e68502..7415c166281f 100644
--- a/arch/arm64/kernel/topology.c
+++ b/arch/arm64/kernel/topology.c
@@ -13,6 +13,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -214,7 +215,19 @@ EXPORT_SYMBOL_GPL(cpu_topology);
 
 const struct cpumask *cpu_coregroup_mask(int cpu)
 {
-   return &cpu_topology[cpu].core_sibling;
+   const cpumask_t *core_mask = cpumask_of_node(cpu_to_node(cpu));
+
+   /* Find the smaller of NUMA, core or LLC siblings */
+   if (cpumask_subset(&cpu_topology[cpu].core_sibling, core_mask)) {
+   /* not numa in package, lets use the package siblings */
+   core_mask = &cpu_topology[cpu].core_sibling;
+   }
+   if (cpu_topology[cpu].llc_id != -1) {
+   if (cpumask_subset(&cpu_topology[cpu].llc_siblings, core_mask))
+   core_mask = &cpu_topology[cpu].llc_siblings;
+   }
+
+   return core_mask;
 }
 
 static void update_siblings_masks(unsigned int cpuid)
@@ -226,6 +239,9 @@ static void update_siblings_masks(unsigned int cpuid)
for_each_possible_cpu(cpu) {
cpu_topo = &cpu_topology[cpu];
 
+   if (cpuid_topo->llc_id == cpu_topo->llc_id)
+   cpumask_set_cpu(cpu, &cpuid_topo->llc_siblings);
+
if (cpuid_topo->package_id != cpu_topo->package_id)
continue;
 
@@ -291,6 +307,10 @@ static void __init reset_cpu_topology(void)
cpu_topo->core_id = 0;
cpu_topo->package_id = -1;
 
+   cpu_topo->llc_id = -1;
+   cpumask_clear(&cpu_topo->llc_siblings);
+   cpumask_set_cpu(cpu, &cpu_topo->llc_siblings);
+
cpumask_clear(&cpu_topo->core_sibling);
cpumask_set_cpu(cpu, &cpu_topo->core_sibling);
cpumask_clear(&cpu_topo->thread_sibling);
@@ -311,6 +331,8 @@ static int __init parse_acpi_topology(void)
is_threaded = read_cpuid_mpidr() & MPIDR_MT_BITMASK;
 
for_each_possible_cpu(cpu) {
+   int i, cache_id;
+
topology_id = find_acpi_cpu_topology(cpu, 0);
if (topology_id < 0)
return topology_id;
@@ -325,6 +347,18 @@ static int __init parse_acpi_topology(void)
}
topology_id = find_acpi_cpu_topology_package(cpu);
cpu_topology[cpu].package_id = topology_id;
+
+   i = acpi_find_last_cache_level(cpu);
+
+   if (i > 0) {
+   /*
+* this is the only part of cpu_topology that has
+* a direct relationship with the cache topology
+

[PATCH v9 03/12] cacheinfo: rename of_node to fw_token

2018-05-11 Thread Jeremy Linton

Rename and change the type of of_node to indicate
it is a generic pointer which is generally only used
for comparison purposes. In a later patch we will put
an ACPI/PPTT token pointer in fw_token so that
the code which builds the shared cpu masks can be reused.

Signed-off-by: Jeremy Linton 
Tested-by: Ard Biesheuvel 
Tested-by: Vijaya Kumar K 
Tested-by: Xiongfeng Wang 
Tested-by: Tomasz Nowicki 
Acked-by: Sudeep Holla 
Acked-by: Ard Biesheuvel 
---
 drivers/base/cacheinfo.c  | 16 +---
 include/linux/cacheinfo.h |  8 +++-
 2 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index a872523e8951..597aacb233fc 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -35,7 +35,7 @@ struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu)
 static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
   struct cacheinfo *sib_leaf)
 {
-   return sib_leaf->of_node == this_leaf->of_node;
+   return sib_leaf->fw_token == this_leaf->fw_token;
 }
 
 /* OF properties to query for a given cache type */
@@ -167,9 +167,10 @@ static int cache_setup_of_node(unsigned int cpu)
struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
unsigned int index = 0;
 
-   /* skip if of_node is already populated */
-   if (this_cpu_ci->info_list->of_node)
+   /* skip if fw_token is already populated */
+   if (this_cpu_ci->info_list->fw_token) {
return 0;
+   }
 
if (!cpu_dev) {
pr_err("No cpu device for CPU %d\n", cpu);
@@ -190,7 +191,7 @@ static int cache_setup_of_node(unsigned int cpu)
if (!np)
break;
cache_of_set_props(this_leaf, np);
-   this_leaf->of_node = np;
+   this_leaf->fw_token = np;
index++;
}
 
@@ -278,7 +279,7 @@ static void cache_shared_cpu_map_remove(unsigned int cpu)
cpumask_clear_cpu(cpu, &sib_leaf->shared_cpu_map);
cpumask_clear_cpu(sibling, &this_leaf->shared_cpu_map);
}
-   of_node_put(this_leaf->of_node);
+   of_node_put(this_leaf->fw_token);
}
 }
 
@@ -323,8 +324,9 @@ static int detect_cache_attributes(unsigned int cpu)
if (ret)
goto free_ci;
/*
-* For systems using DT for cache hierarchy, of_node and shared_cpu_map
-* will be set up here only if they are not populated already
+* For systems using DT for cache hierarchy, fw_token
+* and shared_cpu_map will be set up here only if they are
+* not populated already
 */
ret = cache_shared_cpu_map_setup(cpu);
if (ret) {
diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h
index 3d9805297cda..0c6f658054d2 100644
--- a/include/linux/cacheinfo.h
+++ b/include/linux/cacheinfo.h
@@ -34,9 +34,8 @@ enum cache_type {
  * @shared_cpu_map: logical cpumask representing all the cpus sharing
  * this cache node
  * @attributes: bitfield representing various cache attributes
- * @of_node: if devicetree is used, this represents either the cpu node in
- * case there's no explicit cache node or the cache node itself in the
- * device tree
+ * @fw_token: Unique value used to determine if different cacheinfo
+ * structures represent a single hardware cache instance.
  * @disable_sysfs: indicates whether this node is visible to the user via
  * sysfs or not
  * @priv: pointer to any private data structure specific to particular
@@ -65,8 +64,7 @@ struct cacheinfo {
 #define CACHE_ALLOCATE_POLICY_MASK \
(CACHE_READ_ALLOCATE | CACHE_WRITE_ALLOCATE)
 #define CACHE_ID   BIT(4)
-
-   struct device_node *of_node;
+   void *fw_token;
bool disable_sysfs;
void *priv;
 };
-- 
2.13.6

Re: [PATCH net] net: dsa: bcm_sf2: Fix RX_CLS_LOC_ANY overwrite for last rule

2018-05-11 Thread Florian Fainelli

On 05/11/2018 04:24 PM, Florian Fainelli wrote:
> When we let the kernel pick up a rule location with RX_CLS_LOC_ANY, we
> would be able to overwrite the last rules because of a number of issues:
> 
> - the IPv4 code path would not be checking that rule_index is within
>   bounds, the IPv6 code path would only be checking the second index and
>   not the first one
> 
> - find_first_zero_bit() needs to operate on the full bitmap size
>   (priv->num_cfp_rules) otherwise it would be off by one in the results
>   it returns and the checks against bcm_sf2_cfp_rule_size() would be non
>   functioning
> 
> Fixes: 3306145866b6 ("net: dsa: bcm_sf2: Move IPv4 CFP processing to specific 
> functions")
> Fixes: ba0696c22e7c ("net: dsa: bcm_sf2: Add support for IPv6 CFP rules")
> Signed-off-by: Florian Fainelli 

David, please discard that for now, the IPv4 part is correct, but I am
not fixing the bug correctly for the IPv6 part. v2 coming some time next
week. Thank you!
-- 
Florian

Proposal

2018-05-11 Thread Zeliha Omer Faruk




--
Hello

Greetings to you please i have a business proposal for you contact me
for more detailes asap thanks.

Best Regards,
Miss.Zeliha ömer faruk
Esentepe Mahallesi Büyükdere
Caddesi Kristal Kule Binasi
No:215
Sisli - Istanbul, Turkey

Re: [PATCH RESEND v2 1/2] perf cs-etm: Support unknown_thread in cs_etm_auxtrace

2018-05-11 Thread Leo Yan

On Fri, May 11, 2018 at 10:48:00AM -0300, Arnaldo Carvalho de Melo wrote:
> Em Thu, May 10, 2018 at 12:01:59PM +0800, Leo Yan escreveu:
> > CoreSight doesn't allocate thread structure for unknown_thread in etm
> > auxtrace, so unknown_thread is NULL pointer.  If the perf data doesn't
> > contain valid tid and then cs_etm__mem_access() uses unknown_thread
> > instead as thread handler, this results in segmentation fault when
> > thread__find_addr_map() accesses thread handler.
> > 
> > This commit creates new thread data which is used by unknown_thread, so
> > CoreSight tracing can roll back to use unknown_thread if perf data
> > doesn't include valid thread info.  This commit also releases thread
> > data for initialization failure case and for normal auxtrace free flow.
> > 
> > Signed-off-by: Leo Yan 
> > Acked-by: Mathieu Poirier 
> 
> Thanks, applied to perf/urgent.
> 
> And please use a more descriptive, eye catching summary, something like:
> 
>   perf cs-etm: Fix segfault when accessing NULL unknown_thread variable
> 
> :-)

Thanks for suggestion.  Indeed, this patch is a fix rather than
a new feature, subject should reflect it.

Thanks,
Leo Yan

[PATCH net] net: dsa: bcm_sf2: Fix RX_CLS_LOC_ANY overwrite for last rule

2018-05-11 Thread Florian Fainelli

When we let the kernel pick up a rule location with RX_CLS_LOC_ANY, we
would be able to overwrite the last rules because of a number of issues:

- the IPv4 code path would not be checking that rule_index is within
  bounds, the IPv6 code path would only be checking the second index and
  not the first one

- find_first_zero_bit() needs to operate on the full bitmap size
  (priv->num_cfp_rules) otherwise it would be off by one in the results
  it returns and the checks against bcm_sf2_cfp_rule_size() would be non
  functioning

Fixes: 3306145866b6 ("net: dsa: bcm_sf2: Move IPv4 CFP processing to specific 
functions")
Fixes: ba0696c22e7c ("net: dsa: bcm_sf2: Add support for IPv6 CFP rules")
Signed-off-by: Florian Fainelli 
---
 drivers/net/dsa/bcm_sf2_cfp.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/net/dsa/bcm_sf2_cfp.c b/drivers/net/dsa/bcm_sf2_cfp.c
index 23b45da784cb..ade5fa3d747d 100644
--- a/drivers/net/dsa/bcm_sf2_cfp.c
+++ b/drivers/net/dsa/bcm_sf2_cfp.c
@@ -354,10 +354,13 @@ static int bcm_sf2_cfp_ipv4_rule_set(struct bcm_sf2_priv 
*priv, int port,
/* Locate the first rule available */
if (fs->location == RX_CLS_LOC_ANY)
rule_index = find_first_zero_bit(priv->cfp.used,
-bcm_sf2_cfp_rule_size(priv));
+priv->num_cfp_rules);
else
rule_index = fs->location;
 
+   if (rule_index > bcm_sf2_cfp_rule_size(priv))
+   return -ENOSPC;
+
layout = &udf_tcpip4_layout;
/* We only use one UDF slice for now */
slice_num = bcm_sf2_get_slice_number(layout, 0);
@@ -563,9 +566,11 @@ static int bcm_sf2_cfp_ipv6_rule_set(struct bcm_sf2_priv 
*priv, int port,
 */
if (fs->location == RX_CLS_LOC_ANY)
rule_index[0] = find_first_zero_bit(priv->cfp.used,
-   
bcm_sf2_cfp_rule_size(priv));
+   priv->num_cfp_rules);
else
rule_index[0] = fs->location;
+   if (rule_index[0] > bcm_sf2_cfp_rule_size(priv))
+   return -ENOSPC;
 
/* Flag it as used (cleared on error path) such that we can immediately
 * obtain a second one to chain from.
@@ -573,7 +578,7 @@ static int bcm_sf2_cfp_ipv6_rule_set(struct bcm_sf2_priv 
*priv, int port,
set_bit(rule_index[0], priv->cfp.used);
 
rule_index[1] = find_first_zero_bit(priv->cfp.used,
-   bcm_sf2_cfp_rule_size(priv));
+   priv->num_cfp_rules);
if (rule_index[1] > bcm_sf2_cfp_rule_size(priv)) {
ret = -ENOSPC;
goto out_err;
-- 
2.14.1

mmotm 2018-05-11-16-28 uploaded

2018-05-11 Thread akpm

The mm-of-the-moment snapshot 2018-05-11-16-28 has been uploaded to

   http://www.ozlabs.org/~akpm/mmotm/

mmotm-readme.txt says

README for mm-of-the-moment:

http://www.ozlabs.org/~akpm/mmotm/

This is a snapshot of my -mm patch queue.  Uploaded at random hopefully
more than once a week.

You will need quilt to apply these patches to the latest Linus release (4.x
or 4.x-rcY).  The series file is in broken-out.tar.gz and is duplicated in
http://ozlabs.org/~akpm/mmotm/series

The file broken-out.tar.gz contains two datestamp files: .DATE and
.DATE--mm-dd-hh-mm-ss.  Both contain the string -mm-dd-hh-mm-ss,
followed by the base kernel version against which this patch series is to
be applied.

This tree is partially included in linux-next.  To see which patches are
included in linux-next, consult the `series' file.  Only the patches
within the #NEXT_PATCHES_START/#NEXT_PATCHES_END markers are included in
linux-next.

A git tree which contains the memory management portion of this tree is
maintained at git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git
by Michal Hocko.  It contains the patches which are between the
"#NEXT_PATCHES_START mm" and "#NEXT_PATCHES_END" markers, from the series
file, http://www.ozlabs.org/~akpm/mmotm/series.


A full copy of the full kernel tree with the linux-next and mmotm patches
already applied is available through git within an hour of the mmotm
release.  Individual mmotm releases are tagged.  The master branch always
points to the latest release, so it's constantly rebasing.

http://git.cmpxchg.org/cgit.cgi/linux-mmotm.git/

To develop on top of mmotm git:

  $ git remote add mmotm 
git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git
  $ git remote update mmotm
  $ git checkout -b topic mmotm/master
  
  $ git send-email mmotm/master.. [...]

To rebase a branch with older patches to a new mmotm release:

  $ git remote update mmotm
  $ git rebase --onto mmotm/master  topic




The directory http://www.ozlabs.org/~akpm/mmots/ (mm-of-the-second)
contains daily snapshots of the -mm tree.  It is updated more frequently
than mmotm, and is untested.

A git copy of this tree is available at

http://git.cmpxchg.org/cgit.cgi/linux-mmots.git/

and use of this tree is similar to
http://git.cmpxchg.org/cgit.cgi/linux-mmotm.git/, described above.


This mmotm tree contains the following patches against 4.17-rc4:
(patches marked "*" will be included in linux-next)

  origin.patch
  i-need-old-gcc.patch
* maintainers-update-email-address-in-maintainers-entries.patch
* kasan-prohibit-kasanstructleak-combination.patch
* lib-avoid-soft-lockup-in-test_find_first_bit.patch
* init-fix-false-positives-in-wx-checking.patch
* z3fold-fix-reclaim-lock-ups.patch
* mm-sections-are-not-offlined-during-memory-hotremove.patch
* mm-dont-show-nr_indirectly_reclaimable-in-proc-vmstat.patch
* proc-kcore-dont-bounds-check-against-address-0.patch
* mm-migrate-fix-double-call-of-radix_tree_replace_slot.patch
* mm-oom-fix-concurrent-munlock-and-oom-reaper-unmap.patch
* 
ocfs2-take-inode-cluster-lock-before-moving-reflinked-inode-from-orphan-dir.patch
* 
scripts-faddr2line-fix-error-when-addr2line-output-contains-discriminator.patch
* rbtree-include-rcuh-because-we-use-it.patch
* memcg-remove-memcg_cgroup-id-from-idr-on-mem_cgroup_css_alloc-failure.patch
* ocfs2-submit-another-bio-if-current-bio-is-full.patch
* mm-memory_hotplug-fix-leftover-use-of-struct-page-during-hotplug.patch
* mm-allow-deferred-page-init-for-vmemmap-only.patch
* 
lib-test_bitmapc-fix-bitmap-optimisation-tests-to-report-errors-correctly.patch
* include-mm-adding-new-inline-function-vmf_error.patch
* radix-tree-test-suite-fix-mapshift-build-target.patch
* radix-tree-test-suite-fix-compilation-issue.patch
* radix-tree-test-suite-add-item_delete_rcu.patch
* radix-tree-test-suite-multi-order-iteration-race.patch
* radix-tree-fix-multi-order-iteration-race.patch
* arm-arch-arm-include-asm-pageh-needs-personalityh.patch
* fs-dax-adding-new-return-type-vm_fault_t.patch
* prctl-add-pr_et_pdeathsig_proc.patch
* ocfs2-clean-up-redundant-function-declarations.patch
* ocfs2-ocfs2_inode_lock_tracker-does-not-distinguish-lock-level.patch
* ocfs2-eliminate-a-misreported-warning.patch
* 
ocfs2-correct-the-comments-position-of-the-structure-ocfs2_dir_block_trailer.patch
* ocfs2-get-rid-of-ocfs2_is_o2cb_active-function.patch
* ocfs2-without-quota-support-try-to-avoid-calling-quota-recovery.patch
* 
ocfs2-without-quota-support-try-to-avoid-calling-quota-recovery-checkpatch-fixes.patch
* ocfs2-dont-put-and-assign-null-to-bh-allocated-outside.patch
* ocfs2-dont-use-iocb-when-eiocbqueued-returns.patch
* 
block-restore-proc-partitions-to-not-display-non-partitionable-removable-devices.patch
* net-9p-detecting-invalid-options-as-much-as-possible.patch
* fs-9p-detecting-invalid-options-as-much-as-possible.patch
* dentry-fix-kmemcheck-splat-at-take_dentry_name_snapshot.patch
* namei-allow-restricted-o_creat-of-fifos-and-regular-files.pat

Re: [PATCH] input: fix coding style issues in input.c

2018-05-11 Thread Nick Simonov

 Wed, May 09, 2018 at 05:33:13PM -0700, Dmitry Torokhov wrote:
> Hi NIck,
> 
> On Wed, May 09, 2018 at 05:07:14PM +0300, Nick Simonov wrote:
> > This is a patch to the input.c file that fixes
> > up warning found by checkpatch.pl tool
> > 
> > Signed-off-by: Nick Simonov 
> > ---
> >  drivers/input/input.c | 52 
> > ---
> >  1 file changed, 33 insertions(+), 19 deletions(-)
> > 
> > diff --git a/drivers/input/input.c b/drivers/input/input.c
> > index 9785546..e18fdae 100644
> > --- a/drivers/input/input.c
> > +++ b/drivers/input/input.c
> > @@ -1,3 +1,4 @@
> > +// SPDX-License-Identifier: GPL-2.0
> >  /*
> >   * The input core
> >   *
> > @@ -252,7 +253,8 @@ static int input_handle_abs_event(struct input_dev *dev,
> > }
> >  
> > /* Flush pending "slot" event */
> > -   if (is_mt_event && mt && mt->slot != input_abs_get_val(dev, 
> > ABS_MT_SLOT)) {
> > +   if (is_mt_event && mt && mt->slot !=
> > +   input_abs_get_val(dev, ABS_MT_SLOT)) {
> > input_abs_set_val(dev, ABS_MT_SLOT, mt->slot);
> 
> So now it is not immediately clear what is part of condition and what is
> part of body.
> 
> I am sorry to say, but with most of these changes the cure is worse than
> the disease. If you were fixing the code and adjusted the affected lines
> so they are under 80 columns limit that would be one thing, but just
> reformatting for the sake of it is not really helpful.
> 
> Thanks.
> 
> -- 
> Dmitry

Dmitry thanks for your comment. I deleted all my changes except one
and prepare a new patch for it. 

In function input_set_capability when it go through default statment 
it is use hard coded function name "input_set_capability" in pr_err() call.
I replace it using "%s" __func__ instead.
>From 2aef27ca4896b8d9e64fd1417965793acfba3653 Mon Sep 17 00:00:00 2001
From: Nick Simonov 
Date: Sat, 12 May 2018 01:24:47 +0300
Subject: [PATCH] input: replace hard coded string with __func__ in pr_err()

Change hardcoded string "input_set_capability"
in pr_err() function call, replace it with
"%s" __func__ instead.

Signed-off-by: Nick Simonov 
---
 drivers/input/input.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/input/input.c b/drivers/input/input.c
index 9785546..6365c19 100644
--- a/drivers/input/input.c
+++ b/drivers/input/input.c
@@ -1943,8 +1943,7 @@ void input_set_capability(struct input_dev *dev, unsigned 
int type, unsigned int
break;
 
default:
-   pr_err("input_set_capability: unknown type %u (code %u)\n",
-  type, code);
+   pr_err("%s: unknown type %u (code %u)\n", __func__, type, code);
dump_stack();
return;
}
-- 
2.7.4

Re: [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-05-11 Thread Logan Gunthorpe


On 5/11/2018 4:24 PM, Stephen  Bates wrote:

All


  Alex (or anyone else) can you point to where IOVA addresses are generated?


A case of RTFM perhaps (though a pointer to the code would still be 
appreciated).

https://www.kernel.org/doc/Documentation/Intel-IOMMU.txt

Some exceptions to IOVA
---
Interrupt ranges are not address translated, (0xfee0 - 0xfeef).
The same is true for peer to peer transactions. Hence we reserve the
address from PCI MMIO ranges so they are not allocated for IOVA addresses.


Hmm, except I'm not sure how to interpret that. It sounds like there 
can't be an IOVA address that overlaps with the PCI MMIO range which is 
good and what I'd expect.


But for peer to peer they say they don't translate the address which 
implies to me that the intention is for a peer to peer address to not be 
mapped in the same way using the dma_map interface (of course though if 
you were using ATS you'd want this for sure). Unless the existing 
dma_map command's notice a PCI MMIO address and handle them differently, 
but I don't see how.


Logan

Re: [PATCH] rcu: Report a quiescent state when it's exactly in the state

2018-05-11 Thread Joel Fernandes

On Fri, May 11, 2018 at 09:17:46AM -0700, Paul E. McKenney wrote:
> On Fri, May 11, 2018 at 09:57:54PM +0900, Byungchul Park wrote:
> > Hello folks,
> > 
> > I think I wrote the title in a misleading way.
> > 
> > Please change the title to something else such as,
> > "rcu: Report a quiescent state when it's in the state" or,
> > "rcu: Add points reporting quiescent states where proper" or so on.
> > 
> > On 2018-05-11 오후 5:30, Byungchul Park wrote:
> > >We expect a quiescent state of TASKS_RCU when cond_resched_tasks_rcu_qs()
> > >is called, no matter whether it actually be scheduled or not. However,
> > >it currently doesn't report the quiescent state when the task enters
> > >into __schedule() as it's called with preempt = true. So make it report
> > >the quiescent state unconditionally when cond_resched_tasks_rcu_qs() is
> > >called.
> > >
> > >And in TINY_RCU, even though the quiescent state of rcu_bh also should
> > >be reported when the tick interrupt comes from user, it doesn't. So make
> > >it reported.
> > >
> > >Lastly in TREE_RCU, rcu_note_voluntary_context_switch() should be
> > >reported when the tick interrupt comes from not only user but also idle,
> > >as an extended quiescent state.
> > >
> > >Signed-off-by: Byungchul Park 
> > >---
> > >  include/linux/rcupdate.h | 4 ++--
> > >  kernel/rcu/tiny.c| 6 +++---
> > >  kernel/rcu/tree.c| 4 ++--
> > >  3 files changed, 7 insertions(+), 7 deletions(-)
> > >
> > >diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> > >index ee8cf5fc..7432261 100644
> > >--- a/include/linux/rcupdate.h
> > >+++ b/include/linux/rcupdate.h
> > >@@ -195,8 +195,8 @@ static inline void exit_tasks_rcu_finish(void) { }
> > >   */
> > >  #define cond_resched_tasks_rcu_qs() \
> > >  do { \
> > >-  if (!cond_resched()) \
> > >-  rcu_note_voluntary_context_switch_lite(current); \
> > >+  rcu_note_voluntary_context_switch_lite(current); \
> > >+  cond_resched(); \
> 
> Ah, good point.
> 
> Peter, I have to ask...  Why is "cond_resched()" considered a preemption
> while "schedule()" is not?

Infact something interesting I inferred from the __schedule loop related to
your question:

switch_count can either be set to prev->invcsw or prev->nvcsw. If we can
assume that switch_count reflects whether the context switch is involuntary
or voluntary,

task-running-state  preempt switch_count
0 (running) 1   involuntary
0   0   involuntary
1   0   voluntary
1   1   involuntary

According to the above table, both the task's running state and the preempt
parameter to __schedule should be used together to determine if the switch is
a voluntary one or not.

So this code in rcu_note_context_switch should really be:
if (!preempt && !(current->state & TASK_RUNNING))
rcu_note_voluntary_context_switch_lite(current);

According to the above table, cond_resched always classifies as an
involuntary switch which makes sense to me. Even though cond_resched is
explicitly called, its still sort of involuntary in the sense its not called
into the scheduler for sleeping, but rather for seeing if something else can
run instead (a preemption point). Infact none of the task deactivation in the
__schedule loop will run if cond_resched is used.

I agree that if schedule was called directly but with TASK_RUNNING=1, then
that could probably be classified an involuntary switch too...

Also since we're deciding to call rcu_note_voluntary_context_switch_lite
unconditionally, then IMO this comment on that macro:

/*
 * Note a voluntary context switch for RCU-tasks benefit.  This is a
 * macro rather than an inline function to avoid #include hell.
 */
 #ifdef CONFIG_TASKS_RCU
 #define rcu_note_voluntary_context_switch_lite(t)

Should be changed to:

/*
 * Note a attempt to perform a voluntary context switch for RCU-tasks
 * benefit.  This is called even in situations where a context switch
 * didn't really happen even though it was requested. This is a
 * macro rather than an inline function to avoid #include hell.
 */
 #ifdef CONFIG_TASKS_RCU
 #define rcu_note_voluntary_context_switch_lite(t)

Right?

Correct me if I'm wrong about anything, thanks,

- Joel

Re: [PATCH] spi: spi-geni-qcom: Add SPI driver support for GENI based QUP

2018-05-11 Thread Stephen Boyd

Quoting Girish Mahadevan (2018-05-03 14:34:43)
> diff --git a/drivers/spi/Kconfig b/drivers/spi/Kconfig
> index 9b31351..358d60a 100644
> --- a/drivers/spi/Kconfig
> +++ b/drivers/spi/Kconfig
> @@ -564,6 +564,18 @@ config SPI_QUP
>   This driver can also be built as a module.  If so, the module
>   will be called spi_qup.
>  
> +config SPI_QCOM_GENI
> +   tristate "Qualcomm SPI controller with QUP interface"

This is the same help text as the SPI_QUP config up above. Please make
it different somehow by adding GENI or something like that instead of
QUP?

> +   depends on ARCH_QCOM || (ARM && COMPILE_TEST)

This driver uses the GENI wrapper code so it may need to have a better
Kconfig dependency than this.

> +   help
> + This driver supports GENI serial engine based SPI controller in
> + master mode on the Qualcomm Technologies Inc.'s SoCs. If you say
> + yes to this option, support will be included for the built-in SPI

Drop "built-in"?

> + interface on the Qualcomm Technologies Inc.'s SoCs.
> +
> + This driver can also be built as a module.  If so, the module
> + will be called spi-geni-qcom.
> +
>  config SPI_S3C24XX
> tristate "Samsung S3C24XX series SPI"
> depends on ARCH_S3C24XX
> diff --git a/drivers/spi/Makefile b/drivers/spi/Makefile
> index a3ae2b7..cc90d6e 100644
> --- a/drivers/spi/Makefile
> +++ b/drivers/spi/Makefile
> @@ -77,6 +77,7 @@ spi-pxa2xx-platform-objs  := spi-pxa2xx.o 
> spi-pxa2xx-dma.o
>  obj-$(CONFIG_SPI_PXA2XX)   += spi-pxa2xx-platform.o
>  obj-$(CONFIG_SPI_PXA2XX_PCI)   += spi-pxa2xx-pci.o
>  obj-$(CONFIG_SPI_QUP)  += spi-qup.o
> +obj-$(CONFIG_SPI_QCOM_GENI)+= spi-geni-qcom.o

This should come before QUP.

>  obj-$(CONFIG_SPI_ROCKCHIP) += spi-rockchip.o
>  obj-$(CONFIG_SPI_RB4XX)+= spi-rb4xx.o
>  obj-$(CONFIG_SPI_RSPI) += spi-rspi.o
> diff --git a/drivers/spi/spi-geni-qcom.c b/drivers/spi/spi-geni-qcom.c
> new file mode 100644
> index 000..eecc634
> --- /dev/null
> +++ b/drivers/spi/spi-geni-qcom.c
> @@ -0,0 +1,766 @@
> +// SPDX-License-Identifier: GPL-2.0
> +// Copyright (c) 2017-2018, The Linux foundation. All rights reserved.
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 

include platform_device.h instead of of_platform.h?

> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#define SPI_NUM_CHIPSELECT 4

Why do we need the define? It's used one place.

> +#define SPI_XFER_TIMEOUT_MS250

Same comment.

> +/* SPI SE specific registers */
> +#define SE_SPI_CPHA0x224
> +#define SE_SPI_LOOPBACK0x22c
> +#define SE_SPI_CPOL0x230
> +#define SE_SPI_DEMUX_OUTPUT_INV0x24c
> +#define SE_SPI_DEMUX_SEL   0x250
> +#define SE_SPI_TRANS_CFG   0x25c
> +#define SE_SPI_WORD_LEN0x268
> +#define SE_SPI_TX_TRANS_LEN0x26c
> +#define SE_SPI_RX_TRANS_LEN0x270
> +#define SE_SPI_PRE_POST_CMD_DLY0x274
> +#define SE_SPI_DELAY_COUNTERS  0x278
> +
> +/* SE_SPI_CPHA register fields */
> +#define CPHA   BIT(0)

Can you put these defines next to the register that they correspond to?
Then we don't need the duplicate comment to indicate what registers they
are used with. 

> +
> +/* SE_SPI_LOOPBACK register fields */
> +#define LOOPBACK_ENABLE0x1
> +#define NORMAL_MODE0x0
> +#define LOOPBACK_MSK   GENMASK(1, 0)
> +
> +/* SE_SPI_CPOL register fields */
> +#define CPOL   BIT(2)
> +
> +/* SE_SPI_DEMUX_OUTPUT_INV register fields */
> +#define CS_DEMUX_OUTPUT_INV_MSKGENMASK(3, 0)
> +
> +/* SE_SPI_DEMUX_SEL register fields */
> +#define CS_DEMUX_OUTPUT_SELGENMASK(3, 0)
> +
> +/* SE_SPI_TX_TRANS_CFG register fields */
> +#define CS_TOGGLE  BIT(0)
> +
> +/* SE_SPI_WORD_LEN register fields */
> +#define WORD_LEN_MSK   GENMASK(9, 0)
> +#define MIN_WORD_LEN   4
> +
> +/* SPI_TX/SPI_RX_TRANS_LEN fields */
> +#define TRANS_LEN_MSK  GENMASK(23, 0)
> +
> +/* SE_SPI_DELAY_COUNTERS */
> +#define SPI_INTER_WORDS_DELAY_MSK  GENMASK(9, 0)
> +#define SPI_CS_CLK_DELAY_MSK   GENMASK(19, 10)
> +#define SPI_CS_CLK_DELAY_SHFT  10
> +
> +/* M_CMD OP codes for SPI */
> +#define SPI_TX_ONLY1
> +#define SPI_RX_ONLY2
> +#define SPI_FULL_DUPLEX3
> +#define SPI_TX_RX  7
> +#define SPI_CS_ASSERT  8
> +#define SPI_CS_DEASSERT9
> +#define SPI_SCK_ONLY   10
> +/* M_CMD params for SPI */
> +#define SPI_PRE_CMD_DELAY  BIT(0)
> +#define TIMESTAMP_BEFORE   BIT(1)
> +#define FRAGMENTATION  BIT(2)
> +#define TIMESTAMP_AFTERBIT(3)
> +#define POST_CMD_DELAY BIT(4)
> +
> +static irqreturn_t geni_spi_isr(int irq, void *dev);
> +
> +struct spi_geni_m

Re: [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-05-11 Thread Stephen Bates

All

> Alex (or anyone else) can you point to where IOVA addresses are generated?

A case of RTFM perhaps (though a pointer to the code would still be 
appreciated).

https://www.kernel.org/doc/Documentation/Intel-IOMMU.txt

Some exceptions to IOVA
---
Interrupt ranges are not address translated, (0xfee0 - 0xfeef).
The same is true for peer to peer transactions. Hence we reserve the
address from PCI MMIO ranges so they are not allocated for IOVA addresses.

Cheers

Stephen

Re: [PATCH ghak81 RFC V1 1/5] audit: normalize loginuid read access

2018-05-11 Thread Richard Guy Briggs

On 2018-05-10 17:21, Richard Guy Briggs wrote:
> On 2018-05-09 11:13, Paul Moore wrote:
> > On Fri, May 4, 2018 at 4:54 PM, Richard Guy Briggs  wrote:
> > > Recognizing that the loginuid is an internal audit value, use an access
> > > function to retrieve the audit loginuid value for the task rather than
> > > reaching directly into the task struct to get it.
> > >
> > > Signed-off-by: Richard Guy Briggs 
> > > ---
> > >  kernel/auditsc.c | 16 
> > >  1 file changed, 8 insertions(+), 8 deletions(-)
> > >
> > > diff --git a/kernel/auditsc.c b/kernel/auditsc.c
> > > index 479c031..f3817d0 100644
> > > --- a/kernel/auditsc.c
> > > +++ b/kernel/auditsc.c
> > > @@ -374,7 +374,7 @@ static int audit_field_compare(struct task_struct 
> > > *tsk,
> > > case AUDIT_COMPARE_EGID_TO_OBJ_GID:
> > > return audit_compare_gid(cred->egid, name, f, ctx);
> > > case AUDIT_COMPARE_AUID_TO_OBJ_UID:
> > > -   return audit_compare_uid(tsk->loginuid, name, f, ctx);
> > > +   return audit_compare_uid(audit_get_loginuid(tsk), name, 
> > > f, ctx);
> > > case AUDIT_COMPARE_SUID_TO_OBJ_UID:
> > > return audit_compare_uid(cred->suid, name, f, ctx);
> > > case AUDIT_COMPARE_SGID_TO_OBJ_GID:
> > > @@ -385,7 +385,7 @@ static int audit_field_compare(struct task_struct 
> > > *tsk,
> > > return audit_compare_gid(cred->fsgid, name, f, ctx);
> > > /* uid comparisons */
> > > case AUDIT_COMPARE_UID_TO_AUID:
> > > -   return audit_uid_comparator(cred->uid, f->op, 
> > > tsk->loginuid);
> > > +   return audit_uid_comparator(cred->uid, f->op, 
> > > audit_get_loginuid(tsk));
> > > case AUDIT_COMPARE_UID_TO_EUID:
> > > return audit_uid_comparator(cred->uid, f->op, cred->euid);
> > > case AUDIT_COMPARE_UID_TO_SUID:
> > > @@ -394,11 +394,11 @@ static int audit_field_compare(struct task_struct 
> > > *tsk,
> > > return audit_uid_comparator(cred->uid, f->op, 
> > > cred->fsuid);
> > > /* auid comparisons */
> > > case AUDIT_COMPARE_AUID_TO_EUID:
> > > -   return audit_uid_comparator(tsk->loginuid, f->op, 
> > > cred->euid);
> > > +   return audit_uid_comparator(audit_get_loginuid(tsk), 
> > > f->op, cred->euid);
> > > case AUDIT_COMPARE_AUID_TO_SUID:
> > > -   return audit_uid_comparator(tsk->loginuid, f->op, 
> > > cred->suid);
> > > +   return audit_uid_comparator(audit_get_loginuid(tsk), 
> > > f->op, cred->suid);
> > > case AUDIT_COMPARE_AUID_TO_FSUID:
> > > -   return audit_uid_comparator(tsk->loginuid, f->op, 
> > > cred->fsuid);
> > > +   return audit_uid_comparator(audit_get_loginuid(tsk), 
> > > f->op, cred->fsuid);
> > > /* euid comparisons */
> > > case AUDIT_COMPARE_EUID_TO_SUID:
> > > return audit_uid_comparator(cred->euid, f->op, 
> > > cred->suid);
> > > @@ -611,7 +611,7 @@ static int audit_filter_rules(struct task_struct *tsk,
> > > result = match_tree_refs(ctx, rule->tree);
> > > break;
> > > case AUDIT_LOGINUID:
> > > -   result = audit_uid_comparator(tsk->loginuid, 
> > > f->op, f->uid);
> > > +   result = 
> > > audit_uid_comparator(audit_get_loginuid(tsk), f->op, f->uid);
> > > break;
> > > case AUDIT_LOGINUID_SET:
> > > result = 
> > > audit_comparator(audit_loginuid_set(tsk), f->op, f->val);
> > > @@ -2287,8 +2287,8 @@ int audit_signal_info(int sig, struct task_struct 
> > > *t)
> > > (sig == SIGTERM || sig == SIGHUP ||
> > >  sig == SIGUSR1 || sig == SIGUSR2)) {
> > > audit_sig_pid = task_tgid_nr(tsk);
> > > -   if (uid_valid(tsk->loginuid))
> > > -   audit_sig_uid = tsk->loginuid;
> > > +   if (uid_valid(audit_get_loginuid(tsk)))
> > > +   audit_sig_uid = audit_get_loginuid(tsk);
> > 
> > I realize this comment is a little silly given the nature of loginuid,
> > but if we are going to abstract away loginuid accesses (which I think
> > is good), we should probably access it once, store it in a local
> > variable, perform the validity check on the local variable, then
> > commit the local variable to audit_sig_uid.  I realize a TOCTOU
> > problem is unlikely here, but with this new layer of abstraction it
> > seems that some additional safety might be a good thing.
> 
> Ok, I'll just assign it to where it is going and check it there, holding
> the audit_ctl_lock the whole time, since it should have been done
> anyways for all of audit_sig_{pid,uid,sid} anyways to get a consistent
> view from the AUDIT_SIGNAL_INFO fetch.

Hmmm, holding audit_ctl_lock won't work because it could sleep trying to
get the lock and the sig

[PATCH] drivers: bluetooth: hci_serdev: Removed unnecessary curly braces

2018-05-11 Thread Vaibhav Murkute

checkpatch.pl shows a warning for these unnecessary curly braces.
so just removed those curly braces.

Signed-off-by: Vaibhav Murkute 
---
 drivers/bluetooth/hci_serdev.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/bluetooth/hci_serdev.c b/drivers/bluetooth/hci_serdev.c
index e0e6461..137c314 100644
--- a/drivers/bluetooth/hci_serdev.c
+++ b/drivers/bluetooth/hci_serdev.c
@@ -204,9 +204,9 @@ static int hci_uart_setup(struct hci_dev *hdev)
return 0;
}
 
-   if (skb->len != sizeof(*ver)) {
+   if (skb->len != sizeof(*ver))
bt_dev_err(hdev, "Event length mismatch for version info");
-   }
+
 
kfree_skb(skb);
return 0;
-- 
2.7.4

Re: [PATCH v1 11/13] dt-bindings: power: add PX30 SoCs header for power-domain

2018-05-11 Thread Heiko Stuebner

Hi Elaine,


Am Freitag, 11. Mai 2018, 05:34:32 CEST schrieb Elaine Zhang:
> According to a description from TRM, add all the power domains.
> 
> Signed-off-by: Elaine Zhang 
> Signed-off-by: Finley Xiao 

that's a bit ambigous, having the Signed-offs like above.

Either you are the author+sender of the patch, then Finley's
Signed-off should go away ... or Finley is the author and you
are the sender, then the author of the patch should be set
correctly (--author option for git) and the Signed-offs should
switch places (= Finley first, then yours). Please fix.

This seems to be true for most patches in this series.

> ---
>  include/dt-bindings/power/px30-power.h | 32 
>  1 file changed, 32 insertions(+)
>  create mode 100644 include/dt-bindings/power/px30-power.h
> 
> diff --git a/include/dt-bindings/power/px30-power.h 
> b/include/dt-bindings/power/px30-power.h
> new file mode 100644
> index ..4ed482e80950
> --- /dev/null
> +++ b/include/dt-bindings/power/px30-power.h

Here I have a naming question. When looking at the vendor kernel
it looks like the px30 is largely related to the rk3326.
(rk3326.dtsi includeing the px30.dtsi)

What is the reason for basing the naming on the px30 this time? And could
we possibly keep to rk names for the basic things in the kernel, thus
keeping the pxXX as second name, like with the other px-variants before?


Thanks
Heiko

Re: [PATCH 1/2] bcachefs: On disk data structures

2018-05-11 Thread Kent Overstreet

On Fri, May 11, 2018 at 06:32:33PM +1000, Dave Chinner wrote:
> Hi Kent,
> 
> I haven't really had time to digest this in any real detail,
> but I've noticed a couple of things that worry me...
> 
> On Tue, May 08, 2018 at 06:17:59PM -0400, Kent Overstreet wrote:
> > Signed-off-by: Kent Overstreet 
> > ---
> >  fs/bcachefs/bcachefs_format.h | 1448 +
> >  1 file changed, 1448 insertions(+)
> >  create mode 100644 fs/bcachefs/bcachefs_format.h
> > 
> > diff --git a/fs/bcachefs/bcachefs_format.h b/fs/bcachefs/bcachefs_format.h
> > new file mode 100644
> > index 00..0961585c7e
> > --- /dev/null
> > +++ b/fs/bcachefs/bcachefs_format.h
> > @@ -0,0 +1,1448 @@
> > +#ifndef _BCACHEFS_FORMAT_H
> > +#define _BCACHEFS_FORMAT_H
> .
> > +/* Btree keys - all units are in sectors */
> > +
> > +struct bpos {
> > +   /* Word order matches machine byte order */
> > +#if defined(__LITTLE_ENDIAN)
> > +   __u32   snapshot;
> > +   __u64   offset;
> > +   __u64   inode;
> > +#elif defined(__BIG_ENDIAN)
> > +   __u64   inode;
> > +   __u64   offset; /* Points to end of extent - sectors */
> > +   __u32   snapshot;
> > +#else
> 
> Mostly my concerns are about these endian constructs - is the on
> disk structure big endian or little endian, and how do you ensure
> that everything you read and write to the on-disk format is in the
> correct endian notation? I think your on-disk format is little
> endian (from the definitions later in the file) but these don't look
> like endian neutral structures

Darrick already commented on struct bpos too, I added a big comment explaining
what's going on there :)

The majority is little endian/endian neutral. For struct bpos, I'll quote the   

comment I wrote earlier:

/*
 * Word order matches machine byte order - btree code treats a bpos as a
 * single large integer, for search/comparison purposes
 *
 * Note that wherever a bpos is embedded in another on disk data
 * structure, it has to be byte swabbed when reading in metadata that
 * wasn't written in native endian order:
 */

With the way the core lookup code in bset.c works, this really the only sane way
of doing it - bcache doesn't work on big endian machines, and because within a
key word order does not match machine byte order I tried and gave up trying to
fix it there. I spent a lot of time getting this right when I broke the on disk
format :)

The byte swabbing we have to do when reading in metadata from a different endian
machine is dead simple:

void bch2_bpos_swab(struct bpos *p)
{
u8 *l = (u8 *) p;
u8 *h = ((u8 *) &p[1]) - 1;

while (l < h) {
swap(*l, *h);
l++;
--h;
}
}

void bch2_bkey_swab_key(const struct bkey_format *_f, struct bkey_packed *k)
{
const struct bkey_format *f = bkey_packed(k) ? _f : 
&bch2_bkey_format_current;
u8 *l = k->key_start;
u8 *h = (u8 *) (k->_data + f->key_u64s) - 1;

while (l < h) {
swap(*l, *h);
l++;
--h;
}
}

> That's apart from the fact all the endian defines make the code
> really hard to read, and probably a pain to maintain, and it doubles
> the test matrix because any on-disk change has to be validate on
> both little endian and big endian machines

The testing does kind of suck, but it needs to happen anyways, not just for the
structures playing weird endianness games. I just finished fixing ktest's
foreign architecture support, so I'm going to be testing on 32 bit big endian
mips as soon as I finish getting bcachefs-tools to build (I've tested all the
stuff you've pointed out with powerpc virtual machines, but that was ages ago
and the on disk format has had other additions since then).

> > +union bch_extent_entry {
> > +#if defined(__LITTLE_ENDIAN) ||  __BITS_PER_LONG == 64
> > +   unsigned long   type;
> > +#elif __BITS_PER_LONG == 32
> > +   struct {
> > +   unsigned long   pad;
> > +   unsigned long   type;
> > +   };
> > +#else
> 
> This is another worry - using "long" in the on disk structure
> definition. If this is in-meory structures, then use
> le64_to_cpu/cpu_to_le64 to convert the value from the on-disk value
> to the in-memory, cpu order value

bch_extent is a more debatable use of fancy endianness/byte swabbing tricks...
it's not necessary for any fundamental algorithmic reasons here like it is with
bpos/bkey, it's mainly because as extents are the biggest fraction of total
metadata I was trying to optimize space efficiency as much as possible, without
having the cost of a pack/unpack (like I have for inodes now, which is actually
a bit painful). That and I really hate the KEY_INODE()/SET_KEY_INODE() style
accessors bcache uses.

I'm not sure I'll do that again in the future though, when I w

[PATCH 2/2] soc: bcm: brcmstb: Add missing DDR MEMC compatible strings

2018-05-11 Thread Florian Fainelli

We would not be matching the following chip/compatible strings
combinations, which would lead to not setting the warm boot flag
correctly, fix that:

7260A0/B0: brcm,brcmstb-memc-ddr-rev-b.2.1
7255A0: brcm,brcmstb-memc-ddr-rev-b.2.3
7278Bx: brcm,brcmstb-memc-ddr-rev-b.3.1

The B2.1 core (which is in 7260 A0 and B0) doesn't have the
SHIMPHY_ADDR_CNTL_0_DDR_PAD_CNTRL setup in the memsys init code, nor
does it have the warm boot flag re-definition on entry. Those changes
were for B2.2 and later MEMSYS cores. Fall back to the previous S2/S3
entry method for these specific chips.

Fixes: 0b741b8234c8 ("soc: bcm: brcmstb: Add support for S2/S3/S5 suspend 
states (ARM)")
Signed-off-by: Florian Fainelli 
---
 Documentation/devicetree/bindings/arm/bcm/brcm,brcmstb.txt |  3 +++
 drivers/soc/bcm/brcmstb/pm/pm-arm.c| 12 
 2 files changed, 15 insertions(+)

diff --git a/Documentation/devicetree/bindings/arm/bcm/brcm,brcmstb.txt 
b/Documentation/devicetree/bindings/arm/bcm/brcm,brcmstb.txt
index fb762059e68e..104cc9b41df4 100644
--- a/Documentation/devicetree/bindings/arm/bcm/brcm,brcmstb.txt
+++ b/Documentation/devicetree/bindings/arm/bcm/brcm,brcmstb.txt
@@ -189,8 +189,11 @@ Power-Down (SRPD), among other things.
 
 Required properties:
 - compatible : should contain one of these
+   "brcm,brcmstb-memc-ddr-rev-b.2.1"
"brcm,brcmstb-memc-ddr-rev-b.2.2"
+   "brcm,brcmstb-memc-ddr-rev-b.2.3"
"brcm,brcmstb-memc-ddr-rev-b.3.0"
+   "brcm,brcmstb-memc-ddr-rev-b.3.1"
"brcm,brcmstb-memc-ddr"
 - reg: the MEMC DDR register range
 
diff --git a/drivers/soc/bcm/brcmstb/pm/pm-arm.c 
b/drivers/soc/bcm/brcmstb/pm/pm-arm.c
index ade724677238..a5577dd5eb08 100644
--- a/drivers/soc/bcm/brcmstb/pm/pm-arm.c
+++ b/drivers/soc/bcm/brcmstb/pm/pm-arm.c
@@ -627,14 +627,26 @@ static const struct of_device_id ddr_shimphy_dt_ids[] = {
 };
 
 static const struct of_device_id brcmstb_memc_of_match[] = {
+   {
+   .compatible = "brcm,brcmstb-memc-ddr-rev-b.2.1",
+   .data = &ddr_seq,
+   },
{
.compatible = "brcm,brcmstb-memc-ddr-rev-b.2.2",
.data = &ddr_seq_b22,
},
+   {
+   .compatible = "brcm,brcmstb-memc-ddr-rev-b.2.3",
+   .data = &ddr_seq_b22,
+   },
{
.compatible = "brcm,brcmstb-memc-ddr-rev-b.3.0",
.data = &ddr_seq_b22,
},
+   {
+   .compatible = "brcm,brcmstb-memc-ddr-rev-b.3.1",
+   .data = &ddr_seq_b22,
+   },
{
.compatible = "brcm,brcmstb-memc-ddr",
.data = &ddr_seq,
-- 
2.14.1

[PATCH 1/2] soc: bcm: brcmstb: pm: Add support for newer rev B3.0 controllers

2018-05-11 Thread Florian Fainelli

From: Doug Berger 

Update the Device Tree binding document and add a matching entry for the
MEMC DDR controller revision B3.0 which is found on chips like 7278A0
and newer.

Signed-off-by: Doug Berger 
[florian: tweak commit message, make it apply to upstream kernel]
Signed-off-by: Florian Fainelli 
---
 Documentation/devicetree/bindings/arm/bcm/brcm,brcmstb.txt | 1 +
 drivers/soc/bcm/brcmstb/pm/pm-arm.c| 4 
 2 files changed, 5 insertions(+)

diff --git a/Documentation/devicetree/bindings/arm/bcm/brcm,brcmstb.txt 
b/Documentation/devicetree/bindings/arm/bcm/brcm,brcmstb.txt
index c052caad36e8..fb762059e68e 100644
--- a/Documentation/devicetree/bindings/arm/bcm/brcm,brcmstb.txt
+++ b/Documentation/devicetree/bindings/arm/bcm/brcm,brcmstb.txt
@@ -190,6 +190,7 @@ Power-Down (SRPD), among other things.
 Required properties:
 - compatible : should contain one of these
"brcm,brcmstb-memc-ddr-rev-b.2.2"
+   "brcm,brcmstb-memc-ddr-rev-b.3.0"
"brcm,brcmstb-memc-ddr"
 - reg: the MEMC DDR register range
 
diff --git a/drivers/soc/bcm/brcmstb/pm/pm-arm.c 
b/drivers/soc/bcm/brcmstb/pm/pm-arm.c
index dcf8c8065508..ade724677238 100644
--- a/drivers/soc/bcm/brcmstb/pm/pm-arm.c
+++ b/drivers/soc/bcm/brcmstb/pm/pm-arm.c
@@ -631,6 +631,10 @@ static const struct of_device_id brcmstb_memc_of_match[] = 
{
.compatible = "brcm,brcmstb-memc-ddr-rev-b.2.2",
.data = &ddr_seq_b22,
},
+   {
+   .compatible = "brcm,brcmstb-memc-ddr-rev-b.3.0",
+   .data = &ddr_seq_b22,
+   },
{
.compatible = "brcm,brcmstb-memc-ddr",
.data = &ddr_seq,
-- 
2.14.1

[PATCH 0/2] soc: bcm: brcmstb: Updates to support newer controllers

2018-05-11 Thread Florian Fainelli

Hi all,

This patch series adds support for newer reveisions of the memory controller
which is necessary to make sure we do use the right programming sequence to
enter S2 and S3 suspend/resume modes.

Doug Berger (1):
  soc: bcm: brcmstb: pm: Add support for newer rev B3.0 controllers

Florian Fainelli (1):
  soc: bcm: brcmstb: Add missing DDR MEMC compatible strings

 .../devicetree/bindings/arm/bcm/brcm,brcmstb.txt |  4 
 drivers/soc/bcm/brcmstb/pm/pm-arm.c  | 16 
 2 files changed, 20 insertions(+)

-- 
2.14.1

Re: [PATCH net] macmace: Set platform device coherent_dma_mask

2018-05-11 Thread Michael Schmitz

Hi Finn,

Am 11.05.2018 um 22:06 schrieb Finn Thain:
>> You would have to be careful not to overwrite a pdev->dev.dma_mask and 
>> pdev->dev.dma_coherent_mask that might have been set in a platform 
>> device passed via platform_device_register here. Coldfire is the only 
>> m68k platform currently using that, but there might be others in future.
>>
> 
> That Coldfire patch could be reverted if this is a better solution.

True, but there might be other uses for deviating from a platform
default (I'm thinking of Atari SCSI and floppy drivers here). But we
could chose the correct mask to set in arch_setup_pdev_archdata()
instead, as it's a platform property not a driver property in that case.

>> ... But I don't think there are smaller DMA masks used by m68k drivers 
>> that use the platform device mechanism at present. I've only looked at 
>> arch/m68k though.
> 
> So we're back at the same problem that Geert's suggestion also raised: how 
> to identify potentially affected platform devices and drivers?
> 
> Maybe we can take a leaf out of Christoph's book, and leave a noisy 
> WARNING splat in the log.
> 
> void arch_setup_pdev_archdata(struct platform_device *pdev)
> {
> WARN_ON_ONCE(pdev->dev.coherent_dma_mask != DMA_MASK_NONE ||
>  pdev->dev.dma_mask != NULL);

I'd suggest using WARN_ON() so we catch all uses on a particular platform.

I initially thought it necessary to warn on unset mask here, but I see
that would throw up a lot of redundant false positives.

Cheers,

Michael

Re: [PATCH v2] MIPS: c-r4k: fix data corruption related to cache coherence.

2018-05-11 Thread James Hogan

On Tue, May 08, 2018 at 11:22:36AM +1000, NeilBrown wrote:
> On Mon, May 07 2018, James Hogan wrote:
> 
> > On Mon, May 07, 2018 at 07:40:49AM +1000, NeilBrown wrote:
> >> 
> >> Hi James,
> >>  this hasn't appear in linux-next yet, or in any branch
> >>  of
> >>git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/mips.git
> >> 
> >>  Should I expect it to?
> >
> > Sorry Neil, I haven't applied it yet. I'm planning to get a few fixes
> > sorted this week, at which point it would land in the mips-fixes branch
> > at the above repo.
> 
> Cool, thanks.  I just wanted to be sure it hadn't got lost somehow.

Now pushed.

Thanks
James


signature.asc
Description: PGP signature

Re: [PATCH v2] MIPS: Fix build with DEBUG_ZBOOT and MACH_JZ4770

2018-05-11 Thread James Hogan

On Wed, Mar 28, 2018 at 05:38:12PM +0200, Paul Cercueil wrote:
> The debug definitions were missing for MACH_JZ4770, resulting in a build
> failure when DEBUG_ZBOOT was set.
> 
> Since the UART addresses are the same across all Ingenic SoCs, we just
> use a #ifdef CONFIG_MACH_INGENIC instead of checking for individual
> Ingenic SoCs.
> 
> Additionally, I added a #define for the UART0 address in-code and dropped
> the  include, for the reason that this include
> file is slowly being phased out as the whole platform is being moved to
> devicetree.
> 
> Signed-off-by: Paul Cercueil 

Applied for 4.17 with fixes tag and 4.16 stable tag.

Thanks
James


signature.asc
Description: PGP signature

Re: [PATCH 3/6] firmware: differentiate between signed regulatory.db and other firmware

2018-05-11 Thread Luis R. Rodriguez

On Fri, May 11, 2018 at 01:00:26AM -0400, Mimi Zohar wrote:
> On Thu, 2018-05-10 at 23:26 +, Luis R. Rodriguez wrote:
> > On Wed, May 09, 2018 at 10:00:58PM -0400, Mimi Zohar wrote:
> > > On Wed, 2018-05-09 at 23:48 +, Luis R. Rodriguez wrote:
> > > > On Wed, May 09, 2018 at 06:06:57PM -0400, Mimi Zohar wrote:
> > > 
> > > > > > > Yes, writing regdb as a micro/mini LSM sounds reasonable.  The LSM
> > > > > > > would differentiate between other firmware and the regulatory.db 
> > > > > > > based
> > > > > > > on the firmware's pathname.
> > > > > > 
> > > > > > If that is the only way then it would be silly to do the mini LSM 
> > > > > > as all
> > > > > > calls would have to have the check. A special LSM hook for just the
> > > > > > regulatory db also doesn't make much sense.
> > > > > 
> > > > > All calls to request_firmware() are already going through this LSM
> > > > > hook.  I should have said, it would be based on both READING_FIRMWARE
> > > > > and the firmware's pathname.
> > > > 
> > > > Yes, but it would still be a strcmp() computation added for all
> > > > READING_FIRMWARE. In that sense, the current arrangement is only open 
> > > > coding the
> > > > signature verification for the regulatory.db file.  One way to avoid 
> > > > this would
> > > > be to add an LSM specific to the regulatory db
> > > 
> > > Casey already commented on this suggestion.
> > 
> > Sorry but I must have missed this, can you send me the email or URL where 
> > he did that?
> > I never got a copy of that email I think.
> 
> My mistake.  I've posted similar patches for kexec_load and for the
> firmware sysfs fallback, both call security_kernel_read_file().
> Casey's comment was in regards to kexec_load[1], not for the sysfs
> fallback mode.  Here's the link to the most recent version of the
> kexec_load patches.[2]
> 
> [1] 
> http://kernsec.org/pipermail/linux-security-module-archive/2018-May/006690.html
> [2] 
> http://kernsec.org/pipermail/linux-security-module-archive/2018-May/006854.html

It seems I share Eric's concern on these threads are over general architecture,
below some notes which I think may help for the long term on that regards.

In the firmware_loader case we have *one* subsystem which as open coded firmware
signing -- the wireless subsystem open codes firmware verification by doing two
request_firmware() calls, one for the regulatory.bin and one for 
regulatory.bin.p7s,
and then it does its own check. In this patch set you suggested adding
a new READING_FIRMWARE_REGULATORY_DB. But your first patch in the series also
adds READING_FIRMWARE_FALLBACK for the fallback case where we enable use of
the old syfs loading facility.

My concerns are two fold for this case:

a) This would mean adding a new READING_* ID tag per any kernel mechanism which 
open
codes its own signature verification scheme.

b) The way it was implemented was to do (just showing
READING_FIRMWARE_REGULATORY_DB here):

diff --git a/drivers/base/firmware_loader/main.c 
b/drivers/base/firmware_loader/main.c
index eb34089e4299..d7cdf04a8681 100644
--- a/drivers/base/firmware_loader/main.c
+++ b/drivers/base/firmware_loader/main.c
@@ -318,6 +318,11 @@ fw_get_filesystem_firmware(struct device *device, struct 
fw_priv *fw_priv)
break;
}

+#ifdef CONFIG_CFG80211_REQUIRE_SIGNED_REGDB
+   if ((strcmp(fw_priv->fw_name, "regulatory.db") == 0) ||
+   (strcmp(fw_priv->fw_name, "regulatory.db.p7s") == 0))
+   id = READING_FIRMWARE_REGULATORY_DB;
+#endif
fw_priv->size = 0;
rc = kernel_read_file_from_path(path, &fw_priv->data, &size,
msize, id);

This is eye-soring, and in turn would mean adding yet more #ifdefs for any
code on the kernel which open codes other firmware signing efforts with
its own kconfig...

I gather from reading the threads above that Eric's concerns are the re-use of
an API for security to read files for something which is really not a file, but
a binary blob of some sort and Casey's rebuttal is adding more hooks for small
things is a bad idea.

In light of all this I'll say that the concerns Eric has are unfortunately
too late, that ship has sailed eons ago. The old non-fd API for module loading
init_module() calls   security_kernel_read_file(NULL, READING_MODULE). Your
patch in this series adds security_kernel_read_file(NULL, 
READING_FIRMWARE_FALLBACK)
for the old syfs loading facility.

So in this regard, I think we have no other option but what you suggested, to
add a wrapper, say a security_kernel_read_blob() wrapper that calls
security_kernel_read_file(NULL, id); and make the following legacy calls use
it:

  o kernel/module.c for init_module()
  o kexec_load()
  o firmware loader sysfs facility

I think its fair then to add a new READING entry per functionality here
*but* with the compromise that we *document* that such interfaces are
discouraged, i

Re: [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-05-11 Thread Stephen Bates

>I find this hard to believe. There's always the possibility that some 
>part of the system doesn't support ACS so if the PCI bus addresses and 
>IOVA overlap there's a good chance that P2P and ATS won't work at all on 
>some hardware.

I tend to agree but this comes down to how IOVA addresses are generated in the 
kernel. Alex (or anyone else) can you point to where IOVA addresses are 
generated? As Logan stated earlier, p2pdma bypasses this and programs the PCI 
bus address directly but other IO going to the same PCI EP may flow through the 
IOMMU and be programmed with IOVA rather than PCI bus addresses.

> I prefer 
>the option to disable the ACS bit on boot and let the existing code put 
>the devices into their own IOMMU group (as it should already do to 
>support hardware that doesn't have ACS support).

+1

Stephen

[PATCH] ring_buffer: Update logging to use single line output

2018-05-11 Thread Joe Perches

With a possible change to pr_fmt coming, the logging output can
become unbalanced when an initial line has a prefix and subsequent
lines do not when a multiple line pr_ is emitted.

Fix it by emitting a single line.

Miscellanea:

o Convert consecutive tests of total_lost and !total_lost to if/else

Signed-off-by: Joe Perches 
---
 kernel/trace/ring_buffer.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index c9cb9767d49b..ee74494a2da3 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -5132,10 +5132,9 @@ static __init int test_ringbuffer(void)
pr_info("total events:   %ld\n", total_lost + 
total_read);
pr_info("  recorded len bytes:   %ld\n", total_len);
pr_info(" recorded size bytes:   %ld\n", total_size);
-   if (total_lost)
-   pr_info(" With dropped events, record len and size may 
not match\n"
-   " alloced and written from above\n");
-   if (!total_lost) {
+   if (total_lost) {
+   pr_info(" With dropped events, record len and size may 
not match alloced and written from above\n");
+   } else {
if (RB_WARN_ON(buffer, total_len != total_alloc ||
   total_size != total_written))
break;

Re: [PATCH v4 1/7] interconnect: Add generic on-chip interconnect API

2018-05-11 Thread Evan Green

Hi Georgi,

On Fri, Mar 9, 2018 at 1:12 PM Georgi Djakov 
wrote:

> This patch introduce a new API to get requirements and configure the
> interconnect buses across the entire chipset to fit with the current
> demand.

> The API is using a consumer/provider-based model, where the providers are
> the interconnect buses and the consumers could be various drivers.
> The consumers request interconnect resources (path) between endpoints and
> set the desired constraints on this data flow path. The providers receive
> requests from consumers and aggregate these requests for all master-slave
> pairs on that path. Then the providers configure each participating in the
> topology node according to the requested data flow path, physical links
and
> constraints. The topology could be complicated and multi-tiered and is SoC
> specific.

> Signed-off-by: Georgi Djakov 
> ---
>   Documentation/interconnect/interconnect.rst |  96 ++
>   drivers/Kconfig |   2 +
>   drivers/Makefile|   1 +
>   drivers/interconnect/Kconfig|  10 +
>   drivers/interconnect/Makefile   |   1 +
>   drivers/interconnect/core.c | 489

>   include/linux/interconnect-provider.h   | 109 +++
>   include/linux/interconnect.h|  40 +++
>   8 files changed, 748 insertions(+)
>   create mode 100644 Documentation/interconnect/interconnect.rst
>   create mode 100644 drivers/interconnect/Kconfig
>   create mode 100644 drivers/interconnect/Makefile
>   create mode 100644 drivers/interconnect/core.c
>   create mode 100644 include/linux/interconnect-provider.h
>   create mode 100644 include/linux/interconnect.h

...
> diff --git a/drivers/interconnect/core.c b/drivers/interconnect/core.c
> new file mode 100644
> index ..6306e258b9b9
> --- /dev/null
> +++ b/drivers/interconnect/core.c
> @@ -0,0 +1,489 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Interconnect framework core driver
> + *
> + * Copyright (c) 2018, Linaro Ltd.
> + * Author: Georgi Djakov 
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +static DEFINE_IDR(icc_idr);
> +static LIST_HEAD(icc_provider_list);
> +static DEFINE_MUTEX(icc_provider_list_mutex);
> +static DEFINE_MUTEX(icc_path_mutex);
> +
> +/**
> + * struct icc_req - constraints that are attached to each node
> + *
> + * @req_node: entry in list of requests for the particular @node
> + * @node: the interconnect node to which this constraint applies
> + * @avg_bw: an integer describing the average bandwidth in kbps
> + * @peak_bw: an integer describing the peak bandwidth in kbps
> + */
> +struct icc_req {
> +   struct hlist_node req_node;
> +   struct icc_node *node;
> +   u32 avg_bw;
> +   u32 peak_bw;
> +};
> +
> +/**
> + * struct icc_path - interconnect path structure
> + * @num_nodes: number of hops (nodes)
> + * @reqs: array of the requests applicable to this path of nodes
> + */
> +struct icc_path {
> +   size_t num_nodes;
> +   struct icc_req reqs[0];
> +};
> +
> +static struct icc_node *node_find(const int id)
> +{
> +   struct icc_node *node;
> +
> +   node = idr_find(&icc_idr, id);
> +
> +   return node;
> +}
> +
> +static struct icc_path *path_allocate(struct icc_node *node, ssize_t
num_nodes)
> +{

So node is really the destination, correct? Then we use ->reverse to walk
backwards num_nodes steps towards the source. It might increase readability
to call the parameter dest, then assign that to a local called node for
traversal.

> +   struct icc_path *path;
> +   size_t i;
> +
> +   path = kzalloc(sizeof(*path) + num_nodes * sizeof(*path->reqs),
> +  GFP_KERNEL);
> +   if (!path)
> +   return ERR_PTR(-ENOMEM);
> +
> +   path->num_nodes = num_nodes;
> +
> +   for (i = 0; i < num_nodes; i++) {
> +   hlist_add_head(&path->reqs[i].req_node, &node->req_list);
> +
> +   path->reqs[i].node = node;
> +   /* reference to previous node was saved during path
traversal */
> +   node = node->reverse;
> +   }
> +
> +   return path;
> +}
> +
> +static struct icc_path *path_find(struct icc_node *src, struct icc_node
*dst)
> +{
> +   struct icc_node *node = NULL;
> +   struct list_head traverse_list;
> +   struct list_head edge_list;
> +   struct list_head tmp_list;
> +   size_t i, number = 0;
> +   bool found = false;
> +
> +   INIT_LIST_HEAD(&traverse_list);
> +   INIT_LIST_HEAD(&edge_list);
> +   INIT_LIST_HEAD(&tmp_list);

tmp_list is really the list of nodes you've already visited and need to
remember to reset is_traversed for. Maybe calling this done_list or
visited_list would be more descriptive.

> +
> +   list_add_tail(&src->search_list, &traverse_list);

For added paranoia, you could set src->rever

Re: [PATCH v3 5/8] MIPS: jz4740: dts: Add bindings for the jz4740-wdt driver

2018-05-11 Thread Guenter Roeck

On Fri, May 11, 2018 at 10:15:55PM +0100, James Hogan wrote:
> On Fri, May 11, 2018 at 02:14:16PM -0700, Guenter Roeck wrote:
> > On Fri, May 11, 2018 at 09:54:14PM +0100, James Hogan wrote:
> > > On Fri, May 11, 2018 at 01:17:04PM -0300, Paul Cercueil wrote:
> > > > Le 11 mai 2018 11:52, James Hogan  a écrit :
> > > > > Otherwise 
> > > > > Acked-by: James Hogan  
> > > > >
> > > > > I'm happy to apply for 4.18 with that change if you want it to go 
> > > > > through the MIPS tree. 
> > > > 
> > > > Yes please!
> > > 
> > > Done
> > > 
> > Does that include the watchdog changes ? No problem with it, just asking to 
> > make
> > sure that those don't get lost.
> 
> Yes, I suppose I was taking your reviewed-by as an ack.
> 
Ok.

Thanks,
Guenter

Re: [PATCH v4 4/7] interconnect: qcom: Add RPM communication

2018-05-11 Thread Evan Green

On Fri, Mar 9, 2018 at 1:11 PM Georgi Djakov 
wrote:

> On some Qualcomm SoCs, there is a remote processor, which controls some of
> the Network-On-Chip interconnect resources. Other CPUs express their needs
> by communicating with this processor. Add a driver to handle comminication
> with this remote processor.

> Signed-off-by: Georgi Djakov 
> ---
>   .../devicetree/bindings/interconnect/qcom-smd.txt  | 31 
>   drivers/interconnect/qcom/Makefile |  1 +
>   drivers/interconnect/qcom/smd-rpm.c| 90
++
>   drivers/interconnect/qcom/smd-rpm.h| 15 
>   4 files changed, 137 insertions(+)
>   create mode 100644
Documentation/devicetree/bindings/interconnect/qcom-smd.txt
>   create mode 100644 drivers/interconnect/qcom/Makefile
>   create mode 100644 drivers/interconnect/qcom/smd-rpm.c
>   create mode 100644 drivers/interconnect/qcom/smd-rpm.h

> diff --git a/Documentation/devicetree/bindings/interconnect/qcom-smd.txt
b/Documentation/devicetree/bindings/interconnect/qcom-smd.txt
> new file mode 100644
> index ..14e83ed7019b
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/interconnect/qcom-smd.txt
> @@ -0,0 +1,31 @@
> +Qualcomm SMD-RPM interconnect driver binding
> +
> +The RPM is a dedicated hardware engine for managing the shared
> +SoC resources in order to keep the lowest power profile. It
> +communicates with other hardware subsystems via shared memory
> +and accepts requests for various resources.

You never say what RPM or SMD stands for. RPM is Resource Power Manager,
right? But I'm not in the know about SMD. Can you define these somewhere?

-Evan

Re: [PATCH v4 5/7] interconnect: qcom: Add msm8916 interconnect provider driver

2018-05-11 Thread Evan Green

Hi Georgi,

On Fri, Mar 9, 2018 at 1:11 PM Georgi Djakov 
wrote:

> Add driver for the Qualcomm interconnect buses found in msm8916 based
> platforms.

> Signed-off-by: Georgi Djakov 
> ---
>drivers/interconnect/Kconfig|   5 +
>drivers/interconnect/Makefile   |   1 +
>drivers/interconnect/qcom/Kconfig   |  11 +
>drivers/interconnect/qcom/Makefile  |   2 +
>drivers/interconnect/qcom/msm8916.c | 482

>include/linux/interconnect/qcom.h   | 350 ++
>6 files changed, 851 insertions(+)
>create mode 100644 drivers/interconnect/qcom/Kconfig
>create mode 100644 drivers/interconnect/qcom/msm8916.c
>create mode 100644 include/linux/interconnect/qcom.h
...
> diff --git a/drivers/interconnect/qcom/msm8916.c
b/drivers/interconnect/qcom/msm8916.c
> new file mode 100644
> index ..d5b54f8261c8
> --- /dev/null
> +++ b/drivers/interconnect/qcom/msm8916.c
...
> +static int qnoc_probe(struct platform_device *pdev)
> +{
> +   const struct qcom_icc_desc *desc;
> +   struct qcom_icc_node **qnodes;
> +   struct qcom_icc_provider *qp;
> +   struct resource *res;
> +   struct icc_provider *provider;
> +   size_t num_nodes, i;
> +   int ret;
> +
> +   /* wait for RPM */
> +   if (!qcom_icc_rpm_smd_available())
> +   return -EPROBE_DEFER;
> +
> +   desc = of_device_get_match_data(&pdev->dev);
> +   if (!desc)
> +   return -EINVAL;
> +
> +   qnodes = desc->nodes;
> +   num_nodes = desc->num_nodes;
> +
> +   qp = devm_kzalloc(&pdev->dev, sizeof(*qp), GFP_KERNEL);
> +   if (!qp)
> +   return -ENOMEM;
> +
> +   res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> +   qp->base = devm_ioremap_resource(&pdev->dev, res);
> +   if (IS_ERR(qp->base))
> +   return PTR_ERR(qp->base);
> +
> +   qp->bus_clk = devm_clk_get(&pdev->dev, "bus_clk");
> +   if (IS_ERR(qp->bus_clk))
> +   return PTR_ERR(qp->bus_clk);
> +
> +   qp->bus_a_clk = devm_clk_get(&pdev->dev, "bus_a_clk");
> +   if (IS_ERR(qp->bus_a_clk))
> +   return PTR_ERR(qp->bus_a_clk);
> +
> +   provider = &qp->provider;
> +   provider->dev = &pdev->dev;
> +   provider->set = &qcom_icc_set;
> +   INIT_LIST_HEAD(&provider->nodes);
> +   provider->data = qp;
> +
> +   ret = icc_add_provider(provider);
> +   if (ret) {
> +   dev_err(&pdev->dev, "error adding interconnect
provider\n");
> +   return ret;
> +   }
> +
> +   for (i = 0; i < num_nodes; i++) {
> +   struct icc_node *node;
> +   int ret;
> +   size_t j;
> +
> +   node = icc_node_create(qnodes[i]->id);
> +   if (IS_ERR(node)) {
> +   ret = PTR_ERR(node);
> +   goto err;
> +   }
> +
> +   node->name = qnodes[i]->name;
> +   node->data = qnodes[i];
> +   icc_node_add(node, provider);
> +
> +   dev_dbg(&pdev->dev, "registered node %p %s %d\n", node,
> +   qnodes[i]->name, node->id);
> +
> +   /* populate links */
> +   for (j = 0; j < qnodes[i]->num_links; j++)
> +   if (qnodes[i]->links[j])
> +   icc_link_create(node,
qnodes[i]->links[j]);
> +
> +   ret = qcom_icc_init(node);
> +   if (ret)
> +   dev_err(&pdev->dev, "%s init error (%d)\n",
node->name,
> +   ret);

Don't you want to call qcom_icc_init before icc_link_create? As soon as
icc_link_create is called, the node is connected to the graph, and
qcom_icc_set can be called.

> +   }
> +
> +   platform_set_drvdata(pdev, provider);
> +
> +   return ret;
> +err:
> +   icc_del_provider(provider);
> +   return ret;
> +}
> +
> +static int qnoc_remove(struct platform_device *pdev)
> +{
> +   struct icc_provider *provider = platform_get_drvdata(pdev);
> +
> +   icc_del_provider(provider);

Presumably in the framework nodes and links ought to get cleaned up
somewhere too, right? As it is now, the devm code frees provider when this
device is removed, even though provider is still very connected in the
graph via the nodes and links.

> +
> +   return 0;
> +}
> +
> +static const struct of_device_id qnoc_of_match[] = {
> +   { .compatible = "qcom,msm8916-pnoc", .data = &msm8916_pnoc },
> +   { .compatible = "qcom,msm8916-snoc", .data = &msm8916_snoc },
> +   { .compatible = "qcom,msm8916-bimc", .data = &msm8916_bimc },
> +   { },
> +};
> +MODULE_DEVICE_TABLE(of, qnoc_of_match);
> +
> +static struct platform_driver qnoc_driver = {
> +   .probe = qnoc_probe,
> +   .remove = qnoc_remove,
> +   .driver = {
> +   .name = "qnoc-msm8916",
> +   .of_match_table = qn

Re: [PATCH v4 7/7] interconnect: Allow endpoints translation via DT

2018-05-11 Thread Evan Green

On Fri, Mar 9, 2018 at 1:10 PM Georgi Djakov 
wrote:

> Currently we support only platform data for specifying the interconnect
> endpoints. As now the endpoints are hard-coded into the consumer driver
> this may leed to complications when a single driver is used by multiple

Nit: s/leed/lead/

-Evan

[PATCH] mtd: nand: Add support for reading ooblayout from device tree

2018-05-11 Thread Paul Cercueil

By specifying the properties "mtd-oob-ecc" and "mtd-oob-free", it is
now possible to specify from devicetree where the ECC data is located
inside the OOB region.

Signed-off-by: Paul Cercueil 
---
 Documentation/devicetree/bindings/mtd/nand.txt |  7 +
 drivers/mtd/nand/raw/nand_base.c   | 42 ++
 2 files changed, 49 insertions(+)

diff --git a/Documentation/devicetree/bindings/mtd/nand.txt 
b/Documentation/devicetree/bindings/mtd/nand.txt
index 8bb11d809429..118ea92787cb 100644
--- a/Documentation/devicetree/bindings/mtd/nand.txt
+++ b/Documentation/devicetree/bindings/mtd/nand.txt
@@ -45,6 +45,13 @@ Optional NAND chip properties:
 as reliable as possible.
 - nand-rb: shall contain the native Ready/Busy ids.
 
+- nand-oob-ecc:  couples of integers, specifying the offset
+and length of the ECC data in the OOB region. There can be 
more
+than one couple.
+- nand-oob-free:  couples of integers, specifying the offset
+and length of a free-to-use area in the OOB region. There 
can be
+more than one couple.
+
 The ECC strength and ECC step size properties define the correction capability
 of a controller. Together, they say a controller can correct "{strength} bit
 errors per {size} bytes".
diff --git a/drivers/mtd/nand/raw/nand_base.c b/drivers/mtd/nand/raw/nand_base.c
index 72f3a89da513..c905531effb0 100644
--- a/drivers/mtd/nand/raw/nand_base.c
+++ b/drivers/mtd/nand/raw/nand_base.c
@@ -213,6 +213,43 @@ static const struct mtd_ooblayout_ops 
nand_ooblayout_lp_hamming_ops = {
.free = nand_ooblayout_free_lp_hamming,
 };
 
+static int nand_oob_of(struct device_node *np, int section,
+  struct mtd_oob_region *oobregion, const char *prop)
+{
+   int ret = of_property_read_u32_index(np, prop,
+   section * 2, &oobregion->offset);
+   if (ret == -EOVERFLOW)
+   return -ERANGE; /* We're done */
+   if (ret)
+   return ret;
+
+   ret = of_property_read_u32_index(np, prop,
+   section * 2 + 1, &oobregion->length);
+   if (ret == -EOVERFLOW)
+   return -EINVAL; /* We must have an even number of integers */
+
+   return ret;
+}
+
+static int nand_ooblayout_ecc_of(struct mtd_info *mtd, int section,
+struct mtd_oob_region *oobregion)
+{
+   return nand_oob_of(mtd->dev.of_node, section,
+   oobregion, "nand-oob-ecc");
+}
+
+static int nand_ooblayout_free_of(struct mtd_info *mtd, int section,
+struct mtd_oob_region *oobregion)
+{
+   return nand_oob_of(mtd->dev.of_node, section,
+   oobregion, "nand-oob-free");
+}
+
+static const struct mtd_ooblayout_ops nand_ooblayout_of_ops = {
+   .ecc = nand_ooblayout_ecc_of,
+   .free = nand_ooblayout_free_of,
+};
+
 static int check_offs_len(struct mtd_info *mtd,
loff_t ofs, uint64_t len)
 {
@@ -5843,6 +5880,11 @@ static int nand_dt_init(struct nand_chip *chip)
if (of_property_read_bool(dn, "nand-ecc-maximize"))
chip->ecc.options |= NAND_ECC_MAXIMIZE;
 
+   if (!chip->mtd.ooblayout &&
+   of_property_read_bool(dn, "nand-oob-ecc") &&
+   of_property_read_bool(dn, "nand-oob-free"))
+   chip->mtd.ooblayout = &nand_ooblayout_of_ops;
+
return 0;
 }
 
-- 
2.11.0

Re: [GIT] Networking

2018-05-11 Thread Linus Torvalds

David, is there something you want to tell us?

Drugs are bad, m'kay..

   Linus

On Fri, May 11, 2018 at 2:00 PM David Miller  wrote:

> "from Kevin Easton", "Thanks to Bhadram Varka", "courtesy of Gustavo A.
> R.  Silva", "To Eric Dumazet we are most grateful for this fix", "This
> fix from YU Bo, we do appreciate", "Once again we are blessed by the
> honorable Eric Dumazet with this fix", "This fix is bestowed upon us by
> Andrew Tomt", "another great gift from Eric Dumazet", "to Hangbin Liu we
> give thanks for this", "Paolo Abeni, he gave us this", "thank you Moshe
> Shemesh", "from our good brother David Howells", "Daniel Juergens,
> you're the best!", "Debabrata Benerjee saved us!", "The ship is now
> water tight, thanks to Andrey Ignatov", "from Colin Ian King, man we've
> got holes everywhere!", "Jiri Pirko what would we do without you!

Re: [PATCH] ocfs2: ocfs2_inode_lock_tracker does not distinguish lock level

2018-05-11 Thread Andrew Morton

On Fri, 11 May 2018 12:16:51 +0800 Larry Chen  wrote:

> > Nice changelog, but it gives no information about the severity of the
> > bug: how often does it hit and what is the end-user impact.
> >
> > This info is needed so that I and others can decide which kernel
> > version(s) need the patch, so please always include it when fixing a
> > bug, thanks.
> 
> Thanks for your review and feel sorry for not providing enough information.
> 
> For the status quo of ocfs2, without this patch, neither a bug nor end-user
> impact will be caused because the wrong logic is avoided.
> 
> But I'm afraid this generic interface, may be called by other
> developers in future and used in this situation.
> 
>      a process
> ocfs2_inode_lock_tracker(ex=0)
> ocfs2_inode_lock_tracker(ex=1)

OK, thanks.

> By the way, should I resend this patch with this info included?

I pasted the above into my copy of the changelog so we're good.

[PATCH net-next 2/3] net: dsa: mv88e6xxx: add IEEE and IP mapping ops

2018-05-11 Thread Vivien Didelot

All Marvell switch families except 88E6390 have direct registers in
Global 1 for IEEE and IP priorities override mapping. The 88E6390 uses
indirect tables instead.

Add .ieee_pri_map and .ip_pri_map ops to distinct that and call them
from a mv88e6xxx_pri_setup helper. Only non-6390 are concerned ATM.

Signed-off-by: Vivien Didelot 
---
 drivers/net/dsa/mv88e6xxx/chip.c| 94 +++--
 drivers/net/dsa/mv88e6xxx/chip.h|  3 +
 drivers/net/dsa/mv88e6xxx/global1.c | 58 ++
 drivers/net/dsa/mv88e6xxx/global1.h |  3 +
 4 files changed, 127 insertions(+), 31 deletions(-)

diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index 1cebde80b101..df92fed44674 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -1104,6 +1104,25 @@ static void mv88e6xxx_port_stp_state_set(struct 
dsa_switch *ds, int port,
dev_err(ds->dev, "p%d: failed to update state\n", port);
 }
 
+static int mv88e6xxx_pri_setup(struct mv88e6xxx_chip *chip)
+{
+   int err;
+
+   if (chip->info->ops->ieee_pri_map) {
+   err = chip->info->ops->ieee_pri_map(chip);
+   if (err)
+   return err;
+   }
+
+   if (chip->info->ops->ip_pri_map) {
+   err = chip->info->ops->ip_pri_map(chip);
+   if (err)
+   return err;
+   }
+
+   return 0;
+}
+
 static int mv88e6xxx_devmap_setup(struct mv88e6xxx_chip *chip)
 {
int target, port;
@@ -2252,37 +2271,6 @@ static int mv88e6xxx_g1_setup(struct mv88e6xxx_chip 
*chip)
 {
int err;
 
-   /* Configure the IP ToS mapping registers. */
-   err = mv88e6xxx_g1_write(chip, MV88E6XXX_G1_IP_PRI_0, 0x);
-   if (err)
-   return err;
-   err = mv88e6xxx_g1_write(chip, MV88E6XXX_G1_IP_PRI_1, 0x);
-   if (err)
-   return err;
-   err = mv88e6xxx_g1_write(chip, MV88E6XXX_G1_IP_PRI_2, 0x);
-   if (err)
-   return err;
-   err = mv88e6xxx_g1_write(chip, MV88E6XXX_G1_IP_PRI_3, 0x);
-   if (err)
-   return err;
-   err = mv88e6xxx_g1_write(chip, MV88E6XXX_G1_IP_PRI_4, 0x);
-   if (err)
-   return err;
-   err = mv88e6xxx_g1_write(chip, MV88E6XXX_G1_IP_PRI_5, 0x);
-   if (err)
-   return err;
-   err = mv88e6xxx_g1_write(chip, MV88E6XXX_G1_IP_PRI_6, 0x);
-   if (err)
-   return err;
-   err = mv88e6xxx_g1_write(chip, MV88E6XXX_G1_IP_PRI_7, 0x);
-   if (err)
-   return err;
-
-   /* Configure the IEEE 802.1p priority mapping register. */
-   err = mv88e6xxx_g1_write(chip, MV88E6XXX_G1_IEEE_PRI, 0xfa41);
-   if (err)
-   return err;
-
/* Initialize the statistics unit */
err = mv88e6xxx_stats_set_histogram(chip);
if (err)
@@ -2365,6 +2353,10 @@ static int mv88e6xxx_setup(struct dsa_switch *ds)
if (err)
goto unlock;
 
+   err = mv88e6xxx_pri_setup(chip);
+   if (err)
+   goto unlock;
+
/* Setup PTP Hardware Clock and timestamping */
if (chip->info->ptp_support) {
err = mv88e6xxx_ptp_setup(chip);
@@ -2592,6 +2584,8 @@ static int mv88e6xxx_set_eeprom(struct dsa_switch *ds,
 
 static const struct mv88e6xxx_ops mv88e6085_ops = {
/* MV88E6XXX_FAMILY_6097 */
+   .ieee_pri_map = mv88e6085_g1_ieee_pri_map,
+   .ip_pri_map = mv88e6085_g1_ip_pri_map,
.irl_init_all = mv88e6352_g2_irl_init_all,
.set_switch_mac = mv88e6xxx_g1_set_switch_mac,
.phy_read = mv88e6185_phy_ppu_read,
@@ -2628,6 +2622,8 @@ static const struct mv88e6xxx_ops mv88e6085_ops = {
 
 static const struct mv88e6xxx_ops mv88e6095_ops = {
/* MV88E6XXX_FAMILY_6095 */
+   .ieee_pri_map = mv88e6085_g1_ieee_pri_map,
+   .ip_pri_map = mv88e6085_g1_ip_pri_map,
.set_switch_mac = mv88e6xxx_g1_set_switch_mac,
.phy_read = mv88e6185_phy_ppu_read,
.phy_write = mv88e6185_phy_ppu_write,
@@ -2652,6 +2648,8 @@ static const struct mv88e6xxx_ops mv88e6095_ops = {
 
 static const struct mv88e6xxx_ops mv88e6097_ops = {
/* MV88E6XXX_FAMILY_6097 */
+   .ieee_pri_map = mv88e6085_g1_ieee_pri_map,
+   .ip_pri_map = mv88e6085_g1_ip_pri_map,
.irl_init_all = mv88e6352_g2_irl_init_all,
.set_switch_mac = mv88e6xxx_g2_set_switch_mac,
.phy_read = mv88e6xxx_g2_smi_phy_read,
@@ -2686,6 +2684,8 @@ static const struct mv88e6xxx_ops mv88e6097_ops = {
 
 static const struct mv88e6xxx_ops mv88e6123_ops = {
/* MV88E6XXX_FAMILY_6165 */
+   .ieee_pri_map = mv88e6085_g1_ieee_pri_map,
+   .ip_pri_map = mv88e6085_g1_ip_pri_map,
.irl_init_all = mv88e6352_g2_irl_init_all,
.set_switch_mac = mv88e6xxx_g2_set_switch_mac,
.phy_read = mv88e6xxx_g2_smi_phy_read,
@@ -2714,6 +2714,8 @@ static const struct mv88e6xxx_ops mv88e6123

[PATCH net-next 0/3] net: dsa: mv88e6xxx: remove Global 1 setup

2018-05-11 Thread Vivien Didelot

The mv88e6xxx driver is still writing arbitrary registers at setup time,
e.g. priority override bits. Add ops for them and provide specific setup
functions for priority and stats before getting rid of the erroneous
mv88e6xxx_g1_setup code, as previously done with Global 2.

Vivien Didelot (3):
  net: dsa: mv88e6xxx: use helper for 6390 histogram
  net: dsa: mv88e6xxx: add IEEE and IP mapping ops
  net: dsa: mv88e6xxx: add a stats setup function

 drivers/net/dsa/mv88e6xxx/chip.c| 121 +---
 drivers/net/dsa/mv88e6xxx/chip.h|   3 +
 drivers/net/dsa/mv88e6xxx/global1.c |  73 ++---
 drivers/net/dsa/mv88e6xxx/global1.h |  15 +++-
 4 files changed, 149 insertions(+), 63 deletions(-)

-- 
2.17.0

[PATCH net-next 1/3] net: dsa: mv88e6xxx: use helper for 6390 histogram

2018-05-11 Thread Vivien Didelot

The Marvell 88E6390 model has its histogram mode bits moved in the
Global 1 Control 2 register. Use the previously introduced
mv88e6xxx_g1_ctl2_mask helper to set them.

At the same time complete the documentation of the said register.

Signed-off-by: Vivien Didelot 
---
 drivers/net/dsa/mv88e6xxx/global1.c | 15 +++
 drivers/net/dsa/mv88e6xxx/global1.h | 12 +---
 2 files changed, 12 insertions(+), 15 deletions(-)

diff --git a/drivers/net/dsa/mv88e6xxx/global1.c 
b/drivers/net/dsa/mv88e6xxx/global1.c
index 244ee1ff9edc..0f2b05342c18 100644
--- a/drivers/net/dsa/mv88e6xxx/global1.c
+++ b/drivers/net/dsa/mv88e6xxx/global1.c
@@ -393,18 +393,9 @@ int mv88e6390_g1_rmu_disable(struct mv88e6xxx_chip *chip)
 
 int mv88e6390_g1_stats_set_histogram(struct mv88e6xxx_chip *chip)
 {
-   u16 val;
-   int err;
-
-   err = mv88e6xxx_g1_read(chip, MV88E6XXX_G1_CTL2, &val);
-   if (err)
-   return err;
-
-   val |= MV88E6XXX_G1_CTL2_HIST_RX_TX;
-
-   err = mv88e6xxx_g1_write(chip, MV88E6XXX_G1_CTL2, val);
-
-   return err;
+   return mv88e6xxx_g1_ctl2_mask(chip, MV88E6390_G1_CTL2_HIST_MODE_MASK,
+ MV88E6390_G1_CTL2_HIST_MODE_RX |
+ MV88E6390_G1_CTL2_HIST_MODE_TX);
 }
 
 int mv88e6xxx_g1_set_device_number(struct mv88e6xxx_chip *chip, int index)
diff --git a/drivers/net/dsa/mv88e6xxx/global1.h 
b/drivers/net/dsa/mv88e6xxx/global1.h
index e186a026e1b1..c357b3ca9a09 100644
--- a/drivers/net/dsa/mv88e6xxx/global1.h
+++ b/drivers/net/dsa/mv88e6xxx/global1.h
@@ -201,12 +201,13 @@
 
 /* Offset 0x1C: Global Control 2 */
 #define MV88E6XXX_G1_CTL2  0x1c
-#define MV88E6XXX_G1_CTL2_HIST_RX  0x0040
-#define MV88E6XXX_G1_CTL2_HIST_TX  0x0080
-#define MV88E6XXX_G1_CTL2_HIST_RX_TX   0x00c0
 #define MV88E6185_G1_CTL2_CASCADE_PORT_MASK0xf000
 #define MV88E6185_G1_CTL2_CASCADE_PORT_NONE0xe000
 #define MV88E6185_G1_CTL2_CASCADE_PORT_MULTI   0xf000
+#define MV88E6352_G1_CTL2_HEADER_TYPE_MASK 0xc000
+#define MV88E6352_G1_CTL2_HEADER_TYPE_ORIG 0x
+#define MV88E6352_G1_CTL2_HEADER_TYPE_MGMT 0x4000
+#define MV88E6390_G1_CTL2_HEADER_TYPE_LAG  0x8000
 #define MV88E6352_G1_CTL2_RMU_MODE_MASK0x3000
 #define MV88E6352_G1_CTL2_RMU_MODE_DISABLED0x
 #define MV88E6352_G1_CTL2_RMU_MODE_PORT_4  0x1000
@@ -223,6 +224,11 @@
 #define MV88E6390_G1_CTL2_RMU_MODE_PORT_10 0x0300
 #define MV88E6390_G1_CTL2_RMU_MODE_ALL_DSA 0x0600
 #define MV88E6390_G1_CTL2_RMU_MODE_DISABLED0x0700
+#define MV88E6390_G1_CTL2_HIST_MODE_MASK   0x00c0
+#define MV88E6390_G1_CTL2_HIST_MODE_RX 0x0040
+#define MV88E6390_G1_CTL2_HIST_MODE_TX 0x0080
+#define MV88E6352_G1_CTL2_CTR_MODE_MASK0x0060
+#define MV88E6390_G1_CTL2_CTR_MODE 0x0020
 #define MV88E6XXX_G1_CTL2_DEVICE_NUMBER_MASK   0x001f
 
 /* Offset 0x1D: Stats Operation Register */
-- 
2.17.0

1 2 3 4 5 6 7 >

1 - 100 of 683 matches

Mail list logo