date:20150305

Re: [PATCH 1/1] x86/fpu: math_state_restore() should not blindly disable irqs

2015-03-05 Thread Ingo Molnar

* Oleg Nesterov  wrote:

> On 03/05, Ingo Molnar wrote:
> >
> > * Oleg Nesterov  wrote:
> >
> > > --- a/arch/x86/kernel/traps.c
> > > +++ b/arch/x86/kernel/traps.c
> > > @@ -774,7 +774,10 @@ void math_state_restore(void)
> > >   struct task_struct *tsk = current;
> > >
> > >   if (!tsk_used_math(tsk)) {
> > > - local_irq_enable();
> > > + bool disabled = irqs_disabled();
> > > +
> > > + if (disabled)
> > > + local_irq_enable();
> > >   /*
> > >* does a slab alloc which can sleep
> > >*/
> > > @@ -785,7 +788,9 @@ void math_state_restore(void)
> > >   do_group_exit(SIGKILL);
> > >   return;
> > >   }
> > > - local_irq_disable();
> > > +
> > > + if (disabled)
> > > + local_irq_disable();
> > >   }
> >
> > Yuck!
> >
> > Is there a fundamental reason why we cannot simply enable irqs and
> > leave them enabled? Math state restore is not atomic and cannot really
> > be atomic.
> 
> You know, I didn't even try to verify ;) but see below.

So I'm thinking about the attached patch.

> Most probably we can simply enable irqs, yes. But what about older 
> kernels, how can we check?
>
> And let me repeat, I strongly believe that this !tsk_used_math() 
> case in math_state_restore() must die. And unlazy_fpu() in 
> init_fpu(). And both __restore_xstate_sig() and flush_thread() 
> should not use math_state_restore() at all. At least in its current 
> form.

Agreed.

> But this is obviously not -stable material.
> 
> That said, I'll try to look into git history tomorrow.

So I think the reasons are:

 - historic: because math_state_restore() started out as an interrupt 
   routine (from the IRQ13 days)

 - hardware imposed: the handler is executed with irqs off

 - it's probably the fastest implementation: we just run with the 
   natural irqs-off state the handler executes with.

So there's nothing outright wrong about executing with irqs off in a 
trap handler.

> [...] The patch above looks "obviously safe", but perhaps I am 
> paranoid too much...

IMHO your hack above isn't really acceptable, even for a backport.
So lets test the patch below (assuming it's the right thing to do)
and move forward?

Thanks,

Ingo

==>
From: Ingo Molnar 
Date: Fri, 6 Mar 2015 08:37:57 +0100
Subject: [PATCH] x86/fpu: Don't disable irqs in math_state_restore()

math_state_restore() was historically called with irqs disabled, 
because that's how the hardware generates the trap, and also because 
back in the days it was possible for it to be an asynchronous 
interrupt and interrupt handlers run with irqs off.

These days it's always an instruction trap, and furthermore it does 
inevitably complex things such as memory allocation and signal 
processing, which is not done with irqs disabled.

So keep irqs enabled.

This might surprise in-kernel FPU users that somehow relied on
interrupts being disabled across FPU usage - but that's
fundamentally fragile anyway due to the inatomicity of FPU state
restores. The trap return will restore interrupts to its previous 
state, but if FPU ops trigger math_state_restore() there's no
guarantee of atomicity anymore.

To warn about in-kernel irqs-off users of FPU state we might want to 
pass 'struct pt_regs' to math_state_restore() and check the trapped 
state for irqs disabled (flags has IF cleared) and kernel context - 
but that's for a later patch.

Cc: Andy Lutomirski 
Cc: Borislav Petkov 
Cc: Fenghua Yu 
Cc: H. Peter Anvin 
Cc: Linus Torvalds 
Cc: Oleg Nesterov 
Cc: Quentin Casasnovas 
Cc: Thomas Gleixner 
Signed-off-by: Ingo Molnar 
---
 arch/x86/kernel/traps.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 950815a138e1..52f9e4057cee 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -844,8 +844,9 @@ void math_state_restore(void)
 {
struct task_struct *tsk = current;

+   local_irq_enable();
+
if (!tsk_used_math(tsk)) {
-   local_irq_enable();
/*
 * does a slab alloc which can sleep
 */
@@ -856,7 +857,6 @@ void math_state_restore(void)
do_group_exit(SIGKILL);
return;
}
-   local_irq_disable();
}

/* Avoid __kernel_fpu_begin() right after __thread_fpu_begin() */
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] phy: core: Fixup return value of phy_exit when !pm_runtime_enabled

2015-03-05 Thread Axel Lin

When phy_pm_runtime_get_sync() returns -ENOTSUPP, phy_exit() also returns
-ENOTSUPP if !phy->ops->exit. Fix it.
Also move the code to override ret close to the code we got ret.
I think it is less error prone this way.

Signed-off-by: Axel Lin 
---
 drivers/phy/phy-core.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/phy/phy-core.c b/drivers/phy/phy-core.c
index a12d353..250dc6c 100644
--- a/drivers/phy/phy-core.c
+++ b/drivers/phy/phy-core.c
@@ -223,6 +223,7 @@ int phy_init(struct phy *phy)
ret = phy_pm_runtime_get_sync(phy);
if (ret < 0 && ret != -ENOTSUPP)
return ret;
+   ret = 0; /* Override possible ret == -ENOTSUPP */
 
mutex_lock(&phy->mutex);
if (phy->init_count == 0 && phy->ops->init) {
@@ -231,8 +232,6 @@ int phy_init(struct phy *phy)
dev_err(&phy->dev, "phy init failed --> %d\n", ret);
goto out;
}
-   } else {
-   ret = 0; /* Override possible ret == -ENOTSUPP */
}
++phy->init_count;
 
@@ -253,6 +252,7 @@ int phy_exit(struct phy *phy)
ret = phy_pm_runtime_get_sync(phy);
if (ret < 0 && ret != -ENOTSUPP)
return ret;
+   ret = 0; /* Override possible ret == -ENOTSUPP */
 
mutex_lock(&phy->mutex);
if (phy->init_count == 1 && phy->ops->exit) {
@@ -287,6 +287,7 @@ int phy_power_on(struct phy *phy)
ret = phy_pm_runtime_get_sync(phy);
if (ret < 0 && ret != -ENOTSUPP)
return ret;
+   ret = 0; /* Override possible ret == -ENOTSUPP */
 
mutex_lock(&phy->mutex);
if (phy->power_count == 0 && phy->ops->power_on) {
@@ -295,8 +296,6 @@ int phy_power_on(struct phy *phy)
dev_err(&phy->dev, "phy poweron failed --> %d\n", ret);
goto out;
}
-   } else {
-   ret = 0; /* Override possible ret == -ENOTSUPP */
}
++phy->power_count;
mutex_unlock(&phy->mutex);
-- 
1.9.1



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 1/2] locks: Split insert/delete block functions into flock/posix parts

2015-03-05 Thread Daniel Wagner

The locks_insert/delete_block() functions are used for flock, posix
and leases types. blocked_lock_lock is used to serialize all access to
fl_link, fl_block, fl_next and blocked_hash. Here, we prepare the
stage for using blocked_lock_lock only to protect blocked_hash.

Signed-off-by: Daniel Wagner 
Cc: Jeff Layton 
Cc: "J. Bruce Fields" 
Cc: Alexander Viro 
---
 fs/locks.c | 49 -
 1 file changed, 40 insertions(+), 9 deletions(-)

diff --git a/fs/locks.c b/fs/locks.c
index d4992a1..0c37d68 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -611,11 +611,20 @@ static void locks_delete_global_blocked(struct file_lock 
*waiter)
  */
 static void __locks_delete_block(struct file_lock *waiter)
 {
-   locks_delete_global_blocked(waiter);
list_del_init(&waiter->fl_block);
waiter->fl_next = NULL;
 }
 
+/* Posix block variant of __locks_delete_block.
+ *
+ * Must be called with blocked_lock_lock held.
+ */
+static void __locks_delete_posix_block(struct file_lock *waiter)
+{
+   locks_delete_global_blocked(waiter);
+   __locks_delete_block(waiter);
+}
+
 static void locks_delete_block(struct file_lock *waiter)
 {
spin_lock(&blocked_lock_lock);
@@ -623,6 +632,13 @@ static void locks_delete_block(struct file_lock *waiter)
spin_unlock(&blocked_lock_lock);
 }
 
+static void locks_delete_posix_block(struct file_lock *waiter)
+{
+   spin_lock(&blocked_lock_lock);
+   __locks_delete_posix_block(waiter);
+   spin_unlock(&blocked_lock_lock);
+}
+
 /* Insert waiter into blocker's block list.
  * We use a circular list so that processes can be easily woken up in
  * the order they blocked. The documentation doesn't require this but
@@ -639,8 +655,17 @@ static void __locks_insert_block(struct file_lock *blocker,
BUG_ON(!list_empty(&waiter->fl_block));
waiter->fl_next = blocker;
list_add_tail(&waiter->fl_block, &blocker->fl_block);
-   if (IS_POSIX(blocker) && !IS_OFDLCK(blocker))
-   locks_insert_global_blocked(waiter);
+}
+
+/* Posix block variant of __locks_insert_block.
+ *
+ * Must be called with flc_lock and blocked_lock_lock held.
+ */
+static void __locks_insert_posix_block(struct file_lock *blocker,
+   struct file_lock *waiter)
+{
+   __locks_insert_block(blocker, waiter);
+   locks_insert_global_blocked(waiter);
 }
 
 /* Must be called with flc_lock held. */
@@ -675,7 +700,10 @@ static void locks_wake_up_blocks(struct file_lock *blocker)
 
waiter = list_first_entry(&blocker->fl_block,
struct file_lock, fl_block);
-   __locks_delete_block(waiter);
+   if (IS_POSIX(blocker) && !IS_OFDLCK(blocker))
+   __locks_delete_posix_block(waiter);
+   else
+   __locks_delete_block(waiter);
if (waiter->fl_lmops && waiter->fl_lmops->lm_notify)
waiter->fl_lmops->lm_notify(waiter);
else
@@ -985,7 +1013,7 @@ static int __posix_lock_file(struct inode *inode, struct 
file_lock *request, str
spin_lock(&blocked_lock_lock);
if (likely(!posix_locks_deadlock(request, fl))) {
error = FILE_LOCK_DEFERRED;
-   __locks_insert_block(fl, request);
+   __locks_insert_posix_block(fl, request);
}
spin_unlock(&blocked_lock_lock);
goto out;
@@ -1186,7 +1214,7 @@ int posix_lock_file_wait(struct file *filp, struct 
file_lock *fl)
if (!error)
continue;
 
-   locks_delete_block(fl);
+   locks_delete_posix_block(fl);
break;
}
return error;
@@ -1283,7 +1311,7 @@ int locks_mandatory_area(int read_write, struct inode 
*inode,
continue;
}
 
-   locks_delete_block(&fl);
+   locks_delete_posix_block(&fl);
break;
}
 
@@ -2104,7 +2132,10 @@ static int do_lock_file_wait(struct file *filp, unsigned 
int cmd,
if (!error)
continue;
 
-   locks_delete_block(fl);
+   if (IS_POSIX(fl) && !IS_OFDLCK(fl))
+   locks_delete_posix_block(fl);
+   else
+   locks_delete_block(fl);
break;
}
 
@@ -2468,7 +2499,7 @@ posix_unblock_lock(struct file_lock *waiter)
 
spin_lock(&blocked_lock_lock);
if (waiter->fl_next)
-   __locks_delete_block(waiter);
+   __locks_delete_posix_block(waiter);
else
status = -ENOENT;
spin_unlock(&blocked_lock_lock);
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the

[PATCH v3 2/2] locks: Use blocked_lock_lock only to protect blocked_hash

2015-03-05 Thread Daniel Wagner

blocked_lock_lock and file_lock_lglock are used to protect file_lock's
fl_link, fl_block, fl_next, blocked_hash and the percpu
file_lock_list.

Let's use blocked_lock_lock only to protect blocked_hash since it is a
global lock.

Whenever we insert a new lock we are going to grab besides the
flc_lock also the corresponding file_lock_lglock. The global
blocked_lock_lock is only used when blocked_hash is involved.

Since we already use fl_link_cpu to remember which percpu
file_lock_list is referencing to a blocker we just going to use it as
well for all waiters.

Note fl_list is protected by flc_lock. It's easy to get confused...

Signed-off-by: Daniel Wagner 
Cc: Jeff Layton 
Cc: "J. Bruce Fields" 
Cc: Alexander Viro 
---
 fs/locks.c | 72 ++
 1 file changed, 39 insertions(+), 33 deletions(-)

diff --git a/fs/locks.c b/fs/locks.c
index 0c37d68..661e58b 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -162,6 +162,20 @@ int lease_break_time = 45;
  * keep a list on each CPU, with each list protected by its own spinlock via
  * the file_lock_lglock. Note that alterations to the list also require that
  * the relevant flc_lock is held.
+ *
+ * In addition, it also protects the fl->fl_block list, and the fl->fl_next
+ * pointer for file_lock structures that are acting as lock requests (in
+ * contrast to those that are acting as records of acquired locks).
+ *
+ * file_lock structures acting as lock requests (waiters) use the same
+ * spinlock as the those acting as lock holder (blocker). E.g. the
+ * blocker is initially added to the file_lock_list living on CPU 0,
+ * all waiters on that blocker are serialized via CPU 0 (see
+ * fl_link_cpu usage).
+ *
+ * In particular, adding an entry to the fl_block list requires that you hold
+ * both the flc_lock and the blocked_lock_lock (acquired in that order).
+ * Deleting an entry from the list however only requires the file_lock_gllock.
  */
 DEFINE_STATIC_LGLOCK(file_lock_lglock);
 static DEFINE_PER_CPU(struct hlist_head, file_lock_list);
@@ -183,19 +197,6 @@ static DEFINE_HASHTABLE(blocked_hash, BLOCKED_HASH_BITS);
 /*
  * This lock protects the blocked_hash. Generally, if you're accessing it, you
  * want to be holding this lock.
- *
- * In addition, it also protects the fl->fl_block list, and the fl->fl_next
- * pointer for file_lock structures that are acting as lock requests (in
- * contrast to those that are acting as records of acquired locks).
- *
- * Note that when we acquire this lock in order to change the above fields,
- * we often hold the flc_lock as well. In certain cases, when reading the 
fields
- * protected by this lock, we can skip acquiring it iff we already hold the
- * flc_lock.
- *
- * In particular, adding an entry to the fl_block list requires that you hold
- * both the flc_lock and the blocked_lock_lock (acquired in that order).
- * Deleting an entry from the list however only requires the file_lock_lock.
  */
 static DEFINE_SPINLOCK(blocked_lock_lock);
 
@@ -607,7 +608,7 @@ static void locks_delete_global_blocked(struct file_lock 
*waiter)
 /* Remove waiter from blocker's block list.
  * When blocker ends up pointing to itself then the list is empty.
  *
- * Must be called with blocked_lock_lock held.
+ * Must be called with file_lock_lglock held.
  */
 static void __locks_delete_block(struct file_lock *waiter)
 {
@@ -617,7 +618,7 @@ static void __locks_delete_block(struct file_lock *waiter)
 
 /* Posix block variant of __locks_delete_block.
  *
- * Must be called with blocked_lock_lock held.
+ * Must be called with file_lock_lglock held.
  */
 static void __locks_delete_posix_block(struct file_lock *waiter)
 {
@@ -627,16 +628,18 @@ static void __locks_delete_posix_block(struct file_lock 
*waiter)
 
 static void locks_delete_block(struct file_lock *waiter)
 {
-   spin_lock(&blocked_lock_lock);
+   lg_local_lock_cpu(&file_lock_lglock, waiter->fl_link_cpu);
__locks_delete_block(waiter);
-   spin_unlock(&blocked_lock_lock);
+   lg_local_unlock_cpu(&file_lock_lglock, waiter->fl_link_cpu);
 }
 
 static void locks_delete_posix_block(struct file_lock *waiter)
 {
+   lg_local_lock_cpu(&file_lock_lglock, waiter->fl_link_cpu);
spin_lock(&blocked_lock_lock);
__locks_delete_posix_block(waiter);
spin_unlock(&blocked_lock_lock);
+   lg_local_unlock_cpu(&file_lock_lglock, waiter->fl_link_cpu);
 }
 
 /* Insert waiter into blocker's block list.
@@ -644,22 +647,23 @@ static void locks_delete_posix_block(struct file_lock 
*waiter)
  * the order they blocked. The documentation doesn't require this but
  * it seems like the reasonable thing to do.
  *
- * Must be called with both the flc_lock and blocked_lock_lock held. The
- * fl_block list itself is protected by the blocked_lock_lock, but by ensuring
+ * Must be called with both the flc_lock and file_lock_lglock held. The
+ * fl_block list itself is protected by the file_lock_lglock, but by ensuring

[PATCH v3 0/2] Use blocked_lock_lock only to protect blocked_hash

2015-03-05 Thread Daniel Wagner

Hi,

Finally, I got a bigger machine and did a quick test round. I expected
to see some improvements but the resutls do not show any real gain. So
they are merely refactoring patches.

4x Intel(R) Xeon(R) CPU E5-4610 v2 @ 2.30GHz

4.0.0-rc2/flock01.data
# NumSamples = 3; Min = 47160.80; Max = 47555.42
# Mean = 47294.254786; Variance = 34110.284932; SD = 184.689699; Median 
47166.534982
# each ∎ represents a count of 1
47160.8049 - 47200.2668 [ 2]: ∎∎
47200.2668 - 47239.7288 [ 0]: 
47239.7288 - 47279.1908 [ 0]: 
47279.1908 - 47318.6527 [ 0]: 
47318.6527 - 47358.1147 [ 0]: 
47358.1147 - 47397.5767 [ 0]: 
47397.5767 - 47437.0386 [ 0]: 
47437.0386 - 47476.5006 [ 0]: 
47476.5006 - 47515.9625 [ 0]: 
47515.9625 - 47555.4245 [ 1]: ∎

patched/flock01.data
# NumSamples = 21; Min = 45877.22; Max = 50206.70
# Mean = 47042.844720; Variance = 752166.966346; SD = 867.275600; Median 
46939.811380
# each ∎ represents a count of 1
45877.2235 - 46310.1709 [ 2]: ∎∎
46310.1709 - 46743.1182 [ 7]: ∎∎∎
46743.1182 - 47176.0655 [ 3]: ∎∎∎
47176.0655 - 47609.0128 [ 6]: ∎∎
47609.0128 - 48041.9602 [ 2]: ∎∎
48041.9602 - 48474.9075 [ 0]: 
48474.9075 - 48907.8548 [ 0]: 
48907.8548 - 49340.8021 [ 0]: 
49340.8021 - 49773.7495 [ 0]: 
49773.7495 - 50206.6968 [ 1]: ∎


4.0.0-rc2/flock02.data
# NumSamples = 1786; Min = 1.86; Max = 3.13
# Mean = 2.204980; Variance = 0.015900; SD = 0.126096; Median 2.177549
# each ∎ represents a count of 13
1.8606 - 1.9880 [ 5]: 
1.9880 - 2.1154 [   315]: 
2.1154 - 2.2427 [  1040]: 

2.2427 - 2.3701 [   272]: 
2.3701 - 2.4975 [75]: ∎
2.4975 - 2.6249 [42]: ∎∎∎
2.6249 - 2.7523 [28]: ∎∎
2.7523 - 2.8796 [ 7]: 
2.8796 - 3.0070 [ 1]: 
3.0070 - 3.1344 [ 1]: 

patched/flock02.data
# NumSamples = 4586; Min = 2.14; Max = 4.31
# Mean = 2.619467; Variance = 0.043192; SD = 0.207828; Median 2.575378
# each ∎ represents a count of 27
2.1385 - 2.3561 [   186]: ∎∎
2.3561 - 2.5737 [  2079]: 
∎
2.5737 - 2.7914 [  1642]: 

2.7914 - 3.0090 [   355]: ∎
3.0090 - 3.2266 [   246]: ∎
3.2266 - 3.4442 [66]: ∎∎
3.4442 - 3.6618 [ 9]: 
3.6618 - 3.8795 [ 1]: 
3.8795 - 4.0971 [ 0]: 
4.0971 - 4.3147 [ 2]: 


4.0.0-rc2/lease01.data
# NumSamples = 12; Min = 1097.16; Max = 1255.06
# Mean = 1184.550432; Variance = 1590.438052; SD = 39.880297; Median 1190.704582
# each ∎ represents a count of 1
 1097.1556 -  1112.9460 [ 1]: ∎
 1112.9460 -  1128.7363 [ 0]: 
 1128.7363 -  1144.5267 [ 1]: ∎
 1144.5267 -  1160.3170 [ 0]: 
 1160.3170 -  1176.1074 [ 2]: ∎∎
 1176.1074 -  1191.8977 [ 2]: ∎∎
 1191.8977 -  1207.6881 [ 2]: ∎∎
 1207.6881 -  1223.4784 [ 3]: ∎∎∎
 1223.4784 -  1239.2688 [ 0]: 
 1239.2688 -  1255.0591 [ 1]: ∎

patched/lease01.data
# NumSamples = 14; Min = 1055.00; Max = 1213.97
# Mean = 1128.800723; Variance = 2225.466357; SD = 47.174849; Median 1114.384900
# each ∎ represents a count of 1
 1054.9959 -  1070.8932 [ 2]: ∎∎
 1070.8932 -  1086.7906 [ 1]: ∎
 1086.7906 -  1102.6879 [ 1]: ∎
 1102.6879 -  1118.5853 [ 4]: 
 1118.5853 -  1134.4826 [ 0]: 
 1134.4826 -  1150.3800 [ 1]: ∎
 1150.3800 -  1166.2773 [ 2]: ∎∎
 1166.2773 -  1182.1747 [ 0]: 
 1182.1747 -  1198.0720 [ 2]: ∎∎
 1198.0720 -  1213.9694 [ 1]: ∎


4.0.0-rc2/lease02.data
# NumSamples = 12; Min = 841.43; Max = 911.82
# Mean = 888.716745; Variance = 317.221486; SD = 17.810713; Median 894.897002
# each ∎ represents a count of 1
  841.4339 -   848.4727 [ 1]: ∎
  848.4727 -   855.5115 [ 0]: 
  855.5115 -   862.5503 [ 0]: 
  862.5503 -   869.5891 [ 0]: 
  869.5891 -   876.6278 [ 2]: ∎∎
  876.6278 -   883. [ 1]: ∎
  883. -   890.7054 [ 1]: ∎
  890.7054 -   897.7442 [ 3]: ∎∎∎
  897.7442 -   904.7830 [ 2]: ∎∎
  904.7830 -   911.8218 [ 2]: ∎∎

patched/lease02.data
# NumSamples = 26; Min = 845.36; Max = 917.22
# Mean = 886.178134; Variance = 320.861100; SD = 17.912596; Median 889.109363
# each ∎ represents a count of 1
  845.3620 -   852.5481 [ 2]: ∎∎
  852.5481 -   859.7343 [ 1]: ∎
  859.7343 -   866.9204 [ 1]: ∎
  866.9204 -   874.1065 [ 2]: ∎∎
  874.1065 -   881.2926 [ 3]: ∎∎∎
  881.2926 -   888.4788 [ 2]: ∎∎
  888.4788 -   895.6649 [ 6]: ∎∎
  895.6649 -   902.8510 [ 4]: 
  902.8510 -   910.0372 [ 2]: ∎∎
  910.0372 -   917.2233 [ 3]: ∎∎∎


4.0.0-rc2/posix01.data
# NumSamples = 5; Min = 46659.56; Max = 48332.45
# Mean = 47237.374603; Variance = 337801.6

Re: [PATCH] checkpatch: Add spell checking of email subject line

2015-03-05 Thread Jani Nikula

On Thu, 05 Mar 2015, Joe Perches  wrote:
> Only commit log and patch additions are checked for
> typos and spelling errors currently.  Add a check
> of the email subject line too.
>
> Suggested-by: Jani Nikula 
> Signed-off-by: Joe Perches 

Thanks Joe.

FWIW,

Tested-by: Jani Nikula 

> ---
>  scripts/checkpatch.pl | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
> index 421bbb4..c061a63 100755
> --- a/scripts/checkpatch.pl
> +++ b/scripts/checkpatch.pl
> @@ -2303,7 +2303,8 @@ sub process {
>   }
>  
>  # Check for various typo / spelling mistakes
> - if (defined($misspellings) && ($in_commit_log || $line =~ 
> /^\+/)) {
> + if (defined($misspellings) &&
> + ($in_commit_log || $line =~ /^(?:\+|Subject:)/i)) {
>   while ($rawline =~ 
> /(?:^|[^a-z@])($misspellings)(?:$|[^a-z@])/gi) {
>   my $typo = $1;
>   my $typo_fix = $spelling_fix{lc($typo)};
>
>

-- 
Jani Nikula, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC 0/6] the big khugepaged redesign

2015-03-05 Thread Vlastimil Babka

On 03/06/2015 01:21 AM, Andres Freund wrote:
> Long mail ahead, sorry for that.

No problem, thanks a lot!

> TL;DR: THP is still noticeable, but not nearly as bad.
> 
> On 2015-03-05 17:30:16 +0100, Vlastimil Babka wrote:
>> That however means the workload is based on hugetlbfs and shouldn't trigger 
>> THP
>> page fault activity, which is the aim of this patchset. Some more googling 
>> made
>> me recall that last LSF/MM, postgresql people mentioned THP issues and 
>> pointed
>> at compaction. See http://lwn.net/Articles/591723/ That's exactly where this
>> patchset should help, but I obviously won't be able to measure this before 
>> LSF/MM...
> 
> Just as a reference, this is how some the more extreme profiles looked
> like in the past:
> 
>> 96.50%postmaster  [kernel.kallsyms] [k] _spin_lock_irq
>>   |
>>   --- _spin_lock_irq
>>  |
>>  |--99.87%-- compact_zone
>>  |  compact_zone_order
>>  |  try_to_compact_pages
>>  |  __alloc_pages_nodemask
>>  |  alloc_pages_vma
>>  |  do_huge_pmd_anonymous_page
>>  |  handle_mm_fault
>>  |  __do_page_fault
>>  |  do_page_fault
>>  |  page_fault
>>  |  0x631d98
>>   --0.13%-- [...]
> 
> That specific profile is from a rather old kernel as you probably
> recognize.

Yeah, sounds like synchronous compaction before it was forbidden for THP page
faults...

>> I'm CCing the psql guys from last year LSF/MM - do you have any insight about
>> psql performance with THPs enabled/disabled on recent kernels, where e.g.
>> compaction is no longer synchronous for THP page faults?
> 
> So, I've managed to get a machine upgraded to 3.19. 4 x E5-4620, 256GB
> RAM.
> 
> First of: It's noticeably harder to trigger problems than it used to
> be. But, I can still trigger various problems that are much worse with
> THP enabled than without.
> 
> There seem to be various different bottlenecks; I can get somewhat
> different profiles.
> 
> In a somewhat artificial workload, that tries to simulate what I've seen
> trigger the problem at a customer, I can quite easily trigger large
> differences between THP=enable and THP=never.  There's two types of
> tasks running, one purely OLTP, another doing somewhat more complex
> statements that require a fair amount of process local memory.
> 
> (ignore the absolute numbers for progress, I just waited for somewhat
> stable results while doing other stuff)
> 
> THP off:
> Task 1 solo:
> progress: 200.0 s, 391442.0 tps, 0.654 ms lat
> progress: 201.0 s, 394816.1 tps, 0.683 ms lat
> progress: 202.0 s, 409722.5 tps, 0.625 ms lat
> progress: 203.0 s, 384794.9 tps, 0.665 ms lat
> 
> combined:
> Task 1:
> progress: 144.0 s, 25430.4 tps, 10.067 ms lat
> progress: 145.0 s, 22260.3 tps, 11.500 ms lat
> progress: 146.0 s, 24089.9 tps, 10.627 ms lat
> progress: 147.0 s, 25888.8 tps, 9.888 ms lat
> 
> Task 2:
> progress: 24.4 s, 30.0 tps, 2134.043 ms lat
> progress: 26.5 s, 29.8 tps, 2150.487 ms lat
> progress: 28.4 s, 29.7 tps, 2151.557 ms lat
> progress: 30.4 s, 28.5 tps, 2245.304 ms lat
> 
> flat profile:
>  6.07%  postgres  postgres[.] heap_form_minimal_tuple
>  4.36%  postgres  postgres[.] heap_fill_tuple
>  4.22%  postgres  postgres[.] ExecStoreMinimalTuple
>  4.11%  postgres  postgres[.] AllocSetAlloc
>  3.97%  postgres  postgres[.] advance_aggregates
>  3.94%  postgres  postgres[.] advance_transition_function
>  3.94%  postgres  postgres[.] ExecMakeTableFunctionResult
>  3.33%  postgres  postgres[.] heap_compute_data_size
>  3.30%  postgres  postgres[.] MemoryContextReset
>  3.28%  postgres  postgres[.] ExecScan
>  3.04%  postgres  postgres[.] ExecProject
>  2.96%  postgres  postgres[.] generate_series_step_int4
>  2.94%  postgres  [kernel.kallsyms]   [k] clear_page_c
> 
> (i.e. most of it postgres, cache miss bound)
> 
> THP on:
> Task 1 solo:
> progress: 140.0 s, 390458.1 tps, 0.656 ms lat
> progress: 141.0 s, 391174.2 tps, 0.654 ms lat
> progress: 142.0 s, 394828.8 tps, 0.648 ms lat
> progress: 143.0 s, 398156.2 tps, 0.643 ms lat
> 
> Task 1:
> progress: 179.0 s, 23963.1 tps, 10.683 ms lat
> progress: 180.0 s, 22712.9 tps, 11.271 ms lat
> progress: 181.0 s, 21211.4 tps, 12.069 ms lat
> progress: 182.0 s, 23207.8 tps, 11.031 ms lat
> 
> Task 2:
> progress: 28.2 s, 19.1 tps, 3349.747 ms lat
> progress: 31.0 s, 19.8 tps, 3230.589 ms lat
> progress: 34.3 s, 21.5 tps, 2979.113 ms lat
> progress: 37.4 s, 20.9 tps, 3055.143 ms lat

So that's 1/3 worse tps for task 2? Not very nice...

> flat

Re: [PATCH v2] ASoC: Add support for NAU8824 codec to ASoC

2015-03-05 Thread Chih-Chiang Chang



On 2015/3/5 上午 06:32, Paul Bolle wrote:
> Chih-Chiang Chang schreef op wo 04-03-2015 om 20:53 [+0800]:
>> From fe37688e226f83ba477a3c2fbc1e64946cd4ec4e Mon Sep 17 00:00:00 2001
>> From: Chih-Chiang Chang 
>> Date: Wed, 4 Mar 2015 20:03:21 +0800
>> Subject: [PATCH v2] ASoC: Add support for NAU8824 codec to ASoC
> 
> It seems that none of those lines were needed.
Sorry for the wrong patch format, will remove these lines in next submit.
> 
>> --- /dev/null
>> +++ b/include/sound/nau8824.h
>> @@ -0,0 +1,22 @@
>> +/*
>> + * linux/sound/nau8824.h -- Platform data for NAU8824
>> + *
>> + * Copyright 2015 Nuvoton Technology Corp.
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + */
>> +
>> +#ifndef __LINUX_SND_NAU8824_H
>> +#define __LINUX_SND_NAU8824_H
>> +
>> +struct nau8824_platform_data {
>> +   unsigned int audio_mclk1;
>> +   unsigned int gpio_irq;
>> +   int naudint_irq;
>> +   int headset_detect;
>> +   int button_press_detect;
>> +};
>> +
>> +#endif
> 
> In the future something other than just sound/soc/codecs/nau8824.h is
> going to include this header, right?
> 
>> --- /dev/null
>> +++ b/sound/soc/codecs/nau8824.c
>> @@ -0,0 +1,807 @@
>> +/*
>> + * linux/sound/soc/codecs/nau8824.c
>> + *
>> + * Copyright 2015 Nuvoton Technology Corp.
>> + * Author: Meng-Huang Kuo 
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + */
> 
> This states the license is GPL v2. (So do the two headers this patch
> adds.)
> 
>> +MODULE_LICENSE("GPL");
> 
> So that should probably be
> MODULE_LICENSE("GPL v2");
We will modify the code to be GPL v2.
> 
> 
> Paul Bolle
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH perf/core v2 2/5] perf-probe: Fix --line to handle aliased symbols in glibc

2015-03-05 Thread Masami Hiramatsu

Fix perf probe --line to handle aliased symbols correctly
in glibc.

This makes line_range search failing back to address-based
alternative search as same as --add and --vars.

Without this patch;
  -
  # ./perf probe -x /usr/lib64/libc-2.17.so -L malloc
  Specified source line is not found.
Error: Failed to show lines.
  -

With this patch;
  -
  # ./perf probe -x /usr/lib64/libc-2.17.so -L malloc
  <__libc_malloc@/usr/src/debug/glibc-2.17-c758a686/malloc/malloc.c:0>
0  __libc_malloc(size_t bytes)
1  {
 mstate ar_ptr;
 void *victim;

 __malloc_ptr_t (*hook) (size_t, const __malloc_ptr_t)
6  = force_reg (__malloc_hook);
7if (__builtin_expect (hook != NULL, 0))
8  return (*hook)(bytes, RETURN_ADDRESS (0));

   10arena_lookup(ar_ptr);

   12arena_lock(ar_ptr, bytes);
  -

Note that this actually shows __libc_malloc, since it is
the real instance of malloc. User can use both __libc_malloc
and malloc for --line.

Signed-off-by: Masami Hiramatsu 
---
 tools/perf/util/probe-event.c |   35 +--
 1 file changed, 33 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index b8f4578..4cfd121 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -353,6 +353,31 @@ static int get_alternative_probe_event(struct debuginfo 
*dinfo,
return ret;
 }
 
+static int get_alternative_line_range(struct debuginfo *dinfo,
+ struct line_range *lr,
+ const char *target, bool user)
+{
+   struct perf_probe_point pp = { 0 }, result = { 0 };
+   int ret, len = 0;
+
+   pp.function = lr->function;
+   pp.file = lr->file;
+   pp.line = lr->start;
+   if (lr->end != INT_MAX)
+   len = lr->end - lr->start;
+   ret = find_alternative_probe_point(dinfo, &pp, &result,
+  target, user);
+   if (!ret) {
+   lr->function = result.function;
+   lr->file = result.file;
+   lr->start = result.line;
+   if (lr->end != INT_MAX)
+   lr->end = lr->start + len;
+   clear_perf_probe_point(&pp);
+   }
+   return ret;
+}
+
 /* Open new debuginfo of given module */
 static struct debuginfo *open_debuginfo(const char *module, bool silent)
 {
@@ -734,7 +759,8 @@ static int _show_one_line(FILE *fp, int l, bool skip, bool 
show_num)
  * Show line-range always requires debuginfo to find source file and
  * line number.
  */
-static int __show_line_range(struct line_range *lr, const char *module)
+static int __show_line_range(struct line_range *lr, const char *module,
+bool user)
 {
int l = 1;
struct int_node *ln;
@@ -750,6 +776,11 @@ static int __show_line_range(struct line_range *lr, const 
char *module)
return -ENOENT;
 
ret = debuginfo__find_line_range(dinfo, lr);
+   if (!ret) { /* Not found, retry with an alternative */
+   ret = get_alternative_line_range(dinfo, lr, module, user);
+   if (!ret)
+   ret = debuginfo__find_line_range(dinfo, lr);
+   }
debuginfo__delete(dinfo);
if (ret == 0 || ret == -ENOENT) {
pr_warning("Specified source line is not found.\n");
@@ -819,7 +850,7 @@ int show_line_range(struct line_range *lr, const char 
*module, bool user)
ret = init_symbol_maps(user);
if (ret < 0)
return ret;
-   ret = __show_line_range(lr, module);
+   ret = __show_line_range(lr, module, user);
exit_symbol_maps();
 
return ret;

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH perf/core v2 0/5] perf-probe: improve glibc support

2015-03-05 Thread Masami Hiramatsu

Hi,

Here is a series of patches which improves perf-probe to
handle glibc's aliased symbols and weak symbols more
correctly.

This version includes 2 new patches from Namhyung (Thanks!)
which solves a problem on weak symbols. I added a fix
on his latter patch to modify find_alternative_probe_point,
and dropped a bugfix which is already merged.
So, this series is a merged series of below 2 series.

http://lkml.kernel.org/g/20150302124939.9191.33564.stgit@localhost.localdomain
http://lkml.kernel.org/g/1425477143-5310-1-git-send-email-namhy...@kernel.org

==
A major known issue of probing on glibc is that the
some aliased symbols(e.g. malloc) and weak symbols
(e.g. calloc) can not find by perf-probe.

Actually, glibc's malloc symbol is just an alias of
__libc_malloc. Its debuginfo knows only __libc_malloc,
and perf's symbol map knows only malloc. This difference
always confuses users that they can see malloc by perf
report or annotate, but they can not probe on it, nor
find definitions by --line option.

And weak symbols have been dropped when loading.

Previously, I've made a commit 906451b98b67 which solved
this problem partly, but not completely fixed.
So I decided to solve this issue completely by finding
the symbols like malloc from perf's symbol map, and 
converting the symbol's address into debuginfo's
location infomation.

With this series, you can use --vars, --line and --add
with the aliased symbols and weak symbols on glibc.

Thank you,


---

Masami Hiramatsu (3):
  perf-probe: Fix to handle aliased symbols in glibc
  perf-probe: Fix --line to handle aliased symbols in glibc
  Revert "perf probe: Fix to fall back to find probe point in symbols"

Namhyung Kim (2):
  perf symbols: Allow symbol alias when loading map for symbol name
  perf probe: Allow weak symbols to be probed


 tools/perf/util/machine.c|2 
 tools/perf/util/map.c|6 +
 tools/perf/util/map.h|8 +-
 tools/perf/util/probe-event.c|  185 +-
 tools/perf/util/symbol-elf.c |5 +
 tools/perf/util/symbol-minimal.c |2 
 tools/perf/util/symbol.c |8 +-
 tools/perf/util/symbol.h |5 +
 8 files changed, 182 insertions(+), 39 deletions(-)

--
Masami HIRAMATSU
Software Platform Research Dpt. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu...@hitachi.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH perf/core v2 5/5] perf probe: Allow weak symbols to be probed

2015-03-05 Thread Masami Hiramatsu

From: Namhyung Kim 

It currently prevents adding probes in weak symbols.  But there're cases
that given name is an only weak symbol so that we cannot add probe.

  $ perf probe -x /usr/lib/libc.so.6 -a calloc
  Failed to find symbol calloc in /usr/lib/libc-2.21.so
Error: Failed to add events.

  $ nm /usr/lib/libc.so.6 | grep calloc
  0007b1f0 t __calloc
  0007b1f0 T __libc_calloc
  0007b1f0 W calloc

This change will result in duplicate probes when strong and weak symbols
co-exist in a binary.  But I think it's not a big problem since probes
at the weak symbol will never be hit anyway.

Signed-off-by: Masami Hiramatsu 
Signed-off-by: Namhyung Kim 
---
 tools/perf/util/probe-event.c |   12 
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index c379ea0..f9c1e53 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -309,10 +309,8 @@ static int find_alternative_probe_point(struct debuginfo 
*dinfo,
 
/* Find the address of given function */
map__for_each_symbol_by_name(map, pp->function, sym) {
-   if (sym->binding == STB_GLOBAL || sym->binding == STB_LOCAL) {
-   address = sym->start;
-   break;
-   }
+   address = sym->start;
+   break;
}
if (!address) {
ret = -ENOENT;
@@ -2484,8 +2482,7 @@ static int find_probe_functions(struct map *map, char 
*name)
struct symbol *sym;
 
map__for_each_symbol_by_name(map, name, sym) {
-   if (sym->binding == STB_GLOBAL || sym->binding == STB_LOCAL)
-   found++;
+   found++;
}
 
return found;
@@ -2845,8 +2842,7 @@ static struct strfilter *available_func_filter;
 static int filter_available_functions(struct map *map __maybe_unused,
  struct symbol *sym)
 {
-   if ((sym->binding == STB_GLOBAL || sym->binding == STB_LOCAL) &&
-   strfilter__compare(available_func_filter, sym->name))
+   if (strfilter__compare(available_func_filter, sym->name))
return 0;
return 1;
 }

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH perf/core v2 3/5] Revert "perf probe: Fix to fall back to find probe point in symbols"

2015-03-05 Thread Masami Hiramatsu

This reverts commit 906451b98b67 ("perf probe: Fix to fall back to find probe 
point in symbols").

Since perf-probe retries with the address of given symbol
searched from map before this path, this fall back routine
doesn't need anymore.

Signed-off-by: Masami Hiramatsu 
---
 tools/perf/util/probe-event.c |6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 4cfd121..c379ea0 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -630,11 +630,9 @@ static int try_to_find_probe_trace_events(struct 
perf_probe_event *pev,
}
 
if (ntevs == 0) {   /* No error but failed to find probe point. */
-   pr_warning("Probe point '%s' not found in debuginfo.\n",
+   pr_warning("Probe point '%s' not found.\n",
   synthesize_perf_probe_point(&pev->point));
-   if (need_dwarf)
-   return -ENOENT;
-   return 0;
+   return -ENOENT;
}
/* Error path : ntevs < 0 */
pr_debug("An error occurred in debuginfo analysis (%d).\n", ntevs);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH perf/core v2 4/5] perf symbols: Allow symbol alias when loading map for symbol name

2015-03-05 Thread Masami Hiramatsu

From: Namhyung Kim 

When perf probe tries to add a probe in a binary using symbol name, it
sometimes failed since some symbols were discard during loading dso.
When it resolves an address to symbol, it'd be better to have just one
symbol at given address.  But for finding address from symbol, it'd be
better to keep all names (including aliases).

Add and propagate a new allow_alias argument to dso (and map) load
functions so that it can keep those duplicate symbol aliases.

Acked-by: Masami Hiramatsu 
Signed-off-by: Namhyung Kim 
---
 tools/perf/util/machine.c|2 +-
 tools/perf/util/map.c|6 +++---
 tools/perf/util/map.h|8 +++-
 tools/perf/util/symbol-elf.c |5 +++--
 tools/perf/util/symbol-minimal.c |2 +-
 tools/perf/util/symbol.c |8 +---
 tools/perf/util/symbol.h |5 +++--
 7 files changed, 23 insertions(+), 13 deletions(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 24f8c97..01ba9b6 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1128,7 +1128,7 @@ static int machine__process_kernel_mmap_event(struct 
machine *machine,
 * preload dso of guest kernel and modules
 */
dso__load(kernel, machine->vmlinux_maps[MAP__FUNCTION],
- NULL);
+ NULL, false);
}
}
return 0;
diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index 62ca9f2..711e072 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -248,7 +248,7 @@ void map__fixup_end(struct map *map)
 
 #define DSO__DELETED "(deleted)"
 
-int map__load(struct map *map, symbol_filter_t filter)
+int __map__load(struct map *map, symbol_filter_t filter, bool allow_alias)
 {
const char *name = map->dso->long_name;
int nr;
@@ -256,7 +256,7 @@ int map__load(struct map *map, symbol_filter_t filter)
if (dso__loaded(map->dso, map->type))
return 0;
 
-   nr = dso__load(map->dso, map, filter);
+   nr = dso__load(map->dso, map, filter, allow_alias);
if (nr < 0) {
if (map->dso->has_build_id) {
char sbuild_id[BUILD_ID_SIZE * 2 + 1];
@@ -304,7 +304,7 @@ struct symbol *map__find_symbol(struct map *map, u64 addr,
 struct symbol *map__find_symbol_by_name(struct map *map, const char *name,
symbol_filter_t filter)
 {
-   if (map__load(map, filter) < 0)
+   if (__map__load(map, filter, true) < 0)
return NULL;
 
if (!dso__sorted_by_name(map->dso, map->type))
diff --git a/tools/perf/util/map.h b/tools/perf/util/map.h
index 0e42438..ba15607 100644
--- a/tools/perf/util/map.h
+++ b/tools/perf/util/map.h
@@ -149,7 +149,13 @@ size_t map__fprintf_dsoname(struct map *map, FILE *fp);
 int map__fprintf_srcline(struct map *map, u64 addr, const char *prefix,
 FILE *fp);
 
-int map__load(struct map *map, symbol_filter_t filter);
+int __map__load(struct map *map, symbol_filter_t filter, bool allow_alias);
+
+static inline int map__load(struct map *map, symbol_filter_t filter)
+{
+   return __map__load(map, filter, false);
+}
+
 struct symbol *map__find_symbol(struct map *map,
u64 addr, symbol_filter_t filter);
 struct symbol *map__find_symbol_by_name(struct map *map, const char *name,
diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index ada1676..fb630f8 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -754,7 +754,7 @@ static bool want_demangle(bool is_kernel_sym)
 
 int dso__load_sym(struct dso *dso, struct map *map,
  struct symsrc *syms_ss, struct symsrc *runtime_ss,
- symbol_filter_t filter, int kmodule)
+ symbol_filter_t filter, int kmodule, bool allow_alias)
 {
struct kmap *kmap = dso->kernel ? map__kmap(map) : NULL;
struct map *curr_map = map;
@@ -1048,7 +1048,8 @@ new_symbol:
 * For misannotated, zeroed, ASM function sizes.
 */
if (nr > 0) {
-   symbols__fixup_duplicate(&dso->symbols[map->type]);
+   if (!allow_alias)
+   symbols__fixup_duplicate(&dso->symbols[map->type]);
symbols__fixup_end(&dso->symbols[map->type]);
if (kmap) {
/*
diff --git a/tools/perf/util/symbol-minimal.c b/tools/perf/util/symbol-minimal.c
index d7efb03..fefeeb3 100644
--- a/tools/perf/util/symbol-minimal.c
+++ b/tools/perf/util/symbol-minimal.c
@@ -334,7 +334,7 @@ int dso__load_sym(struct dso *dso, struct map *map 
__maybe_unused,
  struct symsrc *ss,
  struct symsrc *runtime_ss __maybe_unused,
  symbol_filter_t filter __maybe_unused,
- int kmodule __maybe_unused)
+

[PATCH perf/core v2 1/5] perf-probe: Fix to handle aliased symbols in glibc

2015-03-05 Thread Masami Hiramatsu

Fix perf probe to handle aliased symbols correctly in glibc.
In the glibc, several symbols are defined as an alias of
__libc_XXX, e.g. malloc is an alias of __libc_malloc.
In such cases, dwarf has no subroutine instances of the
alias functions (e.g. no "malloc" instance), but the map
has that symbol and its address.
Thus, if we search the alieased symbol in debuginfo, we
always fail to find it, but it is in the map.

To solve this problem, this fails back to address-based
alternative search, which searches the symbol in the map,
translates its address to alternative (correct) function
name by using debuginfo, and retry to find the alternative
function point from debuginfo.

This adds fail-back process to --vars, --lines and --add
options. So, now you can use those on malloc@libc :)

Without this patch;
  -
  # ./perf probe -x /usr/lib64/libc-2.17.so -V malloc
  Failed to find the address of malloc
Error: Failed to show vars.
  # ./perf probe -x /usr/lib64/libc-2.17.so -a "malloc bytes"
  Probe point 'malloc' not found in debuginfo.
Error: Failed to add events.
  -

With this patch;
  -
  # ./perf probe -x /usr/lib64/libc-2.17.so -V malloc
  Available variables at malloc
  @<__libc_malloc+0>
  size_t  bytes
  # ./perf probe -x /usr/lib64/libc-2.17.so -a "malloc bytes"
  Added new event:
probe_libc:malloc(on malloc in /usr/lib64/libc-2.17.so with bytes)

  You can now use it in all perf tools, such as:

  perf record -e probe_libc:malloc -aR sleep 1
  -

Reported-by: Arnaldo Carvalho de Melo 
Signed-off-by: Masami Hiramatsu 
---
 tools/perf/util/probe-event.c |  140 -
 1 file changed, 124 insertions(+), 16 deletions(-)

diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 1c570c2fa7..b8f4578 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -178,6 +178,25 @@ static struct map *kernel_get_module_map(const char 
*module)
return NULL;
 }
 
+static struct map *get_target_map(const char *target, bool user)
+{
+   /* Init maps of given executable or kernel */
+   if (user)
+   return dso__new_map(target);
+   else
+   return kernel_get_module_map(target);
+}
+
+static void put_target_map(struct map *map, bool user)
+{
+   if (map && user) {
+   /* Only the user map needs to be released */
+   dso__delete(map->dso);
+   map__delete(map);
+   }
+}
+
+
 static struct dso *kernel_get_module_dso(const char *module)
 {
struct dso *dso;
@@ -249,6 +268,13 @@ out:
return ret;
 }
 
+static void clear_perf_probe_point(struct perf_probe_point *pp)
+{
+   free(pp->file);
+   free(pp->function);
+   free(pp->lazy_line);
+}
+
 static void clear_probe_trace_events(struct probe_trace_event *tevs, int ntevs)
 {
int i;
@@ -258,6 +284,74 @@ static void clear_probe_trace_events(struct 
probe_trace_event *tevs, int ntevs)
 }
 
 #ifdef HAVE_DWARF_SUPPORT
+/*
+ * Some binaries like glibc have special symbols which are on the symbol
+ * table, but not in the debuginfo. If we can find the address of the
+ * symbol from map, we can translate the address back to the probe point.
+ */
+static int find_alternative_probe_point(struct debuginfo *dinfo,
+   struct perf_probe_point *pp,
+   struct perf_probe_point *result,
+   const char *target, bool uprobes)
+{
+   struct map *map = NULL;
+   struct symbol *sym;
+   u64 address = 0;
+   int ret = -ENOENT;
+
+   /* This can work only for function-name based one */
+   if (!pp->function || pp->file)
+   return -ENOTSUP;
+
+   map = get_target_map(target, uprobes);
+   if (!map)
+   return -EINVAL;
+
+   /* Find the address of given function */
+   map__for_each_symbol_by_name(map, pp->function, sym) {
+   if (sym->binding == STB_GLOBAL || sym->binding == STB_LOCAL) {
+   address = sym->start;
+   break;
+   }
+   }
+   if (!address) {
+   ret = -ENOENT;
+   goto out;
+   }
+   pr_debug("Symbol %s address found : %lx\n", pp->function, address);
+
+   ret = debuginfo__find_probe_point(dinfo, (unsigned long)address,
+ result);
+   if (ret <= 0)
+   ret = (!ret) ? -ENOENT : ret;
+   else {
+   result->offset += pp->offset;
+   result->line += pp->line;
+   ret = 0;
+   }
+
+out:
+   put_target_map(map, uprobes);
+   return ret;
+
+}
+
+static int get_alternative_probe_event(struct debuginfo *dinfo,
+  struct perf_probe_event *pev,
+  struct perf_probe_point *tmp,
+

Re: [PATCH v2 6/6] x86, asm: Rename INIT_TSS_IST to TSS_IST

2015-03-05 Thread Ingo Molnar


* Andy Lutomirski  wrote:

> This has nothing to do with the init thread or the initial anything.
> It's just the TSS.
> 
> Signed-off-by: Andy Lutomirski 
> ---
>  arch/x86/kernel/entry_64.S | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
> index 0c00fd80249a..c86f83e95f15 100644
> --- a/arch/x86/kernel/entry_64.S
> +++ b/arch/x86/kernel/entry_64.S
> @@ -959,7 +959,7 @@ apicinterrupt IRQ_WORK_VECTOR \
>  /*
>   * Exception entry points.
>   */
> -#define INIT_TSS_IST(x) PER_CPU_VAR(cpu_tss) + (TSS_ist + ((x) - 1) * 8)
> +#define TSS_IST(x) PER_CPU_VAR(cpu_tss) + (TSS_ist + ((x) - 1) * 8)
>  
>  .macro idtentry sym do_sym has_error_code:req paranoid=0 shift_ist=-1
>  ENTRY(\sym)
> @@ -1015,13 +1015,13 @@ ENTRY(\sym)
>   .endif
>  
>   .if \shift_ist != -1
> - subq $EXCEPTION_STKSZ, INIT_TSS_IST(\shift_ist)
> + subq $EXCEPTION_STKSZ, TSS_IST(\shift_ist)
>   .endif
>  
>   call \do_sym
>  
>   .if \shift_ist != -1
> - addq $EXCEPTION_STKSZ, INIT_TSS_IST(\shift_ist)
> + addq $EXCEPTION_STKSZ, TSS_IST(\shift_ist)
>   .endif
>  
>   /* these procedures expect "no swapgs" flag in ebx */

If you don't mind I've renamed this to 'CPU_TSS_IST', to be in line 
with cpu_tss.

The per-cpuness of this symbol gets lost at the usage sites, because 
the PER_CPU_VAR() reference is hidden in a macro.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RESEND PATCH] kernel/panic/kexec: fix "crash_kexec_post_notifiers" option issue in oops path

2015-03-05 Thread HATAYAMA Daisuke

From: Vivek Goyal 
Subject: Re: [RESEND PATCH] kernel/panic/kexec: fix 
"crash_kexec_post_notifiers" option issue in oops path
Date: Thu, 5 Mar 2015 17:22:04 -0500

> On Thu, Mar 05, 2015 at 05:19:30PM -0500, Vivek Goyal wrote:
>> On Wed, Mar 04, 2015 at 05:56:48PM +0900, HATAYAMA Daisuke wrote:
>> > The commit f06e5153f4ae2e2f3b0300f0e260e40cb7fefd45 introduced
>> > "crash_kexec_post_notifiers" kernel boot option, which toggles
>> > wheather panic() calls crash_kexec() before or after panic_notifiers
>> > and dump kmsg.
>> > 
>> > The problem is that the commit overlooks panic_on_oops kernel boot
>> > option. If it is enabled, crash_kexec() is called directly without
>> > going through panic() in oops path.
>> > 
>> > To fix this issue, this patch adds a check to
>> > "crash_kexec_post_notifiers" in the condition of kexec_should_crash().
>> > 
>> > Signed-off-by: HATAYAMA Daisuke 
>> > Acked-by: Baoquan He 
>> > Tested-by: Hidehiro Kawai 
>> > ---
>> >  include/linux/kernel.h | 3 +++
>> >  kernel/kexec.c | 2 ++
>> >  kernel/panic.c | 2 +-
>> >  3 files changed, 6 insertions(+), 1 deletion(-)
>> > 
>> > diff --git a/include/linux/kernel.h b/include/linux/kernel.h
>> > index 64ce58b..f47379f 100644
>> > --- a/include/linux/kernel.h
>> > +++ b/include/linux/kernel.h
>> > @@ -426,6 +426,9 @@ extern int panic_on_unrecovered_nmi;
>> >  extern int panic_on_io_nmi;
>> >  extern int panic_on_warn;
>> >  extern int sysctl_panic_on_stackoverflow;
>> > +
>> > +extern bool crash_kexec_post_notifiers;
>> > +
>> >  /*
>> >   * Only to be used by arch init code. If the user over-wrote the default
>> >   * CONFIG_PANIC_TIMEOUT, honor it.
>> > diff --git a/kernel/kexec.c b/kernel/kexec.c
>> > index 9a8a01a..0ecf252 100644
>> > --- a/kernel/kexec.c
>> > +++ b/kernel/kexec.c
>> > @@ -84,6 +84,8 @@ struct resource crashk_low_res = {
>> >  
>> >  int kexec_should_crash(struct task_struct *p)
>> >  {
>> > +  if (crash_kexec_post_notifiers)
>> > +  return 0;
>> 
>> This is little confusing. So if crash_kexec_post_notifiers is set but
>> panic_on_oops is not set, still we will return?
>> 
>> Should we do this only if panic_on_oops is set? IOW, how about following
>> 
>>  if (panic_on_oops && crash_kexec_post_notifiers)
>>  return 0;
>> 
>> And then also put a comment explaining the rationale.
> 
> Ok, I went through the previous version of patch and discussion there
> which says that all the 4 conditions lead to panic. So putting above
> code should be fine.
> 
> Can you please atleast put a comment here to explain it as it was not
> obvious. Just mention that all the checks below lead to panic hence
> if user wants to run panic notifiers then don't run crash_kexec() yet.
> It will be run after panic notifiers.
> 

Thanks for your reviewing.

Yes, I'll put such new comment in the patch of next version.

--
Thanks.
HATAYAMA, Daisuke

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: x32 + audit status?

2015-03-05 Thread Paul Moore

On Thu, Mar 5, 2015 at 6:07 PM, Andy Lutomirski  wrote:
> On Mar 5, 2015 10:32 AM, "David Drysdale"  wrote:
>>
>> Hi,
>>
>> Do we currently expect the audit system to work with x32 syscalls?
>>
>> I was playing with the audit system for the first time today (on
>> v4.0-rc2, due to [1]), and it didn't seem to work for me.  (Tweaking
>> ptrace.c like the patch below seemed to help, but I may just have
>> configured something wrong.)
>>
>> I know there was a bunch of activity around this area in mid-2014,
>> but I'm not sure what the final position was...
>
> It's totally broken, and it needs ABI work.  I think it should keep
> the high syscall numbers, which means that both userspace and the
> audit core need to learn how to deal with it.

What Andy said.  It's on the list of things to fix, but to be brutally
honest, it's not very high on the list due to lack of interest from
people asking for audit/x32 support.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] ASoC: Add support for NAU8824 codec to ASoC

2015-03-05 Thread Chih-Chiang Chang



On 2015/3/4 下午 08:55, Mark Brown wrote:
> On Wed, Mar 04, 2015 at 08:35:52PM +0800, Chih-Chiang Chang wrote:
>> On 2015/2/24 下午 10:13, Mark Brown wrote:
> 
>>> I would have expected the headphone volume control to be a stereo
>>> (double) control - same for speakers.
> 
>> The nau8824 related registers which control left/right volume are located
>> in different addresses and different shift bits. Since there is no available
>> preprocessor macro to meet our requirements, the driver consists of 
>> left/right
>> volume control separately.
> 
> Add relevant control types if you need them, it's important to have
> proper stereo controls available to userspace.
We cannot find suitable macro in file "include\sound\soc.h", so we want to add 
below two macro for our chip.
SOC_DOUBLE_L_R_VALUE
SOC_DOUBLE_L_R_TLV
> 
 +struct nau8824_init_reg {
 +u8 reg;
 +u16 val;
 +};
> 
>>> This looks like you're reimplementing regmap's register sequence
>>> stuff...  It's also a *very* large sequence you have, are you sure it's
>>> all required?  It seems like this may be doing a bunch of machine
>>> specific configuration but since it's all magic numbers it's hard to
>>> tell.
> 
>> Initial settings are arranged in order
> 
> This doesn't answer or address my concern.
These large number of register setting is used to initial our codec, and some 
of other codec have the same behavior. We will remove few unnecessary register 
default setting and add some remark for registers.
> 
 +/* Dynamic Headset detection enabled */
 +snd_soc_update_bits(codec, 0x01, 0x400, 0x400);
 +snd_soc_update_bits(codec, 0x02, 0x0008, 0x0008);
 +snd_soc_update_bits(codec, 0x0f, 0x0300, 0x0100);
 +snd_soc_write(codec, 0x09, 0xE000);
 +snd_soc_write(codec, NAU8824_IRQ_SETTING, 0x1006);
 +snd_soc_write(codec, 0x13, 0x1615);
 +snd_soc_write(codec, 0x15, 0x0414);
 +snd_soc_update_bits(codec, 0x16, 0xFF00, 0x5900);
 +snd_soc_update_bits(codec, 0x66, 0x0070, 0x0060);
> 
>>> Too many magic numbers here I think and this looks like it should be
>>> configured using platform data and/or the machine driver (what if the
>>> headphone detection/IRQ aren't wired up?).  I'd also expect to see
>>> reporting via the standard interfaces for jack reporting.
> 
>> The above initial settings are for jack detection. As for other jack
>> detection flow, it will be implemented in machine driver but not be included 
>> in
>> this application.
> 
> Please either remove this for now or implement it properly.
We will remove it.
> 
>> ===
>> The privileged confidential information contained in this email is intended 
>> for use only by the addressees as indicated by the original sender of this 
>> email. If you are not the addressee indicated in this email or are not 
>> responsible for delivery of the email to such a person, please kindly reply 
>> to the sender indicating this fact and delete all copies of it from your 
>> computer and network server immediately. Your cooperation is highly 
>> appreciated. It is advised that any unauthorized use of confidential 
>> information of Nuvoton is strictly prohibited; and any information in this 
>> email irrelevant to the official business of Nuvoton shall be deemed as 
>> neither given nor endorsed by Nuvoton.
> 
> Don't include noise like this in upstream communication, if your company
> won't fix this then please use an external mail account for upstream
> communication.
Our MIS report they have disabled to append message in mail. Hope you do not 
see it in this mail.
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC/PATCH 2/2] perf probe: Allow weak symbols to be probed

2015-03-05 Thread Namhyung Kim

On Fri, Mar 6, 2015 at 4:05 PM, Masami Hiramatsu
 wrote:
> (2015/03/04 22:52), Namhyung Kim wrote:
>> It currently prevents adding probes in weak symbols.  But there're cases
>> that given name is an only weak symbol so that we cannot add probe.
>>
>>   $ perf probe -x /usr/lib/libc.so.6 -a calloc
>>   Failed to find symbol calloc in /usr/lib/libc-2.21.so
>> Error: Failed to add events.
>>
>>   $ nm /usr/lib/libc.so.6 | grep calloc
>>   0007b1f0 t __calloc
>>   0007b1f0 T __libc_calloc
>>   0007b1f0 W calloc
>>
>> This change will result in duplicate probes when strong and weak symbols
>> co-exist in a binary.  But I think it's not a big problem since probes
>> at the weak symbol will never be hit anyway.
>>
>> Cc: Masami Hiramatsu 
>> Signed-off-by: Namhyung Kim 
>> ---
>>  tools/perf/util/probe-event.c | 6 ++
>>  1 file changed, 2 insertions(+), 4 deletions(-)
>>
>> diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
>> index 1c570c2fa7cc..12b7d018106e 100644
>> --- a/tools/perf/util/probe-event.c
>> +++ b/tools/perf/util/probe-event.c
>> @@ -2339,8 +2339,7 @@ static int find_probe_functions(struct map *map, char 
>> *name)
>>   struct symbol *sym;
>>
>>   map__for_each_symbol_by_name(map, name, sym) {
>> - if (sym->binding == STB_GLOBAL || sym->binding == STB_LOCAL)
>> - found++;
>> + found++;
>>   }
>
> Ah, I've found this is the magic...
> Here, we need another fix on my series.

Oops, right.  I didn't base on your patch so missed this function.

Thanks,
Namhyung


>
> diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
> index 22392b06..f9c1e53 100644
> --- a/tools/perf/util/probe-event.c
> +++ b/tools/perf/util/probe-event.c
> @@ -309,10 +309,8 @@ static int find_alternative_probe_point(struct debuginfo 
> *d
>
> /* Find the address of given function */
> map__for_each_symbol_by_name(map, pp->function, sym) {
> -   if (sym->binding == STB_GLOBAL || sym->binding == STB_LOCAL) {
> -   address = sym->start;
> -   break;
> -   }
> +   address = sym->start;
> +   break;
> }
> if (!address) {
> ret = -ENOENT;
> ---
>
> With this fix, I could get variables on waitpid and calloc.
> -
> # ./perf probe -x /lib64/libc-2.17.so -V waitpid
> Available variables at waitpid
> @<__libc_waitpid+0>
> __pid_t pid
> int oldtype
> int options
> int*stat_loc
> -
>
> I'll update and include it my series.
>
> Thank you!
>
>>
>>   return found;
>> @@ -2708,8 +2707,7 @@ static struct strfilter *available_func_filter;
>>  static int filter_available_functions(struct map *map __maybe_unused,
>> struct symbol *sym)
>>  {
>> - if ((sym->binding == STB_GLOBAL || sym->binding == STB_LOCAL) &&
>> - strfilter__compare(available_func_filter, sym->name))
>> + if (strfilter__compare(available_func_filter, sym->name))
>>   return 0;
>>   return 1;
>>  }
>>
>
>
> --
> Masami HIRAMATSU
> Software Platform Research Dept. Linux Technology Research Center
> Hitachi, Ltd., Yokohama Research Laboratory
> E-mail: masami.hiramatsu...@hitachi.com
>
>



-- 
Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC 00/16] Introduce ZONE_CMA

2015-03-05 Thread Joonsoo Kim

On Thu, Mar 05, 2015 at 06:48:50PM +0100, Vlastimil Babka wrote:
> On 03/05/2015 05:53 PM, Vlastimil Babka wrote:
> > On 02/12/2015 08:32 AM, Joonsoo Kim wrote:
> >> 
> >> 1) Break non-overlapped zone assumption
> >> CMA regions could be spread to all memory range, so, to keep all of them
> >> into one zone, span of ZONE_CMA would be overlap to other zones'.
> > 
> > From patch 13/16 ut seems to me that indeed the ZONE_CMA spans the area of 
> > all
> > other zones. This seems very inefficient for e.g. compaction scanners, which
> > will repeatedly skip huge amounts of pageblocks that don't belong to 
> > ZONE_CMA.
> > Could you instead pick only a single zone on a node from which you steal the
> > pages? That would allow to keep the span low.

Hello, Vlastimil.

CMA is used for DMA now and it sometimes has memory range constraint
so we could not limit zone span as low. But, current implementatino
unnecessarilly set up ZONE_CMA's span from start_pfn of node to
end_pfn of node. I will change it to the range where we actually steal
pages. Maybe, most of usecase of CMA would use small, limited range of
memory so it doesn't impose critical performance problem on zone's pfn
iterator such as compaction scanners.

> > 
> > Another disadvantage I see is that to allocate from ZONE_CMA you will have 
> > now
> > to reclaim enough pages within the zone itself. I think think the cma 
> > allocation
> 
>   I don't think...
> 
> > supports migrating pages from ZONE_CMA to the adjacent non-CMA zone, which 
> > would
> > be equivalent to migration from MIGRATE_CMA pageblocks to the rest of the 
> > zone?

I'm not sure I understand your question correctly.

cma allocation uses alloc_migrate_target() to get migration target
freepage and it doesn't impose any zone contraint so migrating pages
from ZONE_CMA to the adjacent non-CMA zone is possible. Am I
understading you question correctly? If I mis-understand, please let
me know.

Thanks.

> >> I'm not sure that there is an assumption about possibility of zone overlap
> >> But, if ZONE_CMA is introduced, this assumption becomes reality
> >> so we should deal with this situation. I investigated most of sites
> >> that iterates pfn on certain zone and found that they normally doesn't
> >> consider zone overlap. I tried to handle these cases by myself in the
> >> early of this series. I hope that there is no more site that depends on
> >> non-overlap zone assumption when iterating pfn on certain zone.
> >> 
> >> I passed boot test on x86, ARM32 and ARM64. I did some stress tests
> >> on x86 and there is no problem. Feel free to enjoy and please give me
> >> a feedback. :)
> >> 
> >> This patchset is based on v3.18.
> >> 
> >> Thanks.
> >> 
> >> [1] https://lkml.org/lkml/2014/5/28/64
> >> [2] https://lkml.org/lkml/2014/11/4/55 
> >> [3] https://lkml.org/lkml/2014/10/15/623
> >> [4] https://lkml.org/lkml/2014/5/30/320
> >> 
> >> 
> >> Joonsoo Kim (16):
> >>   mm/page_alloc: correct highmem memory statistics
> >>   mm/writeback: correct dirty page calculation for highmem
> >>   mm/highmem: make nr_free_highpages() handles all highmem zones by
> >> itself
> >>   mm/vmstat: make node_page_state() handles all zones by itself
> >>   mm/vmstat: watch out zone range overlap
> >>   mm/page_alloc: watch out zone range overlap
> >>   mm/page_isolation: watch out zone range overlap
> >>   power: watch out zone range overlap
> >>   mm/cma: introduce cma_total_pages() for future use
> >>   mm/highmem: remove is_highmem_idx()
> >>   mm/page_alloc: clean-up free_area_init_core()
> >>   mm/cma: introduce new zone, ZONE_CMA
> >>   mm/cma: populate ZONE_CMA and use this zone when GFP_HIGHUSERMOVABLE
> >>   mm/cma: print stealed page count
> >>   mm/cma: remove ALLOC_CMA
> >>   mm/cma: remove MIGRATE_CMA
> >> 
> >>  arch/x86/include/asm/sparsemem.h  |2 +-
> >>  arch/x86/mm/highmem_32.c  |3 +
> >>  include/linux/cma.h   |9 ++
> >>  include/linux/gfp.h   |   31 +++---
> >>  include/linux/mempolicy.h |2 +-
> >>  include/linux/mm.h|1 +
> >>  include/linux/mmzone.h|   58 +-
> >>  include/linux/page-flags-layout.h |2 +
> >>  include/linux/vm_event_item.h |8 +-
> >>  include/linux/vmstat.h|   26 +
> >>  kernel/power/snapshot.c   |   15 +++
> >>  lib/show_mem.c|2 +-
> >>  mm/cma.c  |   70 ++--
> >>  mm/compaction.c   |6 +-
> >>  mm/highmem.c  |   12 +-
> >>  mm/hugetlb.c  |2 +-
> >>  mm/internal.h |3 +-
> >>  mm/memory_hotplug.c   |3 +
> >>  mm/mempolicy.c|3 +-
> >>  mm/page-writeback.c   |8 +-
> >>  mm/page_alloc.c   |  223 
> >> +
> >>  mm/page_isolation.c   |   14

[PATCH] crypto: RNGs must return 0 in success case

2015-03-05 Thread Stephan Mueller

Change the RNGs to always return 0 in success case.

This patch ensures that seqiv.c works with RNGs other than krng. seqiv
expects that any return code other than 0 is an error. Without the
patch, rfc4106(gcm(aes)) will not work when using a DRBG or an ANSI
X9.31 RNG.

Signed-off-by: Stephan Mueller 
---
 crypto/ansi_cprng.c  | 6 +-
 crypto/drbg.c| 7 ++-
 include/crypto/rng.h | 3 +--
 3 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/crypto/ansi_cprng.c b/crypto/ansi_cprng.c
index 6f5bebc..765fe76 100644
--- a/crypto/ansi_cprng.c
+++ b/crypto/ansi_cprng.c
@@ -210,7 +210,11 @@ static int get_prng_bytes(char *buf, size_t nbytes, struct 
prng_context *ctx,
byte_count = DEFAULT_BLK_SZ;
}
 
-   err = byte_count;
+   /*
+* Return 0 in case of success as mandated by the kernel
+* crypto API interface definition.
+*/
+   err = 0;
 
dbgprint(KERN_CRIT "getting %d random bytes for context %p\n",
byte_count, ctx);
diff --git a/crypto/drbg.c b/crypto/drbg.c
index 56c1d7e..b69409c 100644
--- a/crypto/drbg.c
+++ b/crypto/drbg.c
@@ -1280,7 +1280,7 @@ static void drbg_restore_shadow(struct drbg_state *drbg,
  *   as defined in SP800-90A. The additional input is mixed into
  *   the state in addition to the pulled entropy.
  *
- * return: generated number of bytes
+ * return: 0 when all bytes are generated; < 0 in case of an error
  */
 static int drbg_generate(struct drbg_state *drbg,
 unsigned char *buf, unsigned int buflen,
@@ -1419,6 +1419,11 @@ static int drbg_generate(struct drbg_state *drbg,
}
 #endif
 
+   /*
+* All operations were successful, return 0 as mandated by
+* the kernel crypto API interface.
+*/
+   len = 0;
 err:
shadow->d_ops->crypto_fini(shadow);
drbg_restore_shadow(drbg, &shadow);
diff --git a/include/crypto/rng.h b/include/crypto/rng.h
index a16fb10..6e28ea5 100644
--- a/include/crypto/rng.h
+++ b/include/crypto/rng.h
@@ -103,8 +103,7 @@ static inline void crypto_free_rng(struct crypto_rng *tfm)
  * This function fills the caller-allocated buffer with random numbers using 
the
  * random number generator referenced by the cipher handle.
  *
- * Return: > 0 function was successful and returns the number of generated
- *bytes; < 0 if an error occurred
+ * Return: 0 function was successful; < 0 if an error occurred
  */
 static inline int crypto_rng_get_bytes(struct crypto_rng *tfm,
   u8 *rdata, unsigned int dlen)
-- 
2.1.0


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/2] f2fs: reduce searching region of segmap when set free section

2015-03-05 Thread Wanpeng Li

In __set_free we will check whether all segment are free in one section 
when free one segment, in order to set section to free status. But the 
searching region of segmap is from start segno to last segno of main 
area, it's not necessary. So let's just only check all segment bitmap 
of target section.

Signed-off-by: Wanpeng Li 
---
 fs/f2fs/segment.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h
index 7fd3511..85d7fa7 100644
--- a/fs/f2fs/segment.h
+++ b/fs/f2fs/segment.h
@@ -336,7 +336,8 @@ static inline void __set_free(struct f2fs_sb_info *sbi, 
unsigned int segno)
clear_bit(segno, free_i->free_segmap);
free_i->free_segments++;
 
-   next = find_next_bit(free_i->free_segmap, MAIN_SEGS(sbi), start_segno);
+   next = find_next_bit(free_i->free_segmap,
+   start_segno + sbi->segs_per_sec, start_segno);
if (next >= start_segno + sbi->segs_per_sec) {
clear_bit(secno, free_i->free_secmap);
free_i->free_sections++;
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/2] f2fs: fix extent cache memory leak

2015-03-05 Thread Wanpeng Li

extent tree/node slab cache is created during f2fs insmod,
how, it isn't destroyed during f2fs rmmod, this patch fix 
it by destroy extent tree/node slab cache once rmmod f2fs.

Signed-off-by: Wanpeng Li 
---
 fs/f2fs/super.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index e649f21..0b8a2d8 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -1323,6 +1323,7 @@ static void __exit exit_f2fs_fs(void)
remove_proc_entry("fs/f2fs", NULL);
f2fs_destroy_root_stats();
unregister_filesystem(&f2fs_fs_type);
+   destroy_extent_cache();
destroy_checkpoint_caches();
destroy_segment_manager_caches();
destroy_node_manager_caches();
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: parent/child hierarchy for regulator

2015-03-05 Thread Peter Chen

On Thu, Mar 05, 2015 at 12:22:34PM +, Mark Brown wrote:
> On Thu, Mar 05, 2015 at 06:35:36PM +0800, Peter Chen wrote:
> 
> > Any good ways at code/dts to show parent/child hierarchy for regulator?
> 
> There's plenty of examples in mainline...
> 

Thanks, I get answer for adding parent regulator for fixed regulator,
it is very easy, we only need to add 'vin-supply = <&parent_reg>;'
at fixed's property.

-- 

Best Regards,
Peter Chen
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC 13/16] mm/cma: populate ZONE_CMA and use this zone when GFP_HIGHUSERMOVABLE

2015-03-05 Thread Joonsoo Kim

On Tue, Mar 03, 2015 at 01:58:46PM +0530, Aneesh Kumar K.V wrote:
> Joonsoo Kim  writes:
> 
> > Until now, reserved pages for CMA are managed altogether with normal
> > page in the same zone. This approach has numorous problems and fixing
> > them isn't easy. To fix this situation, ZONE_CMA is introduced in
> > previous patch, but, not yet populated. This patch implement population
> > of ZONE_CMA by stealing reserved pages from normal zones. This stealing
> > break one uncertain assumption on zone, that is, zone isn't overlapped.
> > In the early of this series, some check is inserted to every zone's span
> > iterator to handle zone overlap so there would be no problem with
> > this assumption break.
> >
> > To utilize this zone, user should use GFP_HIGHUSERMOVABLE, because
> > these pages are only applicable for movable type and ZONE_CMA could
> > contain highmem.
> >
> > Implementation itself is very easy to understand. Do steal when cma
> > area is initialized and recalculate values for per zone data structure.
> >
> > Signed-off-by: Joonsoo Kim 
> > ---
> >  include/linux/gfp.h |   10 --
> >  include/linux/mm.h  |1 +
> >  mm/cma.c|   23 ---
> >  mm/page_alloc.c |   42 +++---
> >  4 files changed, 64 insertions(+), 12 deletions(-)
> >
> > diff --git a/include/linux/gfp.h b/include/linux/gfp.h
> > index 619eb20..d125440 100644
> > --- a/include/linux/gfp.h
> > +++ b/include/linux/gfp.h
> > @@ -186,6 +186,12 @@ static inline int gfpflags_to_migratetype(const gfp_t 
> > gfp_flags)
> >  #define OPT_ZONE_DMA32 ZONE_NORMAL
> >  #endif
> >  
> > +#ifdef CONFIG_CMA
> > +#define OPT_ZONE_CMA ZONE_CMA
> > +#else
> > +#define OPT_ZONE_CMA ZONE_MOVABLE
> > +#endif
> > +
> 
> Does that mean with CONFIG_CMA we always try ZONE_CMA first and then
> fallback to ZONE_MOVABLE ? If so won't we hit termporary CMA allocation
> failures that can result with pinned movable pages ?

Hello, Aneesh.

IIUC, Johannes's fair allocation policy patchset makes us uses
individual zones fairly. So, before freepage on ZONE_CMA is exhausted,
ZONE_MOVABLE will be used. :)

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC/PATCH 2/2] perf probe: Allow weak symbols to be probed

2015-03-05 Thread Masami Hiramatsu

(2015/03/04 22:52), Namhyung Kim wrote:
> It currently prevents adding probes in weak symbols.  But there're cases
> that given name is an only weak symbol so that we cannot add probe.
> 
>   $ perf probe -x /usr/lib/libc.so.6 -a calloc
>   Failed to find symbol calloc in /usr/lib/libc-2.21.so
> Error: Failed to add events.
> 
>   $ nm /usr/lib/libc.so.6 | grep calloc
>   0007b1f0 t __calloc
>   0007b1f0 T __libc_calloc
>   0007b1f0 W calloc
> 
> This change will result in duplicate probes when strong and weak symbols
> co-exist in a binary.  But I think it's not a big problem since probes
> at the weak symbol will never be hit anyway.
> 
> Cc: Masami Hiramatsu 
> Signed-off-by: Namhyung Kim 
> ---
>  tools/perf/util/probe-event.c | 6 ++
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
> index 1c570c2fa7cc..12b7d018106e 100644
> --- a/tools/perf/util/probe-event.c
> +++ b/tools/perf/util/probe-event.c
> @@ -2339,8 +2339,7 @@ static int find_probe_functions(struct map *map, char 
> *name)
>   struct symbol *sym;
>  
>   map__for_each_symbol_by_name(map, name, sym) {
> - if (sym->binding == STB_GLOBAL || sym->binding == STB_LOCAL)
> - found++;
> + found++;
>   }

Ah, I've found this is the magic...
Here, we need another fix on my series.

diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 22392b06..f9c1e53 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -309,10 +309,8 @@ static int find_alternative_probe_point(struct debuginfo *d

/* Find the address of given function */
map__for_each_symbol_by_name(map, pp->function, sym) {
-   if (sym->binding == STB_GLOBAL || sym->binding == STB_LOCAL) {
-   address = sym->start;
-   break;
-   }
+   address = sym->start;
+   break;
}
if (!address) {
ret = -ENOENT;
---

With this fix, I could get variables on waitpid and calloc.
-
# ./perf probe -x /lib64/libc-2.17.so -V waitpid
Available variables at waitpid
@<__libc_waitpid+0>
__pid_t pid
int oldtype
int options
int*stat_loc
-

I'll update and include it my series.

Thank you!

>  
>   return found;
> @@ -2708,8 +2707,7 @@ static struct strfilter *available_func_filter;
>  static int filter_available_functions(struct map *map __maybe_unused,
> struct symbol *sym)
>  {
> - if ((sym->binding == STB_GLOBAL || sym->binding == STB_LOCAL) &&
> - strfilter__compare(available_func_filter, sym->name))
> + if (strfilter__compare(available_func_filter, sym->name))
>   return 0;
>   return 1;
>  }
> 


-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu...@hitachi.com


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v9 14/21] ACPI / processor: Make it possible to get CPU hardware ID via GICC

2015-03-05 Thread Hanjun Guo

On 2015/3/5 23:19, Catalin Marinas wrote:
> On Thu, Mar 05, 2015 at 02:13:58PM +0100, Rafael J. Wysocki wrote:
>> On Thu, Mar 5, 2015 at 12:27 PM, Catalin Marinas
>>  wrote:
>>> On Thu, Mar 05, 2015 at 04:03:21PM +0800, Hanjun Guo wrote:
 On 2015/3/5 6:46, Rafael J. Wysocki wrote:
> IMO, you really need to define phys_cpuid_t in a common place or people 
> will
> forget that it may be 64-bit, because they'll only be looking at their 
> arch.
 Since x86 and ARM64 are using different types for phys_cpuid_t, we need to
 introduce something like following if define it in common place:

 in linux/acpi.h,

 #if defined(CONFIG_X86) || defined(CONFIG_IA64)
 typedef u32 phys_cpuid_t;
 #define PHYS_CPUID_INVALID (phys_cpuid_t)(-1)
 #else if defined(CONFIG_ARM64)
 typedef u64 phys_cpuid_t;
 #define PHYS_CPUID_INVALID INVALID_HWID
 #endif

 I think it's awful, did I miss something?
>> Well, you can define the type and PHYS_CPUID_INVALID in the arch
>> code and then do this in a common header:
>>
>> #ifndef PHYS_CPUID_INVALID
>> typedef u32 phys_cpuid_t;
>> #define PHYS_CPUID_INVALID (phys_cpuid_t)(-1)
>> #endif
>>
>> That would allow you to avoid the need to duplicate the
>> definitions where it is not necessary.
> It's fine by me.

I will update the patch.

Thanks
Hanjun

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [v3 24/26] KVM: Update Posted-Interrupts Descriptor when vCPU is blocked

2015-03-05 Thread Wu, Feng



> -Original Message-
> From: Marcelo Tosatti [mailto:mtosa...@redhat.com]
> Sent: Wednesday, March 04, 2015 8:06 PM
> To: Wu, Feng
> Cc: t...@linutronix.de; mi...@redhat.com; h...@zytor.com; x...@kernel.org;
> g...@kernel.org; pbonz...@redhat.com; dw...@infradead.org;
> j...@8bytes.org; alex.william...@redhat.com; jiang@linux.intel.com;
> eric.au...@linaro.org; linux-kernel@vger.kernel.org;
> io...@lists.linux-foundation.org; k...@vger.kernel.org
> Subject: Re: [v3 24/26] KVM: Update Posted-Interrupts Descriptor when vCPU
> is blocked
> 
> On Mon, Mar 02, 2015 at 01:36:51PM +, Wu, Feng wrote:
> >
> >
> > > -Original Message-
> > > From: Marcelo Tosatti [mailto:mtosa...@redhat.com]
> > > Sent: Friday, February 27, 2015 7:41 AM
> > > To: Wu, Feng
> > > Cc: t...@linutronix.de; mi...@redhat.com; h...@zytor.com;
> x...@kernel.org;
> > > g...@kernel.org; pbonz...@redhat.com; dw...@infradead.org;
> > > j...@8bytes.org; alex.william...@redhat.com; jiang@linux.intel.com;
> > > eric.au...@linaro.org; linux-kernel@vger.kernel.org;
> > > io...@lists.linux-foundation.org; k...@vger.kernel.org
> > > Subject: Re: [v3 24/26] KVM: Update Posted-Interrupts Descriptor when
> vCPU
> > > is blocked
> > >
> > > On Fri, Dec 12, 2014 at 11:14:58PM +0800, Feng Wu wrote:
> > > > This patch updates the Posted-Interrupts Descriptor when vCPU
> > > > is blocked.
> > > >
> > > > pre-block:
> > > > - Add the vCPU to the blocked per-CPU list
> > > > - Clear 'SN'
> > > > - Set 'NV' to POSTED_INTR_WAKEUP_VECTOR
> > > >
> > > > post-block:
> > > > - Remove the vCPU from the per-CPU list
> > > >
> > > > Signed-off-by: Feng Wu 
> > > > ---
> > > >  arch/x86/include/asm/kvm_host.h |  2 +
> > > >  arch/x86/kvm/vmx.c  | 96
> > > +
> > > >  arch/x86/kvm/x86.c  | 22 +++---
> > > >  include/linux/kvm_host.h|  4 ++
> > > >  virt/kvm/kvm_main.c |  6 +++
> > > >  5 files changed, 123 insertions(+), 7 deletions(-)
> > > >
> > > > diff --git a/arch/x86/include/asm/kvm_host.h
> > > b/arch/x86/include/asm/kvm_host.h
> > > > index 13e3e40..32c110a 100644
> > > > --- a/arch/x86/include/asm/kvm_host.h
> > > > +++ b/arch/x86/include/asm/kvm_host.h
> > > > @@ -101,6 +101,8 @@ static inline gfn_t gfn_to_index(gfn_t gfn, gfn_t
> > > base_gfn, int level)
> > > >
> > > >  #define ASYNC_PF_PER_VCPU 64
> > > >
> > > > +extern void (*wakeup_handler_callback)(void);
> > > > +
> > > >  enum kvm_reg {
> > > > VCPU_REGS_RAX = 0,
> > > > VCPU_REGS_RCX = 1,
> > > > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> > > > index bf2e6cd..a1c83a2 100644
> > > > --- a/arch/x86/kvm/vmx.c
> > > > +++ b/arch/x86/kvm/vmx.c
> > > > @@ -832,6 +832,13 @@ static DEFINE_PER_CPU(struct vmcs *,
> > > current_vmcs);
> > > >  static DEFINE_PER_CPU(struct list_head, loaded_vmcss_on_cpu);
> > > >  static DEFINE_PER_CPU(struct desc_ptr, host_gdt);
> > > >
> > > > +/*
> > > > + * We maintian a per-CPU linked-list of vCPU, so in wakeup_handler() we
> > > > + * can find which vCPU should be waken up.
> > > > + */
> > > > +static DEFINE_PER_CPU(struct list_head, blocked_vcpu_on_cpu);
> > > > +static DEFINE_PER_CPU(spinlock_t, blocked_vcpu_on_cpu_lock);
> > > > +
> > > >  static unsigned long *vmx_io_bitmap_a;
> > > >  static unsigned long *vmx_io_bitmap_b;
> > > >  static unsigned long *vmx_msr_bitmap_legacy;
> > > > @@ -1921,6 +1928,7 @@ static void vmx_vcpu_load(struct kvm_vcpu
> *vcpu,
> > > int cpu)
> > > > struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu);
> > > > struct pi_desc old, new;
> > > > unsigned int dest;
> > > > +   unsigned long flags;
> > > >
> > > > memset(&old, 0, sizeof(old));
> > > > memset(&new, 0, sizeof(new));
> > > > @@ -1942,6 +1950,20 @@ static void vmx_vcpu_load(struct kvm_vcpu
> > > *vcpu, int cpu)
> > > > new.nv = POSTED_INTR_VECTOR;
> > > > } while (cmpxchg(&pi_desc->control, old.control,
> > > > new.control) != old.control);
> > > > +
> > > > +   /*
> > > > +* Delete the vCPU from the related wakeup queue
> > > > +* if we are resuming from blocked state
> > > > +*/
> > > > +   if (vcpu->blocked) {
> > > > +   vcpu->blocked = false;
> > > > +   
> > > > spin_lock_irqsave(&per_cpu(blocked_vcpu_on_cpu_lock,
> > > > +   vcpu->wakeup_cpu), flags);
> > > > +   list_del(&vcpu->blocked_vcpu_list);
> > > > +
>   spin_unlock_irqrestore(&per_cpu(blocked_vcpu_on_cpu_lock,
> > > > +   vcpu->wakeup_cpu), flags);
> > > > +   vcpu->wakeup_cpu = -1;
> > > > +   }
> > > > }
> > > >  }
> > > >
> > > > @@ -1950,6 +1972,9 @@ static void vmx_vcpu_put(struct kvm_vc

Re: [PATCH v9 02/21] ACPI / processor: Introduce phys_cpuid_t for CPU hardware ID

2015-03-05 Thread Hanjun Guo

On 2015/3/5 21:23, Rafael J. Wysocki wrote:
> On Thu, Mar 5, 2015 at 8:44 AM, Hanjun Guo  wrote:
>> On 2015/3/5 6:29, Rafael J. Wysocki wrote:
>>> On Wednesday, February 25, 2015 04:39:42 PM Hanjun Guo wrote:
> [cut]
>
 @@ -190,7 +190,7 @@ int acpi_map_cpuid(int phys_id, u32 acpi_id)
  if (nr_cpu_ids <= 1 && acpi_id == 0)
  return acpi_id;
  else
 -return phys_id;
 +return -1;
>>> Can we use a proper error code here?
>> I'm afraid not. In ACPI processor drivers, -1 will be deemed to
>> invalid cpu logical number, if we return error code here, we need
>> to modify multi places of "if (cpu_logical_num == -1)" to
> Oh, silly stuff.
>
>> "if (! (cpu_logical_num < 0))" too, so for me, I prefer to keep it as
>> -1, but I'm open for suggestions.
> OK
>
> I think we need something like invalid_logical_cpuid() and use it
> in all of those checks instead of the direct comparisons, but we
> can make those changes later.

OK, I recorded this as one of my TODO list, thanks for the suggestions.

Thanks
Hanjun

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/17] crypto: talitos - Add support for SEC1

2015-03-05 Thread leroy christophe




Le 06/03/2015 01:21, Kim Phillips a écrit :

On Thu, 5 Mar 2015 17:46:05 +0100
Christophe Leroy  wrote:


[15/17] crypto: talitos - Implementation of SEC1

...


[16/17] crypto: talitos - SEC1 bugs on 0 data hash
[17/17] crypto: talitos - Update DT bindings with SEC1

This patchseries doesn't apply, at least on top of Herbert's
cryptodev-2.6 tree, as of today:

Applying: crypto: talitos - Implementation of SEC1
error: patch failed: drivers/crypto/talitos.c:655
error: drivers/crypto/talitos.c: patch does not apply

It was applying ok on linux-next as of yesterday.
I will rebase the serie on cryptodev-2.6

Christophe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 1/4] i2c: sunxi: Add Reduced Serial Bus (RSB) support

2015-03-05 Thread Wolfram Sang


> From that regard, RSB is a multiple device bus, using addresses, just
> like I2C. The way it communicates is basically the one used by P2WI.

I am not keen to allow everything which "is a bus and has addresses"
into the I2C realm. The addresses are 12 bit, whilst I2C has at maximum
10 bit which is rarely used, so mostly 7 bit are used. It has a runtime
readdressing mechanism which is not present in standard I2C. And if you
look at the protocol with no acks but parities, IMO it doesn't look
closer to I2C than to other two wire protocols. So, being in I2C needs
more arguments.

And while the outcome could be that it really makes sense to add RSB to
I2C with I2C_FUNCS_RSB added, it could also be that there is a more
suitable place for custom busses in the kernel.

Also, the fact that P2WI is in I2C is not an argument IMO. It could have
been a mistake to pick it up.

> So really, it just is more I2C-alike than P2WI has ever been.

Because it has addresses? I disagree.

> Good thing that we are not talking about a full review then, but more
> a philosophical discussion.

Exactly. This is why I wanted to bring this in early.



signature.asc
Description: Digital signature

Re: [RFC/PATCH 2/2] perf probe: Allow weak symbols to be probed

2015-03-05 Thread Masami Hiramatsu

(2015/03/06 15:15), Namhyung Kim wrote:
> Hi Masami,
> 
> On Thu, Mar 05, 2015 at 12:57:21AM +0900, Masami Hiramatsu wrote:
>> (2015/03/04 22:52), Namhyung Kim wrote:
>>> It currently prevents adding probes in weak symbols.  But there're cases
>>> that given name is an only weak symbol so that we cannot add probe.
>>>
>>>   $ perf probe -x /usr/lib/libc.so.6 -a calloc
>>>   Failed to find symbol calloc in /usr/lib/libc-2.21.so
>>> Error: Failed to add events.
>>>
>>>   $ nm /usr/lib/libc.so.6 | grep calloc
>>>   0007b1f0 t __calloc
>>>   0007b1f0 T __libc_calloc
>>>   0007b1f0 W calloc
>>>
>>> This change will result in duplicate probes when strong and weak symbols
>>> co-exist in a binary.  But I think it's not a big problem since probes
>>> at the weak symbol will never be hit anyway.
>>
>> Hmm, even on my previous series, I got an error with calloc and waitpid.
>>
>> $ ./perf probe -x /usr/lib64/libc-2.17.so -vvV calloc
>> probe-definition(0): calloc
>> symbol:calloc file:(null) line:0 offset:0 return:0 lazy:(null)
>> 0 arguments
>> Open Debuginfo file: /usr/lib/debug/usr/lib64/libc-2.17.so.debug
>> Searching variables at calloc
>> Failed to find the address of calloc
>>   Error: Failed to show vars. Reason: No such file or directory (Code: -2)
>>
>> However, it seems that calloc is loaded as a symbol.
>>
>> $ ./perf probe -x /usr/lib64/libc-2.17.so -V calloc
>> ...
>> symbol__new: __xstat64 0xe7340-0xe7385
>> symbol__new: calloc 0x80a90-0x80d2a
>> symbol__new: msgget 0xf7940-0xf7961
>> ...
>>
>> FYI, without these patches, I see the same result (calloc is loaded)
> 
> I'm bit confused with the English ;-).  So you mean that now you *can*
> probe calloc and waitpid with this patch, right?

Ah, sorry for confusing you. I meant that I couldn't probe it
even with your patch. I'm not sure why, since the calloc symbol
is created as above message, but can't find in the map...

Thank you,

-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu...@hitachi.com


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [cgroup] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/cpu/common.c:1439 warn_pre_alternatives()

2015-03-05 Thread Fengguang Wu

Hi Vladimir,

On Fri, Mar 06, 2015 at 09:09:37AM +0300, Vladimir Davydov wrote:
> Hi,
> 
> This bug should have been fixed by "[PATCH -next] cpuset: initialize
> cpuset a bit early":
> 
> http://www.spinics.net/lists/cgroups/msg12599.html

OK, sorry for the late report! I only searched for the full commit id
for possible duplicates, should check the patch subject, too.

Thanks,
Fengguang

> On Fri, Mar 06, 2015 at 01:57:58PM +0800, Fengguang Wu wrote:
> > [0.021989] [ cut here ]
> > [0.021989] [ cut here ]
> > [0.022816] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/cpu/common.c:1439 
> > warn_pre_alternatives+0x25/0x2e()
> > [0.022816] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/cpu/common.c:1439 
> > warn_pre_alternatives+0x25/0x2e()
> > [0.024000] You're using static_cpu_has before alternatives have run!
> > [0.024000] You're using static_cpu_has before alternatives have run!
> > [0.024000] Modules linked in:
> > [0.024000] Modules linked in:
> > 
> > [0.024000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
> > 4.0.0-rc1-4-g295458e #455
> > [0.024000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
> > 4.0.0-rc1-4-g295458e #455
> > [0.024000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> > 1.7.5-20140531_083030-gandalf 04/01/2014
> > [0.024000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> > 1.7.5-20140531_083030-gandalf 04/01/2014
> > [0.024000]  0009
> > [0.024000]  0009 81e03cc8 81e03cc8 
> > 81674d02 81674d02 810ca88e 810ca88e
> > 
> > [0.024000]  81e03d18
> > [0.024000]  81e03d18 81e03d08 81e03d08 
> > 81073d6f 81073d6f  
> > 
> > [0.024000]  81018f79
> > [0.024000]  81018f79 81e03e38 81e03e38 
> >   0002 0002
> > 
> > [0.024000] Call Trace:
> > [0.024000] Call Trace:
> > [0.024000]  [] dump_stack+0xa0/0xd5
> > [0.024000]  [] dump_stack+0xa0/0xd5
> > [0.024000]  [] ? console_unlock+0x496/0x4ef
> > [0.024000]  [] ? console_unlock+0x496/0x4ef
> > [0.024000]  [] warn_slowpath_common+0xc8/0xf7
> > [0.024000]  [] warn_slowpath_common+0xc8/0xf7
> > [0.024000]  [] ? warn_pre_alternatives+0x25/0x2e
> > [0.024000]  [] ? warn_pre_alternatives+0x25/0x2e
> > [0.024000]  [] warn_slowpath_fmt+0x4f/0x58
> > [0.024000]  [] warn_slowpath_fmt+0x4f/0x58
> > [0.024000]  [] ? native_iret+0x7/0x7
> > [0.024000]  [] ? native_iret+0x7/0x7
> > [0.024000]  [] warn_pre_alternatives+0x25/0x2e
> > [0.024000]  [] warn_pre_alternatives+0x25/0x2e
> > [0.024000]  [] __do_page_fault+0x2b4/0x7c2
> > [0.024000]  [] __do_page_fault+0x2b4/0x7c2
> > [0.024000]  [] do_page_fault+0x3e/0x4a
> > [0.024000]  [] do_page_fault+0x3e/0x4a
> > [0.024000]  [] do_async_page_fault+0x3a/0xb9
> > [0.024000]  [] do_async_page_fault+0x3a/0xb9
> > [0.024000]  [] async_page_fault+0x28/0x30
> > [0.024000]  [] async_page_fault+0x28/0x30
> > [0.024000]  [] ? cpumask_copy+0x2c/0x2f
> > [0.024000]  [] ? cpumask_copy+0x2c/0x2f
> > [0.024000]  [] ? cpuset_bind+0x5b/0xc4
> > [0.024000]  [] ? cpuset_bind+0x5b/0xc4
> > [0.024000]  [] cgroup_init+0x2fa/0x3d3
> > [0.024000]  [] cgroup_init+0x2fa/0x3d3
> > [0.024000]  [] start_kernel+0x6ed/0x755
> > [0.024000]  [] start_kernel+0x6ed/0x755
> > [0.024000]  [] ? early_idt_handlers+0x120/0x120
> > [0.024000]  [] ? early_idt_handlers+0x120/0x120
> > [0.024000]  [] x86_64_start_reservations+0x46/0x4f
> > [0.024000]  [] x86_64_start_reservations+0x46/0x4f
> > [0.024000]  [] x86_64_start_kernel+0x1b0/0x1c6
> > [0.024000]  [] x86_64_start_kernel+0x1b0/0x1c6
> > [0.024000] ---[ end trace 37d9a871c47a31bc ]---
> > [0.024000] ---[ end trace 37d9a871c47a31bc ]---
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH stable 3.10, 3.12, 3.14] MIPS: Export FP functions used by lose_fpu(1) for KVM

2015-03-05 Thread Greg Kroah-Hartman

On Thu, Mar 05, 2015 at 04:08:44PM +, James Hogan wrote:
> [ Upstream commit 3ce465e04bfd8de9956d515d6e9587faac3375dc ]
> 
> Export the _save_fp asm function used by the lose_fpu(1) macro to GPL
> modules so that KVM can make use of it when it is built as a module.
> 
> This fixes the following build error when CONFIG_KVM=m due to commit
> f798217dfd03 ("KVM: MIPS: Don't leak FPU/DSP to guest"):
> 
> ERROR: "_save_fp" [arch/mips/kvm/kvm.ko] undefined!
> 
> Signed-off-by: James Hogan 
> Fixes: f798217dfd03 (KVM: MIPS: Don't leak FPU/DSP to guest)
> Cc: Paolo Bonzini 
> Cc: Ralf Baechle 
> Cc: Paul Burton 
> Cc: Gleb Natapov 
> Cc: k...@vger.kernel.org
> Cc: linux-m...@linux-mips.org
> Cc:  # 3.10...3.15
> Patchwork: https://patchwork.linux-mips.org/patch/9260/
> Signed-off-by: Ralf Baechle 
> [james.ho...@imgtec.com: Only export when CPU_R4K_FPU=y prior to v3.16,
>  so as not to break the Octeon build which excludes FPU support. KVM
>  depends on MIPS32r2 anyway.]
> Signed-off-by: James Hogan 
> ---
> Appologies for the previous cavium_octeon_defconfig link breakage.
> Octeon has the symbol since 3.16, but not before. This backport should
> do the trick for stable 3.10, 3.12, and 3.14. Build tested with
> cavium_octeon_defconfig and malta_kvm_defconfig on those stable
> branches.
> ---
>  arch/mips/kernel/mips_ksyms.c | 8 
>  1 file changed, 8 insertions(+)

Now fixed up, thanks.

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[V5 PATCH 0/2] Introduce ACPI support for ahci_platform driver

2015-03-05 Thread Suravee Suthikulpanit

This patch series introduce ACPI support for AHCI platform driver.
Existing ACPI support for AHCI assumes the device controller is a PCI device.
Since there is no ACPI _CID for generic AHCI controller, the driver
could not use it for matching devices. Therefore, this patch introduces
a mechanism for drivers to match devices using ACPI _CLS method.
_CLS contains PCI-defined class-code.

This patch series also modifies ACPI modalias to add class-code to the
exisiting format, which currently only uses _HID and _CIDs. This is required
to support loadable modules w/ _CLS.

This patch series is rebased from and tested with:

http://git.linaro.org/leg/acpi/acpi.git acpi-5.1-v9

This topic was discussed earlier here (as part of introducing support for
AMD Seattle SATA controller):

http://marc.info/?l=linux-arm-kernel&m=141083492521584&w=2

Changes from V4 (https://lkml.org/lkml/2015/3/2/56)
* [1/2] Bug fixed: Reorder the declaration of
  struct acpi_pnp_device_id cls in the struct acpi_device_info
  (include/acpi/actypes.h) since compatible_id_list must be last one.
* [2/2] Added Acked-by: Tejun Heo 

Changes from V3 (https://lkml.org/lkml/2015/2/8/106)
* Instead of introducing new structure acpi_device_cls, add cls into
  the acpi_device_id, and modify the __acpi_match_device
  to also match for cls. (per Mika suggestion.)
* Add loadable module support, which requires changes in ACPI
  modalias. (per Mika suggestion.)
* Rebased and tested with acpi-5.1-v9

Changes from V2 (https://lkml.org/lkml/2015/1/5/662)
* Update with review comment from Rafael in patch 1/2
* Rebased and tested with acpi-5.1-v8

Changes from V1 (https://lkml.org/lkml/2014/12/19/345)
* Rebased to 3.19.0-rc2
* Change from acpi_cls in device_driver to acpi_match_cls (Hanjun 
comment)
* Change the matching logic in acpi_driver_match_device() due to the new
  special PRP0001 _HID.
* Simplify the return type of acpi_match_device_cls() to boolean.

Changes from RFC (https://lkml.org/lkml/2014/12/17/446)
* Remove #ifdef and make non-ACPI version of the acpi_match_device_cls
  as inline. (per Arnd)
* Simplify logic to retrieve and evaluate _CLS handle. (per Hanjun)

Suravee Suthikulpanit (2):
  ACPI / scan: Add support for ACPI _CLS device matching
  ata: ahci_platform: Add ACPI _CLS matching

 drivers/acpi/acpica/acutils.h |  3 ++
 drivers/acpi/acpica/nsxfname.c| 21 ++--
 drivers/acpi/acpica/utids.c   | 71 +++
 drivers/acpi/scan.c   | 17 --
 drivers/ata/Kconfig   |  2 +-
 drivers/ata/ahci_platform.c   |  9 +
 include/acpi/acnames.h|  1 +
 include/acpi/actypes.h|  4 ++-
 include/linux/mod_devicetable.h   |  1 +
 scripts/mod/devicetable-offsets.c |  1 +
 scripts/mod/file2alias.c  | 13 +--
 11 files changed, 134 insertions(+), 9 deletions(-)

-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[V5 PATCH 1/2] ACPI / scan: Add support for ACPI _CLS device matching

2015-03-05 Thread Suravee Suthikulpanit

Device drivers typically use ACPI _HIDs/_CIDs listed in struct device_driver
acpi_match_table to match devices. However, for generic drivers, we do not
want to list _HID for all supported devices. Also, certain classes of devices
do not have _CID (e.g. SATA, USB). Instead, we can leverage ACPI _CLS,
which specifies PCI-defined class code (i.e. base-class, subclass and
programming interface). This patch adds support for matching ACPI devices using
the _CLS method.

To support loadable module, current design uses _HID or _CID to match device's
modalias. With the new way of matching with _CLS this would requires 
modification
to the current ACPI modalias key to include _CLS. This patch appends PCI-defined
class-code to the existing ACPI modalias as following.

acpi..:::
E.g:
# cat /sys/devices/platform/AMDI0600:00/modalias
acpi:AMDI0600:010601:

where bb is th base-class code, ss is te sub-class code, and pp is the
programming interface code

Since there would not be _HID/_CID in the ACPI matching table of the driver,
this patch adds a field to acpi_device_id to specify the matching _CLS.

static const struct acpi_device_id ahci_acpi_match[] = {
{ "", 0, PCI_CLASS_STORAGE_SATA_AHCI },
{},
};

In this case, the corresponded entry in modules.alias file would be:

alias acpi*:010601:* ahci_platform

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/acpi/acpica/acutils.h |  3 ++
 drivers/acpi/acpica/nsxfname.c| 21 ++--
 drivers/acpi/acpica/utids.c   | 71 +++
 drivers/acpi/scan.c   | 17 --
 include/acpi/acnames.h|  1 +
 include/acpi/actypes.h|  4 ++-
 include/linux/mod_devicetable.h   |  1 +
 scripts/mod/devicetable-offsets.c |  1 +
 scripts/mod/file2alias.c  | 13 +--
 9 files changed, 124 insertions(+), 8 deletions(-)

diff --git a/drivers/acpi/acpica/acutils.h b/drivers/acpi/acpica/acutils.h
index c2f03e8..2aef850 100644
--- a/drivers/acpi/acpica/acutils.h
+++ b/drivers/acpi/acpica/acutils.h
@@ -430,6 +430,9 @@ acpi_status
 acpi_ut_execute_CID(struct acpi_namespace_node *device_node,
struct acpi_pnp_device_id_list ** return_cid_list);
 
+acpi_status
+acpi_ut_execute_CLS(struct acpi_namespace_node *device_node,
+   struct acpi_pnp_device_id **return_id);
 /*
  * utlock - reader/writer locks
  */
diff --git a/drivers/acpi/acpica/nsxfname.c b/drivers/acpi/acpica/nsxfname.c
index d66c326..590ef06 100644
--- a/drivers/acpi/acpica/nsxfname.c
+++ b/drivers/acpi/acpica/nsxfname.c
@@ -276,11 +276,12 @@ acpi_get_object_info(acpi_handle handle,
struct acpi_pnp_device_id *hid = NULL;
struct acpi_pnp_device_id *uid = NULL;
struct acpi_pnp_device_id *sub = NULL;
+   struct acpi_pnp_device_id *cls = NULL;
char *next_id_string;
acpi_object_type type;
acpi_name name;
u8 param_count = 0;
-   u8 valid = 0;
+   u16 valid = 0;
u32 info_size;
u32 i;
acpi_status status;
@@ -320,7 +321,7 @@ acpi_get_object_info(acpi_handle handle,
if ((type == ACPI_TYPE_DEVICE) || (type == ACPI_TYPE_PROCESSOR)) {
/*
 * Get extra info for ACPI Device/Processor objects only:
-* Run the Device _HID, _UID, _SUB, and _CID methods.
+* Run the Device _HID, _UID, _SUB, _CID and _CLS methods.
 *
 * Note: none of these methods are required, so they may or may
 * not be present for this device. The Info->Valid bitfield is 
used
@@ -351,6 +352,14 @@ acpi_get_object_info(acpi_handle handle,
valid |= ACPI_VALID_SUB;
}
 
+   /* Execute the Device._CLS method */
+
+   status = acpi_ut_execute_CLS(node, &cls);
+   if (ACPI_SUCCESS(status)) {
+   info_size += cls->length;
+   valid |= ACPI_VALID_CLS;
+   }
+
/* Execute the Device._CID method */
 
status = acpi_ut_execute_CID(node, &cid_list);
@@ -468,6 +477,11 @@ acpi_get_object_info(acpi_handle handle,
sub, next_id_string);
}
 
+   if (cls) {
+   next_id_string = acpi_ns_copy_device_id(&info->cls,
+   cls, next_id_string);
+   }
+
if (cid_list) {
info->compatible_id_list.count = cid_list->count;
info->compatible_id_list.list_size = cid_list->list_size;
@@ -507,6 +521,9 @@ cleanup:
if (sub) {
ACPI_FREE(sub);
}
+   if (cls) {
+   ACPI_FREE(cls);
+   }
if (cid_list) {
ACPI_FREE(cid_list);
}
diff --git a/drivers/acpi/acpica/utids.c b/drivers/acpi/acpica/utids.c
index 27431cf..a64b5d1 100644
--- a/drivers/acpi/acpi

[PATCH 1/2] arm64: mediatek: Select PINCTRL for Mediatek platform

2015-03-05 Thread Yingjoe Chen

These 2 patches are fixup for MT8173 pinctrl driver:
http://lists.infradead.org/pipermail/linux-arm-kernel/2015-January/320066.html

Arm64 maintainers doesn't want to add MACH_* in Kconfig, this patch
is used to replace the first one in that series.

Matthias,
Can you take this one?

--
MediaTek SoC expect to work with a pinctrl driver.
Select PINCTRL if ARCH_MEDIATEK is selected.

Signed-off-by: Yingjoe Chen 
---
 arch/arm64/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index e627ead..a2ddd3f 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -151,6 +151,7 @@ menu "Platform selection"
 config ARCH_MEDIATEK
bool "Mediatek MT65xx & MT81xx ARMv8 SoC"
select ARM_GIC
+   select PINCTRL
help
  Support for Mediatek MT65xx & MT81xx ARMv8 SoCs
 
-- 
1.8.1.1.dirty

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/2] pinctrl: mediatek: Adjust mt8173 pinctrl kconfig

2015-03-05 Thread Yingjoe Chen

Linus,
This one make PINCTRL_MT8173 option user selectable and is based on
mtk-staging in your tree. If you think this is OK, please applied or
squash this into previous change. Thanks.

--
ARM64 maintainer doesn't want to add MACH_* for each SoC.
Adjust mt8173 pinctrl kconfig entry so user can manually select it.

Also make PINCTRL_MT8135 build when COMPILE_TEST is enabled.

Signed-off-by: Yingjoe Chen 
---
 drivers/pinctrl/mediatek/Kconfig | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/pinctrl/mediatek/Kconfig b/drivers/pinctrl/mediatek/Kconfig
index 49b8649..1472f0e 100644
--- a/drivers/pinctrl/mediatek/Kconfig
+++ b/drivers/pinctrl/mediatek/Kconfig
@@ -1,4 +1,4 @@
-if ARCH_MEDIATEK
+if ARCH_MEDIATEK || COMPILE_TEST
 
 config PINCTRL_MTK_COMMON
bool
@@ -8,11 +8,13 @@ config PINCTRL_MTK_COMMON
select OF_GPIO
 
 config PINCTRL_MT8135
-   def_bool MACH_MT8135
+   def_bool MACH_MT8135 || COMPILE_TEST
select PINCTRL_MTK_COMMON
 
 config PINCTRL_MT8173
-   def_bool MACH_MT8173
+   bool "Mediatek MT8173 pin control"
+   def_bool y
+   depends on ARM64 || COMPILE_TEST
select PINCTRL_MTK_COMMON
 
 endif
-- 
1.8.1.1.dirty

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC/PATCH 2/2] perf probe: Allow weak symbols to be probed

2015-03-05 Thread Namhyung Kim

Hi Masami,

On Thu, Mar 05, 2015 at 12:57:21AM +0900, Masami Hiramatsu wrote:
> (2015/03/04 22:52), Namhyung Kim wrote:
> > It currently prevents adding probes in weak symbols.  But there're cases
> > that given name is an only weak symbol so that we cannot add probe.
> > 
> >   $ perf probe -x /usr/lib/libc.so.6 -a calloc
> >   Failed to find symbol calloc in /usr/lib/libc-2.21.so
> > Error: Failed to add events.
> > 
> >   $ nm /usr/lib/libc.so.6 | grep calloc
> >   0007b1f0 t __calloc
> >   0007b1f0 T __libc_calloc
> >   0007b1f0 W calloc
> > 
> > This change will result in duplicate probes when strong and weak symbols
> > co-exist in a binary.  But I think it's not a big problem since probes
> > at the weak symbol will never be hit anyway.
> 
> Hmm, even on my previous series, I got an error with calloc and waitpid.
> 
> $ ./perf probe -x /usr/lib64/libc-2.17.so -vvV calloc
> probe-definition(0): calloc
> symbol:calloc file:(null) line:0 offset:0 return:0 lazy:(null)
> 0 arguments
> Open Debuginfo file: /usr/lib/debug/usr/lib64/libc-2.17.so.debug
> Searching variables at calloc
> Failed to find the address of calloc
>   Error: Failed to show vars. Reason: No such file or directory (Code: -2)
> 
> However, it seems that calloc is loaded as a symbol.
> 
> $ ./perf probe -x /usr/lib64/libc-2.17.so -V calloc
> ...
> symbol__new: __xstat64 0xe7340-0xe7385
> symbol__new: calloc 0x80a90-0x80d2a
> symbol__new: msgget 0xf7940-0xf7961
> ...
> 
> FYI, without these patches, I see the same result (calloc is loaded)

I'm bit confused with the English ;-).  So you mean that now you *can*
probe calloc and waitpid with this patch, right?

Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/5] mtd: nand: vf610_nfc: Freescale NFC for VF610, MPC5125 and others

2015-03-05 Thread Sascha Hauer

Hi Stefan,

On Thu, Mar 05, 2015 at 12:10:20AM +0100, Stefan Agner wrote:
> +
> +static int vf610_nfc_probe_dt(struct device *dev, struct vf610_nfc_config 
> *cfg)
> +{
> + struct device_node *np = dev->of_node;
> + int buswidth;
> + u32 clkrate;
> +
> + if (!np)
> + return 1;
> +
> + cfg->flash_bbt = of_get_nand_on_flash_bbt(np);
> +
> + if (!of_property_read_u32(np, "clock-frequency", &clkrate))
> + cfg->clkrate = clkrate;

Normally the clock-frequency property tells the driver at which
frequency the device actually is running, not to tell the driver at
which frequency the device *should* run. It's strange to use the value
of the clock-frequency property as input to clk_set_rate(). Maybe the
assigned clock binding is more appropriate here, see
Documentation/devicetree/bindings/clock/clock-bindings.txt.

BTW the above can easier be written as:

of_property_read_u32(np, "clock-frequency", &cfg->clkrate);

No return value checking necessary.

> +static int vf610_nfc_probe(struct platform_device *pdev)
> +{
> + struct vf610_nfc *nfc;
> + struct resource *res;
> + struct mtd_info *mtd;
> + struct nand_chip *chip;
> + struct vf610_nfc_config *cfg;
> + int err = 0;
> + int page_sz;
> + int irq;
> +
> + nfc = devm_kzalloc(&pdev->dev, sizeof(*nfc), GFP_KERNEL);
> + if (!nfc)
> + return -ENOMEM;
> +
> + nfc->cfg = devm_kzalloc(&pdev->dev, sizeof(*nfc), GFP_KERNEL);
> + if (!nfc->cfg)
> + return -ENOMEM;
> + cfg = nfc->cfg;

Why is nfc->cfg allocated separately instead of embedding it into struct
vf610_nfc? Is this some platform_data leftover you can remove now?

> +
> + nfc->dev = &pdev->dev;
> + nfc->page = -1;
> + mtd = &nfc->mtd;
> + chip = &nfc->chip;
> +
> + mtd->priv = chip;
> + mtd->owner = THIS_MODULE;
> + mtd->dev.parent = nfc->dev;
> + mtd->name = DRV_NAME;
> +
> + err = vf610_nfc_probe_dt(nfc->dev, cfg);
> + if (err)
> + return -ENODEV;

Does this driver work without device tree or not? Currently the driver
bails out when device tree support is enabled but no device node is
given. When device tree support is disabled in the kernel though the
driver happily continues here.

Sascha

-- 
Pengutronix e.K.   | |
Industrial Linux Solutions | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0|
Amtsgericht Hildesheim, HRA 2686   | Fax:   +49-5121-206917- |
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 1/2] net/macb: unify clock management

2015-03-05 Thread Boris Brezillon

From: Cyrille Pitchen 

Most of the functions from the Common Clk Framework handle NULL pointer as
input argument.

Since the TX clock is optional, we now set tx_clk to NULL value
instead of ERR_PTR(-ENOENT) when this clock is not available. This simplifies
the clock management and avoid the need to test tx_clk value.

Signed-off-by: Cyrille Pitchen 
Acked-by: Boris Brezillon 
Acked-by: Alexandre Belloni 
---
 drivers/net/ethernet/cadence/macb.c | 31 ++-
 1 file changed, 14 insertions(+), 17 deletions(-)

diff --git a/drivers/net/ethernet/cadence/macb.c 
b/drivers/net/ethernet/cadence/macb.c
index a7dbf04..a429cf8 100644
--- a/drivers/net/ethernet/cadence/macb.c
+++ b/drivers/net/ethernet/cadence/macb.c
@@ -213,6 +213,9 @@ static void macb_set_tx_clk(struct clk *clk, int speed, 
struct net_device *dev)
 {
long ferr, rate, rate_rounded;
 
+   if (!clk)
+   return;
+
switch (speed) {
case SPEED_10:
rate = 250;
@@ -292,8 +295,7 @@ static void macb_handle_link_change(struct net_device *dev)
 
spin_unlock_irqrestore(&bp->lock, flags);
 
-   if (!IS_ERR(bp->tx_clk))
-   macb_set_tx_clk(bp->tx_clk, phydev->speed, dev);
+   macb_set_tx_clk(bp->tx_clk, phydev->speed, dev);
 
if (status_change) {
if (phydev->link) {
@@ -2244,6 +2246,8 @@ static int macb_probe(struct platform_device *pdev)
}
 
tx_clk = devm_clk_get(&pdev->dev, "tx_clk");
+   if (IS_ERR(tx_clk))
+   tx_clk = NULL;
 
err = clk_prepare_enable(pclk);
if (err) {
@@ -2257,13 +2261,10 @@ static int macb_probe(struct platform_device *pdev)
goto err_out_disable_pclk;
}
 
-   if (!IS_ERR(tx_clk)) {
-   err = clk_prepare_enable(tx_clk);
-   if (err) {
-   dev_err(&pdev->dev, "failed to enable tx_clk (%u)\n",
-   err);
-   goto err_out_disable_hclk;
-   }
+   err = clk_prepare_enable(tx_clk);
+   if (err) {
+   dev_err(&pdev->dev, "failed to enable tx_clk (%u)\n", err);
+   goto err_out_disable_hclk;
}
 
err = -ENOMEM;
@@ -2436,8 +2437,7 @@ err_out_unregister_netdev:
 err_out_free_netdev:
free_netdev(dev);
 err_out_disable_clocks:
-   if (!IS_ERR(tx_clk))
-   clk_disable_unprepare(tx_clk);
+   clk_disable_unprepare(tx_clk);
 err_out_disable_hclk:
clk_disable_unprepare(hclk);
 err_out_disable_pclk:
@@ -2461,8 +2461,7 @@ static int macb_remove(struct platform_device *pdev)
kfree(bp->mii_bus->irq);
mdiobus_free(bp->mii_bus);
unregister_netdev(dev);
-   if (!IS_ERR(bp->tx_clk))
-   clk_disable_unprepare(bp->tx_clk);
+   clk_disable_unprepare(bp->tx_clk);
clk_disable_unprepare(bp->hclk);
clk_disable_unprepare(bp->pclk);
free_netdev(dev);
@@ -2480,8 +2479,7 @@ static int __maybe_unused macb_suspend(struct device *dev)
netif_carrier_off(netdev);
netif_device_detach(netdev);
 
-   if (!IS_ERR(bp->tx_clk))
-   clk_disable_unprepare(bp->tx_clk);
+   clk_disable_unprepare(bp->tx_clk);
clk_disable_unprepare(bp->hclk);
clk_disable_unprepare(bp->pclk);
 
@@ -2496,8 +2494,7 @@ static int __maybe_unused macb_resume(struct device *dev)
 
clk_prepare_enable(bp->pclk);
clk_prepare_enable(bp->hclk);
-   if (!IS_ERR(bp->tx_clk))
-   clk_prepare_enable(bp->tx_clk);
+   clk_prepare_enable(bp->tx_clk);
 
netif_device_attach(netdev);
 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 2/2] net/macb: merge at91_ether driver into macb driver

2015-03-05 Thread Boris Brezillon

From: Cyrille Pitchen 

macb and at91_ether drivers can be compiled as modules, but the at91_ether
driver use some functions and variables defined in the macb one, thus
creating a dependency on the macb driver.

Since these drivers are sharing the same logic we can easily merge
at91_ether into macb.

In order to factorize common probing logic we've added an ->init() function
to struct macb_config (the structure associated with the compatible
string), and moved macb specific init code from macb_probe to macb_init.

Signed-off-by: Cyrille Pitchen 
Signed-off-by: Boris Brezillon 
---
 drivers/net/ethernet/cadence/Kconfig  |   8 -
 drivers/net/ethernet/cadence/Makefile |   1 -
 drivers/net/ethernet/cadence/at91_ether.c | 481 --
 drivers/net/ethernet/cadence/macb.c   | 641 ++
 drivers/net/ethernet/cadence/macb.h   |  10 +-
 5 files changed, 485 insertions(+), 656 deletions(-)
 delete mode 100644 drivers/net/ethernet/cadence/at91_ether.c

diff --git a/drivers/net/ethernet/cadence/Kconfig 
b/drivers/net/ethernet/cadence/Kconfig
index 321d2ad..fb8d09b 100644
--- a/drivers/net/ethernet/cadence/Kconfig
+++ b/drivers/net/ethernet/cadence/Kconfig
@@ -20,14 +20,6 @@ config NET_CADENCE
 
 if NET_CADENCE
 
-config ARM_AT91_ETHER
-   tristate "AT91RM9200 Ethernet support"
-   depends on HAS_DMA && (ARCH_AT91 || COMPILE_TEST)
-   select MACB
-   ---help---
- If you wish to compile a kernel for the AT91RM9200 and enable
- ethernet support, then you should always answer Y to this.
-
 config MACB
tristate "Cadence MACB/GEM support"
depends on HAS_DMA && (PLATFORM_AT32AP || ARCH_AT91 || ARCH_PICOXCELL 
|| ARCH_ZYNQ || MICROBLAZE || COMPILE_TEST)
diff --git a/drivers/net/ethernet/cadence/Makefile 
b/drivers/net/ethernet/cadence/Makefile
index 9068b83..91f79b1 100644
--- a/drivers/net/ethernet/cadence/Makefile
+++ b/drivers/net/ethernet/cadence/Makefile
@@ -2,5 +2,4 @@
 # Makefile for the Atmel network device drivers.
 #
 
-obj-$(CONFIG_ARM_AT91_ETHER) += at91_ether.o
 obj-$(CONFIG_MACB) += macb.o
diff --git a/drivers/net/ethernet/cadence/at91_ether.c 
b/drivers/net/ethernet/cadence/at91_ether.c
deleted file mode 100644
index 7ef55f5..000
--- a/drivers/net/ethernet/cadence/at91_ether.c
+++ /dev/null
@@ -1,481 +0,0 @@
-/*
- * Ethernet driver for the Atmel AT91RM9200 (Thunder)
- *
- *  Copyright (C) 2003 SAN People (Pty) Ltd
- *
- * Based on an earlier Atmel EMAC macrocell driver by Atmel and Lineo Inc.
- * Initial version by Rick Bronson 01/11/2003
- *
- * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Public License
- * as published by the Free Software Foundation; either version
- * 2 of the License, or (at your option) any later version.
- */
-
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-
-#include "macb.h"
-
-/* 1518 rounded up */
-#define MAX_RBUFF_SZ   0x600
-/* max number of receive buffers */
-#define MAX_RX_DESCR   9
-
-/* Initialize and start the Receiver and Transmit subsystems */
-static int at91ether_start(struct net_device *dev)
-{
-   struct macb *lp = netdev_priv(dev);
-   dma_addr_t addr;
-   u32 ctl;
-   int i;
-
-   lp->rx_ring = dma_alloc_coherent(&lp->pdev->dev,
-(MAX_RX_DESCR *
- sizeof(struct macb_dma_desc)),
-&lp->rx_ring_dma, GFP_KERNEL);
-   if (!lp->rx_ring)
-   return -ENOMEM;
-
-   lp->rx_buffers = dma_alloc_coherent(&lp->pdev->dev,
-   MAX_RX_DESCR * MAX_RBUFF_SZ,
-   &lp->rx_buffers_dma, GFP_KERNEL);
-   if (!lp->rx_buffers) {
-   dma_free_coherent(&lp->pdev->dev,
- MAX_RX_DESCR * sizeof(struct macb_dma_desc),
- lp->rx_ring, lp->rx_ring_dma);
-   lp->rx_ring = NULL;
-   return -ENOMEM;
-   }
-
-   addr = lp->rx_buffers_dma;
-   for (i = 0; i < MAX_RX_DESCR; i++) {
-   lp->rx_ring[i].addr = addr;
-   lp->rx_ring[i].ctrl = 0;
-   addr += MAX_RBUFF_SZ;
-   }
-
-   /* Set the Wrap bit on the last descriptor */
-   lp->rx_ring[MAX_RX_DESCR - 1].addr |= MACB_BIT(RX_WRAP);
-
-   /* Reset buffer index */
-   lp->rx_tail = 0;
-
-   /* Program address of descriptor list in Rx Buffer Queue register */
-   macb_writel(lp, RBQP, lp->rx_ring_dma);
-
-   /* Enable Receive and Transmit */
-   ctl = macb_readl(lp, NCR);
-   macb_writel(lp, NCR, ctl | MACB_BIT(RE) | MACB_BIT(TE));
-
-   return 0;
-}
-
-/* Open the ethernet interface */
-static int at91ether_open(struct ne

[PATCH v2 0/2] net/macb: merge at91_ether driver into macb driver

2015-03-05 Thread Boris Brezillon

Hello,

The rm9200 boards use the dedicated at91_ether driver instead of the
regular macb driver.

Both the macb and at91_ether drivers can be compiled as separated
modules.
Since the at91_ether driver uses code from the macb driver, at91_ether.ko
depends on macb.ko.

However the macb.ko module always fails to load on rm9200 boards: the
macb_probe() function expects a hclk clock which doesn't exist on rm9200.
Then the at91_ether.ko can't be loaded in turn due to unresolved
dependencies.

This series of patches fix this issue by merging at91_ether into macb.

Best Rrgards,

Boris

Changes since v1:
- rework probe functions to share common probing logic

Cyrille Pitchen (2):
  net/macb: unify clock management
  net/macb: merge at91_ether driver into macb driver

 drivers/net/ethernet/cadence/Kconfig  |   8 -
 drivers/net/ethernet/cadence/Makefile |   1 -
 drivers/net/ethernet/cadence/at91_ether.c | 481 --
 drivers/net/ethernet/cadence/macb.c   | 662 ++
 drivers/net/ethernet/cadence/macb.h   |  10 +-
 5 files changed, 494 insertions(+), 668 deletions(-)
 delete mode 100644 drivers/net/ethernet/cadence/at91_ether.c

-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[V5 PATCH 2/2] ata: ahci_platform: Add ACPI _CLS matching

2015-03-05 Thread Suravee Suthikulpanit

This patch adds ACPI supports for AHCI platform driver, which uses _CLS
method to match the device.

The following is an example of ASL structure in DSDT for a SATA controller,
which contains _CLS package to be matched by the ahci_platform driver:

  Device (AHC0) // AHCI Controller
  {
Name(_HID, "AMDI0600")
Name (_CCA, 1)
Name (_CLS, Package (3)
{
  0x01, // Base Class: Mass Storage
  0x06, // Sub-Class: serial ATA
  0x01, // Interface: AHCI
})
Name (_CRS, ResourceTemplate ()
{
  Memory32Fixed (ReadWrite, 0xE030, 0x0001)
  Interrupt (ResourceConsumer, Level, ActiveHigh, Exclusive,,,) { 387 }
})
  }

Also, since ATA driver should not require PCI support for ATA_ACPI,
this patch removes dependency in the driver/ata/Kconfig.

Acked-by: Tejun Heo 
Signed-off-by: Suravee Suthikulpanit 
---
 drivers/ata/Kconfig | 2 +-
 drivers/ata/ahci_platform.c | 9 +
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/ata/Kconfig b/drivers/ata/Kconfig
index 5f60155..50305e3 100644
--- a/drivers/ata/Kconfig
+++ b/drivers/ata/Kconfig
@@ -48,7 +48,7 @@ config ATA_VERBOSE_ERROR
 
 config ATA_ACPI
bool "ATA ACPI Support"
-   depends on ACPI && PCI
+   depends on ACPI
default y
help
  This option adds support for ATA-related ACPI objects.
diff --git a/drivers/ata/ahci_platform.c b/drivers/ata/ahci_platform.c
index 78d6ae0..842cd13 100644
--- a/drivers/ata/ahci_platform.c
+++ b/drivers/ata/ahci_platform.c
@@ -20,6 +20,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include "ahci.h"
 
 #define DRV_NAME "ahci"
@@ -78,12 +80,19 @@ static const struct of_device_id ahci_of_match[] = {
 };
 MODULE_DEVICE_TABLE(of, ahci_of_match);
 
+static const struct acpi_device_id ahci_acpi_match[] = {
+   { "", 0, PCI_CLASS_STORAGE_SATA_AHCI },
+   {},
+};
+MODULE_DEVICE_TABLE(acpi, ahci_acpi_match);
+
 static struct platform_driver ahci_driver = {
.probe = ahci_probe,
.remove = ata_platform_remove_one,
.driver = {
.name = DRV_NAME,
.of_match_table = ahci_of_match,
+   .acpi_match_table = ahci_acpi_match,
.pm = &ahci_pm_ops,
},
 };
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [cgroup] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/cpu/common.c:1439 warn_pre_alternatives()

2015-03-05 Thread Vladimir Davydov

Hi,

This bug should have been fixed by "[PATCH -next] cpuset: initialize
cpuset a bit early":

http://www.spinics.net/lists/cgroups/msg12599.html

Thanks,
Vladimir

On Fri, Mar 06, 2015 at 01:57:58PM +0800, Fengguang Wu wrote:
> [0.021989] [ cut here ]
> [0.021989] [ cut here ]
> [0.022816] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/cpu/common.c:1439 
> warn_pre_alternatives+0x25/0x2e()
> [0.022816] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/cpu/common.c:1439 
> warn_pre_alternatives+0x25/0x2e()
> [0.024000] You're using static_cpu_has before alternatives have run!
> [0.024000] You're using static_cpu_has before alternatives have run!
> [0.024000] Modules linked in:
> [0.024000] Modules linked in:
> 
> [0.024000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
> 4.0.0-rc1-4-g295458e #455
> [0.024000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
> 4.0.0-rc1-4-g295458e #455
> [0.024000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> 1.7.5-20140531_083030-gandalf 04/01/2014
> [0.024000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> 1.7.5-20140531_083030-gandalf 04/01/2014
> [0.024000]  0009
> [0.024000]  0009 81e03cc8 81e03cc8 
> 81674d02 81674d02 810ca88e 810ca88e
> 
> [0.024000]  81e03d18
> [0.024000]  81e03d18 81e03d08 81e03d08 
> 81073d6f 81073d6f  
> 
> [0.024000]  81018f79
> [0.024000]  81018f79 81e03e38 81e03e38 
>   0002 0002
> 
> [0.024000] Call Trace:
> [0.024000] Call Trace:
> [0.024000]  [] dump_stack+0xa0/0xd5
> [0.024000]  [] dump_stack+0xa0/0xd5
> [0.024000]  [] ? console_unlock+0x496/0x4ef
> [0.024000]  [] ? console_unlock+0x496/0x4ef
> [0.024000]  [] warn_slowpath_common+0xc8/0xf7
> [0.024000]  [] warn_slowpath_common+0xc8/0xf7
> [0.024000]  [] ? warn_pre_alternatives+0x25/0x2e
> [0.024000]  [] ? warn_pre_alternatives+0x25/0x2e
> [0.024000]  [] warn_slowpath_fmt+0x4f/0x58
> [0.024000]  [] warn_slowpath_fmt+0x4f/0x58
> [0.024000]  [] ? native_iret+0x7/0x7
> [0.024000]  [] ? native_iret+0x7/0x7
> [0.024000]  [] warn_pre_alternatives+0x25/0x2e
> [0.024000]  [] warn_pre_alternatives+0x25/0x2e
> [0.024000]  [] __do_page_fault+0x2b4/0x7c2
> [0.024000]  [] __do_page_fault+0x2b4/0x7c2
> [0.024000]  [] do_page_fault+0x3e/0x4a
> [0.024000]  [] do_page_fault+0x3e/0x4a
> [0.024000]  [] do_async_page_fault+0x3a/0xb9
> [0.024000]  [] do_async_page_fault+0x3a/0xb9
> [0.024000]  [] async_page_fault+0x28/0x30
> [0.024000]  [] async_page_fault+0x28/0x30
> [0.024000]  [] ? cpumask_copy+0x2c/0x2f
> [0.024000]  [] ? cpumask_copy+0x2c/0x2f
> [0.024000]  [] ? cpuset_bind+0x5b/0xc4
> [0.024000]  [] ? cpuset_bind+0x5b/0xc4
> [0.024000]  [] cgroup_init+0x2fa/0x3d3
> [0.024000]  [] cgroup_init+0x2fa/0x3d3
> [0.024000]  [] start_kernel+0x6ed/0x755
> [0.024000]  [] start_kernel+0x6ed/0x755
> [0.024000]  [] ? early_idt_handlers+0x120/0x120
> [0.024000]  [] ? early_idt_handlers+0x120/0x120
> [0.024000]  [] x86_64_start_reservations+0x46/0x4f
> [0.024000]  [] x86_64_start_reservations+0x46/0x4f
> [0.024000]  [] x86_64_start_kernel+0x1b0/0x1c6
> [0.024000]  [] x86_64_start_kernel+0x1b0/0x1c6
> [0.024000] ---[ end trace 37d9a871c47a31bc ]---
> [0.024000] ---[ end trace 37d9a871c47a31bc ]---
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] do_fork(): Rename 'stack_size' argument to reflect actual use

2015-03-05 Thread Alex Dowad



On 05/03/15 22:29, David Rientjes wrote:

On Thu, 5 Mar 2015, Alex Dowad wrote:


diff --git a/kernel/fork.c b/kernel/fork.c
index cf65139..b38a2ae 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1186,10 +1186,12 @@ init_task_pid(struct task_struct *task, enum
pid_type type, struct pid *pid)
* It copies the registers, and all the appropriate
* parts of the process environment (as per the clone
* flags). The actual kick-off is left to the caller.
+ *
+ * When copying a kernel thread, 'stack_start' is the function to run.
*/
   static struct task_struct *copy_process(unsigned long clone_flags,
unsigned long stack_start,
-   unsigned long stack_size,
+   unsigned long kthread_arg,
int __user *child_tidptr,
struct pid *pid,
int trace)
@@ -1401,7 +1403,7 @@ static struct task_struct *copy_process(unsigned
long clone_flags,
retval = copy_io(clone_flags, p);
if (retval)
goto bad_fork_cleanup_namespaces;
-   retval = copy_thread(clone_flags, stack_start, stack_size, p);
+   retval = copy_thread(clone_flags, stack_start, kthread_arg, p);
if (retval)
goto bad_fork_cleanup_io;
   @@ -1629,8 +1631,8 @@ struct task_struct *fork_idle(int cpu)
* it and waits for it to finish using the VM if required.
*/
   long do_fork(unsigned long clone_flags,
- unsigned long stack_start,
- unsigned long stack_size,
+ unsigned long stack_start, /* or function for kthread to run */
+ unsigned long kthread_arg,
  int __user *parent_tidptr,
  int __user *child_tidptr)
   {

Looks fine, but I'm not sure about commenting functional formals.  Since
copy_process() and do_fork() can have formals with different meanings,
then why not just rename them "arg1" and "arg2" respectively and then
define in the comment above the function what the possible combinations
are?

The second argument is *only* ever used for one thing: an argument passed to a
kernel thread. That's why I would like to rename it to "kthread_arg". The
previous argument (currently named "stack_start") is indeed used for 2
different things: a new stack pointer for a user thread, or a function to be
executed by a kernel thread. Rather than "arg1", what would you think of
something like "sp_or_fn", or "usp_or_fn"?


I would recommend exactly "arg" since it can be used for multiple purposes
and if the formal could ever be used for a third purpose we don't want to
go through another re-naming patch to change it from sp_or_fn or
usp_or_fn.

If that's done, then the comment above the function could define what arg
can represent.
Do others concur with this idea? Personally, I feel the code will be 
more readable/maintainable if the naming of args/variables/etc reflects 
what they are actually used for.


(Case in point: on IA64, copy_thread() adds the kernel thread arg to the 
user stack pointer. The kernel thread arg is always 0 when forking a 
user process, so this "works", but it's certainly not what the author 
intended. Good names make it harder to write buggy code!)


For readability, using the same arg for 2 different purposes is a bad 
practice (though it might be good for keeping the object code small). I 
hate to think that "arg" might be co-opted for another purpose again.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[bdi] BUG: unable to handle kernel NULL pointer dereference at 0000000000000550

2015-03-05 Thread Fengguang Wu

/0xc2
[0.704142]  [] evict+0xa2/0x15e
[0.704142]  [] iput+0x160/0x16d
[0.704142]  [] bdput+0xd/0xf
[0.704142]  [] __blkdev_put+0x166/0x18a
[0.704142]  [] blkdev_put+0x114/0x11d
[0.704142]  [] add_disk+0x44d/0x461
[0.704142]  [] brd_init+0x95/0x160
[0.704142]  [] ? ramdisk_size+0x1a/0x1a
[0.704142]  [] do_one_initcall+0xe8/0x175
[0.704142]  [] kernel_init_freeable+0x1d0/0x258
[0.704142]  [] ? rest_init+0xbc/0xbc
[0.704142]  [] kernel_init+0x9/0xd5
[0.704142]  [] ret_from_fork+0x7c/0xb0
[0.704142]  [] ? rest_init+0xbc/0xbc
[0.704142] Code: ca 48 c1 ea 04 29 d0 ba 01 00 00 00 89 8f 80 08 00 00 ff 
c8 85 c0 0f 4e c2 89 87 84 08 00 00 c3 48 8b 87 10 01 00 00 55 48 89 e5 <48> 8b 
80 50 05 00 00 5d 48 05 58 02 00 00 c3 48 89 fa 31 c0 b9 
[0.704142] RIP  [] blk_get_backing_dev_info+0xb/0x1a
[0.704142]  RSP 
[0.704142] CR2: 0550
[0.704142] ---[ end trace 5c64cf25111d3d67 ]---
[0.704142] Kernel panic - not syncing: Fatal exception

git bisect start 45b8e7be563c57fc42d69d5239b4829b5586620d 
13a7a6ac0a11197edcd0f756a035f472b42cdf8b --
git bisect  bad 980171ac3db20fc792b9b1298067344725a5a285  # 19:07  0- 
20  Merge 'luto/x86/entry' into devel-xian-x86_64-201503051818
git bisect  bad 7a2a5fad21b95990713cbdfaccc9eeba4e98f9b8  # 19:13  0- 
20  Merge 'kees/format-security' into devel-xian-x86_64-201503051818
git bisect good cadb5884edc7353ecb245cf0874ead1f9565f2a7  # 19:29 20+  
0  Merge 'trace/ftrace/urgent' into devel-xian-x86_64-201503051818
git bisect good 30abe812fb9b18b25ebb9d2d214a70013a191ccb  # 19:34 20+  
0  Merge 'paulburton/wip-ci20-v4.0' into devel-xian-x86_64-201503051818
git bisect good 0d0fc17147f433ffe27f8d2fcd3b29e109694fe3  # 19:40 20+  
0  Merge 'arm-soc/next/drivers' into devel-xian-x86_64-201503051818
git bisect  bad caca114c0271d4df06e2ff1acee68dd62be43d66  # 20:03  0- 
20  Merge 'josef-btrfs/superblock-scaling' into devel-xian-x86_64-201503051818
git bisect good d2ee19114357bdf21c59a3ac61eb053ef1c0dc4e  # 20:15 20+  
8  inode: rename i_wb_list to i_io_list
git bisect  bad 63738525a6ebdf74bb3eb1c3dba16c0bb6895d97  # 20:28  0- 
20  inode: convert per-sb inode list to a list_lru
git bisect  bad a05899067cddc24276e43e0d440da791738cf967  # 20:42  0- 
20  writeback: periodically trim the writeback list
git bisect  bad 40ceea09e84d1b9319236b27ad3162422310e5d0  # 21:12  0- 
20  bdi: add a new writeback list for sync
# first bad commit: [40ceea09e84d1b9319236b27ad3162422310e5d0] bdi: add a new 
writeback list for sync
git bisect good d2ee19114357bdf21c59a3ac61eb053ef1c0dc4e  # 21:14 60+  
8  inode: rename i_wb_list to i_io_list
# extra tests with DEBUG_INFO
git bisect  bad 40ceea09e84d1b9319236b27ad3162422310e5d0  # 22:55  0- 
22  bdi: add a new writeback list for sync
# extra tests on HEAD of linux-devel/devel-xian-x86_64-201503051818
git bisect  bad 45b8e7be563c57fc42d69d5239b4829b5586620d  # 22:55  0- 
12  0day head guard for 'devel-xian-x86_64-201503051818'
# extra tests on tree/branch josef-btrfs/superblock-scaling
git bisect  bad d119f33d7f868e92c2d7fd21da1aade94584994d  # 23:13  0- 
60  inode: don't softlockup when evicting inodes
# extra tests on tree/branch linus/master
git bisect good 6587457b4b3d663b237a0f95ddf6e67d1828c8ea  # 23:41 60+  
2  Merge tag 'dma-buf-for-4.0-rc3' of 
git://git.kernel.org/pub/scm/linux/kernel/git/sumits/dma-buf
# extra tests on tree/branch next/master
git bisect good cbbf783608bd1f177fd8b1f6498bb2481116beed  # 23:53 60+  
0  Add linux-next specific files for 20150305


This script may reproduce the error.


#!/bin/bash

kernel=$1
initrd=yocto-minimal-x86_64.cgz

wget --no-clobber 
https://github.com/fengguang/reproduce-kernel-bug/raw/master/initrd/$initrd

kvm=(
qemu-system-x86_64
-cpu kvm64
-enable-kvm
-kernel $kernel
-initrd $initrd
-m 320
-smp 1
-net nic,vlan=1,model=e1000
-net user,vlan=1
-boot order=nc
-no-reboot
-watchdog i6300esb
-rtc base=localtime
-serial stdio
-display none
-monitor null 
)

append=(
hung_task_panic=1
earlyprintk=ttyS0,115200
rd.udev.log-priority=err
systemd.log_target=journal
systemd.log_level=warning
debug
apic=debug
sysrq_always_enabled
rcupdate.rcu_cpu_stall_timeout=100
panic=-1
softlockup_panic=1
nmi_watchdog=panic
oops=panic
load_ramdisk=2
prompt_ramdisk=0
console=ttyS0,115200
console=tty0
vga=normal
root=/dev/ram0
rw
drbd.minor_count=8
)

"${kvm[@]}" --append &q

[PCI] BUG: unable to handle kernel

2015-03-05 Thread Fengguang Wu

Greetings,

0day kernel testing robot got the below dmesg and the first bad commit is

git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git

commit 0b2af171520e5d5e7d5b5f479b90a6a5014d9df6
Author: Murali Karicheri 
AuthorDate: Tue Mar 3 12:52:13 2015 -0500
Commit: Bjorn Helgaas 
CommitDate: Tue Mar 3 14:42:58 2015 -0600

PCI: Update DMA configuration from DT

If there is a DT node available for the root bridge's parent device, use
the DMA configuration from that device node.  For example, Keystone PCI
devices would require dma_pfn_offset to be set correctly in the device
structure of the PCI device in order to have the correct DMA mask.  The DT
node will have dma-ranges defined for this.  Also support using the DT
property dma-coherent to allow coherent DMA operation by the PCI device.

Use the new helper function of_pci_dma_configure() to update the device DMA
configuration.  This fixes DMA on systems where DMA addresses are a
constant offset from CPU physical addresses.

Tested-by: Suravee Suthikulpanit  (AMD 
Seattle)
Signed-off-by: Murali Karicheri 
Signed-off-by: Bjorn Helgaas 
Reviewed-by: Catalin Marinas 
Acked-by: Will Deacon 
CC: Joerg Roedel 
CC: Grant Likely 
CC: Rob Herring 
CC: Russell King 
CC: Arnd Bergmann 

+--+++-+
|  | bdc567f9c1 | 0b2af17152 | 
v4.0-rc2_030422 |
+--+++-+
| boot_successes   | 47 | 0  | 0
   |
| boot_failures| 33 | 20 | 12   
   |
| page_allocation_failure:order:#,mode | 33 ||  
   |
| backtrace:btrfs_test_extent_io   | 33 ||  
   |
| backtrace:init_btrfs_fs  | 33 ||  
   |
| backtrace:kernel_init_freeable   | 33 | 20 | 12   
   |
| BUG:unable_to_handle_kernel  | 0  | 20 | 12   
   |
| Oops | 0  | 20 | 12   
   |
| EIP_is_at_of_pci_dma_configure   | 0  | 20 | 12   
   |
| Kernel_panic-not_syncing:Fatal_exception | 0  | 20 | 12   
   |
| backtrace:acpi_bus_scan  | 0  | 20 | 12   
   |
| backtrace:acpi_scan_init | 0  | 20 | 12   
   |
| backtrace:acpi_init  | 0  | 20 | 12   
   |
+--+++-+

[0.573023] pci_bus :00: root bus resource [mem 0x1400-0xfebf 
window]
[0.573381] pci :00:00.0: [8086:1237] type 00 class 0x06
[0.573381] pci :00:00.0: [8086:1237] type 00 class 0x06
[0.574397] BUG: unable to handle kernel 
[0.574397] BUG: unable to handle kernel NULL pointer dereferenceNULL 
pointer dereference at 01c4
 at 01c4
[0.575439] IP:
[0.575439] IP: [<79a20c33>] of_pci_dma_configure+0x33/0x70
 [<79a20c33>] of_pci_dma_configure+0x33/0x70
[0.576231] *pde =  
[0.576231] *pde =  

[0.57] Oops:  [#1] 
[0.57] Oops:  [#1] SMP SMP 

[0.57] Modules linked in:
[0.57] Modules linked in:

[0.57] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
4.0.0-rc1-6-g0b2af17 #6
[0.57] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
4.0.0-rc1-6-g0b2af17 #6
[0.57] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.7.5-20140531_083030-gandalf 04/01/2014
[0.57] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.7.5-20140531_083030-gandalf 04/01/2014
[0.57] task: 7806 ti: 78068000 task.ti: 78068000
[0.57] task: 7806 ti: 78068000 task.ti: 78068000
[0.57] EIP: 0060:[<79a20c33>] EFLAGS: 00010246 CPU: 0
[0.57] EIP: 0060:[<79a20c33>] EFLAGS: 00010246 CPU: 0
[0.57] EIP is at of_pci_dma_configure+0x33/0x70
[0.57] EIP is at of_pci_dma_configure+0x33/0x70
[0.57] EAX:  EBX: 78011800 ECX:  EDX: 0005
[0.57] EAX:  EBX: 78011800 ECX:  EDX: 0005
[0.57] ESI: 781d8400 EDI: 781d8000 EBP: 78069cd0 ESP: 78069cc8
[0.57] ESI: 781d8400 EDI: 781d8000 EBP: 78069cd0 ESP: 78069cc8
[0.57]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[0.57]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[0.57] CR0: 8005003b CR2: 01c4 CR3: 0229f000 CR4: 06d0
[0.57] CR0: 8005003b CR2: 01c4 CR3: 0229f000 CR4: 06d0
[0.57] Stack:
[0.57] Stack:
[0.57]  78011800
[0.57]  7

RE: [PATCH v2] ixgbe: make VLAN filter conditional

2015-03-05 Thread Hiroshi Shimamoto

> From: Hiroshi Shimamoto 
> 
> Disable hardware VLAN filtering if netdev->features VLAN flag is dropped.
> 
> In SR-IOV case, there is a use case which needs to disable VLAN filter.
> For example, we need to make a network function with VF in virtualized
> environment. That network function may be a software switch, a router
> or etc. It means that that network function will be an end point which
> terminates many VLANs.
> 
> In the current implementation, VLAN filtering always be turned on and
> VF can receive only 63 VLANs. It means that only 63 VLANs can be terminated
> in one NIC.
> 
> With this patch, if the user turns VLAN filtering off on the host, VF
> can receive every VLAN packet.
> 
> This VLAN filtering can be turned on or off when SR-IOV is disabled, if not
> the operation is rejected.

Hi,

any comment about this?
I added a warning message and prevent operation during SR-IOV is enabled.


thanks,
Hiroshi

N�r��yb�X��ǧv�^�)޺{.n�+{zX����ܨ}���Ơz�&j:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a���
0��h���i

[cpumask] WARNING: CPU: 0 PID: 0 at lib/list_debug.c:29 __list_add()

2015-03-05 Thread Fengguang Wu

bsolete cpu function usage.
git bisect good c099221e5944e36612c4079d888a38530a667645  # 13:27 20+ 
20  cpumask: remove deprecated functions.
git bisect  bad f754909a13e848900abee1014ca29b9b4e33b6ff  # 13:34  0- 
20  cpumask: only allocate nr_cpumask_bits.
git bisect good 7928baeec50516d4f632f2b9a66925a3fc1126b0  # 13:56 20+  
0  Fix weird uses of num_online_cpus().
# first bad commit: [f754909a13e848900abee1014ca29b9b4e33b6ff] cpumask: only 
allocate nr_cpumask_bits.
git bisect good 7928baeec50516d4f632f2b9a66925a3fc1126b0  # 13:58 60+ 
16  Fix weird uses of num_online_cpus().
# extra tests with DEBUG_INFO
git bisect good f754909a13e848900abee1014ca29b9b4e33b6ff  # 14:15 60+ 
60  cpumask: only allocate nr_cpumask_bits.
# extra tests on HEAD of linux-devel/devel-snb-smoke-201503051145
git bisect  bad e9d45bb15ba89a3ef7b6dde0d12d15d4964e74de  # 14:15  0- 
12  0day head guard for 'devel-snb-smoke-201503051145'
# extra tests on tree/branch rusty/cpumask-next
git bisect  bad f754909a13e848900abee1014ca29b9b4e33b6ff  # 14:23  0- 
20  cpumask: only allocate nr_cpumask_bits.
# extra tests with first bad commit reverted
# extra tests on tree/branch linus/master
git bisect good 6587457b4b3d663b237a0f95ddf6e67d1828c8ea  # 14:32 60+ 
20  Merge tag 'dma-buf-for-4.0-rc3' of 
git://git.kernel.org/pub/scm/linux/kernel/git/sumits/dma-buf
# extra tests on tree/branch next/master
git bisect good cbbf783608bd1f177fd8b1f6498bb2481116beed  # 14:52     60+ 
60  Add linux-next specific files for 20150305


This script may reproduce the error.


#!/bin/bash

kernel=$1

kvm=(
qemu-system-x86_64
-cpu kvm64
-enable-kvm
-kernel $kernel
-m 320
-smp 1
-net nic,vlan=1,model=e1000
-net user,vlan=1
-boot order=nc
-no-reboot
-watchdog i6300esb
-rtc base=localtime
-serial stdio
-display none
-monitor null 
)

append=(
hung_task_panic=1
earlyprintk=ttyS0,115200
rd.udev.log-priority=err
systemd.log_target=journal
systemd.log_level=warning
debug
apic=debug
sysrq_always_enabled
rcupdate.rcu_cpu_stall_timeout=100
panic=-1
softlockup_panic=1
nmi_watchdog=panic
oops=panic
load_ramdisk=2
prompt_ramdisk=0
console=ttyS0,115200
console=tty0
vga=normal
root=/dev/ram0
rw
drbd.minor_count=8
)

"${kvm[@]}" --append "${append[*]}"


Thanks,
Fengguang
early console in setup code
Probing EDD (edd=off to disable)... ok
early console in decompress_kernel
KASLR using RDTSC...

Decompressing Linux... Parsing ELF... Performing relocations... done.
Booting the kernel.
[0.00] Initializing cgroup subsys cpuset
[0.00] Linux version 4.0.0-rc2-00166-gf754909 (kbuild@snb) (gcc version 
4.9.1 (Debian 4.9.1-19) ) #35 SMP PREEMPT Thu Mar 5 13:32:58 CST 2015
[0.00] Command line: hung_task_panic=1 earlyprintk=ttyS0,115200 
rd.udev.log-priority=err systemd.log_target=journal systemd.log_level=warning 
debug apic=debug sysrq_always_enabled rcupdate.rcu_cpu_stall_timeout=100 
panic=-1 softlockup_panic=1 nmi_watchdog=panic oops=panic load_ramdisk=2 
prompt_ramdisk=0 console=ttyS0,115200 console=tty0 vga=normal  root=/dev/ram0 
rw 
link=/kbuild-tests/run-queue/kvm/x86_64-randconfig-s0-03031804/linux-devel:devel-snb-smoke-201503051145:f754909a13e848900abee1014ca29b9b4e33b6ff:bisect-linux-3/.vmlinuz-f754909a13e848900abee1014ca29b9b4e33b6ff-2015030515-16-client6
 branch=linux-devel/devel-snb-smoke-201503051145 
BOOT_IMAGE=/kernel/x86_64-randconfig-s0-03031804/f754909a13e848900abee1014ca29b9b4e33b6ff/vmlinuz-4.0.0-rc2-00166-gf754909
 drbd.minor_count=8
[0.00] KERNEL supported cpus:
[0.00]   AMD AuthenticAMD
[0.00]   Centaur CentaurHauls
[0.00] CPU: vendor_id 'GenuineIntel' unknown, using generic init.
[0.00] CPU: Your system may be unstable.
[0.00] e820: BIOS-provided physical RAM map:
[0.00] BIOS-e820: [mem 0x-0x0009fbff] usable
[0.00] BIOS-e820: [mem 0x0009fc00-0x0009] reserved
[0.00] BIOS-e820: [mem 0x000f-0x000f] reserved
[0.00] BIOS-e820: [mem 0x0010-0x13fd] usable
[0.00] BIOS-e820: [mem 0x13fe-0x13ff] reserved
[0.00] BIOS-e820: [mem 0xfeffc000-0xfeff] reserved
[0.00] BIOS-e820: [mem 0xfffc-0x] reserved
[0.00] bootconsole [earlyser0] enabled
[0.00] NX (Execute Disable) protection: active
[0.00] e820: update [mem 0x01fb21

[cgroup] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/cpu/common.c:1439 warn_pre_alternatives()

2015-03-05 Thread Fengguang Wu

Greetings,

0day kernel testing robot got the below dmesg and the first bad commit is

git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git 
revert-295458e67284f57d154ec8156a22797c0cfb044a-295458e67284f57d154ec8156a22797c0cfb044a

commit 295458e67284f57d154ec8156a22797c0cfb044a
Author: Vladimir Davydov 
AuthorDate: Thu Feb 19 17:34:46 2015 +0300
Commit: Tejun Heo 
CommitDate: Mon Mar 2 12:11:01 2015 -0500

cgroup: call cgroup_subsys->bind on cgroup subsys initialization

Currently, we call cgroup_subsys->bind only on unmount, remount, and
when creating a new root on mount. Since the default hierarchy root is
created in cgroup_init, we will not call cgroup_subsys->bind if the
default hierarchy is freshly mounted. As a result, some controllers will
behave incorrectly (most notably, the "memory" controller will not
enable hierarchy support). Fix this by calling cgroup_subsys->bind right
after initializing a cgroup subsystem.

Signed-off-by: Vladimir Davydov 
Signed-off-by: Tejun Heo 

+--++++
|  | 283cb41f42 
| 295458e672 | 65cf2c9599 |
+--++++
| boot_successes   | 60 
| 0  | 0  |
| boot_failures| 0  
| 20 | 12 |
| WARNING:at_arch/x86/kernel/cpu/common.c:#warn_pre_alternatives() | 0  
| 20 | 12 |
| BUG:unable_to_handle_kernel  | 0  
| 20 | 12 |
| Oops | 0  
| 20 | 12 |
| RIP:cpumask_copy | 0  
| 20 | 12 |
| Kernel_panic-not_syncing:Fatal_exception | 0  
| 20 | 12 |
| backtrace:async_page_fault   | 0  
| 20 | 12 |
| backtrace:cgroup_init| 0  
| 20 | 12 |
+--++++

[0.020009] Mountpoint-cache hash table entries: 1024 (order: 1, 8192 bytes)
[0.021989] [ cut here ]
[0.021989] [ cut here ]
[0.022816] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/cpu/common.c:1439 
warn_pre_alternatives+0x25/0x2e()
[0.022816] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/cpu/common.c:1439 
warn_pre_alternatives+0x25/0x2e()
[0.024000] You're using static_cpu_has before alternatives have run!
[0.024000] You're using static_cpu_has before alternatives have run!
[0.024000] Modules linked in:
[0.024000] Modules linked in:

[0.024000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
4.0.0-rc1-4-g295458e #455
[0.024000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
4.0.0-rc1-4-g295458e #455
[0.024000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.7.5-20140531_083030-gandalf 04/01/2014
[0.024000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.7.5-20140531_083030-gandalf 04/01/2014
[0.024000]  0009
[0.024000]  0009 81e03cc8 81e03cc8 
81674d02 81674d02 810ca88e 810ca88e

[0.024000]  81e03d18
[0.024000]  81e03d18 81e03d08 81e03d08 
81073d6f 81073d6f  

[0.024000]  81018f79
[0.024000]  81018f79 81e03e38 81e03e38 
  0002 0002

[0.024000] Call Trace:
[0.024000] Call Trace:
[0.024000]  [] dump_stack+0xa0/0xd5
[0.024000]  [] dump_stack+0xa0/0xd5
[0.024000]  [] ? console_unlock+0x496/0x4ef
[0.024000]  [] ? console_unlock+0x496/0x4ef
[0.024000]  [] warn_slowpath_common+0xc8/0xf7
[0.024000]  [] warn_slowpath_common+0xc8/0xf7
[0.024000]  [] ? warn_pre_alternatives+0x25/0x2e
[0.024000]  [] ? warn_pre_alternatives+0x25/0x2e
[0.024000]  [] warn_slowpath_fmt+0x4f/0x58
[0.024000]  [] warn_slowpath_fmt+0x4f/0x58
[0.024000]  [] ? native_iret+0x7/0x7
[0.024000]  [] ? native_iret+0x7/0x7
[0.024000]  [] warn_pre_alternatives+0x25/0x2e
[0.024000]  [] warn_pre_alternatives+0x25/0x2e
[0.024000]  [] __do_page_fault+0x2b4/0x7c2
[0.024000]  [] __do_page_fault+0x2b4/0x7c2
[0.024000]  [] do_page_fault+0x3e/0x4a
[0.024000]  [] do_page_fault+0x3e/0x4a
[0.024000]  [] do_async_page_fault+0x3a/0xb9
[0.024000]  [] do_async_page_fault+0x3

[x86/xen] WARNING: CPU: 0 PID: 1 at arch/x86/xen/apic.c:73 xen_apic_write()

2015-03-05 Thread Fengguang Wu

Greetings,

0day kernel testing robot got the below dmesg and the first bad commit is

git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip 
revert-3f4560207f796d5f79c18329d5a5d383fe3c97bb-3f4560207f796d5f79c18329d5a5d383fe3c97bb

commit 3f4560207f796d5f79c18329d5a5d383fe3c97bb
Author: Konrad Rzeszutek Wilk 
AuthorDate: Mon Mar 2 12:06:23 2015 -0500
Commit: David Vrabel 
CommitDate: Mon Mar 2 17:15:05 2015 +

x86/xen: Provide a "Xen PV" APIC driver to support >255 VCPUs

Instead of mangling the default APIC driver, provide a Xen PV guest
specific one that explicitly provides appropriate methods.

This allows use to report that all APIC IDs are valid, allowing dom0
to boot with more than 255 VCPUs.

Since the probe order of APIC drivers is link dependent, we add in an
late probe function to change to the Xen PV if it hadn't been done
during bootup.

Suggested-by: David Vrabel 
Reported-by: Cathy Avery 
Signed-off-by: Konrad Rzeszutek Wilk 
Signed-off-by: David Vrabel 

+--++++
|  | dbc36df319 | 3f4560207f | 
64abd71342 |
+--++++
| boot_successes   | 60 | 0  | 
0  |
| boot_failures| 0  | 20 | 
12 |
| WARNING:at_arch/x86/xen/apic.c:#xen_apic_write() | 0  | 20 | 
12 |
| BUG:kernel_boot_hang | 0  | 9  | 
2  |
| backtrace:native_smp_prepare_cpus| 0  | 20 | 
12 |
| backtrace:kernel_init_freeable   | 0  | 20 | 
12 |
+--++++

[0.021336] Freeing SMP alternatives memory: 32K (8264c000 - 
82654000)
[0.027813] Getting VERSION: 0
[0.028005] [ cut here ]
[0.028838] WARNING: CPU: 0 PID: 1 at arch/x86/xen/apic.c:73 
xen_apic_write+0x15/0x17()
[0.032006] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
4.0.0-rc1-8-g3f45602 #10
[0.033313] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.7.5-20140531_083030-gandalf 04/01/2014
[0.035045]  0009 88001144fe58 81b4bc0c 
005e
[0.036329]   88001144fe98 810729f0 

[0.037692]  81009b3b 0008  
a108
[0.039067] Call Trace:
[0.039506]  [] dump_stack+0x4c/0x6e
[0.040008]  [] warn_slowpath_common+0x92/0xac
[0.041048]  [] ? xen_apic_write+0x15/0x17
[0.042032]  [] warn_slowpath_null+0x15/0x17
[0.044007]  [] xen_apic_write+0x15/0x17
[0.044964]  [] verify_local_APIC+0x50/0x1a5
[0.045980]  [] native_smp_prepare_cpus+0x1f9/0x2d2
[0.047093]  [] kernel_init_freeable+0x115/0x258
[0.048007]  [] ? rest_init+0xbc/0xbc
[0.048915]  [] kernel_init+0x9/0xd5
[0.049810]  [] ret_from_fork+0x7c/0xb0
[0.050753]  [] ? rest_init+0xbc/0xbc
[0.052021] ---[ end trace 2224f94bfa1995b9 ]---
[0.052835] Getting VERSION: 0

git bisect start 64abd713427959b0c88f3f7ddc3a519d9628 
c517d838eb7d07bbe9507871fab3931deccff539 --
git bisect  bad 564cdc396432bb58399bee9c85d2f9c9dbd1f4c8  # 02:32  0- 
20  Merge 'xen-tip/devel/for-linus-4.1' into devel-xian-x86_64-201503030145
git bisect good 42d429fb535b1ed2a8f2bd64e5e2b0d1507020e8  # 03:30 20+  
0  Merge 'slave-dma/for-linus' into devel-xian-x86_64-201503030145
git bisect good 358928c49c157cfd513af221ccabe22434a63bbe  # 03:55 20+  
4  Merge 'cgroup/for-4.1' into devel-xian-x86_64-201503030145
git bisect good 058e1fa5f35fbd876af4e1bcc1f938218a28706e  # 04:16 20+  
0  Merge 'wq/for-4.0-fixes' into devel-xian-x86_64-201503030145
git bisect good 06324125b0143ed0efe6c3db9b210ce2fe0f255d  # 04:27 20+ 
10  xen: synchronize include/xen/interface/xen.h with xen
git bisect good f227b2ffd052e52f51c78c692eec4ccfba180d31  # 04:43 20+  
1  xen: use generated hypercall symbols in arch/x86/xen/xen-head.S
git bisect  bad 3f4560207f796d5f79c18329d5a5d383fe3c97bb  # 05:11  0- 
11  x86/xen: Provide a "Xen PV" APIC driver to support >255 VCPUs
git bisect good dbc36df3197da8364f9a58f76970968c7862eb60  # 05:45 20+  
0  xen/pciback: Don't print scary messages when unsupported by hypervisor.
# first bad commit: [3f4560207f796d5f79c18329d5a5d383fe3c97bb] x86/xen: Provide 
a "Xen PV" APIC driver to support >255 VCPUs
git bisect good dbc36df3197da8364f9a58f76970968c7862eb60  # 05:48 60+  
0  xen/pciback: Don't print scary messages when unsupported by hypervisor.
# extra tests with DEBUG_INFO
git bisect good 3f4560207f796d5f79c

Re: [PATCH v2 3/4] cpufreq: mediatek: add Mediatek cpufreq driver

2015-03-05 Thread Pi-Cheng Chen

+cc Sascha

On 5 March 2015 at 17:55, Viresh Kumar  wrote:
> On 5 March 2015 at 12:57, Pi-Cheng Chen  wrote:
>
>> On 4 March 2015 at 19:09, Viresh Kumar  wrote:
>> There are 2 clusters, but only the big cluster need to do voltage scaling in 
>> the
>> notifier, since the voltage controlling is done by cpufreq-dt driver
>> in this version.
>> Therefore only one dvfs_info struct here.
>
> Do you really think its readable enough that way? You must have added some
> comments on how this is working. Also, what about putting this stuff in your
> regulator driver, so that you don't really have to do this in PRE/POST
> notifiers.

Okay. I will add comments to describe some details about this. About putting
those stuff into regulator driver, I think you mean creating a
"virtual regulator
device" and put all the voltage controlling complex into the driver, right?
Maybe it's a good idea in this case, but I am sure if this kind of
virtual regulator
is acceptable. And the flexibility might be an issue, since we might
use different
PMIC for same SoC on different board.

>
 +   inter_clk = clk_get(&pdev->dev, NULL);
>>>
>>> How is this supposed to work ? How will pdev->dev give intermediate
>>> clock ?
>>
>> It works with the the device tree binding in the 2nd patch of this series, 
>> too.
>> Since the cpufreq node is not allowed, would you have some suggestions on
>> how to get the intermediate clock source in this case?
>
> How exactly? I am not doubting your work, just that I don't know how that DT
> binding will reflect here with clock_get for pdev->dev..

Please correct me if I was wrong. IIUC, It does:
clk_get() -> __of_clk_get_by_name() -> __of_clk_get()
The "mtk-cpufreq" device tree node specified the intermediate clock source in
"clocks" property. And the pdev here came from the "mtk-cpufreq" device tree
node, so we can get the "clock specifier" by calling
of_parse_phandle_with_args()
to find "clocks" property in __of_clk_get().

>
 +   pd->independent_clocks = 1,
>>>
>>> s/,/; ??
>>
>> It's strange that I didn't get a compiling error here.
>> Will fix it.
>
> Its a perfectly valid statement :) and so no errors. Both will execute as they
> will in case of ';', just that output of the later one will be
> returned. But there
> in no variable on LHS (left-hand-side) and so the value doesn't matter.

Thanks for your explanation. :)

>
>>> Don't want to free OPP table here on error ?
>>
>> Please correct me if I was wrong. Since the OPP table in the dvfs_info is
>> allocated by devm_kzalloc(), it is supposed to be freed if the probe function
>> failed, isn't it?
>>
>> And the OPP table initialized by of_init_opp_table() in cpu_opp_table_init()
>> was freed right before the function return since it will be initialized 
>> again in
>> the cpufreq-dt driver.
>
> Okay, I was talking about this only and I missed it. We probably need to fix
> this in OPP library so that multiple callers are allowed.
>
 +   dev = platform_device_register_data(NULL, "cpufreq-dt", -1, pd,
 +   sizeof(*pd));
>>>
>>> So this routine is going to be called only once. Then how are you
>>> initializing stuff
>>> for both the clusters in the upper for loop ? It looked very very confusing.
>>
>> Please let me clarify this here.
>> We have two clusters, one for big and another for little cores. For
>> the little cores'
>> cluster, only one voltage source needs to be controlled when doing CPU DVFS.
>> Therefore the voltage scaling of little cores' cluster is done in the
>> cpufreq-dt.
>> But for the big cores' cluster, there are two voltage sources here to
>> be controlled
>> and these two voltage source need to be scaled up and down in a SoC specific
>> manner which is implemented in the mtk_cpufreq_voltage_trace() function.
>> Hence, we put the voltage scaling of big cores' cluster in the cpufreq
>> notifier and
>> that's also why we need a mtk-cpufreq driver in addition to cpufreq-dt.
>>
>> In the confusing loop above, I am trying to solve two problems:
>> 1. to find out which CPUs shares the same clock / power domains among all 
>> CPUs
>> 2. to initialize the dvfs_info which is only needed by big cores' cluster
>>
>> I think that's why the loop looks so confusing. Maybe doing it in two
>> separate loops
>> will make the code more readable? I'll try it in next version.
>
> Yes.

Combining comments and suggestions from you and Sascha[1], I conclude some
architectural changes are going to be made in the next version:

1. Use set_rate hook instead of determine_rate in clk driver, and
switch to intermeidate
PLL parent and back to original CPU PLL parent explicitly in set_rate
2. Therefore we don't need intermediate frequency support in
cpufreq-dt to implement
cpufreq support for Mediatek SoC
3. Use clk notifier to handle voltage controlling corresponding to
intermediate clock rate
4. Due to 3. we need to move all voltage controlling part back into
the notifier in
mtk-cpufr

Re: [PATCH v4] x86: mce: kexec: switch MCE handler for kexec/kdump

2015-03-05 Thread Naoya Horiguchi

On Thu, Mar 05, 2015 at 01:24:47AM +, Horiguchi Naoya(堀口 直也) wrote:
...
> > Is the "UC" entry at the end of the severities[] table just a catch-all for 
> > things that made it
> > past all the other entries? Does it ever really get used?
> 
> I read through the severity check table and it seems that all UC=1 case
> are already considered by the above entries, so it seems not used.

I was completely wrong, the "Uncorrected" entry is chosen when mca_cfg.ser is
false (where all checks with SER_REQUIRED are skipped) and UC=1 and 
OVER=0.N�r��yb�X��ǧv�^�)޺{.n�+{zX����ܨ}���Ơz�&j:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a���
0��h���i

[BUG] uprobe: failed to work on 9pfs

2015-03-05 Thread He Kuang


Uprobe uses inode address to index all registered uprobes in a
rb_tree, this works well in most filesystems but failed on 9pfs.

9pfs allocate more than one vfs inode to the same file, the inode
address when we create uprobe is not same as the inode address we
run later. As a result, neither perf record nor events/uprobe can
capture the predefined uprobe events.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

linux-next: Tree for Mar 6

2015-03-05 Thread Stephen Rothwell

Hi all,

Changes since 20150305:

The net-next tree lost its build failure.

The vhost tree gained a conflict against the virtio tree.

Non-merge commits (relative to Linus' tree): 2757
 2807 files changed, 87638 insertions(+), 60274 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log files
in the Next directory.  Between each merge, the tree was built with
a ppc64_defconfig for powerpc and an allmodconfig for x86_64 and a
multi_v7_defconfig for arm. After the final fixups (if any), it is also
built with powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig and
allyesconfig (this fails its final link) and i386, sparc, sparc64 and arm
defconfig.

Below is a summary of the state of the merge.

I am currently merging 207 trees (counting Linus' and 30 trees of patches
pending for Linus' tree).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

$ git checkout master
$ git reset --hard stable
Merging origin/master (99aedde0869c Merge branch 'x86-urgent-for-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip)
Merging fixes/master (b94d525e58dc Merge 
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net)
Merging kbuild-current/rc-fixes (c517d838eb7d Linux 4.0-rc1)
Merging arc-current/for-curr (2ce7598c9a45 Linux 3.17-rc4)
Merging arm-current/fixes (23be7fdafa50 ARM: 8305/1: DMA: Fix kzalloc flags in 
__iommu_alloc_buffer())
Merging m68k-current/for-linus (4436820a98cd m68k/defconfig: Enable Ethernet 
bridging)
Merging metag-fixes/fixes (c2996cb29bfb metag: Fix KSTK_EIP() and KSTK_ESP() 
macros)
Merging mips-fixes/mips-fixes (1795cd9b3a91 Linux 3.16-rc5)
Merging powerpc-merge/merge (c517d838eb7d Linux 4.0-rc1)
Merging powerpc-merge-mpe/fixes (4ad04e598711 powerpc/iommu: Remove IOMMU 
device references via bus notifier)
Merging sparc/master (53eb2516972b sparc: semtimedop() unreachable due to 
comparison error)
Merging net/master (b0ab0afaebc8 net: eth: xgene: fix booting with devicetree)
Merging ipsec/master (ac37e2515c1a xfrm: release dst_orig in case of error in 
xfrm_lookup())
Merging sound-current/for-linus (f44f07cf3910 ALSA: line6: Clamp values 
correctly)
Merging pci-current/for-linus (4efe874aace5 PCI: Don't read past the end of 
sysfs "driver_override" buffer)
Merging wireless-drivers/master (c8f034558669 rtlwifi: Improve handling of IPv6 
packets)
Merging driver-core.current/driver-core-linus (c517d838eb7d Linux 4.0-rc1)
Merging tty.current/tty-linus (c517d838eb7d Linux 4.0-rc1)
Merging usb.current/usb-linus (d3d5389475e8 Merge tag 'usb-serial-4.0-rc3' of 
git://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus)
Merging usb-gadget-fixes/fixes (a0456399fb07 usb: gadget: configfs: don't 
NUL-terminate (sub)compatible ids)
Merging usb-serial-fixes/usb-linus (c7d373c3f0da usb: ftdi_sio: Add jtag quirk 
support for Cyber Cortex AV boards)
Merging staging.current/staging-linus (abe46b8932dd staging: comedi: 
adv_pci1710: fix AI INSN_READ for non-zero channel)
Merging char-misc.current/char-misc-linus (6c15a8516b81 mei: make device 
disabled on stop unconditionally)
Merging input-current/for-linus (20f02d66f042 Input: tc3589x-keypad - set 
IRQF_ONESHOT flag to ensure IRQ request)
Merging crypto-current/master (001eabfd54c0 crypto: arm/aes update NEON AES 
module to latest OpenSSL version)
Merging ide/master (f96fe225677b Merge 
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net)
Merging devicetree-current/devicetree/merge (6b1271de3723 of/unittest: Overlays 
with sub-devices tests)
Merging rr-fixes/fixes (f47689345931 lguest: update help text.)
Merging vfio-fixes/for-linus (7c2e211f3c95 vfio-pci: Fix the check on pci 
device type in vfio_pci_probe())
Merging kselftest-fixes/fixes (f5db310d77ef selftests/vm: fix link error for 
transhuge-stress test)
Merging drm-intel-fixes/for-linux-next-fixes (ab3be73fa7b4 drm/i915: gen4: work 
around hang during hibernation)
Merging asm-generic/master (643165c8bbc

Re: [PATCH v2] f2fs: fix max orphan inodes calculation

2015-03-05 Thread Wanpeng Li

Hi Changman,
On Fri, Mar 06, 2015 at 11:37:28AM +0800, Chao Yu wrote:
>Hi Changman,
>
>> -Original Message-
>> From: Changman Lee [mailto:cm224@samsung.com]
>> Sent: Tuesday, March 03, 2015 9:40 AM
>> To: linux-f2fs-de...@lists.sourceforge.net
>> Cc: Jaegeuk Kim; Chao Yu; linux-fsde...@vger.kernel.org; 
>> linux-kernel@vger.kernel.org
>> Subject: Re: [PATCH v2] f2fs: fix max orphan inodes calculation
>> 
>> On Fri, Feb 27, 2015 at 05:38:13PM +0800, Wanpeng Li wrote:
>> > cp_payload is introduced for sit bitmap to support large volume, and it is
>> > just after the block of f2fs_checkpoint + nat bitmap, so the first segment
>> > should include F2FS_CP_PACKS + NR_CURSEG_TYPE + cp_payload + orphan blocks.
>> > However, current max orphan inodes calculation don't consider cp_payload,
>> > this patch fix it by reducing the number of cp_payload from total blocks of
>> > the first segment when calculate max orphan inodes.
>> >
>> > Signed-off-by: Wanpeng Li 
>> > ---
>> > v1 -> v2:
>> >  * adjust comments above the codes
>> >  * fix coding style issue
>> >
>> >  fs/f2fs/checkpoint.c | 12 +++-
>> >  1 file changed, 7 insertions(+), 5 deletions(-)
>> >
>> > diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
>> > index db82e09..a914e99 100644
>> > --- a/fs/f2fs/checkpoint.c
>> > +++ b/fs/f2fs/checkpoint.c
>> > @@ -1103,13 +1103,15 @@ void init_ino_entry_info(struct f2fs_sb_info *sbi)
>> >}
>> >
>> >/*
>> > -   * considering 512 blocks in a segment 8 blocks are needed for cp
>> > -   * and log segment summaries. Remaining blocks are used to keep
>> > -   * orphan entries with the limitation one reserved segment
>> > -   * for cp pack we can have max 1020*504 orphan entries
>> > +   * considering 512 blocks in a segment 8+cp_payload blocks are
>> > +   * needed for cp and log segment summaries. Remaining blocks are
>> > +   * used to keep orphan entries with the limitation one reserved
>> > +   * segment for cp pack we can have max 1020*(504-cp_payload)
>> > +   * orphan entries
>> > */
>> 
>> Hi all,
>> 
>> I think below code give us information enough so it doesn't need to
>> describe above comments. And someone could get confused by 1020 constants.
>> How do you think about removing comments.
>
>I agree with you.
>
>There are nothing special need to be pay attention for the below statement,
>all meaning of statement could be easily readed as each macro in statement
>can indicate meaning of itself clearly.
>
>So could you send another patch to remove it?

Agreed. You can cleanup it. ;-)

Regards,
Wanpeng Li

>
>Thanks,
>
>> 
>> Regards,
>> Changman
>> 
>> >sbi->max_orphans = (sbi->blocks_per_seg - F2FS_CP_PACKS -
>> > -  NR_CURSEG_TYPE) * F2FS_ORPHANS_PER_BLOCK;
>> > +  NR_CURSEG_TYPE - __cp_payload(sbi)) *
>> > +  F2FS_ORPHANS_PER_BLOCK;
>> >  }
>> >
>> >  int __init create_checkpoint_caches(void)
>> > --
>> > 1.9.1
>
>--
>To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
>the body of a message to majord...@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/5] mtd: nand: vf610_nfc: Freescale NFC for VF610, MPC5125 and others

2015-03-05 Thread Shawn Guo

On Thu, Mar 05, 2015 at 12:10:20AM +0100, Stefan Agner wrote:
> This driver supports Freescale NFC (NAND flash controller) found on
> Vybrid (VF610), MPC5125, MCF54418 and Kinetis K70.
> 
> Limitations:
> - DMA and pipelining not used
> - Pages larger than 2k are not supported
> - No hardware ECC
> 
> The driver has only been tested on Vybrid (VF610).
> 
> Signed-off-by: Bill Pringlemeir 
> Signed-off-by: Stefan Agner 
> ---
>  arch/arm/mach-imx/Kconfig|   1 +

This change shouldn't be part of driver patch.

Shawn

>  drivers/mtd/nand/Kconfig |  12 +
>  drivers/mtd/nand/Makefile|   1 +
>  drivers/mtd/nand/vf610_nfc.c | 730 
> +++
>  4 files changed, 744 insertions(+)
>  create mode 100644 drivers/mtd/nand/vf610_nfc.c
> 
> diff --git a/arch/arm/mach-imx/Kconfig b/arch/arm/mach-imx/Kconfig
> index e8627e0..de4a51a 100644
> --- a/arch/arm/mach-imx/Kconfig
> +++ b/arch/arm/mach-imx/Kconfig
> @@ -634,6 +634,7 @@ config SOC_VF610
>   select ARM_GIC
>   select PINCTRL_VF610
>   select PL310_ERRATA_769419 if CACHE_L2X0
> + select HAVE_NAND_VF610_NFC
>  
>   help
> This enable support for Freescale Vybrid VF610 processor.
> diff --git a/drivers/mtd/nand/Kconfig b/drivers/mtd/nand/Kconfig
> index 5b76a17..1be30a6 100644
> --- a/drivers/mtd/nand/Kconfig
> +++ b/drivers/mtd/nand/Kconfig
> @@ -455,6 +455,18 @@ config MTD_NAND_MPC5121_NFC
> This enables the driver for the NAND flash controller on the
> MPC5121 SoC.
>  
> +config HAVE_NAND_VF610_NFC
> + bool
> +
> +config MTD_NAND_VF610_NFC
> + tristate "Support for Freescale NFC for VF610/MPC5125"
> + depends on HAVE_NAND_VF610_NFC
> + help
> +   Enables support for NAND Flash Controller on some Freescale
> +   processors like the VF610, MPC5125, MCF54418 or Kinetis K70.
> +   The driver supports a maximum 2k page size. The driver
> +   currently does not support hardware ECC.
> +
>  config MTD_NAND_MXC
>   tristate "MXC NAND support"
>   depends on ARCH_MXC
> diff --git a/drivers/mtd/nand/Makefile b/drivers/mtd/nand/Makefile
> index 582bbd05..e97ca7b 100644
> --- a/drivers/mtd/nand/Makefile
> +++ b/drivers/mtd/nand/Makefile
> @@ -45,6 +45,7 @@ obj-$(CONFIG_MTD_NAND_SOCRATES) += 
> socrates_nand.o
>  obj-$(CONFIG_MTD_NAND_TXX9NDFMC) += txx9ndfmc.o
>  obj-$(CONFIG_MTD_NAND_NUC900)+= nuc900_nand.o
>  obj-$(CONFIG_MTD_NAND_MPC5121_NFC)   += mpc5121_nfc.o
> +obj-$(CONFIG_MTD_NAND_VF610_NFC) += vf610_nfc.o
>  obj-$(CONFIG_MTD_NAND_RICOH) += r852.o
>  obj-$(CONFIG_MTD_NAND_JZ4740)+= jz4740_nand.o
>  obj-$(CONFIG_MTD_NAND_GPMI_NAND) += gpmi-nand/
> diff --git a/drivers/mtd/nand/vf610_nfc.c b/drivers/mtd/nand/vf610_nfc.c
> new file mode 100644
> index 000..101fd20
> --- /dev/null
> +++ b/drivers/mtd/nand/vf610_nfc.c
> @@ -0,0 +1,730 @@
> +/*
> + * Copyright 2009-2015 Freescale Semiconductor, Inc. and others
> + *
> + * Description: MPC5125, VF610, MCF54418 and Kinetis K70 Nand driver.
> + * Jason ported to M54418TWR and MVFA5 (VF610).
> + * Authors: Stefan Agner 
> + *  Bill Pringlemeir 
> + *  Shaohui Xie 
> + *  Jason Jin 
> + *
> + * Based on original driver mpc5121_nfc.c.
> + *
> + * This is free software; you can redistribute it and/or modify it
> + * under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * Limitations:
> + * - Untested on MPC5125 and M54418.
> + * - DMA not used.
> + * - 2K pages or less.
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#define  DRV_NAME"vf610_nfc"
> +
> +/* Register Offsets */
> +#define NFC_FLASH_CMD1   0x3F00
> +#define NFC_FLASH_CMD2   0x3F04
> +#define NFC_COL_ADDR 0x3F08
> +#define NFC_ROW_ADDR 0x3F0c
> +#define NFC_ROW_ADDR_INC 0x3F14
> +#define NFC_FLASH_STATUS10x3F18
> +#define NFC_FLASH_STATUS20x3F1c
> +#define NFC_CACHE_SWAP   0x3F28
> +#define NFC_SECTOR_SIZE  0x3F2c
> +#define NFC_FLASH_CONFIG 0x3F30
> +#define NFC_IRQ_STATUS   0x3F38
> +
> +/* Addresses for NFC MAIN RAM BUFFER areas */
> +#define NFC_MAIN_AREA(n) ((n) *  0x1000)
> +
> +#define PAGE_2K  0x0800
> +#define OOB_64   0x0040
> +
> +/*
> + * NFC_CMD2[CODE] values. See section:
> + *  - 31.4.7 Flash Command Code Description, Vybrid manual
> + *  - 23.8.6 Flash Command Sequencer, MPC5125 manual
> + *
> + * Briefly these are bitmasks of controller cycles.
> + */
> +#define READ_PAGE_CMD_CODE   0x7EE0
> +#define PROGRAM

Re: [PATCH] pci: host: xgene: fix incorrectly returned address by map_bus

2015-03-05 Thread Bjorn Helgaas

On Tue, Feb 17, 2015 at 03:14:00PM -0800, Feng Kan wrote:
> The generic accessor functions for pci-xgene uses map_bus
> call that returns the base address but did not add the additional
> offset.
> 
> Signed-off-by: Feng Kan 

Applied to for-linus for v4.0, with acks from Tanmay and Rob.  Thanks!

> ---
>  drivers/pci/host/pci-xgene.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/pci/host/pci-xgene.c b/drivers/pci/host/pci-xgene.c
> index aab5547..ee082c0 100644
> --- a/drivers/pci/host/pci-xgene.c
> +++ b/drivers/pci/host/pci-xgene.c
> @@ -127,7 +127,7 @@ static bool xgene_pcie_hide_rc_bars(struct pci_bus *bus, 
> int offset)
>   return false;
>  }
>  
> -static int xgene_pcie_map_bus(struct pci_bus *bus, unsigned int devfn,
> +static void __iomem *xgene_pcie_map_bus(struct pci_bus *bus, unsigned int 
> devfn,
> int offset)
>  {
>   struct xgene_pcie_port *port = bus->sysdata;
> @@ -137,7 +137,7 @@ static int xgene_pcie_map_bus(struct pci_bus *bus, 
> unsigned int devfn,
>   return NULL;
>  
>   xgene_pcie_set_rtdid_reg(bus, devfn);
> - return xgene_pcie_get_cfg_base(bus);
> + return xgene_pcie_get_cfg_base(bus) + offset;
>  }
>  
>  static struct pci_ops xgene_pcie_ops = {
> -- 
> 1.9.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] pci: host: xgene: fix incorrectly returned address by map_bus

2015-03-05 Thread Bjorn Helgaas

On Thu, Mar 05, 2015 at 02:57:55PM -0600, Rob Herring wrote:
> On Thu, Mar 5, 2015 at 10:38 AM, Bjorn Helgaas  wrote:
> > [+cc Mark]
> >
> > On Thu, Feb 26, 2015 at 06:21:51PM -0600, Bjorn Helgaas wrote:
> >> On Tue, Feb 17, 2015 at 03:14:00PM -0800, Feng Kan wrote:
> >> > The generic accessor functions for pci-xgene uses map_bus
> >> > call that returns the base address but did not add the additional
> >> > offset.
> >> >
> >> > Signed-off-by: Feng Kan 
> >> > ...

> >> > @@ -137,7 +137,7 @@ static int xgene_pcie_map_bus(struct pci_bus *bus, 
> >> > unsigned int devfn,
> >> > return NULL;
> >> >
> >> > xgene_pcie_set_rtdid_reg(bus, devfn);
> >> > -   return xgene_pcie_get_cfg_base(bus);
> >> > +   return xgene_pcie_get_cfg_base(bus) + offset;
> >>
> >> Where's the locking here?  ECAM doesn't need locking because the
> >> bus/dev/fn/offset is all encoded in the MMIO address.  But it looks
> >> like X-Gene doesn't work that way and bus/dev/fn is in the RTDID register.
> >>
> >> So it seems like X-Gene needs locking that not everybody needs.  Are you
> >> relying on higher-level locking somewhere?
> >> ...
> 
> There's no locking problem. The config accesses are all within the
> pci_lock spinlock and nothing else touches that register.

M.  Yes, you're right.  pci_bus_{read,write}_config_{byte,word,dword}()
all acquire pci_lock.  For anybody following along at home, here's the
path I was concerned about:

pci_read_config_byte
  pci_bus_read_config_byte
lock(&pci_lock) # acquire pci_lock
bus->ops->read/write# struct pci_ops
  pci_generic_config_read   # gen_pci_ops
bus->ops->map_bus
  xgene_pcie_map_bus# xgene_pcie_ops
xgene_pcie_set_rtdid_reg
  writel# requires mutex
readb   # config read

I'm not exactly sure *why* we do locking there, other than we're just
too scared to change it.  As far as I know, methods like ECAM shouldn't
require that lock, so it's sort of a shame to do it at the top level
like that.

Some of the low-level routines, like pci_{conf1,conf2,bios}, also use a
lock (pci_config_lock in these cases).  We do need it there because a
few paths do call the low-level routines directly.

Here's a typical path on x86:

pci_read_config_byte
  pci_bus_read_config_byte
lock(&pci_lock) # acquire pci_lock
bus->ops->read/write# struct pci_ops
  pci_read  # x86 pci_root_ops
raw_pci_read
  raw_pci_ops->read
pci_conf1_read  # x86 raw_pci_ops
  lock(&pci_config_lock)# acquire pci_config_lock

And here's a path on x86 that uses the low-level routines directly and
requires the locking there:

acpi_os_read_pci_configuration
  raw_pci_read
raw_pci_ops->read
  pci_conf1_read
lock(&pci_config_lock)

So ideally I think the locking would be done in the low-level routines
that need it, and we could do without pci_lock.  But I don't know
whether that's practical at this point or not.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [REGRESSION in 3.18][PPC] PA Semi fails to boot after: of/base: Fix PowerPC address parsing hack

2015-03-05 Thread Benjamin Herrenschmidt

On Thu, 2015-03-05 at 17:12 -0500, Steven Rostedt wrote:
> A bug in ftrace was reported to me that affects ARM and ARM64 but not
> x86. Looking at the code it appears to affect PowerPC as well. So I
> booted up my old PA Semi, to give it a try. The last time I booted it
> was for a 3.17 kernel. Unfortunately, for 4.0-rc2 it crashed with:

Argh. Well, we have one of these here but Michael who owns it is off til
Tuesday. 

Can you shoot me the DT (/proc/device-tree in a tarball) ? Olof, can the
DT be updated on this thing or should we add workarounds to Linux if
something is really missing ?

Cheers,
Ben.

> Unable to handle kernel paging request for data at address 0x
> Faulting instruction address: 0xc05cef88
> Oops: Kernel access of bad area, sig: 11 [#1]
> SMP NR_CPUS=2 PA Semi PWRficient
> Modules linked in:
> CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.0.0-rc2-test #50
> task: c0003816cb60 ti: c000381a4000 task.ti: c000381a4000
> NIP: c05cef88 LR: c007c1a0 CTR: c007c184
> REGS: c000381a7a00 TRAP: 0300   Not tainted  (4.0.0-rc2-test)
> MSR: 90009032   CR: 2228  XER: 
> DAR:  DSISR: 4000 SOFTE: 0 
> GPR00: c007c1a0 c000381a7c80 c0af4b98 0001 
> GPR04:   04ba 3d6de000 
> GPR08: 0100  c000381a4080  
> GPR12: 24044042 c300 ffed  
> GPR16: c0afb920 c000381a4000 c09ad648 c09ae580 
> GPR20: c000381a4080 c000381a4000 c000381a4080 c000381a4000 
> GPR24: c000381a4000 c000381a4000 c0afb880 c000381a4000 
> GPR28: c09f8790  c000381a4000 c0b02168 
> NIP [c05cef88] .check_astate+0x28/0x50
> LR [c007c1a0] sleep_common+0x14/0x74
> Call Trace:
> [c000381a7c80] [c0afb880] 0xc0afb880 (unreliable)
> [c000381a7cf0] [c007c1a0] sleep_common+0x14/0x74
> [c000381a7d30] [c00130f0] .arch_cpu_idle+0x70/0x160
> [c000381a7db0] [c00d6660] .cpu_startup_entry+0x320/0x5a0
> [c000381a7ee0] [c0034570] .start_secondary+0x290/0x2c0
> [c000381a7f90] [c0008bfc] start_secondary_prolog+0x10/0x14
> Instruction dump:
> 6000 6000 7c0802a6 f8010010 f821ff91 6000 6000 3d220003 
> 39296870 a86d0038 e9290010 7c0004ac <7c004c2c> 0c00 4c00012c 5463103a 
> ---[ end trace 40e864a431826b26 ]---
> 
> I kicked off a ktest bisect, and it came down to this commit:
> 
> commit 746c9e9f92dde2789908e51a354ba90a1962a2eb
> Author: Benjamin Herrenschmidt 
> Date:   Fri Nov 14 17:55:03 2014 +1100
> 
> of/base: Fix PowerPC address parsing hack
> 
> When I revert this from v4.0-rc2, I can successfully boot my PA Semi
> again.
> 
> -- Steve


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: NMI watchdog triggering during load_balance

2015-03-05 Thread Mike Galbraith

On Thu, 2015-03-05 at 21:05 -0700, David Ahern wrote:
> Hi Peter/Mike/Ingo:
> 
> I've been banging my against this wall for a week now and hoping you or 
> someone could shed some light on the problem.
> 
> On larger systems (256 to 1024 cpus) there are several use cases (e.g., 
> http://www.cs.virginia.edu/stream/) that regularly trigger the NMI 
> watchdog with the stack trace:
> 
> Call Trace:
> @  [0045d3d0] double_rq_lock+0x4c/0x68
> @  [004699c4] load_balance+0x278/0x740
> @  [008a7b88] __schedule+0x378/0x8e4
> @  [008a852c] schedule+0x68/0x78
> @  [0042c82c] cpu_idle+0x14c/0x18c
> @  [008a3a50] after_lock_tlb+0x1b4/0x1cc
> 
> Capturing data for all CPUs I tend to see load_balance related stack 
> traces on 700-800 cpus, with a few hundred blocked on _raw_spin_trylock_bh.
> 
> I originally thought it was a deadlock in the rq locking, but if I bump 
> the watchdog timeout the system eventually recovers (after 10-30+ 
> seconds of unresponsiveness) so it does not seem likely to be a deadlock.
> 
> This particluar system has 1024 cpus:
> # lscpu
> Architecture:  sparc64
> CPU op-mode(s):32-bit, 64-bit
> Byte Order:Big Endian
> CPU(s):1024
> On-line CPU(s) list:   0-1023
> Thread(s) per core:8
> Core(s) per socket:4
> Socket(s): 32
> NUMA node(s):  4
> NUMA node0 CPU(s): 0-255
> NUMA node1 CPU(s): 256-511
> NUMA node2 CPU(s): 512-767
> NUMA node3 CPU(s): 768-1023
> 
> and there are 4 scheduling domains. An example of the domain debug 
> output (condensed for the email):
> 
> CPU970 attaching sched-domain:
>   domain 0: span 968-975 level SIBLING
>groups: 8 single CPU groups
>domain 1: span 968-975 level MC
> groups: 1 group with 8 cpus
> domain 2: span 768-1023 level CPU
>  groups: 4 groups with 256 cpus per group

Wow, that topology is horrid.  I'm not surprised that your box is
writhing in agony.  Can you twiddle that?

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 12/38] perf tools: Introduce thread__comm_time() helpers

2015-03-05 Thread Namhyung Kim

Hi Frederic and Arnaldo,

On Thu, Mar 05, 2015 at 05:08:56PM +0100, Frederic Weisbecker wrote:
> On Wed, Mar 04, 2015 at 09:02:55AM +0900, Namhyung Kim wrote:
> > Hi Frederic,
> > 
> > On Tue, Mar 03, 2015 at 05:28:40PM +0100, Frederic Weisbecker wrote:
> > > On Tue, Mar 03, 2015 at 12:07:24PM +0900, Namhyung Kim wrote:
> > > > When data file indexing is enabled, it processes all task, comm and mmap
> > > > events first and then goes to the sample events.  So all it sees is the
> > > > last comm of a thread although it has information at the time of sample.
> > > > 
> > > > Sort thread's comm by time so that it can find appropriate comm at the
> > > > sample time.  The thread__comm_time() will mostly work even if
> > > > PERF_SAMPLE_TIME bit is off since in that case, sample->time will be
> > > > -1 so it'll take the last comm anyway.
> > > > 
> > > > Cc: Frederic Weisbecker 
> > > > Signed-off-by: Namhyung Kim 
> > > > ---
> > > >  tools/perf/util/thread.c | 33 -
> > > >  tools/perf/util/thread.h |  2 ++
> > > >  2 files changed, 34 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
> > > > index 9ebc8b1f9be5..ad96725105c2 100644
> > > > --- a/tools/perf/util/thread.c
> > > > +++ b/tools/perf/util/thread.c
> > > > @@ -103,6 +103,21 @@ struct comm *thread__exec_comm(const struct thread 
> > > > *thread)
> > > > return last;
> > > >  }
> > > >  
> > > > +struct comm *thread__comm_time(const struct thread *thread, u64 
> > > > timestamp)
> > > 
> > > Usually thread__comm_foo() would suggest that we return the "foo" from a 
> > > thread comm.
> > > For example thread__comm_len() returns the len of the last thread comm.
> > > thread__comm_str() returns the string of the last thread comm.
> > 
> > Ah, okay.
> 
> I mean, that's just an impression, others may have a different one :o)

Right.  Although I agree with your idea of function naming, I'm not
sure it's worth changing every function call site for this - and for
similar machine__find(new)_thread()_time.

Arnaldo, What do you think?

Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] ARM: dts: imx: Add dr_mode host setting to all host-only usb instances

2015-03-05 Thread Shawn Guo

On Fri, Feb 27, 2015 at 09:06:00AM -0500, Matt Porter wrote:
> The chipidea driver adds an extra line of spam to the log when a
> host-only chipidea instance is left set to the default of a dual role
> controller.
> 
> [2.010873] ci_hdrc ci_hdrc.1: doesn't support gadget
> 
> Set the dr_mode property to host on all the host-only nodes
> to avoid this warning.
> 
> Signed-off-by: Matt Porter 

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Regression caused by using node_to_bdi()

2015-03-05 Thread Zhao Lei

Hi, Christoph Hellwig

resend: + cc lkml

I found regression in v4.0-rc1 caused by this patch:
 Author: Christoph Hellwig 
 Date:   Wed Jan 14 10:42:36 2015 +0100
 fs: export inode_to_bdi and use it in favor of mapping->backing_dev_info

Test process is following:
 2015-02-25 15:50:22: Start
 2015-02-25 15:50:22: Linux version:Linux btrfs 
4.0.0-rc1_HEAD_c517d838eb7d07bbe9507871fab3931deccff539_ #1 SMP Wed Feb 25 
10:59:10 CST 2015 x86_64 x86_64 x86_64 GNU/Linux
 2015-02-25 15:50:25: mkfs.btrfs -f /dev/sdb1
 2015-02-25 15:50:27: mount /dev/sdb1 /data/ltf/tester
 2015-02-25 15:50:28: sysbench --test=fileio --num-threads=1 --file-num=1 
--file-block-size=32768 --file-total-size=4G --file-test-mode=seqwr 
--file-io-mode=sync --file-extra-flags= --file-fsync-freq=0 
--file-fsync-end=off --max-requests=131072
 2015-02-25 15:51:40: done sysbench

Result is following:
 v3.19-rc1: testcnt=40 average=135.677 range=[132.460,139.130] stdev=1.610 
cv=1.19%
 v4.0-rc1: testcnt=40 average=130.970 range=[127.980,132.050] stdev=1.012 
cv=0.77%

Then I bisect above case between v3.19-rc1 and v4.0-rc1, and found this patch 
caused the regresstion.

Maybe it is because kernel need more time to call node_to_bdi(), compared with 
"using inode->i_mapping->backing_dev_info directly" in old code.

Is there some way to speed up it(inline, or some access some variant in struct 
directly, ...)?

Thanks
Zhaolei



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v9 00/21] Introduce ACPI for ARM64 based on ACPI 5.1

2015-03-05 Thread Hanjun Guo

On 2015/3/6 2:57, Olof Johansson wrote:
> Hi,

Hi Olof,

>
> On Wed, Feb 25, 2015 at 04:39:40PM +0800, Hanjun Guo wrote:
>> Changes since v8:
>>   - remove MPIDR packing things by introducing phys_cpuid_t;
>>
>>   - update patch acpi: fix acpi_os_ioremap for arm64 to follow
>> Rafael's suggestion;
>>
>>   - Squash patch (disable ACPI if ACPI less than 5.1) to patch
>> (Get RSDP and ACPI boot-time table);
>>
>>   - Move sleep_arm.c to arch/arm64/ and rename it as acpi_sleep.c 
>>
>>   - Rework the uefi generated empty dtb to enable acpi when no dtb
>> is available, thanks Ard for the updated patch.
>>
>>   - rework the function of register cpu for kexec case
>>
>>   - use pr_debug() instead of pr_info() when scanning MADT.
>>
>>   - rebase on top of 4.0-rc1
>>
> I've looked at most of the arch code besides GIC and some of the timer stuff,
> which I might revisit later, but the pieces I've seen seem reasonable. I've
> acked individual patches.

Thank you very much for the ACKs and review comments!

>
> There are some cleanups to be made, but that can be done incrementally on top,
> it's all internal implementation details.

Definitely in my TODO list :)

>
> I also haven't looked closely at the documentation patches yet, so I might 
> have
> some comments on those showing up.

OK, thanks.

Regards
Hanjun

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 08/38] perf record: Add --index option for building index table

2015-03-05 Thread Namhyung Kim

On Thu, Mar 05, 2015 at 08:56:44AM +0100, Jiri Olsa wrote:
> On Tue, Mar 03, 2015 at 12:07:20PM +0900, Namhyung Kim wrote:
> 
> SNIP
> 
> > +static int record__merge_index_files(struct record *rec, int nr_index)
> > +{
> > +   int i;
> > +   int ret = -1;
> > +   u64 offset;
> > +   char path[PATH_MAX];
> > +   struct perf_file_section *idx;
> > +   struct perf_data_file *file = &rec->file;
> > +   struct perf_session *session = rec->session;
> > +   int output_fd = perf_data_file__fd(file);
> > +
> > +   /* +1 for header file itself */
> > +   nr_index++;
> > +
> > +   idx = calloc(nr_index, sizeof(*idx));
> > +   if (idx == NULL)
> > +   goto out_close;
> > +
> > +   offset = lseek(output_fd, 0, SEEK_END);
> > +
> > +   idx[0].offset = session->header.data_offset;
> > +   idx[0].size   = offset - idx[0].offset;
> > +
> > +   for (i = 1; i < nr_index; i++) {
> > +   struct stat stbuf;
> > +   int fd = rec->fds[i];
> > +
> > +   if (fstat(fd, &stbuf) < 0)
> > +   goto out_close;
> > +
> > +   idx[i].offset = offset;
> > +   idx[i].size   = stbuf.st_size;
> > +
> > +   offset += stbuf.st_size;
> > +   }
> > +
> > +   /* copy sample events */
> > +   for (i = 1; i < nr_index; i++) {
> > +   int fd = rec->fds[i];
> > +
> > +   if (idx[i].size == 0)
> > +   continue;
> > +
> > +   if (copyfile_offset(fd, 0, output_fd, idx[i].offset,
> > +   idx[i].size) < 0)
> > +   goto out_close;
> > +   }
> 
> why not do the copy in previous loop as well?

I will change it in the next version.

Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 07/38] perf tools: Handle indexed data file properly

2015-03-05 Thread Namhyung Kim

Hi Jiri,

On Wed, Mar 04, 2015 at 05:19:54PM +0100, Jiri Olsa wrote:
> On Tue, Mar 03, 2015 at 12:07:19PM +0900, Namhyung Kim wrote:
> > When perf detects data file has index table, process header part first
> > and then rest data files in a row.  Note that the indexed sample data is
> > recorded for each cpu/thread separately, it's already ordered with
> > respect to themselves so no need to use the ordered event queue
> > interface.
> > 
> > Signed-off-by: Namhyung Kim 
> > ---
> >  tools/perf/util/session.c | 62 
> > ++-
> >  tools/perf/util/session.h |  5 
> >  2 files changed, 55 insertions(+), 12 deletions(-)
> > 
> > diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
> > index e4f166981ff0..00cd1ad427be 100644
> > --- a/tools/perf/util/session.c
> > +++ b/tools/perf/util/session.c
> > @@ -1300,11 +1300,10 @@ fetch_mmaped_event(struct perf_session *session,
> >  #define NUM_MMAPS 128
> >  #endif
> >  
> > -static int __perf_session__process_events(struct perf_session *session,
> > +static int __perf_session__process_events(struct perf_session *session, 
> > int fd,
> >   u64 data_offset, u64 data_size,
> >   u64 file_size, struct perf_tool *tool)
> >  {
> > -   int fd = perf_data_file__fd(session->file);
> 
> why is 'fd' passed separatelly here? we have single file now
> and the only 'file::fd' we use is in session, no?

You're right, it's a leftover from the old code.  Will change.

Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v4 2/5] irqchip: gicv3-its: use 64KB page as default granule

2015-03-05 Thread Yun Wu

The field of page size in register GITS_BASERn might be read-only
if an implementation only supports a single, fixed page size. But
currently the ITS driver will throw out an error when PAGE_SIZE
is less than the minimum size supported by an ITS. So addressing
this problem by using 64KB pages as default granule for all the
ITS base tables.

Acked-by: Marc Zyngier 
Signed-off-by: Yun Wu 
---
 drivers/irqchip/irq-gic-v3-its.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 69eeea3..f5bfa42 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -800,7 +800,7 @@ static int its_alloc_tables(struct its_node *its)
 {
int err;
int i;
-   int psz = PAGE_SIZE;
+   int psz = SZ_64K;
u64 shr = GITS_BASER_InnerShareable;

for (i = 0; i < GITS_BASER_NR_REGS; i++) {
--
1.8.0


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v4 0/5] enhance configuring an ITS

2015-03-05 Thread Yun Wu

This patch series makes some enhancement to ITS configuration in the
following aspects:

o make allocation of the ITS tables more sensible
o replace magic numbers with sensible macros
o guarantees a safe quiescent status before initializing an ITS

This patch series is based on Marc's branch[1], and tested on Hisilion
ARM64 board with GICv3 ITS hardware.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git 
irq/gic-fixes

v3 -> v4:
o Spell out device table instead of DT to avoid misunderstanding
o Change its_check_quiesced() to a more sensible name its_force_quiescent()

v2 -> v3:
o drop the patch of tracing LPI enabling status since Vladimir Murzin
  had already posted a similar patch
o fix several improper description issues

v1 -> v2:
o rebase to Marc's GIC fix branch
o drop size calculation for Device Table since Marc had already posted one
o guarantees a safe quiescent status before initializing an ITS as
  Marc suggested, rather than register a reboot notifier
o fix an issue about the enabling status of LPI feature

Yun Wu (5):
  irqchip: gicv3-its: zero itt before handling to hardware
  irqchip: gicv3-its: use 64KB page as default granule
  irqchip: gicv3-its: add limitation to page order
  irqchip: gicv3-its: define macros for GITS_CTLR fields
  irqchip: gicv3-its: support safe initialization

 drivers/irqchip/irq-gic-v3-its.c   | 46 +++---
 include/linux/irqchip/arm-gic-v3.h |  3 +++
 2 files changed, 46 insertions(+), 3 deletions(-)

--
1.8.0


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v4 1/5] irqchip: gicv3-its: zero itt before handling to hardware

2015-03-05 Thread Yun Wu

Some kind of brain-dead implementations chooses to insert ITEes in
rapid sequence of disabled ITEes, and an un-zeroed ITT will confuse
ITS on judging whether an ITE is really enabled or not. Considering
the implementations are still supported by the GICv3 architecture,
in which ITT is not required to be zeroed before being handled to
hardware, we do the favor in ITS driver.

Acked-by: Marc Zyngier 
Signed-off-by: Yun Wu 
---
 drivers/irqchip/irq-gic-v3-its.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 6850141..69eeea3 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -1076,7 +1076,7 @@ static struct its_device *its_create_device(struct 
its_node *its, u32 dev_id,
nr_ites = max(2UL, roundup_pow_of_two(nvecs));
sz = nr_ites * its->ite_size;
sz = max(sz, ITS_ITT_ALIGN) + ITS_ITT_ALIGN - 1;
-   itt = kmalloc(sz, GFP_KERNEL);
+   itt = kzalloc(sz, GFP_KERNEL);
lpi_map = its_lpi_alloc_chunks(nvecs, &lpi_base, &nr_lpis);

if (!dev || !itt || !lpi_map) {
--
1.8.0


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v4 3/5] irqchip: gicv3-its: add limitation to page order

2015-03-05 Thread Yun Wu

When required size of Device Table is out of the page allocator's
capability, the whole ITS will fail in probing. This actually is
not the hardware's problem and is mainly a limitation of the kernel
page allocator. This patch will keep ITS going on to the next
initializaion stage with an explicit warning.

Acked-by: Marc Zyngier 
Signed-off-by: Yun Wu 
---
 drivers/irqchip/irq-gic-v3-its.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index f5bfa42..e8bda0b 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -828,6 +828,11 @@ static int its_alloc_tables(struct its_node *its)
u32 ids = GITS_TYPER_DEVBITS(typer);

order = get_order((1UL << ids) * entry_size);
+   if (order >= MAX_ORDER) {
+   order = MAX_ORDER - 1;
+   pr_warn("%s: Device Table too large, reduce its 
page order to %u\n",
+   its->msi_chip.of_node->full_name, 
order);
+   }
}

alloc_size = (1 << order) * PAGE_SIZE;
--
1.8.0


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v4 4/5] irqchip: gicv3-its: define macros for GITS_CTLR fields

2015-03-05 Thread Yun Wu

Define macros for GITS_CTLR fields to avoid using magic numbers.

Acked-by: Marc Zyngier 
Signed-off-by: Yun Wu 
---
 drivers/irqchip/irq-gic-v3-its.c   | 2 +-
 include/linux/irqchip/arm-gic-v3.h | 3 +++
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index e8bda0b..d13c24e 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -1388,7 +1388,7 @@ static int its_probe(struct device_node *node, struct 
irq_domain *parent)
writeq_relaxed(baser, its->base + GITS_CBASER);
tmp = readq_relaxed(its->base + GITS_CBASER);
writeq_relaxed(0, its->base + GITS_CWRITER);
-   writel_relaxed(1, its->base + GITS_CTLR);
+   writel_relaxed(GITS_CTLR_ENABLE, its->base + GITS_CTLR);

if ((tmp ^ baser) & GITS_BASER_SHAREABILITY_MASK) {
pr_info("ITS: using cache flushing for cmd queue\n");
diff --git a/include/linux/irqchip/arm-gic-v3.h 
b/include/linux/irqchip/arm-gic-v3.h
index 3459b43..c9d3002 100644
--- a/include/linux/irqchip/arm-gic-v3.h
+++ b/include/linux/irqchip/arm-gic-v3.h
@@ -134,6 +134,9 @@

 #define GITS_TRANSLATER0x10040

+#define GITS_CTLR_ENABLE   (1U << 0)
+#define GITS_CTLR_QUIESCENT(1U << 31)
+
 #define GITS_TYPER_DEVBITS_SHIFT   13
 #define GITS_TYPER_DEVBITS(r)  r) >> GITS_TYPER_DEVBITS_SHIFT) & 
0x1f) + 1)
 #define GITS_TYPER_PTA (1UL << 19)
--
1.8.0


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v4 5/5] irqchip: gicv3-its: support safe initialization

2015-03-05 Thread Yun Wu

It's unsafe to change the configurations of an activated ITS directly
since this will lead to unpredictable results. This patch guarantees
the ITSes being initialized are quiescent.

Acked-by: Marc Zyngier 
Signed-off-by: Yun Wu 
---
 drivers/irqchip/irq-gic-v3-its.c | 35 +++
 1 file changed, 35 insertions(+)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index d13c24e..9e09aa0 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -1320,6 +1320,34 @@ static const struct irq_domain_ops its_domain_ops = {
.deactivate = its_irq_domain_deactivate,
 };

+static int its_force_quiescent(void __iomem *base)
+{
+   u32 count = 100;/* 1s */
+   u32 val;
+
+   val = readl_relaxed(base + GITS_CTLR);
+   if (val & GITS_CTLR_QUIESCENT)
+   return 0;
+
+   /* Disable the generation of all interrupts to this ITS */
+   val &= ~GITS_CTLR_ENABLE;
+   writel_relaxed(val, base + GITS_CTLR);
+
+   /* Poll GITS_CTLR and wait until ITS becomes quiescent */
+   while (1) {
+   val = readl_relaxed(base + GITS_CTLR);
+   if (val & GITS_CTLR_QUIESCENT)
+   return 0;
+
+   count--;
+   if (!count)
+   return -EBUSY;
+
+   cpu_relax();
+   udelay(1);
+   }
+}
+
 static int its_probe(struct device_node *node, struct irq_domain *parent)
 {
struct resource res;
@@ -1348,6 +1376,13 @@ static int its_probe(struct device_node *node, struct 
irq_domain *parent)
goto out_unmap;
}

+   err = its_force_quiescent(its_base);
+   if (err) {
+   pr_warn("%s: failed to quiesce, giving up\n",
+   node->full_name);
+   goto out_unmap;
+   }
+
pr_info("ITS: %s\n", node->full_name);

its = kzalloc(sizeof(*its), GFP_KERNEL);
--
1.8.0


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] pci: host: xgene: fix incorrectly returned address by map_bus

2015-03-05 Thread Bjorn Helgaas

On Thu, Mar 05, 2015 at 08:53:38AM -0800, Feng Kan wrote:
> Please take Mark's patch if you think it is better.
> 
> 
> 
> On Thu, Mar 5, 2015 at 8:38 AM, Bjorn Helgaas  wrote:
> > [+cc Mark]
> >
> > On Thu, Feb 26, 2015 at 06:21:51PM -0600, Bjorn Helgaas wrote:
> >> On Tue, Feb 17, 2015 at 03:14:00PM -0800, Feng Kan wrote:
> >> > The generic accessor functions for pci-xgene uses map_bus
> >> > call that returns the base address but did not add the additional
> >> > offset.
> >> >
> >> > Signed-off-by: Feng Kan 
> >> > ---
> >> >  drivers/pci/host/pci-xgene.c | 4 ++--
> >> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >> >
> >> > diff --git a/drivers/pci/host/pci-xgene.c b/drivers/pci/host/pci-xgene.c
> >> > index aab5547..ee082c0 100644
> >> > --- a/drivers/pci/host/pci-xgene.c
> >> > +++ b/drivers/pci/host/pci-xgene.c
> >> > @@ -127,7 +127,7 @@ static bool xgene_pcie_hide_rc_bars(struct pci_bus 
> >> > *bus, int offset)
> >> > return false;
> >> >  }
> >> >
> >> > -static int xgene_pcie_map_bus(struct pci_bus *bus, unsigned int devfn,
> >> > +static void __iomem *xgene_pcie_map_bus(struct pci_bus *bus, unsigned 
> >> > int devfn,
> >> >   int offset)
> >> >  {
> >> > struct xgene_pcie_port *port = bus->sysdata;
> >> > @@ -137,7 +137,7 @@ static int xgene_pcie_map_bus(struct pci_bus *bus, 
> >> > unsigned int devfn,
> >> > return NULL;
> >> >
> >> > xgene_pcie_set_rtdid_reg(bus, devfn);
> >> > -   return xgene_pcie_get_cfg_base(bus);
> >> > +   return xgene_pcie_get_cfg_base(bus) + offset;
> >>
> >> Where's the locking here?  ECAM doesn't need locking because the
> >> bus/dev/fn/offset is all encoded in the MMIO address.  But it looks
> >> like X-Gene doesn't work that way and bus/dev/fn is in the RTDID register.
> >>
> >> So it seems like X-Gene needs locking that not everybody needs.  Are you
> >> relying on higher-level locking somewhere?
> >
> > Ping, what's going on here?  I've gotten at least three patches for this
> > offset issue, so we need to get it resolved.
> >
> > If there's no locking problem, I can just apply this and we'll be finished.
> > Actually, I think Mark's patch is better, because it correctly returns NULL
> > (failure) if xgene_pcie_get_cfg_base() fails.  So please review and ack
> > that one or explain why this one is better.

Huh, I could swear I saw a failure path in xgene_pcie_get_cfg_base().  But
I don't see a way it can fail, so I don't think it matters which way we fix
this.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

NMI watchdog triggering during load_balance

2015-03-05 Thread David Ahern


Hi Peter/Mike/Ingo:

I've been banging my against this wall for a week now and hoping you or 
someone could shed some light on the problem.


On larger systems (256 to 1024 cpus) there are several use cases (e.g., 
http://www.cs.virginia.edu/stream/) that regularly trigger the NMI 
watchdog with the stack trace:


Call Trace:
@  [0045d3d0] double_rq_lock+0x4c/0x68
@  [004699c4] load_balance+0x278/0x740
@  [008a7b88] __schedule+0x378/0x8e4
@  [008a852c] schedule+0x68/0x78
@  [0042c82c] cpu_idle+0x14c/0x18c
@  [008a3a50] after_lock_tlb+0x1b4/0x1cc

Capturing data for all CPUs I tend to see load_balance related stack 
traces on 700-800 cpus, with a few hundred blocked on _raw_spin_trylock_bh.


I originally thought it was a deadlock in the rq locking, but if I bump 
the watchdog timeout the system eventually recovers (after 10-30+ 
seconds of unresponsiveness) so it does not seem likely to be a deadlock.


This particluar system has 1024 cpus:
# lscpu
Architecture:  sparc64
CPU op-mode(s):32-bit, 64-bit
Byte Order:Big Endian
CPU(s):1024
On-line CPU(s) list:   0-1023
Thread(s) per core:8
Core(s) per socket:4
Socket(s): 32
NUMA node(s):  4
NUMA node0 CPU(s): 0-255
NUMA node1 CPU(s): 256-511
NUMA node2 CPU(s): 512-767
NUMA node3 CPU(s): 768-1023

and there are 4 scheduling domains. An example of the domain debug 
output (condensed for the email):


CPU970 attaching sched-domain:
 domain 0: span 968-975 level SIBLING
  groups: 8 single CPU groups
  domain 1: span 968-975 level MC
   groups: 1 group with 8 cpus
   domain 2: span 768-1023 level CPU
groups: 32 groups with 8 cpus per group
domain 3: span 0-1023 level NODE
 groups: 4 groups with 256 cpus per group


On an idle system (20 or so non-kernel threads such as mingetty, udev, 
...) perf top shows the task scheduler is consuming significant time:



   PerfTop:  136580 irqs/sec  kernel:99.9%  exact:  0.0% [1000Hz 
cycles],  (all, 1024 CPUs)

---

20.22%  [kernel]  [k] find_busiest_group
16.00%  [kernel]  [k] find_next_bit
 6.37%  [kernel]  [k] ktime_get_update_offsets
 5.70%  [kernel]  [k] ktime_get
...


This is a 2.6.39 kernel (yes, a relatively old one); 3.8 shows similar 
symptoms. 3.18 is much better.


From what I can tell load balancing is happening non-stop and there is 
heavy contention in the run queue locks. I instrumented the rq locking 
and under load (e.g, the stream test) regularly see single rq locks held 
continuously for 2-3 seconds (e.g., at the end of the stream run which 
has 1024 threads and the process is terminating).


I have been staring at and instrumenting the scheduling code for days. 
It seems like the balancing of domains is regularly lining up on all or 
almost all CPUs and it seems like the NODE domain causes the most damage 
since it scans all cpus (ie., in rebalance_domains() each domain pass 
triggers a call to load_balance on all cpus at the same time). Just in 
random snapshots during a stream test I have seen 1 pass through 
rebalance_domains take > 17 seconds (custom tracepoints to tag start and 
end).


Since each domain is a superset of the lower one each pass through 
load_balance regularly repeats the processing of the previous domain 
(e.g., NODE domain repeats the cpus in the CPU domain). Then multiplying 
that across 1024 cpus and it seems like a of duplication.


Does that make sense or am I off in the weeds?

Thanks,
David
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -next] sensors: fix build of pwm-fan.c when THERMAL=m

2015-03-05 Thread Guenter Roeck


On 03/05/2015 03:27 PM, Randy Dunlap wrote:

From: Randy Dunlap 

Fix build errors when CONFIG_THERMAL=m and SENSORS_PWM_FAN=y
by restricting SENSORS_PWM_FAN to 'm' when THERMAL=m.

drivers/built-in.o: In function `pwm_fan_remove':
pwm-fan.c:(.text+0x22ba58): undefined reference to 
`thermal_cooling_device_unregister'
drivers/built-in.o: In function `pwm_fan_probe':
pwm-fan.c:(.text+0x22bebb): undefined reference to 
`thermal_of_cooling_device_register'
pwm-fan.c:(.text+0x22bf11): undefined reference to `thermal_cdev_update'

Signed-off-by: Randy Dunlap 
Cc: Kamil Debski 
---


Applied, thanks.

Guenter


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/2] mm: numa: Do not clear PTEs or PMDs for NUMA hinting faults

2015-03-05 Thread Dave Chinner

On Thu, Mar 05, 2015 at 11:54:52PM +, Mel Gorman wrote:
> Dave Chinner reported the following on https://lkml.org/lkml/2015/3/1/226
> 
>Across the board the 4.0-rc1 numbers are much slower, and the
>degradation is far worse when using the large memory footprint
>configs. Perf points straight at the cause - this is from 4.0-rc1
>on the "-o bhash=101073" config:
> 
>-   56.07%56.07%  [kernel][k] 
> default_send_IPI_mask_sequence_phys
>   - default_send_IPI_mask_sequence_phys
>  - 99.99% physflat_send_IPI_mask
> - 99.37% native_send_call_func_ipi
>  smp_call_function_many
>- native_flush_tlb_others
>   - 99.85% flush_tlb_page
>ptep_clear_flush
>try_to_unmap_one
>rmap_walk
>try_to_unmap
>migrate_pages
>migrate_misplaced_page
>  - handle_mm_fault
> - 99.73% __do_page_fault
>  trace_do_page_fault
>  do_async_page_fault
>+ async_page_fault
>   0.63% native_send_call_func_single_ipi
>  generic_exec_single
>  smp_call_function_single
> 
> This was bisected to commit 4d9424669946 ("mm: convert p[te|md]_mknonnuma
> and remaining page table manipulations") which clears PTEs and PMDs to make
> them PROT_NONE. This is tidy but tests on some benchmarks indicate that
> there are many more hinting faults trapped resulting in excessive migration.
> This is the result for the old autonuma benchmark for example.

[snip]

Doesn't fix the problem. Runtime is slightly improved (16m45s vs 17m35)
but it's still much slower that 3.19 (6m5s).

Stats and profiles still roughly the same:

360,228  migrate:mm_migrate_pages ( +-  4.28% )

-   52.69%52.69%  [kernel][k] 
default_send_IPI_mask_sequence_phys
 default_send_IPI_mask_sequence_phys
   - physflat_send_IPI_mask
  - 97.28% native_send_call_func_ipi
   smp_call_function_many
   native_flush_tlb_others
   flush_tlb_page
   ptep_clear_flush
   try_to_unmap_one
   rmap_walk
   try_to_unmap
   migrate_pages
   migrate_misplaced_page
 - handle_mm_fault
- 99.59% __do_page_fault
 trace_do_page_fault
 do_async_page_fault
   + async_page_fault
  + 2.72% native_send_call_func_single_ipi

numa_hit 36678767
numa_miss 905234
numa_foreign 905234
numa_interleave 14802
numa_local 36656791
numa_other 927210
numa_pte_updates 92168450
numa_huge_pte_updates 0
numa_hint_faults 87573926
numa_hint_faults_local 29730293
numa_pages_migrated 30195890
pgmigrate_success 30195890
pgmigrate_fail 0

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86/PCI: Fully disable devices before releasing IRQ resource

2015-03-05 Thread Alex Williamson

On Fri, 2015-03-06 at 09:49 +0800, Jiang Liu wrote:
> On 2015/3/6 5:06, Alex Williamson wrote:
> > The IRQ resource for a device is established when pci_enabled_device()
> > is called on a fully disabled device (ie. enable_cnt == 0).  With
> > commit b4b55cda5874 ("x86/PCI: Refine the way to release PCI IRQ
> > resources") this same IRQ resource is released when the driver is
> > unbound from the device, regardless of the enable_cnt.  This presents
> > the situation that an ill-behaved driver can now make a device
> > unusable to subsequent drivers by an imbalance in their use of
> > pci_enable/disable_device().  It's one thing to break your own device
> > if you're one of these ill-behaved drivers, but it's a serious
> > regression for secondary drivers like vfio-pci, which are innocent
> > of the transgressions of the previous driver.
> > 
> > Resolve by pushing the device to a fully disabled state before
> > releasing the IRQ resource.
> > 
> > Fixes: b4b55cda5874 ("x86/PCI: Refine the way to release PCI IRQ resources")
> > Signed-off-by: Alex Williamson 
> > Cc: Jiang Liu 
> > ---
> >  arch/x86/pci/common.c |   13 -
> >  1 file changed, 12 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
> > index 3d2612b..4810194 100644
> > --- a/arch/x86/pci/common.c
> > +++ b/arch/x86/pci/common.c
> > @@ -527,8 +527,19 @@ static int pci_irq_notifier(struct notifier_block *nb, 
> > unsigned long action,
> > if (action != BUS_NOTIFY_UNBOUND_DRIVER)
> > return NOTIFY_DONE;
> >  
> > -   if (pcibios_disable_irq)
> > +   if (pcibios_disable_irq) {
> > +   /*
> > +* Broken drivers may allow a device to be .remove()'d while
> > +* still enabled.  pci_enable_device() will only re-establish
> > +* dev->irq if the devices is fully disabled.  So if we want
> > +* to release the IRQ, we need to make sure the next driver
> > +* can re-establish it using pci_enable_device().
> > +*/
> > +   while (pci_is_enabled(dev))
> > +   pci_disable_device(dev);
> > +
> > pcibios_disable_irq(dev);
> > +   }
> Hi Alex,
>   Thanks for debugging and fixing it.
>   Will it be feasible to give a debug message to remind those
> driver authors to correctly disable PCI when unbinding?

I can certainly add a warning to the loop, it loses a bit of its teeth
here though since we can't specify which driver to blame at this point.
Maybe that warning and perhaps this enabling roll-back should happen in
drivers/pci/pci-driver.c:pci_device_remove().  Bjorn, would you prefer
it be done generically there?  Thanks,

Alex

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[GIT PULL] Please pull NFS client bugfixes

2015-03-05 Thread Trond Myklebust

Hi Linus,

The following changes since commit c517d838eb7d07bbe9507871fab3931deccff539:

  Linux 4.0-rc1 (2015-02-22 18:21:14 -0800)

are available in the git repository at:

  git://git.linux-nfs.org/projects/trondmy/linux-nfs.git tags/nfs-for-4.0-3

for you to fetch changes up to e11259f920d8cb3550e0f311c064bdabe1bc3aaf:

  NFSv4.1: Clear the old state by our client id before establishing a new lease 
(2015-03-03 21:52:30 -0500)


NFS client bugfixes for Linux 4.0

Highlights include:

- Fix a regression in the NFSv4 open state recovery code
- Fix a regression in the NFSv4 close code
- Fix regressions and side-effects of the loop-back mounted NFS fixes
  in 3.18, that cause the NFS read() syscall to return EBUSY.
- Fix regressions around the readdirplus code and how it interacts with
  the VFS lazy unmount changes that went into v3.18.
- Fix issues with out-of-order RPC call replies replacing updated
  attributes with stale ones (particularly after a truncate()).
- Fix an underflow checking issue with RPC/RDMA credits
- Fix a number of issues with the NFSv4 delegation return/free code.
- Fix issues around stale NFSv4.1 leases when doing a mount


Anna Schumaker (1):
  NFS: Fix stateid used for NFS v4 closes

Chuck Lever (1):
  xprtrdma: Store RDMA credits in unsigned variables

Trond Myklebust (23):
  Merge tag 'nfs-rdma-for-4.0-3' of 
git://git.linux-nfs.org/projects/anna/nfs-rdma
  NFSv4: nfs4_open_recover_helper() must set share access
  NFS: Ensure that buffered writes wait for O_DIRECT writes to complete
  NFS: Add a helper to set attribute barriers
  NFS: Add attribute update barriers to nfs_setattr_update_inode()
  NFS: Set an attribute barrier on all updates
  NFS: Add attribute update barriers to NFS writebacks
  NFSv4: Add attribute update barriers to delegreturn and pNFS layoutcommit
  NFS: Remove size hack in nfs_inode_attrs_need_update()
  NFS: Fix nfs_post_op_update_inode() to set an attribute barrier
  NFSv4: Set a barrier in the update_changeattr() helper
  NFS: Don't invalidate a submounted dentry in nfs_prime_dcache()
  NFSv3: Use the readdir fileid as the mounted-on-fileid
  NFS: Don't require a filehandle to refresh the inode in nfs_prime_dcache()
  NFSv4: Don't call put_rpccred() under the rcu_read_lock()
  NFSv4: Ensure that we don't reap a delegation that is being returned
  NFSv4: Ensure we honour NFS_DELEGATION_RETURNING in 
nfs_inode_set_delegation()
  NFSv4: Pin the superblock while we're returning the delegation
  NFSv4: Ensure we skip delegations that are already being returned
  NFS: Fix a regression in the read() syscall
  NFS: Don't write enable new pages while an invalidation is proceeding
  NFSv4: Fix a race in NFSv4.1 server trunking discovery
  NFSv4.1: Clear the old state by our client id before establishing a new 
lease

 fs/nfs/client.c |   2 +-
 fs/nfs/delegation.c |  45 
 fs/nfs/dir.c|  22 ++--
 fs/nfs/file.c   |  11 +++-
 fs/nfs/inode.c  | 111 +---
 fs/nfs/internal.h   |   1 +
 fs/nfs/nfs3proc.c   |   4 +-
 fs/nfs/nfs3xdr.c|   5 ++
 fs/nfs/nfs4client.c |   9 ++--
 fs/nfs/nfs4proc.c   |  31 +++
 fs/nfs/nfs4session.h|   1 +
 fs/nfs/nfs4state.c  |  18 ++-
 fs/nfs/proc.c   |   6 +--
 fs/nfs/write.c  |  30 +++
 include/linux/nfs_fs.h  |   5 +-
 net/sunrpc/xprtrdma/rpc_rdma.c  |   3 +-
 net/sunrpc/xprtrdma/xprt_rdma.h |   2 +-
 17 files changed, 244 insertions(+), 62 deletions(-)

-- 
Trond Myklebust
Linux NFS client maintainer, PrimaryData
trond.mykleb...@primarydata.com




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH v2] f2fs: fix max orphan inodes calculation

2015-03-05 Thread Chao Yu

Hi Changman,

> -Original Message-
> From: Changman Lee [mailto:cm224@samsung.com]
> Sent: Tuesday, March 03, 2015 9:40 AM
> To: linux-f2fs-de...@lists.sourceforge.net
> Cc: Jaegeuk Kim; Chao Yu; linux-fsde...@vger.kernel.org; 
> linux-kernel@vger.kernel.org
> Subject: Re: [PATCH v2] f2fs: fix max orphan inodes calculation
> 
> On Fri, Feb 27, 2015 at 05:38:13PM +0800, Wanpeng Li wrote:
> > cp_payload is introduced for sit bitmap to support large volume, and it is
> > just after the block of f2fs_checkpoint + nat bitmap, so the first segment
> > should include F2FS_CP_PACKS + NR_CURSEG_TYPE + cp_payload + orphan blocks.
> > However, current max orphan inodes calculation don't consider cp_payload,
> > this patch fix it by reducing the number of cp_payload from total blocks of
> > the first segment when calculate max orphan inodes.
> >
> > Signed-off-by: Wanpeng Li 
> > ---
> > v1 -> v2:
> >  * adjust comments above the codes
> >  * fix coding style issue
> >
> >  fs/f2fs/checkpoint.c | 12 +++-
> >  1 file changed, 7 insertions(+), 5 deletions(-)
> >
> > diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
> > index db82e09..a914e99 100644
> > --- a/fs/f2fs/checkpoint.c
> > +++ b/fs/f2fs/checkpoint.c
> > @@ -1103,13 +1103,15 @@ void init_ino_entry_info(struct f2fs_sb_info *sbi)
> > }
> >
> > /*
> > -* considering 512 blocks in a segment 8 blocks are needed for cp
> > -* and log segment summaries. Remaining blocks are used to keep
> > -* orphan entries with the limitation one reserved segment
> > -* for cp pack we can have max 1020*504 orphan entries
> > +* considering 512 blocks in a segment 8+cp_payload blocks are
> > +* needed for cp and log segment summaries. Remaining blocks are
> > +* used to keep orphan entries with the limitation one reserved
> > +* segment for cp pack we can have max 1020*(504-cp_payload)
> > +* orphan entries
> >  */
> 
> Hi all,
> 
> I think below code give us information enough so it doesn't need to
> describe above comments. And someone could get confused by 1020 constants.
> How do you think about removing comments.

I agree with you.

There are nothing special need to be pay attention for the below statement,
all meaning of statement could be easily readed as each macro in statement
can indicate meaning of itself clearly.

So could you send another patch to remove it?

Thanks,

> 
> Regards,
> Changman
> 
> > sbi->max_orphans = (sbi->blocks_per_seg - F2FS_CP_PACKS -
> > -   NR_CURSEG_TYPE) * F2FS_ORPHANS_PER_BLOCK;
> > +   NR_CURSEG_TYPE - __cp_payload(sbi)) *
> > +   F2FS_ORPHANS_PER_BLOCK;
> >  }
> >
> >  int __init create_checkpoint_caches(void)
> > --
> > 1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] mfd: rtsx_usb: prevent DMA from stack

2015-03-05 Thread Roger Tseng

Functions rtsx_usb_ep0_read_register() and rtsx_usb_get_card_status()
both use arbitrary buffer addresses from arguments directly for DMA and
the buffers could be located in stack. This was caught by DMA-API debug
check.

Fixes this by using double-buffers via kzalloc in both functions to
guarantee the validity of DMA buffer.

WARNING: CPU: 1 PID: 25 at lib/dma-debug.c:1166 check_for_stack+0x96/0xe0()
ehci-pci :00:1a.0: DMA-API: device driver maps memory from stack
[addr=8801199e3cef]
Modules linked in: rtsx_usb_ms arc4 memstick intel_rapl iosf_mbi
rtl8192ce snd_hda_codec_hdmi snd_hda_codec_realtek
snd_hda_codec_generic snd_hda_intel rtl_pci rtl8192c_common
snd_hda_controller x86_pkg_temp_thermal snd_hda_codec rtlwifi mac80211
coretemp kvm_intel kvm iTCO_wdt snd_hwdep snd_seq snd_seq_device
crct10dif_pclmul iTCO_vendor_support sparse_keymap cfg80211
crc32_pclmul snd_pcm crc32c_intel ghash_clmulni_intel rfkill i2c_i801
snd_timer shpchp snd serio_raw mei_me lpc_ich soundcore mei tpm_tis
tpm wmi nfsd auth_rpcgss nfs_acl lockd grace sunrpc i915
rtsx_usb_sdmmc mmc_core 8021q uas garp stp i2c_algo_bit llc mrp
drm_kms_helper usb_storage drm rtsx_usb mfd_core r8169 mii video
CPU: 1 PID: 25 Comm: kworker/1:2 Not tainted 3.20.0-0.rc0.git7.3.fc22.x86_64 #1
Hardware name: WB WB-B06211/WB-B0621, BIOS EB062IWB V1.0 12/12/2013
Workqueue: events rtsx_usb_ms_handle_req [rtsx_usb_ms]
  3d188e66 8801199e3808 8187642b
  8801199e3860 8801199e3848 810ab39a
 8801199e3864 8801199e3cef 880119b57098 880119b37320
Call Trace:
 [] dump_stack+0x4c/0x65
 [] warn_slowpath_common+0x8a/0xc0
 [] warn_slowpath_fmt+0x55/0x70
 [] ? _raw_spin_unlock_irqrestore+0x36/0x70
 [] check_for_stack+0x96/0xe0
 [] debug_dma_map_page+0x104/0x150
 [] usb_hcd_map_urb_for_dma+0x646/0x790
 [] usb_hcd_submit_urb+0x1d5/0xa90
 [] ? mark_held_locks+0x7f/0xc0
 [] ? mark_held_locks+0x7f/0xc0
 [] ? lockdep_init_map+0x65/0x5d0
 [] usb_submit_urb+0x42e/0x5f0
 [] usb_start_wait_urb+0x77/0x190
 [] ? __kmalloc+0x205/0x2d0
 [] usb_control_msg+0xdc/0x130
 [] rtsx_usb_ep0_read_register+0x59/0x70 [rtsx_usb]
 [] ? rtsx_usb_get_rsp+0x41/0x50 [rtsx_usb]
 [] rtsx_usb_ms_handle_req+0x7ce/0x9c5 [rtsx_usb_ms]

Reported-by: Josh Boyer 
Signed-off-by: Roger Tseng 
---
 drivers/mfd/rtsx_usb.c | 30 --
 1 file changed, 24 insertions(+), 6 deletions(-)

diff --git a/drivers/mfd/rtsx_usb.c b/drivers/mfd/rtsx_usb.c
index ede50244f265..dbd907d7170e 100644
--- a/drivers/mfd/rtsx_usb.c
+++ b/drivers/mfd/rtsx_usb.c
@@ -196,18 +196,27 @@ EXPORT_SYMBOL_GPL(rtsx_usb_ep0_write_register);
 int rtsx_usb_ep0_read_register(struct rtsx_ucr *ucr, u16 addr, u8 *data)
 {
u16 value;
+   u8 *buf;
+   int ret;
 
if (!data)
return -EINVAL;
-   *data = 0;
+
+   buf = kzalloc(sizeof(u8), GFP_KERNEL);
+   if (!buf)
+   return -ENOMEM;
 
addr |= EP0_READ_REG_CMD << EP0_OP_SHIFT;
value = swab16(addr);
 
-   return usb_control_msg(ucr->pusb_dev,
+   ret = usb_control_msg(ucr->pusb_dev,
usb_rcvctrlpipe(ucr->pusb_dev, 0), RTSX_USB_REQ_REG_OP,
USB_DIR_IN | USB_TYPE_VENDOR | USB_RECIP_DEVICE,
-   value, 0, data, 1, 100);
+   value, 0, buf, 1, 100);
+   *data = *buf;
+
+   kfree(buf);
+   return ret;
 }
 EXPORT_SYMBOL_GPL(rtsx_usb_ep0_read_register);
 
@@ -288,18 +297,27 @@ static int rtsx_usb_get_status_with_bulk(struct rtsx_ucr 
*ucr, u16 *status)
 int rtsx_usb_get_card_status(struct rtsx_ucr *ucr, u16 *status)
 {
int ret;
+   u16 *buf;
 
if (!status)
return -EINVAL;
 
-   if (polling_pipe == 0)
+   if (polling_pipe == 0) {
+   buf = kzalloc(sizeof(u16), GFP_KERNEL);
+   if (!buf)
+   return -ENOMEM;
+
ret = usb_control_msg(ucr->pusb_dev,
usb_rcvctrlpipe(ucr->pusb_dev, 0),
RTSX_USB_REQ_POLL,
USB_DIR_IN | USB_TYPE_VENDOR | USB_RECIP_DEVICE,
-   0, 0, status, 2, 100);
-   else
+   0, 0, buf, 2, 100);
+   *status = *buf;
+
+   kfree(buf);
+   } else {
ret = rtsx_usb_get_status_with_bulk(ucr, status);
+   }
 
/* usb_control_msg may return positive when success */
if (ret < 0)
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

linux-next: manual merge of the vhost tree with the virtio tree

2015-03-05 Thread Stephen Rothwell

Hi Michael,

Today's linux-next merge of the vhost tree got a conflict in
drivers/virtio/virtio_balloon.c between commit 7f8998200dcb
("virtio_balloon: annotate possible sleep waiting for event") from the
virtio tree and commit 2426d3b03d07 ("virtio-balloon: do not call
blocking ops when !TASK_RUNNING") from the vhost tree.

I fixed it up (I think - see below) and can carry the fix as necessary
(no action is required).

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

diff --cc drivers/virtio/virtio_balloon.c
index 06001ca71ea3,5a6ad6dbdec4..
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@@ -341,19 -343,17 +343,25 @@@ static int balloon(void *_vballoon
  
try_to_freeze();
  
 +  /*
 +   * Reading the config on the ccw backend involves an
 +   * allocation, so we may actually sleep and have an
 +   * extra iteration.  It's extremely unlikely, and this
 +   * isn't a fast path in any sense.
 +   */
 +  sched_annotate_sleep();
 +
-   wait_event_interruptible(vb->config_change,
-(diff = towards_target(vb)) != 0
-|| vb->need_stats_update
-|| kthread_should_stop()
-|| freezing(current));
+   add_wait_queue(&vb->config_change, &wait);
+   for (;;) {
+   if ((diff = towards_target(vb)) != 0 ||
+   vb->need_stats_update ||
+   kthread_should_stop() ||
+   freezing(current))
+   break;
+   wait_woken(&wait, TASK_INTERRUPTIBLE, 
MAX_SCHEDULE_TIMEOUT);
+   }
+   remove_wait_queue(&vb->config_change, &wait);
+ 
if (vb->need_stats_update)
stats_handle_request(vb);
if (diff > 0)


pgpdDrFr1BDDj.pgp
Description: OpenPGP digital signature

Mellanox Technologies MT23108 causes #MC exceptions under heavy load

2015-03-05 Thread Maxim Levitsky

We are running CPU and network heavy test on marmot.pdl.cmu.edu cluster.
It has Mellanox Technologies MT23108 InfiniHost controller.

When we start using it for network communications, after just few
minutes some of the nodes of the cluster die
with the following machine check exception.
I repeated this test with Ethernet few times and had not an single
failure so far (I thought to had one but it turned to be another
unrelated issue)

It happened already on most nodes of this 128 node cluster, thus I
expect this to be kernel bug.
Do you have any pointers what we could try?

I compiled and tested current HEAD  of the vanilla kernel
(99aedde0869ce194539166ac5a4d2e1a20995348)
4.0.0-rc2
but this happens even on 2.6.38 (which was in one of
their stock kernel images).

Best regards,
  Maxim Levitsky

The kernel log of failure captured via serial console:

[  297.575167] ib0: can't use GFP_NOIO for QPs on device mthca0, using
GFP_KERNEL
[  564.704428] ib0: can't use GFP_NOIO for QPs on device mthca0, using
GFP_KERNEL
[  951.619320] ib0: can't use GFP_NOIO for QPs on device mthca0, using
GFP_KERNEL
[  956.790789] ib0: can't use GFP_NOIO for QPs on device mthca0, using
GFP_KERNEL
[  957.301036] ib0: can't use GFP_NOIO for QPs on device mthca0, using
GFP_KERNEL
[  957.333938] ib0: can't use GFP_NOIO for QPs on device mthca0, using
GFP_KERNEL
[  957.924656] ib0: can't use GFP_NOIO for QPs on device mthca0, using
GFP_KERNEL
[  958.125879] ib0: can't use GFP_NOIO for QPs on device mthca0, using
GFP_KERNEL
[  958.147588] ib0: can't use GFP_NOIO for QPs on device mthca0, using
GFP_KERNEL
[  958.485607] ib0: can't use GFP_NOIO for QPs on device mthca0, using
GFP_KERNEL
[  959.050155] ib0: can't use GFP_NOIO for QPs on device mthca0, using
GFP_KERNEL
[  959.120109] ib0: can't use GFP_NOIO for QPs on device mthca0, using
GFP_KERNEL
[  960.048666] ib0: can't use GFP_NOIO for QPs on device mthca0, using
GFP_KERNEL
[  960.110928] ib0: can't use GFP_NOIO for QPs on device mthca0, using
GFP_KERNEL
[  960.754363] ib0: can't use GFP_NOIO for QPs on device mthca0, using
GFP_KERNEL
[  961.390093] ib0: can't use GFP_NOIO for QPs on device mthca0, using
GFP_KERNEL
[  972.199782] ib0: can't use GFP_NOIO for QPs on device mthca0, using
GFP_KERNEL
[  972.496511] ib0: can't use GFP_NOIO for QPs on device mthca0, using
GFP_KERNEL
[  983.078444] ib0: can't use GFP_NOIO for QPs on device mthca0, using
GFP_KERNEL
[  983.618178] ib0: can't use GFP_NOIO for QPs on device mthca0, using
GFP_KERNEL
[  991.365565] ib0: can't use GFP_NOIO for QPs on device mthca0, using
GFP_KERNEL
[ 1003.344498] ib0: can't use GFP_NOIO for QPs on device mthca0, using
GFP_KERNEL
[ 1013.748036] Disabling lock debugging due to kernel taint
[ 1013.747903] [Hardware Error]: System Fatal error.
[ 1013.747903] [Hardware Error]: CPU:0 (f:5:1)
MC4_STATUS[-|UE|-|PCC|-]: 0xb2070f0f
[ 1013.747903] [Hardware Error]: MC4 Error (node 0): Watchdog timeout
due to lack of progress.
[ 1013.747903] [Hardware Error]: cache level: L3/GEN, mem/io: GEN,
mem-tx: GEN, part-proc: GEN (timed out)
[ 1013.747903] mce: [Hardware Error]: CPU 0: Machine Check Exception:
4 Bank 4: b2070f0f
[ 1013.747903] mce: [Hardware Error]: TSC 1a2dcecb6b8
[ 1013.747903] mce: [Hardware Error]: PROCESSOR 2:f51 TIME 1425610753
SOCKET 0 APIC 0 microcode 0
[ 1013.747903] [Hardware Error]: System Fatal error.
[ 1013.747903] [Hardware Error]: CPU:0 (f:5:1)
MC4_STATUS[-|UE|-|PCC|-]: 0xb2070f0f
[ 1013.747903] [Hardware Error]: MC4 Error (node 0): Watchdog timeout
due to lack of progress.
[ 1013.747903] [Hardware Error]: cache level: L3/GEN, mem/io: GEN,
mem-tx: GEN, part-proc: GEN (timed out)
[ 1013.747903] mce: [Hardware Error]: Machine check: Processor context corrupt
[ 1013.747903] Kernel panic - not syncing: Fatal machine check on current CPU
[ 1013.748036] [Hardware Error]: System Fatal error.
[ 1013.748036] [Hardware Error]: CPU:1 (f:5:1)
MC4_STATUS[-|UE|-|PCC|-]: 0xb2070f0f
[ 1013.748036] [Hardware Error]: MC4 Error (node 1): Watchdog timeout
due to lack of progress.
[ 1013.748036] [Hardware Error]: cache level: L3/GEN, mem/io: GEN,
mem-tx: GEN, part-proc: GEN (timed out)
[ 1013.747903] Kernel Offset: disabled
[ 1013.747903] ---[ end Kernel panic - not syncing: Fatal machine
check on current CPU
[ 1019.239423] [ cut here ]
[ 1019.244144] WARNING: CPU: 0 PID: 13875 at arch/x86/kernel/smp.c:124
native_smp_send_reschedule+0x5f/0x70()
[ 1019.249416] Modules linked in: ib_ipoib ib_cm ib_sa nfsv2 nfs lockd
sunrpc grace i2c_piix4 ib_mthca ib_mad ib_core ib_addr shpchp
amd64_edac_mod i2c_amd756 k8temp amd_rng edac_core edac_mce_amd tg3
ptp pps_core sata_promise pata_amd
[ 1019.249416] CPU: 0 PID: 13875 Comm: java Tainted: G   M
4.0.0-rc2+ #1
[ 1019.249416] Hardware name: RIOWORKS HDAMA/HDAMA, BIOS V2.17 03/20/2006
[ 1019.249416]  007c 8801f8409a80 815f33ff
007c
[ 1019.249416]   8801f8409ac0 81055

Re: [PATCH 0/2] make automatic device_id generation possible

2015-03-05 Thread Sergey Senozhatsky

On (03/05/15 09:20), Minchan Kim wrote:
> In summary, I want to support only "cat /sys/class/zram-control/zram_add"
> unless you have feasible usecase.
> 
> What do you think about it?
> 

Hello Minchan,

I've tried to contact as many guys (who has previously demonstrated some
interest in on-demand device creation) as I could in every sane way (using
both lkml and google+). and looks like people see no value in this
functionality.

so I'm happy to remove it. cleanup patch will arrive later today. thanks
for raising this topic.

-ss
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: pskb_expand_head: skb_shared BUG

2015-03-05 Thread Chris Dunlop

On Mon, Mar 02, 2015 at 11:45:11AM +1100, Chris Dunlop wrote:
> Heads up...
> 
> We've hit this BUG() in v3.10.70, v3.14.27 and v3.18.7:
> 
> net/core/skbuff.c:
> 1027 int pskb_expand_head(struct sk_buff *skb, int nhead, int ntail,
> 1028  gfp_t gfp_mask)
> 1029 {
> 1030 int i;
> 1031 u8 *data;
> 1032 int size = nhead + skb_end_offset(skb) + ntail;
> 1033 long off;
> 1034 
> 1035 BUG_ON(nhead < 0);
> 1036 
> 1037 if (skb_shared(skb))
> 1038 BUG(); <<< BOOM!!!
> 
> This appears to be a regression in the 3.10.x stable series:
> we've been running for 11 months on v3.10.33 without problem, we
> upgraded to v3.14.27 and hit the BUG(), than again on upgrading
> to v3.18.7, then again after downgrading to v3.10.70. 

Apologies, this was a false alarm.

There was indeed a regression, but it's in the upstream openvswitch code
rather than linux core. (Further details: a sharing of an
otherwise-unshared skb, causing us to hit the BUG() above, introduced in
v2.3, will be fixed in upcoming v2.3.2)

Cheers,

Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] net: fec: fix unbalanced clk disable on driver unbind

2015-03-05 Thread David Miller

From: Stefan Agner 
Date: Thu,  5 Mar 2015 15:09:29 +0100

> When the driver is removed (e.g. using unbind through sysfs), the
> clocks get disabled twice, once on fec_enet_close and once on
> fec_drv_remove. Since the clocks are enabled only once, this leads
> to a warning:
> 
> WARNING: CPU: 0 PID: 402 at drivers/clk/clk.c:992 clk_core_disable+0x64/0x68()
> 
> Remove the call to fec_enet_clk_enable in fec_drv_remove to balance
> the clock enable/disable calls again. This has been introduce by
> e8fcfcd5684a ("net: fec: optimize the clock management to save power").
> 
> Signed-off-by: Stefan Agner 

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] net: macb: Correct the MID field length value

2015-03-05 Thread David Miller

From: Michal Simek 
Date: Thu,  5 Mar 2015 15:02:10 +0100

> From: Punnaiah Choudary Kalluri 
> 
> The latest spec "I-IPA01-0266-USR Rev 10" limit the MID field length to 12 bit
> value. For previous versions it is 16 bit value.
> 
> This change will not break the backward compatibility as the latest ID value 
> is
> 7 and with in the 12 bit value limit.
> 
> Signed-off-by: Punnaiah Choudary Kalluri 
> Signed-off-by: Michal Simek 

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 1/6] x86: Add this_cpu_sp0() to read sp0 for the current cpu

2015-03-05 Thread Andy Lutomirski

We currently store references to the top of the kernel stack in
multiple places: kernel_stack (with an offset) and
init_tss.x86_tss.sp0 (no offset).  The latter is defined by hardware
and is a clean canonical way to find the top of the stack.  Add an
accessor so we can start using it.

This needs minor paravirt tweaks.  On native, sp0 defines the top of
the kernel stack and is therefore always correct.  On Xen and
lguest, the hypervisor tracks the top of the stack, but we want to
start reading sp0 in the kernel.  Fixing this is simple: just update
our local copy of sp0 as well as the hypervisor's copy on task
switches.

Cc: Konrad Rzeszutek Wilk 
Cc: Boris Ostrovsky 
Cc: Rusty Russell 
Signed-off-by: Andy Lutomirski 
---
 arch/x86/include/asm/processor.h | 5 +
 arch/x86/kernel/process.c| 1 +
 arch/x86/lguest/boot.c   | 1 +
 arch/x86/xen/enlighten.c | 1 +
 4 files changed, 8 insertions(+)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 7be2c9a6caba..71c3a826a690 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -564,6 +564,11 @@ static inline void native_swapgs(void)
 #endif
 }
 
+static inline unsigned long this_cpu_sp0(void)
+{
+   return this_cpu_read_stable(init_tss.x86_tss.sp0);
+}
+
 #ifdef CONFIG_PARAVIRT
 #include 
 #else
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 046e2d620bbe..ff5c9088b1c5 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -38,6 +38,7 @@
  * on exact cacheline boundaries, to eliminate cacheline ping-pong.
  */
 __visible DEFINE_PER_CPU_SHARED_ALIGNED(struct tss_struct, init_tss) = 
INIT_TSS;
+EXPORT_PER_CPU_SYMBOL_GPL(init_tss);
 
 #ifdef CONFIG_X86_64
 static DEFINE_PER_CPU(unsigned char, is_idle);
diff --git a/arch/x86/lguest/boot.c b/arch/x86/lguest/boot.c
index ac4453d8520e..8561585ee2c6 100644
--- a/arch/x86/lguest/boot.c
+++ b/arch/x86/lguest/boot.c
@@ -1076,6 +1076,7 @@ static void lguest_load_sp0(struct tss_struct *tss,
 {
lazy_hcall3(LHCALL_SET_STACK, __KERNEL_DS | 0x1, thread->sp0,
   THREAD_SIZE / PAGE_SIZE);
+   tss->x86_tss.sp0 = thread->sp0;
 }
 
 /* Let's just say, I wouldn't do debugging under a Guest. */
diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 5240f563076d..81665c9f2132 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -912,6 +912,7 @@ static void xen_load_sp0(struct tss_struct *tss,
mcs = xen_mc_entry(0);
MULTI_stack_switch(mcs.mc, __KERNEL_DS, thread->sp0);
xen_mc_issue(PARAVIRT_LAZY_CPU);
+   tss->x86_tss.sp0 = thread->sp0;
 }
 
 static void xen_set_iopl_mask(unsigned mask)
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 5/6] x86: Remove INIT_TSS and fold the definitions into cpu_tss

2015-03-05 Thread Andy Lutomirski

The INIT_TSS is unnecessary.  Just define the initial TSS where cpu_tss
is defined.

While we're at it, merge the 32-bit and 64-bit definitions.  The only
syntactic change is that 32-bit kernels were computing sp0 as long, but
now they compute it as unsigned long.

Verified by objdump: the contents and relocations of
.data..percpu..shared_aligned are unchanged on 32-bit and 64-bit
kernels.

Signed-off-by: Andy Lutomirski 
---
 arch/x86/include/asm/processor.h | 20 
 arch/x86/kernel/process.c| 20 +++-
 2 files changed, 19 insertions(+), 21 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 117ee65473e2..f5e3ec63767d 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -818,22 +818,6 @@ static inline void spin_lock_prefetch(const void *x)
.io_bitmap_ptr  = NULL,   \
 }
 
-/*
- * Note that the .io_bitmap member must be extra-big. This is because
- * the CPU will access an additional byte beyond the end of the IO
- * permission bitmap. The extra byte must be all 1 bits, and must
- * be within the limit.
- */
-#define INIT_TSS  {  \
-   .x86_tss = {  \
-   .sp0= sizeof(init_stack) + (long)&init_stack, \
-   .ss0= __KERNEL_DS,\
-   .ss1= __KERNEL_CS,\
-   .io_bitmap_base = INVALID_IO_BITMAP_OFFSET,   \
-},   \
-   .io_bitmap  = { [0 ... IO_BITMAP_LONGS] = ~0 },   \
-}
-
 extern unsigned long thread_saved_pc(struct task_struct *tsk);
 
 #define THREAD_SIZE_LONGS  (THREAD_SIZE/sizeof(unsigned long))
@@ -892,10 +876,6 @@ extern unsigned long thread_saved_pc(struct task_struct 
*tsk);
.sp0 = (unsigned long)&init_stack + sizeof(init_stack) \
 }
 
-#define INIT_TSS  { \
-   .x86_tss.sp0 = (unsigned long)&init_stack + sizeof(init_stack) \
-}
-
 /*
  * Return saved PC of a blocked thread.
  * What is this good for? it will be always the scheduler or ret_from_fork.
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 6f6087349231..f4c0af7fc3a0 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -37,7 +37,25 @@
  * section. Since TSS's are completely CPU-local, we want them
  * on exact cacheline boundaries, to eliminate cacheline ping-pong.
  */
-__visible DEFINE_PER_CPU_SHARED_ALIGNED(struct tss_struct, cpu_tss) = INIT_TSS;
+__visible DEFINE_PER_CPU_SHARED_ALIGNED(struct tss_struct, cpu_tss) = {
+   .x86_tss = {
+   .sp0 = (unsigned long)&init_stack + sizeof(init_stack),
+#ifdef CONFIG_X86_32
+   .ss0 = __KERNEL_DS,
+   .ss1 = __KERNEL_CS,
+   .io_bitmap_base = INVALID_IO_BITMAP_OFFSET,
+#endif
+},
+#ifdef CONFIG_X86_32
+/*
+ * Note that the .io_bitmap member must be extra-big. This is because
+ * the CPU will access an additional byte beyond the end of the IO
+ * permission bitmap. The extra byte must be all 1 bits, and must
+ * be within the limit.
+ */
+   .io_bitmap  = { [0 ... IO_BITMAP_LONGS] = ~0 },
+#endif
+};
 EXPORT_PER_CPU_SYMBOL_GPL(cpu_tss);
 
 #ifdef CONFIG_X86_64
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 6/6] x86, asm: Rename INIT_TSS_IST to TSS_IST

2015-03-05 Thread Andy Lutomirski

This has nothing to do with the init thread or the initial anything.
It's just the TSS.

Signed-off-by: Andy Lutomirski 
---
 arch/x86/kernel/entry_64.S | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index 0c00fd80249a..c86f83e95f15 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -959,7 +959,7 @@ apicinterrupt IRQ_WORK_VECTOR \
 /*
  * Exception entry points.
  */
-#define INIT_TSS_IST(x) PER_CPU_VAR(cpu_tss) + (TSS_ist + ((x) - 1) * 8)
+#define TSS_IST(x) PER_CPU_VAR(cpu_tss) + (TSS_ist + ((x) - 1) * 8)
 
 .macro idtentry sym do_sym has_error_code:req paranoid=0 shift_ist=-1
 ENTRY(\sym)
@@ -1015,13 +1015,13 @@ ENTRY(\sym)
.endif
 
.if \shift_ist != -1
-   subq $EXCEPTION_STKSZ, INIT_TSS_IST(\shift_ist)
+   subq $EXCEPTION_STKSZ, TSS_IST(\shift_ist)
.endif
 
call \do_sym
 
.if \shift_ist != -1
-   addq $EXCEPTION_STKSZ, INIT_TSS_IST(\shift_ist)
+   addq $EXCEPTION_STKSZ, TSS_IST(\shift_ist)
.endif
 
/* these procedures expect "no swapgs" flag in ebx */
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 0/6] Baby steps toward cleaning up KERNEL_STACK_OFFSET

2015-03-05 Thread Andy Lutomirski

Denys is right that KERNEL_STACK_OFFSET is a mess.  Let's start fixing
it.

This removes all C code that *reads* kernel_stack.  It also fixes the
KERNEL_STACK_OFFSET abomination in ia32_sysenter_target.

It does not fix the KERNEL_STACK_OFFSET abomination in GET_THREAD_INFO
and THREAD_INFO.  I think that should be its own patch.

It also doesn't change the two syscall targets.  To fix them, we should
make a decision.  Either we should make KERNEL_STACK_OFFSET have the
correct nonzero value to save an instruction or we should get rid of
kernel_stack entirely.

Changes from v1:
 - Fix missing export.
 - Fix lguest code.
 - Add more init_tss naming cleanups (Ingo's suggestion).
 - Changelog improvements (Ingo).
 - Improve the check in ist_begin_non_atomic (Denys).

Andy Lutomirski (6):
  x86: Add this_cpu_sp0() to read sp0 for the current cpu
  x86: Switch all C consumers of kernel_stack to this_cpu_sp0
  x86, asm: Change the 32-bit sysenter code to use sp0
  x86: Rename init_tss to cpu_tss
  x86: Remove INIT_TSS and fold the definitions into cpu_tss
  x86, asm: Rename INIT_TSS_IST to TSS_IST

 arch/x86/ia32/ia32entry.S  |  3 +--
 arch/x86/include/asm/processor.h   | 27 ++-
 arch/x86/include/asm/thread_info.h |  3 +--
 arch/x86/kernel/asm-offsets_64.c   |  1 +
 arch/x86/kernel/cpu/common.c   |  6 +++---
 arch/x86/kernel/entry_64.S |  6 +++---
 arch/x86/kernel/ioport.c   |  2 +-
 arch/x86/kernel/process.c  | 23 +--
 arch/x86/kernel/process_32.c   |  2 +-
 arch/x86/kernel/process_64.c   |  2 +-
 arch/x86/kernel/traps.c|  4 ++--
 arch/x86/kernel/vm86_32.c  |  4 ++--
 arch/x86/lguest/boot.c |  1 +
 arch/x86/power/cpu.c   |  2 +-
 arch/x86/xen/enlighten.c   |  1 +
 15 files changed, 46 insertions(+), 41 deletions(-)

-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 2/6] x86: Switch all C consumers of kernel_stack to this_cpu_sp0

2015-03-05 Thread Andy Lutomirski

This will make modifying the semantics of kernel_stack easier.

The change to ist_begin_non_atomic() is necessary because sp0 no
longer points to the same THREAD_SIZE-aligned region as rsp; it's
one byte too high for that.  At Denys' suggestion, rather than
offsetting it, just check explicitly that we're in the correct range
ending at sp0.  This has the added benefit that we no longer assume
that the thread stack is aligned to THREAD_SIZE.

Signed-off-by: Andy Lutomirski 
---
 arch/x86/include/asm/thread_info.h | 3 +--
 arch/x86/kernel/traps.c| 4 ++--
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/thread_info.h 
b/arch/x86/include/asm/thread_info.h
index 1d4e4f279a32..a2fa1899494e 100644
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -159,8 +159,7 @@ DECLARE_PER_CPU(unsigned long, kernel_stack);
 static inline struct thread_info *current_thread_info(void)
 {
struct thread_info *ti;
-   ti = (void *)(this_cpu_read_stable(kernel_stack) +
- KERNEL_STACK_OFFSET - THREAD_SIZE);
+   ti = (void *)(this_cpu_sp0() - THREAD_SIZE);
return ti;
 }
 
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 42819886be0c..484eb03a3f32 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -174,8 +174,8 @@ void ist_begin_non_atomic(struct pt_regs *regs)
 * will catch asm bugs and any attempt to use ist_preempt_enable
 * from double_fault.
 */
-   BUG_ON(((current_stack_pointer() ^ this_cpu_read_stable(kernel_stack))
-   & ~(THREAD_SIZE - 1)) != 0);
+   BUG_ON((unsigned long)(this_cpu_sp0() - current_stack_pointer()) >=
+  THREAD_SIZE);
 
preempt_count_sub(HARDIRQ_OFFSET);
 }
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 4/6] x86: Rename init_tss to cpu_tss

2015-03-05 Thread Andy Lutomirski

It has nothing to do with init -- there's only one tss per cpu.

Other names considered include:
 - current_tss: Confusing because we never switch the tss.
 - singleton_tss: Too long.

This patch was generated with 's/init_tss/cpu_tss/g'.  Followup patches
will fix INIT_TSS and INIT_TSS_IST by hand.

Signed-off-by: Andy Lutomirski 
---
 arch/x86/ia32/ia32entry.S| 2 +-
 arch/x86/include/asm/processor.h | 4 ++--
 arch/x86/kernel/cpu/common.c | 6 +++---
 arch/x86/kernel/entry_64.S   | 2 +-
 arch/x86/kernel/ioport.c | 2 +-
 arch/x86/kernel/process.c| 6 +++---
 arch/x86/kernel/process_32.c | 2 +-
 arch/x86/kernel/process_64.c | 2 +-
 arch/x86/kernel/vm86_32.c| 4 ++--
 arch/x86/power/cpu.c | 2 +-
 10 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/arch/x86/ia32/ia32entry.S b/arch/x86/ia32/ia32entry.S
index 719db63b35c4..ad9efef65a6b 100644
--- a/arch/x86/ia32/ia32entry.S
+++ b/arch/x86/ia32/ia32entry.S
@@ -113,7 +113,7 @@ ENTRY(ia32_sysenter_target)
CFI_DEF_CFA rsp,0
CFI_REGISTERrsp,rbp
SWAPGS_UNSAFE_STACK
-   movqPER_CPU_VAR(init_tss + TSS_sp0), %rsp
+   movqPER_CPU_VAR(cpu_tss + TSS_sp0), %rsp
/*
 * No need to follow this irqs on/off section: the syscall
 * disabled irqs, here we enable it straight after entry:
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 71c3a826a690..117ee65473e2 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -282,7 +282,7 @@ struct tss_struct {
 
 } cacheline_aligned;
 
-DECLARE_PER_CPU_SHARED_ALIGNED(struct tss_struct, init_tss);
+DECLARE_PER_CPU_SHARED_ALIGNED(struct tss_struct, cpu_tss);
 
 /*
  * Save the original ist values for checking stack pointers during debugging
@@ -566,7 +566,7 @@ static inline void native_swapgs(void)
 
 static inline unsigned long this_cpu_sp0(void)
 {
-   return this_cpu_read_stable(init_tss.x86_tss.sp0);
+   return this_cpu_read_stable(cpu_tss.x86_tss.sp0);
 }
 
 #ifdef CONFIG_PARAVIRT
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 2346c95c6ab1..5d0f0cc7ea26 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -979,7 +979,7 @@ static void syscall32_cpu_init(void)
 void enable_sep_cpu(void)
 {
int cpu = get_cpu();
-   struct tss_struct *tss = &per_cpu(init_tss, cpu);
+   struct tss_struct *tss = &per_cpu(cpu_tss, cpu);
 
if (!boot_cpu_has(X86_FEATURE_SEP)) {
put_cpu();
@@ -1307,7 +1307,7 @@ void cpu_init(void)
 */
load_ucode_ap();
 
-   t = &per_cpu(init_tss, cpu);
+   t = &per_cpu(cpu_tss, cpu);
oist = &per_cpu(orig_ist, cpu);
 
 #ifdef CONFIG_NUMA
@@ -1391,7 +1391,7 @@ void cpu_init(void)
 {
int cpu = smp_processor_id();
struct task_struct *curr = current;
-   struct tss_struct *t = &per_cpu(init_tss, cpu);
+   struct tss_struct *t = &per_cpu(cpu_tss, cpu);
struct thread_struct *thread = &curr->thread;
 
wait_for_master_cpu(cpu);
diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index 622ce4254893..0c00fd80249a 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -959,7 +959,7 @@ apicinterrupt IRQ_WORK_VECTOR \
 /*
  * Exception entry points.
  */
-#define INIT_TSS_IST(x) PER_CPU_VAR(init_tss) + (TSS_ist + ((x) - 1) * 8)
+#define INIT_TSS_IST(x) PER_CPU_VAR(cpu_tss) + (TSS_ist + ((x) - 1) * 8)
 
 .macro idtentry sym do_sym has_error_code:req paranoid=0 shift_ist=-1
 ENTRY(\sym)
diff --git a/arch/x86/kernel/ioport.c b/arch/x86/kernel/ioport.c
index 4ddaf66ea35f..37dae792dbbe 100644
--- a/arch/x86/kernel/ioport.c
+++ b/arch/x86/kernel/ioport.c
@@ -54,7 +54,7 @@ asmlinkage long sys_ioperm(unsigned long from, unsigned long 
num, int turn_on)
 * because the ->io_bitmap_max value must match the bitmap
 * contents:
 */
-   tss = &per_cpu(init_tss, get_cpu());
+   tss = &per_cpu(cpu_tss, get_cpu());
 
if (turn_on)
bitmap_clear(t->io_bitmap_ptr, from, num);
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index ff5c9088b1c5..6f6087349231 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -37,8 +37,8 @@
  * section. Since TSS's are completely CPU-local, we want them
  * on exact cacheline boundaries, to eliminate cacheline ping-pong.
  */
-__visible DEFINE_PER_CPU_SHARED_ALIGNED(struct tss_struct, init_tss) = 
INIT_TSS;
-EXPORT_PER_CPU_SYMBOL_GPL(init_tss);
+__visible DEFINE_PER_CPU_SHARED_ALIGNED(struct tss_struct, cpu_tss) = INIT_TSS;
+EXPORT_PER_CPU_SYMBOL_GPL(cpu_tss);
 
 #ifdef CONFIG_X86_64
 static DEFINE_PER_CPU(unsigned char, is_idle);
@@ -110,7 +110,7 @@ void exit_thread(void)
unsigned long *bp = t->io_bitmap_ptr;
 
if (bp) {
-   struct tss_struct *tss = &per_cpu(init_tss, get_cpu());
+

[PATCH v2 3/6] x86, asm: Change the 32-bit sysenter code to use sp0

2015-03-05 Thread Andy Lutomirski

The ia32 sysenter code loaded the top of the kernel stack into rsp
by loading kernel_stack and then adjusting it.  It can be simplified
to just read sp0 directly.

This requires the addition of a new asm-offsets entry for sp0.

Signed-off-by: Andy Lutomirski 
---
 arch/x86/ia32/ia32entry.S| 3 +--
 arch/x86/kernel/asm-offsets_64.c | 1 +
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/ia32/ia32entry.S b/arch/x86/ia32/ia32entry.S
index ed9746340363..719db63b35c4 100644
--- a/arch/x86/ia32/ia32entry.S
+++ b/arch/x86/ia32/ia32entry.S
@@ -113,8 +113,7 @@ ENTRY(ia32_sysenter_target)
CFI_DEF_CFA rsp,0
CFI_REGISTERrsp,rbp
SWAPGS_UNSAFE_STACK
-   movqPER_CPU_VAR(kernel_stack), %rsp
-   addq$(KERNEL_STACK_OFFSET),%rsp
+   movqPER_CPU_VAR(init_tss + TSS_sp0), %rsp
/*
 * No need to follow this irqs on/off section: the syscall
 * disabled irqs, here we enable it straight after entry:
diff --git a/arch/x86/kernel/asm-offsets_64.c b/arch/x86/kernel/asm-offsets_64.c
index fdcbb4d27c9f..5ce6f2da8763 100644
--- a/arch/x86/kernel/asm-offsets_64.c
+++ b/arch/x86/kernel/asm-offsets_64.c
@@ -81,6 +81,7 @@ int main(void)
 #undef ENTRY
 
OFFSET(TSS_ist, tss_struct, x86_tss.ist);
+   OFFSET(TSS_sp0, tss_struct, x86_tss.sp0);
BLANK();
 
DEFINE(__NR_syscall_max, sizeof(syscalls_64) - 1);
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 1/7] ARM: at91: switch to multiplatform

2015-03-05 Thread Rob Herring

On Thu, Mar 5, 2015 at 5:35 PM, Alexandre Belloni
 wrote:
> On 05/03/2015 at 16:50:57 -0600, Rob Herring wrote :
>> > -config SOC_SAMA5
>> > +config ARCH_AT91
>> > bool
>> > -   select ATMEL_AIC5_IRQ
>> > +   select ARCH_REQUIRE_GPIOLIB
>> > select COMMON_CLK_AT91
>> > -   select CPU_V7
>> > +   select CLKDEV_LOOKUP
>>
>> This is already selected by COMMON_CLK I think.
>>
>> > select GENERIC_CLOCKEVENTS
>>
>> This is already selected.
>>
>
> I'm just moving options around I didn't add or remove any. That applies
> to most of your comments.

You are enabling multiplatform which means you can drop selecting the
ones multiplatform selects. I've cleaned-up up the tree once for this
and I don't care to do it again.

Rob
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v6] x86: mce: kexec: switch MCE handler for kexec/kdump

2015-03-05 Thread Naoya Horiguchi

On Thu, Mar 05, 2015 at 09:37:52AM +, Naoya Horiguchi wrote:
...
> > With the above simplified versions used, the rest of the patch becomes
> > almost trivial.
> 
> Other than that, I'm OK to write in the simplified form.

Here is the updated one.

And I found some cleanups and/or tiny fixes (independent from this patch),
so will post them later.

Thanks,
Naoya Horiguchi

---
>From 8890e9976c525a4b480bf5f86008641688de8c11 Mon Sep 17 00:00:00 2001
From: Naoya Horiguchi 
Date: Fri, 6 Mar 2015 11:52:10 +0900
Subject: [PATCH v6] x86: mce: kexec: switch MCE handler for kexec/kdump

kexec disables (or "shoots down") all CPUs other than a crashing CPU before
entering the 2nd kernel. But the MCE handler is still enabled after that,
so if MCE happens and broadcasts over the CPUs after the main thread starts
the 2nd kernel (which might not initialize MCE device yet, or might decide
not to enable it,) MCE handler runs only on the other CPUs (not on the main
thread,) leading to kernel panic with MCE synchronization. The user-visible
effect of this bug is kdump failure.

Our standard MCE handler do_machine_check() assumes some about system's
status and it's hard to alter it to cover kexec/kdump context, so let's add
another kdump-specific one and switch to it.

Note that this problem exists since current MCE handler was implemented in
2.6.32, and recently commit 716079f66eac ("mce: Panic when a core has reached
a timeout") made it more visible by changing the default behavior of the
synchronization timeout from "ignore" to "panic".

Signed-off-by: Naoya Horiguchi 
---
ChangeLog v5 -> v6:
- drop "CC stable" tag
- stop using/exporting mce_gather_info(), mce_(rd|wr)msrl(), and mce_panic()
- drop quirk_no_way_out() part, because quirk_sandybridge_ifu() (only possible
  callback) could just change a MCE_PANIC_SEVERITY case to a MCE_AR_SEVERITY
  case, which doesn't affect the panic/return decision.

ChangeLog v4 -> v5:
- drop MCE_UC/AR_SEVERITY re-ordering
- move most of code to arch/x86/kernel/crash.c
- export some MCE internal variables/routines via arch/x86/include/asm/mce.h

ChangeLog v3 -> v4:
- fixed AR and UC order in enum severity_level because UC is severer than AR
  by definition. Current code is not affected by this wrong order by chance.
- check severity in machine_check_under_kdump(), and call mce_panic() if the
  resultant severity is as bad as or worse than MCE_AR_SEVERITY.
- use static global variable kdump_cpu instead of mca_cfg->kdump_cpu
- reduce "#ifdef CONFIG_KEXEC"
- add "#ifdef CONFIG_X86_MCE" for declaration of machine_check_under_kdump()
  in mce.h
- update comment on switch_mce_handler_for_kdump()

ChangeLog v2 -> v3
- go to "switch MCE handler" approach

ChangeLog v1 -> v2
- clear MSR_IA32_MCG_CTL, MSR_IA32_MCx_CTL, and CR4.MCE instead of using
  global flag to ignore MCE events.
- fixed the description of the problem
---
 arch/x86/include/asm/mce.h| 14 +
 arch/x86/kernel/cpu/mcheck/mce-internal.h | 13 -
 arch/x86/kernel/crash.c   | 88 +++
 3 files changed, 102 insertions(+), 13 deletions(-)

diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index 51b26e895933..192267fcee73 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -248,4 +248,18 @@ struct cper_sec_mem_err;
 extern void apei_mce_report_mem_error(int corrected,
  struct cper_sec_mem_err *mem_err);
 
+enum severity_level {
+   MCE_NO_SEVERITY,
+   MCE_DEFERRED_SEVERITY,
+   MCE_UCNA_SEVERITY = MCE_DEFERRED_SEVERITY,
+   MCE_KEEP_SEVERITY,
+   MCE_SOME_SEVERITY,
+   MCE_AO_SEVERITY,
+   MCE_UC_SEVERITY,
+   MCE_AR_SEVERITY,
+   MCE_PANIC_SEVERITY,
+};
+
+int mce_severity(struct mce *a, int tolerant, char **msg, bool is_excp);
+
 #endif /* _ASM_X86_MCE_H */
diff --git a/arch/x86/kernel/cpu/mcheck/mce-internal.h 
b/arch/x86/kernel/cpu/mcheck/mce-internal.h
index 10b46906767f..909ee3ed95dd 100644
--- a/arch/x86/kernel/cpu/mcheck/mce-internal.h
+++ b/arch/x86/kernel/cpu/mcheck/mce-internal.h
@@ -1,18 +1,6 @@
 #include 
 #include 
 
-enum severity_level {
-   MCE_NO_SEVERITY,
-   MCE_DEFERRED_SEVERITY,
-   MCE_UCNA_SEVERITY = MCE_DEFERRED_SEVERITY,
-   MCE_KEEP_SEVERITY,
-   MCE_SOME_SEVERITY,
-   MCE_AO_SEVERITY,
-   MCE_UC_SEVERITY,
-   MCE_AR_SEVERITY,
-   MCE_PANIC_SEVERITY,
-};
-
 #define ATTR_LEN   16
 
 /* One object for each MCE bank, shared by all CPUs */
@@ -23,7 +11,6 @@ struct mce_bank {
charattrname[ATTR_LEN]; /* attribute name */
 };
 
-int mce_severity(struct mce *a, int tolerant, char **msg, bool is_excp);
 struct dentry *mce_get_debugfs_dir(void);
 
 extern struct mce_bank *mce_banks;
diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index 6f3baedcb6f6..588a8b214356 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -34,6 +34,7 @@
 #in

Re: parent/child hierarchy for regulator

2015-03-05 Thread Peter Chen

On Thu, Mar 05, 2015 at 12:22:34PM +, Mark Brown wrote:
> On Thu, Mar 05, 2015 at 06:35:36PM +0800, Peter Chen wrote:
> 
> > Any good ways at code/dts to show parent/child hierarchy for regulator?
> 
> There's plenty of examples in mainline...
> 

thanks, I am back to study again.

> > The related regulators at my platforms like below:
> > PMIC (SWB 5v) --> Switch Chip (GPIO Regulator) --> USB VBUS
> 
> > PMIC has one 5V regulator (eg, swbst at pfuse), and it is the input
> > for USB power switch chip, and there are two gpios at this switch
> > chip to control if 5V is output or not, we register these two gpios as 
> > fixed regulators, currently, if regulator swbst is disabled, the
> > gpio regulator has no way to know, and cause the vbus voltage is wrong.
> 
> Can you please clarify why you're registering two fixed voltage
> regulators for the switch chip and how you're doing that?

Two fixed regulators for two USB vbus, there are no relationships beween
them, but both of them needs PMIC 5V (swbst at pfuse) to be enabled.

> The picture
> above looks like you should just have a single regulator there and
> nothing should care if the either regulator is enabled when querying the
> parent for its voltage.

I need to care about its parent's status, currently, the usb code does
not consider parent regulator, so after below patch, the voltage of
vbus is incorrect, due to parent regulator is disabled after boots up,
there is no user for this parent regulator.

commit a6dcf9782f99a0d844b4d06f65cc990468424068
Author: Sean Cross 
Date:   Mon May 26 16:45:40 2014 +0800

regulator: pfuze100: Support SWB enable/disable

The SWB regulators have the ability to be turned on and off.
Add enable/disable support for these regulators.

-- 

Best Regards,
Peter Chen
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] ipv4: ip_check_defrag should not assume that skb_network_offset is zero

2015-03-05 Thread David Miller

From: Alexander Drozdov 
Date: Thu,  5 Mar 2015 10:29:39 +0300

> ip_check_defrag() may be used by af_packet to defragment outgoing packets.
> skb_network_offset() of af_packet's outgoing packets is not zero.
> 
> Signed-off-by: Alexander Drozdov 

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] netlink: drop (int) cast on length arg in NLMSG_OK

2015-03-05 Thread David Miller

From: Mike Frysinger 
Date: Thu,  5 Mar 2015 00:47:08 -0500

> The NLMSG_OK macro compares three things:
>  - the len arg from the user
>  - a size_t: sizeof(struct nlmsghdr)
>  - an int: sizeof(struct nlmsghdr) casted
>  - an u32: the nlmsghdr->nlmsg_len member
> 
> When building with -Wsign-compare, this macro triggers a signed compare
> warning.  This is because it compares len to an int, and then compares
> it to a u32.  If len is signed, we get a warning due to the last test.
> If len is unsigned, we get a warning due to the first test.  Like in
> strace:
> socketutils.c:145:8: warning: comparison between signed and unsigned
>   integer expressions [-Wsign-compare]
> 
> Lets drop the int cast on the first sizeof.  This way, once the user
> casts len to an unsigned value, everything shakes out correctly.
> 
> Signed-off-by: Mike Frysinger 

I don't think we can change this.  If you get rid of the 'int' cast
then code is going to end up with a signed comparison for the first
test even if 'len' is signed, and that's a potential security issue.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 4 5 6 7 8 9 >

1 - 100 of 849 matches

Mail list logo