Re: [PATCH 1/1] x86/fpu: math_state_restore() should not blindly disable irqs
* Oleg Nesterov wrote: > On 03/05, Ingo Molnar wrote: > > > > * Oleg Nesterov wrote: > > > > > --- a/arch/x86/kernel/traps.c > > > +++ b/arch/x86/kernel/traps.c > > > @@ -774,7 +774,10 @@ void math_state_restore(void) > > > struct task_struct *tsk = current; > > > > > > if (!tsk_used_math(tsk)) { > > > - local_irq_enable(); > > > + bool disabled = irqs_disabled(); > > > + > > > + if (disabled) > > > + local_irq_enable(); > > > /* > > >* does a slab alloc which can sleep > > >*/ > > > @@ -785,7 +788,9 @@ void math_state_restore(void) > > > do_group_exit(SIGKILL); > > > return; > > > } > > > - local_irq_disable(); > > > + > > > + if (disabled) > > > + local_irq_disable(); > > > } > > > > Yuck! > > > > Is there a fundamental reason why we cannot simply enable irqs and > > leave them enabled? Math state restore is not atomic and cannot really > > be atomic. > > You know, I didn't even try to verify ;) but see below. So I'm thinking about the attached patch. > Most probably we can simply enable irqs, yes. But what about older > kernels, how can we check? > > And let me repeat, I strongly believe that this !tsk_used_math() > case in math_state_restore() must die. And unlazy_fpu() in > init_fpu(). And both __restore_xstate_sig() and flush_thread() > should not use math_state_restore() at all. At least in its current > form. Agreed. > But this is obviously not -stable material. > > That said, I'll try to look into git history tomorrow. So I think the reasons are: - historic: because math_state_restore() started out as an interrupt routine (from the IRQ13 days) - hardware imposed: the handler is executed with irqs off - it's probably the fastest implementation: we just run with the natural irqs-off state the handler executes with. So there's nothing outright wrong about executing with irqs off in a trap handler. > [...] The patch above looks "obviously safe", but perhaps I am > paranoid too much... IMHO your hack above isn't really acceptable, even for a backport. So lets test the patch below (assuming it's the right thing to do) and move forward? Thanks, Ingo ==> From: Ingo Molnar Date: Fri, 6 Mar 2015 08:37:57 +0100 Subject: [PATCH] x86/fpu: Don't disable irqs in math_state_restore() math_state_restore() was historically called with irqs disabled, because that's how the hardware generates the trap, and also because back in the days it was possible for it to be an asynchronous interrupt and interrupt handlers run with irqs off. These days it's always an instruction trap, and furthermore it does inevitably complex things such as memory allocation and signal processing, which is not done with irqs disabled. So keep irqs enabled. This might surprise in-kernel FPU users that somehow relied on interrupts being disabled across FPU usage - but that's fundamentally fragile anyway due to the inatomicity of FPU state restores. The trap return will restore interrupts to its previous state, but if FPU ops trigger math_state_restore() there's no guarantee of atomicity anymore. To warn about in-kernel irqs-off users of FPU state we might want to pass 'struct pt_regs' to math_state_restore() and check the trapped state for irqs disabled (flags has IF cleared) and kernel context - but that's for a later patch. Cc: Andy Lutomirski Cc: Borislav Petkov Cc: Fenghua Yu Cc: H. Peter Anvin Cc: Linus Torvalds Cc: Oleg Nesterov Cc: Quentin Casasnovas Cc: Thomas Gleixner Signed-off-by: Ingo Molnar --- arch/x86/kernel/traps.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index 950815a138e1..52f9e4057cee 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -844,8 +844,9 @@ void math_state_restore(void) { struct task_struct *tsk = current; + local_irq_enable(); + if (!tsk_used_math(tsk)) { - local_irq_enable(); /* * does a slab alloc which can sleep */ @@ -856,7 +857,6 @@ void math_state_restore(void) do_group_exit(SIGKILL); return; } - local_irq_disable(); } /* Avoid __kernel_fpu_begin() right after __thread_fpu_begin() */ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] phy: core: Fixup return value of phy_exit when !pm_runtime_enabled
When phy_pm_runtime_get_sync() returns -ENOTSUPP, phy_exit() also returns -ENOTSUPP if !phy->ops->exit. Fix it. Also move the code to override ret close to the code we got ret. I think it is less error prone this way. Signed-off-by: Axel Lin --- drivers/phy/phy-core.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/drivers/phy/phy-core.c b/drivers/phy/phy-core.c index a12d353..250dc6c 100644 --- a/drivers/phy/phy-core.c +++ b/drivers/phy/phy-core.c @@ -223,6 +223,7 @@ int phy_init(struct phy *phy) ret = phy_pm_runtime_get_sync(phy); if (ret < 0 && ret != -ENOTSUPP) return ret; + ret = 0; /* Override possible ret == -ENOTSUPP */ mutex_lock(&phy->mutex); if (phy->init_count == 0 && phy->ops->init) { @@ -231,8 +232,6 @@ int phy_init(struct phy *phy) dev_err(&phy->dev, "phy init failed --> %d\n", ret); goto out; } - } else { - ret = 0; /* Override possible ret == -ENOTSUPP */ } ++phy->init_count; @@ -253,6 +252,7 @@ int phy_exit(struct phy *phy) ret = phy_pm_runtime_get_sync(phy); if (ret < 0 && ret != -ENOTSUPP) return ret; + ret = 0; /* Override possible ret == -ENOTSUPP */ mutex_lock(&phy->mutex); if (phy->init_count == 1 && phy->ops->exit) { @@ -287,6 +287,7 @@ int phy_power_on(struct phy *phy) ret = phy_pm_runtime_get_sync(phy); if (ret < 0 && ret != -ENOTSUPP) return ret; + ret = 0; /* Override possible ret == -ENOTSUPP */ mutex_lock(&phy->mutex); if (phy->power_count == 0 && phy->ops->power_on) { @@ -295,8 +296,6 @@ int phy_power_on(struct phy *phy) dev_err(&phy->dev, "phy poweron failed --> %d\n", ret); goto out; } - } else { - ret = 0; /* Override possible ret == -ENOTSUPP */ } ++phy->power_count; mutex_unlock(&phy->mutex); -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v3 1/2] locks: Split insert/delete block functions into flock/posix parts
The locks_insert/delete_block() functions are used for flock, posix and leases types. blocked_lock_lock is used to serialize all access to fl_link, fl_block, fl_next and blocked_hash. Here, we prepare the stage for using blocked_lock_lock only to protect blocked_hash. Signed-off-by: Daniel Wagner Cc: Jeff Layton Cc: "J. Bruce Fields" Cc: Alexander Viro --- fs/locks.c | 49 - 1 file changed, 40 insertions(+), 9 deletions(-) diff --git a/fs/locks.c b/fs/locks.c index d4992a1..0c37d68 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -611,11 +611,20 @@ static void locks_delete_global_blocked(struct file_lock *waiter) */ static void __locks_delete_block(struct file_lock *waiter) { - locks_delete_global_blocked(waiter); list_del_init(&waiter->fl_block); waiter->fl_next = NULL; } +/* Posix block variant of __locks_delete_block. + * + * Must be called with blocked_lock_lock held. + */ +static void __locks_delete_posix_block(struct file_lock *waiter) +{ + locks_delete_global_blocked(waiter); + __locks_delete_block(waiter); +} + static void locks_delete_block(struct file_lock *waiter) { spin_lock(&blocked_lock_lock); @@ -623,6 +632,13 @@ static void locks_delete_block(struct file_lock *waiter) spin_unlock(&blocked_lock_lock); } +static void locks_delete_posix_block(struct file_lock *waiter) +{ + spin_lock(&blocked_lock_lock); + __locks_delete_posix_block(waiter); + spin_unlock(&blocked_lock_lock); +} + /* Insert waiter into blocker's block list. * We use a circular list so that processes can be easily woken up in * the order they blocked. The documentation doesn't require this but @@ -639,8 +655,17 @@ static void __locks_insert_block(struct file_lock *blocker, BUG_ON(!list_empty(&waiter->fl_block)); waiter->fl_next = blocker; list_add_tail(&waiter->fl_block, &blocker->fl_block); - if (IS_POSIX(blocker) && !IS_OFDLCK(blocker)) - locks_insert_global_blocked(waiter); +} + +/* Posix block variant of __locks_insert_block. + * + * Must be called with flc_lock and blocked_lock_lock held. + */ +static void __locks_insert_posix_block(struct file_lock *blocker, + struct file_lock *waiter) +{ + __locks_insert_block(blocker, waiter); + locks_insert_global_blocked(waiter); } /* Must be called with flc_lock held. */ @@ -675,7 +700,10 @@ static void locks_wake_up_blocks(struct file_lock *blocker) waiter = list_first_entry(&blocker->fl_block, struct file_lock, fl_block); - __locks_delete_block(waiter); + if (IS_POSIX(blocker) && !IS_OFDLCK(blocker)) + __locks_delete_posix_block(waiter); + else + __locks_delete_block(waiter); if (waiter->fl_lmops && waiter->fl_lmops->lm_notify) waiter->fl_lmops->lm_notify(waiter); else @@ -985,7 +1013,7 @@ static int __posix_lock_file(struct inode *inode, struct file_lock *request, str spin_lock(&blocked_lock_lock); if (likely(!posix_locks_deadlock(request, fl))) { error = FILE_LOCK_DEFERRED; - __locks_insert_block(fl, request); + __locks_insert_posix_block(fl, request); } spin_unlock(&blocked_lock_lock); goto out; @@ -1186,7 +1214,7 @@ int posix_lock_file_wait(struct file *filp, struct file_lock *fl) if (!error) continue; - locks_delete_block(fl); + locks_delete_posix_block(fl); break; } return error; @@ -1283,7 +1311,7 @@ int locks_mandatory_area(int read_write, struct inode *inode, continue; } - locks_delete_block(&fl); + locks_delete_posix_block(&fl); break; } @@ -2104,7 +2132,10 @@ static int do_lock_file_wait(struct file *filp, unsigned int cmd, if (!error) continue; - locks_delete_block(fl); + if (IS_POSIX(fl) && !IS_OFDLCK(fl)) + locks_delete_posix_block(fl); + else + locks_delete_block(fl); break; } @@ -2468,7 +2499,7 @@ posix_unblock_lock(struct file_lock *waiter) spin_lock(&blocked_lock_lock); if (waiter->fl_next) - __locks_delete_block(waiter); + __locks_delete_posix_block(waiter); else status = -ENOENT; spin_unlock(&blocked_lock_lock); -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the
[PATCH v3 2/2] locks: Use blocked_lock_lock only to protect blocked_hash
blocked_lock_lock and file_lock_lglock are used to protect file_lock's fl_link, fl_block, fl_next, blocked_hash and the percpu file_lock_list. Let's use blocked_lock_lock only to protect blocked_hash since it is a global lock. Whenever we insert a new lock we are going to grab besides the flc_lock also the corresponding file_lock_lglock. The global blocked_lock_lock is only used when blocked_hash is involved. Since we already use fl_link_cpu to remember which percpu file_lock_list is referencing to a blocker we just going to use it as well for all waiters. Note fl_list is protected by flc_lock. It's easy to get confused... Signed-off-by: Daniel Wagner Cc: Jeff Layton Cc: "J. Bruce Fields" Cc: Alexander Viro --- fs/locks.c | 72 ++ 1 file changed, 39 insertions(+), 33 deletions(-) diff --git a/fs/locks.c b/fs/locks.c index 0c37d68..661e58b 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -162,6 +162,20 @@ int lease_break_time = 45; * keep a list on each CPU, with each list protected by its own spinlock via * the file_lock_lglock. Note that alterations to the list also require that * the relevant flc_lock is held. + * + * In addition, it also protects the fl->fl_block list, and the fl->fl_next + * pointer for file_lock structures that are acting as lock requests (in + * contrast to those that are acting as records of acquired locks). + * + * file_lock structures acting as lock requests (waiters) use the same + * spinlock as the those acting as lock holder (blocker). E.g. the + * blocker is initially added to the file_lock_list living on CPU 0, + * all waiters on that blocker are serialized via CPU 0 (see + * fl_link_cpu usage). + * + * In particular, adding an entry to the fl_block list requires that you hold + * both the flc_lock and the blocked_lock_lock (acquired in that order). + * Deleting an entry from the list however only requires the file_lock_gllock. */ DEFINE_STATIC_LGLOCK(file_lock_lglock); static DEFINE_PER_CPU(struct hlist_head, file_lock_list); @@ -183,19 +197,6 @@ static DEFINE_HASHTABLE(blocked_hash, BLOCKED_HASH_BITS); /* * This lock protects the blocked_hash. Generally, if you're accessing it, you * want to be holding this lock. - * - * In addition, it also protects the fl->fl_block list, and the fl->fl_next - * pointer for file_lock structures that are acting as lock requests (in - * contrast to those that are acting as records of acquired locks). - * - * Note that when we acquire this lock in order to change the above fields, - * we often hold the flc_lock as well. In certain cases, when reading the fields - * protected by this lock, we can skip acquiring it iff we already hold the - * flc_lock. - * - * In particular, adding an entry to the fl_block list requires that you hold - * both the flc_lock and the blocked_lock_lock (acquired in that order). - * Deleting an entry from the list however only requires the file_lock_lock. */ static DEFINE_SPINLOCK(blocked_lock_lock); @@ -607,7 +608,7 @@ static void locks_delete_global_blocked(struct file_lock *waiter) /* Remove waiter from blocker's block list. * When blocker ends up pointing to itself then the list is empty. * - * Must be called with blocked_lock_lock held. + * Must be called with file_lock_lglock held. */ static void __locks_delete_block(struct file_lock *waiter) { @@ -617,7 +618,7 @@ static void __locks_delete_block(struct file_lock *waiter) /* Posix block variant of __locks_delete_block. * - * Must be called with blocked_lock_lock held. + * Must be called with file_lock_lglock held. */ static void __locks_delete_posix_block(struct file_lock *waiter) { @@ -627,16 +628,18 @@ static void __locks_delete_posix_block(struct file_lock *waiter) static void locks_delete_block(struct file_lock *waiter) { - spin_lock(&blocked_lock_lock); + lg_local_lock_cpu(&file_lock_lglock, waiter->fl_link_cpu); __locks_delete_block(waiter); - spin_unlock(&blocked_lock_lock); + lg_local_unlock_cpu(&file_lock_lglock, waiter->fl_link_cpu); } static void locks_delete_posix_block(struct file_lock *waiter) { + lg_local_lock_cpu(&file_lock_lglock, waiter->fl_link_cpu); spin_lock(&blocked_lock_lock); __locks_delete_posix_block(waiter); spin_unlock(&blocked_lock_lock); + lg_local_unlock_cpu(&file_lock_lglock, waiter->fl_link_cpu); } /* Insert waiter into blocker's block list. @@ -644,22 +647,23 @@ static void locks_delete_posix_block(struct file_lock *waiter) * the order they blocked. The documentation doesn't require this but * it seems like the reasonable thing to do. * - * Must be called with both the flc_lock and blocked_lock_lock held. The - * fl_block list itself is protected by the blocked_lock_lock, but by ensuring + * Must be called with both the flc_lock and file_lock_lglock held. The + * fl_block list itself is protected by the file_lock_lglock, but by ensuring
[PATCH v3 0/2] Use blocked_lock_lock only to protect blocked_hash
Hi, Finally, I got a bigger machine and did a quick test round. I expected to see some improvements but the resutls do not show any real gain. So they are merely refactoring patches. 4x Intel(R) Xeon(R) CPU E5-4610 v2 @ 2.30GHz 4.0.0-rc2/flock01.data # NumSamples = 3; Min = 47160.80; Max = 47555.42 # Mean = 47294.254786; Variance = 34110.284932; SD = 184.689699; Median 47166.534982 # each ∎ represents a count of 1 47160.8049 - 47200.2668 [ 2]: ∎∎ 47200.2668 - 47239.7288 [ 0]: 47239.7288 - 47279.1908 [ 0]: 47279.1908 - 47318.6527 [ 0]: 47318.6527 - 47358.1147 [ 0]: 47358.1147 - 47397.5767 [ 0]: 47397.5767 - 47437.0386 [ 0]: 47437.0386 - 47476.5006 [ 0]: 47476.5006 - 47515.9625 [ 0]: 47515.9625 - 47555.4245 [ 1]: ∎ patched/flock01.data # NumSamples = 21; Min = 45877.22; Max = 50206.70 # Mean = 47042.844720; Variance = 752166.966346; SD = 867.275600; Median 46939.811380 # each ∎ represents a count of 1 45877.2235 - 46310.1709 [ 2]: ∎∎ 46310.1709 - 46743.1182 [ 7]: ∎∎∎ 46743.1182 - 47176.0655 [ 3]: ∎∎∎ 47176.0655 - 47609.0128 [ 6]: ∎∎ 47609.0128 - 48041.9602 [ 2]: ∎∎ 48041.9602 - 48474.9075 [ 0]: 48474.9075 - 48907.8548 [ 0]: 48907.8548 - 49340.8021 [ 0]: 49340.8021 - 49773.7495 [ 0]: 49773.7495 - 50206.6968 [ 1]: ∎ 4.0.0-rc2/flock02.data # NumSamples = 1786; Min = 1.86; Max = 3.13 # Mean = 2.204980; Variance = 0.015900; SD = 0.126096; Median 2.177549 # each ∎ represents a count of 13 1.8606 - 1.9880 [ 5]: 1.9880 - 2.1154 [ 315]: 2.1154 - 2.2427 [ 1040]: 2.2427 - 2.3701 [ 272]: 2.3701 - 2.4975 [75]: ∎ 2.4975 - 2.6249 [42]: ∎∎∎ 2.6249 - 2.7523 [28]: ∎∎ 2.7523 - 2.8796 [ 7]: 2.8796 - 3.0070 [ 1]: 3.0070 - 3.1344 [ 1]: patched/flock02.data # NumSamples = 4586; Min = 2.14; Max = 4.31 # Mean = 2.619467; Variance = 0.043192; SD = 0.207828; Median 2.575378 # each ∎ represents a count of 27 2.1385 - 2.3561 [ 186]: ∎∎ 2.3561 - 2.5737 [ 2079]: ∎ 2.5737 - 2.7914 [ 1642]: 2.7914 - 3.0090 [ 355]: ∎ 3.0090 - 3.2266 [ 246]: ∎ 3.2266 - 3.4442 [66]: ∎∎ 3.4442 - 3.6618 [ 9]: 3.6618 - 3.8795 [ 1]: 3.8795 - 4.0971 [ 0]: 4.0971 - 4.3147 [ 2]: 4.0.0-rc2/lease01.data # NumSamples = 12; Min = 1097.16; Max = 1255.06 # Mean = 1184.550432; Variance = 1590.438052; SD = 39.880297; Median 1190.704582 # each ∎ represents a count of 1 1097.1556 - 1112.9460 [ 1]: ∎ 1112.9460 - 1128.7363 [ 0]: 1128.7363 - 1144.5267 [ 1]: ∎ 1144.5267 - 1160.3170 [ 0]: 1160.3170 - 1176.1074 [ 2]: ∎∎ 1176.1074 - 1191.8977 [ 2]: ∎∎ 1191.8977 - 1207.6881 [ 2]: ∎∎ 1207.6881 - 1223.4784 [ 3]: ∎∎∎ 1223.4784 - 1239.2688 [ 0]: 1239.2688 - 1255.0591 [ 1]: ∎ patched/lease01.data # NumSamples = 14; Min = 1055.00; Max = 1213.97 # Mean = 1128.800723; Variance = 2225.466357; SD = 47.174849; Median 1114.384900 # each ∎ represents a count of 1 1054.9959 - 1070.8932 [ 2]: ∎∎ 1070.8932 - 1086.7906 [ 1]: ∎ 1086.7906 - 1102.6879 [ 1]: ∎ 1102.6879 - 1118.5853 [ 4]: 1118.5853 - 1134.4826 [ 0]: 1134.4826 - 1150.3800 [ 1]: ∎ 1150.3800 - 1166.2773 [ 2]: ∎∎ 1166.2773 - 1182.1747 [ 0]: 1182.1747 - 1198.0720 [ 2]: ∎∎ 1198.0720 - 1213.9694 [ 1]: ∎ 4.0.0-rc2/lease02.data # NumSamples = 12; Min = 841.43; Max = 911.82 # Mean = 888.716745; Variance = 317.221486; SD = 17.810713; Median 894.897002 # each ∎ represents a count of 1 841.4339 - 848.4727 [ 1]: ∎ 848.4727 - 855.5115 [ 0]: 855.5115 - 862.5503 [ 0]: 862.5503 - 869.5891 [ 0]: 869.5891 - 876.6278 [ 2]: ∎∎ 876.6278 - 883. [ 1]: ∎ 883. - 890.7054 [ 1]: ∎ 890.7054 - 897.7442 [ 3]: ∎∎∎ 897.7442 - 904.7830 [ 2]: ∎∎ 904.7830 - 911.8218 [ 2]: ∎∎ patched/lease02.data # NumSamples = 26; Min = 845.36; Max = 917.22 # Mean = 886.178134; Variance = 320.861100; SD = 17.912596; Median 889.109363 # each ∎ represents a count of 1 845.3620 - 852.5481 [ 2]: ∎∎ 852.5481 - 859.7343 [ 1]: ∎ 859.7343 - 866.9204 [ 1]: ∎ 866.9204 - 874.1065 [ 2]: ∎∎ 874.1065 - 881.2926 [ 3]: ∎∎∎ 881.2926 - 888.4788 [ 2]: ∎∎ 888.4788 - 895.6649 [ 6]: ∎∎ 895.6649 - 902.8510 [ 4]: 902.8510 - 910.0372 [ 2]: ∎∎ 910.0372 - 917.2233 [ 3]: ∎∎∎ 4.0.0-rc2/posix01.data # NumSamples = 5; Min = 46659.56; Max = 48332.45 # Mean = 47237.374603; Variance = 337801.6
Re: [PATCH] checkpatch: Add spell checking of email subject line
On Thu, 05 Mar 2015, Joe Perches wrote: > Only commit log and patch additions are checked for > typos and spelling errors currently. Add a check > of the email subject line too. > > Suggested-by: Jani Nikula > Signed-off-by: Joe Perches Thanks Joe. FWIW, Tested-by: Jani Nikula > --- > scripts/checkpatch.pl | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl > index 421bbb4..c061a63 100755 > --- a/scripts/checkpatch.pl > +++ b/scripts/checkpatch.pl > @@ -2303,7 +2303,8 @@ sub process { > } > > # Check for various typo / spelling mistakes > - if (defined($misspellings) && ($in_commit_log || $line =~ > /^\+/)) { > + if (defined($misspellings) && > + ($in_commit_log || $line =~ /^(?:\+|Subject:)/i)) { > while ($rawline =~ > /(?:^|[^a-z@])($misspellings)(?:$|[^a-z@])/gi) { > my $typo = $1; > my $typo_fix = $spelling_fix{lc($typo)}; > > -- Jani Nikula, Intel Open Source Technology Center -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 0/6] the big khugepaged redesign
On 03/06/2015 01:21 AM, Andres Freund wrote: > Long mail ahead, sorry for that. No problem, thanks a lot! > TL;DR: THP is still noticeable, but not nearly as bad. > > On 2015-03-05 17:30:16 +0100, Vlastimil Babka wrote: >> That however means the workload is based on hugetlbfs and shouldn't trigger >> THP >> page fault activity, which is the aim of this patchset. Some more googling >> made >> me recall that last LSF/MM, postgresql people mentioned THP issues and >> pointed >> at compaction. See http://lwn.net/Articles/591723/ That's exactly where this >> patchset should help, but I obviously won't be able to measure this before >> LSF/MM... > > Just as a reference, this is how some the more extreme profiles looked > like in the past: > >> 96.50%postmaster [kernel.kallsyms] [k] _spin_lock_irq >> | >> --- _spin_lock_irq >> | >> |--99.87%-- compact_zone >> | compact_zone_order >> | try_to_compact_pages >> | __alloc_pages_nodemask >> | alloc_pages_vma >> | do_huge_pmd_anonymous_page >> | handle_mm_fault >> | __do_page_fault >> | do_page_fault >> | page_fault >> | 0x631d98 >> --0.13%-- [...] > > That specific profile is from a rather old kernel as you probably > recognize. Yeah, sounds like synchronous compaction before it was forbidden for THP page faults... >> I'm CCing the psql guys from last year LSF/MM - do you have any insight about >> psql performance with THPs enabled/disabled on recent kernels, where e.g. >> compaction is no longer synchronous for THP page faults? > > So, I've managed to get a machine upgraded to 3.19. 4 x E5-4620, 256GB > RAM. > > First of: It's noticeably harder to trigger problems than it used to > be. But, I can still trigger various problems that are much worse with > THP enabled than without. > > There seem to be various different bottlenecks; I can get somewhat > different profiles. > > In a somewhat artificial workload, that tries to simulate what I've seen > trigger the problem at a customer, I can quite easily trigger large > differences between THP=enable and THP=never. There's two types of > tasks running, one purely OLTP, another doing somewhat more complex > statements that require a fair amount of process local memory. > > (ignore the absolute numbers for progress, I just waited for somewhat > stable results while doing other stuff) > > THP off: > Task 1 solo: > progress: 200.0 s, 391442.0 tps, 0.654 ms lat > progress: 201.0 s, 394816.1 tps, 0.683 ms lat > progress: 202.0 s, 409722.5 tps, 0.625 ms lat > progress: 203.0 s, 384794.9 tps, 0.665 ms lat > > combined: > Task 1: > progress: 144.0 s, 25430.4 tps, 10.067 ms lat > progress: 145.0 s, 22260.3 tps, 11.500 ms lat > progress: 146.0 s, 24089.9 tps, 10.627 ms lat > progress: 147.0 s, 25888.8 tps, 9.888 ms lat > > Task 2: > progress: 24.4 s, 30.0 tps, 2134.043 ms lat > progress: 26.5 s, 29.8 tps, 2150.487 ms lat > progress: 28.4 s, 29.7 tps, 2151.557 ms lat > progress: 30.4 s, 28.5 tps, 2245.304 ms lat > > flat profile: > 6.07% postgres postgres[.] heap_form_minimal_tuple > 4.36% postgres postgres[.] heap_fill_tuple > 4.22% postgres postgres[.] ExecStoreMinimalTuple > 4.11% postgres postgres[.] AllocSetAlloc > 3.97% postgres postgres[.] advance_aggregates > 3.94% postgres postgres[.] advance_transition_function > 3.94% postgres postgres[.] ExecMakeTableFunctionResult > 3.33% postgres postgres[.] heap_compute_data_size > 3.30% postgres postgres[.] MemoryContextReset > 3.28% postgres postgres[.] ExecScan > 3.04% postgres postgres[.] ExecProject > 2.96% postgres postgres[.] generate_series_step_int4 > 2.94% postgres [kernel.kallsyms] [k] clear_page_c > > (i.e. most of it postgres, cache miss bound) > > THP on: > Task 1 solo: > progress: 140.0 s, 390458.1 tps, 0.656 ms lat > progress: 141.0 s, 391174.2 tps, 0.654 ms lat > progress: 142.0 s, 394828.8 tps, 0.648 ms lat > progress: 143.0 s, 398156.2 tps, 0.643 ms lat > > Task 1: > progress: 179.0 s, 23963.1 tps, 10.683 ms lat > progress: 180.0 s, 22712.9 tps, 11.271 ms lat > progress: 181.0 s, 21211.4 tps, 12.069 ms lat > progress: 182.0 s, 23207.8 tps, 11.031 ms lat > > Task 2: > progress: 28.2 s, 19.1 tps, 3349.747 ms lat > progress: 31.0 s, 19.8 tps, 3230.589 ms lat > progress: 34.3 s, 21.5 tps, 2979.113 ms lat > progress: 37.4 s, 20.9 tps, 3055.143 ms lat So that's 1/3 worse tps for task 2? Not very nice... > flat
Re: [PATCH v2] ASoC: Add support for NAU8824 codec to ASoC
On 2015/3/5 上午 06:32, Paul Bolle wrote: > Chih-Chiang Chang schreef op wo 04-03-2015 om 20:53 [+0800]: >> From fe37688e226f83ba477a3c2fbc1e64946cd4ec4e Mon Sep 17 00:00:00 2001 >> From: Chih-Chiang Chang >> Date: Wed, 4 Mar 2015 20:03:21 +0800 >> Subject: [PATCH v2] ASoC: Add support for NAU8824 codec to ASoC > > It seems that none of those lines were needed. Sorry for the wrong patch format, will remove these lines in next submit. > >> --- /dev/null >> +++ b/include/sound/nau8824.h >> @@ -0,0 +1,22 @@ >> +/* >> + * linux/sound/nau8824.h -- Platform data for NAU8824 >> + * >> + * Copyright 2015 Nuvoton Technology Corp. >> + * >> + * This program is free software; you can redistribute it and/or modify >> + * it under the terms of the GNU General Public License version 2 as >> + * published by the Free Software Foundation. >> + */ >> + >> +#ifndef __LINUX_SND_NAU8824_H >> +#define __LINUX_SND_NAU8824_H >> + >> +struct nau8824_platform_data { >> + unsigned int audio_mclk1; >> + unsigned int gpio_irq; >> + int naudint_irq; >> + int headset_detect; >> + int button_press_detect; >> +}; >> + >> +#endif > > In the future something other than just sound/soc/codecs/nau8824.h is > going to include this header, right? > >> --- /dev/null >> +++ b/sound/soc/codecs/nau8824.c >> @@ -0,0 +1,807 @@ >> +/* >> + * linux/sound/soc/codecs/nau8824.c >> + * >> + * Copyright 2015 Nuvoton Technology Corp. >> + * Author: Meng-Huang Kuo >> + * >> + * This program is free software; you can redistribute it and/or modify >> + * it under the terms of the GNU General Public License version 2 as >> + * published by the Free Software Foundation. >> + */ > > This states the license is GPL v2. (So do the two headers this patch > adds.) > >> +MODULE_LICENSE("GPL"); > > So that should probably be > MODULE_LICENSE("GPL v2"); We will modify the code to be GPL v2. > > > Paul Bolle > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH perf/core v2 2/5] perf-probe: Fix --line to handle aliased symbols in glibc
Fix perf probe --line to handle aliased symbols correctly in glibc. This makes line_range search failing back to address-based alternative search as same as --add and --vars. Without this patch; - # ./perf probe -x /usr/lib64/libc-2.17.so -L malloc Specified source line is not found. Error: Failed to show lines. - With this patch; - # ./perf probe -x /usr/lib64/libc-2.17.so -L malloc <__libc_malloc@/usr/src/debug/glibc-2.17-c758a686/malloc/malloc.c:0> 0 __libc_malloc(size_t bytes) 1 { mstate ar_ptr; void *victim; __malloc_ptr_t (*hook) (size_t, const __malloc_ptr_t) 6 = force_reg (__malloc_hook); 7if (__builtin_expect (hook != NULL, 0)) 8 return (*hook)(bytes, RETURN_ADDRESS (0)); 10arena_lookup(ar_ptr); 12arena_lock(ar_ptr, bytes); - Note that this actually shows __libc_malloc, since it is the real instance of malloc. User can use both __libc_malloc and malloc for --line. Signed-off-by: Masami Hiramatsu --- tools/perf/util/probe-event.c | 35 +-- 1 file changed, 33 insertions(+), 2 deletions(-) diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c index b8f4578..4cfd121 100644 --- a/tools/perf/util/probe-event.c +++ b/tools/perf/util/probe-event.c @@ -353,6 +353,31 @@ static int get_alternative_probe_event(struct debuginfo *dinfo, return ret; } +static int get_alternative_line_range(struct debuginfo *dinfo, + struct line_range *lr, + const char *target, bool user) +{ + struct perf_probe_point pp = { 0 }, result = { 0 }; + int ret, len = 0; + + pp.function = lr->function; + pp.file = lr->file; + pp.line = lr->start; + if (lr->end != INT_MAX) + len = lr->end - lr->start; + ret = find_alternative_probe_point(dinfo, &pp, &result, + target, user); + if (!ret) { + lr->function = result.function; + lr->file = result.file; + lr->start = result.line; + if (lr->end != INT_MAX) + lr->end = lr->start + len; + clear_perf_probe_point(&pp); + } + return ret; +} + /* Open new debuginfo of given module */ static struct debuginfo *open_debuginfo(const char *module, bool silent) { @@ -734,7 +759,8 @@ static int _show_one_line(FILE *fp, int l, bool skip, bool show_num) * Show line-range always requires debuginfo to find source file and * line number. */ -static int __show_line_range(struct line_range *lr, const char *module) +static int __show_line_range(struct line_range *lr, const char *module, +bool user) { int l = 1; struct int_node *ln; @@ -750,6 +776,11 @@ static int __show_line_range(struct line_range *lr, const char *module) return -ENOENT; ret = debuginfo__find_line_range(dinfo, lr); + if (!ret) { /* Not found, retry with an alternative */ + ret = get_alternative_line_range(dinfo, lr, module, user); + if (!ret) + ret = debuginfo__find_line_range(dinfo, lr); + } debuginfo__delete(dinfo); if (ret == 0 || ret == -ENOENT) { pr_warning("Specified source line is not found.\n"); @@ -819,7 +850,7 @@ int show_line_range(struct line_range *lr, const char *module, bool user) ret = init_symbol_maps(user); if (ret < 0) return ret; - ret = __show_line_range(lr, module); + ret = __show_line_range(lr, module, user); exit_symbol_maps(); return ret; -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH perf/core v2 0/5] perf-probe: improve glibc support
Hi, Here is a series of patches which improves perf-probe to handle glibc's aliased symbols and weak symbols more correctly. This version includes 2 new patches from Namhyung (Thanks!) which solves a problem on weak symbols. I added a fix on his latter patch to modify find_alternative_probe_point, and dropped a bugfix which is already merged. So, this series is a merged series of below 2 series. http://lkml.kernel.org/g/20150302124939.9191.33564.stgit@localhost.localdomain http://lkml.kernel.org/g/1425477143-5310-1-git-send-email-namhy...@kernel.org == A major known issue of probing on glibc is that the some aliased symbols(e.g. malloc) and weak symbols (e.g. calloc) can not find by perf-probe. Actually, glibc's malloc symbol is just an alias of __libc_malloc. Its debuginfo knows only __libc_malloc, and perf's symbol map knows only malloc. This difference always confuses users that they can see malloc by perf report or annotate, but they can not probe on it, nor find definitions by --line option. And weak symbols have been dropped when loading. Previously, I've made a commit 906451b98b67 which solved this problem partly, but not completely fixed. So I decided to solve this issue completely by finding the symbols like malloc from perf's symbol map, and converting the symbol's address into debuginfo's location infomation. With this series, you can use --vars, --line and --add with the aliased symbols and weak symbols on glibc. Thank you, --- Masami Hiramatsu (3): perf-probe: Fix to handle aliased symbols in glibc perf-probe: Fix --line to handle aliased symbols in glibc Revert "perf probe: Fix to fall back to find probe point in symbols" Namhyung Kim (2): perf symbols: Allow symbol alias when loading map for symbol name perf probe: Allow weak symbols to be probed tools/perf/util/machine.c|2 tools/perf/util/map.c|6 + tools/perf/util/map.h|8 +- tools/perf/util/probe-event.c| 185 +- tools/perf/util/symbol-elf.c |5 + tools/perf/util/symbol-minimal.c |2 tools/perf/util/symbol.c |8 +- tools/perf/util/symbol.h |5 + 8 files changed, 182 insertions(+), 39 deletions(-) -- Masami HIRAMATSU Software Platform Research Dpt. Linux Technology Center Hitachi, Ltd., Yokohama Research Laboratory E-mail: masami.hiramatsu...@hitachi.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH perf/core v2 5/5] perf probe: Allow weak symbols to be probed
From: Namhyung Kim It currently prevents adding probes in weak symbols. But there're cases that given name is an only weak symbol so that we cannot add probe. $ perf probe -x /usr/lib/libc.so.6 -a calloc Failed to find symbol calloc in /usr/lib/libc-2.21.so Error: Failed to add events. $ nm /usr/lib/libc.so.6 | grep calloc 0007b1f0 t __calloc 0007b1f0 T __libc_calloc 0007b1f0 W calloc This change will result in duplicate probes when strong and weak symbols co-exist in a binary. But I think it's not a big problem since probes at the weak symbol will never be hit anyway. Signed-off-by: Masami Hiramatsu Signed-off-by: Namhyung Kim --- tools/perf/util/probe-event.c | 12 1 file changed, 4 insertions(+), 8 deletions(-) diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c index c379ea0..f9c1e53 100644 --- a/tools/perf/util/probe-event.c +++ b/tools/perf/util/probe-event.c @@ -309,10 +309,8 @@ static int find_alternative_probe_point(struct debuginfo *dinfo, /* Find the address of given function */ map__for_each_symbol_by_name(map, pp->function, sym) { - if (sym->binding == STB_GLOBAL || sym->binding == STB_LOCAL) { - address = sym->start; - break; - } + address = sym->start; + break; } if (!address) { ret = -ENOENT; @@ -2484,8 +2482,7 @@ static int find_probe_functions(struct map *map, char *name) struct symbol *sym; map__for_each_symbol_by_name(map, name, sym) { - if (sym->binding == STB_GLOBAL || sym->binding == STB_LOCAL) - found++; + found++; } return found; @@ -2845,8 +2842,7 @@ static struct strfilter *available_func_filter; static int filter_available_functions(struct map *map __maybe_unused, struct symbol *sym) { - if ((sym->binding == STB_GLOBAL || sym->binding == STB_LOCAL) && - strfilter__compare(available_func_filter, sym->name)) + if (strfilter__compare(available_func_filter, sym->name)) return 0; return 1; } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH perf/core v2 3/5] Revert "perf probe: Fix to fall back to find probe point in symbols"
This reverts commit 906451b98b67 ("perf probe: Fix to fall back to find probe point in symbols"). Since perf-probe retries with the address of given symbol searched from map before this path, this fall back routine doesn't need anymore. Signed-off-by: Masami Hiramatsu --- tools/perf/util/probe-event.c |6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c index 4cfd121..c379ea0 100644 --- a/tools/perf/util/probe-event.c +++ b/tools/perf/util/probe-event.c @@ -630,11 +630,9 @@ static int try_to_find_probe_trace_events(struct perf_probe_event *pev, } if (ntevs == 0) { /* No error but failed to find probe point. */ - pr_warning("Probe point '%s' not found in debuginfo.\n", + pr_warning("Probe point '%s' not found.\n", synthesize_perf_probe_point(&pev->point)); - if (need_dwarf) - return -ENOENT; - return 0; + return -ENOENT; } /* Error path : ntevs < 0 */ pr_debug("An error occurred in debuginfo analysis (%d).\n", ntevs); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH perf/core v2 4/5] perf symbols: Allow symbol alias when loading map for symbol name
From: Namhyung Kim When perf probe tries to add a probe in a binary using symbol name, it sometimes failed since some symbols were discard during loading dso. When it resolves an address to symbol, it'd be better to have just one symbol at given address. But for finding address from symbol, it'd be better to keep all names (including aliases). Add and propagate a new allow_alias argument to dso (and map) load functions so that it can keep those duplicate symbol aliases. Acked-by: Masami Hiramatsu Signed-off-by: Namhyung Kim --- tools/perf/util/machine.c|2 +- tools/perf/util/map.c|6 +++--- tools/perf/util/map.h|8 +++- tools/perf/util/symbol-elf.c |5 +++-- tools/perf/util/symbol-minimal.c |2 +- tools/perf/util/symbol.c |8 +--- tools/perf/util/symbol.h |5 +++-- 7 files changed, 23 insertions(+), 13 deletions(-) diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c index 24f8c97..01ba9b6 100644 --- a/tools/perf/util/machine.c +++ b/tools/perf/util/machine.c @@ -1128,7 +1128,7 @@ static int machine__process_kernel_mmap_event(struct machine *machine, * preload dso of guest kernel and modules */ dso__load(kernel, machine->vmlinux_maps[MAP__FUNCTION], - NULL); + NULL, false); } } return 0; diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c index 62ca9f2..711e072 100644 --- a/tools/perf/util/map.c +++ b/tools/perf/util/map.c @@ -248,7 +248,7 @@ void map__fixup_end(struct map *map) #define DSO__DELETED "(deleted)" -int map__load(struct map *map, symbol_filter_t filter) +int __map__load(struct map *map, symbol_filter_t filter, bool allow_alias) { const char *name = map->dso->long_name; int nr; @@ -256,7 +256,7 @@ int map__load(struct map *map, symbol_filter_t filter) if (dso__loaded(map->dso, map->type)) return 0; - nr = dso__load(map->dso, map, filter); + nr = dso__load(map->dso, map, filter, allow_alias); if (nr < 0) { if (map->dso->has_build_id) { char sbuild_id[BUILD_ID_SIZE * 2 + 1]; @@ -304,7 +304,7 @@ struct symbol *map__find_symbol(struct map *map, u64 addr, struct symbol *map__find_symbol_by_name(struct map *map, const char *name, symbol_filter_t filter) { - if (map__load(map, filter) < 0) + if (__map__load(map, filter, true) < 0) return NULL; if (!dso__sorted_by_name(map->dso, map->type)) diff --git a/tools/perf/util/map.h b/tools/perf/util/map.h index 0e42438..ba15607 100644 --- a/tools/perf/util/map.h +++ b/tools/perf/util/map.h @@ -149,7 +149,13 @@ size_t map__fprintf_dsoname(struct map *map, FILE *fp); int map__fprintf_srcline(struct map *map, u64 addr, const char *prefix, FILE *fp); -int map__load(struct map *map, symbol_filter_t filter); +int __map__load(struct map *map, symbol_filter_t filter, bool allow_alias); + +static inline int map__load(struct map *map, symbol_filter_t filter) +{ + return __map__load(map, filter, false); +} + struct symbol *map__find_symbol(struct map *map, u64 addr, symbol_filter_t filter); struct symbol *map__find_symbol_by_name(struct map *map, const char *name, diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c index ada1676..fb630f8 100644 --- a/tools/perf/util/symbol-elf.c +++ b/tools/perf/util/symbol-elf.c @@ -754,7 +754,7 @@ static bool want_demangle(bool is_kernel_sym) int dso__load_sym(struct dso *dso, struct map *map, struct symsrc *syms_ss, struct symsrc *runtime_ss, - symbol_filter_t filter, int kmodule) + symbol_filter_t filter, int kmodule, bool allow_alias) { struct kmap *kmap = dso->kernel ? map__kmap(map) : NULL; struct map *curr_map = map; @@ -1048,7 +1048,8 @@ new_symbol: * For misannotated, zeroed, ASM function sizes. */ if (nr > 0) { - symbols__fixup_duplicate(&dso->symbols[map->type]); + if (!allow_alias) + symbols__fixup_duplicate(&dso->symbols[map->type]); symbols__fixup_end(&dso->symbols[map->type]); if (kmap) { /* diff --git a/tools/perf/util/symbol-minimal.c b/tools/perf/util/symbol-minimal.c index d7efb03..fefeeb3 100644 --- a/tools/perf/util/symbol-minimal.c +++ b/tools/perf/util/symbol-minimal.c @@ -334,7 +334,7 @@ int dso__load_sym(struct dso *dso, struct map *map __maybe_unused, struct symsrc *ss, struct symsrc *runtime_ss __maybe_unused, symbol_filter_t filter __maybe_unused, - int kmodule __maybe_unused) +
[PATCH perf/core v2 1/5] perf-probe: Fix to handle aliased symbols in glibc
Fix perf probe to handle aliased symbols correctly in glibc. In the glibc, several symbols are defined as an alias of __libc_XXX, e.g. malloc is an alias of __libc_malloc. In such cases, dwarf has no subroutine instances of the alias functions (e.g. no "malloc" instance), but the map has that symbol and its address. Thus, if we search the alieased symbol in debuginfo, we always fail to find it, but it is in the map. To solve this problem, this fails back to address-based alternative search, which searches the symbol in the map, translates its address to alternative (correct) function name by using debuginfo, and retry to find the alternative function point from debuginfo. This adds fail-back process to --vars, --lines and --add options. So, now you can use those on malloc@libc :) Without this patch; - # ./perf probe -x /usr/lib64/libc-2.17.so -V malloc Failed to find the address of malloc Error: Failed to show vars. # ./perf probe -x /usr/lib64/libc-2.17.so -a "malloc bytes" Probe point 'malloc' not found in debuginfo. Error: Failed to add events. - With this patch; - # ./perf probe -x /usr/lib64/libc-2.17.so -V malloc Available variables at malloc @<__libc_malloc+0> size_t bytes # ./perf probe -x /usr/lib64/libc-2.17.so -a "malloc bytes" Added new event: probe_libc:malloc(on malloc in /usr/lib64/libc-2.17.so with bytes) You can now use it in all perf tools, such as: perf record -e probe_libc:malloc -aR sleep 1 - Reported-by: Arnaldo Carvalho de Melo Signed-off-by: Masami Hiramatsu --- tools/perf/util/probe-event.c | 140 - 1 file changed, 124 insertions(+), 16 deletions(-) diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c index 1c570c2fa7..b8f4578 100644 --- a/tools/perf/util/probe-event.c +++ b/tools/perf/util/probe-event.c @@ -178,6 +178,25 @@ static struct map *kernel_get_module_map(const char *module) return NULL; } +static struct map *get_target_map(const char *target, bool user) +{ + /* Init maps of given executable or kernel */ + if (user) + return dso__new_map(target); + else + return kernel_get_module_map(target); +} + +static void put_target_map(struct map *map, bool user) +{ + if (map && user) { + /* Only the user map needs to be released */ + dso__delete(map->dso); + map__delete(map); + } +} + + static struct dso *kernel_get_module_dso(const char *module) { struct dso *dso; @@ -249,6 +268,13 @@ out: return ret; } +static void clear_perf_probe_point(struct perf_probe_point *pp) +{ + free(pp->file); + free(pp->function); + free(pp->lazy_line); +} + static void clear_probe_trace_events(struct probe_trace_event *tevs, int ntevs) { int i; @@ -258,6 +284,74 @@ static void clear_probe_trace_events(struct probe_trace_event *tevs, int ntevs) } #ifdef HAVE_DWARF_SUPPORT +/* + * Some binaries like glibc have special symbols which are on the symbol + * table, but not in the debuginfo. If we can find the address of the + * symbol from map, we can translate the address back to the probe point. + */ +static int find_alternative_probe_point(struct debuginfo *dinfo, + struct perf_probe_point *pp, + struct perf_probe_point *result, + const char *target, bool uprobes) +{ + struct map *map = NULL; + struct symbol *sym; + u64 address = 0; + int ret = -ENOENT; + + /* This can work only for function-name based one */ + if (!pp->function || pp->file) + return -ENOTSUP; + + map = get_target_map(target, uprobes); + if (!map) + return -EINVAL; + + /* Find the address of given function */ + map__for_each_symbol_by_name(map, pp->function, sym) { + if (sym->binding == STB_GLOBAL || sym->binding == STB_LOCAL) { + address = sym->start; + break; + } + } + if (!address) { + ret = -ENOENT; + goto out; + } + pr_debug("Symbol %s address found : %lx\n", pp->function, address); + + ret = debuginfo__find_probe_point(dinfo, (unsigned long)address, + result); + if (ret <= 0) + ret = (!ret) ? -ENOENT : ret; + else { + result->offset += pp->offset; + result->line += pp->line; + ret = 0; + } + +out: + put_target_map(map, uprobes); + return ret; + +} + +static int get_alternative_probe_event(struct debuginfo *dinfo, + struct perf_probe_event *pev, + struct perf_probe_point *tmp, +
Re: [PATCH v2 6/6] x86, asm: Rename INIT_TSS_IST to TSS_IST
* Andy Lutomirski wrote: > This has nothing to do with the init thread or the initial anything. > It's just the TSS. > > Signed-off-by: Andy Lutomirski > --- > arch/x86/kernel/entry_64.S | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S > index 0c00fd80249a..c86f83e95f15 100644 > --- a/arch/x86/kernel/entry_64.S > +++ b/arch/x86/kernel/entry_64.S > @@ -959,7 +959,7 @@ apicinterrupt IRQ_WORK_VECTOR \ > /* > * Exception entry points. > */ > -#define INIT_TSS_IST(x) PER_CPU_VAR(cpu_tss) + (TSS_ist + ((x) - 1) * 8) > +#define TSS_IST(x) PER_CPU_VAR(cpu_tss) + (TSS_ist + ((x) - 1) * 8) > > .macro idtentry sym do_sym has_error_code:req paranoid=0 shift_ist=-1 > ENTRY(\sym) > @@ -1015,13 +1015,13 @@ ENTRY(\sym) > .endif > > .if \shift_ist != -1 > - subq $EXCEPTION_STKSZ, INIT_TSS_IST(\shift_ist) > + subq $EXCEPTION_STKSZ, TSS_IST(\shift_ist) > .endif > > call \do_sym > > .if \shift_ist != -1 > - addq $EXCEPTION_STKSZ, INIT_TSS_IST(\shift_ist) > + addq $EXCEPTION_STKSZ, TSS_IST(\shift_ist) > .endif > > /* these procedures expect "no swapgs" flag in ebx */ If you don't mind I've renamed this to 'CPU_TSS_IST', to be in line with cpu_tss. The per-cpuness of this symbol gets lost at the usage sites, because the PER_CPU_VAR() reference is hidden in a macro. Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RESEND PATCH] kernel/panic/kexec: fix "crash_kexec_post_notifiers" option issue in oops path
From: Vivek Goyal Subject: Re: [RESEND PATCH] kernel/panic/kexec: fix "crash_kexec_post_notifiers" option issue in oops path Date: Thu, 5 Mar 2015 17:22:04 -0500 > On Thu, Mar 05, 2015 at 05:19:30PM -0500, Vivek Goyal wrote: >> On Wed, Mar 04, 2015 at 05:56:48PM +0900, HATAYAMA Daisuke wrote: >> > The commit f06e5153f4ae2e2f3b0300f0e260e40cb7fefd45 introduced >> > "crash_kexec_post_notifiers" kernel boot option, which toggles >> > wheather panic() calls crash_kexec() before or after panic_notifiers >> > and dump kmsg. >> > >> > The problem is that the commit overlooks panic_on_oops kernel boot >> > option. If it is enabled, crash_kexec() is called directly without >> > going through panic() in oops path. >> > >> > To fix this issue, this patch adds a check to >> > "crash_kexec_post_notifiers" in the condition of kexec_should_crash(). >> > >> > Signed-off-by: HATAYAMA Daisuke >> > Acked-by: Baoquan He >> > Tested-by: Hidehiro Kawai >> > --- >> > include/linux/kernel.h | 3 +++ >> > kernel/kexec.c | 2 ++ >> > kernel/panic.c | 2 +- >> > 3 files changed, 6 insertions(+), 1 deletion(-) >> > >> > diff --git a/include/linux/kernel.h b/include/linux/kernel.h >> > index 64ce58b..f47379f 100644 >> > --- a/include/linux/kernel.h >> > +++ b/include/linux/kernel.h >> > @@ -426,6 +426,9 @@ extern int panic_on_unrecovered_nmi; >> > extern int panic_on_io_nmi; >> > extern int panic_on_warn; >> > extern int sysctl_panic_on_stackoverflow; >> > + >> > +extern bool crash_kexec_post_notifiers; >> > + >> > /* >> > * Only to be used by arch init code. If the user over-wrote the default >> > * CONFIG_PANIC_TIMEOUT, honor it. >> > diff --git a/kernel/kexec.c b/kernel/kexec.c >> > index 9a8a01a..0ecf252 100644 >> > --- a/kernel/kexec.c >> > +++ b/kernel/kexec.c >> > @@ -84,6 +84,8 @@ struct resource crashk_low_res = { >> > >> > int kexec_should_crash(struct task_struct *p) >> > { >> > + if (crash_kexec_post_notifiers) >> > + return 0; >> >> This is little confusing. So if crash_kexec_post_notifiers is set but >> panic_on_oops is not set, still we will return? >> >> Should we do this only if panic_on_oops is set? IOW, how about following >> >> if (panic_on_oops && crash_kexec_post_notifiers) >> return 0; >> >> And then also put a comment explaining the rationale. > > Ok, I went through the previous version of patch and discussion there > which says that all the 4 conditions lead to panic. So putting above > code should be fine. > > Can you please atleast put a comment here to explain it as it was not > obvious. Just mention that all the checks below lead to panic hence > if user wants to run panic notifiers then don't run crash_kexec() yet. > It will be run after panic notifiers. > Thanks for your reviewing. Yes, I'll put such new comment in the patch of next version. -- Thanks. HATAYAMA, Daisuke -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: x32 + audit status?
On Thu, Mar 5, 2015 at 6:07 PM, Andy Lutomirski wrote: > On Mar 5, 2015 10:32 AM, "David Drysdale" wrote: >> >> Hi, >> >> Do we currently expect the audit system to work with x32 syscalls? >> >> I was playing with the audit system for the first time today (on >> v4.0-rc2, due to [1]), and it didn't seem to work for me. (Tweaking >> ptrace.c like the patch below seemed to help, but I may just have >> configured something wrong.) >> >> I know there was a bunch of activity around this area in mid-2014, >> but I'm not sure what the final position was... > > It's totally broken, and it needs ABI work. I think it should keep > the high syscall numbers, which means that both userspace and the > audit core need to learn how to deal with it. What Andy said. It's on the list of things to fix, but to be brutally honest, it's not very high on the list due to lack of interest from people asking for audit/x32 support. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ASoC: Add support for NAU8824 codec to ASoC
On 2015/3/4 下午 08:55, Mark Brown wrote: > On Wed, Mar 04, 2015 at 08:35:52PM +0800, Chih-Chiang Chang wrote: >> On 2015/2/24 下午 10:13, Mark Brown wrote: > >>> I would have expected the headphone volume control to be a stereo >>> (double) control - same for speakers. > >> The nau8824 related registers which control left/right volume are located >> in different addresses and different shift bits. Since there is no available >> preprocessor macro to meet our requirements, the driver consists of >> left/right >> volume control separately. > > Add relevant control types if you need them, it's important to have > proper stereo controls available to userspace. We cannot find suitable macro in file "include\sound\soc.h", so we want to add below two macro for our chip. SOC_DOUBLE_L_R_VALUE SOC_DOUBLE_L_R_TLV > +struct nau8824_init_reg { +u8 reg; +u16 val; +}; > >>> This looks like you're reimplementing regmap's register sequence >>> stuff... It's also a *very* large sequence you have, are you sure it's >>> all required? It seems like this may be doing a bunch of machine >>> specific configuration but since it's all magic numbers it's hard to >>> tell. > >> Initial settings are arranged in order > > This doesn't answer or address my concern. These large number of register setting is used to initial our codec, and some of other codec have the same behavior. We will remove few unnecessary register default setting and add some remark for registers. > +/* Dynamic Headset detection enabled */ +snd_soc_update_bits(codec, 0x01, 0x400, 0x400); +snd_soc_update_bits(codec, 0x02, 0x0008, 0x0008); +snd_soc_update_bits(codec, 0x0f, 0x0300, 0x0100); +snd_soc_write(codec, 0x09, 0xE000); +snd_soc_write(codec, NAU8824_IRQ_SETTING, 0x1006); +snd_soc_write(codec, 0x13, 0x1615); +snd_soc_write(codec, 0x15, 0x0414); +snd_soc_update_bits(codec, 0x16, 0xFF00, 0x5900); +snd_soc_update_bits(codec, 0x66, 0x0070, 0x0060); > >>> Too many magic numbers here I think and this looks like it should be >>> configured using platform data and/or the machine driver (what if the >>> headphone detection/IRQ aren't wired up?). I'd also expect to see >>> reporting via the standard interfaces for jack reporting. > >> The above initial settings are for jack detection. As for other jack >> detection flow, it will be implemented in machine driver but not be included >> in >> this application. > > Please either remove this for now or implement it properly. We will remove it. > >> === >> The privileged confidential information contained in this email is intended >> for use only by the addressees as indicated by the original sender of this >> email. If you are not the addressee indicated in this email or are not >> responsible for delivery of the email to such a person, please kindly reply >> to the sender indicating this fact and delete all copies of it from your >> computer and network server immediately. Your cooperation is highly >> appreciated. It is advised that any unauthorized use of confidential >> information of Nuvoton is strictly prohibited; and any information in this >> email irrelevant to the official business of Nuvoton shall be deemed as >> neither given nor endorsed by Nuvoton. > > Don't include noise like this in upstream communication, if your company > won't fix this then please use an external mail account for upstream > communication. Our MIS report they have disabled to append message in mail. Hope you do not see it in this mail. > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC/PATCH 2/2] perf probe: Allow weak symbols to be probed
On Fri, Mar 6, 2015 at 4:05 PM, Masami Hiramatsu wrote: > (2015/03/04 22:52), Namhyung Kim wrote: >> It currently prevents adding probes in weak symbols. But there're cases >> that given name is an only weak symbol so that we cannot add probe. >> >> $ perf probe -x /usr/lib/libc.so.6 -a calloc >> Failed to find symbol calloc in /usr/lib/libc-2.21.so >> Error: Failed to add events. >> >> $ nm /usr/lib/libc.so.6 | grep calloc >> 0007b1f0 t __calloc >> 0007b1f0 T __libc_calloc >> 0007b1f0 W calloc >> >> This change will result in duplicate probes when strong and weak symbols >> co-exist in a binary. But I think it's not a big problem since probes >> at the weak symbol will never be hit anyway. >> >> Cc: Masami Hiramatsu >> Signed-off-by: Namhyung Kim >> --- >> tools/perf/util/probe-event.c | 6 ++ >> 1 file changed, 2 insertions(+), 4 deletions(-) >> >> diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c >> index 1c570c2fa7cc..12b7d018106e 100644 >> --- a/tools/perf/util/probe-event.c >> +++ b/tools/perf/util/probe-event.c >> @@ -2339,8 +2339,7 @@ static int find_probe_functions(struct map *map, char >> *name) >> struct symbol *sym; >> >> map__for_each_symbol_by_name(map, name, sym) { >> - if (sym->binding == STB_GLOBAL || sym->binding == STB_LOCAL) >> - found++; >> + found++; >> } > > Ah, I've found this is the magic... > Here, we need another fix on my series. Oops, right. I didn't base on your patch so missed this function. Thanks, Namhyung > > diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c > index 22392b06..f9c1e53 100644 > --- a/tools/perf/util/probe-event.c > +++ b/tools/perf/util/probe-event.c > @@ -309,10 +309,8 @@ static int find_alternative_probe_point(struct debuginfo > *d > > /* Find the address of given function */ > map__for_each_symbol_by_name(map, pp->function, sym) { > - if (sym->binding == STB_GLOBAL || sym->binding == STB_LOCAL) { > - address = sym->start; > - break; > - } > + address = sym->start; > + break; > } > if (!address) { > ret = -ENOENT; > --- > > With this fix, I could get variables on waitpid and calloc. > - > # ./perf probe -x /lib64/libc-2.17.so -V waitpid > Available variables at waitpid > @<__libc_waitpid+0> > __pid_t pid > int oldtype > int options > int*stat_loc > - > > I'll update and include it my series. > > Thank you! > >> >> return found; >> @@ -2708,8 +2707,7 @@ static struct strfilter *available_func_filter; >> static int filter_available_functions(struct map *map __maybe_unused, >> struct symbol *sym) >> { >> - if ((sym->binding == STB_GLOBAL || sym->binding == STB_LOCAL) && >> - strfilter__compare(available_func_filter, sym->name)) >> + if (strfilter__compare(available_func_filter, sym->name)) >> return 0; >> return 1; >> } >> > > > -- > Masami HIRAMATSU > Software Platform Research Dept. Linux Technology Research Center > Hitachi, Ltd., Yokohama Research Laboratory > E-mail: masami.hiramatsu...@hitachi.com > > -- Thanks, Namhyung -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 00/16] Introduce ZONE_CMA
On Thu, Mar 05, 2015 at 06:48:50PM +0100, Vlastimil Babka wrote: > On 03/05/2015 05:53 PM, Vlastimil Babka wrote: > > On 02/12/2015 08:32 AM, Joonsoo Kim wrote: > >> > >> 1) Break non-overlapped zone assumption > >> CMA regions could be spread to all memory range, so, to keep all of them > >> into one zone, span of ZONE_CMA would be overlap to other zones'. > > > > From patch 13/16 ut seems to me that indeed the ZONE_CMA spans the area of > > all > > other zones. This seems very inefficient for e.g. compaction scanners, which > > will repeatedly skip huge amounts of pageblocks that don't belong to > > ZONE_CMA. > > Could you instead pick only a single zone on a node from which you steal the > > pages? That would allow to keep the span low. Hello, Vlastimil. CMA is used for DMA now and it sometimes has memory range constraint so we could not limit zone span as low. But, current implementatino unnecessarilly set up ZONE_CMA's span from start_pfn of node to end_pfn of node. I will change it to the range where we actually steal pages. Maybe, most of usecase of CMA would use small, limited range of memory so it doesn't impose critical performance problem on zone's pfn iterator such as compaction scanners. > > > > Another disadvantage I see is that to allocate from ZONE_CMA you will have > > now > > to reclaim enough pages within the zone itself. I think think the cma > > allocation > > I don't think... > > > supports migrating pages from ZONE_CMA to the adjacent non-CMA zone, which > > would > > be equivalent to migration from MIGRATE_CMA pageblocks to the rest of the > > zone? I'm not sure I understand your question correctly. cma allocation uses alloc_migrate_target() to get migration target freepage and it doesn't impose any zone contraint so migrating pages from ZONE_CMA to the adjacent non-CMA zone is possible. Am I understading you question correctly? If I mis-understand, please let me know. Thanks. > >> I'm not sure that there is an assumption about possibility of zone overlap > >> But, if ZONE_CMA is introduced, this assumption becomes reality > >> so we should deal with this situation. I investigated most of sites > >> that iterates pfn on certain zone and found that they normally doesn't > >> consider zone overlap. I tried to handle these cases by myself in the > >> early of this series. I hope that there is no more site that depends on > >> non-overlap zone assumption when iterating pfn on certain zone. > >> > >> I passed boot test on x86, ARM32 and ARM64. I did some stress tests > >> on x86 and there is no problem. Feel free to enjoy and please give me > >> a feedback. :) > >> > >> This patchset is based on v3.18. > >> > >> Thanks. > >> > >> [1] https://lkml.org/lkml/2014/5/28/64 > >> [2] https://lkml.org/lkml/2014/11/4/55 > >> [3] https://lkml.org/lkml/2014/10/15/623 > >> [4] https://lkml.org/lkml/2014/5/30/320 > >> > >> > >> Joonsoo Kim (16): > >> mm/page_alloc: correct highmem memory statistics > >> mm/writeback: correct dirty page calculation for highmem > >> mm/highmem: make nr_free_highpages() handles all highmem zones by > >> itself > >> mm/vmstat: make node_page_state() handles all zones by itself > >> mm/vmstat: watch out zone range overlap > >> mm/page_alloc: watch out zone range overlap > >> mm/page_isolation: watch out zone range overlap > >> power: watch out zone range overlap > >> mm/cma: introduce cma_total_pages() for future use > >> mm/highmem: remove is_highmem_idx() > >> mm/page_alloc: clean-up free_area_init_core() > >> mm/cma: introduce new zone, ZONE_CMA > >> mm/cma: populate ZONE_CMA and use this zone when GFP_HIGHUSERMOVABLE > >> mm/cma: print stealed page count > >> mm/cma: remove ALLOC_CMA > >> mm/cma: remove MIGRATE_CMA > >> > >> arch/x86/include/asm/sparsemem.h |2 +- > >> arch/x86/mm/highmem_32.c |3 + > >> include/linux/cma.h |9 ++ > >> include/linux/gfp.h | 31 +++--- > >> include/linux/mempolicy.h |2 +- > >> include/linux/mm.h|1 + > >> include/linux/mmzone.h| 58 +- > >> include/linux/page-flags-layout.h |2 + > >> include/linux/vm_event_item.h |8 +- > >> include/linux/vmstat.h| 26 + > >> kernel/power/snapshot.c | 15 +++ > >> lib/show_mem.c|2 +- > >> mm/cma.c | 70 ++-- > >> mm/compaction.c |6 +- > >> mm/highmem.c | 12 +- > >> mm/hugetlb.c |2 +- > >> mm/internal.h |3 +- > >> mm/memory_hotplug.c |3 + > >> mm/mempolicy.c|3 +- > >> mm/page-writeback.c |8 +- > >> mm/page_alloc.c | 223 > >> + > >> mm/page_isolation.c | 14
[PATCH] crypto: RNGs must return 0 in success case
Change the RNGs to always return 0 in success case. This patch ensures that seqiv.c works with RNGs other than krng. seqiv expects that any return code other than 0 is an error. Without the patch, rfc4106(gcm(aes)) will not work when using a DRBG or an ANSI X9.31 RNG. Signed-off-by: Stephan Mueller --- crypto/ansi_cprng.c | 6 +- crypto/drbg.c| 7 ++- include/crypto/rng.h | 3 +-- 3 files changed, 12 insertions(+), 4 deletions(-) diff --git a/crypto/ansi_cprng.c b/crypto/ansi_cprng.c index 6f5bebc..765fe76 100644 --- a/crypto/ansi_cprng.c +++ b/crypto/ansi_cprng.c @@ -210,7 +210,11 @@ static int get_prng_bytes(char *buf, size_t nbytes, struct prng_context *ctx, byte_count = DEFAULT_BLK_SZ; } - err = byte_count; + /* +* Return 0 in case of success as mandated by the kernel +* crypto API interface definition. +*/ + err = 0; dbgprint(KERN_CRIT "getting %d random bytes for context %p\n", byte_count, ctx); diff --git a/crypto/drbg.c b/crypto/drbg.c index 56c1d7e..b69409c 100644 --- a/crypto/drbg.c +++ b/crypto/drbg.c @@ -1280,7 +1280,7 @@ static void drbg_restore_shadow(struct drbg_state *drbg, * as defined in SP800-90A. The additional input is mixed into * the state in addition to the pulled entropy. * - * return: generated number of bytes + * return: 0 when all bytes are generated; < 0 in case of an error */ static int drbg_generate(struct drbg_state *drbg, unsigned char *buf, unsigned int buflen, @@ -1419,6 +1419,11 @@ static int drbg_generate(struct drbg_state *drbg, } #endif + /* +* All operations were successful, return 0 as mandated by +* the kernel crypto API interface. +*/ + len = 0; err: shadow->d_ops->crypto_fini(shadow); drbg_restore_shadow(drbg, &shadow); diff --git a/include/crypto/rng.h b/include/crypto/rng.h index a16fb10..6e28ea5 100644 --- a/include/crypto/rng.h +++ b/include/crypto/rng.h @@ -103,8 +103,7 @@ static inline void crypto_free_rng(struct crypto_rng *tfm) * This function fills the caller-allocated buffer with random numbers using the * random number generator referenced by the cipher handle. * - * Return: > 0 function was successful and returns the number of generated - *bytes; < 0 if an error occurred + * Return: 0 function was successful; < 0 if an error occurred */ static inline int crypto_rng_get_bytes(struct crypto_rng *tfm, u8 *rdata, unsigned int dlen) -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/2] f2fs: reduce searching region of segmap when set free section
In __set_free we will check whether all segment are free in one section when free one segment, in order to set section to free status. But the searching region of segmap is from start segno to last segno of main area, it's not necessary. So let's just only check all segment bitmap of target section. Signed-off-by: Wanpeng Li --- fs/f2fs/segment.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h index 7fd3511..85d7fa7 100644 --- a/fs/f2fs/segment.h +++ b/fs/f2fs/segment.h @@ -336,7 +336,8 @@ static inline void __set_free(struct f2fs_sb_info *sbi, unsigned int segno) clear_bit(segno, free_i->free_segmap); free_i->free_segments++; - next = find_next_bit(free_i->free_segmap, MAIN_SEGS(sbi), start_segno); + next = find_next_bit(free_i->free_segmap, + start_segno + sbi->segs_per_sec, start_segno); if (next >= start_segno + sbi->segs_per_sec) { clear_bit(secno, free_i->free_secmap); free_i->free_sections++; -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/2] f2fs: fix extent cache memory leak
extent tree/node slab cache is created during f2fs insmod, how, it isn't destroyed during f2fs rmmod, this patch fix it by destroy extent tree/node slab cache once rmmod f2fs. Signed-off-by: Wanpeng Li --- fs/f2fs/super.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c index e649f21..0b8a2d8 100644 --- a/fs/f2fs/super.c +++ b/fs/f2fs/super.c @@ -1323,6 +1323,7 @@ static void __exit exit_f2fs_fs(void) remove_proc_entry("fs/f2fs", NULL); f2fs_destroy_root_stats(); unregister_filesystem(&f2fs_fs_type); + destroy_extent_cache(); destroy_checkpoint_caches(); destroy_segment_manager_caches(); destroy_node_manager_caches(); -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: parent/child hierarchy for regulator
On Thu, Mar 05, 2015 at 12:22:34PM +, Mark Brown wrote: > On Thu, Mar 05, 2015 at 06:35:36PM +0800, Peter Chen wrote: > > > Any good ways at code/dts to show parent/child hierarchy for regulator? > > There's plenty of examples in mainline... > Thanks, I get answer for adding parent regulator for fixed regulator, it is very easy, we only need to add 'vin-supply = <&parent_reg>;' at fixed's property. -- Best Regards, Peter Chen -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 13/16] mm/cma: populate ZONE_CMA and use this zone when GFP_HIGHUSERMOVABLE
On Tue, Mar 03, 2015 at 01:58:46PM +0530, Aneesh Kumar K.V wrote: > Joonsoo Kim writes: > > > Until now, reserved pages for CMA are managed altogether with normal > > page in the same zone. This approach has numorous problems and fixing > > them isn't easy. To fix this situation, ZONE_CMA is introduced in > > previous patch, but, not yet populated. This patch implement population > > of ZONE_CMA by stealing reserved pages from normal zones. This stealing > > break one uncertain assumption on zone, that is, zone isn't overlapped. > > In the early of this series, some check is inserted to every zone's span > > iterator to handle zone overlap so there would be no problem with > > this assumption break. > > > > To utilize this zone, user should use GFP_HIGHUSERMOVABLE, because > > these pages are only applicable for movable type and ZONE_CMA could > > contain highmem. > > > > Implementation itself is very easy to understand. Do steal when cma > > area is initialized and recalculate values for per zone data structure. > > > > Signed-off-by: Joonsoo Kim > > --- > > include/linux/gfp.h | 10 -- > > include/linux/mm.h |1 + > > mm/cma.c| 23 --- > > mm/page_alloc.c | 42 +++--- > > 4 files changed, 64 insertions(+), 12 deletions(-) > > > > diff --git a/include/linux/gfp.h b/include/linux/gfp.h > > index 619eb20..d125440 100644 > > --- a/include/linux/gfp.h > > +++ b/include/linux/gfp.h > > @@ -186,6 +186,12 @@ static inline int gfpflags_to_migratetype(const gfp_t > > gfp_flags) > > #define OPT_ZONE_DMA32 ZONE_NORMAL > > #endif > > > > +#ifdef CONFIG_CMA > > +#define OPT_ZONE_CMA ZONE_CMA > > +#else > > +#define OPT_ZONE_CMA ZONE_MOVABLE > > +#endif > > + > > Does that mean with CONFIG_CMA we always try ZONE_CMA first and then > fallback to ZONE_MOVABLE ? If so won't we hit termporary CMA allocation > failures that can result with pinned movable pages ? Hello, Aneesh. IIUC, Johannes's fair allocation policy patchset makes us uses individual zones fairly. So, before freepage on ZONE_CMA is exhausted, ZONE_MOVABLE will be used. :) Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC/PATCH 2/2] perf probe: Allow weak symbols to be probed
(2015/03/04 22:52), Namhyung Kim wrote: > It currently prevents adding probes in weak symbols. But there're cases > that given name is an only weak symbol so that we cannot add probe. > > $ perf probe -x /usr/lib/libc.so.6 -a calloc > Failed to find symbol calloc in /usr/lib/libc-2.21.so > Error: Failed to add events. > > $ nm /usr/lib/libc.so.6 | grep calloc > 0007b1f0 t __calloc > 0007b1f0 T __libc_calloc > 0007b1f0 W calloc > > This change will result in duplicate probes when strong and weak symbols > co-exist in a binary. But I think it's not a big problem since probes > at the weak symbol will never be hit anyway. > > Cc: Masami Hiramatsu > Signed-off-by: Namhyung Kim > --- > tools/perf/util/probe-event.c | 6 ++ > 1 file changed, 2 insertions(+), 4 deletions(-) > > diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c > index 1c570c2fa7cc..12b7d018106e 100644 > --- a/tools/perf/util/probe-event.c > +++ b/tools/perf/util/probe-event.c > @@ -2339,8 +2339,7 @@ static int find_probe_functions(struct map *map, char > *name) > struct symbol *sym; > > map__for_each_symbol_by_name(map, name, sym) { > - if (sym->binding == STB_GLOBAL || sym->binding == STB_LOCAL) > - found++; > + found++; > } Ah, I've found this is the magic... Here, we need another fix on my series. diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c index 22392b06..f9c1e53 100644 --- a/tools/perf/util/probe-event.c +++ b/tools/perf/util/probe-event.c @@ -309,10 +309,8 @@ static int find_alternative_probe_point(struct debuginfo *d /* Find the address of given function */ map__for_each_symbol_by_name(map, pp->function, sym) { - if (sym->binding == STB_GLOBAL || sym->binding == STB_LOCAL) { - address = sym->start; - break; - } + address = sym->start; + break; } if (!address) { ret = -ENOENT; --- With this fix, I could get variables on waitpid and calloc. - # ./perf probe -x /lib64/libc-2.17.so -V waitpid Available variables at waitpid @<__libc_waitpid+0> __pid_t pid int oldtype int options int*stat_loc - I'll update and include it my series. Thank you! > > return found; > @@ -2708,8 +2707,7 @@ static struct strfilter *available_func_filter; > static int filter_available_functions(struct map *map __maybe_unused, > struct symbol *sym) > { > - if ((sym->binding == STB_GLOBAL || sym->binding == STB_LOCAL) && > - strfilter__compare(available_func_filter, sym->name)) > + if (strfilter__compare(available_func_filter, sym->name)) > return 0; > return 1; > } > -- Masami HIRAMATSU Software Platform Research Dept. Linux Technology Research Center Hitachi, Ltd., Yokohama Research Laboratory E-mail: masami.hiramatsu...@hitachi.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v9 14/21] ACPI / processor: Make it possible to get CPU hardware ID via GICC
On 2015/3/5 23:19, Catalin Marinas wrote: > On Thu, Mar 05, 2015 at 02:13:58PM +0100, Rafael J. Wysocki wrote: >> On Thu, Mar 5, 2015 at 12:27 PM, Catalin Marinas >> wrote: >>> On Thu, Mar 05, 2015 at 04:03:21PM +0800, Hanjun Guo wrote: On 2015/3/5 6:46, Rafael J. Wysocki wrote: > IMO, you really need to define phys_cpuid_t in a common place or people > will > forget that it may be 64-bit, because they'll only be looking at their > arch. Since x86 and ARM64 are using different types for phys_cpuid_t, we need to introduce something like following if define it in common place: in linux/acpi.h, #if defined(CONFIG_X86) || defined(CONFIG_IA64) typedef u32 phys_cpuid_t; #define PHYS_CPUID_INVALID (phys_cpuid_t)(-1) #else if defined(CONFIG_ARM64) typedef u64 phys_cpuid_t; #define PHYS_CPUID_INVALID INVALID_HWID #endif I think it's awful, did I miss something? >> Well, you can define the type and PHYS_CPUID_INVALID in the arch >> code and then do this in a common header: >> >> #ifndef PHYS_CPUID_INVALID >> typedef u32 phys_cpuid_t; >> #define PHYS_CPUID_INVALID (phys_cpuid_t)(-1) >> #endif >> >> That would allow you to avoid the need to duplicate the >> definitions where it is not necessary. > It's fine by me. I will update the patch. Thanks Hanjun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [v3 24/26] KVM: Update Posted-Interrupts Descriptor when vCPU is blocked
> -Original Message- > From: Marcelo Tosatti [mailto:mtosa...@redhat.com] > Sent: Wednesday, March 04, 2015 8:06 PM > To: Wu, Feng > Cc: t...@linutronix.de; mi...@redhat.com; h...@zytor.com; x...@kernel.org; > g...@kernel.org; pbonz...@redhat.com; dw...@infradead.org; > j...@8bytes.org; alex.william...@redhat.com; jiang@linux.intel.com; > eric.au...@linaro.org; linux-kernel@vger.kernel.org; > io...@lists.linux-foundation.org; k...@vger.kernel.org > Subject: Re: [v3 24/26] KVM: Update Posted-Interrupts Descriptor when vCPU > is blocked > > On Mon, Mar 02, 2015 at 01:36:51PM +, Wu, Feng wrote: > > > > > > > -Original Message- > > > From: Marcelo Tosatti [mailto:mtosa...@redhat.com] > > > Sent: Friday, February 27, 2015 7:41 AM > > > To: Wu, Feng > > > Cc: t...@linutronix.de; mi...@redhat.com; h...@zytor.com; > x...@kernel.org; > > > g...@kernel.org; pbonz...@redhat.com; dw...@infradead.org; > > > j...@8bytes.org; alex.william...@redhat.com; jiang@linux.intel.com; > > > eric.au...@linaro.org; linux-kernel@vger.kernel.org; > > > io...@lists.linux-foundation.org; k...@vger.kernel.org > > > Subject: Re: [v3 24/26] KVM: Update Posted-Interrupts Descriptor when > vCPU > > > is blocked > > > > > > On Fri, Dec 12, 2014 at 11:14:58PM +0800, Feng Wu wrote: > > > > This patch updates the Posted-Interrupts Descriptor when vCPU > > > > is blocked. > > > > > > > > pre-block: > > > > - Add the vCPU to the blocked per-CPU list > > > > - Clear 'SN' > > > > - Set 'NV' to POSTED_INTR_WAKEUP_VECTOR > > > > > > > > post-block: > > > > - Remove the vCPU from the per-CPU list > > > > > > > > Signed-off-by: Feng Wu > > > > --- > > > > arch/x86/include/asm/kvm_host.h | 2 + > > > > arch/x86/kvm/vmx.c | 96 > > > + > > > > arch/x86/kvm/x86.c | 22 +++--- > > > > include/linux/kvm_host.h| 4 ++ > > > > virt/kvm/kvm_main.c | 6 +++ > > > > 5 files changed, 123 insertions(+), 7 deletions(-) > > > > > > > > diff --git a/arch/x86/include/asm/kvm_host.h > > > b/arch/x86/include/asm/kvm_host.h > > > > index 13e3e40..32c110a 100644 > > > > --- a/arch/x86/include/asm/kvm_host.h > > > > +++ b/arch/x86/include/asm/kvm_host.h > > > > @@ -101,6 +101,8 @@ static inline gfn_t gfn_to_index(gfn_t gfn, gfn_t > > > base_gfn, int level) > > > > > > > > #define ASYNC_PF_PER_VCPU 64 > > > > > > > > +extern void (*wakeup_handler_callback)(void); > > > > + > > > > enum kvm_reg { > > > > VCPU_REGS_RAX = 0, > > > > VCPU_REGS_RCX = 1, > > > > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c > > > > index bf2e6cd..a1c83a2 100644 > > > > --- a/arch/x86/kvm/vmx.c > > > > +++ b/arch/x86/kvm/vmx.c > > > > @@ -832,6 +832,13 @@ static DEFINE_PER_CPU(struct vmcs *, > > > current_vmcs); > > > > static DEFINE_PER_CPU(struct list_head, loaded_vmcss_on_cpu); > > > > static DEFINE_PER_CPU(struct desc_ptr, host_gdt); > > > > > > > > +/* > > > > + * We maintian a per-CPU linked-list of vCPU, so in wakeup_handler() we > > > > + * can find which vCPU should be waken up. > > > > + */ > > > > +static DEFINE_PER_CPU(struct list_head, blocked_vcpu_on_cpu); > > > > +static DEFINE_PER_CPU(spinlock_t, blocked_vcpu_on_cpu_lock); > > > > + > > > > static unsigned long *vmx_io_bitmap_a; > > > > static unsigned long *vmx_io_bitmap_b; > > > > static unsigned long *vmx_msr_bitmap_legacy; > > > > @@ -1921,6 +1928,7 @@ static void vmx_vcpu_load(struct kvm_vcpu > *vcpu, > > > int cpu) > > > > struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu); > > > > struct pi_desc old, new; > > > > unsigned int dest; > > > > + unsigned long flags; > > > > > > > > memset(&old, 0, sizeof(old)); > > > > memset(&new, 0, sizeof(new)); > > > > @@ -1942,6 +1950,20 @@ static void vmx_vcpu_load(struct kvm_vcpu > > > *vcpu, int cpu) > > > > new.nv = POSTED_INTR_VECTOR; > > > > } while (cmpxchg(&pi_desc->control, old.control, > > > > new.control) != old.control); > > > > + > > > > + /* > > > > +* Delete the vCPU from the related wakeup queue > > > > +* if we are resuming from blocked state > > > > +*/ > > > > + if (vcpu->blocked) { > > > > + vcpu->blocked = false; > > > > + > > > > spin_lock_irqsave(&per_cpu(blocked_vcpu_on_cpu_lock, > > > > + vcpu->wakeup_cpu), flags); > > > > + list_del(&vcpu->blocked_vcpu_list); > > > > + > spin_unlock_irqrestore(&per_cpu(blocked_vcpu_on_cpu_lock, > > > > + vcpu->wakeup_cpu), flags); > > > > + vcpu->wakeup_cpu = -1; > > > > + } > > > > } > > > > } > > > > > > > > @@ -1950,6 +1972,9 @@ static void vmx_vcpu_put(struct kvm_vc
Re: [PATCH v9 02/21] ACPI / processor: Introduce phys_cpuid_t for CPU hardware ID
On 2015/3/5 21:23, Rafael J. Wysocki wrote: > On Thu, Mar 5, 2015 at 8:44 AM, Hanjun Guo wrote: >> On 2015/3/5 6:29, Rafael J. Wysocki wrote: >>> On Wednesday, February 25, 2015 04:39:42 PM Hanjun Guo wrote: > [cut] > @@ -190,7 +190,7 @@ int acpi_map_cpuid(int phys_id, u32 acpi_id) if (nr_cpu_ids <= 1 && acpi_id == 0) return acpi_id; else -return phys_id; +return -1; >>> Can we use a proper error code here? >> I'm afraid not. In ACPI processor drivers, -1 will be deemed to >> invalid cpu logical number, if we return error code here, we need >> to modify multi places of "if (cpu_logical_num == -1)" to > Oh, silly stuff. > >> "if (! (cpu_logical_num < 0))" too, so for me, I prefer to keep it as >> -1, but I'm open for suggestions. > OK > > I think we need something like invalid_logical_cpuid() and use it > in all of those checks instead of the direct comparisons, but we > can make those changes later. OK, I recorded this as one of my TODO list, thanks for the suggestions. Thanks Hanjun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/17] crypto: talitos - Add support for SEC1
Le 06/03/2015 01:21, Kim Phillips a écrit : On Thu, 5 Mar 2015 17:46:05 +0100 Christophe Leroy wrote: [15/17] crypto: talitos - Implementation of SEC1 ... [16/17] crypto: talitos - SEC1 bugs on 0 data hash [17/17] crypto: talitos - Update DT bindings with SEC1 This patchseries doesn't apply, at least on top of Herbert's cryptodev-2.6 tree, as of today: Applying: crypto: talitos - Implementation of SEC1 error: patch failed: drivers/crypto/talitos.c:655 error: drivers/crypto/talitos.c: patch does not apply It was applying ok on linux-next as of yesterday. I will rebase the serie on cryptodev-2.6 Christophe -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 1/4] i2c: sunxi: Add Reduced Serial Bus (RSB) support
> From that regard, RSB is a multiple device bus, using addresses, just > like I2C. The way it communicates is basically the one used by P2WI. I am not keen to allow everything which "is a bus and has addresses" into the I2C realm. The addresses are 12 bit, whilst I2C has at maximum 10 bit which is rarely used, so mostly 7 bit are used. It has a runtime readdressing mechanism which is not present in standard I2C. And if you look at the protocol with no acks but parities, IMO it doesn't look closer to I2C than to other two wire protocols. So, being in I2C needs more arguments. And while the outcome could be that it really makes sense to add RSB to I2C with I2C_FUNCS_RSB added, it could also be that there is a more suitable place for custom busses in the kernel. Also, the fact that P2WI is in I2C is not an argument IMO. It could have been a mistake to pick it up. > So really, it just is more I2C-alike than P2WI has ever been. Because it has addresses? I disagree. > Good thing that we are not talking about a full review then, but more > a philosophical discussion. Exactly. This is why I wanted to bring this in early. signature.asc Description: Digital signature
Re: [RFC/PATCH 2/2] perf probe: Allow weak symbols to be probed
(2015/03/06 15:15), Namhyung Kim wrote: > Hi Masami, > > On Thu, Mar 05, 2015 at 12:57:21AM +0900, Masami Hiramatsu wrote: >> (2015/03/04 22:52), Namhyung Kim wrote: >>> It currently prevents adding probes in weak symbols. But there're cases >>> that given name is an only weak symbol so that we cannot add probe. >>> >>> $ perf probe -x /usr/lib/libc.so.6 -a calloc >>> Failed to find symbol calloc in /usr/lib/libc-2.21.so >>> Error: Failed to add events. >>> >>> $ nm /usr/lib/libc.so.6 | grep calloc >>> 0007b1f0 t __calloc >>> 0007b1f0 T __libc_calloc >>> 0007b1f0 W calloc >>> >>> This change will result in duplicate probes when strong and weak symbols >>> co-exist in a binary. But I think it's not a big problem since probes >>> at the weak symbol will never be hit anyway. >> >> Hmm, even on my previous series, I got an error with calloc and waitpid. >> >> $ ./perf probe -x /usr/lib64/libc-2.17.so -vvV calloc >> probe-definition(0): calloc >> symbol:calloc file:(null) line:0 offset:0 return:0 lazy:(null) >> 0 arguments >> Open Debuginfo file: /usr/lib/debug/usr/lib64/libc-2.17.so.debug >> Searching variables at calloc >> Failed to find the address of calloc >> Error: Failed to show vars. Reason: No such file or directory (Code: -2) >> >> However, it seems that calloc is loaded as a symbol. >> >> $ ./perf probe -x /usr/lib64/libc-2.17.so -V calloc >> ... >> symbol__new: __xstat64 0xe7340-0xe7385 >> symbol__new: calloc 0x80a90-0x80d2a >> symbol__new: msgget 0xf7940-0xf7961 >> ... >> >> FYI, without these patches, I see the same result (calloc is loaded) > > I'm bit confused with the English ;-). So you mean that now you *can* > probe calloc and waitpid with this patch, right? Ah, sorry for confusing you. I meant that I couldn't probe it even with your patch. I'm not sure why, since the calloc symbol is created as above message, but can't find in the map... Thank you, -- Masami HIRAMATSU Software Platform Research Dept. Linux Technology Research Center Hitachi, Ltd., Yokohama Research Laboratory E-mail: masami.hiramatsu...@hitachi.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [cgroup] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/cpu/common.c:1439 warn_pre_alternatives()
Hi Vladimir, On Fri, Mar 06, 2015 at 09:09:37AM +0300, Vladimir Davydov wrote: > Hi, > > This bug should have been fixed by "[PATCH -next] cpuset: initialize > cpuset a bit early": > > http://www.spinics.net/lists/cgroups/msg12599.html OK, sorry for the late report! I only searched for the full commit id for possible duplicates, should check the patch subject, too. Thanks, Fengguang > On Fri, Mar 06, 2015 at 01:57:58PM +0800, Fengguang Wu wrote: > > [0.021989] [ cut here ] > > [0.021989] [ cut here ] > > [0.022816] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/cpu/common.c:1439 > > warn_pre_alternatives+0x25/0x2e() > > [0.022816] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/cpu/common.c:1439 > > warn_pre_alternatives+0x25/0x2e() > > [0.024000] You're using static_cpu_has before alternatives have run! > > [0.024000] You're using static_cpu_has before alternatives have run! > > [0.024000] Modules linked in: > > [0.024000] Modules linked in: > > > > [0.024000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted > > 4.0.0-rc1-4-g295458e #455 > > [0.024000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted > > 4.0.0-rc1-4-g295458e #455 > > [0.024000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > > 1.7.5-20140531_083030-gandalf 04/01/2014 > > [0.024000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > > 1.7.5-20140531_083030-gandalf 04/01/2014 > > [0.024000] 0009 > > [0.024000] 0009 81e03cc8 81e03cc8 > > 81674d02 81674d02 810ca88e 810ca88e > > > > [0.024000] 81e03d18 > > [0.024000] 81e03d18 81e03d08 81e03d08 > > 81073d6f 81073d6f > > > > [0.024000] 81018f79 > > [0.024000] 81018f79 81e03e38 81e03e38 > > 0002 0002 > > > > [0.024000] Call Trace: > > [0.024000] Call Trace: > > [0.024000] [] dump_stack+0xa0/0xd5 > > [0.024000] [] dump_stack+0xa0/0xd5 > > [0.024000] [] ? console_unlock+0x496/0x4ef > > [0.024000] [] ? console_unlock+0x496/0x4ef > > [0.024000] [] warn_slowpath_common+0xc8/0xf7 > > [0.024000] [] warn_slowpath_common+0xc8/0xf7 > > [0.024000] [] ? warn_pre_alternatives+0x25/0x2e > > [0.024000] [] ? warn_pre_alternatives+0x25/0x2e > > [0.024000] [] warn_slowpath_fmt+0x4f/0x58 > > [0.024000] [] warn_slowpath_fmt+0x4f/0x58 > > [0.024000] [] ? native_iret+0x7/0x7 > > [0.024000] [] ? native_iret+0x7/0x7 > > [0.024000] [] warn_pre_alternatives+0x25/0x2e > > [0.024000] [] warn_pre_alternatives+0x25/0x2e > > [0.024000] [] __do_page_fault+0x2b4/0x7c2 > > [0.024000] [] __do_page_fault+0x2b4/0x7c2 > > [0.024000] [] do_page_fault+0x3e/0x4a > > [0.024000] [] do_page_fault+0x3e/0x4a > > [0.024000] [] do_async_page_fault+0x3a/0xb9 > > [0.024000] [] do_async_page_fault+0x3a/0xb9 > > [0.024000] [] async_page_fault+0x28/0x30 > > [0.024000] [] async_page_fault+0x28/0x30 > > [0.024000] [] ? cpumask_copy+0x2c/0x2f > > [0.024000] [] ? cpumask_copy+0x2c/0x2f > > [0.024000] [] ? cpuset_bind+0x5b/0xc4 > > [0.024000] [] ? cpuset_bind+0x5b/0xc4 > > [0.024000] [] cgroup_init+0x2fa/0x3d3 > > [0.024000] [] cgroup_init+0x2fa/0x3d3 > > [0.024000] [] start_kernel+0x6ed/0x755 > > [0.024000] [] start_kernel+0x6ed/0x755 > > [0.024000] [] ? early_idt_handlers+0x120/0x120 > > [0.024000] [] ? early_idt_handlers+0x120/0x120 > > [0.024000] [] x86_64_start_reservations+0x46/0x4f > > [0.024000] [] x86_64_start_reservations+0x46/0x4f > > [0.024000] [] x86_64_start_kernel+0x1b0/0x1c6 > > [0.024000] [] x86_64_start_kernel+0x1b0/0x1c6 > > [0.024000] ---[ end trace 37d9a871c47a31bc ]--- > > [0.024000] ---[ end trace 37d9a871c47a31bc ]--- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH stable 3.10, 3.12, 3.14] MIPS: Export FP functions used by lose_fpu(1) for KVM
On Thu, Mar 05, 2015 at 04:08:44PM +, James Hogan wrote: > [ Upstream commit 3ce465e04bfd8de9956d515d6e9587faac3375dc ] > > Export the _save_fp asm function used by the lose_fpu(1) macro to GPL > modules so that KVM can make use of it when it is built as a module. > > This fixes the following build error when CONFIG_KVM=m due to commit > f798217dfd03 ("KVM: MIPS: Don't leak FPU/DSP to guest"): > > ERROR: "_save_fp" [arch/mips/kvm/kvm.ko] undefined! > > Signed-off-by: James Hogan > Fixes: f798217dfd03 (KVM: MIPS: Don't leak FPU/DSP to guest) > Cc: Paolo Bonzini > Cc: Ralf Baechle > Cc: Paul Burton > Cc: Gleb Natapov > Cc: k...@vger.kernel.org > Cc: linux-m...@linux-mips.org > Cc: # 3.10...3.15 > Patchwork: https://patchwork.linux-mips.org/patch/9260/ > Signed-off-by: Ralf Baechle > [james.ho...@imgtec.com: Only export when CPU_R4K_FPU=y prior to v3.16, > so as not to break the Octeon build which excludes FPU support. KVM > depends on MIPS32r2 anyway.] > Signed-off-by: James Hogan > --- > Appologies for the previous cavium_octeon_defconfig link breakage. > Octeon has the symbol since 3.16, but not before. This backport should > do the trick for stable 3.10, 3.12, and 3.14. Build tested with > cavium_octeon_defconfig and malta_kvm_defconfig on those stable > branches. > --- > arch/mips/kernel/mips_ksyms.c | 8 > 1 file changed, 8 insertions(+) Now fixed up, thanks. greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[V5 PATCH 0/2] Introduce ACPI support for ahci_platform driver
This patch series introduce ACPI support for AHCI platform driver. Existing ACPI support for AHCI assumes the device controller is a PCI device. Since there is no ACPI _CID for generic AHCI controller, the driver could not use it for matching devices. Therefore, this patch introduces a mechanism for drivers to match devices using ACPI _CLS method. _CLS contains PCI-defined class-code. This patch series also modifies ACPI modalias to add class-code to the exisiting format, which currently only uses _HID and _CIDs. This is required to support loadable modules w/ _CLS. This patch series is rebased from and tested with: http://git.linaro.org/leg/acpi/acpi.git acpi-5.1-v9 This topic was discussed earlier here (as part of introducing support for AMD Seattle SATA controller): http://marc.info/?l=linux-arm-kernel&m=141083492521584&w=2 Changes from V4 (https://lkml.org/lkml/2015/3/2/56) * [1/2] Bug fixed: Reorder the declaration of struct acpi_pnp_device_id cls in the struct acpi_device_info (include/acpi/actypes.h) since compatible_id_list must be last one. * [2/2] Added Acked-by: Tejun Heo Changes from V3 (https://lkml.org/lkml/2015/2/8/106) * Instead of introducing new structure acpi_device_cls, add cls into the acpi_device_id, and modify the __acpi_match_device to also match for cls. (per Mika suggestion.) * Add loadable module support, which requires changes in ACPI modalias. (per Mika suggestion.) * Rebased and tested with acpi-5.1-v9 Changes from V2 (https://lkml.org/lkml/2015/1/5/662) * Update with review comment from Rafael in patch 1/2 * Rebased and tested with acpi-5.1-v8 Changes from V1 (https://lkml.org/lkml/2014/12/19/345) * Rebased to 3.19.0-rc2 * Change from acpi_cls in device_driver to acpi_match_cls (Hanjun comment) * Change the matching logic in acpi_driver_match_device() due to the new special PRP0001 _HID. * Simplify the return type of acpi_match_device_cls() to boolean. Changes from RFC (https://lkml.org/lkml/2014/12/17/446) * Remove #ifdef and make non-ACPI version of the acpi_match_device_cls as inline. (per Arnd) * Simplify logic to retrieve and evaluate _CLS handle. (per Hanjun) Suravee Suthikulpanit (2): ACPI / scan: Add support for ACPI _CLS device matching ata: ahci_platform: Add ACPI _CLS matching drivers/acpi/acpica/acutils.h | 3 ++ drivers/acpi/acpica/nsxfname.c| 21 ++-- drivers/acpi/acpica/utids.c | 71 +++ drivers/acpi/scan.c | 17 -- drivers/ata/Kconfig | 2 +- drivers/ata/ahci_platform.c | 9 + include/acpi/acnames.h| 1 + include/acpi/actypes.h| 4 ++- include/linux/mod_devicetable.h | 1 + scripts/mod/devicetable-offsets.c | 1 + scripts/mod/file2alias.c | 13 +-- 11 files changed, 134 insertions(+), 9 deletions(-) -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[V5 PATCH 1/2] ACPI / scan: Add support for ACPI _CLS device matching
Device drivers typically use ACPI _HIDs/_CIDs listed in struct device_driver acpi_match_table to match devices. However, for generic drivers, we do not want to list _HID for all supported devices. Also, certain classes of devices do not have _CID (e.g. SATA, USB). Instead, we can leverage ACPI _CLS, which specifies PCI-defined class code (i.e. base-class, subclass and programming interface). This patch adds support for matching ACPI devices using the _CLS method. To support loadable module, current design uses _HID or _CID to match device's modalias. With the new way of matching with _CLS this would requires modification to the current ACPI modalias key to include _CLS. This patch appends PCI-defined class-code to the existing ACPI modalias as following. acpi..::: E.g: # cat /sys/devices/platform/AMDI0600:00/modalias acpi:AMDI0600:010601: where bb is th base-class code, ss is te sub-class code, and pp is the programming interface code Since there would not be _HID/_CID in the ACPI matching table of the driver, this patch adds a field to acpi_device_id to specify the matching _CLS. static const struct acpi_device_id ahci_acpi_match[] = { { "", 0, PCI_CLASS_STORAGE_SATA_AHCI }, {}, }; In this case, the corresponded entry in modules.alias file would be: alias acpi*:010601:* ahci_platform Signed-off-by: Suravee Suthikulpanit --- drivers/acpi/acpica/acutils.h | 3 ++ drivers/acpi/acpica/nsxfname.c| 21 ++-- drivers/acpi/acpica/utids.c | 71 +++ drivers/acpi/scan.c | 17 -- include/acpi/acnames.h| 1 + include/acpi/actypes.h| 4 ++- include/linux/mod_devicetable.h | 1 + scripts/mod/devicetable-offsets.c | 1 + scripts/mod/file2alias.c | 13 +-- 9 files changed, 124 insertions(+), 8 deletions(-) diff --git a/drivers/acpi/acpica/acutils.h b/drivers/acpi/acpica/acutils.h index c2f03e8..2aef850 100644 --- a/drivers/acpi/acpica/acutils.h +++ b/drivers/acpi/acpica/acutils.h @@ -430,6 +430,9 @@ acpi_status acpi_ut_execute_CID(struct acpi_namespace_node *device_node, struct acpi_pnp_device_id_list ** return_cid_list); +acpi_status +acpi_ut_execute_CLS(struct acpi_namespace_node *device_node, + struct acpi_pnp_device_id **return_id); /* * utlock - reader/writer locks */ diff --git a/drivers/acpi/acpica/nsxfname.c b/drivers/acpi/acpica/nsxfname.c index d66c326..590ef06 100644 --- a/drivers/acpi/acpica/nsxfname.c +++ b/drivers/acpi/acpica/nsxfname.c @@ -276,11 +276,12 @@ acpi_get_object_info(acpi_handle handle, struct acpi_pnp_device_id *hid = NULL; struct acpi_pnp_device_id *uid = NULL; struct acpi_pnp_device_id *sub = NULL; + struct acpi_pnp_device_id *cls = NULL; char *next_id_string; acpi_object_type type; acpi_name name; u8 param_count = 0; - u8 valid = 0; + u16 valid = 0; u32 info_size; u32 i; acpi_status status; @@ -320,7 +321,7 @@ acpi_get_object_info(acpi_handle handle, if ((type == ACPI_TYPE_DEVICE) || (type == ACPI_TYPE_PROCESSOR)) { /* * Get extra info for ACPI Device/Processor objects only: -* Run the Device _HID, _UID, _SUB, and _CID methods. +* Run the Device _HID, _UID, _SUB, _CID and _CLS methods. * * Note: none of these methods are required, so they may or may * not be present for this device. The Info->Valid bitfield is used @@ -351,6 +352,14 @@ acpi_get_object_info(acpi_handle handle, valid |= ACPI_VALID_SUB; } + /* Execute the Device._CLS method */ + + status = acpi_ut_execute_CLS(node, &cls); + if (ACPI_SUCCESS(status)) { + info_size += cls->length; + valid |= ACPI_VALID_CLS; + } + /* Execute the Device._CID method */ status = acpi_ut_execute_CID(node, &cid_list); @@ -468,6 +477,11 @@ acpi_get_object_info(acpi_handle handle, sub, next_id_string); } + if (cls) { + next_id_string = acpi_ns_copy_device_id(&info->cls, + cls, next_id_string); + } + if (cid_list) { info->compatible_id_list.count = cid_list->count; info->compatible_id_list.list_size = cid_list->list_size; @@ -507,6 +521,9 @@ cleanup: if (sub) { ACPI_FREE(sub); } + if (cls) { + ACPI_FREE(cls); + } if (cid_list) { ACPI_FREE(cid_list); } diff --git a/drivers/acpi/acpica/utids.c b/drivers/acpi/acpica/utids.c index 27431cf..a64b5d1 100644 --- a/drivers/acpi/acpi
[PATCH 1/2] arm64: mediatek: Select PINCTRL for Mediatek platform
These 2 patches are fixup for MT8173 pinctrl driver: http://lists.infradead.org/pipermail/linux-arm-kernel/2015-January/320066.html Arm64 maintainers doesn't want to add MACH_* in Kconfig, this patch is used to replace the first one in that series. Matthias, Can you take this one? -- MediaTek SoC expect to work with a pinctrl driver. Select PINCTRL if ARCH_MEDIATEK is selected. Signed-off-by: Yingjoe Chen --- arch/arm64/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index e627ead..a2ddd3f 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -151,6 +151,7 @@ menu "Platform selection" config ARCH_MEDIATEK bool "Mediatek MT65xx & MT81xx ARMv8 SoC" select ARM_GIC + select PINCTRL help Support for Mediatek MT65xx & MT81xx ARMv8 SoCs -- 1.8.1.1.dirty -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/2] pinctrl: mediatek: Adjust mt8173 pinctrl kconfig
Linus, This one make PINCTRL_MT8173 option user selectable and is based on mtk-staging in your tree. If you think this is OK, please applied or squash this into previous change. Thanks. -- ARM64 maintainer doesn't want to add MACH_* for each SoC. Adjust mt8173 pinctrl kconfig entry so user can manually select it. Also make PINCTRL_MT8135 build when COMPILE_TEST is enabled. Signed-off-by: Yingjoe Chen --- drivers/pinctrl/mediatek/Kconfig | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/pinctrl/mediatek/Kconfig b/drivers/pinctrl/mediatek/Kconfig index 49b8649..1472f0e 100644 --- a/drivers/pinctrl/mediatek/Kconfig +++ b/drivers/pinctrl/mediatek/Kconfig @@ -1,4 +1,4 @@ -if ARCH_MEDIATEK +if ARCH_MEDIATEK || COMPILE_TEST config PINCTRL_MTK_COMMON bool @@ -8,11 +8,13 @@ config PINCTRL_MTK_COMMON select OF_GPIO config PINCTRL_MT8135 - def_bool MACH_MT8135 + def_bool MACH_MT8135 || COMPILE_TEST select PINCTRL_MTK_COMMON config PINCTRL_MT8173 - def_bool MACH_MT8173 + bool "Mediatek MT8173 pin control" + def_bool y + depends on ARM64 || COMPILE_TEST select PINCTRL_MTK_COMMON endif -- 1.8.1.1.dirty -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC/PATCH 2/2] perf probe: Allow weak symbols to be probed
Hi Masami, On Thu, Mar 05, 2015 at 12:57:21AM +0900, Masami Hiramatsu wrote: > (2015/03/04 22:52), Namhyung Kim wrote: > > It currently prevents adding probes in weak symbols. But there're cases > > that given name is an only weak symbol so that we cannot add probe. > > > > $ perf probe -x /usr/lib/libc.so.6 -a calloc > > Failed to find symbol calloc in /usr/lib/libc-2.21.so > > Error: Failed to add events. > > > > $ nm /usr/lib/libc.so.6 | grep calloc > > 0007b1f0 t __calloc > > 0007b1f0 T __libc_calloc > > 0007b1f0 W calloc > > > > This change will result in duplicate probes when strong and weak symbols > > co-exist in a binary. But I think it's not a big problem since probes > > at the weak symbol will never be hit anyway. > > Hmm, even on my previous series, I got an error with calloc and waitpid. > > $ ./perf probe -x /usr/lib64/libc-2.17.so -vvV calloc > probe-definition(0): calloc > symbol:calloc file:(null) line:0 offset:0 return:0 lazy:(null) > 0 arguments > Open Debuginfo file: /usr/lib/debug/usr/lib64/libc-2.17.so.debug > Searching variables at calloc > Failed to find the address of calloc > Error: Failed to show vars. Reason: No such file or directory (Code: -2) > > However, it seems that calloc is loaded as a symbol. > > $ ./perf probe -x /usr/lib64/libc-2.17.so -V calloc > ... > symbol__new: __xstat64 0xe7340-0xe7385 > symbol__new: calloc 0x80a90-0x80d2a > symbol__new: msgget 0xf7940-0xf7961 > ... > > FYI, without these patches, I see the same result (calloc is loaded) I'm bit confused with the English ;-). So you mean that now you *can* probe calloc and waitpid with this patch, right? Thanks, Namhyung -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/5] mtd: nand: vf610_nfc: Freescale NFC for VF610, MPC5125 and others
Hi Stefan, On Thu, Mar 05, 2015 at 12:10:20AM +0100, Stefan Agner wrote: > + > +static int vf610_nfc_probe_dt(struct device *dev, struct vf610_nfc_config > *cfg) > +{ > + struct device_node *np = dev->of_node; > + int buswidth; > + u32 clkrate; > + > + if (!np) > + return 1; > + > + cfg->flash_bbt = of_get_nand_on_flash_bbt(np); > + > + if (!of_property_read_u32(np, "clock-frequency", &clkrate)) > + cfg->clkrate = clkrate; Normally the clock-frequency property tells the driver at which frequency the device actually is running, not to tell the driver at which frequency the device *should* run. It's strange to use the value of the clock-frequency property as input to clk_set_rate(). Maybe the assigned clock binding is more appropriate here, see Documentation/devicetree/bindings/clock/clock-bindings.txt. BTW the above can easier be written as: of_property_read_u32(np, "clock-frequency", &cfg->clkrate); No return value checking necessary. > +static int vf610_nfc_probe(struct platform_device *pdev) > +{ > + struct vf610_nfc *nfc; > + struct resource *res; > + struct mtd_info *mtd; > + struct nand_chip *chip; > + struct vf610_nfc_config *cfg; > + int err = 0; > + int page_sz; > + int irq; > + > + nfc = devm_kzalloc(&pdev->dev, sizeof(*nfc), GFP_KERNEL); > + if (!nfc) > + return -ENOMEM; > + > + nfc->cfg = devm_kzalloc(&pdev->dev, sizeof(*nfc), GFP_KERNEL); > + if (!nfc->cfg) > + return -ENOMEM; > + cfg = nfc->cfg; Why is nfc->cfg allocated separately instead of embedding it into struct vf610_nfc? Is this some platform_data leftover you can remove now? > + > + nfc->dev = &pdev->dev; > + nfc->page = -1; > + mtd = &nfc->mtd; > + chip = &nfc->chip; > + > + mtd->priv = chip; > + mtd->owner = THIS_MODULE; > + mtd->dev.parent = nfc->dev; > + mtd->name = DRV_NAME; > + > + err = vf610_nfc_probe_dt(nfc->dev, cfg); > + if (err) > + return -ENODEV; Does this driver work without device tree or not? Currently the driver bails out when device tree support is enabled but no device node is given. When device tree support is disabled in the kernel though the driver happily continues here. Sascha -- Pengutronix e.K. | | Industrial Linux Solutions | http://www.pengutronix.de/ | Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0| Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917- | -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 1/2] net/macb: unify clock management
From: Cyrille Pitchen Most of the functions from the Common Clk Framework handle NULL pointer as input argument. Since the TX clock is optional, we now set tx_clk to NULL value instead of ERR_PTR(-ENOENT) when this clock is not available. This simplifies the clock management and avoid the need to test tx_clk value. Signed-off-by: Cyrille Pitchen Acked-by: Boris Brezillon Acked-by: Alexandre Belloni --- drivers/net/ethernet/cadence/macb.c | 31 ++- 1 file changed, 14 insertions(+), 17 deletions(-) diff --git a/drivers/net/ethernet/cadence/macb.c b/drivers/net/ethernet/cadence/macb.c index a7dbf04..a429cf8 100644 --- a/drivers/net/ethernet/cadence/macb.c +++ b/drivers/net/ethernet/cadence/macb.c @@ -213,6 +213,9 @@ static void macb_set_tx_clk(struct clk *clk, int speed, struct net_device *dev) { long ferr, rate, rate_rounded; + if (!clk) + return; + switch (speed) { case SPEED_10: rate = 250; @@ -292,8 +295,7 @@ static void macb_handle_link_change(struct net_device *dev) spin_unlock_irqrestore(&bp->lock, flags); - if (!IS_ERR(bp->tx_clk)) - macb_set_tx_clk(bp->tx_clk, phydev->speed, dev); + macb_set_tx_clk(bp->tx_clk, phydev->speed, dev); if (status_change) { if (phydev->link) { @@ -2244,6 +2246,8 @@ static int macb_probe(struct platform_device *pdev) } tx_clk = devm_clk_get(&pdev->dev, "tx_clk"); + if (IS_ERR(tx_clk)) + tx_clk = NULL; err = clk_prepare_enable(pclk); if (err) { @@ -2257,13 +2261,10 @@ static int macb_probe(struct platform_device *pdev) goto err_out_disable_pclk; } - if (!IS_ERR(tx_clk)) { - err = clk_prepare_enable(tx_clk); - if (err) { - dev_err(&pdev->dev, "failed to enable tx_clk (%u)\n", - err); - goto err_out_disable_hclk; - } + err = clk_prepare_enable(tx_clk); + if (err) { + dev_err(&pdev->dev, "failed to enable tx_clk (%u)\n", err); + goto err_out_disable_hclk; } err = -ENOMEM; @@ -2436,8 +2437,7 @@ err_out_unregister_netdev: err_out_free_netdev: free_netdev(dev); err_out_disable_clocks: - if (!IS_ERR(tx_clk)) - clk_disable_unprepare(tx_clk); + clk_disable_unprepare(tx_clk); err_out_disable_hclk: clk_disable_unprepare(hclk); err_out_disable_pclk: @@ -2461,8 +2461,7 @@ static int macb_remove(struct platform_device *pdev) kfree(bp->mii_bus->irq); mdiobus_free(bp->mii_bus); unregister_netdev(dev); - if (!IS_ERR(bp->tx_clk)) - clk_disable_unprepare(bp->tx_clk); + clk_disable_unprepare(bp->tx_clk); clk_disable_unprepare(bp->hclk); clk_disable_unprepare(bp->pclk); free_netdev(dev); @@ -2480,8 +2479,7 @@ static int __maybe_unused macb_suspend(struct device *dev) netif_carrier_off(netdev); netif_device_detach(netdev); - if (!IS_ERR(bp->tx_clk)) - clk_disable_unprepare(bp->tx_clk); + clk_disable_unprepare(bp->tx_clk); clk_disable_unprepare(bp->hclk); clk_disable_unprepare(bp->pclk); @@ -2496,8 +2494,7 @@ static int __maybe_unused macb_resume(struct device *dev) clk_prepare_enable(bp->pclk); clk_prepare_enable(bp->hclk); - if (!IS_ERR(bp->tx_clk)) - clk_prepare_enable(bp->tx_clk); + clk_prepare_enable(bp->tx_clk); netif_device_attach(netdev); -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 2/2] net/macb: merge at91_ether driver into macb driver
From: Cyrille Pitchen macb and at91_ether drivers can be compiled as modules, but the at91_ether driver use some functions and variables defined in the macb one, thus creating a dependency on the macb driver. Since these drivers are sharing the same logic we can easily merge at91_ether into macb. In order to factorize common probing logic we've added an ->init() function to struct macb_config (the structure associated with the compatible string), and moved macb specific init code from macb_probe to macb_init. Signed-off-by: Cyrille Pitchen Signed-off-by: Boris Brezillon --- drivers/net/ethernet/cadence/Kconfig | 8 - drivers/net/ethernet/cadence/Makefile | 1 - drivers/net/ethernet/cadence/at91_ether.c | 481 -- drivers/net/ethernet/cadence/macb.c | 641 ++ drivers/net/ethernet/cadence/macb.h | 10 +- 5 files changed, 485 insertions(+), 656 deletions(-) delete mode 100644 drivers/net/ethernet/cadence/at91_ether.c diff --git a/drivers/net/ethernet/cadence/Kconfig b/drivers/net/ethernet/cadence/Kconfig index 321d2ad..fb8d09b 100644 --- a/drivers/net/ethernet/cadence/Kconfig +++ b/drivers/net/ethernet/cadence/Kconfig @@ -20,14 +20,6 @@ config NET_CADENCE if NET_CADENCE -config ARM_AT91_ETHER - tristate "AT91RM9200 Ethernet support" - depends on HAS_DMA && (ARCH_AT91 || COMPILE_TEST) - select MACB - ---help--- - If you wish to compile a kernel for the AT91RM9200 and enable - ethernet support, then you should always answer Y to this. - config MACB tristate "Cadence MACB/GEM support" depends on HAS_DMA && (PLATFORM_AT32AP || ARCH_AT91 || ARCH_PICOXCELL || ARCH_ZYNQ || MICROBLAZE || COMPILE_TEST) diff --git a/drivers/net/ethernet/cadence/Makefile b/drivers/net/ethernet/cadence/Makefile index 9068b83..91f79b1 100644 --- a/drivers/net/ethernet/cadence/Makefile +++ b/drivers/net/ethernet/cadence/Makefile @@ -2,5 +2,4 @@ # Makefile for the Atmel network device drivers. # -obj-$(CONFIG_ARM_AT91_ETHER) += at91_ether.o obj-$(CONFIG_MACB) += macb.o diff --git a/drivers/net/ethernet/cadence/at91_ether.c b/drivers/net/ethernet/cadence/at91_ether.c deleted file mode 100644 index 7ef55f5..000 --- a/drivers/net/ethernet/cadence/at91_ether.c +++ /dev/null @@ -1,481 +0,0 @@ -/* - * Ethernet driver for the Atmel AT91RM9200 (Thunder) - * - * Copyright (C) 2003 SAN People (Pty) Ltd - * - * Based on an earlier Atmel EMAC macrocell driver by Atmel and Lineo Inc. - * Initial version by Rick Bronson 01/11/2003 - * - * This program is free software; you can redistribute it and/or - * modify it under the terms of the GNU General Public License - * as published by the Free Software Foundation; either version - * 2 of the License, or (at your option) any later version. - */ - -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#include "macb.h" - -/* 1518 rounded up */ -#define MAX_RBUFF_SZ 0x600 -/* max number of receive buffers */ -#define MAX_RX_DESCR 9 - -/* Initialize and start the Receiver and Transmit subsystems */ -static int at91ether_start(struct net_device *dev) -{ - struct macb *lp = netdev_priv(dev); - dma_addr_t addr; - u32 ctl; - int i; - - lp->rx_ring = dma_alloc_coherent(&lp->pdev->dev, -(MAX_RX_DESCR * - sizeof(struct macb_dma_desc)), -&lp->rx_ring_dma, GFP_KERNEL); - if (!lp->rx_ring) - return -ENOMEM; - - lp->rx_buffers = dma_alloc_coherent(&lp->pdev->dev, - MAX_RX_DESCR * MAX_RBUFF_SZ, - &lp->rx_buffers_dma, GFP_KERNEL); - if (!lp->rx_buffers) { - dma_free_coherent(&lp->pdev->dev, - MAX_RX_DESCR * sizeof(struct macb_dma_desc), - lp->rx_ring, lp->rx_ring_dma); - lp->rx_ring = NULL; - return -ENOMEM; - } - - addr = lp->rx_buffers_dma; - for (i = 0; i < MAX_RX_DESCR; i++) { - lp->rx_ring[i].addr = addr; - lp->rx_ring[i].ctrl = 0; - addr += MAX_RBUFF_SZ; - } - - /* Set the Wrap bit on the last descriptor */ - lp->rx_ring[MAX_RX_DESCR - 1].addr |= MACB_BIT(RX_WRAP); - - /* Reset buffer index */ - lp->rx_tail = 0; - - /* Program address of descriptor list in Rx Buffer Queue register */ - macb_writel(lp, RBQP, lp->rx_ring_dma); - - /* Enable Receive and Transmit */ - ctl = macb_readl(lp, NCR); - macb_writel(lp, NCR, ctl | MACB_BIT(RE) | MACB_BIT(TE)); - - return 0; -} - -/* Open the ethernet interface */ -static int at91ether_open(struct ne
[PATCH v2 0/2] net/macb: merge at91_ether driver into macb driver
Hello, The rm9200 boards use the dedicated at91_ether driver instead of the regular macb driver. Both the macb and at91_ether drivers can be compiled as separated modules. Since the at91_ether driver uses code from the macb driver, at91_ether.ko depends on macb.ko. However the macb.ko module always fails to load on rm9200 boards: the macb_probe() function expects a hclk clock which doesn't exist on rm9200. Then the at91_ether.ko can't be loaded in turn due to unresolved dependencies. This series of patches fix this issue by merging at91_ether into macb. Best Rrgards, Boris Changes since v1: - rework probe functions to share common probing logic Cyrille Pitchen (2): net/macb: unify clock management net/macb: merge at91_ether driver into macb driver drivers/net/ethernet/cadence/Kconfig | 8 - drivers/net/ethernet/cadence/Makefile | 1 - drivers/net/ethernet/cadence/at91_ether.c | 481 -- drivers/net/ethernet/cadence/macb.c | 662 ++ drivers/net/ethernet/cadence/macb.h | 10 +- 5 files changed, 494 insertions(+), 668 deletions(-) delete mode 100644 drivers/net/ethernet/cadence/at91_ether.c -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[V5 PATCH 2/2] ata: ahci_platform: Add ACPI _CLS matching
This patch adds ACPI supports for AHCI platform driver, which uses _CLS method to match the device. The following is an example of ASL structure in DSDT for a SATA controller, which contains _CLS package to be matched by the ahci_platform driver: Device (AHC0) // AHCI Controller { Name(_HID, "AMDI0600") Name (_CCA, 1) Name (_CLS, Package (3) { 0x01, // Base Class: Mass Storage 0x06, // Sub-Class: serial ATA 0x01, // Interface: AHCI }) Name (_CRS, ResourceTemplate () { Memory32Fixed (ReadWrite, 0xE030, 0x0001) Interrupt (ResourceConsumer, Level, ActiveHigh, Exclusive,,,) { 387 } }) } Also, since ATA driver should not require PCI support for ATA_ACPI, this patch removes dependency in the driver/ata/Kconfig. Acked-by: Tejun Heo Signed-off-by: Suravee Suthikulpanit --- drivers/ata/Kconfig | 2 +- drivers/ata/ahci_platform.c | 9 + 2 files changed, 10 insertions(+), 1 deletion(-) diff --git a/drivers/ata/Kconfig b/drivers/ata/Kconfig index 5f60155..50305e3 100644 --- a/drivers/ata/Kconfig +++ b/drivers/ata/Kconfig @@ -48,7 +48,7 @@ config ATA_VERBOSE_ERROR config ATA_ACPI bool "ATA ACPI Support" - depends on ACPI && PCI + depends on ACPI default y help This option adds support for ATA-related ACPI objects. diff --git a/drivers/ata/ahci_platform.c b/drivers/ata/ahci_platform.c index 78d6ae0..842cd13 100644 --- a/drivers/ata/ahci_platform.c +++ b/drivers/ata/ahci_platform.c @@ -20,6 +20,8 @@ #include #include #include +#include +#include #include "ahci.h" #define DRV_NAME "ahci" @@ -78,12 +80,19 @@ static const struct of_device_id ahci_of_match[] = { }; MODULE_DEVICE_TABLE(of, ahci_of_match); +static const struct acpi_device_id ahci_acpi_match[] = { + { "", 0, PCI_CLASS_STORAGE_SATA_AHCI }, + {}, +}; +MODULE_DEVICE_TABLE(acpi, ahci_acpi_match); + static struct platform_driver ahci_driver = { .probe = ahci_probe, .remove = ata_platform_remove_one, .driver = { .name = DRV_NAME, .of_match_table = ahci_of_match, + .acpi_match_table = ahci_acpi_match, .pm = &ahci_pm_ops, }, }; -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [cgroup] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/cpu/common.c:1439 warn_pre_alternatives()
Hi, This bug should have been fixed by "[PATCH -next] cpuset: initialize cpuset a bit early": http://www.spinics.net/lists/cgroups/msg12599.html Thanks, Vladimir On Fri, Mar 06, 2015 at 01:57:58PM +0800, Fengguang Wu wrote: > [0.021989] [ cut here ] > [0.021989] [ cut here ] > [0.022816] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/cpu/common.c:1439 > warn_pre_alternatives+0x25/0x2e() > [0.022816] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/cpu/common.c:1439 > warn_pre_alternatives+0x25/0x2e() > [0.024000] You're using static_cpu_has before alternatives have run! > [0.024000] You're using static_cpu_has before alternatives have run! > [0.024000] Modules linked in: > [0.024000] Modules linked in: > > [0.024000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted > 4.0.0-rc1-4-g295458e #455 > [0.024000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted > 4.0.0-rc1-4-g295458e #455 > [0.024000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > 1.7.5-20140531_083030-gandalf 04/01/2014 > [0.024000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > 1.7.5-20140531_083030-gandalf 04/01/2014 > [0.024000] 0009 > [0.024000] 0009 81e03cc8 81e03cc8 > 81674d02 81674d02 810ca88e 810ca88e > > [0.024000] 81e03d18 > [0.024000] 81e03d18 81e03d08 81e03d08 > 81073d6f 81073d6f > > [0.024000] 81018f79 > [0.024000] 81018f79 81e03e38 81e03e38 > 0002 0002 > > [0.024000] Call Trace: > [0.024000] Call Trace: > [0.024000] [] dump_stack+0xa0/0xd5 > [0.024000] [] dump_stack+0xa0/0xd5 > [0.024000] [] ? console_unlock+0x496/0x4ef > [0.024000] [] ? console_unlock+0x496/0x4ef > [0.024000] [] warn_slowpath_common+0xc8/0xf7 > [0.024000] [] warn_slowpath_common+0xc8/0xf7 > [0.024000] [] ? warn_pre_alternatives+0x25/0x2e > [0.024000] [] ? warn_pre_alternatives+0x25/0x2e > [0.024000] [] warn_slowpath_fmt+0x4f/0x58 > [0.024000] [] warn_slowpath_fmt+0x4f/0x58 > [0.024000] [] ? native_iret+0x7/0x7 > [0.024000] [] ? native_iret+0x7/0x7 > [0.024000] [] warn_pre_alternatives+0x25/0x2e > [0.024000] [] warn_pre_alternatives+0x25/0x2e > [0.024000] [] __do_page_fault+0x2b4/0x7c2 > [0.024000] [] __do_page_fault+0x2b4/0x7c2 > [0.024000] [] do_page_fault+0x3e/0x4a > [0.024000] [] do_page_fault+0x3e/0x4a > [0.024000] [] do_async_page_fault+0x3a/0xb9 > [0.024000] [] do_async_page_fault+0x3a/0xb9 > [0.024000] [] async_page_fault+0x28/0x30 > [0.024000] [] async_page_fault+0x28/0x30 > [0.024000] [] ? cpumask_copy+0x2c/0x2f > [0.024000] [] ? cpumask_copy+0x2c/0x2f > [0.024000] [] ? cpuset_bind+0x5b/0xc4 > [0.024000] [] ? cpuset_bind+0x5b/0xc4 > [0.024000] [] cgroup_init+0x2fa/0x3d3 > [0.024000] [] cgroup_init+0x2fa/0x3d3 > [0.024000] [] start_kernel+0x6ed/0x755 > [0.024000] [] start_kernel+0x6ed/0x755 > [0.024000] [] ? early_idt_handlers+0x120/0x120 > [0.024000] [] ? early_idt_handlers+0x120/0x120 > [0.024000] [] x86_64_start_reservations+0x46/0x4f > [0.024000] [] x86_64_start_reservations+0x46/0x4f > [0.024000] [] x86_64_start_kernel+0x1b0/0x1c6 > [0.024000] [] x86_64_start_kernel+0x1b0/0x1c6 > [0.024000] ---[ end trace 37d9a871c47a31bc ]--- > [0.024000] ---[ end trace 37d9a871c47a31bc ]--- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] do_fork(): Rename 'stack_size' argument to reflect actual use
On 05/03/15 22:29, David Rientjes wrote: On Thu, 5 Mar 2015, Alex Dowad wrote: diff --git a/kernel/fork.c b/kernel/fork.c index cf65139..b38a2ae 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -1186,10 +1186,12 @@ init_task_pid(struct task_struct *task, enum pid_type type, struct pid *pid) * It copies the registers, and all the appropriate * parts of the process environment (as per the clone * flags). The actual kick-off is left to the caller. + * + * When copying a kernel thread, 'stack_start' is the function to run. */ static struct task_struct *copy_process(unsigned long clone_flags, unsigned long stack_start, - unsigned long stack_size, + unsigned long kthread_arg, int __user *child_tidptr, struct pid *pid, int trace) @@ -1401,7 +1403,7 @@ static struct task_struct *copy_process(unsigned long clone_flags, retval = copy_io(clone_flags, p); if (retval) goto bad_fork_cleanup_namespaces; - retval = copy_thread(clone_flags, stack_start, stack_size, p); + retval = copy_thread(clone_flags, stack_start, kthread_arg, p); if (retval) goto bad_fork_cleanup_io; @@ -1629,8 +1631,8 @@ struct task_struct *fork_idle(int cpu) * it and waits for it to finish using the VM if required. */ long do_fork(unsigned long clone_flags, - unsigned long stack_start, - unsigned long stack_size, + unsigned long stack_start, /* or function for kthread to run */ + unsigned long kthread_arg, int __user *parent_tidptr, int __user *child_tidptr) { Looks fine, but I'm not sure about commenting functional formals. Since copy_process() and do_fork() can have formals with different meanings, then why not just rename them "arg1" and "arg2" respectively and then define in the comment above the function what the possible combinations are? The second argument is *only* ever used for one thing: an argument passed to a kernel thread. That's why I would like to rename it to "kthread_arg". The previous argument (currently named "stack_start") is indeed used for 2 different things: a new stack pointer for a user thread, or a function to be executed by a kernel thread. Rather than "arg1", what would you think of something like "sp_or_fn", or "usp_or_fn"? I would recommend exactly "arg" since it can be used for multiple purposes and if the formal could ever be used for a third purpose we don't want to go through another re-naming patch to change it from sp_or_fn or usp_or_fn. If that's done, then the comment above the function could define what arg can represent. Do others concur with this idea? Personally, I feel the code will be more readable/maintainable if the naming of args/variables/etc reflects what they are actually used for. (Case in point: on IA64, copy_thread() adds the kernel thread arg to the user stack pointer. The kernel thread arg is always 0 when forking a user process, so this "works", but it's certainly not what the author intended. Good names make it harder to write buggy code!) For readability, using the same arg for 2 different purposes is a bad practice (though it might be good for keeping the object code small). I hate to think that "arg" might be co-opted for another purpose again. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[bdi] BUG: unable to handle kernel NULL pointer dereference at 0000000000000550
/0xc2 [0.704142] [] evict+0xa2/0x15e [0.704142] [] iput+0x160/0x16d [0.704142] [] bdput+0xd/0xf [0.704142] [] __blkdev_put+0x166/0x18a [0.704142] [] blkdev_put+0x114/0x11d [0.704142] [] add_disk+0x44d/0x461 [0.704142] [] brd_init+0x95/0x160 [0.704142] [] ? ramdisk_size+0x1a/0x1a [0.704142] [] do_one_initcall+0xe8/0x175 [0.704142] [] kernel_init_freeable+0x1d0/0x258 [0.704142] [] ? rest_init+0xbc/0xbc [0.704142] [] kernel_init+0x9/0xd5 [0.704142] [] ret_from_fork+0x7c/0xb0 [0.704142] [] ? rest_init+0xbc/0xbc [0.704142] Code: ca 48 c1 ea 04 29 d0 ba 01 00 00 00 89 8f 80 08 00 00 ff c8 85 c0 0f 4e c2 89 87 84 08 00 00 c3 48 8b 87 10 01 00 00 55 48 89 e5 <48> 8b 80 50 05 00 00 5d 48 05 58 02 00 00 c3 48 89 fa 31 c0 b9 [0.704142] RIP [] blk_get_backing_dev_info+0xb/0x1a [0.704142] RSP [0.704142] CR2: 0550 [0.704142] ---[ end trace 5c64cf25111d3d67 ]--- [0.704142] Kernel panic - not syncing: Fatal exception git bisect start 45b8e7be563c57fc42d69d5239b4829b5586620d 13a7a6ac0a11197edcd0f756a035f472b42cdf8b -- git bisect bad 980171ac3db20fc792b9b1298067344725a5a285 # 19:07 0- 20 Merge 'luto/x86/entry' into devel-xian-x86_64-201503051818 git bisect bad 7a2a5fad21b95990713cbdfaccc9eeba4e98f9b8 # 19:13 0- 20 Merge 'kees/format-security' into devel-xian-x86_64-201503051818 git bisect good cadb5884edc7353ecb245cf0874ead1f9565f2a7 # 19:29 20+ 0 Merge 'trace/ftrace/urgent' into devel-xian-x86_64-201503051818 git bisect good 30abe812fb9b18b25ebb9d2d214a70013a191ccb # 19:34 20+ 0 Merge 'paulburton/wip-ci20-v4.0' into devel-xian-x86_64-201503051818 git bisect good 0d0fc17147f433ffe27f8d2fcd3b29e109694fe3 # 19:40 20+ 0 Merge 'arm-soc/next/drivers' into devel-xian-x86_64-201503051818 git bisect bad caca114c0271d4df06e2ff1acee68dd62be43d66 # 20:03 0- 20 Merge 'josef-btrfs/superblock-scaling' into devel-xian-x86_64-201503051818 git bisect good d2ee19114357bdf21c59a3ac61eb053ef1c0dc4e # 20:15 20+ 8 inode: rename i_wb_list to i_io_list git bisect bad 63738525a6ebdf74bb3eb1c3dba16c0bb6895d97 # 20:28 0- 20 inode: convert per-sb inode list to a list_lru git bisect bad a05899067cddc24276e43e0d440da791738cf967 # 20:42 0- 20 writeback: periodically trim the writeback list git bisect bad 40ceea09e84d1b9319236b27ad3162422310e5d0 # 21:12 0- 20 bdi: add a new writeback list for sync # first bad commit: [40ceea09e84d1b9319236b27ad3162422310e5d0] bdi: add a new writeback list for sync git bisect good d2ee19114357bdf21c59a3ac61eb053ef1c0dc4e # 21:14 60+ 8 inode: rename i_wb_list to i_io_list # extra tests with DEBUG_INFO git bisect bad 40ceea09e84d1b9319236b27ad3162422310e5d0 # 22:55 0- 22 bdi: add a new writeback list for sync # extra tests on HEAD of linux-devel/devel-xian-x86_64-201503051818 git bisect bad 45b8e7be563c57fc42d69d5239b4829b5586620d # 22:55 0- 12 0day head guard for 'devel-xian-x86_64-201503051818' # extra tests on tree/branch josef-btrfs/superblock-scaling git bisect bad d119f33d7f868e92c2d7fd21da1aade94584994d # 23:13 0- 60 inode: don't softlockup when evicting inodes # extra tests on tree/branch linus/master git bisect good 6587457b4b3d663b237a0f95ddf6e67d1828c8ea # 23:41 60+ 2 Merge tag 'dma-buf-for-4.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/sumits/dma-buf # extra tests on tree/branch next/master git bisect good cbbf783608bd1f177fd8b1f6498bb2481116beed # 23:53 60+ 0 Add linux-next specific files for 20150305 This script may reproduce the error. #!/bin/bash kernel=$1 initrd=yocto-minimal-x86_64.cgz wget --no-clobber https://github.com/fengguang/reproduce-kernel-bug/raw/master/initrd/$initrd kvm=( qemu-system-x86_64 -cpu kvm64 -enable-kvm -kernel $kernel -initrd $initrd -m 320 -smp 1 -net nic,vlan=1,model=e1000 -net user,vlan=1 -boot order=nc -no-reboot -watchdog i6300esb -rtc base=localtime -serial stdio -display none -monitor null ) append=( hung_task_panic=1 earlyprintk=ttyS0,115200 rd.udev.log-priority=err systemd.log_target=journal systemd.log_level=warning debug apic=debug sysrq_always_enabled rcupdate.rcu_cpu_stall_timeout=100 panic=-1 softlockup_panic=1 nmi_watchdog=panic oops=panic load_ramdisk=2 prompt_ramdisk=0 console=ttyS0,115200 console=tty0 vga=normal root=/dev/ram0 rw drbd.minor_count=8 ) "${kvm[@]}" --append &q
[PCI] BUG: unable to handle kernel
Greetings, 0day kernel testing robot got the below dmesg and the first bad commit is git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git commit 0b2af171520e5d5e7d5b5f479b90a6a5014d9df6 Author: Murali Karicheri AuthorDate: Tue Mar 3 12:52:13 2015 -0500 Commit: Bjorn Helgaas CommitDate: Tue Mar 3 14:42:58 2015 -0600 PCI: Update DMA configuration from DT If there is a DT node available for the root bridge's parent device, use the DMA configuration from that device node. For example, Keystone PCI devices would require dma_pfn_offset to be set correctly in the device structure of the PCI device in order to have the correct DMA mask. The DT node will have dma-ranges defined for this. Also support using the DT property dma-coherent to allow coherent DMA operation by the PCI device. Use the new helper function of_pci_dma_configure() to update the device DMA configuration. This fixes DMA on systems where DMA addresses are a constant offset from CPU physical addresses. Tested-by: Suravee Suthikulpanit (AMD Seattle) Signed-off-by: Murali Karicheri Signed-off-by: Bjorn Helgaas Reviewed-by: Catalin Marinas Acked-by: Will Deacon CC: Joerg Roedel CC: Grant Likely CC: Rob Herring CC: Russell King CC: Arnd Bergmann +--+++-+ | | bdc567f9c1 | 0b2af17152 | v4.0-rc2_030422 | +--+++-+ | boot_successes | 47 | 0 | 0 | | boot_failures| 33 | 20 | 12 | | page_allocation_failure:order:#,mode | 33 || | | backtrace:btrfs_test_extent_io | 33 || | | backtrace:init_btrfs_fs | 33 || | | backtrace:kernel_init_freeable | 33 | 20 | 12 | | BUG:unable_to_handle_kernel | 0 | 20 | 12 | | Oops | 0 | 20 | 12 | | EIP_is_at_of_pci_dma_configure | 0 | 20 | 12 | | Kernel_panic-not_syncing:Fatal_exception | 0 | 20 | 12 | | backtrace:acpi_bus_scan | 0 | 20 | 12 | | backtrace:acpi_scan_init | 0 | 20 | 12 | | backtrace:acpi_init | 0 | 20 | 12 | +--+++-+ [0.573023] pci_bus :00: root bus resource [mem 0x1400-0xfebf window] [0.573381] pci :00:00.0: [8086:1237] type 00 class 0x06 [0.573381] pci :00:00.0: [8086:1237] type 00 class 0x06 [0.574397] BUG: unable to handle kernel [0.574397] BUG: unable to handle kernel NULL pointer dereferenceNULL pointer dereference at 01c4 at 01c4 [0.575439] IP: [0.575439] IP: [<79a20c33>] of_pci_dma_configure+0x33/0x70 [<79a20c33>] of_pci_dma_configure+0x33/0x70 [0.576231] *pde = [0.576231] *pde = [0.57] Oops: [#1] [0.57] Oops: [#1] SMP SMP [0.57] Modules linked in: [0.57] Modules linked in: [0.57] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.0.0-rc1-6-g0b2af17 #6 [0.57] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.0.0-rc1-6-g0b2af17 #6 [0.57] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014 [0.57] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014 [0.57] task: 7806 ti: 78068000 task.ti: 78068000 [0.57] task: 7806 ti: 78068000 task.ti: 78068000 [0.57] EIP: 0060:[<79a20c33>] EFLAGS: 00010246 CPU: 0 [0.57] EIP: 0060:[<79a20c33>] EFLAGS: 00010246 CPU: 0 [0.57] EIP is at of_pci_dma_configure+0x33/0x70 [0.57] EIP is at of_pci_dma_configure+0x33/0x70 [0.57] EAX: EBX: 78011800 ECX: EDX: 0005 [0.57] EAX: EBX: 78011800 ECX: EDX: 0005 [0.57] ESI: 781d8400 EDI: 781d8000 EBP: 78069cd0 ESP: 78069cc8 [0.57] ESI: 781d8400 EDI: 781d8000 EBP: 78069cd0 ESP: 78069cc8 [0.57] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 [0.57] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 [0.57] CR0: 8005003b CR2: 01c4 CR3: 0229f000 CR4: 06d0 [0.57] CR0: 8005003b CR2: 01c4 CR3: 0229f000 CR4: 06d0 [0.57] Stack: [0.57] Stack: [0.57] 78011800 [0.57] 7
RE: [PATCH v2] ixgbe: make VLAN filter conditional
> From: Hiroshi Shimamoto > > Disable hardware VLAN filtering if netdev->features VLAN flag is dropped. > > In SR-IOV case, there is a use case which needs to disable VLAN filter. > For example, we need to make a network function with VF in virtualized > environment. That network function may be a software switch, a router > or etc. It means that that network function will be an end point which > terminates many VLANs. > > In the current implementation, VLAN filtering always be turned on and > VF can receive only 63 VLANs. It means that only 63 VLANs can be terminated > in one NIC. > > With this patch, if the user turns VLAN filtering off on the host, VF > can receive every VLAN packet. > > This VLAN filtering can be turned on or off when SR-IOV is disabled, if not > the operation is rejected. Hi, any comment about this? I added a warning message and prevent operation during SR-IOV is enabled. thanks, Hiroshi N�r��yb�X��ǧv�^�){.n�+{zX����ܨ}���Ơz�&j:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a��� 0��h���i
[cpumask] WARNING: CPU: 0 PID: 0 at lib/list_debug.c:29 __list_add()
bsolete cpu function usage. git bisect good c099221e5944e36612c4079d888a38530a667645 # 13:27 20+ 20 cpumask: remove deprecated functions. git bisect bad f754909a13e848900abee1014ca29b9b4e33b6ff # 13:34 0- 20 cpumask: only allocate nr_cpumask_bits. git bisect good 7928baeec50516d4f632f2b9a66925a3fc1126b0 # 13:56 20+ 0 Fix weird uses of num_online_cpus(). # first bad commit: [f754909a13e848900abee1014ca29b9b4e33b6ff] cpumask: only allocate nr_cpumask_bits. git bisect good 7928baeec50516d4f632f2b9a66925a3fc1126b0 # 13:58 60+ 16 Fix weird uses of num_online_cpus(). # extra tests with DEBUG_INFO git bisect good f754909a13e848900abee1014ca29b9b4e33b6ff # 14:15 60+ 60 cpumask: only allocate nr_cpumask_bits. # extra tests on HEAD of linux-devel/devel-snb-smoke-201503051145 git bisect bad e9d45bb15ba89a3ef7b6dde0d12d15d4964e74de # 14:15 0- 12 0day head guard for 'devel-snb-smoke-201503051145' # extra tests on tree/branch rusty/cpumask-next git bisect bad f754909a13e848900abee1014ca29b9b4e33b6ff # 14:23 0- 20 cpumask: only allocate nr_cpumask_bits. # extra tests with first bad commit reverted # extra tests on tree/branch linus/master git bisect good 6587457b4b3d663b237a0f95ddf6e67d1828c8ea # 14:32 60+ 20 Merge tag 'dma-buf-for-4.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/sumits/dma-buf # extra tests on tree/branch next/master git bisect good cbbf783608bd1f177fd8b1f6498bb2481116beed # 14:52 60+ 60 Add linux-next specific files for 20150305 This script may reproduce the error. #!/bin/bash kernel=$1 kvm=( qemu-system-x86_64 -cpu kvm64 -enable-kvm -kernel $kernel -m 320 -smp 1 -net nic,vlan=1,model=e1000 -net user,vlan=1 -boot order=nc -no-reboot -watchdog i6300esb -rtc base=localtime -serial stdio -display none -monitor null ) append=( hung_task_panic=1 earlyprintk=ttyS0,115200 rd.udev.log-priority=err systemd.log_target=journal systemd.log_level=warning debug apic=debug sysrq_always_enabled rcupdate.rcu_cpu_stall_timeout=100 panic=-1 softlockup_panic=1 nmi_watchdog=panic oops=panic load_ramdisk=2 prompt_ramdisk=0 console=ttyS0,115200 console=tty0 vga=normal root=/dev/ram0 rw drbd.minor_count=8 ) "${kvm[@]}" --append "${append[*]}" Thanks, Fengguang early console in setup code Probing EDD (edd=off to disable)... ok early console in decompress_kernel KASLR using RDTSC... Decompressing Linux... Parsing ELF... Performing relocations... done. Booting the kernel. [0.00] Initializing cgroup subsys cpuset [0.00] Linux version 4.0.0-rc2-00166-gf754909 (kbuild@snb) (gcc version 4.9.1 (Debian 4.9.1-19) ) #35 SMP PREEMPT Thu Mar 5 13:32:58 CST 2015 [0.00] Command line: hung_task_panic=1 earlyprintk=ttyS0,115200 rd.udev.log-priority=err systemd.log_target=journal systemd.log_level=warning debug apic=debug sysrq_always_enabled rcupdate.rcu_cpu_stall_timeout=100 panic=-1 softlockup_panic=1 nmi_watchdog=panic oops=panic load_ramdisk=2 prompt_ramdisk=0 console=ttyS0,115200 console=tty0 vga=normal root=/dev/ram0 rw link=/kbuild-tests/run-queue/kvm/x86_64-randconfig-s0-03031804/linux-devel:devel-snb-smoke-201503051145:f754909a13e848900abee1014ca29b9b4e33b6ff:bisect-linux-3/.vmlinuz-f754909a13e848900abee1014ca29b9b4e33b6ff-2015030515-16-client6 branch=linux-devel/devel-snb-smoke-201503051145 BOOT_IMAGE=/kernel/x86_64-randconfig-s0-03031804/f754909a13e848900abee1014ca29b9b4e33b6ff/vmlinuz-4.0.0-rc2-00166-gf754909 drbd.minor_count=8 [0.00] KERNEL supported cpus: [0.00] AMD AuthenticAMD [0.00] Centaur CentaurHauls [0.00] CPU: vendor_id 'GenuineIntel' unknown, using generic init. [0.00] CPU: Your system may be unstable. [0.00] e820: BIOS-provided physical RAM map: [0.00] BIOS-e820: [mem 0x-0x0009fbff] usable [0.00] BIOS-e820: [mem 0x0009fc00-0x0009] reserved [0.00] BIOS-e820: [mem 0x000f-0x000f] reserved [0.00] BIOS-e820: [mem 0x0010-0x13fd] usable [0.00] BIOS-e820: [mem 0x13fe-0x13ff] reserved [0.00] BIOS-e820: [mem 0xfeffc000-0xfeff] reserved [0.00] BIOS-e820: [mem 0xfffc-0x] reserved [0.00] bootconsole [earlyser0] enabled [0.00] NX (Execute Disable) protection: active [0.00] e820: update [mem 0x01fb21
[cgroup] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/cpu/common.c:1439 warn_pre_alternatives()
Greetings, 0day kernel testing robot got the below dmesg and the first bad commit is git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git revert-295458e67284f57d154ec8156a22797c0cfb044a-295458e67284f57d154ec8156a22797c0cfb044a commit 295458e67284f57d154ec8156a22797c0cfb044a Author: Vladimir Davydov AuthorDate: Thu Feb 19 17:34:46 2015 +0300 Commit: Tejun Heo CommitDate: Mon Mar 2 12:11:01 2015 -0500 cgroup: call cgroup_subsys->bind on cgroup subsys initialization Currently, we call cgroup_subsys->bind only on unmount, remount, and when creating a new root on mount. Since the default hierarchy root is created in cgroup_init, we will not call cgroup_subsys->bind if the default hierarchy is freshly mounted. As a result, some controllers will behave incorrectly (most notably, the "memory" controller will not enable hierarchy support). Fix this by calling cgroup_subsys->bind right after initializing a cgroup subsystem. Signed-off-by: Vladimir Davydov Signed-off-by: Tejun Heo +--++++ | | 283cb41f42 | 295458e672 | 65cf2c9599 | +--++++ | boot_successes | 60 | 0 | 0 | | boot_failures| 0 | 20 | 12 | | WARNING:at_arch/x86/kernel/cpu/common.c:#warn_pre_alternatives() | 0 | 20 | 12 | | BUG:unable_to_handle_kernel | 0 | 20 | 12 | | Oops | 0 | 20 | 12 | | RIP:cpumask_copy | 0 | 20 | 12 | | Kernel_panic-not_syncing:Fatal_exception | 0 | 20 | 12 | | backtrace:async_page_fault | 0 | 20 | 12 | | backtrace:cgroup_init| 0 | 20 | 12 | +--++++ [0.020009] Mountpoint-cache hash table entries: 1024 (order: 1, 8192 bytes) [0.021989] [ cut here ] [0.021989] [ cut here ] [0.022816] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/cpu/common.c:1439 warn_pre_alternatives+0x25/0x2e() [0.022816] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/cpu/common.c:1439 warn_pre_alternatives+0x25/0x2e() [0.024000] You're using static_cpu_has before alternatives have run! [0.024000] You're using static_cpu_has before alternatives have run! [0.024000] Modules linked in: [0.024000] Modules linked in: [0.024000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.0.0-rc1-4-g295458e #455 [0.024000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.0.0-rc1-4-g295458e #455 [0.024000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014 [0.024000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014 [0.024000] 0009 [0.024000] 0009 81e03cc8 81e03cc8 81674d02 81674d02 810ca88e 810ca88e [0.024000] 81e03d18 [0.024000] 81e03d18 81e03d08 81e03d08 81073d6f 81073d6f [0.024000] 81018f79 [0.024000] 81018f79 81e03e38 81e03e38 0002 0002 [0.024000] Call Trace: [0.024000] Call Trace: [0.024000] [] dump_stack+0xa0/0xd5 [0.024000] [] dump_stack+0xa0/0xd5 [0.024000] [] ? console_unlock+0x496/0x4ef [0.024000] [] ? console_unlock+0x496/0x4ef [0.024000] [] warn_slowpath_common+0xc8/0xf7 [0.024000] [] warn_slowpath_common+0xc8/0xf7 [0.024000] [] ? warn_pre_alternatives+0x25/0x2e [0.024000] [] ? warn_pre_alternatives+0x25/0x2e [0.024000] [] warn_slowpath_fmt+0x4f/0x58 [0.024000] [] warn_slowpath_fmt+0x4f/0x58 [0.024000] [] ? native_iret+0x7/0x7 [0.024000] [] ? native_iret+0x7/0x7 [0.024000] [] warn_pre_alternatives+0x25/0x2e [0.024000] [] warn_pre_alternatives+0x25/0x2e [0.024000] [] __do_page_fault+0x2b4/0x7c2 [0.024000] [] __do_page_fault+0x2b4/0x7c2 [0.024000] [] do_page_fault+0x3e/0x4a [0.024000] [] do_page_fault+0x3e/0x4a [0.024000] [] do_async_page_fault+0x3a/0xb9 [0.024000] [] do_async_page_fault+0x3
[x86/xen] WARNING: CPU: 0 PID: 1 at arch/x86/xen/apic.c:73 xen_apic_write()
Greetings, 0day kernel testing robot got the below dmesg and the first bad commit is git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip revert-3f4560207f796d5f79c18329d5a5d383fe3c97bb-3f4560207f796d5f79c18329d5a5d383fe3c97bb commit 3f4560207f796d5f79c18329d5a5d383fe3c97bb Author: Konrad Rzeszutek Wilk AuthorDate: Mon Mar 2 12:06:23 2015 -0500 Commit: David Vrabel CommitDate: Mon Mar 2 17:15:05 2015 + x86/xen: Provide a "Xen PV" APIC driver to support >255 VCPUs Instead of mangling the default APIC driver, provide a Xen PV guest specific one that explicitly provides appropriate methods. This allows use to report that all APIC IDs are valid, allowing dom0 to boot with more than 255 VCPUs. Since the probe order of APIC drivers is link dependent, we add in an late probe function to change to the Xen PV if it hadn't been done during bootup. Suggested-by: David Vrabel Reported-by: Cathy Avery Signed-off-by: Konrad Rzeszutek Wilk Signed-off-by: David Vrabel +--++++ | | dbc36df319 | 3f4560207f | 64abd71342 | +--++++ | boot_successes | 60 | 0 | 0 | | boot_failures| 0 | 20 | 12 | | WARNING:at_arch/x86/xen/apic.c:#xen_apic_write() | 0 | 20 | 12 | | BUG:kernel_boot_hang | 0 | 9 | 2 | | backtrace:native_smp_prepare_cpus| 0 | 20 | 12 | | backtrace:kernel_init_freeable | 0 | 20 | 12 | +--++++ [0.021336] Freeing SMP alternatives memory: 32K (8264c000 - 82654000) [0.027813] Getting VERSION: 0 [0.028005] [ cut here ] [0.028838] WARNING: CPU: 0 PID: 1 at arch/x86/xen/apic.c:73 xen_apic_write+0x15/0x17() [0.032006] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.0.0-rc1-8-g3f45602 #10 [0.033313] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014 [0.035045] 0009 88001144fe58 81b4bc0c 005e [0.036329] 88001144fe98 810729f0 [0.037692] 81009b3b 0008 a108 [0.039067] Call Trace: [0.039506] [] dump_stack+0x4c/0x6e [0.040008] [] warn_slowpath_common+0x92/0xac [0.041048] [] ? xen_apic_write+0x15/0x17 [0.042032] [] warn_slowpath_null+0x15/0x17 [0.044007] [] xen_apic_write+0x15/0x17 [0.044964] [] verify_local_APIC+0x50/0x1a5 [0.045980] [] native_smp_prepare_cpus+0x1f9/0x2d2 [0.047093] [] kernel_init_freeable+0x115/0x258 [0.048007] [] ? rest_init+0xbc/0xbc [0.048915] [] kernel_init+0x9/0xd5 [0.049810] [] ret_from_fork+0x7c/0xb0 [0.050753] [] ? rest_init+0xbc/0xbc [0.052021] ---[ end trace 2224f94bfa1995b9 ]--- [0.052835] Getting VERSION: 0 git bisect start 64abd713427959b0c88f3f7ddc3a519d9628 c517d838eb7d07bbe9507871fab3931deccff539 -- git bisect bad 564cdc396432bb58399bee9c85d2f9c9dbd1f4c8 # 02:32 0- 20 Merge 'xen-tip/devel/for-linus-4.1' into devel-xian-x86_64-201503030145 git bisect good 42d429fb535b1ed2a8f2bd64e5e2b0d1507020e8 # 03:30 20+ 0 Merge 'slave-dma/for-linus' into devel-xian-x86_64-201503030145 git bisect good 358928c49c157cfd513af221ccabe22434a63bbe # 03:55 20+ 4 Merge 'cgroup/for-4.1' into devel-xian-x86_64-201503030145 git bisect good 058e1fa5f35fbd876af4e1bcc1f938218a28706e # 04:16 20+ 0 Merge 'wq/for-4.0-fixes' into devel-xian-x86_64-201503030145 git bisect good 06324125b0143ed0efe6c3db9b210ce2fe0f255d # 04:27 20+ 10 xen: synchronize include/xen/interface/xen.h with xen git bisect good f227b2ffd052e52f51c78c692eec4ccfba180d31 # 04:43 20+ 1 xen: use generated hypercall symbols in arch/x86/xen/xen-head.S git bisect bad 3f4560207f796d5f79c18329d5a5d383fe3c97bb # 05:11 0- 11 x86/xen: Provide a "Xen PV" APIC driver to support >255 VCPUs git bisect good dbc36df3197da8364f9a58f76970968c7862eb60 # 05:45 20+ 0 xen/pciback: Don't print scary messages when unsupported by hypervisor. # first bad commit: [3f4560207f796d5f79c18329d5a5d383fe3c97bb] x86/xen: Provide a "Xen PV" APIC driver to support >255 VCPUs git bisect good dbc36df3197da8364f9a58f76970968c7862eb60 # 05:48 60+ 0 xen/pciback: Don't print scary messages when unsupported by hypervisor. # extra tests with DEBUG_INFO git bisect good 3f4560207f796d5f79c
Re: [PATCH v2 3/4] cpufreq: mediatek: add Mediatek cpufreq driver
+cc Sascha On 5 March 2015 at 17:55, Viresh Kumar wrote: > On 5 March 2015 at 12:57, Pi-Cheng Chen wrote: > >> On 4 March 2015 at 19:09, Viresh Kumar wrote: >> There are 2 clusters, but only the big cluster need to do voltage scaling in >> the >> notifier, since the voltage controlling is done by cpufreq-dt driver >> in this version. >> Therefore only one dvfs_info struct here. > > Do you really think its readable enough that way? You must have added some > comments on how this is working. Also, what about putting this stuff in your > regulator driver, so that you don't really have to do this in PRE/POST > notifiers. Okay. I will add comments to describe some details about this. About putting those stuff into regulator driver, I think you mean creating a "virtual regulator device" and put all the voltage controlling complex into the driver, right? Maybe it's a good idea in this case, but I am sure if this kind of virtual regulator is acceptable. And the flexibility might be an issue, since we might use different PMIC for same SoC on different board. > + inter_clk = clk_get(&pdev->dev, NULL); >>> >>> How is this supposed to work ? How will pdev->dev give intermediate >>> clock ? >> >> It works with the the device tree binding in the 2nd patch of this series, >> too. >> Since the cpufreq node is not allowed, would you have some suggestions on >> how to get the intermediate clock source in this case? > > How exactly? I am not doubting your work, just that I don't know how that DT > binding will reflect here with clock_get for pdev->dev.. Please correct me if I was wrong. IIUC, It does: clk_get() -> __of_clk_get_by_name() -> __of_clk_get() The "mtk-cpufreq" device tree node specified the intermediate clock source in "clocks" property. And the pdev here came from the "mtk-cpufreq" device tree node, so we can get the "clock specifier" by calling of_parse_phandle_with_args() to find "clocks" property in __of_clk_get(). > + pd->independent_clocks = 1, >>> >>> s/,/; ?? >> >> It's strange that I didn't get a compiling error here. >> Will fix it. > > Its a perfectly valid statement :) and so no errors. Both will execute as they > will in case of ';', just that output of the later one will be > returned. But there > in no variable on LHS (left-hand-side) and so the value doesn't matter. Thanks for your explanation. :) > >>> Don't want to free OPP table here on error ? >> >> Please correct me if I was wrong. Since the OPP table in the dvfs_info is >> allocated by devm_kzalloc(), it is supposed to be freed if the probe function >> failed, isn't it? >> >> And the OPP table initialized by of_init_opp_table() in cpu_opp_table_init() >> was freed right before the function return since it will be initialized >> again in >> the cpufreq-dt driver. > > Okay, I was talking about this only and I missed it. We probably need to fix > this in OPP library so that multiple callers are allowed. > + dev = platform_device_register_data(NULL, "cpufreq-dt", -1, pd, + sizeof(*pd)); >>> >>> So this routine is going to be called only once. Then how are you >>> initializing stuff >>> for both the clusters in the upper for loop ? It looked very very confusing. >> >> Please let me clarify this here. >> We have two clusters, one for big and another for little cores. For >> the little cores' >> cluster, only one voltage source needs to be controlled when doing CPU DVFS. >> Therefore the voltage scaling of little cores' cluster is done in the >> cpufreq-dt. >> But for the big cores' cluster, there are two voltage sources here to >> be controlled >> and these two voltage source need to be scaled up and down in a SoC specific >> manner which is implemented in the mtk_cpufreq_voltage_trace() function. >> Hence, we put the voltage scaling of big cores' cluster in the cpufreq >> notifier and >> that's also why we need a mtk-cpufreq driver in addition to cpufreq-dt. >> >> In the confusing loop above, I am trying to solve two problems: >> 1. to find out which CPUs shares the same clock / power domains among all >> CPUs >> 2. to initialize the dvfs_info which is only needed by big cores' cluster >> >> I think that's why the loop looks so confusing. Maybe doing it in two >> separate loops >> will make the code more readable? I'll try it in next version. > > Yes. Combining comments and suggestions from you and Sascha[1], I conclude some architectural changes are going to be made in the next version: 1. Use set_rate hook instead of determine_rate in clk driver, and switch to intermeidate PLL parent and back to original CPU PLL parent explicitly in set_rate 2. Therefore we don't need intermediate frequency support in cpufreq-dt to implement cpufreq support for Mediatek SoC 3. Use clk notifier to handle voltage controlling corresponding to intermediate clock rate 4. Due to 3. we need to move all voltage controlling part back into the notifier in mtk-cpufr
Re: [PATCH v4] x86: mce: kexec: switch MCE handler for kexec/kdump
On Thu, Mar 05, 2015 at 01:24:47AM +, Horiguchi Naoya(堀口 直也) wrote: ... > > Is the "UC" entry at the end of the severities[] table just a catch-all for > > things that made it > > past all the other entries? Does it ever really get used? > > I read through the severity check table and it seems that all UC=1 case > are already considered by the above entries, so it seems not used. I was completely wrong, the "Uncorrected" entry is chosen when mca_cfg.ser is false (where all checks with SER_REQUIRED are skipped) and UC=1 and OVER=0.N�r��yb�X��ǧv�^�){.n�+{zX����ܨ}���Ơz�&j:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a��� 0��h���i
[BUG] uprobe: failed to work on 9pfs
Uprobe uses inode address to index all registered uprobes in a rb_tree, this works well in most filesystems but failed on 9pfs. 9pfs allocate more than one vfs inode to the same file, the inode address when we create uprobe is not same as the inode address we run later. As a result, neither perf record nor events/uprobe can capture the predefined uprobe events. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
linux-next: Tree for Mar 6
Hi all, Changes since 20150305: The net-next tree lost its build failure. The vhost tree gained a conflict against the virtio tree. Non-merge commits (relative to Linus' tree): 2757 2807 files changed, 87638 insertions(+), 60274 deletions(-) I have created today's linux-next tree at git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git (patches at http://www.kernel.org/pub/linux/kernel/next/ ). If you are tracking the linux-next tree using git, you should not use "git pull" to do so as that will try to merge the new linux-next release with the old one. You should use "git fetch" and checkout or reset to the new master. You can see which trees have been included by looking in the Next/Trees file in the source. There are also quilt-import.log and merge.log files in the Next directory. Between each merge, the tree was built with a ppc64_defconfig for powerpc and an allmodconfig for x86_64 and a multi_v7_defconfig for arm. After the final fixups (if any), it is also built with powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig and allyesconfig (this fails its final link) and i386, sparc, sparc64 and arm defconfig. Below is a summary of the state of the merge. I am currently merging 207 trees (counting Linus' and 30 trees of patches pending for Linus' tree). Stats about the size of the tree over time can be seen at http://neuling.org/linux-next-size.html . Status of my local build tests will be at http://kisskb.ellerman.id.au/linux-next . If maintainers want to give advice about cross compilers/configs that work, we are always open to add more builds. Thanks to Randy Dunlap for doing many randconfig builds. And to Paul Gortmaker for triage and bug fixes. -- Cheers, Stephen Rothwells...@canb.auug.org.au $ git checkout master $ git reset --hard stable Merging origin/master (99aedde0869c Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip) Merging fixes/master (b94d525e58dc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net) Merging kbuild-current/rc-fixes (c517d838eb7d Linux 4.0-rc1) Merging arc-current/for-curr (2ce7598c9a45 Linux 3.17-rc4) Merging arm-current/fixes (23be7fdafa50 ARM: 8305/1: DMA: Fix kzalloc flags in __iommu_alloc_buffer()) Merging m68k-current/for-linus (4436820a98cd m68k/defconfig: Enable Ethernet bridging) Merging metag-fixes/fixes (c2996cb29bfb metag: Fix KSTK_EIP() and KSTK_ESP() macros) Merging mips-fixes/mips-fixes (1795cd9b3a91 Linux 3.16-rc5) Merging powerpc-merge/merge (c517d838eb7d Linux 4.0-rc1) Merging powerpc-merge-mpe/fixes (4ad04e598711 powerpc/iommu: Remove IOMMU device references via bus notifier) Merging sparc/master (53eb2516972b sparc: semtimedop() unreachable due to comparison error) Merging net/master (b0ab0afaebc8 net: eth: xgene: fix booting with devicetree) Merging ipsec/master (ac37e2515c1a xfrm: release dst_orig in case of error in xfrm_lookup()) Merging sound-current/for-linus (f44f07cf3910 ALSA: line6: Clamp values correctly) Merging pci-current/for-linus (4efe874aace5 PCI: Don't read past the end of sysfs "driver_override" buffer) Merging wireless-drivers/master (c8f034558669 rtlwifi: Improve handling of IPv6 packets) Merging driver-core.current/driver-core-linus (c517d838eb7d Linux 4.0-rc1) Merging tty.current/tty-linus (c517d838eb7d Linux 4.0-rc1) Merging usb.current/usb-linus (d3d5389475e8 Merge tag 'usb-serial-4.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus) Merging usb-gadget-fixes/fixes (a0456399fb07 usb: gadget: configfs: don't NUL-terminate (sub)compatible ids) Merging usb-serial-fixes/usb-linus (c7d373c3f0da usb: ftdi_sio: Add jtag quirk support for Cyber Cortex AV boards) Merging staging.current/staging-linus (abe46b8932dd staging: comedi: adv_pci1710: fix AI INSN_READ for non-zero channel) Merging char-misc.current/char-misc-linus (6c15a8516b81 mei: make device disabled on stop unconditionally) Merging input-current/for-linus (20f02d66f042 Input: tc3589x-keypad - set IRQF_ONESHOT flag to ensure IRQ request) Merging crypto-current/master (001eabfd54c0 crypto: arm/aes update NEON AES module to latest OpenSSL version) Merging ide/master (f96fe225677b Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net) Merging devicetree-current/devicetree/merge (6b1271de3723 of/unittest: Overlays with sub-devices tests) Merging rr-fixes/fixes (f47689345931 lguest: update help text.) Merging vfio-fixes/for-linus (7c2e211f3c95 vfio-pci: Fix the check on pci device type in vfio_pci_probe()) Merging kselftest-fixes/fixes (f5db310d77ef selftests/vm: fix link error for transhuge-stress test) Merging drm-intel-fixes/for-linux-next-fixes (ab3be73fa7b4 drm/i915: gen4: work around hang during hibernation) Merging asm-generic/master (643165c8bbc
Re: [PATCH v2] f2fs: fix max orphan inodes calculation
Hi Changman, On Fri, Mar 06, 2015 at 11:37:28AM +0800, Chao Yu wrote: >Hi Changman, > >> -Original Message- >> From: Changman Lee [mailto:cm224@samsung.com] >> Sent: Tuesday, March 03, 2015 9:40 AM >> To: linux-f2fs-de...@lists.sourceforge.net >> Cc: Jaegeuk Kim; Chao Yu; linux-fsde...@vger.kernel.org; >> linux-kernel@vger.kernel.org >> Subject: Re: [PATCH v2] f2fs: fix max orphan inodes calculation >> >> On Fri, Feb 27, 2015 at 05:38:13PM +0800, Wanpeng Li wrote: >> > cp_payload is introduced for sit bitmap to support large volume, and it is >> > just after the block of f2fs_checkpoint + nat bitmap, so the first segment >> > should include F2FS_CP_PACKS + NR_CURSEG_TYPE + cp_payload + orphan blocks. >> > However, current max orphan inodes calculation don't consider cp_payload, >> > this patch fix it by reducing the number of cp_payload from total blocks of >> > the first segment when calculate max orphan inodes. >> > >> > Signed-off-by: Wanpeng Li >> > --- >> > v1 -> v2: >> > * adjust comments above the codes >> > * fix coding style issue >> > >> > fs/f2fs/checkpoint.c | 12 +++- >> > 1 file changed, 7 insertions(+), 5 deletions(-) >> > >> > diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c >> > index db82e09..a914e99 100644 >> > --- a/fs/f2fs/checkpoint.c >> > +++ b/fs/f2fs/checkpoint.c >> > @@ -1103,13 +1103,15 @@ void init_ino_entry_info(struct f2fs_sb_info *sbi) >> >} >> > >> >/* >> > - * considering 512 blocks in a segment 8 blocks are needed for cp >> > - * and log segment summaries. Remaining blocks are used to keep >> > - * orphan entries with the limitation one reserved segment >> > - * for cp pack we can have max 1020*504 orphan entries >> > + * considering 512 blocks in a segment 8+cp_payload blocks are >> > + * needed for cp and log segment summaries. Remaining blocks are >> > + * used to keep orphan entries with the limitation one reserved >> > + * segment for cp pack we can have max 1020*(504-cp_payload) >> > + * orphan entries >> > */ >> >> Hi all, >> >> I think below code give us information enough so it doesn't need to >> describe above comments. And someone could get confused by 1020 constants. >> How do you think about removing comments. > >I agree with you. > >There are nothing special need to be pay attention for the below statement, >all meaning of statement could be easily readed as each macro in statement >can indicate meaning of itself clearly. > >So could you send another patch to remove it? Agreed. You can cleanup it. ;-) Regards, Wanpeng Li > >Thanks, > >> >> Regards, >> Changman >> >> >sbi->max_orphans = (sbi->blocks_per_seg - F2FS_CP_PACKS - >> > - NR_CURSEG_TYPE) * F2FS_ORPHANS_PER_BLOCK; >> > + NR_CURSEG_TYPE - __cp_payload(sbi)) * >> > + F2FS_ORPHANS_PER_BLOCK; >> > } >> > >> > int __init create_checkpoint_caches(void) >> > -- >> > 1.9.1 > >-- >To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in >the body of a message to majord...@vger.kernel.org >More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/5] mtd: nand: vf610_nfc: Freescale NFC for VF610, MPC5125 and others
On Thu, Mar 05, 2015 at 12:10:20AM +0100, Stefan Agner wrote: > This driver supports Freescale NFC (NAND flash controller) found on > Vybrid (VF610), MPC5125, MCF54418 and Kinetis K70. > > Limitations: > - DMA and pipelining not used > - Pages larger than 2k are not supported > - No hardware ECC > > The driver has only been tested on Vybrid (VF610). > > Signed-off-by: Bill Pringlemeir > Signed-off-by: Stefan Agner > --- > arch/arm/mach-imx/Kconfig| 1 + This change shouldn't be part of driver patch. Shawn > drivers/mtd/nand/Kconfig | 12 + > drivers/mtd/nand/Makefile| 1 + > drivers/mtd/nand/vf610_nfc.c | 730 > +++ > 4 files changed, 744 insertions(+) > create mode 100644 drivers/mtd/nand/vf610_nfc.c > > diff --git a/arch/arm/mach-imx/Kconfig b/arch/arm/mach-imx/Kconfig > index e8627e0..de4a51a 100644 > --- a/arch/arm/mach-imx/Kconfig > +++ b/arch/arm/mach-imx/Kconfig > @@ -634,6 +634,7 @@ config SOC_VF610 > select ARM_GIC > select PINCTRL_VF610 > select PL310_ERRATA_769419 if CACHE_L2X0 > + select HAVE_NAND_VF610_NFC > > help > This enable support for Freescale Vybrid VF610 processor. > diff --git a/drivers/mtd/nand/Kconfig b/drivers/mtd/nand/Kconfig > index 5b76a17..1be30a6 100644 > --- a/drivers/mtd/nand/Kconfig > +++ b/drivers/mtd/nand/Kconfig > @@ -455,6 +455,18 @@ config MTD_NAND_MPC5121_NFC > This enables the driver for the NAND flash controller on the > MPC5121 SoC. > > +config HAVE_NAND_VF610_NFC > + bool > + > +config MTD_NAND_VF610_NFC > + tristate "Support for Freescale NFC for VF610/MPC5125" > + depends on HAVE_NAND_VF610_NFC > + help > + Enables support for NAND Flash Controller on some Freescale > + processors like the VF610, MPC5125, MCF54418 or Kinetis K70. > + The driver supports a maximum 2k page size. The driver > + currently does not support hardware ECC. > + > config MTD_NAND_MXC > tristate "MXC NAND support" > depends on ARCH_MXC > diff --git a/drivers/mtd/nand/Makefile b/drivers/mtd/nand/Makefile > index 582bbd05..e97ca7b 100644 > --- a/drivers/mtd/nand/Makefile > +++ b/drivers/mtd/nand/Makefile > @@ -45,6 +45,7 @@ obj-$(CONFIG_MTD_NAND_SOCRATES) += > socrates_nand.o > obj-$(CONFIG_MTD_NAND_TXX9NDFMC) += txx9ndfmc.o > obj-$(CONFIG_MTD_NAND_NUC900)+= nuc900_nand.o > obj-$(CONFIG_MTD_NAND_MPC5121_NFC) += mpc5121_nfc.o > +obj-$(CONFIG_MTD_NAND_VF610_NFC) += vf610_nfc.o > obj-$(CONFIG_MTD_NAND_RICOH) += r852.o > obj-$(CONFIG_MTD_NAND_JZ4740)+= jz4740_nand.o > obj-$(CONFIG_MTD_NAND_GPMI_NAND) += gpmi-nand/ > diff --git a/drivers/mtd/nand/vf610_nfc.c b/drivers/mtd/nand/vf610_nfc.c > new file mode 100644 > index 000..101fd20 > --- /dev/null > +++ b/drivers/mtd/nand/vf610_nfc.c > @@ -0,0 +1,730 @@ > +/* > + * Copyright 2009-2015 Freescale Semiconductor, Inc. and others > + * > + * Description: MPC5125, VF610, MCF54418 and Kinetis K70 Nand driver. > + * Jason ported to M54418TWR and MVFA5 (VF610). > + * Authors: Stefan Agner > + * Bill Pringlemeir > + * Shaohui Xie > + * Jason Jin > + * > + * Based on original driver mpc5121_nfc.c. > + * > + * This is free software; you can redistribute it and/or modify it > + * under the terms of the GNU General Public License as published by > + * the Free Software Foundation; either version 2 of the License, or > + * (at your option) any later version. > + * > + * Limitations: > + * - Untested on MPC5125 and M54418. > + * - DMA not used. > + * - 2K pages or less. > + */ > + > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > +#define DRV_NAME"vf610_nfc" > + > +/* Register Offsets */ > +#define NFC_FLASH_CMD1 0x3F00 > +#define NFC_FLASH_CMD2 0x3F04 > +#define NFC_COL_ADDR 0x3F08 > +#define NFC_ROW_ADDR 0x3F0c > +#define NFC_ROW_ADDR_INC 0x3F14 > +#define NFC_FLASH_STATUS10x3F18 > +#define NFC_FLASH_STATUS20x3F1c > +#define NFC_CACHE_SWAP 0x3F28 > +#define NFC_SECTOR_SIZE 0x3F2c > +#define NFC_FLASH_CONFIG 0x3F30 > +#define NFC_IRQ_STATUS 0x3F38 > + > +/* Addresses for NFC MAIN RAM BUFFER areas */ > +#define NFC_MAIN_AREA(n) ((n) * 0x1000) > + > +#define PAGE_2K 0x0800 > +#define OOB_64 0x0040 > + > +/* > + * NFC_CMD2[CODE] values. See section: > + * - 31.4.7 Flash Command Code Description, Vybrid manual > + * - 23.8.6 Flash Command Sequencer, MPC5125 manual > + * > + * Briefly these are bitmasks of controller cycles. > + */ > +#define READ_PAGE_CMD_CODE 0x7EE0 > +#define PROGRAM
Re: [PATCH] pci: host: xgene: fix incorrectly returned address by map_bus
On Tue, Feb 17, 2015 at 03:14:00PM -0800, Feng Kan wrote: > The generic accessor functions for pci-xgene uses map_bus > call that returns the base address but did not add the additional > offset. > > Signed-off-by: Feng Kan Applied to for-linus for v4.0, with acks from Tanmay and Rob. Thanks! > --- > drivers/pci/host/pci-xgene.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/pci/host/pci-xgene.c b/drivers/pci/host/pci-xgene.c > index aab5547..ee082c0 100644 > --- a/drivers/pci/host/pci-xgene.c > +++ b/drivers/pci/host/pci-xgene.c > @@ -127,7 +127,7 @@ static bool xgene_pcie_hide_rc_bars(struct pci_bus *bus, > int offset) > return false; > } > > -static int xgene_pcie_map_bus(struct pci_bus *bus, unsigned int devfn, > +static void __iomem *xgene_pcie_map_bus(struct pci_bus *bus, unsigned int > devfn, > int offset) > { > struct xgene_pcie_port *port = bus->sysdata; > @@ -137,7 +137,7 @@ static int xgene_pcie_map_bus(struct pci_bus *bus, > unsigned int devfn, > return NULL; > > xgene_pcie_set_rtdid_reg(bus, devfn); > - return xgene_pcie_get_cfg_base(bus); > + return xgene_pcie_get_cfg_base(bus) + offset; > } > > static struct pci_ops xgene_pcie_ops = { > -- > 1.9.1 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-pci" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] pci: host: xgene: fix incorrectly returned address by map_bus
On Thu, Mar 05, 2015 at 02:57:55PM -0600, Rob Herring wrote: > On Thu, Mar 5, 2015 at 10:38 AM, Bjorn Helgaas wrote: > > [+cc Mark] > > > > On Thu, Feb 26, 2015 at 06:21:51PM -0600, Bjorn Helgaas wrote: > >> On Tue, Feb 17, 2015 at 03:14:00PM -0800, Feng Kan wrote: > >> > The generic accessor functions for pci-xgene uses map_bus > >> > call that returns the base address but did not add the additional > >> > offset. > >> > > >> > Signed-off-by: Feng Kan > >> > ... > >> > @@ -137,7 +137,7 @@ static int xgene_pcie_map_bus(struct pci_bus *bus, > >> > unsigned int devfn, > >> > return NULL; > >> > > >> > xgene_pcie_set_rtdid_reg(bus, devfn); > >> > - return xgene_pcie_get_cfg_base(bus); > >> > + return xgene_pcie_get_cfg_base(bus) + offset; > >> > >> Where's the locking here? ECAM doesn't need locking because the > >> bus/dev/fn/offset is all encoded in the MMIO address. But it looks > >> like X-Gene doesn't work that way and bus/dev/fn is in the RTDID register. > >> > >> So it seems like X-Gene needs locking that not everybody needs. Are you > >> relying on higher-level locking somewhere? > >> ... > > There's no locking problem. The config accesses are all within the > pci_lock spinlock and nothing else touches that register. M. Yes, you're right. pci_bus_{read,write}_config_{byte,word,dword}() all acquire pci_lock. For anybody following along at home, here's the path I was concerned about: pci_read_config_byte pci_bus_read_config_byte lock(&pci_lock) # acquire pci_lock bus->ops->read/write# struct pci_ops pci_generic_config_read # gen_pci_ops bus->ops->map_bus xgene_pcie_map_bus# xgene_pcie_ops xgene_pcie_set_rtdid_reg writel# requires mutex readb # config read I'm not exactly sure *why* we do locking there, other than we're just too scared to change it. As far as I know, methods like ECAM shouldn't require that lock, so it's sort of a shame to do it at the top level like that. Some of the low-level routines, like pci_{conf1,conf2,bios}, also use a lock (pci_config_lock in these cases). We do need it there because a few paths do call the low-level routines directly. Here's a typical path on x86: pci_read_config_byte pci_bus_read_config_byte lock(&pci_lock) # acquire pci_lock bus->ops->read/write# struct pci_ops pci_read # x86 pci_root_ops raw_pci_read raw_pci_ops->read pci_conf1_read # x86 raw_pci_ops lock(&pci_config_lock)# acquire pci_config_lock And here's a path on x86 that uses the low-level routines directly and requires the locking there: acpi_os_read_pci_configuration raw_pci_read raw_pci_ops->read pci_conf1_read lock(&pci_config_lock) So ideally I think the locking would be done in the low-level routines that need it, and we could do without pci_lock. But I don't know whether that's practical at this point or not. Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [REGRESSION in 3.18][PPC] PA Semi fails to boot after: of/base: Fix PowerPC address parsing hack
On Thu, 2015-03-05 at 17:12 -0500, Steven Rostedt wrote: > A bug in ftrace was reported to me that affects ARM and ARM64 but not > x86. Looking at the code it appears to affect PowerPC as well. So I > booted up my old PA Semi, to give it a try. The last time I booted it > was for a 3.17 kernel. Unfortunately, for 4.0-rc2 it crashed with: Argh. Well, we have one of these here but Michael who owns it is off til Tuesday. Can you shoot me the DT (/proc/device-tree in a tarball) ? Olof, can the DT be updated on this thing or should we add workarounds to Linux if something is really missing ? Cheers, Ben. > Unable to handle kernel paging request for data at address 0x > Faulting instruction address: 0xc05cef88 > Oops: Kernel access of bad area, sig: 11 [#1] > SMP NR_CPUS=2 PA Semi PWRficient > Modules linked in: > CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.0.0-rc2-test #50 > task: c0003816cb60 ti: c000381a4000 task.ti: c000381a4000 > NIP: c05cef88 LR: c007c1a0 CTR: c007c184 > REGS: c000381a7a00 TRAP: 0300 Not tainted (4.0.0-rc2-test) > MSR: 90009032 CR: 2228 XER: > DAR: DSISR: 4000 SOFTE: 0 > GPR00: c007c1a0 c000381a7c80 c0af4b98 0001 > GPR04: 04ba 3d6de000 > GPR08: 0100 c000381a4080 > GPR12: 24044042 c300 ffed > GPR16: c0afb920 c000381a4000 c09ad648 c09ae580 > GPR20: c000381a4080 c000381a4000 c000381a4080 c000381a4000 > GPR24: c000381a4000 c000381a4000 c0afb880 c000381a4000 > GPR28: c09f8790 c000381a4000 c0b02168 > NIP [c05cef88] .check_astate+0x28/0x50 > LR [c007c1a0] sleep_common+0x14/0x74 > Call Trace: > [c000381a7c80] [c0afb880] 0xc0afb880 (unreliable) > [c000381a7cf0] [c007c1a0] sleep_common+0x14/0x74 > [c000381a7d30] [c00130f0] .arch_cpu_idle+0x70/0x160 > [c000381a7db0] [c00d6660] .cpu_startup_entry+0x320/0x5a0 > [c000381a7ee0] [c0034570] .start_secondary+0x290/0x2c0 > [c000381a7f90] [c0008bfc] start_secondary_prolog+0x10/0x14 > Instruction dump: > 6000 6000 7c0802a6 f8010010 f821ff91 6000 6000 3d220003 > 39296870 a86d0038 e9290010 7c0004ac <7c004c2c> 0c00 4c00012c 5463103a > ---[ end trace 40e864a431826b26 ]--- > > I kicked off a ktest bisect, and it came down to this commit: > > commit 746c9e9f92dde2789908e51a354ba90a1962a2eb > Author: Benjamin Herrenschmidt > Date: Fri Nov 14 17:55:03 2014 +1100 > > of/base: Fix PowerPC address parsing hack > > When I revert this from v4.0-rc2, I can successfully boot my PA Semi > again. > > -- Steve -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: NMI watchdog triggering during load_balance
On Thu, 2015-03-05 at 21:05 -0700, David Ahern wrote: > Hi Peter/Mike/Ingo: > > I've been banging my against this wall for a week now and hoping you or > someone could shed some light on the problem. > > On larger systems (256 to 1024 cpus) there are several use cases (e.g., > http://www.cs.virginia.edu/stream/) that regularly trigger the NMI > watchdog with the stack trace: > > Call Trace: > @ [0045d3d0] double_rq_lock+0x4c/0x68 > @ [004699c4] load_balance+0x278/0x740 > @ [008a7b88] __schedule+0x378/0x8e4 > @ [008a852c] schedule+0x68/0x78 > @ [0042c82c] cpu_idle+0x14c/0x18c > @ [008a3a50] after_lock_tlb+0x1b4/0x1cc > > Capturing data for all CPUs I tend to see load_balance related stack > traces on 700-800 cpus, with a few hundred blocked on _raw_spin_trylock_bh. > > I originally thought it was a deadlock in the rq locking, but if I bump > the watchdog timeout the system eventually recovers (after 10-30+ > seconds of unresponsiveness) so it does not seem likely to be a deadlock. > > This particluar system has 1024 cpus: > # lscpu > Architecture: sparc64 > CPU op-mode(s):32-bit, 64-bit > Byte Order:Big Endian > CPU(s):1024 > On-line CPU(s) list: 0-1023 > Thread(s) per core:8 > Core(s) per socket:4 > Socket(s): 32 > NUMA node(s): 4 > NUMA node0 CPU(s): 0-255 > NUMA node1 CPU(s): 256-511 > NUMA node2 CPU(s): 512-767 > NUMA node3 CPU(s): 768-1023 > > and there are 4 scheduling domains. An example of the domain debug > output (condensed for the email): > > CPU970 attaching sched-domain: > domain 0: span 968-975 level SIBLING >groups: 8 single CPU groups >domain 1: span 968-975 level MC > groups: 1 group with 8 cpus > domain 2: span 768-1023 level CPU > groups: 4 groups with 256 cpus per group Wow, that topology is horrid. I'm not surprised that your box is writhing in agony. Can you twiddle that? -Mike -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 12/38] perf tools: Introduce thread__comm_time() helpers
Hi Frederic and Arnaldo, On Thu, Mar 05, 2015 at 05:08:56PM +0100, Frederic Weisbecker wrote: > On Wed, Mar 04, 2015 at 09:02:55AM +0900, Namhyung Kim wrote: > > Hi Frederic, > > > > On Tue, Mar 03, 2015 at 05:28:40PM +0100, Frederic Weisbecker wrote: > > > On Tue, Mar 03, 2015 at 12:07:24PM +0900, Namhyung Kim wrote: > > > > When data file indexing is enabled, it processes all task, comm and mmap > > > > events first and then goes to the sample events. So all it sees is the > > > > last comm of a thread although it has information at the time of sample. > > > > > > > > Sort thread's comm by time so that it can find appropriate comm at the > > > > sample time. The thread__comm_time() will mostly work even if > > > > PERF_SAMPLE_TIME bit is off since in that case, sample->time will be > > > > -1 so it'll take the last comm anyway. > > > > > > > > Cc: Frederic Weisbecker > > > > Signed-off-by: Namhyung Kim > > > > --- > > > > tools/perf/util/thread.c | 33 - > > > > tools/perf/util/thread.h | 2 ++ > > > > 2 files changed, 34 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c > > > > index 9ebc8b1f9be5..ad96725105c2 100644 > > > > --- a/tools/perf/util/thread.c > > > > +++ b/tools/perf/util/thread.c > > > > @@ -103,6 +103,21 @@ struct comm *thread__exec_comm(const struct thread > > > > *thread) > > > > return last; > > > > } > > > > > > > > +struct comm *thread__comm_time(const struct thread *thread, u64 > > > > timestamp) > > > > > > Usually thread__comm_foo() would suggest that we return the "foo" from a > > > thread comm. > > > For example thread__comm_len() returns the len of the last thread comm. > > > thread__comm_str() returns the string of the last thread comm. > > > > Ah, okay. > > I mean, that's just an impression, others may have a different one :o) Right. Although I agree with your idea of function naming, I'm not sure it's worth changing every function call site for this - and for similar machine__find(new)_thread()_time. Arnaldo, What do you think? Thanks, Namhyung -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ARM: dts: imx: Add dr_mode host setting to all host-only usb instances
On Fri, Feb 27, 2015 at 09:06:00AM -0500, Matt Porter wrote: > The chipidea driver adds an extra line of spam to the log when a > host-only chipidea instance is left set to the default of a dual role > controller. > > [2.010873] ci_hdrc ci_hdrc.1: doesn't support gadget > > Set the dr_mode property to host on all the host-only nodes > to avoid this warning. > > Signed-off-by: Matt Porter Applied, thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Regression caused by using node_to_bdi()
Hi, Christoph Hellwig resend: + cc lkml I found regression in v4.0-rc1 caused by this patch: Author: Christoph Hellwig Date: Wed Jan 14 10:42:36 2015 +0100 fs: export inode_to_bdi and use it in favor of mapping->backing_dev_info Test process is following: 2015-02-25 15:50:22: Start 2015-02-25 15:50:22: Linux version:Linux btrfs 4.0.0-rc1_HEAD_c517d838eb7d07bbe9507871fab3931deccff539_ #1 SMP Wed Feb 25 10:59:10 CST 2015 x86_64 x86_64 x86_64 GNU/Linux 2015-02-25 15:50:25: mkfs.btrfs -f /dev/sdb1 2015-02-25 15:50:27: mount /dev/sdb1 /data/ltf/tester 2015-02-25 15:50:28: sysbench --test=fileio --num-threads=1 --file-num=1 --file-block-size=32768 --file-total-size=4G --file-test-mode=seqwr --file-io-mode=sync --file-extra-flags= --file-fsync-freq=0 --file-fsync-end=off --max-requests=131072 2015-02-25 15:51:40: done sysbench Result is following: v3.19-rc1: testcnt=40 average=135.677 range=[132.460,139.130] stdev=1.610 cv=1.19% v4.0-rc1: testcnt=40 average=130.970 range=[127.980,132.050] stdev=1.012 cv=0.77% Then I bisect above case between v3.19-rc1 and v4.0-rc1, and found this patch caused the regresstion. Maybe it is because kernel need more time to call node_to_bdi(), compared with "using inode->i_mapping->backing_dev_info directly" in old code. Is there some way to speed up it(inline, or some access some variant in struct directly, ...)? Thanks Zhaolei -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v9 00/21] Introduce ACPI for ARM64 based on ACPI 5.1
On 2015/3/6 2:57, Olof Johansson wrote: > Hi, Hi Olof, > > On Wed, Feb 25, 2015 at 04:39:40PM +0800, Hanjun Guo wrote: >> Changes since v8: >> - remove MPIDR packing things by introducing phys_cpuid_t; >> >> - update patch acpi: fix acpi_os_ioremap for arm64 to follow >> Rafael's suggestion; >> >> - Squash patch (disable ACPI if ACPI less than 5.1) to patch >> (Get RSDP and ACPI boot-time table); >> >> - Move sleep_arm.c to arch/arm64/ and rename it as acpi_sleep.c >> >> - Rework the uefi generated empty dtb to enable acpi when no dtb >> is available, thanks Ard for the updated patch. >> >> - rework the function of register cpu for kexec case >> >> - use pr_debug() instead of pr_info() when scanning MADT. >> >> - rebase on top of 4.0-rc1 >> > I've looked at most of the arch code besides GIC and some of the timer stuff, > which I might revisit later, but the pieces I've seen seem reasonable. I've > acked individual patches. Thank you very much for the ACKs and review comments! > > There are some cleanups to be made, but that can be done incrementally on top, > it's all internal implementation details. Definitely in my TODO list :) > > I also haven't looked closely at the documentation patches yet, so I might > have > some comments on those showing up. OK, thanks. Regards Hanjun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 08/38] perf record: Add --index option for building index table
On Thu, Mar 05, 2015 at 08:56:44AM +0100, Jiri Olsa wrote: > On Tue, Mar 03, 2015 at 12:07:20PM +0900, Namhyung Kim wrote: > > SNIP > > > +static int record__merge_index_files(struct record *rec, int nr_index) > > +{ > > + int i; > > + int ret = -1; > > + u64 offset; > > + char path[PATH_MAX]; > > + struct perf_file_section *idx; > > + struct perf_data_file *file = &rec->file; > > + struct perf_session *session = rec->session; > > + int output_fd = perf_data_file__fd(file); > > + > > + /* +1 for header file itself */ > > + nr_index++; > > + > > + idx = calloc(nr_index, sizeof(*idx)); > > + if (idx == NULL) > > + goto out_close; > > + > > + offset = lseek(output_fd, 0, SEEK_END); > > + > > + idx[0].offset = session->header.data_offset; > > + idx[0].size = offset - idx[0].offset; > > + > > + for (i = 1; i < nr_index; i++) { > > + struct stat stbuf; > > + int fd = rec->fds[i]; > > + > > + if (fstat(fd, &stbuf) < 0) > > + goto out_close; > > + > > + idx[i].offset = offset; > > + idx[i].size = stbuf.st_size; > > + > > + offset += stbuf.st_size; > > + } > > + > > + /* copy sample events */ > > + for (i = 1; i < nr_index; i++) { > > + int fd = rec->fds[i]; > > + > > + if (idx[i].size == 0) > > + continue; > > + > > + if (copyfile_offset(fd, 0, output_fd, idx[i].offset, > > + idx[i].size) < 0) > > + goto out_close; > > + } > > why not do the copy in previous loop as well? I will change it in the next version. Thanks, Namhyung -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 07/38] perf tools: Handle indexed data file properly
Hi Jiri, On Wed, Mar 04, 2015 at 05:19:54PM +0100, Jiri Olsa wrote: > On Tue, Mar 03, 2015 at 12:07:19PM +0900, Namhyung Kim wrote: > > When perf detects data file has index table, process header part first > > and then rest data files in a row. Note that the indexed sample data is > > recorded for each cpu/thread separately, it's already ordered with > > respect to themselves so no need to use the ordered event queue > > interface. > > > > Signed-off-by: Namhyung Kim > > --- > > tools/perf/util/session.c | 62 > > ++- > > tools/perf/util/session.h | 5 > > 2 files changed, 55 insertions(+), 12 deletions(-) > > > > diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c > > index e4f166981ff0..00cd1ad427be 100644 > > --- a/tools/perf/util/session.c > > +++ b/tools/perf/util/session.c > > @@ -1300,11 +1300,10 @@ fetch_mmaped_event(struct perf_session *session, > > #define NUM_MMAPS 128 > > #endif > > > > -static int __perf_session__process_events(struct perf_session *session, > > +static int __perf_session__process_events(struct perf_session *session, > > int fd, > > u64 data_offset, u64 data_size, > > u64 file_size, struct perf_tool *tool) > > { > > - int fd = perf_data_file__fd(session->file); > > why is 'fd' passed separatelly here? we have single file now > and the only 'file::fd' we use is in session, no? You're right, it's a leftover from the old code. Will change. Thanks, Namhyung -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v4 2/5] irqchip: gicv3-its: use 64KB page as default granule
The field of page size in register GITS_BASERn might be read-only if an implementation only supports a single, fixed page size. But currently the ITS driver will throw out an error when PAGE_SIZE is less than the minimum size supported by an ITS. So addressing this problem by using 64KB pages as default granule for all the ITS base tables. Acked-by: Marc Zyngier Signed-off-by: Yun Wu --- drivers/irqchip/irq-gic-v3-its.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c index 69eeea3..f5bfa42 100644 --- a/drivers/irqchip/irq-gic-v3-its.c +++ b/drivers/irqchip/irq-gic-v3-its.c @@ -800,7 +800,7 @@ static int its_alloc_tables(struct its_node *its) { int err; int i; - int psz = PAGE_SIZE; + int psz = SZ_64K; u64 shr = GITS_BASER_InnerShareable; for (i = 0; i < GITS_BASER_NR_REGS; i++) { -- 1.8.0 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v4 0/5] enhance configuring an ITS
This patch series makes some enhancement to ITS configuration in the following aspects: o make allocation of the ITS tables more sensible o replace magic numbers with sensible macros o guarantees a safe quiescent status before initializing an ITS This patch series is based on Marc's branch[1], and tested on Hisilion ARM64 board with GICv3 ITS hardware. [1] https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git irq/gic-fixes v3 -> v4: o Spell out device table instead of DT to avoid misunderstanding o Change its_check_quiesced() to a more sensible name its_force_quiescent() v2 -> v3: o drop the patch of tracing LPI enabling status since Vladimir Murzin had already posted a similar patch o fix several improper description issues v1 -> v2: o rebase to Marc's GIC fix branch o drop size calculation for Device Table since Marc had already posted one o guarantees a safe quiescent status before initializing an ITS as Marc suggested, rather than register a reboot notifier o fix an issue about the enabling status of LPI feature Yun Wu (5): irqchip: gicv3-its: zero itt before handling to hardware irqchip: gicv3-its: use 64KB page as default granule irqchip: gicv3-its: add limitation to page order irqchip: gicv3-its: define macros for GITS_CTLR fields irqchip: gicv3-its: support safe initialization drivers/irqchip/irq-gic-v3-its.c | 46 +++--- include/linux/irqchip/arm-gic-v3.h | 3 +++ 2 files changed, 46 insertions(+), 3 deletions(-) -- 1.8.0 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v4 1/5] irqchip: gicv3-its: zero itt before handling to hardware
Some kind of brain-dead implementations chooses to insert ITEes in rapid sequence of disabled ITEes, and an un-zeroed ITT will confuse ITS on judging whether an ITE is really enabled or not. Considering the implementations are still supported by the GICv3 architecture, in which ITT is not required to be zeroed before being handled to hardware, we do the favor in ITS driver. Acked-by: Marc Zyngier Signed-off-by: Yun Wu --- drivers/irqchip/irq-gic-v3-its.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c index 6850141..69eeea3 100644 --- a/drivers/irqchip/irq-gic-v3-its.c +++ b/drivers/irqchip/irq-gic-v3-its.c @@ -1076,7 +1076,7 @@ static struct its_device *its_create_device(struct its_node *its, u32 dev_id, nr_ites = max(2UL, roundup_pow_of_two(nvecs)); sz = nr_ites * its->ite_size; sz = max(sz, ITS_ITT_ALIGN) + ITS_ITT_ALIGN - 1; - itt = kmalloc(sz, GFP_KERNEL); + itt = kzalloc(sz, GFP_KERNEL); lpi_map = its_lpi_alloc_chunks(nvecs, &lpi_base, &nr_lpis); if (!dev || !itt || !lpi_map) { -- 1.8.0 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v4 3/5] irqchip: gicv3-its: add limitation to page order
When required size of Device Table is out of the page allocator's capability, the whole ITS will fail in probing. This actually is not the hardware's problem and is mainly a limitation of the kernel page allocator. This patch will keep ITS going on to the next initializaion stage with an explicit warning. Acked-by: Marc Zyngier Signed-off-by: Yun Wu --- drivers/irqchip/irq-gic-v3-its.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c index f5bfa42..e8bda0b 100644 --- a/drivers/irqchip/irq-gic-v3-its.c +++ b/drivers/irqchip/irq-gic-v3-its.c @@ -828,6 +828,11 @@ static int its_alloc_tables(struct its_node *its) u32 ids = GITS_TYPER_DEVBITS(typer); order = get_order((1UL << ids) * entry_size); + if (order >= MAX_ORDER) { + order = MAX_ORDER - 1; + pr_warn("%s: Device Table too large, reduce its page order to %u\n", + its->msi_chip.of_node->full_name, order); + } } alloc_size = (1 << order) * PAGE_SIZE; -- 1.8.0 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v4 4/5] irqchip: gicv3-its: define macros for GITS_CTLR fields
Define macros for GITS_CTLR fields to avoid using magic numbers. Acked-by: Marc Zyngier Signed-off-by: Yun Wu --- drivers/irqchip/irq-gic-v3-its.c | 2 +- include/linux/irqchip/arm-gic-v3.h | 3 +++ 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c index e8bda0b..d13c24e 100644 --- a/drivers/irqchip/irq-gic-v3-its.c +++ b/drivers/irqchip/irq-gic-v3-its.c @@ -1388,7 +1388,7 @@ static int its_probe(struct device_node *node, struct irq_domain *parent) writeq_relaxed(baser, its->base + GITS_CBASER); tmp = readq_relaxed(its->base + GITS_CBASER); writeq_relaxed(0, its->base + GITS_CWRITER); - writel_relaxed(1, its->base + GITS_CTLR); + writel_relaxed(GITS_CTLR_ENABLE, its->base + GITS_CTLR); if ((tmp ^ baser) & GITS_BASER_SHAREABILITY_MASK) { pr_info("ITS: using cache flushing for cmd queue\n"); diff --git a/include/linux/irqchip/arm-gic-v3.h b/include/linux/irqchip/arm-gic-v3.h index 3459b43..c9d3002 100644 --- a/include/linux/irqchip/arm-gic-v3.h +++ b/include/linux/irqchip/arm-gic-v3.h @@ -134,6 +134,9 @@ #define GITS_TRANSLATER0x10040 +#define GITS_CTLR_ENABLE (1U << 0) +#define GITS_CTLR_QUIESCENT(1U << 31) + #define GITS_TYPER_DEVBITS_SHIFT 13 #define GITS_TYPER_DEVBITS(r) r) >> GITS_TYPER_DEVBITS_SHIFT) & 0x1f) + 1) #define GITS_TYPER_PTA (1UL << 19) -- 1.8.0 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v4 5/5] irqchip: gicv3-its: support safe initialization
It's unsafe to change the configurations of an activated ITS directly since this will lead to unpredictable results. This patch guarantees the ITSes being initialized are quiescent. Acked-by: Marc Zyngier Signed-off-by: Yun Wu --- drivers/irqchip/irq-gic-v3-its.c | 35 +++ 1 file changed, 35 insertions(+) diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c index d13c24e..9e09aa0 100644 --- a/drivers/irqchip/irq-gic-v3-its.c +++ b/drivers/irqchip/irq-gic-v3-its.c @@ -1320,6 +1320,34 @@ static const struct irq_domain_ops its_domain_ops = { .deactivate = its_irq_domain_deactivate, }; +static int its_force_quiescent(void __iomem *base) +{ + u32 count = 100;/* 1s */ + u32 val; + + val = readl_relaxed(base + GITS_CTLR); + if (val & GITS_CTLR_QUIESCENT) + return 0; + + /* Disable the generation of all interrupts to this ITS */ + val &= ~GITS_CTLR_ENABLE; + writel_relaxed(val, base + GITS_CTLR); + + /* Poll GITS_CTLR and wait until ITS becomes quiescent */ + while (1) { + val = readl_relaxed(base + GITS_CTLR); + if (val & GITS_CTLR_QUIESCENT) + return 0; + + count--; + if (!count) + return -EBUSY; + + cpu_relax(); + udelay(1); + } +} + static int its_probe(struct device_node *node, struct irq_domain *parent) { struct resource res; @@ -1348,6 +1376,13 @@ static int its_probe(struct device_node *node, struct irq_domain *parent) goto out_unmap; } + err = its_force_quiescent(its_base); + if (err) { + pr_warn("%s: failed to quiesce, giving up\n", + node->full_name); + goto out_unmap; + } + pr_info("ITS: %s\n", node->full_name); its = kzalloc(sizeof(*its), GFP_KERNEL); -- 1.8.0 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] pci: host: xgene: fix incorrectly returned address by map_bus
On Thu, Mar 05, 2015 at 08:53:38AM -0800, Feng Kan wrote: > Please take Mark's patch if you think it is better. > > > > On Thu, Mar 5, 2015 at 8:38 AM, Bjorn Helgaas wrote: > > [+cc Mark] > > > > On Thu, Feb 26, 2015 at 06:21:51PM -0600, Bjorn Helgaas wrote: > >> On Tue, Feb 17, 2015 at 03:14:00PM -0800, Feng Kan wrote: > >> > The generic accessor functions for pci-xgene uses map_bus > >> > call that returns the base address but did not add the additional > >> > offset. > >> > > >> > Signed-off-by: Feng Kan > >> > --- > >> > drivers/pci/host/pci-xgene.c | 4 ++-- > >> > 1 file changed, 2 insertions(+), 2 deletions(-) > >> > > >> > diff --git a/drivers/pci/host/pci-xgene.c b/drivers/pci/host/pci-xgene.c > >> > index aab5547..ee082c0 100644 > >> > --- a/drivers/pci/host/pci-xgene.c > >> > +++ b/drivers/pci/host/pci-xgene.c > >> > @@ -127,7 +127,7 @@ static bool xgene_pcie_hide_rc_bars(struct pci_bus > >> > *bus, int offset) > >> > return false; > >> > } > >> > > >> > -static int xgene_pcie_map_bus(struct pci_bus *bus, unsigned int devfn, > >> > +static void __iomem *xgene_pcie_map_bus(struct pci_bus *bus, unsigned > >> > int devfn, > >> > int offset) > >> > { > >> > struct xgene_pcie_port *port = bus->sysdata; > >> > @@ -137,7 +137,7 @@ static int xgene_pcie_map_bus(struct pci_bus *bus, > >> > unsigned int devfn, > >> > return NULL; > >> > > >> > xgene_pcie_set_rtdid_reg(bus, devfn); > >> > - return xgene_pcie_get_cfg_base(bus); > >> > + return xgene_pcie_get_cfg_base(bus) + offset; > >> > >> Where's the locking here? ECAM doesn't need locking because the > >> bus/dev/fn/offset is all encoded in the MMIO address. But it looks > >> like X-Gene doesn't work that way and bus/dev/fn is in the RTDID register. > >> > >> So it seems like X-Gene needs locking that not everybody needs. Are you > >> relying on higher-level locking somewhere? > > > > Ping, what's going on here? I've gotten at least three patches for this > > offset issue, so we need to get it resolved. > > > > If there's no locking problem, I can just apply this and we'll be finished. > > Actually, I think Mark's patch is better, because it correctly returns NULL > > (failure) if xgene_pcie_get_cfg_base() fails. So please review and ack > > that one or explain why this one is better. Huh, I could swear I saw a failure path in xgene_pcie_get_cfg_base(). But I don't see a way it can fail, so I don't think it matters which way we fix this. Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
NMI watchdog triggering during load_balance
Hi Peter/Mike/Ingo: I've been banging my against this wall for a week now and hoping you or someone could shed some light on the problem. On larger systems (256 to 1024 cpus) there are several use cases (e.g., http://www.cs.virginia.edu/stream/) that regularly trigger the NMI watchdog with the stack trace: Call Trace: @ [0045d3d0] double_rq_lock+0x4c/0x68 @ [004699c4] load_balance+0x278/0x740 @ [008a7b88] __schedule+0x378/0x8e4 @ [008a852c] schedule+0x68/0x78 @ [0042c82c] cpu_idle+0x14c/0x18c @ [008a3a50] after_lock_tlb+0x1b4/0x1cc Capturing data for all CPUs I tend to see load_balance related stack traces on 700-800 cpus, with a few hundred blocked on _raw_spin_trylock_bh. I originally thought it was a deadlock in the rq locking, but if I bump the watchdog timeout the system eventually recovers (after 10-30+ seconds of unresponsiveness) so it does not seem likely to be a deadlock. This particluar system has 1024 cpus: # lscpu Architecture: sparc64 CPU op-mode(s):32-bit, 64-bit Byte Order:Big Endian CPU(s):1024 On-line CPU(s) list: 0-1023 Thread(s) per core:8 Core(s) per socket:4 Socket(s): 32 NUMA node(s): 4 NUMA node0 CPU(s): 0-255 NUMA node1 CPU(s): 256-511 NUMA node2 CPU(s): 512-767 NUMA node3 CPU(s): 768-1023 and there are 4 scheduling domains. An example of the domain debug output (condensed for the email): CPU970 attaching sched-domain: domain 0: span 968-975 level SIBLING groups: 8 single CPU groups domain 1: span 968-975 level MC groups: 1 group with 8 cpus domain 2: span 768-1023 level CPU groups: 32 groups with 8 cpus per group domain 3: span 0-1023 level NODE groups: 4 groups with 256 cpus per group On an idle system (20 or so non-kernel threads such as mingetty, udev, ...) perf top shows the task scheduler is consuming significant time: PerfTop: 136580 irqs/sec kernel:99.9% exact: 0.0% [1000Hz cycles], (all, 1024 CPUs) --- 20.22% [kernel] [k] find_busiest_group 16.00% [kernel] [k] find_next_bit 6.37% [kernel] [k] ktime_get_update_offsets 5.70% [kernel] [k] ktime_get ... This is a 2.6.39 kernel (yes, a relatively old one); 3.8 shows similar symptoms. 3.18 is much better. From what I can tell load balancing is happening non-stop and there is heavy contention in the run queue locks. I instrumented the rq locking and under load (e.g, the stream test) regularly see single rq locks held continuously for 2-3 seconds (e.g., at the end of the stream run which has 1024 threads and the process is terminating). I have been staring at and instrumenting the scheduling code for days. It seems like the balancing of domains is regularly lining up on all or almost all CPUs and it seems like the NODE domain causes the most damage since it scans all cpus (ie., in rebalance_domains() each domain pass triggers a call to load_balance on all cpus at the same time). Just in random snapshots during a stream test I have seen 1 pass through rebalance_domains take > 17 seconds (custom tracepoints to tag start and end). Since each domain is a superset of the lower one each pass through load_balance regularly repeats the processing of the previous domain (e.g., NODE domain repeats the cpus in the CPU domain). Then multiplying that across 1024 cpus and it seems like a of duplication. Does that make sense or am I off in the weeds? Thanks, David -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH -next] sensors: fix build of pwm-fan.c when THERMAL=m
On 03/05/2015 03:27 PM, Randy Dunlap wrote: From: Randy Dunlap Fix build errors when CONFIG_THERMAL=m and SENSORS_PWM_FAN=y by restricting SENSORS_PWM_FAN to 'm' when THERMAL=m. drivers/built-in.o: In function `pwm_fan_remove': pwm-fan.c:(.text+0x22ba58): undefined reference to `thermal_cooling_device_unregister' drivers/built-in.o: In function `pwm_fan_probe': pwm-fan.c:(.text+0x22bebb): undefined reference to `thermal_of_cooling_device_register' pwm-fan.c:(.text+0x22bf11): undefined reference to `thermal_cdev_update' Signed-off-by: Randy Dunlap Cc: Kamil Debski --- Applied, thanks. Guenter -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] mm: numa: Do not clear PTEs or PMDs for NUMA hinting faults
On Thu, Mar 05, 2015 at 11:54:52PM +, Mel Gorman wrote: > Dave Chinner reported the following on https://lkml.org/lkml/2015/3/1/226 > >Across the board the 4.0-rc1 numbers are much slower, and the >degradation is far worse when using the large memory footprint >configs. Perf points straight at the cause - this is from 4.0-rc1 >on the "-o bhash=101073" config: > >- 56.07%56.07% [kernel][k] > default_send_IPI_mask_sequence_phys > - default_send_IPI_mask_sequence_phys > - 99.99% physflat_send_IPI_mask > - 99.37% native_send_call_func_ipi > smp_call_function_many >- native_flush_tlb_others > - 99.85% flush_tlb_page >ptep_clear_flush >try_to_unmap_one >rmap_walk >try_to_unmap >migrate_pages >migrate_misplaced_page > - handle_mm_fault > - 99.73% __do_page_fault > trace_do_page_fault > do_async_page_fault >+ async_page_fault > 0.63% native_send_call_func_single_ipi > generic_exec_single > smp_call_function_single > > This was bisected to commit 4d9424669946 ("mm: convert p[te|md]_mknonnuma > and remaining page table manipulations") which clears PTEs and PMDs to make > them PROT_NONE. This is tidy but tests on some benchmarks indicate that > there are many more hinting faults trapped resulting in excessive migration. > This is the result for the old autonuma benchmark for example. [snip] Doesn't fix the problem. Runtime is slightly improved (16m45s vs 17m35) but it's still much slower that 3.19 (6m5s). Stats and profiles still roughly the same: 360,228 migrate:mm_migrate_pages ( +- 4.28% ) - 52.69%52.69% [kernel][k] default_send_IPI_mask_sequence_phys default_send_IPI_mask_sequence_phys - physflat_send_IPI_mask - 97.28% native_send_call_func_ipi smp_call_function_many native_flush_tlb_others flush_tlb_page ptep_clear_flush try_to_unmap_one rmap_walk try_to_unmap migrate_pages migrate_misplaced_page - handle_mm_fault - 99.59% __do_page_fault trace_do_page_fault do_async_page_fault + async_page_fault + 2.72% native_send_call_func_single_ipi numa_hit 36678767 numa_miss 905234 numa_foreign 905234 numa_interleave 14802 numa_local 36656791 numa_other 927210 numa_pte_updates 92168450 numa_huge_pte_updates 0 numa_hint_faults 87573926 numa_hint_faults_local 29730293 numa_pages_migrated 30195890 pgmigrate_success 30195890 pgmigrate_fail 0 Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] x86/PCI: Fully disable devices before releasing IRQ resource
On Fri, 2015-03-06 at 09:49 +0800, Jiang Liu wrote: > On 2015/3/6 5:06, Alex Williamson wrote: > > The IRQ resource for a device is established when pci_enabled_device() > > is called on a fully disabled device (ie. enable_cnt == 0). With > > commit b4b55cda5874 ("x86/PCI: Refine the way to release PCI IRQ > > resources") this same IRQ resource is released when the driver is > > unbound from the device, regardless of the enable_cnt. This presents > > the situation that an ill-behaved driver can now make a device > > unusable to subsequent drivers by an imbalance in their use of > > pci_enable/disable_device(). It's one thing to break your own device > > if you're one of these ill-behaved drivers, but it's a serious > > regression for secondary drivers like vfio-pci, which are innocent > > of the transgressions of the previous driver. > > > > Resolve by pushing the device to a fully disabled state before > > releasing the IRQ resource. > > > > Fixes: b4b55cda5874 ("x86/PCI: Refine the way to release PCI IRQ resources") > > Signed-off-by: Alex Williamson > > Cc: Jiang Liu > > --- > > arch/x86/pci/common.c | 13 - > > 1 file changed, 12 insertions(+), 1 deletion(-) > > > > diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c > > index 3d2612b..4810194 100644 > > --- a/arch/x86/pci/common.c > > +++ b/arch/x86/pci/common.c > > @@ -527,8 +527,19 @@ static int pci_irq_notifier(struct notifier_block *nb, > > unsigned long action, > > if (action != BUS_NOTIFY_UNBOUND_DRIVER) > > return NOTIFY_DONE; > > > > - if (pcibios_disable_irq) > > + if (pcibios_disable_irq) { > > + /* > > +* Broken drivers may allow a device to be .remove()'d while > > +* still enabled. pci_enable_device() will only re-establish > > +* dev->irq if the devices is fully disabled. So if we want > > +* to release the IRQ, we need to make sure the next driver > > +* can re-establish it using pci_enable_device(). > > +*/ > > + while (pci_is_enabled(dev)) > > + pci_disable_device(dev); > > + > > pcibios_disable_irq(dev); > > + } > Hi Alex, > Thanks for debugging and fixing it. > Will it be feasible to give a debug message to remind those > driver authors to correctly disable PCI when unbinding? I can certainly add a warning to the loop, it loses a bit of its teeth here though since we can't specify which driver to blame at this point. Maybe that warning and perhaps this enabling roll-back should happen in drivers/pci/pci-driver.c:pci_device_remove(). Bjorn, would you prefer it be done generically there? Thanks, Alex -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[GIT PULL] Please pull NFS client bugfixes
Hi Linus, The following changes since commit c517d838eb7d07bbe9507871fab3931deccff539: Linux 4.0-rc1 (2015-02-22 18:21:14 -0800) are available in the git repository at: git://git.linux-nfs.org/projects/trondmy/linux-nfs.git tags/nfs-for-4.0-3 for you to fetch changes up to e11259f920d8cb3550e0f311c064bdabe1bc3aaf: NFSv4.1: Clear the old state by our client id before establishing a new lease (2015-03-03 21:52:30 -0500) NFS client bugfixes for Linux 4.0 Highlights include: - Fix a regression in the NFSv4 open state recovery code - Fix a regression in the NFSv4 close code - Fix regressions and side-effects of the loop-back mounted NFS fixes in 3.18, that cause the NFS read() syscall to return EBUSY. - Fix regressions around the readdirplus code and how it interacts with the VFS lazy unmount changes that went into v3.18. - Fix issues with out-of-order RPC call replies replacing updated attributes with stale ones (particularly after a truncate()). - Fix an underflow checking issue with RPC/RDMA credits - Fix a number of issues with the NFSv4 delegation return/free code. - Fix issues around stale NFSv4.1 leases when doing a mount Anna Schumaker (1): NFS: Fix stateid used for NFS v4 closes Chuck Lever (1): xprtrdma: Store RDMA credits in unsigned variables Trond Myklebust (23): Merge tag 'nfs-rdma-for-4.0-3' of git://git.linux-nfs.org/projects/anna/nfs-rdma NFSv4: nfs4_open_recover_helper() must set share access NFS: Ensure that buffered writes wait for O_DIRECT writes to complete NFS: Add a helper to set attribute barriers NFS: Add attribute update barriers to nfs_setattr_update_inode() NFS: Set an attribute barrier on all updates NFS: Add attribute update barriers to NFS writebacks NFSv4: Add attribute update barriers to delegreturn and pNFS layoutcommit NFS: Remove size hack in nfs_inode_attrs_need_update() NFS: Fix nfs_post_op_update_inode() to set an attribute barrier NFSv4: Set a barrier in the update_changeattr() helper NFS: Don't invalidate a submounted dentry in nfs_prime_dcache() NFSv3: Use the readdir fileid as the mounted-on-fileid NFS: Don't require a filehandle to refresh the inode in nfs_prime_dcache() NFSv4: Don't call put_rpccred() under the rcu_read_lock() NFSv4: Ensure that we don't reap a delegation that is being returned NFSv4: Ensure we honour NFS_DELEGATION_RETURNING in nfs_inode_set_delegation() NFSv4: Pin the superblock while we're returning the delegation NFSv4: Ensure we skip delegations that are already being returned NFS: Fix a regression in the read() syscall NFS: Don't write enable new pages while an invalidation is proceeding NFSv4: Fix a race in NFSv4.1 server trunking discovery NFSv4.1: Clear the old state by our client id before establishing a new lease fs/nfs/client.c | 2 +- fs/nfs/delegation.c | 45 fs/nfs/dir.c| 22 ++-- fs/nfs/file.c | 11 +++- fs/nfs/inode.c | 111 +--- fs/nfs/internal.h | 1 + fs/nfs/nfs3proc.c | 4 +- fs/nfs/nfs3xdr.c| 5 ++ fs/nfs/nfs4client.c | 9 ++-- fs/nfs/nfs4proc.c | 31 +++ fs/nfs/nfs4session.h| 1 + fs/nfs/nfs4state.c | 18 ++- fs/nfs/proc.c | 6 +-- fs/nfs/write.c | 30 +++ include/linux/nfs_fs.h | 5 +- net/sunrpc/xprtrdma/rpc_rdma.c | 3 +- net/sunrpc/xprtrdma/xprt_rdma.h | 2 +- 17 files changed, 244 insertions(+), 62 deletions(-) -- Trond Myklebust Linux NFS client maintainer, PrimaryData trond.mykleb...@primarydata.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH v2] f2fs: fix max orphan inodes calculation
Hi Changman, > -Original Message- > From: Changman Lee [mailto:cm224@samsung.com] > Sent: Tuesday, March 03, 2015 9:40 AM > To: linux-f2fs-de...@lists.sourceforge.net > Cc: Jaegeuk Kim; Chao Yu; linux-fsde...@vger.kernel.org; > linux-kernel@vger.kernel.org > Subject: Re: [PATCH v2] f2fs: fix max orphan inodes calculation > > On Fri, Feb 27, 2015 at 05:38:13PM +0800, Wanpeng Li wrote: > > cp_payload is introduced for sit bitmap to support large volume, and it is > > just after the block of f2fs_checkpoint + nat bitmap, so the first segment > > should include F2FS_CP_PACKS + NR_CURSEG_TYPE + cp_payload + orphan blocks. > > However, current max orphan inodes calculation don't consider cp_payload, > > this patch fix it by reducing the number of cp_payload from total blocks of > > the first segment when calculate max orphan inodes. > > > > Signed-off-by: Wanpeng Li > > --- > > v1 -> v2: > > * adjust comments above the codes > > * fix coding style issue > > > > fs/f2fs/checkpoint.c | 12 +++- > > 1 file changed, 7 insertions(+), 5 deletions(-) > > > > diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c > > index db82e09..a914e99 100644 > > --- a/fs/f2fs/checkpoint.c > > +++ b/fs/f2fs/checkpoint.c > > @@ -1103,13 +1103,15 @@ void init_ino_entry_info(struct f2fs_sb_info *sbi) > > } > > > > /* > > -* considering 512 blocks in a segment 8 blocks are needed for cp > > -* and log segment summaries. Remaining blocks are used to keep > > -* orphan entries with the limitation one reserved segment > > -* for cp pack we can have max 1020*504 orphan entries > > +* considering 512 blocks in a segment 8+cp_payload blocks are > > +* needed for cp and log segment summaries. Remaining blocks are > > +* used to keep orphan entries with the limitation one reserved > > +* segment for cp pack we can have max 1020*(504-cp_payload) > > +* orphan entries > > */ > > Hi all, > > I think below code give us information enough so it doesn't need to > describe above comments. And someone could get confused by 1020 constants. > How do you think about removing comments. I agree with you. There are nothing special need to be pay attention for the below statement, all meaning of statement could be easily readed as each macro in statement can indicate meaning of itself clearly. So could you send another patch to remove it? Thanks, > > Regards, > Changman > > > sbi->max_orphans = (sbi->blocks_per_seg - F2FS_CP_PACKS - > > - NR_CURSEG_TYPE) * F2FS_ORPHANS_PER_BLOCK; > > + NR_CURSEG_TYPE - __cp_payload(sbi)) * > > + F2FS_ORPHANS_PER_BLOCK; > > } > > > > int __init create_checkpoint_caches(void) > > -- > > 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] mfd: rtsx_usb: prevent DMA from stack
Functions rtsx_usb_ep0_read_register() and rtsx_usb_get_card_status() both use arbitrary buffer addresses from arguments directly for DMA and the buffers could be located in stack. This was caught by DMA-API debug check. Fixes this by using double-buffers via kzalloc in both functions to guarantee the validity of DMA buffer. WARNING: CPU: 1 PID: 25 at lib/dma-debug.c:1166 check_for_stack+0x96/0xe0() ehci-pci :00:1a.0: DMA-API: device driver maps memory from stack [addr=8801199e3cef] Modules linked in: rtsx_usb_ms arc4 memstick intel_rapl iosf_mbi rtl8192ce snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel rtl_pci rtl8192c_common snd_hda_controller x86_pkg_temp_thermal snd_hda_codec rtlwifi mac80211 coretemp kvm_intel kvm iTCO_wdt snd_hwdep snd_seq snd_seq_device crct10dif_pclmul iTCO_vendor_support sparse_keymap cfg80211 crc32_pclmul snd_pcm crc32c_intel ghash_clmulni_intel rfkill i2c_i801 snd_timer shpchp snd serio_raw mei_me lpc_ich soundcore mei tpm_tis tpm wmi nfsd auth_rpcgss nfs_acl lockd grace sunrpc i915 rtsx_usb_sdmmc mmc_core 8021q uas garp stp i2c_algo_bit llc mrp drm_kms_helper usb_storage drm rtsx_usb mfd_core r8169 mii video CPU: 1 PID: 25 Comm: kworker/1:2 Not tainted 3.20.0-0.rc0.git7.3.fc22.x86_64 #1 Hardware name: WB WB-B06211/WB-B0621, BIOS EB062IWB V1.0 12/12/2013 Workqueue: events rtsx_usb_ms_handle_req [rtsx_usb_ms] 3d188e66 8801199e3808 8187642b 8801199e3860 8801199e3848 810ab39a 8801199e3864 8801199e3cef 880119b57098 880119b37320 Call Trace: [] dump_stack+0x4c/0x65 [] warn_slowpath_common+0x8a/0xc0 [] warn_slowpath_fmt+0x55/0x70 [] ? _raw_spin_unlock_irqrestore+0x36/0x70 [] check_for_stack+0x96/0xe0 [] debug_dma_map_page+0x104/0x150 [] usb_hcd_map_urb_for_dma+0x646/0x790 [] usb_hcd_submit_urb+0x1d5/0xa90 [] ? mark_held_locks+0x7f/0xc0 [] ? mark_held_locks+0x7f/0xc0 [] ? lockdep_init_map+0x65/0x5d0 [] usb_submit_urb+0x42e/0x5f0 [] usb_start_wait_urb+0x77/0x190 [] ? __kmalloc+0x205/0x2d0 [] usb_control_msg+0xdc/0x130 [] rtsx_usb_ep0_read_register+0x59/0x70 [rtsx_usb] [] ? rtsx_usb_get_rsp+0x41/0x50 [rtsx_usb] [] rtsx_usb_ms_handle_req+0x7ce/0x9c5 [rtsx_usb_ms] Reported-by: Josh Boyer Signed-off-by: Roger Tseng --- drivers/mfd/rtsx_usb.c | 30 -- 1 file changed, 24 insertions(+), 6 deletions(-) diff --git a/drivers/mfd/rtsx_usb.c b/drivers/mfd/rtsx_usb.c index ede50244f265..dbd907d7170e 100644 --- a/drivers/mfd/rtsx_usb.c +++ b/drivers/mfd/rtsx_usb.c @@ -196,18 +196,27 @@ EXPORT_SYMBOL_GPL(rtsx_usb_ep0_write_register); int rtsx_usb_ep0_read_register(struct rtsx_ucr *ucr, u16 addr, u8 *data) { u16 value; + u8 *buf; + int ret; if (!data) return -EINVAL; - *data = 0; + + buf = kzalloc(sizeof(u8), GFP_KERNEL); + if (!buf) + return -ENOMEM; addr |= EP0_READ_REG_CMD << EP0_OP_SHIFT; value = swab16(addr); - return usb_control_msg(ucr->pusb_dev, + ret = usb_control_msg(ucr->pusb_dev, usb_rcvctrlpipe(ucr->pusb_dev, 0), RTSX_USB_REQ_REG_OP, USB_DIR_IN | USB_TYPE_VENDOR | USB_RECIP_DEVICE, - value, 0, data, 1, 100); + value, 0, buf, 1, 100); + *data = *buf; + + kfree(buf); + return ret; } EXPORT_SYMBOL_GPL(rtsx_usb_ep0_read_register); @@ -288,18 +297,27 @@ static int rtsx_usb_get_status_with_bulk(struct rtsx_ucr *ucr, u16 *status) int rtsx_usb_get_card_status(struct rtsx_ucr *ucr, u16 *status) { int ret; + u16 *buf; if (!status) return -EINVAL; - if (polling_pipe == 0) + if (polling_pipe == 0) { + buf = kzalloc(sizeof(u16), GFP_KERNEL); + if (!buf) + return -ENOMEM; + ret = usb_control_msg(ucr->pusb_dev, usb_rcvctrlpipe(ucr->pusb_dev, 0), RTSX_USB_REQ_POLL, USB_DIR_IN | USB_TYPE_VENDOR | USB_RECIP_DEVICE, - 0, 0, status, 2, 100); - else + 0, 0, buf, 2, 100); + *status = *buf; + + kfree(buf); + } else { ret = rtsx_usb_get_status_with_bulk(ucr, status); + } /* usb_control_msg may return positive when success */ if (ret < 0) -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
linux-next: manual merge of the vhost tree with the virtio tree
Hi Michael, Today's linux-next merge of the vhost tree got a conflict in drivers/virtio/virtio_balloon.c between commit 7f8998200dcb ("virtio_balloon: annotate possible sleep waiting for event") from the virtio tree and commit 2426d3b03d07 ("virtio-balloon: do not call blocking ops when !TASK_RUNNING") from the vhost tree. I fixed it up (I think - see below) and can carry the fix as necessary (no action is required). -- Cheers, Stephen Rothwells...@canb.auug.org.au diff --cc drivers/virtio/virtio_balloon.c index 06001ca71ea3,5a6ad6dbdec4.. --- a/drivers/virtio/virtio_balloon.c +++ b/drivers/virtio/virtio_balloon.c @@@ -341,19 -343,17 +343,25 @@@ static int balloon(void *_vballoon try_to_freeze(); + /* + * Reading the config on the ccw backend involves an + * allocation, so we may actually sleep and have an + * extra iteration. It's extremely unlikely, and this + * isn't a fast path in any sense. + */ + sched_annotate_sleep(); + - wait_event_interruptible(vb->config_change, -(diff = towards_target(vb)) != 0 -|| vb->need_stats_update -|| kthread_should_stop() -|| freezing(current)); + add_wait_queue(&vb->config_change, &wait); + for (;;) { + if ((diff = towards_target(vb)) != 0 || + vb->need_stats_update || + kthread_should_stop() || + freezing(current)) + break; + wait_woken(&wait, TASK_INTERRUPTIBLE, MAX_SCHEDULE_TIMEOUT); + } + remove_wait_queue(&vb->config_change, &wait); + if (vb->need_stats_update) stats_handle_request(vb); if (diff > 0) pgpdDrFr1BDDj.pgp Description: OpenPGP digital signature
Mellanox Technologies MT23108 causes #MC exceptions under heavy load
We are running CPU and network heavy test on marmot.pdl.cmu.edu cluster. It has Mellanox Technologies MT23108 InfiniHost controller. When we start using it for network communications, after just few minutes some of the nodes of the cluster die with the following machine check exception. I repeated this test with Ethernet few times and had not an single failure so far (I thought to had one but it turned to be another unrelated issue) It happened already on most nodes of this 128 node cluster, thus I expect this to be kernel bug. Do you have any pointers what we could try? I compiled and tested current HEAD of the vanilla kernel (99aedde0869ce194539166ac5a4d2e1a20995348) 4.0.0-rc2 but this happens even on 2.6.38 (which was in one of their stock kernel images). Best regards, Maxim Levitsky The kernel log of failure captured via serial console: [ 297.575167] ib0: can't use GFP_NOIO for QPs on device mthca0, using GFP_KERNEL [ 564.704428] ib0: can't use GFP_NOIO for QPs on device mthca0, using GFP_KERNEL [ 951.619320] ib0: can't use GFP_NOIO for QPs on device mthca0, using GFP_KERNEL [ 956.790789] ib0: can't use GFP_NOIO for QPs on device mthca0, using GFP_KERNEL [ 957.301036] ib0: can't use GFP_NOIO for QPs on device mthca0, using GFP_KERNEL [ 957.333938] ib0: can't use GFP_NOIO for QPs on device mthca0, using GFP_KERNEL [ 957.924656] ib0: can't use GFP_NOIO for QPs on device mthca0, using GFP_KERNEL [ 958.125879] ib0: can't use GFP_NOIO for QPs on device mthca0, using GFP_KERNEL [ 958.147588] ib0: can't use GFP_NOIO for QPs on device mthca0, using GFP_KERNEL [ 958.485607] ib0: can't use GFP_NOIO for QPs on device mthca0, using GFP_KERNEL [ 959.050155] ib0: can't use GFP_NOIO for QPs on device mthca0, using GFP_KERNEL [ 959.120109] ib0: can't use GFP_NOIO for QPs on device mthca0, using GFP_KERNEL [ 960.048666] ib0: can't use GFP_NOIO for QPs on device mthca0, using GFP_KERNEL [ 960.110928] ib0: can't use GFP_NOIO for QPs on device mthca0, using GFP_KERNEL [ 960.754363] ib0: can't use GFP_NOIO for QPs on device mthca0, using GFP_KERNEL [ 961.390093] ib0: can't use GFP_NOIO for QPs on device mthca0, using GFP_KERNEL [ 972.199782] ib0: can't use GFP_NOIO for QPs on device mthca0, using GFP_KERNEL [ 972.496511] ib0: can't use GFP_NOIO for QPs on device mthca0, using GFP_KERNEL [ 983.078444] ib0: can't use GFP_NOIO for QPs on device mthca0, using GFP_KERNEL [ 983.618178] ib0: can't use GFP_NOIO for QPs on device mthca0, using GFP_KERNEL [ 991.365565] ib0: can't use GFP_NOIO for QPs on device mthca0, using GFP_KERNEL [ 1003.344498] ib0: can't use GFP_NOIO for QPs on device mthca0, using GFP_KERNEL [ 1013.748036] Disabling lock debugging due to kernel taint [ 1013.747903] [Hardware Error]: System Fatal error. [ 1013.747903] [Hardware Error]: CPU:0 (f:5:1) MC4_STATUS[-|UE|-|PCC|-]: 0xb2070f0f [ 1013.747903] [Hardware Error]: MC4 Error (node 0): Watchdog timeout due to lack of progress. [ 1013.747903] [Hardware Error]: cache level: L3/GEN, mem/io: GEN, mem-tx: GEN, part-proc: GEN (timed out) [ 1013.747903] mce: [Hardware Error]: CPU 0: Machine Check Exception: 4 Bank 4: b2070f0f [ 1013.747903] mce: [Hardware Error]: TSC 1a2dcecb6b8 [ 1013.747903] mce: [Hardware Error]: PROCESSOR 2:f51 TIME 1425610753 SOCKET 0 APIC 0 microcode 0 [ 1013.747903] [Hardware Error]: System Fatal error. [ 1013.747903] [Hardware Error]: CPU:0 (f:5:1) MC4_STATUS[-|UE|-|PCC|-]: 0xb2070f0f [ 1013.747903] [Hardware Error]: MC4 Error (node 0): Watchdog timeout due to lack of progress. [ 1013.747903] [Hardware Error]: cache level: L3/GEN, mem/io: GEN, mem-tx: GEN, part-proc: GEN (timed out) [ 1013.747903] mce: [Hardware Error]: Machine check: Processor context corrupt [ 1013.747903] Kernel panic - not syncing: Fatal machine check on current CPU [ 1013.748036] [Hardware Error]: System Fatal error. [ 1013.748036] [Hardware Error]: CPU:1 (f:5:1) MC4_STATUS[-|UE|-|PCC|-]: 0xb2070f0f [ 1013.748036] [Hardware Error]: MC4 Error (node 1): Watchdog timeout due to lack of progress. [ 1013.748036] [Hardware Error]: cache level: L3/GEN, mem/io: GEN, mem-tx: GEN, part-proc: GEN (timed out) [ 1013.747903] Kernel Offset: disabled [ 1013.747903] ---[ end Kernel panic - not syncing: Fatal machine check on current CPU [ 1019.239423] [ cut here ] [ 1019.244144] WARNING: CPU: 0 PID: 13875 at arch/x86/kernel/smp.c:124 native_smp_send_reschedule+0x5f/0x70() [ 1019.249416] Modules linked in: ib_ipoib ib_cm ib_sa nfsv2 nfs lockd sunrpc grace i2c_piix4 ib_mthca ib_mad ib_core ib_addr shpchp amd64_edac_mod i2c_amd756 k8temp amd_rng edac_core edac_mce_amd tg3 ptp pps_core sata_promise pata_amd [ 1019.249416] CPU: 0 PID: 13875 Comm: java Tainted: G M 4.0.0-rc2+ #1 [ 1019.249416] Hardware name: RIOWORKS HDAMA/HDAMA, BIOS V2.17 03/20/2006 [ 1019.249416] 007c 8801f8409a80 815f33ff 007c [ 1019.249416] 8801f8409ac0 81055
Re: [PATCH 0/2] make automatic device_id generation possible
On (03/05/15 09:20), Minchan Kim wrote: > In summary, I want to support only "cat /sys/class/zram-control/zram_add" > unless you have feasible usecase. > > What do you think about it? > Hello Minchan, I've tried to contact as many guys (who has previously demonstrated some interest in on-demand device creation) as I could in every sane way (using both lkml and google+). and looks like people see no value in this functionality. so I'm happy to remove it. cleanup patch will arrive later today. thanks for raising this topic. -ss -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: pskb_expand_head: skb_shared BUG
On Mon, Mar 02, 2015 at 11:45:11AM +1100, Chris Dunlop wrote: > Heads up... > > We've hit this BUG() in v3.10.70, v3.14.27 and v3.18.7: > > net/core/skbuff.c: > 1027 int pskb_expand_head(struct sk_buff *skb, int nhead, int ntail, > 1028 gfp_t gfp_mask) > 1029 { > 1030 int i; > 1031 u8 *data; > 1032 int size = nhead + skb_end_offset(skb) + ntail; > 1033 long off; > 1034 > 1035 BUG_ON(nhead < 0); > 1036 > 1037 if (skb_shared(skb)) > 1038 BUG(); <<< BOOM!!! > > This appears to be a regression in the 3.10.x stable series: > we've been running for 11 months on v3.10.33 without problem, we > upgraded to v3.14.27 and hit the BUG(), than again on upgrading > to v3.18.7, then again after downgrading to v3.10.70. Apologies, this was a false alarm. There was indeed a regression, but it's in the upstream openvswitch code rather than linux core. (Further details: a sharing of an otherwise-unshared skb, causing us to hit the BUG() above, introduced in v2.3, will be fixed in upcoming v2.3.2) Cheers, Chris -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] net: fec: fix unbalanced clk disable on driver unbind
From: Stefan Agner Date: Thu, 5 Mar 2015 15:09:29 +0100 > When the driver is removed (e.g. using unbind through sysfs), the > clocks get disabled twice, once on fec_enet_close and once on > fec_drv_remove. Since the clocks are enabled only once, this leads > to a warning: > > WARNING: CPU: 0 PID: 402 at drivers/clk/clk.c:992 clk_core_disable+0x64/0x68() > > Remove the call to fec_enet_clk_enable in fec_drv_remove to balance > the clock enable/disable calls again. This has been introduce by > e8fcfcd5684a ("net: fec: optimize the clock management to save power"). > > Signed-off-by: Stefan Agner Applied, thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] net: macb: Correct the MID field length value
From: Michal Simek Date: Thu, 5 Mar 2015 15:02:10 +0100 > From: Punnaiah Choudary Kalluri > > The latest spec "I-IPA01-0266-USR Rev 10" limit the MID field length to 12 bit > value. For previous versions it is 16 bit value. > > This change will not break the backward compatibility as the latest ID value > is > 7 and with in the 12 bit value limit. > > Signed-off-by: Punnaiah Choudary Kalluri > Signed-off-by: Michal Simek Applied, thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 1/6] x86: Add this_cpu_sp0() to read sp0 for the current cpu
We currently store references to the top of the kernel stack in multiple places: kernel_stack (with an offset) and init_tss.x86_tss.sp0 (no offset). The latter is defined by hardware and is a clean canonical way to find the top of the stack. Add an accessor so we can start using it. This needs minor paravirt tweaks. On native, sp0 defines the top of the kernel stack and is therefore always correct. On Xen and lguest, the hypervisor tracks the top of the stack, but we want to start reading sp0 in the kernel. Fixing this is simple: just update our local copy of sp0 as well as the hypervisor's copy on task switches. Cc: Konrad Rzeszutek Wilk Cc: Boris Ostrovsky Cc: Rusty Russell Signed-off-by: Andy Lutomirski --- arch/x86/include/asm/processor.h | 5 + arch/x86/kernel/process.c| 1 + arch/x86/lguest/boot.c | 1 + arch/x86/xen/enlighten.c | 1 + 4 files changed, 8 insertions(+) diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 7be2c9a6caba..71c3a826a690 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -564,6 +564,11 @@ static inline void native_swapgs(void) #endif } +static inline unsigned long this_cpu_sp0(void) +{ + return this_cpu_read_stable(init_tss.x86_tss.sp0); +} + #ifdef CONFIG_PARAVIRT #include #else diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index 046e2d620bbe..ff5c9088b1c5 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -38,6 +38,7 @@ * on exact cacheline boundaries, to eliminate cacheline ping-pong. */ __visible DEFINE_PER_CPU_SHARED_ALIGNED(struct tss_struct, init_tss) = INIT_TSS; +EXPORT_PER_CPU_SYMBOL_GPL(init_tss); #ifdef CONFIG_X86_64 static DEFINE_PER_CPU(unsigned char, is_idle); diff --git a/arch/x86/lguest/boot.c b/arch/x86/lguest/boot.c index ac4453d8520e..8561585ee2c6 100644 --- a/arch/x86/lguest/boot.c +++ b/arch/x86/lguest/boot.c @@ -1076,6 +1076,7 @@ static void lguest_load_sp0(struct tss_struct *tss, { lazy_hcall3(LHCALL_SET_STACK, __KERNEL_DS | 0x1, thread->sp0, THREAD_SIZE / PAGE_SIZE); + tss->x86_tss.sp0 = thread->sp0; } /* Let's just say, I wouldn't do debugging under a Guest. */ diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c index 5240f563076d..81665c9f2132 100644 --- a/arch/x86/xen/enlighten.c +++ b/arch/x86/xen/enlighten.c @@ -912,6 +912,7 @@ static void xen_load_sp0(struct tss_struct *tss, mcs = xen_mc_entry(0); MULTI_stack_switch(mcs.mc, __KERNEL_DS, thread->sp0); xen_mc_issue(PARAVIRT_LAZY_CPU); + tss->x86_tss.sp0 = thread->sp0; } static void xen_set_iopl_mask(unsigned mask) -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 5/6] x86: Remove INIT_TSS and fold the definitions into cpu_tss
The INIT_TSS is unnecessary. Just define the initial TSS where cpu_tss is defined. While we're at it, merge the 32-bit and 64-bit definitions. The only syntactic change is that 32-bit kernels were computing sp0 as long, but now they compute it as unsigned long. Verified by objdump: the contents and relocations of .data..percpu..shared_aligned are unchanged on 32-bit and 64-bit kernels. Signed-off-by: Andy Lutomirski --- arch/x86/include/asm/processor.h | 20 arch/x86/kernel/process.c| 20 +++- 2 files changed, 19 insertions(+), 21 deletions(-) diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 117ee65473e2..f5e3ec63767d 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -818,22 +818,6 @@ static inline void spin_lock_prefetch(const void *x) .io_bitmap_ptr = NULL, \ } -/* - * Note that the .io_bitmap member must be extra-big. This is because - * the CPU will access an additional byte beyond the end of the IO - * permission bitmap. The extra byte must be all 1 bits, and must - * be within the limit. - */ -#define INIT_TSS { \ - .x86_tss = { \ - .sp0= sizeof(init_stack) + (long)&init_stack, \ - .ss0= __KERNEL_DS,\ - .ss1= __KERNEL_CS,\ - .io_bitmap_base = INVALID_IO_BITMAP_OFFSET, \ -}, \ - .io_bitmap = { [0 ... IO_BITMAP_LONGS] = ~0 }, \ -} - extern unsigned long thread_saved_pc(struct task_struct *tsk); #define THREAD_SIZE_LONGS (THREAD_SIZE/sizeof(unsigned long)) @@ -892,10 +876,6 @@ extern unsigned long thread_saved_pc(struct task_struct *tsk); .sp0 = (unsigned long)&init_stack + sizeof(init_stack) \ } -#define INIT_TSS { \ - .x86_tss.sp0 = (unsigned long)&init_stack + sizeof(init_stack) \ -} - /* * Return saved PC of a blocked thread. * What is this good for? it will be always the scheduler or ret_from_fork. diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index 6f6087349231..f4c0af7fc3a0 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -37,7 +37,25 @@ * section. Since TSS's are completely CPU-local, we want them * on exact cacheline boundaries, to eliminate cacheline ping-pong. */ -__visible DEFINE_PER_CPU_SHARED_ALIGNED(struct tss_struct, cpu_tss) = INIT_TSS; +__visible DEFINE_PER_CPU_SHARED_ALIGNED(struct tss_struct, cpu_tss) = { + .x86_tss = { + .sp0 = (unsigned long)&init_stack + sizeof(init_stack), +#ifdef CONFIG_X86_32 + .ss0 = __KERNEL_DS, + .ss1 = __KERNEL_CS, + .io_bitmap_base = INVALID_IO_BITMAP_OFFSET, +#endif +}, +#ifdef CONFIG_X86_32 +/* + * Note that the .io_bitmap member must be extra-big. This is because + * the CPU will access an additional byte beyond the end of the IO + * permission bitmap. The extra byte must be all 1 bits, and must + * be within the limit. + */ + .io_bitmap = { [0 ... IO_BITMAP_LONGS] = ~0 }, +#endif +}; EXPORT_PER_CPU_SYMBOL_GPL(cpu_tss); #ifdef CONFIG_X86_64 -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 6/6] x86, asm: Rename INIT_TSS_IST to TSS_IST
This has nothing to do with the init thread or the initial anything. It's just the TSS. Signed-off-by: Andy Lutomirski --- arch/x86/kernel/entry_64.S | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S index 0c00fd80249a..c86f83e95f15 100644 --- a/arch/x86/kernel/entry_64.S +++ b/arch/x86/kernel/entry_64.S @@ -959,7 +959,7 @@ apicinterrupt IRQ_WORK_VECTOR \ /* * Exception entry points. */ -#define INIT_TSS_IST(x) PER_CPU_VAR(cpu_tss) + (TSS_ist + ((x) - 1) * 8) +#define TSS_IST(x) PER_CPU_VAR(cpu_tss) + (TSS_ist + ((x) - 1) * 8) .macro idtentry sym do_sym has_error_code:req paranoid=0 shift_ist=-1 ENTRY(\sym) @@ -1015,13 +1015,13 @@ ENTRY(\sym) .endif .if \shift_ist != -1 - subq $EXCEPTION_STKSZ, INIT_TSS_IST(\shift_ist) + subq $EXCEPTION_STKSZ, TSS_IST(\shift_ist) .endif call \do_sym .if \shift_ist != -1 - addq $EXCEPTION_STKSZ, INIT_TSS_IST(\shift_ist) + addq $EXCEPTION_STKSZ, TSS_IST(\shift_ist) .endif /* these procedures expect "no swapgs" flag in ebx */ -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 0/6] Baby steps toward cleaning up KERNEL_STACK_OFFSET
Denys is right that KERNEL_STACK_OFFSET is a mess. Let's start fixing it. This removes all C code that *reads* kernel_stack. It also fixes the KERNEL_STACK_OFFSET abomination in ia32_sysenter_target. It does not fix the KERNEL_STACK_OFFSET abomination in GET_THREAD_INFO and THREAD_INFO. I think that should be its own patch. It also doesn't change the two syscall targets. To fix them, we should make a decision. Either we should make KERNEL_STACK_OFFSET have the correct nonzero value to save an instruction or we should get rid of kernel_stack entirely. Changes from v1: - Fix missing export. - Fix lguest code. - Add more init_tss naming cleanups (Ingo's suggestion). - Changelog improvements (Ingo). - Improve the check in ist_begin_non_atomic (Denys). Andy Lutomirski (6): x86: Add this_cpu_sp0() to read sp0 for the current cpu x86: Switch all C consumers of kernel_stack to this_cpu_sp0 x86, asm: Change the 32-bit sysenter code to use sp0 x86: Rename init_tss to cpu_tss x86: Remove INIT_TSS and fold the definitions into cpu_tss x86, asm: Rename INIT_TSS_IST to TSS_IST arch/x86/ia32/ia32entry.S | 3 +-- arch/x86/include/asm/processor.h | 27 ++- arch/x86/include/asm/thread_info.h | 3 +-- arch/x86/kernel/asm-offsets_64.c | 1 + arch/x86/kernel/cpu/common.c | 6 +++--- arch/x86/kernel/entry_64.S | 6 +++--- arch/x86/kernel/ioport.c | 2 +- arch/x86/kernel/process.c | 23 +-- arch/x86/kernel/process_32.c | 2 +- arch/x86/kernel/process_64.c | 2 +- arch/x86/kernel/traps.c| 4 ++-- arch/x86/kernel/vm86_32.c | 4 ++-- arch/x86/lguest/boot.c | 1 + arch/x86/power/cpu.c | 2 +- arch/x86/xen/enlighten.c | 1 + 15 files changed, 46 insertions(+), 41 deletions(-) -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 2/6] x86: Switch all C consumers of kernel_stack to this_cpu_sp0
This will make modifying the semantics of kernel_stack easier. The change to ist_begin_non_atomic() is necessary because sp0 no longer points to the same THREAD_SIZE-aligned region as rsp; it's one byte too high for that. At Denys' suggestion, rather than offsetting it, just check explicitly that we're in the correct range ending at sp0. This has the added benefit that we no longer assume that the thread stack is aligned to THREAD_SIZE. Signed-off-by: Andy Lutomirski --- arch/x86/include/asm/thread_info.h | 3 +-- arch/x86/kernel/traps.c| 4 ++-- 2 files changed, 3 insertions(+), 4 deletions(-) diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h index 1d4e4f279a32..a2fa1899494e 100644 --- a/arch/x86/include/asm/thread_info.h +++ b/arch/x86/include/asm/thread_info.h @@ -159,8 +159,7 @@ DECLARE_PER_CPU(unsigned long, kernel_stack); static inline struct thread_info *current_thread_info(void) { struct thread_info *ti; - ti = (void *)(this_cpu_read_stable(kernel_stack) + - KERNEL_STACK_OFFSET - THREAD_SIZE); + ti = (void *)(this_cpu_sp0() - THREAD_SIZE); return ti; } diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index 42819886be0c..484eb03a3f32 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -174,8 +174,8 @@ void ist_begin_non_atomic(struct pt_regs *regs) * will catch asm bugs and any attempt to use ist_preempt_enable * from double_fault. */ - BUG_ON(((current_stack_pointer() ^ this_cpu_read_stable(kernel_stack)) - & ~(THREAD_SIZE - 1)) != 0); + BUG_ON((unsigned long)(this_cpu_sp0() - current_stack_pointer()) >= + THREAD_SIZE); preempt_count_sub(HARDIRQ_OFFSET); } -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 4/6] x86: Rename init_tss to cpu_tss
It has nothing to do with init -- there's only one tss per cpu. Other names considered include: - current_tss: Confusing because we never switch the tss. - singleton_tss: Too long. This patch was generated with 's/init_tss/cpu_tss/g'. Followup patches will fix INIT_TSS and INIT_TSS_IST by hand. Signed-off-by: Andy Lutomirski --- arch/x86/ia32/ia32entry.S| 2 +- arch/x86/include/asm/processor.h | 4 ++-- arch/x86/kernel/cpu/common.c | 6 +++--- arch/x86/kernel/entry_64.S | 2 +- arch/x86/kernel/ioport.c | 2 +- arch/x86/kernel/process.c| 6 +++--- arch/x86/kernel/process_32.c | 2 +- arch/x86/kernel/process_64.c | 2 +- arch/x86/kernel/vm86_32.c| 4 ++-- arch/x86/power/cpu.c | 2 +- 10 files changed, 16 insertions(+), 16 deletions(-) diff --git a/arch/x86/ia32/ia32entry.S b/arch/x86/ia32/ia32entry.S index 719db63b35c4..ad9efef65a6b 100644 --- a/arch/x86/ia32/ia32entry.S +++ b/arch/x86/ia32/ia32entry.S @@ -113,7 +113,7 @@ ENTRY(ia32_sysenter_target) CFI_DEF_CFA rsp,0 CFI_REGISTERrsp,rbp SWAPGS_UNSAFE_STACK - movqPER_CPU_VAR(init_tss + TSS_sp0), %rsp + movqPER_CPU_VAR(cpu_tss + TSS_sp0), %rsp /* * No need to follow this irqs on/off section: the syscall * disabled irqs, here we enable it straight after entry: diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 71c3a826a690..117ee65473e2 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -282,7 +282,7 @@ struct tss_struct { } cacheline_aligned; -DECLARE_PER_CPU_SHARED_ALIGNED(struct tss_struct, init_tss); +DECLARE_PER_CPU_SHARED_ALIGNED(struct tss_struct, cpu_tss); /* * Save the original ist values for checking stack pointers during debugging @@ -566,7 +566,7 @@ static inline void native_swapgs(void) static inline unsigned long this_cpu_sp0(void) { - return this_cpu_read_stable(init_tss.x86_tss.sp0); + return this_cpu_read_stable(cpu_tss.x86_tss.sp0); } #ifdef CONFIG_PARAVIRT diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 2346c95c6ab1..5d0f0cc7ea26 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -979,7 +979,7 @@ static void syscall32_cpu_init(void) void enable_sep_cpu(void) { int cpu = get_cpu(); - struct tss_struct *tss = &per_cpu(init_tss, cpu); + struct tss_struct *tss = &per_cpu(cpu_tss, cpu); if (!boot_cpu_has(X86_FEATURE_SEP)) { put_cpu(); @@ -1307,7 +1307,7 @@ void cpu_init(void) */ load_ucode_ap(); - t = &per_cpu(init_tss, cpu); + t = &per_cpu(cpu_tss, cpu); oist = &per_cpu(orig_ist, cpu); #ifdef CONFIG_NUMA @@ -1391,7 +1391,7 @@ void cpu_init(void) { int cpu = smp_processor_id(); struct task_struct *curr = current; - struct tss_struct *t = &per_cpu(init_tss, cpu); + struct tss_struct *t = &per_cpu(cpu_tss, cpu); struct thread_struct *thread = &curr->thread; wait_for_master_cpu(cpu); diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S index 622ce4254893..0c00fd80249a 100644 --- a/arch/x86/kernel/entry_64.S +++ b/arch/x86/kernel/entry_64.S @@ -959,7 +959,7 @@ apicinterrupt IRQ_WORK_VECTOR \ /* * Exception entry points. */ -#define INIT_TSS_IST(x) PER_CPU_VAR(init_tss) + (TSS_ist + ((x) - 1) * 8) +#define INIT_TSS_IST(x) PER_CPU_VAR(cpu_tss) + (TSS_ist + ((x) - 1) * 8) .macro idtentry sym do_sym has_error_code:req paranoid=0 shift_ist=-1 ENTRY(\sym) diff --git a/arch/x86/kernel/ioport.c b/arch/x86/kernel/ioport.c index 4ddaf66ea35f..37dae792dbbe 100644 --- a/arch/x86/kernel/ioport.c +++ b/arch/x86/kernel/ioport.c @@ -54,7 +54,7 @@ asmlinkage long sys_ioperm(unsigned long from, unsigned long num, int turn_on) * because the ->io_bitmap_max value must match the bitmap * contents: */ - tss = &per_cpu(init_tss, get_cpu()); + tss = &per_cpu(cpu_tss, get_cpu()); if (turn_on) bitmap_clear(t->io_bitmap_ptr, from, num); diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index ff5c9088b1c5..6f6087349231 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -37,8 +37,8 @@ * section. Since TSS's are completely CPU-local, we want them * on exact cacheline boundaries, to eliminate cacheline ping-pong. */ -__visible DEFINE_PER_CPU_SHARED_ALIGNED(struct tss_struct, init_tss) = INIT_TSS; -EXPORT_PER_CPU_SYMBOL_GPL(init_tss); +__visible DEFINE_PER_CPU_SHARED_ALIGNED(struct tss_struct, cpu_tss) = INIT_TSS; +EXPORT_PER_CPU_SYMBOL_GPL(cpu_tss); #ifdef CONFIG_X86_64 static DEFINE_PER_CPU(unsigned char, is_idle); @@ -110,7 +110,7 @@ void exit_thread(void) unsigned long *bp = t->io_bitmap_ptr; if (bp) { - struct tss_struct *tss = &per_cpu(init_tss, get_cpu()); +
[PATCH v2 3/6] x86, asm: Change the 32-bit sysenter code to use sp0
The ia32 sysenter code loaded the top of the kernel stack into rsp by loading kernel_stack and then adjusting it. It can be simplified to just read sp0 directly. This requires the addition of a new asm-offsets entry for sp0. Signed-off-by: Andy Lutomirski --- arch/x86/ia32/ia32entry.S| 3 +-- arch/x86/kernel/asm-offsets_64.c | 1 + 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/ia32/ia32entry.S b/arch/x86/ia32/ia32entry.S index ed9746340363..719db63b35c4 100644 --- a/arch/x86/ia32/ia32entry.S +++ b/arch/x86/ia32/ia32entry.S @@ -113,8 +113,7 @@ ENTRY(ia32_sysenter_target) CFI_DEF_CFA rsp,0 CFI_REGISTERrsp,rbp SWAPGS_UNSAFE_STACK - movqPER_CPU_VAR(kernel_stack), %rsp - addq$(KERNEL_STACK_OFFSET),%rsp + movqPER_CPU_VAR(init_tss + TSS_sp0), %rsp /* * No need to follow this irqs on/off section: the syscall * disabled irqs, here we enable it straight after entry: diff --git a/arch/x86/kernel/asm-offsets_64.c b/arch/x86/kernel/asm-offsets_64.c index fdcbb4d27c9f..5ce6f2da8763 100644 --- a/arch/x86/kernel/asm-offsets_64.c +++ b/arch/x86/kernel/asm-offsets_64.c @@ -81,6 +81,7 @@ int main(void) #undef ENTRY OFFSET(TSS_ist, tss_struct, x86_tss.ist); + OFFSET(TSS_sp0, tss_struct, x86_tss.sp0); BLANK(); DEFINE(__NR_syscall_max, sizeof(syscalls_64) - 1); -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 1/7] ARM: at91: switch to multiplatform
On Thu, Mar 5, 2015 at 5:35 PM, Alexandre Belloni wrote: > On 05/03/2015 at 16:50:57 -0600, Rob Herring wrote : >> > -config SOC_SAMA5 >> > +config ARCH_AT91 >> > bool >> > - select ATMEL_AIC5_IRQ >> > + select ARCH_REQUIRE_GPIOLIB >> > select COMMON_CLK_AT91 >> > - select CPU_V7 >> > + select CLKDEV_LOOKUP >> >> This is already selected by COMMON_CLK I think. >> >> > select GENERIC_CLOCKEVENTS >> >> This is already selected. >> > > I'm just moving options around I didn't add or remove any. That applies > to most of your comments. You are enabling multiplatform which means you can drop selecting the ones multiplatform selects. I've cleaned-up up the tree once for this and I don't care to do it again. Rob -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v6] x86: mce: kexec: switch MCE handler for kexec/kdump
On Thu, Mar 05, 2015 at 09:37:52AM +, Naoya Horiguchi wrote: ... > > With the above simplified versions used, the rest of the patch becomes > > almost trivial. > > Other than that, I'm OK to write in the simplified form. Here is the updated one. And I found some cleanups and/or tiny fixes (independent from this patch), so will post them later. Thanks, Naoya Horiguchi --- >From 8890e9976c525a4b480bf5f86008641688de8c11 Mon Sep 17 00:00:00 2001 From: Naoya Horiguchi Date: Fri, 6 Mar 2015 11:52:10 +0900 Subject: [PATCH v6] x86: mce: kexec: switch MCE handler for kexec/kdump kexec disables (or "shoots down") all CPUs other than a crashing CPU before entering the 2nd kernel. But the MCE handler is still enabled after that, so if MCE happens and broadcasts over the CPUs after the main thread starts the 2nd kernel (which might not initialize MCE device yet, or might decide not to enable it,) MCE handler runs only on the other CPUs (not on the main thread,) leading to kernel panic with MCE synchronization. The user-visible effect of this bug is kdump failure. Our standard MCE handler do_machine_check() assumes some about system's status and it's hard to alter it to cover kexec/kdump context, so let's add another kdump-specific one and switch to it. Note that this problem exists since current MCE handler was implemented in 2.6.32, and recently commit 716079f66eac ("mce: Panic when a core has reached a timeout") made it more visible by changing the default behavior of the synchronization timeout from "ignore" to "panic". Signed-off-by: Naoya Horiguchi --- ChangeLog v5 -> v6: - drop "CC stable" tag - stop using/exporting mce_gather_info(), mce_(rd|wr)msrl(), and mce_panic() - drop quirk_no_way_out() part, because quirk_sandybridge_ifu() (only possible callback) could just change a MCE_PANIC_SEVERITY case to a MCE_AR_SEVERITY case, which doesn't affect the panic/return decision. ChangeLog v4 -> v5: - drop MCE_UC/AR_SEVERITY re-ordering - move most of code to arch/x86/kernel/crash.c - export some MCE internal variables/routines via arch/x86/include/asm/mce.h ChangeLog v3 -> v4: - fixed AR and UC order in enum severity_level because UC is severer than AR by definition. Current code is not affected by this wrong order by chance. - check severity in machine_check_under_kdump(), and call mce_panic() if the resultant severity is as bad as or worse than MCE_AR_SEVERITY. - use static global variable kdump_cpu instead of mca_cfg->kdump_cpu - reduce "#ifdef CONFIG_KEXEC" - add "#ifdef CONFIG_X86_MCE" for declaration of machine_check_under_kdump() in mce.h - update comment on switch_mce_handler_for_kdump() ChangeLog v2 -> v3 - go to "switch MCE handler" approach ChangeLog v1 -> v2 - clear MSR_IA32_MCG_CTL, MSR_IA32_MCx_CTL, and CR4.MCE instead of using global flag to ignore MCE events. - fixed the description of the problem --- arch/x86/include/asm/mce.h| 14 + arch/x86/kernel/cpu/mcheck/mce-internal.h | 13 - arch/x86/kernel/crash.c | 88 +++ 3 files changed, 102 insertions(+), 13 deletions(-) diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h index 51b26e895933..192267fcee73 100644 --- a/arch/x86/include/asm/mce.h +++ b/arch/x86/include/asm/mce.h @@ -248,4 +248,18 @@ struct cper_sec_mem_err; extern void apei_mce_report_mem_error(int corrected, struct cper_sec_mem_err *mem_err); +enum severity_level { + MCE_NO_SEVERITY, + MCE_DEFERRED_SEVERITY, + MCE_UCNA_SEVERITY = MCE_DEFERRED_SEVERITY, + MCE_KEEP_SEVERITY, + MCE_SOME_SEVERITY, + MCE_AO_SEVERITY, + MCE_UC_SEVERITY, + MCE_AR_SEVERITY, + MCE_PANIC_SEVERITY, +}; + +int mce_severity(struct mce *a, int tolerant, char **msg, bool is_excp); + #endif /* _ASM_X86_MCE_H */ diff --git a/arch/x86/kernel/cpu/mcheck/mce-internal.h b/arch/x86/kernel/cpu/mcheck/mce-internal.h index 10b46906767f..909ee3ed95dd 100644 --- a/arch/x86/kernel/cpu/mcheck/mce-internal.h +++ b/arch/x86/kernel/cpu/mcheck/mce-internal.h @@ -1,18 +1,6 @@ #include #include -enum severity_level { - MCE_NO_SEVERITY, - MCE_DEFERRED_SEVERITY, - MCE_UCNA_SEVERITY = MCE_DEFERRED_SEVERITY, - MCE_KEEP_SEVERITY, - MCE_SOME_SEVERITY, - MCE_AO_SEVERITY, - MCE_UC_SEVERITY, - MCE_AR_SEVERITY, - MCE_PANIC_SEVERITY, -}; - #define ATTR_LEN 16 /* One object for each MCE bank, shared by all CPUs */ @@ -23,7 +11,6 @@ struct mce_bank { charattrname[ATTR_LEN]; /* attribute name */ }; -int mce_severity(struct mce *a, int tolerant, char **msg, bool is_excp); struct dentry *mce_get_debugfs_dir(void); extern struct mce_bank *mce_banks; diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c index 6f3baedcb6f6..588a8b214356 100644 --- a/arch/x86/kernel/crash.c +++ b/arch/x86/kernel/crash.c @@ -34,6 +34,7 @@ #in
Re: parent/child hierarchy for regulator
On Thu, Mar 05, 2015 at 12:22:34PM +, Mark Brown wrote: > On Thu, Mar 05, 2015 at 06:35:36PM +0800, Peter Chen wrote: > > > Any good ways at code/dts to show parent/child hierarchy for regulator? > > There's plenty of examples in mainline... > thanks, I am back to study again. > > The related regulators at my platforms like below: > > PMIC (SWB 5v) --> Switch Chip (GPIO Regulator) --> USB VBUS > > > PMIC has one 5V regulator (eg, swbst at pfuse), and it is the input > > for USB power switch chip, and there are two gpios at this switch > > chip to control if 5V is output or not, we register these two gpios as > > fixed regulators, currently, if regulator swbst is disabled, the > > gpio regulator has no way to know, and cause the vbus voltage is wrong. > > Can you please clarify why you're registering two fixed voltage > regulators for the switch chip and how you're doing that? Two fixed regulators for two USB vbus, there are no relationships beween them, but both of them needs PMIC 5V (swbst at pfuse) to be enabled. > The picture > above looks like you should just have a single regulator there and > nothing should care if the either regulator is enabled when querying the > parent for its voltage. I need to care about its parent's status, currently, the usb code does not consider parent regulator, so after below patch, the voltage of vbus is incorrect, due to parent regulator is disabled after boots up, there is no user for this parent regulator. commit a6dcf9782f99a0d844b4d06f65cc990468424068 Author: Sean Cross Date: Mon May 26 16:45:40 2014 +0800 regulator: pfuze100: Support SWB enable/disable The SWB regulators have the ability to be turned on and off. Add enable/disable support for these regulators. -- Best Regards, Peter Chen -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ipv4: ip_check_defrag should not assume that skb_network_offset is zero
From: Alexander Drozdov Date: Thu, 5 Mar 2015 10:29:39 +0300 > ip_check_defrag() may be used by af_packet to defragment outgoing packets. > skb_network_offset() of af_packet's outgoing packets is not zero. > > Signed-off-by: Alexander Drozdov Applied, thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] netlink: drop (int) cast on length arg in NLMSG_OK
From: Mike Frysinger Date: Thu, 5 Mar 2015 00:47:08 -0500 > The NLMSG_OK macro compares three things: > - the len arg from the user > - a size_t: sizeof(struct nlmsghdr) > - an int: sizeof(struct nlmsghdr) casted > - an u32: the nlmsghdr->nlmsg_len member > > When building with -Wsign-compare, this macro triggers a signed compare > warning. This is because it compares len to an int, and then compares > it to a u32. If len is signed, we get a warning due to the last test. > If len is unsigned, we get a warning due to the first test. Like in > strace: > socketutils.c:145:8: warning: comparison between signed and unsigned > integer expressions [-Wsign-compare] > > Lets drop the int cast on the first sizeof. This way, once the user > casts len to an unsigned value, everything shakes out correctly. > > Signed-off-by: Mike Frysinger I don't think we can change this. If you get rid of the 'int' cast then code is going to end up with a signed comparison for the first test even if 'len' is signed, and that's a potential security issue. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/