date:20140507

Re: [PATCH 0/4] Volatile Ranges (v14 - madvise reborn edition!)

2014-05-07 Thread Minchan Kim

On Tue, Apr 29, 2014 at 02:21:19PM -0700, John Stultz wrote:
> Another few weeks and another volatile ranges patchset...
> 
> After getting the sense that the a major objection to the earlier
> patches was the introduction of a new syscall (and its somewhat
> strange dual length/purged-bit return values), I spent some time
> trying to rework the vma manipulations so we can be we won't fail
> mid-way through changing volatility (basically making it atomic).
> I think I have it working, and thus, there is no longer the
> need for a new syscall, and we can go back to using madvise()
> to set and unset pages as volatile.

As I said reply as other patch's reply, I'm ok with this but I'd
like to make it clear to support zero-filled page as well as SIGBUS.
If we want to use madvise, maybe we need another advise flag like
MADV_VOLATILE_SIGBUS.
> 
> 
> New changes are:
> 
> o Reworked vma manipulations to be be atomic
> o Converted back to using madvise() as syscall interface
> o Integrated fix from Minchan to avoid SIGBUS faulting race
> o Caught/fixed subtle use-after-free bug w/ vma merging
> o Lots of minor cleanups and comment improvements
> 
> 
> Still on the TODO list
> 
> o Sort out how best to do page accounting when the volatility
>   is tracked on a per-mm basis.

What's is your concern about page accouting?
Could you elaborate it more for everybody to understand your concern
clearly.

> o Revisit anonymous page aging on swapless systems

One idea is that we can age forcefully on swapless system if system
has volatile vma or lazyfree pages. If the number of volatile vma or
lazyfree pages is zero, we can stop the aging automatically.

> o Draft up re-adding tmpfs/shm file volatility support
> 
  o One concern from minchan.
  I really like O(1) cost of unmarking syscall.

Vrange syscall is for others, not itself. I mean if some process calls
vrange syscall, it would scacrifice his resource for others when
emergency happens so if the syscall is overhead rather expensive,
anybody doesn't want to use it.

One idea is put increasing counter in mm_struct and assign the token
to volatile vma. Maybe we can squeeze it into vma->vm_start's lower
bits if we don't want to bloat vma size because we always hold mmap_sem
with write-side lock when we handle vrange syscall.
And we can use the token and purged mark together to pte when the purge
happens. With this, we can bail out as soon as we found purged entry in
unmarking syscall so remained ptes still have purged pte although
unmarking syscall is done. But it's no problem because if the vma is
marked as volatile again, the token will be change(ie, increased) and
doesn't match with pte's token. When the page fault occur, we can compare
the token to emit SIGBUS. If it doesn't match, we can ignore and just
map new page to pte.

One problem is overflow of counter. In the case, we can deliver false
positive to user but it isn't severe, either because use have a preparation
to handle SIGBUS if he want to use vrange syscall with SIGBUS model.

> 
> Many thanks again to Minchan, Kosaki-san, Johannes, Jan, Rik,
> Hugh, and others for the great feedback and discussion at
> LSF-MM.
> 
> thanks
> -john
> 
> 
> Volatile ranges provides a method for userland to inform the kernel
> that a range of memory is safe to discard (ie: can be regenerated)
> but userspace may want to try access it in the future.  It can be
> thought of as similar to MADV_DONTNEED, but that the actual freeing
> of the memory is delayed and only done under memory pressure, and the
> user can try to cancel the action and be able to quickly access any
> unpurged pages. The idea originated from Android's ashmem, but I've
> since learned that other OSes provide similar functionality.
> 
> This functionality allows for a number of interesting uses. One such
> example is: Userland caches that have kernel triggered eviction under
> memory pressure. This allows for the kernel to "rightsize" userspace
> caches for current system-wide workload. Things like image bitmap
> caches, or rendered HTML in a hidden browser tab, where the data is
> not visible and can be regenerated if needed, are good examples.
> 
> Both Chrome and Firefox already make use of volatile range-like
> functionality via the ashmem interface:
> https://hg.mozilla.org/releases/mozilla-b2g28_v1_3t/rev/a32c32b24a34
> 
> https://chromium.googlesource.com/chromium/src/base/+/47617a69b9a57796935e03d78931bd01b4806e70/memory/discardable_memory_allocator_android.cc
> 
> 
> The basic usage of volatile ranges is as so:
> 1) Userland marks a range of memory that can be regenerated if
> necessary as volatile
> 2) Before accessing the memory again, userland marks the memory as
> nonvolatile, and the kernel will provide notification if any pages in
> the range has been purged.
> 
> If userland accesses memory while it is volatile, it will either
> get the value stored at that memory if there has

Re: [PATCH 1/3] PM / OPP: Add support for descending order for cpufreq table

2014-05-07 Thread Viresh Kumar

On 8 May 2014 07:37, Jonghwan Choi  wrote:

As asked earlier by Nishanth:

- Avoid top-posting (the practice of putting your answer above the quoted
  text you are responding to).  It makes your response harder to read and
  makes a poor impression.

Reference: Documentation/development-process/2.Process

> I believe that 3 item is required for DVFS. Those are frequency, voltage, 
> divider value.

Not necessarily. People may need a multiplier as well or some other
configuration and so this stuff was left for drivers to implement.

> How about adding that divider value into struct dev_pm_opp like this;

Wouldn't work for all and so NAK.

> struct dev_pm_opp {
> struct list_head node;
>
> bool available;
> unsigned long rate;
> unsigned long u_volt;
> unsigned int ctl[2]; // Added
>
> struct device_opp *dev_opp;
> struct rcu_head head;
> };

Always paste a diff, its impossible to read this :(

> In my test, it works very wel..

Working isn't enough :)

You don't have a complicated list of dividers, these are simple
values from 0 to total-num-of-freq -1 and that can be handled very
easily in your code.. Please do it there.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/2] mm/page_alloc: DEBUG_VM checks for free_list placement of CMA and RESERVE pages

2014-05-07 Thread Joonsoo Kim

On Wed, May 07, 2014 at 04:59:07PM +0200, Vlastimil Babka wrote:
> On 05/07/2014 03:33 AM, Minchan Kim wrote:
> > On Mon, May 05, 2014 at 05:50:46PM +0200, Vlastimil Babka wrote:
> >> On 05/05/2014 04:36 PM, Sasha Levin wrote:
> >>> On 05/02/2014 08:08 AM, Vlastimil Babka wrote:
>  On 04/30/2014 11:46 PM, Sasha Levin wrote:
> >> On 04/03/2014 11:40 AM, Vlastimil Babka wrote:
>  For the MIGRATE_RESERVE pages, it is important they do not get 
>  misplaced
>  on free_list of other migratetype, otherwise the whole 
>  MIGRATE_RESERVE
>  pageblock might be changed to other migratetype in 
>  try_to_steal_freepages().
>  For MIGRATE_CMA, the pages also must not go to a different 
>  free_list, otherwise
>  they could get allocated as unmovable and result in CMA failure.
> 
>  This is ensured by setting the freepage_migratetype appropriately 
>  when placing
>  pages on pcp lists, and using the information when releasing them 
>  back to
>  free_list. It is also assumed that CMA and RESERVE pageblocks are 
>  created only
>  in the init phase. This patch adds DEBUG_VM checks to catch any 
>  regressions
>  introduced for this invariant.
> 
>  Cc: Yong-Taek Lee 
>  Cc: Bartlomiej Zolnierkiewicz 
>  Cc: Joonsoo Kim 
>  Cc: Mel Gorman 
>  Cc: Minchan Kim 
>  Cc: KOSAKI Motohiro 
>  Cc: Marek Szyprowski 
>  Cc: Hugh Dickins 
>  Cc: Rik van Riel 
>  Cc: Michal Nazarewicz 
>  Signed-off-by: Vlastimil Babka 
> >>
> >> Two issues with this patch.
> >>
> >> First:
> >>
> >> [ 3446.320082] kernel BUG at mm/page_alloc.c:1197!
> >> [ 3446.320082] invalid opcode:  [#1] PREEMPT SMP DEBUG_PAGEALLOC
> >> [ 3446.320082] Dumping ftrace buffer:
> >> [ 3446.320082](ftrace buffer empty)
> >> [ 3446.320082] Modules linked in:
> >> [ 3446.320082] CPU: 1 PID: 8923 Comm: trinity-c42 Not tainted 
> >> 3.15.0-rc3-next-20140429-sasha-00015-g7c7e0a7-dirty #427
> >> [ 3446.320082] task: 88053e208000 ti: 88053e246000 task.ti: 
> >> 88053e246000
> >> [ 3446.320082] RIP: get_page_from_freelist (mm/page_alloc.c:1197 
> >> mm/page_alloc.c:1548 mm/page_alloc.c:2036)
> >> [ 3446.320082] RSP: 0018:88053e247778  EFLAGS: 00010002
> >> [ 3446.320082] RAX: 0003 RBX: eaf4 RCX: 
> >> 0008
> >> [ 3446.320082] RDX: 0002 RSI: 0003 RDI: 
> >> 00a0
> >> [ 3446.320082] RBP: 88053e247868 R08: 0007 R09: 
> >> 
> >> [ 3446.320082] R10: 88006ffcef00 R11:  R12: 
> >> 0014
> >> [ 3446.335888] R13: ea000115ffe0 R14: ea000115ffe0 R15: 
> >> 
> >> [ 3446.335888] FS:  7f8c9f059700() GS:88006ec0() 
> >> knlGS:
> >> [ 3446.335888] CS:  0010 DS:  ES:  CR0: 8005003b
> >> [ 3446.335888] CR2: 02cbc048 CR3: 00054cdb4000 CR4: 
> >> 06a0
> >> [ 3446.335888] DR0: 006de000 DR1: 006de000 DR2: 
> >> 
> >> [ 3446.335888] DR3:  DR6: 0ff0 DR7: 
> >> 0602
> >> [ 3446.335888] Stack:
> >> [ 3446.335888]  88053e247798 88006eddc0b8 0016 
> >> 
> >> [ 3446.335888]  88006ffd2068 88006ffdb008 0001 
> >> 
> >> [ 3446.335888]  88006ffdb000  0003 
> >> 0001
> >> [ 3446.335888] Call Trace:
> >> [ 3446.335888] __alloc_pages_nodemask (mm/page_alloc.c:2731)
> >> [ 3446.335888] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63)
> >> [ 3446.335888] alloc_pages_vma (include/linux/mempolicy.h:76 
> >> mm/mempolicy.c:1998)
> >> [ 3446.335888] ? shmem_alloc_page (mm/shmem.c:881)
> >> [ 3446.335888] ? kvm_clock_read (arch/x86/include/asm/preempt.h:90 
> >> arch/x86/kernel/kvmclock.c:86)
> >> [ 3446.335888] shmem_alloc_page (mm/shmem.c:881)
> >> [ 3446.335888] ? __const_udelay (arch/x86/lib/delay.c:126)
> >> [ 3446.335888] ? __rcu_read_unlock (kernel/rcu/update.c:97)
> >> [ 3446.335888] ? find_get_entry (mm/filemap.c:979)
> >> [ 3446.335888] ? find_get_entry (mm/filemap.c:940)
> >> [ 3446.335888] ? find_lock_entry (mm/filemap.c:1024)
> >> [ 3446.335888] shmem_getpage_gfp (mm/shmem.c:1130)
> >> [ 3446.335888] ? sched_clock_local (kernel/sched/clock.c:214)
> >> [ 3446.335888] ? do_read_fault.isra.42 (mm/memory.c:3523)
> >> [ 3446.335888] shmem_fault (mm/shmem.c:1237)
> >> [ 3446.335888] ? do_read_fault.isra.42 (mm/memory.c:3523)
> >> [ 3446.335888] __do_fault

Re: [PATCH 2/2] mmc: rtsx: Revert "mmc: rtsx: add support for pre_reqandpost_req"

2014-05-07 Thread micky


Hi Lee

Sorry for previous email, only [PATCH 2/2] mmc: rtsx: Revert "mmc: rtsx: 
add support for pre_reqand post_req"

if need for 3.15 fix.

Best Regards.
micky.
On 04/29/2014 03:36 PM, Ulf Hansson wrote:

On 29 April 2014 03:54,  wrote:

>From: Micky Ching
>
>This reverts commit c42deffd5b53c9e583d83c7964854ede2f12410d.
>
>commit  did use
>mutex_unlock() in tasklet, but mutex_unlock() can't used in
>tasklet(atomic context). The driver need use mutex to avoid concurrency,
>so we can't use tasklet here, the patch need to be removed.
>
>The spinlock host->lock and pcr->lock may deadlock, one way to solve the
>deadlock is remove host->lock in sd_isr_done_transfer(), but if using
>workqueue the we can avoid using the spinlock and also avoid the problem.
>
>Signed-off-by: Micky Ching

Acked-by: Ulf Hansson



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/3] PM / OPP: Add support for descending order for cpufreq table

2014-05-07 Thread Viresh Kumar

On Thu, May 8, 2014 at 7:25 AM, Nishanth Menon  wrote:
>> Is it acceptiable?
>
> Personally, I feel that filling up driver_data should be left to the
> driver(caller of dev_pm_opp_init_cpufreq_table).

Exactly, and I never advised Jonghwan to update the common routine
for this. I wanted him to handle this in his driver only :)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] mmc: rtsx: Revert "mmc: rtsx: modify error handleandremove smatch warnings"

2014-05-07 Thread micky


Hi Lee

It seems Chris is too busy to responding, so would you help to pick this 
patch for 3.15 fix.


Best Regards.
micky.
On 04/29/2014 03:30 PM, Ulf Hansson wrote:

On 29 April 2014 03:54,  wrote:

>From: Micky Ching
>
>This reverts commit 1f7b581b3ffcb2a8437397a02f4af89fa6934d08.
>
>The patch depend on commit c42deffd5b53c9e583d83c7964854ede2f12410d
>, but the previous
>patch was discard. So we have to delete the patch.
>
>Signed-off-by: Micky Ching

Acked-by: Ulf Hansson


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFT PATCH -next ] [BUGFIX] kprobes: Fix "Failed to find blacklist" error on ia64 and ppc64

2014-05-07 Thread Masami Hiramatsu

(2014/05/08 13:47), Ananth N Mavinakayanahalli wrote:
> On Wed, May 07, 2014 at 08:55:51PM +0900, Masami Hiramatsu wrote:
> 
> ...
> 
>> +#if defined(CONFIG_PPC64) && (!defined(_CALL_ELF) || _CALL_ELF == 1)
>> +/*
>> + * On PPC64 ABIv1 the function pointer actually points to the
>> + * function's descriptor. The first entry in the descriptor is the
>> + * address of the function text.
>> + */
>> +#define constant_function_entry(fn) (((func_descr_t *)(fn))->entry)
>> +#else
>> +#define constant_function_entry(fn) ((unsigned long)(fn))
>> +#endif
>> +
>>  #endif /* __ASSEMBLY__ */
> 
> Hi Masami,
> 
> You could just use ppc_function_entry() instead.

No, I think ppc_function_entry() has two problems (on the latest -next kernel)

At first, that is an inlined functions which is not applied in build time.
Since the NOKPROBE_SYMBOL() is used outside of any functions as like as
EXPORT_SYMBOL(), we can only use preprocessed macros.
Next, on PPC64 ABI*v2*, ppc_function_entry() returns local function entry,
which seems global function entry + 2 insns. I'm not sure about implementation
of the kallsyms on PPC64 ABIv2, but I guess we need global function entry
for kallsyms.

BTW, could you test this patch on the latest -next tree on PPC64 if possible?

Thank you,

-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu...@hitachi.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: perf_fuzzer crash on pentium 4

2014-05-07 Thread Cyrill Gorcunov

On Thu, May 08, 2014 at 01:14:56AM -0400, Vince Weaver wrote:
> > 
> > There were a bug in p4 pmu Don (CC'ed) fixed not that long ago but I fear
> > not all corner cases might be covered yet.
> 
> I hit the NMI warnings somewhat often on Intel hardware (Haswell, Core2) 
> but it usually doesn't make the system unusable like it does on p4.
> 
> I can try to get a trace, although I'm not sure it will be useful.  I 
> spent a lot of time getting a reproducible test case for the same warnings 
> on core2 and it was unclear what the proble was and it was never fixed.
> 
> The messages look like this:
> 
> [ 2944.203423] Uhhuh. NMI received for unknown reason 31 on CPU 0.
> [ 2944.208006] Do you have a strange power saving mode enabled?
> [ 2944.208006] Dazed and confused, but trying to continue
> [ 2944.208006] Uhhuh. NMI received for unknown reason 21 on CPU 0.
> [ 2944.208006] Do you have a strange power saving mode enabled?
> [ 2944.208006] Dazed and confused, but trying to continue
> [ 2944.208006] Uhhuh. NMI received for unknown reason 31 on CPU 0.
> [ 2944.208006] Do you have a strange power saving mode enabled?
> [ 2944.208006] Dazed and confused, but trying to continue
> 
> repeating forever, system is unusable.

Vince, is it possible to get a trace which exactly events perf-fuzzed
pushed into the kernel? Maybe it would shed some light.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] cpufreq: Fix build error on some platforms that use cpufreq_for_each_*

2014-05-07 Thread Viresh Kumar

On 7 May 2014 22:03, Stratos Karafotis  wrote:
> On platforms that use cpufreq_for_each_* macros, build fails if
> CONFIG_CPU_FREQ=n, e.g. ARM/shmobile/koelsch/non-multiplatform:
>
> drivers/built-in.o: In function `clk_round_parent':
> clkdev.c:(.text+0xcf168): undefined reference to `cpufreq_next_valid'
> drivers/built-in.o: In function `clk_rate_table_find':
> clkdev.c:(.text+0xcf820): undefined reference to `cpufreq_next_valid'
> make[3]: *** [vmlinux] Error 1
>
> Fix this making cpufreq_next_valid function inline and move it to
> cpufreq.h.
>
> Reported-by: Geert Uytterhoeven 
> Signed-off-by: Stratos Karafotis 
> ---
>  drivers/cpufreq/cpufreq.c | 11 ---
>  include/linux/cpufreq.h   | 11 +--
>  2 files changed, 9 insertions(+), 13 deletions(-)

Acked-by: Viresh Kumar 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: perf_fuzzer crash on pentium 4

2014-05-07 Thread Cyrill Gorcunov

On Wed, May 07, 2014 at 10:00:50PM -0400, Don Zickus wrote:
> > 
> > If i'm right (btw it's possible to use addr2line helper?) then hwc->config
> > is corrupted and p4_config_get_bind returned nil simply because proper event
> > was not found. And I don't understand how it could happen because before
> > configuration gets written into hwc->config it's validated once obtained
> > from user-space as a raw event. Weird...
> 
> I think my commit 13beacee817d27a40ffc6f065ea0042685611dd5 explains this
> corruption.  Though I have to admit I haven't looked through the problem
> very closely yet.

nope ;) without the fix above we could (and we did) simply misconfigure
counter but never get access out of array bound. anyway, Vince confirm
the fix from PeterZ healed the problem.

> IOW my lazy fix in that commit doesn't cover fuzzers and the real problem
> in p4_pmu_schedule_events. :-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch -mm] mm, thp: avoid excessive compaction latency during fault fix

2014-05-07 Thread David Rientjes

mm-thp-avoid-excessive-compaction-latency-during-fault.patch excludes sync
compaction for all high order allocations other than thp.  What we really
want to do is suppress sync compaction for thp, but only during the page
fault path.

Orders greater than PAGE_ALLOC_COSTLY_ORDER aren't necessarily going to
loop again so this is the only way to exhaust our capabilities before
declaring that we can't allocate.

Reported-by: Hugh Dickins 
Signed-off-by: David Rientjes 
---
 mm/page_alloc.c | 17 +++--
 1 file changed, 7 insertions(+), 10 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2585,16 +2585,13 @@ rebalance:
if (page)
goto got_pg;
 
-   if (gfp_mask & __GFP_NO_KSWAPD) {
-   /*
-* Khugepaged is allowed to try MIGRATE_SYNC_LIGHT, the latency
-* of this allocation isn't critical.  Everything else, however,
-* should only be allowed to do MIGRATE_ASYNC to avoid excessive
-* stalls during fault.
-*/
-   if ((current->flags & (PF_KTHREAD | PF_KSWAPD)) == PF_KTHREAD)
-   migration_mode = MIGRATE_SYNC_LIGHT;
-   }
+   /*
+* It can become very expensive to allocate transparent hugepages at
+* fault, so use asynchronous memory compaction for THP unless it is
+* khugepaged trying to collapse.
+*/
+   if (!(gfp_mask & __GFP_NO_KSWAPD) || (current->flags & PF_KTHREAD))
+   migration_mode = MIGRATE_SYNC_LIGHT;
 
/*
 * If compaction is deferred for high-order allocations, it is because
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 2/2] mm/compaction: avoid rescanning pageblocks in isolate_freepages

2014-05-07 Thread Joonsoo Kim

On Wed, May 07, 2014 at 02:09:10PM +0200, Vlastimil Babka wrote:
> The compaction free scanner in isolate_freepages() currently remembers PFN of
> the highest pageblock where it successfully isolates, to be used as the
> starting pageblock for the next invocation. The rationale behind this is that
> page migration might return free pages to the allocator when migration fails
> and we don't want to skip them if the compaction continues.
> 
> Since migration now returns free pages back to compaction code where they can
> be reused, this is no longer a concern. This patch changes isolate_freepages()
> so that the PFN for restarting is updated with each pageblock where isolation
> is attempted. Using stress-highalloc from mmtests, this resulted in 10%
> reduction of the pages scanned by the free scanner.

Hello,

Although this patch could reduce page scanned, it is possible to skip
scanning fresh pageblock. If there is zone lock contention and we are on
asyn compaction, we stop scanning this pageblock immediately. And
then, we will continue to scan next pageblock. With this patch,
next_free_pfn is updated in this case, so we never come back again to this
pageblock. Possibly this makes compaction success rate low, doesn't
it?

Thanks.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] sound: soc: intel: remove unneeded dependency from Makefile

2014-05-07 Thread Lad Prabhakar

From: "Lad, Prabhakar" 

this patch removes drops the entries form Makefile, fixing
following issues when make clean was run,

scripts/Makefile.clean:17: sound/soc/intel/board/Makefile: No such file or 
directory
scripts/Makefile.clean:17: sound/soc/intel/sst/Makefile: No such file or 
directory

Signed-off-by: Lad, Prabhakar 
---
 Found this issue on linux-next not sure if this is fixed in
 sound subsystem tree.
 
 sound/soc/intel/Makefile |6 --
 1 file changed, 6 deletions(-)

diff --git a/sound/soc/intel/Makefile b/sound/soc/intel/Makefile
index 57242c4..26544c5 100644
--- a/sound/soc/intel/Makefile
+++ b/sound/soc/intel/Makefile
@@ -32,9 +32,3 @@ PLATFORM_LIBS = platform-libs/controls_v2_dpcm.o
 
 snd-soc-sst-platform-objs := pcm.o compress.o $(PLATFORM_LIBS)
 obj-$(CONFIG_SND_SST_PLATFORM) += snd-soc-sst-platform.o
-
-# Relevant Machine driver
-obj-$(CONFIG_SND_SST_MACHINE) += board/
-
-# IPC driver
-obj-$(CONFIG_SND_INTEL_SST) += sst/
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch v3 6/6] mm, compaction: terminate async compaction when rescheduling

2014-05-07 Thread Joonsoo Kim

On Tue, May 06, 2014 at 07:22:52PM -0700, David Rientjes wrote:
> Async compaction terminates prematurely when need_resched(), see
> compact_checklock_irqsave().  This can never trigger, however, if the 
> cond_resched() in isolate_migratepages_range() always takes care of the 
> scheduling.
> 
> If the cond_resched() actually triggers, then terminate this pageblock scan 
> for 
> async compaction as well.

Hello,

I think that same logic would be helpful to cond_resched() in
isolatate_freepages(). And, isolate_freepages() doesn't have exit logic
when it find zone_lock contention. I think that fixing it is also
helpful.

Thanks.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3] staging: rtl8188eu: fix potential leak in update_bcn_wps_ie()

2014-05-07 Thread Jes Sorensen

Christian Engelmayer  writes:
> Fix a potential leak in the error path of function update_bcn_wps_ie().
> Move the affected input verification to the beginning of the function so
> that it may return directly without leaking already allocated memory.
> Detected by Coverity - CID 1077718.
>
> Signed-off-by: Christian Engelmayer 
> ---
> v3: Resend after v2 failed to apply
>
>   * rebased against staging-next - commit 09c3fbba (staging: rtl8188eu:
> Remove 'u8 *pbuf' from struct recv_buf)
>   * fixed mua: no multipart, 7bit text/plain us-ascii
>
> v2: Added change suggested by Mateusz Guzik for the rtl8723au variant:
>
> Move the check before allocating the memory instead of freeing the
> resource afterwards in the error path.
>
> Compile tested and applies against branch staging-next of tree
> git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
> ---
>  drivers/staging/rtl8188eu/core/rtw_ap.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)

Acked-by: Jes Sorensen 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 4/4] MADV_VOLATILE: Add page purging logic & SIGBUS trap

2014-05-07 Thread Minchan Kim

On Tue, Apr 29, 2014 at 02:21:23PM -0700, John Stultz wrote:
> This patch adds the hooks in the vmscan logic to purge volatile pages
> and mark their pte as purged. With this, volatile pages will be purged
> under pressure, and their ptes swap entry's marked. If the purged pages
> are accessed before being marked non-volatile, we catch this and send a
> SIGBUS.
> 
> This is a simplified implementation that uses logic from Minchan's earlier
> efforts, so credit to Minchan for his work.
> 
> Cc: Andrew Morton 
> Cc: Android Kernel Team 
> Cc: Johannes Weiner 
> Cc: Robert Love 
> Cc: Mel Gorman 
> Cc: Hugh Dickins 
> Cc: Dave Hansen 
> Cc: Rik van Riel 
> Cc: Dmitry Adamushko 
> Cc: Neil Brown 
> Cc: Andrea Arcangeli 
> Cc: Mike Hommey 
> Cc: Taras Glek 
> Cc: Jan Kara 
> Cc: KOSAKI Motohiro 
> Cc: Michel Lespinasse 
> Cc: Minchan Kim 
> Cc: Keith Packard 
> Cc: linux...@kvack.org 
> Signed-off-by: John Stultz 
> ---
>  include/linux/mvolatile.h |   1 +
>  mm/internal.h |   2 -
>  mm/memory.c   |   7 +++
>  mm/mvolatile.c| 119 
> ++
>  mm/rmap.c |   5 ++
>  mm/vmscan.c   |  12 +
>  6 files changed, 144 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/mvolatile.h b/include/linux/mvolatile.h
> index f53396b..8b797b7 100644
> --- a/include/linux/mvolatile.h
> +++ b/include/linux/mvolatile.h
> @@ -2,5 +2,6 @@
>  #define _LINUX_MVOLATILE_H
>  
>  int madvise_volatile(int bhv, unsigned long start, unsigned long end);
> +extern int purge_volatile_page(struct page *page);
>  
>  #endif /* _LINUX_MVOLATILE_H */
> diff --git a/mm/internal.h b/mm/internal.h
> index 07b6736..2213055 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -240,10 +240,8 @@ static inline void mlock_migrate_page(struct page 
> *newpage, struct page *page)
>  
>  extern pmd_t maybe_pmd_mkwrite(pmd_t pmd, struct vm_area_struct *vma);
>  
> -#ifdef CONFIG_TRANSPARENT_HUGEPAGE
>  extern unsigned long vma_address(struct page *page,
>struct vm_area_struct *vma);
> -#endif
>  #else /* !CONFIG_MMU */
>  static inline int mlocked_vma_newpage(struct vm_area_struct *v, struct page 
> *p)
>  {
> diff --git a/mm/memory.c b/mm/memory.c
> index 037b812..cf024bd 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -61,6 +61,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  #include 
> @@ -3067,6 +3068,12 @@ static int do_swap_page(struct mm_struct *mm, struct 
> vm_area_struct *vma,
>   migration_entry_wait(mm, pmd, address);
>   } else if (is_hwpoison_entry(entry)) {
>   ret = VM_FAULT_HWPOISON;
> + } else if (is_purged_entry(entry)) {
> + page_table = pte_offset_map_lock(mm, pmd, address,
> + );
> + if (likely(pte_same(*page_table, orig_pte)))
> + ret = VM_FAULT_SIGBUS;
> + goto unlock;
>   } else {
>   print_bad_pte(vma, address, orig_pte, NULL);
>   ret = VM_FAULT_SIGBUS;
> diff --git a/mm/mvolatile.c b/mm/mvolatile.c
> index 555d5c4..a7831d3 100644
> --- a/mm/mvolatile.c
> +++ b/mm/mvolatile.c
> @@ -232,3 +232,122 @@ out:
>  
>   return ret;
>  }
> +
> +
> +/**
> + * try_to_purge_one - Purge a volatile page from a vma
> + * @page: page to purge
> + * @vma: vma to purge page from
> + *
> + * Finds the pte for a page in a vma, marks the pte as purged
> + * and release the page.
> + */
> +static void try_to_purge_one(struct page *page, struct vm_area_struct *vma)
> +{
> + struct mm_struct *mm = vma->vm_mm;
> + pte_t *pte;
> + pte_t pteval;
> + spinlock_t *ptl;
> + unsigned long addr;
> +
> + VM_BUG_ON(!PageLocked(page));
> +
> + addr = vma_address(page, vma);
> + pte = page_check_address(page, mm, addr, , 0);
> + if (!pte)
> + return;
> +
> + BUG_ON(vma->vm_flags & (VM_SPECIAL|VM_LOCKED|VM_MIXEDMAP|VM_HUGETLB));
> +
> + flush_cache_page(vma, addr, page_to_pfn(page));
> + pteval = ptep_clear_flush(vma, addr, pte);
> +
> + update_hiwater_rss(mm);
> + if (PageAnon(page))
> + dec_mm_counter(mm, MM_ANONPAGES);
> + else
> + dec_mm_counter(mm, MM_FILEPAGES);

We can add file-backed page part later when we move to suppport vrange-file.

> +
> + page_remove_rmap(page);
> + page_cache_release(page);
> +
> + set_pte_at(mm, addr, pte, swp_entry_to_pte(make_purged_entry()));
> +
> + pte_unmap_unlock(pte, ptl);
> + mmu_notifier_invalidate_page(mm, addr);
> +}
> +
> +
> +/**
> + * try_to_purge_vpage - check vma chain and purge from vmas marked volatile
> + * @page: page to purge
> + *
> + * Goes over all the vmas that hold a page, and where the vmas are volatile,
> + * purge the page from the vma.
> + *

Re: perf_fuzzer crash on pentium 4

2014-05-07 Thread Vince Weaver

On Thu, 8 May 2014, Cyrill Gorcunov wrote:

> > > The NMI issue is probably the only one that is p4 related, and I do get 
> > > the NMI warnings on other machines too, it's just the p4 is the only one 
> > > where it brings down the machine.
> > 
> > Vince, could you please provde more details on that? Is it possible
> > to somehow log which events were used by perf?
> 
> There were a bug in p4 pmu Don (CC'ed) fixed not that long ago but I fear
> not all corner cases might be covered yet.

I hit the NMI warnings somewhat often on Intel hardware (Haswell, Core2) 
but it usually doesn't make the system unusable like it does on p4.

I can try to get a trace, although I'm not sure it will be useful.  I 
spent a lot of time getting a reproducible test case for the same warnings 
on core2 and it was unclear what the proble was and it was never fixed.

The messages look like this:

[ 2944.203423] Uhhuh. NMI received for unknown reason 31 on CPU 0.
[ 2944.208006] Do you have a strange power saving mode enabled?
[ 2944.208006] Dazed and confused, but trying to continue
[ 2944.208006] Uhhuh. NMI received for unknown reason 21 on CPU 0.
[ 2944.208006] Do you have a strange power saving mode enabled?
[ 2944.208006] Dazed and confused, but trying to continue
[ 2944.208006] Uhhuh. NMI received for unknown reason 31 on CPU 0.
[ 2944.208006] Do you have a strange power saving mode enabled?
[ 2944.208006] Dazed and confused, but trying to continue

repeating forever, system is unusable.

Vince
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/2] kernel/stop_machine.c: remove false assignment to static

2014-05-07 Thread Fabian Frederick

On Wed, 7 May 2014 23:04:11 +0200
Peter Zijlstra  wrote:

> On Wed, May 07, 2014 at 10:46:56PM +0200, Fabian Frederick wrote:
> > This patch also fixes function prototype over 80 characters
> 
> And does it wrong.. Also does it really matter to GCC that we init the
> bool? Surely it can see its 0 and put it in .bss anyway?
It's considered redundant but maintainer's choice is the priority :)
 
> > Cc: Peter Zijlstra 
> > Cc: Andrew Morton 
> > Signed-off-by: Fabian Frederick 
> > ---
> >  kernel/stop_machine.c | 5 +++--
> >  1 file changed, 3 insertions(+), 2 deletions(-)
> > 
> > diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
> > index 695f0c6..8d29ee2 100644
> > --- a/kernel/stop_machine.c
> > +++ b/kernel/stop_machine.c
> > @@ -42,7 +42,7 @@ struct cpu_stopper {
> >  
> >  static DEFINE_PER_CPU(struct cpu_stopper, cpu_stopper);
> >  static DEFINE_PER_CPU(struct task_struct *, cpu_stopper_task);
> > -static bool stop_machine_initialized = false;
> > +static bool stop_machine_initialized;
> >  
> >  /*
> >   * Avoids a race between stop_two_cpus and global stop_cpus, where
> > @@ -241,7 +241,8 @@ static void irq_cpu_stop_queue_work(void *arg)
> >   *
> >   * returns when both are completed.
> >   */
> > -int stop_two_cpus(unsigned int cpu1, unsigned int cpu2, cpu_stop_fn_t fn, 
> > void *arg)
> > +int stop_two_cpus(unsigned int cpu1, unsigned int cpu2, cpu_stop_fn_t fn,
> > + void *arg)
> 
> Its only 84, so I didn't care to wrap it, its more readable this way,
> but if you want it split split it like:
> 
> int
> stop_two_cpus(unsigned int cpu1, unsigned int cpu2, cpu_stop_fn_t fn, void 
> *arg)
> 
> Or if you really have to split arguments to it in groups that make
> sense; like:
> 
> int stop_two_cpus(unsigned int cpu1, unsigned int cpu2,
> cpu_stop_fn_t fn, void *arg)
> 
> But really, these are two unrelated changes, and should therefore not be
> in a single patch.
Ok, thanks

Fabian
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] leds: Remove duplicated OOM message for individual driver

2014-05-07 Thread Bryan Wu

On Wed, May 7, 2014 at 8:25 PM, Xiubo Li  wrote:
> The OOM message of individual driver is unnecessary, and this is
> duplicate the memory subsystem generic OOM message.
>

Thanks, applied.
-Bryan

> Signed-off-by: Xiubo Li 
> ---
>  drivers/leds/leds-adp5520.c | 5 +
>  drivers/leds/leds-bd2802.c  | 4 +---
>  drivers/leds/leds-da903x.c  | 4 +---
>  drivers/leds/leds-da9052.c  | 3 +--
>  drivers/leds/leds-s3c24xx.c | 4 +---
>  drivers/leds/leds-sunfire.c | 4 +---
>  6 files changed, 6 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/leds/leds-adp5520.c b/drivers/leds/leds-adp5520.c
> index 7e311a1..9982691 100644
> --- a/drivers/leds/leds-adp5520.c
> +++ b/drivers/leds/leds-adp5520.c
> @@ -121,13 +121,10 @@ static int adp5520_led_probe(struct platform_device 
> *pdev)
>
> led = devm_kzalloc(>dev, sizeof(*led) * pdata->num_leds,
> GFP_KERNEL);
> -   if (led == NULL) {
> -   dev_err(>dev, "failed to alloc memory\n");
> +   if (!led)
> return -ENOMEM;
> -   }
>
> ret = adp5520_led_prepare(pdev);
> -
> if (ret) {
> dev_err(>dev, "failed to write\n");
> return ret;
> diff --git a/drivers/leds/leds-bd2802.c b/drivers/leds/leds-bd2802.c
> index fb5a347..6078c15 100644
> --- a/drivers/leds/leds-bd2802.c
> +++ b/drivers/leds/leds-bd2802.c
> @@ -678,10 +678,8 @@ static int bd2802_probe(struct i2c_client *client,
> int ret, i;
>
> led = devm_kzalloc(>dev, sizeof(struct bd2802_led), 
> GFP_KERNEL);
> -   if (!led) {
> -   dev_err(>dev, "failed to allocate driver data\n");
> +   if (!led)
> return -ENOMEM;
> -   }
>
> led->client = client;
> pdata = led->pdata = dev_get_platdata(>dev);
> diff --git a/drivers/leds/leds-da903x.c b/drivers/leds/leds-da903x.c
> index 2a4b87f..180bbd8 100644
> --- a/drivers/leds/leds-da903x.c
> +++ b/drivers/leds/leds-da903x.c
> @@ -109,10 +109,8 @@ static int da903x_led_probe(struct platform_device *pdev)
> }
>
> led = devm_kzalloc(>dev, sizeof(struct da903x_led), GFP_KERNEL);
> -   if (led == NULL) {
> -   dev_err(>dev, "failed to alloc memory for LED%d\n", id);
> +   if (!led)
> return -ENOMEM;
> -   }
>
> led->cdev.name = pdata->name;
> led->cdev.default_trigger = pdata->default_trigger;
> diff --git a/drivers/leds/leds-da9052.c b/drivers/leds/leds-da9052.c
> index 865d4fa..5aa7e31 100644
> --- a/drivers/leds/leds-da9052.c
> +++ b/drivers/leds/leds-da9052.c
> @@ -127,8 +127,7 @@ static int da9052_led_probe(struct platform_device *pdev)
> led = devm_kzalloc(>dev,
>sizeof(struct da9052_led) * pled->num_leds,
>GFP_KERNEL);
> -   if (led == NULL) {
> -   dev_err(>dev, "Failed to alloc memory\n");
> +   if (!led) {
> error = -ENOMEM;
> goto err;
> }
> diff --git a/drivers/leds/leds-s3c24xx.c b/drivers/leds/leds-s3c24xx.c
> index 98174e7..37a6a92 100644
> --- a/drivers/leds/leds-s3c24xx.c
> +++ b/drivers/leds/leds-s3c24xx.c
> @@ -77,10 +77,8 @@ static int s3c24xx_led_probe(struct platform_device *dev)
>
> led = devm_kzalloc(>dev, sizeof(struct s3c24xx_gpio_led),
>GFP_KERNEL);
> -   if (led == NULL) {
> -   dev_err(>dev, "No memory for device\n");
> +   if (!led)
> return -ENOMEM;
> -   }
>
> platform_set_drvdata(dev, led);
>
> diff --git a/drivers/leds/leds-sunfire.c b/drivers/leds/leds-sunfire.c
> index 388632d..0b8cc4a 100644
> --- a/drivers/leds/leds-sunfire.c
> +++ b/drivers/leds/leds-sunfire.c
> @@ -135,10 +135,8 @@ static int sunfire_led_generic_probe(struct 
> platform_device *pdev,
> }
>
> p = devm_kzalloc(>dev, sizeof(*p), GFP_KERNEL);
> -   if (!p) {
> -   dev_err(>dev, "Could not allocate struct 
> sunfire_drvdata\n");
> +   if (!p)
> return -ENOMEM;
> -   }
>
> for (i = 0; i < NUM_LEDS_PER_BOARD; i++) {
> struct led_classdev *lp = >leds[i].led_cdev;
> --
> 1.8.4
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFT PATCH -next ] [BUGFIX] kprobes: Fix "Failed to find blacklist" error on ia64 and ppc64

2014-05-07 Thread Ananth N Mavinakayanahalli

On Wed, May 07, 2014 at 08:55:51PM +0900, Masami Hiramatsu wrote:

...

> +#if defined(CONFIG_PPC64) && (!defined(_CALL_ELF) || _CALL_ELF == 1)
> +/*
> + * On PPC64 ABIv1 the function pointer actually points to the
> + * function's descriptor. The first entry in the descriptor is the
> + * address of the function text.
> + */
> +#define constant_function_entry(fn)  (((func_descr_t *)(fn))->entry)
> +#else
> +#define constant_function_entry(fn)  ((unsigned long)(fn))
> +#endif
> +
>  #endif /* __ASSEMBLY__ */

Hi Masami,

You could just use ppc_function_entry() instead.

Ananth

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] btree: Fix the bug to release whole btree nodes

2014-05-07 Thread Minfei Huang

I use the btree which pickes up the latest version of kernel source
3.14-rc2 in my own module. When the btree module is removed, a warning arised:

kmem_cache_destroy btree_node: Slab cache still has objects
CPU: 13 PID: 9150 Comm: rmmod Tainted: GF  O 3.14.0-rc2 #1
Hardware name: Inspur NF5270M3/NF5270M3, BIOS CHEETAH_2.1.3 09/10/2013
881ff8643b18 881ffdc23ea8 815a4ecc 
881ff8643ac0 881ffdc23ec8 811610df 0880
a057da60 881ffdc23ed8 a057d57c 881ffdc23f78
Call Trace:
[] dump_stack+0x49/0x5d
[] kmem_cache_destroy+0xcf/0xe0
[] btree_module_exit+0x10/0x12 [btree]
[] SyS_delete_module+0x198/0x1f0
[] ? retint_swapgs+0xe/0x13
[] ? trace_hardirqs_on_caller+0xfd/0x1c0
[] ? trace_hardirqs_on_thunk+0x3a/0x3f
[] system_call_fastpath+0x16/0x1b

The cause is that it doesn't release the last btree node,
when height = 1 and fill = 1.

Signed-off-by: Minfei Huang 
CC: Joern Engel 
CC: Johannes Berg 
---
 lib/btree.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/lib/btree.c b/lib/btree.c
index f9a4846..725bf8b 100644
--- a/lib/btree.c
+++ b/lib/btree.c
@@ -198,6 +198,8 @@ EXPORT_SYMBOL_GPL(btree_init);
 
 void btree_destroy(struct btree_head *head)
 {
+   if (head->node)
+   mempool_free(head->node, head->mempool);
mempool_destroy(head->mempool);
head->mempool = NULL;
 }
-- 
1.7.1


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/4] drm/exynos/mixer: use MXR_GRP_SXY_SY

2014-05-07 Thread Seung-Woo Kim

Hello Daniel,

On 2014년 05월 07일 23:14, Daniel Kurtz wrote:
> On Wed, May 7, 2014 at 1:14 PM, Seung-Woo Kim  wrote:
>> Hi Daniel,
>>
>> On 2014년 05월 05일 00:26, Daniel Kurtz wrote:
>>> Mixer hardware supports offsetting dma from start of source buffer using
>>> the MXR_GRP_SXY register.
>>>
>>> Signed-off-by: Daniel Kurtz 
>>> ---
>>>  drivers/gpu/drm/exynos/exynos_mixer.c | 8 +++-
>>>  1 file changed, 3 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/exynos/exynos_mixer.c 
>>> b/drivers/gpu/drm/exynos/exynos_mixer.c
>>> index 475eb49..40cf39b 100644
>>> --- a/drivers/gpu/drm/exynos/exynos_mixer.c
>>> +++ b/drivers/gpu/drm/exynos/exynos_mixer.c
>>> @@ -529,13 +529,11 @@ static void mixer_graph_buffer(struct mixer_context 
>>> *ctx, int win)
>>>
>>>   dst_x_offset = win_data->crtc_x;
>>>   dst_y_offset = win_data->crtc_y;
>>> + src_x_offset = win_data->fb_x;
>>> + src_y_offset = win_data->fb_y;
>>>
>>>   /* converting dma address base and source offset */
>>> - dma_addr = win_data->dma_addr
>>> - + (win_data->fb_x * win_data->bpp >> 3)
>>> - + (win_data->fb_y * win_data->fb_width * win_data->bpp >> 3);
>>> - src_x_offset = 0;
>>> - src_y_offset = 0;
>>> + dma_addr = win_data->dma_addr;
>>
>> Basically, you are right and source offset register can be used. But
>> because of limitation of resolution for mixer up to 1920x1080, I
>> considered modified soruce dma address to set one frame buffer, which is
>> bigger than 1920x1080, on to both fimd and hdmi.
> 
> Hi Seung-Woo,
> 
> I do not see why the maximum MIXER resolution matters for choosing
> between offsetting BASE or using SXY.
> 
> Let's say you have one big 1920x1908 framebuffer, with a span of 1920,
> starting at dma_addr (there is no extra padding at the end of the
> line).
> Let's say you wanted the mixer to scan out 1920x1080 pixels starting
> from (0, 800) in the framebuffer, and start drawing them at (0,0) on
> the screen.
> 
> What we currently do is:
>   BASE = dma_addr + (800 * 1080 * 4)
>   SPAN = 1920
>   SXY = SX(0) | SY(0)
>   WH = W(1920) | H(1080)
>   DXY = DX(0) | DY(0)
> 
> I am proposing we do:
>   BASE = dma_addr
>   SPAN = 1920
>   SXY = SX(0) | SY(800)
>   WH = W(1920) | H(1080)
>   DXY = DX(0) | DY(0)
> 
> In both cases, the mixer resolution is 1920x1080.

In my test to show each half of big one framebuffer (3840 x 1080) to
FIMD from 0 to 1079 and MIXER from 1080 to 3839 with exynos4210 and
exynos4412, it was failed to show proper hdmi display. Also it is same
for framebuffer (1920 x 2160). AFAIK, it is mainly because mixer dma has
limitation of dma memory size.

In this case, I set register as like:
  BASE = dma_addr /* 3840 x 1080 x 4 */
  SPAN = 3840
  SXY = SX(1920) | SY(0)
  WH = W(1920) | H(1080)
  DXY = DX(0) | DY(0)
or:
  BASE = dma_addr /* 1920 x 2160 x 4 */
  SPAN = 1920
  SXY = SX(0) | SY(1080)
  WH = W(1920) | H(1080)
  DXY = DX(0) | DY(0)
but these two setting did not show hdmi display as I expected. So I used
modified dma address.

> 
> My motivation for wanting to program an un-modified dma_addr into BASE
> is so we can then just check BASE_S to determine from which buffer the
> mixer is actively being scanned out without worrying about the source
> offset, since the source offset can change for a given framebuffer
> (for example, when doing panning, or if an overlay is used for a HW
> cursor).

Actually, this patch is exactly same with my first implementation, so I
completely understand your motivation. Anyway, I was focus on extended
displays with one buffer, so I wrote modified dma base address.

Thanks and Regards,
- Seung-Woo Kim

> 
> Best Regards,
> -Daniel
> 
>>
>> Regards,
>> - Seung-Woo Kim
>>
>>>
>>>   if (win_data->scan_flags & DRM_MODE_FLAG_INTERLACE)
>>>   ctx->interlace = true;
>>>
>>
>> --
>> Seung-Woo Kim
>> Samsung Software R Center
>> --
>>
> 

-- 
Seung-Woo Kim
Samsung Software R Center
--

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] CPU hotplug: Slow down hotplug operations

2014-05-07 Thread Srivatsa S. Bhat

On 05/08/2014 01:52 AM, Thomas Gleixner wrote:
> On Wed, 7 May 2014, Andrew Morton wrote:
>> On Wed,  7 May 2014 21:57:41 +0200 Borislav Petkov  wrote:
>>
>>> We have all those eager tester dudes which scratch up a dirty script to
>>> pound on CPU hotplug senselessly and then report bugs they've managed to
>>> trigger.
>>>
>>> Well, first of all, most, if not all, bugs they trigger are CPU hotplug
>>> related anyway. But we know hotplug is full of duct tape and brown
>>> paper bags. So we end up clearly wasting too much time dealing with a
>>> mechanism we know it is b0rked in the first place.
>>>
>>> Oh, and I would understand if that pounding were close to some real
>>> usage patterns but I've yet to receive a justification for toggling
>>> cores on- and offline senselessly.
>>>
>>> In any case, before this gets rewritten properly (I'm being told we
>>> might get lucky after all) let's slow down hotplugging on purpose and
>>> thus make it uninteresting, as a temporary brown paper bag solution
>>> until the real thing gets done.
>>>
>>> This way we'll save us a lot of time and efforts in chasing the wrong
>>> bugs.
>>
>> Well, I only yesterday merged Srivatsa's `CPU hotplug, stop-machine:
>> plug race-window that leads to "IPI-to-offline-CPU"' bugfix.  That bug
>> presumably wouldn't have been fixed if this patch was in place.
> 
> True.
> 
> OTOH, if people would have spent the same amount of time to rewrite
> the hotplug mess, we would have a way bigger benefit. But no, we
> prefer to add more layers of duct tape and bandaid hackery to it. 
> 
> I tried a redesign and run out of cycles, but the patches are out
> there and none of the folks who promised to complete them ever
> delivered. If nothing fundamental changes, I'm going to spend some
> serious time on it in the next couple of month.
>

Yeah, that's quite unfortunate. Even several of my own attempts to try
and fix some of the chronic issues of CPU hotplug (such as the removal
of CPU hotplug's dependency on stop-machine, consolidation of all the
duplicated and buggy CPU hotplug code in various architectures etc.) all
met a similar fate. Initially there was some amount of consensus on these
patchsets and designs, but eventually they got nowhere due to lack of any
further feedback or signs of upstream acceptance.


Stop-machine()-free CPU hotplug, v6:
http://lwn.net/Articles/538819/

With performance improvements:
http://article.gmane.org/gmane.linux.kernel/1435249

Attempt to upstream that patchset in parts, v3:
http://lwn.net/Articles/556727/

Generic SMP boot/cpu-hotplug framework to consolidate arch/ code:
https://lwn.net/Articles/500185/

But, luckily the recent work to fix the notifier deadlock mess actually
went upstream, fairly quickly. So we have one less CPU hotplug problem
to fix! :-)
https://lkml.org/lkml/2014/3/10/522

Regards,
Srivatsa S. Bhat

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] leds: Remove duplicated OOM message for individual driver

2014-05-07 Thread Xiubo Li

The OOM message of individual driver is unnecessary, and this is
duplicate the memory subsystem generic OOM message.

Signed-off-by: Xiubo Li 
---
 drivers/leds/leds-adp5520.c | 5 +
 drivers/leds/leds-bd2802.c  | 4 +---
 drivers/leds/leds-da903x.c  | 4 +---
 drivers/leds/leds-da9052.c  | 3 +--
 drivers/leds/leds-s3c24xx.c | 4 +---
 drivers/leds/leds-sunfire.c | 4 +---
 6 files changed, 6 insertions(+), 18 deletions(-)

diff --git a/drivers/leds/leds-adp5520.c b/drivers/leds/leds-adp5520.c
index 7e311a1..9982691 100644
--- a/drivers/leds/leds-adp5520.c
+++ b/drivers/leds/leds-adp5520.c
@@ -121,13 +121,10 @@ static int adp5520_led_probe(struct platform_device *pdev)
 
led = devm_kzalloc(>dev, sizeof(*led) * pdata->num_leds,
GFP_KERNEL);
-   if (led == NULL) {
-   dev_err(>dev, "failed to alloc memory\n");
+   if (!led)
return -ENOMEM;
-   }
 
ret = adp5520_led_prepare(pdev);
-
if (ret) {
dev_err(>dev, "failed to write\n");
return ret;
diff --git a/drivers/leds/leds-bd2802.c b/drivers/leds/leds-bd2802.c
index fb5a347..6078c15 100644
--- a/drivers/leds/leds-bd2802.c
+++ b/drivers/leds/leds-bd2802.c
@@ -678,10 +678,8 @@ static int bd2802_probe(struct i2c_client *client,
int ret, i;
 
led = devm_kzalloc(>dev, sizeof(struct bd2802_led), GFP_KERNEL);
-   if (!led) {
-   dev_err(>dev, "failed to allocate driver data\n");
+   if (!led)
return -ENOMEM;
-   }
 
led->client = client;
pdata = led->pdata = dev_get_platdata(>dev);
diff --git a/drivers/leds/leds-da903x.c b/drivers/leds/leds-da903x.c
index 2a4b87f..180bbd8 100644
--- a/drivers/leds/leds-da903x.c
+++ b/drivers/leds/leds-da903x.c
@@ -109,10 +109,8 @@ static int da903x_led_probe(struct platform_device *pdev)
}
 
led = devm_kzalloc(>dev, sizeof(struct da903x_led), GFP_KERNEL);
-   if (led == NULL) {
-   dev_err(>dev, "failed to alloc memory for LED%d\n", id);
+   if (!led)
return -ENOMEM;
-   }
 
led->cdev.name = pdata->name;
led->cdev.default_trigger = pdata->default_trigger;
diff --git a/drivers/leds/leds-da9052.c b/drivers/leds/leds-da9052.c
index 865d4fa..5aa7e31 100644
--- a/drivers/leds/leds-da9052.c
+++ b/drivers/leds/leds-da9052.c
@@ -127,8 +127,7 @@ static int da9052_led_probe(struct platform_device *pdev)
led = devm_kzalloc(>dev,
   sizeof(struct da9052_led) * pled->num_leds,
   GFP_KERNEL);
-   if (led == NULL) {
-   dev_err(>dev, "Failed to alloc memory\n");
+   if (!led) {
error = -ENOMEM;
goto err;
}
diff --git a/drivers/leds/leds-s3c24xx.c b/drivers/leds/leds-s3c24xx.c
index 98174e7..37a6a92 100644
--- a/drivers/leds/leds-s3c24xx.c
+++ b/drivers/leds/leds-s3c24xx.c
@@ -77,10 +77,8 @@ static int s3c24xx_led_probe(struct platform_device *dev)
 
led = devm_kzalloc(>dev, sizeof(struct s3c24xx_gpio_led),
   GFP_KERNEL);
-   if (led == NULL) {
-   dev_err(>dev, "No memory for device\n");
+   if (!led)
return -ENOMEM;
-   }
 
platform_set_drvdata(dev, led);
 
diff --git a/drivers/leds/leds-sunfire.c b/drivers/leds/leds-sunfire.c
index 388632d..0b8cc4a 100644
--- a/drivers/leds/leds-sunfire.c
+++ b/drivers/leds/leds-sunfire.c
@@ -135,10 +135,8 @@ static int sunfire_led_generic_probe(struct 
platform_device *pdev,
}
 
p = devm_kzalloc(>dev, sizeof(*p), GFP_KERNEL);
-   if (!p) {
-   dev_err(>dev, "Could not allocate struct 
sunfire_drvdata\n");
+   if (!p)
return -ENOMEM;
-   }
 
for (i = 0; i < NUM_LEDS_PER_BOARD; i++) {
struct led_classdev *lp = >leds[i].led_cdev;
-- 
1.8.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v1 03/11] perf: Allow for multiple ring buffers per event

2014-05-07 Thread Alexander Shishkin

Peter Zijlstra  writes:

> How about something like this for the itrace thing?

It's much nicer than the page swizzling draft I was about to send you.

> You would mmap() the regular buffer; when write ->aux_{offset,size} in
> the control page. After which you can do a second mmap() with the .pgoff
> matching the aux_offset you gave and .length matching the aux_size you
> gave.

Why do we need aux_{offset,size} at all, then? Userspace should know how
they mmap()ed it.

> This way the mmap() content still looks like a single linear file (could
> be sparse if you leave a hole, although we could require the aux_offset
> to match the end of the data section).
>
> And there is still the single event->rb, not more.

Fair enough.

> Then, when data inside that aux data store changes they should inject an
> PERF_RECORD_AUX to indicate this did happen, which ties it back into the
> normal event flow.
>
> With this there should be no difficult page table tricks or anything.

True.

> The patch is way incomplete but should sketch enough of the idea..

Can I take it over?

> So the aux_head/tail values should also be in the file space and not
> start at 0 again, similar for the offsets in the AUX record.

With PERF_RECORD_AUX carrying offset and size, we shouldn't need
aux_{head,tail} either, don't you think?

>
> ---
>  include/uapi/linux/perf_event.h | 19 +++
>  kernel/events/core.c| 51 
> +
>  kernel/events/internal.h|  6 +
>  kernel/events/ring_buffer.c |  8 +--
>  4 files changed, 72 insertions(+), 12 deletions(-)
>
> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> index 853bc1ccb395..adef7c0f1e7c 100644
> --- a/include/uapi/linux/perf_event.h
> +++ b/include/uapi/linux/perf_event.h
> @@ -491,6 +491,13 @@ struct perf_event_mmap_page {
>*/
>   __u64   data_head;  /* head in the data section */
>   __u64   data_tail;  /* user-space written tail */
> + __u64   data_offset;
> + __u64   data_size;
> +
> + __u64   aux_head;
> + __u64   aux_tail;
> + __u64   aux_offset;
> + __u64   aux_size;
>  };
>  
>  #define PERF_RECORD_MISC_CPUMODE_MASK(7 << 0)
> @@ -705,6 +712,18 @@ enum perf_event_type {
>*/
>   PERF_RECORD_MMAP2   = 10,
>  
> + /*
> +  * Records that new data landed in the AUX buffer part.
> +  *
> +  * struct {
> +  *  struct perf_event_headerheader;
> +  *
> +  *  u64 aux_offset;
> +  *  u64 aux_size;
> +  * };
> +  */
> + PERF_RECORD_AUX = 11,
> +
>   PERF_RECORD_MAX,/* non-ABI */
>  };
>  
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 5129b1201050..993995a23b73 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -4016,7 +4016,7 @@ static void perf_mmap_close(struct vm_area_struct *vma)
>  
>  static const struct vm_operations_struct perf_mmap_vmops = {
>   .open   = perf_mmap_open,
> - .close  = perf_mmap_close,
> + .close  = perf_mmap_close, /* non mergable */
>   .fault  = perf_mmap_fault,
>   .page_mkwrite   = perf_mmap_fault,
>  };
> @@ -4030,6 +4030,7 @@ static int perf_mmap(struct file *file, struct 
> vm_area_struct *vma)
>   struct ring_buffer *rb;
>   unsigned long vma_size;
>   unsigned long nr_pages;
> + unsigned long pgoff;
>   long user_extra, extra;
>   int ret = 0, flags = 0;
>  
> @@ -4045,7 +4046,50 @@ static int perf_mmap(struct file *file, struct 
> vm_area_struct *vma)
>   return -EINVAL;
>  
>   vma_size = vma->vm_end - vma->vm_start;
> - nr_pages = (vma_size / PAGE_SIZE) - 1;
> +
> + if (vma->vm_pgoff == 0) {
> + nr_pages = (vma_size / PAGE_SIZE) - 1;
> + } else {
> + if (!event->rb)
> + return -EINVAL;
> +
> + nr_pages = vma_size / PAGE_SIZE;
> +
> + mutex_lock(>mmap_mutex);
> + ret = -EINVAL;
> + if (!event->rb)
> + goto err_aux_unlock;
> +
> + if (!atomic_inc_not_zero(>rb->mmap_count))
> + goto err_aux_unlock;
> +
> + if (userpg->aux_offset < userpg->data_offset + 
> userpg->data_size)
> + goto err_aux_unlock;

The data_{offset,size} seem to be only set by userspace too, maybe we
can do away with these altogether unless we want to allow for it to be a
sparse file?

> + pgoff = userpg->aux_offset;

..and simply do a

pgoff = event->rb->nr_pages + 1;

?

> + if (pgoff & ~PAGE_MASK)
> + goto err_aux_unlock;
> +
> + pgoff >>= PAGE_SHIFT;
> + if (pgoff != vma->vm_pgoff)
> + goto

Re: [PATCH] usb: dwc3: ep0: fix delayed status is queued too early

2014-05-07 Thread Zhuang Jin Can

On Wed, May 07, 2014 at 12:59:06PM -0400, Alan Stern wrote:
> On Thu, 8 May 2014, Zhuang Jin Can wrote:
> 
> > > A similar problem can occur in the opposite sense: The thread queuing
> > > the delayed status request might be delayed for so long that another
> > > SETUP packet arrives from the host first.  In that case, the delayed
> > > status request is a response for a stale transfer, so it must not be
> > > sent to the host.
> > > 
> > > Do dwc3 and composite.c handle this case correctly?
> > > 
> > So the situation you describe is that we get the STATUS XferNotReady
> > event, but gadget queues a status request when control transfer already
> > failed.
> 
> When the host already timed out the control transfer and started a new 
> one.  Here's what I'm talking about:
> 
>   Host sends a Set-Configuration request.
> 
>   The UDC driver calls the gadget driver's setup function.
> 
>   The setup function returns DELAYED_STATUS.
> 
>   After a few seconds, the host gets tired of waiting and
>   sends a Get-Descriptor request
My understanding is dwc3 will return NYET to host for this
Get-Descriptor request transaction, as dwc3 is still in STATUS phase,
there's no buffer to receive anything in ep0-out. And your below
comments is not applicapable to dwc3.
> 
>   The gadget driver finally submits the delayed request response
>   to the Set-Configuration request.  But it is now too late,
>   because the host expects a response to the Get-Descriptor 
>   request.
> 
> >  dwc3 can't move to SETUP phase until the status request arrives,
> > so any SETUP transaction from host will fail. If status request
> > eventually arrives, it already missed the first control transfer, and
> > I don't know how the controller will behave. If we still can get a
> > STATUS XferComplete event without actually transfer anything on the
> > bus, then we can move back to SETUP PHASE which will remove the stale
> > delayed status request and start the new SETUP transaction. But I think
> > in this situation, the host should already lose it patience and start
> > to reset the bus.
> 
> My point is that the UDC driver can't handle this.  Therefore the
> gadget driver has to prevent this from happening.
> 
> That means composite.c has to avoid sending delayed status responses if 
> a new SETUP packet has been received already.
> 
> > Per my understanding, it's impossible for dwc3 to send a stale STATUS
> > request for a new SETUP transaction. 
> 
> dwc3 won't know that the status response is stale.  It will think the 
> response was meant for the new transfer, not the old one.
> 
> Alan Stern
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-usb" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] Input: Introduce the use of managed version of kzalloc

2014-05-07 Thread Himangi Saraogi

This patch moves data allocated using kzalloc to managed data allocated
using devm_kzalloc and cleans now unnecessary kfrees in probe and remove
functions. This data is the third argument to da9052_request_irq in the
two cases below.

The following Coccinelle semantic patch was used for making the change:

@platform@
identifier p, probefn, removefn;
@@
struct platform_driver p = {
  .probe = probefn,
  .remove = removefn,
};

@prb@
identifier platform.probefn, pdev;
expression e, e1, e2;
@@
probefn(struct platform_device *pdev, ...) {
  <+...
- e = kzalloc(e1, e2)
+ e = devm_kzalloc(>dev, e1, e2)
  ...
?-kfree(e);
  ...+>
}

@rem depends on prb@
identifier platform.removefn;
expression e;
@@
removefn(...) {
  <...
- kfree(e);
  ...>
}

Signed-off-by: Himangi Saraogi 
Acked-by: Julia Lawall 
---
As a follow up patch I would like to know if it would be desirable to 
modify request_threaded_irq to devm_request_threaded_irq in the helper
function da9052_request_irq :
int da9052_request_irq(struct da9052 *da9052, int irq, char *name,
   irq_handler_t handler, void *data)
{
irq = da9052_map_irq(da9052, irq);
if (irq < 0)
return irq;

return request_threaded_irq(irq, NULL, handler,
 IRQF_TRIGGER_LOW | IRQF_ONESHOT,
 name, data);
}

 drivers/input/misc/da9052_onkey.c  | 4 +---
 drivers/input/touchscreen/da9052_tsi.c | 4 +---
 2 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/input/misc/da9052_onkey.c 
b/drivers/input/misc/da9052_onkey.c
index 184c8f2..6fc8243 100644
--- a/drivers/input/misc/da9052_onkey.c
+++ b/drivers/input/misc/da9052_onkey.c
@@ -84,7 +84,7 @@ static int da9052_onkey_probe(struct platform_device *pdev)
return -EINVAL;
}
 
-   onkey = kzalloc(sizeof(*onkey), GFP_KERNEL);
+   onkey = devm_kzalloc(>dev, sizeof(*onkey), GFP_KERNEL);
input_dev = input_allocate_device();
if (!onkey || !input_dev) {
dev_err(>dev, "Failed to allocate memory\n");
@@ -126,7 +126,6 @@ err_free_irq:
cancel_delayed_work_sync(>work);
 err_free_mem:
input_free_device(input_dev);
-   kfree(onkey);
 
return error;
 }
@@ -139,7 +138,6 @@ static int da9052_onkey_remove(struct platform_device *pdev)
cancel_delayed_work_sync(>work);
 
input_unregister_device(onkey->input);
-   kfree(onkey);
 
return 0;
 }
diff --git a/drivers/input/touchscreen/da9052_tsi.c 
b/drivers/input/touchscreen/da9052_tsi.c
index ab64d58..dff6a2e 100644
--- a/drivers/input/touchscreen/da9052_tsi.c
+++ b/drivers/input/touchscreen/da9052_tsi.c
@@ -238,7 +238,7 @@ static int da9052_ts_probe(struct platform_device *pdev)
if (!da9052)
return -EINVAL;
 
-   tsi = kzalloc(sizeof(struct da9052_tsi), GFP_KERNEL);
+   tsi = devm_kzalloc(>dev, sizeof(struct da9052_tsi), GFP_KERNEL);
input_dev = input_allocate_device();
if (!tsi || !input_dev) {
error = -ENOMEM;
@@ -311,7 +311,6 @@ err_free_datardy_irq:
 err_free_pendwn_irq:
da9052_free_irq(tsi->da9052, DA9052_IRQ_PENDOWN, tsi);
 err_free_mem:
-   kfree(tsi);
input_free_device(input_dev);
 
return error;
@@ -327,7 +326,6 @@ static int  da9052_ts_remove(struct platform_device *pdev)
da9052_free_irq(tsi->da9052, DA9052_IRQ_PENDOWN, tsi);
 
input_unregister_device(tsi->dev);
-   kfree(tsi);
 
return 0;
 }
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v4 1/4] clk: samsung: add new Kconfig for Samsung common clock option

2014-05-07 Thread Pankaj Dubey

This patch adds new Kconfig file for adding new COMMON_CLK_SAMSUNG option.
Samsung platforms can select this for using common clock infrastructure.

CC: Mike Turquette 
Signed-off-by: Pankaj Dubey 
---
 drivers/clk/Kconfig |2 ++
 drivers/clk/samsung/Kconfig |3 +++
 2 files changed, 5 insertions(+)
 create mode 100644 drivers/clk/samsung/Kconfig

diff --git a/drivers/clk/Kconfig b/drivers/clk/Kconfig
index 6f56d3a..ba24366 100644
--- a/drivers/clk/Kconfig
+++ b/drivers/clk/Kconfig
@@ -115,3 +115,5 @@ endmenu
 
 source "drivers/clk/bcm/Kconfig"
 source "drivers/clk/mvebu/Kconfig"
+
+source "drivers/clk/samsung/Kconfig"
diff --git a/drivers/clk/samsung/Kconfig b/drivers/clk/samsung/Kconfig
new file mode 100644
index 000..fc8696b
--- /dev/null
+++ b/drivers/clk/samsung/Kconfig
@@ -0,0 +1,3 @@
+config COMMON_CLK_SAMSUNG
+   bool
+   select COMMON_CLK
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v4 4/4] drivers: clk: use COMMON_CLK_SAMSUNG for Samsung clock support

2014-05-07 Thread Pankaj Dubey

This patch replaces PLAT_SAMSUNG with COMMON_CLK_SAMSUNG for Samsung
common clock support. Any Samsung SoC want to use Samsung common clock
infrastructure can simply select COMMON_CLK_SAMSUNG.

CC: Mike Turquette 
Signed-off-by: Pankaj Dubey 
---
 drivers/clk/Makefile |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/clk/Makefile b/drivers/clk/Makefile
index 5f8a287..17d7f13 100644
--- a/drivers/clk/Makefile
+++ b/drivers/clk/Makefile
@@ -41,7 +41,7 @@ obj-$(CONFIG_PLAT_ORION)  += mvebu/
 obj-$(CONFIG_ARCH_MXS) += mxs/
 obj-$(CONFIG_COMMON_CLK_QCOM)  += qcom/
 obj-$(CONFIG_ARCH_ROCKCHIP)+= rockchip/
-obj-$(CONFIG_PLAT_SAMSUNG) += samsung/
+obj-$(CONFIG_COMMON_CLK_SAMSUNG)   += samsung/
 obj-$(CONFIG_ARCH_SHMOBILE_MULTI)  += shmobile/
 obj-$(CONFIG_ARCH_SIRF)+= sirf/
 obj-$(CONFIG_ARCH_SOCFPGA) += socfpga/
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v4 2/4] ARM: select COMMON_CLK_SAMSUNG for ARCH_EXYNOS and ARCH_S3C64XX

2014-05-07 Thread Pankaj Dubey

This patch selects COMMON_CLK_SAMSUNG for EXYNOS and S3C64XX SoC
and removes COMMON_CLK selection as COMMON_CLK_SAMSUNG selects it's dependency.

CC: Russell King 
Signed-off-by: Pankaj Dubey 
---
 arch/arm/Kconfig |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index ab438cb..0edb868 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -754,7 +754,7 @@ config ARCH_S3C64XX
select ATAGS
select CLKDEV_LOOKUP
select CLKSRC_SAMSUNG_PWM
-   select COMMON_CLK
+   select COMMON_CLK_SAMSUNG
select CPU_V6K
select GENERIC_CLOCKEVENTS
select GPIO_SAMSUNG
@@ -835,7 +835,7 @@ config ARCH_EXYNOS
select ARCH_REQUIRE_GPIOLIB
select ARCH_SPARSEMEM_ENABLE
select ARM_GIC
-   select COMMON_CLK
+   select COMMON_CLK_SAMSUNG
select CPU_V7
select GENERIC_CLOCKEVENTS
select HAVE_S3C2410_I2C if I2C
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v4 3/4] ARM: S3C24XX: move S3C24XX clock Kconfig options to Samsung clock Kconfig file

2014-05-07 Thread Pankaj Dubey

This patch moves S3C24XX specific clock Kconfig options into
"clk/samsung/Kconfig" and also removes COMMON_CLK selection from
"mach-s3c24xx/Kconfig" as S3C24XX_COMMON_CLK is selecting it's dependency.

CC: Ben Dooks 
CC: Kukjin Kim 
CC: Russell King 
Signed-off-by: Pankaj Dubey 
---
 arch/arm/mach-s3c24xx/Kconfig |   14 --
 drivers/clk/samsung/Kconfig   |9 +
 2 files changed, 9 insertions(+), 14 deletions(-)

diff --git a/arch/arm/mach-s3c24xx/Kconfig b/arch/arm/mach-s3c24xx/Kconfig
index fbafb9a..e645ece 100644
--- a/arch/arm/mach-s3c24xx/Kconfig
+++ b/arch/arm/mach-s3c24xx/Kconfig
@@ -39,7 +39,6 @@ config CPU_S3C2410
 
 config CPU_S3C2412
bool "SAMSUNG S3C2412"
-   select COMMON_CLK
select CPU_ARM926T
select CPU_LLSERIAL_S3C2440
select S3C2412_COMMON_CLK
@@ -50,7 +49,6 @@ config CPU_S3C2412
 
 config CPU_S3C2416
bool "SAMSUNG S3C2416/S3C2450"
-   select COMMON_CLK
select CPU_ARM926T
select CPU_LLSERIAL_S3C2440
select S3C2416_PM if PM
@@ -88,7 +86,6 @@ config CPU_S3C244X
 
 config CPU_S3C2443
bool "SAMSUNG S3C2443"
-   select COMMON_CLK
select CPU_ARM920T
select CPU_LLSERIAL_S3C2440
select S3C2443_COMMON_CLK
@@ -364,11 +361,6 @@ config S3C2412_PM_SLEEP
 
 if CPU_S3C2412
 
-config S3C2412_COMMON_CLK
-   bool
-   help
- Build the s3c2412 clock driver based on the common clock framework.
-
 config CPU_S3C2412_ONLY
bool
depends on !CPU_S3C2410 && !CPU_S3C2416 && !CPU_S3C2440 && \
@@ -651,12 +643,6 @@ endif  # CPU_S3C2442
 
 if CPU_S3C2443 || CPU_S3C2416
 
-config S3C2443_COMMON_CLK
-   bool
-   help
- Temporary symbol to build the clock driver based on the common clock
- framework.
-
 config S3C2443_DMA
bool
help
diff --git a/drivers/clk/samsung/Kconfig b/drivers/clk/samsung/Kconfig
index fc8696b..baf28cb 100644
--- a/drivers/clk/samsung/Kconfig
+++ b/drivers/clk/samsung/Kconfig
@@ -1,3 +1,12 @@
 config COMMON_CLK_SAMSUNG
bool
select COMMON_CLK
+
+config S3C2412_COMMON_CLK
+   bool
+   select COMMON_CLK_SAMSUNG
+
+config S3C2443_COMMON_CLK
+   bool
+   select COMMON_CLK_SAMSUNG
+
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v4 0/4] Introduce new Kconfig for Samsung common clock

2014-05-07 Thread Pankaj Dubey


Introduce a new Kconfig file for Samsung common clock infrastructure
related config options. As current Samsung common clock gets compiled
based on PLAT_SAMSUNG, but moving ahead with ARM64 we can not have any
more such config options, so this patch introduce new COMMON_CLK_SAMSUNG
invisible option. This option also select COMMON_CLK so ARCH Kconfig just
need to select COMMON_CLK_SAMSUNG in case they want to use Samsung common
clock.

This series is based on Kukjin's for-next branch.

I am just respinning this patch after rebasing. I have already addressed
all review comments discussed here [1].

[1]: http://lkml.org/lkml/2014/3/19/216

V4:
 1) Rebased on latest Kukjin's for-next branch.

V3: 
 1) Re-organized patches for preventing bisect issues.
 2) Rebase on top of latest Kukjin's for-next branch.
 3) Compile tested exynos_defconfig, s3c6400_defconfig and s3c2410_defconfig
after each commit.

V2:
 1) Adding new Kconfig file for Samsung common clock.
 2) Make COMMON_CLK_SAMSUNG option invisible. (as suggested by Tomasz Figa)
 3) Let COMMON_CLK_SAMSUNG select COMMON_CLK. (as suggested by Tomasz Figa)
 4) Move S3C24XX clock config option in new Kconfig file.

Pankaj Dubey (4):
  clk: samsung: add new Kconfig for Samsung common clock option
  ARM: select COMMON_CLK_SAMSUNG for ARCH_EXYNOS and ARCH_S3C64XX
  ARM: S3C24XX: move S3C24XX clock Kconfig options to Samsung clock
Kconfig file
  drivers: clk: use COMMON_CLK_SAMSUNG for Samsung clock support

 arch/arm/Kconfig  |4 ++--
 arch/arm/mach-s3c24xx/Kconfig |   14 --
 drivers/clk/Kconfig   |2 ++
 drivers/clk/Makefile  |2 +-
 drivers/clk/samsung/Kconfig   |   12 
 5 files changed, 17 insertions(+), 17 deletions(-)
 create mode 100644 drivers/clk/samsung/Kconfig

-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 7/7] ARM: sunxi: dt: add PRCM clk and reset controller subdevices

2014-05-07 Thread Chen-Yu Tsai

On Thu, May 8, 2014 at 11:17 AM, Maxime Ripard
 wrote:
> On Wed, May 07, 2014 at 07:25:54PM +0200, Boris BREZILLON wrote:
>> Add DT definitions for PRCM (Power/Reset/Clock Management) clock and reset
>> controller subdevices.
>>
>> Signed-off-by: Boris BREZILLON 
>> ---
>>  arch/arm/boot/dts/sun6i-a31.dtsi | 39 
>> ++-
>>  1 file changed, 38 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/arm/boot/dts/sun6i-a31.dtsi 
>> b/arch/arm/boot/dts/sun6i-a31.dtsi
>> index ec3253a..b69be0b 100644
>> --- a/arch/arm/boot/dts/sun6i-a31.dtsi
>> +++ b/arch/arm/boot/dts/sun6i-a31.dtsi
>> @@ -498,9 +498,46 @@
>>   reg = <0x01f01c00 0x300>;
>>   };
>>
>> - prcm@01f01c00 {
>> + prcm@01f01400 {
>
> This has already been fixed by Hans.
>
>>   compatible = "allwinner,sun6i-a31-prcm";
>>   reg = <0x01f01400 0x200>;
>> +
>> + ar100: ar100_clk {
>> + compatible = "allwinner,sun6i-a31-ar100-clk";
>> + #clock-cells = <0>;
>> + clocks = <>, <>, <>, 
>> <>;
>> + };
>> +
>> + ahb0: ahb0_clk {
>> + compatible = "fixed-factor-clock";
>> + #clock-cells = <0>;
>> + clock-div = <1>;
>> + clock-mult = <1>;
>> + clocks = <>;
>> + clock-output-names = "ahb0";
>> + };
>> +
>> + apb0: apb0_clk {
>> + compatible = "allwinner,sun6i-a31-apb0-clk";
>> + #clock-cells = <0>;
>> + clocks = <>;
>> + clock-output-names = "apb0";
>> + };
>> +
>> + apb0_gates: apb0_gates_clk {
>> + compatible = 
>> "allwinner,sun6i-a31-apb0-gates-clk";
>> + #clock-cells = <1>;
>> + clocks = <>;
>> + clock-output-names = "apb0_pio", "apb0_ir",
>> + "apb0_timer01", "apb0_p2wi",
>
> timer01 ? is this a typo?

A23 manual lists the clock gate as "r_timer0_1", so I put the name on the wiki.
Allwinner sun6i code uses "r_tmr" or just "tmr". I see no problem naming this
clock output as "apb0_timer" though.

>> + "apb0_uart", "apb0_1wire",
>> + "apb0_i2c";
>> + };
>> +
>> + apb0_rst: apb0_rst {
>> + compatible = "allwinner,sun6i-a31-clock-reset";
>> + #reset-cells = <1>;
>> + };
>>   };
>>   };
>>  };
>> --
>> 1.8.3.2
>>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] ARM: i.MX27 pca100: remove deprecated IRQF_DISABLED

2014-05-07 Thread Shawn Guo

On Wed, May 07, 2014 at 05:09:59PM +0200, Juan Solano wrote:
> This flag is a NOOP and can be removed now.
> 
> Signed-off-by: Juan Solano 

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/8] Enable dma driver for MIC X100 Coprocessors.

2014-05-07 Thread Dan Williams

On Wed, May 7, 2014 at 8:10 PM, Sudeep Dutt  wrote:
> On Thu, 2014-04-24 at 11:10 -0700, Siva Krishna Yerramreddy wrote:
>> On Mon, 2014-04-14 at 13:14 -0700, Siva Yerramreddy wrote:
>> > I am sending all these patches to char-misc because there is a dependency
>> > between the patches for dma driver and other drivers.
>> >
>> Greg, any feedback on the patches?
>
> Hi Greg,
> The primary author of this patch series Siva is no longer with Intel so
> we will be taking ownership of addressing review feedback.
>
> The patches have been applied to the MIC GITHUB tree which is registered
> with Fengguang Wu's 0-day infrastructure and no issues have been
> reported.
>
> We have not received any feedback on the patches yet and were wondering
> if you had a chance to review them?

Fwiw, I'm still planning on reviewing these.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v1 03/11] perf: Allow for multiple ring buffers per event

2014-05-07 Thread Alexander Shishkin

Peter Zijlstra  writes:

> On Wed, May 07, 2014 at 02:08:43PM -0700, Andi Kleen wrote:
>> > Then, when data inside that aux data store changes they should inject an
>> > PERF_RECORD_AUX to indicate this did happen, which ties it back into the
>> > normal event flow.
>> 
>> What happens when the aux buffer wraps? How would the client know
>> if the data belongs to this _AUX entry or some later one?
>
> It belongs to the last one. Rewind them from 'now' until you hit
> collisions in AUX space, then you're done.

I guess the point here is that if we don't want to lose any data in aux
space, we need to stop the perf_event when it fills up. Also there's a
question if we need a separate wake up watermark for the AUX buffer or
do we simply wake up the poller every time there's new data.

>> May need some extra sequence numbers in the mmap header and the aux
>> entry to handle this.
>
> You're thinking of overwrite mode, right? We should update the tail in
> that case, I've not thought about how to do that for the AUX buffer.

In the overwrite mode we don't have to write out AUX records at all
before we stop the trace, we don't care how many times data in the AUX
space wraps.

> There have been some patches for the normal buffer, but they stalled;
>
> https://lkml.org/lkml/2013/7/8/154
>
> I'm all for merging that patch (or a fixed on, since it has fail in) if
> we can show the current !overwrite case doesn't regress.
>
> Also, would anybody want different mode for the data and aux parts? In
> that case we do need to add some extra state to the control page to
> indicate such.

For the decoder to make sense of the trace, it needs all the data in the
normal buffer (MMAPs, sched_switches), not just the latest bits, so it's
a good idea to have it.

Regards,
--
Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v7] DMA: sun6i: Add driver for the Allwinner A31 DMA controller

2014-05-07 Thread Maxime Ripard

On Fri, May 02, 2014 at 10:04:29PM +0530, Vinod Koul wrote:
> On Wed, Apr 30, 2014 at 02:53:22PM -0700, Maxime Ripard wrote:
> > Hi Vinod,
> > 
> > On Wed, Apr 30, 2014 at 12:34:08PM +0530, Vinod Koul wrote:
> > > On Thu, Apr 24, 2014 at 04:22:44PM +0200, Maxime Ripard wrote:
> > > > +static inline void sun6i_dma_free(struct sun6i_dma_dev *sdc)
> > > > +{
> > > > +   int i;
> > > > +
> > > > +   for (i = 0; i < NR_MAX_VCHANS; i++) {
> > > > +   struct sun6i_vchan *vchan = >vchans[i];
> > > > +
> > > > +   list_del(>vc.chan.device_node);
> > > > +   tasklet_kill(>vc.task);
> > > > +   }
> > > > +
> > > > +   tasklet_kill(>task);
> > > This is again not good. see http://lwn.net/Articles/588457/
> > > At this point HW can still generate interrupts or you can have irq 
> > > running!
> > 
> > I'm not sure to fully understand the issue here, but what is not good?
> > the first or the second tasklet_kill calls, or both?
> > 
> > From what I understood, the issue is only there whenever you are
> > calling tasklet_disable without making sure that no one will schedule
> > your tasklet before disabling it.
> > 
> > But the point is I don't actually use either _enable/_disable. I might
> > be wrong in not using those functions, but I don't really see how I
> > can be impacted.
> 
> Well that was one part of it. How do you ensure the tasklet is not scheduled
> while and after you are killing it. You need to ensure irq is disabled and 
> pending irqs
> have finished processing. I dont see that bit.

Ok. I'll change that.

Do you want me to use tasklet_enable and tasklet_disable as well?

Thanks!
Maxime

-- 
Maxime Ripard, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com


signature.asc
Description: Digital signature

Re: [PATCH v2 7/7] ARM: sunxi: dt: add PRCM clk and reset controller subdevices

2014-05-07 Thread Maxime Ripard

On Wed, May 07, 2014 at 07:25:54PM +0200, Boris BREZILLON wrote:
> Add DT definitions for PRCM (Power/Reset/Clock Management) clock and reset
> controller subdevices.
> 
> Signed-off-by: Boris BREZILLON 
> ---
>  arch/arm/boot/dts/sun6i-a31.dtsi | 39 ++-
>  1 file changed, 38 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm/boot/dts/sun6i-a31.dtsi 
> b/arch/arm/boot/dts/sun6i-a31.dtsi
> index ec3253a..b69be0b 100644
> --- a/arch/arm/boot/dts/sun6i-a31.dtsi
> +++ b/arch/arm/boot/dts/sun6i-a31.dtsi
> @@ -498,9 +498,46 @@
>   reg = <0x01f01c00 0x300>;
>   };
>  
> - prcm@01f01c00 {
> + prcm@01f01400 {

This has already been fixed by Hans.

>   compatible = "allwinner,sun6i-a31-prcm";
>   reg = <0x01f01400 0x200>;
> +
> + ar100: ar100_clk {
> + compatible = "allwinner,sun6i-a31-ar100-clk";
> + #clock-cells = <0>;
> + clocks = <>, <>, <>, <>;
> + };
> +
> + ahb0: ahb0_clk {
> + compatible = "fixed-factor-clock";
> + #clock-cells = <0>;
> + clock-div = <1>;
> + clock-mult = <1>;
> + clocks = <>;
> + clock-output-names = "ahb0";
> + };
> +
> + apb0: apb0_clk {
> + compatible = "allwinner,sun6i-a31-apb0-clk";
> + #clock-cells = <0>;
> + clocks = <>;
> + clock-output-names = "apb0";
> + };
> +
> + apb0_gates: apb0_gates_clk {
> + compatible = 
> "allwinner,sun6i-a31-apb0-gates-clk";
> + #clock-cells = <1>;
> + clocks = <>;
> + clock-output-names = "apb0_pio", "apb0_ir",
> + "apb0_timer01", "apb0_p2wi",

timer01 ? is this a typo?

> + "apb0_uart", "apb0_1wire",
> + "apb0_i2c";
> + };
> +
> + apb0_rst: apb0_rst {
> + compatible = "allwinner,sun6i-a31-clock-reset";
> + #reset-cells = <1>;
> + };
>   };
>   };
>  };
> -- 
> 1.8.3.2
> 

-- 
Maxime Ripard, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com


signature.asc
Description: Digital signature

Re: [PATCH v2 3/7] mfd: add support for sun6i PRCM (Power/Reset/Clock Management) unit

2014-05-07 Thread Maxime Ripard

On Wed, May 07, 2014 at 07:25:50PM +0200, Boris BREZILLON wrote:
> The PRCM (Power/Reset/Clock Management) block exposes several subdevices
> in different subsystems (clk, reset ...)
> 
> Add basic support for the PRCM unit with clk (AR100, AHB0, and APB0 clks)
> and reset controller subdevices.
> 
> Other subdevices might be added later (if needed).
> 
> Signed-off-by: Boris BREZILLON 

Acked-by: Maxime Ripard 

-- 
Maxime Ripard, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com


signature.asc
Description: Digital signature

Re: [PATCH v2 5/7] clk: sunxi: add PRCM (Power/Reset/Clock Management) clks support

2014-05-07 Thread Maxime Ripard

On Wed, May 07, 2014 at 07:25:52PM +0200, Boris BREZILLON wrote:
> The PRCM (Power/Reset/Clock Management) unit provides several clock
> devices:
> - AR100 clk: used to clock the Power Management co-processor
> - AHB0 clk: used to clock the AHB0 bus
> - APB0 clk and gates: used to clk peripherals connected to the APB0 bus
> 
> Add support for these clks in a separate driver so that they can be probed
> as platform devices instead of registered during early init.
> This is needed to be able to probe PRCM MFD subdevices.
> 
> Signed-off-by: Boris BREZILLON 

Acked-by: Maxime Ripard 

Thanks!
Maxime

-- 
Maxime Ripard, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com


signature.asc
Description: Digital signature

Re: [PATCH v2 2/7] reset: sunxi: allow MFD subdevices probe

2014-05-07 Thread Maxime Ripard

On Wed, May 07, 2014 at 07:25:49PM +0200, Boris BREZILLON wrote:
> The current implementation uses sunxi_reset_init function for both early
> init and platform device probe.
> 
> The sunxi_reset_init function uses DT to retrieve device resources, which
> will be an issue if reset controllers are registered from an MFD device
> that define resources from mfd_cell definition.
> 
> Moreover, we can make of devm functions when we're in the probe context.
> 
> Signed-off-by: Boris BREZILLON 
> ---
>  drivers/reset/reset-sunxi.c | 21 ++---
>  1 file changed, 18 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/reset/reset-sunxi.c b/drivers/reset/reset-sunxi.c
> index 695bd34..1b5fea6 100644
> --- a/drivers/reset/reset-sunxi.c
> +++ b/drivers/reset/reset-sunxi.c
> @@ -145,7 +145,24 @@ MODULE_DEVICE_TABLE(of, sunxi_reset_dt_ids);
>  
>  static int sunxi_reset_probe(struct platform_device *pdev)
>  {
> - return sunxi_reset_init(pdev->dev.of_node);
> + struct sunxi_reset_data *data;
> + struct resource *res;
> +
> + data = devm_kzalloc(>dev, sizeof(*data), GFP_KERNEL);
> + if (!data)
> + return -ENOMEM;
> +
> + res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> + data->membase = devm_request_and_ioremap(>dev, res);
> + if (!data->membase)
> + return -ENOMEM;

You'd probably be better off using devm_ioremap_resource so that you
get a meaningful error code.

Apart from this, you have my 
Acked-by: Maxime Ripard 

Thanks!
Maxime

-- 
Maxime Ripard, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com


signature.asc
Description: Digital signature

Re: [PATCH v2 1/7] reset: sunxi: document sunxi's reset controllers bindings

2014-05-07 Thread Maxime Ripard

On Wed, May 07, 2014 at 07:25:48PM +0200, Boris BREZILLON wrote:
> Add DT bindings documentation for sunxi's reset controllers.
> 
> Signed-off-by: Boris BREZILLON 

Acked-by: Maxime Ripard 

Thanks,
Maxime

-- 
Maxime Ripard, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com


signature.asc
Description: Digital signature

[PATCH][v2] driver/memory:Add Kconfig help description for IFC

2014-05-07 Thread Prabhakar Kushwaha

Freescale's Integrated Flash controller(IFC) module is used to handle
devices such as NOR, NAND, FPGA and ASIC.

Update same in Help section of Kconfig for IFC.

Signed-off-by: Prabhakar Kushwaha 
---
Changes for v2: Resending again keeping maintainer in 'CC'

 drivers/memory/Kconfig |4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/memory/Kconfig b/drivers/memory/Kconfig
index c59e9c9..9a59581 100644
--- a/drivers/memory/Kconfig
+++ b/drivers/memory/Kconfig
@@ -64,5 +64,9 @@ config TEGRA30_MC
 config FSL_IFC
bool
depends on FSL_SOC
+   help
+ This driver is for the Integrated Flash Controller(IFC) module
+ available in Freescale SoCs. This controller allows to handle
+ devices such as NOR, NAND, FPGA and ASIC etc.
 
 endif
-- 
1.7.9.5


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/8] Enable dma driver for MIC X100 Coprocessors.

2014-05-07 Thread Sudeep Dutt

On Thu, 2014-04-24 at 11:10 -0700, Siva Krishna Yerramreddy wrote:
> On Mon, 2014-04-14 at 13:14 -0700, Siva Yerramreddy wrote:
> > I am sending all these patches to char-misc because there is a dependency
> > between the patches for dma driver and other drivers.
> > 
> Greg, any feedback on the patches?

Hi Greg,
The primary author of this patch series Siva is no longer with Intel so
we will be taking ownership of addressing review feedback.

The patches have been applied to the MIC GITHUB tree which is registered
with Fengguang Wu's 0-day infrastructure and no issues have been
reported.

We have not received any feedback on the patches yet and were wondering
if you had a chance to review them?

Thanks,
Sudeep Dutt

> > Description:
> > 
> > This set of patches add support for MIC X100 dma driver.
> > MIC PCIe card has a dma controller with 8 channels. These channels are
> > shared between the host s/w and the card s/w. 0 to 3 are used by host
> > and 4 to 7 by card. As the dma device doesn't show up as PCIe device,
> > a virtual bus called mic bus is created and virtual dma devices are
> > created on it by the host/card drivers. On host the channels are private
> > and used only by the host driver to transfer data for the virtio devices.
> > 
> > Here is a higher level block diagram.
> >   |
> >+--+   | +--+
> >| Card OS  |   | | Host OS  |
> >+--+   | +--+
> >   |
> > +---+ ++ +--+ | +-+  ++ ++
> > | Virtio| |Virtio  | |Virtio| | |Virtio   |  |Virtio  | |Virtio  |
> > | Net   | |Console | |Block | | |Net  |  |Console | |Block   |
> > | Driver| |Driver  | |Driver| | |backend  |  |backend | |backend |
> > +---+ ++ +--+ | +-+  ++ ++
> > | | | |  || |
> > | | | |User  || |
> > | | | |--||-|---
> > +---+ |Kernel +--+
> >   |   |   | Virtio over PCIe IOCTLs  |
> >   |   |   +--+
> > +---+ |   |   |  
> > +---+
> > | MIC DMA   | |   |   |  | MIC DMA  
> >  |
> > | Driver| |   |   |  | Driver   
> >  |
> > +---+ |   |   |  
> > +---+
> >   |   |   |   ||
> > +---+ |   |   |  
> > ++
> > |MIC virtual Bus| |   |   |  |MIC 
> > virtual Bus |
> > +---+ |   |   |  
> > ++
> >   |   |   |   |  |
> >   |   +--+|+---+ |
> >   |   |Intel MIC |||Intel MIC  | |
> >   +---|Card Driver   |||Host Driver| |
> >   +--+|+---+-+
> >   |   |   |
> >  +-+
> >  | |
> >  |PCIe Bus |
> >  +-+
> > 
> > The following series of patches are partitioned as follows:
> > 
> > Patch 1: Add mic bus and dma driver documentation.
> >  Author: Siva Yerramreddy
> > Patch 2: Add a bus driver for virtual MIC devices.
> >  Authors: Siva Yerramreddy, Sudeep Dutt
> > Patch 3: MIC X100 DMA Driver.
> >  Author: Siva Yerramreddy
> > Patch 4: Add threaded irq support in host driver.
> >  This is needed as the dma driver uses threaded irq.
> >  Author: Siva Yerramreddy
> > Patch 5: Add dma support in host driver.
> >  Authors: Siva Yerramreddy, Ashutosh Dixit, Sudeep Dutt
> > Patch 6: Add threaded irq support in card driver.
> >  This is needed as the dma driver uses threaded irq.
> >  Author: Siva Yerramreddy
> > Patch 7: Add dma support in card driver.
> >  Author: Siva Yerramreddy
> > Patch 8: Add support for loading/unloading dma driver.
> >  Author: Siva Yerramreddy
> > 
> > The patches have been compiled/validated against v3.14.
> > Tested using dmatest module with module parameter "threads_per_chan=60".
> > 
> > Thanks to Dan Williams, Vinod Koul, Jon Mason, Dave Jiang for the

Re: [PATCH] ARM: at91: fix rtc irq mask for sam9x5 SoCs

2014-05-07 Thread Mark Roszko

Atmel actually has this issue in the Errata of the SAM9G25 and SAM9G35
datasheets which might be worth referencing in the description?

>49.7.1 RTC: Interrupt Mask Register cannot be used
>Interrupt Mask Register read always returns 0.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] mm: slub: fix ALLOC_SLOWPATH stat

2014-05-07 Thread David Rientjes

On Tue, 21 Jan 2014, David Rientjes wrote:

> On Wed, 8 Jan 2014, David Rientjes wrote:
> 
> > > There used to be only one path out of __slab_alloc(), and
> > > ALLOC_SLOWPATH got bumped in that exit path.  Now there are two,
> > > and a bunch of gotos.  ALLOC_SLOWPATH can now get set more than once
> > > during a single call to __slab_alloc() which is pretty bogus.
> > > Here's the sequence:
> > > 
> > > 1. Enter __slab_alloc(), fall through all the way to the
> > >stat(s, ALLOC_SLOWPATH);
> > > 2. hit 'if (!freelist)', and bump DEACTIVATE_BYPASS, jump to
> > >new_slab (goto #1)
> > > 3. Hit 'if (c->partial)', bump CPU_PARTIAL_ALLOC, goto redo
> > >(goto #2)
> > > 4. Fall through in the same path we did before all the way to
> > >stat(s, ALLOC_SLOWPATH)
> > > 5. bump ALLOC_REFILL stat, then return
> > > 
> > > Doing this is obviously bogus.  It keeps us from being able to
> > > accurately compare ALLOC_SLOWPATH vs. ALLOC_FASTPATH.  It also
> > > means that the total number of allocs always exceeds the total
> > > number of frees.
> > > 
> > > This patch moves stat(s, ALLOC_SLOWPATH) to be called from the
> > > same place that __slab_alloc() is.  This makes it much less
> > > likely that ALLOC_SLOWPATH will get botched again in the
> > > spaghetti-code inside __slab_alloc().
> > > 
> > > Signed-off-by: Dave Hansen 
> > 
> > Acked-by: David Rientjes 
> > 
> 
> Pekka, are you going to pick this up for linux-next?  I think it would be 
> nice to have for 3.14 for those of us who use the stats.
> 

Ping #2.  Pekka or Andrew, would you pick this up for linux-next?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/4] DT platform device name collision fixes

2014-05-07 Thread Frank Rowand

On 5/7/2014 3:52 PM, Frank Rowand wrote:
> On 5/7/2014 2:48 PM, Rob Herring wrote:
>> From: Rob Herring 
>>
>> This series fixes the device naming collisions that can occur with 
>> nultiple devices having the same name and non-translatable unit 
>> addresses. This issue was raised in this thread[1]. I intend to merge 
>> this regardless of whether or not some hierarchy in sysfs is created. 
>> That is really a separate issue independent of these fixes.
>>
>> I found and fix a couple of other issues in the process of testing the 
>> fix.
>>
>> Rob
>>
>> [1] https://lkml.org/lkml/2014/4/23/312
>>
>> Rob Herring (4):
>>   of/selftest: add testcase for nodes with same name and address
>>   of/platform: return error on of_platform_device_create_pdata failure
>>   of/platform: fix device naming for non-translatable addresses
>>   of: kill off of_can_translate_address
> 
> My opinion is that this approach is not a good approach to solving the
> problem.  It is papering over a symptom, instead of dealing with the
> root cause.
> 
> But despite my opinion, you can add to patches 2-4 (I did not test
> the self-test added in patch 1):
> 
>Tested-by: frowand.l...@gmail.com 
> 
> The patches resolve the name conflict originally reported for the
> qcomm PMIC, tested on 3.15-rc1, with a bunch of out of tree
> patches.

And you can add to the 4 patches:

  Reviewed-by: Frank Rowand 

-Frank

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] clk: divider: Fix table round up function

2014-05-07 Thread Shawn Guo

On Wed, May 07, 2014 at 06:48:52PM +0200, Maxime COQUELIN wrote:
> Commit 1d9fe6b97 ("clk: divider: Fix best div calculation for power-of-two and
> table dividers") introduces a regression in its _table_round_up function.
> 
> When the divider passed to this function is greater than the max divider
> available in the table, this function returns table's max divider.
> Problem is that it causes an infinite loop in clk_divider_bestdiv() because
> _next_div() will never return a value greater than maxdiv.
> 
> Instead of returning table's max divider, this patch returns INT_MAX.
> 
> Reported-by: Fabio Estevam 
> Reported-by: Shawn Guo 
> Tested-by: Fabio Estevam 
> Cc: Mike Turquette 
> Signed-off-by: Maxime Coquelin 

Tested-by: Shawn Guo 

Thanks for the fix.

Shawn

> ---
>  drivers/clk/clk-divider.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/clk/clk-divider.c b/drivers/clk/clk-divider.c
> index b3c8396..cf9114a 100644
> --- a/drivers/clk/clk-divider.c
> +++ b/drivers/clk/clk-divider.c
> @@ -158,7 +158,7 @@ static bool _is_valid_div(struct clk_divider *divider, 
> unsigned int div)
>  static int _round_up_table(const struct clk_div_table *table, int div)
>  {
>   const struct clk_div_table *clkt;
> - int up = _get_table_maxdiv(table);
> + int up = INT_MAX;
>  
>   for (clkt = table; clkt->div; clkt++) {
>   if (clkt->div == div)
> -- 
> 1.9.1
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/4] of/selftest: add testcase for nodes with same name and address

2014-05-07 Thread Frank Rowand

On 5/7/2014 2:48 PM, Rob Herring wrote:
> From: Rob Herring 
> 
> Add a test case for nodes which have the same name and same
> non-translatable unit address.

If I apply patch 1 and 2 without applying 3 and 4 then console
warnings are printed, but from a different area of code than
the original problem reported.  This probably is not a big deal,
but I'm trying to figure out if I can modify the test to also
show the original problem.

The test case also properly reports the failure.

Once all 4 patches are applied, then the test case passes.

Thus:

   Tested-by: Frank Rowand 

> 
> Signed-off-by: Rob Herring 
> ---
>  drivers/of/selftest.c| 23 ++
>  drivers/of/testcase-data/testcases.dtsi  |  1 +
>  drivers/of/testcase-data/tests-platform.dtsi | 35 
> 
>  3 files changed, 59 insertions(+)
>  create mode 100644 drivers/of/testcase-data/tests-platform.dtsi
> 

< snip >

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 5/7] clk: sunxi: add PRCM (Power/Reset/Clock Management) clks support

2014-05-07 Thread Maxime Ripard

On Wed, May 07, 2014 at 07:12:30PM +0200, Boris BREZILLON wrote:
> 
> On 29/04/2014 01:40, Maxime Ripard wrote:
> > On Mon, Apr 28, 2014 at 04:58:48PM +0200, Boris BREZILLON wrote:
> >> The PRCM (Power/Reset/Clock Management) unit provides several clock
> >> devices:
> >> - AR100 clk: used to clock the Power Management co-processor
> >> - AHB0 clk: used to clock the AHB0 bus
> >> - APB0 clk and gates: used to clk
> > Used to clk?
> "Used to clk peripherals connected on the APB0 bus"
> 
> I'll add the missing words in the next version :-).
> 
> >
> [...]
> > Ditto.
> >
> > And you'll probably want to use devm_ioremap_resource when you'll have
> > a single clock for the AR100.
> 
> Absolutely.
> 
> >
> >> +
> >> +  clk_parent = of_clk_get_parent_name(np, 0);
> >> +  if (!clk_parent)
> >> +  return -EINVAL;
> [...]
> >> +
> >> +static struct platform_driver sun6i_a31_prcm_clk_driver = {
> >> +  .driver = {
> >> +  .name = "sun6i-a31-prcm-clk",
> >> +  .owner = THIS_MODULE,
> >> +  .of_match_table = sun6i_a31_prcm_clk_dt_ids,
> >> +  },
> >> +  .probe = sun6i_a31_prcm_clk_probe,
> > You're not calling the of_clk_del_provider, and you should probably
> > unregister your clocks too.
> 
> This driver cannot be compiled as a module, and as a result the probed
> clks will never be removed.
> 
> Do you really want to support clk removal for this HW block ?

Hmm, no, then it's fine.

Thanks!
Maxime

-- 
Maxime Ripard, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com


signature.asc
Description: Digital signature

Re: [PATCH tip/core/rcu 07/45] torture: Allow variations of "defconfig" to be specified

2014-05-07 Thread Paul E. McKenney

On Wed, May 07, 2014 at 06:54:08PM -0700, Josh Triplett wrote:
> On Wed, May 07, 2014 at 04:52:40PM -0700, Paul E. McKenney wrote:
> > On Wed, May 07, 2014 at 02:22:19PM -0700, j...@joshtriplett.org wrote:
> > > On Mon, Apr 28, 2014 at 05:24:55PM -0700, Paul E. McKenney wrote:
> > > > From: "Paul E. McKenney" 
> > > > 
> > > > Some environments require some variation on "make defconfig" to 
> > > > initialize
> > > > the .config file.  This commit therefore adds a --defconfig argument to
> > > > allow this to be specified.  The default value is of course "defconfig".
> > > > 
> > > > Signed-off-by: Paul E. McKenney 
> > > 
> > > 
> > > "--defconfig randconfig" or "--defconfig allyesconfig" or similar seems
> > > rather odd; how about calling it --kconfig or similar?
> > > 
> > 
> > Some day I am going to have to feed that to a browser and see what
> > happens.  ;-)
> > 
> > I must confess that I hadn't considered feeding randconfig or allyesconfig
> > to that argument, partly because I figured that I would have to also
> > supply Kconfig constraints in those cases in order to ensure that the
> > resulting kernel would actually run under qemu.  I was instead thinking
> > in terms of a --configs option beginning with "RAND", which would pick
> > up the Kconfig constraints from the appropriate configs directory,
> > for example:
> > 
> > tools/testing/selftests/rcutorture/configs/rcu/RAND1
> > 
> > That said, I haven't thought that far down that path.
> > 
> > So for the --defconfig argument, I was thinking more in terms of things
> > like pseries_defconfig or versatile_defconfig.
> 
> Ah, I see.  --defconfig specifies the base configuration, while
> --configs specifies the constraints.  In that case, how about
> --baseconfig?  It might still make sense to pass --baseconfig
> allnoconfig or --baseconfig allyesconfig or --baseconfig randconfig,
> given a sufficiently complete constraints file.

My choice of name was guided by the following:

$ find arch -name '*defconfig*' -print | wc -l
469

When I try baseconfig:

$ find arch -name '*baseconfig*' -print | wc -l
0

Don't get me wrong, you might be correct here.  But the thing is that
I like being able to specify all tests in one go.  That means that I
want to still be able to run my current defconfig-based config files
(e.g., TREE01) when I get a randconfig-based setup going.  So I would
like to be able to specify something like:

sh kvm.sh --defconfig pseries_defconfig --configs "TREE01 RAND01"

But perhaps there is a better way to do this.

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH net-next] vlan: rename __vlan_find_dev_deep() to __vlan_find_dev_deep_rcu_rcu()

2014-05-07 Thread Ding Tianhong

The __vlan_find_dev_deep should always called in RCU, according
David's suggestion, rename to __vlan_find_dev_deep_rcu looks more
reasonable.

Signed-off-by: Ding Tianhong 
---
 drivers/net/ethernet/chelsio/cxgb3/cxgb3_offload.c |  2 +-
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c|  2 +-
 drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c   |  2 +-
 drivers/net/usb/cdc_mbim.c |  2 +-
 drivers/s390/net/qeth_l3_main.c| 10 +-
 include/linux/if_vlan.h|  4 ++--
 net/8021q/vlan_core.c  |  6 +++---
 net/bridge/br_netfilter.c  |  2 +-
 8 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb3/cxgb3_offload.c 
b/drivers/net/ethernet/chelsio/cxgb3/cxgb3_offload.c
index c0a9dd5..b0cbb2b 100644
--- a/drivers/net/ethernet/chelsio/cxgb3/cxgb3_offload.c
+++ b/drivers/net/ethernet/chelsio/cxgb3/cxgb3_offload.c
@@ -185,7 +185,7 @@ static struct net_device *get_iff_from_mac(struct adapter 
*adapter,
if (ether_addr_equal(dev->dev_addr, mac)) {
rcu_read_lock();
if (vlan && vlan != VLAN_VID_MASK) {
-   dev = __vlan_find_dev_deep(dev, 
htons(ETH_P_8021Q), vlan);
+   dev = __vlan_find_dev_deep_rcu(dev, 
htons(ETH_P_8021Q), vlan);
} else if (netif_is_bond_slave(dev)) {
struct net_device *upper_dev;
 
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c 
b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
index 24e16e3..05ce66e 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
@@ -4061,7 +4061,7 @@ static int update_root_dev_clip(struct net_device *dev)
 
/* Parse all bond and vlan devices layered on top of the physical dev */
for (i = 0; i < VLAN_N_VID; i++) {
-   root_dev = __vlan_find_dev_deep(dev, htons(ETH_P_8021Q), i);
+   root_dev = __vlan_find_dev_deep_rcu(dev, htons(ETH_P_8021Q), i);
if (!root_dev)
continue;
 
diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c 
b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c
index 7e55e88..076aa5e 100644
--- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c
+++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c
@@ -4122,7 +4122,7 @@ void qlcnic_restore_indev_addr(struct net_device *netdev, 
unsigned long event)
 
rcu_read_lock();
for_each_set_bit(vid, adapter->vlans, VLAN_N_VID) {
-   dev = __vlan_find_dev_deep(netdev, htons(ETH_P_8021Q), vid);
+   dev = __vlan_find_dev_deep_rcu(netdev, htons(ETH_P_8021Q), vid);
if (!dev)
continue;
qlcnic_config_indev_addr(adapter, dev, event);
diff --git a/drivers/net/usb/cdc_mbim.c b/drivers/net/usb/cdc_mbim.c
index 13f7705..5b5a0a4 100644
--- a/drivers/net/usb/cdc_mbim.c
+++ b/drivers/net/usb/cdc_mbim.c
@@ -206,7 +206,7 @@ static void do_neigh_solicit(struct usbnet *dev, u8 *buf, 
u16 tci)
/* need to send the NA on the VLAN dev, if any */
rcu_read_lock();
if (tci) {
-   netdev = __vlan_find_dev_deep(dev->net, htons(ETH_P_8021Q),
+   netdev = __vlan_find_dev_deep_rcu(dev->net, htons(ETH_P_8021Q),
  tci);
if (!netdev) {
rcu_read_unlock();
diff --git a/drivers/s390/net/qeth_l3_main.c b/drivers/s390/net/qeth_l3_main.c
index 3524d34..403889a 100644
--- a/drivers/s390/net/qeth_l3_main.c
+++ b/drivers/s390/net/qeth_l3_main.c
@@ -1659,7 +1659,7 @@ static void qeth_l3_add_vlan_mc(struct qeth_card *card)
for_each_set_bit(vid, card->active_vlans, VLAN_N_VID) {
struct net_device *netdev;
 
-   netdev = __vlan_find_dev_deep(card->dev, htons(ETH_P_8021Q),
+   netdev = __vlan_find_dev_deep_rcu(card->dev, htons(ETH_P_8021Q),
  vid);
if (netdev == NULL ||
!(netdev->flags & IFF_UP))
@@ -1721,7 +1721,7 @@ static void qeth_l3_add_vlan_mc6(struct qeth_card *card)
for_each_set_bit(vid, card->active_vlans, VLAN_N_VID) {
struct net_device *netdev;
 
-   netdev = __vlan_find_dev_deep(card->dev, htons(ETH_P_8021Q),
+   netdev = __vlan_find_dev_deep_rcu(card->dev, htons(ETH_P_8021Q),
  vid);
if (netdev == NULL ||
!(netdev->flags & IFF_UP))
@@ -1766,7 +1766,7 @@ static void qeth_l3_free_vlan_addresses4(struct qeth_card 
*card,
 
QETH_CARD_TEXT(card, 4, "frvaddr4");
 
-   netdev = __vlan_find_dev_deep(card->dev, htons(ETH_P_8021Q), vid);
+   netdev =

Re: [PATCH] spi: Force the registration of the spidev devices

2014-05-07 Thread Maxime Ripard

On Mon, May 05, 2014 at 12:17:23PM -0700, Mark Brown wrote:
> On Sun, May 04, 2014 at 11:21:47PM -0500, Maxime Ripard wrote:
> > On Fri, May 02, 2014 at 10:40:48AM -0700, Mark Brown wrote:
> > > > i2c-dev works great in these cases, because you always have access to
> > > > all the bus, and all the devices, except if the device is already used
> > > > by someone. The patch I suggested is an attempt to mimic this.
> 
> > > It seems better to implement something like this at the device model
> > > level, provide a way to have a default UIO driver for anything on a
> > > given bus.  I don't see anything bus specific apart from saying what the
> > > default driver to use is and it avoids the icky code fiddling about with
> > > what devices are bound and the races that might be involved duplicated
> > > in individual buses.
> 
> > Hmmm, yes, that's probably a great long-term way of dealing with this,
> > but I don't see it happening soon.
> 
> Isn't the code in the patch that started this thread roughly what's
> needed, just done in a SPI specific way instead of a generic way?

Hmmm, I think I get your point now. Yes, we could do it. Let me find
some time to actually write something :)

Thanks,
Maxime

-- 
Maxime Ripard, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com


signature.asc
Description: Digital signature

Re: Re: Re: [PATCH V6] serial/uart/8250: Add tunable RX interrupt trigger I/F of FIFO buffers

2014-05-07 Thread Yoshihiro YUNOMAE


Hi Greg,

Sorry for the late reply.

(2014/04/26 1:01), Greg Kroah-Hartman wrote:

On Fri, Apr 25, 2014 at 05:53:02PM +0900, Yoshihiro YUNOMAE wrote:

Hi Greg,

Thank you for your review.

(2014/04/25 8:11), Greg Kroah-Hartman wrote:

On Thu, Apr 17, 2014 at 03:06:44PM +0900, Yoshihiro YUNOMAE wrote:

[snip]

+static DEVICE_ATTR(rx_int_trig, S_IRUSR | S_IWUSR | S_IRGRP,
+  serial8250_get_attr_rx_int_trig,
+  serial8250_set_attr_rx_int_trig);
+


As you are adding a new sysfs attribute, you have to add a
Documentation/ABI/ entry as well.


I added this attribute to /sys/dev/char/*


What?  No.  That's not ok, why would it be?

See, Documentation would have pointed that problem out very obviously :)


, so the documentation may be sysfs-dev. However, any other attributes
are not written at all. Should I add this description to it or is
there another file?


It shouldn't be on a char device, that's not acceptable.


My reply that I added this attribute to /sys/dev/char/* was
inappropriate. Actually, I added it to a serial device(ttyS0) in
/sys/devices/ tree. /sys/dev/char/* stores symlinks of devices
in /sys/devices/ tree.

I think the documentation where I should add the description is
Documentation/ABI/testing/sysfs-tty. (/sys/class/* also stores
symlinks of devices in /sys/devices/ tree.)


+static struct attribute *serial8250_dev_attrs[] = {
+   _attr_rx_int_trig.attr,
+   NULL,
+   };
+
+static struct attribute_group serial8250_dev_attr_group = {
+   .attrs = serial8250_dev_attrs,
+   };


What's wrong with the macro to create a group?


I'll explain about this below.


+
+static void register_dev_spec_attr_grp(struct uart_8250_port *up)
+{
+   const struct serial8250_config *conf_type = _config[up->port.type];
+
+   if (conf_type->rxtrig_bytes[0])
+   up->port.dev_spec_attr_group = _dev_attr_group;
+}
+
  static void serial8250_config_port(struct uart_port *port, int flags)
  {
struct uart_8250_port *up =
@@ -2708,6 +2848,9 @@ static void serial8250_config_port(struct uart_port 
*port, int flags)
if ((port->type == PORT_XR17V35X) ||
   (port->type == PORT_XR17D15X))
port->handle_irq = exar_handle_irq;
+
+   register_dev_spec_attr_grp(up);
+   up->fcr = uart_config[up->port.type].fcr;
  }

  static int
diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c
index 2cf5649..41ac44b 100644
--- a/drivers/tty/serial/serial_core.c
+++ b/drivers/tty/serial/serial_core.c
@@ -2548,15 +2548,16 @@ static struct attribute *tty_dev_attrs[] = {
NULL,
};

-static const struct attribute_group tty_dev_attr_group = {
+static struct attribute_group tty_dev_attr_group = {
.attrs = tty_dev_attrs,
};

-static const struct attribute_group *tty_dev_attr_groups[] = {
-   _dev_attr_group,
-   NULL
-   };
-
+static void make_uport_attr_grps(struct uart_port *uport)
+{
+   uport->attr_grps[0] = _dev_attr_group;
+   if (uport->dev_spec_attr_group)
+   uport->attr_grps[1] = uport->dev_spec_attr_group;
+}

  /**
   *uart_add_one_port - attach a driver-defined port structure
@@ -2607,12 +2608,15 @@ int uart_add_one_port(struct uart_driver *drv, struct 
uart_port *uport)

uart_configure_port(drv, state, uport);

+   make_uport_attr_grps(uport);
+
/*
 * Register the port whether it's detected or not.  This allows
 * setserial to be used to alter this port's parameters.
 */
tty_dev = tty_port_register_device_attr(port, drv->tty_driver,
-   uport->line, uport->dev, port, tty_dev_attr_groups);
+   uport->line, uport->dev, port,
+   (const struct attribute_group **)uport->attr_grps);


If you have to cast that hard, something is wrong here, why are you
doing that?


The attribute group in serial layer was defined as constant
because serial layer has only common sysfs I/F. However, I want to
change sysfs I/F for specific devices. So, I deleted 'const' from the
definition of the attribute group in serial layer in order to make the
attribute group be changeable. On the other hand, to pass the attribute
group to tty layer, the group must be const because the 5th variable of
tty_port_register_device_attr() is an attribute group with 'const', so
I implemented like this. Although I investigated again,
tty_port_register_device_attr() is used only here, and
tty_register_device_attr() called by the function is called from 2
locations (the one of them passes NULL in the 5th variable).
Therefore, we can delete 'const' for those functions, I think.
How do you think about this?


I think you need to not be messing with the devices in /sys/dev/char/ at
all...

And why do you feel you need a sysfs attribute at all?  What is it going
to be used for?  Who needs it?  Without knowing that, I can't really
answer your questions...


In the

RE: [PATCH 1/3] PM / OPP: Add support for descending order for cpufreq table

2014-05-07 Thread Jonghwan Choi

I believe that 3 item is required for DVFS. Those are frequency, voltage, 
divider value.
Currently OPP only supports voltage and frequency. 
So some cpufreq and devfreq driver get a divider value from struct divider 
table.

How about adding that divider value into struct dev_pm_opp like this;

struct dev_pm_opp {
struct list_head node;

bool available;
unsigned long rate;
unsigned long u_volt;
unsigned int ctl[2]; // Added

struct device_opp *dev_opp;
struct rcu_head head;
};
In my test, it works very wel..

I got a this idea from _PCT in acpi spec.

Then we can remove a lot of code related to divide table. And we also can solve 
this problem.

Thanks

Best Regarfs.


> -Original Message-
> From: menon.nisha...@gmail.com [mailto:menon.nisha...@gmail.com] On Behalf
> Of Nishanth Menon
> Sent: Thursday, May 08, 2014 10:56 AM
> To: Jonghwan Choi
> Cc: Viresh Kumar; Linux PM list; open list; Rafael J. Wysocki; Len Brown;
> Amit Daniel Kachhap
> Subject: Re: [PATCH 1/3] PM / OPP: Add support for descending order for
> cpufreq table
> 
> On Wed, May 7, 2014 at 8:22 PM, Jonghwan Choi 
> wrote:
> >> @Jonghwan: Please consider doing this:
> >> - Don't play with the order of frequencies in table.
> >> - Instead initialize .driver_data filed with values that you need to
> >> write in the registers for all frequencies. i.e. 0 for highest
> >> frequency and
> >> FREQ_COUNT-1 for lowest one.
> >
> > -> For that, I changed like this.
> > For initializing .driver_data, I changed dev_pm_opp_init_cpufreq_table
> function().
> >
> >
> > --- a/drivers/base/power/opp.c
> > +++ b/drivers/base/power/opp.c
> > @@ -622,12 +622,12 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_disable);
> >   * or in contexts where mutex locking cannot be used.
> >   */
> >  int dev_pm_opp_init_cpufreq_table(struct device *dev,
> > -   struct cpufreq_frequency_table **table)
> > +   struct cpufreq_frequency_table **table, int order)
> >  {
> > struct device_opp *dev_opp;
> > struct dev_pm_opp *opp;
> > struct cpufreq_frequency_table *freq_table;
> > -   int i = 0;
> > +   int i = 0, index = 0;
> >
> > /* Pretend as if I am an updater */
> > mutex_lock(_opp_list_lock); @@ -649,16 +649,22 @@ int
> > dev_pm_opp_init_cpufreq_table(struct device *dev,
> > return -ENOMEM;
> > }
> >
> > +   if (OPP_TABLE_ORDER_DESCENDING == order)
> > +   index = dev_pm_opp_get_opp_count(dev) - 1;
> > +
> > list_for_each_entry(opp, _opp->opp_list, node) {
> > if (opp->available) {
> > -   freq_table[i].driver_data = i;
> > +   if (OPP_TABLE_ORDER_DESCENDING == order)
> > +   freq_table[i].driver_data = index--;
> > +   else
> > +   freq_table[i].driver_data = index++;
> > freq_table[i].frequency = opp->rate / 1000;
> > i++;
> > }
> > }
> > mutex_unlock(_opp_list_lock);
> >
> > -   freq_table[i].driver_data = i;
> > +   freq_table[i].driver_data = index;
> > freq_table[i].frequency = CPUFREQ_TABLE_END;
> >
> > *table = _table[0];
> >
> >
> > Is it acceptiable?
> 
> Personally, I feel that filling up driver_data should be left to the
> driver(caller of dev_pm_opp_init_cpufreq_table). for example providing a
> function pointer which decides what that value should be (be it index or
> some magical register value).. Viresh might have better opinions.
> 
> Regards,
> Nishanth Menon

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

mm,console: circular dependency between console_sem and zone lock

2014-05-07 Thread Sasha Levin

Hi all,

While fuzzing with trinity inside a KVM tools guest running the latest -next
kernel I've stumbled on the following spew:

[  262.793172] ==
[  262.794555] [ INFO: possible circular locking dependency detected ]
[  262.796110] 3.15.0-rc4-next-20140507-sasha-4-g14be78b-dirty #448 
Tainted: GW
[  262.798430] ---
[  262.799804] runtrin.sh/9791 is trying to acquire lock:
[  262.801168] ((console_sem).lock){-.-...}, at: down_trylock 
(kernel/locking/semaphore.c:137)
[  262.801216]
[  262.801216] but task is already holding lock:
[  262.801216] (&(>lock)->rlock){-.-...}, at: __offline_isolated_pages 
(mm/page_alloc.c:6427)
[  262.801216]
[  262.801216] which lock already depends on the new lock.
[  262.801216]
[  262.801216]
[  262.801216] the existing dependency chain (in reverse order) is:
[  262.801216]
-> #3 (&(>lock)->rlock){-.-...}:
[  262.801216] lock_acquire (arch/x86/include/asm/current.h:14 
kernel/locking/lockdep.c:3602)
[  262.801216] _raw_spin_lock_irqsave (include/linux/spinlock_api_smp.h:117 
kernel/locking/spinlock.c:159)
[  262.801216] get_page_from_freelist (mm/page_alloc.c:1574 
mm/page_alloc.c:2033)
[  262.801216] __alloc_pages_nodemask (mm/page_alloc.c:2728)
[  262.801216] alloc_page_interleave (mm/mempolicy.c:1944)
[  262.801216] alloc_pages_current (mm/mempolicy.c:2041)
[  262.801216] new_slab (include/linux/gfp.h:337 mm/slub.c:1327 mm/slub.c:1356 
mm/slub.c:1418)
[  262.801216] __slab_alloc (mm/slub.c:2204 mm/slub.c:2364)
[  262.801216] kmem_cache_alloc (mm/slub.c:2470 mm/slub.c:2481 mm/slub.c:2486)
[  262.801216] __debug_object_init (lib/debugobjects.c:97 
lib/debugobjects.c:311)
[  262.801216] debug_object_init (lib/debugobjects.c:364)
[  262.801216] hrtimer_init (kernel/hrtimer.c:437 
include/linux/jump_label.h:105 include/trace/events/timer.h:130 
kernel/hrtimer.c:482 kernel/hrtimer.c:1222)
[  262.801216] __sched_fork (kernel/sched/core.c:1745)
[  262.801216] init_idle (kernel/sched/core.c:4460)
[  262.801216] fork_idle (kernel/fork.c:1565)
[  262.801216] idle_threads_init (kernel/smpboot.c:54 kernel/smpboot.c:72)
[  262.801216] smp_init (kernel/smp.c:535)
[  262.801216] kernel_init_freeable (init/main.c:854 init/main.c:1007)
[  262.801216] kernel_init (init/main.c:939)
[  262.801216] ret_from_fork (arch/x86/kernel/entry_64.S:553)
[  262.801216]
-> #2 (>lock){-.-.-.}:
[  262.801216] lock_acquire (arch/x86/include/asm/current.h:14 
kernel/locking/lockdep.c:3602)
[  262.801216] _raw_spin_lock (include/linux/spinlock_api_smp.h:143 
kernel/locking/spinlock.c:151)
[  262.801216] wake_up_new_task (include/linux/sched.h:2873 
kernel/sched/core.c:329 kernel/sched/core.c:2027)
[  262.801216] do_fork (kernel/fork.c:1628)
[  262.801216] kernel_thread (kernel/fork.c:1650)
[  262.801216] rest_init (init/main.c:404)
[  262.801216] start_kernel (init/main.c:683)
[  262.801216] x86_64_start_reservations (arch/x86/kernel/head64.c:194)
[  262.801216] x86_64_start_kernel (arch/x86/kernel/head64.c:183)
[  262.801216]
-> #1 (>pi_lock){-.-.-.}:
[  262.801216] lock_acquire (arch/x86/include/asm/current.h:14 
kernel/locking/lockdep.c:3602)
[  262.801216] _raw_spin_lock_irqsave (include/linux/spinlock_api_smp.h:117 
kernel/locking/spinlock.c:159)
[  262.801216] try_to_wake_up (kernel/sched/core.c:1605)
[  262.801216] wake_up_process (kernel/sched/core.c:1701 (discriminator 2))
[  262.801216] __up.isra.0 (kernel/locking/semaphore.c:263)
[  262.801216] up (kernel/locking/semaphore.c:186)
[  262.801216] console_unlock (kernel/printk/printk.c:2230)
[  262.801216] vprintk_emit (kernel/printk/printk.c:1746)
[  262.801216] dev_vprintk_emit (drivers/base/core.c:2053 (discriminator 3))
[  262.801216] dev_printk_emit (drivers/base/core.c:2068)
[  262.801216] __dynamic_dev_dbg (lib/dynamic_debug.c:593)
[  262.801216] pps_event (drivers/pps/kapi.c:204 (discriminator 1))
[  262.801216] pps_ktimer_event (drivers/pps/clients/pps-ktimer.c:51)
[  262.801216] call_timer_fn (kernel/timer.c:1140 
include/linux/jump_label.h:105 include/trace/events/timer.h:106 
kernel/timer.c:1141)
[  262.801216] run_timer_softirq (include/linux/spinlock.h:328 
kernel/timer.c:1213 kernel/timer.c:1403)
[  262.801216] __do_softirq (kernel/softirq.c:269 
include/linux/jump_label.h:105 include/trace/events/irq.h:126 
kernel/softirq.c:270)
[  262.801216] irq_exit (kernel/softirq.c:346 kernel/softirq.c:387)
[  262.801216] smp_apic_timer_interrupt (arch/x86/include/asm/irq_regs.h:26 
arch/x86/kernel/apic/apic.c:947)
[  262.801216] apic_timer_interrupt (arch/x86/kernel/entry_64.S:1225)
[  262.801216] default_idle (arch/x86/include/asm/paravirt.h:111 
arch/x86/kernel/process.c:310)
[  262.801216] arch_cpu_idle (arch/x86/kernel/process.c:302)
[  262.801216] cpu_idle_loop (kernel/sched/idle.c:173 kernel/sched/idle.c:220)
[  262.801216] cpu_startup_entry (??:?)
[  262.801216] start_secon

RE: i.MX28 based system losing eth0 on boot

2014-05-07 Thread fugang.d...@freescale.com

From: Brian Lilly 
Data: Thursday, May 08, 2014 3:52 AM

>To: Florian Fainelli
>Cc: Uwe Kleine-König; David S. Miller; Estevam Fabio-R49496; Jim Baxter; Li 
>Frank-
>B20596; Duan Fugang-B38611; netdev; linux-kernel@vger.kernel.org; kernel
>Subject: Re: i.MX28 based system losing eth0 on boot
>
>Florian:
>
>Thank you for your help.
>
>After doubling the timeout length it worked.
>
>I managed to get my hands on a imx28evk board and compared our component load
>versus theirs, to find they have a 1.5k pull-up on ENET_MDIO to +3.3v which 
>wasn't
>present on our board.  Adding a 1.5k pull-up resistor on ENET_MDIO solves the
>problem, and boots as expected without patching anything.
>
>Sorry for the trouble on this.
>
>Apparently our EE had some question as to whether or not the pull-up was 
>necessary,
>and put it in the schematic, and the footprint on the board, but marked it as a
>DNP, which of course left it off the board and out of the BOM.
[...]

Yes, 1.5K pull-up on MDIO is necessary, otherwise write/read phy register data 
is not right due to the drive strength is not enough.

Thanks,
Andy

Re: perf_fuzzer crash on pentium 4

2014-05-07 Thread Don Zickus

On Wed, May 07, 2014 at 12:23:08AM +0400, Cyrill Gorcunov wrote:
> On Tue, May 06, 2014 at 11:42:58AM -0400, Vince Weaver wrote:
> > 
> > So just to be difficult I fired up the perf_fuzzer on a Pentium 4 machine.
> > 
> > It crashes more or less instantly (sorry for the line wrapping, 
> > just got the serial console hooked up and don't have minicom configured 
> > right yet).
> > 
> > this is 3.15-rc4 with the anti-memory corruption patch applied.
> > 
> > [   67.872274] BUG: unable to handle kernel NULL pointer dereference at 
> > 0004
> > [   67.876146] IP: [] p4_pmu_schedule_events+0xa5/0x331
> 
> This looks like
> 
> p4_pmu_schedule_events:
>   ...
>   bind = p4_config_get_bind(hwc->config);
>   returned bind = NULL;
>   escr_idx = p4_get_escr_idx(bind->escr_msr[thread]); NULL deref
> 
> If i'm right (btw it's possible to use addr2line helper?) then hwc->config
> is corrupted and p4_config_get_bind returned nil simply because proper event
> was not found. And I don't understand how it could happen because before
> configuration gets written into hwc->config it's validated once obtained
> from user-space as a raw event. Weird...

I think my commit 13beacee817d27a40ffc6f065ea0042685611dd5 explains this
corruption.  Though I have to admit I haven't looked through the problem
very closely yet.

IOW my lazy fix in that commit doesn't cover fuzzers and the real problem
in p4_pmu_schedule_events. :-)

Cheers,
Don
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: drivers/leds: Replace __get_cpu_var use through this_cpu_ptr

2014-05-07 Thread Bryan Wu

On Mon, May 5, 2014 at 9:48 AM, Christoph Lameter  wrote:
> Would you merge this patch please?
>
>

Sure, I merged it to my my tree
Thanks,
-Bryan

> Use this_cpu_ptr for the address calculation instead of __get_cpu_var.
>
> Acked-by: Bryan Wu 
> Signed-off-by: Christoph Lameter 
>
> Index: linux/drivers/leds/trigger/ledtrig-cpu.c
> ===
> --- linux.orig/drivers/leds/trigger/ledtrig-cpu.c   2014-04-14 
> 13:24:54.833360823 -0500
> +++ linux/drivers/leds/trigger/ledtrig-cpu.c2014-04-14 13:24:54.825360977 
> -0500
> @@ -47,7 +47,7 @@
>   */
>  void ledtrig_cpu(enum cpu_led_event ledevt)
>  {
> -   struct led_trigger_cpu *trig = &__get_cpu_var(cpu_trig);
> +   struct led_trigger_cpu *trig = this_cpu_ptr(_trig);
>
> /* Locate the correct CPU LED */
> switch (ledevt) {
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/3] PM / OPP: Add support for descending order for cpufreq table

2014-05-07 Thread Nishanth Menon

On Wed, May 7, 2014 at 8:22 PM, Jonghwan Choi  wrote:
>> @Jonghwan: Please consider doing this:
>> - Don't play with the order of frequencies in table.
>> - Instead initialize .driver_data filed with values that you need to write
>> in the registers for all frequencies. i.e. 0 for highest frequency and
>> FREQ_COUNT-1 for lowest one.
>
> -> For that, I changed like this.
> For initializing .driver_data, I changed dev_pm_opp_init_cpufreq_table 
> function().
>
>
> --- a/drivers/base/power/opp.c
> +++ b/drivers/base/power/opp.c
> @@ -622,12 +622,12 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_disable);
>   * or in contexts where mutex locking cannot be used.
>   */
>  int dev_pm_opp_init_cpufreq_table(struct device *dev,
> -   struct cpufreq_frequency_table **table)
> +   struct cpufreq_frequency_table **table, int order)
>  {
> struct device_opp *dev_opp;
> struct dev_pm_opp *opp;
> struct cpufreq_frequency_table *freq_table;
> -   int i = 0;
> +   int i = 0, index = 0;
>
> /* Pretend as if I am an updater */
> mutex_lock(_opp_list_lock);
> @@ -649,16 +649,22 @@ int dev_pm_opp_init_cpufreq_table(struct device *dev,
> return -ENOMEM;
> }
>
> +   if (OPP_TABLE_ORDER_DESCENDING == order)
> +   index = dev_pm_opp_get_opp_count(dev) - 1;
> +
> list_for_each_entry(opp, _opp->opp_list, node) {
> if (opp->available) {
> -   freq_table[i].driver_data = i;
> +   if (OPP_TABLE_ORDER_DESCENDING == order)
> +   freq_table[i].driver_data = index--;
> +   else
> +   freq_table[i].driver_data = index++;
> freq_table[i].frequency = opp->rate / 1000;
> i++;
> }
> }
> mutex_unlock(_opp_list_lock);
>
> -   freq_table[i].driver_data = i;
> +   freq_table[i].driver_data = index;
> freq_table[i].frequency = CPUFREQ_TABLE_END;
>
> *table = _table[0];
>
>
> Is it acceptiable?

Personally, I feel that filling up driver_data should be left to the
driver(caller of dev_pm_opp_init_cpufreq_table). for example providing
a function pointer which decides what that value should be (be it
index or some magical register value).. Viresh might have better
opinions.

Regards,
Nishanth Menon
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH tip/core/rcu 07/45] torture: Allow variations of "defconfig" to be specified

2014-05-07 Thread Josh Triplett

On Wed, May 07, 2014 at 04:52:40PM -0700, Paul E. McKenney wrote:
> On Wed, May 07, 2014 at 02:22:19PM -0700, j...@joshtriplett.org wrote:
> > On Mon, Apr 28, 2014 at 05:24:55PM -0700, Paul E. McKenney wrote:
> > > From: "Paul E. McKenney" 
> > > 
> > > Some environments require some variation on "make defconfig" to initialize
> > > the .config file.  This commit therefore adds a --defconfig argument to
> > > allow this to be specified.  The default value is of course "defconfig".
> > > 
> > > Signed-off-by: Paul E. McKenney 
> > 
> > 
> > "--defconfig randconfig" or "--defconfig allyesconfig" or similar seems
> > rather odd; how about calling it --kconfig or similar?
> > 
> 
> Some day I am going to have to feed that to a browser and see what
> happens.  ;-)
> 
> I must confess that I hadn't considered feeding randconfig or allyesconfig
> to that argument, partly because I figured that I would have to also
> supply Kconfig constraints in those cases in order to ensure that the
> resulting kernel would actually run under qemu.  I was instead thinking
> in terms of a --configs option beginning with "RAND", which would pick
> up the Kconfig constraints from the appropriate configs directory,
> for example:
> 
>   tools/testing/selftests/rcutorture/configs/rcu/RAND1
> 
> That said, I haven't thought that far down that path.
> 
> So for the --defconfig argument, I was thinking more in terms of things
> like pseries_defconfig or versatile_defconfig.

Ah, I see.  --defconfig specifies the base configuration, while
--configs specifies the constraints.  In that case, how about
--baseconfig?  It might still make sense to pass --baseconfig
allnoconfig or --baseconfig allyesconfig or --baseconfig randconfig,
given a sufficiently complete constraints file.

- Josh Triplett
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/4] MADV_VOLATILE: Add purged page detection on setting memory non-volatile

2014-05-07 Thread Minchan Kim

On Tue, Apr 29, 2014 at 02:21:22PM -0700, John Stultz wrote:
> Users of volatile ranges will need to know if memory was discarded.
> This patch adds the purged state tracking required to inform userland
> when it marks memory as non-volatile that some memory in that range
> was purged and needs to be regenerated.
> 
> This simplified implementation which uses some of the logic from
> Minchan's earlier efforts, so credit to Minchan for his work.
> 
> Cc: Andrew Morton 
> Cc: Android Kernel Team 
> Cc: Johannes Weiner 
> Cc: Robert Love 
> Cc: Mel Gorman 
> Cc: Hugh Dickins 
> Cc: Dave Hansen 
> Cc: Rik van Riel 
> Cc: Dmitry Adamushko 
> Cc: Neil Brown 
> Cc: Andrea Arcangeli 
> Cc: Mike Hommey 
> Cc: Taras Glek 
> Cc: Jan Kara 
> Cc: KOSAKI Motohiro 
> Cc: Michel Lespinasse 
> Cc: Minchan Kim 
> Cc: Keith Packard 
> Cc: linux...@kvack.org 
> Acked-by: Jan Kara 
> Signed-off-by: John Stultz 
> ---
>  include/linux/swap.h|  5 +++
>  include/linux/swapops.h | 10 ++
>  mm/mvolatile.c  | 87 
> +
>  3 files changed, 102 insertions(+)
> 
> diff --git a/include/linux/swap.h b/include/linux/swap.h
> index a32c3da..3abc977 100644
> --- a/include/linux/swap.h
> +++ b/include/linux/swap.h
> @@ -55,6 +55,7 @@ enum {
>* 1<* new entries here at the top of the enum, not at the bottom
>*/
> + SWP_MVOLATILE_PURGED_NR,
>  #ifdef CONFIG_MEMORY_FAILURE
>   SWP_HWPOISON_NR,
>  #endif
> @@ -81,6 +82,10 @@ enum {
>  #define SWP_HWPOISON (MAX_SWAPFILES + SWP_HWPOISON_NR)
>  #endif
>  
> +/*
> + * Purged volatile range pages
> + */
> +#define SWP_MVOLATILE_PURGED (MAX_SWAPFILES + SWP_MVOLATILE_PURGED_NR)
>  
>  /*
>   * Magic header for a swap area. The first part of the union is
> diff --git a/include/linux/swapops.h b/include/linux/swapops.h
> index c0f7526..fe9c026 100644
> --- a/include/linux/swapops.h
> +++ b/include/linux/swapops.h
> @@ -161,6 +161,16 @@ static inline int is_write_migration_entry(swp_entry_t 
> entry)
>  
>  #endif
>  
> +static inline swp_entry_t make_purged_entry(void)
> +{
> + return swp_entry(SWP_MVOLATILE_PURGED, 0);
> +}
> +
> +static inline int is_purged_entry(swp_entry_t entry)
> +{
> + return swp_type(entry) == SWP_MVOLATILE_PURGED;
> +}
> +
>  #ifdef CONFIG_MEMORY_FAILURE
>  /*
>   * Support for hardware poisoned pages
> diff --git a/mm/mvolatile.c b/mm/mvolatile.c
> index edc5894..555d5c4 100644
> --- a/mm/mvolatile.c
> +++ b/mm/mvolatile.c
> @@ -13,8 +13,92 @@
>  #include 
>  #include 
>  #include 
> +#include 
> +#include 
>  #include "internal.h"
>  
> +struct mvolatile_walker {
> + struct vm_area_struct *vma;
> + int page_was_purged;
> +};
> +
> +
> +/**
> + * mvolatile_check_purged_pte - Checks ptes for purged pages
> + * @pmd: pmd to walk
> + * @addr: starting address
> + * @end: end address
> + * @walk: mm_walk ptr (contains ptr to mvolatile_walker)
> + *
> + * Iterates over the ptes in the pmd checking if they have
> + * purged swap entries.
> + *
> + * Sets the mvolatile_walker.page_was_purged to 1 if any were purged,
> + * and clears the purged pte swp entries (since the pages are no
> + * longer volatile, we don't want future accesses to SIGBUS).
> + */
> +static int mvolatile_check_purged_pte(pmd_t *pmd, unsigned long addr,
> + unsigned long end, struct mm_walk *walk)
> +{
> + struct mvolatile_walker *vw = walk->private;
> + pte_t *pte;
> + spinlock_t *ptl;
> +
> + if (pmd_trans_huge(*pmd))
> + return 0;
> + if (pmd_trans_unstable(pmd))
> + return 0;
> +
> + pte = pte_offset_map_lock(walk->mm, pmd, addr, );
> + for (; addr != end; pte++, addr += PAGE_SIZE) {
> + if (!pte_present(*pte)) {
> + swp_entry_t mvolatile_entry = pte_to_swp_entry(*pte);
> +
> + if (unlikely(is_purged_entry(mvolatile_entry))) {
> +
> + vw->page_was_purged = 1;
> +
> + /* clear the pte swp entry */
> + flush_cache_page(vw->vma, addr, pte_pfn(*pte));

Maybe we don't need to flush the cache because there is no mapped page.

> + ptep_clear_flush(vw->vma, addr, pte);

Maybe we don't need this, either. We didn't set present bit for purged
page but when I look at the internal of ptep_clear_flush, it checks present bit
and skip the TLB flush so it's okay for x86 but not sure other architecture.
More clear function for our purpose would be pte_clear_not_present_full.

And we are changing page table so at least, we need to handle mmu_notifier to
inform that to the client of mmu_notifier.

> + }
> + }
> + }
> + pte_unmap_unlock(pte - 1, ptl);
> + cond_resched();
> +
> + return 0;
> +}
> +
> +
> +/**
> + * mvolatile_check_purged - Sets up a mm_walk to check for purged pages
> + * @vma: ptr to vma we're starting

Re: [PATCH v2 03/10] slab: move up code to get kmem_cache_node in free_block()

2014-05-07 Thread George Spelvin

(Oops, previous e-mail was sent halfway through composition in error.)

> I'm not sure it's even correct since 
> you're now clearing after doing recheck_pfmemalloc_active().

I thought this through before rearranging the code.
recheck_pfmemalloc_active() checks global lists, but __ac_get_obj()
is doing clear_obj_pfmemalloc on a local variable.  So it
can't affect recheck_pfmemalloc_active().

> A function called clear_obj_pfmemalloc() doesn't indicate it's returning 
> anything, I think the vast majority of people would believe that it 
> returns void just as it does.

Perhaps the name needs to be modified, but it's still pretty clear.
It just clears the bit in its argument and returns it, as opposed to
operating in-place.

In particular, when reading the code that calls it, there is obviously
a return value.  What could it possibly be?

> There's no complier generated code optimization with this patch

On that subject, you're absolutely correct:
   textdata bss dec hex filename
  10635 939   4   115782d3a mm/slab.o.before
  10635 939   4   115782d3a mm/slab.o.after

If I don't have CONFIG_CC_OPTIMIZE_FOR_SIZE=y, the padding NOPs after
unconditional jumps get aligned a little differently and it actually
gets bigger.

   textdata bss dec hex filename
  129581079   4   1404136d9 mm/slab.o.before
  129901079   4   1407336f9 mm/slab.o.after

__ac_get_obj actually spills one fewer register in this case, and
the code paths seem a little cleaner, but I haven't gone through
it completely.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 5/5] x86, nmi: Add better NMI stats to /proc/interrupts and show handlers

2014-05-07 Thread Don Zickus

On Wed, May 07, 2014 at 07:50:48PM +, Elliott, Robert (Server Storage) 
wrote:
> Don Zickus  wrote:
> > The main reason for this patch is because I have a hard time knowing
> > what NMI handlers are registered on the system when debugging NMI issues.
> > 
> > This info is provided in /proc/interrupts for interrupt handlers, so I
> > added support for NMI stuff too.  As a bonus it provides stat breakdowns
> > much like the interrupts.
> 
> /proc/interrupts only shows online CPUs, while /proc/softirqs shows 
> all possible CPUs.  Is there any value in this information for all 
> possible CPUs? Perhaps a /proc/hardirqs could be created alongside.

Well if they are not online, they probably won't be generating NMIs, so I
am not sure there is much value there.

> 
> > The only ugly issue is how to label NMI subtypes using only 3 letters
> > and still make it obvious it is part of the NMI.  Adding a /proc/nmi
> > seemed overkill, so I choose to indent things by one space.  
> 
> The list only shows the currently registered handlers, which may
> differ from the ones that were registered when the NMIs whose counts 
> are being displayed occurred. You might want to describe these new 
> rows and mention that in Documentation/filesystems/proc.txt and 
> the proc(5) manpage.

Ok, but that is a /proc/interrupts problem not one specific to NMI, no?

> 
> > Sample output is below:
> > 
> > [root@dhcp71-248 ~]# cat /proc/interrupts
> >CPU0   CPU1   CPU2   CPU3
> >   0: 29  0  0  0  IR-IO-APIC-edge  timer
> > 
> > NMI: 20774  10986   4227   Non-maskable interrupts
> >  LOC: 21775  10987   4228  Local PMI, arch_bt
> >  EXT:  0  0  0  0  External  plat
> >  UNK:  0  0  0  0  Unknown
> >  SWA:  0  0  0  0  Swallowed
> 
> Adding the list of NMI handlers in /proc/interrupts is a bit 
> inconsistent with the other interrupts, which don't describe their 
> handlers. It would be helpful to distinguish between a handler 
> list being present, being present but empty, or not being present.
> 
> Maybe use parenthesis like this (using Ingo's suggested format):
>  NMI: 20774  10986   4227   Non-maskable interrupts
>  NLC: 21775  10987   4228   NMI: Local (PMI, arch_bt)
>  NXT:  0  0  0  0   NMI: External (plat)
>  NUN:  0  0  0  0   NMI: Unknown ()
>  NSW:  0  0  0  0   NMI: Swallowed
>  LOC:  30374  24749  20795  15095   Local timer interrupts
> 

Hmm, looking at /proc/interrupts I see

  1: 858014  29054  23191   9337   IO-APIC-edge  i8042
  8:  3 24 10  2   IO-APIC-edge  rtc0
  9: 387555   9219   8308   7944   IO-APIC-fasteoi   acpi
 12:9251360 163811 158846 141916   IO-APIC-edge  i8042
 16:  0  0  0  0   IO-APIC-fasteoi   mmc0
 17: 14  5  7 10   IO-APIC-fasteoi 
 19:   6892367 13 10   IO-APIC-fasteoi 
ehci_hcd:usb2, ips, firewire_ohci
 23:1363281753 94 94   IO-APIC-fasteoi ehci_hcd:usb1

Those may not be specific handlers, but they are registered irq names, no?
That basically matches what I was trying to accomplish with NMI. 

I guess I don't see how what I did is much different than what already
exists.


> > diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
> > index d99f31d..520359c 100644
> > --- a/arch/x86/kernel/irq.c
> > +++ b/arch/x86/kernel/irq.c
> ...
> > +void nmi_show_interrupts(struct seq_file *p, int prec)
> > +{
> > +   int j;
> > +   int indent = prec + 1;
> > +
> > +#define get_nmi_stats(j)   (_cpu(nmi_stats, j))
> > +
> > +   seq_printf(p, "%*s: ", indent, "LOC");
> > +   for_each_online_cpu(j)
> > +   seq_printf(p, "%10u ", get_nmi_stats(j)->normal);
> > +   seq_printf(p, " %-8s", "Local");
> > +
> > +   print_nmi_action_name(p, NMI_LOCAL);
> > +
> > +   seq_printf(p, "%*s: ", indent, "EXT");
> > +   for_each_online_cpu(j)
> > +   seq_printf(p, "%10u ", get_nmi_stats(j)->external);
> > +   seq_printf(p, " %-8s", "External");
> > +
> > +   print_nmi_action_name(p, NMI_EXT);
> > +
> > +   seq_printf(p, "%*s: ", indent, "UNK");
> > +   for_each_online_cpu(j)
> > +   seq_printf(p, "%10u ", get_nmi_stats(j)->unknown);
> > +   seq_printf(p, " %-8s", "Unknown");
> > +
> > +   print_nmi_action_name(p, NMI_UNKNOWN);
> > +
> 
> The NMI handler types are in arch/c86/include/asm/nmi.h:
> enum {
> NMI_LOCAL=0,
> NMI_UNKNOWN,
> NMI_SERR,
> NMI_IO_CHECK,
> NMI_MAX
> };
> 
> The new code only prints the registered handlers for NMI_LOCAL, 
> NMI_UNKNOWN, and the new NMI_EXT.

RE: [PATCH 1/3] PM / OPP: Add support for descending order for cpufreq table

2014-05-07 Thread Jonghwan Choi

> @Jonghwan: Please consider doing this:
> - Don't play with the order of frequencies in table.
> - Instead initialize .driver_data filed with values that you need to write
> in the registers for all frequencies. i.e. 0 for highest frequency and
> FREQ_COUNT-1 for lowest one.

-> For that, I changed like this.
For initializing .driver_data, I changed dev_pm_opp_init_cpufreq_table 
function().


--- a/drivers/base/power/opp.c
+++ b/drivers/base/power/opp.c
@@ -622,12 +622,12 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_disable);
  * or in contexts where mutex locking cannot be used.
  */
 int dev_pm_opp_init_cpufreq_table(struct device *dev,
-   struct cpufreq_frequency_table **table)
+   struct cpufreq_frequency_table **table, int order)
 {
struct device_opp *dev_opp;
struct dev_pm_opp *opp;
struct cpufreq_frequency_table *freq_table;
-   int i = 0;
+   int i = 0, index = 0;

/* Pretend as if I am an updater */
mutex_lock(_opp_list_lock);
@@ -649,16 +649,22 @@ int dev_pm_opp_init_cpufreq_table(struct device *dev,
return -ENOMEM;
}

+   if (OPP_TABLE_ORDER_DESCENDING == order)
+   index = dev_pm_opp_get_opp_count(dev) - 1;
+
list_for_each_entry(opp, _opp->opp_list, node) {
if (opp->available) {
-   freq_table[i].driver_data = i;
+   if (OPP_TABLE_ORDER_DESCENDING == order)
+   freq_table[i].driver_data = index--;
+   else
+   freq_table[i].driver_data = index++;
freq_table[i].frequency = opp->rate / 1000;
i++;
}
}
mutex_unlock(_opp_list_lock);

-   freq_table[i].driver_data = i;
+   freq_table[i].driver_data = index;
freq_table[i].frequency = CPUFREQ_TABLE_END;

*table = _table[0];


Is it acceptiable?

Thanks

Best Regards

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/4] MADV_VOLATILE: Add MADV_VOLATILE/NONVOLATILE hooks and handle marking vmas

2014-05-07 Thread Minchan Kim

Hey John,

On Tue, Apr 29, 2014 at 02:21:21PM -0700, John Stultz wrote:
> This patch introduces MADV_VOLATILE/NONVOLATILE flags to madvise(),
> which allows for specifying ranges of memory as volatile, and able
> to be discarded by the system.
> 
> This initial patch simply adds flag handling to madvise, and the
> vma handling, splitting and merging the vmas as needed, and marking
> them with VM_VOLATILE.
> 
> No purging or discarding of volatile ranges is done at this point.
> 
> This a simplified implementation which reuses some of the logic
> from Minchan's earlier efforts. So credit to Minchan for his work.

Remove purged argument is really good thing but I'm not sure merging
the feature into madvise syscall is good idea.
My concern is how we support user who don't want SIGBUS.
I believe we should support them because someuser(ex, sanitizer) really
want to avoid MADV_NONVOLATILE call right before overwriting their cache
(ex, If there was purged page for cyclic cache, user should call NONVOLATILE
right before overwriting to avoid SIGBUS).
Moreover, this changes made unmarking cost O(N) so I'd like to avoid
NOVOLATILE syscall if possible.

For me, SIGBUS is more special usecase for code pages but I believe
both are reasonable for each usecase so my preference is MADV_VOLATILE
is just zero-filled page and MADV_VOLATILE_SIGBUS, another new advise
if you really want to merge volatile range feature with madvise.

> 
> Cc: Andrew Morton 
> Cc: Android Kernel Team 
> Cc: Johannes Weiner 
> Cc: Robert Love 
> Cc: Mel Gorman 
> Cc: Hugh Dickins 
> Cc: Dave Hansen 
> Cc: Rik van Riel 
> Cc: Dmitry Adamushko 
> Cc: Neil Brown 
> Cc: Andrea Arcangeli 
> Cc: Mike Hommey 
> Cc: Taras Glek 
> Cc: Jan Kara 
> Cc: KOSAKI Motohiro 
> Cc: Michel Lespinasse 
> Cc: Minchan Kim 
> Cc: Keith Packard 
> Cc: linux...@kvack.org 
> Signed-off-by: John Stultz 
> ---
>  include/linux/mm.h |   1 +
>  include/linux/mvolatile.h  |   6 ++
>  include/uapi/asm-generic/mman-common.h |   5 ++
>  mm/Makefile|   2 +-
>  mm/madvise.c   |  14 
>  mm/mvolatile.c | 147 
> +
>  6 files changed, 174 insertions(+), 1 deletion(-)
>  create mode 100644 include/linux/mvolatile.h
>  create mode 100644 mm/mvolatile.c
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index bf9811e..ea8b687 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -117,6 +117,7 @@ extern unsigned int kobjsize(const void *objp);
>  #define VM_IO   0x4000   /* Memory mapped I/O or similar */
>  
>   /* Used by sys_madvise() */
> +#define VM_VOLATILE  0x1000  /* VMA is volatile */
>  #define VM_SEQ_READ  0x8000  /* App will access data sequentially */
>  #define VM_RAND_READ 0x0001  /* App will not benefit from clustered 
> reads */
>  
> diff --git a/include/linux/mvolatile.h b/include/linux/mvolatile.h
> new file mode 100644
> index 000..f53396b
> --- /dev/null
> +++ b/include/linux/mvolatile.h
> @@ -0,0 +1,6 @@
> +#ifndef _LINUX_MVOLATILE_H
> +#define _LINUX_MVOLATILE_H
> +
> +int madvise_volatile(int bhv, unsigned long start, unsigned long end);
> +
> +#endif /* _LINUX_MVOLATILE_H */
> diff --git a/include/uapi/asm-generic/mman-common.h 
> b/include/uapi/asm-generic/mman-common.h
> index ddc3b36..b74d61d 100644
> --- a/include/uapi/asm-generic/mman-common.h
> +++ b/include/uapi/asm-generic/mman-common.h
> @@ -39,6 +39,7 @@
>  #define MADV_REMOVE  9   /* remove these pages & resources */
>  #define MADV_DONTFORK10  /* don't inherit across fork */
>  #define MADV_DOFORK  11  /* do inherit across fork */
> +
>  #define MADV_HWPOISON100 /* poison a page for testing */
>  #define MADV_SOFT_OFFLINE 101/* soft offline page for 
> testing */
>  
> @@ -52,6 +53,10 @@
>  overrides the coredump filter bits */
>  #define MADV_DODUMP  17  /* Clear the MADV_DONTDUMP flag */
>  
> +#define MADV_VOLATILE18  /* Mark pages as volatile */
> +#define MADV_NONVOLATILE 19  /* Mark pages non-volatile, return 1
> +if any pages were purged  */
> +
>  /* compatibility flags */
>  #define MAP_FILE 0
>  
> diff --git a/mm/Makefile b/mm/Makefile
> index b484452..9a3dc62 100644
> --- a/mm/Makefile
> +++ b/mm/Makefile
> @@ -18,7 +18,7 @@ obj-y   := filemap.o mempool.o 
> oom_kill.o fadvise.o \
>  mm_init.o mmu_context.o percpu.o slab_common.o \
>  compaction.o balloon_compaction.o vmacache.o \
>  interval_tree.o list_lru.o workingset.o \
> -iov_iter.o $(mmu-y)
> +mvolatile.o iov_iter.o $(mmu-y)
>  
>  obj-y += init-mm.o
>  
>

Re: [PATCH v2 05/10] slab: factor out initialization of arracy cache

2014-05-07 Thread David Rientjes

On Wed, 7 May 2014, Joonsoo Kim wrote:

> Factor out initialization of array cache to use it in following patch.
> 
> Acked-by: Christoph Lameter 
> Signed-off-by: Joonsoo Kim 
> 

Acked-by: David Rientjes 

s/arracy/array/ in patch title.

> diff --git a/mm/slab.c b/mm/slab.c
> index 7647728..755fb57 100644
> --- a/mm/slab.c
> +++ b/mm/slab.c
> @@ -741,13 +741,8 @@ static void start_cpu_timer(int cpu)
>   }
>  }
>  
> -static struct array_cache *alloc_arraycache(int node, int entries,
> - int batchcount, gfp_t gfp)
> +static void init_arraycache(struct array_cache *ac, int limit, int batch)
>  {
> - int memsize = sizeof(void *) * entries + sizeof(struct array_cache);
> - struct array_cache *nc = NULL;
> -
> - nc = kmalloc_node(memsize, gfp, node);
>   /*
>* The array_cache structures contain pointers to free object.
>* However, when such objects are allocated or transferred to another
> @@ -755,15 +750,25 @@ static struct array_cache *alloc_arraycache(int node, 
> int entries,
>* valid references during a kmemleak scan. Therefore, kmemleak must
>* not scan such objects.
>*/
> - kmemleak_no_scan(nc);
> - if (nc) {
> - nc->avail = 0;
> - nc->limit = entries;
> - nc->batchcount = batchcount;
> - nc->touched = 0;
> - spin_lock_init(>lock);
> + kmemleak_no_scan(ac);
> + if (ac) {
> + ac->avail = 0;
> + ac->limit = limit;
> + ac->batchcount = batch;
> + ac->touched = 0;
> + spin_lock_init(>lock);
>   }
> - return nc;
> +}
> +
> +static struct array_cache *alloc_arraycache(int node, int entries,
> + int batchcount, gfp_t gfp)
> +{
> + int memsize = sizeof(void *) * entries + sizeof(struct array_cache);

const?

> + struct array_cache *ac = NULL;
> +
> + ac = kmalloc_node(memsize, gfp, node);

I thought nc meant node cache, but I agree that ac is clearer.

> + init_arraycache(ac, entries, batchcount);
> + return ac;
>  }
>  
>  static inline bool is_slab_pfmemalloc(struct page *page)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Suggestion] unicore32: About toolchain issues and source code.

2014-05-07 Thread Chen Gang


On 05/07/2014 05:45 PM, 管雪涛 wrote:
> Cc: Greg Kroah-Hartman 
> 
>> On 05/01/2014 02:48 AM, Guenter Roeck wrote:
> 
>>> Build results are at http://server.roeck-us.net:8010/builders.
>>>
> 
> I'll fix it in this month.
> 
> Also, Cc greg
> 

Thank you for your work, it will be great if can finish within this
month (especially you are busy, too).


Thanks.
-- 
Chen Gang

Open, share, and attitude like air, water, and life which God blessed
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 04/10] slab: defer slab_destroy in free_block()

2014-05-07 Thread David Rientjes

On Wed, 7 May 2014, Joonsoo Kim wrote:

> In free_block(), if freeing object makes new free slab and number of
> free_objects exceeds free_limit, we start to destroy this new free slab
> with holding the kmem_cache node lock. Holding the lock is useless and,
> generally, holding a lock as least as possible is good thing. I never
> measure performance effect of this, but we'd be better not to hold the lock
> as much as possible.
> 
> Commented by Christoph:
>   This is also good because kmem_cache_free is no longer called while
>   holding the node lock. So we avoid one case of recursion.
> 
> Acked-by: Christoph Lameter 
> Signed-off-by: Joonsoo Kim 

Acked-by: David Rientjes 

Nice optimization.  I think it could have benefited from a comment 
describing what the free_block() list formal is, though.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mm: gpf in global_dirty_limits

2014-05-07 Thread Sasha Levin

On 05/07/2014 06:02 PM, Andrew Morton wrote:
> On Wed, 07 May 2014 12:59:28 -0400 Sasha Levin  wrote:
> 
>> Hi all,
>>
>> While fuzzing with trinity inside a KVM tools guest running the latest -next
>> kernel I've stumbled on the following spew:
>>
>> [ 1139.410483] general protection fault:  [#1] PREEMPT SMP 
>> DEBUG_PAGEALLOC
>> [ 1139.413202] Dumping ftrace buffer:
>> [ 1139.414152](ftrace buffer empty)
>> [ 1139.415069] Modules linked in:
>> [ 1139.415846] CPU: 10 PID: 39777 Comm: kworker/u115:2 Tainted: GW   
>>   3.15.0-rc4-next-20140506-sasha-00021-gc164334-dirty #447
>> [ 1139.418931] Workqueue: writeback bdi_writeback_workfn (flush-7:10)
>> [ 1139.420320] task: 880285848000 ti: 880282dbc000 task.ti: 
>> 880282dbc000
>> [ 1139.420320] RIP: global_dirty_limits 
>> (include/trace/events/writeback.h:308 mm/page-writeback.c:309)
>> [ 1139.420320] RSP: 0018:880282dbdc28  EFLAGS: 00010282
>> [ 1139.420320] RAX: 6b6b6b6b6b6b6b6b RBX: 00088034 RCX: 
>> 0001
>> [ 1139.420320] RDX: 00110068 RSI: 00088034 RDI: 
>> 6b6b6b6b6b6b6b6b
>> [ 1139.420320] RBP: 880282dbdc48 R08: 000abad6 R09: 
>> 880285848cf0
>> [ 1139.420320] R10: 0001 R11:  R12: 
>> 00110068
>> [ 1139.420320] R13: 8805bc5932a8 R14: 880282dbdc60 R15: 
>> 1cc0
>> [ 1139.420320] FS:  () GS:880292c0() 
>> knlGS:
>> [ 1139.420320] CS:  0010 DS:  ES:  CR0: 8005003b
>> [ 1139.420320] CR2: 0001 CR3: 25e2d000 CR4: 
>> 06a0
>> [ 1139.438692] Stack:
>> [ 1139.438692]  8804e3ed0278 8804e3ed0570  
>> 8804e3ed06c8
>> [ 1139.438692]  880282dbdc78 a1342180 00088034 
>> 00110068
>> [ 1139.438692]   8804e3ed0570 880282dbdd38 
>> a1346f3a
>> [ 1139.438692] Call Trace:
>> [ 1139.438692] over_bground_thresh (arch/x86/include/asm/atomic64_64.h:21 
>> include/asm-generic/atomic-long.h:31 include/linux/vmstat.h:122 
>> fs/fs-writeback.c:772)
>> [ 1139.438692] bdi_writeback_workfn (fs/fs-writeback.c:934 
>> fs/fs-writeback.c:1014 fs/fs-writeback.c:1043)
>> [ 1139.438692] process_one_work (kernel/workqueue.c:2227 
>> include/linux/jump_label.h:105 include/trace/events/workqueue.h:111 
>> kernel/workqueue.c:2232)
>> [ 1139.438692] ? process_one_work (include/linux/workqueue.h:186 
>> kernel/workqueue.c:611 kernel/workqueue.c:638 kernel/workqueue.c:2220)
>> [ 1139.438692] worker_thread (kernel/workqueue.c:2354)
>> [ 1139.438692] ? rescuer_thread (kernel/workqueue.c:2303)
>> [ 1139.438692] kthread (kernel/kthread.c:210)
>> [ 1139.438692] ? kthread_create_on_node (kernel/kthread.c:176)
>> [ 1139.438692] ret_from_fork (arch/x86/kernel/entry_64.S:553)
>> [ 1139.438692] ? kthread_create_on_node (kernel/kthread.c:176)
>> [ 1139.438692] Code: 25 a0 da 00 00 0f 84 82 00 00 00 66 90 eb 2e 66 0f 1f 
>> 44 00 00 49 8b 45 00 0f 1f 40 00 49 8b 7d 08 4c 89 e2 49 83 c5 10 48 89 de 
>>  d0 49 8b 45 00 48 85 c0 75 e7 eb c5 0f 1f 44 00 00 eb 53 66
>> [ 1139.438692] RIP global_dirty_limits (include/trace/events/writeback.h:308 
>> mm/page-writeback.c:309)
>> [ 1139.438692]  RSP 
> 
> Did this die somewhere within trace_global_dirty_state()?
> 

This turns out to be an issue with tracing and not mm/, sorry for the noise.


Thanks,
Sasha
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 03/10] slab: move up code to get kmem_cache_node in free_block()

2014-05-07 Thread David Rientjes

On Wed, 7 May 2014, Joonsoo Kim wrote:

> node isn't changed, so we don't need to retreive this structure
> everytime we move the object. Maybe compiler do this optimization,
> but making it explicitly is better.
> 
> Acked-by: Christoph Lameter 
> Signed-off-by: Joonsoo Kim 

Acked-by: David Rientjes 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 03/10] slab: move up code to get kmem_cache_node in free_block()

2014-05-07 Thread George Spelvin

> A function called clear_obj_pfmemalloc() doesn't indicate it's returning 
> anything, I think the vast majority of people would believe that it 
> returns void just as it does.  There's no complier generated code 
> optimization with this patch and

> I'm not sure it's even correct since 
> you're now clearing after doing recheck_pfmemalloc_active().

I thought this through before rearranging the code.
recheck_pfmemalloc_active() checks global lists, but __ac_get_obj()
is doing clear_obj_pfmemalloc on a local variable.

I think it does make sense to remove the pointless "return;" in 
set_obj_pfmemalloc(), however.  Not sure it's worth asking someone to 
merge it, though.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] tpm: Fix tpm init with no ACPI entry

2014-05-07 Thread Derek Basehore

tpm_add_ppi fails without ACPI support now, but loading the tpm without an ACPI
entry is a valid use case. This changes the init of the tpm to not fail when
tpm_add_ppi fails.

Signed-off-by: Derek Basehore 
---
 drivers/char/tpm/tpm-interface.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/char/tpm/tpm-interface.c b/drivers/char/tpm/tpm-interface.c
index 62e10fd..8bfa339 100644
--- a/drivers/char/tpm/tpm-interface.c
+++ b/drivers/char/tpm/tpm-interface.c
@@ -1094,8 +1094,7 @@ struct tpm_chip *tpm_register_hardware(struct device *dev,
if (tpm_sysfs_add_device(chip))
goto del_misc;
 
-   if (tpm_add_ppi(>kobj))
-   goto del_misc;
+   tpm_add_ppi(>kobj);
 
chip->bios_dir = tpm_bios_log_setup(chip->devname);
 
-- 
1.9.1.423.g4596e3a

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH 3/3] CMA: always treat free cma pages as non-free on watermark checking

2014-05-07 Thread Joonsoo Kim

commit d95ea5d1('cma: fix watermark checking') introduces ALLOC_CMA flag
for alloc flag and treats free cma pages as free pages if this flag is
passed to watermark checking. Intention of that patch is that movable page
allocation can be be handled from cma reserved region without starting
kswapd. Now, previous patch changes the behaviour of allocator that
movable allocation uses the page on cma reserved region aggressively,
so this watermark hack isn't needed anymore. Therefore remove it.

Signed-off-by: Joonsoo Kim 

diff --git a/mm/compaction.c b/mm/compaction.c
index 627dc2e..36e2fcd 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1117,10 +1117,6 @@ unsigned long try_to_compact_pages(struct zonelist 
*zonelist,
 
count_compact_event(COMPACTSTALL);
 
-#ifdef CONFIG_CMA
-   if (allocflags_to_migratetype(gfp_mask) == MIGRATE_MOVABLE)
-   alloc_flags |= ALLOC_CMA;
-#endif
/* Compact each zone in the list */
for_each_zone_zonelist_nodemask(zone, z, zonelist, high_zoneidx,
nodemask) {
diff --git a/mm/internal.h b/mm/internal.h
index 07b6736..a121762 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -384,7 +384,6 @@ unsigned long reclaim_clean_pages_from_list(struct zone 
*zone,
 #define ALLOC_HARDER   0x10 /* try to alloc harder */
 #define ALLOC_HIGH 0x20 /* __GFP_HIGH set */
 #define ALLOC_CPUSET   0x40 /* check for correct cpuset */
-#define ALLOC_CMA  0x80 /* allow allocations from CMA areas */
-#define ALLOC_FAIR 0x100 /* fair zone allocation */
+#define ALLOC_FAIR 0x80 /* fair zone allocation */
 
 #endif /* __MM_INTERNAL_H */
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 6f2b27b..6af2fa1 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1757,20 +1757,22 @@ static bool __zone_watermark_ok(struct zone *z, int 
order, unsigned long mark,
long min = mark;
long lowmem_reserve = z->lowmem_reserve[classzone_idx];
int o;
-   long free_cma = 0;
 
free_pages -= (1 << order) - 1;
if (alloc_flags & ALLOC_HIGH)
min -= min / 2;
if (alloc_flags & ALLOC_HARDER)
min -= min / 4;
-#ifdef CONFIG_CMA
-   /* If allocation can't use CMA areas don't use free CMA pages */
-   if (!(alloc_flags & ALLOC_CMA))
-   free_cma = zone_page_state(z, NR_FREE_CMA_PAGES);
-#endif
+   /*
+* We don't want to regard the pages on CMA region as free
+* on watermark checking, since they cannot be used for
+* unmovable/reclaimable allocation and they can suddenly
+* vanish through CMA allocation
+*/
+   if (IS_ENABLED(CONFIG_CMA) && z->has_cma)
+   free_pages -= zone_page_state(z, NR_FREE_CMA_PAGES);
 
-   if (free_pages - free_cma <= min + lowmem_reserve)
+   if (free_pages <= min + lowmem_reserve)
return false;
for (o = 0; o < order; o++) {
/* At the next order, this order's pages become unavailable */
@@ -2538,10 +2540,6 @@ gfp_to_alloc_flags(gfp_t gfp_mask)
 unlikely(test_thread_flag(TIF_MEMDIE
alloc_flags |= ALLOC_NO_WATERMARKS;
}
-#ifdef CONFIG_CMA
-   if (allocflags_to_migratetype(gfp_mask) == MIGRATE_MOVABLE)
-   alloc_flags |= ALLOC_CMA;
-#endif
return alloc_flags;
 }
 
@@ -2811,10 +2809,6 @@ retry_cpuset:
if (!preferred_zone)
goto out;
 
-#ifdef CONFIG_CMA
-   if (allocflags_to_migratetype(gfp_mask) == MIGRATE_MOVABLE)
-   alloc_flags |= ALLOC_CMA;
-#endif
 retry:
/* First allocation attempt */
page = get_page_from_freelist(gfp_mask|__GFP_HARDWALL, nodemask, order,
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH 1/3] CMA: remove redundant retrying code in __alloc_contig_migrate_range

2014-05-07 Thread Joonsoo Kim

We already have retry logic in migrate_pages(). It does retry 10 times.
So if we keep this retrying code in __alloc_contig_migrate_range(), we
would try to migrate some unmigratable page in 50 times. There is just one
small difference in -ENOMEM case. migrate_pages() don't do retry
in this case, however, current __alloc_contig_migrate_range() does. But,
I think that this isn't problem, because in this case, we may fail again
with same reason.

Signed-off-by: Joonsoo Kim 

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 5dba293..674ade7 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6185,7 +6185,6 @@ static int __alloc_contig_migrate_range(struct 
compact_control *cc,
/* This function is based on compact_zone() from compaction.c. */
unsigned long nr_reclaimed;
unsigned long pfn = start;
-   unsigned int tries = 0;
int ret = 0;
 
migrate_prep();
@@ -6204,10 +6203,6 @@ static int __alloc_contig_migrate_range(struct 
compact_control *cc,
ret = -EINTR;
break;
}
-   tries = 0;
-   } else if (++tries == 5) {
-   ret = ret < 0 ? ret : -EBUSY;
-   break;
}
 
nr_reclaimed = reclaim_clean_pages_from_list(cc->zone,
@@ -6216,6 +6211,10 @@ static int __alloc_contig_migrate_range(struct 
compact_control *cc,
 
ret = migrate_pages(>migratepages, alloc_migrate_target,
0, MIGRATE_SYNC, MR_CMA);
+   if (ret) {
+   ret = ret < 0 ? ret : -EBUSY;
+   break;
+   }
}
if (ret < 0) {
putback_movable_pages(>migratepages);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH 2/3] CMA: aggressively allocate the pages on cma reserved memory when not used

2014-05-07 Thread Joonsoo Kim

CMA is introduced to provide physically contiguous pages at runtime.
For this purpose, it reserves memory at boot time. Although it reserve
memory, this reserved memory can be used for movable memory allocation
request. This usecase is beneficial to the system that needs this CMA
reserved memory infrequently and it is one of main purpose of
introducing CMA.

But, there is a problem in current implementation. The problem is that
it works like as just reserved memory approach. The pages on cma reserved
memory are hardly used for movable memory allocation. This is caused by
combination of allocation and reclaim policy.

The pages on cma reserved memory are allocated if there is no movable
memory, that is, as fallback allocation. So the time this fallback
allocation is started is under heavy memory pressure. Although it is under
memory pressure, movable allocation easily succeed, since there would be
many pages on cma reserved memory. But this is not the case for unmovable
and reclaimable allocation, because they can't use the pages on cma
reserved memory. These allocations regard system's free memory as
(free pages - free cma pages) on watermark checking, that is, free
unmovable pages + free reclaimable pages + free movable pages. Because
we already exhausted movable pages, only free pages we have are unmovable
and reclaimable types and this would be really small amount. So watermark
checking would be failed. It will wake up kswapd to make enough free
memory for unmovable and reclaimable allocation and kswapd will do.
So before we fully utilize pages on cma reserved memory, kswapd start to
reclaim memory and try to make free memory over the high watermark. This
watermark checking by kswapd doesn't take care free cma pages so many
movable pages would be reclaimed. After then, we have a lot of movable
pages again, so fallback allocation doesn't happen again. To conclude,
amount of free memory on meminfo which includes free CMA pages is moving
around 512 MB if I reserve 512 MB memory for CMA.

I found this problem on following experiment.

4 CPUs, 1024 MB, VIRTUAL MACHINE
make -j24

CMA reserve:0 MB512 MB
Elapsed-time:   234.8   361.8
Average-MemFree:283880 KB   530851 KB

To solve this problem, I can think following 2 possible solutions.
1. allocate the pages on cma reserved memory first, and if they are
   exhausted, allocate movable pages.
2. interleaved allocation: try to allocate specific amounts of memory
   from cma reserved memory and then allocate from free movable memory.

I tested #1 approach and found the problem. Although free memory on
meminfo can move around low watermark, there is large fluctuation on free
memory, because too many pages are reclaimed when kswapd is invoked.
Reason for this behaviour is that successive allocated CMA pages are
on the LRU list in that order and kswapd reclaim them in same order.
These memory doesn't help watermark checking from kwapd, so too many
pages are reclaimed, I guess.

So, I implement #2 approach.
One thing I should note is that we should not change allocation target
(movable list or cma) on each allocation attempt, since this prevent
allocated pages to be in physically succession, so some I/O devices can
be hurt their performance. To solve this, I keep allocation target
in at least pageblock_nr_pages attempts and make this number reflect
ratio, free pages without free cma pages to free cma pages. With this
approach, system works very smoothly and fully utilize the pages on
cma reserved memory.

Following is the experimental result of this patch.

4 CPUs, 1024 MB, VIRTUAL MACHINE
make -j24


CMA reserve:0 MB512 MB
Elapsed-time:   234.8   361.8
Average-MemFree:283880 KB   530851 KB
pswpin: 7   110064
pswpout:452 767502


CMA reserve:0 MB512 MB
Elapsed-time:   234.2   235.6
Average-MemFree:281651 KB   290227 KB
pswpin: 8   8
pswpout:430 510

There is no difference if we don't have cma reserved memory (0 MB case).
But, with cma reserved memory (512 MB case), we fully utilize these
reserved memory through this patch and the system behaves like as
it doesn't reserve any memory.

With this patch, we aggressively allocate the pages on cma reserved memory
so latency of CMA can arise. Below is the experimental result about
latency.

4 CPUs, 1024 MB, VIRTUAL MACHINE
CMA reserve: 512 MB
Backgound Workload: make -jN
Real Workload: 8 MB CMA allocation/free 20 times with 5 sec interval

N:14   816
Elapsed-time(Before): 4309.75  9511.09 12276.1  77103.5
Elapsed-time(After):  5391.69 16114.1  19380.3  34879.2

So generally we can see latency increase. Ratio of this increase
is rather big - up to 70%. But, under the heavy workload, it shows
latency decrease - up to 55%.

[RFC PATCH 0/3] Aggressively allocate the pages on cma reserved memory

2014-05-07 Thread Joonsoo Kim

Hello,

This series tries to improve CMA.

CMA is introduced to provide physically contiguous pages at runtime
without reserving memory area. But, current implementation works like as
reserving memory approach, because allocation on cma reserved region only
occurs as fallback of migrate_movable allocation. We can allocate from it
when there is no movable page. In that situation, kswapd would be invoked
easily since unmovable and reclaimable allocation consider
(free pages - free CMA pages) as free memory on the system and free memory
may be lower than high watermark in that case. If kswapd start to reclaim
memory, then fallback allocation doesn't occur much.

In my experiment, I found that if system memory has 1024 MB memory and
has 512 MB reserved memory for CMA, kswapd is mostly invoked around
the 512MB free memory boundary. And invoked kswapd tries to make free
memory until (free pages - free CMA pages) is higher than high watermark,
so free memory on meminfo is moving around 512MB boundary consistently.

To fix this problem, we should allocate the pages on cma reserved memory
more aggressively and intelligenetly. Patch 2 implements the solution.
Patch 1 is the simple optimization which remove useless re-trial and patch 3
is for removing useless alloc flag, so these are not important.
See patch 2 for more detailed description.

This patchset is based on v3.15-rc4.

Thanks.
Joonsoo Kim (3):
  CMA: remove redundant retrying code in __alloc_contig_migrate_range
  CMA: aggressively allocate the pages on cma reserved memory when not
used
  CMA: always treat free cma pages as non-free on watermark checking

 include/linux/mmzone.h |6 +++
 mm/compaction.c|4 --
 mm/internal.h  |3 +-
 mm/page_alloc.c|  117 +++-
 4 files changed, 102 insertions(+), 28 deletions(-)

-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 4/5] iommu: Use dma_addr_t for IOVA arguments

2014-05-07 Thread Bjorn Helgaas

On Wed, May 07, 2014 at 09:58:58AM +0200, Arnd Bergmann wrote:
> On Tuesday 06 May 2014 16:48:40 Bjorn Helgaas wrote:
> > Convert the "iova" arguments of iommu_map(), iommu_unmap(), etc., from
> > "unsigned long" to dma_addr_t.
> > 
> > bb5547acfcd8 ("iommu/fsl: Make iova dma_addr_t in the iommu_iova_to_phys
> > API") did this for iommu_iova_to_phys(), but didn't fix the rest of the
> > IOMMU API.
> 
> This patch looks 100% correct, but I'm not convinced it's a good idea:
> On 32-bit platforms (i.e. most of the ones you change), doing 64-bit
> arithmetic has a noticeable overhead. I am not aware of an IOMMU that
> actually uses 64-bit DMA addresses on its slave side, usually they
> are used to translate addresses from 32-bit masters into 64-bit
> memory addresses, so using 'unsigned long' seems better from a practical
> point of view as opposed to the strict type correctness.

The current x86 IOMMUs support DMA addresses larger than 32 bits, but
obviously those platforms usually run 64-bit kernels so "unsigned
long" is already 64 bits.

I guess you're thinking about cases where "unsigned long" is 32 bits,
the IOMMU only supports 32 bit DMA addresses, and dma_addr_t is 64
bits.  If the IOMMU only supports 32-bit DMA addresses, is there any
value in having a 64-bit dma_addr_t?

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH tip/core/rcu 19/45] rcutorture: Print negatives for SRCU counter wraparound

2014-05-07 Thread Paul E. McKenney

On Wed, May 07, 2014 at 02:34:26PM -0700, j...@joshtriplett.org wrote:
> On Mon, Apr 28, 2014 at 05:25:07PM -0700, Paul E. McKenney wrote:
> > From: "Paul E. McKenney" 
> > 
> > The srcu_torture_stats() function prints SRCU's per-CPU c[] array with
> > an unsigned format, which means that the number one less than zero is
> > a very large number.  This commit therefore prints this array with a
> > signed format in order to improve readability of the rcutorture output.
> > 
> > Signed-off-by: Paul E. McKenney 
> 
> Nit below; with that:
> Reviewed-by: Josh Triplett 
> 
> >  kernel/rcu/rcutorture.c | 7 ---
> >  1 file changed, 4 insertions(+), 3 deletions(-)
> > 
> > diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
> > index 3845ea99ccd4..0141fcff6bb9 100644
> > --- a/kernel/rcu/rcutorture.c
> > +++ b/kernel/rcu/rcutorture.c
> > @@ -486,15 +486,16 @@ static void srcu_torture_barrier(void)
> >  
> >  static void srcu_torture_stats(char *page)
> >  {
> > +   long c0, c1;
> > int cpu;
> > int idx = srcu_ctl.completed & 0x1;
> >  
> > page += sprintf(page, "%s%s per-CPU(idx=%d):",
> >torture_type, TORTURE_FLAG, idx);
> > for_each_possible_cpu(cpu) {
> > -   page += sprintf(page, " %d(%lu,%lu)", cpu,
> > -  per_cpu_ptr(srcu_ctl.per_cpu_ref, cpu)->c[!idx],
> > -  per_cpu_ptr(srcu_ctl.per_cpu_ref, cpu)->c[idx]);
> > +   c0 = (long)per_cpu_ptr(srcu_ctl.per_cpu_ref, cpu)->c[!idx];
> > +   c1 = (long)per_cpu_ptr(srcu_ctl.per_cpu_ref, cpu)->c[idx];
> > +   page += sprintf(page, " %d(%ld,%ld)", cpu, c0, c1);
> 
> Nit: I'd suggest declaring the variables inside the loop, or not using
> intermediate variables at all.

Fair enough!  Can't see that I am gaining much by them.

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH V4] gic: preserve gic V2 bypass bits in cpu ctrl register

2014-05-07 Thread Feng Kan

On Wed, May 7, 2014 at 2:37 AM, Marc Zyngier  wrote:
> On Wed, May 07 2014 at  1:53:45 am BST, Feng Kan  wrote:
>> This change is made to preserve the GIC v2 bypass bits in the
>> GIC_CPU_CTRL register (also known as the GICC_CTLR register in spec).
>> This code will preserve all bits configured by the bootload regarding
>  bootloader
>> v2 bypass group bits. In the X-Gene platform (as well others), the
>
> Which other platform? It'd be interesting to know which one, as you're
> implying they haven't managed to boot a mainline kernel so far...
>

Okay, wording could use some refinement here. What I mean is that most
platform would probably not use the bypass feature. Correct me if I am
wrong here. Not setting the bypass bit does not impact booting kernel,
it only create incorrect behavior when the impacted irq are used.

>> bypass functionality is not generally used and bypass bits should not
>> be changed by the kernel gic code as it could lead to incorrect behavior.
>> Tested on X-Gene mustang board.
>>
>> Signed-off-by: Vinayak Kale 
>> Acked-by: Anup Patel 
>> Signed-off-by: Feng Kan 
>> ---
>>  V4: Change to use bypass mask, change to use more suitable variable name.
>>  V3: Fix code not touch other bits other than bypass bits.
>>
>>  drivers/irqchip/irq-gic.c |   20 +---
>>  1 files changed, 17 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
>> index 4300b66..50a7bb2 100644
>> --- a/drivers/irqchip/irq-gic.c
>> +++ b/drivers/irqchip/irq-gic.c
>> @@ -97,6 +97,8 @@ struct irq_chip gic_arch_extn = {
>>  #define MAX_GIC_NR   1
>>  #endif
>>
>> +#define GIC_BYPASS_MASK  0x1e0
>> +
>>  static struct gic_chip_data gic_data[MAX_GIC_NR] __read_mostly;
>>
>>  #ifdef CONFIG_GIC_NON_BANKED
>> @@ -418,6 +420,7 @@ static void gic_cpu_init(struct gic_chip_data *gic)
>>   void __iomem *dist_base = gic_data_dist_base(gic);
>>   void __iomem *base = gic_data_cpu_base(gic);
>>   unsigned int cpu_mask, cpu = smp_processor_id();
>> + unsigned int bypass;
>
> Please use a type that corresponds to the width of the access (u32 in
> this case).
>
>>   int i;
>>
>>   /*
>> @@ -449,13 +452,20 @@ static void gic_cpu_init(struct gic_chip_data *gic)
>>   writel_relaxed(0xa0a0a0a0, dist_base + GIC_DIST_PRI + i * 4 / 
>> 4);
>>
>>   writel_relaxed(0xf0, base + GIC_CPU_PRIMASK);
>> - writel_relaxed(1, base + GIC_CPU_CTRL);
>> +
>> + bypass = readl(base + GIC_CPU_CTRL);
>> + bypass &= GIC_BYPASS_MASK;
>> + writel_relaxed(bypass | 0x1, base + GIC_CPU_CTRL);
>>  }
>>
>>  void gic_cpu_if_down(void)
>>  {
>>   void __iomem *cpu_base = gic_data_cpu_base(_data[0]);
>> - writel_relaxed(0, cpu_base + GIC_CPU_CTRL);
>> + unsigned int bypass;
>> +
>> + bypass = readl(cpu_base + GIC_CPU_CTRL);
>> + bypass &= GIC_BYPASS_MASK;
>> + writel_relaxed(bypass, cpu_base + GIC_CPU_CTRL);
>>  }
>>
>>  #ifdef CONFIG_CPU_PM
>> @@ -566,6 +576,7 @@ static void gic_cpu_restore(unsigned int gic_nr)
>>  {
>>   int i;
>>   u32 *ptr;
>> + unsigned int bypass;
>>   void __iomem *dist_base;
>>   void __iomem *cpu_base;
>>
>> @@ -590,7 +601,10 @@ static void gic_cpu_restore(unsigned int gic_nr)
>>   writel_relaxed(0xa0a0a0a0, dist_base + GIC_DIST_PRI + i * 4);
>>
>>   writel_relaxed(0xf0, cpu_base + GIC_CPU_PRIMASK);
>> - writel_relaxed(1, cpu_base + GIC_CPU_CTRL);
>> +
>> + bypass = readl(cpu_base + GIC_CPU_CTRL);
>> + bypass &= GIC_BYPASS_MASK;
>> + writel_relaxed(bypass | 0x1, cpu_base + GIC_CPU_CTRL);
>
> It would be good to turn these into a static function (gic_cpu_if_up,
> matching gic_cpu_if_down?), and use it in both gic_cpu_init and
> gic_cpu_restore.

Would a macro be better here, please see my next patch. As two
additional static functions seems a bit heavy. Please let me know.
Thanks

>
> Thanks,
>
> M.
> --
> Jazz is not dead. It just smells funny.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH tip/core/rcu 18/45] torture: Make "--dryrun script" use same environment as normal run

2014-05-07 Thread Paul E. McKenney

On Wed, May 07, 2014 at 02:33:27PM -0700, j...@joshtriplett.org wrote:
> On Mon, Apr 28, 2014 at 05:25:06PM -0700, Paul E. McKenney wrote:
> > From: "Paul E. McKenney" 
> > 
> > In a normal torture-test run, the script inherits its environment
> > variables, but this does not work when producing a script that is
> > to run later.  Therefore, definitions and exports are prepended to
> > a dryrun script but not to a script that is run immediately.  This
> > commit reconciles this by placing definitions and exports at the
> > beginning of the script in both cases.
> 
> Cleanup idea, not needed in this commit: This has gotten sufficiently
> long to warrant a loop over a list of environment variables, or possibly
> a loop over all environment variable starting with TORTURE_* .

Makes sense on both counts, thank you!  Give or take quoting, anyway!  ;-)

Thanx, Paul

> > Signed-off-by: Paul E. McKenney 
> Reviewed-by: Josh Triplett 
> 
> > ---
> >  tools/testing/selftests/rcutorture/bin/kvm.sh | 28 
> > ---
> >  1 file changed, 12 insertions(+), 16 deletions(-)
> > 
> > diff --git a/tools/testing/selftests/rcutorture/bin/kvm.sh 
> > b/tools/testing/selftests/rcutorture/bin/kvm.sh
> > index 8aa62a2dccb5..93a6c5a8517d 100644
> > --- a/tools/testing/selftests/rcutorture/bin/kvm.sh
> > +++ b/tools/testing/selftests/rcutorture/bin/kvm.sh
> > @@ -241,8 +241,19 @@ END {
> >  
> >  # Generate a script to execute the tests in appropriate batches.
> >  cat << ___EOF___ > $T/script
> > -TORTURE_SUITE="$TORTURE_SUITE"; export TORTURE_SUITE
> > +CONFIGFRAG="$CONFIGFRAG"; export CONFIGFRAG
> > +KVM="$KVM"; export KVM
> > +KVPATH="$KVPATH"; export KVPATH
> > +PATH="$PATH"; export PATH
> > +TORTURE_BUILDONLY="$TORTURE_BUILDONLY"; export TORTURE_BUILDONLY
> >  TORTURE_DEFCONFIG="$TORTURE_DEFCONFIG"; export TORTURE_DEFCONFIG
> > +TORTURE_INITRD="$TORTURE_INITRD"; export TORTURE_INITRD
> > +TORTURE_KMAKE_ARG="$TORTURE_KMAKE_ARG"; export TORTURE_KMAKE_ARG
> > +TORTURE_QEMU_CMD="$TORTURE_QEMU_CMD"; export TORTURE_QEMU_CMD
> > +TORTURE_QEMU_INTERACTIVE="$TORTURE_QEMU_INTERACTIVE";
> > +   export TORTURE_QEMU_INTERACTIVE
> > +TORTURE_QEMU_MAC="$TORTURE_QEMU_MAC"; export TORTURE_QEMU_MAC
> > +TORTURE_SUITE="$TORTURE_SUITE"; export TORTURE_SUITE
> >  if ! test -e $resdir
> >  then
> > mkdir -p "$resdir" || :
> > @@ -371,21 +382,6 @@ ___EOF___
> >  
> >  if test "$dryrun" = script
> >  then
> > -   # Dump out the script, but define the environment variables that
> > -   # it needs to run standalone.
> > -   echo CONFIGFRAG="$CONFIGFRAG; export CONFIGFRAG"
> > -   echo KVM="$KVM; export KVM"
> > -   echo KVPATH="$KVPATH; export KVPATH"
> > -   echo PATH="$PATH; export PATH"
> > -   echo TORTURE_BUILDONLY="$TORTURE_BUILDONLY; export TORTURE_BUILDONLY"
> > -   echo TORTURE_INITRD="$TORTURE_INITRD; export TORTURE_INITRD"
> > -   echo TORTURE_KMAKE_ARG="$TORTURE_KMAKE_ARG; export TORTURE_KMAKE_ARG"
> > -   echo TORTURE_QEMU_CMD="$TORTURE_QEMU_CMD; export TORTURE_QEMU_CMD"
> > -   echo TORTURE_QEMU_INTERACTIVE="$TORTURE_QEMU_INTERACTIVE;
> > -   export TORTURE_QEMU_INTERACTIVE"
> > -   echo TORTURE_QEMU_MAC="$TORTURE_QEMU_MAC; export TORTURE_QEMU_MAC"
> > -   echo "mkdir -p "$resdir" || :"
> > -   echo "mkdir $resdir/$ds"
> > cat $T/script
> > exit 0
> >  elif test "$dryrun" = sched
> > -- 
> > 1.8.1.5
> > 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH V5] gic: preserve gic V2 bypass bits in cpu ctrl register

2014-05-07 Thread Feng Kan

This change is made to preserve the GIC v2 bypass bits in the
GIC_CPU_CTRL register (also known as the GICC_CTLR register in spec).
This code will preserve all bits configured by the bootload regarding
v2 bypass group bits. In the X-Gene platform, the bypass functionality
is not used and bypass bits should not be changed by the kernel gic
code as it could lead to incorrect behavior.

Signed-off-by: Vinayak Kale 
Acked-by: Anup Patel 
Signed-off-by: Feng Kan 
---
V5: Use macro to replace read modify write of cpu_ctrl register.
V4: Change to use bypass mask, change ot user more suitable variable name.
 drivers/irqchip/irq-gic.c |   16 +---
 1 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
index 4300b66..42e9bf4 100644
--- a/drivers/irqchip/irq-gic.c
+++ b/drivers/irqchip/irq-gic.c
@@ -97,6 +97,13 @@ struct irq_chip gic_arch_extn = {
 #define MAX_GIC_NR 1
 #endif
 
+#define set_cpuctrl(base, val) \
+   do {\
+   u32 bypass; \
+   bypass = readl(base + GIC_CPU_CTRL) & 0x1e0;\
+   writel_relaxed(bypass | val, base + GIC_CPU_CTRL);\
+   } while (0)
+
 static struct gic_chip_data gic_data[MAX_GIC_NR] __read_mostly;
 
 #ifdef CONFIG_GIC_NON_BANKED
@@ -449,13 +456,15 @@ static void gic_cpu_init(struct gic_chip_data *gic)
writel_relaxed(0xa0a0a0a0, dist_base + GIC_DIST_PRI + i * 4 / 
4);
 
writel_relaxed(0xf0, base + GIC_CPU_PRIMASK);
-   writel_relaxed(1, base + GIC_CPU_CTRL);
+
+   set_cpuctrl(base, 1);
 }
 
 void gic_cpu_if_down(void)
 {
void __iomem *cpu_base = gic_data_cpu_base(_data[0]);
-   writel_relaxed(0, cpu_base + GIC_CPU_CTRL);
+
+   set_cpuctrl(cpu_base, 0);
 }
 
 #ifdef CONFIG_CPU_PM
@@ -590,7 +599,8 @@ static void gic_cpu_restore(unsigned int gic_nr)
writel_relaxed(0xa0a0a0a0, dist_base + GIC_DIST_PRI + i * 4);
 
writel_relaxed(0xf0, cpu_base + GIC_CPU_PRIMASK);
-   writel_relaxed(1, cpu_base + GIC_CPU_CTRL);
+   
+   set_cpuctrl(cpu_base, 1);
 }
 
 static int gic_notifier(struct notifier_block *self, unsigned long cmd,
void *v)
-- 
1.7.6.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH tip/core/rcu 15/45] torture: Make config-fragment filtering RCU-independent

2014-05-07 Thread Paul E. McKenney

On Wed, May 07, 2014 at 02:29:37PM -0700, j...@joshtriplett.org wrote:
> On Mon, Apr 28, 2014 at 05:25:03PM -0700, Paul E. McKenney wrote:
> > From: "Paul E. McKenney" 
> > 
> > The torture tests need to set specific values for their respective
> > Kconfig options (e.g., CONFIG_LOCK_TORTURE_TEST), and must therefore
> > filter any conflicting definitions from the Kconfig fragment
> > file.  Unfortunately, the code in kvm-build.sh was looking only for
> > CONFIG_RCU_TORTURE_TEST.  This commit therefore handles the general case
> > of CONFIG_[A-Z]*TORTURE_TEST.
> 
> This doesn't match your code below, which includes an _ after the * .
> 
> Also, one nit below.
> 
> > Signed-off-by: Paul E. McKenney 
> 
> With the commit message fixed:
> Reviewed-by: Josh Triplett 
> 
> > ---
> >  tools/testing/selftests/rcutorture/bin/kvm-build.sh | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/tools/testing/selftests/rcutorture/bin/kvm-build.sh 
> > b/tools/testing/selftests/rcutorture/bin/kvm-build.sh
> > index e838c775f709..6d0b76d918f4 100755
> > --- a/tools/testing/selftests/rcutorture/bin/kvm-build.sh
> > +++ b/tools/testing/selftests/rcutorture/bin/kvm-build.sh
> > @@ -45,7 +45,7 @@ T=/tmp/test-linux.sh.$$
> >  trap 'rm -rf $T' 0
> >  mkdir $T
> >  
> > -cat ${config_template} | grep -v CONFIG_RCU_TORTURE_TEST > $T/config
> > +cat ${config_template} | grep -v 'CONFIG_[A-Z]*_TORTURE_TEST' > $T/config
> 
> UUOC (useless use of cat): you can redirect from ${config_template}
> rather than catting it.

Both good points!  Back when I was first using UNIX, that extra "cat"
would have cost me something like three seconds.  How quickly we forget!  ;-)

Thanx, Paul

> >  cat << ___EOF___ >> $T/config
> >  CONFIG_INITRAMFS_SOURCE="$TORTURE_INITRD"
> >  CONFIG_VIRTIO_PCI=y
> > -- 
> > 1.8.1.5
> > 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH tip/core/rcu 11/45] torture: Rename RCU_QEMU_INTERACTIVE to TORTURE_QEMU_INTERACTIVE

2014-05-07 Thread Paul E. McKenney

On Wed, May 07, 2014 at 02:26:09PM -0700, j...@joshtriplett.org wrote:
> On Mon, Apr 28, 2014 at 05:24:59PM -0700, Paul E. McKenney wrote:
> > From: "Paul E. McKenney" 
> > 
> > This commit makes the torture scripts a bit more RCU-independent.
> > 
> > Signed-off-by: Paul E. McKenney 
> 
> Bug below; with that fixed,
> Reviewed-by: Josh Triplett 
> 
> > --- a/tools/testing/selftests/rcutorture/bin/kvm.sh
> > +++ b/tools/testing/selftests/rcutorture/bin/kvm.sh
> [...]
> > @@ -378,7 +378,8 @@ then
> > echo TORTURE_INITRD="$TORTURE_INITRD; export TORTURE_INITRD"
> > echo TORTURE_KMAKE_ARG="$TORTURE_KMAKE_ARG; export TORTURE_KMAKE_ARG"
> > echo RCU_QEMU_CMD="$RCU_QEMU_CMD; export RCU_QEMU_CMD"
> > -   echo RCU_QEMU_INTERACTIVE="$RCU_QEMU_INTERACTIVE; export 
> > RCU_QEMU_INTERACTIVE"
> > +   echo TORTURE_QEMU_INTERACTIVE="$TORTURE_QEMU_INTERACTIVE;
> > +   export TORTURE_QEMU_INTERACTIVE"
> 
> Don't break this line in the middle of your quoted string.

Good point, will fix!

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH tip/core/rcu 12/45] torture: Rename RCU_QEMU_MAC to TORTURE_QEMU_MAC

2014-05-07 Thread Paul E. McKenney

On Wed, May 07, 2014 at 02:27:56PM -0700, j...@joshtriplett.org wrote:
> On Mon, Apr 28, 2014 at 05:25:00PM -0700, Paul E. McKenney wrote:
> > From: "Paul E. McKenney" 
> > 
> > This commit makes the torture scripts a bit more RCU-independent.
> > 
> > Signed-off-by: Paul E. McKenney 
> 
> One comment below; with or without that change:
> Reviewed-by: Josh Triplett 
> 
> > ---
> >  tools/testing/selftests/rcutorture/bin/functions.sh | 6 +++---
> >  tools/testing/selftests/rcutorture/bin/kvm.sh   | 4 ++--
> >  2 files changed, 5 insertions(+), 5 deletions(-)
> > 
> > diff --git a/tools/testing/selftests/rcutorture/bin/functions.sh 
> > b/tools/testing/selftests/rcutorture/bin/functions.sh
> > index 623939cf814e..41fb52b805e4 100644
> > --- a/tools/testing/selftests/rcutorture/bin/functions.sh
> > +++ b/tools/testing/selftests/rcutorture/bin/functions.sh
> > @@ -124,7 +124,7 @@ identify_qemu_append () {
> >  
> >  # identify_qemu_args qemu-cmd serial-file
> >  #
> > -# Output arguments for qemu arguments based on the RCU_QEMU_MAC
> > +# Output arguments for qemu arguments based on the TORTURE_QEMU_MAC
> >  # and TORTURE_QEMU_INTERACTIVE environment variables.
> >  identify_qemu_args () {
> > case "$1" in
> > @@ -133,9 +133,9 @@ identify_qemu_args () {
> > qemu-system-ppc64)
> > echo -enable-kvm -M pseries -cpu POWER7 -nodefaults
> > echo -device spapr-vscsi
> > -   if test -n "$TORTURE_QEMU_INTERACTIVE" -a -n "$RCU_QEMU_MAC"
> > +   if test -n "$TORTURE_QEMU_INTERACTIVE" -a -n "$TORTURE_QEMU_MAC"
> > then
> > -   echo -device spapr-vlan,netdev=net0,mac=$RCU_QEMU_MAC
> > +   echo -device 
> > spapr-vlan,netdev=net0,mac=$TORTURE_QEMU_MAC
> > echo -netdev bridge,br=br0,id=net0
> > elif test -n "$TORTURE_QEMU_INTERACTIVE"
> > then
> > diff --git a/tools/testing/selftests/rcutorture/bin/kvm.sh 
> > b/tools/testing/selftests/rcutorture/bin/kvm.sh
> > index 2f9605ed5b58..1a4a68c76914 100644
> > --- a/tools/testing/selftests/rcutorture/bin/kvm.sh
> > +++ b/tools/testing/selftests/rcutorture/bin/kvm.sh
> > @@ -128,7 +128,7 @@ do
> > ;;
> > --mac)
> > checkarg --mac "(MAC address)" $# "$2" 
> > '^\([0-9a-fA-F]\{2\}:\)\{5\}[0-9a-fA-F]\{2\}$' error
> > -   RCU_QEMU_MAC=$2; export RCU_QEMU_MAC
> > +   TORTURE_QEMU_MAC=$2; export TORTURE_QEMU_MAC
> 
> Can't you drop this export the same way you did previous exports?

Good point, will do!

Thanx, Paul

> > shift
> > ;;
> > --no-initrd)
> > @@ -380,7 +380,7 @@ then
> > echo RCU_QEMU_CMD="$RCU_QEMU_CMD; export RCU_QEMU_CMD"
> > echo TORTURE_QEMU_INTERACTIVE="$TORTURE_QEMU_INTERACTIVE;
> > export TORTURE_QEMU_INTERACTIVE"
> > -   echo RCU_QEMU_MAC="$RCU_QEMU_MAC; export RCU_QEMU_MAC"
> > +   echo TORTURE_QEMU_MAC="$TORTURE_QEMU_MAC; export TORTURE_QEMU_MAC"
> > echo "mkdir -p "$resdir" || :"
> > echo "mkdir $resdir/$ds"
> > cat $T/script
> > -- 
> > 1.8.1.5
> > 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH tip/core/rcu 11/45] torture: Rename RCU_QEMU_INTERACTIVE to TORTURE_QEMU_INTERACTIVE

2014-05-07 Thread Paul E. McKenney

On Wed, May 07, 2014 at 02:27:08PM -0700, j...@joshtriplett.org wrote:
> On Mon, Apr 28, 2014 at 05:24:59PM -0700, Paul E. McKenney wrote:
> > -   if test -n "$RCU_QEMU_INTERACTIVE" -a -n "$RCU_QEMU_MAC"
> > +   if test -n "$TORTURE_QEMU_INTERACTIVE" -a -n "$RCU_QEMU_MAC"
> > then
> > echo -device spapr-vlan,netdev=net0,mac=$RCU_QEMU_MAC
> > echo -netdev bridge,br=br0,id=net0
> > -   elif test -n "$RCU_QEMU_INTERACTIVE"
> > +   elif test -n "$TORTURE_QEMU_INTERACTIVE"
> > then
> > echo -net nic -net user
> 
> Not related to this patch, but: qemu defaults to -net nic -net user, so
> you don't need to specify it.

Good point, I wasn't aware of that.  Paranoia will probably cause me
to leave it explicitly specified, though.

And yes, I freely admit that my paranoia is not evenly distributed.  ;-)

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH tip/core/rcu 06/45] torture: Intensify locking test

2014-05-07 Thread Paul E. McKenney

On Wed, May 07, 2014 at 02:20:15PM -0700, j...@joshtriplett.org wrote:
> On Mon, Apr 28, 2014 at 05:24:54PM -0700, Paul E. McKenney wrote:
> > From: "Paul E. McKenney" 
> > 
> > The current lock_torture_writer() spends too much time sleeping and not
> > enough time hammering locks, as in an eight-CPU test will often only be
> > utilizing a CPU or two.  This commit therefore makes lock_torture_writer()
> > sleep less and hammer more.
> > 
> > Signed-off-by: Paul E. McKenney 
> > ---
> >  kernel/locking/locktorture.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/kernel/locking/locktorture.c b/kernel/locking/locktorture.c
> > index f26b1a18e34e..b0d3e3c50672 100644
> > --- a/kernel/locking/locktorture.c
> > +++ b/kernel/locking/locktorture.c
> > @@ -219,7 +219,8 @@ static int lock_torture_writer(void *arg)
> > set_user_nice(current, 19);
> >  
> > do {
> > -   schedule_timeout_uninterruptible(1);
> > +   if ((torture_random() & 0xf) == 0)
> > +   schedule_timeout_uninterruptible(1);
> 
> That's a one-in-1048576 chance of sleeping for a jiffy; is that frequent
> enough to even bother sleeping at all?

On large systems, maybe not.  Smallish systems should be able to get
through that loop a million times in a few hundreds of milliseconds,
though.  So longer term a smarter approach might be needed, but this
should be a good start.

Thanx, Paul

> > cur_ops->writelock();
> > if (WARN_ON_ONCE(lock_is_write_held))
> > lwsp->n_write_lock_fail++;
> > -- 
> > 1.8.1.5
> > 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/2] Add test to validate udelay

2014-05-07 Thread Doug Anderson

John,

On Wed, May 7, 2014 at 3:46 PM, John Stultz  wrote:
> On 05/07/2014 11:32 AM, Doug Anderson wrote:
>> John,
>>
>> On Wed, May 7, 2014 at 11:10 AM, John Stultz  wrote:
>>> On 05/06/2014 09:19 PM, Doug Anderson wrote:
 John,

 On Tue, May 6, 2014 at 5:25 PM, John Stultz  wrote:
> Really, I'm curious about the backstory that made you generate the test?
> I assume something bit you where udelay was way off? Or were you using
> udelay for some sort of accuracy sensitive use?
 Several times we've seen cases where udelay() was pretty broken with
 cpufreq if you were actually implementing udelay() with
 loops_per_jiffy.  I believe it may also be broken upstream on
 multicore systems, though now that ARM arch timers are there maybe we
 don't care as much?

 Specifically, there is a lot of confusion between the global loops per
 jiffy and the per CPU one.  On ARM I think we always use the global
 one and we attempt to scale it as cpufreq changes.  ...but...

 * cores tend scale together and there's a single global.  That means
 you might have started the delay loop at one freq and ended it at
 another (if another CPU changes the freq).
>>> Good point. The loops based delay would clearly be broken w/ ASMP unless
>>> we use per-cpu values that are scaled(and as you point out, we don't
>>> scale the value mid-delay). Time based counters for udelay() - like the
>>> arch timer you mention - are a much better way to work around this.
>> Locally we have this:
>> * https://chromium-review.googlesource.com/189885
>>   ARM: Don't ever downscale loops_per_jiffy in SMP systems
>>
>> I didn't think upstream would really want this given the move to arch
>> timers, but I'm happy to post it.
>
> Might be good just to get the discussion going. I agree its probably not
> the best solution, and likely the timer delay is the right path, but but
> I think we need to have some rails in place so that other folks don't
> trip over this.
>
> Maybe a combination of your change and something where on systems that
> see cpufreq changes (or cores with different frequencies) complain
> loudly if they're not configured to use the delay timer?

I thew the patch up there.  I didn't add any loud complaining, but I
certainly can if people think that the patches are useful and they'd
like the complaining.

https://patchwork.kernel.org/patch/4132521/


 * I believe there's some strange issues in terms of how the loops per
 jiffy variable is initialized and how the "original CPU freq" is.  I
 know we ran into issues on big.LITTLE where the LITTLE cores came up
 and clobbered the loops_per_jiffy variable but it was still doing math
 based on the big cores.
>>> Hrm. I don't have a theory on this right now, but clearly there are
>>> issues to be resolved, so having your tests included would be a good
>>> thing to help find these issues.
>> Locally we added:
>> * https://chromium-review.googlesource.com/189725
>>   init: Don't decrease loops_per_jiffy when a CPU comes up
>>
>> ...and that "fixed" things for us.  Specifically what happened was:
>>
>> - A15s boot up at 1.8GHz (though they can actually go up to 1.9/2.0).
>>
>> - A7s boot up at ~600MHz and clobbered loops_per_jiffy with something tiny.
>>
>> - In the first "cpufreq_callback" in "arch/arm/kernel/smp.c", we
>> stored the current _A15_ frequency upon the first cpufreq transition
>> and the current _A7_ loops per jiffy (since the A7s clobbered it).
>>
>>
>> It seemed like our situation was so messed up that upstream probably
>> wouldn't want the fix (using loops_per_jiffy in an HMP system is
>> pretty insane), but again I'm happy to send it up.
>
> Yea. Again, might be good to send it out just so more folks are aware
> (particularly outside the arm world) that ASMP systems are here and the
> generic delay infrastructure has assumptions that are not compatible.

Here ya go: 

-Doug
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH tip/core/rcu 07/45] torture: Allow variations of "defconfig" to be specified

2014-05-07 Thread Paul E. McKenney

On Wed, May 07, 2014 at 02:22:19PM -0700, j...@joshtriplett.org wrote:
> On Mon, Apr 28, 2014 at 05:24:55PM -0700, Paul E. McKenney wrote:
> > From: "Paul E. McKenney" 
> > 
> > Some environments require some variation on "make defconfig" to initialize
> > the .config file.  This commit therefore adds a --defconfig argument to
> > allow this to be specified.  The default value is of course "defconfig".
> > 
> > Signed-off-by: Paul E. McKenney 
> 
> 
> "--defconfig randconfig" or "--defconfig allyesconfig" or similar seems
> rather odd; how about calling it --kconfig or similar?
> 

Some day I am going to have to feed that to a browser and see what
happens.  ;-)

I must confess that I hadn't considered feeding randconfig or allyesconfig
to that argument, partly because I figured that I would have to also
supply Kconfig constraints in those cases in order to ensure that the
resulting kernel would actually run under qemu.  I was instead thinking
in terms of a --configs option beginning with "RAND", which would pick
up the Kconfig constraints from the appropriate configs directory,
for example:

tools/testing/selftests/rcutorture/configs/rcu/RAND1

That said, I haven't thought that far down that path.

So for the --defconfig argument, I was thinking more in terms of things
like pseries_defconfig or versatile_defconfig.

Thanx, Paul

> >  tools/testing/selftests/rcutorture/bin/configinit.sh | 2 +-
> >  tools/testing/selftests/rcutorture/bin/kvm.sh| 8 
> >  2 files changed, 9 insertions(+), 1 deletion(-)
> > 
> > diff --git a/tools/testing/selftests/rcutorture/bin/configinit.sh 
> > b/tools/testing/selftests/rcutorture/bin/configinit.sh
> > index a1be6e62add1..9c3f3d39b934 100755
> > --- a/tools/testing/selftests/rcutorture/bin/configinit.sh
> > +++ b/tools/testing/selftests/rcutorture/bin/configinit.sh
> > @@ -62,7 +62,7 @@ grep '^grep' < $T/u.sh > $T/upd.sh
> >  echo "cat - $c" >> $T/upd.sh
> >  make mrproper
> >  make $buildloc distclean > $builddir/Make.distclean 2>&1
> > -make $buildloc defconfig > $builddir/Make.defconfig.out 2>&1
> > +make $buildloc $TORTURE_DEFCONFIG > $builddir/Make.defconfig.out 2>&1
> >  mv $builddir/.config $builddir/.config.sav
> >  sh $T/upd.sh < $builddir/.config.sav > $builddir/.config
> >  cp $builddir/.config $builddir/.config.new
> > diff --git a/tools/testing/selftests/rcutorture/bin/kvm.sh 
> > b/tools/testing/selftests/rcutorture/bin/kvm.sh
> > index a52a077ee258..59945b7793d9 100644
> > --- a/tools/testing/selftests/rcutorture/bin/kvm.sh
> > +++ b/tools/testing/selftests/rcutorture/bin/kvm.sh
> > @@ -38,6 +38,7 @@ dur=30
> >  dryrun=""
> >  KVM="`pwd`/tools/testing/selftests/rcutorture"; export KVM
> >  PATH=${KVM}/bin:$PATH; export PATH
> > +TORTURE_DEFCONFIG=defconfig
> >  TORTURE_INITRD="$KVM/initrd"; export TORTURE_INITRD
> >  RCU_KMAKE_ARG=""; export RCU_KMAKE_ARG
> >  TORTURE_SUITE=rcu
> > @@ -56,6 +57,7 @@ usage () {
> > echo "   --configs \"config-file list\""
> > echo "   --cpus N"
> > echo "   --datestamp string"
> > +   echo "   --defconfig string"
> > echo "   --dryrun sched|script"
> > echo "   --duration minutes"
> > echo "   --interactive"
> > @@ -96,6 +98,11 @@ do
> > ds=$2
> > shift
> > ;;
> > +   --defconfig)
> > +   checkarg --defconfig "defconfigtype" "$#" "$2" '^[^/][^/]*$' 
> > '^--'
> > +   TORTURE_DEFCONFIG=$2
> > +   shift
> > +   ;;
> > --dryrun)
> > checkarg --dryrun "sched|script" $# "$2" 'sched\|script' '^--'
> > dryrun=$2
> > @@ -259,6 +266,7 @@ END {
> >  # Generate a script to execute the tests in appropriate batches.
> >  cat << ___EOF___ > $T/script
> >  TORTURE_SUITE="$TORTURE_SUITE"; export TORTURE_SUITE
> > +TORTURE_DEFCONFIG="$TORTURE_DEFCONFIG"; export TORTURE_DEFCONFIG
> >  ___EOF___
> >  awk < $T/cfgcpu.pack \
> > -v CONFIGDIR="$CONFIGFRAG/$kversion/" \
> > -- 
> > 1.8.1.5
> > 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] init: Don't decrease loops_per_jiffy when a CPU comes up

2014-05-07 Thread Doug Anderson

The loops_per_jiffy count continues to be updated as each CPU is
brought up.  This causes problems when we've got an HMP system and
different CPUs have different loops per jiffy.  On exynos 542x
systems, for instance, the A7s will have significantly lower loops per
jiffy than their big brothers.

We should always set the loops_per_jiffy the first time through, then
use the max.

One could argue that complex HMP systems should really be completely
ignoring the global loops_per_jiffy variable anyway.  That's probably
why nobody has fixed this before.  With that argument you could say
that while this change isn't incorrect, it's a bit misguided.  Still,
it doesn't hurt and provides a better fallback than we had without
this.

Signed-off-by: Doug Anderson 
---
 init/calibrate.c | 22 +-
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/init/calibrate.c b/init/calibrate.c
index 520702d..073bf9b 100644
--- a/init/calibrate.c
+++ b/init/calibrate.c
@@ -265,40 +265,44 @@ unsigned long __attribute__((weak)) 
calibrate_delay_is_known(void)
 void calibrate_delay(void)
 {
unsigned long lpj;
-   static bool printed;
+   static bool already_ran;
int this_cpu = smp_processor_id();
 
if (per_cpu(cpu_loops_per_jiffy, this_cpu)) {
lpj = per_cpu(cpu_loops_per_jiffy, this_cpu);
-   if (!printed)
+   if (!already_ran)
pr_info("Calibrating delay loop (skipped) "
"already calibrated this CPU");
} else if (preset_lpj) {
lpj = preset_lpj;
-   if (!printed)
+   if (!already_ran)
pr_info("Calibrating delay loop (skipped) "
"preset value.. ");
-   } else if ((!printed) && lpj_fine) {
+   } else if ((!already_ran) && lpj_fine) {
lpj = lpj_fine;
pr_info("Calibrating delay loop (skipped), "
"value calculated using timer frequency.. ");
} else if ((lpj = calibrate_delay_is_known())) {
;
} else if ((lpj = calibrate_delay_direct()) != 0) {
-   if (!printed)
+   if (!already_ran)
pr_info("Calibrating delay using timer "
"specific routine.. ");
} else {
-   if (!printed)
+   if (!already_ran)
pr_info("Calibrating delay loop... ");
lpj = calibrate_delay_converge();
}
per_cpu(cpu_loops_per_jiffy, this_cpu) = lpj;
-   if (!printed)
+   if (!already_ran) {
pr_cont("%lu.%02lu BogoMIPS (lpj=%lu)\n",
lpj/(50/HZ),
(lpj/(5000/HZ)) % 100, lpj);
 
-   loops_per_jiffy = lpj;
-   printed = true;
+   loops_per_jiffy = lpj;
+   } else {
+   loops_per_jiffy = max(loops_per_jiffy, lpj);
+   }
+
+   already_ran = true;
 }
-- 
1.9.1.423.g4596e3a

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH tip/core/rcu 01/45] rcutorture: Add forward-progress checking for writer

2014-05-07 Thread Paul E. McKenney

On Wed, May 07, 2014 at 02:16:49PM -0700, j...@joshtriplett.org wrote:
> On Mon, Apr 28, 2014 at 05:24:49PM -0700, Paul E. McKenney wrote:
> > From: "Paul E. McKenney" 
> > 
> > The rcutorture output currently does not distinguish between stalls in
> > the RCU implementation and stalls in the rcu_torture_writer() kthreads.
> > This commit therefore adds some diagnostics to help distinguish between
> > these two conditions, at least for the non-SRCU implementations.  (SRCU
> > does not provide evidence of update-side forward progress by design.)
> > 
> > Signed-off-by: Paul E. McKenney 
> 
> The concept makes sense, and the writer state annotations seem like a
> useful debugging mechanism, but having RCU know about RCU torture types
> seems fundamentally wrong.  This mechanism accesses rcu_state, which is
> already implementation-specific, so why not just only define the
> function for the RCU implementations that support it, and then have a
> function pointer in the torture-test structure to report a stall?

Ouch.  It is worse than that!  When running RCU-bh or RCU-sched,
the current code incorrectly returns the statistics for RCU.
So I do need some way for rcutorture to tell RCU which flavor
it is testing.

One thing I could do would be to pass in a pointer to the call_rcu()
function (cur_ops->call from rcutorture's viewpoint), then scan the
rcu_state structures looking for the selected flavor (rsp->call from
tree.c's viewpoint).  In the SRCU and RCU-busted cases, the flavor would
not be found, and I could then just set everything to zero.

Does that seem reasonable, or is there a better way to do this?

Thanx, Paul

> - Josh Triplett
> 
> >  include/linux/rcupdate.h | 19 +++
> >  kernel/rcu/rcutorture.c  | 37 +
> >  kernel/rcu/tree.c| 18 ++
> >  3 files changed, 74 insertions(+)
> > 
> > diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> > index 00a7fd61b3c6..a6c3898e141e 100644
> > --- a/include/linux/rcupdate.h
> > +++ b/include/linux/rcupdate.h
> > @@ -51,7 +51,17 @@ extern int rcu_expedited; /* for sysctl */
> >  extern int rcutorture_runnable; /* for sysctl */
> >  #endif /* #ifdef CONFIG_RCU_TORTURE_TEST */
> >  
> > +enum rcutorture_type {
> > +   RTORT_BUSTED,
> > +   RTORT_RCU,
> > +   RTORT_RCU_BH,
> > +   RTORT_RCU_SCHED,
> > +   RTORT_SRCU
> > +};
> > +
> >  #if defined(CONFIG_TREE_RCU) || defined(CONFIG_TREE_PREEMPT_RCU)
> > +void rcutorture_get_gp_data(enum rcutorture_type test_type, int *flags,
> > +   unsigned long *gpnum, unsigned long *completed);
> >  void rcutorture_record_test_transition(void);
> >  void rcutorture_record_progress(unsigned long vernum);
> >  void do_trace_rcu_torture_read(const char *rcutorturename,
> > @@ -60,6 +70,15 @@ void do_trace_rcu_torture_read(const char 
> > *rcutorturename,
> >unsigned long c_old,
> >unsigned long c);
> >  #else
> > +static inline void rcutorture_get_gp_data(enum rcutorture_type test_type,
> > + int *flags,
> > + unsigned long *gpnum,
> > + unsigned long *completed)
> > +{
> > +   *flags = 0;
> > +   *gpnum = 0;
> > +   *completed = 0;
> > +}
> >  static inline void rcutorture_record_test_transition(void)
> >  {
> >  }
> > diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
> > index bd30bc61bc05..1110db210318 100644
> > --- a/kernel/rcu/rcutorture.c
> > +++ b/kernel/rcu/rcutorture.c
> > @@ -138,6 +138,15 @@ static long n_barrier_attempts;
> >  static long n_barrier_successes;
> >  static struct list_head rcu_torture_removed;
> >  
> > +static int rcu_torture_writer_state;
> > +#define RTWS_FIXED_DELAY   0
> > +#define RTWS_DELAY 1
> > +#define RTWS_REPLACE   2
> > +#define RTWS_DEF_FREE  3
> > +#define RTWS_EXP_SYNC  4
> > +#define RTWS_STUTTER   5
> > +#define RTWS_STOPPING  6
> > +
> >  #if defined(MODULE) || defined(CONFIG_RCU_TORTURE_TEST_RUNNABLE)
> >  #define RCUTORTURE_RUNNABLE_INIT 1
> >  #else
> > @@ -214,6 +223,7 @@ rcu_torture_free(struct rcu_torture *p)
> >   */
> >  
> >  struct rcu_torture_ops {
> > +   int ttype;
> > void (*init)(void);
> > int (*readlock)(void);
> > void (*read_delay)(struct torture_random_state *rrsp);
> > @@ -312,6 +322,7 @@ static void rcu_sync_torture_init(void)
> >  }
> >  
> >  static struct rcu_torture_ops rcu_ops = {
> > +   .ttype  = RTORT_RCU,
> > .init   = rcu_sync_torture_init,
> > .readlock   = rcu_torture_read_lock,
> > .read_delay = rcu_read_delay,
> > @@ -355,6 +366,7 @@ static void rcu_bh_torture_deferred_free(struct 
> > rcu_torture *p)
> >  }
> >  
> >  static struct rcu_torture_ops rcu_bh_ops = {
> > +   .ttype  = RTORT_RCU_BH,

Re: [PATCH] arm: memset: zero out upper bytes in r1

2014-05-07 Thread Afzal Mohammed

Hi Andrey,

On Mon, May 05, 2014 at 11:11:13AM +0400, Andrey Ryabinin wrote:

> memset doesn't work right for following example:
> 
>   signed char c = 0xF0;
>   memset(addr, c, size);
> 
> Variable c is signed, so after typcasting to int the value will be 0xFFF0.
> This value will be passed through r1 regitster to memset function.
> memset doesn't zero out upper bytes in r1, so memory will be filled
> with 0xFFF0 instead of expected 0xF0F0F0F0.

> --- a/arch/arm/lib/memset.S
> +++ b/arch/arm/lib/memset.S
> @@ -22,7 +22,8 @@ ENTRY(memset)
>  /*
>   * we know that the pointer in ip is aligned to a word boundary.
>   */
> -1:   orr r1, r1, r1, lsl #8
> +1:   and r1, r1, #0xff
> + orr r1, r1, r1, lsl #8

int is to be converted to unsigned char in memset, would having above
change immediately upon entry to memset rather than at a place where it
won't always execute make intention clearer ? (although it doesn't make
difference)

ubfx r1, r1, #0, #8 would have given the needed typecasting, but seems
it is available only on ARMv6T2 & above.

Regards
Afzal
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] ARM: tegra: enable console framebuffer rotation

2014-05-07 Thread Stephen Warren

On 05/07/2014 05:40 PM, Alex Courbot wrote:
> On 05/08/2014 12:57 AM, Stephen Warren wrote:
>> On 05/06/2014 09:18 PM, Alexandre Courbot wrote:
>>> Console rotation is needed for devices like Tegra Note 7 and NVIDIA
>>> SHIELD to get the boot console in the expected orientation.
>>
>> I've squashed this into Tegra's for-3.16/defconfig branch.
>>
>> Can you please also update multi_v7_defconfig, and send that change to
>> arm-soc (a...@kernel.org) to be applied. Thanks.
> 
> I omitted doing this for now because the devices that require this
> option (TN7/SHIELD) need a custom build with appended DTB and/or
> command-line anyway. Therefore they cannot use a multi-mach kernel and
> might as well be built against Tegra's defconfig. Does your remark above
> still apply in spite of this?

Ah right, I guess we don't need it there then.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] ARM: tegra: enable console framebuffer rotation

2014-05-07 Thread Alex Courbot


On 05/08/2014 12:57 AM, Stephen Warren wrote:

On 05/06/2014 09:18 PM, Alexandre Courbot wrote:

Console rotation is needed for devices like Tegra Note 7 and NVIDIA
SHIELD to get the boot console in the expected orientation.


I've squashed this into Tegra's for-3.16/defconfig branch.

Can you please also update multi_v7_defconfig, and send that change to
arm-soc (a...@kernel.org) to be applied. Thanks.


I omitted doing this for now because the devices that require this 
option (TN7/SHIELD) need a custom build with appended DTB and/or 
command-line anyway. Therefore they cannot use a multi-mach kernel and 
might as well be built against Tegra's defconfig. Does your remark above 
still apply in spite of this?


Thanks,
Alex.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v5 2/4] acpi_processor: do not mark present at boot but not onlined CPU as onlined

2014-05-07 Thread Rafael J. Wysocki

On Monday, May 05, 2014 10:49:49 PM Igor Mammedov wrote:
> acpi_processor_add() assumes that present at boot CPUs
> are always onlined, it is not so if a CPU failed to become
> onlined. As result acpi_processor_add() will mark such CPU
> device as onlined in sysfs and following attempts to
> online/offline it using /sys/device/system/cpu/cpuX/online
> attribute will fail.
> 
> Do not poke into device internals in acpi_processor_add()
> and touch "struct device { .offline }" attribute, since
> for CPUs onlined at boot it's set by:
>   topology_init() -> arch_register_cpu() -> register_cpu()
> before ACPI device tree is parsed, and for hotplugged
> CPUs it's set when userspace onlines CPU via sysfs.
> 
> Signed-off-by: Igor Mammedov 
> Acked-by: Toshi Kani 

Would there be a problem if I applied this separately from the rest of the
series?

> ---
> v2:
>  - fix regression in v1 leading to NULL pointer dereference
>on CPU unplug, do not remove "pr->dev = dev;"
> ---
>  drivers/acpi/acpi_processor.c |1 -
>  1 files changed, 0 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> index b06f5f5..52c81c4 100644
> --- a/drivers/acpi/acpi_processor.c
> +++ b/drivers/acpi/acpi_processor.c
> @@ -405,7 +405,6 @@ static int acpi_processor_add(struct acpi_device *device,
>   goto err;
>  
>   pr->dev = dev;
> - dev->offline = pr->flags.need_hotplug_init;
>  
>   /* Trigger the processor driver's .probe() if present. */
>   if (device_attach(dev) >= 0)
> 

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v5] mm: support madvise(MADV_FREE)

2014-05-07 Thread Minchan Kim

bump

On Mon, Apr 21, 2014 at 10:56:08AM +0900, Minchan Kim wrote:
> Linux doesn't have an ability to free pages lazy while other OS
> already have been supported that named by madvise(MADV_FREE).
> 
> The gain is clear that kernel can discard freed pages rather than
> swapping out or OOM if memory pressure happens.
> 
> Without memory pressure, freed pages would be reused by userspace
> without another additional overhead(ex, page fault + allocation
> + zeroing).
> 
> How to work is following as.
> 
> When madvise syscall is called, VM clears dirty bit of ptes of
> the range. If memory pressure happens, VM checks dirty bit of
> page table and if it found still "clean", it means it's a
> "lazyfree pages" so VM could discard the page instead of swapping out.
> Once there was store operation for the page before VM peek a page
> to reclaim, dirty bit is set so VM can swap out the page instead of
> discarding.
> 
> Firstly, heavy users would be general allocators(ex, jemalloc,
> tcmalloc and hope glibc supports it) and jemalloc/tcmalloc already
> have supported the feature for other OS(ex, FreeBSD)
> 
> barrios@blaptop:~/benchmark/ebizzy$ lscpu
> Architecture:  x86_64
> CPU op-mode(s):32-bit, 64-bit
> Byte Order:Little Endian
> CPU(s):4
> On-line CPU(s) list:   0-3
> Thread(s) per core:2
> Core(s) per socket:2
> Socket(s): 1
> NUMA node(s):  1
> Vendor ID: GenuineIntel
> CPU family:6
> Model: 42
> Stepping:  7
> CPU MHz:   2801.000
> BogoMIPS:  5581.64
> Virtualization:VT-x
> L1d cache: 32K
> L1i cache: 32K
> L2 cache:  256K
> L3 cache:  4096K
> NUMA node0 CPU(s): 0-3
> 
> ebizzy benchmark(./ebizzy -S 10 -n 512)
> 
>  vanilla-jemalloc MADV_free-jemalloc
> 
> 1 thread
> records:  10  records:  10
> avg:  7436.70 avg:  15292.70
> std:  48.01(0.65%)std:  496.40(3.25%)
> max:  7542.00 max:  15944.00
> min:  7366.00 min:  14478.00
> 
> 2 thread
> records:  10  records:  10
> avg:  12190.50avg:  24975.50
> std:  1011.51(8.30%)  std:  1127.22(4.51%)
> max:  13012.00max:  26382.00
> min:  10192.00min:  23265.00
> 
> 4 thread
> records:  10  records:  10
> avg:  16875.30avg:  36320.90
> std:  562.59(3.33%)   std:  1503.75(4.14%)
> max:  17465.00max:  38314.00
> min:  15552.00min:  33863.00
> 
> 8 thread
> records:  10  records:  10
> avg:  16966.80avg:  35915.20
> std:  229.35(1.35%)   std:  2153.89(6.00%)
> max:  17456.00max:  37943.00
> min:  16742.00min:  29891.00
> 
> 16 thread
> records:  10  records:  10
> avg:  20590.90avg:  37388.40
> std:  362.33(1.76%)   std:  1282.59(3.43%)
> max:  20954.00max:  38911.00
> min:  19985.00min:  34928.00
> 
> 32 thread
> records:  10  records:  10
> avg:  22633.40avg:  37118.00
> std:  413.73(1.83%)   std:  766.36(2.06%)
> max:  23120.00max:  38328.00
> min:  22071.00min:  35557.00
> 
> In summary, MADV_FREE is about 2 time faster than MADV_DONTNEED.
> 
> Patchset is based on v3.15-rc1-mmotm-2014-04-15-16-14
> 
> * From v4
>  * Add Reviewed-by: Zhang Yanfei
>  * Rebase on v3.15-rc1-mmotm-2014-04-15-16-14
> 
> * From v3
>  * Add "how to work part" in description - Zhang
>  * Add page_discardable utility function - Zhang
>  * Clean up
> 
> * From v2
>  * Remove forceful dirty marking of swap-readed page - Johannes
>  * Remove deactivation logic of lazyfreed page
>  * Rebased on 3.14
>  * Remove RFC tag
> 
> * From v1
>  * Use custom page table walker for madvise_free - Johannes
>  * Remove PG_lazypage flag - Johannes
>  * Do madvise_dontneed instead of madvise_freein swapless system
> 
> Cc: Hugh Dickins 
> Cc: Johannes Weiner 
> Cc: Rik van Riel 
> Cc: KOSAKI Motohiro 
> Cc: Mel Gorman 
> Cc: Jason Evans 
> Reviewed-by: Zhang Yanfei 
> Signed-off-by: Minchan Kim 
> ---
>  include/linux/mm.h |   2 +
>  include/linux/rmap.h   |  21 -
>  include/linux/vm_event_item.h  |   1 +
>  include/uapi/asm-generic/mman-common.h |   1 +
>  mm/madvise.c   |  25 ++
>  mm/memory.c| 140 
> +
>  mm/rmap.c  |  83 +--
>  mm/vmscan.c|  29 ++-
>  mm/vmstat.c|   1 +
>  9 files changed, 291 insertions(+), 12 deletions(-)
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 9a3744d98b00..98f55ccef1a9 100644
> --- a/include/linux/mm.h
>

[PATCH] devicetree: bindings: separate CPU enable method descriptions

2014-05-07 Thread Alex Elder

The bindings for CPU enable methods are defined in ".../arm/cpus.txt".  As
additional 32-bit ARM CPUS are converted to use the "enable-method" CPU
property to imply a particular set of SMP operations to use, the list of these
methods is likely to become unwieldy.  The current documentation already
contains several property descriptions that are meaningful only for certain
enable methods.

This patch defines a new Documentation subdirectory whose purpose is to give
each CPU enable method its own place to define how and when it's used, as
well as what other properties (optional or required) are associated with
the method.  The existing enable method documentation is expanded and moved
from ".../arm/cpus.txt" into new files accordingly.

Signed-off-by: Alex Elder 
---
This series is available here:
http://git.linaro.org/landing-teams/working/broadcom/kernel.git
Branch review/enable-method-bindings

 .../bindings/arm/cpu-enable-method/README  | 20 +
 .../bindings/arm/cpu-enable-method/arm,psci.txt| 69 
 .../arm/cpu-enable-method/qcom,gcc-msm8660 | 30 +++
 .../arm/cpu-enable-method/qcom,kpss-acc-v1 | 56 +
 .../arm/cpu-enable-method/qcom,kpss-acc-v2 | 56 +
 .../bindings/arm/cpu-enable-method/spin-table.txt  | 96 ++
 Documentation/devicetree/bindings/arm/cpus.txt | 29 +--
 7 files changed, 330 insertions(+), 26 deletions(-)
 create mode 100644 
Documentation/devicetree/bindings/arm/cpu-enable-method/README
 create mode 100644 
Documentation/devicetree/bindings/arm/cpu-enable-method/arm,psci.txt
 create mode 100644 
Documentation/devicetree/bindings/arm/cpu-enable-method/qcom,gcc-msm8660
 create mode 100644 
Documentation/devicetree/bindings/arm/cpu-enable-method/qcom,kpss-acc-v1
 create mode 100644 
Documentation/devicetree/bindings/arm/cpu-enable-method/qcom,kpss-acc-v2
 create mode 100644 
Documentation/devicetree/bindings/arm/cpu-enable-method/spin-table.txt

diff --git a/Documentation/devicetree/bindings/arm/cpu-enable-method/README 
b/Documentation/devicetree/bindings/arm/cpu-enable-method/README
new file mode 100644
index 000..cc9431e
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/cpu-enable-method/README
@@ -0,0 +1,20 @@
+==
+CPU enable-method bindings
+==
+
+The device tree describes the layout of CPUs in a machine in a single "cpus"
+node, which in turn contains a number of "cpu" sub-nodes defining properties
+for each cpu.
+
+For multiprocessing configurations, CPU cores can be individually enabled
+and disabled.  The enabling capability is used for SMP startup as well as
+CPU hotplug.  A CPU enable method--normally specified in the device tree
+using an "enable-method" property--defines how cores are enabled.  If all
+CPUs in a machine use the same enable method and related property values,
+these properties should be defined in the "cpus" node, which associates the
+property values with all CPUs.  Alternatively, every "cpu" node can define
+its "enable-method" separately.
+
+Documents in this directory define how each of the CPU enable methods are to
+be used, as well the names and possible values of related properties that
+are required by or affect each enable method.
diff --git 
a/Documentation/devicetree/bindings/arm/cpu-enable-method/arm,psci.txt 
b/Documentation/devicetree/bindings/arm/cpu-enable-method/arm,psci.txt
new file mode 100644
index 000..c80d68e
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/cpu-enable-method/arm,psci.txt
@@ -0,0 +1,69 @@
+
+CPU enable-method "arm,psci" binding
+
+
+This document describes the "arm,psci" method for enabling secondary CPUs.
+This is different from other CPU enable methods, in that CPU cores are
+enabled and disabled using the ARM PSCI interface, which is defined in the
+device tree independent of the CPUs.  Instead, a separate node compatible
+with "arm,psci" defines the PSCI functions supported; if a "cpu_on" function
+is defined, that is used for enabling a CPU core.
+
+Enable method: Distinct node with compatible = "arm,psci" property
+Compatible cpus:   (???)  (both 32- and 64-bit ARM have a hook)
+Properties:
+   - method
+   Usage:  required
+   Value type: 
+   Definition:
+   A string defining the specific instruction
+   used to enable the core.  The value must be
+   either "hvc" or "smc".
+   - cpu_suspend
+   Usage:  optional
+   Value type: 
+   Definition:
+   If present, this value defines the PSCI function id
+   used to suspend execution on a CPU core.
+   - cpu_off
+   Usage:  optional
+   Value type: 
+

[PATCH] ARM: Don't ever downscale loops_per_jiffy in SMP systems

2014-05-07 Thread Doug Anderson

Downscaling loops_per_jiffy on SMP ARM systems really doesn't work.
You could really only do this if:

* Each CPU is has independent frequency changes (changing one CPU
  doesn't affect another).
* We change the generic ARM udelay() code to actually look at percpu
  loops_per_jiffy.

I don't know of any ARM CPUs that are totally independent that don't
just use a timer-based delay anyway.  For those that don't have a
timer-based delay, we should be conservative and overestimate
loops_per_jiffy.

Note that on some systems you might sometimes see (in the extreme case
when we're all the way downclocked) a udelay(100) become a
udelay(1000) now.

Signed-off-by: Doug Anderson 
---
Note that I don't have an board that has cpufreq enabled upstream so
I'm relying on the testing I did on our local kernel-3.8.  Hopefully
someone out there can test using David's nifty udelay tests.  In order
to see this you'd need to make sure that you _don't_ have arch timers
enabled.  See:
* https://patchwork.kernel.org/patch/4124721/
* https://patchwork.kernel.org/patch/4124731/

 arch/arm/kernel/smp.c | 45 -
 1 file changed, 28 insertions(+), 17 deletions(-)

diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 7c4fada..9d944f6 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -649,39 +649,50 @@ int setup_profiling_timer(unsigned int multiplier)
 
 #ifdef CONFIG_CPU_FREQ
 
-static DEFINE_PER_CPU(unsigned long, l_p_j_ref);
-static DEFINE_PER_CPU(unsigned long, l_p_j_ref_freq);
 static unsigned long global_l_p_j_ref;
 static unsigned long global_l_p_j_ref_freq;
+static unsigned long global_l_p_j_max_freq;
+
+/**
+ * cpufreq_callback - Adjust loops_per_jiffies when frequency changes
+ *
+ * When the CPU frequency changes we need to adjust loops_per_jiffies, which
+ * we assume scales linearly with frequency.
+ *
+ * This function is fairly castrated and only ever adjust loops_per_jiffies
+ * upward.  It also doesn't adjust the PER_CPU loops_per_jiffies.  Here's why:
+ * 1. The ARM udelay only ever looks at the global loops_per_jiffy not the
+ *percpu one.  If your CPUs _are not_ changed in lockstep you could run
+ *into problems by decreasing loops_per_jiffies since one of the other
+ *processors might still be running slower.
+ * 2. The ARM udelay reads the loops_per_jiffy at the beginning of its loop and
+ *no other times.  If your CPUs _are_ changed in lockstep you could run
+ *into a race where one CPU has started its loop with old (slower)
+ *loops_per_jiffy and then suddenly is running faster.
+ *
+ * Anyone who wants a good udelay() should be using a timer-based solution
+ * anyway.  If you don't have a timer solution, you just gotta be conservative.
+ */
 
 static int cpufreq_callback(struct notifier_block *nb,
unsigned long val, void *data)
 {
struct cpufreq_freqs *freq = data;
-   int cpu = freq->cpu;
 
if (freq->flags & CPUFREQ_CONST_LOOPS)
return NOTIFY_OK;
 
-   if (!per_cpu(l_p_j_ref, cpu)) {
-   per_cpu(l_p_j_ref, cpu) =
-   per_cpu(cpu_data, cpu).loops_per_jiffy;
-   per_cpu(l_p_j_ref_freq, cpu) = freq->old;
-   if (!global_l_p_j_ref) {
-   global_l_p_j_ref = loops_per_jiffy;
-   global_l_p_j_ref_freq = freq->old;
-   }
+   if (!global_l_p_j_ref) {
+   global_l_p_j_ref = loops_per_jiffy;
+   global_l_p_j_ref_freq = freq->old;
+   global_l_p_j_max_freq = freq->old;
}
 
-   if ((val == CPUFREQ_PRECHANGE  && freq->old < freq->new) ||
-   (val == CPUFREQ_POSTCHANGE && freq->old > freq->new)) {
+   if (freq->new > global_l_p_j_max_freq) {
loops_per_jiffy = cpufreq_scale(global_l_p_j_ref,
global_l_p_j_ref_freq,
freq->new);
-   per_cpu(cpu_data, cpu).loops_per_jiffy =
-   cpufreq_scale(per_cpu(l_p_j_ref, cpu),
-   per_cpu(l_p_j_ref_freq, cpu),
-   freq->new);
+   global_l_p_j_max_freq = freq->new;
}
return NOTIFY_OK;
 }
-- 
1.9.1.423.g4596e3a

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 3/5] sh/PCI: Pass GAPSPCI_DMA_BASE CPU address to dma_declare_coherent_memory()

2014-05-07 Thread Bjorn Helgaas

[+cc Magnus, linux-sh (sorry, I forgot to tell stgit to cc you)]

On Wed, May 07, 2014 at 10:15:10AM +0200, Arnd Bergmann wrote:
> On Wednesday 07 May 2014 09:55:16 Arnd Bergmann wrote:
> > On Tuesday 06 May 2014 16:48:33 Bjorn Helgaas wrote:
> > > @@ -51,8 +53,12 @@ static void gapspci_fixup_resources(struct pci_dev 
> > > *dev)
> > > /*
> > >  * Redirect dma memory allocations to special memory 
> > > window.
> > >  */
> > > +   region.start = GAPSPCI_DMA_BASE;
> > > +   region.end = GAPSPCI_DMA_BASE + GAPSPCI_DMA_SIZE - 1;
> > > +   res.flags = IORESOURCE_MEM;
> > > +   pcibios_bus_to_resource(dev->bus, , );
> > > BUG_ON(!dma_declare_coherent_memory(>dev,
> > > -   GAPSPCI_DMA_BASE,
> > > +   res->start,
> > > GAPSPCI_DMA_BASE,
> > > GAPSPCI_DMA_SIZE,
> > > DMA_MEMORY_MAP |
> > 
> > Not sure if this is an improvement. I understand the intention, but
> > it's actually possible for the offset to be different both ways:
> > 
> > Your patch applies the outbound mem_offset that was provided when
> > registering the MMIO resource for the PCI host bridge. What the driver
> > needs instead is the inbound DMA offset that gets applied by some
> > host bridges that don't have a 1:1 mapping but also don't have an
> > IOMMU.

I don't understand where the inbound DMA offset comes into play.  As I
understand it, we're talking about a region of memory on the PCI bus,
not in system RAM, and there are two ways to access it: (1) the CPU
makes MMIO accesses to it, and (2) the device makes DMA accesses to
it.

The CPU accesses go through the host bridge MMIO aperture and will
have the outbound mem_offset applied to them.  The DMA accesses are
handled by the device itself and there's no host bridge or IOMMU or
inbound DMA offset involved.

I think there are two reasonable ways a PCI driver could use
dma_declare_coherent_memory():

  1) The bus address of the region might be in a BAR.  In that case,
  it should pass pci_resource_start() (the CPU phys_addr_t) as the
  first address, and pci_bus_address(pci_resource_start()) (the bus
  dma_addr_t) as the second.

  2) The bus address of the region might be discovered from the device
  in some non-standard way.  In this case, the driver reads the bus
  address dma_addr_t directly from the device, and it should use
  something like the pcibios_bus_to_resource() code I proposed to find
  the CPU phys_addr_t.

For this platform, all the addresses are hard-coded and there is no
outbound MMIO offset, so we can't tell which to use and in this
system, it doesn't matter if we do anything at all.

The only reason I would make a change here is because the current code
cannot be safely copied (another driver might be used in systems where
the host bridge does have an outbound MMIO offset).  If we make this
code use either strategy, it would be a clue to future users that they
should not assume the physical address is the same as the bus address.

I changed the patch (below) to add a comment and to use the first
strategy (though there isn't an actual BAR, so we can't do it exactly
that way).

> > We know that on this particular platform, both are zero, so
> > the original code works and it will keep working with your change,
> > but I think it's a mistake anyway. I have seen both kinds of offsets
> > in the past on real machines, but I am not aware of any where they
> > are both nonzero and have the same value.
> 
> Hmm, looking at it again, it seem even weirder: the address we register
> for DMA allocations is the same that gets passed into the PCI host
> for MMIO resources. Something is very strange here. Still, I'd rather
> not touch it at all, whatever it does.

It's definitely strange.  It looks to me like the memory on the device
takes up the entire 32KB host bridge MMIO aperture.  I don't know what
driver uses this, but the device must be programmed via I/O ports,
which sort of makes sense given the previous lines in this quirk that
set the BAR1 resource to be in the I/O aperture.

Bjorn

sh/PCI: Pass GAPSPCI_DMA_BASE CPU & bus address to dma_declare_coherent_memory()

From: Bjorn Helgaas 

dma_declare_coherent_memory() needs both the CPU physical address and the
bus address of the device memory.  They are the same on this platform, but
in general we should use pcibios_resource_to_bus() to account for any
address translation done by the PCI host bridge.

This makes no difference on Dreamcast, but is safer if the usage is copied
to future drivers.

Signed-off-by: Bjorn Helgaas 
CC: Magnus Damm 
CC: linux...@vger.kernel.org
---
 arch/sh/drivers/pci/fixups-dreamcast.c |   18 +++---
 1 file changed, 15 insertions(+), 3 deletions(-)

[RFC PATCH 0/2] iommu: Expose IOMMU information in sysfs

2014-05-07 Thread Alex Williamson

Users want to know the features of their hardware and we need a better
way to get it than parsing it out of dmesg.  This series adds a simple
registration interface for IOMMUs and an example base implementation
for intel-iommu.

One key hardware feature for device assignment is the IOMMU support
for superpages.  In this example, we can parse it out of the "cap"
attribute exposed, but I expect we'll want to add more human readable
entries for such features.  I'd welcome suggestions on what features
we should pull out into human friendly attributes and how to format
the contents.

I have not attempted to make a common, consistent interface for
attributes between various IOMMU types here.  I'm not entirely sure
such a thing is possible.  Perhaps instead we do like I show in the
intel-iommu example and provide IOMMU driver specific attribute
groups, clearly labeled so that we effectively give each a namespace.
We can promote consistency between drivers, but a common namespace
is probably best left to userspace tools.

Appreciate any thoughts and comments.  Thanks,

Alex

---

Alex Williamson (2):
  iommu: Add sysfs support for IOMMUs
  iommu/intel: Make use of IOMMU sysfs support


 drivers/iommu/Makefile  |1 
 drivers/iommu/dmar.c|8 ++
 drivers/iommu/intel-iommu.c |   75 ++
 drivers/iommu/iommu-sysfs.c |  147 +++
 include/linux/intel-iommu.h |2 +
 include/linux/iommu.h   |   28 
 6 files changed, 260 insertions(+), 1 deletion(-)
 create mode 100644 drivers/iommu/iommu-sysfs.c
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1824 matches

Mail list logo