Re: [LEDE-DEV] DHCP via bridge in case of IPv4

2016-08-15 Thread Alexey Brodkin
Hello,

On Mon, 2016-07-11 at 06:15 +, Alexey Brodkin wrote:
> Hi Russel,
> 
> On Sun, 2016-07-10 at 00:19 -0700, Russell Senior wrote:
> > 
> > > > > > > "Alexey" == Alexey Brodkin  writes:
> > Alexey> Hi Aaron,
> > Alexey> On Sat, 2016-07-09 at 07:47 -0400, Aaron Z wrote:
> > > 
> > > > On Sat, Jul 9, 2016 at 4:37 AM, Alexey Brodkin
> > > >  wrote:
> > > > > 
> > > > > Hello,
> > > > > 
> > > > > I was playing with quite simple bridged setup on different boards
> > > > with > very recent kernels (4.6.3 as of this writing) and found one
> > > > interesting > behavior that I cannot yet understand and googling
> > > > din't help here as well.
> > > > > 
> > > > > My setup is pretty simple: >
> > > > -   --   -
> > > > > 
> > > > > > HOST  |   | "Dumb AP"  |   | Wireless
> > > > client   | > > with DHCP |<->(eth0) (wlan0)<->|
> > > > attempting to | > > server|   |\ br0
> > > > / |   | get settings via DHCP | >
> > > > -   --   -
> > > > > * HOST is my laptop with DHCP server that works for sure.  > *
> > > > "Dumb AP" is a separate board (I tried ARM-based Wandboard and
> > > > ARC-based >   AXS10x boards but results are exactly the same) with
> > > > wired (eth0) and wireless >   (wlan0) network controllers bridged
> > > > together (br0). That "br0" bridge flawlessly >   gets its settings
> > > > from DHCP server on host.  > * Wireless client could be either a
> > > > smatrphone or another laptop etc but >   what's important it should
> > > > be configured to get network settings by DHCP as well.
> > > > > 
> > > > > So what happens "br0" always gets network settings from DHCP server
> > > > on HOST.  > That's fine. But wireless client only reliably gets
> > > > settings from DHCP server > if IPv6 is enabled on "Dumb AP" board. If
> > > > IPv6 is disabled I may see that > wireless client sends "DHCP
> > > > Discover" then server replies with "DHCP Offer" but > that offer
> > > > never reaches wireless client.
> > > > 
> > > > Do you have WDS enabled? If not, DHCP has issues in that scenario:
> > > > https://wiki.openwrt.org/doc/howto/clientmode
> > If the Dumb AP's wireless interface is in ap-mode, then this shouldn't
> > be an issue.  It's only client-mode interfaces that have trouble with 
> > bridging.
> > 
> > I'd suggest running tcpdump on the Dumb AP's wireless interface and the
> > client's wireless interface and see which of them sees the various parts
> > of the DHCP handshake.
> 
> So I did but for DHCP server and wireless client (had no tcpdump on Dump AP
> at the moment).
> 
> That's what I see on the server:
> ->8---
> No. TimeSource Destination  Protocol Length Info
>  3 0.151181000  0.0.0.0255.255.255.255  DHCP 342DHCP Discover - 
> Transaction ID 0x31dc321f
> 11 2.760796000  10.42.0.1  10.42.0.13   DHCP 342DHCP Offer- 
> Transaction ID 0x31dc321f
> 14 5.220985000  0.0.0.0255.255.255.255  DHCP 342DHCP Discover - 
> Transaction ID 0x31dc321f
> 15 5.22115  10.42.0.1  10.42.0.13   DHCP 342DHCP Offer- 
> Transaction ID 0x31dc321f
> 23 15.649835000 0.0.0.0255.255.255.255  DHCP 342DHCP Discover - 
> Transaction ID 0x31dc321f
> 24 15.650017000 10.42.0.1  10.42.0.13   DHCP 342DHCP Offer- 
> Transaction ID 0x31dc321f
> 32 25.648589000 0.0.0.0255.255.255.255  DHCP 342DHCP Discover - 
> Transaction ID 0x31dc321f
> 33 25.648758000 10.42.0.1  10.42.0.13   DHCP 342DHCP Offer- 
> Transaction ID 0x31dc321f
> 43 35.864567000 0.0.0.0255.255.255.255  DHCP 342DHCP Discover - 
> Transaction ID 0x31dc321f
> 48 38.832837000 10.42.0.1  10.42.0.13   DHCP 342DHCP Offer- 
> Transaction ID 0x31dc321f
> ->8---
> 
> That's on the wireless client:
> ->8---
> No.  Time   Source   Destination  Protocol Length Info
> 1171 94.192971000   0.0.0.0  255.255.255.255  DHCP 342DHCP Discover - 
> Transaction ID 0x31dc321f
> 1182 99.263686000   0.0.0.0  255.255.255.255  DHCP 342DHCP Discover - 
> Transaction ID 0x31dc321f
> 1185 109.692642000  0.0.0.0  255.255.255.255  DHCP 342DHCP Discover - 
> Transaction ID 0x31dc321f
> 1186 119.691474000  0.0.0.0  255.255.255.255  DHCP 342DHCP Discover - 
> Transaction ID 0x31dc321f
> 1190 129.907507000  0.0.0.0  255.255.255.255  DHCP 342DHCP Discover - 
> Transaction ID 0x31dc321f
> ->8---
> 
> I'll try to capture data from Dumb AP sometime soon and will reply to the 
> thread.

So finally after quite some time I figured out what 

Re: [LEDE-DEV] DHCP via bridge in case of IPv4

2016-08-15 Thread Alexey Brodkin
Hello,

On Mon, 2016-07-11 at 06:15 +, Alexey Brodkin wrote:
> Hi Russel,
> 
> On Sun, 2016-07-10 at 00:19 -0700, Russell Senior wrote:
> > 
> > > > > > > "Alexey" == Alexey Brodkin  writes:
> > Alexey> Hi Aaron,
> > Alexey> On Sat, 2016-07-09 at 07:47 -0400, Aaron Z wrote:
> > > 
> > > > On Sat, Jul 9, 2016 at 4:37 AM, Alexey Brodkin
> > > >  wrote:
> > > > > 
> > > > > Hello,
> > > > > 
> > > > > I was playing with quite simple bridged setup on different boards
> > > > with > very recent kernels (4.6.3 as of this writing) and found one
> > > > interesting > behavior that I cannot yet understand and googling
> > > > din't help here as well.
> > > > > 
> > > > > My setup is pretty simple: >
> > > > -   --   -
> > > > > 
> > > > > > HOST  |   | "Dumb AP"  |   | Wireless
> > > > client   | > > with DHCP |<->(eth0) (wlan0)<->|
> > > > attempting to | > > server|   |\ br0
> > > > / |   | get settings via DHCP | >
> > > > -   --   -
> > > > > * HOST is my laptop with DHCP server that works for sure.  > *
> > > > "Dumb AP" is a separate board (I tried ARM-based Wandboard and
> > > > ARC-based >   AXS10x boards but results are exactly the same) with
> > > > wired (eth0) and wireless >   (wlan0) network controllers bridged
> > > > together (br0). That "br0" bridge flawlessly >   gets its settings
> > > > from DHCP server on host.  > * Wireless client could be either a
> > > > smatrphone or another laptop etc but >   what's important it should
> > > > be configured to get network settings by DHCP as well.
> > > > > 
> > > > > So what happens "br0" always gets network settings from DHCP server
> > > > on HOST.  > That's fine. But wireless client only reliably gets
> > > > settings from DHCP server > if IPv6 is enabled on "Dumb AP" board. If
> > > > IPv6 is disabled I may see that > wireless client sends "DHCP
> > > > Discover" then server replies with "DHCP Offer" but > that offer
> > > > never reaches wireless client.
> > > > 
> > > > Do you have WDS enabled? If not, DHCP has issues in that scenario:
> > > > https://wiki.openwrt.org/doc/howto/clientmode
> > If the Dumb AP's wireless interface is in ap-mode, then this shouldn't
> > be an issue.  It's only client-mode interfaces that have trouble with 
> > bridging.
> > 
> > I'd suggest running tcpdump on the Dumb AP's wireless interface and the
> > client's wireless interface and see which of them sees the various parts
> > of the DHCP handshake.
> 
> So I did but for DHCP server and wireless client (had no tcpdump on Dump AP
> at the moment).
> 
> That's what I see on the server:
> ->8---
> No. TimeSource Destination  Protocol Length Info
>  3 0.151181000  0.0.0.0255.255.255.255  DHCP 342DHCP Discover - 
> Transaction ID 0x31dc321f
> 11 2.760796000  10.42.0.1  10.42.0.13   DHCP 342DHCP Offer- 
> Transaction ID 0x31dc321f
> 14 5.220985000  0.0.0.0255.255.255.255  DHCP 342DHCP Discover - 
> Transaction ID 0x31dc321f
> 15 5.22115  10.42.0.1  10.42.0.13   DHCP 342DHCP Offer- 
> Transaction ID 0x31dc321f
> 23 15.649835000 0.0.0.0255.255.255.255  DHCP 342DHCP Discover - 
> Transaction ID 0x31dc321f
> 24 15.650017000 10.42.0.1  10.42.0.13   DHCP 342DHCP Offer- 
> Transaction ID 0x31dc321f
> 32 25.648589000 0.0.0.0255.255.255.255  DHCP 342DHCP Discover - 
> Transaction ID 0x31dc321f
> 33 25.648758000 10.42.0.1  10.42.0.13   DHCP 342DHCP Offer- 
> Transaction ID 0x31dc321f
> 43 35.864567000 0.0.0.0255.255.255.255  DHCP 342DHCP Discover - 
> Transaction ID 0x31dc321f
> 48 38.832837000 10.42.0.1  10.42.0.13   DHCP 342DHCP Offer- 
> Transaction ID 0x31dc321f
> ->8---
> 
> That's on the wireless client:
> ->8---
> No.  Time   Source   Destination  Protocol Length Info
> 1171 94.192971000   0.0.0.0  255.255.255.255  DHCP 342DHCP Discover - 
> Transaction ID 0x31dc321f
> 1182 99.263686000   0.0.0.0  255.255.255.255  DHCP 342DHCP Discover - 
> Transaction ID 0x31dc321f
> 1185 109.692642000  0.0.0.0  255.255.255.255  DHCP 342DHCP Discover - 
> Transaction ID 0x31dc321f
> 1186 119.691474000  0.0.0.0  255.255.255.255  DHCP 342DHCP Discover - 
> Transaction ID 0x31dc321f
> 1190 129.907507000  0.0.0.0  255.255.255.255  DHCP 342DHCP Discover - 
> Transaction ID 0x31dc321f
> ->8---
> 
> I'll try to capture data from Dumb AP sometime soon and will reply to the 
> thread.

So finally after quite some time I figured out what happens in my setup.
Basically it all boils down to the 

Re: [RFC][PATCHSET v2] allowing exports in *.S

2016-08-15 Thread Michal Marek
Dne 16.8.2016 v 07:48 Michal Marek napsal(a):
> Dne 2.8.2016 v 16:01 Michal Marek napsal(a):
>> On 2016-02-03 22:19, Al Viro wrote:
>>> Shortlog:
>>> Al Viro (13):
>>>   [kbuild] handle exports in lib-y objects reliably
>>>   EXPORT_SYMBOL() for asm
>>>   x86: move exports to actual definitions
>>>   alpha: move exports to actual definitions
>>>   m68k: move exports to definitions
>>>   s390: move exports to definitions
>>>   arm: move exports to definitions
>>>   ppc: move exports to definitions
>>>   ppc: get rid of unreachable abs() implementation
>>>   sparc: move exports to definitions
>>>   [sparc] unify 32bit and 64bit string.h
>>>   sparc32: debride memcpy.S a bit
>>>   ia64: move exports to definitions
>>
>> After several pings by Al (sorry about that!), I got around to review a
>> rebased version of this patchset at
>>
>>   git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git asm-exports
>>
>> The kbuild commits are good, but since we are close to the end of the
>> merge window, I will apply them to my kbuild branch after 4.8-rc1.
> 
> The rebased patchset is now in kbuild.git#kbuild. Before pushing, I
> noticed one issue: For some reason,
> drivers/firmware/efi/libstub/lib-ksyms.o is regenerated each time,
> leading to relink of vmlinux. I'm looking into this.

OK, it's the

$(obj)/lib-%.o: $(srctree)/lib/%.c FORCE
$(call if_changed_rule,cc_o_c)

rule in drivers/firmware/efi/libstub/Makefile file that conflicts with
the lib-ksyms.o rule. I need to find a better solution to this hack.

Michal


Re: [RFC][PATCHSET v2] allowing exports in *.S

2016-08-15 Thread Michal Marek
Dne 16.8.2016 v 07:48 Michal Marek napsal(a):
> Dne 2.8.2016 v 16:01 Michal Marek napsal(a):
>> On 2016-02-03 22:19, Al Viro wrote:
>>> Shortlog:
>>> Al Viro (13):
>>>   [kbuild] handle exports in lib-y objects reliably
>>>   EXPORT_SYMBOL() for asm
>>>   x86: move exports to actual definitions
>>>   alpha: move exports to actual definitions
>>>   m68k: move exports to definitions
>>>   s390: move exports to definitions
>>>   arm: move exports to definitions
>>>   ppc: move exports to definitions
>>>   ppc: get rid of unreachable abs() implementation
>>>   sparc: move exports to definitions
>>>   [sparc] unify 32bit and 64bit string.h
>>>   sparc32: debride memcpy.S a bit
>>>   ia64: move exports to definitions
>>
>> After several pings by Al (sorry about that!), I got around to review a
>> rebased version of this patchset at
>>
>>   git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git asm-exports
>>
>> The kbuild commits are good, but since we are close to the end of the
>> merge window, I will apply them to my kbuild branch after 4.8-rc1.
> 
> The rebased patchset is now in kbuild.git#kbuild. Before pushing, I
> noticed one issue: For some reason,
> drivers/firmware/efi/libstub/lib-ksyms.o is regenerated each time,
> leading to relink of vmlinux. I'm looking into this.

OK, it's the

$(obj)/lib-%.o: $(srctree)/lib/%.c FORCE
$(call if_changed_rule,cc_o_c)

rule in drivers/firmware/efi/libstub/Makefile file that conflicts with
the lib-ksyms.o rule. I need to find a better solution to this hack.

Michal


Re: [PATCH v6 05/11] mm, compaction: add the ultimate direct compaction priority

2016-08-15 Thread Joonsoo Kim
On Wed, Aug 10, 2016 at 11:12:20AM +0200, Vlastimil Babka wrote:
> During reclaim/compaction loop, it's desirable to get a final answer from
> unsuccessful compaction so we can either fail the allocation or invoke the OOM
> killer. However, heuristics such as deferred compaction or pageblock skip bits
> can cause compaction to skip parts or whole zones and lead to premature OOM's,
> failures or excessive reclaim/compaction retries.
> 
> To remedy this, we introduce a new direct compaction priority called
> COMPACT_PRIO_SYNC_FULL, which instructs direct compaction to:
> 
> - ignore deferred compaction status for a zone
> - ignore pageblock skip hints
> - ignore cached scanner positions and scan the whole zone
> 
> The new priority should get eventually picked up by should_compact_retry() and
> this should improve success rates for costly allocations using __GFP_REPEAT,
> such as hugetlbfs allocations, and reduce some corner-case OOM's for 
> non-costly
> allocations.
> 
> Signed-off-by: Vlastimil Babka 
> Acked-by: Michal Hocko 
> ---
>  include/linux/compaction.h | 3 ++-
>  mm/compaction.c| 5 -
>  2 files changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/compaction.h b/include/linux/compaction.h
> index e88c037afe47..a1fba9994728 100644
> --- a/include/linux/compaction.h
> +++ b/include/linux/compaction.h
> @@ -6,8 +6,9 @@
>   * Lower value means higher priority, analogically to reclaim priority.
>   */
>  enum compact_priority {
> + COMPACT_PRIO_SYNC_FULL,
> + MIN_COMPACT_PRIORITY = COMPACT_PRIO_SYNC_FULL,
>   COMPACT_PRIO_SYNC_LIGHT,
> - MIN_COMPACT_PRIORITY = COMPACT_PRIO_SYNC_LIGHT,
>   DEF_COMPACT_PRIORITY = COMPACT_PRIO_SYNC_LIGHT,
>   COMPACT_PRIO_ASYNC,
>   INIT_COMPACT_PRIORITY = COMPACT_PRIO_ASYNC
> diff --git a/mm/compaction.c b/mm/compaction.c
> index a144f58f7193..ae4f40afcca1 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -1644,6 +1644,8 @@ static enum compact_result compact_zone_order(struct 
> zone *zone, int order,
>   .alloc_flags = alloc_flags,
>   .classzone_idx = classzone_idx,
>   .direct_compaction = true,
> + .whole_zone = (prio == COMPACT_PRIO_SYNC_FULL),
> + .ignore_skip_hint = (prio == COMPACT_PRIO_SYNC_FULL)
>   };
>   INIT_LIST_HEAD();
>   INIT_LIST_HEAD();
> @@ -1689,7 +1691,8 @@ enum compact_result try_to_compact_pages(gfp_t 
> gfp_mask, unsigned int order,
>   ac->nodemask) {
>   enum compact_result status;
>  
> - if (compaction_deferred(zone, order)) {
> + if (prio > COMPACT_PRIO_SYNC_FULL
> + && compaction_deferred(zone, order)) {
>   rc = max_t(enum compact_result, COMPACT_DEFERRED, rc);
>   continue;

Could we provide prio to compaction_deferred() and do the decision in
that that function?

BTW, in kcompactd, compaction_deferred() is checked but
.ignore_skip_hint=true. Is there any reason? If we can remove
compaction_deferred() for kcompactd, we can check .ignore_skip_hint
to determine if defer is needed or not.

Thanks.


Re: [Query] increased latency observed in cpu hotplug path

2016-08-15 Thread Khan, Imran
On 8/5/2016 12:49 PM, Khan, Imran wrote:
> On 8/1/2016 2:58 PM, Khan, Imran wrote:
>> On 7/30/2016 7:54 AM, Akinobu Mita wrote:
>>> 2016-07-28 22:18 GMT+09:00 Khan, Imran :

 Hi,

 Recently we have observed some increased latency in CPU hotplug
 event in CPU online path. For online latency we see that block
 layer is executing notification handler for CPU_UP_PREPARE event
 and this in turn waits for RCU grace period resulting (sometimes)
 in an execution time of 15-20 ms for this notification handler.
 This change was not there in 3.18 kernel but is present in 4.4
 kernel and was introduced by following commit:


 commit 5778322e67ed34dc9f391a4a5cbcbb856071ceba
 Author: Akinobu Mita 
 Date:   Sun Sep 27 02:09:23 2015 +0900

 blk-mq: avoid inserting requests before establishing new mapping
>>>
>>> ...
>>>
 Upon reverting this commit I could see an improvement of 15-20 ms in
 online latency. So I am looking for some help in analyzing the effects
 of reverting this or should some other approach to reduce the online
 latency must be taken.
>>>
>>> Can you observe the difference in online latency by removing
>>> get_online_cpus() and put_online_cpus() pair in 
>>> blk_mq_init_allocated_queue()
>>> instead of full reverting the commit?
>>>
>> Hi Akinobu,
>> I tried your suggestion but could not achieve any improvement. Actually the 
>> snippet that is causing the change in latency is the following one :
>>
>> list_for_each_entry(q, _q_list, all_q_node) {
>> blk_mq_freeze_queue_wait(q);
>>
>> /*
>>  * timeout handler can't touch hw queue during the
>>  * reinitialization
>>  */
>> del_timer_sync(>timeout);
>>  }
>>
>> I understand that this is getting executed now for CPU_UP_PREPARE as well 
>> resulting in 
>> increased latency in the cpu online path. I am trying to reduce this latency 
>> while keeping the 
>> purpose of this commit intact. I would welcome further suggestions/feedback 
>> in this regard.
>>
> Hi Akinobu,
> 
> I am not able to reduce the cpu online latency with this patch, could you 
> please let me know what
> functionality will be broken, if we avoid this patch in our kernel. Also if 
> you have some other 
> suggestions towards improving this patch please let me know.
> 
After moving the remapping of queues to block layer's kworker I see that online 
latency has improved
while offline latency remains the same. As the freezing of queues happens in 
the context of block layer's
worker, I think it would be better to do the remapping in the same context and 
then go ahead with freezing.
In this regard I have made following change:

commit b2131b86eeef4c5b1f8adaf7a53606301aa6b624
Author: Imran Khan 
Date:   Fri Aug 12 19:59:47 2016 +0530

blk-mq: Move block queue remapping from cpu hotplug path

During a cpu hotplug, the hardware and software contexts mappings
need to be updated in order to take into account requests
submitted for the hotadded CPU. But if this mapping is done
in hotplug notifier, it deteriorates the hotplug latency.
So move the block queue remapping to block layer worker which
results in significant improvements in hotplug latency.

Change-Id: I01ac83178ce95c3a4e3b7b1b286eda65ff34e8c4
Signed-off-by: Imran Khan 

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 6d6f8fe..06fcf89 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -22,7 +22,11 @@
 #include 
 #include 
 #include 
-
+#include 
+#include 
+#include 
+#include 
+#include 
 #include 

 #include 
@@ -32,10 +36,18 @@

 static DEFINE_MUTEX(all_q_mutex);
 static LIST_HEAD(all_q_list);
-
 static void __blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx);

 /*
+ * New online cpumask which is going to be set in this hotplug event.
+ * Declare this cpumasks as global as cpu-hotplug operation is invoked
+ * one-by-one and dynamically allocating this could result in a failure.
+ */
+static struct cpumask online_new;
+
+static struct work_struct blk_mq_remap_work;
+
+/*
  * Check if any of the ctx's have pending work in this hardware queue
  */
 static bool blk_mq_hctx_has_pending(struct blk_mq_hw_ctx *hctx)
@@ -2125,14 +2137,7 @@ static void blk_mq_queue_reinit(struct request_queue *q,
 static int blk_mq_queue_reinit_notify(struct notifier_block *nb,
  unsigned long action, void *hcpu)
 {
-   struct request_queue *q;
int cpu = (unsigned long)hcpu;
-   /*
-* New online cpumask which is going to be set in this hotplug event.
-* Declare this cpumasks as global as cpu-hotplug operation is invoked
-* one-by-one and dynamically allocating this could result in a failure.
-*/
-   static struct cpumask online_new;

/*
 * Before 

Re: [PATCH v6 05/11] mm, compaction: add the ultimate direct compaction priority

2016-08-15 Thread Joonsoo Kim
On Wed, Aug 10, 2016 at 11:12:20AM +0200, Vlastimil Babka wrote:
> During reclaim/compaction loop, it's desirable to get a final answer from
> unsuccessful compaction so we can either fail the allocation or invoke the OOM
> killer. However, heuristics such as deferred compaction or pageblock skip bits
> can cause compaction to skip parts or whole zones and lead to premature OOM's,
> failures or excessive reclaim/compaction retries.
> 
> To remedy this, we introduce a new direct compaction priority called
> COMPACT_PRIO_SYNC_FULL, which instructs direct compaction to:
> 
> - ignore deferred compaction status for a zone
> - ignore pageblock skip hints
> - ignore cached scanner positions and scan the whole zone
> 
> The new priority should get eventually picked up by should_compact_retry() and
> this should improve success rates for costly allocations using __GFP_REPEAT,
> such as hugetlbfs allocations, and reduce some corner-case OOM's for 
> non-costly
> allocations.
> 
> Signed-off-by: Vlastimil Babka 
> Acked-by: Michal Hocko 
> ---
>  include/linux/compaction.h | 3 ++-
>  mm/compaction.c| 5 -
>  2 files changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/compaction.h b/include/linux/compaction.h
> index e88c037afe47..a1fba9994728 100644
> --- a/include/linux/compaction.h
> +++ b/include/linux/compaction.h
> @@ -6,8 +6,9 @@
>   * Lower value means higher priority, analogically to reclaim priority.
>   */
>  enum compact_priority {
> + COMPACT_PRIO_SYNC_FULL,
> + MIN_COMPACT_PRIORITY = COMPACT_PRIO_SYNC_FULL,
>   COMPACT_PRIO_SYNC_LIGHT,
> - MIN_COMPACT_PRIORITY = COMPACT_PRIO_SYNC_LIGHT,
>   DEF_COMPACT_PRIORITY = COMPACT_PRIO_SYNC_LIGHT,
>   COMPACT_PRIO_ASYNC,
>   INIT_COMPACT_PRIORITY = COMPACT_PRIO_ASYNC
> diff --git a/mm/compaction.c b/mm/compaction.c
> index a144f58f7193..ae4f40afcca1 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -1644,6 +1644,8 @@ static enum compact_result compact_zone_order(struct 
> zone *zone, int order,
>   .alloc_flags = alloc_flags,
>   .classzone_idx = classzone_idx,
>   .direct_compaction = true,
> + .whole_zone = (prio == COMPACT_PRIO_SYNC_FULL),
> + .ignore_skip_hint = (prio == COMPACT_PRIO_SYNC_FULL)
>   };
>   INIT_LIST_HEAD();
>   INIT_LIST_HEAD();
> @@ -1689,7 +1691,8 @@ enum compact_result try_to_compact_pages(gfp_t 
> gfp_mask, unsigned int order,
>   ac->nodemask) {
>   enum compact_result status;
>  
> - if (compaction_deferred(zone, order)) {
> + if (prio > COMPACT_PRIO_SYNC_FULL
> + && compaction_deferred(zone, order)) {
>   rc = max_t(enum compact_result, COMPACT_DEFERRED, rc);
>   continue;

Could we provide prio to compaction_deferred() and do the decision in
that that function?

BTW, in kcompactd, compaction_deferred() is checked but
.ignore_skip_hint=true. Is there any reason? If we can remove
compaction_deferred() for kcompactd, we can check .ignore_skip_hint
to determine if defer is needed or not.

Thanks.


Re: [Query] increased latency observed in cpu hotplug path

2016-08-15 Thread Khan, Imran
On 8/5/2016 12:49 PM, Khan, Imran wrote:
> On 8/1/2016 2:58 PM, Khan, Imran wrote:
>> On 7/30/2016 7:54 AM, Akinobu Mita wrote:
>>> 2016-07-28 22:18 GMT+09:00 Khan, Imran :

 Hi,

 Recently we have observed some increased latency in CPU hotplug
 event in CPU online path. For online latency we see that block
 layer is executing notification handler for CPU_UP_PREPARE event
 and this in turn waits for RCU grace period resulting (sometimes)
 in an execution time of 15-20 ms for this notification handler.
 This change was not there in 3.18 kernel but is present in 4.4
 kernel and was introduced by following commit:


 commit 5778322e67ed34dc9f391a4a5cbcbb856071ceba
 Author: Akinobu Mita 
 Date:   Sun Sep 27 02:09:23 2015 +0900

 blk-mq: avoid inserting requests before establishing new mapping
>>>
>>> ...
>>>
 Upon reverting this commit I could see an improvement of 15-20 ms in
 online latency. So I am looking for some help in analyzing the effects
 of reverting this or should some other approach to reduce the online
 latency must be taken.
>>>
>>> Can you observe the difference in online latency by removing
>>> get_online_cpus() and put_online_cpus() pair in 
>>> blk_mq_init_allocated_queue()
>>> instead of full reverting the commit?
>>>
>> Hi Akinobu,
>> I tried your suggestion but could not achieve any improvement. Actually the 
>> snippet that is causing the change in latency is the following one :
>>
>> list_for_each_entry(q, _q_list, all_q_node) {
>> blk_mq_freeze_queue_wait(q);
>>
>> /*
>>  * timeout handler can't touch hw queue during the
>>  * reinitialization
>>  */
>> del_timer_sync(>timeout);
>>  }
>>
>> I understand that this is getting executed now for CPU_UP_PREPARE as well 
>> resulting in 
>> increased latency in the cpu online path. I am trying to reduce this latency 
>> while keeping the 
>> purpose of this commit intact. I would welcome further suggestions/feedback 
>> in this regard.
>>
> Hi Akinobu,
> 
> I am not able to reduce the cpu online latency with this patch, could you 
> please let me know what
> functionality will be broken, if we avoid this patch in our kernel. Also if 
> you have some other 
> suggestions towards improving this patch please let me know.
> 
After moving the remapping of queues to block layer's kworker I see that online 
latency has improved
while offline latency remains the same. As the freezing of queues happens in 
the context of block layer's
worker, I think it would be better to do the remapping in the same context and 
then go ahead with freezing.
In this regard I have made following change:

commit b2131b86eeef4c5b1f8adaf7a53606301aa6b624
Author: Imran Khan 
Date:   Fri Aug 12 19:59:47 2016 +0530

blk-mq: Move block queue remapping from cpu hotplug path

During a cpu hotplug, the hardware and software contexts mappings
need to be updated in order to take into account requests
submitted for the hotadded CPU. But if this mapping is done
in hotplug notifier, it deteriorates the hotplug latency.
So move the block queue remapping to block layer worker which
results in significant improvements in hotplug latency.

Change-Id: I01ac83178ce95c3a4e3b7b1b286eda65ff34e8c4
Signed-off-by: Imran Khan 

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 6d6f8fe..06fcf89 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -22,7 +22,11 @@
 #include 
 #include 
 #include 
-
+#include 
+#include 
+#include 
+#include 
+#include 
 #include 

 #include 
@@ -32,10 +36,18 @@

 static DEFINE_MUTEX(all_q_mutex);
 static LIST_HEAD(all_q_list);
-
 static void __blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx);

 /*
+ * New online cpumask which is going to be set in this hotplug event.
+ * Declare this cpumasks as global as cpu-hotplug operation is invoked
+ * one-by-one and dynamically allocating this could result in a failure.
+ */
+static struct cpumask online_new;
+
+static struct work_struct blk_mq_remap_work;
+
+/*
  * Check if any of the ctx's have pending work in this hardware queue
  */
 static bool blk_mq_hctx_has_pending(struct blk_mq_hw_ctx *hctx)
@@ -2125,14 +2137,7 @@ static void blk_mq_queue_reinit(struct request_queue *q,
 static int blk_mq_queue_reinit_notify(struct notifier_block *nb,
  unsigned long action, void *hcpu)
 {
-   struct request_queue *q;
int cpu = (unsigned long)hcpu;
-   /*
-* New online cpumask which is going to be set in this hotplug event.
-* Declare this cpumasks as global as cpu-hotplug operation is invoked
-* one-by-one and dynamically allocating this could result in a failure.
-*/
-   static struct cpumask online_new;

/*
 * Before hotadded cpu starts handling requests, new mappings must
@@ -2155,43 +2160,17 @@ static int 

Re: [PATCH] Map in physical addresses in efi_map_region_fixed

2016-08-15 Thread Borislav Petkov
On Mon, Aug 15, 2016 at 01:47:31PM -0500, Alex Thorlton wrote:
> The only thing we're adding here is the physical mappings, to match
> what is availble in the primary kernel.

I can see what it does - I just am questioning the reasoning for as we
did all that effort so that kexec can have stable virtual mappings.

I guess we still need a way to pass the virtual mappings to kexec
as they're immutable as some "smartass" decided to allow to call
SetVirtualAddressMap only once.

> This is sort of a hand-wavey answer - I will investigate the his further...

Yeah, it'll be interesting to know whether that is an issue because if
we do the 1:1 mappings in the kexec kernel too and there's an address
conflict, then we better know upfront.

> It's not that we need it all of the sudden, necessarily, it's just that
> we've had to make other changes to make things work with the new,
> (almost) completely isolated, EFI page tables.  We ended up choosing the
> lesser of two evils, and have decided to temporarily rely on the
> physical address of our runtime code, instead of continuing to rely on
> EFI_OLD_MEMMAP.

Well, if it starts to cause trouble, you probably will have to revert.

> If there are strong objections to this change, I won't pursue it
> further.

I don't really care all that much as long as it doesn't break the
existing situation. I've long given up on the hope that EFI and all its
incarnations will hold on to some spec... :-)

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--


Re: [PATCH] Map in physical addresses in efi_map_region_fixed

2016-08-15 Thread Borislav Petkov
On Mon, Aug 15, 2016 at 01:47:31PM -0500, Alex Thorlton wrote:
> The only thing we're adding here is the physical mappings, to match
> what is availble in the primary kernel.

I can see what it does - I just am questioning the reasoning for as we
did all that effort so that kexec can have stable virtual mappings.

I guess we still need a way to pass the virtual mappings to kexec
as they're immutable as some "smartass" decided to allow to call
SetVirtualAddressMap only once.

> This is sort of a hand-wavey answer - I will investigate the his further...

Yeah, it'll be interesting to know whether that is an issue because if
we do the 1:1 mappings in the kexec kernel too and there's an address
conflict, then we better know upfront.

> It's not that we need it all of the sudden, necessarily, it's just that
> we've had to make other changes to make things work with the new,
> (almost) completely isolated, EFI page tables.  We ended up choosing the
> lesser of two evils, and have decided to temporarily rely on the
> physical address of our runtime code, instead of continuing to rely on
> EFI_OLD_MEMMAP.

Well, if it starts to cause trouble, you probably will have to revert.

> If there are strong objections to this change, I won't pursue it
> further.

I don't really care all that much as long as it doesn't break the
existing situation. I've long given up on the hope that EFI and all its
incarnations will hold on to some spec... :-)

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--


Re: [PATCH 1/2] ARM: dts: imx7d: move CPU operating points to imx7d.dtsi

2016-08-15 Thread Shawn Guo
On Thu, Aug 11, 2016 at 05:11:06PM -0700, Stefan Agner wrote:
> Only i.MX 7Dual SoC supports CPU frequencies of up to 1GHz. The i.MX
> 7Solo can run with up to 800MHz and does so without making use of DVFS
> usually. While the device tree clearly specified a too fast operating
> point for i.MX 7Solo, the kernel did not used it in practise so far
> because the CPUfreq driver does not get loaded on i.MX 7Solo devices
> (since the fsl,imx7s compatible string is not in the list of devices
> making use of the cpufreq-dt driver...).
> 
> Signed-off-by: Stefan Agner 
> ---
> Hi Shawn,
> 
> This is based on my earlier patchset:
> ARM: dts: imx7d: move ARM platform peripherals inside soc
> 
> This are kind of fixes too, so if possible I would like to see them
> in v4.8, what do you think?

Patch "ARM: dts: imx7d: move ARM platform peripherals inside soc node"
is not really a fix, and the diffstat looks too dramatic to be a -rc
material, so I queued it as a -next patch, and any patch based on it
will have to go through -next as well.

Applied for -next, thanks.

Shawn

> 
> --
> Stefan
> 
>  arch/arm/boot/dts/imx7d.dtsi | 8 
>  arch/arm/boot/dts/imx7s.dtsi | 5 -
>  2 files changed, 8 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arm/boot/dts/imx7d.dtsi b/arch/arm/boot/dts/imx7d.dtsi
> index 3d77d95..d0b199c 100644
> --- a/arch/arm/boot/dts/imx7d.dtsi
> +++ b/arch/arm/boot/dts/imx7d.dtsi
> @@ -45,6 +45,14 @@
>  
>  / {
>   cpus {
> + cpu0: cpu@0 {
> + operating-points = <
> + /* KHz  uV */
> + 996000  1075000
> + 792000  975000
> + >;
> + };
> +
>   cpu1: cpu@1 {
>   compatible = "arm,cortex-a7";
>   device_type = "cpu";
> diff --git a/arch/arm/boot/dts/imx7s.dtsi b/arch/arm/boot/dts/imx7s.dtsi
> index c63591c..5132e2f 100644
> --- a/arch/arm/boot/dts/imx7s.dtsi
> +++ b/arch/arm/boot/dts/imx7s.dtsi
> @@ -85,11 +85,6 @@
>   compatible = "arm,cortex-a7";
>   device_type = "cpu";
>   reg = <0>;
> - operating-points = <
> - /* KHz  uV */
> - 996000  1075000
> - 792000  975000
> - >;
>   clock-latency = <61036>; /* two CLK32 periods */
>   clocks = < IMX7D_CLK_ARM>;
>   };
> -- 
> 2.9.0
> 
> 
> ___
> linux-arm-kernel mailing list
> linux-arm-ker...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel


Re: [PATCH 1/2] ARM: dts: imx7d: move CPU operating points to imx7d.dtsi

2016-08-15 Thread Shawn Guo
On Thu, Aug 11, 2016 at 05:11:06PM -0700, Stefan Agner wrote:
> Only i.MX 7Dual SoC supports CPU frequencies of up to 1GHz. The i.MX
> 7Solo can run with up to 800MHz and does so without making use of DVFS
> usually. While the device tree clearly specified a too fast operating
> point for i.MX 7Solo, the kernel did not used it in practise so far
> because the CPUfreq driver does not get loaded on i.MX 7Solo devices
> (since the fsl,imx7s compatible string is not in the list of devices
> making use of the cpufreq-dt driver...).
> 
> Signed-off-by: Stefan Agner 
> ---
> Hi Shawn,
> 
> This is based on my earlier patchset:
> ARM: dts: imx7d: move ARM platform peripherals inside soc
> 
> This are kind of fixes too, so if possible I would like to see them
> in v4.8, what do you think?

Patch "ARM: dts: imx7d: move ARM platform peripherals inside soc node"
is not really a fix, and the diffstat looks too dramatic to be a -rc
material, so I queued it as a -next patch, and any patch based on it
will have to go through -next as well.

Applied for -next, thanks.

Shawn

> 
> --
> Stefan
> 
>  arch/arm/boot/dts/imx7d.dtsi | 8 
>  arch/arm/boot/dts/imx7s.dtsi | 5 -
>  2 files changed, 8 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arm/boot/dts/imx7d.dtsi b/arch/arm/boot/dts/imx7d.dtsi
> index 3d77d95..d0b199c 100644
> --- a/arch/arm/boot/dts/imx7d.dtsi
> +++ b/arch/arm/boot/dts/imx7d.dtsi
> @@ -45,6 +45,14 @@
>  
>  / {
>   cpus {
> + cpu0: cpu@0 {
> + operating-points = <
> + /* KHz  uV */
> + 996000  1075000
> + 792000  975000
> + >;
> + };
> +
>   cpu1: cpu@1 {
>   compatible = "arm,cortex-a7";
>   device_type = "cpu";
> diff --git a/arch/arm/boot/dts/imx7s.dtsi b/arch/arm/boot/dts/imx7s.dtsi
> index c63591c..5132e2f 100644
> --- a/arch/arm/boot/dts/imx7s.dtsi
> +++ b/arch/arm/boot/dts/imx7s.dtsi
> @@ -85,11 +85,6 @@
>   compatible = "arm,cortex-a7";
>   device_type = "cpu";
>   reg = <0>;
> - operating-points = <
> - /* KHz  uV */
> - 996000  1075000
> - 792000  975000
> - >;
>   clock-latency = <61036>; /* two CLK32 periods */
>   clocks = < IMX7D_CLK_ARM>;
>   };
> -- 
> 2.9.0
> 
> 
> ___
> linux-arm-kernel mailing list
> linux-arm-ker...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel


Re: [PATCH v6 0/2] Block layer support ZAC/ZBC commands

2016-08-15 Thread Shaun Tancheff
On Mon, Aug 15, 2016 at 11:00 PM, Damien Le Moal  wrote:
>
> Shaun,
>
>> On Aug 14, 2016, at 09:09, Shaun Tancheff  wrote:
> […]

>>> No, surely not.
>>> But one of the _big_ advantages for the RB tree is blkdev_discard().
>>> Without the RB tree any mkfs program will issue a 'discard' for every
>>> sector. We will be able to coalesce those into one discard per zone, but
>>> we still need to issue one for _every_ zone.
>>
>> How can you make coalesce work transparently in the
>> sd layer _without_ keeping some sort of a discard cache along
>> with the zone cache?
>>
>> Currently the block layer's blkdev_issue_discard() is breaking
>> large discard's into nice granular and aligned chunks but it is
>> not preventing small discards nor coalescing them.
>>
>> In the sd layer would there be way to persist or purge an
>> overly large discard cache? What about honoring
>> discard_zeroes_data? Once the discard is completed with
>> discard_zeroes_data you have to return zeroes whenever
>> a discarded sector is read. Isn't that a log more than just
>> tracking a write pointer? Couldn't a zone have dozens of holes?
>
> My understanding of the standards regarding discard is that it is not
> mandatory and that it is a hint to the drive. The drive can completely
> ignore it if it thinks that is a better choice. I may be wrong on this
> though. Need to check again.

But you are currently setting discard_zeroes_data=1 in your
current patches. I believe that setting discard_zeroes_data=1
effectively promotes discards to being mandatory.

I have a follow on patch to my SCT Write Same series that
handles the CMR zone case in the sd_zbc_setup_discard() handler.

> For reset write pointer, the mapping to discard requires that the calls
> to blkdev_issue_discard be zone aligned for anything to happen. Specify
> less than a zone and nothing will be done. This I think preserve the
> discard semantic.

Oh. If that is the intent then there is just a bug in the handler.
I have pointed out where I believe it to be in my response to
the zone cache patch being posted.

> As for the “discard_zeroes_data” thing, I also think that is a drive
> feature not mandatory. Drives may have it or not, which is consistent
> with the ZBC/ZAC standards regarding reading after write pointer (nothing
> says that zeros have to be returned). In any case, discard of CMR zones
> will be a nop, so for SMR drives, discard_zeroes_data=0 may be a better
> choice.

However I am still curious about discard's being coalesced.

>>> Which is (as indicated) really slow, and easily takes several minutes.
>>> With the RB tree we can short-circuit discards to empty zones, and speed
>>> up processing time dramatically.
>>> Sure we could be moving the logic into mkfs and friends, but that would
>>> require us to change the programs and agree on a library (libzbc?) which
>>> should be handling that.
>>
>> F2FS's mkfs.f2fs is already reading the zone topology via SG_IO ...
>> so I'm not sure your argument is valid here.
>
> This initial SMR support patch is just that: a first try. Jaegeuk
> used SG_IO (in fact copy-paste of parts of libzbc) because the current
> ZBC patch-set has no ioctl API for zone information manipulation. We
> will fix this mkfs.f2fs once we agree on an ioctl interface.

Which again is my point. If mkfs.f2fs wants to speed up it's
discard pass in mkfs.f2fs by _not_ sending unneccessary
Reset WP for zones that are already empty it has all the
information it needs to do so.

Here it seems to me that the zone cache is _at_best_
doing double work. At works the zone cache could be
doing the wrong thing _if_ the zone cache got out of sync.
It is certainly possible (however unlikely) that someone was
doing some raw sg activity that is not seed by the sd path.

All I am trying to do is have a discussion about the reasons for
and against have a zone cache. Where it works and where it breaks
this should be entirely technical but I understand that we have all
spent a lot of time _not_ discussing this for various non-technical
reasons.

So far the only reason I've been able to ascertain is that
Host Manged drives really don't like being stuck with the
URSWRZ and would like to have a software hack to return
MUD rather than ship drives with some weird out-of-the box
config where the last zone is marked as FINISH'd thereby
returning MUD on reads as per spec.

I understand that it would be strange state to see of first
boot and likely people would just do a ResetWP and have
weird boot errors, which would probably just make matters
worse.

I just would rather the work around be a bit cleaner and/or
use less memory. I would also like a path available that
does not require SD_ZBC or BLK_ZONED for Host Aware
drives to work, hence this set of patches and me begging
for a single bit in struct bio.

>>
>> [..]
>>
> 3) Try to condense the blkzone data structure to save memory:
> I think that we can at the very 

Re: [PATCH v6 0/2] Block layer support ZAC/ZBC commands

2016-08-15 Thread Shaun Tancheff
On Mon, Aug 15, 2016 at 11:00 PM, Damien Le Moal  wrote:
>
> Shaun,
>
>> On Aug 14, 2016, at 09:09, Shaun Tancheff  wrote:
> […]

>>> No, surely not.
>>> But one of the _big_ advantages for the RB tree is blkdev_discard().
>>> Without the RB tree any mkfs program will issue a 'discard' for every
>>> sector. We will be able to coalesce those into one discard per zone, but
>>> we still need to issue one for _every_ zone.
>>
>> How can you make coalesce work transparently in the
>> sd layer _without_ keeping some sort of a discard cache along
>> with the zone cache?
>>
>> Currently the block layer's blkdev_issue_discard() is breaking
>> large discard's into nice granular and aligned chunks but it is
>> not preventing small discards nor coalescing them.
>>
>> In the sd layer would there be way to persist or purge an
>> overly large discard cache? What about honoring
>> discard_zeroes_data? Once the discard is completed with
>> discard_zeroes_data you have to return zeroes whenever
>> a discarded sector is read. Isn't that a log more than just
>> tracking a write pointer? Couldn't a zone have dozens of holes?
>
> My understanding of the standards regarding discard is that it is not
> mandatory and that it is a hint to the drive. The drive can completely
> ignore it if it thinks that is a better choice. I may be wrong on this
> though. Need to check again.

But you are currently setting discard_zeroes_data=1 in your
current patches. I believe that setting discard_zeroes_data=1
effectively promotes discards to being mandatory.

I have a follow on patch to my SCT Write Same series that
handles the CMR zone case in the sd_zbc_setup_discard() handler.

> For reset write pointer, the mapping to discard requires that the calls
> to blkdev_issue_discard be zone aligned for anything to happen. Specify
> less than a zone and nothing will be done. This I think preserve the
> discard semantic.

Oh. If that is the intent then there is just a bug in the handler.
I have pointed out where I believe it to be in my response to
the zone cache patch being posted.

> As for the “discard_zeroes_data” thing, I also think that is a drive
> feature not mandatory. Drives may have it or not, which is consistent
> with the ZBC/ZAC standards regarding reading after write pointer (nothing
> says that zeros have to be returned). In any case, discard of CMR zones
> will be a nop, so for SMR drives, discard_zeroes_data=0 may be a better
> choice.

However I am still curious about discard's being coalesced.

>>> Which is (as indicated) really slow, and easily takes several minutes.
>>> With the RB tree we can short-circuit discards to empty zones, and speed
>>> up processing time dramatically.
>>> Sure we could be moving the logic into mkfs and friends, but that would
>>> require us to change the programs and agree on a library (libzbc?) which
>>> should be handling that.
>>
>> F2FS's mkfs.f2fs is already reading the zone topology via SG_IO ...
>> so I'm not sure your argument is valid here.
>
> This initial SMR support patch is just that: a first try. Jaegeuk
> used SG_IO (in fact copy-paste of parts of libzbc) because the current
> ZBC patch-set has no ioctl API for zone information manipulation. We
> will fix this mkfs.f2fs once we agree on an ioctl interface.

Which again is my point. If mkfs.f2fs wants to speed up it's
discard pass in mkfs.f2fs by _not_ sending unneccessary
Reset WP for zones that are already empty it has all the
information it needs to do so.

Here it seems to me that the zone cache is _at_best_
doing double work. At works the zone cache could be
doing the wrong thing _if_ the zone cache got out of sync.
It is certainly possible (however unlikely) that someone was
doing some raw sg activity that is not seed by the sd path.

All I am trying to do is have a discussion about the reasons for
and against have a zone cache. Where it works and where it breaks
this should be entirely technical but I understand that we have all
spent a lot of time _not_ discussing this for various non-technical
reasons.

So far the only reason I've been able to ascertain is that
Host Manged drives really don't like being stuck with the
URSWRZ and would like to have a software hack to return
MUD rather than ship drives with some weird out-of-the box
config where the last zone is marked as FINISH'd thereby
returning MUD on reads as per spec.

I understand that it would be strange state to see of first
boot and likely people would just do a ResetWP and have
weird boot errors, which would probably just make matters
worse.

I just would rather the work around be a bit cleaner and/or
use less memory. I would also like a path available that
does not require SD_ZBC or BLK_ZONED for Host Aware
drives to work, hence this set of patches and me begging
for a single bit in struct bio.

>>
>> [..]
>>
> 3) Try to condense the blkzone data structure to save memory:
> I think that we can at the very least remove the zone length, and also
> may be 

Re: [RFC][PATCHSET v2] allowing exports in *.S

2016-08-15 Thread Michal Marek
Dne 2.8.2016 v 16:01 Michal Marek napsal(a):
> On 2016-02-03 22:19, Al Viro wrote:
>> Shortlog:
>> Al Viro (13):
>>   [kbuild] handle exports in lib-y objects reliably
>>   EXPORT_SYMBOL() for asm
>>   x86: move exports to actual definitions
>>   alpha: move exports to actual definitions
>>   m68k: move exports to definitions
>>   s390: move exports to definitions
>>   arm: move exports to definitions
>>   ppc: move exports to definitions
>>   ppc: get rid of unreachable abs() implementation
>>   sparc: move exports to definitions
>>   [sparc] unify 32bit and 64bit string.h
>>   sparc32: debride memcpy.S a bit
>>   ia64: move exports to definitions
> 
> After several pings by Al (sorry about that!), I got around to review a
> rebased version of this patchset at
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git asm-exports
> 
> The kbuild commits are good, but since we are close to the end of the
> merge window, I will apply them to my kbuild branch after 4.8-rc1.

The rebased patchset is now in kbuild.git#kbuild. Before pushing, I
noticed one issue: For some reason,
drivers/firmware/efi/libstub/lib-ksyms.o is regenerated each time,
leading to relink of vmlinux. I'm looking into this.

Michal


Re: [RFC][PATCHSET v2] allowing exports in *.S

2016-08-15 Thread Michal Marek
Dne 2.8.2016 v 16:01 Michal Marek napsal(a):
> On 2016-02-03 22:19, Al Viro wrote:
>> Shortlog:
>> Al Viro (13):
>>   [kbuild] handle exports in lib-y objects reliably
>>   EXPORT_SYMBOL() for asm
>>   x86: move exports to actual definitions
>>   alpha: move exports to actual definitions
>>   m68k: move exports to definitions
>>   s390: move exports to definitions
>>   arm: move exports to definitions
>>   ppc: move exports to definitions
>>   ppc: get rid of unreachable abs() implementation
>>   sparc: move exports to definitions
>>   [sparc] unify 32bit and 64bit string.h
>>   sparc32: debride memcpy.S a bit
>>   ia64: move exports to definitions
> 
> After several pings by Al (sorry about that!), I got around to review a
> rebased version of this patchset at
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git asm-exports
> 
> The kbuild commits are good, but since we are close to the end of the
> merge window, I will apply them to my kbuild branch after 4.8-rc1.

The rebased patchset is now in kbuild.git#kbuild. Before pushing, I
noticed one issue: For some reason,
drivers/firmware/efi/libstub/lib-ksyms.o is regenerated each time,
leading to relink of vmlinux. I'm looking into this.

Michal


Re: [PATCH] KEYS: fix big_key dependency

2016-08-15 Thread Stephan Mueller
Am Dienstag, 16. August 2016, 00:45:39 CEST schrieb Kirill Marinushkin:

Hi Kirill,

> + select CRYPTO_ANSI_CPRNG

This change enables the RNG which will not pass FIPS testing any more. Hence, 
this selection could cause an issue in FIPS mode (i.e. booting the kernel with 
fips=1).

May I suggest CRYPTO_DRBG?

Ciao
Stephan


Re: [PATCH] KEYS: fix big_key dependency

2016-08-15 Thread Stephan Mueller
Am Dienstag, 16. August 2016, 00:45:39 CEST schrieb Kirill Marinushkin:

Hi Kirill,

> + select CRYPTO_ANSI_CPRNG

This change enables the RNG which will not pass FIPS testing any more. Hence, 
this selection could cause an issue in FIPS mode (i.e. booting the kernel with 
fips=1).

May I suggest CRYPTO_DRBG?

Ciao
Stephan


Re: [PATCH v6 0/5] /dev/random - a new approach

2016-08-15 Thread Stephan Mueller
Am Montag, 15. August 2016, 13:42:54 CEST schrieb H. Peter Anvin:

Hi H,

> On 08/11/16 05:24, Stephan Mueller wrote:
> > * prevent fast noise sources from dominating slow noise sources
> > 
> >   in case of /dev/random
> 
> Can someone please explain if and why this is actually desirable, and if
> this assessment has been passed to someone who has actual experience
> with cryptography at the professional level?

There are two motivations for that:

- the current /dev/random is compliant to NTG.1 from AIS 20/31 which requires 
(in brief words) that entropy comes from auditible noise sources. Currently in 
my LRNG only RDRAND is a fast noise source which is not auditible (and it is 
designed to cause a VM exit making it even harder to assess it). To make the 
LRNG to comply with NTG.1, RDRAND can provide entropy but must not become the 
sole entropy provider which is the case now with that change.

- the current /dev/random implementation follows the same concept with the 
exception of 3.15 and 3.16 where RDRAND was not rate-limited. In later 
versions, this was changed.

Ciao
Stephan


Re: [PATCH v6 0/5] /dev/random - a new approach

2016-08-15 Thread Stephan Mueller
Am Montag, 15. August 2016, 13:42:54 CEST schrieb H. Peter Anvin:

Hi H,

> On 08/11/16 05:24, Stephan Mueller wrote:
> > * prevent fast noise sources from dominating slow noise sources
> > 
> >   in case of /dev/random
> 
> Can someone please explain if and why this is actually desirable, and if
> this assessment has been passed to someone who has actual experience
> with cryptography at the professional level?

There are two motivations for that:

- the current /dev/random is compliant to NTG.1 from AIS 20/31 which requires 
(in brief words) that entropy comes from auditible noise sources. Currently in 
my LRNG only RDRAND is a fast noise source which is not auditible (and it is 
designed to cause a VM exit making it even harder to assess it). To make the 
LRNG to comply with NTG.1, RDRAND can provide entropy but must not become the 
sole entropy provider which is the case now with that change.

- the current /dev/random implementation follows the same concept with the 
exception of 3.15 and 3.16 where RDRAND was not rate-limited. In later 
versions, this was changed.

Ciao
Stephan


Re: [PATCH] Map in physical addresses in efi_map_region_fixed

2016-08-15 Thread Borislav Petkov
On Mon, Aug 15, 2016 at 02:52:22PM -0700, H. Peter Anvin wrote:
> So to answer the implicit question: we have found UEFI stacks in the
> field which fail without the physical mappings present, and we have
> found stacks which fail without a nontrivial SetAddressMapping.

You mean SetVirtualAddressMap.

Oh well, it's not like it matters all that much as we have our own
pagetable for EFI so we can go nuts there. Apparently.

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--


Re: [PATCH] Map in physical addresses in efi_map_region_fixed

2016-08-15 Thread Borislav Petkov
On Mon, Aug 15, 2016 at 02:52:22PM -0700, H. Peter Anvin wrote:
> So to answer the implicit question: we have found UEFI stacks in the
> field which fail without the physical mappings present, and we have
> found stacks which fail without a nontrivial SetAddressMapping.

You mean SetVirtualAddressMap.

Oh well, it's not like it matters all that much as we have our own
pagetable for EFI so we can go nuts there. Apparently.

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--


Re: [PATCH] perf/core: Fix the mask in perf_output_sample_regs

2016-08-15 Thread Madhavan Srinivasan



On Thursday 11 August 2016 05:57 PM, Peter Zijlstra wrote:

Sorry, found it in my inbox while clearing out backlog..

On Sun, Jul 03, 2016 at 11:31:58PM +0530, Madhavan Srinivasan wrote:

When decoding the perf_regs mask in perf_output_sample_regs(),
we loop through the mask using find_first_bit and find_next_bit functions.
While the exisitng code works fine in most of the case,
the logic is broken for 32bit kernel (Big Endian).
When reading u64 mask using (u32 *)()[0], find_*_bit() assumes it gets
lower 32bits of u64 but instead gets upper 32bits which is wrong.
Proposed fix is to swap the words of the u64 to handle this case.
This is _not_ endianness swap.

But it looks an awful lot like it..

Hit this issue when testing my perf_arch_regs patchset. Yep exactly
the reason for adding that comment in the commit message.





+++ b/kernel/events/core.c
@@ -5205,8 +5205,10 @@ perf_output_sample_regs(struct perf_output_handle 
*handle,
struct pt_regs *regs, u64 mask)
  {
int bit;
+   DECLARE_BITMAP(_mask, 64);
  
-	for_each_set_bit(bit, (const unsigned long *) ,

+   bitmap_from_u64(_mask, mask);
+   for_each_set_bit(bit, _mask,
 sizeof(mask) * BITS_PER_BYTE) {
u64 val;
+++ b/lib/bitmap.c
+void bitmap_from_u64(unsigned long *dst, u64 mask)
+{
+   dst[0] = mask & ULONG_MAX;
+
+   if (sizeof(mask) > sizeof(unsigned long))
+   dst[1] = mask >> 32;
+}
+EXPORT_SYMBOL(bitmap_from_u64);

Looks small enough for an inline.

Alternatively you can go all the way and add bitmap_from_u64array(), but
that seems massive overkill.


Ok will make it inline and resend.

Maddy



Tedious stuff.. I can't come up with anything prettier :/





Re: [PATCH] perf/core: Fix the mask in perf_output_sample_regs

2016-08-15 Thread Madhavan Srinivasan



On Thursday 11 August 2016 05:57 PM, Peter Zijlstra wrote:

Sorry, found it in my inbox while clearing out backlog..

On Sun, Jul 03, 2016 at 11:31:58PM +0530, Madhavan Srinivasan wrote:

When decoding the perf_regs mask in perf_output_sample_regs(),
we loop through the mask using find_first_bit and find_next_bit functions.
While the exisitng code works fine in most of the case,
the logic is broken for 32bit kernel (Big Endian).
When reading u64 mask using (u32 *)()[0], find_*_bit() assumes it gets
lower 32bits of u64 but instead gets upper 32bits which is wrong.
Proposed fix is to swap the words of the u64 to handle this case.
This is _not_ endianness swap.

But it looks an awful lot like it..

Hit this issue when testing my perf_arch_regs patchset. Yep exactly
the reason for adding that comment in the commit message.





+++ b/kernel/events/core.c
@@ -5205,8 +5205,10 @@ perf_output_sample_regs(struct perf_output_handle 
*handle,
struct pt_regs *regs, u64 mask)
  {
int bit;
+   DECLARE_BITMAP(_mask, 64);
  
-	for_each_set_bit(bit, (const unsigned long *) ,

+   bitmap_from_u64(_mask, mask);
+   for_each_set_bit(bit, _mask,
 sizeof(mask) * BITS_PER_BYTE) {
u64 val;
+++ b/lib/bitmap.c
+void bitmap_from_u64(unsigned long *dst, u64 mask)
+{
+   dst[0] = mask & ULONG_MAX;
+
+   if (sizeof(mask) > sizeof(unsigned long))
+   dst[1] = mask >> 32;
+}
+EXPORT_SYMBOL(bitmap_from_u64);

Looks small enough for an inline.

Alternatively you can go all the way and add bitmap_from_u64array(), but
that seems massive overkill.


Ok will make it inline and resend.

Maddy



Tedious stuff.. I can't come up with anything prettier :/





[PATCH v2 8/8] power: ds2760_battery: Remove deprecated create_singlethread_workqueue

2016-08-15 Thread Bhaktipriya Shridhar
alloc_ordered_workqueue() with WQ_MEM_RECLAIM set replaces
deprecated create_singlethread_workqueue(). This is the identity
conversion.

The workqueue "monitor_wqueue" is used to monitor the battery
status. It has been identity converted.

It queues multiple work items viz >monitor_work,
>set_charged_work, which require execution ordering.
Hence, alloc_workqueue has been used to replace the
deprecated create_singlethread_workqueue instance.

WQ_MEM_RECLAIM flag has been set to ensure forward progress under
memory pressure.

Signed-off-by: Bhaktipriya Shridhar 
---
 drivers/power/ds2760_battery.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/power/ds2760_battery.c b/drivers/power/ds2760_battery.c
index 80f73cc..ac92e80 100644
--- a/drivers/power/ds2760_battery.c
+++ b/drivers/power/ds2760_battery.c
@@ -566,7 +566,8 @@ static int ds2760_battery_probe(struct platform_device 
*pdev)
INIT_DELAYED_WORK(>monitor_work, ds2760_battery_work);
INIT_DELAYED_WORK(>set_charged_work,
  ds2760_battery_set_charged_work);
-   di->monitor_wqueue = 
create_singlethread_workqueue(dev_name(>dev));
+   di->monitor_wqueue = alloc_ordered_workqueue(dev_name(>dev),
+WQ_MEM_RECLAIM);
if (!di->monitor_wqueue) {
retval = -ESRCH;
goto workqueue_failed;
--
2.1.4



[PATCH v2 8/8] power: ds2760_battery: Remove deprecated create_singlethread_workqueue

2016-08-15 Thread Bhaktipriya Shridhar
alloc_ordered_workqueue() with WQ_MEM_RECLAIM set replaces
deprecated create_singlethread_workqueue(). This is the identity
conversion.

The workqueue "monitor_wqueue" is used to monitor the battery
status. It has been identity converted.

It queues multiple work items viz >monitor_work,
>set_charged_work, which require execution ordering.
Hence, alloc_workqueue has been used to replace the
deprecated create_singlethread_workqueue instance.

WQ_MEM_RECLAIM flag has been set to ensure forward progress under
memory pressure.

Signed-off-by: Bhaktipriya Shridhar 
---
 drivers/power/ds2760_battery.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/power/ds2760_battery.c b/drivers/power/ds2760_battery.c
index 80f73cc..ac92e80 100644
--- a/drivers/power/ds2760_battery.c
+++ b/drivers/power/ds2760_battery.c
@@ -566,7 +566,8 @@ static int ds2760_battery_probe(struct platform_device 
*pdev)
INIT_DELAYED_WORK(>monitor_work, ds2760_battery_work);
INIT_DELAYED_WORK(>set_charged_work,
  ds2760_battery_set_charged_work);
-   di->monitor_wqueue = 
create_singlethread_workqueue(dev_name(>dev));
+   di->monitor_wqueue = alloc_ordered_workqueue(dev_name(>dev),
+WQ_MEM_RECLAIM);
if (!di->monitor_wqueue) {
retval = -ESRCH;
goto workqueue_failed;
--
2.1.4



[PATCH v2 5/8] power: ab8500_charger: Remove deprecated create_singlethread_workqueue

2016-08-15 Thread Bhaktipriya Shridhar
alloc_ordered_workqueue() with WQ_MEM_RECLAIM set replaces
deprecated create_singlethread_workqueue(). This is the identity
conversion.

The workqueue "charger_wq" is used for the IRQs and checking HW state of
the charger. It has been identity converted.

It has multiple work items viz usb_charger_attached_work, kick_wd_work,
check_vbat_work, check_hw_failure_work, usb_charger_attached_work,
ac_work, ac_charger_attached_work, attach_work and check_usbchgnotok_work,
which require execution ordering. Hence, a dedicated ordered workqueue
has been used here.

The WQ_MEM_RECLAIM flag has also been set to ensure
forward progress under memory pressure.

Signed-off-by: Bhaktipriya Shridhar 
---
 drivers/power/ab8500_charger.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/power/ab8500_charger.c b/drivers/power/ab8500_charger.c
index 30de5d4..5cee9aa 100644
--- a/drivers/power/ab8500_charger.c
+++ b/drivers/power/ab8500_charger.c
@@ -3540,8 +3540,8 @@ static int ab8500_charger_probe(struct platform_device 
*pdev)
di->usb_state.usb_current = -1;

/* Create a work queue for the charger */
-   di->charger_wq =
-   create_singlethread_workqueue("ab8500_charger_wq");
+   di->charger_wq = alloc_ordered_workqueue("ab8500_charger_wq",
+WQ_MEM_RECLAIM);
if (di->charger_wq == NULL) {
dev_err(di->dev, "failed to create work queue\n");
return -ENOMEM;
--
2.1.4



[PATCH v2 7/8] power: ab8500_fg: Remove deprecated create_singlethread_workqueue

2016-08-15 Thread Bhaktipriya Shridhar
alloc_ordered_workqueue() with WQ_MEM_RECLAIM set replaces
deprecated create_singlethread_workqueue(). This is the identity
conversion.

The workqueue "fg_wq" is used for running the FG algorithm periodically.
It has been identity converted.

It has multiple work items viz fg_periodic_work, fg_low_bat_work,
fg_reinit_work, fg_work, fg_acc_cur_work and fg_check_hw_failure_work,
which require execution ordering. Hence, a dedicated ordered workqueue
has been used here.

The WQ_MEM_RECLAIM flag has been set to guarantee forward progress under
memory pressure.

Signed-off-by: Bhaktipriya Shridhar 
---
 drivers/power/ab8500_fg.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/power/ab8500_fg.c b/drivers/power/ab8500_fg.c
index 5a36cf8..199f2db 100644
--- a/drivers/power/ab8500_fg.c
+++ b/drivers/power/ab8500_fg.c
@@ -3096,7 +3096,7 @@ static int ab8500_fg_probe(struct platform_device *pdev)
ab8500_fg_discharge_state_to(di, AB8500_FG_DISCHARGE_INIT);

/* Create a work queue for running the FG algorithm */
-   di->fg_wq = create_singlethread_workqueue("ab8500_fg_wq");
+   di->fg_wq = alloc_ordered_workqueue("ab8500_fg_wq", WQ_MEM_RECLAIM);
if (di->fg_wq == NULL) {
dev_err(di->dev, "failed to create work queue\n");
return -ENOMEM;
--
2.1.4



[PATCH v2 6/8] power: ipaq_micro_battery: Remove deprecated create_singlethread_workqueue

2016-08-15 Thread Bhaktipriya Shridhar
The workqueue "wq" is used for handling battery related tasks.

It has a single work item viz >update and hence it doesn't require
execution ordering. Hence, alloc_workqueue has been used to replace the
deprecated create_singlethread_workqueue instance.

The WQ_MEM_RECLAIM flag has been set to ensure forward progress under
memory pressure.

Since there is a single work item, explicit concurrency
limit is unnecessary here.

Signed-off-by: Bhaktipriya Shridhar 
---
 drivers/power/ipaq_micro_battery.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/power/ipaq_micro_battery.c 
b/drivers/power/ipaq_micro_battery.c
index 35b01c7..4af7b77 100644
--- a/drivers/power/ipaq_micro_battery.c
+++ b/drivers/power/ipaq_micro_battery.c
@@ -235,7 +235,7 @@ static int micro_batt_probe(struct platform_device *pdev)
return -ENOMEM;

mb->micro = dev_get_drvdata(pdev->dev.parent);
-   mb->wq = create_singlethread_workqueue("ipaq-battery-wq");
+   mb->wq = alloc_workqueue("ipaq-battery-wq", WQ_MEM_RECLAIM, 0);
if (!mb->wq)
return -ENOMEM;

--
2.1.4



[PATCH v2 5/8] power: ab8500_charger: Remove deprecated create_singlethread_workqueue

2016-08-15 Thread Bhaktipriya Shridhar
alloc_ordered_workqueue() with WQ_MEM_RECLAIM set replaces
deprecated create_singlethread_workqueue(). This is the identity
conversion.

The workqueue "charger_wq" is used for the IRQs and checking HW state of
the charger. It has been identity converted.

It has multiple work items viz usb_charger_attached_work, kick_wd_work,
check_vbat_work, check_hw_failure_work, usb_charger_attached_work,
ac_work, ac_charger_attached_work, attach_work and check_usbchgnotok_work,
which require execution ordering. Hence, a dedicated ordered workqueue
has been used here.

The WQ_MEM_RECLAIM flag has also been set to ensure
forward progress under memory pressure.

Signed-off-by: Bhaktipriya Shridhar 
---
 drivers/power/ab8500_charger.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/power/ab8500_charger.c b/drivers/power/ab8500_charger.c
index 30de5d4..5cee9aa 100644
--- a/drivers/power/ab8500_charger.c
+++ b/drivers/power/ab8500_charger.c
@@ -3540,8 +3540,8 @@ static int ab8500_charger_probe(struct platform_device 
*pdev)
di->usb_state.usb_current = -1;

/* Create a work queue for the charger */
-   di->charger_wq =
-   create_singlethread_workqueue("ab8500_charger_wq");
+   di->charger_wq = alloc_ordered_workqueue("ab8500_charger_wq",
+WQ_MEM_RECLAIM);
if (di->charger_wq == NULL) {
dev_err(di->dev, "failed to create work queue\n");
return -ENOMEM;
--
2.1.4



[PATCH v2 7/8] power: ab8500_fg: Remove deprecated create_singlethread_workqueue

2016-08-15 Thread Bhaktipriya Shridhar
alloc_ordered_workqueue() with WQ_MEM_RECLAIM set replaces
deprecated create_singlethread_workqueue(). This is the identity
conversion.

The workqueue "fg_wq" is used for running the FG algorithm periodically.
It has been identity converted.

It has multiple work items viz fg_periodic_work, fg_low_bat_work,
fg_reinit_work, fg_work, fg_acc_cur_work and fg_check_hw_failure_work,
which require execution ordering. Hence, a dedicated ordered workqueue
has been used here.

The WQ_MEM_RECLAIM flag has been set to guarantee forward progress under
memory pressure.

Signed-off-by: Bhaktipriya Shridhar 
---
 drivers/power/ab8500_fg.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/power/ab8500_fg.c b/drivers/power/ab8500_fg.c
index 5a36cf8..199f2db 100644
--- a/drivers/power/ab8500_fg.c
+++ b/drivers/power/ab8500_fg.c
@@ -3096,7 +3096,7 @@ static int ab8500_fg_probe(struct platform_device *pdev)
ab8500_fg_discharge_state_to(di, AB8500_FG_DISCHARGE_INIT);

/* Create a work queue for running the FG algorithm */
-   di->fg_wq = create_singlethread_workqueue("ab8500_fg_wq");
+   di->fg_wq = alloc_ordered_workqueue("ab8500_fg_wq", WQ_MEM_RECLAIM);
if (di->fg_wq == NULL) {
dev_err(di->dev, "failed to create work queue\n");
return -ENOMEM;
--
2.1.4



[PATCH v2 6/8] power: ipaq_micro_battery: Remove deprecated create_singlethread_workqueue

2016-08-15 Thread Bhaktipriya Shridhar
The workqueue "wq" is used for handling battery related tasks.

It has a single work item viz >update and hence it doesn't require
execution ordering. Hence, alloc_workqueue has been used to replace the
deprecated create_singlethread_workqueue instance.

The WQ_MEM_RECLAIM flag has been set to ensure forward progress under
memory pressure.

Since there is a single work item, explicit concurrency
limit is unnecessary here.

Signed-off-by: Bhaktipriya Shridhar 
---
 drivers/power/ipaq_micro_battery.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/power/ipaq_micro_battery.c 
b/drivers/power/ipaq_micro_battery.c
index 35b01c7..4af7b77 100644
--- a/drivers/power/ipaq_micro_battery.c
+++ b/drivers/power/ipaq_micro_battery.c
@@ -235,7 +235,7 @@ static int micro_batt_probe(struct platform_device *pdev)
return -ENOMEM;

mb->micro = dev_get_drvdata(pdev->dev.parent);
-   mb->wq = create_singlethread_workqueue("ipaq-battery-wq");
+   mb->wq = alloc_workqueue("ipaq-battery-wq", WQ_MEM_RECLAIM, 0);
if (!mb->wq)
return -ENOMEM;

--
2.1.4



[PATCH v2 4/8] power: intel_mid_battery: Remove deprecated create_singlethread_workqueue

2016-08-15 Thread Bhaktipriya Shridhar
The workqueue "monitor_wqueue" is used to monitor the PMIC battery status.
It queues a single work item (pbi->monitor_battery) and hence doesn't
require ordering. Hence, alloc_workqueue has been used to replace the
deprecated create_singlethread_workqueue instance.

Since PMIC battery status needs to be monitored for any change, the
WQ_MEM_RECLAIM flag has been set to ensure forward progress under memory
pressure.

Since there is a single work item, explicit concurrency
limit is unnecessary here.

Signed-off-by: Bhaktipriya Shridhar 
---
 drivers/power/intel_mid_battery.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/power/intel_mid_battery.c 
b/drivers/power/intel_mid_battery.c
index 9fa4acc..dc7feef 100644
--- a/drivers/power/intel_mid_battery.c
+++ b/drivers/power/intel_mid_battery.c
@@ -689,8 +689,7 @@ static int probe(int irq, struct device *dev)
/* initialize all required framework before enabling interrupts */
INIT_WORK(>handler, pmic_battery_handle_intrpt);
INIT_DELAYED_WORK(>monitor_battery, pmic_battery_monitor);
-   pbi->monitor_wqueue =
-   create_singlethread_workqueue(dev_name(dev));
+   pbi->monitor_wqueue = alloc_workqueue(dev_name(dev), WQ_MEM_RECLAIM, 0);
if (!pbi->monitor_wqueue) {
dev_err(dev, "%s(): wqueue init failed\n", __func__);
retval = -ESRCH;
--
2.1.4



[PATCH v2 3/8] power: pm2301_charger: Remove deprecated create_singlethread_workqueue

2016-08-15 Thread Bhaktipriya Shridhar
alloc_ordered_workqueue() with WQ_MEM_RECLAIM set replaces
deprecated create_singlethread_workqueue(). This is the identity
conversion.

The workqueue "charger_wq" is used for running all the charger related
tasks. This involves charger detection, checking for HW failure and HW
status. This workqueue has been identity converted.

It queues multiple workitems viz >check_main_thermal_prot_work,
>check_hw_failure_work, >ac_work. Hence, the deprecated
create_singlethread_workqueue() instance has been replaced with a
dedicated ordered workqueue.

The WQ_MEM_RECLAIM flag has been set to ensure forward progress under
memory pressure.

Signed-off-by: Bhaktipriya Shridhar 
---
 drivers/power/pm2301_charger.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/power/pm2301_charger.c b/drivers/power/pm2301_charger.c
index fb62ed3..78561b6 100644
--- a/drivers/power/pm2301_charger.c
+++ b/drivers/power/pm2301_charger.c
@@ -1054,7 +1054,8 @@ static int pm2xxx_wall_charger_probe(struct i2c_client 
*i2c_client,
pm2->ac_chg.external = true;

/* Create a work queue for the charger */
-   pm2->charger_wq = create_singlethread_workqueue("pm2xxx_charger_wq");
+   pm2->charger_wq = alloc_ordered_workqueue("pm2xxx_charger_wq",
+ WQ_MEM_RECLAIM);
if (pm2->charger_wq == NULL) {
ret = -ENOMEM;
dev_err(pm2->dev, "failed to create work queue\n");
--
2.1.4



[PATCH v2 2/8] power: ab8500_btemp: Remove deprecated create_singlethread_workqueue

2016-08-15 Thread Bhaktipriya Shridhar
The workqueue "btemp_wq" is used for measuring the temperature
periodically. It queues a single workitem (btemp_periodic_work) and
hence doesn't require ordering. Thus, the deprecated
create_singlethread_workqueue() instance has been replaced with
alloc_workqueue().

The WQ_MEM_RECLAIM flag has been set to ensure forward progress under
memory pressure.

Since there is a single work item, explicit concurrency
limit is unnecessary here.

Signed-off-by: Bhaktipriya Shridhar 
---
 drivers/power/ab8500_btemp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/power/ab8500_btemp.c b/drivers/power/ab8500_btemp.c
index bf2e5dd..6ffdc18 100644
--- a/drivers/power/ab8500_btemp.c
+++ b/drivers/power/ab8500_btemp.c
@@ -1095,7 +1095,7 @@ static int ab8500_btemp_probe(struct platform_device 
*pdev)

/* Create a work queue for the btemp */
di->btemp_wq =
-   create_singlethread_workqueue("ab8500_btemp_wq");
+   alloc_workqueue("ab8500_btemp_wq", WQ_MEM_RECLAIM, 0);
if (di->btemp_wq == NULL) {
dev_err(di->dev, "failed to create work queue\n");
return -ENOMEM;
--
2.1.4



[PATCH v2 4/8] power: intel_mid_battery: Remove deprecated create_singlethread_workqueue

2016-08-15 Thread Bhaktipriya Shridhar
The workqueue "monitor_wqueue" is used to monitor the PMIC battery status.
It queues a single work item (pbi->monitor_battery) and hence doesn't
require ordering. Hence, alloc_workqueue has been used to replace the
deprecated create_singlethread_workqueue instance.

Since PMIC battery status needs to be monitored for any change, the
WQ_MEM_RECLAIM flag has been set to ensure forward progress under memory
pressure.

Since there is a single work item, explicit concurrency
limit is unnecessary here.

Signed-off-by: Bhaktipriya Shridhar 
---
 drivers/power/intel_mid_battery.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/power/intel_mid_battery.c 
b/drivers/power/intel_mid_battery.c
index 9fa4acc..dc7feef 100644
--- a/drivers/power/intel_mid_battery.c
+++ b/drivers/power/intel_mid_battery.c
@@ -689,8 +689,7 @@ static int probe(int irq, struct device *dev)
/* initialize all required framework before enabling interrupts */
INIT_WORK(>handler, pmic_battery_handle_intrpt);
INIT_DELAYED_WORK(>monitor_battery, pmic_battery_monitor);
-   pbi->monitor_wqueue =
-   create_singlethread_workqueue(dev_name(dev));
+   pbi->monitor_wqueue = alloc_workqueue(dev_name(dev), WQ_MEM_RECLAIM, 0);
if (!pbi->monitor_wqueue) {
dev_err(dev, "%s(): wqueue init failed\n", __func__);
retval = -ESRCH;
--
2.1.4



[PATCH v2 3/8] power: pm2301_charger: Remove deprecated create_singlethread_workqueue

2016-08-15 Thread Bhaktipriya Shridhar
alloc_ordered_workqueue() with WQ_MEM_RECLAIM set replaces
deprecated create_singlethread_workqueue(). This is the identity
conversion.

The workqueue "charger_wq" is used for running all the charger related
tasks. This involves charger detection, checking for HW failure and HW
status. This workqueue has been identity converted.

It queues multiple workitems viz >check_main_thermal_prot_work,
>check_hw_failure_work, >ac_work. Hence, the deprecated
create_singlethread_workqueue() instance has been replaced with a
dedicated ordered workqueue.

The WQ_MEM_RECLAIM flag has been set to ensure forward progress under
memory pressure.

Signed-off-by: Bhaktipriya Shridhar 
---
 drivers/power/pm2301_charger.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/power/pm2301_charger.c b/drivers/power/pm2301_charger.c
index fb62ed3..78561b6 100644
--- a/drivers/power/pm2301_charger.c
+++ b/drivers/power/pm2301_charger.c
@@ -1054,7 +1054,8 @@ static int pm2xxx_wall_charger_probe(struct i2c_client 
*i2c_client,
pm2->ac_chg.external = true;

/* Create a work queue for the charger */
-   pm2->charger_wq = create_singlethread_workqueue("pm2xxx_charger_wq");
+   pm2->charger_wq = alloc_ordered_workqueue("pm2xxx_charger_wq",
+ WQ_MEM_RECLAIM);
if (pm2->charger_wq == NULL) {
ret = -ENOMEM;
dev_err(pm2->dev, "failed to create work queue\n");
--
2.1.4



[PATCH v2 2/8] power: ab8500_btemp: Remove deprecated create_singlethread_workqueue

2016-08-15 Thread Bhaktipriya Shridhar
The workqueue "btemp_wq" is used for measuring the temperature
periodically. It queues a single workitem (btemp_periodic_work) and
hence doesn't require ordering. Thus, the deprecated
create_singlethread_workqueue() instance has been replaced with
alloc_workqueue().

The WQ_MEM_RECLAIM flag has been set to ensure forward progress under
memory pressure.

Since there is a single work item, explicit concurrency
limit is unnecessary here.

Signed-off-by: Bhaktipriya Shridhar 
---
 drivers/power/ab8500_btemp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/power/ab8500_btemp.c b/drivers/power/ab8500_btemp.c
index bf2e5dd..6ffdc18 100644
--- a/drivers/power/ab8500_btemp.c
+++ b/drivers/power/ab8500_btemp.c
@@ -1095,7 +1095,7 @@ static int ab8500_btemp_probe(struct platform_device 
*pdev)

/* Create a work queue for the btemp */
di->btemp_wq =
-   create_singlethread_workqueue("ab8500_btemp_wq");
+   alloc_workqueue("ab8500_btemp_wq", WQ_MEM_RECLAIM, 0);
if (di->btemp_wq == NULL) {
dev_err(di->dev, "failed to create work queue\n");
return -ENOMEM;
--
2.1.4



[PATCH v2 1/8] power: abx500_chargalg: Remove deprecated create_singlethread_workqueue

2016-08-15 Thread Bhaktipriya Shridhar
alloc_ordered_workqueue() with WQ_MEM_RECLAIM set replaces
deprecated create_singlethread_workqueue(). This is the identity
conversion.

The workqueue "chargalg_wq" is used for running the charging algorithm.
It has multiple workitems viz >chargalg_periodic_work,
>chargalg_wd_work, >chargalg_work per abx500_chargalg, which
require ordering. It has been identity converted.

Also, WQ_MEM_RECLAIM has been set to ensure forward progress under
memory pressure.

Signed-off-by: Bhaktipriya Shridhar 
---
 drivers/power/abx500_chargalg.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/power/abx500_chargalg.c b/drivers/power/abx500_chargalg.c
index d9104b1..a4411d6 100644
--- a/drivers/power/abx500_chargalg.c
+++ b/drivers/power/abx500_chargalg.c
@@ -2091,8 +2091,8 @@ static int abx500_chargalg_probe(struct platform_device 
*pdev)
abx500_chargalg_maintenance_timer_expired;

/* Create a work queue for the chargalg */
-   di->chargalg_wq =
-   create_singlethread_workqueue("abx500_chargalg_wq");
+   di->chargalg_wq = alloc_ordered_workqueue("abx500_chargalg_wq",
+  WQ_MEM_RECLAIM);
if (di->chargalg_wq == NULL) {
dev_err(di->dev, "failed to create work queue\n");
return -ENOMEM;
--
2.1.4



[PATCH v2 1/8] power: abx500_chargalg: Remove deprecated create_singlethread_workqueue

2016-08-15 Thread Bhaktipriya Shridhar
alloc_ordered_workqueue() with WQ_MEM_RECLAIM set replaces
deprecated create_singlethread_workqueue(). This is the identity
conversion.

The workqueue "chargalg_wq" is used for running the charging algorithm.
It has multiple workitems viz >chargalg_periodic_work,
>chargalg_wd_work, >chargalg_work per abx500_chargalg, which
require ordering. It has been identity converted.

Also, WQ_MEM_RECLAIM has been set to ensure forward progress under
memory pressure.

Signed-off-by: Bhaktipriya Shridhar 
---
 drivers/power/abx500_chargalg.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/power/abx500_chargalg.c b/drivers/power/abx500_chargalg.c
index d9104b1..a4411d6 100644
--- a/drivers/power/abx500_chargalg.c
+++ b/drivers/power/abx500_chargalg.c
@@ -2091,8 +2091,8 @@ static int abx500_chargalg_probe(struct platform_device 
*pdev)
abx500_chargalg_maintenance_timer_expired;

/* Create a work queue for the chargalg */
-   di->chargalg_wq =
-   create_singlethread_workqueue("abx500_chargalg_wq");
+   di->chargalg_wq = alloc_ordered_workqueue("abx500_chargalg_wq",
+  WQ_MEM_RECLAIM);
if (di->chargalg_wq == NULL) {
dev_err(di->dev, "failed to create work queue\n");
return -ENOMEM;
--
2.1.4



[PATCH v2 0/8] power: Remove deprecated create_singlethread_workqueue

2016-08-15 Thread Bhaktipriya Shridhar
This patch set removes the instances of deprecated
create_singlethread_workqueues in drivers/power by making the
appropriate conversions.

Bhaktipriya Shridhar (8):
  power: abx500_chargalg: Remove deprecated
create_singlethread_workqueue
  power: ab8500_btemp: Remove deprecated create_singlethread_workqueue
  power: pm2301_charger: Remove deprecated create_singlethread_workqueue
  power: intel_mid_battery: Remove deprecated
create_singlethread_workqueue
  power: ab8500_charger: Remove deprecated create_singlethread_workqueue
  power: ipaq_micro_battery: Remove deprecated
create_singlethread_workqueue
  power: ab8500_fg: Remove deprecated create_singlethread_workqueue
  power: ds2760_battery: Remove deprecated create_singlethread_workqueue

 drivers/power/ab8500_btemp.c   | 2 +-
 drivers/power/ab8500_charger.c | 4 ++--
 drivers/power/ab8500_fg.c  | 2 +-
 drivers/power/abx500_chargalg.c| 4 ++--
 drivers/power/ds2760_battery.c | 3 ++-
 drivers/power/intel_mid_battery.c  | 3 +--
 drivers/power/ipaq_micro_battery.c | 2 +-
 drivers/power/pm2301_charger.c | 3 ++-
 8 files changed, 12 insertions(+), 11 deletions(-)

--
2.1.4



[PATCH v2 0/8] power: Remove deprecated create_singlethread_workqueue

2016-08-15 Thread Bhaktipriya Shridhar
This patch set removes the instances of deprecated
create_singlethread_workqueues in drivers/power by making the
appropriate conversions.

Bhaktipriya Shridhar (8):
  power: abx500_chargalg: Remove deprecated
create_singlethread_workqueue
  power: ab8500_btemp: Remove deprecated create_singlethread_workqueue
  power: pm2301_charger: Remove deprecated create_singlethread_workqueue
  power: intel_mid_battery: Remove deprecated
create_singlethread_workqueue
  power: ab8500_charger: Remove deprecated create_singlethread_workqueue
  power: ipaq_micro_battery: Remove deprecated
create_singlethread_workqueue
  power: ab8500_fg: Remove deprecated create_singlethread_workqueue
  power: ds2760_battery: Remove deprecated create_singlethread_workqueue

 drivers/power/ab8500_btemp.c   | 2 +-
 drivers/power/ab8500_charger.c | 4 ++--
 drivers/power/ab8500_fg.c  | 2 +-
 drivers/power/abx500_chargalg.c| 4 ++--
 drivers/power/ds2760_battery.c | 3 ++-
 drivers/power/intel_mid_battery.c  | 3 +--
 drivers/power/ipaq_micro_battery.c | 2 +-
 drivers/power/pm2301_charger.c | 3 ++-
 8 files changed, 12 insertions(+), 11 deletions(-)

--
2.1.4



Re: [PATCH v1 3/3] PM / AVS: rockchip-cpu-avs: add driver handling Rockchip cpu avs

2016-08-15 Thread kbuild test robot
Hi Finley,

[auto build test ERROR on battery/master]
[also build test ERROR on v4.8-rc2 next-20160815]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Finlye-Xiao/PM-AVS-add-Rockchip-cpu-avs/20160816-105228
base:   git://git.infradead.org/battery-2.6.git master
config: arm-allmodconfig (attached as .config)
compiler: arm-linux-gnueabi-gcc (Debian 5.4.0-6) 5.4.0 20160609
reproduce:
wget 
https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross
 -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=arm 

All error/warnings (new ones prefixed by >>):

   drivers/power/avs/rockchip-cpu-avs.c: In function 
'rockchip_cpu_avs_notifier':
>> drivers/power/avs/rockchip-cpu-avs.c:230:10: error: implicit declaration of 
>> function 'cpufreq_frequency_get_table' 
>> [-Werror=implicit-function-declaration]
 table = cpufreq_frequency_get_table(policy->cpu);
 ^
>> drivers/power/avs/rockchip-cpu-avs.c:230:8: warning: assignment makes 
>> pointer from integer without a cast [-Wint-conversion]
 table = cpufreq_frequency_get_table(policy->cpu);
   ^
   cc1: some warnings being treated as errors

vim +/cpufreq_frequency_get_table +230 drivers/power/avs/rockchip-cpu-avs.c

   224  dev = get_cpu_device(policy->cpu);
   225  if (!dev) {
   226  pr_err("cpu%d Failed to get device\n", policy->cpu);
   227  goto out;
   228  }
   229  
 > 230  table = cpufreq_frequency_get_table(policy->cpu);
   231  if (!table) {
   232  pr_err("cpu%d CPUFreq table not found\n", policy->cpu);
   233  goto out;

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: Binary data


Re: [PATCH v1 3/3] PM / AVS: rockchip-cpu-avs: add driver handling Rockchip cpu avs

2016-08-15 Thread kbuild test robot
Hi Finley,

[auto build test ERROR on battery/master]
[also build test ERROR on v4.8-rc2 next-20160815]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Finlye-Xiao/PM-AVS-add-Rockchip-cpu-avs/20160816-105228
base:   git://git.infradead.org/battery-2.6.git master
config: arm-allmodconfig (attached as .config)
compiler: arm-linux-gnueabi-gcc (Debian 5.4.0-6) 5.4.0 20160609
reproduce:
wget 
https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross
 -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=arm 

All error/warnings (new ones prefixed by >>):

   drivers/power/avs/rockchip-cpu-avs.c: In function 
'rockchip_cpu_avs_notifier':
>> drivers/power/avs/rockchip-cpu-avs.c:230:10: error: implicit declaration of 
>> function 'cpufreq_frequency_get_table' 
>> [-Werror=implicit-function-declaration]
 table = cpufreq_frequency_get_table(policy->cpu);
 ^
>> drivers/power/avs/rockchip-cpu-avs.c:230:8: warning: assignment makes 
>> pointer from integer without a cast [-Wint-conversion]
 table = cpufreq_frequency_get_table(policy->cpu);
   ^
   cc1: some warnings being treated as errors

vim +/cpufreq_frequency_get_table +230 drivers/power/avs/rockchip-cpu-avs.c

   224  dev = get_cpu_device(policy->cpu);
   225  if (!dev) {
   226  pr_err("cpu%d Failed to get device\n", policy->cpu);
   227  goto out;
   228  }
   229  
 > 230  table = cpufreq_frequency_get_table(policy->cpu);
   231  if (!table) {
   232  pr_err("cpu%d CPUFreq table not found\n", policy->cpu);
   233  goto out;

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: Binary data


[PATCH] Bluetooth: btusb: Add support for 0cf3:e009

2016-08-15 Thread Kai-Heng Feng
Device 0cf3:e009 is one of the QCA ROME family.

T:  Bus=01 Lev=01 Prnt=01 Port=07 Cnt=04 Dev#=  4 Spd=12  MxCh= 0
D:  Ver= 2.01 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs=  1
P:  Vendor=0cf3 ProdID=e009 Rev=00.01
C:  #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=100mA
I:  If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
I:  If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
Signed-off-by: Kai-Heng Feng 
---
 drivers/bluetooth/btusb.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c
index c58a00c..80ae854 100644
--- a/drivers/bluetooth/btusb.c
+++ b/drivers/bluetooth/btusb.c
@@ -248,6 +248,7 @@ static const struct usb_device_id blacklist_table[] = {
 
/* QCA ROME chipset */
{ USB_DEVICE(0x0cf3, 0xe007), .driver_info = BTUSB_QCA_ROME },
+   { USB_DEVICE(0x0cf3, 0xe009), .driver_info = BTUSB_QCA_ROME },
{ USB_DEVICE(0x0cf3, 0xe300), .driver_info = BTUSB_QCA_ROME },
{ USB_DEVICE(0x0cf3, 0xe360), .driver_info = BTUSB_QCA_ROME },
{ USB_DEVICE(0x0489, 0xe092), .driver_info = BTUSB_QCA_ROME },
-- 
2.8.1



[PATCH] Bluetooth: btusb: Add support for 0cf3:e009

2016-08-15 Thread Kai-Heng Feng
Device 0cf3:e009 is one of the QCA ROME family.

T:  Bus=01 Lev=01 Prnt=01 Port=07 Cnt=04 Dev#=  4 Spd=12  MxCh= 0
D:  Ver= 2.01 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs=  1
P:  Vendor=0cf3 ProdID=e009 Rev=00.01
C:  #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=100mA
I:  If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
I:  If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
Signed-off-by: Kai-Heng Feng 
---
 drivers/bluetooth/btusb.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c
index c58a00c..80ae854 100644
--- a/drivers/bluetooth/btusb.c
+++ b/drivers/bluetooth/btusb.c
@@ -248,6 +248,7 @@ static const struct usb_device_id blacklist_table[] = {
 
/* QCA ROME chipset */
{ USB_DEVICE(0x0cf3, 0xe007), .driver_info = BTUSB_QCA_ROME },
+   { USB_DEVICE(0x0cf3, 0xe009), .driver_info = BTUSB_QCA_ROME },
{ USB_DEVICE(0x0cf3, 0xe300), .driver_info = BTUSB_QCA_ROME },
{ USB_DEVICE(0x0cf3, 0xe360), .driver_info = BTUSB_QCA_ROME },
{ USB_DEVICE(0x0489, 0xe092), .driver_info = BTUSB_QCA_ROME },
-- 
2.8.1



Re: ASoC: sun4i-codec: playback stall and I/O error with DAPM paths all disabled

2016-08-15 Thread wens Tsai
On Mon, Aug 15, 2016 at 7:42 PM, Mark Brown  wrote:
> On Mon, Aug 15, 2016 at 05:43:55PM +0800, wens Tsai wrote:
>
>> What is unexpected is any attempt to play anything under this state makes
>> the playback software (in my case mpg321) stall, and later report an I/O
>> error. My guess is that the DAC is still disabled by DAPM, so it doesn't
>> send any DRQs, and thus the DMA engine is not consuming any data from
>> userspace.
>
> This is normal for ASoC - like you say it'll be becasue the hardware
> isn't powered up.
>
>> I think we should just enable the digital bits of the DAC/ADC all the
>> time. Or maybe transfer and then discard data if the DAC is off. Not
>> sure if this is doable though. I expect playback software to work, and
>> not block, regardless of the hardware status.
>
> Powering things up all the time will have a major effect on battery life
> for systems that care about that.  The expectation is that systems with
> this sort of hardware won't normally be offering end users direct
> control of the routing, it'll be something that's handled during system
> integration.

Ok. So I guess one solution would be to move the mute controls out of
DAPM, and maybe change some other mux like paths into actual muxes, so
there's at least one usable path at all times.

IIRC there was a patch doing something like this. I'll look into it.

Regards
ChenYu


Re: ASoC: sun4i-codec: playback stall and I/O error with DAPM paths all disabled

2016-08-15 Thread wens Tsai
On Mon, Aug 15, 2016 at 7:42 PM, Mark Brown  wrote:
> On Mon, Aug 15, 2016 at 05:43:55PM +0800, wens Tsai wrote:
>
>> What is unexpected is any attempt to play anything under this state makes
>> the playback software (in my case mpg321) stall, and later report an I/O
>> error. My guess is that the DAC is still disabled by DAPM, so it doesn't
>> send any DRQs, and thus the DMA engine is not consuming any data from
>> userspace.
>
> This is normal for ASoC - like you say it'll be becasue the hardware
> isn't powered up.
>
>> I think we should just enable the digital bits of the DAC/ADC all the
>> time. Or maybe transfer and then discard data if the DAC is off. Not
>> sure if this is doable though. I expect playback software to work, and
>> not block, regardless of the hardware status.
>
> Powering things up all the time will have a major effect on battery life
> for systems that care about that.  The expectation is that systems with
> this sort of hardware won't normally be offering end users direct
> control of the routing, it'll be something that's handled during system
> integration.

Ok. So I guess one solution would be to move the mute controls out of
DAPM, and maybe change some other mux like paths into actual muxes, so
there's at least one usable path at all times.

IIRC there was a patch doing something like this. I'll look into it.

Regards
ChenYu


Re: [lkp] [usb] ad05399d68: BUG: unable to handle kernel NULL pointer dereference at 0000000000000012

2016-08-15 Thread Ye Xiaolong
On 08/16, Peter Chen wrote:
>On Mon, Aug 15, 2016 at 10:49:55PM +0800, Ye Xiaolong wrote:
>> On 08/15, Peter Chen wrote:
>> > 
>> >>
>> >>
>> >>FYI, we noticed the following commit:
>> >>
>> >>https://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb.git testing/next 
>> >>commit
>> >>ad05399d68b6ae1649cdcfc82ce3ffea1a7c5104 ("usb: udc: core: fix error 
>> >>handling")
>> >>
>> >
>> >Hi Xiaolong,
>> >
>> >You reported it one month ago, and said it is a false report. see below.
>> >Would you please double confirm it?
>> 
>> Hi, peter
>> 
>> Last time I reported stat "WARNING: CPU: 0 PID: 1 at
>> lib/list_debug.c:36" and it showed both in this commit and its parent,
>> this time, the observed change stat is "BUG: unable to handle kernel NULL
>> pointer dereference at 0012" and it doesn't show in parent
>> commit, however, the parent commit's dmesg would show kernel panic log
>> as:
>> 
>> [   10.338487] Kernel panic - not syncing: Attempted to kill init! 
>> exitcode=0x000b
>> [   10.338487] 
>> [   10.339911] CPU: 0 PID: 1 Comm: init Not tainted 4.8.0-rc1-00020-g0937a4d 
>> #1
>> [   10.341177] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
>> Debian-1.8.2-1 04/01/2014
>> [   10.342798]   88001e53bc28 8168cf8a 
>> 88001e534000
>> [   10.345177]  8256ef20 88001e53bcb8 88001e50ca50 
>> 88001e53bca8
>> [   10.346739]  8114e062 8810 88001e53bcb8 
>> 88001e53bc50
>> [   10.347970] Call Trace:
>> [   10.348690]  [] dump_stack+0x83/0xb9
>> [   10.351592]  [] panic+0xf3/0x2a9
>> [   10.352386]  [] do_exit+0x601/0xde0
>> [   10.352879]  [] ? __sigqueue_free+0x43/0x50
>> [   10.353511]  [] ? __dequeue_signal+0x1f7/0x210
>> [   10.354483]  [] do_group_exit+0xa2/0x100
>> [   10.355324]  [] get_signal+0x68e/0x740
>> [   10.356155]  [] do_signal+0x23/0x670
>> [   10.356983]  [] ? do_syslog+0x2c0/0x6a0
>> [   10.357832]  [] ? bad_area_nosemaphore+0x33/0x40
>> [   10.358825]  [] ? __do_page_fault+0x407/0x4d0
>> [   10.359738]  [] exit_to_usermode_loop+0x69/0xc0
>> [   10.360680]  [] prepare_exit_to_usermode+0x3d/0x70
>> [   10.361725]  [] retint_user+0x8/0x10
>> [   10.362650] Kernel Offset: disabled
>> 
>> The whole parent dmesg is attached.
>> 
>
>Then, what's the conclusion? Is this one is detect one or not?
>

It seems parent kernel lives longer than this commit, and the 
sysfs_kf_write bug shows up consistently in 3 boot tests in LKP
environment.

% compare -at ad05399d68b6ae1649cdcfc82ce3ffea1a7c5104
tests: 3
testcase/path_params/tbox_group/run: boot/1/vm-kbuild-yocto-x86_64

0937a4d787539e2f  ad05399d68b6ae1649cdcfc82c
  --
  fail:runs  %reproductionfail:runs
  | | |
 6:6 -100%:4 
kmsg.stc):gdata/new_proto/recv_or_reg_complete_cb_not_ready
 6:6 -100%:4 kmsg.fmdrv:st_unregister_failed
  :6  100%   4:4 
kmsg.list_del_corruption.prev->next_should_be#,but_was
  :6  100%   4:4 
dmesg.WARNING:at_lib/list_debug.c:#__list_del_entry
  :6  100%   4:4 dmesg.BUG:unable_to_handle_kernel
  :6  100%   4:4 dmesg.Oops
  :6  100%   4:4 dmesg.RIP:sysfs_kf_write
  :6  100%   4:4 
dmesg.Kernel_panic-not_syncing:Fatal_exception
 6:6 -100%:4 
dmesg.Kernel_panic-not_syncing:Attempted_to_kill_init!exitcode=

testcase/path_params/tbox_group/run: boot/1/vm-ivb41-yocto-ia32

0937a4d787539e2f  ad05399d68b6ae1649cdcfc82c
  --
 2:2 -100%:2 
kmsg.stc):gdata/new_proto/recv_or_reg_complete_cb_not_ready
 2:2 -100%:2 kmsg.fmdrv:st_unregister_failed
  :2  100%   2:2 dmesg.BUG:unable_to_handle_kernel
  :2  100%   2:2 dmesg.Oops
  :2  100%   2:2 dmesg.RIP:sysfs_kf_write
  :2  100%   2:2 
dmesg.Kernel_panic-not_syncing:Fatal_exception
 2:2 -100%:2 dmesg.BUG:kernel_test_hang

testcase/path_params/tbox_group/run: boot/1/vm-kbuild-1G

0937a4d787539e2f  ad05399d68b6ae1649cdcfc82c
  --
  :4  100%   4:4 
dmesg.WARNING:at_lib/list_debug.c:#__list_del_entry
  :4   75%   3:4 dmesg.BUG:unable_to_handle_kernel
  :4   75%   3:4 dmesg.Oops
  :4   75%   3:4 dmesg.RIP:sysfs_kf_write
  :4   75%   3:4 
dmesg.Kernel_panic-not_syncing:Fatal_exception
 4:4 -100%:4 
dmesg.BUG:kernel_oversize_in_test_stage


>Peter
>
>> Thanks,
>> Xiaolong
>> 
>> >
>> >On Wed, Jul 13, 2016 at 

Re: [lkp] [usb] ad05399d68: BUG: unable to handle kernel NULL pointer dereference at 0000000000000012

2016-08-15 Thread Ye Xiaolong
On 08/16, Peter Chen wrote:
>On Mon, Aug 15, 2016 at 10:49:55PM +0800, Ye Xiaolong wrote:
>> On 08/15, Peter Chen wrote:
>> > 
>> >>
>> >>
>> >>FYI, we noticed the following commit:
>> >>
>> >>https://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb.git testing/next 
>> >>commit
>> >>ad05399d68b6ae1649cdcfc82ce3ffea1a7c5104 ("usb: udc: core: fix error 
>> >>handling")
>> >>
>> >
>> >Hi Xiaolong,
>> >
>> >You reported it one month ago, and said it is a false report. see below.
>> >Would you please double confirm it?
>> 
>> Hi, peter
>> 
>> Last time I reported stat "WARNING: CPU: 0 PID: 1 at
>> lib/list_debug.c:36" and it showed both in this commit and its parent,
>> this time, the observed change stat is "BUG: unable to handle kernel NULL
>> pointer dereference at 0012" and it doesn't show in parent
>> commit, however, the parent commit's dmesg would show kernel panic log
>> as:
>> 
>> [   10.338487] Kernel panic - not syncing: Attempted to kill init! 
>> exitcode=0x000b
>> [   10.338487] 
>> [   10.339911] CPU: 0 PID: 1 Comm: init Not tainted 4.8.0-rc1-00020-g0937a4d 
>> #1
>> [   10.341177] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
>> Debian-1.8.2-1 04/01/2014
>> [   10.342798]   88001e53bc28 8168cf8a 
>> 88001e534000
>> [   10.345177]  8256ef20 88001e53bcb8 88001e50ca50 
>> 88001e53bca8
>> [   10.346739]  8114e062 8810 88001e53bcb8 
>> 88001e53bc50
>> [   10.347970] Call Trace:
>> [   10.348690]  [] dump_stack+0x83/0xb9
>> [   10.351592]  [] panic+0xf3/0x2a9
>> [   10.352386]  [] do_exit+0x601/0xde0
>> [   10.352879]  [] ? __sigqueue_free+0x43/0x50
>> [   10.353511]  [] ? __dequeue_signal+0x1f7/0x210
>> [   10.354483]  [] do_group_exit+0xa2/0x100
>> [   10.355324]  [] get_signal+0x68e/0x740
>> [   10.356155]  [] do_signal+0x23/0x670
>> [   10.356983]  [] ? do_syslog+0x2c0/0x6a0
>> [   10.357832]  [] ? bad_area_nosemaphore+0x33/0x40
>> [   10.358825]  [] ? __do_page_fault+0x407/0x4d0
>> [   10.359738]  [] exit_to_usermode_loop+0x69/0xc0
>> [   10.360680]  [] prepare_exit_to_usermode+0x3d/0x70
>> [   10.361725]  [] retint_user+0x8/0x10
>> [   10.362650] Kernel Offset: disabled
>> 
>> The whole parent dmesg is attached.
>> 
>
>Then, what's the conclusion? Is this one is detect one or not?
>

It seems parent kernel lives longer than this commit, and the 
sysfs_kf_write bug shows up consistently in 3 boot tests in LKP
environment.

% compare -at ad05399d68b6ae1649cdcfc82ce3ffea1a7c5104
tests: 3
testcase/path_params/tbox_group/run: boot/1/vm-kbuild-yocto-x86_64

0937a4d787539e2f  ad05399d68b6ae1649cdcfc82c
  --
  fail:runs  %reproductionfail:runs
  | | |
 6:6 -100%:4 
kmsg.stc):gdata/new_proto/recv_or_reg_complete_cb_not_ready
 6:6 -100%:4 kmsg.fmdrv:st_unregister_failed
  :6  100%   4:4 
kmsg.list_del_corruption.prev->next_should_be#,but_was
  :6  100%   4:4 
dmesg.WARNING:at_lib/list_debug.c:#__list_del_entry
  :6  100%   4:4 dmesg.BUG:unable_to_handle_kernel
  :6  100%   4:4 dmesg.Oops
  :6  100%   4:4 dmesg.RIP:sysfs_kf_write
  :6  100%   4:4 
dmesg.Kernel_panic-not_syncing:Fatal_exception
 6:6 -100%:4 
dmesg.Kernel_panic-not_syncing:Attempted_to_kill_init!exitcode=

testcase/path_params/tbox_group/run: boot/1/vm-ivb41-yocto-ia32

0937a4d787539e2f  ad05399d68b6ae1649cdcfc82c
  --
 2:2 -100%:2 
kmsg.stc):gdata/new_proto/recv_or_reg_complete_cb_not_ready
 2:2 -100%:2 kmsg.fmdrv:st_unregister_failed
  :2  100%   2:2 dmesg.BUG:unable_to_handle_kernel
  :2  100%   2:2 dmesg.Oops
  :2  100%   2:2 dmesg.RIP:sysfs_kf_write
  :2  100%   2:2 
dmesg.Kernel_panic-not_syncing:Fatal_exception
 2:2 -100%:2 dmesg.BUG:kernel_test_hang

testcase/path_params/tbox_group/run: boot/1/vm-kbuild-1G

0937a4d787539e2f  ad05399d68b6ae1649cdcfc82c
  --
  :4  100%   4:4 
dmesg.WARNING:at_lib/list_debug.c:#__list_del_entry
  :4   75%   3:4 dmesg.BUG:unable_to_handle_kernel
  :4   75%   3:4 dmesg.Oops
  :4   75%   3:4 dmesg.RIP:sysfs_kf_write
  :4   75%   3:4 
dmesg.Kernel_panic-not_syncing:Fatal_exception
 4:4 -100%:4 
dmesg.BUG:kernel_oversize_in_test_stage


>Peter
>
>> Thanks,
>> Xiaolong
>> 
>> >
>> >On Wed, Jul 13, 2016 at 

Re: [PATCH v2 2/3] ses: use scsi_is_sas_rphy instead of is_sas_attached

2016-08-15 Thread kbuild test robot
Hi Johannes,

[auto build test ERROR on scsi/for-next]
[also build test ERROR on v4.8-rc2 next-20160815]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Johannes-Thumshirn/Fix-panic-when-a-SES-device-is-attached-to-a-hpsa-logical-volume/20160815-231901
base:   https://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git for-next
config: i386-randconfig-h0-08161012 (attached as .config)
compiler: gcc-6 (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
# save the attached .config to linux build tree
make ARCH=i386 

All errors (new ones prefixed by >>):

>> ERROR: "scsi_is_sas_rphy" [drivers/scsi/ses.ko] undefined!

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: Binary data


Re: [PATCH v2 2/3] ses: use scsi_is_sas_rphy instead of is_sas_attached

2016-08-15 Thread kbuild test robot
Hi Johannes,

[auto build test ERROR on scsi/for-next]
[also build test ERROR on v4.8-rc2 next-20160815]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Johannes-Thumshirn/Fix-panic-when-a-SES-device-is-attached-to-a-hpsa-logical-volume/20160815-231901
base:   https://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git for-next
config: i386-randconfig-h0-08161012 (attached as .config)
compiler: gcc-6 (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
# save the attached .config to linux build tree
make ARCH=i386 

All errors (new ones prefixed by >>):

>> ERROR: "scsi_is_sas_rphy" [drivers/scsi/ses.ko] undefined!

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: Binary data


Re: [PATCH] exynos-drm: Fix display manager failing to start without IOMMU problem

2016-08-15 Thread Inki Dae
Hi Shuah,

2016년 08월 13일 02:52에 Shuah Khan 이(가) 쓴 글:
> On 08/12/2016 11:28 AM, Shuah Khan wrote:
>> On 08/10/2016 05:05 PM, Shuah Khan wrote:
>>> On 08/10/2016 04:59 PM, Inki Dae wrote:
 Hi Shuah,

 2016년 08월 11일 02:30에 Shuah Khan 이(가) 쓴 글:
> Fix exynos_drm_gem_create_ioctl() attempts to allocate non-contiguous GEM
> memory without IOMMU. In this case, there is no point in attempting to

 DRM gem can be used for Non-DRM drivers such as GPU, V4L2 based Multimedia 
 device and other DMA devices.
 Even though IOMMU support is disabled, other framework based DMA drivers 
 can use IOMMU - i.e., GPU driver -
 and they can use non-contiguous GEM buffer through UMM. (DMABUF) 

 So GEM allocation type is not dependent on IOMMU.
>>>
>>> Hi Inki,
>>>
>>> I am seeing the following failure without IOMMU and light dm fails
>>> to start:
>>>
>>> [drm:exynos_drm_framebuffer_init] *ERROR* Non-continguous GEM memory is not 
>>> supported.
>>>
>>> The change I made fixed that problem and light dm starts without IOMMU.
>>> Is there a better way to fix this problem? Currently without IOMMU,
>>> light dm doesn't start.
>>>
>>> This is on linux_next
>>
>> Hi Inki,
>>
>> I am looking into this further and I am finding inconsistent
>> commits with regards to GEM contiguous and non-contiguous
>> buffers.
>>
>> Okay what you said is that:
>>
>> exymod-drm should support non-continguous and contiguous GEM memory
>> type with or without IOMMU

Right.

>>
>> However, the code currently isn't doing that. The following
>> commit allocates non-contiguous buffers when IOMMU is enabled
>> to handle contiguous allocation failures.
>>
>> There are other commits that removed checks for non-contig type.
>> Let's look at the following cases to see what should be the driver
>> behavior in these cases:
>>
>> IOMMU is disabled:
>>
>> exynos_drm_gem_create_ioctl() gets called with NONCONTIG
>> - driver should try to allocate non-contig
>> - if it can't allocate non-contig, allocate contig
>>   ( this will allow avoid failure like the one I am seeing)
>>
>> exynos_drm_gem_create_ioctl() gets called with CONTIG
>> - driver should try to allocate contig
>> - if it can't allocate contig, allocate non-contig
>>
>> What is confusing is there are several code paths in the
>> GEN allocation and checking memory types are enforcing
>> non-contig with IOMMU. Check this routine:
>>
>> exynos_drm_framebuffer_init() will reject non-contig
>> memory type when check_fb_gem_memory_type() rejects
>> non-contig GEM memory type without IOMMU.

Only in case that the gem buffer is used for framebuffer, gem memory type 
should be checked because this means the DMA of Display controller accesses the 
gem buffer so without IOMMU the DMA device cannot access non-contiguous memory 
region.
That is why exynos_drm_framebuffer_init checks gem memory type for fb not when 
gem is created.

> 
> 
> okay the very first commit that added IOMMU support
> introduced the code that rejects non-contig gem memory
> type without IOMMU.
> 
> commit 0519f9a12d0113caab78980c48a7902d2bd40c2c
> Author: Inki Dae 
> Date:   Sat Oct 20 07:53:42 2012 -0700
> 
> drm/exynos: add iommu support for exynos drm framework
> 
> Anyway, if it is th right change to fix check_fb_gem_memory_type()
> to not reject NONCONTIG_BUFFER, then I can make that change

No, as I mentioned above, the gem buffer for fb is dependent on IOMMU because 
the gem buffer for fb is used by DMA device - FIMD, DECON or Mixer.
You would need to understand that gem buffer can be used for other purposes - 
2D/3D or post process devices which don't use framebuffer - not display 
controller which uses framebuffer to scanout

Thanks,
Inki Dae

> instead of this patch I sent.
> 
>>
>> So there is inconsistency in the non-contig vs. contig
>> GEM support in exynos-drm. I think this needs to be cleaned
>> up to get the desired behavior.
>>
>> The following commit allocates non-contiguous buffers when IOMMU is
>> enabled to handle contiguous allocation failures.
>>
>> There are other commits that removed checks for non-contig type.
>> Let's look at the following cases to see what should be the driver
>> behavior in these cases:
>>
>> commit 122beea84bb90236b1ae545f08267af58591c21b
>> Author: Rahul Sharma 
>> Date:   Wed May 7 17:21:29 2014 +0530
>>
>> drm/exynos: allocate non-contigous buffers when iommu is enabled
>> 
>> Allow to allocate non-contigous buffers when iommu is enabled.
>> Currently, it tries to allocates contigous buffer which consistently
>> fail for large buffers and then fall back to non contigous. Apart
>> from being slow, this implementation is also very noisy and fills
>> the screen with alloc fail logs.
>> 
>> Signed-off-by: Rahul Sharma 
>> Reviewed-by: Sachin Kamat 
>> Signed-off-by: Inki Dae 
>>
>>
>> 

Re: [PATCH] exynos-drm: Fix display manager failing to start without IOMMU problem

2016-08-15 Thread Inki Dae
Hi Shuah,

2016년 08월 13일 02:52에 Shuah Khan 이(가) 쓴 글:
> On 08/12/2016 11:28 AM, Shuah Khan wrote:
>> On 08/10/2016 05:05 PM, Shuah Khan wrote:
>>> On 08/10/2016 04:59 PM, Inki Dae wrote:
 Hi Shuah,

 2016년 08월 11일 02:30에 Shuah Khan 이(가) 쓴 글:
> Fix exynos_drm_gem_create_ioctl() attempts to allocate non-contiguous GEM
> memory without IOMMU. In this case, there is no point in attempting to

 DRM gem can be used for Non-DRM drivers such as GPU, V4L2 based Multimedia 
 device and other DMA devices.
 Even though IOMMU support is disabled, other framework based DMA drivers 
 can use IOMMU - i.e., GPU driver -
 and they can use non-contiguous GEM buffer through UMM. (DMABUF) 

 So GEM allocation type is not dependent on IOMMU.
>>>
>>> Hi Inki,
>>>
>>> I am seeing the following failure without IOMMU and light dm fails
>>> to start:
>>>
>>> [drm:exynos_drm_framebuffer_init] *ERROR* Non-continguous GEM memory is not 
>>> supported.
>>>
>>> The change I made fixed that problem and light dm starts without IOMMU.
>>> Is there a better way to fix this problem? Currently without IOMMU,
>>> light dm doesn't start.
>>>
>>> This is on linux_next
>>
>> Hi Inki,
>>
>> I am looking into this further and I am finding inconsistent
>> commits with regards to GEM contiguous and non-contiguous
>> buffers.
>>
>> Okay what you said is that:
>>
>> exymod-drm should support non-continguous and contiguous GEM memory
>> type with or without IOMMU

Right.

>>
>> However, the code currently isn't doing that. The following
>> commit allocates non-contiguous buffers when IOMMU is enabled
>> to handle contiguous allocation failures.
>>
>> There are other commits that removed checks for non-contig type.
>> Let's look at the following cases to see what should be the driver
>> behavior in these cases:
>>
>> IOMMU is disabled:
>>
>> exynos_drm_gem_create_ioctl() gets called with NONCONTIG
>> - driver should try to allocate non-contig
>> - if it can't allocate non-contig, allocate contig
>>   ( this will allow avoid failure like the one I am seeing)
>>
>> exynos_drm_gem_create_ioctl() gets called with CONTIG
>> - driver should try to allocate contig
>> - if it can't allocate contig, allocate non-contig
>>
>> What is confusing is there are several code paths in the
>> GEN allocation and checking memory types are enforcing
>> non-contig with IOMMU. Check this routine:
>>
>> exynos_drm_framebuffer_init() will reject non-contig
>> memory type when check_fb_gem_memory_type() rejects
>> non-contig GEM memory type without IOMMU.

Only in case that the gem buffer is used for framebuffer, gem memory type 
should be checked because this means the DMA of Display controller accesses the 
gem buffer so without IOMMU the DMA device cannot access non-contiguous memory 
region.
That is why exynos_drm_framebuffer_init checks gem memory type for fb not when 
gem is created.

> 
> 
> okay the very first commit that added IOMMU support
> introduced the code that rejects non-contig gem memory
> type without IOMMU.
> 
> commit 0519f9a12d0113caab78980c48a7902d2bd40c2c
> Author: Inki Dae 
> Date:   Sat Oct 20 07:53:42 2012 -0700
> 
> drm/exynos: add iommu support for exynos drm framework
> 
> Anyway, if it is th right change to fix check_fb_gem_memory_type()
> to not reject NONCONTIG_BUFFER, then I can make that change

No, as I mentioned above, the gem buffer for fb is dependent on IOMMU because 
the gem buffer for fb is used by DMA device - FIMD, DECON or Mixer.
You would need to understand that gem buffer can be used for other purposes - 
2D/3D or post process devices which don't use framebuffer - not display 
controller which uses framebuffer to scanout

Thanks,
Inki Dae

> instead of this patch I sent.
> 
>>
>> So there is inconsistency in the non-contig vs. contig
>> GEM support in exynos-drm. I think this needs to be cleaned
>> up to get the desired behavior.
>>
>> The following commit allocates non-contiguous buffers when IOMMU is
>> enabled to handle contiguous allocation failures.
>>
>> There are other commits that removed checks for non-contig type.
>> Let's look at the following cases to see what should be the driver
>> behavior in these cases:
>>
>> commit 122beea84bb90236b1ae545f08267af58591c21b
>> Author: Rahul Sharma 
>> Date:   Wed May 7 17:21:29 2014 +0530
>>
>> drm/exynos: allocate non-contigous buffers when iommu is enabled
>> 
>> Allow to allocate non-contigous buffers when iommu is enabled.
>> Currently, it tries to allocates contigous buffer which consistently
>> fail for large buffers and then fall back to non contigous. Apart
>> from being slow, this implementation is also very noisy and fills
>> the screen with alloc fail logs.
>> 
>> Signed-off-by: Rahul Sharma 
>> Reviewed-by: Sachin Kamat 
>> Signed-off-by: Inki Dae 
>>
>>
>> commit ea6d66c3a797376d21b23dc8261733ce35970014
>> Author: Inki Dae 
>> Date:   Fri Nov 2 16:10:39 2012 +0900
>>
>> 

[PATCH v2] arc: Add "model" properly in device tree description of all boards

2016-08-15 Thread Alexey Brodkin
As it was discussed quite some time ago (see
https://lkml.org/lkml/2015/11/5/862) it's a good practice to add
"model" property in .dts. Moreover as per ePAPR "model" property is
required and should look like "manufacturer,model" so we do here.

Signed-off-by: Alexey Brodkin 
Cc: Vineet Gupta 
Cc: Jonas Gorski 
Cc: Arnd Bergmann 
Cc: Rob Herring 
Cc: Christian Ruppert 
---

Changes v1 -> v2:
 * Added "hs" postfix for boards based on ARC HS core
 * Added "archs" postfix in VDK's .dts to distinguish VDKs for
   ARC cores from those for ARM cores

 arch/arc/boot/dts/abilis_tb100_dvk.dts | 1 +
 arch/arc/boot/dts/abilis_tb101_dvk.dts | 1 +
 arch/arc/boot/dts/axs101.dts   | 1 +
 arch/arc/boot/dts/axs103.dts   | 1 +
 arch/arc/boot/dts/axs103_idu.dts   | 1 +
 arch/arc/boot/dts/nsim_700.dts | 1 +
 arch/arc/boot/dts/nsim_hs.dts  | 1 +
 arch/arc/boot/dts/nsim_hs_idu.dts  | 1 +
 arch/arc/boot/dts/nsimosci.dts | 1 +
 arch/arc/boot/dts/nsimosci_hs.dts  | 1 +
 arch/arc/boot/dts/nsimosci_hs_idu.dts  | 1 +
 arch/arc/boot/dts/vdk_hs38.dts | 1 +
 arch/arc/boot/dts/vdk_hs38_smp.dts | 1 +
 13 files changed, 13 insertions(+)

diff --git a/arch/arc/boot/dts/abilis_tb100_dvk.dts 
b/arch/arc/boot/dts/abilis_tb100_dvk.dts
index 3dd6ed9..3acf04d 100644
--- a/arch/arc/boot/dts/abilis_tb100_dvk.dts
+++ b/arch/arc/boot/dts/abilis_tb100_dvk.dts
@@ -24,6 +24,7 @@
 /include/ "abilis_tb100.dtsi"
 
 / {
+   model = "abilis,tb100";
chosen {
bootargs = "earlycon=uart8250,mmio32,0xff10,9600n8 
console=ttyS0,9600n8";
};
diff --git a/arch/arc/boot/dts/abilis_tb101_dvk.dts 
b/arch/arc/boot/dts/abilis_tb101_dvk.dts
index 1cf51c2..37d88c5 100644
--- a/arch/arc/boot/dts/abilis_tb101_dvk.dts
+++ b/arch/arc/boot/dts/abilis_tb101_dvk.dts
@@ -24,6 +24,7 @@
 /include/ "abilis_tb101.dtsi"
 
 / {
+   model = "abilis,tb101";
chosen {
bootargs = "earlycon=uart8250,mmio32,0xff10,9600n8 
console=ttyS0,9600n8";
};
diff --git a/arch/arc/boot/dts/axs101.dts b/arch/arc/boot/dts/axs101.dts
index 3f9b058..d9b9b9d 100644
--- a/arch/arc/boot/dts/axs101.dts
+++ b/arch/arc/boot/dts/axs101.dts
@@ -13,6 +13,7 @@
 /include/ "axs10x_mb.dtsi"
 
 / {
+   model = "snps,axs101";
compatible = "snps,axs101", "snps,arc-sdp";
 
chosen {
diff --git a/arch/arc/boot/dts/axs103.dts b/arch/arc/boot/dts/axs103.dts
index e6d0e31..ec7fb27 100644
--- a/arch/arc/boot/dts/axs103.dts
+++ b/arch/arc/boot/dts/axs103.dts
@@ -16,6 +16,7 @@
 /include/ "axs10x_mb.dtsi"
 
 / {
+   model = "snps,axs103";
compatible = "snps,axs103", "snps,arc-sdp";
 
chosen {
diff --git a/arch/arc/boot/dts/axs103_idu.dts b/arch/arc/boot/dts/axs103_idu.dts
index f999fef..070c297 100644
--- a/arch/arc/boot/dts/axs103_idu.dts
+++ b/arch/arc/boot/dts/axs103_idu.dts
@@ -16,6 +16,7 @@
 /include/ "axs10x_mb.dtsi"
 
 / {
+   model = "snps,axs103-smp";
compatible = "snps,axs103", "snps,arc-sdp";
 
chosen {
diff --git a/arch/arc/boot/dts/nsim_700.dts b/arch/arc/boot/dts/nsim_700.dts
index 6397051..ce0ccd20 100644
--- a/arch/arc/boot/dts/nsim_700.dts
+++ b/arch/arc/boot/dts/nsim_700.dts
@@ -10,6 +10,7 @@
 /include/ "skeleton.dtsi"
 
 / {
+   model = "snps,nsim";
compatible = "snps,nsim";
#address-cells = <1>;
#size-cells = <1>;
diff --git a/arch/arc/boot/dts/nsim_hs.dts b/arch/arc/boot/dts/nsim_hs.dts
index bf05fe5..3772c40 100644
--- a/arch/arc/boot/dts/nsim_hs.dts
+++ b/arch/arc/boot/dts/nsim_hs.dts
@@ -10,6 +10,7 @@
 /include/ "skeleton_hs.dtsi"
 
 / {
+   model = "snps,nsim_hs";
compatible = "snps,nsim_hs";
#address-cells = <2>;
#size-cells = <2>;
diff --git a/arch/arc/boot/dts/nsim_hs_idu.dts 
b/arch/arc/boot/dts/nsim_hs_idu.dts
index 99eabe1..48434d7c 100644
--- a/arch/arc/boot/dts/nsim_hs_idu.dts
+++ b/arch/arc/boot/dts/nsim_hs_idu.dts
@@ -10,6 +10,7 @@
 /include/ "skeleton_hs_idu.dtsi"
 
 / {
+   model = "snps,nsim_hs-smp";
compatible = "snps,nsim_hs";
interrupt-parent = <_intc>;
 
diff --git a/arch/arc/boot/dts/nsimosci.dts b/arch/arc/boot/dts/nsimosci.dts
index e659a34..bcf6031 100644
--- a/arch/arc/boot/dts/nsimosci.dts
+++ b/arch/arc/boot/dts/nsimosci.dts
@@ -10,6 +10,7 @@
 /include/ "skeleton.dtsi"
 
 / {
+   model = "snps,nsimosci";
compatible = "snps,nsimosci";
#address-cells = <1>;
#size-cells = <1>;
diff --git a/arch/arc/boot/dts/nsimosci_hs.dts 
b/arch/arc/boot/dts/nsimosci_hs.dts
index 16ce5d6..14a727c 100644
--- a/arch/arc/boot/dts/nsimosci_hs.dts
+++ b/arch/arc/boot/dts/nsimosci_hs.dts
@@ -10,6 +10,7 @@
 /include/ "skeleton_hs.dtsi"
 
 / {
+   model = "snps,nsimosci_hs";
compatible = "snps,nsimosci_hs";
#address-cells = <1>;
#size-cells = <1>;
diff 

[PATCH v2] arc: Add "model" properly in device tree description of all boards

2016-08-15 Thread Alexey Brodkin
As it was discussed quite some time ago (see
https://lkml.org/lkml/2015/11/5/862) it's a good practice to add
"model" property in .dts. Moreover as per ePAPR "model" property is
required and should look like "manufacturer,model" so we do here.

Signed-off-by: Alexey Brodkin 
Cc: Vineet Gupta 
Cc: Jonas Gorski 
Cc: Arnd Bergmann 
Cc: Rob Herring 
Cc: Christian Ruppert 
---

Changes v1 -> v2:
 * Added "hs" postfix for boards based on ARC HS core
 * Added "archs" postfix in VDK's .dts to distinguish VDKs for
   ARC cores from those for ARM cores

 arch/arc/boot/dts/abilis_tb100_dvk.dts | 1 +
 arch/arc/boot/dts/abilis_tb101_dvk.dts | 1 +
 arch/arc/boot/dts/axs101.dts   | 1 +
 arch/arc/boot/dts/axs103.dts   | 1 +
 arch/arc/boot/dts/axs103_idu.dts   | 1 +
 arch/arc/boot/dts/nsim_700.dts | 1 +
 arch/arc/boot/dts/nsim_hs.dts  | 1 +
 arch/arc/boot/dts/nsim_hs_idu.dts  | 1 +
 arch/arc/boot/dts/nsimosci.dts | 1 +
 arch/arc/boot/dts/nsimosci_hs.dts  | 1 +
 arch/arc/boot/dts/nsimosci_hs_idu.dts  | 1 +
 arch/arc/boot/dts/vdk_hs38.dts | 1 +
 arch/arc/boot/dts/vdk_hs38_smp.dts | 1 +
 13 files changed, 13 insertions(+)

diff --git a/arch/arc/boot/dts/abilis_tb100_dvk.dts 
b/arch/arc/boot/dts/abilis_tb100_dvk.dts
index 3dd6ed9..3acf04d 100644
--- a/arch/arc/boot/dts/abilis_tb100_dvk.dts
+++ b/arch/arc/boot/dts/abilis_tb100_dvk.dts
@@ -24,6 +24,7 @@
 /include/ "abilis_tb100.dtsi"
 
 / {
+   model = "abilis,tb100";
chosen {
bootargs = "earlycon=uart8250,mmio32,0xff10,9600n8 
console=ttyS0,9600n8";
};
diff --git a/arch/arc/boot/dts/abilis_tb101_dvk.dts 
b/arch/arc/boot/dts/abilis_tb101_dvk.dts
index 1cf51c2..37d88c5 100644
--- a/arch/arc/boot/dts/abilis_tb101_dvk.dts
+++ b/arch/arc/boot/dts/abilis_tb101_dvk.dts
@@ -24,6 +24,7 @@
 /include/ "abilis_tb101.dtsi"
 
 / {
+   model = "abilis,tb101";
chosen {
bootargs = "earlycon=uart8250,mmio32,0xff10,9600n8 
console=ttyS0,9600n8";
};
diff --git a/arch/arc/boot/dts/axs101.dts b/arch/arc/boot/dts/axs101.dts
index 3f9b058..d9b9b9d 100644
--- a/arch/arc/boot/dts/axs101.dts
+++ b/arch/arc/boot/dts/axs101.dts
@@ -13,6 +13,7 @@
 /include/ "axs10x_mb.dtsi"
 
 / {
+   model = "snps,axs101";
compatible = "snps,axs101", "snps,arc-sdp";
 
chosen {
diff --git a/arch/arc/boot/dts/axs103.dts b/arch/arc/boot/dts/axs103.dts
index e6d0e31..ec7fb27 100644
--- a/arch/arc/boot/dts/axs103.dts
+++ b/arch/arc/boot/dts/axs103.dts
@@ -16,6 +16,7 @@
 /include/ "axs10x_mb.dtsi"
 
 / {
+   model = "snps,axs103";
compatible = "snps,axs103", "snps,arc-sdp";
 
chosen {
diff --git a/arch/arc/boot/dts/axs103_idu.dts b/arch/arc/boot/dts/axs103_idu.dts
index f999fef..070c297 100644
--- a/arch/arc/boot/dts/axs103_idu.dts
+++ b/arch/arc/boot/dts/axs103_idu.dts
@@ -16,6 +16,7 @@
 /include/ "axs10x_mb.dtsi"
 
 / {
+   model = "snps,axs103-smp";
compatible = "snps,axs103", "snps,arc-sdp";
 
chosen {
diff --git a/arch/arc/boot/dts/nsim_700.dts b/arch/arc/boot/dts/nsim_700.dts
index 6397051..ce0ccd20 100644
--- a/arch/arc/boot/dts/nsim_700.dts
+++ b/arch/arc/boot/dts/nsim_700.dts
@@ -10,6 +10,7 @@
 /include/ "skeleton.dtsi"
 
 / {
+   model = "snps,nsim";
compatible = "snps,nsim";
#address-cells = <1>;
#size-cells = <1>;
diff --git a/arch/arc/boot/dts/nsim_hs.dts b/arch/arc/boot/dts/nsim_hs.dts
index bf05fe5..3772c40 100644
--- a/arch/arc/boot/dts/nsim_hs.dts
+++ b/arch/arc/boot/dts/nsim_hs.dts
@@ -10,6 +10,7 @@
 /include/ "skeleton_hs.dtsi"
 
 / {
+   model = "snps,nsim_hs";
compatible = "snps,nsim_hs";
#address-cells = <2>;
#size-cells = <2>;
diff --git a/arch/arc/boot/dts/nsim_hs_idu.dts 
b/arch/arc/boot/dts/nsim_hs_idu.dts
index 99eabe1..48434d7c 100644
--- a/arch/arc/boot/dts/nsim_hs_idu.dts
+++ b/arch/arc/boot/dts/nsim_hs_idu.dts
@@ -10,6 +10,7 @@
 /include/ "skeleton_hs_idu.dtsi"
 
 / {
+   model = "snps,nsim_hs-smp";
compatible = "snps,nsim_hs";
interrupt-parent = <_intc>;
 
diff --git a/arch/arc/boot/dts/nsimosci.dts b/arch/arc/boot/dts/nsimosci.dts
index e659a34..bcf6031 100644
--- a/arch/arc/boot/dts/nsimosci.dts
+++ b/arch/arc/boot/dts/nsimosci.dts
@@ -10,6 +10,7 @@
 /include/ "skeleton.dtsi"
 
 / {
+   model = "snps,nsimosci";
compatible = "snps,nsimosci";
#address-cells = <1>;
#size-cells = <1>;
diff --git a/arch/arc/boot/dts/nsimosci_hs.dts 
b/arch/arc/boot/dts/nsimosci_hs.dts
index 16ce5d6..14a727c 100644
--- a/arch/arc/boot/dts/nsimosci_hs.dts
+++ b/arch/arc/boot/dts/nsimosci_hs.dts
@@ -10,6 +10,7 @@
 /include/ "skeleton_hs.dtsi"
 
 / {
+   model = "snps,nsimosci_hs";
compatible = "snps,nsimosci_hs";
#address-cells = <1>;
#size-cells = <1>;
diff --git a/arch/arc/boot/dts/nsimosci_hs_idu.dts 
b/arch/arc/boot/dts/nsimosci_hs_idu.dts
index ce8dfbc..cbf65b6 100644
--- 

Re: [PATCH] powerpc/powernv: Initialise nest mmu

2016-08-15 Thread Balbir Singh


On 16/08/16 10:37, Alistair Popple wrote:
> Balbir,
> 
> 
>  
>>> +   /* Update partition table control register on all Nest MMUs */
>>> +   opal_nmmu_set_ptcr(-1UL, __pa(partition_tb) | (PATB_SIZE_SHIFT - 12));
>>> +
>>
>> Just wondering if
>>
>> 1. Instead of using -1 for all cpus, we should do
>>  for_each_online_cpu() {
>>  opal_numm_set_ptcr(...)
>>  }
> 
> Good question, but I don't think it makes sense to do that. The NMMU is
> per-chip/socket rather than per-cpu so it shouldn't be tied to
> onlining/offlining of individual CPUs.
> 
>> 2. In cpu hotplug path do the same when onlining and set to NULL on
>> offlining?
> 
> Again, the nmmu isn't tied to a specific CPU but rather a chip/socket. So in
> theory at least it's possible that all CPUs in a chip could be offline but
> other units on the chip could still be using the nmmu so we wouldn't want to
> disable the nmmu at that point.

Fair enough

Balbir Singh.


Re: [PATCH] powerpc/powernv: Initialise nest mmu

2016-08-15 Thread Balbir Singh


On 16/08/16 10:37, Alistair Popple wrote:
> Balbir,
> 
> 
>  
>>> +   /* Update partition table control register on all Nest MMUs */
>>> +   opal_nmmu_set_ptcr(-1UL, __pa(partition_tb) | (PATB_SIZE_SHIFT - 12));
>>> +
>>
>> Just wondering if
>>
>> 1. Instead of using -1 for all cpus, we should do
>>  for_each_online_cpu() {
>>  opal_numm_set_ptcr(...)
>>  }
> 
> Good question, but I don't think it makes sense to do that. The NMMU is
> per-chip/socket rather than per-cpu so it shouldn't be tied to
> onlining/offlining of individual CPUs.
> 
>> 2. In cpu hotplug path do the same when onlining and set to NULL on
>> offlining?
> 
> Again, the nmmu isn't tied to a specific CPU but rather a chip/socket. So in
> theory at least it's possible that all CPUs in a chip could be offline but
> other units on the chip could still be using the nmmu so we wouldn't want to
> disable the nmmu at that point.

Fair enough

Balbir Singh.


Re: [PATCH V5 3/4] drm/bridge: Add driver for GE B850v3 LVDS/DP++ Bridge

2016-08-15 Thread Archit Taneja

Hi,

On 08/09/2016 10:11 PM, Peter Senna Tschudin wrote:

Add a driver that create a drm_bridge and a drm_connector for the LVDS
to DP++ display bridge of the GE B850v3.

There are two physical bridges on the video signal pipeline: a
STDP4028(LVDS to DP) and a STDP2690(DP to DP++).  The hardware and
firmware made it complicated for this binding to comprise two device
tree nodes, as the design goal is to configure both bridges based on
the LVDS signal, which leave the driver powerless to control the video
processing pipeline. The two bridges behaves as a single bridge, and
the driver is only needed for telling the host about EDID / HPD, and
for giving the host powers to ack interrupts. The video signal pipeline
is as follows:

   Host -> LVDS|--(STDP4028)--|DP -> DP|--(STDP2690)--|DP++ -> Video output



I'd commented on an earlier revision (v2) of this patch, but hadn't got
a response on it. Pasting the query again:

Are these two chips always expected to be used together? I don't think
it's right to pair up two encoder chips into one driver just for one
board.

Is one device @0x72 and other @0x73? Or is only one of them an i2c
slave?

What's preventing us to create these as two different bridge drivers?
The drm framework allows us to daisy chain encoder bridges. The only
problem I see is that we don't have a clear-cut way to tell the bridge
driver whether we want it to create a connector for us or not. Because,
it looks like both can potentially create connectors. This isn't a big
problem either if we have DT. We just need to check whether our output
port is connected to another bridge or a connector.

Thanks,
Archit


Cc: Martyn Welch 
Cc: Martin Donnelly 
Cc: Daniel Vetter 
Cc: Enric Balletbo i Serra 
Cc: Philipp Zabel 
Cc: Rob Herring 
Cc: Fabio Estevam 
CC: David Airlie 
CC: Thierry Reding 
CC: Thierry Reding 
Reviewed-by: Enric Balletbo 
Signed-off-by: Peter Senna Tschudin 
---
Changes from V4:
  - Check the output of the first call to i2c_smbus_write_word_data() and return
it's error code for failing gracefully on i2c issues
  - Renamed the i2c_driver.name from "ge,b850v3-lvds-dp" to "b850v3-lvds-dp" to
remove the comma from the driver name

Changes from V3:
  - 3/4 instead of 4/5
  - Tested on next-20160804

Changes from V2:
  - Made it atomic to be applied on next-20160729 on top of Liu Ying changes
that made imx-ldb atomic

Changes from V1:
  - New commit message
  - Removed 3 empty entry points
  - Removed memory leak from ge_b850v3_lvds_dp_get_modes()
  - Added a lock for mode setting
  - Removed a few blank lines
  - Changed the order at Makefile and Kconfig

  MAINTAINERS|   8 +
  drivers/gpu/drm/bridge/Kconfig |  11 +
  drivers/gpu/drm/bridge/Makefile|   1 +
  drivers/gpu/drm/bridge/ge_b850v3_lvds_dp.c | 405 +
  4 files changed, 425 insertions(+)
  create mode 100644 drivers/gpu/drm/bridge/ge_b850v3_lvds_dp.c

diff --git a/MAINTAINERS b/MAINTAINERS
index a306795..e8d106a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -5142,6 +5142,14 @@ W:   https://linuxtv.org
  S:Maintained
  F:drivers/media/radio/radio-gemtek*

+GENERAL ELECTRIC B850V3 LVDS/DP++ BRIDGE
+M: Peter Senna Tschudin 
+M: Martin Donnelly 
+M: Martyn Welch 
+S: Maintained
+F: drivers/gpu/drm/bridge/ge_b850v3_dp2.c
+F: Documentation/devicetree/bindings/ge/b850v3_dp2_bridge.txt
+
  GENERIC GPIO I2C DRIVER
  M:Haavard Skinnemoen 
  S:Supported
diff --git a/drivers/gpu/drm/bridge/Kconfig b/drivers/gpu/drm/bridge/Kconfig
index b590e67..b4b70fb 100644
--- a/drivers/gpu/drm/bridge/Kconfig
+++ b/drivers/gpu/drm/bridge/Kconfig
@@ -32,6 +32,17 @@ config DRM_DW_HDMI_AHB_AUDIO
  Designware HDMI block.  This is used in conjunction with
  the i.MX6 HDMI driver.

+config DRM_GE_B850V3_LVDS_DP
+   tristate "GE B850v3 LVDS to DP++ display bridge"
+   depends on OF
+   select DRM_KMS_HELPER
+   select DRM_PANEL
+   ---help---
+  This is a driver for the display bridge of
+  GE B850v3 that convert dual channel LVDS
+  to DP++. This is used with the i.MX6 imx-ldb
+  driver.
+
  config DRM_NXP_PTN3460
tristate "NXP PTN3460 DP/LVDS bridge"
depends on OF
diff --git a/drivers/gpu/drm/bridge/Makefile b/drivers/gpu/drm/bridge/Makefile
index efdb07e..b9606f3 100644
--- a/drivers/gpu/drm/bridge/Makefile
+++ b/drivers/gpu/drm/bridge/Makefile
@@ -3,6 +3,7 @@ ccflags-y := -Iinclude/drm
  obj-$(CONFIG_DRM_ANALOGIX_ANX78XX) += analogix-anx78xx.o
  

Re: [PATCH V5 3/4] drm/bridge: Add driver for GE B850v3 LVDS/DP++ Bridge

2016-08-15 Thread Archit Taneja

Hi,

On 08/09/2016 10:11 PM, Peter Senna Tschudin wrote:

Add a driver that create a drm_bridge and a drm_connector for the LVDS
to DP++ display bridge of the GE B850v3.

There are two physical bridges on the video signal pipeline: a
STDP4028(LVDS to DP) and a STDP2690(DP to DP++).  The hardware and
firmware made it complicated for this binding to comprise two device
tree nodes, as the design goal is to configure both bridges based on
the LVDS signal, which leave the driver powerless to control the video
processing pipeline. The two bridges behaves as a single bridge, and
the driver is only needed for telling the host about EDID / HPD, and
for giving the host powers to ack interrupts. The video signal pipeline
is as follows:

   Host -> LVDS|--(STDP4028)--|DP -> DP|--(STDP2690)--|DP++ -> Video output



I'd commented on an earlier revision (v2) of this patch, but hadn't got
a response on it. Pasting the query again:

Are these two chips always expected to be used together? I don't think
it's right to pair up two encoder chips into one driver just for one
board.

Is one device @0x72 and other @0x73? Or is only one of them an i2c
slave?

What's preventing us to create these as two different bridge drivers?
The drm framework allows us to daisy chain encoder bridges. The only
problem I see is that we don't have a clear-cut way to tell the bridge
driver whether we want it to create a connector for us or not. Because,
it looks like both can potentially create connectors. This isn't a big
problem either if we have DT. We just need to check whether our output
port is connected to another bridge or a connector.

Thanks,
Archit


Cc: Martyn Welch 
Cc: Martin Donnelly 
Cc: Daniel Vetter 
Cc: Enric Balletbo i Serra 
Cc: Philipp Zabel 
Cc: Rob Herring 
Cc: Fabio Estevam 
CC: David Airlie 
CC: Thierry Reding 
CC: Thierry Reding 
Reviewed-by: Enric Balletbo 
Signed-off-by: Peter Senna Tschudin 
---
Changes from V4:
  - Check the output of the first call to i2c_smbus_write_word_data() and return
it's error code for failing gracefully on i2c issues
  - Renamed the i2c_driver.name from "ge,b850v3-lvds-dp" to "b850v3-lvds-dp" to
remove the comma from the driver name

Changes from V3:
  - 3/4 instead of 4/5
  - Tested on next-20160804

Changes from V2:
  - Made it atomic to be applied on next-20160729 on top of Liu Ying changes
that made imx-ldb atomic

Changes from V1:
  - New commit message
  - Removed 3 empty entry points
  - Removed memory leak from ge_b850v3_lvds_dp_get_modes()
  - Added a lock for mode setting
  - Removed a few blank lines
  - Changed the order at Makefile and Kconfig

  MAINTAINERS|   8 +
  drivers/gpu/drm/bridge/Kconfig |  11 +
  drivers/gpu/drm/bridge/Makefile|   1 +
  drivers/gpu/drm/bridge/ge_b850v3_lvds_dp.c | 405 +
  4 files changed, 425 insertions(+)
  create mode 100644 drivers/gpu/drm/bridge/ge_b850v3_lvds_dp.c

diff --git a/MAINTAINERS b/MAINTAINERS
index a306795..e8d106a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -5142,6 +5142,14 @@ W:   https://linuxtv.org
  S:Maintained
  F:drivers/media/radio/radio-gemtek*

+GENERAL ELECTRIC B850V3 LVDS/DP++ BRIDGE
+M: Peter Senna Tschudin 
+M: Martin Donnelly 
+M: Martyn Welch 
+S: Maintained
+F: drivers/gpu/drm/bridge/ge_b850v3_dp2.c
+F: Documentation/devicetree/bindings/ge/b850v3_dp2_bridge.txt
+
  GENERIC GPIO I2C DRIVER
  M:Haavard Skinnemoen 
  S:Supported
diff --git a/drivers/gpu/drm/bridge/Kconfig b/drivers/gpu/drm/bridge/Kconfig
index b590e67..b4b70fb 100644
--- a/drivers/gpu/drm/bridge/Kconfig
+++ b/drivers/gpu/drm/bridge/Kconfig
@@ -32,6 +32,17 @@ config DRM_DW_HDMI_AHB_AUDIO
  Designware HDMI block.  This is used in conjunction with
  the i.MX6 HDMI driver.

+config DRM_GE_B850V3_LVDS_DP
+   tristate "GE B850v3 LVDS to DP++ display bridge"
+   depends on OF
+   select DRM_KMS_HELPER
+   select DRM_PANEL
+   ---help---
+  This is a driver for the display bridge of
+  GE B850v3 that convert dual channel LVDS
+  to DP++. This is used with the i.MX6 imx-ldb
+  driver.
+
  config DRM_NXP_PTN3460
tristate "NXP PTN3460 DP/LVDS bridge"
depends on OF
diff --git a/drivers/gpu/drm/bridge/Makefile b/drivers/gpu/drm/bridge/Makefile
index efdb07e..b9606f3 100644
--- a/drivers/gpu/drm/bridge/Makefile
+++ b/drivers/gpu/drm/bridge/Makefile
@@ -3,6 +3,7 @@ ccflags-y := -Iinclude/drm
  obj-$(CONFIG_DRM_ANALOGIX_ANX78XX) += analogix-anx78xx.o
  obj-$(CONFIG_DRM_DW_HDMI) += dw-hdmi.o
  obj-$(CONFIG_DRM_DW_HDMI_AHB_AUDIO) += dw-hdmi-ahb-audio.o
+obj-$(CONFIG_DRM_GE_B850V3_LVDS_DP) += ge_b850v3_lvds_dp.o
  obj-$(CONFIG_DRM_NXP_PTN3460) += nxp-ptn3460.o
  obj-$(CONFIG_DRM_PARADE_PS8622) += parade-ps8622.o
  obj-$(CONFIG_DRM_SII902X) += sii902x.o
diff --git a/drivers/gpu/drm/bridge/ge_b850v3_lvds_dp.c 

Re: [RFC] can we use vmalloc to alloc thread stack if compaction failed

2016-08-15 Thread Joonsoo Kim
On Wed, Aug 10, 2016 at 04:59:39AM -0700, Andy Lutomirski wrote:
> On Sun, Jul 31, 2016 at 10:30 PM, Joonsoo Kim  wrote:
> > On Fri, Jul 29, 2016 at 12:47:38PM -0700, Andy Lutomirski wrote:
> >> -- Forwarded message --
> >> From: "Joonsoo Kim" 
> >> Date: Jul 28, 2016 7:57 PM
> >> Subject: Re: [RFC] can we use vmalloc to alloc thread stack if compaction 
> >> failed
> >> To: "Andy Lutomirski" 
> >> Cc: "Xishi Qiu" , "Michal Hocko"
> >> , "Tejun Heo" , "Ingo Molnar"
> >> , "Peter Zijlstra" , "LKML"
> >> , "Linux MM" ,
> >> "Yisheng Xie" 
> >>
> >> > On Thu, Jul 28, 2016 at 08:07:51AM -0700, Andy Lutomirski wrote:
> >> > > On Thu, Jul 28, 2016 at 3:51 AM, Xishi Qiu  wrote:
> >> > > > On 2016/7/28 17:43, Michal Hocko wrote:
> >> > > >
> >> > > >> On Thu 28-07-16 16:45:06, Xishi Qiu wrote:
> >> > > >>> On 2016/7/28 15:58, Michal Hocko wrote:
> >> > > >>>
> >> > >  On Thu 28-07-16 15:41:53, Xishi Qiu wrote:
> >> > > > On 2016/7/28 15:20, Michal Hocko wrote:
> >> > > >
> >> > > >> On Thu 28-07-16 15:08:26, Xishi Qiu wrote:
> >> > > >>> Usually THREAD_SIZE_ORDER is 2, it means we need to alloc 16kb 
> >> > > >>> continuous
> >> > > >>> physical memory during fork a new process.
> >> > > >>>
> >> > > >>> If the system's memory is very small, especially the smart 
> >> > > >>> phone, maybe there
> >> > > >>> is only 1G memory. So the free memory is very small and 
> >> > > >>> compaction is not
> >> > > >>> always success in slowpath(__alloc_pages_slowpath), then alloc 
> >> > > >>> thread stack
> >> > > >>> may be failed for memory fragment.
> >> > > >>
> >> > > >> Well, with the current implementation of the page allocator 
> >> > > >> those
> >> > > >> requests will not fail in most cases. The oom killer would be 
> >> > > >> invoked in
> >> > > >> order to free up some memory.
> >> > > >>
> >> > > >
> >> > > > Hi Michal,
> >> > > >
> >> > > > Yes, it success in most cases, but I did have seen this problem 
> >> > > > in some
> >> > > > stress-test.
> >> > > >
> >> > > > DMA free:470628kB, but alloc 2 order block failed during fork a 
> >> > > > new process.
> >> > > > There are so many memory fragments and the large block may be 
> >> > > > soon taken by
> >> > > > others after compact because of stress-test.
> >> > > >
> >> > > > --- dmesg messages ---
> >> > > > 07-13 08:41:51.341 
> >> > > > <4>[309805.658142s][pid:1361,cpu5,sManagerService]sManagerService:
> >> > > >  page allocation failure: order:2, mode:0x2000d1
> >> > > 
> >> > >  Yes but this is __GFP_DMA allocation. I guess you have already 
> >> > >  reported
> >> > >  this failure and you've been told that this is quite unexpected 
> >> > >  for the
> >> > >  kernel stack allocation. It is your out-of-tree patch which just 
> >> > >  makes
> >> > >  things worse because DMA restricted allocations are considered 
> >> > >  "lowmem"
> >> > >  and so they do not invoke OOM killer and do not retry like regular
> >> > >  GFP_KERNEL allocations.
> >> > > >>>
> >> > > >>> Hi Michal,
> >> > > >>>
> >> > > >>> Yes, we add GFP_DMA, but I don't think this is the key for the 
> >> > > >>> problem.
> >> > > >>
> >> > > >> You are restricting the allocation request to a single zone which is
> >> > > >> definitely not good. Look at how many larger order pages are 
> >> > > >> available
> >> > > >> in the Normal zone.
> >> > > >>
> >> > > >>> If we do oom-killer, maybe we will get a large block later, but 
> >> > > >>> there
> >> > > >>> is enough free memory before oom(although most of them are 
> >> > > >>> fragments).
> >> > > >>
> >> > > >> Killing a task is of course the last resort action. It would give 
> >> > > >> you
> >> > > >> larger order blocks used for the victims thread.
> >> > > >>
> >> > > >>> I wonder if we can alloc success without kill any process in this 
> >> > > >>> situation.
> >> > > >>
> >> > > >> Sure it would be preferable to compact that memory but that might be
> >> > > >> hard with your restriction in place. Consider that DMA zone would 
> >> > > >> tend
> >> > > >> to be less movable than normal zones as users would have to pin it 
> >> > > >> for
> >> > > >> DMA. Your DMA is really large so this might turn out to just happen 
> >> > > >> to
> >> > > >> work but note that the primary problem here is that you put a zone
> >> > > >> restriction for your allocations.
> >> > > >>
> >> > > >>> Maybe use vmalloc is a good way, but I don't know the influence.
> >> > > >>
> >> > > >> You can have a look at vmalloc patches posted by Andy. They are not 
> >> > > >> that
> >> > > 

Re: [RFC] can we use vmalloc to alloc thread stack if compaction failed

2016-08-15 Thread Joonsoo Kim
On Wed, Aug 10, 2016 at 04:59:39AM -0700, Andy Lutomirski wrote:
> On Sun, Jul 31, 2016 at 10:30 PM, Joonsoo Kim  wrote:
> > On Fri, Jul 29, 2016 at 12:47:38PM -0700, Andy Lutomirski wrote:
> >> -- Forwarded message --
> >> From: "Joonsoo Kim" 
> >> Date: Jul 28, 2016 7:57 PM
> >> Subject: Re: [RFC] can we use vmalloc to alloc thread stack if compaction 
> >> failed
> >> To: "Andy Lutomirski" 
> >> Cc: "Xishi Qiu" , "Michal Hocko"
> >> , "Tejun Heo" , "Ingo Molnar"
> >> , "Peter Zijlstra" , "LKML"
> >> , "Linux MM" ,
> >> "Yisheng Xie" 
> >>
> >> > On Thu, Jul 28, 2016 at 08:07:51AM -0700, Andy Lutomirski wrote:
> >> > > On Thu, Jul 28, 2016 at 3:51 AM, Xishi Qiu  wrote:
> >> > > > On 2016/7/28 17:43, Michal Hocko wrote:
> >> > > >
> >> > > >> On Thu 28-07-16 16:45:06, Xishi Qiu wrote:
> >> > > >>> On 2016/7/28 15:58, Michal Hocko wrote:
> >> > > >>>
> >> > >  On Thu 28-07-16 15:41:53, Xishi Qiu wrote:
> >> > > > On 2016/7/28 15:20, Michal Hocko wrote:
> >> > > >
> >> > > >> On Thu 28-07-16 15:08:26, Xishi Qiu wrote:
> >> > > >>> Usually THREAD_SIZE_ORDER is 2, it means we need to alloc 16kb 
> >> > > >>> continuous
> >> > > >>> physical memory during fork a new process.
> >> > > >>>
> >> > > >>> If the system's memory is very small, especially the smart 
> >> > > >>> phone, maybe there
> >> > > >>> is only 1G memory. So the free memory is very small and 
> >> > > >>> compaction is not
> >> > > >>> always success in slowpath(__alloc_pages_slowpath), then alloc 
> >> > > >>> thread stack
> >> > > >>> may be failed for memory fragment.
> >> > > >>
> >> > > >> Well, with the current implementation of the page allocator 
> >> > > >> those
> >> > > >> requests will not fail in most cases. The oom killer would be 
> >> > > >> invoked in
> >> > > >> order to free up some memory.
> >> > > >>
> >> > > >
> >> > > > Hi Michal,
> >> > > >
> >> > > > Yes, it success in most cases, but I did have seen this problem 
> >> > > > in some
> >> > > > stress-test.
> >> > > >
> >> > > > DMA free:470628kB, but alloc 2 order block failed during fork a 
> >> > > > new process.
> >> > > > There are so many memory fragments and the large block may be 
> >> > > > soon taken by
> >> > > > others after compact because of stress-test.
> >> > > >
> >> > > > --- dmesg messages ---
> >> > > > 07-13 08:41:51.341 
> >> > > > <4>[309805.658142s][pid:1361,cpu5,sManagerService]sManagerService:
> >> > > >  page allocation failure: order:2, mode:0x2000d1
> >> > > 
> >> > >  Yes but this is __GFP_DMA allocation. I guess you have already 
> >> > >  reported
> >> > >  this failure and you've been told that this is quite unexpected 
> >> > >  for the
> >> > >  kernel stack allocation. It is your out-of-tree patch which just 
> >> > >  makes
> >> > >  things worse because DMA restricted allocations are considered 
> >> > >  "lowmem"
> >> > >  and so they do not invoke OOM killer and do not retry like regular
> >> > >  GFP_KERNEL allocations.
> >> > > >>>
> >> > > >>> Hi Michal,
> >> > > >>>
> >> > > >>> Yes, we add GFP_DMA, but I don't think this is the key for the 
> >> > > >>> problem.
> >> > > >>
> >> > > >> You are restricting the allocation request to a single zone which is
> >> > > >> definitely not good. Look at how many larger order pages are 
> >> > > >> available
> >> > > >> in the Normal zone.
> >> > > >>
> >> > > >>> If we do oom-killer, maybe we will get a large block later, but 
> >> > > >>> there
> >> > > >>> is enough free memory before oom(although most of them are 
> >> > > >>> fragments).
> >> > > >>
> >> > > >> Killing a task is of course the last resort action. It would give 
> >> > > >> you
> >> > > >> larger order blocks used for the victims thread.
> >> > > >>
> >> > > >>> I wonder if we can alloc success without kill any process in this 
> >> > > >>> situation.
> >> > > >>
> >> > > >> Sure it would be preferable to compact that memory but that might be
> >> > > >> hard with your restriction in place. Consider that DMA zone would 
> >> > > >> tend
> >> > > >> to be less movable than normal zones as users would have to pin it 
> >> > > >> for
> >> > > >> DMA. Your DMA is really large so this might turn out to just happen 
> >> > > >> to
> >> > > >> work but note that the primary problem here is that you put a zone
> >> > > >> restriction for your allocations.
> >> > > >>
> >> > > >>> Maybe use vmalloc is a good way, but I don't know the influence.
> >> > > >>
> >> > > >> You can have a look at vmalloc patches posted by Andy. They are not 
> >> > > >> that
> >> > > >> trivial.
> >> > > >>
> >> > > >
> >> > > > Hi Michal,
> >> > > >
> >> > > > Thank you for your comment, could you give me the link?
> >> > > >
> >> > >
> >> > > I've been keeping it mostly up to date in this branch:
> >> > >
> >> > > 

Re: [PATCH 2/6] x86/boot: Move compressed kernel to end of decompression buffer

2016-08-15 Thread Matt Mullins
[added Simon Glass to CC in case there's some input from u-boot]

On Thu, Apr 28, 2016 at 05:09:04PM -0700, Kees Cook wrote:
> From: Yinghai Lu 
> 
> This patch adds BP_init_size (which is the INIT_SIZE as passed in from
> the boot_params) into asm-offsets.c to make it visible to the assembly
> code. Then when moving the ZO, it calculates the starting position of
> the copied ZO (via BP_init_size and the ZO run size) so that the VO__end
> will be at the end of the decompression buffer. To make the position
> calculation safe, the end of ZO is page aligned (and a comment is added
> to the existing VO alignment for good measure).

> diff --git a/arch/x86/boot/compressed/head_64.S 
> b/arch/x86/boot/compressed/head_64.S
> index d43c30ed89ed..09cdc0c3ee7e 100644
> --- a/arch/x86/boot/compressed/head_64.S
> +++ b/arch/x86/boot/compressed/head_64.S
> @@ -338,7 +340,9 @@ preferred_addr:
>  1:
>  
>   /* Target address to relocate to for decompression */
> - leaqz_extract_offset(%rbp), %rbx
> + movlBP_init_size(%rsi), %ebx
> + subl$_end, %ebx
> + addq%rbp, %rbx
>  
>   /* Set up the stack */
>   leaqboot_stack_end(%rbx), %rsp

This appears to have a negative effect on booting the Intel Edison platform, as
it uses u-boot as its bootloader.  u-boot does not copy the init_size parameter
when booting a bzImage: it copies a fixed-size setup_header [1], and its
definition of setup_header doesn't include the parameters beyond setup_data [2].

With a zero value for init_size, this calculates a %rsp value of 0x101ff9600.
This causes the boot process to hard-stop at the immediately-following pushq, as
this platform has no usable physical addresses above 4G.

What are the options for getting this type of platform to function again?  For
now, kexec from a working Linux system does seem to be a work-around, but there
appears to be other x86 hardware using u-boot: the chromium.org folks seem to be
maintaining the u-boot x86 tree.

[1] 
http://git.denx.de/?p=u-boot.git;a=blob;f=arch/x86/lib/zimage.c;h=1b33c771391f49ffe82864ff1582bdfd07e5e97d;hb=HEAD#l156
[2] 
http://git.denx.de/?p=u-boot.git;a=blob;f=arch/x86/include/asm/bootparam.h;h=140095117e5a2daef0a097c55f0ed10e08acc781;hb=HEAD#l24


Re: [PATCH 2/6] x86/boot: Move compressed kernel to end of decompression buffer

2016-08-15 Thread Matt Mullins
[added Simon Glass to CC in case there's some input from u-boot]

On Thu, Apr 28, 2016 at 05:09:04PM -0700, Kees Cook wrote:
> From: Yinghai Lu 
> 
> This patch adds BP_init_size (which is the INIT_SIZE as passed in from
> the boot_params) into asm-offsets.c to make it visible to the assembly
> code. Then when moving the ZO, it calculates the starting position of
> the copied ZO (via BP_init_size and the ZO run size) so that the VO__end
> will be at the end of the decompression buffer. To make the position
> calculation safe, the end of ZO is page aligned (and a comment is added
> to the existing VO alignment for good measure).

> diff --git a/arch/x86/boot/compressed/head_64.S 
> b/arch/x86/boot/compressed/head_64.S
> index d43c30ed89ed..09cdc0c3ee7e 100644
> --- a/arch/x86/boot/compressed/head_64.S
> +++ b/arch/x86/boot/compressed/head_64.S
> @@ -338,7 +340,9 @@ preferred_addr:
>  1:
>  
>   /* Target address to relocate to for decompression */
> - leaqz_extract_offset(%rbp), %rbx
> + movlBP_init_size(%rsi), %ebx
> + subl$_end, %ebx
> + addq%rbp, %rbx
>  
>   /* Set up the stack */
>   leaqboot_stack_end(%rbx), %rsp

This appears to have a negative effect on booting the Intel Edison platform, as
it uses u-boot as its bootloader.  u-boot does not copy the init_size parameter
when booting a bzImage: it copies a fixed-size setup_header [1], and its
definition of setup_header doesn't include the parameters beyond setup_data [2].

With a zero value for init_size, this calculates a %rsp value of 0x101ff9600.
This causes the boot process to hard-stop at the immediately-following pushq, as
this platform has no usable physical addresses above 4G.

What are the options for getting this type of platform to function again?  For
now, kexec from a working Linux system does seem to be a work-around, but there
appears to be other x86 hardware using u-boot: the chromium.org folks seem to be
maintaining the u-boot x86 tree.

[1] 
http://git.denx.de/?p=u-boot.git;a=blob;f=arch/x86/lib/zimage.c;h=1b33c771391f49ffe82864ff1582bdfd07e5e97d;hb=HEAD#l156
[2] 
http://git.denx.de/?p=u-boot.git;a=blob;f=arch/x86/include/asm/bootparam.h;h=140095117e5a2daef0a097c55f0ed10e08acc781;hb=HEAD#l24


Re: [PATCH 4.7 00/41] 4.7.1-stable review

2016-08-15 Thread Shuah Khan
On 08/14/2016 02:38 PM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.7.1 release.
> There are 41 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Tue Aug 16 20:25:22 UTC 2016.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
>   kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.7.1-rc1.gz
> or in the git tree and branch at:
>   git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git 
> linux-4.7.y
> and the diffstat can be found below.
> 
> thanks,
> 
> greg k-h
> 

Compiled and booted on my test system. No dmesg regressions.

thanks,
-- Shuah


-- 
Shuah Khan
Sr. Linux Kernel Developer
Open Source Innovation Group
Samsung Research America(Silicon Valley)
shuah...@samsung.com


Re: [PATCH 4.7 00/41] 4.7.1-stable review

2016-08-15 Thread Shuah Khan
On 08/14/2016 02:38 PM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.7.1 release.
> There are 41 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Tue Aug 16 20:25:22 UTC 2016.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
>   kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.7.1-rc1.gz
> or in the git tree and branch at:
>   git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git 
> linux-4.7.y
> and the diffstat can be found below.
> 
> thanks,
> 
> greg k-h
> 

Compiled and booted on my test system. No dmesg regressions.

thanks,
-- Shuah


-- 
Shuah Khan
Sr. Linux Kernel Developer
Open Source Innovation Group
Samsung Research America(Silicon Valley)
shuah...@samsung.com


Re: [PATCH v6 0/2] Block layer support ZAC/ZBC commands

2016-08-15 Thread Damien Le Moal

Shaun,

> On Aug 14, 2016, at 09:09, Shaun Tancheff  wrote:
[…]
>>> 
>> No, surely not.
>> But one of the _big_ advantages for the RB tree is blkdev_discard().
>> Without the RB tree any mkfs program will issue a 'discard' for every
>> sector. We will be able to coalesce those into one discard per zone, but
>> we still need to issue one for _every_ zone.
> 
> How can you make coalesce work transparently in the
> sd layer _without_ keeping some sort of a discard cache along
> with the zone cache?
> 
> Currently the block layer's blkdev_issue_discard() is breaking
> large discard's into nice granular and aligned chunks but it is
> not preventing small discards nor coalescing them.
> 
> In the sd layer would there be way to persist or purge an
> overly large discard cache? What about honoring
> discard_zeroes_data? Once the discard is completed with
> discard_zeroes_data you have to return zeroes whenever
> a discarded sector is read. Isn't that a log more than just
> tracking a write pointer? Couldn't a zone have dozens of holes?

My understanding of the standards regarding discard is that it is not
mandatory and that it is a hint to the drive. The drive can completely
ignore it if it thinks that is a better choice. I may be wrong on this
though. Need to check again.
For reset write pointer, the mapping to discard requires that the calls
to blkdev_issue_discard be zone aligned for anything to happen. Specify
less than a zone and nothing will be done. This I think preserve the
discard semantic.

As for the “discard_zeroes_data” thing, I also think that is a drive
feature not mandatory. Drives may have it or not, which is consistent
with the ZBC/ZAC standards regarding reading after write pointer (nothing
says that zeros have to be returned). In any case, discard of CMR zones
will be a nop, so for SMR drives, discard_zeroes_data=0 may be a better
choice.

> 
>> Which is (as indicated) really slow, and easily takes several minutes.
>> With the RB tree we can short-circuit discards to empty zones, and speed
>> up processing time dramatically.
>> Sure we could be moving the logic into mkfs and friends, but that would
>> require us to change the programs and agree on a library (libzbc?) which
>> should be handling that.
> 
> F2FS's mkfs.f2fs is already reading the zone topology via SG_IO ...
> so I'm not sure your argument is valid here.

This initial SMR support patch is just that: a first try. Jaegeuk
used SG_IO (in fact copy-paste of parts of libzbc) because the current
ZBC patch-set has no ioctl API for zone information manipulation. We
will fix this mkfs.f2fs once we agree on an ioctl interface.

> 
> [..]
> 
 3) Try to condense the blkzone data structure to save memory:
 I think that we can at the very least remove the zone length, and also
 may be the per zone spinlock too (a single spinlock and proper state flags 
 can
 be used).
>>> 
>>> I have a variant that is an array of descriptors that roughly mimics the
>>> api from blk-zoned.c that I did a few months ago as an example.
>>> I should be able to update that to the current kernel + patches.
>>> 
>> Okay. If we restrict the in-kernel SMR drive handling to devices with
>> identical zone sizes of course we can remove the zone length.
>> And we can do away with the per-zone spinlock and use a global one instead.
> 
> I don't think dropping the zone length is a reasonable thing to do.
> 
> What I propose is an array of _descriptors_ it doesn't drop _any_
> of the zone information that you are holding in an RB tree, it is
> just a condensed format that _mostly_ plugs into your existing
> API.

I do not agree. The Seagate drive already has one zone (the last one)
that is not the same length as the other zones. Sure, since it is the
last one, we can had “if (last zone)” all over the place and make it
work. But that is really ugly. Keeping the length field makes the code
generic and following the standard, which has no restriction on the
zone sizes. We could do some memory optimisation using different types
of blk_zone sturcts, the types mapping to the SAME value: drives with
constant zone size can use a blk_zone type without the length field,
others use a different type that include the field. Accessor functions
can hide the different types in the zone manipulation code.

Best regards.


-- 
Damien Le Moal, Ph.D.
Sr. Manager, System Software Group, HGST Research,
HGST, a Western Digital brand
damien.lem...@hgst.com
(+81) 0466-98-3593 (ext. 513593)
1 kirihara-cho, Fujisawa, 
Kanagawa, 252-0888 Japan
www.hgst.com

Western Digital Corporation (and its subsidiaries) E-mail Confidentiality 
Notice & Disclaimer:

This e-mail and any files transmitted with it may contain confidential or 
legally privileged information of WDC and/or its affiliates, and are intended 
solely for the use of the individual or entity to which they are addressed. If 
you are not the intended recipient, any disclosure, copying, distribution 

Re: [PATCH 4.6 00/56] 4.6.7-stable review

2016-08-15 Thread Shuah Khan
On 08/14/2016 02:37 PM, Greg Kroah-Hartman wrote:
> *
> NOTE
>   This is the LAST 4.6.y kernel that will be released.  After this
>   release, it is end-of-life.  You should be moving on to 4.7.y at this
>   point in time.  You have been warned.
> *
> 
> This is the start of the stable review cycle for the 4.6.7 release.
> There are 56 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Tue Aug 16 20:24:52 UTC 2016.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
>   kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.6.7-rc1.gz
> or in the git tree and branch at:
>   git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git 
> linux-4.6.y
> and the diffstat can be found below.
> 

Compiled and booted on my test system. No dmesg regressions.

thanks,
-- Shuah

-- 
Shuah Khan
Sr. Linux Kernel Developer
Open Source Innovation Group
Samsung Research America(Silicon Valley)
shuah...@samsung.com


Re: [PATCH 4.4 00/49] 4.4.18-stable review

2016-08-15 Thread Shuah Khan
On 08/14/2016 02:23 PM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.4.18 release.
> There are 49 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Tue Aug 16 20:22:43 UTC 2016.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
>   kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.4.18-rc1.gz
> or in the git tree and branch at:
>   git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git 
> linux-4.4.y
> and the diffstat can be found below.
> 

Compiled and booted on my test system. No dmesg regressions.

thanks,
-- Shuah

-- 
Shuah Khan
Sr. Linux Kernel Developer
Open Source Innovation Group
Samsung Research America(Silicon Valley)
shuah...@samsung.com


Re: [PATCH 4.6 00/56] 4.6.7-stable review

2016-08-15 Thread Shuah Khan
On 08/14/2016 02:37 PM, Greg Kroah-Hartman wrote:
> *
> NOTE
>   This is the LAST 4.6.y kernel that will be released.  After this
>   release, it is end-of-life.  You should be moving on to 4.7.y at this
>   point in time.  You have been warned.
> *
> 
> This is the start of the stable review cycle for the 4.6.7 release.
> There are 56 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Tue Aug 16 20:24:52 UTC 2016.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
>   kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.6.7-rc1.gz
> or in the git tree and branch at:
>   git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git 
> linux-4.6.y
> and the diffstat can be found below.
> 

Compiled and booted on my test system. No dmesg regressions.

thanks,
-- Shuah

-- 
Shuah Khan
Sr. Linux Kernel Developer
Open Source Innovation Group
Samsung Research America(Silicon Valley)
shuah...@samsung.com


Re: [PATCH 4.4 00/49] 4.4.18-stable review

2016-08-15 Thread Shuah Khan
On 08/14/2016 02:23 PM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.4.18 release.
> There are 49 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Tue Aug 16 20:22:43 UTC 2016.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
>   kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.4.18-rc1.gz
> or in the git tree and branch at:
>   git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git 
> linux-4.4.y
> and the diffstat can be found below.
> 

Compiled and booted on my test system. No dmesg regressions.

thanks,
-- Shuah

-- 
Shuah Khan
Sr. Linux Kernel Developer
Open Source Innovation Group
Samsung Research America(Silicon Valley)
shuah...@samsung.com


Re: [PATCH v6 0/2] Block layer support ZAC/ZBC commands

2016-08-15 Thread Damien Le Moal

Shaun,

> On Aug 14, 2016, at 09:09, Shaun Tancheff  wrote:
[…]
>>> 
>> No, surely not.
>> But one of the _big_ advantages for the RB tree is blkdev_discard().
>> Without the RB tree any mkfs program will issue a 'discard' for every
>> sector. We will be able to coalesce those into one discard per zone, but
>> we still need to issue one for _every_ zone.
> 
> How can you make coalesce work transparently in the
> sd layer _without_ keeping some sort of a discard cache along
> with the zone cache?
> 
> Currently the block layer's blkdev_issue_discard() is breaking
> large discard's into nice granular and aligned chunks but it is
> not preventing small discards nor coalescing them.
> 
> In the sd layer would there be way to persist or purge an
> overly large discard cache? What about honoring
> discard_zeroes_data? Once the discard is completed with
> discard_zeroes_data you have to return zeroes whenever
> a discarded sector is read. Isn't that a log more than just
> tracking a write pointer? Couldn't a zone have dozens of holes?

My understanding of the standards regarding discard is that it is not
mandatory and that it is a hint to the drive. The drive can completely
ignore it if it thinks that is a better choice. I may be wrong on this
though. Need to check again.
For reset write pointer, the mapping to discard requires that the calls
to blkdev_issue_discard be zone aligned for anything to happen. Specify
less than a zone and nothing will be done. This I think preserve the
discard semantic.

As for the “discard_zeroes_data” thing, I also think that is a drive
feature not mandatory. Drives may have it or not, which is consistent
with the ZBC/ZAC standards regarding reading after write pointer (nothing
says that zeros have to be returned). In any case, discard of CMR zones
will be a nop, so for SMR drives, discard_zeroes_data=0 may be a better
choice.

> 
>> Which is (as indicated) really slow, and easily takes several minutes.
>> With the RB tree we can short-circuit discards to empty zones, and speed
>> up processing time dramatically.
>> Sure we could be moving the logic into mkfs and friends, but that would
>> require us to change the programs and agree on a library (libzbc?) which
>> should be handling that.
> 
> F2FS's mkfs.f2fs is already reading the zone topology via SG_IO ...
> so I'm not sure your argument is valid here.

This initial SMR support patch is just that: a first try. Jaegeuk
used SG_IO (in fact copy-paste of parts of libzbc) because the current
ZBC patch-set has no ioctl API for zone information manipulation. We
will fix this mkfs.f2fs once we agree on an ioctl interface.

> 
> [..]
> 
 3) Try to condense the blkzone data structure to save memory:
 I think that we can at the very least remove the zone length, and also
 may be the per zone spinlock too (a single spinlock and proper state flags 
 can
 be used).
>>> 
>>> I have a variant that is an array of descriptors that roughly mimics the
>>> api from blk-zoned.c that I did a few months ago as an example.
>>> I should be able to update that to the current kernel + patches.
>>> 
>> Okay. If we restrict the in-kernel SMR drive handling to devices with
>> identical zone sizes of course we can remove the zone length.
>> And we can do away with the per-zone spinlock and use a global one instead.
> 
> I don't think dropping the zone length is a reasonable thing to do.
> 
> What I propose is an array of _descriptors_ it doesn't drop _any_
> of the zone information that you are holding in an RB tree, it is
> just a condensed format that _mostly_ plugs into your existing
> API.

I do not agree. The Seagate drive already has one zone (the last one)
that is not the same length as the other zones. Sure, since it is the
last one, we can had “if (last zone)” all over the place and make it
work. But that is really ugly. Keeping the length field makes the code
generic and following the standard, which has no restriction on the
zone sizes. We could do some memory optimisation using different types
of blk_zone sturcts, the types mapping to the SAME value: drives with
constant zone size can use a blk_zone type without the length field,
others use a different type that include the field. Accessor functions
can hide the different types in the zone manipulation code.

Best regards.


-- 
Damien Le Moal, Ph.D.
Sr. Manager, System Software Group, HGST Research,
HGST, a Western Digital brand
damien.lem...@hgst.com
(+81) 0466-98-3593 (ext. 513593)
1 kirihara-cho, Fujisawa, 
Kanagawa, 252-0888 Japan
www.hgst.com

Western Digital Corporation (and its subsidiaries) E-mail Confidentiality 
Notice & Disclaimer:

This e-mail and any files transmitted with it may contain confidential or 
legally privileged information of WDC and/or its affiliates, and are intended 
solely for the use of the individual or entity to which they are addressed. If 
you are not the intended recipient, any disclosure, copying, distribution or 
any action taken or 

Re: [PATCH 3.14 00/29] 3.14.76-stable review

2016-08-15 Thread Shuah Khan
On 08/14/2016 02:07 PM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 3.14.76 release.
> There are 29 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Tue Aug 16 20:07:18 UTC 2016.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
>   kernel.org/pub/linux/kernel/v3.x/stable-review/patch-3.14.76-rc1.gz
> or in the git tree and branch at:
>   git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git 
> linux-3.14.y
> and the diffstat can be found below.
> 

Compiled and booted on my test system. No dmesg regressions.

thanks,
-- Shuah


-- 
Shuah Khan
Sr. Linux Kernel Developer
Open Source Innovation Group
Samsung Research America(Silicon Valley)
shuah...@samsung.com


Re: [PATCH 3.14 00/29] 3.14.76-stable review

2016-08-15 Thread Shuah Khan
On 08/14/2016 02:07 PM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 3.14.76 release.
> There are 29 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Tue Aug 16 20:07:18 UTC 2016.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
>   kernel.org/pub/linux/kernel/v3.x/stable-review/patch-3.14.76-rc1.gz
> or in the git tree and branch at:
>   git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git 
> linux-3.14.y
> and the diffstat can be found below.
> 

Compiled and booted on my test system. No dmesg regressions.

thanks,
-- Shuah


-- 
Shuah Khan
Sr. Linux Kernel Developer
Open Source Innovation Group
Samsung Research America(Silicon Valley)
shuah...@samsung.com


linux-next: Tree for Aug 16

2016-08-15 Thread Stephen Rothwell
Hi all,

Changes since 20160815:

Non-merge commits (relative to Linus' tree): 2149
 2086 files changed, 87009 insertions(+), 30762 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc and an allmodconfig (with
CONFIG_BUILD_DOCSRC=n) for x86_64, a multi_v7_defconfig for arm and a
native build of tools/perf. After the final fixups (if any), I do an
x86_64 modules_install followed by builds for x86_64 allnoconfig,
powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig, allyesconfig
(this fails its final link) and pseries_le_defconfig and i386, sparc
and sparc64 defconfig.

Below is a summary of the state of the merge.

I am currently merging 241 trees (counting Linus' and 35 trees of patches
pending for Linus' tree).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwell

$ git checkout master
$ git reset --hard stable
Merging origin/master (3684b03d8e9a Merge tag 'iommu-fixes-v4.8-rc2' of 
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu)
Merging fixes/master (d3396e1e4ec4 Merge tag 'fixes-for-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc)
Merging kbuild-current/rc-fixes (b36fad65d61f kbuild: Initialize exported 
variables)
Merging arc-current/for-curr (45c3b08a117e ARC: Elide redundant setup of DMA 
callbacks)
Merging arm-current/fixes (87eed3c74d7c ARM: fix address limit restoration for 
undefined instructions)
Merging m68k-current/for-linus (6bd80f372371 m68k/defconfig: Update defconfigs 
for v4.7-rc2)
Merging metag-fixes/fixes (97b1d23f7bcb metag: Drop show_mem() from mem_init())
Merging powerpc-fixes/fixes (ca49e64f0cb1 selftests/powerpc: Specify we expect 
to build with std=gnu99)
Merging powerpc-merge-mpe/fixes (bc0195aad0da Linux 4.2-rc2)
Merging sparc/master (4620a06e4b3c shmem: Fix link error if huge pages support 
is disabled)
Merging net/master (d2fbdf76b85b tipc: fix NULL pointer dereference in 
shutdown())
Merging ipsec/master (1625f4529957 net/xfrm_input: fix possible NULL deref of 
tunnel.ip6->parms.i_key)
Merging netfilter/master (4b5b9ba553f9 openvswitch: do not ignore netdev errors 
when creating tunnel vports)
Merging ipvs/master (ea43f860d984 Merge branch 'ethoc-fixes')
Merging wireless-drivers/master (034fdd4a17ff Merge ath-current from ath.git)
Merging mac80211/master (4d0bd46a4d55 Revert "wext: Fix 32 bit iwpriv 
compatibility issue with 64 bit Kernel")
Merging sound-current/for-linus (a52ff34e5ec6 ALSA: hda - Manage power well 
properly for resume)
Merging pci-current/for-linus (8b078c603249 PCI: Update 
"pci=resource_alignment" documentation)
Merging driver-core.current/driver-core-linus (694d0d0bb203 Linux 4.8-rc2)
Merging tty.current/tty-linus (29b4817d4018 Linux 4.8-rc1)
Merging usb.current/usb-linus (add125054b87 cdc-acm: fix wrong pipe type on rx 
interrupt xfers)
Merging usb-gadget-fixes/fixes (a0ad85ae866f usb: dwc3: gadget: stop processing 
on HWO set)
Merging usb-serial-fixes/usb-linus (3b7c7e52efda USB: serial: mos7840: fix 
non-atomic allocation in write path)
Merging usb-chipidea-fixes/ci-for-usb-stable (ea1d39a31d3b usb: common: 
otg-fsm: add license to usb-otg-fsm)
Merging staging.current/staging-linus (99f1c013194e staging/lustre/llite: Close 
atomic_open race with several openers)
Merging char-misc.current/char-misc-linus (7b142d8fd0bd android: binder: fix 
dangling pointer comparison)
Merging input-current/for-linus (22fe874f3803 Input: silead - remove some dead 
code)
Merging crypto-current/master (a0118c8b2be9 crypto: caam - fix non-hmac hashes)
Merging ide/master (797cee982eef Merge branch 'stable-4.8' of 
git://git.infradead.org/users/pcmoore/audit)
Merging rr-fixes/fixes (8244062ef1e5 modules: fix longstanding /proc/kallsyms 
vs module insertion race.)
Merging vfio-fixes/for-linus (c8952a707556 vfio/pci: Fix NULL pointer oops in 
error interrupt setup handling)
Merging kselftest-fixes/fixes (29b4817d4018 Linux 4.8-rc1)
Merging backli

linux-next: Tree for Aug 16

2016-08-15 Thread Stephen Rothwell
Hi all,

Changes since 20160815:

Non-merge commits (relative to Linus' tree): 2149
 2086 files changed, 87009 insertions(+), 30762 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc and an allmodconfig (with
CONFIG_BUILD_DOCSRC=n) for x86_64, a multi_v7_defconfig for arm and a
native build of tools/perf. After the final fixups (if any), I do an
x86_64 modules_install followed by builds for x86_64 allnoconfig,
powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig, allyesconfig
(this fails its final link) and pseries_le_defconfig and i386, sparc
and sparc64 defconfig.

Below is a summary of the state of the merge.

I am currently merging 241 trees (counting Linus' and 35 trees of patches
pending for Linus' tree).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwell

$ git checkout master
$ git reset --hard stable
Merging origin/master (3684b03d8e9a Merge tag 'iommu-fixes-v4.8-rc2' of 
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu)
Merging fixes/master (d3396e1e4ec4 Merge tag 'fixes-for-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc)
Merging kbuild-current/rc-fixes (b36fad65d61f kbuild: Initialize exported 
variables)
Merging arc-current/for-curr (45c3b08a117e ARC: Elide redundant setup of DMA 
callbacks)
Merging arm-current/fixes (87eed3c74d7c ARM: fix address limit restoration for 
undefined instructions)
Merging m68k-current/for-linus (6bd80f372371 m68k/defconfig: Update defconfigs 
for v4.7-rc2)
Merging metag-fixes/fixes (97b1d23f7bcb metag: Drop show_mem() from mem_init())
Merging powerpc-fixes/fixes (ca49e64f0cb1 selftests/powerpc: Specify we expect 
to build with std=gnu99)
Merging powerpc-merge-mpe/fixes (bc0195aad0da Linux 4.2-rc2)
Merging sparc/master (4620a06e4b3c shmem: Fix link error if huge pages support 
is disabled)
Merging net/master (d2fbdf76b85b tipc: fix NULL pointer dereference in 
shutdown())
Merging ipsec/master (1625f4529957 net/xfrm_input: fix possible NULL deref of 
tunnel.ip6->parms.i_key)
Merging netfilter/master (4b5b9ba553f9 openvswitch: do not ignore netdev errors 
when creating tunnel vports)
Merging ipvs/master (ea43f860d984 Merge branch 'ethoc-fixes')
Merging wireless-drivers/master (034fdd4a17ff Merge ath-current from ath.git)
Merging mac80211/master (4d0bd46a4d55 Revert "wext: Fix 32 bit iwpriv 
compatibility issue with 64 bit Kernel")
Merging sound-current/for-linus (a52ff34e5ec6 ALSA: hda - Manage power well 
properly for resume)
Merging pci-current/for-linus (8b078c603249 PCI: Update 
"pci=resource_alignment" documentation)
Merging driver-core.current/driver-core-linus (694d0d0bb203 Linux 4.8-rc2)
Merging tty.current/tty-linus (29b4817d4018 Linux 4.8-rc1)
Merging usb.current/usb-linus (add125054b87 cdc-acm: fix wrong pipe type on rx 
interrupt xfers)
Merging usb-gadget-fixes/fixes (a0ad85ae866f usb: dwc3: gadget: stop processing 
on HWO set)
Merging usb-serial-fixes/usb-linus (3b7c7e52efda USB: serial: mos7840: fix 
non-atomic allocation in write path)
Merging usb-chipidea-fixes/ci-for-usb-stable (ea1d39a31d3b usb: common: 
otg-fsm: add license to usb-otg-fsm)
Merging staging.current/staging-linus (99f1c013194e staging/lustre/llite: Close 
atomic_open race with several openers)
Merging char-misc.current/char-misc-linus (7b142d8fd0bd android: binder: fix 
dangling pointer comparison)
Merging input-current/for-linus (22fe874f3803 Input: silead - remove some dead 
code)
Merging crypto-current/master (a0118c8b2be9 crypto: caam - fix non-hmac hashes)
Merging ide/master (797cee982eef Merge branch 'stable-4.8' of 
git://git.infradead.org/users/pcmoore/audit)
Merging rr-fixes/fixes (8244062ef1e5 modules: fix longstanding /proc/kallsyms 
vs module insertion race.)
Merging vfio-fixes/for-linus (c8952a707556 vfio/pci: Fix NULL pointer oops in 
error interrupt setup handling)
Merging kselftest-fixes/fixes (29b4817d4018 Linux 4.8-rc1)
Merging backli

[PATCH V3 2/2] rtc/rtc-cmos: Initialize software counters before irq is registered

2016-08-15 Thread Pratyush Anand
We have observed on few x86 machines with rtc-cmos device that
hpet_rtc_interrupt() is called just after irq registration and before
cmos_do_probe() could call hpet_rtc_timer_init().

So, neither hpet_default_delta nor hpet_t1_cmp is initialized by the time
interrupt is raised in the given situation, and this results in NMI
watchdog LOCKUP.

It has only been observed sporadically on kdump secondary kernels.

See the call trace:
---<-snip->---
   27.913194] Kernel panic - not syncing: Watchdog detected hard LOCKUP on
cpu 0
[   27.915371] CPU: 0 PID: 1 Comm: swapper/0 Not tainted
3.10.0-342.el7.x86_64 #1
[   27.917503] Hardware name: HP ProLiant DL160 Gen8, BIOS J03 02/10/2014
[   27.919455]  8186a728 59c82488 880034e05af0
81637bd4
[   27.921870]  880034e05b70 8163144a 0010
880034e05b80
[   27.924257]  880034e05b20 59c82488 

[   27.926599] Call Trace:
[   27.927352][] dump_stack+0x19/0x1b
[   27.929080]  [] panic+0xd8/0x1e7
[   27.930588]  [] ? restart_watchdog_hrtimer+0x50/0x50
[   27.932502]  [] watchdog_overflow_callback+0xc2/0xd0
[   27.934427]  [] __perf_event_overflow+0xa1/0x250
[   27.936232]  [] perf_event_overflow+0x14/0x20
[   27.937957]  [] intel_pmu_handle_irq+0x1e8/0x470
[   27.939799]  [] perf_event_nmi_handler+0x2b/0x50
[   27.941649]  [] nmi_handle.isra.0+0x69/0xb0
[   27.943348]  [] do_nmi+0x169/0x340
[   27.944802]  [] end_repeat_nmi+0x1e/0x2e
[   27.946424]  [] ? hpet_rtc_interrupt+0x85/0x380
[   27.948197]  [] ? hpet_rtc_interrupt+0x85/0x380
[   27.949992]  [] ? hpet_rtc_interrupt+0x85/0x380
[   27.951816]  <>[] ?
run_timer_softirq+0x43/0x340
[   27.954114]  [] handle_irq_event_percpu+0x3e/0x1e0
[   27.955962]  [] handle_irq_event+0x3d/0x60
[   27.957635]  [] handle_edge_irq+0x77/0x130
[   27.959332]  [] handle_irq+0xbf/0x150
[   27.960949]  [] do_IRQ+0x4f/0xf0
[   27.962434]  [] common_interrupt+0x6d/0x6d
[   27.964101][] ?
_raw_spin_unlock_irqrestore+0x1b/0x40
[   27.966308]  [] __setup_irq+0x2a7/0x570
[   28.067859]  [] ? hpet_cpuhp_notify+0x140/0x140
[   28.069709]  [] request_threaded_irq+0xcc/0x170
[   28.071585]  [] cmos_do_probe+0x1e6/0x450
[   28.073240]  [] ? cmos_do_probe+0x450/0x450
[   28.074911]  [] cmos_pnp_probe+0xbb/0xc0
[   28.076533]  [] pnp_device_probe+0x65/0xd0
[   28.078198]  [] driver_probe_device+0x87/0x390
[   28.079971]  [] __driver_attach+0x93/0xa0
[   28.081660]  [] ? __device_attach+0x40/0x40
[   28.083662]  [] bus_for_each_dev+0x73/0xc0
[   28.085370]  [] driver_attach+0x1e/0x20
[   28.086974]  [] bus_add_driver+0x200/0x2d0
[   28.088634]  [] ? rtc_sysfs_init+0xe/0xe
[   28.090349]  [] driver_register+0x64/0xf0
[   28.091989]  [] pnp_register_driver+0x20/0x30
[   28.093707]  [] cmos_init+0x11/0x71
---<-snip->---

The previous patch split hpet_rtc_timer_init into
hpet_rtc_timer_counter_init() and hpet_rtc_timer_enable().

Therefore, this patch moved hpet_rtc_timer_counter_init() before IRQ
registration, so that we can gracefully handle such spurious interrupts.

We were able to reproduce the problem in maximum 15 trials of kdump
secondary kernel boot on an hp-dl160gen8 machine without this patch.
However, more than 35 trials went fine after applying this patch.

Signed-off-by: Pratyush Anand 
[dzic...@redhat.com: edited the patch's summary]
Signed-off-by: Don Zickus 
---
 drivers/rtc/rtc-cmos.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/rtc/rtc-cmos.c b/drivers/rtc/rtc-cmos.c
index 43745cac0141..089d987f2638 100644
--- a/drivers/rtc/rtc-cmos.c
+++ b/drivers/rtc/rtc-cmos.c
@@ -129,6 +129,16 @@ static inline int hpet_rtc_dropped_irq(void)
return 0;
 }
 
+static inline int hpet_rtc_timer_counter_init(void)
+{
+   return 0;
+}
+
+static inline int hpet_rtc_timer_enable(void)
+{
+   return 0;
+}
+
 static inline int hpet_rtc_timer_init(void)
 {
return 0;
@@ -707,6 +717,7 @@ cmos_do_probe(struct device *dev, struct resource *ports, 
int rtc_irq)
goto cleanup1;
}
 
+   hpet_rtc_timer_counter_init();
if (is_valid_irq(rtc_irq)) {
irq_handler_t rtc_cmos_int_handler;
 
@@ -729,7 +740,7 @@ cmos_do_probe(struct device *dev, struct resource *ports, 
int rtc_irq)
goto cleanup1;
}
}
-   hpet_rtc_timer_init();
+   hpet_rtc_timer_enable();
 
/* export at least the first block of NVRAM */
nvram.size = address_space - NVRAM_OFFSET;
-- 
2.5.5



[PATCH V3 2/2] rtc/rtc-cmos: Initialize software counters before irq is registered

2016-08-15 Thread Pratyush Anand
We have observed on few x86 machines with rtc-cmos device that
hpet_rtc_interrupt() is called just after irq registration and before
cmos_do_probe() could call hpet_rtc_timer_init().

So, neither hpet_default_delta nor hpet_t1_cmp is initialized by the time
interrupt is raised in the given situation, and this results in NMI
watchdog LOCKUP.

It has only been observed sporadically on kdump secondary kernels.

See the call trace:
---<-snip->---
   27.913194] Kernel panic - not syncing: Watchdog detected hard LOCKUP on
cpu 0
[   27.915371] CPU: 0 PID: 1 Comm: swapper/0 Not tainted
3.10.0-342.el7.x86_64 #1
[   27.917503] Hardware name: HP ProLiant DL160 Gen8, BIOS J03 02/10/2014
[   27.919455]  8186a728 59c82488 880034e05af0
81637bd4
[   27.921870]  880034e05b70 8163144a 0010
880034e05b80
[   27.924257]  880034e05b20 59c82488 

[   27.926599] Call Trace:
[   27.927352][] dump_stack+0x19/0x1b
[   27.929080]  [] panic+0xd8/0x1e7
[   27.930588]  [] ? restart_watchdog_hrtimer+0x50/0x50
[   27.932502]  [] watchdog_overflow_callback+0xc2/0xd0
[   27.934427]  [] __perf_event_overflow+0xa1/0x250
[   27.936232]  [] perf_event_overflow+0x14/0x20
[   27.937957]  [] intel_pmu_handle_irq+0x1e8/0x470
[   27.939799]  [] perf_event_nmi_handler+0x2b/0x50
[   27.941649]  [] nmi_handle.isra.0+0x69/0xb0
[   27.943348]  [] do_nmi+0x169/0x340
[   27.944802]  [] end_repeat_nmi+0x1e/0x2e
[   27.946424]  [] ? hpet_rtc_interrupt+0x85/0x380
[   27.948197]  [] ? hpet_rtc_interrupt+0x85/0x380
[   27.949992]  [] ? hpet_rtc_interrupt+0x85/0x380
[   27.951816]  <>[] ?
run_timer_softirq+0x43/0x340
[   27.954114]  [] handle_irq_event_percpu+0x3e/0x1e0
[   27.955962]  [] handle_irq_event+0x3d/0x60
[   27.957635]  [] handle_edge_irq+0x77/0x130
[   27.959332]  [] handle_irq+0xbf/0x150
[   27.960949]  [] do_IRQ+0x4f/0xf0
[   27.962434]  [] common_interrupt+0x6d/0x6d
[   27.964101][] ?
_raw_spin_unlock_irqrestore+0x1b/0x40
[   27.966308]  [] __setup_irq+0x2a7/0x570
[   28.067859]  [] ? hpet_cpuhp_notify+0x140/0x140
[   28.069709]  [] request_threaded_irq+0xcc/0x170
[   28.071585]  [] cmos_do_probe+0x1e6/0x450
[   28.073240]  [] ? cmos_do_probe+0x450/0x450
[   28.074911]  [] cmos_pnp_probe+0xbb/0xc0
[   28.076533]  [] pnp_device_probe+0x65/0xd0
[   28.078198]  [] driver_probe_device+0x87/0x390
[   28.079971]  [] __driver_attach+0x93/0xa0
[   28.081660]  [] ? __device_attach+0x40/0x40
[   28.083662]  [] bus_for_each_dev+0x73/0xc0
[   28.085370]  [] driver_attach+0x1e/0x20
[   28.086974]  [] bus_add_driver+0x200/0x2d0
[   28.088634]  [] ? rtc_sysfs_init+0xe/0xe
[   28.090349]  [] driver_register+0x64/0xf0
[   28.091989]  [] pnp_register_driver+0x20/0x30
[   28.093707]  [] cmos_init+0x11/0x71
---<-snip->---

The previous patch split hpet_rtc_timer_init into
hpet_rtc_timer_counter_init() and hpet_rtc_timer_enable().

Therefore, this patch moved hpet_rtc_timer_counter_init() before IRQ
registration, so that we can gracefully handle such spurious interrupts.

We were able to reproduce the problem in maximum 15 trials of kdump
secondary kernel boot on an hp-dl160gen8 machine without this patch.
However, more than 35 trials went fine after applying this patch.

Signed-off-by: Pratyush Anand 
[dzic...@redhat.com: edited the patch's summary]
Signed-off-by: Don Zickus 
---
 drivers/rtc/rtc-cmos.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/rtc/rtc-cmos.c b/drivers/rtc/rtc-cmos.c
index 43745cac0141..089d987f2638 100644
--- a/drivers/rtc/rtc-cmos.c
+++ b/drivers/rtc/rtc-cmos.c
@@ -129,6 +129,16 @@ static inline int hpet_rtc_dropped_irq(void)
return 0;
 }
 
+static inline int hpet_rtc_timer_counter_init(void)
+{
+   return 0;
+}
+
+static inline int hpet_rtc_timer_enable(void)
+{
+   return 0;
+}
+
 static inline int hpet_rtc_timer_init(void)
 {
return 0;
@@ -707,6 +717,7 @@ cmos_do_probe(struct device *dev, struct resource *ports, 
int rtc_irq)
goto cleanup1;
}
 
+   hpet_rtc_timer_counter_init();
if (is_valid_irq(rtc_irq)) {
irq_handler_t rtc_cmos_int_handler;
 
@@ -729,7 +740,7 @@ cmos_do_probe(struct device *dev, struct resource *ports, 
int rtc_irq)
goto cleanup1;
}
}
-   hpet_rtc_timer_init();
+   hpet_rtc_timer_enable();
 
/* export at least the first block of NVRAM */
nvram.size = address_space - NVRAM_OFFSET;
-- 
2.5.5



[PATCH V3 1/2] rtc/hpet: Factorize hpet_rtc_timer_init()

2016-08-15 Thread Pratyush Anand
We need the ability to support initialization of hpet_default_delta and
hpet_t1_cmp counters before irq can be enabled.

This patch splits hpet_rtc_timer_init() into two functions:
hpet_rtc_timer_counter_init() and hpet_rtc_timer_enable, so that the above
functionality can be achieved.

Next patch explains it's need in detail.

No functional change in this patch.

Signed-off-by: Pratyush Anand 
[dzic...@redhat.com: edited the patch's summary]
Signed-off-by: Don Zickus 
---
 arch/x86/include/asm/hpet.h |  2 ++
 arch/x86/kernel/hpet.c  | 41 +++--
 2 files changed, 37 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/hpet.h b/arch/x86/include/asm/hpet.h
index cc285ec4b2c1..8eecb31bebcb 100644
--- a/arch/x86/include/asm/hpet.h
+++ b/arch/x86/include/asm/hpet.h
@@ -96,6 +96,8 @@ extern int hpet_set_alarm_time(unsigned char hrs, unsigned 
char min,
   unsigned char sec);
 extern int hpet_set_periodic_freq(unsigned long freq);
 extern int hpet_rtc_dropped_irq(void);
+extern int hpet_rtc_timer_counter_init(void);
+extern int hpet_rtc_timer_enable(void);
 extern int hpet_rtc_timer_init(void);
 extern irqreturn_t hpet_rtc_interrupt(int irq, void *dev_id);
 extern int hpet_register_irq_handler(rtc_irq_handler handler);
diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c
index ed16e58658a4..6f6d21059b1b 100644
--- a/arch/x86/kernel/hpet.c
+++ b/arch/x86/kernel/hpet.c
@@ -1074,14 +1074,12 @@ void hpet_unregister_irq_handler(rtc_irq_handler 
handler)
 EXPORT_SYMBOL_GPL(hpet_unregister_irq_handler);
 
 /*
- * Timer 1 for RTC emulation. We use one shot mode, as periodic mode
- * is not supported by all HPET implementations for timer 1.
- *
- * hpet_rtc_timer_init() is called when the rtc is initialized.
+ * hpet_rtc_timer_counter_init() is called before interrupt can be
+ * registered
  */
-int hpet_rtc_timer_init(void)
+int hpet_rtc_timer_counter_init(void)
 {
-   unsigned int cfg, cnt, delta;
+   unsigned int cnt, delta;
unsigned long flags;
 
if (!is_hpet_enabled())
@@ -1106,6 +1104,22 @@ int hpet_rtc_timer_init(void)
hpet_writel(cnt, HPET_T1_CMP);
hpet_t1_cmp = cnt;
 
+   local_irq_restore(flags);
+
+   return 1;
+}
+EXPORT_SYMBOL_GPL(hpet_rtc_timer_counter_init);
+
+/*
+ * hpet_rtc_timer_enable() is called during RTC initialization
+ */
+int hpet_rtc_timer_enable(void)
+{
+   unsigned int cfg;
+   unsigned long flags;
+
+   local_irq_save(flags);
+
cfg = hpet_readl(HPET_T1_CFG);
cfg &= ~HPET_TN_PERIODIC;
cfg |= HPET_TN_ENABLE | HPET_TN_32BIT;
@@ -1115,6 +1129,21 @@ int hpet_rtc_timer_init(void)
 
return 1;
 }
+EXPORT_SYMBOL_GPL(hpet_rtc_timer_enable);
+
+/*
+ * Timer 1 for RTC emulation. We use one shot mode, as periodic mode
+ * is not supported by all HPET implementations for timer 1.
+ *
+ * hpet_rtc_timer_init() is called when the rtc is initialized.
+ */
+int hpet_rtc_timer_init(void)
+{
+   if (!hpet_rtc_timer_counter_init())
+   return 0;
+
+   return hpet_rtc_timer_enable();
+}
 EXPORT_SYMBOL_GPL(hpet_rtc_timer_init);
 
 static void hpet_disable_rtc_channel(void)
-- 
2.5.5



[PATCH V3 1/2] rtc/hpet: Factorize hpet_rtc_timer_init()

2016-08-15 Thread Pratyush Anand
We need the ability to support initialization of hpet_default_delta and
hpet_t1_cmp counters before irq can be enabled.

This patch splits hpet_rtc_timer_init() into two functions:
hpet_rtc_timer_counter_init() and hpet_rtc_timer_enable, so that the above
functionality can be achieved.

Next patch explains it's need in detail.

No functional change in this patch.

Signed-off-by: Pratyush Anand 
[dzic...@redhat.com: edited the patch's summary]
Signed-off-by: Don Zickus 
---
 arch/x86/include/asm/hpet.h |  2 ++
 arch/x86/kernel/hpet.c  | 41 +++--
 2 files changed, 37 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/hpet.h b/arch/x86/include/asm/hpet.h
index cc285ec4b2c1..8eecb31bebcb 100644
--- a/arch/x86/include/asm/hpet.h
+++ b/arch/x86/include/asm/hpet.h
@@ -96,6 +96,8 @@ extern int hpet_set_alarm_time(unsigned char hrs, unsigned 
char min,
   unsigned char sec);
 extern int hpet_set_periodic_freq(unsigned long freq);
 extern int hpet_rtc_dropped_irq(void);
+extern int hpet_rtc_timer_counter_init(void);
+extern int hpet_rtc_timer_enable(void);
 extern int hpet_rtc_timer_init(void);
 extern irqreturn_t hpet_rtc_interrupt(int irq, void *dev_id);
 extern int hpet_register_irq_handler(rtc_irq_handler handler);
diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c
index ed16e58658a4..6f6d21059b1b 100644
--- a/arch/x86/kernel/hpet.c
+++ b/arch/x86/kernel/hpet.c
@@ -1074,14 +1074,12 @@ void hpet_unregister_irq_handler(rtc_irq_handler 
handler)
 EXPORT_SYMBOL_GPL(hpet_unregister_irq_handler);
 
 /*
- * Timer 1 for RTC emulation. We use one shot mode, as periodic mode
- * is not supported by all HPET implementations for timer 1.
- *
- * hpet_rtc_timer_init() is called when the rtc is initialized.
+ * hpet_rtc_timer_counter_init() is called before interrupt can be
+ * registered
  */
-int hpet_rtc_timer_init(void)
+int hpet_rtc_timer_counter_init(void)
 {
-   unsigned int cfg, cnt, delta;
+   unsigned int cnt, delta;
unsigned long flags;
 
if (!is_hpet_enabled())
@@ -1106,6 +1104,22 @@ int hpet_rtc_timer_init(void)
hpet_writel(cnt, HPET_T1_CMP);
hpet_t1_cmp = cnt;
 
+   local_irq_restore(flags);
+
+   return 1;
+}
+EXPORT_SYMBOL_GPL(hpet_rtc_timer_counter_init);
+
+/*
+ * hpet_rtc_timer_enable() is called during RTC initialization
+ */
+int hpet_rtc_timer_enable(void)
+{
+   unsigned int cfg;
+   unsigned long flags;
+
+   local_irq_save(flags);
+
cfg = hpet_readl(HPET_T1_CFG);
cfg &= ~HPET_TN_PERIODIC;
cfg |= HPET_TN_ENABLE | HPET_TN_32BIT;
@@ -1115,6 +1129,21 @@ int hpet_rtc_timer_init(void)
 
return 1;
 }
+EXPORT_SYMBOL_GPL(hpet_rtc_timer_enable);
+
+/*
+ * Timer 1 for RTC emulation. We use one shot mode, as periodic mode
+ * is not supported by all HPET implementations for timer 1.
+ *
+ * hpet_rtc_timer_init() is called when the rtc is initialized.
+ */
+int hpet_rtc_timer_init(void)
+{
+   if (!hpet_rtc_timer_counter_init())
+   return 0;
+
+   return hpet_rtc_timer_enable();
+}
 EXPORT_SYMBOL_GPL(hpet_rtc_timer_init);
 
 static void hpet_disable_rtc_channel(void)
-- 
2.5.5



[PATCH 6/7] dax: define a unified inode/address_space for device-dax mappings

2016-08-15 Thread Dan Williams
In support of enabling resize / truncate of device-dax instances, define
a pseudo-fs to provide a unified inode/address space for vm operations.

Cc: Al Viro 
Signed-off-by: Dan Williams 
---
 drivers/dax/dax.c  |  150 +++-
 fs/char_dev.c  |1 
 include/uapi/linux/magic.h |1 
 3 files changed, 148 insertions(+), 4 deletions(-)

diff --git a/drivers/dax/dax.c b/drivers/dax/dax.c
index 17715773c097..e8b9319aeadb 100644
--- a/drivers/dax/dax.c
+++ b/drivers/dax/dax.c
@@ -13,7 +13,9 @@
 #include 
 #include 
 #include 
+#include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -26,6 +28,9 @@ static struct class *dax_class;
 static DEFINE_IDA(dax_minor_ida);
 static int nr_dax = CONFIG_NR_DEV_DAX;
 module_param(nr_dax, int, S_IRUGO);
+static struct vfsmount *dax_mnt;
+static struct kmem_cache *dax_cache __read_mostly;
+static struct super_block *dax_superblock __read_mostly;
 MODULE_PARM_DESC(nr_dax, "max number of device-dax instances");
 
 /**
@@ -61,6 +66,7 @@ struct dax_region {
  */
 struct dax_dev {
struct dax_region *region;
+   struct inode *inode;
struct device dev;
struct cdev cdev;
bool alive;
@@ -69,6 +75,117 @@ struct dax_dev {
struct resource res[0];
 };
 
+static struct inode *dax_alloc_inode(struct super_block *sb)
+{
+   return kmem_cache_alloc(dax_cache, GFP_KERNEL);
+}
+
+static void dax_i_callback(struct rcu_head *head)
+{
+   struct inode *inode = container_of(head, struct inode, i_rcu);
+
+   kmem_cache_free(dax_cache, inode);
+}
+
+static void dax_destroy_inode(struct inode *inode)
+{
+   call_rcu(>i_rcu, dax_i_callback);
+}
+
+static const struct super_operations dax_sops = {
+   .statfs = simple_statfs,
+   .alloc_inode = dax_alloc_inode,
+   .destroy_inode = dax_destroy_inode,
+   .drop_inode = generic_delete_inode,
+};
+
+static struct dentry *dax_mount(struct file_system_type *fs_type,
+   int flags, const char *dev_name, void *data)
+{
+   return mount_pseudo(fs_type, "dax:", _sops, NULL, DAXFS_MAGIC);
+}
+
+static struct file_system_type dax_type = {
+   .name = "dax",
+   .mount = dax_mount,
+   .kill_sb = kill_anon_super,
+};
+
+static int dax_test(struct inode *inode, void *data)
+{
+   return inode->i_cdev == data;
+}
+
+static int dax_set(struct inode *inode, void *data)
+{
+   inode->i_cdev = data;
+   return 0;
+}
+
+static struct inode *dax_inode_get(struct cdev *cdev, dev_t devt)
+{
+   struct inode *inode;
+
+   inode = iget5_locked(dax_superblock, hash_32(devt + DAXFS_MAGIC, 31),
+   dax_test, dax_set, cdev);
+
+   if (!inode)
+   return NULL;
+
+   if (inode->i_state & I_NEW) {
+   inode->i_mode = S_IFCHR;
+   inode->i_flags = S_DAX;
+   inode->i_rdev = devt;
+   mapping_set_gfp_mask(>i_data, GFP_USER);
+   unlock_new_inode(inode);
+   }
+   return inode;
+}
+
+static void init_once(void *inode)
+{
+   inode_init_once(inode);
+}
+
+static int dax_inode_init(void)
+{
+   int rc;
+
+   dax_cache = kmem_cache_create("dax_cache", sizeof(struct inode), 0,
+   (SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|
+SLAB_MEM_SPREAD|SLAB_ACCOUNT),
+   init_once);
+   if (!dax_cache)
+   return -ENOMEM;
+
+   rc = register_filesystem(_type);
+   if (rc)
+   goto err_register_fs;
+
+   dax_mnt = kern_mount(_type);
+   if (IS_ERR(dax_mnt)) {
+   rc = PTR_ERR(dax_mnt);
+   goto err_mount;
+   }
+   dax_superblock = dax_mnt->mnt_sb;
+
+   return 0;
+
+ err_mount:
+   unregister_filesystem(_type);
+ err_register_fs:
+   kmem_cache_destroy(dax_cache);
+
+   return rc;
+}
+
+static void dax_inode_exit(void)
+{
+   kern_unmount(dax_mnt);
+   unregister_filesystem(_type);
+   kmem_cache_destroy(dax_cache);
+}
+
 static void dax_region_free(struct kref *kref)
 {
struct dax_region *dax_region;
@@ -379,6 +496,9 @@ static int dax_open(struct inode *inode, struct file *filp)
 
dax_dev = container_of(inode->i_cdev, struct dax_dev, cdev);
dev_dbg(_dev->dev, "%s\n", __func__);
+   inode->i_mapping = dax_dev->inode->i_mapping;
+   inode->i_mapping->host = dax_dev->inode;
+   filp->f_mapping = inode->i_mapping;
filp->private_data = dax_dev;
inode->i_flags = S_DAX;
 
@@ -410,6 +530,7 @@ static void dax_dev_release(struct device *dev)
ida_simple_remove(_region->ida, dax_dev->id);
ida_simple_remove(_minor_ida, MINOR(dev->devt));
dax_region_put(dax_region);
+   iput(dax_dev->inode);
kfree(dax_dev);
 }
 
@@ -459,6 +580,12 @@ int devm_create_dax_dev(struct dax_region *dax_region, 
struct resource 

[PATCH 6/7] dax: define a unified inode/address_space for device-dax mappings

2016-08-15 Thread Dan Williams
In support of enabling resize / truncate of device-dax instances, define
a pseudo-fs to provide a unified inode/address space for vm operations.

Cc: Al Viro 
Signed-off-by: Dan Williams 
---
 drivers/dax/dax.c  |  150 +++-
 fs/char_dev.c  |1 
 include/uapi/linux/magic.h |1 
 3 files changed, 148 insertions(+), 4 deletions(-)

diff --git a/drivers/dax/dax.c b/drivers/dax/dax.c
index 17715773c097..e8b9319aeadb 100644
--- a/drivers/dax/dax.c
+++ b/drivers/dax/dax.c
@@ -13,7 +13,9 @@
 #include 
 #include 
 #include 
+#include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -26,6 +28,9 @@ static struct class *dax_class;
 static DEFINE_IDA(dax_minor_ida);
 static int nr_dax = CONFIG_NR_DEV_DAX;
 module_param(nr_dax, int, S_IRUGO);
+static struct vfsmount *dax_mnt;
+static struct kmem_cache *dax_cache __read_mostly;
+static struct super_block *dax_superblock __read_mostly;
 MODULE_PARM_DESC(nr_dax, "max number of device-dax instances");
 
 /**
@@ -61,6 +66,7 @@ struct dax_region {
  */
 struct dax_dev {
struct dax_region *region;
+   struct inode *inode;
struct device dev;
struct cdev cdev;
bool alive;
@@ -69,6 +75,117 @@ struct dax_dev {
struct resource res[0];
 };
 
+static struct inode *dax_alloc_inode(struct super_block *sb)
+{
+   return kmem_cache_alloc(dax_cache, GFP_KERNEL);
+}
+
+static void dax_i_callback(struct rcu_head *head)
+{
+   struct inode *inode = container_of(head, struct inode, i_rcu);
+
+   kmem_cache_free(dax_cache, inode);
+}
+
+static void dax_destroy_inode(struct inode *inode)
+{
+   call_rcu(>i_rcu, dax_i_callback);
+}
+
+static const struct super_operations dax_sops = {
+   .statfs = simple_statfs,
+   .alloc_inode = dax_alloc_inode,
+   .destroy_inode = dax_destroy_inode,
+   .drop_inode = generic_delete_inode,
+};
+
+static struct dentry *dax_mount(struct file_system_type *fs_type,
+   int flags, const char *dev_name, void *data)
+{
+   return mount_pseudo(fs_type, "dax:", _sops, NULL, DAXFS_MAGIC);
+}
+
+static struct file_system_type dax_type = {
+   .name = "dax",
+   .mount = dax_mount,
+   .kill_sb = kill_anon_super,
+};
+
+static int dax_test(struct inode *inode, void *data)
+{
+   return inode->i_cdev == data;
+}
+
+static int dax_set(struct inode *inode, void *data)
+{
+   inode->i_cdev = data;
+   return 0;
+}
+
+static struct inode *dax_inode_get(struct cdev *cdev, dev_t devt)
+{
+   struct inode *inode;
+
+   inode = iget5_locked(dax_superblock, hash_32(devt + DAXFS_MAGIC, 31),
+   dax_test, dax_set, cdev);
+
+   if (!inode)
+   return NULL;
+
+   if (inode->i_state & I_NEW) {
+   inode->i_mode = S_IFCHR;
+   inode->i_flags = S_DAX;
+   inode->i_rdev = devt;
+   mapping_set_gfp_mask(>i_data, GFP_USER);
+   unlock_new_inode(inode);
+   }
+   return inode;
+}
+
+static void init_once(void *inode)
+{
+   inode_init_once(inode);
+}
+
+static int dax_inode_init(void)
+{
+   int rc;
+
+   dax_cache = kmem_cache_create("dax_cache", sizeof(struct inode), 0,
+   (SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|
+SLAB_MEM_SPREAD|SLAB_ACCOUNT),
+   init_once);
+   if (!dax_cache)
+   return -ENOMEM;
+
+   rc = register_filesystem(_type);
+   if (rc)
+   goto err_register_fs;
+
+   dax_mnt = kern_mount(_type);
+   if (IS_ERR(dax_mnt)) {
+   rc = PTR_ERR(dax_mnt);
+   goto err_mount;
+   }
+   dax_superblock = dax_mnt->mnt_sb;
+
+   return 0;
+
+ err_mount:
+   unregister_filesystem(_type);
+ err_register_fs:
+   kmem_cache_destroy(dax_cache);
+
+   return rc;
+}
+
+static void dax_inode_exit(void)
+{
+   kern_unmount(dax_mnt);
+   unregister_filesystem(_type);
+   kmem_cache_destroy(dax_cache);
+}
+
 static void dax_region_free(struct kref *kref)
 {
struct dax_region *dax_region;
@@ -379,6 +496,9 @@ static int dax_open(struct inode *inode, struct file *filp)
 
dax_dev = container_of(inode->i_cdev, struct dax_dev, cdev);
dev_dbg(_dev->dev, "%s\n", __func__);
+   inode->i_mapping = dax_dev->inode->i_mapping;
+   inode->i_mapping->host = dax_dev->inode;
+   filp->f_mapping = inode->i_mapping;
filp->private_data = dax_dev;
inode->i_flags = S_DAX;
 
@@ -410,6 +530,7 @@ static void dax_dev_release(struct device *dev)
ida_simple_remove(_region->ida, dax_dev->id);
ida_simple_remove(_minor_ida, MINOR(dev->devt));
dax_region_put(dax_region);
+   iput(dax_dev->inode);
kfree(dax_dev);
 }
 
@@ -459,6 +580,12 @@ int devm_create_dax_dev(struct dax_region *dax_region, 
struct resource *res,
goto err_minor;
}
 
+

[PATCH 5/7] dax: convert to the cdev api

2016-08-15 Thread Dan Williams
A goal of the device-DAX interface is to be able to support many
exclusive allocations (partitions) of performance / feature
differentiated memory.  This count may exceed the default minors limit
of 256.

As a result of switching to an embedded cdev the inode-to-dax_dev
conversion is simplified, as well as reference counting which can switch
to the cdev kobject lifetime.

Cc: Al Viro 
Signed-off-by: Dan Williams 
---
 drivers/dax/Kconfig |5 +++
 drivers/dax/dax.c   |   82 ++-
 2 files changed, 46 insertions(+), 41 deletions(-)

diff --git a/drivers/dax/Kconfig b/drivers/dax/Kconfig
index cedab7572de3..daadd20aa936 100644
--- a/drivers/dax/Kconfig
+++ b/drivers/dax/Kconfig
@@ -23,4 +23,9 @@ config DEV_DAX_PMEM
 
  Say Y if unsure
 
+config NR_DEV_DAX
+   int "Maximum number of Device-DAX instances"
+   default 32768
+   range 256 2147483647
+
 endif
diff --git a/drivers/dax/dax.c b/drivers/dax/dax.c
index 181d2a5a21e4..17715773c097 100644
--- a/drivers/dax/dax.c
+++ b/drivers/dax/dax.c
@@ -14,15 +14,19 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
 #include 
 #include "dax.h"
 
-static int dax_major;
+static dev_t dax_devt;
 static struct class *dax_class;
 static DEFINE_IDA(dax_minor_ida);
+static int nr_dax = CONFIG_NR_DEV_DAX;
+module_param(nr_dax, int, S_IRUGO);
+MODULE_PARM_DESC(nr_dax, "max number of device-dax instances");
 
 /**
  * struct dax_region - mapping infrastructure for dax devices
@@ -49,6 +53,7 @@ struct dax_region {
  * struct dax_dev - subdivision of a dax region
  * @region - parent region
  * @dev - device backing the character device
+ * @cdev - core chardev data
  * @alive - !alive + rcu grace period == no new mappings can be established
  * @id - child id in the region
  * @num_resources - number of physical address extents in this device
@@ -57,6 +62,7 @@ struct dax_region {
 struct dax_dev {
struct dax_region *region;
struct device dev;
+   struct cdev cdev;
bool alive;
int id;
int num_resources;
@@ -367,29 +373,12 @@ static unsigned long dax_get_unmapped_area(struct file 
*filp,
return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags);
 }
 
-static int __match_devt(struct device *dev, const void *data)
-{
-   const dev_t *devt = data;
-
-   return dev->devt == *devt;
-}
-
-static struct device *dax_dev_find(dev_t dev_t)
-{
-   return class_find_device(dax_class, NULL, _t, __match_devt);
-}
-
 static int dax_open(struct inode *inode, struct file *filp)
 {
-   struct dax_dev *dax_dev = NULL;
-   struct device *dev;
-
-   dev = dax_dev_find(inode->i_rdev);
-   if (!dev)
-   return -ENXIO;
+   struct dax_dev *dax_dev;
 
-   dax_dev = to_dax_dev(dev);
-   dev_dbg(dev, "%s\n", __func__);
+   dax_dev = container_of(inode->i_cdev, struct dax_dev, cdev);
+   dev_dbg(_dev->dev, "%s\n", __func__);
filp->private_data = dax_dev;
inode->i_flags = S_DAX;
 
@@ -399,11 +388,8 @@ static int dax_open(struct inode *inode, struct file *filp)
 static int dax_release(struct inode *inode, struct file *filp)
 {
struct dax_dev *dax_dev = filp->private_data;
-   struct device *dev = _dev->dev;
-
-   dev_dbg(dev, "%s\n", __func__);
-   put_device(dev);
 
+   dev_dbg(_dev->dev, "%s\n", __func__);
return 0;
 }
 
@@ -430,6 +416,7 @@ static void dax_dev_release(struct device *dev)
 static void unregister_dax_dev(void *dev)
 {
struct dax_dev *dax_dev = to_dax_dev(dev);
+   struct cdev *cdev = _dev->cdev;
 
dev_dbg(dev, "%s\n", __func__);
 
@@ -442,6 +429,7 @@ static void unregister_dax_dev(void *dev)
 */
dax_dev->alive = false;
synchronize_rcu();
+   cdev_del(cdev);
device_unregister(dev);
 }
 
@@ -451,17 +439,13 @@ int devm_create_dax_dev(struct dax_region *dax_region, 
struct resource *res,
struct device *parent = dax_region->dev;
struct dax_dev *dax_dev;
struct device *dev;
+   struct cdev *cdev;
int rc, minor;
dev_t dev_t;
 
dax_dev = kzalloc(sizeof(*dax_dev) + sizeof(*res) * count, GFP_KERNEL);
if (!dax_dev)
return -ENOMEM;
-   memcpy(dax_dev->res, res, sizeof(*res) * count);
-   dax_dev->num_resources = count;
-   dax_dev->alive = true;
-   dax_dev->region = dax_region;
-   kref_get(_region->kref);
 
dax_dev->id = ida_simple_get(_region->ida, 0, 0, GFP_KERNEL);
if (dax_dev->id < 0) {
@@ -475,10 +459,26 @@ int devm_create_dax_dev(struct dax_region *dax_region, 
struct resource *res,
goto err_minor;
}
 
-   dev_t = MKDEV(dax_major, minor);
-
+   /* device_initialize() so cdev can reference kobj parent */
+   dev_t = MKDEV(MAJOR(dax_devt), minor);
dev = _dev->dev;

[PATCH 4/7] dax: embed a struct device in dax_dev

2016-08-15 Thread Dan Williams
The kref in dax_dev can be made redundant if the final put_device() on
the device associated with the dax_dev frees the dax_dev.  This can be
accomplished by embedding a struct device in struct dax_dev, open coding
device_create() and specifying a custom release method.

Signed-off-by: Dan Williams 
---
 drivers/dax/dax.c |  130 ++---
 1 file changed, 45 insertions(+), 85 deletions(-)

diff --git a/drivers/dax/dax.c b/drivers/dax/dax.c
index 994dfa507dfb..181d2a5a21e4 100644
--- a/drivers/dax/dax.c
+++ b/drivers/dax/dax.c
@@ -49,7 +49,6 @@ struct dax_region {
  * struct dax_dev - subdivision of a dax region
  * @region - parent region
  * @dev - device backing the character device
- * @kref - enable this data to be tracked in filp->private_data
  * @alive - !alive + rcu grace period == no new mappings can be established
  * @id - child id in the region
  * @num_resources - number of physical address extents in this device
@@ -57,8 +56,7 @@ struct dax_region {
  */
 struct dax_dev {
struct dax_region *region;
-   struct device *dev;
-   struct kref kref;
+   struct device dev;
bool alive;
int id;
int num_resources;
@@ -79,20 +77,6 @@ void dax_region_put(struct dax_region *dax_region)
 }
 EXPORT_SYMBOL_GPL(dax_region_put);
 
-static void dax_dev_free(struct kref *kref)
-{
-   struct dax_dev *dax_dev;
-
-   dax_dev = container_of(kref, struct dax_dev, kref);
-   dax_region_put(dax_dev->region);
-   kfree(dax_dev);
-}
-
-static void dax_dev_put(struct dax_dev *dax_dev)
-{
-   kref_put(_dev->kref, dax_dev_free);
-}
-
 struct dax_region *alloc_dax_region(struct device *parent, int region_id,
struct resource *res, unsigned int align, void *addr,
unsigned long pfn_flags)
@@ -117,10 +101,15 @@ struct dax_region *alloc_dax_region(struct device 
*parent, int region_id,
 }
 EXPORT_SYMBOL_GPL(alloc_dax_region);
 
+static struct dax_dev *to_dax_dev(struct device *dev)
+{
+   return container_of(dev, struct dax_dev, dev);
+}
+
 static ssize_t size_show(struct device *dev,
struct device_attribute *attr, char *buf)
 {
-   struct dax_dev *dax_dev = dev_get_drvdata(dev);
+   struct dax_dev *dax_dev = to_dax_dev(dev);
unsigned long long size = 0;
int i;
 
@@ -149,7 +138,7 @@ static int check_vma(struct dax_dev *dax_dev, struct 
vm_area_struct *vma,
const char *func)
 {
struct dax_region *dax_region = dax_dev->region;
-   struct device *dev = dax_dev->dev;
+   struct device *dev = _dev->dev;
unsigned long mask;
 
if (!dax_dev->alive)
@@ -214,7 +203,7 @@ static int __dax_dev_fault(struct dax_dev *dax_dev, struct 
vm_area_struct *vma,
struct vm_fault *vmf)
 {
unsigned long vaddr = (unsigned long) vmf->virtual_address;
-   struct device *dev = dax_dev->dev;
+   struct device *dev = _dev->dev;
struct dax_region *dax_region;
int rc = VM_FAULT_SIGBUS;
phys_addr_t phys;
@@ -254,7 +243,7 @@ static int dax_dev_fault(struct vm_area_struct *vma, struct 
vm_fault *vmf)
struct file *filp = vma->vm_file;
struct dax_dev *dax_dev = filp->private_data;
 
-   dev_dbg(dax_dev->dev, "%s: %s: %s (%#lx - %#lx)\n", __func__,
+   dev_dbg(_dev->dev, "%s: %s: %s (%#lx - %#lx)\n", __func__,
current->comm, (vmf->flags & FAULT_FLAG_WRITE)
? "write" : "read", vma->vm_start, vma->vm_end);
rcu_read_lock();
@@ -269,7 +258,7 @@ static int __dax_dev_pmd_fault(struct dax_dev *dax_dev,
unsigned int flags)
 {
unsigned long pmd_addr = addr & PMD_MASK;
-   struct device *dev = dax_dev->dev;
+   struct device *dev = _dev->dev;
struct dax_region *dax_region;
phys_addr_t phys;
pgoff_t pgoff;
@@ -311,7 +300,7 @@ static int dax_dev_pmd_fault(struct vm_area_struct *vma, 
unsigned long addr,
struct file *filp = vma->vm_file;
struct dax_dev *dax_dev = filp->private_data;
 
-   dev_dbg(dax_dev->dev, "%s: %s: %s (%#lx - %#lx)\n", __func__,
+   dev_dbg(_dev->dev, "%s: %s: %s (%#lx - %#lx)\n", __func__,
current->comm, (flags & FAULT_FLAG_WRITE)
? "write" : "read", vma->vm_start, vma->vm_end);
 
@@ -322,29 +311,9 @@ static int dax_dev_pmd_fault(struct vm_area_struct *vma, 
unsigned long addr,
return rc;
 }
 
-static void dax_dev_vm_open(struct vm_area_struct *vma)
-{
-   struct file *filp = vma->vm_file;
-   struct dax_dev *dax_dev = filp->private_data;
-
-   dev_dbg(dax_dev->dev, "%s\n", __func__);
-   kref_get(_dev->kref);
-}
-
-static void dax_dev_vm_close(struct vm_area_struct *vma)
-{
-   struct file *filp = vma->vm_file;
-   struct dax_dev *dax_dev = filp->private_data;
-
-   dev_dbg(dax_dev->dev, "%s\n", __func__);
-

[RFC PATCH] mmc: dw_mmc: avoid race condition of cpu and IDMAC

2016-08-15 Thread Shawn Lin
We could see an obvious race condition by test that
the former write operation by IDMAC aiming to clear
OWN bit reach right after the later configuration of
the same desc, which makes the IDMAC be in SUSPEND
state as the OWN bit was cleared by the asynchronous
write operation of IDMAC. The bug can be very easy
reproduced on RK3288 or similar when lowering the
bandwidth of bus and aggravating the Qos to make the
large numbers of IP fight for the priority. One possible
replaceable solution may be alloc dual buff for the
desc to avoid it but could still race each other
theoretically.

Signed-off-by: Shawn Lin 

---

 drivers/mmc/host/dw_mmc.c | 34 ++
 1 file changed, 34 insertions(+)

diff --git a/drivers/mmc/host/dw_mmc.c b/drivers/mmc/host/dw_mmc.c
index 32380d5..7b01fab 100644
--- a/drivers/mmc/host/dw_mmc.c
+++ b/drivers/mmc/host/dw_mmc.c
@@ -490,6 +490,23 @@ static void dw_mci_translate_sglist(struct dw_mci *host, 
struct mmc_data *data,
length -= desc_len;
 
/*
+* OWN bit should be clear by IDMAC after
+* finishing transfer. Let's wait for the
+* asynchronous operation of IDMAC and cpu
+* to make sure that we do not rely on the
+* order of Qos of bus and architecture.
+* Otherwise we could see a race condition
+* here that the former write operation of
+* IDMAC(to clear the OWN bit) reach right
+* after the later new configuration of desc
+* which makes value of desc been covered
+* leading to DMA_SUSPEND state as IDMAC fecth
+* the wrong desc then.
+*/
+   while ((readl(>des0) & IDMAC_DES0_OWN))
+   ;
+
+   /*
 * Set the OWN bit and disable interrupts
 * for this descriptor
 */
@@ -535,6 +552,23 @@ static void dw_mci_translate_sglist(struct dw_mci *host, 
struct mmc_data *data,
length -= desc_len;
 
/*
+* OWN bit should be clear by IDMAC after
+* finishing transfer. Let's wait for the
+* asynchronous operation of IDMAC and cpu
+* to make sure that we do not rely on the
+* order of Qos of bus and architecture.
+* Otherwise we could see a race condition
+* here that the former write operation of
+* IDMAC(to clear the OWN bit) reach right
+* after the later new configuration of desc
+* which makes value of desc been covered
+* leading to DMA_SUSPEND state as IDMAC fecth
+* the wrong desc then.
+*/
+   while ((readl(>des0) & IDMAC_DES0_OWN))
+   ;
+
+   /*
 * Set the OWN bit and disable interrupts
 * for this descriptor
 */
-- 
2.3.7




[PATCH 7/7] dax: unmap/truncate on device shutdown

2016-08-15 Thread Dan Williams
Invalidate all mappings of a device-dax instance when the device is
unregistered.

Signed-off-by: Dan Williams 
---
 drivers/dax/dax.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/dax/dax.c b/drivers/dax/dax.c
index e8b9319aeadb..0a7899d5c65c 100644
--- a/drivers/dax/dax.c
+++ b/drivers/dax/dax.c
@@ -550,6 +550,7 @@ static void unregister_dax_dev(void *dev)
 */
dax_dev->alive = false;
synchronize_rcu();
+   unmap_mapping_range(dax_dev->inode->i_mapping, 0, 0, 1);
cdev_del(cdev);
device_unregister(dev);
 }



[PATCH 5/7] dax: convert to the cdev api

2016-08-15 Thread Dan Williams
A goal of the device-DAX interface is to be able to support many
exclusive allocations (partitions) of performance / feature
differentiated memory.  This count may exceed the default minors limit
of 256.

As a result of switching to an embedded cdev the inode-to-dax_dev
conversion is simplified, as well as reference counting which can switch
to the cdev kobject lifetime.

Cc: Al Viro 
Signed-off-by: Dan Williams 
---
 drivers/dax/Kconfig |5 +++
 drivers/dax/dax.c   |   82 ++-
 2 files changed, 46 insertions(+), 41 deletions(-)

diff --git a/drivers/dax/Kconfig b/drivers/dax/Kconfig
index cedab7572de3..daadd20aa936 100644
--- a/drivers/dax/Kconfig
+++ b/drivers/dax/Kconfig
@@ -23,4 +23,9 @@ config DEV_DAX_PMEM
 
  Say Y if unsure
 
+config NR_DEV_DAX
+   int "Maximum number of Device-DAX instances"
+   default 32768
+   range 256 2147483647
+
 endif
diff --git a/drivers/dax/dax.c b/drivers/dax/dax.c
index 181d2a5a21e4..17715773c097 100644
--- a/drivers/dax/dax.c
+++ b/drivers/dax/dax.c
@@ -14,15 +14,19 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
 #include 
 #include "dax.h"
 
-static int dax_major;
+static dev_t dax_devt;
 static struct class *dax_class;
 static DEFINE_IDA(dax_minor_ida);
+static int nr_dax = CONFIG_NR_DEV_DAX;
+module_param(nr_dax, int, S_IRUGO);
+MODULE_PARM_DESC(nr_dax, "max number of device-dax instances");
 
 /**
  * struct dax_region - mapping infrastructure for dax devices
@@ -49,6 +53,7 @@ struct dax_region {
  * struct dax_dev - subdivision of a dax region
  * @region - parent region
  * @dev - device backing the character device
+ * @cdev - core chardev data
  * @alive - !alive + rcu grace period == no new mappings can be established
  * @id - child id in the region
  * @num_resources - number of physical address extents in this device
@@ -57,6 +62,7 @@ struct dax_region {
 struct dax_dev {
struct dax_region *region;
struct device dev;
+   struct cdev cdev;
bool alive;
int id;
int num_resources;
@@ -367,29 +373,12 @@ static unsigned long dax_get_unmapped_area(struct file 
*filp,
return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags);
 }
 
-static int __match_devt(struct device *dev, const void *data)
-{
-   const dev_t *devt = data;
-
-   return dev->devt == *devt;
-}
-
-static struct device *dax_dev_find(dev_t dev_t)
-{
-   return class_find_device(dax_class, NULL, _t, __match_devt);
-}
-
 static int dax_open(struct inode *inode, struct file *filp)
 {
-   struct dax_dev *dax_dev = NULL;
-   struct device *dev;
-
-   dev = dax_dev_find(inode->i_rdev);
-   if (!dev)
-   return -ENXIO;
+   struct dax_dev *dax_dev;
 
-   dax_dev = to_dax_dev(dev);
-   dev_dbg(dev, "%s\n", __func__);
+   dax_dev = container_of(inode->i_cdev, struct dax_dev, cdev);
+   dev_dbg(_dev->dev, "%s\n", __func__);
filp->private_data = dax_dev;
inode->i_flags = S_DAX;
 
@@ -399,11 +388,8 @@ static int dax_open(struct inode *inode, struct file *filp)
 static int dax_release(struct inode *inode, struct file *filp)
 {
struct dax_dev *dax_dev = filp->private_data;
-   struct device *dev = _dev->dev;
-
-   dev_dbg(dev, "%s\n", __func__);
-   put_device(dev);
 
+   dev_dbg(_dev->dev, "%s\n", __func__);
return 0;
 }
 
@@ -430,6 +416,7 @@ static void dax_dev_release(struct device *dev)
 static void unregister_dax_dev(void *dev)
 {
struct dax_dev *dax_dev = to_dax_dev(dev);
+   struct cdev *cdev = _dev->cdev;
 
dev_dbg(dev, "%s\n", __func__);
 
@@ -442,6 +429,7 @@ static void unregister_dax_dev(void *dev)
 */
dax_dev->alive = false;
synchronize_rcu();
+   cdev_del(cdev);
device_unregister(dev);
 }
 
@@ -451,17 +439,13 @@ int devm_create_dax_dev(struct dax_region *dax_region, 
struct resource *res,
struct device *parent = dax_region->dev;
struct dax_dev *dax_dev;
struct device *dev;
+   struct cdev *cdev;
int rc, minor;
dev_t dev_t;
 
dax_dev = kzalloc(sizeof(*dax_dev) + sizeof(*res) * count, GFP_KERNEL);
if (!dax_dev)
return -ENOMEM;
-   memcpy(dax_dev->res, res, sizeof(*res) * count);
-   dax_dev->num_resources = count;
-   dax_dev->alive = true;
-   dax_dev->region = dax_region;
-   kref_get(_region->kref);
 
dax_dev->id = ida_simple_get(_region->ida, 0, 0, GFP_KERNEL);
if (dax_dev->id < 0) {
@@ -475,10 +459,26 @@ int devm_create_dax_dev(struct dax_region *dax_region, 
struct resource *res,
goto err_minor;
}
 
-   dev_t = MKDEV(dax_major, minor);
-
+   /* device_initialize() so cdev can reference kobj parent */
+   dev_t = MKDEV(MAJOR(dax_devt), minor);
dev = _dev->dev;
device_initialize(dev);
+
+   cdev = _dev->cdev;
+  

[PATCH 4/7] dax: embed a struct device in dax_dev

2016-08-15 Thread Dan Williams
The kref in dax_dev can be made redundant if the final put_device() on
the device associated with the dax_dev frees the dax_dev.  This can be
accomplished by embedding a struct device in struct dax_dev, open coding
device_create() and specifying a custom release method.

Signed-off-by: Dan Williams 
---
 drivers/dax/dax.c |  130 ++---
 1 file changed, 45 insertions(+), 85 deletions(-)

diff --git a/drivers/dax/dax.c b/drivers/dax/dax.c
index 994dfa507dfb..181d2a5a21e4 100644
--- a/drivers/dax/dax.c
+++ b/drivers/dax/dax.c
@@ -49,7 +49,6 @@ struct dax_region {
  * struct dax_dev - subdivision of a dax region
  * @region - parent region
  * @dev - device backing the character device
- * @kref - enable this data to be tracked in filp->private_data
  * @alive - !alive + rcu grace period == no new mappings can be established
  * @id - child id in the region
  * @num_resources - number of physical address extents in this device
@@ -57,8 +56,7 @@ struct dax_region {
  */
 struct dax_dev {
struct dax_region *region;
-   struct device *dev;
-   struct kref kref;
+   struct device dev;
bool alive;
int id;
int num_resources;
@@ -79,20 +77,6 @@ void dax_region_put(struct dax_region *dax_region)
 }
 EXPORT_SYMBOL_GPL(dax_region_put);
 
-static void dax_dev_free(struct kref *kref)
-{
-   struct dax_dev *dax_dev;
-
-   dax_dev = container_of(kref, struct dax_dev, kref);
-   dax_region_put(dax_dev->region);
-   kfree(dax_dev);
-}
-
-static void dax_dev_put(struct dax_dev *dax_dev)
-{
-   kref_put(_dev->kref, dax_dev_free);
-}
-
 struct dax_region *alloc_dax_region(struct device *parent, int region_id,
struct resource *res, unsigned int align, void *addr,
unsigned long pfn_flags)
@@ -117,10 +101,15 @@ struct dax_region *alloc_dax_region(struct device 
*parent, int region_id,
 }
 EXPORT_SYMBOL_GPL(alloc_dax_region);
 
+static struct dax_dev *to_dax_dev(struct device *dev)
+{
+   return container_of(dev, struct dax_dev, dev);
+}
+
 static ssize_t size_show(struct device *dev,
struct device_attribute *attr, char *buf)
 {
-   struct dax_dev *dax_dev = dev_get_drvdata(dev);
+   struct dax_dev *dax_dev = to_dax_dev(dev);
unsigned long long size = 0;
int i;
 
@@ -149,7 +138,7 @@ static int check_vma(struct dax_dev *dax_dev, struct 
vm_area_struct *vma,
const char *func)
 {
struct dax_region *dax_region = dax_dev->region;
-   struct device *dev = dax_dev->dev;
+   struct device *dev = _dev->dev;
unsigned long mask;
 
if (!dax_dev->alive)
@@ -214,7 +203,7 @@ static int __dax_dev_fault(struct dax_dev *dax_dev, struct 
vm_area_struct *vma,
struct vm_fault *vmf)
 {
unsigned long vaddr = (unsigned long) vmf->virtual_address;
-   struct device *dev = dax_dev->dev;
+   struct device *dev = _dev->dev;
struct dax_region *dax_region;
int rc = VM_FAULT_SIGBUS;
phys_addr_t phys;
@@ -254,7 +243,7 @@ static int dax_dev_fault(struct vm_area_struct *vma, struct 
vm_fault *vmf)
struct file *filp = vma->vm_file;
struct dax_dev *dax_dev = filp->private_data;
 
-   dev_dbg(dax_dev->dev, "%s: %s: %s (%#lx - %#lx)\n", __func__,
+   dev_dbg(_dev->dev, "%s: %s: %s (%#lx - %#lx)\n", __func__,
current->comm, (vmf->flags & FAULT_FLAG_WRITE)
? "write" : "read", vma->vm_start, vma->vm_end);
rcu_read_lock();
@@ -269,7 +258,7 @@ static int __dax_dev_pmd_fault(struct dax_dev *dax_dev,
unsigned int flags)
 {
unsigned long pmd_addr = addr & PMD_MASK;
-   struct device *dev = dax_dev->dev;
+   struct device *dev = _dev->dev;
struct dax_region *dax_region;
phys_addr_t phys;
pgoff_t pgoff;
@@ -311,7 +300,7 @@ static int dax_dev_pmd_fault(struct vm_area_struct *vma, 
unsigned long addr,
struct file *filp = vma->vm_file;
struct dax_dev *dax_dev = filp->private_data;
 
-   dev_dbg(dax_dev->dev, "%s: %s: %s (%#lx - %#lx)\n", __func__,
+   dev_dbg(_dev->dev, "%s: %s: %s (%#lx - %#lx)\n", __func__,
current->comm, (flags & FAULT_FLAG_WRITE)
? "write" : "read", vma->vm_start, vma->vm_end);
 
@@ -322,29 +311,9 @@ static int dax_dev_pmd_fault(struct vm_area_struct *vma, 
unsigned long addr,
return rc;
 }
 
-static void dax_dev_vm_open(struct vm_area_struct *vma)
-{
-   struct file *filp = vma->vm_file;
-   struct dax_dev *dax_dev = filp->private_data;
-
-   dev_dbg(dax_dev->dev, "%s\n", __func__);
-   kref_get(_dev->kref);
-}
-
-static void dax_dev_vm_close(struct vm_area_struct *vma)
-{
-   struct file *filp = vma->vm_file;
-   struct dax_dev *dax_dev = filp->private_data;
-
-   dev_dbg(dax_dev->dev, "%s\n", __func__);
-   dax_dev_put(dax_dev);

[RFC PATCH] mmc: dw_mmc: avoid race condition of cpu and IDMAC

2016-08-15 Thread Shawn Lin
We could see an obvious race condition by test that
the former write operation by IDMAC aiming to clear
OWN bit reach right after the later configuration of
the same desc, which makes the IDMAC be in SUSPEND
state as the OWN bit was cleared by the asynchronous
write operation of IDMAC. The bug can be very easy
reproduced on RK3288 or similar when lowering the
bandwidth of bus and aggravating the Qos to make the
large numbers of IP fight for the priority. One possible
replaceable solution may be alloc dual buff for the
desc to avoid it but could still race each other
theoretically.

Signed-off-by: Shawn Lin 

---

 drivers/mmc/host/dw_mmc.c | 34 ++
 1 file changed, 34 insertions(+)

diff --git a/drivers/mmc/host/dw_mmc.c b/drivers/mmc/host/dw_mmc.c
index 32380d5..7b01fab 100644
--- a/drivers/mmc/host/dw_mmc.c
+++ b/drivers/mmc/host/dw_mmc.c
@@ -490,6 +490,23 @@ static void dw_mci_translate_sglist(struct dw_mci *host, 
struct mmc_data *data,
length -= desc_len;
 
/*
+* OWN bit should be clear by IDMAC after
+* finishing transfer. Let's wait for the
+* asynchronous operation of IDMAC and cpu
+* to make sure that we do not rely on the
+* order of Qos of bus and architecture.
+* Otherwise we could see a race condition
+* here that the former write operation of
+* IDMAC(to clear the OWN bit) reach right
+* after the later new configuration of desc
+* which makes value of desc been covered
+* leading to DMA_SUSPEND state as IDMAC fecth
+* the wrong desc then.
+*/
+   while ((readl(>des0) & IDMAC_DES0_OWN))
+   ;
+
+   /*
 * Set the OWN bit and disable interrupts
 * for this descriptor
 */
@@ -535,6 +552,23 @@ static void dw_mci_translate_sglist(struct dw_mci *host, 
struct mmc_data *data,
length -= desc_len;
 
/*
+* OWN bit should be clear by IDMAC after
+* finishing transfer. Let's wait for the
+* asynchronous operation of IDMAC and cpu
+* to make sure that we do not rely on the
+* order of Qos of bus and architecture.
+* Otherwise we could see a race condition
+* here that the former write operation of
+* IDMAC(to clear the OWN bit) reach right
+* after the later new configuration of desc
+* which makes value of desc been covered
+* leading to DMA_SUSPEND state as IDMAC fecth
+* the wrong desc then.
+*/
+   while ((readl(>des0) & IDMAC_DES0_OWN))
+   ;
+
+   /*
 * Set the OWN bit and disable interrupts
 * for this descriptor
 */
-- 
2.3.7




[PATCH 7/7] dax: unmap/truncate on device shutdown

2016-08-15 Thread Dan Williams
Invalidate all mappings of a device-dax instance when the device is
unregistered.

Signed-off-by: Dan Williams 
---
 drivers/dax/dax.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/dax/dax.c b/drivers/dax/dax.c
index e8b9319aeadb..0a7899d5c65c 100644
--- a/drivers/dax/dax.c
+++ b/drivers/dax/dax.c
@@ -550,6 +550,7 @@ static void unregister_dax_dev(void *dev)
 */
dax_dev->alive = false;
synchronize_rcu();
+   unmap_mapping_range(dax_dev->inode->i_mapping, 0, 0, 1);
cdev_del(cdev);
device_unregister(dev);
 }



[PATCH 1/7] dax: cleanup needlessly global symbol warnings

2016-08-15 Thread Dan Williams
drivers/dax/dax.c:75:6: warning: symbol 'dax_region_put' was not declared.
drivers/dax/dax.c:95:19: warning: symbol 'alloc_dax_region' was not declared.
drivers/dax/dax.c:173:5: warning: symbol 'devm_create_dax_dev' was not declared.
drivers/dax/pmem.c:27:17: warning: symbol 'to_dax_pmem' was not declared.

Signed-off-by: Dan Williams 
---
 drivers/dax/dax.c  |1 +
 drivers/dax/pmem.c |2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/dax/dax.c b/drivers/dax/dax.c
index 803f3953b341..736c03830fd0 100644
--- a/drivers/dax/dax.c
+++ b/drivers/dax/dax.c
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include 
+#include "dax.h"
 
 static int dax_major;
 static struct class *dax_class;
diff --git a/drivers/dax/pmem.c b/drivers/dax/pmem.c
index dfb168568af1..59b75c5972bb 100644
--- a/drivers/dax/pmem.c
+++ b/drivers/dax/pmem.c
@@ -24,7 +24,7 @@ struct dax_pmem {
struct completion cmp;
 };
 
-struct dax_pmem *to_dax_pmem(struct percpu_ref *ref)
+static struct dax_pmem *to_dax_pmem(struct percpu_ref *ref)
 {
return container_of(ref, struct dax_pmem, ref);
 }



[PATCH 3/7] dax: rename fops from dax_dev_ to dax_

2016-08-15 Thread Dan Williams
Shorten the prefix of the file operations to distinguish them from
operations on the struct device associated with the dax_dev.

Signed-off-by: Dan Williams 
---
 drivers/dax/dax.c |   16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/dax/dax.c b/drivers/dax/dax.c
index 3774fc9709bb..994dfa507dfb 100644
--- a/drivers/dax/dax.c
+++ b/drivers/dax/dax.c
@@ -347,7 +347,7 @@ static const struct vm_operations_struct dax_dev_vm_ops = {
.close = dax_dev_vm_close,
 };
 
-static int dax_dev_mmap(struct file *filp, struct vm_area_struct *vma)
+static int dax_mmap(struct file *filp, struct vm_area_struct *vma)
 {
struct dax_dev *dax_dev = filp->private_data;
int rc;
@@ -365,7 +365,7 @@ static int dax_dev_mmap(struct file *filp, struct 
vm_area_struct *vma)
 }
 
 /* return an unmapped area aligned to the dax region specified alignment */
-static unsigned long dax_dev_get_unmapped_area(struct file *filp,
+static unsigned long dax_get_unmapped_area(struct file *filp,
unsigned long addr, unsigned long len, unsigned long pgoff,
unsigned long flags)
 {
@@ -411,7 +411,7 @@ static struct device *dax_dev_find(dev_t dev_t)
return class_find_device(dax_class, NULL, _t, __match_devt);
 }
 
-static int dax_dev_open(struct inode *inode, struct file *filp)
+static int dax_open(struct inode *inode, struct file *filp)
 {
struct dax_dev *dax_dev = NULL;
struct device *dev;
@@ -437,7 +437,7 @@ static int dax_dev_open(struct inode *inode, struct file 
*filp)
return 0;
 }
 
-static int dax_dev_release(struct inode *inode, struct file *filp)
+static int dax_release(struct inode *inode, struct file *filp)
 {
struct dax_dev *dax_dev = filp->private_data;
struct device *dev = dax_dev->dev;
@@ -452,10 +452,10 @@ static int dax_dev_release(struct inode *inode, struct 
file *filp)
 static const struct file_operations dax_fops = {
.llseek = noop_llseek,
.owner = THIS_MODULE,
-   .open = dax_dev_open,
-   .release = dax_dev_release,
-   .get_unmapped_area = dax_dev_get_unmapped_area,
-   .mmap = dax_dev_mmap,
+   .open = dax_open,
+   .release = dax_release,
+   .get_unmapped_area = dax_get_unmapped_area,
+   .mmap = dax_mmap,
 };
 
 static void unregister_dax_dev(void *_dev)



[PATCH 1/7] dax: cleanup needlessly global symbol warnings

2016-08-15 Thread Dan Williams
drivers/dax/dax.c:75:6: warning: symbol 'dax_region_put' was not declared.
drivers/dax/dax.c:95:19: warning: symbol 'alloc_dax_region' was not declared.
drivers/dax/dax.c:173:5: warning: symbol 'devm_create_dax_dev' was not declared.
drivers/dax/pmem.c:27:17: warning: symbol 'to_dax_pmem' was not declared.

Signed-off-by: Dan Williams 
---
 drivers/dax/dax.c  |1 +
 drivers/dax/pmem.c |2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/dax/dax.c b/drivers/dax/dax.c
index 803f3953b341..736c03830fd0 100644
--- a/drivers/dax/dax.c
+++ b/drivers/dax/dax.c
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include 
+#include "dax.h"
 
 static int dax_major;
 static struct class *dax_class;
diff --git a/drivers/dax/pmem.c b/drivers/dax/pmem.c
index dfb168568af1..59b75c5972bb 100644
--- a/drivers/dax/pmem.c
+++ b/drivers/dax/pmem.c
@@ -24,7 +24,7 @@ struct dax_pmem {
struct completion cmp;
 };
 
-struct dax_pmem *to_dax_pmem(struct percpu_ref *ref)
+static struct dax_pmem *to_dax_pmem(struct percpu_ref *ref)
 {
return container_of(ref, struct dax_pmem, ref);
 }



[PATCH 3/7] dax: rename fops from dax_dev_ to dax_

2016-08-15 Thread Dan Williams
Shorten the prefix of the file operations to distinguish them from
operations on the struct device associated with the dax_dev.

Signed-off-by: Dan Williams 
---
 drivers/dax/dax.c |   16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/dax/dax.c b/drivers/dax/dax.c
index 3774fc9709bb..994dfa507dfb 100644
--- a/drivers/dax/dax.c
+++ b/drivers/dax/dax.c
@@ -347,7 +347,7 @@ static const struct vm_operations_struct dax_dev_vm_ops = {
.close = dax_dev_vm_close,
 };
 
-static int dax_dev_mmap(struct file *filp, struct vm_area_struct *vma)
+static int dax_mmap(struct file *filp, struct vm_area_struct *vma)
 {
struct dax_dev *dax_dev = filp->private_data;
int rc;
@@ -365,7 +365,7 @@ static int dax_dev_mmap(struct file *filp, struct 
vm_area_struct *vma)
 }
 
 /* return an unmapped area aligned to the dax region specified alignment */
-static unsigned long dax_dev_get_unmapped_area(struct file *filp,
+static unsigned long dax_get_unmapped_area(struct file *filp,
unsigned long addr, unsigned long len, unsigned long pgoff,
unsigned long flags)
 {
@@ -411,7 +411,7 @@ static struct device *dax_dev_find(dev_t dev_t)
return class_find_device(dax_class, NULL, _t, __match_devt);
 }
 
-static int dax_dev_open(struct inode *inode, struct file *filp)
+static int dax_open(struct inode *inode, struct file *filp)
 {
struct dax_dev *dax_dev = NULL;
struct device *dev;
@@ -437,7 +437,7 @@ static int dax_dev_open(struct inode *inode, struct file 
*filp)
return 0;
 }
 
-static int dax_dev_release(struct inode *inode, struct file *filp)
+static int dax_release(struct inode *inode, struct file *filp)
 {
struct dax_dev *dax_dev = filp->private_data;
struct device *dev = dax_dev->dev;
@@ -452,10 +452,10 @@ static int dax_dev_release(struct inode *inode, struct 
file *filp)
 static const struct file_operations dax_fops = {
.llseek = noop_llseek,
.owner = THIS_MODULE,
-   .open = dax_dev_open,
-   .release = dax_dev_release,
-   .get_unmapped_area = dax_dev_get_unmapped_area,
-   .mmap = dax_dev_mmap,
+   .open = dax_open,
+   .release = dax_release,
+   .get_unmapped_area = dax_get_unmapped_area,
+   .mmap = dax_mmap,
 };
 
 static void unregister_dax_dev(void *_dev)



[PATCH 0/7] dax: unified host inode for device-dax mappings

2016-08-15 Thread Dan Williams
There are two scenarios where we need mappings of a /dev/dax device to
share a single host inode, invalidating mappings at device shutdown, and
coordinating resize of an actively mapped device.  This series addresses
the unmap-on-shutdown case and includes reworks, like the cdev api
conversion, to prepare for a dynamic resize / allocation capability.

Recall that device-DAX, introduced in v4.7 [1], is a mechanism to
provide deterministic mapping behavior for performance- /
feature-differentiated memory ranges.

[1]: 
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=ab68f2622136

---

Dan Williams (7):
  dax: cleanup needlessly global symbol warnings
  dax: reorder dax_fops function definitions
  dax: rename fops from dax_dev_ to dax_
  dax: embed a struct device in dax_dev
  dax: convert to the cdev api
  dax: define a unified inode/address_space for device-dax mappings
  dax: unmap/truncate on device shutdown


 drivers/dax/Kconfig|5 
 drivers/dax/dax.c  |  555 ++--
 drivers/dax/pmem.c |2 
 fs/char_dev.c  |1 
 include/uapi/linux/magic.h |1 
 5 files changed, 337 insertions(+), 227 deletions(-)


[PATCH 2/7] dax: reorder dax_fops function definitions

2016-08-15 Thread Dan Williams
In order to convert devm_create_dax_dev() to use cdev, it will need
access to dax_fops. Move dax_fops and related function definitions
before devm_create_dax_dev().

Signed-off-by: Dan Williams 
---
 drivers/dax/dax.c |  337 ++---
 1 file changed, 168 insertions(+), 169 deletions(-)

diff --git a/drivers/dax/dax.c b/drivers/dax/dax.c
index 736c03830fd0..3774fc9709bb 100644
--- a/drivers/dax/dax.c
+++ b/drivers/dax/dax.c
@@ -145,175 +145,6 @@ static const struct attribute_group 
*dax_attribute_groups[] = {
NULL,
 };
 
-static void unregister_dax_dev(void *_dev)
-{
-   struct device *dev = _dev;
-   struct dax_dev *dax_dev = dev_get_drvdata(dev);
-   struct dax_region *dax_region = dax_dev->region;
-
-   dev_dbg(dev, "%s\n", __func__);
-
-   /*
-* Note, rcu is not protecting the liveness of dax_dev, rcu is
-* ensuring that any fault handlers that might have seen
-* dax_dev->alive == true, have completed.  Any fault handlers
-* that start after synchronize_rcu() has started will abort
-* upon seeing dax_dev->alive == false.
-*/
-   dax_dev->alive = false;
-   synchronize_rcu();
-
-   get_device(dev);
-   device_unregister(dev);
-   ida_simple_remove(_region->ida, dax_dev->id);
-   ida_simple_remove(_minor_ida, MINOR(dev->devt));
-   put_device(dev);
-   dax_dev_put(dax_dev);
-}
-
-int devm_create_dax_dev(struct dax_region *dax_region, struct resource *res,
-   int count)
-{
-   struct device *parent = dax_region->dev;
-   struct dax_dev *dax_dev;
-   struct device *dev;
-   int rc, minor;
-   dev_t dev_t;
-
-   dax_dev = kzalloc(sizeof(*dax_dev) + sizeof(*res) * count, GFP_KERNEL);
-   if (!dax_dev)
-   return -ENOMEM;
-   memcpy(dax_dev->res, res, sizeof(*res) * count);
-   dax_dev->num_resources = count;
-   kref_init(_dev->kref);
-   dax_dev->alive = true;
-   dax_dev->region = dax_region;
-   kref_get(_region->kref);
-
-   dax_dev->id = ida_simple_get(_region->ida, 0, 0, GFP_KERNEL);
-   if (dax_dev->id < 0) {
-   rc = dax_dev->id;
-   goto err_id;
-   }
-
-   minor = ida_simple_get(_minor_ida, 0, 0, GFP_KERNEL);
-   if (minor < 0) {
-   rc = minor;
-   goto err_minor;
-   }
-
-   dev_t = MKDEV(dax_major, minor);
-   dev = device_create_with_groups(dax_class, parent, dev_t, dax_dev,
-   dax_attribute_groups, "dax%d.%d", dax_region->id,
-   dax_dev->id);
-   if (IS_ERR(dev)) {
-   rc = PTR_ERR(dev);
-   goto err_create;
-   }
-   dax_dev->dev = dev;
-
-   rc = devm_add_action_or_reset(dax_region->dev, unregister_dax_dev, dev);
-   if (rc)
-   return rc;
-
-   return 0;
-
- err_create:
-   ida_simple_remove(_minor_ida, minor);
- err_minor:
-   ida_simple_remove(_region->ida, dax_dev->id);
- err_id:
-   dax_dev_put(dax_dev);
-
-   return rc;
-}
-EXPORT_SYMBOL_GPL(devm_create_dax_dev);
-
-/* return an unmapped area aligned to the dax region specified alignment */
-static unsigned long dax_dev_get_unmapped_area(struct file *filp,
-   unsigned long addr, unsigned long len, unsigned long pgoff,
-   unsigned long flags)
-{
-   unsigned long off, off_end, off_align, len_align, addr_align, align;
-   struct dax_dev *dax_dev = filp ? filp->private_data : NULL;
-   struct dax_region *dax_region;
-
-   if (!dax_dev || addr)
-   goto out;
-
-   dax_region = dax_dev->region;
-   align = dax_region->align;
-   off = pgoff << PAGE_SHIFT;
-   off_end = off + len;
-   off_align = round_up(off, align);
-
-   if ((off_end <= off_align) || ((off_end - off_align) < align))
-   goto out;
-
-   len_align = len + align;
-   if ((off + len_align) < off)
-   goto out;
-
-   addr_align = current->mm->get_unmapped_area(filp, addr, len_align,
-   pgoff, flags);
-   if (!IS_ERR_VALUE(addr_align)) {
-   addr_align += (off - addr_align) & (align - 1);
-   return addr_align;
-   }
- out:
-   return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags);
-}
-
-static int __match_devt(struct device *dev, const void *data)
-{
-   const dev_t *devt = data;
-
-   return dev->devt == *devt;
-}
-
-static struct device *dax_dev_find(dev_t dev_t)
-{
-   return class_find_device(dax_class, NULL, _t, __match_devt);
-}
-
-static int dax_dev_open(struct inode *inode, struct file *filp)
-{
-   struct dax_dev *dax_dev = NULL;
-   struct device *dev;
-
-   dev = dax_dev_find(inode->i_rdev);
-   if (!dev)
-   return -ENXIO;
-
-   device_lock(dev);
-   dax_dev = dev_get_drvdata(dev);
-   

[PATCH 0/7] dax: unified host inode for device-dax mappings

2016-08-15 Thread Dan Williams
There are two scenarios where we need mappings of a /dev/dax device to
share a single host inode, invalidating mappings at device shutdown, and
coordinating resize of an actively mapped device.  This series addresses
the unmap-on-shutdown case and includes reworks, like the cdev api
conversion, to prepare for a dynamic resize / allocation capability.

Recall that device-DAX, introduced in v4.7 [1], is a mechanism to
provide deterministic mapping behavior for performance- /
feature-differentiated memory ranges.

[1]: 
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=ab68f2622136

---

Dan Williams (7):
  dax: cleanup needlessly global symbol warnings
  dax: reorder dax_fops function definitions
  dax: rename fops from dax_dev_ to dax_
  dax: embed a struct device in dax_dev
  dax: convert to the cdev api
  dax: define a unified inode/address_space for device-dax mappings
  dax: unmap/truncate on device shutdown


 drivers/dax/Kconfig|5 
 drivers/dax/dax.c  |  555 ++--
 drivers/dax/pmem.c |2 
 fs/char_dev.c  |1 
 include/uapi/linux/magic.h |1 
 5 files changed, 337 insertions(+), 227 deletions(-)


[PATCH 2/7] dax: reorder dax_fops function definitions

2016-08-15 Thread Dan Williams
In order to convert devm_create_dax_dev() to use cdev, it will need
access to dax_fops. Move dax_fops and related function definitions
before devm_create_dax_dev().

Signed-off-by: Dan Williams 
---
 drivers/dax/dax.c |  337 ++---
 1 file changed, 168 insertions(+), 169 deletions(-)

diff --git a/drivers/dax/dax.c b/drivers/dax/dax.c
index 736c03830fd0..3774fc9709bb 100644
--- a/drivers/dax/dax.c
+++ b/drivers/dax/dax.c
@@ -145,175 +145,6 @@ static const struct attribute_group 
*dax_attribute_groups[] = {
NULL,
 };
 
-static void unregister_dax_dev(void *_dev)
-{
-   struct device *dev = _dev;
-   struct dax_dev *dax_dev = dev_get_drvdata(dev);
-   struct dax_region *dax_region = dax_dev->region;
-
-   dev_dbg(dev, "%s\n", __func__);
-
-   /*
-* Note, rcu is not protecting the liveness of dax_dev, rcu is
-* ensuring that any fault handlers that might have seen
-* dax_dev->alive == true, have completed.  Any fault handlers
-* that start after synchronize_rcu() has started will abort
-* upon seeing dax_dev->alive == false.
-*/
-   dax_dev->alive = false;
-   synchronize_rcu();
-
-   get_device(dev);
-   device_unregister(dev);
-   ida_simple_remove(_region->ida, dax_dev->id);
-   ida_simple_remove(_minor_ida, MINOR(dev->devt));
-   put_device(dev);
-   dax_dev_put(dax_dev);
-}
-
-int devm_create_dax_dev(struct dax_region *dax_region, struct resource *res,
-   int count)
-{
-   struct device *parent = dax_region->dev;
-   struct dax_dev *dax_dev;
-   struct device *dev;
-   int rc, minor;
-   dev_t dev_t;
-
-   dax_dev = kzalloc(sizeof(*dax_dev) + sizeof(*res) * count, GFP_KERNEL);
-   if (!dax_dev)
-   return -ENOMEM;
-   memcpy(dax_dev->res, res, sizeof(*res) * count);
-   dax_dev->num_resources = count;
-   kref_init(_dev->kref);
-   dax_dev->alive = true;
-   dax_dev->region = dax_region;
-   kref_get(_region->kref);
-
-   dax_dev->id = ida_simple_get(_region->ida, 0, 0, GFP_KERNEL);
-   if (dax_dev->id < 0) {
-   rc = dax_dev->id;
-   goto err_id;
-   }
-
-   minor = ida_simple_get(_minor_ida, 0, 0, GFP_KERNEL);
-   if (minor < 0) {
-   rc = minor;
-   goto err_minor;
-   }
-
-   dev_t = MKDEV(dax_major, minor);
-   dev = device_create_with_groups(dax_class, parent, dev_t, dax_dev,
-   dax_attribute_groups, "dax%d.%d", dax_region->id,
-   dax_dev->id);
-   if (IS_ERR(dev)) {
-   rc = PTR_ERR(dev);
-   goto err_create;
-   }
-   dax_dev->dev = dev;
-
-   rc = devm_add_action_or_reset(dax_region->dev, unregister_dax_dev, dev);
-   if (rc)
-   return rc;
-
-   return 0;
-
- err_create:
-   ida_simple_remove(_minor_ida, minor);
- err_minor:
-   ida_simple_remove(_region->ida, dax_dev->id);
- err_id:
-   dax_dev_put(dax_dev);
-
-   return rc;
-}
-EXPORT_SYMBOL_GPL(devm_create_dax_dev);
-
-/* return an unmapped area aligned to the dax region specified alignment */
-static unsigned long dax_dev_get_unmapped_area(struct file *filp,
-   unsigned long addr, unsigned long len, unsigned long pgoff,
-   unsigned long flags)
-{
-   unsigned long off, off_end, off_align, len_align, addr_align, align;
-   struct dax_dev *dax_dev = filp ? filp->private_data : NULL;
-   struct dax_region *dax_region;
-
-   if (!dax_dev || addr)
-   goto out;
-
-   dax_region = dax_dev->region;
-   align = dax_region->align;
-   off = pgoff << PAGE_SHIFT;
-   off_end = off + len;
-   off_align = round_up(off, align);
-
-   if ((off_end <= off_align) || ((off_end - off_align) < align))
-   goto out;
-
-   len_align = len + align;
-   if ((off + len_align) < off)
-   goto out;
-
-   addr_align = current->mm->get_unmapped_area(filp, addr, len_align,
-   pgoff, flags);
-   if (!IS_ERR_VALUE(addr_align)) {
-   addr_align += (off - addr_align) & (align - 1);
-   return addr_align;
-   }
- out:
-   return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags);
-}
-
-static int __match_devt(struct device *dev, const void *data)
-{
-   const dev_t *devt = data;
-
-   return dev->devt == *devt;
-}
-
-static struct device *dax_dev_find(dev_t dev_t)
-{
-   return class_find_device(dax_class, NULL, _t, __match_devt);
-}
-
-static int dax_dev_open(struct inode *inode, struct file *filp)
-{
-   struct dax_dev *dax_dev = NULL;
-   struct device *dev;
-
-   dev = dax_dev_find(inode->i_rdev);
-   if (!dev)
-   return -ENXIO;
-
-   device_lock(dev);
-   dax_dev = dev_get_drvdata(dev);
-   if (dax_dev) {
-   

[PATCH V3 0/2] rtc-cmos: Workaround unwanted interrupt generation

2016-08-15 Thread Pratyush Anand
We have observed on few machines with rtc-cmos devices that it generates
an interrupt before the hpet_rtc_timer_init() call is finished. This leads
to hpet_rtc_interrupt() being called before it is fully initialized.

Therefore the while-loop of hpet_cnt_ahead() in hpet_rtc_timer_reinit()
never completes. This leads to "NMI watchdog: Watchdog detected hard LOCKUP
on cpu 0".

This patch set initializes hpet_default_delta and hpet_t1_cmp before
interrupt can be raised.

Changes since V2:
  - Improved commit log further
Changes since RFC:
  - Commit log of patches has been improved.

Pratyush Anand (2):
  rtc/hpet: Factorize hpet_rtc_timer_init()
  rtc/rtc-cmos: Initialize software counters before irq is registered

 arch/x86/include/asm/hpet.h |  2 ++
 arch/x86/kernel/hpet.c  | 41 +++--
 drivers/rtc/rtc-cmos.c  | 13 -
 3 files changed, 49 insertions(+), 7 deletions(-)

-- 
2.5.5



[PATCH V3 0/2] rtc-cmos: Workaround unwanted interrupt generation

2016-08-15 Thread Pratyush Anand
We have observed on few machines with rtc-cmos devices that it generates
an interrupt before the hpet_rtc_timer_init() call is finished. This leads
to hpet_rtc_interrupt() being called before it is fully initialized.

Therefore the while-loop of hpet_cnt_ahead() in hpet_rtc_timer_reinit()
never completes. This leads to "NMI watchdog: Watchdog detected hard LOCKUP
on cpu 0".

This patch set initializes hpet_default_delta and hpet_t1_cmp before
interrupt can be raised.

Changes since V2:
  - Improved commit log further
Changes since RFC:
  - Commit log of patches has been improved.

Pratyush Anand (2):
  rtc/hpet: Factorize hpet_rtc_timer_init()
  rtc/rtc-cmos: Initialize software counters before irq is registered

 arch/x86/include/asm/hpet.h |  2 ++
 arch/x86/kernel/hpet.c  | 41 +++--
 drivers/rtc/rtc-cmos.c  | 13 -
 3 files changed, 49 insertions(+), 7 deletions(-)

-- 
2.5.5



  1   2   3   4   5   6   7   8   9   10   >