Re: [LEDE-DEV] DHCP via bridge in case of IPv4

2016-08-15 Thread Alexey Brodkin
Hello,

On Mon, 2016-07-11 at 06:15 +, Alexey Brodkin wrote:
> Hi Russel,
> 
> On Sun, 2016-07-10 at 00:19 -0700, Russell Senior wrote:
> > 
> > > > > > > "Alexey" == Alexey Brodkin  writes:
> > Alexey> Hi Aaron,
> > Alexey> On Sat, 2016-07-09 at 07:47 -0400, Aaron Z wrote:
> > > 
> > > > On Sat, Jul 9, 2016 at 4:37 AM, Alexey Brodkin
> > > >  wrote:
> > > > > 
> > > > > Hello,
> > > > > 
> > > > > I was playing with quite simple bridged setup on different boards
> > > > with > very recent kernels (4.6.3 as of this writing) and found one
> > > > interesting > behavior that I cannot yet understand and googling
> > > > din't help here as well.
> > > > > 
> > > > > My setup is pretty simple: >
> > > > -   --   -
> > > > > 
> > > > > > HOST  |   | "Dumb AP"  |   | Wireless
> > > > client   | > > with DHCP |<->(eth0) (wlan0)<->|
> > > > attempting to | > > server|   |\ br0
> > > > / |   | get settings via DHCP | >
> > > > -   --   -
> > > > > * HOST is my laptop with DHCP server that works for sure.  > *
> > > > "Dumb AP" is a separate board (I tried ARM-based Wandboard and
> > > > ARC-based >   AXS10x boards but results are exactly the same) with
> > > > wired (eth0) and wireless >   (wlan0) network controllers bridged
> > > > together (br0). That "br0" bridge flawlessly >   gets its settings
> > > > from DHCP server on host.  > * Wireless client could be either a
> > > > smatrphone or another laptop etc but >   what's important it should
> > > > be configured to get network settings by DHCP as well.
> > > > > 
> > > > > So what happens "br0" always gets network settings from DHCP server
> > > > on HOST.  > That's fine. But wireless client only reliably gets
> > > > settings from DHCP server > if IPv6 is enabled on "Dumb AP" board. If
> > > > IPv6 is disabled I may see that > wireless client sends "DHCP
> > > > Discover" then server replies with "DHCP Offer" but > that offer
> > > > never reaches wireless client.
> > > > 
> > > > Do you have WDS enabled? If not, DHCP has issues in that scenario:
> > > > https://wiki.openwrt.org/doc/howto/clientmode
> > If the Dumb AP's wireless interface is in ap-mode, then this shouldn't
> > be an issue.  It's only client-mode interfaces that have trouble with 
> > bridging.
> > 
> > I'd suggest running tcpdump on the Dumb AP's wireless interface and the
> > client's wireless interface and see which of them sees the various parts
> > of the DHCP handshake.
> 
> So I did but for DHCP server and wireless client (had no tcpdump on Dump AP
> at the moment).
> 
> That's what I see on the server:
> ->8---
> No. TimeSource Destination  Protocol Length Info
>  3 0.151181000  0.0.0.0255.255.255.255  DHCP 342DHCP Discover - 
> Transaction ID 0x31dc321f
> 11 2.760796000  10.42.0.1  10.42.0.13   DHCP 342DHCP Offer- 
> Transaction ID 0x31dc321f
> 14 5.220985000  0.0.0.0255.255.255.255  DHCP 342DHCP Discover - 
> Transaction ID 0x31dc321f
> 15 5.22115  10.42.0.1  10.42.0.13   DHCP 342DHCP Offer- 
> Transaction ID 0x31dc321f
> 23 15.649835000 0.0.0.0255.255.255.255  DHCP 342DHCP Discover - 
> Transaction ID 0x31dc321f
> 24 15.650017000 10.42.0.1  10.42.0.13   DHCP 342DHCP Offer- 
> Transaction ID 0x31dc321f
> 32 25.648589000 0.0.0.0255.255.255.255  DHCP 342DHCP Discover - 
> Transaction ID 0x31dc321f
> 33 25.648758000 10.42.0.1  10.42.0.13   DHCP 342DHCP Offer- 
> Transaction ID 0x31dc321f
> 43 35.864567000 0.0.0.0255.255.255.255  DHCP 342DHCP Discover - 
> Transaction ID 0x31dc321f
> 48 38.832837000 10.42.0.1  10.42.0.13   DHCP 342DHCP Offer- 
> Transaction ID 0x31dc321f
> ->8---
> 
> That's on the wireless client:
> ->8---
> No.  Time   Source   Destination  Protocol Length Info
> 1171 94.192971000   0.0.0.0  255.255.255.255  DHCP 342DHCP Discover - 
> Transaction ID 0x31dc321f
> 1182 99.263686000   0.0.0.0  255.255.255.255  DHCP 342DHCP Discover - 
> Transaction ID 0x31dc321f
> 1185 109.692642000  0.0.0.0  255.255.255.255  DHCP 342DHCP Discover - 
> Transaction ID 0x31dc321f
> 1186 119.691474000  0.0.0.0  255.255.255.255  DHCP 342DHCP Discover - 
> Transaction ID 0x31dc321f
> 1190 129.907507000  0.0.0.0  255.255.255.255  DHCP 342DHCP Discover - 
> Transaction ID 0x31dc321f
> ->8---
> 
> I'll try to capture data from Dumb AP sometime soon and will reply to the 
> thread.

So finally after quite some time I figured out what 

Re: [RFC][PATCHSET v2] allowing exports in *.S

2016-08-15 Thread Michal Marek
Dne 16.8.2016 v 07:48 Michal Marek napsal(a):
> Dne 2.8.2016 v 16:01 Michal Marek napsal(a):
>> On 2016-02-03 22:19, Al Viro wrote:
>>> Shortlog:
>>> Al Viro (13):
>>>   [kbuild] handle exports in lib-y objects reliably
>>>   EXPORT_SYMBOL() for asm
>>>   x86: move exports to actual definitions
>>>   alpha: move exports to actual definitions
>>>   m68k: move exports to definitions
>>>   s390: move exports to definitions
>>>   arm: move exports to definitions
>>>   ppc: move exports to definitions
>>>   ppc: get rid of unreachable abs() implementation
>>>   sparc: move exports to definitions
>>>   [sparc] unify 32bit and 64bit string.h
>>>   sparc32: debride memcpy.S a bit
>>>   ia64: move exports to definitions
>>
>> After several pings by Al (sorry about that!), I got around to review a
>> rebased version of this patchset at
>>
>>   git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git asm-exports
>>
>> The kbuild commits are good, but since we are close to the end of the
>> merge window, I will apply them to my kbuild branch after 4.8-rc1.
> 
> The rebased patchset is now in kbuild.git#kbuild. Before pushing, I
> noticed one issue: For some reason,
> drivers/firmware/efi/libstub/lib-ksyms.o is regenerated each time,
> leading to relink of vmlinux. I'm looking into this.

OK, it's the

$(obj)/lib-%.o: $(srctree)/lib/%.c FORCE
$(call if_changed_rule,cc_o_c)

rule in drivers/firmware/efi/libstub/Makefile file that conflicts with
the lib-ksyms.o rule. I need to find a better solution to this hack.

Michal


Re: [PATCH v6 05/11] mm, compaction: add the ultimate direct compaction priority

2016-08-15 Thread Joonsoo Kim
On Wed, Aug 10, 2016 at 11:12:20AM +0200, Vlastimil Babka wrote:
> During reclaim/compaction loop, it's desirable to get a final answer from
> unsuccessful compaction so we can either fail the allocation or invoke the OOM
> killer. However, heuristics such as deferred compaction or pageblock skip bits
> can cause compaction to skip parts or whole zones and lead to premature OOM's,
> failures or excessive reclaim/compaction retries.
> 
> To remedy this, we introduce a new direct compaction priority called
> COMPACT_PRIO_SYNC_FULL, which instructs direct compaction to:
> 
> - ignore deferred compaction status for a zone
> - ignore pageblock skip hints
> - ignore cached scanner positions and scan the whole zone
> 
> The new priority should get eventually picked up by should_compact_retry() and
> this should improve success rates for costly allocations using __GFP_REPEAT,
> such as hugetlbfs allocations, and reduce some corner-case OOM's for 
> non-costly
> allocations.
> 
> Signed-off-by: Vlastimil Babka 
> Acked-by: Michal Hocko 
> ---
>  include/linux/compaction.h | 3 ++-
>  mm/compaction.c| 5 -
>  2 files changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/compaction.h b/include/linux/compaction.h
> index e88c037afe47..a1fba9994728 100644
> --- a/include/linux/compaction.h
> +++ b/include/linux/compaction.h
> @@ -6,8 +6,9 @@
>   * Lower value means higher priority, analogically to reclaim priority.
>   */
>  enum compact_priority {
> + COMPACT_PRIO_SYNC_FULL,
> + MIN_COMPACT_PRIORITY = COMPACT_PRIO_SYNC_FULL,
>   COMPACT_PRIO_SYNC_LIGHT,
> - MIN_COMPACT_PRIORITY = COMPACT_PRIO_SYNC_LIGHT,
>   DEF_COMPACT_PRIORITY = COMPACT_PRIO_SYNC_LIGHT,
>   COMPACT_PRIO_ASYNC,
>   INIT_COMPACT_PRIORITY = COMPACT_PRIO_ASYNC
> diff --git a/mm/compaction.c b/mm/compaction.c
> index a144f58f7193..ae4f40afcca1 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -1644,6 +1644,8 @@ static enum compact_result compact_zone_order(struct 
> zone *zone, int order,
>   .alloc_flags = alloc_flags,
>   .classzone_idx = classzone_idx,
>   .direct_compaction = true,
> + .whole_zone = (prio == COMPACT_PRIO_SYNC_FULL),
> + .ignore_skip_hint = (prio == COMPACT_PRIO_SYNC_FULL)
>   };
>   INIT_LIST_HEAD();
>   INIT_LIST_HEAD();
> @@ -1689,7 +1691,8 @@ enum compact_result try_to_compact_pages(gfp_t 
> gfp_mask, unsigned int order,
>   ac->nodemask) {
>   enum compact_result status;
>  
> - if (compaction_deferred(zone, order)) {
> + if (prio > COMPACT_PRIO_SYNC_FULL
> + && compaction_deferred(zone, order)) {
>   rc = max_t(enum compact_result, COMPACT_DEFERRED, rc);
>   continue;

Could we provide prio to compaction_deferred() and do the decision in
that that function?

BTW, in kcompactd, compaction_deferred() is checked but
.ignore_skip_hint=true. Is there any reason? If we can remove
compaction_deferred() for kcompactd, we can check .ignore_skip_hint
to determine if defer is needed or not.

Thanks.


Re: [Query] increased latency observed in cpu hotplug path

2016-08-15 Thread Khan, Imran
On 8/5/2016 12:49 PM, Khan, Imran wrote:
> On 8/1/2016 2:58 PM, Khan, Imran wrote:
>> On 7/30/2016 7:54 AM, Akinobu Mita wrote:
>>> 2016-07-28 22:18 GMT+09:00 Khan, Imran :

 Hi,

 Recently we have observed some increased latency in CPU hotplug
 event in CPU online path. For online latency we see that block
 layer is executing notification handler for CPU_UP_PREPARE event
 and this in turn waits for RCU grace period resulting (sometimes)
 in an execution time of 15-20 ms for this notification handler.
 This change was not there in 3.18 kernel but is present in 4.4
 kernel and was introduced by following commit:


 commit 5778322e67ed34dc9f391a4a5cbcbb856071ceba
 Author: Akinobu Mita 
 Date:   Sun Sep 27 02:09:23 2015 +0900

 blk-mq: avoid inserting requests before establishing new mapping
>>>
>>> ...
>>>
 Upon reverting this commit I could see an improvement of 15-20 ms in
 online latency. So I am looking for some help in analyzing the effects
 of reverting this or should some other approach to reduce the online
 latency must be taken.
>>>
>>> Can you observe the difference in online latency by removing
>>> get_online_cpus() and put_online_cpus() pair in 
>>> blk_mq_init_allocated_queue()
>>> instead of full reverting the commit?
>>>
>> Hi Akinobu,
>> I tried your suggestion but could not achieve any improvement. Actually the 
>> snippet that is causing the change in latency is the following one :
>>
>> list_for_each_entry(q, _q_list, all_q_node) {
>> blk_mq_freeze_queue_wait(q);
>>
>> /*
>>  * timeout handler can't touch hw queue during the
>>  * reinitialization
>>  */
>> del_timer_sync(>timeout);
>>  }
>>
>> I understand that this is getting executed now for CPU_UP_PREPARE as well 
>> resulting in 
>> increased latency in the cpu online path. I am trying to reduce this latency 
>> while keeping the 
>> purpose of this commit intact. I would welcome further suggestions/feedback 
>> in this regard.
>>
> Hi Akinobu,
> 
> I am not able to reduce the cpu online latency with this patch, could you 
> please let me know what
> functionality will be broken, if we avoid this patch in our kernel. Also if 
> you have some other 
> suggestions towards improving this patch please let me know.
> 
After moving the remapping of queues to block layer's kworker I see that online 
latency has improved
while offline latency remains the same. As the freezing of queues happens in 
the context of block layer's
worker, I think it would be better to do the remapping in the same context and 
then go ahead with freezing.
In this regard I have made following change:

commit b2131b86eeef4c5b1f8adaf7a53606301aa6b624
Author: Imran Khan 
Date:   Fri Aug 12 19:59:47 2016 +0530

blk-mq: Move block queue remapping from cpu hotplug path

During a cpu hotplug, the hardware and software contexts mappings
need to be updated in order to take into account requests
submitted for the hotadded CPU. But if this mapping is done
in hotplug notifier, it deteriorates the hotplug latency.
So move the block queue remapping to block layer worker which
results in significant improvements in hotplug latency.

Change-Id: I01ac83178ce95c3a4e3b7b1b286eda65ff34e8c4
Signed-off-by: Imran Khan 

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 6d6f8fe..06fcf89 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -22,7 +22,11 @@
 #include 
 #include 
 #include 
-
+#include 
+#include 
+#include 
+#include 
+#include 
 #include 

 #include 
@@ -32,10 +36,18 @@

 static DEFINE_MUTEX(all_q_mutex);
 static LIST_HEAD(all_q_list);
-
 static void __blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx);

 /*
+ * New online cpumask which is going to be set in this hotplug event.
+ * Declare this cpumasks as global as cpu-hotplug operation is invoked
+ * one-by-one and dynamically allocating this could result in a failure.
+ */
+static struct cpumask online_new;
+
+static struct work_struct blk_mq_remap_work;
+
+/*
  * Check if any of the ctx's have pending work in this hardware queue
  */
 static bool blk_mq_hctx_has_pending(struct blk_mq_hw_ctx *hctx)
@@ -2125,14 +2137,7 @@ static void blk_mq_queue_reinit(struct request_queue *q,
 static int blk_mq_queue_reinit_notify(struct notifier_block *nb,
  unsigned long action, void *hcpu)
 {
-   struct request_queue *q;
int cpu = (unsigned long)hcpu;
-   /*
-* New online cpumask which is going to be set in this hotplug event.
-* Declare this cpumasks as global as cpu-hotplug operation is invoked
-* one-by-one and dynamically allocating this could result in a failure.
-*/
-   static struct cpumask online_new;

/*
 * Before 

Re: [PATCH] Map in physical addresses in efi_map_region_fixed

2016-08-15 Thread Borislav Petkov
On Mon, Aug 15, 2016 at 01:47:31PM -0500, Alex Thorlton wrote:
> The only thing we're adding here is the physical mappings, to match
> what is availble in the primary kernel.

I can see what it does - I just am questioning the reasoning for as we
did all that effort so that kexec can have stable virtual mappings.

I guess we still need a way to pass the virtual mappings to kexec
as they're immutable as some "smartass" decided to allow to call
SetVirtualAddressMap only once.

> This is sort of a hand-wavey answer - I will investigate the his further...

Yeah, it'll be interesting to know whether that is an issue because if
we do the 1:1 mappings in the kexec kernel too and there's an address
conflict, then we better know upfront.

> It's not that we need it all of the sudden, necessarily, it's just that
> we've had to make other changes to make things work with the new,
> (almost) completely isolated, EFI page tables.  We ended up choosing the
> lesser of two evils, and have decided to temporarily rely on the
> physical address of our runtime code, instead of continuing to rely on
> EFI_OLD_MEMMAP.

Well, if it starts to cause trouble, you probably will have to revert.

> If there are strong objections to this change, I won't pursue it
> further.

I don't really care all that much as long as it doesn't break the
existing situation. I've long given up on the hope that EFI and all its
incarnations will hold on to some spec... :-)

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--


Re: [PATCH 1/2] ARM: dts: imx7d: move CPU operating points to imx7d.dtsi

2016-08-15 Thread Shawn Guo
On Thu, Aug 11, 2016 at 05:11:06PM -0700, Stefan Agner wrote:
> Only i.MX 7Dual SoC supports CPU frequencies of up to 1GHz. The i.MX
> 7Solo can run with up to 800MHz and does so without making use of DVFS
> usually. While the device tree clearly specified a too fast operating
> point for i.MX 7Solo, the kernel did not used it in practise so far
> because the CPUfreq driver does not get loaded on i.MX 7Solo devices
> (since the fsl,imx7s compatible string is not in the list of devices
> making use of the cpufreq-dt driver...).
> 
> Signed-off-by: Stefan Agner 
> ---
> Hi Shawn,
> 
> This is based on my earlier patchset:
> ARM: dts: imx7d: move ARM platform peripherals inside soc
> 
> This are kind of fixes too, so if possible I would like to see them
> in v4.8, what do you think?

Patch "ARM: dts: imx7d: move ARM platform peripherals inside soc node"
is not really a fix, and the diffstat looks too dramatic to be a -rc
material, so I queued it as a -next patch, and any patch based on it
will have to go through -next as well.

Applied for -next, thanks.

Shawn

> 
> --
> Stefan
> 
>  arch/arm/boot/dts/imx7d.dtsi | 8 
>  arch/arm/boot/dts/imx7s.dtsi | 5 -
>  2 files changed, 8 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arm/boot/dts/imx7d.dtsi b/arch/arm/boot/dts/imx7d.dtsi
> index 3d77d95..d0b199c 100644
> --- a/arch/arm/boot/dts/imx7d.dtsi
> +++ b/arch/arm/boot/dts/imx7d.dtsi
> @@ -45,6 +45,14 @@
>  
>  / {
>   cpus {
> + cpu0: cpu@0 {
> + operating-points = <
> + /* KHz  uV */
> + 996000  1075000
> + 792000  975000
> + >;
> + };
> +
>   cpu1: cpu@1 {
>   compatible = "arm,cortex-a7";
>   device_type = "cpu";
> diff --git a/arch/arm/boot/dts/imx7s.dtsi b/arch/arm/boot/dts/imx7s.dtsi
> index c63591c..5132e2f 100644
> --- a/arch/arm/boot/dts/imx7s.dtsi
> +++ b/arch/arm/boot/dts/imx7s.dtsi
> @@ -85,11 +85,6 @@
>   compatible = "arm,cortex-a7";
>   device_type = "cpu";
>   reg = <0>;
> - operating-points = <
> - /* KHz  uV */
> - 996000  1075000
> - 792000  975000
> - >;
>   clock-latency = <61036>; /* two CLK32 periods */
>   clocks = < IMX7D_CLK_ARM>;
>   };
> -- 
> 2.9.0
> 
> 
> ___
> linux-arm-kernel mailing list
> linux-arm-ker...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel


Re: [PATCH v6 0/2] Block layer support ZAC/ZBC commands

2016-08-15 Thread Shaun Tancheff
On Mon, Aug 15, 2016 at 11:00 PM, Damien Le Moal  wrote:
>
> Shaun,
>
>> On Aug 14, 2016, at 09:09, Shaun Tancheff  wrote:
> […]

>>> No, surely not.
>>> But one of the _big_ advantages for the RB tree is blkdev_discard().
>>> Without the RB tree any mkfs program will issue a 'discard' for every
>>> sector. We will be able to coalesce those into one discard per zone, but
>>> we still need to issue one for _every_ zone.
>>
>> How can you make coalesce work transparently in the
>> sd layer _without_ keeping some sort of a discard cache along
>> with the zone cache?
>>
>> Currently the block layer's blkdev_issue_discard() is breaking
>> large discard's into nice granular and aligned chunks but it is
>> not preventing small discards nor coalescing them.
>>
>> In the sd layer would there be way to persist or purge an
>> overly large discard cache? What about honoring
>> discard_zeroes_data? Once the discard is completed with
>> discard_zeroes_data you have to return zeroes whenever
>> a discarded sector is read. Isn't that a log more than just
>> tracking a write pointer? Couldn't a zone have dozens of holes?
>
> My understanding of the standards regarding discard is that it is not
> mandatory and that it is a hint to the drive. The drive can completely
> ignore it if it thinks that is a better choice. I may be wrong on this
> though. Need to check again.

But you are currently setting discard_zeroes_data=1 in your
current patches. I believe that setting discard_zeroes_data=1
effectively promotes discards to being mandatory.

I have a follow on patch to my SCT Write Same series that
handles the CMR zone case in the sd_zbc_setup_discard() handler.

> For reset write pointer, the mapping to discard requires that the calls
> to blkdev_issue_discard be zone aligned for anything to happen. Specify
> less than a zone and nothing will be done. This I think preserve the
> discard semantic.

Oh. If that is the intent then there is just a bug in the handler.
I have pointed out where I believe it to be in my response to
the zone cache patch being posted.

> As for the “discard_zeroes_data” thing, I also think that is a drive
> feature not mandatory. Drives may have it or not, which is consistent
> with the ZBC/ZAC standards regarding reading after write pointer (nothing
> says that zeros have to be returned). In any case, discard of CMR zones
> will be a nop, so for SMR drives, discard_zeroes_data=0 may be a better
> choice.

However I am still curious about discard's being coalesced.

>>> Which is (as indicated) really slow, and easily takes several minutes.
>>> With the RB tree we can short-circuit discards to empty zones, and speed
>>> up processing time dramatically.
>>> Sure we could be moving the logic into mkfs and friends, but that would
>>> require us to change the programs and agree on a library (libzbc?) which
>>> should be handling that.
>>
>> F2FS's mkfs.f2fs is already reading the zone topology via SG_IO ...
>> so I'm not sure your argument is valid here.
>
> This initial SMR support patch is just that: a first try. Jaegeuk
> used SG_IO (in fact copy-paste of parts of libzbc) because the current
> ZBC patch-set has no ioctl API for zone information manipulation. We
> will fix this mkfs.f2fs once we agree on an ioctl interface.

Which again is my point. If mkfs.f2fs wants to speed up it's
discard pass in mkfs.f2fs by _not_ sending unneccessary
Reset WP for zones that are already empty it has all the
information it needs to do so.

Here it seems to me that the zone cache is _at_best_
doing double work. At works the zone cache could be
doing the wrong thing _if_ the zone cache got out of sync.
It is certainly possible (however unlikely) that someone was
doing some raw sg activity that is not seed by the sd path.

All I am trying to do is have a discussion about the reasons for
and against have a zone cache. Where it works and where it breaks
this should be entirely technical but I understand that we have all
spent a lot of time _not_ discussing this for various non-technical
reasons.

So far the only reason I've been able to ascertain is that
Host Manged drives really don't like being stuck with the
URSWRZ and would like to have a software hack to return
MUD rather than ship drives with some weird out-of-the box
config where the last zone is marked as FINISH'd thereby
returning MUD on reads as per spec.

I understand that it would be strange state to see of first
boot and likely people would just do a ResetWP and have
weird boot errors, which would probably just make matters
worse.

I just would rather the work around be a bit cleaner and/or
use less memory. I would also like a path available that
does not require SD_ZBC or BLK_ZONED for Host Aware
drives to work, hence this set of patches and me begging
for a single bit in struct bio.

>>
>> [..]
>>
> 3) Try to condense the blkzone data structure to save memory:
> I think that we can at the very 

Re: [RFC][PATCHSET v2] allowing exports in *.S

2016-08-15 Thread Michal Marek
Dne 2.8.2016 v 16:01 Michal Marek napsal(a):
> On 2016-02-03 22:19, Al Viro wrote:
>> Shortlog:
>> Al Viro (13):
>>   [kbuild] handle exports in lib-y objects reliably
>>   EXPORT_SYMBOL() for asm
>>   x86: move exports to actual definitions
>>   alpha: move exports to actual definitions
>>   m68k: move exports to definitions
>>   s390: move exports to definitions
>>   arm: move exports to definitions
>>   ppc: move exports to definitions
>>   ppc: get rid of unreachable abs() implementation
>>   sparc: move exports to definitions
>>   [sparc] unify 32bit and 64bit string.h
>>   sparc32: debride memcpy.S a bit
>>   ia64: move exports to definitions
> 
> After several pings by Al (sorry about that!), I got around to review a
> rebased version of this patchset at
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git asm-exports
> 
> The kbuild commits are good, but since we are close to the end of the
> merge window, I will apply them to my kbuild branch after 4.8-rc1.

The rebased patchset is now in kbuild.git#kbuild. Before pushing, I
noticed one issue: For some reason,
drivers/firmware/efi/libstub/lib-ksyms.o is regenerated each time,
leading to relink of vmlinux. I'm looking into this.

Michal


Re: [PATCH] KEYS: fix big_key dependency

2016-08-15 Thread Stephan Mueller
Am Dienstag, 16. August 2016, 00:45:39 CEST schrieb Kirill Marinushkin:

Hi Kirill,

> + select CRYPTO_ANSI_CPRNG

This change enables the RNG which will not pass FIPS testing any more. Hence, 
this selection could cause an issue in FIPS mode (i.e. booting the kernel with 
fips=1).

May I suggest CRYPTO_DRBG?

Ciao
Stephan


Re: [PATCH v6 0/5] /dev/random - a new approach

2016-08-15 Thread Stephan Mueller
Am Montag, 15. August 2016, 13:42:54 CEST schrieb H. Peter Anvin:

Hi H,

> On 08/11/16 05:24, Stephan Mueller wrote:
> > * prevent fast noise sources from dominating slow noise sources
> > 
> >   in case of /dev/random
> 
> Can someone please explain if and why this is actually desirable, and if
> this assessment has been passed to someone who has actual experience
> with cryptography at the professional level?

There are two motivations for that:

- the current /dev/random is compliant to NTG.1 from AIS 20/31 which requires 
(in brief words) that entropy comes from auditible noise sources. Currently in 
my LRNG only RDRAND is a fast noise source which is not auditible (and it is 
designed to cause a VM exit making it even harder to assess it). To make the 
LRNG to comply with NTG.1, RDRAND can provide entropy but must not become the 
sole entropy provider which is the case now with that change.

- the current /dev/random implementation follows the same concept with the 
exception of 3.15 and 3.16 where RDRAND was not rate-limited. In later 
versions, this was changed.

Ciao
Stephan


Re: [PATCH] Map in physical addresses in efi_map_region_fixed

2016-08-15 Thread Borislav Petkov
On Mon, Aug 15, 2016 at 02:52:22PM -0700, H. Peter Anvin wrote:
> So to answer the implicit question: we have found UEFI stacks in the
> field which fail without the physical mappings present, and we have
> found stacks which fail without a nontrivial SetAddressMapping.

You mean SetVirtualAddressMap.

Oh well, it's not like it matters all that much as we have our own
pagetable for EFI so we can go nuts there. Apparently.

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--


Re: [PATCH] perf/core: Fix the mask in perf_output_sample_regs

2016-08-15 Thread Madhavan Srinivasan



On Thursday 11 August 2016 05:57 PM, Peter Zijlstra wrote:

Sorry, found it in my inbox while clearing out backlog..

On Sun, Jul 03, 2016 at 11:31:58PM +0530, Madhavan Srinivasan wrote:

When decoding the perf_regs mask in perf_output_sample_regs(),
we loop through the mask using find_first_bit and find_next_bit functions.
While the exisitng code works fine in most of the case,
the logic is broken for 32bit kernel (Big Endian).
When reading u64 mask using (u32 *)()[0], find_*_bit() assumes it gets
lower 32bits of u64 but instead gets upper 32bits which is wrong.
Proposed fix is to swap the words of the u64 to handle this case.
This is _not_ endianness swap.

But it looks an awful lot like it..

Hit this issue when testing my perf_arch_regs patchset. Yep exactly
the reason for adding that comment in the commit message.





+++ b/kernel/events/core.c
@@ -5205,8 +5205,10 @@ perf_output_sample_regs(struct perf_output_handle 
*handle,
struct pt_regs *regs, u64 mask)
  {
int bit;
+   DECLARE_BITMAP(_mask, 64);
  
-	for_each_set_bit(bit, (const unsigned long *) ,

+   bitmap_from_u64(_mask, mask);
+   for_each_set_bit(bit, _mask,
 sizeof(mask) * BITS_PER_BYTE) {
u64 val;
+++ b/lib/bitmap.c
+void bitmap_from_u64(unsigned long *dst, u64 mask)
+{
+   dst[0] = mask & ULONG_MAX;
+
+   if (sizeof(mask) > sizeof(unsigned long))
+   dst[1] = mask >> 32;
+}
+EXPORT_SYMBOL(bitmap_from_u64);

Looks small enough for an inline.

Alternatively you can go all the way and add bitmap_from_u64array(), but
that seems massive overkill.


Ok will make it inline and resend.

Maddy



Tedious stuff.. I can't come up with anything prettier :/





[PATCH v2 8/8] power: ds2760_battery: Remove deprecated create_singlethread_workqueue

2016-08-15 Thread Bhaktipriya Shridhar
alloc_ordered_workqueue() with WQ_MEM_RECLAIM set replaces
deprecated create_singlethread_workqueue(). This is the identity
conversion.

The workqueue "monitor_wqueue" is used to monitor the battery
status. It has been identity converted.

It queues multiple work items viz >monitor_work,
>set_charged_work, which require execution ordering.
Hence, alloc_workqueue has been used to replace the
deprecated create_singlethread_workqueue instance.

WQ_MEM_RECLAIM flag has been set to ensure forward progress under
memory pressure.

Signed-off-by: Bhaktipriya Shridhar 
---
 drivers/power/ds2760_battery.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/power/ds2760_battery.c b/drivers/power/ds2760_battery.c
index 80f73cc..ac92e80 100644
--- a/drivers/power/ds2760_battery.c
+++ b/drivers/power/ds2760_battery.c
@@ -566,7 +566,8 @@ static int ds2760_battery_probe(struct platform_device 
*pdev)
INIT_DELAYED_WORK(>monitor_work, ds2760_battery_work);
INIT_DELAYED_WORK(>set_charged_work,
  ds2760_battery_set_charged_work);
-   di->monitor_wqueue = 
create_singlethread_workqueue(dev_name(>dev));
+   di->monitor_wqueue = alloc_ordered_workqueue(dev_name(>dev),
+WQ_MEM_RECLAIM);
if (!di->monitor_wqueue) {
retval = -ESRCH;
goto workqueue_failed;
--
2.1.4



[PATCH v2 5/8] power: ab8500_charger: Remove deprecated create_singlethread_workqueue

2016-08-15 Thread Bhaktipriya Shridhar
alloc_ordered_workqueue() with WQ_MEM_RECLAIM set replaces
deprecated create_singlethread_workqueue(). This is the identity
conversion.

The workqueue "charger_wq" is used for the IRQs and checking HW state of
the charger. It has been identity converted.

It has multiple work items viz usb_charger_attached_work, kick_wd_work,
check_vbat_work, check_hw_failure_work, usb_charger_attached_work,
ac_work, ac_charger_attached_work, attach_work and check_usbchgnotok_work,
which require execution ordering. Hence, a dedicated ordered workqueue
has been used here.

The WQ_MEM_RECLAIM flag has also been set to ensure
forward progress under memory pressure.

Signed-off-by: Bhaktipriya Shridhar 
---
 drivers/power/ab8500_charger.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/power/ab8500_charger.c b/drivers/power/ab8500_charger.c
index 30de5d4..5cee9aa 100644
--- a/drivers/power/ab8500_charger.c
+++ b/drivers/power/ab8500_charger.c
@@ -3540,8 +3540,8 @@ static int ab8500_charger_probe(struct platform_device 
*pdev)
di->usb_state.usb_current = -1;

/* Create a work queue for the charger */
-   di->charger_wq =
-   create_singlethread_workqueue("ab8500_charger_wq");
+   di->charger_wq = alloc_ordered_workqueue("ab8500_charger_wq",
+WQ_MEM_RECLAIM);
if (di->charger_wq == NULL) {
dev_err(di->dev, "failed to create work queue\n");
return -ENOMEM;
--
2.1.4



[PATCH v2 7/8] power: ab8500_fg: Remove deprecated create_singlethread_workqueue

2016-08-15 Thread Bhaktipriya Shridhar
alloc_ordered_workqueue() with WQ_MEM_RECLAIM set replaces
deprecated create_singlethread_workqueue(). This is the identity
conversion.

The workqueue "fg_wq" is used for running the FG algorithm periodically.
It has been identity converted.

It has multiple work items viz fg_periodic_work, fg_low_bat_work,
fg_reinit_work, fg_work, fg_acc_cur_work and fg_check_hw_failure_work,
which require execution ordering. Hence, a dedicated ordered workqueue
has been used here.

The WQ_MEM_RECLAIM flag has been set to guarantee forward progress under
memory pressure.

Signed-off-by: Bhaktipriya Shridhar 
---
 drivers/power/ab8500_fg.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/power/ab8500_fg.c b/drivers/power/ab8500_fg.c
index 5a36cf8..199f2db 100644
--- a/drivers/power/ab8500_fg.c
+++ b/drivers/power/ab8500_fg.c
@@ -3096,7 +3096,7 @@ static int ab8500_fg_probe(struct platform_device *pdev)
ab8500_fg_discharge_state_to(di, AB8500_FG_DISCHARGE_INIT);

/* Create a work queue for running the FG algorithm */
-   di->fg_wq = create_singlethread_workqueue("ab8500_fg_wq");
+   di->fg_wq = alloc_ordered_workqueue("ab8500_fg_wq", WQ_MEM_RECLAIM);
if (di->fg_wq == NULL) {
dev_err(di->dev, "failed to create work queue\n");
return -ENOMEM;
--
2.1.4



[PATCH v2 6/8] power: ipaq_micro_battery: Remove deprecated create_singlethread_workqueue

2016-08-15 Thread Bhaktipriya Shridhar
The workqueue "wq" is used for handling battery related tasks.

It has a single work item viz >update and hence it doesn't require
execution ordering. Hence, alloc_workqueue has been used to replace the
deprecated create_singlethread_workqueue instance.

The WQ_MEM_RECLAIM flag has been set to ensure forward progress under
memory pressure.

Since there is a single work item, explicit concurrency
limit is unnecessary here.

Signed-off-by: Bhaktipriya Shridhar 
---
 drivers/power/ipaq_micro_battery.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/power/ipaq_micro_battery.c 
b/drivers/power/ipaq_micro_battery.c
index 35b01c7..4af7b77 100644
--- a/drivers/power/ipaq_micro_battery.c
+++ b/drivers/power/ipaq_micro_battery.c
@@ -235,7 +235,7 @@ static int micro_batt_probe(struct platform_device *pdev)
return -ENOMEM;

mb->micro = dev_get_drvdata(pdev->dev.parent);
-   mb->wq = create_singlethread_workqueue("ipaq-battery-wq");
+   mb->wq = alloc_workqueue("ipaq-battery-wq", WQ_MEM_RECLAIM, 0);
if (!mb->wq)
return -ENOMEM;

--
2.1.4



[PATCH v2 4/8] power: intel_mid_battery: Remove deprecated create_singlethread_workqueue

2016-08-15 Thread Bhaktipriya Shridhar
The workqueue "monitor_wqueue" is used to monitor the PMIC battery status.
It queues a single work item (pbi->monitor_battery) and hence doesn't
require ordering. Hence, alloc_workqueue has been used to replace the
deprecated create_singlethread_workqueue instance.

Since PMIC battery status needs to be monitored for any change, the
WQ_MEM_RECLAIM flag has been set to ensure forward progress under memory
pressure.

Since there is a single work item, explicit concurrency
limit is unnecessary here.

Signed-off-by: Bhaktipriya Shridhar 
---
 drivers/power/intel_mid_battery.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/power/intel_mid_battery.c 
b/drivers/power/intel_mid_battery.c
index 9fa4acc..dc7feef 100644
--- a/drivers/power/intel_mid_battery.c
+++ b/drivers/power/intel_mid_battery.c
@@ -689,8 +689,7 @@ static int probe(int irq, struct device *dev)
/* initialize all required framework before enabling interrupts */
INIT_WORK(>handler, pmic_battery_handle_intrpt);
INIT_DELAYED_WORK(>monitor_battery, pmic_battery_monitor);
-   pbi->monitor_wqueue =
-   create_singlethread_workqueue(dev_name(dev));
+   pbi->monitor_wqueue = alloc_workqueue(dev_name(dev), WQ_MEM_RECLAIM, 0);
if (!pbi->monitor_wqueue) {
dev_err(dev, "%s(): wqueue init failed\n", __func__);
retval = -ESRCH;
--
2.1.4



[PATCH v2 3/8] power: pm2301_charger: Remove deprecated create_singlethread_workqueue

2016-08-15 Thread Bhaktipriya Shridhar
alloc_ordered_workqueue() with WQ_MEM_RECLAIM set replaces
deprecated create_singlethread_workqueue(). This is the identity
conversion.

The workqueue "charger_wq" is used for running all the charger related
tasks. This involves charger detection, checking for HW failure and HW
status. This workqueue has been identity converted.

It queues multiple workitems viz >check_main_thermal_prot_work,
>check_hw_failure_work, >ac_work. Hence, the deprecated
create_singlethread_workqueue() instance has been replaced with a
dedicated ordered workqueue.

The WQ_MEM_RECLAIM flag has been set to ensure forward progress under
memory pressure.

Signed-off-by: Bhaktipriya Shridhar 
---
 drivers/power/pm2301_charger.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/power/pm2301_charger.c b/drivers/power/pm2301_charger.c
index fb62ed3..78561b6 100644
--- a/drivers/power/pm2301_charger.c
+++ b/drivers/power/pm2301_charger.c
@@ -1054,7 +1054,8 @@ static int pm2xxx_wall_charger_probe(struct i2c_client 
*i2c_client,
pm2->ac_chg.external = true;

/* Create a work queue for the charger */
-   pm2->charger_wq = create_singlethread_workqueue("pm2xxx_charger_wq");
+   pm2->charger_wq = alloc_ordered_workqueue("pm2xxx_charger_wq",
+ WQ_MEM_RECLAIM);
if (pm2->charger_wq == NULL) {
ret = -ENOMEM;
dev_err(pm2->dev, "failed to create work queue\n");
--
2.1.4



[PATCH v2 2/8] power: ab8500_btemp: Remove deprecated create_singlethread_workqueue

2016-08-15 Thread Bhaktipriya Shridhar
The workqueue "btemp_wq" is used for measuring the temperature
periodically. It queues a single workitem (btemp_periodic_work) and
hence doesn't require ordering. Thus, the deprecated
create_singlethread_workqueue() instance has been replaced with
alloc_workqueue().

The WQ_MEM_RECLAIM flag has been set to ensure forward progress under
memory pressure.

Since there is a single work item, explicit concurrency
limit is unnecessary here.

Signed-off-by: Bhaktipriya Shridhar 
---
 drivers/power/ab8500_btemp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/power/ab8500_btemp.c b/drivers/power/ab8500_btemp.c
index bf2e5dd..6ffdc18 100644
--- a/drivers/power/ab8500_btemp.c
+++ b/drivers/power/ab8500_btemp.c
@@ -1095,7 +1095,7 @@ static int ab8500_btemp_probe(struct platform_device 
*pdev)

/* Create a work queue for the btemp */
di->btemp_wq =
-   create_singlethread_workqueue("ab8500_btemp_wq");
+   alloc_workqueue("ab8500_btemp_wq", WQ_MEM_RECLAIM, 0);
if (di->btemp_wq == NULL) {
dev_err(di->dev, "failed to create work queue\n");
return -ENOMEM;
--
2.1.4



[PATCH v2 1/8] power: abx500_chargalg: Remove deprecated create_singlethread_workqueue

2016-08-15 Thread Bhaktipriya Shridhar
alloc_ordered_workqueue() with WQ_MEM_RECLAIM set replaces
deprecated create_singlethread_workqueue(). This is the identity
conversion.

The workqueue "chargalg_wq" is used for running the charging algorithm.
It has multiple workitems viz >chargalg_periodic_work,
>chargalg_wd_work, >chargalg_work per abx500_chargalg, which
require ordering. It has been identity converted.

Also, WQ_MEM_RECLAIM has been set to ensure forward progress under
memory pressure.

Signed-off-by: Bhaktipriya Shridhar 
---
 drivers/power/abx500_chargalg.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/power/abx500_chargalg.c b/drivers/power/abx500_chargalg.c
index d9104b1..a4411d6 100644
--- a/drivers/power/abx500_chargalg.c
+++ b/drivers/power/abx500_chargalg.c
@@ -2091,8 +2091,8 @@ static int abx500_chargalg_probe(struct platform_device 
*pdev)
abx500_chargalg_maintenance_timer_expired;

/* Create a work queue for the chargalg */
-   di->chargalg_wq =
-   create_singlethread_workqueue("abx500_chargalg_wq");
+   di->chargalg_wq = alloc_ordered_workqueue("abx500_chargalg_wq",
+  WQ_MEM_RECLAIM);
if (di->chargalg_wq == NULL) {
dev_err(di->dev, "failed to create work queue\n");
return -ENOMEM;
--
2.1.4



[PATCH v2 0/8] power: Remove deprecated create_singlethread_workqueue

2016-08-15 Thread Bhaktipriya Shridhar
This patch set removes the instances of deprecated
create_singlethread_workqueues in drivers/power by making the
appropriate conversions.

Bhaktipriya Shridhar (8):
  power: abx500_chargalg: Remove deprecated
create_singlethread_workqueue
  power: ab8500_btemp: Remove deprecated create_singlethread_workqueue
  power: pm2301_charger: Remove deprecated create_singlethread_workqueue
  power: intel_mid_battery: Remove deprecated
create_singlethread_workqueue
  power: ab8500_charger: Remove deprecated create_singlethread_workqueue
  power: ipaq_micro_battery: Remove deprecated
create_singlethread_workqueue
  power: ab8500_fg: Remove deprecated create_singlethread_workqueue
  power: ds2760_battery: Remove deprecated create_singlethread_workqueue

 drivers/power/ab8500_btemp.c   | 2 +-
 drivers/power/ab8500_charger.c | 4 ++--
 drivers/power/ab8500_fg.c  | 2 +-
 drivers/power/abx500_chargalg.c| 4 ++--
 drivers/power/ds2760_battery.c | 3 ++-
 drivers/power/intel_mid_battery.c  | 3 +--
 drivers/power/ipaq_micro_battery.c | 2 +-
 drivers/power/pm2301_charger.c | 3 ++-
 8 files changed, 12 insertions(+), 11 deletions(-)

--
2.1.4



Re: [PATCH v1 3/3] PM / AVS: rockchip-cpu-avs: add driver handling Rockchip cpu avs

2016-08-15 Thread kbuild test robot
Hi Finley,

[auto build test ERROR on battery/master]
[also build test ERROR on v4.8-rc2 next-20160815]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Finlye-Xiao/PM-AVS-add-Rockchip-cpu-avs/20160816-105228
base:   git://git.infradead.org/battery-2.6.git master
config: arm-allmodconfig (attached as .config)
compiler: arm-linux-gnueabi-gcc (Debian 5.4.0-6) 5.4.0 20160609
reproduce:
wget 
https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross
 -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=arm 

All error/warnings (new ones prefixed by >>):

   drivers/power/avs/rockchip-cpu-avs.c: In function 
'rockchip_cpu_avs_notifier':
>> drivers/power/avs/rockchip-cpu-avs.c:230:10: error: implicit declaration of 
>> function 'cpufreq_frequency_get_table' 
>> [-Werror=implicit-function-declaration]
 table = cpufreq_frequency_get_table(policy->cpu);
 ^
>> drivers/power/avs/rockchip-cpu-avs.c:230:8: warning: assignment makes 
>> pointer from integer without a cast [-Wint-conversion]
 table = cpufreq_frequency_get_table(policy->cpu);
   ^
   cc1: some warnings being treated as errors

vim +/cpufreq_frequency_get_table +230 drivers/power/avs/rockchip-cpu-avs.c

   224  dev = get_cpu_device(policy->cpu);
   225  if (!dev) {
   226  pr_err("cpu%d Failed to get device\n", policy->cpu);
   227  goto out;
   228  }
   229  
 > 230  table = cpufreq_frequency_get_table(policy->cpu);
   231  if (!table) {
   232  pr_err("cpu%d CPUFreq table not found\n", policy->cpu);
   233  goto out;

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: Binary data


[PATCH] Bluetooth: btusb: Add support for 0cf3:e009

2016-08-15 Thread Kai-Heng Feng
Device 0cf3:e009 is one of the QCA ROME family.

T:  Bus=01 Lev=01 Prnt=01 Port=07 Cnt=04 Dev#=  4 Spd=12  MxCh= 0
D:  Ver= 2.01 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs=  1
P:  Vendor=0cf3 ProdID=e009 Rev=00.01
C:  #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=100mA
I:  If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
I:  If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
Signed-off-by: Kai-Heng Feng 
---
 drivers/bluetooth/btusb.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c
index c58a00c..80ae854 100644
--- a/drivers/bluetooth/btusb.c
+++ b/drivers/bluetooth/btusb.c
@@ -248,6 +248,7 @@ static const struct usb_device_id blacklist_table[] = {
 
/* QCA ROME chipset */
{ USB_DEVICE(0x0cf3, 0xe007), .driver_info = BTUSB_QCA_ROME },
+   { USB_DEVICE(0x0cf3, 0xe009), .driver_info = BTUSB_QCA_ROME },
{ USB_DEVICE(0x0cf3, 0xe300), .driver_info = BTUSB_QCA_ROME },
{ USB_DEVICE(0x0cf3, 0xe360), .driver_info = BTUSB_QCA_ROME },
{ USB_DEVICE(0x0489, 0xe092), .driver_info = BTUSB_QCA_ROME },
-- 
2.8.1



Re: ASoC: sun4i-codec: playback stall and I/O error with DAPM paths all disabled

2016-08-15 Thread wens Tsai
On Mon, Aug 15, 2016 at 7:42 PM, Mark Brown  wrote:
> On Mon, Aug 15, 2016 at 05:43:55PM +0800, wens Tsai wrote:
>
>> What is unexpected is any attempt to play anything under this state makes
>> the playback software (in my case mpg321) stall, and later report an I/O
>> error. My guess is that the DAC is still disabled by DAPM, so it doesn't
>> send any DRQs, and thus the DMA engine is not consuming any data from
>> userspace.
>
> This is normal for ASoC - like you say it'll be becasue the hardware
> isn't powered up.
>
>> I think we should just enable the digital bits of the DAC/ADC all the
>> time. Or maybe transfer and then discard data if the DAC is off. Not
>> sure if this is doable though. I expect playback software to work, and
>> not block, regardless of the hardware status.
>
> Powering things up all the time will have a major effect on battery life
> for systems that care about that.  The expectation is that systems with
> this sort of hardware won't normally be offering end users direct
> control of the routing, it'll be something that's handled during system
> integration.

Ok. So I guess one solution would be to move the mute controls out of
DAPM, and maybe change some other mux like paths into actual muxes, so
there's at least one usable path at all times.

IIRC there was a patch doing something like this. I'll look into it.

Regards
ChenYu


Re: [lkp] [usb] ad05399d68: BUG: unable to handle kernel NULL pointer dereference at 0000000000000012

2016-08-15 Thread Ye Xiaolong
On 08/16, Peter Chen wrote:
>On Mon, Aug 15, 2016 at 10:49:55PM +0800, Ye Xiaolong wrote:
>> On 08/15, Peter Chen wrote:
>> > 
>> >>
>> >>
>> >>FYI, we noticed the following commit:
>> >>
>> >>https://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb.git testing/next 
>> >>commit
>> >>ad05399d68b6ae1649cdcfc82ce3ffea1a7c5104 ("usb: udc: core: fix error 
>> >>handling")
>> >>
>> >
>> >Hi Xiaolong,
>> >
>> >You reported it one month ago, and said it is a false report. see below.
>> >Would you please double confirm it?
>> 
>> Hi, peter
>> 
>> Last time I reported stat "WARNING: CPU: 0 PID: 1 at
>> lib/list_debug.c:36" and it showed both in this commit and its parent,
>> this time, the observed change stat is "BUG: unable to handle kernel NULL
>> pointer dereference at 0012" and it doesn't show in parent
>> commit, however, the parent commit's dmesg would show kernel panic log
>> as:
>> 
>> [   10.338487] Kernel panic - not syncing: Attempted to kill init! 
>> exitcode=0x000b
>> [   10.338487] 
>> [   10.339911] CPU: 0 PID: 1 Comm: init Not tainted 4.8.0-rc1-00020-g0937a4d 
>> #1
>> [   10.341177] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
>> Debian-1.8.2-1 04/01/2014
>> [   10.342798]   88001e53bc28 8168cf8a 
>> 88001e534000
>> [   10.345177]  8256ef20 88001e53bcb8 88001e50ca50 
>> 88001e53bca8
>> [   10.346739]  8114e062 8810 88001e53bcb8 
>> 88001e53bc50
>> [   10.347970] Call Trace:
>> [   10.348690]  [] dump_stack+0x83/0xb9
>> [   10.351592]  [] panic+0xf3/0x2a9
>> [   10.352386]  [] do_exit+0x601/0xde0
>> [   10.352879]  [] ? __sigqueue_free+0x43/0x50
>> [   10.353511]  [] ? __dequeue_signal+0x1f7/0x210
>> [   10.354483]  [] do_group_exit+0xa2/0x100
>> [   10.355324]  [] get_signal+0x68e/0x740
>> [   10.356155]  [] do_signal+0x23/0x670
>> [   10.356983]  [] ? do_syslog+0x2c0/0x6a0
>> [   10.357832]  [] ? bad_area_nosemaphore+0x33/0x40
>> [   10.358825]  [] ? __do_page_fault+0x407/0x4d0
>> [   10.359738]  [] exit_to_usermode_loop+0x69/0xc0
>> [   10.360680]  [] prepare_exit_to_usermode+0x3d/0x70
>> [   10.361725]  [] retint_user+0x8/0x10
>> [   10.362650] Kernel Offset: disabled
>> 
>> The whole parent dmesg is attached.
>> 
>
>Then, what's the conclusion? Is this one is detect one or not?
>

It seems parent kernel lives longer than this commit, and the 
sysfs_kf_write bug shows up consistently in 3 boot tests in LKP
environment.

% compare -at ad05399d68b6ae1649cdcfc82ce3ffea1a7c5104
tests: 3
testcase/path_params/tbox_group/run: boot/1/vm-kbuild-yocto-x86_64

0937a4d787539e2f  ad05399d68b6ae1649cdcfc82c
  --
  fail:runs  %reproductionfail:runs
  | | |
 6:6 -100%:4 
kmsg.stc):gdata/new_proto/recv_or_reg_complete_cb_not_ready
 6:6 -100%:4 kmsg.fmdrv:st_unregister_failed
  :6  100%   4:4 
kmsg.list_del_corruption.prev->next_should_be#,but_was
  :6  100%   4:4 
dmesg.WARNING:at_lib/list_debug.c:#__list_del_entry
  :6  100%   4:4 dmesg.BUG:unable_to_handle_kernel
  :6  100%   4:4 dmesg.Oops
  :6  100%   4:4 dmesg.RIP:sysfs_kf_write
  :6  100%   4:4 
dmesg.Kernel_panic-not_syncing:Fatal_exception
 6:6 -100%:4 
dmesg.Kernel_panic-not_syncing:Attempted_to_kill_init!exitcode=

testcase/path_params/tbox_group/run: boot/1/vm-ivb41-yocto-ia32

0937a4d787539e2f  ad05399d68b6ae1649cdcfc82c
  --
 2:2 -100%:2 
kmsg.stc):gdata/new_proto/recv_or_reg_complete_cb_not_ready
 2:2 -100%:2 kmsg.fmdrv:st_unregister_failed
  :2  100%   2:2 dmesg.BUG:unable_to_handle_kernel
  :2  100%   2:2 dmesg.Oops
  :2  100%   2:2 dmesg.RIP:sysfs_kf_write
  :2  100%   2:2 
dmesg.Kernel_panic-not_syncing:Fatal_exception
 2:2 -100%:2 dmesg.BUG:kernel_test_hang

testcase/path_params/tbox_group/run: boot/1/vm-kbuild-1G

0937a4d787539e2f  ad05399d68b6ae1649cdcfc82c
  --
  :4  100%   4:4 
dmesg.WARNING:at_lib/list_debug.c:#__list_del_entry
  :4   75%   3:4 dmesg.BUG:unable_to_handle_kernel
  :4   75%   3:4 dmesg.Oops
  :4   75%   3:4 dmesg.RIP:sysfs_kf_write
  :4   75%   3:4 
dmesg.Kernel_panic-not_syncing:Fatal_exception
 4:4 -100%:4 
dmesg.BUG:kernel_oversize_in_test_stage


>Peter
>
>> Thanks,
>> Xiaolong
>> 
>> >
>> >On Wed, Jul 13, 2016 at 

Re: [PATCH v2 2/3] ses: use scsi_is_sas_rphy instead of is_sas_attached

2016-08-15 Thread kbuild test robot
Hi Johannes,

[auto build test ERROR on scsi/for-next]
[also build test ERROR on v4.8-rc2 next-20160815]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Johannes-Thumshirn/Fix-panic-when-a-SES-device-is-attached-to-a-hpsa-logical-volume/20160815-231901
base:   https://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git for-next
config: i386-randconfig-h0-08161012 (attached as .config)
compiler: gcc-6 (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
# save the attached .config to linux build tree
make ARCH=i386 

All errors (new ones prefixed by >>):

>> ERROR: "scsi_is_sas_rphy" [drivers/scsi/ses.ko] undefined!

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: Binary data


Re: [PATCH] exynos-drm: Fix display manager failing to start without IOMMU problem

2016-08-15 Thread Inki Dae
Hi Shuah,

2016년 08월 13일 02:52에 Shuah Khan 이(가) 쓴 글:
> On 08/12/2016 11:28 AM, Shuah Khan wrote:
>> On 08/10/2016 05:05 PM, Shuah Khan wrote:
>>> On 08/10/2016 04:59 PM, Inki Dae wrote:
 Hi Shuah,

 2016년 08월 11일 02:30에 Shuah Khan 이(가) 쓴 글:
> Fix exynos_drm_gem_create_ioctl() attempts to allocate non-contiguous GEM
> memory without IOMMU. In this case, there is no point in attempting to

 DRM gem can be used for Non-DRM drivers such as GPU, V4L2 based Multimedia 
 device and other DMA devices.
 Even though IOMMU support is disabled, other framework based DMA drivers 
 can use IOMMU - i.e., GPU driver -
 and they can use non-contiguous GEM buffer through UMM. (DMABUF) 

 So GEM allocation type is not dependent on IOMMU.
>>>
>>> Hi Inki,
>>>
>>> I am seeing the following failure without IOMMU and light dm fails
>>> to start:
>>>
>>> [drm:exynos_drm_framebuffer_init] *ERROR* Non-continguous GEM memory is not 
>>> supported.
>>>
>>> The change I made fixed that problem and light dm starts without IOMMU.
>>> Is there a better way to fix this problem? Currently without IOMMU,
>>> light dm doesn't start.
>>>
>>> This is on linux_next
>>
>> Hi Inki,
>>
>> I am looking into this further and I am finding inconsistent
>> commits with regards to GEM contiguous and non-contiguous
>> buffers.
>>
>> Okay what you said is that:
>>
>> exymod-drm should support non-continguous and contiguous GEM memory
>> type with or without IOMMU

Right.

>>
>> However, the code currently isn't doing that. The following
>> commit allocates non-contiguous buffers when IOMMU is enabled
>> to handle contiguous allocation failures.
>>
>> There are other commits that removed checks for non-contig type.
>> Let's look at the following cases to see what should be the driver
>> behavior in these cases:
>>
>> IOMMU is disabled:
>>
>> exynos_drm_gem_create_ioctl() gets called with NONCONTIG
>> - driver should try to allocate non-contig
>> - if it can't allocate non-contig, allocate contig
>>   ( this will allow avoid failure like the one I am seeing)
>>
>> exynos_drm_gem_create_ioctl() gets called with CONTIG
>> - driver should try to allocate contig
>> - if it can't allocate contig, allocate non-contig
>>
>> What is confusing is there are several code paths in the
>> GEN allocation and checking memory types are enforcing
>> non-contig with IOMMU. Check this routine:
>>
>> exynos_drm_framebuffer_init() will reject non-contig
>> memory type when check_fb_gem_memory_type() rejects
>> non-contig GEM memory type without IOMMU.

Only in case that the gem buffer is used for framebuffer, gem memory type 
should be checked because this means the DMA of Display controller accesses the 
gem buffer so without IOMMU the DMA device cannot access non-contiguous memory 
region.
That is why exynos_drm_framebuffer_init checks gem memory type for fb not when 
gem is created.

> 
> 
> okay the very first commit that added IOMMU support
> introduced the code that rejects non-contig gem memory
> type without IOMMU.
> 
> commit 0519f9a12d0113caab78980c48a7902d2bd40c2c
> Author: Inki Dae 
> Date:   Sat Oct 20 07:53:42 2012 -0700
> 
> drm/exynos: add iommu support for exynos drm framework
> 
> Anyway, if it is th right change to fix check_fb_gem_memory_type()
> to not reject NONCONTIG_BUFFER, then I can make that change

No, as I mentioned above, the gem buffer for fb is dependent on IOMMU because 
the gem buffer for fb is used by DMA device - FIMD, DECON or Mixer.
You would need to understand that gem buffer can be used for other purposes - 
2D/3D or post process devices which don't use framebuffer - not display 
controller which uses framebuffer to scanout

Thanks,
Inki Dae

> instead of this patch I sent.
> 
>>
>> So there is inconsistency in the non-contig vs. contig
>> GEM support in exynos-drm. I think this needs to be cleaned
>> up to get the desired behavior.
>>
>> The following commit allocates non-contiguous buffers when IOMMU is
>> enabled to handle contiguous allocation failures.
>>
>> There are other commits that removed checks for non-contig type.
>> Let's look at the following cases to see what should be the driver
>> behavior in these cases:
>>
>> commit 122beea84bb90236b1ae545f08267af58591c21b
>> Author: Rahul Sharma 
>> Date:   Wed May 7 17:21:29 2014 +0530
>>
>> drm/exynos: allocate non-contigous buffers when iommu is enabled
>> 
>> Allow to allocate non-contigous buffers when iommu is enabled.
>> Currently, it tries to allocates contigous buffer which consistently
>> fail for large buffers and then fall back to non contigous. Apart
>> from being slow, this implementation is also very noisy and fills
>> the screen with alloc fail logs.
>> 
>> Signed-off-by: Rahul Sharma 
>> Reviewed-by: Sachin Kamat 
>> Signed-off-by: Inki Dae 
>>
>>
>> 

[PATCH v2] arc: Add "model" properly in device tree description of all boards

2016-08-15 Thread Alexey Brodkin
As it was discussed quite some time ago (see
https://lkml.org/lkml/2015/11/5/862) it's a good practice to add
"model" property in .dts. Moreover as per ePAPR "model" property is
required and should look like "manufacturer,model" so we do here.

Signed-off-by: Alexey Brodkin 
Cc: Vineet Gupta 
Cc: Jonas Gorski 
Cc: Arnd Bergmann 
Cc: Rob Herring 
Cc: Christian Ruppert 
---

Changes v1 -> v2:
 * Added "hs" postfix for boards based on ARC HS core
 * Added "archs" postfix in VDK's .dts to distinguish VDKs for
   ARC cores from those for ARM cores

 arch/arc/boot/dts/abilis_tb100_dvk.dts | 1 +
 arch/arc/boot/dts/abilis_tb101_dvk.dts | 1 +
 arch/arc/boot/dts/axs101.dts   | 1 +
 arch/arc/boot/dts/axs103.dts   | 1 +
 arch/arc/boot/dts/axs103_idu.dts   | 1 +
 arch/arc/boot/dts/nsim_700.dts | 1 +
 arch/arc/boot/dts/nsim_hs.dts  | 1 +
 arch/arc/boot/dts/nsim_hs_idu.dts  | 1 +
 arch/arc/boot/dts/nsimosci.dts | 1 +
 arch/arc/boot/dts/nsimosci_hs.dts  | 1 +
 arch/arc/boot/dts/nsimosci_hs_idu.dts  | 1 +
 arch/arc/boot/dts/vdk_hs38.dts | 1 +
 arch/arc/boot/dts/vdk_hs38_smp.dts | 1 +
 13 files changed, 13 insertions(+)

diff --git a/arch/arc/boot/dts/abilis_tb100_dvk.dts 
b/arch/arc/boot/dts/abilis_tb100_dvk.dts
index 3dd6ed9..3acf04d 100644
--- a/arch/arc/boot/dts/abilis_tb100_dvk.dts
+++ b/arch/arc/boot/dts/abilis_tb100_dvk.dts
@@ -24,6 +24,7 @@
 /include/ "abilis_tb100.dtsi"
 
 / {
+   model = "abilis,tb100";
chosen {
bootargs = "earlycon=uart8250,mmio32,0xff10,9600n8 
console=ttyS0,9600n8";
};
diff --git a/arch/arc/boot/dts/abilis_tb101_dvk.dts 
b/arch/arc/boot/dts/abilis_tb101_dvk.dts
index 1cf51c2..37d88c5 100644
--- a/arch/arc/boot/dts/abilis_tb101_dvk.dts
+++ b/arch/arc/boot/dts/abilis_tb101_dvk.dts
@@ -24,6 +24,7 @@
 /include/ "abilis_tb101.dtsi"
 
 / {
+   model = "abilis,tb101";
chosen {
bootargs = "earlycon=uart8250,mmio32,0xff10,9600n8 
console=ttyS0,9600n8";
};
diff --git a/arch/arc/boot/dts/axs101.dts b/arch/arc/boot/dts/axs101.dts
index 3f9b058..d9b9b9d 100644
--- a/arch/arc/boot/dts/axs101.dts
+++ b/arch/arc/boot/dts/axs101.dts
@@ -13,6 +13,7 @@
 /include/ "axs10x_mb.dtsi"
 
 / {
+   model = "snps,axs101";
compatible = "snps,axs101", "snps,arc-sdp";
 
chosen {
diff --git a/arch/arc/boot/dts/axs103.dts b/arch/arc/boot/dts/axs103.dts
index e6d0e31..ec7fb27 100644
--- a/arch/arc/boot/dts/axs103.dts
+++ b/arch/arc/boot/dts/axs103.dts
@@ -16,6 +16,7 @@
 /include/ "axs10x_mb.dtsi"
 
 / {
+   model = "snps,axs103";
compatible = "snps,axs103", "snps,arc-sdp";
 
chosen {
diff --git a/arch/arc/boot/dts/axs103_idu.dts b/arch/arc/boot/dts/axs103_idu.dts
index f999fef..070c297 100644
--- a/arch/arc/boot/dts/axs103_idu.dts
+++ b/arch/arc/boot/dts/axs103_idu.dts
@@ -16,6 +16,7 @@
 /include/ "axs10x_mb.dtsi"
 
 / {
+   model = "snps,axs103-smp";
compatible = "snps,axs103", "snps,arc-sdp";
 
chosen {
diff --git a/arch/arc/boot/dts/nsim_700.dts b/arch/arc/boot/dts/nsim_700.dts
index 6397051..ce0ccd20 100644
--- a/arch/arc/boot/dts/nsim_700.dts
+++ b/arch/arc/boot/dts/nsim_700.dts
@@ -10,6 +10,7 @@
 /include/ "skeleton.dtsi"
 
 / {
+   model = "snps,nsim";
compatible = "snps,nsim";
#address-cells = <1>;
#size-cells = <1>;
diff --git a/arch/arc/boot/dts/nsim_hs.dts b/arch/arc/boot/dts/nsim_hs.dts
index bf05fe5..3772c40 100644
--- a/arch/arc/boot/dts/nsim_hs.dts
+++ b/arch/arc/boot/dts/nsim_hs.dts
@@ -10,6 +10,7 @@
 /include/ "skeleton_hs.dtsi"
 
 / {
+   model = "snps,nsim_hs";
compatible = "snps,nsim_hs";
#address-cells = <2>;
#size-cells = <2>;
diff --git a/arch/arc/boot/dts/nsim_hs_idu.dts 
b/arch/arc/boot/dts/nsim_hs_idu.dts
index 99eabe1..48434d7c 100644
--- a/arch/arc/boot/dts/nsim_hs_idu.dts
+++ b/arch/arc/boot/dts/nsim_hs_idu.dts
@@ -10,6 +10,7 @@
 /include/ "skeleton_hs_idu.dtsi"
 
 / {
+   model = "snps,nsim_hs-smp";
compatible = "snps,nsim_hs";
interrupt-parent = <_intc>;
 
diff --git a/arch/arc/boot/dts/nsimosci.dts b/arch/arc/boot/dts/nsimosci.dts
index e659a34..bcf6031 100644
--- a/arch/arc/boot/dts/nsimosci.dts
+++ b/arch/arc/boot/dts/nsimosci.dts
@@ -10,6 +10,7 @@
 /include/ "skeleton.dtsi"
 
 / {
+   model = "snps,nsimosci";
compatible = "snps,nsimosci";
#address-cells = <1>;
#size-cells = <1>;
diff --git a/arch/arc/boot/dts/nsimosci_hs.dts 
b/arch/arc/boot/dts/nsimosci_hs.dts
index 16ce5d6..14a727c 100644
--- a/arch/arc/boot/dts/nsimosci_hs.dts
+++ b/arch/arc/boot/dts/nsimosci_hs.dts
@@ -10,6 +10,7 @@
 /include/ "skeleton_hs.dtsi"
 
 / {
+   model = "snps,nsimosci_hs";
compatible = "snps,nsimosci_hs";
#address-cells = <1>;
#size-cells = <1>;
diff 

Re: [PATCH] powerpc/powernv: Initialise nest mmu

2016-08-15 Thread Balbir Singh


On 16/08/16 10:37, Alistair Popple wrote:
> Balbir,
> 
> 
>  
>>> +   /* Update partition table control register on all Nest MMUs */
>>> +   opal_nmmu_set_ptcr(-1UL, __pa(partition_tb) | (PATB_SIZE_SHIFT - 12));
>>> +
>>
>> Just wondering if
>>
>> 1. Instead of using -1 for all cpus, we should do
>>  for_each_online_cpu() {
>>  opal_numm_set_ptcr(...)
>>  }
> 
> Good question, but I don't think it makes sense to do that. The NMMU is
> per-chip/socket rather than per-cpu so it shouldn't be tied to
> onlining/offlining of individual CPUs.
> 
>> 2. In cpu hotplug path do the same when onlining and set to NULL on
>> offlining?
> 
> Again, the nmmu isn't tied to a specific CPU but rather a chip/socket. So in
> theory at least it's possible that all CPUs in a chip could be offline but
> other units on the chip could still be using the nmmu so we wouldn't want to
> disable the nmmu at that point.

Fair enough

Balbir Singh.


Re: [PATCH V5 3/4] drm/bridge: Add driver for GE B850v3 LVDS/DP++ Bridge

2016-08-15 Thread Archit Taneja

Hi,

On 08/09/2016 10:11 PM, Peter Senna Tschudin wrote:

Add a driver that create a drm_bridge and a drm_connector for the LVDS
to DP++ display bridge of the GE B850v3.

There are two physical bridges on the video signal pipeline: a
STDP4028(LVDS to DP) and a STDP2690(DP to DP++).  The hardware and
firmware made it complicated for this binding to comprise two device
tree nodes, as the design goal is to configure both bridges based on
the LVDS signal, which leave the driver powerless to control the video
processing pipeline. The two bridges behaves as a single bridge, and
the driver is only needed for telling the host about EDID / HPD, and
for giving the host powers to ack interrupts. The video signal pipeline
is as follows:

   Host -> LVDS|--(STDP4028)--|DP -> DP|--(STDP2690)--|DP++ -> Video output



I'd commented on an earlier revision (v2) of this patch, but hadn't got
a response on it. Pasting the query again:

Are these two chips always expected to be used together? I don't think
it's right to pair up two encoder chips into one driver just for one
board.

Is one device @0x72 and other @0x73? Or is only one of them an i2c
slave?

What's preventing us to create these as two different bridge drivers?
The drm framework allows us to daisy chain encoder bridges. The only
problem I see is that we don't have a clear-cut way to tell the bridge
driver whether we want it to create a connector for us or not. Because,
it looks like both can potentially create connectors. This isn't a big
problem either if we have DT. We just need to check whether our output
port is connected to another bridge or a connector.

Thanks,
Archit


Cc: Martyn Welch 
Cc: Martin Donnelly 
Cc: Daniel Vetter 
Cc: Enric Balletbo i Serra 
Cc: Philipp Zabel 
Cc: Rob Herring 
Cc: Fabio Estevam 
CC: David Airlie 
CC: Thierry Reding 
CC: Thierry Reding 
Reviewed-by: Enric Balletbo 
Signed-off-by: Peter Senna Tschudin 
---
Changes from V4:
  - Check the output of the first call to i2c_smbus_write_word_data() and return
it's error code for failing gracefully on i2c issues
  - Renamed the i2c_driver.name from "ge,b850v3-lvds-dp" to "b850v3-lvds-dp" to
remove the comma from the driver name

Changes from V3:
  - 3/4 instead of 4/5
  - Tested on next-20160804

Changes from V2:
  - Made it atomic to be applied on next-20160729 on top of Liu Ying changes
that made imx-ldb atomic

Changes from V1:
  - New commit message
  - Removed 3 empty entry points
  - Removed memory leak from ge_b850v3_lvds_dp_get_modes()
  - Added a lock for mode setting
  - Removed a few blank lines
  - Changed the order at Makefile and Kconfig

  MAINTAINERS|   8 +
  drivers/gpu/drm/bridge/Kconfig |  11 +
  drivers/gpu/drm/bridge/Makefile|   1 +
  drivers/gpu/drm/bridge/ge_b850v3_lvds_dp.c | 405 +
  4 files changed, 425 insertions(+)
  create mode 100644 drivers/gpu/drm/bridge/ge_b850v3_lvds_dp.c

diff --git a/MAINTAINERS b/MAINTAINERS
index a306795..e8d106a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -5142,6 +5142,14 @@ W:   https://linuxtv.org
  S:Maintained
  F:drivers/media/radio/radio-gemtek*

+GENERAL ELECTRIC B850V3 LVDS/DP++ BRIDGE
+M: Peter Senna Tschudin 
+M: Martin Donnelly 
+M: Martyn Welch 
+S: Maintained
+F: drivers/gpu/drm/bridge/ge_b850v3_dp2.c
+F: Documentation/devicetree/bindings/ge/b850v3_dp2_bridge.txt
+
  GENERIC GPIO I2C DRIVER
  M:Haavard Skinnemoen 
  S:Supported
diff --git a/drivers/gpu/drm/bridge/Kconfig b/drivers/gpu/drm/bridge/Kconfig
index b590e67..b4b70fb 100644
--- a/drivers/gpu/drm/bridge/Kconfig
+++ b/drivers/gpu/drm/bridge/Kconfig
@@ -32,6 +32,17 @@ config DRM_DW_HDMI_AHB_AUDIO
  Designware HDMI block.  This is used in conjunction with
  the i.MX6 HDMI driver.

+config DRM_GE_B850V3_LVDS_DP
+   tristate "GE B850v3 LVDS to DP++ display bridge"
+   depends on OF
+   select DRM_KMS_HELPER
+   select DRM_PANEL
+   ---help---
+  This is a driver for the display bridge of
+  GE B850v3 that convert dual channel LVDS
+  to DP++. This is used with the i.MX6 imx-ldb
+  driver.
+
  config DRM_NXP_PTN3460
tristate "NXP PTN3460 DP/LVDS bridge"
depends on OF
diff --git a/drivers/gpu/drm/bridge/Makefile b/drivers/gpu/drm/bridge/Makefile
index efdb07e..b9606f3 100644
--- a/drivers/gpu/drm/bridge/Makefile
+++ b/drivers/gpu/drm/bridge/Makefile
@@ -3,6 +3,7 @@ ccflags-y := -Iinclude/drm
  obj-$(CONFIG_DRM_ANALOGIX_ANX78XX) += analogix-anx78xx.o
  

Re: [RFC] can we use vmalloc to alloc thread stack if compaction failed

2016-08-15 Thread Joonsoo Kim
On Wed, Aug 10, 2016 at 04:59:39AM -0700, Andy Lutomirski wrote:
> On Sun, Jul 31, 2016 at 10:30 PM, Joonsoo Kim  wrote:
> > On Fri, Jul 29, 2016 at 12:47:38PM -0700, Andy Lutomirski wrote:
> >> -- Forwarded message --
> >> From: "Joonsoo Kim" 
> >> Date: Jul 28, 2016 7:57 PM
> >> Subject: Re: [RFC] can we use vmalloc to alloc thread stack if compaction 
> >> failed
> >> To: "Andy Lutomirski" 
> >> Cc: "Xishi Qiu" , "Michal Hocko"
> >> , "Tejun Heo" , "Ingo Molnar"
> >> , "Peter Zijlstra" , "LKML"
> >> , "Linux MM" ,
> >> "Yisheng Xie" 
> >>
> >> > On Thu, Jul 28, 2016 at 08:07:51AM -0700, Andy Lutomirski wrote:
> >> > > On Thu, Jul 28, 2016 at 3:51 AM, Xishi Qiu  wrote:
> >> > > > On 2016/7/28 17:43, Michal Hocko wrote:
> >> > > >
> >> > > >> On Thu 28-07-16 16:45:06, Xishi Qiu wrote:
> >> > > >>> On 2016/7/28 15:58, Michal Hocko wrote:
> >> > > >>>
> >> > >  On Thu 28-07-16 15:41:53, Xishi Qiu wrote:
> >> > > > On 2016/7/28 15:20, Michal Hocko wrote:
> >> > > >
> >> > > >> On Thu 28-07-16 15:08:26, Xishi Qiu wrote:
> >> > > >>> Usually THREAD_SIZE_ORDER is 2, it means we need to alloc 16kb 
> >> > > >>> continuous
> >> > > >>> physical memory during fork a new process.
> >> > > >>>
> >> > > >>> If the system's memory is very small, especially the smart 
> >> > > >>> phone, maybe there
> >> > > >>> is only 1G memory. So the free memory is very small and 
> >> > > >>> compaction is not
> >> > > >>> always success in slowpath(__alloc_pages_slowpath), then alloc 
> >> > > >>> thread stack
> >> > > >>> may be failed for memory fragment.
> >> > > >>
> >> > > >> Well, with the current implementation of the page allocator 
> >> > > >> those
> >> > > >> requests will not fail in most cases. The oom killer would be 
> >> > > >> invoked in
> >> > > >> order to free up some memory.
> >> > > >>
> >> > > >
> >> > > > Hi Michal,
> >> > > >
> >> > > > Yes, it success in most cases, but I did have seen this problem 
> >> > > > in some
> >> > > > stress-test.
> >> > > >
> >> > > > DMA free:470628kB, but alloc 2 order block failed during fork a 
> >> > > > new process.
> >> > > > There are so many memory fragments and the large block may be 
> >> > > > soon taken by
> >> > > > others after compact because of stress-test.
> >> > > >
> >> > > > --- dmesg messages ---
> >> > > > 07-13 08:41:51.341 
> >> > > > <4>[309805.658142s][pid:1361,cpu5,sManagerService]sManagerService:
> >> > > >  page allocation failure: order:2, mode:0x2000d1
> >> > > 
> >> > >  Yes but this is __GFP_DMA allocation. I guess you have already 
> >> > >  reported
> >> > >  this failure and you've been told that this is quite unexpected 
> >> > >  for the
> >> > >  kernel stack allocation. It is your out-of-tree patch which just 
> >> > >  makes
> >> > >  things worse because DMA restricted allocations are considered 
> >> > >  "lowmem"
> >> > >  and so they do not invoke OOM killer and do not retry like regular
> >> > >  GFP_KERNEL allocations.
> >> > > >>>
> >> > > >>> Hi Michal,
> >> > > >>>
> >> > > >>> Yes, we add GFP_DMA, but I don't think this is the key for the 
> >> > > >>> problem.
> >> > > >>
> >> > > >> You are restricting the allocation request to a single zone which is
> >> > > >> definitely not good. Look at how many larger order pages are 
> >> > > >> available
> >> > > >> in the Normal zone.
> >> > > >>
> >> > > >>> If we do oom-killer, maybe we will get a large block later, but 
> >> > > >>> there
> >> > > >>> is enough free memory before oom(although most of them are 
> >> > > >>> fragments).
> >> > > >>
> >> > > >> Killing a task is of course the last resort action. It would give 
> >> > > >> you
> >> > > >> larger order blocks used for the victims thread.
> >> > > >>
> >> > > >>> I wonder if we can alloc success without kill any process in this 
> >> > > >>> situation.
> >> > > >>
> >> > > >> Sure it would be preferable to compact that memory but that might be
> >> > > >> hard with your restriction in place. Consider that DMA zone would 
> >> > > >> tend
> >> > > >> to be less movable than normal zones as users would have to pin it 
> >> > > >> for
> >> > > >> DMA. Your DMA is really large so this might turn out to just happen 
> >> > > >> to
> >> > > >> work but note that the primary problem here is that you put a zone
> >> > > >> restriction for your allocations.
> >> > > >>
> >> > > >>> Maybe use vmalloc is a good way, but I don't know the influence.
> >> > > >>
> >> > > >> You can have a look at vmalloc patches posted by Andy. They are not 
> >> > > >> that
> >> > > 

Re: [PATCH 2/6] x86/boot: Move compressed kernel to end of decompression buffer

2016-08-15 Thread Matt Mullins
[added Simon Glass to CC in case there's some input from u-boot]

On Thu, Apr 28, 2016 at 05:09:04PM -0700, Kees Cook wrote:
> From: Yinghai Lu 
> 
> This patch adds BP_init_size (which is the INIT_SIZE as passed in from
> the boot_params) into asm-offsets.c to make it visible to the assembly
> code. Then when moving the ZO, it calculates the starting position of
> the copied ZO (via BP_init_size and the ZO run size) so that the VO__end
> will be at the end of the decompression buffer. To make the position
> calculation safe, the end of ZO is page aligned (and a comment is added
> to the existing VO alignment for good measure).

> diff --git a/arch/x86/boot/compressed/head_64.S 
> b/arch/x86/boot/compressed/head_64.S
> index d43c30ed89ed..09cdc0c3ee7e 100644
> --- a/arch/x86/boot/compressed/head_64.S
> +++ b/arch/x86/boot/compressed/head_64.S
> @@ -338,7 +340,9 @@ preferred_addr:
>  1:
>  
>   /* Target address to relocate to for decompression */
> - leaqz_extract_offset(%rbp), %rbx
> + movlBP_init_size(%rsi), %ebx
> + subl$_end, %ebx
> + addq%rbp, %rbx
>  
>   /* Set up the stack */
>   leaqboot_stack_end(%rbx), %rsp

This appears to have a negative effect on booting the Intel Edison platform, as
it uses u-boot as its bootloader.  u-boot does not copy the init_size parameter
when booting a bzImage: it copies a fixed-size setup_header [1], and its
definition of setup_header doesn't include the parameters beyond setup_data [2].

With a zero value for init_size, this calculates a %rsp value of 0x101ff9600.
This causes the boot process to hard-stop at the immediately-following pushq, as
this platform has no usable physical addresses above 4G.

What are the options for getting this type of platform to function again?  For
now, kexec from a working Linux system does seem to be a work-around, but there
appears to be other x86 hardware using u-boot: the chromium.org folks seem to be
maintaining the u-boot x86 tree.

[1] 
http://git.denx.de/?p=u-boot.git;a=blob;f=arch/x86/lib/zimage.c;h=1b33c771391f49ffe82864ff1582bdfd07e5e97d;hb=HEAD#l156
[2] 
http://git.denx.de/?p=u-boot.git;a=blob;f=arch/x86/include/asm/bootparam.h;h=140095117e5a2daef0a097c55f0ed10e08acc781;hb=HEAD#l24


Re: [PATCH 4.7 00/41] 4.7.1-stable review

2016-08-15 Thread Shuah Khan
On 08/14/2016 02:38 PM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.7.1 release.
> There are 41 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Tue Aug 16 20:25:22 UTC 2016.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
>   kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.7.1-rc1.gz
> or in the git tree and branch at:
>   git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git 
> linux-4.7.y
> and the diffstat can be found below.
> 
> thanks,
> 
> greg k-h
> 

Compiled and booted on my test system. No dmesg regressions.

thanks,
-- Shuah


-- 
Shuah Khan
Sr. Linux Kernel Developer
Open Source Innovation Group
Samsung Research America(Silicon Valley)
shuah...@samsung.com


Re: [PATCH v6 0/2] Block layer support ZAC/ZBC commands

2016-08-15 Thread Damien Le Moal

Shaun,

> On Aug 14, 2016, at 09:09, Shaun Tancheff  wrote:
[…]
>>> 
>> No, surely not.
>> But one of the _big_ advantages for the RB tree is blkdev_discard().
>> Without the RB tree any mkfs program will issue a 'discard' for every
>> sector. We will be able to coalesce those into one discard per zone, but
>> we still need to issue one for _every_ zone.
> 
> How can you make coalesce work transparently in the
> sd layer _without_ keeping some sort of a discard cache along
> with the zone cache?
> 
> Currently the block layer's blkdev_issue_discard() is breaking
> large discard's into nice granular and aligned chunks but it is
> not preventing small discards nor coalescing them.
> 
> In the sd layer would there be way to persist or purge an
> overly large discard cache? What about honoring
> discard_zeroes_data? Once the discard is completed with
> discard_zeroes_data you have to return zeroes whenever
> a discarded sector is read. Isn't that a log more than just
> tracking a write pointer? Couldn't a zone have dozens of holes?

My understanding of the standards regarding discard is that it is not
mandatory and that it is a hint to the drive. The drive can completely
ignore it if it thinks that is a better choice. I may be wrong on this
though. Need to check again.
For reset write pointer, the mapping to discard requires that the calls
to blkdev_issue_discard be zone aligned for anything to happen. Specify
less than a zone and nothing will be done. This I think preserve the
discard semantic.

As for the “discard_zeroes_data” thing, I also think that is a drive
feature not mandatory. Drives may have it or not, which is consistent
with the ZBC/ZAC standards regarding reading after write pointer (nothing
says that zeros have to be returned). In any case, discard of CMR zones
will be a nop, so for SMR drives, discard_zeroes_data=0 may be a better
choice.

> 
>> Which is (as indicated) really slow, and easily takes several minutes.
>> With the RB tree we can short-circuit discards to empty zones, and speed
>> up processing time dramatically.
>> Sure we could be moving the logic into mkfs and friends, but that would
>> require us to change the programs and agree on a library (libzbc?) which
>> should be handling that.
> 
> F2FS's mkfs.f2fs is already reading the zone topology via SG_IO ...
> so I'm not sure your argument is valid here.

This initial SMR support patch is just that: a first try. Jaegeuk
used SG_IO (in fact copy-paste of parts of libzbc) because the current
ZBC patch-set has no ioctl API for zone information manipulation. We
will fix this mkfs.f2fs once we agree on an ioctl interface.

> 
> [..]
> 
 3) Try to condense the blkzone data structure to save memory:
 I think that we can at the very least remove the zone length, and also
 may be the per zone spinlock too (a single spinlock and proper state flags 
 can
 be used).
>>> 
>>> I have a variant that is an array of descriptors that roughly mimics the
>>> api from blk-zoned.c that I did a few months ago as an example.
>>> I should be able to update that to the current kernel + patches.
>>> 
>> Okay. If we restrict the in-kernel SMR drive handling to devices with
>> identical zone sizes of course we can remove the zone length.
>> And we can do away with the per-zone spinlock and use a global one instead.
> 
> I don't think dropping the zone length is a reasonable thing to do.
> 
> What I propose is an array of _descriptors_ it doesn't drop _any_
> of the zone information that you are holding in an RB tree, it is
> just a condensed format that _mostly_ plugs into your existing
> API.

I do not agree. The Seagate drive already has one zone (the last one)
that is not the same length as the other zones. Sure, since it is the
last one, we can had “if (last zone)” all over the place and make it
work. But that is really ugly. Keeping the length field makes the code
generic and following the standard, which has no restriction on the
zone sizes. We could do some memory optimisation using different types
of blk_zone sturcts, the types mapping to the SAME value: drives with
constant zone size can use a blk_zone type without the length field,
others use a different type that include the field. Accessor functions
can hide the different types in the zone manipulation code.

Best regards.


-- 
Damien Le Moal, Ph.D.
Sr. Manager, System Software Group, HGST Research,
HGST, a Western Digital brand
damien.lem...@hgst.com
(+81) 0466-98-3593 (ext. 513593)
1 kirihara-cho, Fujisawa, 
Kanagawa, 252-0888 Japan
www.hgst.com

Western Digital Corporation (and its subsidiaries) E-mail Confidentiality 
Notice & Disclaimer:

This e-mail and any files transmitted with it may contain confidential or 
legally privileged information of WDC and/or its affiliates, and are intended 
solely for the use of the individual or entity to which they are addressed. If 
you are not the intended recipient, any disclosure, copying, distribution 

Re: [PATCH 4.6 00/56] 4.6.7-stable review

2016-08-15 Thread Shuah Khan
On 08/14/2016 02:37 PM, Greg Kroah-Hartman wrote:
> *
> NOTE
>   This is the LAST 4.6.y kernel that will be released.  After this
>   release, it is end-of-life.  You should be moving on to 4.7.y at this
>   point in time.  You have been warned.
> *
> 
> This is the start of the stable review cycle for the 4.6.7 release.
> There are 56 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Tue Aug 16 20:24:52 UTC 2016.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
>   kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.6.7-rc1.gz
> or in the git tree and branch at:
>   git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git 
> linux-4.6.y
> and the diffstat can be found below.
> 

Compiled and booted on my test system. No dmesg regressions.

thanks,
-- Shuah

-- 
Shuah Khan
Sr. Linux Kernel Developer
Open Source Innovation Group
Samsung Research America(Silicon Valley)
shuah...@samsung.com


Re: [PATCH 4.4 00/49] 4.4.18-stable review

2016-08-15 Thread Shuah Khan
On 08/14/2016 02:23 PM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.4.18 release.
> There are 49 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Tue Aug 16 20:22:43 UTC 2016.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
>   kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.4.18-rc1.gz
> or in the git tree and branch at:
>   git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git 
> linux-4.4.y
> and the diffstat can be found below.
> 

Compiled and booted on my test system. No dmesg regressions.

thanks,
-- Shuah

-- 
Shuah Khan
Sr. Linux Kernel Developer
Open Source Innovation Group
Samsung Research America(Silicon Valley)
shuah...@samsung.com


Re: [PATCH 3.14 00/29] 3.14.76-stable review

2016-08-15 Thread Shuah Khan
On 08/14/2016 02:07 PM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 3.14.76 release.
> There are 29 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Tue Aug 16 20:07:18 UTC 2016.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
>   kernel.org/pub/linux/kernel/v3.x/stable-review/patch-3.14.76-rc1.gz
> or in the git tree and branch at:
>   git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git 
> linux-3.14.y
> and the diffstat can be found below.
> 

Compiled and booted on my test system. No dmesg regressions.

thanks,
-- Shuah


-- 
Shuah Khan
Sr. Linux Kernel Developer
Open Source Innovation Group
Samsung Research America(Silicon Valley)
shuah...@samsung.com


linux-next: Tree for Aug 16

2016-08-15 Thread Stephen Rothwell
Hi all,

Changes since 20160815:

Non-merge commits (relative to Linus' tree): 2149
 2086 files changed, 87009 insertions(+), 30762 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc and an allmodconfig (with
CONFIG_BUILD_DOCSRC=n) for x86_64, a multi_v7_defconfig for arm and a
native build of tools/perf. After the final fixups (if any), I do an
x86_64 modules_install followed by builds for x86_64 allnoconfig,
powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig, allyesconfig
(this fails its final link) and pseries_le_defconfig and i386, sparc
and sparc64 defconfig.

Below is a summary of the state of the merge.

I am currently merging 241 trees (counting Linus' and 35 trees of patches
pending for Linus' tree).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwell

$ git checkout master
$ git reset --hard stable
Merging origin/master (3684b03d8e9a Merge tag 'iommu-fixes-v4.8-rc2' of 
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu)
Merging fixes/master (d3396e1e4ec4 Merge tag 'fixes-for-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc)
Merging kbuild-current/rc-fixes (b36fad65d61f kbuild: Initialize exported 
variables)
Merging arc-current/for-curr (45c3b08a117e ARC: Elide redundant setup of DMA 
callbacks)
Merging arm-current/fixes (87eed3c74d7c ARM: fix address limit restoration for 
undefined instructions)
Merging m68k-current/for-linus (6bd80f372371 m68k/defconfig: Update defconfigs 
for v4.7-rc2)
Merging metag-fixes/fixes (97b1d23f7bcb metag: Drop show_mem() from mem_init())
Merging powerpc-fixes/fixes (ca49e64f0cb1 selftests/powerpc: Specify we expect 
to build with std=gnu99)
Merging powerpc-merge-mpe/fixes (bc0195aad0da Linux 4.2-rc2)
Merging sparc/master (4620a06e4b3c shmem: Fix link error if huge pages support 
is disabled)
Merging net/master (d2fbdf76b85b tipc: fix NULL pointer dereference in 
shutdown())
Merging ipsec/master (1625f4529957 net/xfrm_input: fix possible NULL deref of 
tunnel.ip6->parms.i_key)
Merging netfilter/master (4b5b9ba553f9 openvswitch: do not ignore netdev errors 
when creating tunnel vports)
Merging ipvs/master (ea43f860d984 Merge branch 'ethoc-fixes')
Merging wireless-drivers/master (034fdd4a17ff Merge ath-current from ath.git)
Merging mac80211/master (4d0bd46a4d55 Revert "wext: Fix 32 bit iwpriv 
compatibility issue with 64 bit Kernel")
Merging sound-current/for-linus (a52ff34e5ec6 ALSA: hda - Manage power well 
properly for resume)
Merging pci-current/for-linus (8b078c603249 PCI: Update 
"pci=resource_alignment" documentation)
Merging driver-core.current/driver-core-linus (694d0d0bb203 Linux 4.8-rc2)
Merging tty.current/tty-linus (29b4817d4018 Linux 4.8-rc1)
Merging usb.current/usb-linus (add125054b87 cdc-acm: fix wrong pipe type on rx 
interrupt xfers)
Merging usb-gadget-fixes/fixes (a0ad85ae866f usb: dwc3: gadget: stop processing 
on HWO set)
Merging usb-serial-fixes/usb-linus (3b7c7e52efda USB: serial: mos7840: fix 
non-atomic allocation in write path)
Merging usb-chipidea-fixes/ci-for-usb-stable (ea1d39a31d3b usb: common: 
otg-fsm: add license to usb-otg-fsm)
Merging staging.current/staging-linus (99f1c013194e staging/lustre/llite: Close 
atomic_open race with several openers)
Merging char-misc.current/char-misc-linus (7b142d8fd0bd android: binder: fix 
dangling pointer comparison)
Merging input-current/for-linus (22fe874f3803 Input: silead - remove some dead 
code)
Merging crypto-current/master (a0118c8b2be9 crypto: caam - fix non-hmac hashes)
Merging ide/master (797cee982eef Merge branch 'stable-4.8' of 
git://git.infradead.org/users/pcmoore/audit)
Merging rr-fixes/fixes (8244062ef1e5 modules: fix longstanding /proc/kallsyms 
vs module insertion race.)
Merging vfio-fixes/for-linus (c8952a707556 vfio/pci: Fix NULL pointer oops in 
error interrupt setup handling)
Merging kselftest-fixes/fixes (29b4817d4018 Linux 4.8-rc1)
Merging backli

[PATCH V3 2/2] rtc/rtc-cmos: Initialize software counters before irq is registered

2016-08-15 Thread Pratyush Anand
We have observed on few x86 machines with rtc-cmos device that
hpet_rtc_interrupt() is called just after irq registration and before
cmos_do_probe() could call hpet_rtc_timer_init().

So, neither hpet_default_delta nor hpet_t1_cmp is initialized by the time
interrupt is raised in the given situation, and this results in NMI
watchdog LOCKUP.

It has only been observed sporadically on kdump secondary kernels.

See the call trace:
---<-snip->---
   27.913194] Kernel panic - not syncing: Watchdog detected hard LOCKUP on
cpu 0
[   27.915371] CPU: 0 PID: 1 Comm: swapper/0 Not tainted
3.10.0-342.el7.x86_64 #1
[   27.917503] Hardware name: HP ProLiant DL160 Gen8, BIOS J03 02/10/2014
[   27.919455]  8186a728 59c82488 880034e05af0
81637bd4
[   27.921870]  880034e05b70 8163144a 0010
880034e05b80
[   27.924257]  880034e05b20 59c82488 

[   27.926599] Call Trace:
[   27.927352][] dump_stack+0x19/0x1b
[   27.929080]  [] panic+0xd8/0x1e7
[   27.930588]  [] ? restart_watchdog_hrtimer+0x50/0x50
[   27.932502]  [] watchdog_overflow_callback+0xc2/0xd0
[   27.934427]  [] __perf_event_overflow+0xa1/0x250
[   27.936232]  [] perf_event_overflow+0x14/0x20
[   27.937957]  [] intel_pmu_handle_irq+0x1e8/0x470
[   27.939799]  [] perf_event_nmi_handler+0x2b/0x50
[   27.941649]  [] nmi_handle.isra.0+0x69/0xb0
[   27.943348]  [] do_nmi+0x169/0x340
[   27.944802]  [] end_repeat_nmi+0x1e/0x2e
[   27.946424]  [] ? hpet_rtc_interrupt+0x85/0x380
[   27.948197]  [] ? hpet_rtc_interrupt+0x85/0x380
[   27.949992]  [] ? hpet_rtc_interrupt+0x85/0x380
[   27.951816]  <>[] ?
run_timer_softirq+0x43/0x340
[   27.954114]  [] handle_irq_event_percpu+0x3e/0x1e0
[   27.955962]  [] handle_irq_event+0x3d/0x60
[   27.957635]  [] handle_edge_irq+0x77/0x130
[   27.959332]  [] handle_irq+0xbf/0x150
[   27.960949]  [] do_IRQ+0x4f/0xf0
[   27.962434]  [] common_interrupt+0x6d/0x6d
[   27.964101][] ?
_raw_spin_unlock_irqrestore+0x1b/0x40
[   27.966308]  [] __setup_irq+0x2a7/0x570
[   28.067859]  [] ? hpet_cpuhp_notify+0x140/0x140
[   28.069709]  [] request_threaded_irq+0xcc/0x170
[   28.071585]  [] cmos_do_probe+0x1e6/0x450
[   28.073240]  [] ? cmos_do_probe+0x450/0x450
[   28.074911]  [] cmos_pnp_probe+0xbb/0xc0
[   28.076533]  [] pnp_device_probe+0x65/0xd0
[   28.078198]  [] driver_probe_device+0x87/0x390
[   28.079971]  [] __driver_attach+0x93/0xa0
[   28.081660]  [] ? __device_attach+0x40/0x40
[   28.083662]  [] bus_for_each_dev+0x73/0xc0
[   28.085370]  [] driver_attach+0x1e/0x20
[   28.086974]  [] bus_add_driver+0x200/0x2d0
[   28.088634]  [] ? rtc_sysfs_init+0xe/0xe
[   28.090349]  [] driver_register+0x64/0xf0
[   28.091989]  [] pnp_register_driver+0x20/0x30
[   28.093707]  [] cmos_init+0x11/0x71
---<-snip->---

The previous patch split hpet_rtc_timer_init into
hpet_rtc_timer_counter_init() and hpet_rtc_timer_enable().

Therefore, this patch moved hpet_rtc_timer_counter_init() before IRQ
registration, so that we can gracefully handle such spurious interrupts.

We were able to reproduce the problem in maximum 15 trials of kdump
secondary kernel boot on an hp-dl160gen8 machine without this patch.
However, more than 35 trials went fine after applying this patch.

Signed-off-by: Pratyush Anand 
[dzic...@redhat.com: edited the patch's summary]
Signed-off-by: Don Zickus 
---
 drivers/rtc/rtc-cmos.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/rtc/rtc-cmos.c b/drivers/rtc/rtc-cmos.c
index 43745cac0141..089d987f2638 100644
--- a/drivers/rtc/rtc-cmos.c
+++ b/drivers/rtc/rtc-cmos.c
@@ -129,6 +129,16 @@ static inline int hpet_rtc_dropped_irq(void)
return 0;
 }
 
+static inline int hpet_rtc_timer_counter_init(void)
+{
+   return 0;
+}
+
+static inline int hpet_rtc_timer_enable(void)
+{
+   return 0;
+}
+
 static inline int hpet_rtc_timer_init(void)
 {
return 0;
@@ -707,6 +717,7 @@ cmos_do_probe(struct device *dev, struct resource *ports, 
int rtc_irq)
goto cleanup1;
}
 
+   hpet_rtc_timer_counter_init();
if (is_valid_irq(rtc_irq)) {
irq_handler_t rtc_cmos_int_handler;
 
@@ -729,7 +740,7 @@ cmos_do_probe(struct device *dev, struct resource *ports, 
int rtc_irq)
goto cleanup1;
}
}
-   hpet_rtc_timer_init();
+   hpet_rtc_timer_enable();
 
/* export at least the first block of NVRAM */
nvram.size = address_space - NVRAM_OFFSET;
-- 
2.5.5



[PATCH V3 1/2] rtc/hpet: Factorize hpet_rtc_timer_init()

2016-08-15 Thread Pratyush Anand
We need the ability to support initialization of hpet_default_delta and
hpet_t1_cmp counters before irq can be enabled.

This patch splits hpet_rtc_timer_init() into two functions:
hpet_rtc_timer_counter_init() and hpet_rtc_timer_enable, so that the above
functionality can be achieved.

Next patch explains it's need in detail.

No functional change in this patch.

Signed-off-by: Pratyush Anand 
[dzic...@redhat.com: edited the patch's summary]
Signed-off-by: Don Zickus 
---
 arch/x86/include/asm/hpet.h |  2 ++
 arch/x86/kernel/hpet.c  | 41 +++--
 2 files changed, 37 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/hpet.h b/arch/x86/include/asm/hpet.h
index cc285ec4b2c1..8eecb31bebcb 100644
--- a/arch/x86/include/asm/hpet.h
+++ b/arch/x86/include/asm/hpet.h
@@ -96,6 +96,8 @@ extern int hpet_set_alarm_time(unsigned char hrs, unsigned 
char min,
   unsigned char sec);
 extern int hpet_set_periodic_freq(unsigned long freq);
 extern int hpet_rtc_dropped_irq(void);
+extern int hpet_rtc_timer_counter_init(void);
+extern int hpet_rtc_timer_enable(void);
 extern int hpet_rtc_timer_init(void);
 extern irqreturn_t hpet_rtc_interrupt(int irq, void *dev_id);
 extern int hpet_register_irq_handler(rtc_irq_handler handler);
diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c
index ed16e58658a4..6f6d21059b1b 100644
--- a/arch/x86/kernel/hpet.c
+++ b/arch/x86/kernel/hpet.c
@@ -1074,14 +1074,12 @@ void hpet_unregister_irq_handler(rtc_irq_handler 
handler)
 EXPORT_SYMBOL_GPL(hpet_unregister_irq_handler);
 
 /*
- * Timer 1 for RTC emulation. We use one shot mode, as periodic mode
- * is not supported by all HPET implementations for timer 1.
- *
- * hpet_rtc_timer_init() is called when the rtc is initialized.
+ * hpet_rtc_timer_counter_init() is called before interrupt can be
+ * registered
  */
-int hpet_rtc_timer_init(void)
+int hpet_rtc_timer_counter_init(void)
 {
-   unsigned int cfg, cnt, delta;
+   unsigned int cnt, delta;
unsigned long flags;
 
if (!is_hpet_enabled())
@@ -1106,6 +1104,22 @@ int hpet_rtc_timer_init(void)
hpet_writel(cnt, HPET_T1_CMP);
hpet_t1_cmp = cnt;
 
+   local_irq_restore(flags);
+
+   return 1;
+}
+EXPORT_SYMBOL_GPL(hpet_rtc_timer_counter_init);
+
+/*
+ * hpet_rtc_timer_enable() is called during RTC initialization
+ */
+int hpet_rtc_timer_enable(void)
+{
+   unsigned int cfg;
+   unsigned long flags;
+
+   local_irq_save(flags);
+
cfg = hpet_readl(HPET_T1_CFG);
cfg &= ~HPET_TN_PERIODIC;
cfg |= HPET_TN_ENABLE | HPET_TN_32BIT;
@@ -1115,6 +1129,21 @@ int hpet_rtc_timer_init(void)
 
return 1;
 }
+EXPORT_SYMBOL_GPL(hpet_rtc_timer_enable);
+
+/*
+ * Timer 1 for RTC emulation. We use one shot mode, as periodic mode
+ * is not supported by all HPET implementations for timer 1.
+ *
+ * hpet_rtc_timer_init() is called when the rtc is initialized.
+ */
+int hpet_rtc_timer_init(void)
+{
+   if (!hpet_rtc_timer_counter_init())
+   return 0;
+
+   return hpet_rtc_timer_enable();
+}
 EXPORT_SYMBOL_GPL(hpet_rtc_timer_init);
 
 static void hpet_disable_rtc_channel(void)
-- 
2.5.5



[PATCH 6/7] dax: define a unified inode/address_space for device-dax mappings

2016-08-15 Thread Dan Williams
In support of enabling resize / truncate of device-dax instances, define
a pseudo-fs to provide a unified inode/address space for vm operations.

Cc: Al Viro 
Signed-off-by: Dan Williams 
---
 drivers/dax/dax.c  |  150 +++-
 fs/char_dev.c  |1 
 include/uapi/linux/magic.h |1 
 3 files changed, 148 insertions(+), 4 deletions(-)

diff --git a/drivers/dax/dax.c b/drivers/dax/dax.c
index 17715773c097..e8b9319aeadb 100644
--- a/drivers/dax/dax.c
+++ b/drivers/dax/dax.c
@@ -13,7 +13,9 @@
 #include 
 #include 
 #include 
+#include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -26,6 +28,9 @@ static struct class *dax_class;
 static DEFINE_IDA(dax_minor_ida);
 static int nr_dax = CONFIG_NR_DEV_DAX;
 module_param(nr_dax, int, S_IRUGO);
+static struct vfsmount *dax_mnt;
+static struct kmem_cache *dax_cache __read_mostly;
+static struct super_block *dax_superblock __read_mostly;
 MODULE_PARM_DESC(nr_dax, "max number of device-dax instances");
 
 /**
@@ -61,6 +66,7 @@ struct dax_region {
  */
 struct dax_dev {
struct dax_region *region;
+   struct inode *inode;
struct device dev;
struct cdev cdev;
bool alive;
@@ -69,6 +75,117 @@ struct dax_dev {
struct resource res[0];
 };
 
+static struct inode *dax_alloc_inode(struct super_block *sb)
+{
+   return kmem_cache_alloc(dax_cache, GFP_KERNEL);
+}
+
+static void dax_i_callback(struct rcu_head *head)
+{
+   struct inode *inode = container_of(head, struct inode, i_rcu);
+
+   kmem_cache_free(dax_cache, inode);
+}
+
+static void dax_destroy_inode(struct inode *inode)
+{
+   call_rcu(>i_rcu, dax_i_callback);
+}
+
+static const struct super_operations dax_sops = {
+   .statfs = simple_statfs,
+   .alloc_inode = dax_alloc_inode,
+   .destroy_inode = dax_destroy_inode,
+   .drop_inode = generic_delete_inode,
+};
+
+static struct dentry *dax_mount(struct file_system_type *fs_type,
+   int flags, const char *dev_name, void *data)
+{
+   return mount_pseudo(fs_type, "dax:", _sops, NULL, DAXFS_MAGIC);
+}
+
+static struct file_system_type dax_type = {
+   .name = "dax",
+   .mount = dax_mount,
+   .kill_sb = kill_anon_super,
+};
+
+static int dax_test(struct inode *inode, void *data)
+{
+   return inode->i_cdev == data;
+}
+
+static int dax_set(struct inode *inode, void *data)
+{
+   inode->i_cdev = data;
+   return 0;
+}
+
+static struct inode *dax_inode_get(struct cdev *cdev, dev_t devt)
+{
+   struct inode *inode;
+
+   inode = iget5_locked(dax_superblock, hash_32(devt + DAXFS_MAGIC, 31),
+   dax_test, dax_set, cdev);
+
+   if (!inode)
+   return NULL;
+
+   if (inode->i_state & I_NEW) {
+   inode->i_mode = S_IFCHR;
+   inode->i_flags = S_DAX;
+   inode->i_rdev = devt;
+   mapping_set_gfp_mask(>i_data, GFP_USER);
+   unlock_new_inode(inode);
+   }
+   return inode;
+}
+
+static void init_once(void *inode)
+{
+   inode_init_once(inode);
+}
+
+static int dax_inode_init(void)
+{
+   int rc;
+
+   dax_cache = kmem_cache_create("dax_cache", sizeof(struct inode), 0,
+   (SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|
+SLAB_MEM_SPREAD|SLAB_ACCOUNT),
+   init_once);
+   if (!dax_cache)
+   return -ENOMEM;
+
+   rc = register_filesystem(_type);
+   if (rc)
+   goto err_register_fs;
+
+   dax_mnt = kern_mount(_type);
+   if (IS_ERR(dax_mnt)) {
+   rc = PTR_ERR(dax_mnt);
+   goto err_mount;
+   }
+   dax_superblock = dax_mnt->mnt_sb;
+
+   return 0;
+
+ err_mount:
+   unregister_filesystem(_type);
+ err_register_fs:
+   kmem_cache_destroy(dax_cache);
+
+   return rc;
+}
+
+static void dax_inode_exit(void)
+{
+   kern_unmount(dax_mnt);
+   unregister_filesystem(_type);
+   kmem_cache_destroy(dax_cache);
+}
+
 static void dax_region_free(struct kref *kref)
 {
struct dax_region *dax_region;
@@ -379,6 +496,9 @@ static int dax_open(struct inode *inode, struct file *filp)
 
dax_dev = container_of(inode->i_cdev, struct dax_dev, cdev);
dev_dbg(_dev->dev, "%s\n", __func__);
+   inode->i_mapping = dax_dev->inode->i_mapping;
+   inode->i_mapping->host = dax_dev->inode;
+   filp->f_mapping = inode->i_mapping;
filp->private_data = dax_dev;
inode->i_flags = S_DAX;
 
@@ -410,6 +530,7 @@ static void dax_dev_release(struct device *dev)
ida_simple_remove(_region->ida, dax_dev->id);
ida_simple_remove(_minor_ida, MINOR(dev->devt));
dax_region_put(dax_region);
+   iput(dax_dev->inode);
kfree(dax_dev);
 }
 
@@ -459,6 +580,12 @@ int devm_create_dax_dev(struct dax_region *dax_region, 
struct resource 

[PATCH 5/7] dax: convert to the cdev api

2016-08-15 Thread Dan Williams
A goal of the device-DAX interface is to be able to support many
exclusive allocations (partitions) of performance / feature
differentiated memory.  This count may exceed the default minors limit
of 256.

As a result of switching to an embedded cdev the inode-to-dax_dev
conversion is simplified, as well as reference counting which can switch
to the cdev kobject lifetime.

Cc: Al Viro 
Signed-off-by: Dan Williams 
---
 drivers/dax/Kconfig |5 +++
 drivers/dax/dax.c   |   82 ++-
 2 files changed, 46 insertions(+), 41 deletions(-)

diff --git a/drivers/dax/Kconfig b/drivers/dax/Kconfig
index cedab7572de3..daadd20aa936 100644
--- a/drivers/dax/Kconfig
+++ b/drivers/dax/Kconfig
@@ -23,4 +23,9 @@ config DEV_DAX_PMEM
 
  Say Y if unsure
 
+config NR_DEV_DAX
+   int "Maximum number of Device-DAX instances"
+   default 32768
+   range 256 2147483647
+
 endif
diff --git a/drivers/dax/dax.c b/drivers/dax/dax.c
index 181d2a5a21e4..17715773c097 100644
--- a/drivers/dax/dax.c
+++ b/drivers/dax/dax.c
@@ -14,15 +14,19 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
 #include 
 #include "dax.h"
 
-static int dax_major;
+static dev_t dax_devt;
 static struct class *dax_class;
 static DEFINE_IDA(dax_minor_ida);
+static int nr_dax = CONFIG_NR_DEV_DAX;
+module_param(nr_dax, int, S_IRUGO);
+MODULE_PARM_DESC(nr_dax, "max number of device-dax instances");
 
 /**
  * struct dax_region - mapping infrastructure for dax devices
@@ -49,6 +53,7 @@ struct dax_region {
  * struct dax_dev - subdivision of a dax region
  * @region - parent region
  * @dev - device backing the character device
+ * @cdev - core chardev data
  * @alive - !alive + rcu grace period == no new mappings can be established
  * @id - child id in the region
  * @num_resources - number of physical address extents in this device
@@ -57,6 +62,7 @@ struct dax_region {
 struct dax_dev {
struct dax_region *region;
struct device dev;
+   struct cdev cdev;
bool alive;
int id;
int num_resources;
@@ -367,29 +373,12 @@ static unsigned long dax_get_unmapped_area(struct file 
*filp,
return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags);
 }
 
-static int __match_devt(struct device *dev, const void *data)
-{
-   const dev_t *devt = data;
-
-   return dev->devt == *devt;
-}
-
-static struct device *dax_dev_find(dev_t dev_t)
-{
-   return class_find_device(dax_class, NULL, _t, __match_devt);
-}
-
 static int dax_open(struct inode *inode, struct file *filp)
 {
-   struct dax_dev *dax_dev = NULL;
-   struct device *dev;
-
-   dev = dax_dev_find(inode->i_rdev);
-   if (!dev)
-   return -ENXIO;
+   struct dax_dev *dax_dev;
 
-   dax_dev = to_dax_dev(dev);
-   dev_dbg(dev, "%s\n", __func__);
+   dax_dev = container_of(inode->i_cdev, struct dax_dev, cdev);
+   dev_dbg(_dev->dev, "%s\n", __func__);
filp->private_data = dax_dev;
inode->i_flags = S_DAX;
 
@@ -399,11 +388,8 @@ static int dax_open(struct inode *inode, struct file *filp)
 static int dax_release(struct inode *inode, struct file *filp)
 {
struct dax_dev *dax_dev = filp->private_data;
-   struct device *dev = _dev->dev;
-
-   dev_dbg(dev, "%s\n", __func__);
-   put_device(dev);
 
+   dev_dbg(_dev->dev, "%s\n", __func__);
return 0;
 }
 
@@ -430,6 +416,7 @@ static void dax_dev_release(struct device *dev)
 static void unregister_dax_dev(void *dev)
 {
struct dax_dev *dax_dev = to_dax_dev(dev);
+   struct cdev *cdev = _dev->cdev;
 
dev_dbg(dev, "%s\n", __func__);
 
@@ -442,6 +429,7 @@ static void unregister_dax_dev(void *dev)
 */
dax_dev->alive = false;
synchronize_rcu();
+   cdev_del(cdev);
device_unregister(dev);
 }
 
@@ -451,17 +439,13 @@ int devm_create_dax_dev(struct dax_region *dax_region, 
struct resource *res,
struct device *parent = dax_region->dev;
struct dax_dev *dax_dev;
struct device *dev;
+   struct cdev *cdev;
int rc, minor;
dev_t dev_t;
 
dax_dev = kzalloc(sizeof(*dax_dev) + sizeof(*res) * count, GFP_KERNEL);
if (!dax_dev)
return -ENOMEM;
-   memcpy(dax_dev->res, res, sizeof(*res) * count);
-   dax_dev->num_resources = count;
-   dax_dev->alive = true;
-   dax_dev->region = dax_region;
-   kref_get(_region->kref);
 
dax_dev->id = ida_simple_get(_region->ida, 0, 0, GFP_KERNEL);
if (dax_dev->id < 0) {
@@ -475,10 +459,26 @@ int devm_create_dax_dev(struct dax_region *dax_region, 
struct resource *res,
goto err_minor;
}
 
-   dev_t = MKDEV(dax_major, minor);
-
+   /* device_initialize() so cdev can reference kobj parent */
+   dev_t = MKDEV(MAJOR(dax_devt), minor);
dev = _dev->dev;

[PATCH 4/7] dax: embed a struct device in dax_dev

2016-08-15 Thread Dan Williams
The kref in dax_dev can be made redundant if the final put_device() on
the device associated with the dax_dev frees the dax_dev.  This can be
accomplished by embedding a struct device in struct dax_dev, open coding
device_create() and specifying a custom release method.

Signed-off-by: Dan Williams 
---
 drivers/dax/dax.c |  130 ++---
 1 file changed, 45 insertions(+), 85 deletions(-)

diff --git a/drivers/dax/dax.c b/drivers/dax/dax.c
index 994dfa507dfb..181d2a5a21e4 100644
--- a/drivers/dax/dax.c
+++ b/drivers/dax/dax.c
@@ -49,7 +49,6 @@ struct dax_region {
  * struct dax_dev - subdivision of a dax region
  * @region - parent region
  * @dev - device backing the character device
- * @kref - enable this data to be tracked in filp->private_data
  * @alive - !alive + rcu grace period == no new mappings can be established
  * @id - child id in the region
  * @num_resources - number of physical address extents in this device
@@ -57,8 +56,7 @@ struct dax_region {
  */
 struct dax_dev {
struct dax_region *region;
-   struct device *dev;
-   struct kref kref;
+   struct device dev;
bool alive;
int id;
int num_resources;
@@ -79,20 +77,6 @@ void dax_region_put(struct dax_region *dax_region)
 }
 EXPORT_SYMBOL_GPL(dax_region_put);
 
-static void dax_dev_free(struct kref *kref)
-{
-   struct dax_dev *dax_dev;
-
-   dax_dev = container_of(kref, struct dax_dev, kref);
-   dax_region_put(dax_dev->region);
-   kfree(dax_dev);
-}
-
-static void dax_dev_put(struct dax_dev *dax_dev)
-{
-   kref_put(_dev->kref, dax_dev_free);
-}
-
 struct dax_region *alloc_dax_region(struct device *parent, int region_id,
struct resource *res, unsigned int align, void *addr,
unsigned long pfn_flags)
@@ -117,10 +101,15 @@ struct dax_region *alloc_dax_region(struct device 
*parent, int region_id,
 }
 EXPORT_SYMBOL_GPL(alloc_dax_region);
 
+static struct dax_dev *to_dax_dev(struct device *dev)
+{
+   return container_of(dev, struct dax_dev, dev);
+}
+
 static ssize_t size_show(struct device *dev,
struct device_attribute *attr, char *buf)
 {
-   struct dax_dev *dax_dev = dev_get_drvdata(dev);
+   struct dax_dev *dax_dev = to_dax_dev(dev);
unsigned long long size = 0;
int i;
 
@@ -149,7 +138,7 @@ static int check_vma(struct dax_dev *dax_dev, struct 
vm_area_struct *vma,
const char *func)
 {
struct dax_region *dax_region = dax_dev->region;
-   struct device *dev = dax_dev->dev;
+   struct device *dev = _dev->dev;
unsigned long mask;
 
if (!dax_dev->alive)
@@ -214,7 +203,7 @@ static int __dax_dev_fault(struct dax_dev *dax_dev, struct 
vm_area_struct *vma,
struct vm_fault *vmf)
 {
unsigned long vaddr = (unsigned long) vmf->virtual_address;
-   struct device *dev = dax_dev->dev;
+   struct device *dev = _dev->dev;
struct dax_region *dax_region;
int rc = VM_FAULT_SIGBUS;
phys_addr_t phys;
@@ -254,7 +243,7 @@ static int dax_dev_fault(struct vm_area_struct *vma, struct 
vm_fault *vmf)
struct file *filp = vma->vm_file;
struct dax_dev *dax_dev = filp->private_data;
 
-   dev_dbg(dax_dev->dev, "%s: %s: %s (%#lx - %#lx)\n", __func__,
+   dev_dbg(_dev->dev, "%s: %s: %s (%#lx - %#lx)\n", __func__,
current->comm, (vmf->flags & FAULT_FLAG_WRITE)
? "write" : "read", vma->vm_start, vma->vm_end);
rcu_read_lock();
@@ -269,7 +258,7 @@ static int __dax_dev_pmd_fault(struct dax_dev *dax_dev,
unsigned int flags)
 {
unsigned long pmd_addr = addr & PMD_MASK;
-   struct device *dev = dax_dev->dev;
+   struct device *dev = _dev->dev;
struct dax_region *dax_region;
phys_addr_t phys;
pgoff_t pgoff;
@@ -311,7 +300,7 @@ static int dax_dev_pmd_fault(struct vm_area_struct *vma, 
unsigned long addr,
struct file *filp = vma->vm_file;
struct dax_dev *dax_dev = filp->private_data;
 
-   dev_dbg(dax_dev->dev, "%s: %s: %s (%#lx - %#lx)\n", __func__,
+   dev_dbg(_dev->dev, "%s: %s: %s (%#lx - %#lx)\n", __func__,
current->comm, (flags & FAULT_FLAG_WRITE)
? "write" : "read", vma->vm_start, vma->vm_end);
 
@@ -322,29 +311,9 @@ static int dax_dev_pmd_fault(struct vm_area_struct *vma, 
unsigned long addr,
return rc;
 }
 
-static void dax_dev_vm_open(struct vm_area_struct *vma)
-{
-   struct file *filp = vma->vm_file;
-   struct dax_dev *dax_dev = filp->private_data;
-
-   dev_dbg(dax_dev->dev, "%s\n", __func__);
-   kref_get(_dev->kref);
-}
-
-static void dax_dev_vm_close(struct vm_area_struct *vma)
-{
-   struct file *filp = vma->vm_file;
-   struct dax_dev *dax_dev = filp->private_data;
-
-   dev_dbg(dax_dev->dev, "%s\n", __func__);
-

[RFC PATCH] mmc: dw_mmc: avoid race condition of cpu and IDMAC

2016-08-15 Thread Shawn Lin
We could see an obvious race condition by test that
the former write operation by IDMAC aiming to clear
OWN bit reach right after the later configuration of
the same desc, which makes the IDMAC be in SUSPEND
state as the OWN bit was cleared by the asynchronous
write operation of IDMAC. The bug can be very easy
reproduced on RK3288 or similar when lowering the
bandwidth of bus and aggravating the Qos to make the
large numbers of IP fight for the priority. One possible
replaceable solution may be alloc dual buff for the
desc to avoid it but could still race each other
theoretically.

Signed-off-by: Shawn Lin 

---

 drivers/mmc/host/dw_mmc.c | 34 ++
 1 file changed, 34 insertions(+)

diff --git a/drivers/mmc/host/dw_mmc.c b/drivers/mmc/host/dw_mmc.c
index 32380d5..7b01fab 100644
--- a/drivers/mmc/host/dw_mmc.c
+++ b/drivers/mmc/host/dw_mmc.c
@@ -490,6 +490,23 @@ static void dw_mci_translate_sglist(struct dw_mci *host, 
struct mmc_data *data,
length -= desc_len;
 
/*
+* OWN bit should be clear by IDMAC after
+* finishing transfer. Let's wait for the
+* asynchronous operation of IDMAC and cpu
+* to make sure that we do not rely on the
+* order of Qos of bus and architecture.
+* Otherwise we could see a race condition
+* here that the former write operation of
+* IDMAC(to clear the OWN bit) reach right
+* after the later new configuration of desc
+* which makes value of desc been covered
+* leading to DMA_SUSPEND state as IDMAC fecth
+* the wrong desc then.
+*/
+   while ((readl(>des0) & IDMAC_DES0_OWN))
+   ;
+
+   /*
 * Set the OWN bit and disable interrupts
 * for this descriptor
 */
@@ -535,6 +552,23 @@ static void dw_mci_translate_sglist(struct dw_mci *host, 
struct mmc_data *data,
length -= desc_len;
 
/*
+* OWN bit should be clear by IDMAC after
+* finishing transfer. Let's wait for the
+* asynchronous operation of IDMAC and cpu
+* to make sure that we do not rely on the
+* order of Qos of bus and architecture.
+* Otherwise we could see a race condition
+* here that the former write operation of
+* IDMAC(to clear the OWN bit) reach right
+* after the later new configuration of desc
+* which makes value of desc been covered
+* leading to DMA_SUSPEND state as IDMAC fecth
+* the wrong desc then.
+*/
+   while ((readl(>des0) & IDMAC_DES0_OWN))
+   ;
+
+   /*
 * Set the OWN bit and disable interrupts
 * for this descriptor
 */
-- 
2.3.7




[PATCH 7/7] dax: unmap/truncate on device shutdown

2016-08-15 Thread Dan Williams
Invalidate all mappings of a device-dax instance when the device is
unregistered.

Signed-off-by: Dan Williams 
---
 drivers/dax/dax.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/dax/dax.c b/drivers/dax/dax.c
index e8b9319aeadb..0a7899d5c65c 100644
--- a/drivers/dax/dax.c
+++ b/drivers/dax/dax.c
@@ -550,6 +550,7 @@ static void unregister_dax_dev(void *dev)
 */
dax_dev->alive = false;
synchronize_rcu();
+   unmap_mapping_range(dax_dev->inode->i_mapping, 0, 0, 1);
cdev_del(cdev);
device_unregister(dev);
 }



[PATCH 1/7] dax: cleanup needlessly global symbol warnings

2016-08-15 Thread Dan Williams
drivers/dax/dax.c:75:6: warning: symbol 'dax_region_put' was not declared.
drivers/dax/dax.c:95:19: warning: symbol 'alloc_dax_region' was not declared.
drivers/dax/dax.c:173:5: warning: symbol 'devm_create_dax_dev' was not declared.
drivers/dax/pmem.c:27:17: warning: symbol 'to_dax_pmem' was not declared.

Signed-off-by: Dan Williams 
---
 drivers/dax/dax.c  |1 +
 drivers/dax/pmem.c |2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/dax/dax.c b/drivers/dax/dax.c
index 803f3953b341..736c03830fd0 100644
--- a/drivers/dax/dax.c
+++ b/drivers/dax/dax.c
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include 
+#include "dax.h"
 
 static int dax_major;
 static struct class *dax_class;
diff --git a/drivers/dax/pmem.c b/drivers/dax/pmem.c
index dfb168568af1..59b75c5972bb 100644
--- a/drivers/dax/pmem.c
+++ b/drivers/dax/pmem.c
@@ -24,7 +24,7 @@ struct dax_pmem {
struct completion cmp;
 };
 
-struct dax_pmem *to_dax_pmem(struct percpu_ref *ref)
+static struct dax_pmem *to_dax_pmem(struct percpu_ref *ref)
 {
return container_of(ref, struct dax_pmem, ref);
 }



[PATCH 3/7] dax: rename fops from dax_dev_ to dax_

2016-08-15 Thread Dan Williams
Shorten the prefix of the file operations to distinguish them from
operations on the struct device associated with the dax_dev.

Signed-off-by: Dan Williams 
---
 drivers/dax/dax.c |   16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/dax/dax.c b/drivers/dax/dax.c
index 3774fc9709bb..994dfa507dfb 100644
--- a/drivers/dax/dax.c
+++ b/drivers/dax/dax.c
@@ -347,7 +347,7 @@ static const struct vm_operations_struct dax_dev_vm_ops = {
.close = dax_dev_vm_close,
 };
 
-static int dax_dev_mmap(struct file *filp, struct vm_area_struct *vma)
+static int dax_mmap(struct file *filp, struct vm_area_struct *vma)
 {
struct dax_dev *dax_dev = filp->private_data;
int rc;
@@ -365,7 +365,7 @@ static int dax_dev_mmap(struct file *filp, struct 
vm_area_struct *vma)
 }
 
 /* return an unmapped area aligned to the dax region specified alignment */
-static unsigned long dax_dev_get_unmapped_area(struct file *filp,
+static unsigned long dax_get_unmapped_area(struct file *filp,
unsigned long addr, unsigned long len, unsigned long pgoff,
unsigned long flags)
 {
@@ -411,7 +411,7 @@ static struct device *dax_dev_find(dev_t dev_t)
return class_find_device(dax_class, NULL, _t, __match_devt);
 }
 
-static int dax_dev_open(struct inode *inode, struct file *filp)
+static int dax_open(struct inode *inode, struct file *filp)
 {
struct dax_dev *dax_dev = NULL;
struct device *dev;
@@ -437,7 +437,7 @@ static int dax_dev_open(struct inode *inode, struct file 
*filp)
return 0;
 }
 
-static int dax_dev_release(struct inode *inode, struct file *filp)
+static int dax_release(struct inode *inode, struct file *filp)
 {
struct dax_dev *dax_dev = filp->private_data;
struct device *dev = dax_dev->dev;
@@ -452,10 +452,10 @@ static int dax_dev_release(struct inode *inode, struct 
file *filp)
 static const struct file_operations dax_fops = {
.llseek = noop_llseek,
.owner = THIS_MODULE,
-   .open = dax_dev_open,
-   .release = dax_dev_release,
-   .get_unmapped_area = dax_dev_get_unmapped_area,
-   .mmap = dax_dev_mmap,
+   .open = dax_open,
+   .release = dax_release,
+   .get_unmapped_area = dax_get_unmapped_area,
+   .mmap = dax_mmap,
 };
 
 static void unregister_dax_dev(void *_dev)



[PATCH 0/7] dax: unified host inode for device-dax mappings

2016-08-15 Thread Dan Williams
There are two scenarios where we need mappings of a /dev/dax device to
share a single host inode, invalidating mappings at device shutdown, and
coordinating resize of an actively mapped device.  This series addresses
the unmap-on-shutdown case and includes reworks, like the cdev api
conversion, to prepare for a dynamic resize / allocation capability.

Recall that device-DAX, introduced in v4.7 [1], is a mechanism to
provide deterministic mapping behavior for performance- /
feature-differentiated memory ranges.

[1]: 
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=ab68f2622136

---

Dan Williams (7):
  dax: cleanup needlessly global symbol warnings
  dax: reorder dax_fops function definitions
  dax: rename fops from dax_dev_ to dax_
  dax: embed a struct device in dax_dev
  dax: convert to the cdev api
  dax: define a unified inode/address_space for device-dax mappings
  dax: unmap/truncate on device shutdown


 drivers/dax/Kconfig|5 
 drivers/dax/dax.c  |  555 ++--
 drivers/dax/pmem.c |2 
 fs/char_dev.c  |1 
 include/uapi/linux/magic.h |1 
 5 files changed, 337 insertions(+), 227 deletions(-)


[PATCH 2/7] dax: reorder dax_fops function definitions

2016-08-15 Thread Dan Williams
In order to convert devm_create_dax_dev() to use cdev, it will need
access to dax_fops. Move dax_fops and related function definitions
before devm_create_dax_dev().

Signed-off-by: Dan Williams 
---
 drivers/dax/dax.c |  337 ++---
 1 file changed, 168 insertions(+), 169 deletions(-)

diff --git a/drivers/dax/dax.c b/drivers/dax/dax.c
index 736c03830fd0..3774fc9709bb 100644
--- a/drivers/dax/dax.c
+++ b/drivers/dax/dax.c
@@ -145,175 +145,6 @@ static const struct attribute_group 
*dax_attribute_groups[] = {
NULL,
 };
 
-static void unregister_dax_dev(void *_dev)
-{
-   struct device *dev = _dev;
-   struct dax_dev *dax_dev = dev_get_drvdata(dev);
-   struct dax_region *dax_region = dax_dev->region;
-
-   dev_dbg(dev, "%s\n", __func__);
-
-   /*
-* Note, rcu is not protecting the liveness of dax_dev, rcu is
-* ensuring that any fault handlers that might have seen
-* dax_dev->alive == true, have completed.  Any fault handlers
-* that start after synchronize_rcu() has started will abort
-* upon seeing dax_dev->alive == false.
-*/
-   dax_dev->alive = false;
-   synchronize_rcu();
-
-   get_device(dev);
-   device_unregister(dev);
-   ida_simple_remove(_region->ida, dax_dev->id);
-   ida_simple_remove(_minor_ida, MINOR(dev->devt));
-   put_device(dev);
-   dax_dev_put(dax_dev);
-}
-
-int devm_create_dax_dev(struct dax_region *dax_region, struct resource *res,
-   int count)
-{
-   struct device *parent = dax_region->dev;
-   struct dax_dev *dax_dev;
-   struct device *dev;
-   int rc, minor;
-   dev_t dev_t;
-
-   dax_dev = kzalloc(sizeof(*dax_dev) + sizeof(*res) * count, GFP_KERNEL);
-   if (!dax_dev)
-   return -ENOMEM;
-   memcpy(dax_dev->res, res, sizeof(*res) * count);
-   dax_dev->num_resources = count;
-   kref_init(_dev->kref);
-   dax_dev->alive = true;
-   dax_dev->region = dax_region;
-   kref_get(_region->kref);
-
-   dax_dev->id = ida_simple_get(_region->ida, 0, 0, GFP_KERNEL);
-   if (dax_dev->id < 0) {
-   rc = dax_dev->id;
-   goto err_id;
-   }
-
-   minor = ida_simple_get(_minor_ida, 0, 0, GFP_KERNEL);
-   if (minor < 0) {
-   rc = minor;
-   goto err_minor;
-   }
-
-   dev_t = MKDEV(dax_major, minor);
-   dev = device_create_with_groups(dax_class, parent, dev_t, dax_dev,
-   dax_attribute_groups, "dax%d.%d", dax_region->id,
-   dax_dev->id);
-   if (IS_ERR(dev)) {
-   rc = PTR_ERR(dev);
-   goto err_create;
-   }
-   dax_dev->dev = dev;
-
-   rc = devm_add_action_or_reset(dax_region->dev, unregister_dax_dev, dev);
-   if (rc)
-   return rc;
-
-   return 0;
-
- err_create:
-   ida_simple_remove(_minor_ida, minor);
- err_minor:
-   ida_simple_remove(_region->ida, dax_dev->id);
- err_id:
-   dax_dev_put(dax_dev);
-
-   return rc;
-}
-EXPORT_SYMBOL_GPL(devm_create_dax_dev);
-
-/* return an unmapped area aligned to the dax region specified alignment */
-static unsigned long dax_dev_get_unmapped_area(struct file *filp,
-   unsigned long addr, unsigned long len, unsigned long pgoff,
-   unsigned long flags)
-{
-   unsigned long off, off_end, off_align, len_align, addr_align, align;
-   struct dax_dev *dax_dev = filp ? filp->private_data : NULL;
-   struct dax_region *dax_region;
-
-   if (!dax_dev || addr)
-   goto out;
-
-   dax_region = dax_dev->region;
-   align = dax_region->align;
-   off = pgoff << PAGE_SHIFT;
-   off_end = off + len;
-   off_align = round_up(off, align);
-
-   if ((off_end <= off_align) || ((off_end - off_align) < align))
-   goto out;
-
-   len_align = len + align;
-   if ((off + len_align) < off)
-   goto out;
-
-   addr_align = current->mm->get_unmapped_area(filp, addr, len_align,
-   pgoff, flags);
-   if (!IS_ERR_VALUE(addr_align)) {
-   addr_align += (off - addr_align) & (align - 1);
-   return addr_align;
-   }
- out:
-   return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags);
-}
-
-static int __match_devt(struct device *dev, const void *data)
-{
-   const dev_t *devt = data;
-
-   return dev->devt == *devt;
-}
-
-static struct device *dax_dev_find(dev_t dev_t)
-{
-   return class_find_device(dax_class, NULL, _t, __match_devt);
-}
-
-static int dax_dev_open(struct inode *inode, struct file *filp)
-{
-   struct dax_dev *dax_dev = NULL;
-   struct device *dev;
-
-   dev = dax_dev_find(inode->i_rdev);
-   if (!dev)
-   return -ENXIO;
-
-   device_lock(dev);
-   dax_dev = dev_get_drvdata(dev);
-   

[PATCH V3 0/2] rtc-cmos: Workaround unwanted interrupt generation

2016-08-15 Thread Pratyush Anand
We have observed on few machines with rtc-cmos devices that it generates
an interrupt before the hpet_rtc_timer_init() call is finished. This leads
to hpet_rtc_interrupt() being called before it is fully initialized.

Therefore the while-loop of hpet_cnt_ahead() in hpet_rtc_timer_reinit()
never completes. This leads to "NMI watchdog: Watchdog detected hard LOCKUP
on cpu 0".

This patch set initializes hpet_default_delta and hpet_t1_cmp before
interrupt can be raised.

Changes since V2:
  - Improved commit log further
Changes since RFC:
  - Commit log of patches has been improved.

Pratyush Anand (2):
  rtc/hpet: Factorize hpet_rtc_timer_init()
  rtc/rtc-cmos: Initialize software counters before irq is registered

 arch/x86/include/asm/hpet.h |  2 ++
 arch/x86/kernel/hpet.c  | 41 +++--
 drivers/rtc/rtc-cmos.c  | 13 -
 3 files changed, 49 insertions(+), 7 deletions(-)

-- 
2.5.5



Re: [PATCH v2 3/3] usb: gadget: add f_uac1 variant based on new u_audio api

2016-08-15 Thread Peter Chen
On Sun, Aug 14, 2016 at 01:21:24AM +0300, Ruslan Bilovol wrote:
> This patch adds new function f_uac1_newapi that
> uses recently created u_audio api. This makes
> f_uac1_newapi implementation much simpler by
> reusing existing u_audio core utilities.
> 
> This also drops previous f_uac1 approach (write
> audio samples directly to existing ALSA sound
> card) and moves to more generic/flexible
> one - create an f_uac1 ALSA sound card that
> represents USB Audio function and allows to
> be used by userspace tools.
> 
> f_uac1_newapi also has capture support (gadget->host).
> By default, capture interface has 48000kHz/2ch
> configuration, same as playback channel has.
> 
> f_uac1_newapi descriptors naming conventios
> uses f_uac2 driver naming convention that
> makes it more common and meaningful.
> 
> Comparing to f_uac1, the f_uac1_newapi doesn't
> have volume/mute functionality. This is because
> the volume/mute feature unit was dummy
> implementation since that driver creation (2009)
> and never had real volume control or mute
> functionality.
> 
> g_audio can be built using one of existing
> uac functions (f_uac1, f_uac1_newapi or f_uac2)
> 
> Signed-off-by: Ruslan Bilovol 
> ---
>  .../ABI/testing/configfs-usb-gadget-uac1_newapi|  12 +
>  Documentation/usb/gadget-testing.txt   |  41 ++
>  drivers/usb/gadget/Kconfig |  21 +
>  drivers/usb/gadget/function/Makefile   |   2 +
>  drivers/usb/gadget/function/f_uac1_newapi.c| 795 
> +
>  drivers/usb/gadget/function/u_uac1_newapi.h|  39 +
>  drivers/usb/gadget/legacy/Kconfig  |  15 +-
>  drivers/usb/gadget/legacy/audio.c  |  52 ++
>  8 files changed, 975 insertions(+), 2 deletions(-)
>  create mode 100644 Documentation/ABI/testing/configfs-usb-gadget-uac1_newapi
>  create mode 100644 drivers/usb/gadget/function/f_uac1_newapi.c
>  create mode 100644 drivers/usb/gadget/function/u_uac1_newapi.h
> 
> diff --git a/Documentation/ABI/testing/configfs-usb-gadget-uac1_newapi 
> b/Documentation/ABI/testing/configfs-usb-gadget-uac1_newapi
> new file mode 100644
> index 000..d355275
> --- /dev/null
> +++ b/Documentation/ABI/testing/configfs-usb-gadget-uac1_newapi
> @@ -0,0 +1,12 @@
> +What:/config/usb-gadget/gadget/functions/uac1_newapi.name
> +Date:Aug 2016
> +KernelVersion:   4.9
> +Description:
> + The attributes:
> +
> + c_chmask - capture channel mask
> + c_srate - capture sampling rate
> + c_ssize - capture sample size (bytes)
> + p_chmask - playback channel mask
> + p_srate - playback sampling rate
> + p_ssize - playback sample size (bytes)
> diff --git a/Documentation/usb/gadget-testing.txt 
> b/Documentation/usb/gadget-testing.txt
> index 5819605..4598d7f 100644
> --- a/Documentation/usb/gadget-testing.txt
> +++ b/Documentation/usb/gadget-testing.txt
> @@ -20,6 +20,7 @@ provided by gadgets.
>  17. UAC2 function
>  18. UVC function
>  19. PRINTER function
> +20. UAC1 function (new API)
>  
>  
>  1. ACM function
> @@ -770,3 +771,43 @@ host:
>  
>  More advanced testing can be done with the prn_example
>  described in Documentation/usb/gadget-printer.txt.
> +
> +
> +20. UAC1 function (new API, using u_audio)
> +=
> +
> +The function is provided by usb_f_uac1_newapi.ko module.
> +
> +Function-specific configfs interface
> +
> +
> +The function name to use when creating the function directory
> +is "uac1_newapi". The uac1_newapi function provides these attributes
> +in its function directory:
> +
> + c_chmask - capture channel mask
> + c_srate - capture sampling rate
> + c_ssize - capture sample size (bytes)
> + p_chmask - playback channel mask
> + p_srate - playback sampling rate
> + p_ssize - playback sample size (bytes)
> +
> +The attributes have sane default values.
> +
> +Testing the UAC1 function
> +-
> +
> +device: run the gadget
> +host: aplay -l # should list our USB Audio Gadget
> +
> +This function does not require real hardware support, it just
> +sends a stream of audio data to/from the host. In order to
> +actually hear something at the device side, a command similar
> +to this must be used at the device side:
> +
> +$ arecord -f dat -t wav -D hw:2,0 | aplay -D hw:0,0 &
> +
> +e.g.:
> +
> +$ arecord -f dat -t wav -D hw:CARD=UAC1Gadget,DEV=0 | \
> +aplay -D default:CARD=OdroidU3
> diff --git a/drivers/usb/gadget/Kconfig b/drivers/usb/gadget/Kconfig
> index a25afd8..abcb539 100644
> --- a/drivers/usb/gadget/Kconfig
> +++ b/drivers/usb/gadget/Kconfig
> @@ -194,6 +194,9 @@ config USB_F_FS
>  config USB_F_UAC1
>   tristate
>  
> +config USB_F_UAC1_NEWAPI
> + tristate
> +
>  config USB_F_UAC2
>   tristate
>  
> @@ -397,6 +400,24 @@ config USB_CONFIGFS_F_UAC1
> This driver 

[PATCH] Input: tegra-kbc: fix inverted reset logic

2016-08-15 Thread Masahiro Yamada
Commit fe6b0dfaba68 ("Input: tegra-kbc - use reset framework")
accidentally converted _deassert to _assert, so there is no code
to wake up this hardware.

Fixes: fe6b0dfaba68 ("Input: tegra-kbc - use reset framework")
Signed-off-by: Masahiro Yamada 
---

 drivers/input/keyboard/tegra-kbc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/input/keyboard/tegra-kbc.c 
b/drivers/input/keyboard/tegra-kbc.c
index 7d61439..0c07e10 100644
--- a/drivers/input/keyboard/tegra-kbc.c
+++ b/drivers/input/keyboard/tegra-kbc.c
@@ -376,7 +376,7 @@ static int tegra_kbc_start(struct tegra_kbc *kbc)
/* Reset the KBC controller to clear all previous status.*/
reset_control_assert(kbc->rst);
udelay(100);
-   reset_control_assert(kbc->rst);
+   reset_control_deassert(kbc->rst);
udelay(100);
 
tegra_kbc_config_pins(kbc);
-- 
1.9.1



Re: [PATCH v2] mm/slab: Improve performance of gathering slabinfo stats

2016-08-15 Thread Joonsoo Kim
On Fri, Aug 05, 2016 at 09:21:56AM -0500, Christoph Lameter wrote:
> On Fri, 5 Aug 2016, Joonsoo Kim wrote:
> 
> > If above my comments are fixed, all counting would be done with
> > holding a lock. So, atomic definition isn't needed for the SLAB.
> 
> Ditto for slub. struct kmem_cache_node is alrady defined in mm/slab.h.
> Thus it is a common definition already and can be used by both.
> 
> Making nr_slabs and total_objects unsigned long would be great.

In SLUB, nr_slabs is manipulated without holding a lock so atomic
operation should be used.

Anyway, Aruna. Could you handle my comment?

Thank.


Re: [PATCH 1/2] KVM: nVMX: fix msr bitmaps to prevent L2 from accessing L0 x2APIC

2016-08-15 Thread Wanpeng Li
2016-08-09 2:16 GMT+08:00 Radim Krčmář :
> msr bitmap can be used to avoid a VM exit (interception) on guest MSR
> accesses.  In some configurations of VMX controls, the guest can even
> directly access host's x2APIC MSRs.  See SDM 29.5 VIRTUALIZING MSR-BASED
> APIC ACCESSES.
>
> L2 could read all L0's x2APIC MSRs and write TPR, EOI, and SELF_IPI.
> To do so, L1 would first trick KVM to disable all possible interceptions
> by enabling APICv features and then would turn those features off;
> nested_vmx_merge_msr_bitmap() only disabled interceptions, so VMX would
> not intercept previously enabled MSRs even though they were not safe
> with the new configuration.
>
> Correctly re-enabling interceptions is not enough as a second bug would
> still allow L1+L2 to access host's MSRs: msr bitmap was shared for all
> VMCSs, so L1 could trigger a race to get the desired combination of msr
> bitmap and VMX controls.
>
> This fix allocates a msr bitmap for every L1 VCPU, allows only safe
> x2APIC MSRs from L1's msr bitmap, and disables msr bitmaps if they would
> have to intercept everything anyway.
>
> Fixes: 3af18d9c5fe9 ("KVM: nVMX: Prepare for using hardware MSR bitmap")
> Reported-by: Jim Mattson 
> Suggested-by: Wincy Van 
> Signed-off-by: Radim Krčmář 

Reviewed-by: Wanpeng Li 

> ---
>  arch/x86/kvm/vmx.c | 107 
> ++---
>  1 file changed, 44 insertions(+), 63 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index a45d8580f91e..c66ac2c70d22 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -435,6 +435,8 @@ struct nested_vmx {
> bool pi_pending;
> u16 posted_intr_nv;
>
> +   unsigned long *msr_bitmap;
> +
> struct hrtimer preemption_timer;
> bool preemption_timer_expired;
>
> @@ -924,7 +926,6 @@ static unsigned long *vmx_msr_bitmap_legacy;
>  static unsigned long *vmx_msr_bitmap_longmode;
>  static unsigned long *vmx_msr_bitmap_legacy_x2apic;
>  static unsigned long *vmx_msr_bitmap_longmode_x2apic;
> -static unsigned long *vmx_msr_bitmap_nested;
>  static unsigned long *vmx_vmread_bitmap;
>  static unsigned long *vmx_vmwrite_bitmap;
>
> @@ -2508,7 +2509,7 @@ static void vmx_set_msr_bitmap(struct kvm_vcpu *vcpu)
> unsigned long *msr_bitmap;
>
> if (is_guest_mode(vcpu))
> -   msr_bitmap = vmx_msr_bitmap_nested;
> +   msr_bitmap = to_vmx(vcpu)->nested.msr_bitmap;
> else if (cpu_has_secondary_exec_ctrls() &&
>  (vmcs_read32(SECONDARY_VM_EXEC_CONTROL) &
>   SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE)) {
> @@ -6363,13 +6364,6 @@ static __init int hardware_setup(void)
> if (!vmx_msr_bitmap_longmode_x2apic)
> goto out4;
>
> -   if (nested) {
> -   vmx_msr_bitmap_nested =
> -   (unsigned long *)__get_free_page(GFP_KERNEL);
> -   if (!vmx_msr_bitmap_nested)
> -   goto out5;
> -   }
> -
> vmx_vmread_bitmap = (unsigned long *)__get_free_page(GFP_KERNEL);
> if (!vmx_vmread_bitmap)
> goto out6;
> @@ -6392,8 +6386,6 @@ static __init int hardware_setup(void)
>
> memset(vmx_msr_bitmap_legacy, 0xff, PAGE_SIZE);
> memset(vmx_msr_bitmap_longmode, 0xff, PAGE_SIZE);
> -   if (nested)
> -   memset(vmx_msr_bitmap_nested, 0xff, PAGE_SIZE);
>
> if (setup_vmcs_config(_config) < 0) {
> r = -EIO;
> @@ -6529,9 +6521,6 @@ out8:
>  out7:
> free_page((unsigned long)vmx_vmread_bitmap);
>  out6:
> -   if (nested)
> -   free_page((unsigned long)vmx_msr_bitmap_nested);
> -out5:
> free_page((unsigned long)vmx_msr_bitmap_longmode_x2apic);
>  out4:
> free_page((unsigned long)vmx_msr_bitmap_longmode);
> @@ -6557,8 +6546,6 @@ static __exit void hardware_unsetup(void)
> free_page((unsigned long)vmx_io_bitmap_a);
> free_page((unsigned long)vmx_vmwrite_bitmap);
> free_page((unsigned long)vmx_vmread_bitmap);
> -   if (nested)
> -   free_page((unsigned long)vmx_msr_bitmap_nested);
>
> free_kvm_area();
>  }
> @@ -6995,16 +6982,21 @@ static int handle_vmon(struct kvm_vcpu *vcpu)
> return 1;
> }
>
> +   if (cpu_has_vmx_msr_bitmap()) {
> +   vmx->nested.msr_bitmap =
> +   (unsigned long *)__get_free_page(GFP_KERNEL);
> +   if (!vmx->nested.msr_bitmap)
> +   goto out_msr_bitmap;
> +   }
> +
> vmx->nested.cached_vmcs12 = kmalloc(VMCS12_SIZE, GFP_KERNEL);
> if (!vmx->nested.cached_vmcs12)
> -   return -ENOMEM;
> +   goto out_cached_vmcs12;
>
> if (enable_shadow_vmcs) {
> shadow_vmcs = alloc_vmcs();
> -   if 

[PATCH v2 4/6] mm/page_ext: rename offset to index

2016-08-15 Thread js1304
From: Joonsoo Kim 

Here, 'offset' means entry index in page_ext array. Following patch
will use 'offset' for field offset in each entry so rename current
'offset' to prevent confusion.

Signed-off-by: Joonsoo Kim 
---
 mm/page_ext.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/mm/page_ext.c b/mm/page_ext.c
index 44a4c02..1629282 100644
--- a/mm/page_ext.c
+++ b/mm/page_ext.c
@@ -102,7 +102,7 @@ void __meminit pgdat_page_ext_init(struct pglist_data 
*pgdat)
 struct page_ext *lookup_page_ext(struct page *page)
 {
unsigned long pfn = page_to_pfn(page);
-   unsigned long offset;
+   unsigned long index;
struct page_ext *base;
 
base = NODE_DATA(page_to_nid(page))->node_page_ext;
@@ -119,9 +119,9 @@ struct page_ext *lookup_page_ext(struct page *page)
if (unlikely(!base))
return NULL;
 #endif
-   offset = pfn - round_down(node_start_pfn(page_to_nid(page)),
+   index = pfn - round_down(node_start_pfn(page_to_nid(page)),
MAX_ORDER_NR_PAGES);
-   return base + offset;
+   return base + index;
 }
 
 static int __init alloc_node_page_ext(int nid)
-- 
1.9.1



Re: [PATCH 1/5] mm/debug_pagealloc: clean-up guard page handling code

2016-08-15 Thread Joonsoo Kim
On Fri, Aug 12, 2016 at 09:25:37PM +0900, Sergey Senozhatsky wrote:
> On (08/11/16 11:41), Vlastimil Babka wrote:
> > On 08/10/2016 10:14 AM, Sergey Senozhatsky wrote:
> > > > @@ -1650,18 +1655,15 @@ static inline void expand(struct zone *zone, 
> > > > struct page *page,
> > > > size >>= 1;
> > > > VM_BUG_ON_PAGE(bad_range(zone, [size]), 
> > > > [size]);
> > > > 
> > > > -   if (IS_ENABLED(CONFIG_DEBUG_PAGEALLOC) &&
> > > > -   debug_guardpage_enabled() &&
> > > > -   high < debug_guardpage_minorder()) {
> > > > -   /*
> > > > -* Mark as guard pages (or page), that will 
> > > > allow to
> > > > -* merge back to allocator when buddy will be 
> > > > freed.
> > > > -* Corresponding page table entries will not be 
> > > > touched,
> > > > -* pages will stay not present in virtual 
> > > > address space
> > > > -*/
> > > > -   set_page_guard(zone, [size], high, 
> > > > migratetype);
> > > > +   /*
> > > > +* Mark as guard pages (or page), that will allow to
> > > > +* merge back to allocator when buddy will be freed.
> > > > +* Corresponding page table entries will not be touched,
> > > > +* pages will stay not present in virtual address space
> > > > +*/
> > > > +   if (set_page_guard(zone, [size], high, 
> > > > migratetype))
> > > > continue;
> > > > -   }
> > > 
> > > so previously IS_ENABLED(CONFIG_DEBUG_PAGEALLOC) could have optimized out
> > > the entire branch -- no set_page_guard() invocation and checks, right? but
> > > now we would call set_page_guard() every time?
> > 
> > No, there's a !CONFIG_DEBUG_PAGEALLOC version of set_page_guard() that
> > returns false (static inline), so this whole if will be eliminated by the
> > compiler, same as before.
> 
> ah, indeed. didn't notice it.

Hello, Sergey and Vlastimil.

I fixed all you commented and sent v2.

Thanks.


[PATCH v2 1/6] mm/debug_pagealloc: clean-up guard page handling code

2016-08-15 Thread js1304
From: Joonsoo Kim 

We can make code clean by moving decision condition
for set_page_guard() into set_page_guard() itself. It will
help code readability. There is no functional change.

Acked-by: Vlastimil Babka 
Signed-off-by: Joonsoo Kim 
---
 mm/page_alloc.c | 34 ++
 1 file changed, 18 insertions(+), 16 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 277c3d0..5e7944b 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -638,17 +638,20 @@ static int __init debug_guardpage_minorder_setup(char 
*buf)
 }
 __setup("debug_guardpage_minorder=", debug_guardpage_minorder_setup);
 
-static inline void set_page_guard(struct zone *zone, struct page *page,
+static inline bool set_page_guard(struct zone *zone, struct page *page,
unsigned int order, int migratetype)
 {
struct page_ext *page_ext;
 
if (!debug_guardpage_enabled())
-   return;
+   return false;
+
+   if (order >= debug_guardpage_minorder())
+   return false;
 
page_ext = lookup_page_ext(page);
if (unlikely(!page_ext))
-   return;
+   return false;
 
__set_bit(PAGE_EXT_DEBUG_GUARD, _ext->flags);
 
@@ -656,6 +659,8 @@ static inline void set_page_guard(struct zone *zone, struct 
page *page,
set_page_private(page, order);
/* Guard pages are not available for any usage */
__mod_zone_freepage_state(zone, -(1 << order), migratetype);
+
+   return true;
 }
 
 static inline void clear_page_guard(struct zone *zone, struct page *page,
@@ -678,8 +683,8 @@ static inline void clear_page_guard(struct zone *zone, 
struct page *page,
 }
 #else
 struct page_ext_operations debug_guardpage_ops = { NULL, };
-static inline void set_page_guard(struct zone *zone, struct page *page,
-   unsigned int order, int migratetype) {}
+static inline bool set_page_guard(struct zone *zone, struct page *page,
+   unsigned int order, int migratetype) { return false; }
 static inline void clear_page_guard(struct zone *zone, struct page *page,
unsigned int order, int migratetype) {}
 #endif
@@ -1650,18 +1655,15 @@ static inline void expand(struct zone *zone, struct 
page *page,
size >>= 1;
VM_BUG_ON_PAGE(bad_range(zone, [size]), [size]);
 
-   if (IS_ENABLED(CONFIG_DEBUG_PAGEALLOC) &&
-   debug_guardpage_enabled() &&
-   high < debug_guardpage_minorder()) {
-   /*
-* Mark as guard pages (or page), that will allow to
-* merge back to allocator when buddy will be freed.
-* Corresponding page table entries will not be touched,
-* pages will stay not present in virtual address space
-*/
-   set_page_guard(zone, [size], high, migratetype);
+   /*
+* Mark as guard pages (or page), that will allow to
+* merge back to allocator when buddy will be freed.
+* Corresponding page table entries will not be touched,
+* pages will stay not present in virtual address space
+*/
+   if (set_page_guard(zone, [size], high, migratetype))
continue;
-   }
+
list_add([size].lru, >free_list[migratetype]);
area->nr_free++;
set_page_order([size], high);
-- 
1.9.1



[PATCH v2 3/6] mm/page_owner: move page_owner specific function to page_owner.c

2016-08-15 Thread js1304
From: Joonsoo Kim 

There is no reason that page_owner specific function resides on vmstat.c.

Reviewed-by: Sergey Senozhatsky 
Signed-off-by: Joonsoo Kim 
---
 include/linux/page_owner.h |  2 ++
 mm/page_owner.c| 77 
 mm/vmstat.c| 79 --
 3 files changed, 79 insertions(+), 79 deletions(-)

diff --git a/include/linux/page_owner.h b/include/linux/page_owner.h
index 30583ab..2be728d 100644
--- a/include/linux/page_owner.h
+++ b/include/linux/page_owner.h
@@ -14,6 +14,8 @@ extern void __split_page_owner(struct page *page, unsigned 
int order);
 extern void __copy_page_owner(struct page *oldpage, struct page *newpage);
 extern void __set_page_owner_migrate_reason(struct page *page, int reason);
 extern void __dump_page_owner(struct page *page);
+extern void pagetypeinfo_showmixedcount_print(struct seq_file *m,
+   pg_data_t *pgdat, struct zone *zone);
 
 static inline void reset_page_owner(struct page *page, unsigned int order)
 {
diff --git a/mm/page_owner.c b/mm/page_owner.c
index 3b241f5..2cae0b2 100644
--- a/mm/page_owner.c
+++ b/mm/page_owner.c
@@ -8,6 +8,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "internal.h"
 
@@ -214,6 +215,82 @@ void __copy_page_owner(struct page *oldpage, struct page 
*newpage)
__set_bit(PAGE_EXT_OWNER, _ext->flags);
 }
 
+void pagetypeinfo_showmixedcount_print(struct seq_file *m, pg_data_t *pgdat,
+   struct zone *zone)
+{
+   struct page *page;
+   struct page_ext *page_ext;
+   unsigned long pfn = zone->zone_start_pfn, block_end_pfn;
+   unsigned long end_pfn = pfn + zone->spanned_pages;
+   unsigned long count[MIGRATE_TYPES] = { 0, };
+   int pageblock_mt, page_mt;
+   int i;
+
+   /* Scan block by block. First and last block may be incomplete */
+   pfn = zone->zone_start_pfn;
+
+   /*
+* Walk the zone in pageblock_nr_pages steps. If a page block spans
+* a zone boundary, it will be double counted between zones. This does
+* not matter as the mixed block count will still be correct
+*/
+   for (; pfn < end_pfn; ) {
+   if (!pfn_valid(pfn)) {
+   pfn = ALIGN(pfn + 1, pageblock_nr_pages);
+   continue;
+   }
+
+   block_end_pfn = ALIGN(pfn + 1, pageblock_nr_pages);
+   block_end_pfn = min(block_end_pfn, end_pfn);
+
+   page = pfn_to_page(pfn);
+   pageblock_mt = get_pageblock_migratetype(page);
+
+   for (; pfn < block_end_pfn; pfn++) {
+   if (!pfn_valid_within(pfn))
+   continue;
+
+   page = pfn_to_page(pfn);
+
+   if (page_zone(page) != zone)
+   continue;
+
+   if (PageBuddy(page)) {
+   pfn += (1UL << page_order(page)) - 1;
+   continue;
+   }
+
+   if (PageReserved(page))
+   continue;
+
+   page_ext = lookup_page_ext(page);
+   if (unlikely(!page_ext))
+   continue;
+
+   if (!test_bit(PAGE_EXT_OWNER, _ext->flags))
+   continue;
+
+   page_mt = gfpflags_to_migratetype(page_ext->gfp_mask);
+   if (pageblock_mt != page_mt) {
+   if (is_migrate_cma(pageblock_mt))
+   count[MIGRATE_MOVABLE]++;
+   else
+   count[pageblock_mt]++;
+
+   pfn = block_end_pfn;
+   break;
+   }
+   pfn += (1UL << page_ext->order) - 1;
+   }
+   }
+
+   /* Print counts */
+   seq_printf(m, "Node %d, zone %8s ", pgdat->node_id, zone->name);
+   for (i = 0; i < MIGRATE_TYPES; i++)
+   seq_printf(m, "%12lu ", count[i]);
+   seq_putc(m, '\n');
+}
+
 static ssize_t
 print_page_owner(char __user *buf, size_t count, unsigned long pfn,
struct page *page, struct page_ext *page_ext,
diff --git a/mm/vmstat.c b/mm/vmstat.c
index 84397e8..dc04e76 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -1254,85 +1254,6 @@ static int pagetypeinfo_showblockcount(struct seq_file 
*m, void *arg)
return 0;
 }
 
-#ifdef CONFIG_PAGE_OWNER
-static void pagetypeinfo_showmixedcount_print(struct seq_file *m,
-   pg_data_t *pgdat,
-   struct zone *zone)
-{

[PATCH v2 0/6] Reduce memory waste by page extension user

2016-08-15 Thread js1304
From: Joonsoo Kim 

v2:
Fix rebase mistake (per Vlastimil)
Rename some variable/function to prevent confusion (per Vlastimil)
Fix header dependency (per Sergey)

This patchset tries to reduce memory waste by page extension user.

First case is architecture supported debug_pagealloc. It doesn't
requires additional memory if guard page isn't used. 8 bytes per
page will be saved in this case.

Second case is related to page owner feature. Until now, if page_ext
users want to use it's own fields on page_ext, fields should be
defined in struct page_ext by hard-coding. It has a following problem.

struct page_ext {
 #ifdef CONFIG_A
int a;
 #endif
 #ifdef CONFIG_B
int b;
 #endif
};

Assume that kernel is built with both CONFIG_A and CONFIG_B.
Even if we enable feature A and doesn't enable feature B at runtime,
each entry of struct page_ext takes two int rather than one int.
It's undesirable waste so this patch tries to reduce it. By this patchset,
we can save 20 bytes per page dedicated for page owner feature
in some configurations.

Thanks.

Joonsoo Kim (6):
  mm/debug_pagealloc: clean-up guard page handling code
  mm/debug_pagealloc: don't allocate page_ext if we don't use guard page
  mm/page_owner: move page_owner specific function to page_owner.c
  mm/page_ext: rename offset to index
  mm/page_ext: support extra space allocation by page_ext user
  mm/page_owner: don't define fields on struct page_ext by hard-coding

 include/linux/page_ext.h   |   8 +--
 include/linux/page_owner.h |   2 +
 mm/page_alloc.c|  44 +++--
 mm/page_ext.c  |  45 +
 mm/page_owner.c| 156 ++---
 mm/vmstat.c|  79 ---
 6 files changed, 196 insertions(+), 138 deletions(-)

-- 
1.9.1



[PATCH v2 5/6] mm/page_ext: support extra space allocation by page_ext user

2016-08-15 Thread js1304
From: Joonsoo Kim 

Until now, if some page_ext users want to use it's own field on page_ext,
it should be defined in struct page_ext by hard-coding. It has a problem
that wastes memory in following situation.

struct page_ext {
 #ifdef CONFIG_A
int a;
 #endif
 #ifdef CONFIG_B
int b;
 #endif
};

Assume that kernel is built with both CONFIG_A and CONFIG_B.
Even if we enable feature A and doesn't enable feature B at runtime,
each entry of struct page_ext takes two int rather than one int.
It's undesirable result so this patch tries to fix it.

To solve above problem, this patch implements to support extra space
allocation at runtime. When need() callback returns true, it's extra
memory requirement is summed to entry size of page_ext. Also, offset
for each user's extra memory space is returned. With this offset,
user can use this extra space and there is no need to define needed
field on page_ext by hard-coding.

This patch only implements an infrastructure. Following patch will use it
for page_owner which is only user having it's own fields on page_ext.

Signed-off-by: Joonsoo Kim 
---
 include/linux/page_ext.h |  2 ++
 mm/page_alloc.c  |  2 +-
 mm/page_ext.c| 41 +++--
 3 files changed, 34 insertions(+), 11 deletions(-)

diff --git a/include/linux/page_ext.h b/include/linux/page_ext.h
index 03f2a3e..179bdc4 100644
--- a/include/linux/page_ext.h
+++ b/include/linux/page_ext.h
@@ -7,6 +7,8 @@
 
 struct pglist_data;
 struct page_ext_operations {
+   size_t offset;
+   size_t size;
bool (*need)(void);
void (*init)(void);
 };
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 45cb021..d2e365c 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -688,7 +688,7 @@ static inline void clear_page_guard(struct zone *zone, 
struct page *page,
__mod_zone_freepage_state(zone, (1 << order), migratetype);
 }
 #else
-struct page_ext_operations debug_guardpage_ops = { NULL, };
+struct page_ext_operations debug_guardpage_ops;
 static inline bool set_page_guard(struct zone *zone, struct page *page,
unsigned int order, int migratetype) { return false; }
 static inline void clear_page_guard(struct zone *zone, struct page *page,
diff --git a/mm/page_ext.c b/mm/page_ext.c
index 1629282..121dcff 100644
--- a/mm/page_ext.c
+++ b/mm/page_ext.c
@@ -42,6 +42,11 @@
  * and page extension core can skip to allocate memory. As result,
  * none of memory is wasted.
  *
+ * When need callback returns true, page_ext checks if there is a request for
+ * extra memory through size in struct page_ext_operations. If it is non-zero,
+ * extra space is allocated for each page_ext entry and offset is returned to
+ * user through offset in struct page_ext_operations.
+ *
  * The init callback is used to do proper initialization after page extension
  * is completely initialized. In sparse memory system, extra memory is
  * allocated some time later than memmap is allocated. In other words, lifetime
@@ -66,18 +71,24 @@ static struct page_ext_operations *page_ext_ops[] = {
 };
 
 static unsigned long total_usage;
+static unsigned long extra_mem;
 
 static bool __init invoke_need_callbacks(void)
 {
int i;
int entries = ARRAY_SIZE(page_ext_ops);
+   bool need = false;
 
for (i = 0; i < entries; i++) {
-   if (page_ext_ops[i]->need && page_ext_ops[i]->need())
-   return true;
+   if (page_ext_ops[i]->need && page_ext_ops[i]->need()) {
+   page_ext_ops[i]->offset = sizeof(struct page_ext) +
+   extra_mem;
+   extra_mem += page_ext_ops[i]->size;
+   need = true;
+   }
}
 
-   return false;
+   return need;
 }
 
 static void __init invoke_init_callbacks(void)
@@ -91,6 +102,16 @@ static void __init invoke_init_callbacks(void)
}
 }
 
+static unsigned long get_entry_size(void)
+{
+   return sizeof(struct page_ext) + extra_mem;
+}
+
+static inline struct page_ext *get_entry(void *base, unsigned long index)
+{
+   return base + get_entry_size() * index;
+}
+
 #if !defined(CONFIG_SPARSEMEM)
 
 
@@ -121,7 +142,7 @@ struct page_ext *lookup_page_ext(struct page *page)
 #endif
index = pfn - round_down(node_start_pfn(page_to_nid(page)),
MAX_ORDER_NR_PAGES);
-   return base + index;
+   return get_entry(base, index);
 }
 
 static int __init alloc_node_page_ext(int nid)
@@ -143,7 +164,7 @@ static int __init alloc_node_page_ext(int nid)
!IS_ALIGNED(node_end_pfn(nid), MAX_ORDER_NR_PAGES))
nr_pages += MAX_ORDER_NR_PAGES;
 
-   table_size = sizeof(struct page_ext) * nr_pages;
+   table_size = get_entry_size() * nr_pages;
 
base = memblock_virt_alloc_try_nid_nopanic(
  

[PATCH v2 6/6] mm/page_owner: don't define fields on struct page_ext by hard-coding

2016-08-15 Thread js1304
From: Joonsoo Kim 

There is a memory waste problem if we define field on struct page_ext
by hard-coding. Entry size of struct page_ext includes the size of
those fields even if it is disabled at runtime. Now, extra memory request
at runtime is possible so page_owner don't need to define it's own fields
by hard-coding.

This patch removes hard-coded define and uses extra memory for storing
page_owner information in page_owner. Most of code are just mechanical
changes.

Acked-by: Vlastimil Babka 
Signed-off-by: Joonsoo Kim 
---
 include/linux/page_ext.h |  6 
 mm/page_owner.c  | 83 +---
 2 files changed, 58 insertions(+), 31 deletions(-)

diff --git a/include/linux/page_ext.h b/include/linux/page_ext.h
index 179bdc4..9298c39 100644
--- a/include/linux/page_ext.h
+++ b/include/linux/page_ext.h
@@ -44,12 +44,6 @@ enum page_ext_flags {
  */
 struct page_ext {
unsigned long flags;
-#ifdef CONFIG_PAGE_OWNER
-   unsigned int order;
-   gfp_t gfp_mask;
-   int last_migrate_reason;
-   depot_stack_handle_t handle;
-#endif
 };
 
 extern void pgdat_page_ext_init(struct pglist_data *pgdat);
diff --git a/mm/page_owner.c b/mm/page_owner.c
index 2cae0b2..0537d15 100644
--- a/mm/page_owner.c
+++ b/mm/page_owner.c
@@ -18,6 +18,13 @@
  */
 #define PAGE_OWNER_STACK_DEPTH (16)
 
+struct page_owner {
+   unsigned int order;
+   gfp_t gfp_mask;
+   int last_migrate_reason;
+   depot_stack_handle_t handle;
+};
+
 static bool page_owner_disabled = true;
 DEFINE_STATIC_KEY_FALSE(page_owner_inited);
 
@@ -86,10 +93,16 @@ static void init_page_owner(void)
 }
 
 struct page_ext_operations page_owner_ops = {
+   .size = sizeof(struct page_owner),
.need = need_page_owner,
.init = init_page_owner,
 };
 
+static inline struct page_owner *get_page_owner(struct page_ext *page_ext)
+{
+   return (void *)page_ext + page_owner_ops.offset;
+}
+
 void __reset_page_owner(struct page *page, unsigned int order)
 {
int i;
@@ -156,14 +169,16 @@ noinline void __set_page_owner(struct page *page, 
unsigned int order,
gfp_t gfp_mask)
 {
struct page_ext *page_ext = lookup_page_ext(page);
+   struct page_owner *page_owner;
 
if (unlikely(!page_ext))
return;
 
-   page_ext->handle = save_stack(gfp_mask);
-   page_ext->order = order;
-   page_ext->gfp_mask = gfp_mask;
-   page_ext->last_migrate_reason = -1;
+   page_owner = get_page_owner(page_ext);
+   page_owner->handle = save_stack(gfp_mask);
+   page_owner->order = order;
+   page_owner->gfp_mask = gfp_mask;
+   page_owner->last_migrate_reason = -1;
 
__set_bit(PAGE_EXT_OWNER, _ext->flags);
 }
@@ -171,21 +186,26 @@ noinline void __set_page_owner(struct page *page, 
unsigned int order,
 void __set_page_owner_migrate_reason(struct page *page, int reason)
 {
struct page_ext *page_ext = lookup_page_ext(page);
+   struct page_owner *page_owner;
+
if (unlikely(!page_ext))
return;
 
-   page_ext->last_migrate_reason = reason;
+   page_owner = get_page_owner(page_ext);
+   page_owner->last_migrate_reason = reason;
 }
 
 void __split_page_owner(struct page *page, unsigned int order)
 {
int i;
struct page_ext *page_ext = lookup_page_ext(page);
+   struct page_owner *page_owner;
 
if (unlikely(!page_ext))
return;
 
-   page_ext->order = 0;
+   page_owner = get_page_owner(page_ext);
+   page_owner->order = 0;
for (i = 1; i < (1 << order); i++)
__copy_page_owner(page, page + i);
 }
@@ -194,14 +214,18 @@ void __copy_page_owner(struct page *oldpage, struct page 
*newpage)
 {
struct page_ext *old_ext = lookup_page_ext(oldpage);
struct page_ext *new_ext = lookup_page_ext(newpage);
+   struct page_owner *old_page_owner, *new_page_owner;
 
if (unlikely(!old_ext || !new_ext))
return;
 
-   new_ext->order = old_ext->order;
-   new_ext->gfp_mask = old_ext->gfp_mask;
-   new_ext->last_migrate_reason = old_ext->last_migrate_reason;
-   new_ext->handle = old_ext->handle;
+   old_page_owner = get_page_owner(old_ext);
+   new_page_owner = get_page_owner(new_ext);
+   new_page_owner->order = old_page_owner->order;
+   new_page_owner->gfp_mask = old_page_owner->gfp_mask;
+   new_page_owner->last_migrate_reason =
+   old_page_owner->last_migrate_reason;
+   new_page_owner->handle = old_page_owner->handle;
 
/*
 * We don't clear the bit on the oldpage as it's going to be freed
@@ -220,6 +244,7 @@ void pagetypeinfo_showmixedcount_print(struct seq_file *m, 
pg_data_t *pgdat,
 {
struct page *page;
struct page_ext *page_ext;
+   struct page_owner *page_owner;
unsigned 

[PATCH v2 2/6] mm/debug_pagealloc: don't allocate page_ext if we don't use guard page

2016-08-15 Thread js1304
From: Joonsoo Kim 

What debug_pagealloc does is just mapping/unmapping page table.
Basically, it doesn't need additional memory space to memorize something.
But, with guard page feature, it requires additional memory to distinguish
if the page is for guard or not. Guard page is only used when
debug_guardpage_minorder is non-zero so this patch removes additional
memory allocation (page_ext) if debug_guardpage_minorder is zero.

It saves memory if we just use debug_pagealloc and not guard page.

Acked-by: Vlastimil Babka 
Reviewed-by: Sergey Senozhatsky 
Signed-off-by: Joonsoo Kim 
---
 mm/page_alloc.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 5e7944b..45cb021 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -608,6 +608,9 @@ static bool need_debug_guardpage(void)
if (!debug_pagealloc_enabled())
return false;
 
+   if (!debug_guardpage_minorder())
+   return false;
+
return true;
 }
 
@@ -616,6 +619,9 @@ static void init_debug_guardpage(void)
if (!debug_pagealloc_enabled())
return;
 
+   if (!debug_guardpage_minorder())
+   return;
+
_debug_guardpage_enabled = true;
 }
 
@@ -636,7 +642,7 @@ static int __init debug_guardpage_minorder_setup(char *buf)
pr_info("Setting debug_guardpage_minorder to %lu\n", res);
return 0;
 }
-__setup("debug_guardpage_minorder=", debug_guardpage_minorder_setup);
+early_param("debug_guardpage_minorder", debug_guardpage_minorder_setup);
 
 static inline bool set_page_guard(struct zone *zone, struct page *page,
unsigned int order, int migratetype)
-- 
1.9.1



[PATCH v1 2/3] of: Add support for reading a s32 from a multi-value property.

2016-08-15 Thread Finlye Xiao
From: Finley Xiao 

This patch adds an of_property_read_s32_index() function to allow
reading a single indexed s32 value from a property containing multiple
s32 values.

Signed-off-by: Finley Xiao 
---
 drivers/of/base.c  | 23 +++
 include/linux/of.h | 10 ++
 2 files changed, 33 insertions(+)

diff --git a/drivers/of/base.c b/drivers/of/base.c
index 7792266..346457d 100644
--- a/drivers/of/base.c
+++ b/drivers/of/base.c
@@ -1200,6 +1200,29 @@ int of_property_read_u32_index(const struct device_node 
*np,
 EXPORT_SYMBOL_GPL(of_property_read_u32_index);
 
 /**
+ * of_property_read_s32_index - Find and read a s32 from a multi-value 
property.
+ *
+ * @np:device node from which the property value is to be read.
+ * @propname:  name of the property to be searched.
+ * @index: index of the u32 in the list of values
+ * @out_value: pointer to return value, modified only if no error.
+ *
+ * Search for a property in a device node and read nth 32-bit value from
+ * it. Returns 0 on success, -EINVAL if the property does not exist,
+ * -ENODATA if property does not have a value, and -EOVERFLOW if the
+ * property data isn't large enough.
+ *
+ * The out_value is modified only if a valid s32 value can be decoded.
+ */
+int of_property_read_s32_index(const struct device_node *np,
+  const char *propname, u32 index, s32 *out_value)
+{
+   return of_property_read_u32_index(np, propname, index,
+   (u32 *)out_value);
+}
+EXPORT_SYMBOL_GPL(of_property_read_s32_index);
+
+/**
  * of_property_read_u8_array - Find and read an array of u8 from a property.
  *
  * @np:device node from which the property value is to be read.
diff --git a/include/linux/of.h b/include/linux/of.h
index 3d9ff8e..cb9a627 100644
--- a/include/linux/of.h
+++ b/include/linux/of.h
@@ -291,6 +291,9 @@ extern int of_property_count_elems_of_size(const struct 
device_node *np,
 extern int of_property_read_u32_index(const struct device_node *np,
   const char *propname,
   u32 index, u32 *out_value);
+extern int of_property_read_s32_index(const struct device_node *np,
+ const char *propname,
+ u32 index, s32 *out_value);
 extern int of_property_read_u8_array(const struct device_node *np,
const char *propname, u8 *out_values, size_t sz);
 extern int of_property_read_u16_array(const struct device_node *np,
@@ -536,6 +539,13 @@ static inline int of_property_read_u32_index(const struct 
device_node *np,
return -ENOSYS;
 }
 
+static inline int of_property_read_s32_index(const struct device_node *np,
+const char *propname,
+u32 index, s32 *out_value)
+{
+   return -ENOSYS;
+}
+
 static inline int of_property_read_u8_array(const struct device_node *np,
const char *propname, u8 *out_values, size_t sz)
 {
-- 
1.9.1




[PATCH v1 1/3] nvmem: rockchip-efuse: Change initcall to subsys

2016-08-15 Thread Finlye Xiao
From: Finley Xiao 

We will register a cpufreq notifier for adjusting opp's voltage, and it
need to fetch cpu's leakage from efuse in the notifier_call. so the efuse
driver should probe before cpufreq driver.

Signed-off-by: Finley Xiao 
---
 drivers/nvmem/rockchip-efuse.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/nvmem/rockchip-efuse.c b/drivers/nvmem/rockchip-efuse.c
index 4d3f391..378993d 100644
--- a/drivers/nvmem/rockchip-efuse.c
+++ b/drivers/nvmem/rockchip-efuse.c
@@ -144,6 +144,13 @@ static struct platform_driver rockchip_efuse_driver = {
},
 };
 
-module_platform_driver(rockchip_efuse_driver);
+static int __init rockchip_efuse_module_init(void)
+{
+   return platform_driver_probe(_efuse_driver,
+rockchip_efuse_probe);
+}
+
+subsys_initcall(rockchip_efuse_module_init);
+
 MODULE_DESCRIPTION("rockchip_efuse driver");
 MODULE_LICENSE("GPL v2");
-- 
1.9.1




[PATCH v1 0/3] PM / AVS: add Rockchip cpu avs

2016-08-15 Thread Finlye Xiao
From: Finley Xiao 

Under the same frequency, the operating voltage tends to decrease with
increasing leakage. so it is necessary to adjust opp's voltage according
to leakage for power.

Finley Xiao (3):
  nvmem: rockchip-efuse: Change initcall to subsys
  of: Add support for reading a s32 from a multi-value property.
  PM / AVS: rockchip-cpu-avs: add driver handling Rockchip cpu avs

 .../devicetree/bindings/power/rockchip-cpu-avs.txt |  37 +++
 drivers/nvmem/rockchip-efuse.c |   9 +-
 drivers/of/base.c  |  23 ++
 drivers/power/avs/Kconfig  |   8 +
 drivers/power/avs/Makefile |   1 +
 drivers/power/avs/rockchip-cpu-avs.c   | 314 +
 include/linux/of.h |  10 +
 7 files changed, 401 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/devicetree/bindings/power/rockchip-cpu-avs.txt
 create mode 100644 drivers/power/avs/rockchip-cpu-avs.c

-- 
1.9.1




[PATCH v1 3/3] PM / AVS: rockchip-cpu-avs: add driver handling Rockchip cpu avs

2016-08-15 Thread Finlye Xiao
From: Finley Xiao 

This patch supports adjusting opp's voltage according to leakage

Signed-off-by: Finley Xiao 
---
 .../devicetree/bindings/power/rockchip-cpu-avs.txt |  37 +++
 drivers/power/avs/Kconfig  |   8 +
 drivers/power/avs/Makefile |   1 +
 drivers/power/avs/rockchip-cpu-avs.c   | 314 +
 4 files changed, 360 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/power/rockchip-cpu-avs.txt
 create mode 100644 drivers/power/avs/rockchip-cpu-avs.c

diff --git a/Documentation/devicetree/bindings/power/rockchip-cpu-avs.txt 
b/Documentation/devicetree/bindings/power/rockchip-cpu-avs.txt
new file mode 100644
index 000..90f6b08
--- /dev/null
+++ b/Documentation/devicetree/bindings/power/rockchip-cpu-avs.txt
@@ -0,0 +1,37 @@
+Rockchip cpu avs device tree bindings
+-
+
+Under the same frequency, the operating voltage tends to decrease with
+increasing leakage. so it is necessary to adjust opp's voltage according
+to leakage for power.
+
+
+Required properties:
+- compatible: Should be one of the following.
+  - "rockchip,rk3399-cpu-avs" - for RK3399 SoCs.
+- leakage-volt-: Named leakage-volt property. At runtime, the
+  platform can find a cpu's cluster_id according to it's cpu_id and match
+  leakage-volt- property. The property is an array of 3-tuples
+  items, and each item consists of leakage and voltage like
+  .
+   min-leakage: minimum leakage in mA.
+   max-leakage: maximum leakage in mA.
+   vol: voltage in microvolt.
+
+Example:
+
+   cpu_avs: cpu-avs {
+   compatible = "rockchip,rk3399-cpu-avs";
+   leakage-volt-cluster0 = <
+   /*  mAmA uV*/
+   0 1000
+   101   200(-25000)
+   201   300(-5)
+   >;
+   leakage-volt-cluster1 = <
+   /*  mAmA uV*/
+   0 1000
+   101   200(-25000)
+   201   300(-5)
+   >;
+   };
diff --git a/drivers/power/avs/Kconfig b/drivers/power/avs/Kconfig
index a67eeac..c8f2d09 100644
--- a/drivers/power/avs/Kconfig
+++ b/drivers/power/avs/Kconfig
@@ -18,3 +18,11 @@ config ROCKCHIP_IODOMAIN
   Say y here to enable support io domains on Rockchip SoCs. It is
   necessary for the io domain setting of the SoC to match the
   voltage supplied by the regulators.
+
+config ROCKCHIP_CPU_AVS
+bool "Rockchip CPU AVS support"
+depends on POWER_AVS && ARCH_ROCKCHIP && OF
+help
+  Say y here to enable support CPU AVS on Rockchip SoCs.
+  The cpu's operating voltage is adapted depending on leakage
+  or pvtm.
diff --git a/drivers/power/avs/Makefile b/drivers/power/avs/Makefile
index ba4c7bc..11ce242 100644
--- a/drivers/power/avs/Makefile
+++ b/drivers/power/avs/Makefile
@@ -1,2 +1,3 @@
 obj-$(CONFIG_POWER_AVS_OMAP)   += smartreflex.o
 obj-$(CONFIG_ROCKCHIP_IODOMAIN)+= rockchip-io-domain.o
+obj-$(CONFIG_ROCKCHIP_CPU_AVS) += rockchip-cpu-avs.o
diff --git a/drivers/power/avs/rockchip-cpu-avs.c 
b/drivers/power/avs/rockchip-cpu-avs.c
new file mode 100644
index 000..8266c02
--- /dev/null
+++ b/drivers/power/avs/rockchip-cpu-avs.c
@@ -0,0 +1,314 @@
+/*
+ * Rockchip CPU AVS support.
+ *
+ * Copyright (c) 2016 ROCKCHIP, Co. Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "../../base/power/opp/opp.h"
+
+#define MAX_NAME_LEN   22
+#define LEAKAGE_TABLE_END  ~1
+#define INVALID_VALUE  0xff
+
+struct leakage_volt_table {
+   intmin;
+   intmax;
+   intvolt;
+};
+
+struct leakage_volt_table *leakage_volt_table;
+
+struct rockchip_cpu_avs {
+   struct leakage_volt_table **volt_table;
+   struct notifier_block   cpufreq_notify;
+};
+
+#define notifier_to_avs(_n) container_of(_n, struct rockchip_cpu_avs, \
+   cpufreq_notify)
+
+static unsigned char rockchip_fetch_leakage(struct device *dev)
+{
+   struct nvmem_cell *cell;
+   unsigned char *buf;
+   size_t len;
+   unsigned char leakage = INVALID_VALUE;
+
+   cell = nvmem_cell_get(dev, "cpu_leakage");
+   if (IS_ERR(cell)) {
+   pr_err("failed to get cpu_leakage cell\n");
+   return INVALID_VALUE;
+   }
+
+   buf = (unsigned char *)nvmem_cell_read(cell, );
+
+   

Re: [PATCH 2/2] KVM: nVMX: postpone VMCS changes on MSR_IA32_APICBASE write

2016-08-15 Thread Wanpeng Li
2016-08-09 2:16 GMT+08:00 Radim Krčmář :
> If vmcs12 does not intercept APIC_BASE writes, then KVM will handle the
> write with vmcs02 as the current VMCS.
> This will incorrectly apply modifications intended for vmcs01 to vmcs02
> and L2 can use it to gain access to L0's x2APIC registers by disabling
> virtualized x2APIC while using msr bitmap that assumes enabled.
>
> Postpone execution of vmx_set_virtual_x2apic_mode until vmcs01 is the
> current VMCS.  An alternative solution would temporarily make vmcs01 the
> current VMCS, but it requires more care.
>
> Fixes: 8d14695f9542 ("x86, apicv: add virtual x2apic support")
> Reported-by: Jim Mattson 
> Signed-off-by: Radim Krčmář 

Reviewed-by: Wanpeng Li 

> ---
>  arch/x86/kvm/vmx.c | 13 +
>  1 file changed, 13 insertions(+)
>
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index c66ac2c70d22..ae111a07acc4 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -422,6 +422,7 @@ struct nested_vmx {
> struct list_head vmcs02_pool;
> int vmcs02_num;
> u64 vmcs01_tsc_offset;
> +   bool change_vmcs01_virtual_x2apic_mode;
> /* L2 must run next, and mustn't decide to exit to L1. */
> bool nested_run_pending;
> /*
> @@ -8424,6 +8425,12 @@ static void vmx_set_virtual_x2apic_mode(struct 
> kvm_vcpu *vcpu, bool set)
>  {
> u32 sec_exec_control;
>
> +   /* Postpone execution until vmcs01 is the current VMCS. */
> +   if (is_guest_mode(vcpu)) {
> +   to_vmx(vcpu)->nested.change_vmcs01_virtual_x2apic_mode = true;
> +   return;
> +   }
> +
> /*
>  * There is not point to enable virtualize x2apic without enable
>  * apicv
> @@ -10749,6 +10756,12 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, 
> u32 exit_reason,
> vmcs_set_bits(PIN_BASED_VM_EXEC_CONTROL,
>   PIN_BASED_VMX_PREEMPTION_TIMER);
>
> +   if (vmx->nested.change_vmcs01_virtual_x2apic_mode) {
> +   vmx->nested.change_vmcs01_virtual_x2apic_mode = false;
> +   vmx_set_virtual_x2apic_mode(vcpu,
> +   vcpu->arch.apic_base & X2APIC_ENABLE);
> +   }
> +
> /* This is needed for same reason as it was needed in prepare_vmcs02 
> */
> vmx->host_rsp = 0;


Re: [PART2 PATCH v5 06/12] iommu/amd: Adding GALOG interrupt handler

2016-08-15 Thread Suravee Suthikulpanit

Hi Joerg,

On 8/9/16 21:43, Joerg Roedel wrote:

On Mon, Jul 25, 2016 at 04:32:05AM -0500, Suthikulpanit, Suravee wrote:

From: Suravee Suthikulpanit 

This patch adds AMD IOMMU guest virtual APIC log (GALOG) handler.
When IOMMU hardware receives an interrupt targeting a blocking vcpu,
it creates an entry in the GALOG, and generates an interrupt to notify
the AMD IOMMU driver.

At this point, the driver processes the log entry, and notify the SVM
driver via the registered iommu_ga_log_notifier function.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd_iommu.c | 77 +--
 include/linux/amd-iommu.h | 20 ++--
 2 files changed, 91 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index abfb2b7..861d723 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -741,14 +741,78 @@ static void iommu_poll_ppr_log(struct amd_iommu *iommu)
}
 }

+#ifdef CONFIG_IRQ_REMAP
+static int (*iommu_ga_log_notifier)(u32);
+
+int amd_iommu_register_ga_log_notifier(int (*notifier)(u32))
+{
+   iommu_ga_log_notifier = notifier;
+
+   return 0;
+}
+EXPORT_SYMBOL(amd_iommu_register_ga_log_notifier);
+
+static void iommu_poll_ga_log(struct amd_iommu *iommu)
+{
+   u32 head, tail, cnt = 0;
+
+   if (iommu->ga_log == NULL)
+   return;
+
+   head = readl(iommu->mmio_base + MMIO_GA_HEAD_OFFSET);
+   tail = readl(iommu->mmio_base + MMIO_GA_TAIL_OFFSET);
+
+   while (head != tail) {
+   volatile u64 *raw;
+   u64 log_entry;
+
+   raw = (u64 *)(iommu->ga_log + head);
+   cnt++;
+
+   /* Avoid memcpy function-call overhead */
+   log_entry = *raw;
+
+   /* Update head pointer of hardware ring-buffer */
+   head = (head + GA_ENTRY_SIZE) % GA_LOG_SIZE;
+   writel(head, iommu->mmio_base + MMIO_GA_HEAD_OFFSET);
+
+   /* Handle GA entry */
+   switch (GA_REQ_TYPE(log_entry)) {
+   case GA_GUEST_NR:
+   if (!iommu_ga_log_notifier)
+   break;
+
+   pr_debug("AMD-Vi: %s: devid=%#x, ga_tag=%#x\n",
+__func__, GA_DEVID(log_entry),
+GA_TAG(log_entry));
+
+   if (iommu_ga_log_notifier(GA_TAG(log_entry)) != 0)
+   pr_err("AMD-Vi: GA log notifier failed.\n");
+   break;
+   default:
+   break;
+   }
+
+   /* Refresh ring-buffer information */
+   head = readl(iommu->mmio_base + MMIO_GA_HEAD_OFFSET);
+   tail = readl(iommu->mmio_base + MMIO_GA_TAIL_OFFSET);


Couldn't that cause an endless-loop in case of an interrupt storm from a
device? I think it is better to just read head and tail once before the
loop and update head after we get out of the loop. Any new entries
could be handled by the next iommu interrupt. This avoids any
soft-lockups that might happen when the loop runs for too long.



Sure. Also, we might need to start handling GALogOverflow as well. 
However, let's put that in a separate patch series. What do you think?


S


mipsel-linux-gnu-gcc: error: unrecognized command line option '-mcompact-branches=optimal'

2016-08-15 Thread kbuild test robot
Hi Paul,

FYI, the error/warning still remains.

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   3684b03d8e9a889eda94ee74421959a9d55e5e19
commit: c1a0e9bc885d46e519fd87d35af6a7937abfb986 MIPS: Allow compact branch 
policy to be changed
date:   10 months ago
config: mips-malta_qemu_32r6_defconfig (attached as .config)
compiler: mipsel-linux-gnu-gcc (Debian 5.4.0-6) 5.4.0 20160609
reproduce:
wget 
https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross
 -O ~/bin/make.cross
chmod +x ~/bin/make.cross
git checkout c1a0e9bc885d46e519fd87d35af6a7937abfb986
# save the attached .config to linux build tree
make.cross ARCH=mips 

All errors (new ones prefixed by >>):

>> mipsel-linux-gnu-gcc: error: unrecognized command line option 
>> '-mcompact-branches=optimal'
>> mipsel-linux-gnu-gcc: error: unrecognized command line option 
>> '-mcompact-branches=optimal'
   make[2]: *** [kernel/bounds.s] Error 1
   make[2]: Target '__build' not remade because of errors.
   make[1]: *** [prepare0] Error 2
   make[1]: Target 'prepare' not remade because of errors.
   make: *** [sub-make] Error 2

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: Binary data


Re: [LKP] [lkp] [sctp] a6c2f79287: netperf.Throughput_Mbps -37.2% regression

2016-08-15 Thread Aaron Lu

Any update on this, Long?

Regards,
Aaron

On 08/08/2016 10:10 AM, Aaron Lu wrote:
> On Fri, Aug 05, 2016 at 07:53:38PM +0800, Xin Long wrote:
 It doesn't make much sense to me. the codes I added cannot be
 triggered without enable any pr policies. and I also did the tests in
>>>
>>> It seems these pr policies has to be turned on by user space, i.e.
>>> netperf in this case?
>>>
>>> I checked netperf's source code, it doesn't seem set any option
>>> related to SCTP PR POLICY but I'm new to network code so I could be
>>> wrong or missing something.
>>>
 my local environment,  the result looks normal to me compare to
 prior version.
>>>
>>> Can you share your number?
>>> We run netperf like this:
>>> netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 127.0.0.1
>>> The full log of the run is attached for your reference.
>>
>> Now I also changed to linux-net.git
>>
>> commit 96b585267f552d4b6a28ea8bd75e5ed03deb6e71
>> [root@hp-dl388g8-08 ~]# uname -r
>> 4.7.0.new
>> [root@hp-dl388g8-08 ~]# netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 --
>> -m 10K -H 127.0.0.1
>> SCTP 1-TO-MANY STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
>> 127.0.0.1 () port 0 AF_INET
>> Recv   SendSend  Utilization   Service Demand
>> Socket Socket  Message  Elapsed  Send Recv SendRecv
>> Size   SizeSize Time Throughput  localremote   local   remote
>> bytes  bytes   bytessecs.10^6bits/s  % S  % S  us/KB   us/KB
>>
>> 212992 212992  10240300.00 11814.56   4.65 4.65 0.775   0.774
>>
>>
>> commit f959fb442c35f4b61fea341401b8463dd0a1b959 (just before the buggie 
>> patch)
> 
> I'm testing on Linus' master, can we all use that please?
> 
>> [root@localhost ~]# netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m
>> 10K -H 127.0.0.1
>> SCTP 1-TO-MANY STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
>> 127.0.0.1 () port 0 AF_INET
>> Recv   SendSend  Utilization   Service Demand
>> Socket Socket  Message  Elapsed  Send Recv SendRecv
>> Size   SizeSize Time Throughput  localremote   local   remote
>> bytes  bytes   bytessecs.10^6bits/s  % S  % S  us/KB   us/KB
>>
>> 212992 212992  10240300.00 9454.90   5.22 5.22 1.086   1.085
>>
>>
>> I did tests on physical machine.
>> did you do it on guest ?
> 
> The test is done on a ivy-bridge desktop with 8G memory:
> # cpudesc : Intel(R) Core(TM) i3-3220 CPU @ 3.30GHz
> # total memory : 8058152 kB
> 
>>
>>>

 Recently the sctp performance is not stable,  as during these patches,
 netperf cannot get the result, but return ENOTCONN. which may
 also affect the testing. anyway we've fixed the -ENOTCONN issue
 already in the latest version.
>>>
>>> I tested commit 96b585267f55, which is Linus' git tree HEAD on 08/03, I
>>> guess the fix you mentioned should already be in there? But
>>> unfortunately, the throughput of netperf is still at low number(we did
>>> the test 5 times):
>>> $ cat */netperf.json
>>> {
>>>   "netperf.Throughput_Mbps": [
>>> 2470.69748
>>>   ]
>>> }{
>>>   "netperf.Throughput_Mbps": [
>>> 2486.7675
>>>   ]
>>> }{
>>>   "netperf.Throughput_Mbps": [
>>> 2478.945
>>>   ]
>>> }{
>>>   "netperf.Throughput_Mbps": [
>>> 2429.465
>>>   ]
>>> }{
>>>   "netperf.Throughput_Mbps": [
>>> 2476.91504
>>>   ]
>>>
>>> Considering what you have said that the patch shouldn't make a
>>> difference, the performance drop is really confusing. Any idea what
>>> could be the cause? Thanks.
>> Now I saw your tests result against the new kernel
>>
>> Could you do the same test on the kernel before the problematic commit ?
> 
> Yes, the throughput of its parent commit is higer enough to trigger the
> automatic bisect and then we send out the report.
> 
> Throughput of its parent commit 826d253d57b1("sctp: add SCTP_PR_ASSOC_STATUS
> on sctp sockopt"):
> Average:
> "netperf.Throughput_Mbps": 3923.84375,
> 
> $ cat */netperf.json
> {
>   "netperf.Throughput_Mbps": [
> 3869.25375
>   ]
> }{
>   "netperf.Throughput_Mbps": [
> 3952.58875
>   ]
> }{
>   "netperf.Throughput_Mbps": [
> 3936.89625
>   ]
> }{
>   "netperf.Throughput_Mbps": [
> 3936.63625
>   ]
> }
> 
> Feel free to let me know if you need any more information or you want me
> to do more tests on other commits/machines, thanks.
> 
> Regards,
> Aaron
> 



Re: [PATCH v2] Bluetooth: Add LED triggers for HCI frames tx and rx

2016-08-15 Thread Guodong Xu
Ping. :)  Would you give me some review opinions on this?

In this revision, I chose to limit LED blinking to tx/rx traffic
packets only. For other types of over-the-air packets, like scanning,
it's not covered.

Thank you.

-Guodong


On 31 July 2016 at 12:24, Guodong Xu  wrote:
> Two LED triggers are added into hci_dev: tx_led and rx_led. Upon ACL/SCO
> packets available in tx or rx, the LEDs will blink.
>
> For each hci registration, two triggers are added into LED subsystem:
> [hdev->name]-tx and [hdev-name]-rx.
> Refer to Documentation/leds/leds-class.txt for usage.
>
> Verified on HiKey 96boards, which uses HiSilicon hi6220 SoC and TI
> WL1835 WiFi/BT combo chip.
>
> Signed-off-by: Guodong Xu 
> ---
>  include/net/bluetooth/hci_core.h |  1 +
>  net/bluetooth/hci_core.c |  6 ++
>  net/bluetooth/leds.c | 17 +
>  net/bluetooth/leds.h |  2 ++
>  4 files changed, 26 insertions(+)
>
> diff --git a/include/net/bluetooth/hci_core.h 
> b/include/net/bluetooth/hci_core.h
> index dc71473..37b8dd9 100644
> --- a/include/net/bluetooth/hci_core.h
> +++ b/include/net/bluetooth/hci_core.h
> @@ -398,6 +398,7 @@ struct hci_dev {
> bdaddr_trpa;
>
> struct led_trigger  *power_led;
> +   struct led_trigger  *tx_led, *rx_led;
>
> int (*open)(struct hci_dev *hdev);
> int (*close)(struct hci_dev *hdev);
> diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
> index 45a9fc6..956dce1 100644
> --- a/net/bluetooth/hci_core.c
> +++ b/net/bluetooth/hci_core.c
> @@ -3833,6 +3833,7 @@ static void hci_sched_acl(struct hci_dev *hdev)
> if (!hci_conn_num(hdev, AMP_LINK) && hdev->dev_type == HCI_AMP)
> return;
>
> +   hci_leds_blink_oneshot(hdev->tx_led);
> switch (hdev->flow_ctl_mode) {
> case HCI_FLOW_CTL_MODE_PACKET_BASED:
> hci_sched_acl_pkt(hdev);
> @@ -3856,6 +3857,7 @@ static void hci_sched_sco(struct hci_dev *hdev)
> if (!hci_conn_num(hdev, SCO_LINK))
> return;
>
> +   hci_leds_blink_oneshot(hdev->tx_led);
> while (hdev->sco_cnt && (conn = hci_low_sent(hdev, SCO_LINK, 
> ))) {
> while (quote-- && (skb = skb_dequeue(>data_q))) {
> BT_DBG("skb %p len %d", skb, skb->len);
> @@ -3879,6 +3881,7 @@ static void hci_sched_esco(struct hci_dev *hdev)
> if (!hci_conn_num(hdev, ESCO_LINK))
> return;
>
> +   hci_leds_blink_oneshot(hdev->tx_led);
> while (hdev->sco_cnt && (conn = hci_low_sent(hdev, ESCO_LINK,
>  ))) {
> while (quote-- && (skb = skb_dequeue(>data_q))) {
> @@ -3911,6 +3914,7 @@ static void hci_sched_le(struct hci_dev *hdev)
> hci_link_tx_to(hdev, LE_LINK);
> }
>
> +   hci_leds_blink_oneshot(hdev->tx_led);
> cnt = hdev->le_pkts ? hdev->le_cnt : hdev->acl_cnt;
> tmp = cnt;
> while (cnt && (chan = hci_chan_sent(hdev, LE_LINK, ))) {
> @@ -3990,6 +3994,7 @@ static void hci_acldata_packet(struct hci_dev *hdev, 
> struct sk_buff *skb)
>
> if (conn) {
> hci_conn_enter_active_mode(conn, BT_POWER_FORCE_ACTIVE_OFF);
> +   hci_leds_blink_oneshot(hdev->rx_led);
>
> /* Send to upper protocol */
> l2cap_recv_acldata(conn, skb, flags);
> @@ -4022,6 +4027,7 @@ static void hci_scodata_packet(struct hci_dev *hdev, 
> struct sk_buff *skb)
> hci_dev_unlock(hdev);
>
> if (conn) {
> +   hci_leds_blink_oneshot(hdev->rx_led);
> /* Send to upper protocol */
> sco_recv_scodata(conn, skb);
> return;
> diff --git a/net/bluetooth/leds.c b/net/bluetooth/leds.c
> index 8319c84..ae10c5d 100644
> --- a/net/bluetooth/leds.c
> +++ b/net/bluetooth/leds.c
> @@ -19,6 +19,8 @@ struct hci_basic_led_trigger {
>  #define to_hci_basic_led_trigger(arg) container_of(arg, \
> struct hci_basic_led_trigger, led_trigger)
>
> +#define BLUETOOTH_BLINK_DELAY  50 /* ms */
> +
>  void hci_leds_update_powered(struct hci_dev *hdev, bool enabled)
>  {
> if (hdev->power_led)
> @@ -37,6 +39,17 @@ static void power_activate(struct led_classdev *led_cdev)
> led_trigger_event(led_cdev->trigger, powered ? LED_FULL : LED_OFF);
>  }
>
> +void hci_leds_blink_oneshot(struct led_trigger *trig)
> +{
> +   unsigned long led_delay = BLUETOOTH_BLINK_DELAY;
> +
> +   if (!trig)
> +   return;
> +
> +   BT_DBG("led_trig %p", trig);
> +   led_trigger_blink_oneshot(trig, _delay, _delay, 0);
> +}
> +
>  static struct led_trigger *led_allocate_basic(struct hci_dev *hdev,
> void (*activate)(struct led_classdev *led_cdev),
> const char *name)
> @@ -71,4 +84,8 @@ void hci_leds_init(struct 

Re: [PATCH 1/4] net: hix5hd2_gmac: add tx scatter-gather feature

2016-08-15 Thread Dongpo Li


On 2016/8/16 0:18, Rob Herring wrote:
> On Mon, Aug 15, 2016 at 1:50 AM, Dongpo Li  wrote:
>> Hi Rob,
>> Many thanks for your review.
>>
>> On 2016/8/13 2:43, Rob Herring wrote:
>>> On Thu, Aug 11, 2016 at 05:01:52PM +0800, Dongpo Li wrote:
 From: Li Dongpo 

 The "hix5hd2" is SoC name, add the generic ethernet driver name.
 The "hisi-gemac-v1" is the basic version and "hisi-gemac-v2" adds
 the SG/TXCSUM/TSO/UFO features.
 This patch only adds the SG(scatter-gather) driver for transmitting,
 the drivers of other features will be submitted later.
>>>
>>> The compatible string changes should probably be a separate patch.
>>>
>> ok, I will split this patch into two patches, one for compatible string 
>> changes,
>> and one for driver feature implementation.
>>
 Signed-off-by: Dongpo Li 
 ---
  .../bindings/net/hisilicon-hix5hd2-gmac.txt|   9 +-
  drivers/net/ethernet/hisilicon/hix5hd2_gmac.c  | 213 
 +++--
  2 files changed, 205 insertions(+), 17 deletions(-)

 diff --git 
 a/Documentation/devicetree/bindings/net/hisilicon-hix5hd2-gmac.txt 
 b/Documentation/devicetree/bindings/net/hisilicon-hix5hd2-gmac.txt
 index 75d398b..3c02fac 100644
 --- a/Documentation/devicetree/bindings/net/hisilicon-hix5hd2-gmac.txt
 +++ b/Documentation/devicetree/bindings/net/hisilicon-hix5hd2-gmac.txt
 @@ -1,7 +1,12 @@
  Hisilicon hix5hd2 gmac controller

  Required properties:
 -- compatible: should be "hisilicon,hix5hd2-gmac".
 +- compatible: should contain one of the following version strings:
 +* "hisilicon,hisi-gemac-v1"
 +* "hisilicon,hisi-gemac-v2"
 +and one of the following SoC string:
 +* "hisilicon,hix5hd2-gemac"
 +* "hisilicon,hi3798cv200-gemac"
>>>
>>> Make it clear what the order should be.
>>>
>> ok, I will put the SoC strings in alphabetical order.
> 
> No, I mean the most specific string comes first.
> 
ok, I will fix it in next patch version. Thank you.

> Rob
> 
> .
> 


Regards,
Dongpo

.



Re: [kbuild-all] make[2]: *** No rule to make target 'tools/testing/nvdimm//config_check.o', needed by 'tools/testing/nvdimm//dax.o'.

2016-08-15 Thread Yilong Ren
On Tue, Aug 16, 2016 at 10:30:14AM +0800, Fengguang Wu wrote:
> >how about adding a function "exact_enable_module()" to assure use "m" ?
> >
> >diff --git a/lib/kconfig.sh b/lib/kconfig.sh
> >index 595dbfd..1502ce9 100644
> >--- a/lib/kconfig.sh
> >+++ b/lib/kconfig.sh
> >@@ -102,6 +102,7 @@ enable_testcase_config()
> >   do
> >   [[ $CONFIG =~ ^CONFIG_[A-Z0-9_]+=y$ ]] && enable_config 
> > ${CONFIG%=y}
> >+   [[ $CONFIG =~ ^CONFIG_[A-Z0-9_]+=m$ ]] && 
> >exact_enable_module ${CONFIG%=y}
> 
> s/y/m/
> 
> Otherwise looks good, thanks!

Got it,thanks, will do the patch.

-- 
Thanks
Ren Yilong

> 
> >   [[ $CONFIG =~ ^CONFIG_[A-Z0-9_]+[A-Z0-9]$ ]] && enable_module 
> > $CONFIG
> >   [[ $CONFIG =~ ^(CONFIG_[A-Z0-9_]+)=([0-9]+)$ ]] && 
> > set_config_to_value ${BASH_REMATCH[1]} ${BASH_REMATCH[2]}
> >   done
> >}



Re: [kbuild-all] make[2]: *** No rule to make target 'tools/testing/nvdimm//config_check.o', needed by 'tools/testing/nvdimm//dax.o'.

2016-08-15 Thread Fengguang Wu

how about adding a function "exact_enable_module()" to assure use "m" ?

diff --git a/lib/kconfig.sh b/lib/kconfig.sh
index 595dbfd..1502ce9 100644
--- a/lib/kconfig.sh
+++ b/lib/kconfig.sh
@@ -102,6 +102,7 @@ enable_testcase_config()
   do
   [[ $CONFIG =~ ^CONFIG_[A-Z0-9_]+=y$ ]] && enable_config 
${CONFIG%=y}
+   [[ $CONFIG =~ ^CONFIG_[A-Z0-9_]+=m$ ]] && exact_enable_module 
${CONFIG%=y}


s/y/m/

Otherwise looks good, thanks!


   [[ $CONFIG =~ ^CONFIG_[A-Z0-9_]+[A-Z0-9]$ ]] && enable_module 
$CONFIG
   [[ $CONFIG =~ ^(CONFIG_[A-Z0-9_]+)=([0-9]+)$ ]] && 
set_config_to_value ${BASH_REMATCH[1]} ${BASH_REMATCH[2]}
   done
}


RE: [PATCH v2 1/2] ACPI/tables: Correct the wrong count increasing

2016-08-15 Thread Zheng, Lv
Hi,

> From: linux-acpi-ow...@vger.kernel.org 
> [mailto:linux-acpi-ow...@vger.kernel.org] On Behalf Of Baoquan
> He
> Subject: [PATCH v2 1/2] ACPI/tables: Correct the wrong count increasing
> 
> The current code always increases the count in the 1st element of
> array proc[].
> 
> Signed-off-by: Baoquan He 
> Cc: Rafael J. Wysocki 
> Cc: Len Brown 
> Cc: linux-a...@vger.kernel.org
> ---
> 
> v1->v2:
> V1 is a wrong post because I didn't update the tested code to my
> local laptop. Repost with a correct v2.
> 
>  drivers/acpi/tables.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/acpi/tables.c b/drivers/acpi/tables.c
> index 9f0ad6e..34d45bb 100644
> --- a/drivers/acpi/tables.c
> +++ b/drivers/acpi/tables.c
> @@ -281,7 +281,7 @@ acpi_parse_entries_array(char *id, unsigned long 
> table_size,
>proc[i].handler(entry, table_end))
>   return -EINVAL;
> 
> - proc->count++;
> + proc[i].count++;

Do we have code using acpi_subtable_proce.count?
I think the answer is yes because of:
[Patch] x86, ACPI: Fix the wrong assignment when Handle apic/x2apic entries

So why don't you put these 2 patches together into a single series?
And help to validate if there are problems in other acpi_subtable_proce.count 
users.

Thanks
Lv

>   break;
>   }
>   if (i != proc_num)
> --
> 2.5.5
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUGFIX PATCH 1/2] brcmfmac: Check rtnl_lock is locked when removing interface

2016-08-15 Thread Masami Hiramatsu
On Mon, 15 Aug 2016 23:44:05 +0200
Arend Van Spriel  wrote:

> 
> 
> On 15-8-2016 13:52, Rafał Miłecki wrote:
> > On 15 August 2016 at 12:57, Kalle Valo  wrote:
> >> Rafał Miłecki  writes:
> >>
>  Signed-off-by: Masami Hiramatsu 
> >>>
> >>> Fixes: a63b09872c1d ("brcmfmac: delete interface directly in code that 
> >>> sent fw request")
> >>> Acked-by: Rafał Miłecki 
> >>>
> >>> Kalle: I'm acking this as bugfix for 4.8 release.
> >>
> >> Ok. I'll wait few days for more comments before I apply this.

Thanks!

> > 
> > Sure.
> > 
> > 
> >> (I assume you are talking only about patch 1)
> > 
> > Yes, I'll leave mutex vs. spinlock to the experts :)
> 
> Don't know who the experts are. Surely not me :-p
> 
> I made an uneducated design decision using a mutex for this. The
> reasoning for using a regular spinlock make sense. So I will go and ack
> that patch.

As far as I can see, that change is very local and
at least my environment it works well :)

Regards,

> 
> Regards,
> Arend


-- 
Masami Hiramatsu 


Re: [kbuild-all] make[2]: *** No rule to make target 'tools/testing/nvdimm//config_check.o', needed by 'tools/testing/nvdimm//dax.o'.

2016-08-15 Thread Yilong Ren
On Tue, Aug 16, 2016 at 09:58:43AM +0800, Fengguang Wu wrote:
> On Tue, Aug 16, 2016 at 09:47:52AM +0800, Yilong Ren wrote:
> >On Tue, Aug 16, 2016 at 09:41:02AM +0800, Fengguang Wu wrote:
> >>On Mon, Aug 15, 2016 at 06:30:48PM -0700, Dan Williams wrote:
> >>>On Mon, Aug 15, 2016 at 6:26 PM, Fengguang Wu  
> >>>wrote:
> On Mon, Aug 15, 2016 at 05:58:36PM -0700, Dan Williams wrote:
> >
> >On Mon, Aug 15, 2016 at 3:03 AM, kbuild test robot
> > wrote:
> >>
> >>tree:
> >>https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
> >>master
> >>head:   694d0d0bb2030d2e36df73e2d23d5770511dbc8d
> >>commit: ab68f26221366f92611650e8470e6a926801c7d4 /dev/dax, pmem: direct
> >>access to persistent memory
> >>date:   3 months ago
> >>config: i386-randconfig-i1-201633 (attached as .config)
> >>compiler: gcc-4.8 (Debian 4.8.4-1) 4.8.4
> >>reproduce:
> >>git checkout ab68f26221366f92611650e8470e6a926801c7d4
> >># save the attached .config to linux build tree
> >>make ARCH=i386
> >>
> >>All errors (new ones prefixed by >>):
> >>
> make[2]: *** No rule to make target
> 'tools/testing/nvdimm//config_check.o', needed by
> 'tools/testing/nvdimm//dax.o'.
> >>
> >>   make[2]: Target '__build' not remade because of errors.
> >
> >
> >I think this is an invalid build test.  tools/testing/nvdimm/ uses a
> >external module Kbuild environment, not Kconfig.  So, there's nothing
> >I can do to prevent this compile error, unless there's some other way
> >0-day could determine the configuration dependencies?
> 
> 
> Yeah if you can offer a concrete rule for the dependency, we'll add
> it to 0-day.
> >>>
> >>>Sounds good.  The config_check.c file itself lists the dependencies:
> >>>
> >>>void check(void)
> >>>{
> >>>   /*
> >>>* These kconfig symbols must be set to "m" for nfit_test to
> >>
> >>If "y" is not a valid option, we'll need to adjust 0-day's dependency
> >>specification for ndctl test:
> >>
> >>wfg /c/lkp-tests% cat include/ndctl
> >>need_kconfig:
> >>- CONFIG_HAVE_DMA_CONTIGUOUS=y
> >>- CONFIG_CMA=y
> >>- CONFIG_DMA_CMA=y
> >>- CONFIG_CMA_SIZE_MBYTES=200
> >>- CONFIG_LIBNVDIMM
> >>- CONFIG_BLK_DEV_PMEM
> >>- CONFIG_ND_BLK
> >>- CONFIG_BTT=y
> >>- CONFIG_NVDIMM_PFN=y
> >>- CONFIG_NVDIMM_DAX=y
> >>- CONFIG_ZONE_DEVICE=y
> >>
> >>In the above list, a bare "CONFIG_BLK_DEV_PMEM" means "y" or "m" are
> >>both acceptable.
> >
> >Yes, this is due to enable_module() can accept "y" and "m".
> >How about forcing enable_module() to accept "m" ?
> 
> I think we could change
> 
>- CONFIG_BLK_DEV_PMEM
> to
>- CONFIG_BLK_DEV_PMEM=m
> 
> The former will correspond to kernel's
> 
>#define IS_ENABLED(option) __or(IS_BUILTIN(option), IS_MODULE(option))
> 
> while the latter correspond to
> 
>#define IS_MODULE(option) config_enabled(option##_MODULE)
> 
> And add logic to handle the =m case. Currently we only have
> enable_module() which corresponds to kernel's IS_ENABLED().

how about adding a function "exact_enable_module()" to assure use "m" ?

diff --git a/lib/kconfig.sh b/lib/kconfig.sh
index 595dbfd..1502ce9 100644
--- a/lib/kconfig.sh
+++ b/lib/kconfig.sh
@@ -102,6 +102,7 @@ enable_testcase_config()
do
[[ $CONFIG =~ ^CONFIG_[A-Z0-9_]+=y$ ]] && enable_config 
${CONFIG%=y}
+   [[ $CONFIG =~ ^CONFIG_[A-Z0-9_]+=m$ ]] && exact_enable_module 
${CONFIG%=y}
[[ $CONFIG =~ ^CONFIG_[A-Z0-9_]+[A-Z0-9]$ ]] && enable_module 
$CONFIG
[[ $CONFIG =~ ^(CONFIG_[A-Z0-9_]+)=([0-9]+)$ ]] && 
set_config_to_value ${BASH_REMATCH[1]} ${BASH_REMATCH[2]}
done
 }

-- 
Thanks
Ren Yilong

> 
> >23 # CONFIG_XXX=m => unchange
> >24 # CONFIG_XXX=y => unchange
> >25 # CONFIG_XXX is not set => CONFIG_XXX=m
> >26 enable_module()
> 
> The behavior here is good for its current callers, except the "module"
> in the function name might be a bit misleading.
> 
> Thanks,
> Fengguang


RE: [PATCH V8 2/8] ACPI: Add new IORT functions to support MSI domain handling

2016-08-15 Thread Zheng, Lv
Hi,

> From: linux-acpi-ow...@vger.kernel.org 
> [mailto:linux-acpi-ow...@vger.kernel.org] On Behalf Of Lorenzo
> Pieralisi
> Subject: Re: [PATCH V8 2/8] ACPI: Add new IORT functions to support MSI 
> domain handling
> 
> On Thu, Aug 11, 2016 at 12:06:32PM +0200, Tomasz Nowicki wrote:
> 
> [...]
> 
> > +/**
> > + * iort_register_domain_token() - register domain token and related ITS ID
> > + * to the list from where we can get it back later on.
> > + * @trans_id: ITS ID.
> > + * @fw_node: Domain token.
> > + *
> > + * Returns: 0 on success, -ENOMEM if no memory when allocating list element
> > + */
> > +int iort_register_domain_token(int trans_id, struct fwnode_handle *fw_node)
> > +{
> > +   struct iort_its_msi_chip *its_msi_chip;
> > +
> > +   its_msi_chip = kzalloc(sizeof(*its_msi_chip), GFP_KERNEL);
> 
> I spotted this while reworking my ARM SMMU series, this may sleep
> and that's no good given that we call it within the acpi_probe_lock.
> 
> Same goes for irq_domain_alloc_fwnode() (that we call in
> gic_v2_acpi_init()), we have got to fix this usage, I will see with
> Marc what's the best way to do it.

If we can ensure that all table device probe entries are created during link 
stage or early stage.
I think you can safely unlock probe lock before invoking acpi_table_parse() in 
__acpi_probe_device_table().

Thanks
Lv

> 
> Lorenzo
> 
> > +   if (!its_msi_chip)
> > +   return -ENOMEM;
> > +
> > +   its_msi_chip->fw_node = fw_node;
> > +   its_msi_chip->translation_id = trans_id;
> > +
> > +   spin_lock(_msi_chip_lock);
> > +   list_add(_msi_chip->list, _msi_chip_list);
> > +   spin_unlock(_msi_chip_lock);
> > +
> > +   return 0;
> > +}
> > +
> > +/**
> > + * iort_deregister_domain_token() - Deregister domain token based on ITS ID
> > + * @trans_id: ITS ID.
> > + *
> > + * Returns: none.
> > + */
> > +void iort_deregister_domain_token(int trans_id)
> > +{
> > +   struct iort_its_msi_chip *its_msi_chip, *t;
> > +
> > +   spin_lock(_msi_chip_lock);
> > +   list_for_each_entry_safe(its_msi_chip, t, _msi_chip_list, list) {
> > +   if (its_msi_chip->translation_id == trans_id) {
> > +   list_del(_msi_chip->list);
> > +   kfree(its_msi_chip);
> > +   break;
> > +   }
> > +   }
> > +   spin_unlock(_msi_chip_lock);
> > +}
> > +
> > +/**
> > + * iort_find_domain_token() - Find domain token based on given ITS ID
> > + * @trans_id: ITS ID.
> > + *
> > + * Returns: domain token when find on the list, NULL otherwise
> > + */
> > +struct fwnode_handle *iort_find_domain_token(int trans_id)
> > +{
> > +   struct fwnode_handle *fw_node = NULL;
> > +   struct iort_its_msi_chip *its_msi_chip;
> > +
> > +   spin_lock(_msi_chip_lock);
> > +   list_for_each_entry(its_msi_chip, _msi_chip_list, list) {
> > +   if (its_msi_chip->translation_id == trans_id) {
> > +   fw_node = its_msi_chip->fw_node;
> > +   break;
> > +   }
> > +   }
> > +   spin_unlock(_msi_chip_lock);
> > +
> > +   return fw_node;
> > +}
> > +
> >  static struct acpi_iort_node *
> >  iort_scan_node(enum acpi_iort_node_type type,
> >iort_find_node_callback callback, void *context)
> > @@ -206,6 +285,96 @@ iort_find_dev_node(struct device *dev)
> >   iort_match_node_callback, >dev);
> >  }
> >
> > +/**
> > + * iort_msi_map_rid() - Map a MSI requester ID for a device
> > + * @dev: The device for which the mapping is to be done.
> > + * @req_id: The device requester ID.
> > + *
> > + * Returns: mapped MSI RID on success, input requester ID otherwise
> > + */
> > +u32 iort_msi_map_rid(struct device *dev, u32 req_id)
> > +{
> > +   struct acpi_iort_node *node;
> > +   u32 dev_id;
> > +
> > +   if (!iort_table)
> > +   return req_id;
> > +
> > +   node = iort_find_dev_node(dev);
> > +   if (!node) {
> > +   dev_err(dev, "can't find related IORT node\n");
> > +   return req_id;
> > +   }
> > +
> > +   iort_node_map_rid(node, req_id, _id, ACPI_IORT_NODE_ITS_GROUP);
> > +   return dev_id;
> > +}
> > +
> > +/**
> > + * iort_dev_find_its_id() - Find the ITS identifier for a device
> > + * @dev: The device.
> > + * @idx: Index of the ITS identifier list.
> > + * @its_id: ITS identifier.
> > + *
> > + * Returns: 0 on success, appropriate error value otherwise
> > + */
> > +static int
> > +iort_dev_find_its_id(struct device *dev, u32 req_id, unsigned int idx,
> > +int *its_id)
> > +{
> > +   struct acpi_iort_its_group *its;
> > +   struct acpi_iort_node *node;
> > +
> > +   node = iort_find_dev_node(dev);
> > +   if (!node) {
> > +   dev_err(dev, "can't find related IORT node\n");
> > +   return -ENXIO;
> > +   }
> > +
> > +   node = iort_node_map_rid(node, req_id, NULL, ACPI_IORT_NODE_ITS_GROUP);
> > +   if (!node) {
> > +   dev_err(dev, "can't find related ITS node\n");
> > +   return -ENXIO;
> > +   }
> > +
> > +   /* Move to ITS specific data 

Re: [PATCH] time,virt: resync steal time when guest & host lose sync

2016-08-15 Thread Rik van Riel
On Tue, 2016-08-16 at 09:31 +0800, Wanpeng Li wrote:
> 2016-08-15 23:00 GMT+08:00 Rik van Riel :
> > On Mon, 2016-08-15 at 16:53 +0800, Wanpeng Li wrote:
> > > 2016-08-12 23:58 GMT+08:00 Rik van Riel :
> > > [...]
> > > > Wanpeng, does the patch below work for you?
> > > 
> > > It will break steal time for full dynticks guest, and there is a
> > > calltrace of thread_group_cputime_adjusted call stack, RIP is
> > > cputime_adjust+0xff/0x130.
> > 
> > How?  This patch is equivalent to passing ULONG_MAX to
> > steal_account_process_time, which you tried to no ill
> > effect before.
> 
> https://lkml.org/lkml/2016/6/8/404/ Paolo original suggested to add
> the max cputime limit to the vtime, when the cpu is running in nohz
> full mode and stop the tick, jiffies will be updated depends on clock
> source instead of clock event device in
> guest(tick_nohz_update_jiffies() callsite, ktime_get()), so it will
> not be affected by lost clock ticks, my patch keeps the limit for
> vtime and remove the limit to non-vtime. However, your patch removes
> the limit for both scenarios and results in the below calltrace for
> vtime.

I understand what it does.

What I would like to understand is WHY enforcing the limit
is the right thing when using vtime, and the wrong thing
in all other scenarios.

Can you explain why you change the limit to ULONG_MAX in
three call sites, but not in the last one?

What is different about the first three, versus the last
one?

Are you sure it should be changed in three places, and
not in eg. two?

This seems like something we should try to understand,
rather than patch up haphazardly.

The changelog of your patch could use an explanation of
why the change is the correct way to go.

> > 
> > Do you have the full call trace?

OK, so you are seeing a divide by zero in cputime_adjust.

Specifically, this would be scale_stime getting passed
a (utime + stime) that adds up to 0. Stranger still, that
only gets called if neither utime or stime is 0, meaning
that one of utime or stime is negative, at the exact same
magnitude as the other.

Looking at thread_group_cputime(), I see some room for
rounding errors.

        do {
seq = nextseq;
flags = read_seqbegin_or_lock_irqsave(>stats_lock, 
);
times->utime = sig->utime;
times->stime = sig->stime;
times->sum_exec_runtime = sig->sum_sched_runtime;

for_each_thread(tsk, t) {
task_cputime(t, , );
times->utime += utime;
times->stime += stime;
times->sum_exec_runtime +=
task_sched_runtime(t);
}
/* If lockless access failed, take the lock. */
nextseq = 1;
} while (need_seqretry(>stats_lock, seq));

Specifically, task_cputime calls vtime_delta, which works
off jiffies, while task_sched_runtime works straight off
the sched clock.

I can see how this would lead to a non-zero ->sum_exec_runtime,
while ->utime and/or ->stime are still at zero. This is fine.

What I do not see is how ->utime or ->stime could end up negative,
which is what would be needed to hit that divide by zero.

Unless I am overlooking something...

This would be a good thing to debug.

> [6.929856] divide error:  [#1] SMP
> [6.934217] Modules linked in:
> [6.937759] CPU: 3 PID: 57 Comm: kworker/u8:1 Not tainted 4.7.0+
> #36
> [6.946105] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS Bochs 01/01/2011
> [6.953951] Workqueue: events_unbound
> call_usermodehelper_exec_work
> [6.965726] task: 8e22b9785040 task.stack: 8e22b8b64000
> [6.970820] RIP: 0010:[]  []
> cputime_adjust+0xff/0x130
> [6.981841] RSP: :8e22b8b67b78  EFLAGS: 00010887
> [6.985946] RAX: a528afff5ad75000 RBX: 8e222e243c18 RCX:
> 8e22b8b67c28
> [7.001166] RDX:  RSI: 0296 RDI:
> 
> [7.008758] RBP: 8e22b8b67ba8 R08:  R09:
> a528b000
> [7.015653] R10:  R11:  R12:
> 0014a516
> [7.021376] R13: 8e22b8b67bb8 R14: 8e222e243c28 R15:
> 8e22b8b67c20
> [7.035498] FS:  () GS:8e22bac0()
> knlGS:
> [7.054809] CS:  0010 DS:  ES:  CR0: 80050033
> [7.066571] CR2:  CR3: 7ae06000 CR4:
> 001406e0
> [7.075162] Stack:
> [7.090141]  8e22b8b67c28 8e222e371ac0 8e22b8b67c20
> 8e22b8b67c28
> [7.108512]  8e222e371ac0 8e22b8b67cc0 8e22b8b67be8
> 870c8c01
> [7.123025]  000e0471 fffdcf32 0014a516
> 8e22b9785040
> [7.140622] Call Trace:
> [7.153076]  []
> thread_group_cputime_adjusted+0x41/0x50
> [7.160807]  [] wait_consider_task+0xa4f/0xff0
> [7.176449]  [] ? 

Re: [PATCH v6] x86/hpet: Reduce HPET counter read contention

2016-08-15 Thread Waiman Long

On 08/12/2016 10:30 PM, kbuild test robot wrote:

Hi Waiman,

[auto build test ERROR on tip/auto-latest]
[also build test ERROR on v4.8-rc1 next-20160812]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Waiman-Long/x86-hpet-Reduce-HPET-counter-read-contention/20160813-090247
config: x86_64-randconfig-s0-08131002 (attached as .config)
compiler: gcc-4.4 (Debian 4.4.7-8) 4.4.7
reproduce:
 # save the attached .config to linux build tree
 make ARCH=x86_64

All error/warnings (new ones prefixed by>>):


arch/x86/kernel/hpet.c:791: error: unknown field 'lock' specified in initializer
arch/x86/kernel/hpet.c:791: warning: missing braces around initializer

arch/x86/kernel/hpet.c:791: warning: (near initialization for 
'hpet..lock.val')

vim +/lock +791 arch/x86/kernel/hpet.c

785 u32 value;
786 };
787 u64 lockval;
788 };
789 
790 static union hpet_lock hpet __cacheline_aligned = {
  >  791 .lock = __ARCH_SPIN_LOCK_UNLOCKED,
792 };
793 
794 static cycle_t read_hpet(struct clocksource *cs)

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


The following additional patch should fix the build error. The error 
wasn't produced when I did my test build with the gcc 4.8.5 compiler. 
That was why I missed it.


Cheers,
Longman

--

diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c
index 71127fe..0822688 100644
--- a/arch/x86/kernel/hpet.c
+++ b/arch/x86/kernel/hpet.c
@@ -788,7 +788,7 @@ union hpet_lock {
 };

 static union hpet_lock hpet __cacheline_aligned = {
-   .lock = __ARCH_SPIN_LOCK_UNLOCKED,
+   { .lock = __ARCH_SPIN_LOCK_UNLOCKED, },
 };

 static cycle_t read_hpet(struct clocksource *cs)




Re: [PATCH v3] sched/deadline: Fix the intention to re-evalute tick dependency for offline cpu

2016-08-15 Thread Wanpeng Li
Ping Juri, Frederic, could I get your Acked?
2016-08-12 17:24 GMT+08:00 Wanpeng Li :
> From: Wanpeng Li 
>
> The dl task will be replenished after dl task timer fire and start a
> new period. It will be enqueued and to re-evaluate its dependency on
> the tick in order to restart it. However, if cpu is hot-unplug,
> irq_work_queue will splash since the target cpu is offline.
>
> As a result:
>
> WARNING: CPU: 2 PID: 0 at kernel/irq_work.c:69 irq_work_queue_on+0xad/0xe0
> Call Trace:
>  dump_stack+0x99/0xd0
>  __warn+0xd1/0xf0
>  warn_slowpath_null+0x1d/0x20
>  irq_work_queue_on+0xad/0xe0
>  tick_nohz_full_kick_cpu+0x44/0x50
>  tick_nohz_dep_set_cpu+0x74/0xb0
>  enqueue_task_dl+0x226/0x480
>  activate_task+0x5c/0xa0
>  dl_task_timer+0x19b/0x2c0
>  ? push_dl_task.part.31+0x190/0x190
>
> This can be triggered by hot-unplug the full dynticks cpu which dl
> task is running on.
>
> We enqueue the dl task on the offline CPU, because we need to do
> replenish for start_dl_timer(). So, as Juri pointed out, we would
> need to do is calling replenish_dl_entity() directly, instead of
> enqueue_task_dl(). pi_se shouldn't be a problem as the task shouldn't
> be boosted if it was throttled.
>
> This patch fix it by just replenish dl entity to avoid the intention
> to re-evaluate tick dependency if the cpu is offline.
>
> Cc: Ingo Molnar 
> Cc: Peter Zijlstra 
> Cc: Juri Lelli 
> Cc: Luca Abeni 
> Cc: Frederic Weisbecker 
> Signed-off-by: Wanpeng Li 
> ---
> v2 -> v3:
>  * move rq->online check under CONFIG_SMP
> v1 -> v2:
>  * replenish dl entity
>
>  kernel/sched/deadline.c | 7 +++
>  1 file changed, 7 insertions(+)
>
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index d091f4a..ce0fb00 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -641,6 +641,11 @@ static enum hrtimer_restart dl_task_timer(struct hrtimer 
> *timer)
> goto unlock;
> }
>
> +#ifdef CONFIG_SMP
> +   if (unlikely(!rq->online))
> +   goto offline;
> +#endif
> +
> enqueue_task_dl(rq, p, ENQUEUE_REPLENISH);
> if (dl_task(rq->curr))
> check_preempt_curr_dl(rq, p, 0);
> @@ -648,6 +653,7 @@ static enum hrtimer_restart dl_task_timer(struct hrtimer 
> *timer)
> resched_curr(rq);
>
>  #ifdef CONFIG_SMP
> +offline:
> /*
>  * Perform balancing operations here; after the replenishments.  We
>  * cannot drop rq->lock before this, otherwise the assertion in
> @@ -659,6 +665,7 @@ static enum hrtimer_restart dl_task_timer(struct hrtimer 
> *timer)
>  * XXX figure out if select_task_rq_dl() deals with offline cpus.
>  */
> if (unlikely(!rq->online)) {
> +   replenish_dl_entity(dl_se, dl_se);
> lockdep_unpin_lock(>lock, rf.cookie);
> rq = dl_task_offline_migration(rq, p);
> rf.cookie = lockdep_pin_lock(>lock);
> --
> 1.9.1
>



-- 
Regards,
Wanpeng Li


Re: [kbuild-all] make[2]: *** No rule to make target 'tools/testing/nvdimm//config_check.o', needed by 'tools/testing/nvdimm//dax.o'.

2016-08-15 Thread Fengguang Wu

On Tue, Aug 16, 2016 at 09:47:52AM +0800, Yilong Ren wrote:

On Tue, Aug 16, 2016 at 09:41:02AM +0800, Fengguang Wu wrote:

On Mon, Aug 15, 2016 at 06:30:48PM -0700, Dan Williams wrote:
>On Mon, Aug 15, 2016 at 6:26 PM, Fengguang Wu  wrote:
>>On Mon, Aug 15, 2016 at 05:58:36PM -0700, Dan Williams wrote:
>>>
>>>On Mon, Aug 15, 2016 at 3:03 AM, kbuild test robot
>>> wrote:

tree:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
head:   694d0d0bb2030d2e36df73e2d23d5770511dbc8d
commit: ab68f26221366f92611650e8470e6a926801c7d4 /dev/dax, pmem: direct
access to persistent memory
date:   3 months ago
config: i386-randconfig-i1-201633 (attached as .config)
compiler: gcc-4.8 (Debian 4.8.4-1) 4.8.4
reproduce:
git checkout ab68f26221366f92611650e8470e6a926801c7d4
# save the attached .config to linux build tree
make ARCH=i386

All errors (new ones prefixed by >>):

>>make[2]: *** No rule to make target
>>'tools/testing/nvdimm//config_check.o', needed by
>>'tools/testing/nvdimm//dax.o'.

   make[2]: Target '__build' not remade because of errors.
>>>
>>>
>>>I think this is an invalid build test.  tools/testing/nvdimm/ uses a
>>>external module Kbuild environment, not Kconfig.  So, there's nothing
>>>I can do to prevent this compile error, unless there's some other way
>>>0-day could determine the configuration dependencies?
>>
>>
>>Yeah if you can offer a concrete rule for the dependency, we'll add
>>it to 0-day.
>
>Sounds good.  The config_check.c file itself lists the dependencies:
>
>void check(void)
>{
>   /*
>* These kconfig symbols must be set to "m" for nfit_test to

If "y" is not a valid option, we'll need to adjust 0-day's dependency
specification for ndctl test:

wfg /c/lkp-tests% cat include/ndctl
need_kconfig:
- CONFIG_HAVE_DMA_CONTIGUOUS=y
- CONFIG_CMA=y
- CONFIG_DMA_CMA=y
- CONFIG_CMA_SIZE_MBYTES=200
- CONFIG_LIBNVDIMM
- CONFIG_BLK_DEV_PMEM
- CONFIG_ND_BLK
- CONFIG_BTT=y
- CONFIG_NVDIMM_PFN=y
- CONFIG_NVDIMM_DAX=y
- CONFIG_ZONE_DEVICE=y

In the above list, a bare "CONFIG_BLK_DEV_PMEM" means "y" or "m" are
both acceptable.


Yes, this is due to enable_module() can accept "y" and "m".
How about forcing enable_module() to accept "m" ?


I think we could change 


   - CONFIG_BLK_DEV_PMEM
to
   - CONFIG_BLK_DEV_PMEM=m

The former will correspond to kernel's

   #define IS_ENABLED(option) __or(IS_BUILTIN(option), IS_MODULE(option))

while the latter correspond to

   #define IS_MODULE(option) config_enabled(option##_MODULE)

And add logic to handle the =m case. Currently we only have
enable_module() which corresponds to kernel's IS_ENABLED().


23 # CONFIG_XXX=m => unchange
24 # CONFIG_XXX=y => unchange
25 # CONFIG_XXX is not set => CONFIG_XXX=m
26 enable_module()


The behavior here is good for its current callers, except the "module"
in the function name might be a bit misleading.

Thanks,
Fengguang


Re: [PATCH v3 3/3] ARM64: dts: amlogic: meson-gxbb: Add watchdog node

2016-08-15 Thread Guenter Roeck

On 08/15/2016 01:40 PM, Kevin Hilman wrote:

Kevin Hilman  writes:


On Wed, Aug 3, 2016 at 5:27 PM, Kevin Hilman  wrote:

Hi Guenter,

Kevin Hilman  writes:


Guenter Roeck  writes:


On 07/10/2016 02:11 AM, Neil Armstrong wrote:

Signed-off-by: Neil Armstrong 


Reviewed-by: Guenter Roeck 

Would this go in through one of the arm trees ?


You can take the driver through the watchdog tree, but I'll take the DT
stuff through the amlogic tree (and submit via arm-soc).

However, with your ack, and since this is a brand new driver, I could
take the driver as well.  Just let me know your preference.


ping... do you have a prefernce here?



Oops, just noticed Wim already picked these up.



Strange, seems that Wim picked up patches 1 & 3, but this one was missed.  I'll
queue this for the arm-soc tree.



I got a number of requests from arm maintainers to not pick up devicetree 
changes
for arch/arm, so for my part I usually leave those alone.

Guenter



Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression

2016-08-15 Thread Linus Torvalds
On Mon, Aug 15, 2016 at 5:19 PM, Dave Chinner  wrote:
>
>> None of this code is all that new, which is annoying. This must have
>> gone on forever,
>
> Yes, it has been. Just worse than I've notice before, probably
> because of all the stuff put under the tree lock in the past couple
> of years.

So this is where a good profile can matter.

Particularly if it's all about kswapd, and all the contention is just
from __remove_mapping(), what should matter is the "all the stuff"
added *there* and absolutely nowhere else.

Sadly (well, not for me), in my profiles I have

 --3.37%--kswapd
   |
--3.36%--shrink_node
  |
  |--2.88%--shrink_node_memcg
  |  |
  |   --2.87%--shrink_inactive_list
  | |
  | |--2.55%--shrink_page_list
  | |  |
  | |  |--0.84%--__remove_mapping
  | |  |  |
  | |  |  |--0.37%--__delete_from_page_cache
  | |  |  |  |
  | |  |  |   --0.21%--radix_tree_replace_clear_tags
  | |  |  | |
  | |  |  |  --0.12%--__radix_tree_lookup
  | |  |  |
  | |  |   --0.23%--_raw_spin_lock_irqsave
  | |  | |
  | |  |  --0.11%--queued_spin_lock_slowpath
  | |  |
   


which is rather different from your 22% spin-lock overhead.

Anyway, including the direct reclaim call paths gets
__remove_mapping() a bit higher, and _raw_spin_lock_irqsave climbs to
0.26%. But perhaps more importlantly, looking at what __remove_mapping
actually *does* (apart from the spinlock) gives us:

 - inside remove_mapping itself (0.11% on its own - flat cost, no
child accounting)

48.50 │   lock   cmpxchg %edx,0x1c(%rbx)

so that's about 0.05%

 - 0.40% __delete_from_page_cache (0.22%
radix_tree_replace_clear_tags, 0.13%__radix_tree_lookup)

 - 0.06% workingset_eviction()

so I'm not actually seeing anything *new* expensive in there. The
__delete_from_page_cache() overhead may have changed a bit with the
tagged tree changes, but this doesn't look like memcg.

But we clearly have very different situations.

What does your profile show for when you actually dig into
__remove_mapping() itself?, Looking at your flat profile, I'm assuming
you get

   1.31%  [kernel]  [k] __radix_tree_lookup
   1.22%  [kernel]  [k] radix_tree_tag_set
   1.14%  [kernel]  [k] __remove_mapping

which is higher (but part of why my percentages are lower is that I
have that "50% CPU used for encryption" on my machine).

But I'm not seeing anything I'd attribute to "all the stuff added".
For example, originally I would have blamed memcg, but that's not
actually in this path at all.

I come back to wondering whether maybe you're hitting some PV-lock problem.

I know queued_spin_lock_slowpath() is ok. I'm not entirely sure
__pv_queued_spin_lock_slowpath() is.

So I'd love to see you try the non-PV case, but I also think it might
be interesting to see what the instruction profile for
__pv_queued_spin_lock_slowpath() itself is. They share a lot of code
(there's some interesting #include games going on to make
queued_spin_lock_slowpath() actually *be*
__pv_queued_spin_lock_slowpath() with some magic hooks), but there
might be issues.

For example, if you run a virtual 16-core system on a physical machine
that then doesn't consistently give 16 cores to the virtual machine,
you'll get no end of hiccups.

Because as mentioned, we've had bugs ("performance anomalies") there before.

   Linus


Re: [kbuild-all] make[2]: *** No rule to make target 'tools/testing/nvdimm//config_check.o', needed by 'tools/testing/nvdimm//dax.o'.

2016-08-15 Thread Dan Williams
On Mon, Aug 15, 2016 at 6:41 PM, Fengguang Wu  wrote:
> On Mon, Aug 15, 2016 at 06:30:48PM -0700, Dan Williams wrote:
>>
>> On Mon, Aug 15, 2016 at 6:26 PM, Fengguang Wu 
>> wrote:
>>>
>>> On Mon, Aug 15, 2016 at 05:58:36PM -0700, Dan Williams wrote:


 On Mon, Aug 15, 2016 at 3:03 AM, kbuild test robot
  wrote:
>
>
> tree:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> master
> head:   694d0d0bb2030d2e36df73e2d23d5770511dbc8d
> commit: ab68f26221366f92611650e8470e6a926801c7d4 /dev/dax, pmem: direct
> access to persistent memory
> date:   3 months ago
> config: i386-randconfig-i1-201633 (attached as .config)
> compiler: gcc-4.8 (Debian 4.8.4-1) 4.8.4
> reproduce:
> git checkout ab68f26221366f92611650e8470e6a926801c7d4
> # save the attached .config to linux build tree
> make ARCH=i386
>
> All errors (new ones prefixed by >>):
>
>>> make[2]: *** No rule to make target
>>> 'tools/testing/nvdimm//config_check.o', needed by
>>> 'tools/testing/nvdimm//dax.o'.
>
>
>make[2]: Target '__build' not remade because of errors.



 I think this is an invalid build test.  tools/testing/nvdimm/ uses a
 external module Kbuild environment, not Kconfig.  So, there's nothing
 I can do to prevent this compile error, unless there's some other way
 0-day could determine the configuration dependencies?
>>>
>>>
>>>
>>> Yeah if you can offer a concrete rule for the dependency, we'll add
>>> it to 0-day.
>>
>>
>> Sounds good.  The config_check.c file itself lists the dependencies:
>>
>> void check(void)
>> {
>>/*
>> * These kconfig symbols must be set to "m" for nfit_test to
>
>
> If "y" is not a valid option, we'll need to adjust 0-day's dependency
> specification for ndctl test:

Unfortunately, "y" is not valid because the unit tests use the
"--wrap=" linker feature to redirect calls to exported symbols to mock
versions provided by the nfit_test.ko module.  When the code is
compiled with "y" instead of "m" the linker will use the real symbol
rather than the mock/test version.

> wfg /c/lkp-tests% cat include/ndctl
> need_kconfig:
> - CONFIG_HAVE_DMA_CONTIGUOUS=y
> - CONFIG_CMA=y
> - CONFIG_DMA_CMA=y
> - CONFIG_CMA_SIZE_MBYTES=200

I recently, last week in the 4.8 merge window, removed the dependency
on CMA.  The ndctl README.md for version 54 notes the change.

> - CONFIG_LIBNVDIMM
> - CONFIG_BLK_DEV_PMEM
> - CONFIG_ND_BLK
> - CONFIG_BTT=y
> - CONFIG_NVDIMM_PFN=y
> - CONFIG_NVDIMM_DAX=y
> - CONFIG_ZONE_DEVICE=y
>
> In the above list, a bare "CONFIG_BLK_DEV_PMEM" means "y" or "m" are
> both acceptable.

...only "m".


Re: [kbuild-all] make[2]: *** No rule to make target 'tools/testing/nvdimm//config_check.o', needed by 'tools/testing/nvdimm//dax.o'.

2016-08-15 Thread Yilong Ren
On Tue, Aug 16, 2016 at 09:41:02AM +0800, Fengguang Wu wrote:
> On Mon, Aug 15, 2016 at 06:30:48PM -0700, Dan Williams wrote:
> >On Mon, Aug 15, 2016 at 6:26 PM, Fengguang Wu  wrote:
> >>On Mon, Aug 15, 2016 at 05:58:36PM -0700, Dan Williams wrote:
> >>>
> >>>On Mon, Aug 15, 2016 at 3:03 AM, kbuild test robot
> >>> wrote:
> 
> tree:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
> head:   694d0d0bb2030d2e36df73e2d23d5770511dbc8d
> commit: ab68f26221366f92611650e8470e6a926801c7d4 /dev/dax, pmem: direct
> access to persistent memory
> date:   3 months ago
> config: i386-randconfig-i1-201633 (attached as .config)
> compiler: gcc-4.8 (Debian 4.8.4-1) 4.8.4
> reproduce:
> git checkout ab68f26221366f92611650e8470e6a926801c7d4
> # save the attached .config to linux build tree
> make ARCH=i386
> 
> All errors (new ones prefixed by >>):
> 
> >>make[2]: *** No rule to make target
> >>'tools/testing/nvdimm//config_check.o', needed by
> >>'tools/testing/nvdimm//dax.o'.
> 
>    make[2]: Target '__build' not remade because of errors.
> >>>
> >>>
> >>>I think this is an invalid build test.  tools/testing/nvdimm/ uses a
> >>>external module Kbuild environment, not Kconfig.  So, there's nothing
> >>>I can do to prevent this compile error, unless there's some other way
> >>>0-day could determine the configuration dependencies?
> >>
> >>
> >>Yeah if you can offer a concrete rule for the dependency, we'll add
> >>it to 0-day.
> >
> >Sounds good.  The config_check.c file itself lists the dependencies:
> >
> >void check(void)
> >{
> >   /*
> >* These kconfig symbols must be set to "m" for nfit_test to
> 
> If "y" is not a valid option, we'll need to adjust 0-day's dependency
> specification for ndctl test:
> 
> wfg /c/lkp-tests% cat include/ndctl
> need_kconfig:
> - CONFIG_HAVE_DMA_CONTIGUOUS=y
> - CONFIG_CMA=y
> - CONFIG_DMA_CMA=y
> - CONFIG_CMA_SIZE_MBYTES=200
> - CONFIG_LIBNVDIMM
> - CONFIG_BLK_DEV_PMEM
> - CONFIG_ND_BLK
> - CONFIG_BTT=y
> - CONFIG_NVDIMM_PFN=y
> - CONFIG_NVDIMM_DAX=y
> - CONFIG_ZONE_DEVICE=y
> 
> In the above list, a bare "CONFIG_BLK_DEV_PMEM" means "y" or "m" are
> both acceptable.

Yes, this is due to enable_module() can accept "y" and "m".
How about forcing enable_module() to accept "m" ?

 23 # CONFIG_XXX=m => unchange
 24 # CONFIG_XXX=y => unchange
 25 # CONFIG_XXX is not set => CONFIG_XXX=m
 26 enable_module()
 27 {
 28 grep -q -F -e "$1=m" -e "$1=y" .config && return  <==
 29 
 30 if [ -x source/scripts/config ]; then
 31 source/scripts/config --file .config --module $1
 32 else
 33 /kbuild/src/linux/scripts/config --file .config --module $1
 34 fi
 35 }

-- 
Thanks
Ren Yilong

> 
> >* load and operate.
> >*/
> >   BUILD_BUG_ON(!IS_MODULE(CONFIG_LIBNVDIMM));
> >   BUILD_BUG_ON(!IS_MODULE(CONFIG_BLK_DEV_PMEM));
> >   BUILD_BUG_ON(!IS_MODULE(CONFIG_ND_BTT));
> >   BUILD_BUG_ON(!IS_MODULE(CONFIG_ND_PFN));
> >   BUILD_BUG_ON(!IS_MODULE(CONFIG_ND_BLK));
> >   BUILD_BUG_ON(!IS_MODULE(CONFIG_ACPI_NFIT));
> >   BUILD_BUG_ON(!IS_MODULE(CONFIG_DEV_DAX));
> >   BUILD_BUG_ON(!IS_MODULE(CONFIG_DEV_DAX_PMEM));
> 
> Regards,
> Fengguang


Re: [lkp] [usb] ad05399d68: BUG: unable to handle kernel NULL pointer dereference at 0000000000000012

2016-08-15 Thread Peter Chen
On Mon, Aug 15, 2016 at 10:49:55PM +0800, Ye Xiaolong wrote:
> On 08/15, Peter Chen wrote:
> > 
> >>
> >>
> >>FYI, we noticed the following commit:
> >>
> >>https://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb.git testing/next 
> >>commit
> >>ad05399d68b6ae1649cdcfc82ce3ffea1a7c5104 ("usb: udc: core: fix error 
> >>handling")
> >>
> >
> >Hi Xiaolong,
> >
> >You reported it one month ago, and said it is a false report. see below.
> >Would you please double confirm it?
> 
> Hi, peter
> 
> Last time I reported stat "WARNING: CPU: 0 PID: 1 at
> lib/list_debug.c:36" and it showed both in this commit and its parent,
> this time, the observed change stat is "BUG: unable to handle kernel NULL
> pointer dereference at 0012" and it doesn't show in parent
> commit, however, the parent commit's dmesg would show kernel panic log
> as:
> 
> [   10.338487] Kernel panic - not syncing: Attempted to kill init! 
> exitcode=0x000b
> [   10.338487] 
> [   10.339911] CPU: 0 PID: 1 Comm: init Not tainted 4.8.0-rc1-00020-g0937a4d 
> #1
> [   10.341177] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> Debian-1.8.2-1 04/01/2014
> [   10.342798]   88001e53bc28 8168cf8a 
> 88001e534000
> [   10.345177]  8256ef20 88001e53bcb8 88001e50ca50 
> 88001e53bca8
> [   10.346739]  8114e062 8810 88001e53bcb8 
> 88001e53bc50
> [   10.347970] Call Trace:
> [   10.348690]  [] dump_stack+0x83/0xb9
> [   10.351592]  [] panic+0xf3/0x2a9
> [   10.352386]  [] do_exit+0x601/0xde0
> [   10.352879]  [] ? __sigqueue_free+0x43/0x50
> [   10.353511]  [] ? __dequeue_signal+0x1f7/0x210
> [   10.354483]  [] do_group_exit+0xa2/0x100
> [   10.355324]  [] get_signal+0x68e/0x740
> [   10.356155]  [] do_signal+0x23/0x670
> [   10.356983]  [] ? do_syslog+0x2c0/0x6a0
> [   10.357832]  [] ? bad_area_nosemaphore+0x33/0x40
> [   10.358825]  [] ? __do_page_fault+0x407/0x4d0
> [   10.359738]  [] exit_to_usermode_loop+0x69/0xc0
> [   10.360680]  [] prepare_exit_to_usermode+0x3d/0x70
> [   10.361725]  [] retint_user+0x8/0x10
> [   10.362650] Kernel Offset: disabled
> 
> The whole parent dmesg is attached.
> 

Then, what's the conclusion? Is this one is detect one or not?

Peter

> Thanks,
> Xiaolong
> 
> >
> >On Wed, Jul 13, 2016 at 01:55:26AM +, Peter Chen wrote:
> >> 
> >>
> >>>-Original Message-
> >>>From: lkp-requ...@eclists.intel.com 
> >>>[mailto:lkp-requ...@eclists.intel.com] On Behalf Of kernel test robot
> >>>Sent: Wednesday, July 13, 2016 9:28 AM
> >>>To: Peter Chen 
> >>>Cc: 0day robot ; LKML 
> >>>; l...@01.org
> >>>Subject: [lkp] [usb] 9696ef14de: WARNING: CPU: 0 PID: 1 at 
> >>>lib/list_debug.c:36
> >>>__list_add+0x104/0x188
> >>>
> >>>
> >>>FYI, we noticed the following commit:
> >>>
> >>>https://github.com/0day-ci/linux Peter-Chen/usb-udc-core-fix-error-
> >>>handling/20160711-100832
> >>>commit 9696ef14ded07fb0847f8e1cdda6d98a89ecd4f2 ("usb: udc: core: fix 
> >>>error
> >>>handling")
> >>>
> >>
> >>Thanks,  but I really can't find the relationship between my patch and dump.
> >>Can you reproduce it after running again or without my patch?
> >>
> >
> >Sorry, it's a false report, the error dump also showed in parent commit, 
> >please ignore the report and sorry for the noise.
> >
> >Thanks,
> >Xiaolong
> >
> >
> >
> >
> >Peter
> >
> >>in testcase: boot
> >>
> >>on test machine: 1 threads qemu-system-x86_64 -enable-kvm -cpu SandyBridge
> >>with 512M memory
> >>
> >>caused below changes:
> >>
> >>
> >>+---+++
> >>|   | 0937a4d787
> >>| | ad05399d68 |
> >>+---+++
> >>| boot_successes| 0  | 
> >>0  |
> >>| boot_failures | 12 | 
> >>12 |
> >>| WARNING:at_lib/list_debug.c:#__list_del_entry | 2  | 
> >>12 |
> >>| BUG:kernel_test_hang  | 2  |  
> >>  |
> >>| backtrace:kernel_restart  | 2  |  
> >>  |
> >>| backtrace:SyS_reboot  | 2  |  
> >>  |
> >>| BUG:kernel_oversize_in_test_stage | 4  |  
> >>  |
> >>| Kernel_panic-not_syncing:Attempted_to_kill_init!exitcode= | 6  |  
> >>  |
> >>| BUG:unable_to_handle_kernel   | 0  | 
> >>11 |
> >>| Oops  | 0  | 
> >>11 |
> >>| RIP:sysfs_kf_write| 0  | 
> >>11 |
> >>| 

Re: [PATCH] extcon: Introduce EXTCON_PROP_USB_SUPERSPEED property

2016-08-15 Thread Guenter Roeck
On Mon, Aug 15, 2016 at 5:55 PM, Chanwoo Choi  wrote:
> Hi Guenter,
>
> Looks good to me.
> I'll add the reference[1] information on patch description and applied it.
> [1] https://en.wikipedia.org/wiki/USB#Overview
>

Thanks!

Guenter

> Thanks,
> Chanwoo Choi
>
> On 2016년 08월 15일 22:15, Guenter Roeck wrote:
>> From: Guenter Roeck 
>>
>> EXTCON_PROP_USB_SUPERSPEED is necessary to distinguish between USB/USB2
>> and USB3 connections on USB Type-C cables.
>>
>> Cc: Chris Zhong 
>> Signed-off-by: Guenter Roeck 
>> ---
>> Applies on top of extcon-next.
>>
>>  include/linux/extcon.h | 8 +++-
>>  1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/linux/extcon.h b/include/linux/extcon.h
>> index ad7a1606a7f3..38d2c0dec2c1 100644
>> --- a/include/linux/extcon.h
>> +++ b/include/linux/extcon.h
>> @@ -107,12 +107,18 @@
>>   * @type:integer (intval)
>>   * @value:   0 (normal) or 1 (flip)
>>   * @default: 0 (normal)
>> + * - EXTCON_PROP_USB_SUPERSPEED
>> + * @type:   integer (intval)
>> + * @value:  0 (USB/USB2) or 1 (USB3)
>> + * @default:0 (USB/USB2)
>> + *
>>   */
>>  #define EXTCON_PROP_USB_VBUS 0
>>  #define EXTCON_PROP_USB_TYPEC_POLARITY   1
>> +#define EXTCON_PROP_USB_SUPERSPEED   2
>>
>>  #define EXTCON_PROP_USB_MIN  0
>> -#define EXTCON_PROP_USB_MAX  1
>> +#define EXTCON_PROP_USB_MAX  2
>>  #define EXTCON_PROP_USB_CNT  (EXTCON_PROP_USB_MAX - EXTCON_PROP_USB_MIN + 1)
>>
>>  /* Properties of EXTCON_TYPE_CHG. */
>>
>


Re: [kbuild-all] make[2]: *** No rule to make target 'tools/testing/nvdimm//config_check.o', needed by 'tools/testing/nvdimm//dax.o'.

2016-08-15 Thread Yilong Ren
On Mon, Aug 15, 2016 at 06:30:48PM -0700, Dan Williams wrote:
> On Mon, Aug 15, 2016 at 6:26 PM, Fengguang Wu  wrote:
> > On Mon, Aug 15, 2016 at 05:58:36PM -0700, Dan Williams wrote:
> >>
> >> On Mon, Aug 15, 2016 at 3:03 AM, kbuild test robot
> >>  wrote:
> >>>
> >>> tree:
> >>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
> >>> head:   694d0d0bb2030d2e36df73e2d23d5770511dbc8d
> >>> commit: ab68f26221366f92611650e8470e6a926801c7d4 /dev/dax, pmem: direct
> >>> access to persistent memory
> >>> date:   3 months ago
> >>> config: i386-randconfig-i1-201633 (attached as .config)
> >>> compiler: gcc-4.8 (Debian 4.8.4-1) 4.8.4
> >>> reproduce:
> >>> git checkout ab68f26221366f92611650e8470e6a926801c7d4
> >>> # save the attached .config to linux build tree
> >>> make ARCH=i386
> >>>
> >>> All errors (new ones prefixed by >>):
> >>>
> > make[2]: *** No rule to make target
> > 'tools/testing/nvdimm//config_check.o', needed by
> > 'tools/testing/nvdimm//dax.o'.
> >>>
> >>>make[2]: Target '__build' not remade because of errors.
> >>
> >>
> >> I think this is an invalid build test.  tools/testing/nvdimm/ uses a
> >> external module Kbuild environment, not Kconfig.  So, there's nothing
> >> I can do to prevent this compile error, unless there's some other way
> >> 0-day could determine the configuration dependencies?
> >
> >
> > Yeah if you can offer a concrete rule for the dependency, we'll add
> > it to 0-day.
> 
> Sounds good.  The config_check.c file itself lists the dependencies:
> 
> void check(void)
> {
> /*
>  * These kconfig symbols must be set to "m" for nfit_test to
>  * load and operate.
>  */
> BUILD_BUG_ON(!IS_MODULE(CONFIG_LIBNVDIMM));
> BUILD_BUG_ON(!IS_MODULE(CONFIG_BLK_DEV_PMEM));
> BUILD_BUG_ON(!IS_MODULE(CONFIG_ND_BTT));
> BUILD_BUG_ON(!IS_MODULE(CONFIG_ND_PFN));
> BUILD_BUG_ON(!IS_MODULE(CONFIG_ND_BLK));
> BUILD_BUG_ON(!IS_MODULE(CONFIG_ACPI_NFIT));
> BUILD_BUG_ON(!IS_MODULE(CONFIG_DEV_DAX));
> BUILD_BUG_ON(!IS_MODULE(CONFIG_DEV_DAX_PMEM));
> }

i check the doc from https://github.com/pmem/ndctl, and get info as follows:

Compile the libnvdimm sub-system as a module, make sure "zone device"
memory is enabled, and enable the btt, pfn, and dax features of the
sub-system:
CONFIG_ZONE_DEVICE=y
CONFIG_LIBNVDIMM=m
CONFIG_BLK_DEV_PMEM=m
CONFIG_ND_BLK=m
CONFIG_BTT=y
CONFIG_NVDIMM_PFN=y
CONFIG_NVDIMM_DAX=y 

it has a litte difference between them.

-- 
Thanks
Ren Yilong


Re: [PATCH] USB: core: of: Check device_node before parsing in usb_of_get_child_node()

2016-08-15 Thread Peter Chen
On Mon, Aug 15, 2016 at 11:31:10AM -0700, Vaibhav Hiremath wrote:
> In case of HUB devices connected to USB ports, we may not have DT
> node representing it inside USB, and when devices connected to hub
> gets enumerated, call to usb_of_get_child_node() leads to NULL pointer
> dereference.
> 
> In the usecase we have, where EHCI port is connected to USB HUB
> device, and downward ports of HUB are connected to further USB
> devices. When those devices gets enumerated, in order,
>  1. USB HUB ->
>   -> Call to usb_of_get_child_node() is OK, as
>   parent->dev.of_node is pointing to host node.
>  2. Devices connected to downward port of USB HUB
>   -> Call to usb_of_get_child_node() leads to NULL
>   pointer dereference as parent->dev.of_node = NULL,
>   as USB HUB DTS node may be empty.
> 
> Fix this NULL pointer dereference by adding check for pointer
> device_node inside usb_of_get_child_node() fn.
> 
> Signed-off-by: Vaibhav Hiremath 
> ---
> Testing: I have build tested it against mainline.
> 
>  drivers/usb/core/of.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/usb/core/of.c b/drivers/usb/core/of.c
> index 2289700..dc667a3 100644
> --- a/drivers/usb/core/of.c
> +++ b/drivers/usb/core/of.c
> @@ -34,6 +34,9 @@ struct device_node *usb_of_get_child_node(struct 
> device_node *parent,
>   struct device_node *node;
>   u32 port;
>  
> + if (!parent)
> + return NULL;
> +
>   for_each_child_of_node(parent, node) {
>   if (!of_property_read_u32(node, "reg", )) {
>   if (port == portnum)

I am afraid I can't reproduce it, would you please show me your dump
when null pointer dereference occurs? From what I find the
__of_get_next_child checks null pointer for parent node.

-- 

Best Regards,
Peter Chen


Re: [PATCHv2 3/4] pci: Determine actual VPD size on first access

2016-08-15 Thread Alexey Kardashevskiy
On 12/08/16 04:52, Alexander Duyck wrote:
> On Wed, Aug 10, 2016 at 4:54 PM, Benjamin Herrenschmidt
>  wrote:
>> On Wed, 2016-08-10 at 08:47 -0700, Alexander Duyck wrote:
>>>
>>> The problem is if we don't do this it becomes possible for a guest to
>>> essentially cripple a device on the host by just accessing VPD
>>> regions that aren't actually viable on many devices.
>>
>> And ? We already can cripple the device in so many different ways
>> simpy because we have pretty much full BAR access to it...
>>
>>>  We are much better off
>>> in terms of security and stability if we restrict access to what
>>> should be accessible.
>>
>> Bollox. I've heard that argument over and over again, it never stood
>> and still doesn't.
>>
>> We have full BAR access for god sake. We can already destroy the device
>> in many cases (think: reflashing microcode, internal debug bus access
>> with a route to the config space, voltage/freq control ).
>>
>> We aren't protecting anything more here, we are just adding layers of
>> bloat, complication and bugs.
> 
> To some extent I can agree with you.  I don't know if we should be
> restricting the VFIO based interface the same way we restrict systemd
> from accessing this region.  In the case of VFIO maybe we need to look
> at a different approach for accessing this.  Perhaps we need a
> privileged version of the VPD accessors that could be used by things
> like VFIO and the cxgb3 driver since they are assumed to be a bit
> smarter than those interfaces that were just trying to slurp up
> something like 4K of VPD data.
> 
>>>  In this case what has happened is that the
>>> vendor threw in an extra out-of-spec block and just expected it to
>>> work.
>>
>> Like vendors do all the time in all sort of places
>>
>> I still completely fail to see the point in acting as a filtering
>> middle man.
> 
> The problem is we are having to do some filtering because things like
> systemd were using dumb accessors that were trying to suck down 4K of
> VPD data instead of trying to parse through and read it a field at a
> time.
> 
>>> In order to work around it we just need to add a small function
>>> to drivers/pci/quirks.c that would update the VPD size reported so
>>> that it matches what the hardware is actually providing instead of
>>> what we can determine based on the VPD layout.
>>>
>>> Really working around something like this is not much different than
>>> what we would have to do if the vendor had stuffed the data in some
>>> reserved section of their PCI configuration space.
>>
>> It is, in both cases we shouldn't have VFIO or the host involved. We
>> should just let the guest config space accesses go through.
>>
>>>   We end up needing
>>> to add special quirks any time a vendor goes out-of-spec for some
>>> one-off configuration interface that only they are ever going to use.
>>
>> Cheers,
>> Ben.
> 
> If you have a suggestion on how to resolve this patches are always
> welcome.  Otherwise I think the simpler approach to fixing this
> without re-introducing the existing bugs is to just add the quirk.  I
> will try to get to it sometime this weekend if nobody else does.  It
> should be pretty straight foward, but I just don't have the time to
> pull up a kernel and generate a patch right now.


How exactly is mine - https://lkml.org/lkml/2016/8/11/200 - bad (except
missing rb/ab from Chelsio folks)? Thanks.


-- 
Alexey


Re: [kbuild-all] make[2]: *** No rule to make target 'tools/testing/nvdimm//config_check.o', needed by 'tools/testing/nvdimm//dax.o'.

2016-08-15 Thread Fengguang Wu

On Mon, Aug 15, 2016 at 06:30:48PM -0700, Dan Williams wrote:

On Mon, Aug 15, 2016 at 6:26 PM, Fengguang Wu  wrote:

On Mon, Aug 15, 2016 at 05:58:36PM -0700, Dan Williams wrote:


On Mon, Aug 15, 2016 at 3:03 AM, kbuild test robot
 wrote:


tree:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
head:   694d0d0bb2030d2e36df73e2d23d5770511dbc8d
commit: ab68f26221366f92611650e8470e6a926801c7d4 /dev/dax, pmem: direct
access to persistent memory
date:   3 months ago
config: i386-randconfig-i1-201633 (attached as .config)
compiler: gcc-4.8 (Debian 4.8.4-1) 4.8.4
reproduce:
git checkout ab68f26221366f92611650e8470e6a926801c7d4
# save the attached .config to linux build tree
make ARCH=i386

All errors (new ones prefixed by >>):


make[2]: *** No rule to make target
'tools/testing/nvdimm//config_check.o', needed by
'tools/testing/nvdimm//dax.o'.


   make[2]: Target '__build' not remade because of errors.



I think this is an invalid build test.  tools/testing/nvdimm/ uses a
external module Kbuild environment, not Kconfig.  So, there's nothing
I can do to prevent this compile error, unless there's some other way
0-day could determine the configuration dependencies?



Yeah if you can offer a concrete rule for the dependency, we'll add
it to 0-day.


Sounds good.  The config_check.c file itself lists the dependencies:

void check(void)
{
   /*
* These kconfig symbols must be set to "m" for nfit_test to


If "y" is not a valid option, we'll need to adjust 0-day's dependency
specification for ndctl test:

wfg /c/lkp-tests% cat include/ndctl
need_kconfig:
- CONFIG_HAVE_DMA_CONTIGUOUS=y
- CONFIG_CMA=y
- CONFIG_DMA_CMA=y
- CONFIG_CMA_SIZE_MBYTES=200
- CONFIG_LIBNVDIMM
- CONFIG_BLK_DEV_PMEM
- CONFIG_ND_BLK
- CONFIG_BTT=y
- CONFIG_NVDIMM_PFN=y
- CONFIG_NVDIMM_DAX=y
- CONFIG_ZONE_DEVICE=y

In the above list, a bare "CONFIG_BLK_DEV_PMEM" means "y" or "m" are
both acceptable.


* load and operate.
*/
   BUILD_BUG_ON(!IS_MODULE(CONFIG_LIBNVDIMM));
   BUILD_BUG_ON(!IS_MODULE(CONFIG_BLK_DEV_PMEM));
   BUILD_BUG_ON(!IS_MODULE(CONFIG_ND_BTT));
   BUILD_BUG_ON(!IS_MODULE(CONFIG_ND_PFN));
   BUILD_BUG_ON(!IS_MODULE(CONFIG_ND_BLK));
   BUILD_BUG_ON(!IS_MODULE(CONFIG_ACPI_NFIT));
   BUILD_BUG_ON(!IS_MODULE(CONFIG_DEV_DAX));
   BUILD_BUG_ON(!IS_MODULE(CONFIG_DEV_DAX_PMEM));


Regards,
Fengguang


[PATCH v3 RFC 2/2] nvme: improve performance for virtual NVMe devices

2016-08-15 Thread Helen Koike
From: Rob Nelson 

This change provides a mechanism to reduce the number of MMIO doorbell
writes for the NVMe driver. When running in a virtualized environment
like QEMU, the cost of an MMIO is quite hefy here. The main idea for
the patch is provide the device two memory location locations:
 1) to store the doorbell values so they can be lookup without the doorbell
MMIO write
 2) to store an event index.
I believe the doorbell value is obvious, the event index not so much.
Similar to the virtio specificaiton, the virtual device can tell the
driver (guest OS) not to write MMIO unless you are writing past this
value.

FYI: doorbell values are written by the nvme driver (guest OS) and the
event index is written by the virtual device (host OS).

The patch implements a new admin command that will communicate where
these two memory locations reside. If the command fails, the nvme
driver will work as before without any optimizations.

Contributions:
  Eric Northup 
  Frank Swiderski 
  Ted Tso 
  Keith Busch 

Just to give an idea on the performance boost with the vendor
extension: Running fio [1], a stock NVMe driver I get about 200K read
IOPs with my vendor patch I get about 1000K read IOPs. This was
running with a null device i.e. the backing device simply returned
success on every read IO request.

[1] Running on a 4 core machine:
  fio --time_based --name=benchmark --runtime=30
  --filename=/dev/nvme0n1 --nrfiles=1 --ioengine=libaio --iodepth=32
  --direct=1 --invalidate=1 --verify=0 --verify_fatal=0 --numjobs=4
  --rw=randread --blocksize=4k --randrepeat=false

Signed-off-by: Rob Nelson 
[mlin: port for upstream]
Signed-off-by: Ming Lin 
[koike: updated for upstream]
Signed-off-by: Helen Koike 
---

Changes since v2:
- Add vdb.c and vdb.h, the idea is to let the code in pci.c clean and to
make it easier to integrate with the official nvme extention when nvme
consortium publishes it
- Remove rmb (I couldn't see why they were necessary here, please let me
know if I am wrong)
- Reposition wmb
- Transform specific code in helper functions
- Coding style (checkpatch, remove unecessary goto, change if statement
logic to decrease identation)
- Rename feature to CONFIG_NVME_VDB
- Remove some PCI_VENDOR_ID_GOOGLE checks

 drivers/nvme/host/Kconfig  |  11 
 drivers/nvme/host/Makefile |   1 +
 drivers/nvme/host/pci.c|  29 ++-
 drivers/nvme/host/vdb.c| 125 +
 drivers/nvme/host/vdb.h| 118 ++
 include/linux/nvme.h   |  17 ++
 6 files changed, 299 insertions(+), 2 deletions(-)
 create mode 100644 drivers/nvme/host/vdb.c
 create mode 100644 drivers/nvme/host/vdb.h

diff --git a/drivers/nvme/host/Kconfig b/drivers/nvme/host/Kconfig
index db39d53..d3f4da9 100644
--- a/drivers/nvme/host/Kconfig
+++ b/drivers/nvme/host/Kconfig
@@ -43,3 +43,14 @@ config NVME_RDMA
  from https://github.com/linux-nvme/nvme-cli.
 
  If unsure, say N.
+
+config NVME_VDB
+   bool "NVMe Virtual Doorbell Extension for Improved Virtualization"
+   depends on NVME_CORE
+   ---help---
+ This provides support for the Virtual Doorbell Extension which
+ reduces the number of required MMIOs to ring doorbells, improving
+ performance in virtualized environments where MMIO causes a high
+ overhead.
+
+ If unsure, say N.
diff --git a/drivers/nvme/host/Makefile b/drivers/nvme/host/Makefile
index 47abcec..d4d0e3d 100644
--- a/drivers/nvme/host/Makefile
+++ b/drivers/nvme/host/Makefile
@@ -8,6 +8,7 @@ nvme-core-$(CONFIG_BLK_DEV_NVME_SCSI)   += scsi.o
 nvme-core-$(CONFIG_NVM)+= lightnvm.o
 
 nvme-y += pci.o
+nvme-$(CONFIG_NVME_VDB)+= vdb.o
 
 nvme-fabrics-y += fabrics.o
 
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index cf8b3d7..20bbc33 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -44,6 +44,7 @@
 #include 
 
 #include "nvme.h"
+#include "vdb.h"
 
 #define NVME_Q_DEPTH   1024
 #define NVME_AQ_DEPTH  256
@@ -99,6 +100,7 @@ struct nvme_dev {
dma_addr_t cmb_dma_addr;
u64 cmb_size;
u32 cmbsz;
+   struct nvme_vdb_dev vdb_d;
struct nvme_ctrl ctrl;
struct completion ioq_wait;
 };
@@ -131,6 +133,7 @@ struct nvme_queue {
u16 qid;
u8 cq_phase;
u8 cqe_seen;
+   struct nvme_vdb_queue vdb_q;
 };
 
 /*
@@ -171,6 +174,7 @@ static inline void _nvme_check_size(void)
BUILD_BUG_ON(sizeof(struct nvme_id_ns) != 4096);
BUILD_BUG_ON(sizeof(struct nvme_lba_range_type) != 64);
BUILD_BUG_ON(sizeof(struct nvme_smart_log) != 512);
+   BUILD_BUG_ON(sizeof(struct nvme_doorbell_memory) 

[PATCH v3 RFC 1/2] PCI: Add Google device ID

2016-08-15 Thread Helen Koike
Add device ID for the local SSDs (NVMe) in Google Clound Engine

Signed-off-by: Helen Koike 
---

Changes since v2:
- This is a new patch in the serie

 include/linux/pci_ids.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
index c58752f..d422afc 100644
--- a/include/linux/pci_ids.h
+++ b/include/linux/pci_ids.h
@@ -948,6 +948,8 @@
 #define PCI_DEVICE_ID_APPLE_IPID2_GMAC 0x006b
 #define PCI_DEVICE_ID_APPLE_TIGON3 0x1645
 
+#define PCI_VENDOR_ID_GOOGLE   0x1AE0
+
 #define PCI_VENDOR_ID_YAMAHA   0x1073
 #define PCI_DEVICE_ID_YAMAHA_724   0x0004
 #define PCI_DEVICE_ID_YAMAHA_724F  0x000d
-- 
1.9.1



[PATCH v3 RFC 0/2] Virtual NVMe device optimization

2016-08-15 Thread Helen Koike
Please, check commit "nvme: improve performance for virtual NVMe devices" for 
more details
Commit "PCI: Add Google device ID" only adds a new id in pci_ids.h

Patches are based in the linux-block/for-next branch and available here:
https://github.com/helen-fornazier/opw-staging/commits/nvme/dev

Helen Koike (1):
  PCI: Add Google device ID

Rob Nelson (1):
  nvme: improve performance for virtual NVMe devices

 drivers/nvme/host/Kconfig  |  11 
 drivers/nvme/host/Makefile |   1 +
 drivers/nvme/host/pci.c|  29 ++-
 drivers/nvme/host/vdb.c| 125 +
 drivers/nvme/host/vdb.h| 118 ++
 include/linux/nvme.h   |  17 ++
 include/linux/pci_ids.h|   2 +
 7 files changed, 301 insertions(+), 2 deletions(-)
 create mode 100644 drivers/nvme/host/vdb.c
 create mode 100644 drivers/nvme/host/vdb.h

-- 
1.9.1



Re: [kbuild-all] make[2]: *** No rule to make target 'tools/testing/nvdimm//config_check.o', needed by 'tools/testing/nvdimm//dax.o'.

2016-08-15 Thread Fengguang Wu

On Mon, Aug 15, 2016 at 06:30:48PM -0700, Dan Williams wrote:

On Mon, Aug 15, 2016 at 6:26 PM, Fengguang Wu  wrote:

On Mon, Aug 15, 2016 at 05:58:36PM -0700, Dan Williams wrote:


On Mon, Aug 15, 2016 at 3:03 AM, kbuild test robot
 wrote:


tree:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
head:   694d0d0bb2030d2e36df73e2d23d5770511dbc8d
commit: ab68f26221366f92611650e8470e6a926801c7d4 /dev/dax, pmem: direct
access to persistent memory
date:   3 months ago
config: i386-randconfig-i1-201633 (attached as .config)
compiler: gcc-4.8 (Debian 4.8.4-1) 4.8.4
reproduce:
git checkout ab68f26221366f92611650e8470e6a926801c7d4
# save the attached .config to linux build tree
make ARCH=i386

All errors (new ones prefixed by >>):


make[2]: *** No rule to make target
'tools/testing/nvdimm//config_check.o', needed by
'tools/testing/nvdimm//dax.o'.


   make[2]: Target '__build' not remade because of errors.



I think this is an invalid build test.  tools/testing/nvdimm/ uses a
external module Kbuild environment, not Kconfig.  So, there's nothing
I can do to prevent this compile error, unless there's some other way
0-day could determine the configuration dependencies?



Yeah if you can offer a concrete rule for the dependency, we'll add
it to 0-day.


Sounds good.  The config_check.c file itself lists the dependencies:

void check(void)
{
   /*
* These kconfig symbols must be set to "m" for nfit_test to
* load and operate.
*/
   BUILD_BUG_ON(!IS_MODULE(CONFIG_LIBNVDIMM));
   BUILD_BUG_ON(!IS_MODULE(CONFIG_BLK_DEV_PMEM));
   BUILD_BUG_ON(!IS_MODULE(CONFIG_ND_BTT));
   BUILD_BUG_ON(!IS_MODULE(CONFIG_ND_PFN));
   BUILD_BUG_ON(!IS_MODULE(CONFIG_ND_BLK));
   BUILD_BUG_ON(!IS_MODULE(CONFIG_ACPI_NFIT));
   BUILD_BUG_ON(!IS_MODULE(CONFIG_DEV_DAX));
   BUILD_BUG_ON(!IS_MODULE(CONFIG_DEV_DAX_PMEM));
}


Great, that looks good and easy rules to follow!

If that list is subject to change in future, we can even grep that
file and use the results to check .config.

Regards,
Fengguang


linux-next: build warnings after merge of the sound-asoc tree

2016-08-15 Thread Stephen Rothwell
Hi all,

After merging the sound-asoc tree, today's linux-next build (x86_64
allmodconfig) produced these warnings:

WARNING: sound/soc/intel/boards/snd-soc-sst-bytcr-rt5640.o(.text+0x7e7): 
Section mismatch in reference from the function snd_byt_rt5640_mc_probe() to 
the variable .init.rodata:cpu_ids.43814
The function snd_byt_rt5640_mc_probe() references
the variable __initconst cpu_ids.43814.
This is often because snd_byt_rt5640_mc_probe lacks a __initconst 
annotation or the annotation of cpu_ids.43814 is wrong.

WARNING: sound/soc/intel/boards/snd-soc-sst-bytcr-rt5640.o(.text+0x9d5): 
Section mismatch in reference from the function snd_byt_rt5640_mc_probe() to 
the variable .init.rodata:cpu_ids.43814
The function snd_byt_rt5640_mc_probe() references
the variable __initconst cpu_ids.43814.
This is often because snd_byt_rt5640_mc_probe lacks a __initconst 
annotation or the annotation of cpu_ids.43814 is wrong.

WARNING: sound/soc/intel/atom/sst/snd-intel-sst-acpi.o(.text+0x20f): Section 
mismatch in reference from the function sst_acpi_probe() to the variable 
.init.rodata:cpu_ids.44453
The function sst_acpi_probe() references
the variable __initconst cpu_ids.44453.
This is often because sst_acpi_probe lacks a __initconst 
annotation or the annotation of cpu_ids.44453 is wrong.

WARNING: sound/soc/intel/atom/sst/snd-intel-sst-acpi.o(.text+0x20f): Section 
mismatch in reference from the function sst_acpi_probe() to the variable 
.init.rodata:cpu_ids.44453
The function sst_acpi_probe() references
the variable __initconst cpu_ids.44453.
This is often because sst_acpi_probe lacks a __initconst 
annotation or the annotation of cpu_ids.44453 is wrong.

WARNING: sound/soc/intel/boards/snd-soc-sst-bytcr-rt5640.o(.text+0x7e7): 
Section mismatch in reference from the function snd_byt_rt5640_mc_probe() to 
the variable .init.rodata:cpu_ids.43814
The function snd_byt_rt5640_mc_probe() references
the variable __initconst cpu_ids.43814.
This is often because snd_byt_rt5640_mc_probe lacks a __initconst 
annotation or the annotation of cpu_ids.43814 is wrong.

WARNING: sound/soc/intel/boards/snd-soc-sst-bytcr-rt5640.o(.text+0x9d5): 
Section mismatch in reference from the function snd_byt_rt5640_mc_probe() to 
the variable .init.rodata:cpu_ids.43814
The function snd_byt_rt5640_mc_probe() references
the variable __initconst cpu_ids.43814.
This is often because snd_byt_rt5640_mc_probe lacks a __initconst 
annotation or the annotation of cpu_ids.43814 is wrong.

I am not sure which commit(s) introduced these.

-- 
Cheers,
Stephen Rothwell


Re: [PATCH 2/2] pinctrl: sunxi: Add A64 R_PIO controller support

2016-08-15 Thread Chen-Yu Tsai
On Tue, Aug 16, 2016 at 8:06 AM, Icenowy Zheng  wrote:
>
>
> 15.08.2016, 21:56, "Chen-Yu Tsai" :
>> Hi,
>>
>> On Mon, Aug 1, 2016 at 10:59 PM, Icenowy Zheng  wrote:
>>>  The A64 has a R_PIO pin controller, similar to the one found on the H3 
>>> SoCs.
>>>  Add support for the pins controlled by the R_PIO controller.
>>>
>>>  Signed-off-by: Icenowy Zheng 
>>>  ---
>>>   drivers/pinctrl/sunxi/Kconfig | 5 +
>>>   drivers/pinctrl/sunxi/Makefile | 1 +
>>>   drivers/pinctrl/sunxi/pinctrl-sun50i-a64-r.c | 148 
>>> +++
>>>   3 files changed, 154 insertions(+)
>>>   create mode 100644 drivers/pinctrl/sunxi/pinctrl-sun50i-a64-r.c
>>>
>>>  diff --git a/drivers/pinctrl/sunxi/Kconfig b/drivers/pinctrl/sunxi/Kconfig
>>>  index aaf075b..c4b476f 100644
>>>  --- a/drivers/pinctrl/sunxi/Kconfig
>>>  +++ b/drivers/pinctrl/sunxi/Kconfig
>>>  @@ -72,4 +72,9 @@ config PINCTRL_SUN50I_A64
>>>  bool
>>>  select PINCTRL_SUNXI
>>>
>>>  +config PINCTRL_SUN50I_A64_R
>>>  + bool
>>>  + depends on RESET_CONTROLLER
>>>  + select PINCTRL_SUNXI
>>>  +
>>>   endif
>>>  diff --git a/drivers/pinctrl/sunxi/Makefile 
>>> b/drivers/pinctrl/sunxi/Makefile
>>>  index 2d8b64e..d6eabdd 100644
>>>  --- a/drivers/pinctrl/sunxi/Makefile
>>>  +++ b/drivers/pinctrl/sunxi/Makefile
>>>  @@ -13,6 +13,7 @@ obj-$(CONFIG_PINCTRL_SUN8I_A23) += pinctrl-sun8i-a23.o
>>>   obj-$(CONFIG_PINCTRL_SUN8I_A23_R) += pinctrl-sun8i-a23-r.o
>>>   obj-$(CONFIG_PINCTRL_SUN8I_A33) += pinctrl-sun8i-a33.o
>>>   obj-$(CONFIG_PINCTRL_SUN50I_A64) += pinctrl-sun50i-a64.o
>>>  +obj-$(CONFIG_PINCTRL_SUN50I_A64_R) += pinctrl-sun50i-a64-r.o
>>>   obj-$(CONFIG_PINCTRL_SUN8I_A83T) += pinctrl-sun8i-a83t.o
>>>   obj-$(CONFIG_PINCTRL_SUN8I_H3) += pinctrl-sun8i-h3.o
>>>   obj-$(CONFIG_PINCTRL_SUN8I_H3_R) += pinctrl-sun8i-h3-r.o
>>>  diff --git a/drivers/pinctrl/sunxi/pinctrl-sun50i-a64-r.c 
>>> b/drivers/pinctrl/sunxi/pinctrl-sun50i-a64-r.c
>>>  new file mode 100644
>>>  index 000..b836264
>>>  --- /dev/null
>>>  +++ b/drivers/pinctrl/sunxi/pinctrl-sun50i-a64-r.c
>>>  @@ -0,0 +1,148 @@
>>>  +/*
>>>  + * Allwinner A64 SoCs special pins pinctrl driver.
>>>  + *
>>>  + * Based on pinctrl-sun8i-a23-r.c
>>>  + *
>>>  + * Copyright (C) 2016 Icenowy Zheng
>>>  + * Icenowy Zheng 
>>>  + *
>>>  + * Copyright (C) 2014 Chen-Yu Tsai
>>>  + * Chen-Yu Tsai 
>>>  + *
>>>  + * Copyright (C) 2014 Boris Brezillon
>>>  + * Boris Brezillon 
>>>  + *
>>>  + * Copyright (C) 2014 Maxime Ripard
>>>  + * Maxime Ripard 
>>>  + *
>>>  + * This file is licensed under the terms of the GNU General Public
>>>  + * License version 2. This program is licensed "as is" without any
>>>  + * warranty of any kind, whether express or implied.
>>>  + */
>>>  +
>>>  +#include 
>>>  +#include 
>>>  +#include 
>>>  +#include 
>>
>> Please sort the headers.
> This file is based on pinctrl-sun8i-a23-r.c and pinctrl-sun8i-h3-r.c, which
> both have #include in this sequence.

And we should probably fix them too.

>>
>>>  +#include 
>>>  +#include 
>>>  +
>>>  +#include "pinctrl-sunxi.h"
>>>  +
>>>  +static const struct sunxi_desc_pin sun50i_a64_r_pins[] = {
>>>  + SUNXI_PIN(SUNXI_PINCTRL_PIN(L, 0),
>>>  + SUNXI_FUNCTION(0x0, "gpio_in"),
>>>  + SUNXI_FUNCTION(0x1, "gpio_out"),
>>>  + SUNXI_FUNCTION(0x2, "s_rsb"), /* SCK */
>>>  + SUNXI_FUNCTION(0x3, "s_twi"), /* SCK */
>>
>> We use "i2c" instead of "twi".
> So as above, in A23 / H3 R_PIO driver, they're called as "s_twi" other than 
> "s_i2c"

These as well.

ChenYu

>>
>>>  + SUNXI_FUNCTION_IRQ_BANK(0x6, 0, 0)), /* PL_EINT0 */
>>>  + SUNXI_PIN(SUNXI_PINCTRL_PIN(L, 1),
>>>  + SUNXI_FUNCTION(0x0, "gpio_in"),
>>>  + SUNXI_FUNCTION(0x1, "gpio_out"),
>>>  + SUNXI_FUNCTION(0x2, "s_rsb"), /* SDA */
>>>  + SUNXI_FUNCTION(0x3, "s_twi"), /* SDA */
>>
>> Same here.
>>
>>>  + SUNXI_FUNCTION_IRQ_BANK(0x6, 0, 1)), /* PL_EINT1 */
>>>  + SUNXI_PIN(SUNXI_PINCTRL_PIN(L, 2),
>>>  + SUNXI_FUNCTION(0x0, "gpio_in"),
>>>  + SUNXI_FUNCTION(0x1, "gpio_out"),
>>>  + SUNXI_FUNCTION(0x2, "s_uart"), /* TX */
>>>  + SUNXI_FUNCTION_IRQ_BANK(0x6, 0, 2)), /* PL_EINT2 */
>>>  + SUNXI_PIN(SUNXI_PINCTRL_PIN(L, 3),
>>>  + SUNXI_FUNCTION(0x0, "gpio_in"),
>>>  + SUNXI_FUNCTION(0x1, "gpio_out"),
>>>  + SUNXI_FUNCTION(0x2, "s_uart"), /* RX */
>>>  + SUNXI_FUNCTION_IRQ_BANK(0x6, 0, 3)), /* PL_EINT3 */
>>>  + SUNXI_PIN(SUNXI_PINCTRL_PIN(L, 4),
>>>  + SUNXI_FUNCTION(0x0, "gpio_in"),
>>>  + SUNXI_FUNCTION(0x1, "gpio_out"),
>>>  + SUNXI_FUNCTION(0x3, "s_jtag"), /* MS */
>>>  + SUNXI_FUNCTION_IRQ_BANK(0x6, 0, 4)), /* PL_EINT4 */
>>>  + SUNXI_PIN(SUNXI_PINCTRL_PIN(L, 5),
>>>  + SUNXI_FUNCTION(0x0, "gpio_in"),
>>>  + SUNXI_FUNCTION(0x1, "gpio_out"),
>>>  + SUNXI_FUNCTION(0x3, "s_jtag"), /* CK */
>>>  + SUNXI_FUNCTION_IRQ_BANK(0x6, 0, 5)), /* PL_EINT5 */
>>>  + SUNXI_PIN(SUNXI_PINCTRL_PIN(L, 6),
>>>  + SUNXI_FUNCTION(0x0, 

Re: [PATCH] time,virt: resync steal time when guest & host lose sync

2016-08-15 Thread Wanpeng Li
2016-08-15 23:00 GMT+08:00 Rik van Riel :
> On Mon, 2016-08-15 at 16:53 +0800, Wanpeng Li wrote:
>> 2016-08-12 23:58 GMT+08:00 Rik van Riel :
>> [...]
>> > Wanpeng, does the patch below work for you?
>>
>> It will break steal time for full dynticks guest, and there is a
>> calltrace of thread_group_cputime_adjusted call stack, RIP is
>> cputime_adjust+0xff/0x130.
>
> How?  This patch is equivalent to passing ULONG_MAX to
> steal_account_process_time, which you tried to no ill
> effect before.

https://lkml.org/lkml/2016/6/8/404/ Paolo original suggested to add
the max cputime limit to the vtime, when the cpu is running in nohz
full mode and stop the tick, jiffies will be updated depends on clock
source instead of clock event device in
guest(tick_nohz_update_jiffies() callsite, ktime_get()), so it will
not be affected by lost clock ticks, my patch keeps the limit for
vtime and remove the limit to non-vtime. However, your patch removes
the limit for both scenarios and results in the below calltrace for
vtime.

>
> Do you have the full call trace?

[6.929856] divide error:  [#1] SMP
[6.934217] Modules linked in:
[6.937759] CPU: 3 PID: 57 Comm: kworker/u8:1 Not tainted 4.7.0+ #36
[6.946105] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS Bochs 01/01/2011
[6.953951] Workqueue: events_unbound call_usermodehelper_exec_work
[6.965726] task: 8e22b9785040 task.stack: 8e22b8b64000
[6.970820] RIP: 0010:[]  []
cputime_adjust+0xff/0x130
[6.981841] RSP: :8e22b8b67b78  EFLAGS: 00010887
[6.985946] RAX: a528afff5ad75000 RBX: 8e222e243c18 RCX: 8e22b8b67c28
[7.001166] RDX:  RSI: 0296 RDI: 
[7.008758] RBP: 8e22b8b67ba8 R08:  R09: a528b000
[7.015653] R10:  R11:  R12: 0014a516
[7.021376] R13: 8e22b8b67bb8 R14: 8e222e243c28 R15: 8e22b8b67c20
[7.035498] FS:  () GS:8e22bac0()
knlGS:
[7.054809] CS:  0010 DS:  ES:  CR0: 80050033
[7.066571] CR2:  CR3: 7ae06000 CR4: 001406e0
[7.075162] Stack:
[7.090141]  8e22b8b67c28 8e222e371ac0 8e22b8b67c20
8e22b8b67c28
[7.108512]  8e222e371ac0 8e22b8b67cc0 8e22b8b67be8
870c8c01
[7.123025]  000e0471 fffdcf32 0014a516
8e22b9785040
[7.140622] Call Trace:
[7.153076]  [] thread_group_cputime_adjusted+0x41/0x50
[7.160807]  [] wait_consider_task+0xa4f/0xff0
[7.176449]  [] ? wait_consider_task+0x651/0xff0
[7.186281]  [] ? do_wait+0xdf/0x320
[7.226606]  [] do_wait+0x11b/0x320
[7.239670]  [] SyS_wait4+0x64/0xc0
[7.245385]  [] ? task_stopped_code+0x50/0x50
[7.255924]  [] call_usermodehelper_exec_work+0x70/0xb0
[7.263011]  [] process_one_work+0x1e0/0x670
[7.273051]  [] ? process_one_work+0x161/0x670
[7.277991]  [] worker_thread+0x12b/0x4a0
[7.286920]  [] ? process_one_work+0x670/0x670
[7.291745]  [] kthread+0x101/0x120
[7.296878]  [] ret_from_fork+0x1f/0x40
[7.306511]  [] ? kthread_create_on_node+0x250/0x250
[7.311985] Code: 4d 39 c8 76 c1 4c 89 d0 48 c1 e8 20 48 85 c0 74
ca 4c 89 c0 49 d1 ea 4d 89 c8 48 d1 e8 49 89 c1 eb 9f 44 89 c8 31 d2
49 0f af c0 <49> f7 f2 4d 89 e2 48 39 f8 48 0f 42 c7 49 29 c2 4d 39 d3
76 0b
[7.357565] RIP  [] cputime_adjust+0xff/0x130
[7.364633]  RSP 
[7.373247] ---[ end trace 76ca7475a22c5d43 ]---


Re: [kbuild-all] make[2]: *** No rule to make target 'tools/testing/nvdimm//config_check.o', needed by 'tools/testing/nvdimm//dax.o'.

2016-08-15 Thread Dan Williams
On Mon, Aug 15, 2016 at 6:26 PM, Fengguang Wu  wrote:
> On Mon, Aug 15, 2016 at 05:58:36PM -0700, Dan Williams wrote:
>>
>> On Mon, Aug 15, 2016 at 3:03 AM, kbuild test robot
>>  wrote:
>>>
>>> tree:
>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
>>> head:   694d0d0bb2030d2e36df73e2d23d5770511dbc8d
>>> commit: ab68f26221366f92611650e8470e6a926801c7d4 /dev/dax, pmem: direct
>>> access to persistent memory
>>> date:   3 months ago
>>> config: i386-randconfig-i1-201633 (attached as .config)
>>> compiler: gcc-4.8 (Debian 4.8.4-1) 4.8.4
>>> reproduce:
>>> git checkout ab68f26221366f92611650e8470e6a926801c7d4
>>> # save the attached .config to linux build tree
>>> make ARCH=i386
>>>
>>> All errors (new ones prefixed by >>):
>>>
> make[2]: *** No rule to make target
> 'tools/testing/nvdimm//config_check.o', needed by
> 'tools/testing/nvdimm//dax.o'.
>>>
>>>make[2]: Target '__build' not remade because of errors.
>>
>>
>> I think this is an invalid build test.  tools/testing/nvdimm/ uses a
>> external module Kbuild environment, not Kconfig.  So, there's nothing
>> I can do to prevent this compile error, unless there's some other way
>> 0-day could determine the configuration dependencies?
>
>
> Yeah if you can offer a concrete rule for the dependency, we'll add
> it to 0-day.

Sounds good.  The config_check.c file itself lists the dependencies:

void check(void)
{
/*
 * These kconfig symbols must be set to "m" for nfit_test to
 * load and operate.
 */
BUILD_BUG_ON(!IS_MODULE(CONFIG_LIBNVDIMM));
BUILD_BUG_ON(!IS_MODULE(CONFIG_BLK_DEV_PMEM));
BUILD_BUG_ON(!IS_MODULE(CONFIG_ND_BTT));
BUILD_BUG_ON(!IS_MODULE(CONFIG_ND_PFN));
BUILD_BUG_ON(!IS_MODULE(CONFIG_ND_BLK));
BUILD_BUG_ON(!IS_MODULE(CONFIG_ACPI_NFIT));
BUILD_BUG_ON(!IS_MODULE(CONFIG_DEV_DAX));
BUILD_BUG_ON(!IS_MODULE(CONFIG_DEV_DAX_PMEM));
}


  1   2   3   4   5   6   7   8   9   10   >