Re: [PATCH] time: Avoid signed overflow in timekeeping_delta_to_ns()

2016-11-14 Thread Laurent Vivier


On 14/11/2016 20:42, Chris Metcalf wrote:
> This bugfix was originally made in commit 35a4933a8959 ("time:
> Avoid signed overflow in timekeeping_get_ns()").  When the code was
> refactored in commit 6bd58f09e1d8 ("time: Add cycles to nanoseconds
> translation") the signed overflow fix was lost.  Re-introduce it.
> 
> Signed-off-by: Chris Metcalf 
> ---
> I happened to be looking for an unrelated fix, found this code,
> realized the tip code didn't match the fixed code, and
> backtracked to where it had gone away.
> 
>  kernel/time/timekeeping.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> index 37dec7e3db43..57926bc7b7f3 100644
> --- a/kernel/time/timekeeping.c
> +++ b/kernel/time/timekeeping.c
> @@ -304,8 +304,7 @@ static inline s64 timekeeping_delta_to_ns(struct 
> tk_read_base *tkr,
>  {
>   s64 nsec;
>  
> - nsec = delta * tkr->mult + tkr->xtime_nsec;
> - nsec >>= tkr->shift;
> + nsec = (delta * tkr->mult + tkr->xtime_nsec) >> tkr->shift;
>  
>   /* If arch requires, add in get_arch_timeoffset() */
>   return nsec + arch_gettimeoffset();
> 
Reviewed-by: Laurent Vivier 


Re: [PATCH 1/2] Staging: fsl-mc: include: mc: Kernel type 's16' preferred over 'int16_t'

2016-11-14 Thread Shiva Kerdel



-Original Message-
From: Dan Carpenter [mailto:dan.carpen...@oracle.com]
Sent: Monday, November 14, 2016 4:06 AM
To: Stuart Yoder 
Cc: Shiva Kerdel ; de...@driverdev.osuosl.org; 
gre...@linuxfoundation.org; linux-
ker...@vger.kernel.org; Nipun Gupta ; tred...@nvidia.com; 
Laurentiu Tudor

Subject: Re: [PATCH 1/2] Staging: fsl-mc: include: mc: Kernel type 's16' 
preferred over 'int16_t'

On Fri, Nov 11, 2016 at 02:52:31PM +, Stuart Yoder wrote:

diff --git a/drivers/staging/fsl-mc/include/mc-bus.h 
b/drivers/staging/fsl-mc/include/mc-bus.h
index e915574..c7cad87 100644
--- a/drivers/staging/fsl-mc/include/mc-bus.h
+++ b/drivers/staging/fsl-mc/include/mc-bus.h
@@ -42,8 +42,8 @@ struct msi_domain_info;
   */
  struct fsl_mc_resource_pool {
enum fsl_mc_pool_type type;
-   int16_t max_count;
-   int16_t free_count;
+   s16 max_count;

My understanding is that this has to be signed because the design of
this driver is that we keep adding devices until the the counter
overflows.  After that there are a couple tests for
"if (WARN_ON(res_pool->max_count < 0)) " which prevent the driver from
working again.

This all seems pretty horrible.

Can you elaborate?

The resource pools managed by this driver are populated by hardware objects
discovered when the fsl-mc bus probes a DPRC/container.

The number of potential objects discovered of a given type is in the hundreds,
so a signed 16-bit number is order of magnitudes larger than anything we will
ever encounter.

Would you feel better about this if max_count was an int?

Yeah.


The max_count reflects the total number of objects discovered.  If that is
exceeded we display a warning, because something is horribly wrong.  Nothing
stops working, the allocator simply refuses to add anything else to the
free list.

I didn't look at this carefully...  Anyway we can't remove devices
either.  If we just had an upper bound instead of overflowing the s16
then we could still remove devices.


The only reason max_count is there at all is as an internal check against
bugs and resource leaks.  If the driver is being removed and a resource
pool is being freed, max_count must be zero...i.e. all objects should have
been removed.  If not, there is a leak somewhere.  So, it's a sanity check.


Just use a normal upper bound with a #define instead of an magic number
hidden and then disguised as an integer overflow.

Ok, agree that it would be clearer like that.

Shiva, can you respin this patch and just make both max_count and free_count
to be of type "int".

I will get Dan's suggestion sent as a separate patch...to #define the upper 
bound
instead of relying on integer overflow.

Thanks,
Stuart

I will do that, thank you for the clarification of what I should do.

Thanks,
Shiva Kerdel


Re: [PATCH RFC tip/core/rcu] SRCU rewrite

2016-11-14 Thread Peter Zijlstra
On Mon, Nov 14, 2016 at 10:36:36AM -0800, Paul E. McKenney wrote:
> SRCU uses two per-cpu counters: a nesting counter to count the number of
> active critical sections, and a sequence counter to ensure that the nesting
> counters don't change while they are being added together in
> srcu_readers_active_idx_check().
> 
> This patch instead uses per-cpu lock and unlock counters. Because the both
> counters only increase and srcu_readers_active_idx_check() reads the unlock
> counter before the lock counter, this achieves the same end without having
> to increment two different counters in srcu_read_lock(). This also saves a
> smp_mb() in srcu_readers_active_idx_check().

A very small improvement... I feel SRCU has much bigger issues :/


[PATCH] [media] ir-hix5hd2: make hisilicon,power-syscon property deprecated

2016-11-14 Thread Jiancheng Xue
From: Ruqiang Ju 

The clock of IR can be provided by the clock provider and controlled
by common clock framework APIs.

Signed-off-by: Ruqiang Ju 
Signed-off-by: Jiancheng Xue 
---
 .../devicetree/bindings/media/hix5hd2-ir.txt   |  6 +++---
 drivers/media/rc/ir-hix5hd2.c  | 25 ++
 2 files changed, 19 insertions(+), 12 deletions(-)

diff --git a/Documentation/devicetree/bindings/media/hix5hd2-ir.txt 
b/Documentation/devicetree/bindings/media/hix5hd2-ir.txt
index fb5e760..54e1bed 100644
--- a/Documentation/devicetree/bindings/media/hix5hd2-ir.txt
+++ b/Documentation/devicetree/bindings/media/hix5hd2-ir.txt
@@ -8,10 +8,11 @@ Required properties:
  the device. The interrupt specifier format depends on the interrupt
  controller parent.
- clocks: clock phandle and specifier pair.
-   - hisilicon,power-syscon: phandle of syscon used to control power.

 Optional properties:
- linux,rc-map-name : Remote control map name.
+   - hisilicon,power-syscon: DEPRECATED. Don't use this in new dts files.
+   Provide correct clocks instead.

 Example node:

@@ -19,7 +20,6 @@ Example node:
compatible = "hisilicon,hix5hd2-ir";
reg = <0xf8001000 0x1000>;
interrupts = <0 47 4>;
-   clocks = <&clock HIX5HD2_FIXED_24M>;
-   hisilicon,power-syscon = <&sysctrl>;
+   clocks = <&clock HIX5HD2_IR_CLOCK>;
linux,rc-map-name = "rc-tivo";
};
diff --git a/drivers/media/rc/ir-hix5hd2.c b/drivers/media/rc/ir-hix5hd2.c
index d0549fb..d26907e 100644
--- a/drivers/media/rc/ir-hix5hd2.c
+++ b/drivers/media/rc/ir-hix5hd2.c
@@ -75,15 +75,22 @@ static void hix5hd2_ir_enable(struct hix5hd2_ir_priv *dev, 
bool on)
 {
u32 val;

-   regmap_read(dev->regmap, IR_CLK, &val);
-   if (on) {
-   val &= ~IR_CLK_RESET;
-   val |= IR_CLK_ENABLE;
+   if (dev->regmap) {
+   regmap_read(dev->regmap, IR_CLK, &val);
+   if (on) {
+   val &= ~IR_CLK_RESET;
+   val |= IR_CLK_ENABLE;
+   } else {
+   val &= ~IR_CLK_ENABLE;
+   val |= IR_CLK_RESET;
+   }
+   regmap_write(dev->regmap, IR_CLK, val);
} else {
-   val &= ~IR_CLK_ENABLE;
-   val |= IR_CLK_RESET;
+   if (on)
+   clk_prepare_enable(dev->clock);
+   else
+   clk_disable_unprepare(dev->clock);
}
-   regmap_write(dev->regmap, IR_CLK, val);
 }

 static int hix5hd2_ir_config(struct hix5hd2_ir_priv *priv)
@@ -207,8 +214,8 @@ static int hix5hd2_ir_probe(struct platform_device *pdev)
priv->regmap = syscon_regmap_lookup_by_phandle(node,
   
"hisilicon,power-syscon");
if (IS_ERR(priv->regmap)) {
-   dev_err(dev, "no power-reg\n");
-   return -EINVAL;
+   dev_info(dev, "no power-reg\n");
+   priv->regmap = NULL;
}

res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
--
1.9.1



Re: [RFC][PATCH 5/7] kref: Implement kref_put_lock()

2016-11-14 Thread Peter Zijlstra
On Mon, Nov 14, 2016 at 12:35:48PM -0800, Kees Cook wrote:
> On Mon, Nov 14, 2016 at 9:39 AM, Peter Zijlstra  wrote:
> > Because home-rolling your own is _awesome_, stop doing it. Provide
> > kref_put_lock(), just like kref_put_mutex() but for a spinlock.
> >
> > Signed-off-by: Peter Zijlstra (Intel) 
> > ---
> >  include/linux/kref.h |   21 +++--
> >  net/sunrpc/svcauth.c |   15 ++-
> >  2 files changed, 25 insertions(+), 11 deletions(-)
> >
> > --- a/include/linux/kref.h
> > +++ b/include/linux/kref.h
> > @@ -86,12 +86,21 @@ static inline int kref_put_mutex(struct
> >  struct mutex *lock)
> >  {
> > WARN_ON(release == NULL);
> 
> This WARN_ON makes sense, yes, though it seems like it should be deal
> with differently. If it's NULL, we'll just Oops when we call release()
> later... Seems like this should saturate the kref or something else
> similar.

So I simply took the pattern from the existing kref_put().

But I like it more in these kref_put_{lock,mutex}() variants, because
someone will need to unlock. If we simply crash/bug without unlock we'll
have broken state the rest of the kernel cannot fix up.



Re: [RFC][PATCH 0/7] kref improvements

2016-11-14 Thread Peter Zijlstra
On Tue, Nov 15, 2016 at 08:27:42AM +0100, Greg KH wrote:
> On Mon, Nov 14, 2016 at 06:39:46PM +0100, Peter Zijlstra wrote:
> > This series unfscks kref and then implements it in terms of refcount_t.
> > 
> > x86_64-allyesconfig compile tested and boot tested with my regular config.
> > 
> > refcount_t is as per the previous thread, it BUGs on over-/underflow and
> > saturates at UINT_MAX, such that if we ever overflow, we'll never free 
> > again.
> > 
> > 
> 
> Thanks so much for doing these, at the very least, I want to take the
> kref-abuse-fixes now as those users shouldn't be doing those foolish
> things.  Any objection for me taking some of them through my tree now?

None at all, but please double check at least the 'kill kref_sub()' one,
I might have messed up drbd or something, that code isn't entirely
transparant.


Re: [RFC][PATCH 2/7] kref: Add kref_read()

2016-11-14 Thread Peter Zijlstra
On Tue, Nov 15, 2016 at 08:28:55AM +0100, Greg KH wrote:
> On Mon, Nov 14, 2016 at 10:16:55AM -0800, Christoph Hellwig wrote:
> > On Mon, Nov 14, 2016 at 06:39:48PM +0100, Peter Zijlstra wrote:
> > > Since we need to change the implementation, stop exposing internals.
> > > 
> > > Provide kref_read() to read the current reference count; typically
> > > used for debug messages.
> > 
> > Can we just provide a printk specifier for a kref value instead as
> > that is the only valid use case for reading the value?
> 
> Yeah, that would be great as no one should be doing anything
> logic-related based on the kref value.

IIRC there are a few users that WARN_ON() the value with a minimum bound
or somesuch. Those would be left in the cold, but yes I too like the
idea of a printk() format specifier.


Re: [RFC][PATCH 0/7] kref improvements

2016-11-14 Thread Ingo Molnar

* Greg KH  wrote:

> On Mon, Nov 14, 2016 at 06:39:46PM +0100, Peter Zijlstra wrote:
> > This series unfscks kref and then implements it in terms of refcount_t.
> > 
> > x86_64-allyesconfig compile tested and boot tested with my regular config.
> > 
> > refcount_t is as per the previous thread, it BUGs on over-/underflow and
> > saturates at UINT_MAX, such that if we ever overflow, we'll never free 
> > again.
> > 
> > 
> 
> Thanks so much for doing these, at the very least, I want to take the
> kref-abuse-fixes now as those users shouldn't be doing those foolish
> things.  Any objection for me taking some of them through my tree now?

Very nice series indeed!

We normally route atomics related patches through tip:locking/core (there's 
also 
tip:atomic/core), but this is a special case I think, given how broadly it 
interacts with driver code.

So both would work I think: we could concentrate these and only these patches 
into 
tip:atomic/core into an append-only tree, or you can carry them in the driver 
tree 
- whichever variant you prefer!

Thanks,

Ingo


Re: [PATCH v2] tile: handle __ro_after_init like parisc does

2016-11-14 Thread Heiko Carstens
On Mon, Nov 14, 2016 at 01:12:05PM -0800, Kees Cook wrote:
> At some point here, I want to collect all the arch maintainers and
> discuss the options for correctly reflecting the three data
> memory-protection needs we have:
> 
> - always read-only
> - read-only after init
> - read-only except during rare updates
> 
> (The latter one doesn't exist all yet...)
> 
> x86, arm, and arm64 use mark_rodata_ro() after init finishes, so they
> don't technically implement "always read-only". parisc, tile, powerpc,
> others have "always read-only", but disable read-only-after-init since
> they don't use mark_rodata_ro(). I think s390 has recently implemented
> both, but I have to double-check...

Yes, s390 has both: an early always read-only support, which is effective
as soon as paging_init() has set up and enabled page tables.
Our mark_rodata_ro() implementation only makes the ro_after_init section
read-only.



Re: [RFC PATCH] xen/x86: Increase xen_e820_map to E820_X_MAX possible entries

2016-11-14 Thread Juergen Gross
On 15/11/16 08:15, Jan Beulich wrote:
 On 15.11.16 at 07:33,  wrote:
>> On 15/11/16 01:11, Alex Thorlton wrote:
>>> Hey everyone,
>>>
>>> We're having problems with large systems hitting a BUG in
>>> xen_memory_setup, due to extra e820 entries created in the
>>> XENMEM_machine_memory_map callback.  The change in the patch gets things
>>> working, but Boris and I wanted to get opinions on whether or not this
>>> is the appropriate/entire solution, which is why I've sent it as an RFC
>>> for now.
>>>
>>> Boris pointed out to me that E820_X_MAX is only large when CONFIG_EFI=y,
>>> which is a detail worth discussig.  He proposed possibly adding
>>> CONFIG_XEN to the conditions under which we set E820_X_MAX to a larger
>>> value than E820MAX, since the Xen e820 table isn't bound by the
>>> zero-page memory limitations.
>>>
>>> I do *slightly* question the use of E820_X_MAX here, only from a
>>> cosmetic prospective, as I believe this macro is intended to describe
>>> the maximum size of the extended e820 table, which, AFAIK, is not used
>>> by the Xen HV.  That being said, there isn't exactly a "more
>>> appropriate" macro/variable to use, so this may not really be an issue.
>>>
>>> Any input on the patch, or the questions I've raised above is greatly
>>> appreciated!
>>
>> While I think extending the e820 table is the right thing to do I'm
>> questioning the assumptions here.
>>
>> Looking briefly through the Xen hypervisor sources I think it isn't
>> yet ready for such large machines: the hypervisor's e820 map seems to
>> be still limited to 128 e820 entries. Jan, did I overlook an EFI
>> specific path extending this limitation?
> 
> No, you didn't. I do question the correlation with "large machines"
> here though: The issue isn't with large machines afaict, but with
> ones having very many entries (i.e. heavily fragmented).

Fact is: non-EFI machines are limited to 128 entries.

One reason for fragmentation is NUMA - which tends to be present and
especially adding many entries only with lots of nodes - on large
machines.

So while you are of course right that the problem isn't the size of
the machine, but the memory fragmentation, the chances to show up
are much higher on large machines.

So I'd rephrase:

Looking briefly through the Xen hypervisor sources I think it isn't yet
ready for machines with many e820 entries due to memory fragmentation
to be found e.g. on very large NUMA machines with lots of nodes: ...

>> In case I'm right the Xen hypervisor should be prepared for a larger
>> e820 map, but this won't help alone as there would still be additional
>> entries for the IOAPICs created.
>>
>> So I think we need something like:
>>
>> #define E820_XEN_MAX (E820_X_MAX + MAX_IO_APICS)
>>
>> and use this for sizing xen_e820_map[].
> 
> I would say that if any change gets done here, there shouldn't be
> any static upper limit at all. That could even be viewed as in line
> with recent e820.c changes moving to dynamic allocations. In
> particular I don't see why MAX_IO_APICS would need adding in
> here, but not other (current and future) factors determining the
> (pseudo) E820 map Xen presents to Dom0.

The hypervisor interface of XENMEM_machine_memory_map requires to
specify the size of the e820 map where the hypervisor can supply
entries. The alternative would be try and error: start with a small
e820 map and in case of error increasing the size until success. I
don't like this approach. Especially as dynamic memory allocations
are not possible at this stage (using RESERVE_BRK() isn't any better
than a static __initdata array IMO).

Which other current factors do you see determining the number of
e820 entries presented to Dom0?


Juergen


Re: Linux 4.4.32

2016-11-14 Thread Greg KH
diff --git a/Makefile b/Makefile
index 7c6f28e7a2f6..fba9b09a1330 100644
--- a/Makefile
+++ b/Makefile
@@ -1,6 +1,6 @@
 VERSION = 4
 PATCHLEVEL = 4
-SUBLEVEL = 31
+SUBLEVEL = 32
 EXTRAVERSION =
 NAME = Blurry Fish Butt
 
diff --git a/arch/mips/kvm/emulate.c b/arch/mips/kvm/emulate.c
index bbe56871245c..4298aeb1e20f 100644
--- a/arch/mips/kvm/emulate.c
+++ b/arch/mips/kvm/emulate.c
@@ -822,7 +822,7 @@ static void kvm_mips_invalidate_guest_tlb(struct kvm_vcpu 
*vcpu,
bool user;
 
/* No need to flush for entries which are already invalid */
-   if (!((tlb->tlb_lo[0] | tlb->tlb_lo[1]) & ENTRYLO_V))
+   if (!((tlb->tlb_lo0 | tlb->tlb_lo1) & MIPS3_PG_V))
return;
/* User address space doesn't need flushing for KSeg2/3 changes */
user = tlb->tlb_hi < KVM_GUEST_KSEG0;
diff --git a/drivers/gpu/drm/amd/amdgpu/atombios_dp.c 
b/drivers/gpu/drm/amd/amdgpu/atombios_dp.c
index 21aacc1f45c1..7f85c2c1d681 100644
--- a/drivers/gpu/drm/amd/amdgpu/atombios_dp.c
+++ b/drivers/gpu/drm/amd/amdgpu/atombios_dp.c
@@ -265,15 +265,27 @@ static int amdgpu_atombios_dp_get_dp_link_config(struct 
drm_connector *connector
unsigned max_lane_num = drm_dp_max_lane_count(dpcd);
unsigned lane_num, i, max_pix_clock;
 
-   for (lane_num = 1; lane_num <= max_lane_num; lane_num <<= 1) {
-   for (i = 0; i < ARRAY_SIZE(link_rates) && link_rates[i] <= 
max_link_rate; i++) {
-   max_pix_clock = (lane_num * link_rates[i] * 8) / bpp;
+   if (amdgpu_connector_encoder_get_dp_bridge_encoder_id(connector) ==
+   ENCODER_OBJECT_ID_NUTMEG) {
+   for (lane_num = 1; lane_num <= max_lane_num; lane_num <<= 1) {
+   max_pix_clock = (lane_num * 27 * 8) / bpp;
if (max_pix_clock >= pix_clock) {
*dp_lanes = lane_num;
-   *dp_rate = link_rates[i];
+   *dp_rate = 27;
return 0;
}
}
+   } else {
+   for (i = 0; i < ARRAY_SIZE(link_rates) && link_rates[i] <= 
max_link_rate; i++) {
+   for (lane_num = 1; lane_num <= max_lane_num; lane_num 
<<= 1) {
+   max_pix_clock = (lane_num * link_rates[i] * 8) 
/ bpp;
+   if (max_pix_clock >= pix_clock) {
+   *dp_lanes = lane_num;
+   *dp_rate = link_rates[i];
+   return 0;
+   }
+   }
+   }
}
 
return -EINVAL;
diff --git a/drivers/gpu/drm/radeon/atombios_dp.c 
b/drivers/gpu/drm/radeon/atombios_dp.c
index 44ee72e04df9..b5760851195c 100644
--- a/drivers/gpu/drm/radeon/atombios_dp.c
+++ b/drivers/gpu/drm/radeon/atombios_dp.c
@@ -315,15 +315,27 @@ int radeon_dp_get_dp_link_config(struct drm_connector 
*connector,
unsigned max_lane_num = drm_dp_max_lane_count(dpcd);
unsigned lane_num, i, max_pix_clock;
 
-   for (lane_num = 1; lane_num <= max_lane_num; lane_num <<= 1) {
-   for (i = 0; i < ARRAY_SIZE(link_rates) && link_rates[i] <= 
max_link_rate; i++) {
-   max_pix_clock = (lane_num * link_rates[i] * 8) / bpp;
+   if (radeon_connector_encoder_get_dp_bridge_encoder_id(connector) ==
+   ENCODER_OBJECT_ID_NUTMEG) {
+   for (lane_num = 1; lane_num <= max_lane_num; lane_num <<= 1) {
+   max_pix_clock = (lane_num * 27 * 8) / bpp;
if (max_pix_clock >= pix_clock) {
*dp_lanes = lane_num;
-   *dp_rate = link_rates[i];
+   *dp_rate = 27;
return 0;
}
}
+   } else {
+   for (i = 0; i < ARRAY_SIZE(link_rates) && link_rates[i] <= 
max_link_rate; i++) {
+   for (lane_num = 1; lane_num <= max_lane_num; lane_num 
<<= 1) {
+   max_pix_clock = (lane_num * link_rates[i] * 8) 
/ bpp;
+   if (max_pix_clock >= pix_clock) {
+   *dp_lanes = lane_num;
+   *dp_rate = link_rates[i];
+   return 0;
+   }
+   }
+   }
}
 
return -EINVAL;
diff --git a/drivers/net/ethernet/broadcom/tg3.c 
b/drivers/net/ethernet/broadcom/tg3.c
index ca5ac5d6f4e6..49056c33be74 100644
--- a/drivers/net/ethernet/broadcom/tg3.c
+++ b/drivers/net/ethernet/broadcom/tg3.c
@@ -18142,14 +18142,14 @@ static pci_ers_result_t tg3_io_error_detected(struct 
pci_dev *pdev,
 
rtnl_lock();
 
-   /* We needn't recover from permanent err

Linux 4.8.8

2016-11-14 Thread Greg KH
I'm announcing the release of the 4.8.8 kernel.

All users of the 4.8 kernel series must upgrade.

The updated 4.8.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git 
linux-4.8.y
and can be browsed at the normal kernel.org git web browser:

http://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary

thanks,

greg k-h



 Makefile   |2 
 arch/powerpc/include/asm/checksum.h|   12 +--
 drivers/infiniband/ulp/ipoib/ipoib.h   |   20 --
 drivers/infiniband/ulp/ipoib/ipoib_cm.c|   15 ++--
 drivers/infiniband/ulp/ipoib/ipoib_ib.c|   12 +--
 drivers/infiniband/ulp/ipoib/ipoib_main.c  |   54 ++--
 drivers/infiniband/ulp/ipoib/ipoib_multicast.c |6 +
 drivers/net/ethernet/freescale/fec_main.c  |   18 ++---
 drivers/net/ethernet/mellanox/mlx4/en_cq.c |   10 ++-
 drivers/net/geneve.c   |2 
 drivers/net/hyperv/netvsc_drv.c|   19 +++--
 drivers/net/macsec.c   |   26 +---
 drivers/net/phy/phy.c  |   22 ++
 drivers/net/vxlan.c|2 
 drivers/ptp/ptp_chardev.c  |1 
 drivers/scsi/megaraid/megaraid_sas.h   |2 
 drivers/scsi/megaraid/megaraid_sas_base.c  |   13 +---
 drivers/usb/dwc3/gadget.c  |7 +-
 include/linux/netdevice.h  |   41 
 include/net/ip.h   |4 -
 include/net/ip6_route.h|1 
 include/uapi/linux/rtnetlink.h |2 
 net/8021q/vlan.c   |2 
 net/bridge/br_multicast.c  |   23 ---
 net/core/dev.c |   80 ++---
 net/core/pktgen.c  |   38 ++-
 net/ethernet/eth.c |2 
 net/ipv4/af_inet.c |2 
 net/ipv4/fou.c |4 -
 net/ipv4/gre_offload.c |2 
 net/ipv4/ip_sockglue.c |   11 +--
 net/ipv4/sysctl_net_ipv4.c |8 +-
 net/ipv4/udp.c |2 
 net/ipv4/udp_offload.c |2 
 net/ipv6/addrconf.c|2 
 net/ipv6/ip6_offload.c |2 
 net/ipv6/ip6_tunnel.c  |3 
 net/ipv6/route.c   |6 +
 net/ipv6/tcp_ipv6.c|   20 +++---
 net/ipv6/udp.c |3 
 net/netlink/af_netlink.c   |7 +-
 net/packet/af_packet.c |   10 +--
 net/sched/act_api.c|   19 +++--
 net/sched/act_vlan.c   |9 ++
 net/sched/cls_api.c|3 
 net/sctp/output.c  |8 ++
 net/sctp/sm_statefuns.c|   12 +--
 net/sctp/socket.c  |5 +
 net/switchdev/switchdev.c  |9 ++
 49 files changed, 368 insertions(+), 217 deletions(-)

Andrew Collins (1):
  net: Add netdev all_adj_list refcnt propagation to fix panic

Andrew Lunn (1):
  net: phy: Trigger state machine on state change and not polling.

Anoob Soman (1):
  packet: call fanout_release, while UNREGISTERING a netdev

Brenden Blanco (1):
  net/mlx4_en: fixup xdp tx irq to match rx

David Ahern (1):
  net: ipv6: Do not consider link state for nexthop validation

Eli Cooper (1):
  ip6_tunnel: Update skb->protocol to ETH_P_IPV6 in ip6_tnl_xmit()

Eric Dumazet (5):
  netlink: do not enter direct reclaim from netlink_dump()
  ipv6: tcp: restore IP6CB for pktoptions skbs
  net: pktgen: remove rcu locking in pktgen_change_name()
  ipv4: disable BH in set_ping_group_range()
  udp: fix IP_CHECKSUM handling

Fabio Estevam (1):
  net: fec: Call swap_buffer() prior to IP header alignment

Felipe Balbi (1):
  usb: dwc3: gadget: properly account queued requests

Gavin Schenk (1):
  net: fec: set mac address unconditionally

Greg Kroah-Hartman (1):
  Linux 4.8.8

Ido Schimmel (2):
  switchdev: Execute bridge ndos only for bridge ports
  net: core: Correctly iterate over lower adjacency list

Ivan Vecera (1):
  arch/powerpc: Update parameters for csum_tcpudp_magic & csum_tcpudp_nofold

Jamal Hadi Salim (1):
  net sched filters: fix notification of filter delete with proper handle

Jiri Pirko (1):
  rtnetlink: Add rtnexthop offload flag to compare mask

Jiri Slaby (1):
  net: sctp, forbid negative length

Kashyap Desai (1):
  scsi: megaraid_sas: Fix data integrity failure for JBOD (passthrough) 
devices


Linux 4.4.32

2016-11-14 Thread Greg KH
I'm announcing the release of the 4.4.32 kernel.

All users of the 4.4 kernel series must upgrade.

The updated 4.4.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git 
linux-4.4.y
and can be browsed at the normal kernel.org git web browser:

http://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary

thanks,

greg k-h



 Makefile  |2 
 arch/mips/kvm/emulate.c   |2 
 drivers/gpu/drm/amd/amdgpu/atombios_dp.c  |   20 ++--
 drivers/gpu/drm/radeon/atombios_dp.c  |   20 ++--
 drivers/net/ethernet/broadcom/tg3.c   |   10 ++--
 drivers/net/ethernet/freescale/fec_main.c |   10 ++--
 drivers/net/geneve.c  |2 
 drivers/net/vxlan.c   |2 
 drivers/of/of_reserved_mem.c  |8 ++-
 drivers/scsi/megaraid/megaraid_sas.h  |2 
 include/linux/mroute.h|2 
 include/linux/mroute6.h   |2 
 include/linux/netdevice.h |   40 -
 include/net/ip.h  |4 -
 include/net/sch_generic.h |9 +++
 include/net/sock.h|   10 
 include/uapi/linux/rtnetlink.h|2 
 net/8021q/vlan.c  |2 
 net/bridge/br_multicast.c |   23 ++---
 net/core/dev.c|   70 --
 net/core/pktgen.c |   38 
 net/ethernet/eth.c|2 
 net/ipv4/af_inet.c|2 
 net/ipv4/fou.c|4 -
 net/ipv4/gre_offload.c|2 
 net/ipv4/ip_sockglue.c|   10 ++--
 net/ipv4/ipmr.c   |3 -
 net/ipv4/route.c  |3 -
 net/ipv4/sysctl_net_ipv4.c|8 +--
 net/ipv4/tcp_input.c  |3 -
 net/ipv4/tcp_output.c |   15 +++---
 net/ipv4/udp.c|2 
 net/ipv4/udp_offload.c|4 -
 net/ipv6/addrconf.c   |2 
 net/ipv6/ip6_gre.c|1 
 net/ipv6/ip6_offload.c|2 
 net/ipv6/ip6_tunnel.c |2 
 net/ipv6/ip6mr.c  |5 +-
 net/ipv6/route.c  |4 +
 net/ipv6/tcp_ipv6.c   |   20 
 net/ipv6/udp.c|3 -
 net/netlink/af_netlink.c  |9 +--
 net/packet/af_packet.c|   10 ++--
 net/sched/act_vlan.c  |9 +++
 net/sched/cls_api.c   |3 -
 net/sctp/sm_statefuns.c   |   12 ++---
 net/sctp/socket.c |5 +-
 47 files changed, 275 insertions(+), 150 deletions(-)

Alex Deucher (4):
  drm/amdgpu/dp: add back special handling for NUTMEG
  drm/amdgpu: fix DP mode validation
  drm/radeon/dp: add back special handling for NUTMEG
  drm/radeon: fix DP mode validation

Andrew Collins (1):
  net: Add netdev all_adj_list refcnt propagation to fix panic

Anoob Soman (1):
  packet: call fanout_release, while UNREGISTERING a netdev

Douglas Caetano dos Santos (1):
  tcp: fix wrong checksum calculation on MTU probing

Eric Dumazet (8):
  tcp: fix overflow in __tcp_retransmit_skb()
  net: avoid sk_forward_alloc overflows
  tcp: fix a compile error in DBGUNDO()
  netlink: do not enter direct reclaim from netlink_dump()
  ipv6: tcp: restore IP6CB for pktoptions skbs
  net: pktgen: remove rcu locking in pktgen_change_name()
  ipv4: disable BH in set_ping_group_range()
  udp: fix IP_CHECKSUM handling

Gavin Schenk (1):
  net: fec: set mac address unconditionally

Greg Kroah-Hartman (2):
  Revert KVM: MIPS: Drop other CPU ASIDs on guest MMU changes
  Linux 4.4.32

Jamal Hadi Salim (1):
  net sched filters: fix notification of filter delete with proper handle

James Hogan (1):
  KVM: MIPS: Drop other CPU ASIDs on guest MMU changes

Jiri Pirko (1):
  rtnetlink: Add rtnexthop offload flag to compare mask

Jiri Slaby (1):
  net: sctp, forbid negative length

Lance Richardson (1):
  ip6_gre: fix flowi6_proto value in ip6gre_xmit_other()

Marcelo Ricardo Leitner (1):
  sctp: validate chunk len before actually using it

Milton Miller (1):
  tg3: Avoid NULL pointer dereference in tg3_io_error_detected()

Nicolas Dichtel (1):
  ipv6: correctly add local routes when lo goes up

Nikolay Aleksandrov (2):
  ipmr, ip6mr: fix scheduling while atomic and a deadlock with 
ipmr_get_route
  bridge: multicast: restore perm router ports on multicast enable

Paolo Abeni (1):
  net: pktgen: fix pkt_size

Sabrina Dubroca (1):
 

Re: Linux 4.8.8

2016-11-14 Thread Greg KH
diff --git a/Makefile b/Makefile
index 4d0f28cb481d..8f18daa2c76a 100644
--- a/Makefile
+++ b/Makefile
@@ -1,6 +1,6 @@
 VERSION = 4
 PATCHLEVEL = 8
-SUBLEVEL = 7
+SUBLEVEL = 8
 EXTRAVERSION =
 NAME = Psychotic Stoned Sheep
 
diff --git a/arch/powerpc/include/asm/checksum.h 
b/arch/powerpc/include/asm/checksum.h
index ee655ed1ff1b..1e8fceb308a5 100644
--- a/arch/powerpc/include/asm/checksum.h
+++ b/arch/powerpc/include/asm/checksum.h
@@ -53,10 +53,8 @@ static inline __sum16 csum_fold(__wsum sum)
return (__force __sum16)(~((__force u32)sum + tmp) >> 16);
 }
 
-static inline __wsum csum_tcpudp_nofold(__be32 saddr, __be32 daddr,
- unsigned short len,
- unsigned short proto,
- __wsum sum)
+static inline __wsum csum_tcpudp_nofold(__be32 saddr, __be32 daddr, __u32 len,
+   __u8 proto, __wsum sum)
 {
 #ifdef __powerpc64__
unsigned long s = (__force u32)sum;
@@ -83,10 +81,8 @@ static inline __wsum csum_tcpudp_nofold(__be32 saddr, __be32 
daddr,
  * computes the checksum of the TCP/UDP pseudo-header
  * returns a 16-bit checksum, already complemented
  */
-static inline __sum16 csum_tcpudp_magic(__be32 saddr, __be32 daddr,
-   unsigned short len,
-   unsigned short proto,
-   __wsum sum)
+static inline __sum16 csum_tcpudp_magic(__be32 saddr, __be32 daddr, __u32 len,
+   __u8 proto, __wsum sum)
 {
return csum_fold(csum_tcpudp_nofold(saddr, daddr, len, proto, sum));
 }
diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h 
b/drivers/infiniband/ulp/ipoib/ipoib.h
index 9dbfcc0ab577..5ff64afd69f9 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib.h
+++ b/drivers/infiniband/ulp/ipoib/ipoib.h
@@ -63,6 +63,8 @@ enum ipoib_flush_level {
 
 enum {
IPOIB_ENCAP_LEN   = 4,
+   IPOIB_PSEUDO_LEN  = 20,
+   IPOIB_HARD_LEN= IPOIB_ENCAP_LEN + IPOIB_PSEUDO_LEN,
 
IPOIB_UD_HEAD_SIZE= IB_GRH_BYTES + IPOIB_ENCAP_LEN,
IPOIB_UD_RX_SG= 2, /* max buffer needed for 4K mtu */
@@ -134,15 +136,21 @@ struct ipoib_header {
u16 reserved;
 };
 
-struct ipoib_cb {
-   struct qdisc_skb_cb qdisc_cb;
-   u8  hwaddr[INFINIBAND_ALEN];
+struct ipoib_pseudo_header {
+   u8  hwaddr[INFINIBAND_ALEN];
 };
 
-static inline struct ipoib_cb *ipoib_skb_cb(const struct sk_buff *skb)
+static inline void skb_add_pseudo_hdr(struct sk_buff *skb)
 {
-   BUILD_BUG_ON(sizeof(skb->cb) < sizeof(struct ipoib_cb));
-   return (struct ipoib_cb *)skb->cb;
+   char *data = skb_push(skb, IPOIB_PSEUDO_LEN);
+
+   /*
+* only the ipoib header is present now, make room for a dummy
+* pseudo header and set skb field accordingly
+*/
+   memset(data, 0, IPOIB_PSEUDO_LEN);
+   skb_reset_mac_header(skb);
+   skb_pull(skb, IPOIB_HARD_LEN);
 }
 
 /* Used for all multicast joins (broadcast, IPv4 mcast and IPv6 mcast) */
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c 
b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
index 4ad297d3de89..339a1eecdfe3 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
@@ -63,6 +63,8 @@ MODULE_PARM_DESC(cm_data_debug_level,
 #define IPOIB_CM_RX_DELAY   (3 * 256 * HZ)
 #define IPOIB_CM_RX_UPDATE_MASK (0x3)
 
+#define IPOIB_CM_RX_RESERVE (ALIGN(IPOIB_HARD_LEN, 16) - IPOIB_ENCAP_LEN)
+
 static struct ib_qp_attr ipoib_cm_err_attr = {
.qp_state = IB_QPS_ERR
 };
@@ -146,15 +148,15 @@ static struct sk_buff *ipoib_cm_alloc_rx_skb(struct 
net_device *dev,
struct sk_buff *skb;
int i;
 
-   skb = dev_alloc_skb(IPOIB_CM_HEAD_SIZE + 12);
+   skb = dev_alloc_skb(ALIGN(IPOIB_CM_HEAD_SIZE + IPOIB_PSEUDO_LEN, 16));
if (unlikely(!skb))
return NULL;
 
/*
-* IPoIB adds a 4 byte header. So we need 12 more bytes to align the
+* IPoIB adds a IPOIB_ENCAP_LEN byte header, this will align the
 * IP header to a multiple of 16.
 */
-   skb_reserve(skb, 12);
+   skb_reserve(skb, IPOIB_CM_RX_RESERVE);
 
mapping[0] = ib_dma_map_single(priv->ca, skb->data, IPOIB_CM_HEAD_SIZE,
   DMA_FROM_DEVICE);
@@ -624,9 +626,9 @@ void ipoib_cm_handle_rx_wc(struct net_device *dev, struct 
ib_wc *wc)
if (wc->byte_len < IPOIB_CM_COPYBREAK) {
int dlen = wc->byte_len;
 
-   small_skb = dev_alloc_skb(dlen + 12);
+   small_skb = dev_alloc_skb(dlen + IPOIB_CM_RX_RESERVE);
if (small_skb) {
-   skb_reserve(small_skb, 12);
+   skb_reserve(small_skb, IPOIB_CM_RX_RESERVE);
ib_dma_sync_single_for_cpu(priv->ca, 
rx_rin

Re: [PATCHSET 0/7] perf sched: Introduce timehist command, again (v1)

2016-11-14 Thread Ingo Molnar

* Namhyung Kim  wrote:

> > > By default it shows the individual schedule events, including the time 
> > > between
> > > sched-in events for the task, the task scheduling delay (time between 
> > > wakeup
> > > and actually running) and run time for the task:
> > > 
> > >time cpu  task name[tid/pid]b/n time sch delay  run time
> > >   -   - - -
> > >79371.874569 [11] gcc[31949]   0.014 0.000 1.148
> > >79371.874591 [10] gcc[31951]   0.000 0.000 0.024
> > >79371.874603 [10] migration/10[59] 3.350 0.004 0.011
> > >79371.874604 [11]1.148 0.000 0.035
> > >79371.874723 [05]0.016 0.000 1.383
> > >79371.874746 [05] gcc[31949]   0.153 0.078 0.022
> > > ...
> > 
> > What does the 'b/n' abbreviation stand for? 'Between'? Could we call the 
> > column 
> > 'sch wait' instead, or so?
> 
> Looks better, or what about 'wait time'?

Works for me!

> I'd go with the first option - simply adding arrows.  It's good enough to 
> identify each function IMHO.

Ok!

Thanks,

Ingo


Re: [RFC][PATCH 2/7] kref: Add kref_read()

2016-11-14 Thread Greg KH
On Mon, Nov 14, 2016 at 06:39:48PM +0100, Peter Zijlstra wrote:
> Since we need to change the implementation, stop exposing internals.
> 
> Provide kref_read() to read the current reference count; typically
> used for debug messages.
> 
> Kills two anti-patterns:
> 
>   atomic_read(&kref->refcount)
>   kref->refcount.counter
> 
> Signed-off-by: Peter Zijlstra (Intel) 
> ---
>  drivers/block/drbd/drbd_req.c|2 -
>  drivers/block/rbd.c  |8 ++---
>  drivers/block/virtio_blk.c   |2 -
>  drivers/gpu/drm/drm_gem_cma_helper.c |2 -
>  drivers/gpu/drm/drm_info.c   |2 -
>  drivers/gpu/drm/drm_mode_object.c|4 +-
>  drivers/gpu/drm/etnaviv/etnaviv_gem.c|2 -
>  drivers/gpu/drm/msm/msm_gem.c|2 -
>  drivers/gpu/drm/nouveau/nouveau_fence.c  |2 -
>  drivers/gpu/drm/omapdrm/omap_gem.c   |2 -
>  drivers/gpu/drm/ttm/ttm_bo.c |4 +-
>  drivers/gpu/drm/ttm/ttm_object.c |2 -
>  drivers/infiniband/hw/cxgb3/iwch_cm.h|6 ++--
>  drivers/infiniband/hw/cxgb3/iwch_qp.c|2 -
>  drivers/infiniband/hw/cxgb4/iw_cxgb4.h   |6 ++--
>  drivers/infiniband/hw/cxgb4/qp.c |2 -
>  drivers/infiniband/hw/usnic/usnic_ib_sysfs.c |6 ++--
>  drivers/infiniband/hw/usnic/usnic_ib_verbs.c |4 +-
>  drivers/misc/genwqe/card_dev.c   |2 -
>  drivers/misc/mei/debugfs.c   |2 -
>  drivers/pci/hotplug/pnv_php.c|2 -
>  drivers/pci/slot.c   |2 -
>  drivers/scsi/bnx2fc/bnx2fc_io.c  |8 ++---
>  drivers/scsi/cxgbi/libcxgbi.h|4 +-
>  drivers/scsi/lpfc/lpfc_debugfs.c |2 -
>  drivers/scsi/lpfc/lpfc_els.c |2 -
>  drivers/scsi/lpfc/lpfc_hbadisc.c |   40 
> +--
>  drivers/scsi/lpfc/lpfc_init.c|3 --
>  drivers/scsi/qla2xxx/tcm_qla2xxx.c   |4 +-
>  drivers/staging/android/ion/ion.c|2 -
>  drivers/staging/comedi/comedi_buf.c  |2 -
>  drivers/target/target_core_pr.c  |   10 +++---
>  drivers/target/tcm_fc/tfc_sess.c |2 -
>  drivers/usb/gadget/function/f_fs.c   |2 -
>  fs/exofs/sys.c   |2 -
>  fs/ocfs2/cluster/netdebug.c  |2 -
>  fs/ocfs2/cluster/tcp.c   |2 -
>  fs/ocfs2/dlm/dlmdebug.c  |   12 
>  fs/ocfs2/dlm/dlmdomain.c |2 -
>  fs/ocfs2/dlm/dlmmaster.c |8 ++---
>  fs/ocfs2/dlm/dlmunlock.c |2 -
>  include/drm/drm_framebuffer.h|2 -
>  include/drm/ttm/ttm_bo_driver.h  |4 +-
>  include/linux/kref.h |5 +++
>  include/linux/sunrpc/cache.h |2 -
>  include/net/bluetooth/hci_core.h |4 +-
>  net/bluetooth/6lowpan.c  |2 -
>  net/bluetooth/a2mp.c |4 +-
>  net/bluetooth/amp.c  |4 +-
>  net/bluetooth/l2cap_core.c   |4 +-
>  net/ceph/messenger.c |4 +-
>  net/ceph/osd_client.c|   10 +++---
>  net/sunrpc/cache.c   |2 -
>  net/sunrpc/svc_xprt.c|6 ++--
>  net/sunrpc/xprtrdma/svc_rdma_transport.c |4 +-
>  55 files changed, 120 insertions(+), 116 deletions(-)
> 
> --- a/drivers/block/drbd/drbd_req.c
> +++ b/drivers/block/drbd/drbd_req.c
> @@ -520,7 +520,7 @@ static void mod_rq_state(struct drbd_req
>   /* Completion does it's own kref_put.  If we are going to
>* kref_sub below, we need req to be still around then. */
>   int at_least = k_put + !!c_put;
> - int refcount = atomic_read(&req->kref.refcount);
> + int refcount = kref_read(&req->kref);
>   if (refcount < at_least)
>   drbd_err(device,
>   "mod_rq_state: Logic BUG: %x -> %x: refcount = 
> %d, should be >= %d\n",

As proof of "things you should never do", here is one such example.

ugh.


> --- a/drivers/block/virtio_blk.c
> +++ b/drivers/block/virtio_blk.c
> @@ -767,7 +767,7 @@ static void virtblk_remove(struct virtio
>   /* Stop all the virtqueues. */
>   vdev->config->reset(vdev);
>  
> - refc = atomic_read(&disk_to_dev(vblk->disk)->kobj.kref.refcount);
> + refc = kref_read(&disk_to_dev(vblk->disk)->kobj.kref);
>   put_disk(vblk->disk);
>   vdev->config->del_vqs(vdev);
>   kfree(vblk->vqs);

And this too, ugh, that's a huge abuse and is probably totally wrong...

thanks again for digging through this crap.  I wonder if we need to name
the kref reference variabl

Re: [RFC][PATCH 0/7] kref improvements

2016-11-14 Thread Greg KH
On Mon, Nov 14, 2016 at 06:39:46PM +0100, Peter Zijlstra wrote:
> This series unfscks kref and then implements it in terms of refcount_t.
> 
> x86_64-allyesconfig compile tested and boot tested with my regular config.
> 
> refcount_t is as per the previous thread, it BUGs on over-/underflow and
> saturates at UINT_MAX, such that if we ever overflow, we'll never free again.
> 
> 

Thanks so much for doing these, at the very least, I want to take the
kref-abuse-fixes now as those users shouldn't be doing those foolish
things.  Any objection for me taking some of them through my tree now?

thanks,

greg k-h


[PATCH] reset: hisilicon: add a polarity cell for reset line specifier

2016-11-14 Thread Jiancheng Xue
Add a polarity cell for reset line specifier. If the reset line
is asserted when the register bit is 1, the polarity is
normal. Otherwise, it is inverted.

Signed-off-by: Jiancheng Xue 
---
 .../devicetree/bindings/clock/hisi-crg.txt | 11 ---
 arch/arm/boot/dts/hi3519.dtsi  |  2 +-
 drivers/clk/hisilicon/reset.c  | 36 --
 3 files changed, 33 insertions(+), 16 deletions(-)

diff --git a/Documentation/devicetree/bindings/clock/hisi-crg.txt 
b/Documentation/devicetree/bindings/clock/hisi-crg.txt
index e3919b6..fcbb4f3 100644
--- a/Documentation/devicetree/bindings/clock/hisi-crg.txt
+++ b/Documentation/devicetree/bindings/clock/hisi-crg.txt
@@ -25,19 +25,20 @@ to specify the clock which they consume.
 
 All these identifier could be found in .
 
-- #reset-cells: should be 2.
+- #reset-cells: should be 3.
 
 A reset signal can be controlled by writing a bit register in the CRG module.
-The reset specifier consists of two cells. The first cell represents the
+The reset specifier consists of three cells. The first cell represents the
 register offset relative to the base address. The second cell represents the
-bit index in the register.
+bit index in the register. The third cell represents the polarity of the reset
+line (0 for normal, 1 for inverted).
 
 Example: CRG nodes
 CRG: clock-reset-controller@1201 {
compatible = "hisilicon,hi3519-crg";
reg = <0x1201 0x1>;
#clock-cells = <1>;
-   #reset-cells = <2>;
+   #reset-cells = <3>;
 };
 
 Example: consumer nodes
@@ -45,5 +46,5 @@ i2c0: i2c@1211 {
compatible = "hisilicon,hi3519-i2c";
reg = <0x1211 0x1000>;
clocks = <&CRG HI3519_I2C0_RST>;
-   resets = <&CRG 0xe4 0>;
+   resets = <&CRG 0xe4 0 0>;
 };
diff --git a/arch/arm/boot/dts/hi3519.dtsi b/arch/arm/boot/dts/hi3519.dtsi
index 5729ecf..b7cb182 100644
--- a/arch/arm/boot/dts/hi3519.dtsi
+++ b/arch/arm/boot/dts/hi3519.dtsi
@@ -50,7 +50,7 @@
crg: clock-reset-controller@1201 {
compatible = "hisilicon,hi3519-crg";
#clock-cells = <1>;
-   #reset-cells = <2>;
+   #reset-cells = <3>;
reg = <0x1201 0x1>;
};
 
diff --git a/drivers/clk/hisilicon/reset.c b/drivers/clk/hisilicon/reset.c
index 2a5015c..c0ab0b6 100644
--- a/drivers/clk/hisilicon/reset.c
+++ b/drivers/clk/hisilicon/reset.c
@@ -17,6 +17,7 @@
  * along with this program. If not, see .
  */
 
+#include 
 #include 
 #include 
 #include 
@@ -25,9 +26,11 @@
 #include 
 #include "reset.h"
 
-#defineHISI_RESET_BIT_MASK 0x1f
-#defineHISI_RESET_OFFSET_SHIFT 8
-#defineHISI_RESET_OFFSET_MASK  0x00
+#define HISI_RESET_POLARITY_MASK   BIT(0)
+#define HISI_RESET_BIT_SHIFT   1
+#define HISI_RESET_BIT_MASKGENMASK(6, 1)
+#define HISI_RESET_OFFSET_SHIFT8
+#define HISI_RESET_OFFSET_MASK GENMASK(23, 8)
 
 struct hisi_reset_controller {
spinlock_t  lock;
@@ -44,12 +47,15 @@ static int hisi_reset_of_xlate(struct reset_controller_dev 
*rcdev,
 {
u32 offset;
u8 bit;
+   bool polarity;
 
offset = (reset_spec->args[0] << HISI_RESET_OFFSET_SHIFT)
& HISI_RESET_OFFSET_MASK;
-   bit = reset_spec->args[1] & HISI_RESET_BIT_MASK;
+   bit = (reset_spec->args[1] << HISI_RESET_BIT_SHIFT)
+   & HISI_RESET_BIT_MASK;
+   polarity = reset_spec->args[2] & HISI_RESET_POLARITY_MASK;
 
-   return (offset | bit);
+   return (offset | bit | polarity);
 }
 
 static int hisi_reset_assert(struct reset_controller_dev *rcdev,
@@ -59,14 +65,19 @@ static int hisi_reset_assert(struct reset_controller_dev 
*rcdev,
unsigned long flags;
u32 offset, reg;
u8 bit;
+   bool polarity;
 
offset = (id & HISI_RESET_OFFSET_MASK) >> HISI_RESET_OFFSET_SHIFT;
-   bit = id & HISI_RESET_BIT_MASK;
+   bit = (id & HISI_RESET_BIT_MASK) >> HISI_RESET_BIT_SHIFT;
+   polarity = id & HISI_RESET_POLARITY_MASK;
 
spin_lock_irqsave(&rstc->lock, flags);
 
reg = readl(rstc->membase + offset);
-   writel(reg | BIT(bit), rstc->membase + offset);
+   if (polarity)
+   writel(reg & ~BIT(bit), rstc->membase + offset);
+   else
+   writel(reg | BIT(bit), rstc->membase + offset);
 
spin_unlock_irqrestore(&rstc->lock, flags);
 
@@ -80,14 +91,19 @@ static int hisi_reset_deassert(struct reset_controller_dev 
*rcdev,
unsigned long flags;
u32 offset, reg;
u8 bit;
+   bool polarity;
 
offset = (id & HISI_RESET_OFFSET_MASK) >> HISI_RESET_OFFSET_SHIFT;
-   bit = id & HISI_RESET_BIT_MASK;
+   bit = (id & HISI_RESET_BIT_MASK) >> HISI_RESET_BIT_SHIFT;
+   polarity = id & HISI_RESET_POLARITY_MASK;
 
spin_lock_irqsave(&rstc->lock, flags);
 
reg = readl(rstc->membase + offset);
-

Re: [RFC][PATCH 2/7] kref: Add kref_read()

2016-11-14 Thread Greg KH
On Mon, Nov 14, 2016 at 10:16:55AM -0800, Christoph Hellwig wrote:
> On Mon, Nov 14, 2016 at 06:39:48PM +0100, Peter Zijlstra wrote:
> > Since we need to change the implementation, stop exposing internals.
> > 
> > Provide kref_read() to read the current reference count; typically
> > used for debug messages.
> 
> Can we just provide a printk specifier for a kref value instead as
> that is the only valid use case for reading the value?

Yeah, that would be great as no one should be doing anything
logic-related based on the kref value.

thanks,

greg k-h


Re: [PATCH v3] mmc: sdhci-of-esdhc: fixup PRESENT_STATE read

2016-11-14 Thread Alexander Stein
On Monday 14 November 2016 16:12:27, Michael Walle wrote:
> Since commit 87a18a6a5652 ("mmc: mmc: Use ->card_busy() to detect busy
> cards in __mmc_switch()") the ESDHC driver is broken:
>   mmc0: Card stuck in programming state! __mmc_switch
>   mmc0: error -110 whilst initialising MMC card
> 
> Since this commit __mmc_switch() uses ->card_busy(), which is
> sdhci_card_busy() for the esdhc driver. sdhci_card_busy() uses the
> PRESENT_STATE register, specifically the DAT0 signal level bit. But the
> ESDHC uses a non-conformant PRESENT_STATE register, thus a read fixup is
> required to make the driver work again.
> 
> Signed-off-by: Michael Walle 
> Fixes: 87a18a6a5652 ("mmc: mmc: Use ->card_busy() to detect busy cards in
> __mmc_switch()") ---
> v3:
>  - explain the bits in the comments
>  - use bits[19:0] from the original value, all other will be taken from the
>fixup value.
> 
> v2:
>  - use lower bits of the original value (that was actually a typo)
>  - add fixes tag
>  - fix typo
> 
>  drivers/mmc/host/sdhci-of-esdhc.c | 13 +
>  1 file changed, 13 insertions(+)
> 
> diff --git a/drivers/mmc/host/sdhci-of-esdhc.c
> b/drivers/mmc/host/sdhci-of-esdhc.c index fb71c86..74cf3b1 100644
> --- a/drivers/mmc/host/sdhci-of-esdhc.c
> +++ b/drivers/mmc/host/sdhci-of-esdhc.c
> @@ -66,6 +66,19 @@ static u32 esdhc_readl_fixup(struct sdhci_host *host,
>   return ret;
>   }
>   }
> + /*
> +  * The DAT[3:0] line signal levels and the CMD line signal level are
> +  * not compatible with standard SDHC register. The line signal levels
> +  * DAT[7:0] are at bits 31:24 and the line signal level is at bit 23.
  ^
I guess there is a "command" missing, no?

Best regards,
Alexander



[PATCH] perf/ring_buffer: Fix invalid page order

2016-11-14 Thread Takao Indoh
In rb_alloc_aux_page(), a page order is set to MAX_ORDER when order is
greater than MAX_ORDER, but page order should be less than MAX_ORDER,
therefore alloc_pages_node fails at least once. This patch fixes page
order so that it can be always less than MAX_ORDER.

Signed-off-by: Takao Indoh 
---
 kernel/events/ring_buffer.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c
index 257fa46..3f76fdd 100644
--- a/kernel/events/ring_buffer.c
+++ b/kernel/events/ring_buffer.c
@@ -502,8 +502,8 @@ static struct page *rb_alloc_aux_page(int node, int order)
 {
struct page *page;
 
-   if (order > MAX_ORDER)
-   order = MAX_ORDER;
+   if (order >= MAX_ORDER)
+   order = MAX_ORDER - 1;
 
do {
page = alloc_pages_node(node, PERF_AUX_GFP, order);
-- 
1.8.3.1



Re: [PATCH v11 10/22] vfio iommu type1: Add support for mediated devices

2016-11-14 Thread Alexey Kardashevskiy
On 15/11/16 17:33, Kirti Wankhede wrote:
> 
> 
> On 11/15/2016 10:47 AM, Alexey Kardashevskiy wrote:
>> On 08/11/16 17:52, Alexey Kardashevskiy wrote:
>>> On 05/11/16 08:10, Kirti Wankhede wrote:
 VFIO IOMMU drivers are designed for the devices which are IOMMU capable.
 Mediated device only uses IOMMU APIs, the underlying hardware can be
 managed by an IOMMU domain.

 Aim of this change is:
 - To use most of the code of TYPE1 IOMMU driver for mediated devices
 - To support direct assigned device and mediated device in single module

 This change adds pin and unpin support for mediated device to TYPE1 IOMMU
 backend module. More details:
 - vfio_pin_pages() callback here uses task and address space of vfio_dma,
   that is, of the process who mapped that iova range.
 - Added pfn_list tracking logic to address space structure. All pages
   pinned through this interface are trached in its address space.
 - Pinned pages list is used to verify unpinning request and to unpin
   remaining pages while detaching the group for that device.
 - Page accounting is updated to account in its address space where the
   pages are pinned/unpinned.
 -  Accouting for mdev device is only done if there is no iommu capable
   domain in the container. When there is a direct device assigned to the
   container and that domain is iommu capable, all pages are already pinned
   during DMA_MAP.
 - Page accouting is updated on hot plug and unplug mdev device and pass
   through device.

 Tested by assigning below combinations of devices to a single VM:
 - GPU pass through only
>>>
>>> This does not require this patchset, right?
>>>
> 
> Sorry I missed this earlier.
> This testing is required for this patch, because this patch touches code
> that is used for direct device assignment. Also for page accounting, all
> cases are considered i.e. when there is only pass through device in a
> container, when there is pass through device + vGPU device in a
> container. Also have to test that pages are pinned properly when device
> is hotplugged. In that case vfio_iommu_replay() is called to take
> necessary action.

So in this particular test you are only testing that the patchset did not
break the already existing functionality, is that correct?


> 
 - vGPU device only
>>>
>>> Out of curiosity - how exactly did you test this? The exact GPU, how to
>>> create vGPU, what was the QEMU command line and the guest does with this
>>> passed device? Thanks.
>>
>> ping?
>>
> 
> I'm testing this code with M60, with custom changes in our driver.


Is this shared anywhere? What does the mediated driver do? Can Tesla K80 do
the same thing, or [10de:15fe] (whatever its name is)?


> Steps how to create mediated device are listed in
> Documentation/vfio-mediated-device.txt for sample mtty driver. Same
> steps I'm following for GPU. Quoting those steps here for you:


Nah, I saw this, I was wondering about actual hardware :) Like when you say
"tested with vGPU" - I am wondering what is passed to the guest and how the
guest is actually using it.


> 
> 2. Create a mediated device by using the dummy device that you created
> in the
>previous step.
> 
># echo "83b8f4f2-509f-382f-3c1e-e6bfe0fa1001" >  \
> 
> /sys/devices/virtual/mtty/mtty/mdev_supported_types/mtty-2/create
> 
> 3. Add parameters to qemu-kvm.
> 
>-device vfio-pci,\
> sysfsdev=/sys/bus/mdev/devices/83b8f4f2-509f-382f-3c1e-e6bfe0fa1001




-- 
Alexey


Re: Patch procedure

2016-11-14 Thread Greg KH
On Mon, Nov 14, 2016 at 12:16:08PM -0500, feas wrote:
> Here is how I am going about making the patches. It is basically
> what I have picked up from kernel newbies among other sites
> and videos on making patches. I would be greatful for any
> pointers on what seems to be the problem(s) with why it does
> not produce a proper patch.



Honestly, it's not our job to review someone's patch creation
proceedures and notes as everyone does it differently.  We will be glad
to review your patches that you create.

I think the review that I previously provided should be sufficient to
start with.  Again, send a short patch series to verify this, do not
send 100+ patches without getting feedback on them.

thanks,

greg k-h


Re: [PATCH v3] mmc: sdhci-of-esdhc: fixup PRESENT_STATE read

2016-11-14 Thread Adrian Hunter
On 14/11/16 17:12, Michael Walle wrote:
> Since commit 87a18a6a5652 ("mmc: mmc: Use ->card_busy() to detect busy
> cards in __mmc_switch()") the ESDHC driver is broken:
>   mmc0: Card stuck in programming state! __mmc_switch
>   mmc0: error -110 whilst initialising MMC card
> 
> Since this commit __mmc_switch() uses ->card_busy(), which is
> sdhci_card_busy() for the esdhc driver. sdhci_card_busy() uses the
> PRESENT_STATE register, specifically the DAT0 signal level bit. But the
> ESDHC uses a non-conformant PRESENT_STATE register, thus a read fixup is
> required to make the driver work again.
> 
> Signed-off-by: Michael Walle 
> Fixes: 87a18a6a5652 ("mmc: mmc: Use ->card_busy() to detect busy cards in 
> __mmc_switch()")
> ---
> v3:
>  - explain the bits in the comments
>  - use bits[19:0] from the original value, all other will be taken from the
>fixup value.
> 
> v2:
>  - use lower bits of the original value (that was actually a typo)
>  - add fixes tag
>  - fix typo
> 
>  drivers/mmc/host/sdhci-of-esdhc.c | 13 +
>  1 file changed, 13 insertions(+)
> 
> diff --git a/drivers/mmc/host/sdhci-of-esdhc.c 
> b/drivers/mmc/host/sdhci-of-esdhc.c
> index fb71c86..74cf3b1 100644
> --- a/drivers/mmc/host/sdhci-of-esdhc.c
> +++ b/drivers/mmc/host/sdhci-of-esdhc.c
> @@ -66,6 +66,19 @@ static u32 esdhc_readl_fixup(struct sdhci_host *host,
>   return ret;
>   }
>   }
> + /*
> +  * The DAT[3:0] line signal levels and the CMD line signal level are
> +  * not compatible with standard SDHC register. The line signal levels
> +  * DAT[7:0] are at bits 31:24 and the line signal level is at bit 23.
> +  * All other bits are the same as in the standard SDHC register.
> +  */
> + if (spec_reg == SDHCI_PRESENT_STATE) {
> + ret = value & 0x000f;
> + ret |= (value >> 4) & SDHCI_DATA_LVL_MASK;
> + ret |= (value << 1) & 0x0100;

Please define the command line level bit in sdhci.h and use that here.
e.g.

diff --git a/drivers/mmc/host/sdhci.h b/drivers/mmc/host/sdhci.h
index 766df17fb7eb..2570455b219a 100644
--- a/drivers/mmc/host/sdhci.h
+++ b/drivers/mmc/host/sdhci.h
@@ -73,6 +73,7 @@
 #define  SDHCI_DATA_LVL_MASK   0x00F0
 #define   SDHCI_DATA_LVL_SHIFT 20
 #define   SDHCI_DATA_0_LVL_MASK0x0010
+#define  SDHCI_CMD_LVL 0x0100
 
 #define SDHCI_HOST_CONTROL 0x28
 #define  SDHCI_CTRL_LED0x01


> + return ret;
> + }
> +
>   ret = value;
>   return ret;
>  }
> 



Re: [Intel-gfx] [PATCH v11 3/4] drm/i915: Use new CRC debugfs API

2016-11-14 Thread David Weinehall
On Mon, Nov 14, 2016 at 12:44:25PM +0200, Jani Nikula wrote:
> On Thu, 06 Oct 2016, Tomeu Vizoso  wrote:
> > diff --git a/drivers/gpu/drm/i915/intel_display.c 
> > b/drivers/gpu/drm/i915/intel_display.c
> > index 23a6c7213eca..7412a05fa5d9 100644
> > --- a/drivers/gpu/drm/i915/intel_display.c
> > +++ b/drivers/gpu/drm/i915/intel_display.c
> > @@ -14636,6 +14636,7 @@ static const struct drm_crtc_funcs intel_crtc_funcs 
> > = {
> > .page_flip = intel_crtc_page_flip,
> > .atomic_duplicate_state = intel_crtc_duplicate_state,
> > .atomic_destroy_state = intel_crtc_destroy_state,
> > +   .set_crc_source = intel_crtc_set_crc_source,
> >  };
> >  
> >  /**
> > diff --git a/drivers/gpu/drm/i915/intel_drv.h 
> > b/drivers/gpu/drm/i915/intel_drv.h
> > index 737261b09110..31894b7c6517 100644
> > --- a/drivers/gpu/drm/i915/intel_drv.h
> > +++ b/drivers/gpu/drm/i915/intel_drv.h
> > @@ -1844,6 +1844,14 @@ void intel_color_load_luts(struct drm_crtc_state 
> > *crtc_state);
> >  /* intel_pipe_crc.c */
> >  int intel_pipe_crc_create(struct drm_minor *minor);
> >  void intel_pipe_crc_cleanup(struct drm_minor *minor);
> > +#ifdef CONFIG_DEBUG_FS
> > +int intel_crtc_set_crc_source(struct drm_crtc *crtc, const char 
> > *source_name,
> > + size_t *values_cnt);
> > +#else
> > +static inline int intel_crtc_set_crc_source(struct drm_crtc *crtc,
> > +   const char *source_name,
> > +   size_t *values_cnt) { return 0; }
> > +#endif
> 
> "inline" here doesn't work because it's used as a function pointer.
> 
> Is it better to have a function that returns 0 for .set_crc_source, or
> to set .set_crc_source to NULL when CONFIG_DEBUG_FS=n?

I'd say that whenever we have a function pointer we should have a dummy
function without side-effects for this kind of things.


Kind regards, David


Re: [RFC PATCH] xen/x86: Increase xen_e820_map to E820_X_MAX possible entries

2016-11-14 Thread Jan Beulich
>>> On 15.11.16 at 07:33,  wrote:
> On 15/11/16 01:11, Alex Thorlton wrote:
>> Hey everyone,
>> 
>> We're having problems with large systems hitting a BUG in
>> xen_memory_setup, due to extra e820 entries created in the
>> XENMEM_machine_memory_map callback.  The change in the patch gets things
>> working, but Boris and I wanted to get opinions on whether or not this
>> is the appropriate/entire solution, which is why I've sent it as an RFC
>> for now.
>> 
>> Boris pointed out to me that E820_X_MAX is only large when CONFIG_EFI=y,
>> which is a detail worth discussig.  He proposed possibly adding
>> CONFIG_XEN to the conditions under which we set E820_X_MAX to a larger
>> value than E820MAX, since the Xen e820 table isn't bound by the
>> zero-page memory limitations.
>> 
>> I do *slightly* question the use of E820_X_MAX here, only from a
>> cosmetic prospective, as I believe this macro is intended to describe
>> the maximum size of the extended e820 table, which, AFAIK, is not used
>> by the Xen HV.  That being said, there isn't exactly a "more
>> appropriate" macro/variable to use, so this may not really be an issue.
>> 
>> Any input on the patch, or the questions I've raised above is greatly
>> appreciated!
> 
> While I think extending the e820 table is the right thing to do I'm
> questioning the assumptions here.
> 
> Looking briefly through the Xen hypervisor sources I think it isn't
> yet ready for such large machines: the hypervisor's e820 map seems to
> be still limited to 128 e820 entries. Jan, did I overlook an EFI
> specific path extending this limitation?

No, you didn't. I do question the correlation with "large machines"
here though: The issue isn't with large machines afaict, but with
ones having very many entries (i.e. heavily fragmented).

> In case I'm right the Xen hypervisor should be prepared for a larger
> e820 map, but this won't help alone as there would still be additional
> entries for the IOAPICs created.
> 
> So I think we need something like:
> 
> #define E820_XEN_MAX (E820_X_MAX + MAX_IO_APICS)
> 
> and use this for sizing xen_e820_map[].

I would say that if any change gets done here, there shouldn't be
any static upper limit at all. That could even be viewed as in line
with recent e820.c changes moving to dynamic allocations. In
particular I don't see why MAX_IO_APICS would need adding in
here, but not other (current and future) factors determining the
(pseudo) E820 map Xen presents to Dom0.

Jan



Re: [PATHCv10 1/2] usb: USB Type-C connector class

2016-11-14 Thread Greg KH
On Mon, Nov 14, 2016 at 12:46:50PM -0800, Guenter Roeck wrote:
> On Mon, Nov 14, 2016 at 02:32:35PM +0200, Heikki Krogerus wrote:
> > Hi Greg,
> > 
> > On Mon, Nov 14, 2016 at 10:51:48AM +0100, Greg KH wrote:
> > > On Mon, Sep 19, 2016 at 02:16:56PM +0300, Heikki Krogerus wrote:
> > > > The purpose of USB Type-C connector class is to provide
> > > > unified interface for the user space to get the status and
> > > > basic information about USB Type-C connectors on a system,
> > > > control over data role swapping, and when the port supports
> > > > USB Power Delivery, also control over power role swapping
> > > > and Alternate Modes.
> > > > 
> > > > Reviewed-by: Guenter Roeck 
> > > > Tested-by: Guenter Roeck 
> > > > Signed-off-by: Heikki Krogerus 
> > > > ---
> > > >  Documentation/ABI/testing/sysfs-class-typec |  218 ++
> > > >  Documentation/usb/typec.txt |  103 +++
> > > >  MAINTAINERS |9 +
> > > >  drivers/usb/Kconfig |2 +
> > > >  drivers/usb/Makefile|2 +
> > > >  drivers/usb/typec/Kconfig   |7 +
> > > >  drivers/usb/typec/Makefile  |1 +
> > > >  drivers/usb/typec/typec.c   | 1075 
> > > > +++
> > > >  include/linux/usb/typec.h   |  252 +++
> > > >  9 files changed, 1669 insertions(+)
> > > >  create mode 100644 Documentation/ABI/testing/sysfs-class-typec
> > > >  create mode 100644 Documentation/usb/typec.txt
> > > >  create mode 100644 drivers/usb/typec/Kconfig
> > > >  create mode 100644 drivers/usb/typec/Makefile
> > > >  create mode 100644 drivers/usb/typec/typec.c
> > > >  create mode 100644 include/linux/usb/typec.h
> > > 
> [ ... ]
> 
> > > > +
> > > > +int typec_connect(struct typec_port *port, struct typec_connection 
> > > > *con)
> > > > +{
> > > > +   int ret;
> > > > +
> > > > +   if (!con->partner && !con->cable)
> > > > +   return -EINVAL;
> > > > +
> > > > +   port->connected = 1;
> > > > +   port->data_role = con->data_role;
> > > > +   port->pwr_role = con->pwr_role;
> > > > +   port->vconn_role = con->vconn_role;
> > > > +   port->pwr_opmode = con->pwr_opmode;
> > > > +
> > > > +   kobject_uevent(&port->dev.kobj, KOBJ_CHANGE);
> > > 
> > > This worries me.  Who is listening for it?  What will you do with it?
> > > Shouldn't you just poll on an attribute file instead?
> > 
> > Oliver! Did you need this or can we remove it?
> > 
> > I remember I removed the "connected" attribute because you did not see
> > any use for it at one point. I don't remember the reason exactly why?
> > 
> 
> The Android team tells me that they are currently using the udev events
> to track port role changes, and to detect presence of port partner.
> 
> Also, there are plans to track changes on usbc*cable to differentiate
> between cable attach vs. device being attached on the remote end. 
> 
> What is the problem with using kobject_uevent() and thus presumably
> udev events ?

It's not a "normal" thing to do and is pretty "heavy" to do.  What does
userspace do with that change event?  Does it read specific attributes?
What causes the event to happen in the kernel, is it really just a
change in the specific object, or do new ones get added/removed?

In short, document the heck out of this please so people know how to use
it, and what is happening when the event happens.

thanks,

greg k-h


RE: [PATCH net-next v5] cadence: Add LSO support.

2016-11-14 Thread Rafal Ozieblo
> > > If UFO is in use it should not silently disable UDP checksums.
> > > 
> > > If you cannot support UFO with proper checksumming, then you cannot 
> > > enable support for that feature.
> > 
> > According Cadence Gigabit Ethernet MAC documentation:
> > 
> > "Hardware will not calculate the UDP checksum or modify the UDP 
> > checksum field. Therefore software must set a value of zero in the 
> > checksum field in the UDP header (in the first payload buffer) to indicate 
> > to the receiver that the UDP datagram does not include a checksum."
> > 
> > It is hardware requirement.
>
> I do not doubt that it is a hardware restriction.
>
> But I am saying that you cannot enable this feature under Linux if this is 
> how it operates on your hardware.

Would it be good to enable UFO conditionally with some internal define? Ex.:

+#ifdef MACB_ENABLE_UFO
+#define MACB_NETIF_LSO (NETIF_F_TSO | NETIF_F_UFO)
+#else
+#define MACB_NETIF_LSO (NETIF_F_TSO)
+#endif

I could add precise comment here that ufo is possible only without checksum.

Or maybe I could enable it from module_params or device-tree (like: 
drivers/net/ethernet/neterion/s2io.c).


Re: [PATCH v7 2/5] mm: remove x86-only restriction of movable_node

2016-11-14 Thread Aneesh Kumar K.V
Reza Arbab  writes:

> In commit c5320926e370 ("mem-hotplug: introduce movable_node boot
> option"), the memblock allocation direction is changed to bottom-up and
> then back to top-down like this:
>
> 1. memblock_set_bottom_up(true), called by cmdline_parse_movable_node().
> 2. memblock_set_bottom_up(false), called by x86's numa_init().
>
> Even though (1) occurs in generic mm code, it is wrapped by #ifdef
> CONFIG_MOVABLE_NODE, which depends on X86_64.
>
> This means that when we extend CONFIG_MOVABLE_NODE to non-x86 arches,
> things will be unbalanced. (1) will happen for them, but (2) will not.
>
> This toggle was added in the first place because x86 has a delay between
> adding memblocks and marking them as hotpluggable. Since other arches do
> this marking either immediately or not at all, they do not require the
> bottom-up toggle.
>
> So, resolve things by moving (1) from cmdline_parse_movable_node() to
> x86's setup_arch(), immediately after the movable_node parameter has
> been parsed.


Considering that we now can mark memblock hotpluggable, do we need to
enable the bottom up allocation for ppc64 also ?


>
> Signed-off-by: Reza Arbab 
> ---
>  Documentation/kernel-parameters.txt |  2 +-
>  arch/x86/kernel/setup.c | 24 

-aneesh



[PATCH] lkdtm: Prevent the compiler from optimising lkdtm_CORRUPT_STACK()

2016-11-14 Thread Michael Ellerman
At least on powerpc with GCC 6, the compiler is smart enough to optimise
lkdtm_CORRUPT_STACK() into an empty function that just returns.

If we print the buffer after we've written to it that prevents the
compiler from optimising away data and the memset().

Signed-off-by: Michael Ellerman 
---
 drivers/misc/lkdtm_bugs.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/misc/lkdtm_bugs.c b/drivers/misc/lkdtm_bugs.c
index 182ae1894b32..30e62dd7e7ca 100644
--- a/drivers/misc/lkdtm_bugs.c
+++ b/drivers/misc/lkdtm_bugs.c
@@ -80,7 +80,8 @@ noinline void lkdtm_CORRUPT_STACK(void)
/* Use default char array length that triggers stack protection. */
char data[8];
 
-   memset((void *)data, 0, 64);
+   memset((void *)data, 'a', 64);
+   pr_info("Corrupted stack with '%16s'...\n", data);
 }
 
 void lkdtm_UNALIGNED_LOAD_STORE_WRITE(void)
-- 
2.7.4



Re: [PATCH] thermal/powerclamp: add back module device table

2016-11-14 Thread Greg Kroah-Hartman
On Mon, Nov 14, 2016 at 11:08:45AM -0800, Jacob Pan wrote:
> Commit 3105f234e0aba43e44e277c20f9b32ee8add43d4 replaced module
> cpu id table with a cpu feature check, which is logically correct.
> But we need the module device table to allow module auto loading.
> 
> Fixes:3105f234 thermal/powerclamp: correct cpu support check
> Signed-off-by: Jacob Pan 
> ---
>  drivers/thermal/intel_powerclamp.c | 9 -
>  1 file changed, 8 insertions(+), 1 deletion(-)



This is not the correct way to submit patches for inclusion in the
stable kernel tree.  Please read Documentation/stable_kernel_rules.txt
for how to do this properly.




Re: [PATCHSET 0/7] perf sched: Introduce timehist command, again (v1)

2016-11-14 Thread Namhyung Kim
Hi Ingo,

On Tue, Nov 15, 2016 at 07:42:14AM +0100, Ingo Molnar wrote:
> 
> * Namhyung Kim  wrote:
> 
> > Hello,
> > 
> > This patchset is a rebased version of David's sched timehist work [1].
> > I plan to improve perf sched command more and think that having
> > timehist command before the work looks good.  It seems David is busy
> > these days, so I'm retrying it by myself.
> > 
> > This implements only basic feature and a few options.  I just split
> > the patch to make it easier to review and did some cosmetic changes.
> > More patches will come later.
> > 
> > The below is from the David's original description:
> > 
> > 8<-
> > 'perf sched timehist' provides an analysis of scheduling events.
> > 
> > Example usage:
> > perf sched record -- sleep 1
> > perf sched timehist
> 
> 
> Cool, very nice!

:)

> 
> > By default it shows the individual schedule events, including the time 
> > between
> > sched-in events for the task, the task scheduling delay (time between wakeup
> > and actually running) and run time for the task:
> > 
> >time cpu  task name[tid/pid]b/n time sch delay  run time
> >   -   - - -
> >79371.874569 [11] gcc[31949]   0.014 0.000 1.148
> >79371.874591 [10] gcc[31951]   0.000 0.000 0.024
> >79371.874603 [10] migration/10[59] 3.350 0.004 0.011
> >79371.874604 [11]1.148 0.000 0.035
> >79371.874723 [05]0.016 0.000 1.383
> >79371.874746 [05] gcc[31949]   0.153 0.078 0.022
> > ...
> 
> What does the 'b/n' abbreviation stand for? 'Between'? Could we call the 
> column 
> 'sch wait' instead, or so?

Looks better, or what about 'wait time'?

> 
> 
> > Times are in msec.usec.
> > 
> > If callchains were recorded they are appended to the line with a default 
> > stack depth of 5:
> > 
> >79371.874569 [11] gcc[31949]  0.14  0.00  
> > 0.001148  wait_for_completion_killable do_fork sys_vfork stub_vfork __vfork
> >79371.874591 [10] gcc[31951]  0.00  0.00  
> > 0.24  __cond_resched _cond_resched wait_for_completion stop_one_cpu 
> > sched_exec
> >79371.874603 [10] migration/10[59]0.003350  0.04  
> > 0.11  smpboot_thread_fn kthread ret_from_fork
> >79371.874604 [11]   0.001148  0.00  
> > 0.35  cpu_startup_entry start_secondary
> >79371.874723 [05]   0.16  0.00  
> > 0.001383  cpu_startup_entry start_secondary
> >79371.874746 [05] gcc[31949]  0.000153  0.78  
> > 0.22  do_wait sys_wait4 system_call_fastpath __GI___waitpid
> 
> So when I first saw this it was hard for me to disambiguate individual 
> function 
> names. Wouldn't this be a bit more readable:
> 
> >79371.874569 [11] gcc[31949]  0.14  0.00  
> > 0.001148  wait_for_completion_killable() <- do_fork sys_vfork stub_vfork() 
> > <- __vfork()
> >79371.874591 [10] gcc[31951]  0.00  0.00  
> > 0.24  __cond_resched() <- _cond_resched() <- wait_for_completion() <- 
> > stop_one_cpu() <- sched_exec()
> >79371.874603 [10] migration/10[59]0.003350  0.04  
> > 0.11  smpboot_thread_fn() <- kthread() <- ret_from_fork()
> >79371.874604 [11]   0.001148  0.00  
> > 0.35  cpu_startup_entry() <- start_secondary()
> >79371.874723 [05]   0.16  0.00  
> > 0.001383  cpu_startup_entry() <- start_secondary()
> >79371.874746 [05] gcc[31949]  0.000153  0.78  
> > 0.22  do_wait() <- sys_wait4() <- system_call_fastpath() <- 
> > __GI___waitpid()
> 
> Or:
> 
> >79371.874569 [11] gcc[31949]  0.14  0.00  
> > 0.001148  wait_for_completion_killable()   <- do_fork sys_vfork 
> > stub_vfork() <- __vfork()
> >79371.874591 [10] gcc[31951]  0.00  0.00  
> > 0.24  __cond_resched() <- _cond_resched() <- 
> > wait_for_completion() <- stop_one_cpu() <- sched_exec()
> >79371.874603 [10] migration/10[59]0.003350  0.04  
> > 0.11  smpboot_thread_fn()  <- kthread() <- 
> > ret_from_fork()
> >79371.874604 [11]   0.001148  0.00  
> > 0.35  cpu_startup_entry()  <- start_secondary()
> >79371.874723 [05]   0.16  0.00  
> > 0.001383  cpu_startup_entry()  <- start_secondary()
> >79371.874746 [05] gcc[31949]  0.000153  0.78  
> > 0.22  do_wait()<- sys_wait4() <- 
> > system_call_fastpath() <- __GI___waitpid()
> 
> (i.e. visually separate the first entry - and list the rest.)
> 
> Or may

[PATCH] clk: qcom: smd-rpm: Add msm8974 clocks

2016-11-14 Thread Bjorn Andersson
This adds all RPM based clocks for msm8974 except cxo and gfx3d_clk_src.

Signed-off-by: Bjorn Andersson 
---
 .../devicetree/bindings/clock/qcom,rpmcc.txt   |  1 +
 drivers/clk/qcom/clk-smd-rpm.c | 71 ++
 include/dt-bindings/clock/qcom,rpmcc.h | 40 +++-
 3 files changed, 110 insertions(+), 2 deletions(-)

diff --git a/Documentation/devicetree/bindings/clock/qcom,rpmcc.txt 
b/Documentation/devicetree/bindings/clock/qcom,rpmcc.txt
index 87d3714b956a..a7235e9e1c97 100644
--- a/Documentation/devicetree/bindings/clock/qcom,rpmcc.txt
+++ b/Documentation/devicetree/bindings/clock/qcom,rpmcc.txt
@@ -11,6 +11,7 @@ Required properties :
compatible "qcom,rpmcc" should be also included.
 
"qcom,rpmcc-msm8916", "qcom,rpmcc"
+   "qcom,rpmcc-msm8974", "qcom,rpmcc"
"qcom,rpmcc-apq8064", "qcom,rpmcc"
 
 - #clock-cells : shall contain 1
diff --git a/drivers/clk/qcom/clk-smd-rpm.c b/drivers/clk/qcom/clk-smd-rpm.c
index a27013dbc0aa..b8fcac6f2f87 100644
--- a/drivers/clk/qcom/clk-smd-rpm.c
+++ b/drivers/clk/qcom/clk-smd-rpm.c
@@ -462,8 +462,79 @@ static const struct rpm_smd_clk_desc rpm_clk_msm8916 = {
.num_clks = ARRAY_SIZE(msm8916_clks),
 };
 
+/* msm8974 */
+DEFINE_CLK_SMD_RPM(msm8974, pnoc_clk, pnoc_a_clk, QCOM_SMD_RPM_BUS_CLK, 0);
+DEFINE_CLK_SMD_RPM(msm8974, snoc_clk, snoc_a_clk, QCOM_SMD_RPM_BUS_CLK, 1);
+DEFINE_CLK_SMD_RPM(msm8974, cnoc_clk, cnoc_a_clk, QCOM_SMD_RPM_BUS_CLK, 2);
+DEFINE_CLK_SMD_RPM(msm8974, mmssnoc_ahb_clk, mmssnoc_ahb_a_clk, 
QCOM_SMD_RPM_BUS_CLK, 3);
+DEFINE_CLK_SMD_RPM(msm8974, bimc_clk, bimc_a_clk, QCOM_SMD_RPM_MEM_CLK, 0);
+DEFINE_CLK_SMD_RPM(msm8974, gfx3d_clk_src, gfx3d_a_clk_src, 
QCOM_SMD_RPM_MEM_CLK, 1);
+DEFINE_CLK_SMD_RPM(msm8974, ocmemgx_clk, ocmemgx_a_clk, QCOM_SMD_RPM_MEM_CLK, 
2);
+DEFINE_CLK_SMD_RPM_QDSS(msm8974, qdss_clk, qdss_a_clk, QCOM_SMD_RPM_MISC_CLK, 
1);
+DEFINE_CLK_SMD_RPM_XO_BUFFER(msm8974, cxo_d0, cxo_d0_a, 1);
+DEFINE_CLK_SMD_RPM_XO_BUFFER(msm8974, cxo_d1, cxo_d1_a, 2);
+DEFINE_CLK_SMD_RPM_XO_BUFFER(msm8974, cxo_a0, cxo_a0_a, 4);
+DEFINE_CLK_SMD_RPM_XO_BUFFER(msm8974, cxo_a1, cxo_a1_a, 5);
+DEFINE_CLK_SMD_RPM_XO_BUFFER(msm8974, cxo_a2, cxo_a2_a, 6);
+DEFINE_CLK_SMD_RPM_XO_BUFFER(msm8974, diff_clk, diff_a_clk, 7);
+DEFINE_CLK_SMD_RPM_XO_BUFFER(msm8974, div_clk1, div_a_clk1, 11);
+DEFINE_CLK_SMD_RPM_XO_BUFFER(msm8974, div_clk2, div_a_clk2, 12);
+DEFINE_CLK_SMD_RPM_XO_BUFFER_PINCTRL(msm8974, cxo_d0_pin, cxo_d0_a_pin, 1);
+DEFINE_CLK_SMD_RPM_XO_BUFFER_PINCTRL(msm8974, cxo_d1_pin, cxo_d1_a_pin, 2);
+DEFINE_CLK_SMD_RPM_XO_BUFFER_PINCTRL(msm8974, cxo_a0_pin, cxo_a0_a_pin, 4);
+DEFINE_CLK_SMD_RPM_XO_BUFFER_PINCTRL(msm8974, cxo_a1_pin, cxo_a1_a_pin, 5);
+DEFINE_CLK_SMD_RPM_XO_BUFFER_PINCTRL(msm8974, cxo_a2_pin, cxo_a2_a_pin, 6);
+
+static struct clk_smd_rpm *msm8974_clks[] = {
+   [RPM_SMD_PNOC_CLK]  = &msm8974_pnoc_clk,
+   [RPM_SMD_PNOC_A_CLK]= &msm8974_pnoc_a_clk,
+   [RPM_SMD_SNOC_CLK]  = &msm8974_snoc_clk,
+   [RPM_SMD_SNOC_A_CLK]= &msm8974_snoc_a_clk,
+   [RPM_SMD_CNOC_CLK]  = &msm8974_cnoc_clk,
+   [RPM_SMD_CNOC_A_CLK]= &msm8974_cnoc_a_clk,
+   [RPM_SMD_MMSSNOC_AHB_CLK]   = &msm8974_mmssnoc_ahb_clk,
+   [RPM_SMD_MMSSNOC_AHB_A_CLK] = &msm8974_mmssnoc_ahb_a_clk,
+   [RPM_SMD_BIMC_CLK]  = &msm8974_bimc_clk,
+   [RPM_SMD_BIMC_A_CLK]= &msm8974_bimc_a_clk,
+   [RPM_SMD_OCMEMGX_CLK]   = &msm8974_ocmemgx_clk,
+   [RPM_SMD_OCMEMGX_A_CLK] = &msm8974_ocmemgx_a_clk,
+   [RPM_SMD_QDSS_CLK]  = &msm8974_qdss_clk,
+   [RPM_SMD_QDSS_A_CLK]= &msm8974_qdss_a_clk,
+   [RPM_SMD_CXO_D0]= &msm8974_cxo_d0,
+   [RPM_SMD_CXO_D0_A]  = &msm8974_cxo_d0_a,
+   [RPM_SMD_CXO_D1]= &msm8974_cxo_d1,
+   [RPM_SMD_CXO_D1_A]  = &msm8974_cxo_d1_a,
+   [RPM_SMD_CXO_A0]= &msm8974_cxo_a0,
+   [RPM_SMD_CXO_A0_A]  = &msm8974_cxo_a0_a,
+   [RPM_SMD_CXO_A1]= &msm8974_cxo_a1,
+   [RPM_SMD_CXO_A1_A]  = &msm8974_cxo_a1_a,
+   [RPM_SMD_CXO_A2]= &msm8974_cxo_a2,
+   [RPM_SMD_CXO_A2_A]  = &msm8974_cxo_a2_a,
+   [RPM_SMD_DIFF_CLK]  = &msm8974_diff_clk,
+   [RPM_SMD_DIFF_A_CLK]= &msm8974_diff_a_clk,
+   [RPM_SMD_DIV_CLK1]  = &msm8974_div_clk1,
+   [RPM_SMD_DIV_A_CLK1]= &msm8974_div_a_clk1,
+   [RPM_SMD_DIV_CLK2]  = &msm8974_div_clk2,
+   [RPM_SMD_DIV_A_CLK2]= &msm8974_div_a_clk2,
+   [RPM_SMD_CXO_D0_PIN]= &msm8974_cxo_d0_pin,
+   [RPM_SMD_CXO_D0_A_PIN]  = &msm8974_cxo_d0_a_pin,
+   [RPM_SMD_CXO_D1_PIN]= &msm8974_cxo_d1_pin,

Re: [PATCH 1/5] pinctrl: core: Use delayed work for hogs

2016-11-14 Thread Linus Walleij
On Tue, Nov 15, 2016 at 1:47 AM, Tony Lindgren  wrote:

> 8< 
> From tony Mon Sep 17 00:00:00 2001
> From: Tony Lindgren 
> Date: Tue, 25 Oct 2016 08:33:35 -0700
> Subject: [PATCH] pinctrl: core: Use delayed work for hogs
>
> Having the pin control framework call pin controller functions
> before it's probe has finished is not nice as the pin controller
> device driver does not yet have struct pinctrl_dev handle.
>
> Let's fix this issue by adding deferred work for late init. This is
> needed to be able to add pinctrl generic helper functions that expect
> to know struct pinctrl_dev handle. Note that we now need to call
> create_pinctrl() directly as we don't want to add the pin controller
> to the list of controllers until the hogs are claimed. We also need
> to pass the pinctrl_dev to the device tree parser functions as they
> otherwise won't find the right controller at this point.
>
> Signed-off-by: Tony Lindgren 

This looks a lot better!

So if I understand correctly, we can guarantee that the delayed
work will not execute until the device driver probe() has finished,
and it *will* execute immediately after that?

So:
- Device driver probes
- Delayed work is called
- Next initcall

I'm not 100% familiar with how delayed work works... :/

Yours,
Linus Walleij


Re: [PATCH v12 12/22] vfio: Add notifier callback to parent's ops structure of mdev

2016-11-14 Thread Jike Song
On 11/14/2016 11:42 PM, Kirti Wankhede wrote:
> Add a notifier calback to parent's ops structure of mdev device so that per
> device notifer for vfio module is registered through vfio_mdev module.
> 
> Signed-off-by: Kirti Wankhede 
> Signed-off-by: Neo Jia 
> Change-Id: Iafa6f1721aecdd6e50eb93b153b5621e6d29b637
> ---
>  drivers/vfio/mdev/vfio_mdev.c | 19 +++
>  include/linux/mdev.h  |  9 +
>  2 files changed, 28 insertions(+)
> 
> diff --git a/drivers/vfio/mdev/vfio_mdev.c b/drivers/vfio/mdev/vfio_mdev.c
> index ffc36758cb84..1694b1635607 100644
> --- a/drivers/vfio/mdev/vfio_mdev.c
> +++ b/drivers/vfio/mdev/vfio_mdev.c
> @@ -24,6 +24,15 @@
>  #define DRIVER_AUTHOR   "NVIDIA Corporation"
>  #define DRIVER_DESC "VFIO based driver for Mediated device"
>  
> +static int vfio_mdev_notifier(struct notifier_block *nb, unsigned long 
> action,
> +   void *data)
> +{
> + struct mdev_device *mdev = container_of(nb, struct mdev_device, nb);
> + struct parent_device *parent = mdev->parent;
> +
> + return parent->ops->notifier(mdev, action, data);
> +}
> +
>  static int vfio_mdev_open(void *device_data)
>  {
>   struct mdev_device *mdev = device_data;
> @@ -40,6 +49,11 @@ static int vfio_mdev_open(void *device_data)
>   if (ret)
>   module_put(THIS_MODULE);
>  
> + if (likely(parent->ops->notifier)) {
> + mdev->nb.notifier_call = vfio_mdev_notifier;
> + if (vfio_register_notifier(&mdev->dev, &mdev->nb))
> + pr_err("Failed to register notifier for mdev\n");
> + }

Hi Kirti,

Could you please move the notifier registration before parent->ops->open()?
as you might know, I'm extending your vfio_register_notifier to also include
the attaching/detaching events of vfio_group and kvm.  Basically if vfio_group
not attached to any kvm instance, the parent->ops->open() should return -ENODEV
to indicate the failure, but to know whether kvm is available in open(), the
notifier registration should be earlier.

Of course I can call vfio_register_notifier() from an earlier place to
workaround it, but it doesn't seem a canonical way.

--
Thanks,
Jike

>   return ret;
>  }
>  
> @@ -48,6 +62,11 @@ static void vfio_mdev_release(void *device_data)
>   struct mdev_device *mdev = device_data;
>   struct parent_device *parent = mdev->parent;
>  
> + if (likely(parent->ops->notifier)) {
> + if (vfio_unregister_notifier(&mdev->dev, &mdev->nb))
> + pr_err("Failed to unregister notifier for mdev\n");
> + }
> +
>   if (likely(parent->ops->release))
>   parent->ops->release(mdev);
>  
> diff --git a/include/linux/mdev.h b/include/linux/mdev.h
> index 4900cc472364..665afe0a4c31 100644
> --- a/include/linux/mdev.h
> +++ b/include/linux/mdev.h
> @@ -37,6 +37,7 @@ struct mdev_device {
>   struct kref ref;
>   struct list_headnext;
>   struct kobject  *type_kobj;
> + struct notifier_block   nb;
>  };
>  
>  /**
> @@ -85,6 +86,12 @@ struct mdev_device {
>   * @mmap:mmap callback
>   *   @mdev: mediated device structure
>   *   @vma: vma structure
> + * @notifer: Notifier callback, currently only for
> + *   VFIO_IOMMU_NOTIFY_DMA_UNMAP action notified duing
> + *   DMA_UNMAP call on mapped iova range.
> + *   @mdev: mediated device structure
> + *   @action: Action for which notifier is called
> + *   @data: Data associated with the notifier
>   * Parent device that support mediated device should be registered with mdev
>   * module with parent_ops structure.
>   **/
> @@ -106,6 +113,8 @@ struct parent_ops {
>   ssize_t (*ioctl)(struct mdev_device *mdev, unsigned int cmd,
>unsigned long arg);
>   int (*mmap)(struct mdev_device *mdev, struct vm_area_struct *vma);
> + int (*notifier)(struct mdev_device *mdev, unsigned long action,
> + void *data);
>  };
>  
>  /* interface for exporting mdev supported type attributes */
> 


Re: [kbuild-all] [Patch v6.1] x86/kvm: Add AVX512_4VNNIW and AVX512_4FMAPS support

2016-11-14 Thread Fengguang Wu

Hi He Chen,

On Tue, Nov 15, 2016 at 02:02:23PM +0800, He Chen wrote:

On Tue, Nov 15, 2016 at 04:24:39AM +0800, kbuild test robot wrote:

Hi He,

[auto build test ERROR on kvm/linux-next]
[also build test ERROR on v4.9-rc5]
[cannot apply to next-20161114]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/He-Chen/x86-kvm-Add-AVX512_4VNNIW-and-AVX512_4FMAPS-support/20161114-170941
base:   https://git.kernel.org/pub/scm/virt/kvm/kvm.git linux-next



I have downloaded .config.gz in attachment and use the .config in it
to build kernel in my local branch again, and I don't see any warn or
error message.

I wonder whether the previous 0001 and 0002 patches have applied to run
this test? Or is there something wrong with my compiler or patches?


Sorry the robot is not smart enough to see the 0001/0002 patches.
As you may see from the above url, only this patch is applied on top
of the KVM linux-next branch.

Thanks,
Fengguang


Re: [PATCHSET 0/7] perf sched: Introduce timehist command, again (v1)

2016-11-14 Thread Ingo Molnar

* Namhyung Kim  wrote:

> Hello,
> 
> This patchset is a rebased version of David's sched timehist work [1].
> I plan to improve perf sched command more and think that having
> timehist command before the work looks good.  It seems David is busy
> these days, so I'm retrying it by myself.
> 
> This implements only basic feature and a few options.  I just split
> the patch to make it easier to review and did some cosmetic changes.
> More patches will come later.
> 
> The below is from the David's original description:
> 
> 8<-
> 'perf sched timehist' provides an analysis of scheduling events.
> 
> Example usage:
> perf sched record -- sleep 1
> perf sched timehist


Cool, very nice!

> By default it shows the individual schedule events, including the time between
> sched-in events for the task, the task scheduling delay (time between wakeup
> and actually running) and run time for the task:
> 
>time cpu  task name[tid/pid]b/n time sch delay  run time
>   -   - - -
>79371.874569 [11] gcc[31949]   0.014 0.000 1.148
>79371.874591 [10] gcc[31951]   0.000 0.000 0.024
>79371.874603 [10] migration/10[59] 3.350 0.004 0.011
>79371.874604 [11]1.148 0.000 0.035
>79371.874723 [05]0.016 0.000 1.383
>79371.874746 [05] gcc[31949]   0.153 0.078 0.022
> ...

What does the 'b/n' abbreviation stand for? 'Between'? Could we call the column 
'sch wait' instead, or so?


> Times are in msec.usec.
> 
> If callchains were recorded they are appended to the line with a default 
> stack depth of 5:
> 
>79371.874569 [11] gcc[31949]  0.14  0.00  0.001148 
>  wait_for_completion_killable do_fork sys_vfork stub_vfork __vfork
>79371.874591 [10] gcc[31951]  0.00  0.00  0.24 
>  __cond_resched _cond_resched wait_for_completion stop_one_cpu sched_exec
>79371.874603 [10] migration/10[59]0.003350  0.04  0.11 
>  smpboot_thread_fn kthread ret_from_fork
>79371.874604 [11]   0.001148  0.00  0.35 
>  cpu_startup_entry start_secondary
>79371.874723 [05]   0.16  0.00  0.001383 
>  cpu_startup_entry start_secondary
>79371.874746 [05] gcc[31949]  0.000153  0.78  0.22 
>  do_wait sys_wait4 system_call_fastpath __GI___waitpid

So when I first saw this it was hard for me to disambiguate individual function 
names. Wouldn't this be a bit more readable:

>79371.874569 [11] gcc[31949]  0.14  0.00  0.001148 
>  wait_for_completion_killable() <- do_fork sys_vfork stub_vfork() <- __vfork()
>79371.874591 [10] gcc[31951]  0.00  0.00  0.24 
>  __cond_resched() <- _cond_resched() <- wait_for_completion() <- 
> stop_one_cpu() <- sched_exec()
>79371.874603 [10] migration/10[59]0.003350  0.04  0.11 
>  smpboot_thread_fn() <- kthread() <- ret_from_fork()
>79371.874604 [11]   0.001148  0.00  0.35 
>  cpu_startup_entry() <- start_secondary()
>79371.874723 [05]   0.16  0.00  0.001383 
>  cpu_startup_entry() <- start_secondary()
>79371.874746 [05] gcc[31949]  0.000153  0.78  0.22 
>  do_wait() <- sys_wait4() <- system_call_fastpath() <- __GI___waitpid()

Or:

>79371.874569 [11] gcc[31949]  0.14  0.00  0.001148 
>  wait_for_completion_killable() <- do_fork sys_vfork stub_vfork() <- 
> __vfork()
>79371.874591 [10] gcc[31951]  0.00  0.00  0.24 
>  __cond_resched()   <- _cond_resched() <- 
> wait_for_completion() <- stop_one_cpu() <- sched_exec()
>79371.874603 [10] migration/10[59]0.003350  0.04  0.11 
>  smpboot_thread_fn()<- kthread() <- ret_from_fork()
>79371.874604 [11]   0.001148  0.00  0.35 
>  cpu_startup_entry()<- start_secondary()
>79371.874723 [05]   0.16  0.00  0.001383 
>  cpu_startup_entry()<- start_secondary()
>79371.874746 [05] gcc[31949]  0.000153  0.78  0.22 
>  do_wait()  <- sys_wait4() <- 
> system_call_fastpath() <- __GI___waitpid()

(i.e. visually separate the first entry - and list the rest.)

Or maybe it could be ASCII color coded so that the different entries are easier 
to 
separate: for example the functions could be printed in alternating white/grey 
color?

Thanks,

Ingo


Re: [PATCH] xen-platform: use builtin_pci_driver

2016-11-14 Thread Juergen Gross
On 14/11/16 13:52, Geliang Tang wrote:
> Use builtin_pci_driver() helper to simplify the code.
> 
> Signed-off-by: Geliang Tang 

Reviewed-by: Juergen Gross 

> ---
>  drivers/xen/platform-pci.c | 6 +-
>  1 file changed, 1 insertion(+), 5 deletions(-)
> 
> diff --git a/drivers/xen/platform-pci.c b/drivers/xen/platform-pci.c
> index b59c9455..112ce42 100644
> --- a/drivers/xen/platform-pci.c
> +++ b/drivers/xen/platform-pci.c
> @@ -125,8 +125,4 @@ static struct pci_driver platform_driver = {
>   .id_table =   platform_pci_tbl,
>  };
>  
> -static int __init platform_pci_init(void)
> -{
> - return pci_register_driver(&platform_driver);
> -}
> -device_initcall(platform_pci_init);
> +builtin_pci_driver(platform_driver);
> 



Re: [PATCH v2] f2fs: don't wait writeback for datas during checkpoint

2016-11-14 Thread Chao Yu
Hi Jaegeuk,

On 2016/11/15 7:32, Jaegeuk Kim wrote:
> Hi Chao,
> 
> On Mon, Nov 14, 2016 at 07:04:12PM +0800, Chao Yu wrote:
>> Normally, while committing checkpoint, we will wait on all pages to be
>> writebacked no matter the page is data or metadata, so in scenario where
>> there are lots of data IO being submitted with metadata, we may suffer
>> long latency for waiting writeback during checkpoint.
>>
>> Indeed, we only care about persistence for pages with metadata, but not
>> pages with data, as file system consistent are only related to metadate,
>> so in order to avoid encountering long latency in above scenario, let's
>> recognize and reference metadata in submitted IOs, wait writeback only
>> for metadatas.
> 
> Hmm, another concern comes, which is related to GCed data like below scenario.
> 
> 1. Write data X
> 2. Sync
> 3. Move data X by GC
> 4. Checkpoint
> 5. Power-cut
> 
> In this case, we should guarantee data X which was migrated by GC during #3.
> If we don't care about end_io in #4 Checkpoint, we can lose the data after
> #5 Power-cut.
> 
> Any idea?

Yes, good catch. :)

What about tagging these GCed page as cold data through set_cold_data, and clear
the tag in end_io, then we can keep reference count and wait on writeback for 
them?

Thanks,

> 
> Thanks,
> 
>>
>> Signed-off-by: Chao Yu 
>> ---
>>  fs/f2fs/checkpoint.c |  2 +-
>>  fs/f2fs/data.c   | 36 
>>  fs/f2fs/debug.c  |  7 ---
>>  fs/f2fs/f2fs.h   |  8 +---
>>  4 files changed, 42 insertions(+), 11 deletions(-)
>>
>> diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
>> index 7bece59..bdf8a50 100644
>> --- a/fs/f2fs/checkpoint.c
>> +++ b/fs/f2fs/checkpoint.c
>> @@ -1003,7 +1003,7 @@ static void wait_on_all_pages_writeback(struct 
>> f2fs_sb_info *sbi)
>>  for (;;) {
>>  prepare_to_wait(&sbi->cp_wait, &wait, TASK_UNINTERRUPTIBLE);
>>  
>> -if (!atomic_read(&sbi->nr_wb_bios))
>> +if (!get_pages(sbi, F2FS_WB_META))
>>  break;
>>  
>>  io_schedule_timeout(5*HZ);
>> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
>> index 66d2aee..f52cec3 100644
>> --- a/fs/f2fs/data.c
>> +++ b/fs/f2fs/data.c
>> @@ -29,6 +29,26 @@
>>  #include "trace.h"
>>  #include 
>>  
>> +static bool f2fs_is_meta_data(struct page *page)
>> +{
>> +struct address_space *mapping = page->mapping;
>> +struct f2fs_sb_info *sbi;
>> +struct inode *inode;
>> +
>> +/* it is bounce page of encrypted regular inode */
>> +if (!mapping)
>> +return false;
>> +
>> +inode = mapping->host;
>> +sbi = F2FS_I_SB(inode);
>> +
>> +if ((inode->i_ino == F2FS_META_INO(sbi) &&
>> +page->index < MAIN_BLKADDR(sbi)) ||
>> +inode->i_ino ==  F2FS_NODE_INO(sbi) ||
>> +S_ISDIR(inode->i_mode))
>> +return true;
>> +return false;
>> +}
>>  static void f2fs_read_end_io(struct bio *bio)
>>  {
>>  struct bio_vec *bvec;
>> @@ -73,6 +93,7 @@ static void f2fs_write_end_io(struct bio *bio)
>>  
>>  bio_for_each_segment_all(bvec, bio, i) {
>>  struct page *page = bvec->bv_page;
>> +bool is_meta = f2fs_is_meta_data(page);
>>  
>>  fscrypt_pullback_bio_page(&page, true);
>>  
>> @@ -80,9 +101,10 @@ static void f2fs_write_end_io(struct bio *bio)
>>  mapping_set_error(page->mapping, -EIO);
>>  f2fs_stop_checkpoint(sbi, true);
>>  }
>> +dec_page_count(sbi, is_meta ? F2FS_WB_META : F2FS_WB_DATA);
>>  end_page_writeback(page);
>>  }
>> -if (atomic_dec_and_test(&sbi->nr_wb_bios) &&
>> +if (!get_pages(sbi, F2FS_WB_META) &&
>>  wq_has_sleeper(&sbi->cp_wait))
>>  wake_up(&sbi->cp_wait);
>>  
>> @@ -111,7 +133,6 @@ static inline void __submit_bio(struct f2fs_sb_info *sbi,
>>  struct bio *bio, enum page_type type)
>>  {
>>  if (!is_read_io(bio_op(bio))) {
>> -atomic_inc(&sbi->nr_wb_bios);
>>  if (f2fs_sb_mounted_blkzoned(sbi->sb) &&
>>  current->plug && (type == DATA || type == NODE))
>>  blk_finish_plug(current->plug);
>> @@ -272,6 +293,15 @@ void f2fs_submit_page_mbio(struct f2fs_io_info *fio)
>>  verify_block_addr(sbi, fio->old_blkaddr);
>>  verify_block_addr(sbi, fio->new_blkaddr);
>>  
>> +bio_page = fio->encrypted_page ? fio->encrypted_page : fio->page;
>> +
>> +if (!is_read) {
>> +bool is_meta;
>> +
>> +is_meta = f2fs_is_meta_data(bio_page);
>> +inc_page_count(sbi, is_meta ? F2FS_WB_META : F2FS_WB_DATA);
>> +}
>> +
>>  down_write(&io->io_rwsem);
>>  
>>  if (io->bio && (io->last_block_in_bio != fio->new_blkaddr - 1 ||
>> @@ -284,8 +314,6 @@ void f2fs_submit_page_mbio(struct f2fs_io_info *fio)
>>  io->fio = *fio;
>>  }

[PATCH v2] kvm: x86: don't print warning messages for unimplemented msrs

2016-11-14 Thread Bandan Das

Change unimplemented msrs messages to use pr_debug.
If CONFIG_DYNAMIC_DEBUG is set, then these messages can be
enabled at run time or else -DDEBUG can be used at compile
time to enable them. These messages will still be printed if
ignore_msrs=1.

Signed-off-by: Bandan Das 
---
v2:
use kvm_debug_ratelimited for vcpu_debug_ratelimited

This is a follow up to RFC posted by Dave at
https://patchwork.kernel.org/patch/9238227/ which uses pr_debug_ratelimited
when ignore_msrs is not set.

 arch/x86/kvm/mmu.c   | 2 +-
 arch/x86/kvm/x86.c   | 5 +++--
 include/linux/kvm_host.h | 6 ++
 3 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index d9c7e98..1b3f241 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -4958,7 +4958,7 @@ void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm, 
struct kvm_memslots *slots)
 * zap all shadow pages.
 */
if (unlikely((slots->generation & MMIO_GEN_MASK) == 0)) {
-   printk_ratelimited(KERN_DEBUG "kvm: zapping shadow pages for 
mmio generation wraparound\n");
+   kvm_debug_ratelimited("kvm: zapping shadow pages for mmio 
generation wraparound\n");
kvm_mmu_invalidate_zap_all_pages(kvm);
}
 }
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 3017de0..5d50403 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2280,7 +2280,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct 
msr_data *msr_info)
if (kvm_pmu_is_valid_msr(vcpu, msr))
return kvm_pmu_set_msr(vcpu, msr_info);
if (!ignore_msrs) {
-   vcpu_unimpl(vcpu, "unhandled wrmsr: 0x%x data 0x%llx\n",
+   vcpu_debug_ratelimited(vcpu, "unhandled wrmsr: 0x%x 
data 0x%llx\n",
msr, data);
return 1;
} else {
@@ -2492,7 +2492,8 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct 
msr_data *msr_info)
if (kvm_pmu_is_valid_msr(vcpu, msr_info->index))
return kvm_pmu_get_msr(vcpu, msr_info->index, 
&msr_info->data);
if (!ignore_msrs) {
-   vcpu_unimpl(vcpu, "unhandled rdmsr: 0x%x\n", 
msr_info->index);
+   vcpu_debug_ratelimited(vcpu, "unhandled rdmsr: 0x%x\n",
+  msr_info->index);
return 1;
} else {
vcpu_unimpl(vcpu, "ignored rdmsr: 0x%x\n", 
msr_info->index);
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 01c0b9c..274bf34 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -439,6 +439,9 @@ struct kvm {
pr_info("kvm [%i]: " fmt, task_pid_nr(current), ## __VA_ARGS__)
 #define kvm_debug(fmt, ...) \
pr_debug("kvm [%i]: " fmt, task_pid_nr(current), ## __VA_ARGS__)
+#define kvm_debug_ratelimited(fmt, ...) \
+   pr_debug_ratelimited("kvm [%i]: " fmt, task_pid_nr(current), \
+## __VA_ARGS__)
 #define kvm_pr_unimpl(fmt, ...) \
pr_err_ratelimited("kvm [%i]: " fmt, \
   task_tgid_nr(current), ## __VA_ARGS__)
@@ -450,6 +453,9 @@ struct kvm {
 
 #define vcpu_debug(vcpu, fmt, ...) \
kvm_debug("vcpu%i " fmt, (vcpu)->vcpu_id, ## __VA_ARGS__)
+#define vcpu_debug_ratelimited(vcpu, fmt, ...) \
+   kvm_debug_ratelimited("vcpu%i " fmt, (vcpu)->vcpu_id,   \
+ ## __VA_ARGS__)
 #define vcpu_err(vcpu, fmt, ...)   \
kvm_err("vcpu%i " fmt, (vcpu)->vcpu_id, ## __VA_ARGS__)
 
-- 
2.9.3



Re: [PATCH v11 10/22] vfio iommu type1: Add support for mediated devices

2016-11-14 Thread Kirti Wankhede


On 11/15/2016 10:47 AM, Alexey Kardashevskiy wrote:
> On 08/11/16 17:52, Alexey Kardashevskiy wrote:
>> On 05/11/16 08:10, Kirti Wankhede wrote:
>>> VFIO IOMMU drivers are designed for the devices which are IOMMU capable.
>>> Mediated device only uses IOMMU APIs, the underlying hardware can be
>>> managed by an IOMMU domain.
>>>
>>> Aim of this change is:
>>> - To use most of the code of TYPE1 IOMMU driver for mediated devices
>>> - To support direct assigned device and mediated device in single module
>>>
>>> This change adds pin and unpin support for mediated device to TYPE1 IOMMU
>>> backend module. More details:
>>> - vfio_pin_pages() callback here uses task and address space of vfio_dma,
>>>   that is, of the process who mapped that iova range.
>>> - Added pfn_list tracking logic to address space structure. All pages
>>>   pinned through this interface are trached in its address space.
>>> - Pinned pages list is used to verify unpinning request and to unpin
>>>   remaining pages while detaching the group for that device.
>>> - Page accounting is updated to account in its address space where the
>>>   pages are pinned/unpinned.
>>> -  Accouting for mdev device is only done if there is no iommu capable
>>>   domain in the container. When there is a direct device assigned to the
>>>   container and that domain is iommu capable, all pages are already pinned
>>>   during DMA_MAP.
>>> - Page accouting is updated on hot plug and unplug mdev device and pass
>>>   through device.
>>>
>>> Tested by assigning below combinations of devices to a single VM:
>>> - GPU pass through only
>>
>> This does not require this patchset, right?
>>

Sorry I missed this earlier.
This testing is required for this patch, because this patch touches code
that is used for direct device assignment. Also for page accounting, all
cases are considered i.e. when there is only pass through device in a
container, when there is pass through device + vGPU device in a
container. Also have to test that pages are pinned properly when device
is hotplugged. In that case vfio_iommu_replay() is called to take
necessary action.

>>> - vGPU device only
>>
>> Out of curiosity - how exactly did you test this? The exact GPU, how to
>> create vGPU, what was the QEMU command line and the guest does with this
>> passed device? Thanks.
> 
> ping?
> 

I'm testing this code with M60, with custom changes in our driver.
Steps how to create mediated device are listed in
Documentation/vfio-mediated-device.txt for sample mtty driver. Same
steps I'm following for GPU. Quoting those steps here for you:

2. Create a mediated device by using the dummy device that you created
in the
   previous step.

   # echo "83b8f4f2-509f-382f-3c1e-e6bfe0fa1001" >  \

/sys/devices/virtual/mtty/mtty/mdev_supported_types/mtty-2/create

3. Add parameters to qemu-kvm.

   -device vfio-pci,\
sysfsdev=/sys/bus/mdev/devices/83b8f4f2-509f-382f-3c1e-e6bfe0fa1001


Thanks,
Kirti



Re: [RFC PATCH] xen/x86: Increase xen_e820_map to E820_X_MAX possible entries

2016-11-14 Thread Juergen Gross
On 15/11/16 01:11, Alex Thorlton wrote:
> Hey everyone,
> 
> We're having problems with large systems hitting a BUG in
> xen_memory_setup, due to extra e820 entries created in the
> XENMEM_machine_memory_map callback.  The change in the patch gets things
> working, but Boris and I wanted to get opinions on whether or not this
> is the appropriate/entire solution, which is why I've sent it as an RFC
> for now.
> 
> Boris pointed out to me that E820_X_MAX is only large when CONFIG_EFI=y,
> which is a detail worth discussig.  He proposed possibly adding
> CONFIG_XEN to the conditions under which we set E820_X_MAX to a larger
> value than E820MAX, since the Xen e820 table isn't bound by the
> zero-page memory limitations.
> 
> I do *slightly* question the use of E820_X_MAX here, only from a
> cosmetic prospective, as I believe this macro is intended to describe
> the maximum size of the extended e820 table, which, AFAIK, is not used
> by the Xen HV.  That being said, there isn't exactly a "more
> appropriate" macro/variable to use, so this may not really be an issue.
> 
> Any input on the patch, or the questions I've raised above is greatly
> appreciated!

While I think extending the e820 table is the right thing to do I'm
questioning the assumptions here.

Looking briefly through the Xen hypervisor sources I think it isn't
yet ready for such large machines: the hypervisor's e820 map seems to
be still limited to 128 e820 entries. Jan, did I overlook an EFI
specific path extending this limitation?

In case I'm right the Xen hypervisor should be prepared for a larger
e820 map, but this won't help alone as there would still be additional
entries for the IOAPICs created.

So I think we need something like:

#define E820_XEN_MAX (E820_X_MAX + MAX_IO_APICS)

and use this for sizing xen_e820_map[].


Juergen


Re: [PATCH -tip v2 2/6] selftests: ftrace: Initialize ftrace before each test

2016-11-14 Thread Masami Hiramatsu
On Mon, 14 Nov 2016 13:12:00 -0500
Steven Rostedt  wrote:

> On Sun, 30 Oct 2016 15:54:10 +0900
> Masami Hiramatsu  wrote:
> 
> > Reset ftrace to initial state before running each test.
> > This fixes some test cases to enable tracing before starting
> > trace test. This can avoid false-positive failure when
> > previous testcase fails while disabling tracing.
> > 
> > Signed-off-by: Masami Hiramatsu 
> > Suggested-by: Steven Rostedt 
> > ---
> >  tools/testing/selftests/ftrace/ftracetest   |2 +-
> >  tools/testing/selftests/ftrace/test.d/functions |   25 
> > +++
> >  2 files changed, 26 insertions(+), 1 deletion(-)
> > 
> > diff --git a/tools/testing/selftests/ftrace/ftracetest 
> > b/tools/testing/selftests/ftrace/ftracetest
> > index 4c6a0bf..a03d366 100755
> > --- a/tools/testing/selftests/ftrace/ftracetest
> > +++ b/tools/testing/selftests/ftrace/ftracetest
> > @@ -228,7 +228,7 @@ trap 'SIG_RESULT=$XFAIL' $SIG_XFAIL
> >  
> >  __run_test() { # testfile
> ># setup PID and PPID, $$ is not updated.
> > -  (cd $TRACING_DIR; read PID _ < /proc/self/stat ; set -e; set -x; . $1)
> > +  (cd $TRACING_DIR; read PID _ < /proc/self/stat; set -e; set -x; 
> > initialize_ftrace; . $1)
> >[ $? -ne 0 ] && kill -s $SIG_FAIL $SIG_PID
> >  }
> >  
> > diff --git a/tools/testing/selftests/ftrace/test.d/functions 
> > b/tools/testing/selftests/ftrace/test.d/functions
> > index c37262f..fbaf565 100644
> > --- a/tools/testing/selftests/ftrace/test.d/functions
> > +++ b/tools/testing/selftests/ftrace/test.d/functions
> > @@ -23,3 +23,28 @@ reset_trigger() { # reset all current setting triggers
> >  done
> >  }
> >  
> > +reset_events_filter() { # reset all current setting filters
> > +grep -v ^none events/*/*/filter |
> > +while read line; do
> > +   echo 0 > `echo $line | cut -f1 -d:`
> > +done
> > +}
> > +
> > +disable_events() {
> > +echo 0 > events/enable
> > +}
> > +
> > +initialize_ftrace() { # Reset ftrace to initial-state
> > +# As the initial state, ftrace will be set to nop tracer,
> > +# no events, no triggers, no filters, no function filters,
> > +# no probes, and tracing on.
> > +disable_tracing
> > +reset_tracer
> > +reset_trigger
> > +reset_events_filter
> > +disable_events
> > +echo | tee set_ftrace_* set_graph_* stack_trace_filter set_event_pid
> 
> I just disabled function graph tracing, and this causes every test to
> fail.
> 
>tee: set_graph_*: Permission denied

Oops, right. OK, I'll fix that.

Thanks!

> 
> -- Steve
> 
> > +echo > kprobe_events
> > +echo > uprobe_events
> > +enable_tracing
> > +}
> 


-- 
Masami Hiramatsu 


Re: kvm: deadlock between kvm_vm_ioctl_get_dirty_log/kvm_hv_set_msr_common/kvm_create_pit

2016-11-14 Thread Dmitry Vyukov
On Tue, Nov 15, 2016 at 7:27 AM, Dmitry Vyukov  wrote:
> Hello,
>
> The following program produces a deadlocked, unkillable process:
> https://gist.githubusercontent.com/dvyukov/fb7e93f6618f4eccb84d419ea6cec491/raw/a14b60250e593eb1b61f50cead41059dc49ceff2/gistfile1.txt
>
>
> # cat /proc/9362/task/*/stack
> [] __synchronize_srcu+0x2f8/0x4a0 kernel/rcu/srcu.c:448
> [] synchronize_srcu_expedited+0x13/0x20 
> kernel/rcu/srcu.c:510
> [] kvm_io_bus_register_dev+0x2ab/0x3e0
> arch/x86/kvm/../../../virt/kvm/kvm_main.c:3559
> [] kvm_create_pit+0x5c6/0x8c0 arch/x86/kvm/i8254.c:694
> [] kvm_arch_vm_ioctl+0x1406/0x23c0 arch/x86/kvm/x86.c:3956
> [] kvm_vm_ioctl+0x1fa/0x1a70
> arch/x86/kvm/../../../virt/kvm/kvm_main.c:3099
> [< inline >] vfs_ioctl fs/ioctl.c:43
> [] do_vfs_ioctl+0x1c4/0x1630 fs/ioctl.c:679
> [< inline >] SYSC_ioctl fs/ioctl.c:694
> [] SyS_ioctl+0x94/0xc0 fs/ioctl.c:685
> [] entry_SYSCALL_64_fastpath+0x23/0xc6
> arch/x86/entry/entry_64.S:209
> [] 0x
>
> [] kvm_hv_set_msr_common+0x163/0x2a30
> arch/x86/kvm/hyperv.c:1145
> [] kvm_set_msr_common+0xb0b/0x23a0 arch/x86/kvm/x86.c:2261
> [] vmx_set_msr+0x27d/0xcb0 arch/x86/kvm/vmx.c:3149
> [] kvm_set_msr+0xd9/0x170 arch/x86/kvm/x86.c:1084
> [] do_set_msr+0x123/0x1a0 arch/x86/kvm/x86.c:1113
> [< inline >] __msr_io arch/x86/kvm/x86.c:2523
> [] msr_io+0x250/0x460 arch/x86/kvm/x86.c:2560
> [] kvm_arch_vcpu_ioctl+0x360/0x44a0 arch/x86/kvm/x86.c:3401
> [] kvm_vcpu_ioctl+0x237/0x11c0
> arch/x86/kvm/../../../virt/kvm/kvm_main.c:2710
> [< inline >] vfs_ioctl fs/ioctl.c:43
> [] do_vfs_ioctl+0x1c4/0x1630 fs/ioctl.c:679
> [< inline >] SYSC_ioctl fs/ioctl.c:694
> [] SyS_ioctl+0x94/0xc0 fs/ioctl.c:685
> [] entry_SYSCALL_64_fastpath+0x23/0xc6
> arch/x86/entry/entry_64.S:209
>
> [] 0x
> [] kvm_vm_ioctl_get_dirty_log+0x8f/0x210
> arch/x86/kvm/x86.c:3779
> [] kvm_vm_ioctl+0x11e4/0x1a70
> arch/x86/kvm/../../../virt/kvm/kvm_main.c:2969
> [< inline >] vfs_ioctl fs/ioctl.c:43
> [] do_vfs_ioctl+0x1c4/0x1630 fs/ioctl.c:679
> [< inline >] SYSC_ioctl fs/ioctl.c:694
> [] SyS_ioctl+0x94/0xc0 fs/ioctl.c:685
> [] entry_SYSCALL_64_fastpath+0x23/0xc6
> arch/x86/entry/entry_64.S:209
> [] 0x
>
>
> INFO: task syz-executor:5833 blocked for more than 120 seconds.
>   Not tainted 4.9.0-rc5+ #28
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> syz-executorD17872  5833   4082 0x0004
>  880033944780 8800602f5100 8800652b0c80 8800391a2380
>  88006d122cd8 8800368763a8 8812c15c 41b58ab3
>  88006d123668 88006d123640 110006d0ec5c 88006d122cd8
> Call Trace:
>  [] schedule+0x10d/0x460 kernel/sched/core.c:3457
>  [] schedule_preempt_disabled+0x15/0x20
> kernel/sched/core.c:3490
>  [< inline >] __mutex_lock_common kernel/locking/mutex.c:582
>  [] mutex_lock_nested+0x686/0xf20 kernel/locking/mutex.c:621
>  [] kvm_hv_set_msr_common+0x163/0x2a30
> arch/x86/kvm/hyperv.c:1145
>  [] kvm_set_msr_common+0xb0b/0x23a0 arch/x86/kvm/x86.c:2261
>  [] vmx_set_msr+0x27d/0xcb0 arch/x86/kvm/vmx.c:3149
>  [] kvm_set_msr+0xd9/0x170 arch/x86/kvm/x86.c:1084
>  [] do_set_msr+0x123/0x1a0 arch/x86/kvm/x86.c:1113
>  [< inline >] __msr_io arch/x86/kvm/x86.c:2523
>  [] msr_io+0x250/0x460 arch/x86/kvm/x86.c:2560
>  [] kvm_arch_vcpu_ioctl+0x360/0x44a0 arch/x86/kvm/x86.c:3401
>  [] kvm_vcpu_ioctl+0x237/0x11c0
> arch/x86/kvm/../../../virt/kvm/kvm_main.c:2708
>  [< inline >] vfs_ioctl fs/ioctl.c:43
>  [] do_vfs_ioctl+0x1c4/0x1630 fs/ioctl.c:679
>  [< inline >] SYSC_ioctl fs/ioctl.c:694
>  [] SyS_ioctl+0x94/0xc0 fs/ioctl.c:685
>  [] entry_SYSCALL_64_fastpath+0x23/0xc6
>
> [ 3319.345108] Showing all locks held in the system:
> [ 3319.349897] 2 locks held by khungtaskd/1328:
> [ 3319.352888]  #0: [ 3319.354562]  (
> rcu_read_lock[ 3319.358168] ){..}
> , at: [ 3319.360511] [] watchdog+0x1cc/0xd70
> [ 3319.363841]  #1: [ 3319.364761]  (
> tasklist_lock[ 3319.367215] ){.+.+..}
> , at: [ 3319.369197] [] debug_show_all_locks+0xd2/0x420
> [ 3319.374809] 3 locks held by syz-executor/5833:
> [ 3319.388745]  #0: [ 3319.390145]  (
> &vcpu->mutex[ 3319.391749] ){+.+.+.}
> , at: [ 3319.392313] [] vcpu_load+0x21/0x70
> [ 3319.396281]  #1: [ 3319.398802]  (
> &kvm->srcu[ 3319.399431] ){..}
> , at: [ 3319.399883] [] msr_io+0x148/0x460
> [ 3319.403905]  #2: [ 3319.404639]  (
> &kvm->lock[ 3319.406582] ){+.+.+.}
> , at: [ 3319.409670] [] kvm_hv_set_msr_common+0x163/0x2a30
> [ 3319.422421] 2 locks held by syz-executor/5849:
> [ 3319.425646]  #0: [ 3319.426948]  (
> &kvm->lock[ 3319.427747] ){+.+.+.}
> , at: [ 3319.428368] [] kvm_arch_vm_ioctl+0xb4e/0x23c0
> [ 3319.429594]  #1: [ 3319.429942]  (
> &kvm->slots_lock[ 3319.430881] ){+.+.+.}
> , at: [ 3319.431631] [] kvm_create_pit+0x589/0x8c0
>
>
> On commit a25f0944ba9b1d8a6813fd6f1a86f1bd59ac25a6 (Nov 13)


kvm_vm_ioctl_get_dirty_log is probably unrelated because I also see
fol

kvm: deadlock between kvm_vm_ioctl_get_dirty_log/kvm_hv_set_msr_common/kvm_create_pit

2016-11-14 Thread Dmitry Vyukov
Hello,

The following program produces a deadlocked, unkillable process:
https://gist.githubusercontent.com/dvyukov/fb7e93f6618f4eccb84d419ea6cec491/raw/a14b60250e593eb1b61f50cead41059dc49ceff2/gistfile1.txt


# cat /proc/9362/task/*/stack
[] __synchronize_srcu+0x2f8/0x4a0 kernel/rcu/srcu.c:448
[] synchronize_srcu_expedited+0x13/0x20 kernel/rcu/srcu.c:510
[] kvm_io_bus_register_dev+0x2ab/0x3e0
arch/x86/kvm/../../../virt/kvm/kvm_main.c:3559
[] kvm_create_pit+0x5c6/0x8c0 arch/x86/kvm/i8254.c:694
[] kvm_arch_vm_ioctl+0x1406/0x23c0 arch/x86/kvm/x86.c:3956
[] kvm_vm_ioctl+0x1fa/0x1a70
arch/x86/kvm/../../../virt/kvm/kvm_main.c:3099
[< inline >] vfs_ioctl fs/ioctl.c:43
[] do_vfs_ioctl+0x1c4/0x1630 fs/ioctl.c:679
[< inline >] SYSC_ioctl fs/ioctl.c:694
[] SyS_ioctl+0x94/0xc0 fs/ioctl.c:685
[] entry_SYSCALL_64_fastpath+0x23/0xc6
arch/x86/entry/entry_64.S:209
[] 0x

[] kvm_hv_set_msr_common+0x163/0x2a30
arch/x86/kvm/hyperv.c:1145
[] kvm_set_msr_common+0xb0b/0x23a0 arch/x86/kvm/x86.c:2261
[] vmx_set_msr+0x27d/0xcb0 arch/x86/kvm/vmx.c:3149
[] kvm_set_msr+0xd9/0x170 arch/x86/kvm/x86.c:1084
[] do_set_msr+0x123/0x1a0 arch/x86/kvm/x86.c:1113
[< inline >] __msr_io arch/x86/kvm/x86.c:2523
[] msr_io+0x250/0x460 arch/x86/kvm/x86.c:2560
[] kvm_arch_vcpu_ioctl+0x360/0x44a0 arch/x86/kvm/x86.c:3401
[] kvm_vcpu_ioctl+0x237/0x11c0
arch/x86/kvm/../../../virt/kvm/kvm_main.c:2710
[< inline >] vfs_ioctl fs/ioctl.c:43
[] do_vfs_ioctl+0x1c4/0x1630 fs/ioctl.c:679
[< inline >] SYSC_ioctl fs/ioctl.c:694
[] SyS_ioctl+0x94/0xc0 fs/ioctl.c:685
[] entry_SYSCALL_64_fastpath+0x23/0xc6
arch/x86/entry/entry_64.S:209

[] 0x
[] kvm_vm_ioctl_get_dirty_log+0x8f/0x210
arch/x86/kvm/x86.c:3779
[] kvm_vm_ioctl+0x11e4/0x1a70
arch/x86/kvm/../../../virt/kvm/kvm_main.c:2969
[< inline >] vfs_ioctl fs/ioctl.c:43
[] do_vfs_ioctl+0x1c4/0x1630 fs/ioctl.c:679
[< inline >] SYSC_ioctl fs/ioctl.c:694
[] SyS_ioctl+0x94/0xc0 fs/ioctl.c:685
[] entry_SYSCALL_64_fastpath+0x23/0xc6
arch/x86/entry/entry_64.S:209
[] 0x


INFO: task syz-executor:5833 blocked for more than 120 seconds.
  Not tainted 4.9.0-rc5+ #28
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executorD17872  5833   4082 0x0004
 880033944780 8800602f5100 8800652b0c80 8800391a2380
 88006d122cd8 8800368763a8 8812c15c 41b58ab3
 88006d123668 88006d123640 110006d0ec5c 88006d122cd8
Call Trace:
 [] schedule+0x10d/0x460 kernel/sched/core.c:3457
 [] schedule_preempt_disabled+0x15/0x20
kernel/sched/core.c:3490
 [< inline >] __mutex_lock_common kernel/locking/mutex.c:582
 [] mutex_lock_nested+0x686/0xf20 kernel/locking/mutex.c:621
 [] kvm_hv_set_msr_common+0x163/0x2a30
arch/x86/kvm/hyperv.c:1145
 [] kvm_set_msr_common+0xb0b/0x23a0 arch/x86/kvm/x86.c:2261
 [] vmx_set_msr+0x27d/0xcb0 arch/x86/kvm/vmx.c:3149
 [] kvm_set_msr+0xd9/0x170 arch/x86/kvm/x86.c:1084
 [] do_set_msr+0x123/0x1a0 arch/x86/kvm/x86.c:1113
 [< inline >] __msr_io arch/x86/kvm/x86.c:2523
 [] msr_io+0x250/0x460 arch/x86/kvm/x86.c:2560
 [] kvm_arch_vcpu_ioctl+0x360/0x44a0 arch/x86/kvm/x86.c:3401
 [] kvm_vcpu_ioctl+0x237/0x11c0
arch/x86/kvm/../../../virt/kvm/kvm_main.c:2708
 [< inline >] vfs_ioctl fs/ioctl.c:43
 [] do_vfs_ioctl+0x1c4/0x1630 fs/ioctl.c:679
 [< inline >] SYSC_ioctl fs/ioctl.c:694
 [] SyS_ioctl+0x94/0xc0 fs/ioctl.c:685
 [] entry_SYSCALL_64_fastpath+0x23/0xc6

[ 3319.345108] Showing all locks held in the system:
[ 3319.349897] 2 locks held by khungtaskd/1328:
[ 3319.352888]  #0: [ 3319.354562]  (
rcu_read_lock[ 3319.358168] ){..}
, at: [ 3319.360511] [] watchdog+0x1cc/0xd70
[ 3319.363841]  #1: [ 3319.364761]  (
tasklist_lock[ 3319.367215] ){.+.+..}
, at: [ 3319.369197] [] debug_show_all_locks+0xd2/0x420
[ 3319.374809] 3 locks held by syz-executor/5833:
[ 3319.388745]  #0: [ 3319.390145]  (
&vcpu->mutex[ 3319.391749] ){+.+.+.}
, at: [ 3319.392313] [] vcpu_load+0x21/0x70
[ 3319.396281]  #1: [ 3319.398802]  (
&kvm->srcu[ 3319.399431] ){..}
, at: [ 3319.399883] [] msr_io+0x148/0x460
[ 3319.403905]  #2: [ 3319.404639]  (
&kvm->lock[ 3319.406582] ){+.+.+.}
, at: [ 3319.409670] [] kvm_hv_set_msr_common+0x163/0x2a30
[ 3319.422421] 2 locks held by syz-executor/5849:
[ 3319.425646]  #0: [ 3319.426948]  (
&kvm->lock[ 3319.427747] ){+.+.+.}
, at: [ 3319.428368] [] kvm_arch_vm_ioctl+0xb4e/0x23c0
[ 3319.429594]  #1: [ 3319.429942]  (
&kvm->slots_lock[ 3319.430881] ){+.+.+.}
, at: [ 3319.431631] [] kvm_create_pit+0x589/0x8c0


On commit a25f0944ba9b1d8a6813fd6f1a86f1bd59ac25a6 (Nov 13)


[PATCH] kvm: x86: don't print warning messages for unimplemented msrs

2016-11-14 Thread Bandan Das

Change unimplemented msrs messages to use pr_debug.
If CONFIG_DYNAMIC_DEBUG is set, then these messages can be
enabled at run time or else -DDEBUG can be used at compile
time to enable them. These messages will still be printed if
ignore_msrs=1.

Signed-off-by: Bandan Das 
---
This is a follow up to RFC posted by Dave at
https://patchwork.kernel.org/patch/9238227/ which uses pr_debug_ratelimited
when ignore_msrs is not set.

 arch/x86/kvm/mmu.c   | 2 +-
 arch/x86/kvm/x86.c   | 5 +++--
 include/linux/kvm_host.h | 5 +
 3 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index d9c7e98..1b3f241 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -4958,7 +4958,7 @@ void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm, 
struct kvm_memslots *slots)
 * zap all shadow pages.
 */
if (unlikely((slots->generation & MMIO_GEN_MASK) == 0)) {
-   printk_ratelimited(KERN_DEBUG "kvm: zapping shadow pages for 
mmio generation wraparound\n");
+   kvm_debug_ratelimited("kvm: zapping shadow pages for mmio 
generation wraparound\n");
kvm_mmu_invalidate_zap_all_pages(kvm);
}
 }
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 3017de0..5d50403 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2280,7 +2280,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct 
msr_data *msr_info)
if (kvm_pmu_is_valid_msr(vcpu, msr))
return kvm_pmu_set_msr(vcpu, msr_info);
if (!ignore_msrs) {
-   vcpu_unimpl(vcpu, "unhandled wrmsr: 0x%x data 0x%llx\n",
+   vcpu_debug_ratelimited(vcpu, "unhandled wrmsr: 0x%x 
data 0x%llx\n",
msr, data);
return 1;
} else {
@@ -2492,7 +2492,8 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct 
msr_data *msr_info)
if (kvm_pmu_is_valid_msr(vcpu, msr_info->index))
return kvm_pmu_get_msr(vcpu, msr_info->index, 
&msr_info->data);
if (!ignore_msrs) {
-   vcpu_unimpl(vcpu, "unhandled rdmsr: 0x%x\n", 
msr_info->index);
+   vcpu_debug_ratelimited(vcpu, "unhandled rdmsr: 0x%x\n",
+  msr_info->index);
return 1;
} else {
vcpu_unimpl(vcpu, "ignored rdmsr: 0x%x\n", 
msr_info->index);
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 01c0b9c..e4c0980 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -439,6 +439,9 @@ struct kvm {
pr_info("kvm [%i]: " fmt, task_pid_nr(current), ## __VA_ARGS__)
 #define kvm_debug(fmt, ...) \
pr_debug("kvm [%i]: " fmt, task_pid_nr(current), ## __VA_ARGS__)
+#define kvm_debug_ratelimited(fmt, ...) \
+   pr_debug_ratelimited("kvm [%i]: " fmt, task_pid_nr(current), \
+## __VA_ARGS__)
 #define kvm_pr_unimpl(fmt, ...) \
pr_err_ratelimited("kvm [%i]: " fmt, \
   task_tgid_nr(current), ## __VA_ARGS__)
@@ -450,6 +453,8 @@ struct kvm {
 
 #define vcpu_debug(vcpu, fmt, ...) \
kvm_debug("vcpu%i " fmt, (vcpu)->vcpu_id, ## __VA_ARGS__)
+#define vcpu_debug_ratelimited(vcpu, fmt, ...) \
+   kvm_debug("vcpu%i " fmt, (vcpu)->vcpu_id, ## __VA_ARGS__)
 #define vcpu_err(vcpu, fmt, ...)   \
kvm_err("vcpu%i " fmt, (vcpu)->vcpu_id, ## __VA_ARGS__)
 
-- 
2.5.5



[GIT PULL] arch/tile bugfix for 4.9-rc6

2016-11-14 Thread Chris Metcalf

Linus,

Please pull the following change for 4.9-rc6 from:

git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile.git stable

This just fixes an incompatibility with tile __ro_after_init.

Chris Metcalf (1):
  tile: handle __ro_after_init like parisc does

 arch/tile/include/asm/cache.h | 3 +++
 1 file changed, 3 insertions(+)

--
Chris Metcalf, Mellanox Technologies
http://www.mellanox.com




Re: [PATCH 2/3] qemu: Implement virtio-pstore device

2016-11-14 Thread Namhyung Kim
On Fri, Nov 11, 2016 at 12:50:03AM +0200, Michael S. Tsirkin wrote:
> On Fri, Sep 16, 2016 at 07:05:47PM +0900, Namhyung Kim wrote:
> > On Tue, Sep 13, 2016 at 06:57:10PM +0300, Michael S. Tsirkin wrote:
> > > On Sat, Aug 20, 2016 at 05:07:43PM +0900, Namhyung Kim wrote:
> > > > +
> > > > +/* the index should match to the type value */
> > > > +static const char *virtio_pstore_file_prefix[] = {
> > > > +"unknown-",/* VIRTIO_PSTORE_TYPE_UNKNOWN */
> > > 
> > > Is there value in treating everything unexpected as "unknown"
> > > and rotating them as if they were logs?
> > > It might be better to treat everything that's not known
> > > as guest error.
> > 
> > I was thinking about the version mismatch between the kernel and qemu.
> > I'd like to make the device can deal with a new kernel version which
> > might implement a new pstore message type.  It will be saved as
> > unknown but the kernel can read it properly later.
> 
> Well it'll have a different prefix. E.g. if kernel has
> two different types they will end up in the same
> file, hardly what was wanted.

Right, I think it needs to add 'type' info to the filename for unknown
type.

Thanks,
Namhyung


Re: [RFC PATCH] x86/debug: Dump more detailed segfault info

2016-11-14 Thread Ingo Molnar

* Borislav Petkov  wrote:

> On Sun, Nov 13, 2016 at 12:25:52PM +0100, Borislav Petkov wrote:
> > Hmm, enabling all *PRINTK* options from your .config doesn't change
> > anything for my qemu guest here. Lemme try with your full config.
> 
> Same with your .config:
> 
> [  115.694717] strsep[3027]: segfault at 40066b ip 77abe22b sp 
> 7fffe990 error 7 in libc-2.19.so[77a33000+19f000]
> [  115.700181] RIP: 0033:[<77abe22b>]  [<77abe22b>] 
> 0x77abe22b
> [  115.704843] RSP: 002b:7fffe990  EFLAGS: 00010202
> [  115.707183] RAX: 0040066b RBX: 00400664 RCX: 
> 
> [  115.709189] RDX:  RSI: 003d RDI: 
> 00400665
> [  115.711207] RBP: 7fffe9b0 R08: 77dd7c60 R09: 
> 77deae20
> [  115.713630] R10: 7fffe770 R11: 77abe200 R12: 
> 00400460
> [  115.715653] R13: 7fffeaa0 R14:  R15: 
> 
> [  115.717651] FS:  77fdc700() GS:88007ed0() 
> knlGS:
> [  115.719554] CS:  0010 DS:  ES:  CR0: 80050033
> [  115.720393] CR2: 0040066b CR3: 79f4f000 CR4: 
> 000406e0
> [  115.721409] Code: [  115.721692] 74 33 80 7e 01 00 74 22 48 89 df e8 5a 8a 
> ff ff 48 85 c0 74 20  00 00 48 83 c0 01 48 89 45 00 48 89 d8 48 83 c4 08 
> 5b 5d c3 0f b6 13 38 d0 74 29 84 d2 75 15 48 c7 45 00 00 00 00 00 48 83 c4
> 
> Is this a real hw issue? I.e., maybe I should not be doing this in a
> guest?

So I think the line breaking artifact might be due to the following commit:

  bfd8d3f23b51 ("printk: make reading the kernel log flush pending lines")

... which Linus reverted upstream a few hours ago:

 commit f5c9f9c72395c3291c2e35c905dedae2b98475a4
 Author: Linus Torvalds 
 Date:   Mon Nov 14 09:31:52 2016 -0800

Revert "printk: make reading the kernel log flush pending lines"

This reverts commit bfd8d3f23b51018388be0411ccbc2d56277fe294.

It turns out that this flushes things much too aggressiverly, and causes
lines to break up when the system logger races with new continuation
lines being printed.
...

Thanks,

Ingo


Re: perf: fuzzer KASAN slab-out-of-bounds in snb_uncore_imc_event_del

2016-11-14 Thread Dmitry Vyukov
On Tue, Nov 15, 2016 at 6:57 AM, Vince Weaver  wrote:
> On Mon, 14 Nov 2016, Vince Weaver wrote:
>
>> Anyway as per the suggestion at Linux Plumbers I enabled KASAN and on my
>> haswell machine it falls over in a few minutes of running the perf_fuzzer.
>>
>> [  205.740194] 
>> ==
>> [  205.748005] BUG: KASAN: slab-out-of-bounds in 
>> snb_uncore_imc_event_del+0x6c/0xa0 at addr 8800caa43768
>> [  205.758324] Read of size 8 by task perf_fuzzer/6618
>> [  205.763589] CPU: 0 PID: 6618 Comm: perf_fuzzer Not tainted 4.9.0-rc5 #4
>> [  205.770721] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 
>> 01/26/2014
>> [  205.778689]  8800c3c479b8 816bb796 88011ec00600 
>> 8800caa43580
>> [  205.786759]  8800c3c479e0 812fb961 8800c3c47a78 
>> 8800caa43580
>> [  205.794850]  8800caa43580 8800c3c47a68 812fbbd8 
>> 8800c3c47a28
>> [  205.802911] Call Trace:
>> [  205.805559]  [] dump_stack+0x63/0x8d
>> [  205.811135]  [] kasan_object_err+0x21/0x70
>> [  205.817267]  [] kasan_report_error+0x1d8/0x4c0
>> [  205.823752]  [] ? __lock_is_held+0x75/0xc0
>> [  205.829868]  [] ? snb_uncore_imc_read_counter+0x42/0x50
>> [  205.837198]  [] ? uncore_perf_event_update+0xe2/0x160
>> [  205.844337]  [] kasan_report+0x39/0x40
>> [  205.850085]  [] ? snb_uncore_imc_event_del+0x6c/0xa0


If you pipe the report through
https://github.com/google/sanitizers/blob/master/address-sanitizer/tools/kasan_symbolize.py
it will give you line numbers and inlined frames.

> The best I can tell this maps to:
>
> static void snb_uncore_imc_event_del(struct perf_event *event, int flags)
> {
> struct intel_uncore_box *box = uncore_event_to_box(event);
> int i;
>
> snb_uncore_imc_event_stop(event, PERF_EF_UPDATE);
>
> for (i = 0; i < box->n_events; i++) {
 if (event == box->event_list[i]) {
> --box->n_events;
> break;
> }
> }
> }
>
> Can this code be right?  Does it actually remove the event?
> The similar code in
>
> static void uncore_pmu_event_del(struct perf_event *event, int flags)
>
> 
>
> for (i = 0; i < box->n_events; i++) {
> if (event == box->event_list[i]) {
> uncore_put_event_constraint(box, event);
>
> for (++i; i < box->n_events; i++)
> box->event_list[i - 1] = box->event_list[i];
>
> --box->n_events;
> break;
> }
> }
>
>
> seems like it is more likely to be correct.
>
> Vince


[PATCH v5 2/5] wcn36xx: Transition driver to SMD client

2016-11-14 Thread Bjorn Andersson
The wcn36xx wifi driver follows the life cycle of the WLAN_CTRL SMD
channel, as such it should be a SMD client. This patch makes this
transition, now that we have the necessary frameworks available.

Signed-off-by: Bjorn Andersson 
---

Changes since v4:
- Added Kconfig dependency to handle dependencies compiled as modules

 drivers/net/wireless/ath/wcn36xx/Kconfig   |  2 +
 drivers/net/wireless/ath/wcn36xx/dxe.c | 16 +++---
 drivers/net/wireless/ath/wcn36xx/main.c| 79 --
 drivers/net/wireless/ath/wcn36xx/smd.c | 31 +---
 drivers/net/wireless/ath/wcn36xx/smd.h |  5 ++
 drivers/net/wireless/ath/wcn36xx/wcn36xx.h | 21 +++-
 6 files changed, 88 insertions(+), 66 deletions(-)

diff --git a/drivers/net/wireless/ath/wcn36xx/Kconfig 
b/drivers/net/wireless/ath/wcn36xx/Kconfig
index 591ebaea8265..4b83e87f0b94 100644
--- a/drivers/net/wireless/ath/wcn36xx/Kconfig
+++ b/drivers/net/wireless/ath/wcn36xx/Kconfig
@@ -1,6 +1,8 @@
 config WCN36XX
tristate "Qualcomm Atheros WCN3660/3680 support"
depends on MAC80211 && HAS_DMA
+   depends on QCOM_WCNSS_CTRL || QCOM_WCNSS_CTRL=n
+   depends on QCOM_SMD || QCOM_SMD=n
---help---
  This module adds support for wireless adapters based on
  Qualcomm Atheros WCN3660 and WCN3680 mobile chipsets.
diff --git a/drivers/net/wireless/ath/wcn36xx/dxe.c 
b/drivers/net/wireless/ath/wcn36xx/dxe.c
index 231fd022f0f5..87dfdaf9044c 100644
--- a/drivers/net/wireless/ath/wcn36xx/dxe.c
+++ b/drivers/net/wireless/ath/wcn36xx/dxe.c
@@ -23,6 +23,7 @@
 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
 #include 
+#include 
 #include "wcn36xx.h"
 #include "txrx.h"
 
@@ -151,9 +152,12 @@ int wcn36xx_dxe_alloc_ctl_blks(struct wcn36xx *wcn)
goto out_err;
 
/* Initialize SMSM state  Clear TX Enable RING EMPTY STATE */
-   ret = wcn->ctrl_ops->smsm_change_state(
-   WCN36XX_SMSM_WLAN_TX_ENABLE,
-   WCN36XX_SMSM_WLAN_TX_RINGS_EMPTY);
+   ret = qcom_smem_state_update_bits(wcn->tx_enable_state,
+ WCN36XX_SMSM_WLAN_TX_ENABLE |
+ WCN36XX_SMSM_WLAN_TX_RINGS_EMPTY,
+ WCN36XX_SMSM_WLAN_TX_RINGS_EMPTY);
+   if (ret)
+   goto out_err;
 
return 0;
 
@@ -678,9 +682,9 @@ int wcn36xx_dxe_tx_frame(struct wcn36xx *wcn,
 * notify chip about new frame through SMSM bus.
 */
if (is_low &&  vif_priv->pw_state == WCN36XX_BMPS) {
-   wcn->ctrl_ops->smsm_change_state(
- 0,
- WCN36XX_SMSM_WLAN_TX_ENABLE);
+   qcom_smem_state_update_bits(wcn->tx_rings_empty_state,
+   WCN36XX_SMSM_WLAN_TX_ENABLE,
+   WCN36XX_SMSM_WLAN_TX_ENABLE);
} else {
/* indicate End Of Packet and generate interrupt on descriptor
 * done.
diff --git a/drivers/net/wireless/ath/wcn36xx/main.c 
b/drivers/net/wireless/ath/wcn36xx/main.c
index e1d59da2ad20..3c2522b07c90 100644
--- a/drivers/net/wireless/ath/wcn36xx/main.c
+++ b/drivers/net/wireless/ath/wcn36xx/main.c
@@ -21,6 +21,10 @@
 #include 
 #include 
 #include 
+#include 
+#include 
+#include 
+#include 
 #include "wcn36xx.h"
 
 unsigned int wcn36xx_dbg_mask;
@@ -1058,8 +1062,7 @@ static int wcn36xx_platform_get_resources(struct wcn36xx 
*wcn,
int ret;
 
/* Set TX IRQ */
-   res = platform_get_resource_byname(pdev, IORESOURCE_IRQ,
-  "wcnss_wlantx_irq");
+   res = platform_get_resource_byname(pdev, IORESOURCE_IRQ, "tx");
if (!res) {
wcn36xx_err("failed to get tx_irq\n");
return -ENOENT;
@@ -1067,14 +1070,29 @@ static int wcn36xx_platform_get_resources(struct 
wcn36xx *wcn,
wcn->tx_irq = res->start;
 
/* Set RX IRQ */
-   res = platform_get_resource_byname(pdev, IORESOURCE_IRQ,
-  "wcnss_wlanrx_irq");
+   res = platform_get_resource_byname(pdev, IORESOURCE_IRQ, "rx");
if (!res) {
wcn36xx_err("failed to get rx_irq\n");
return -ENOENT;
}
wcn->rx_irq = res->start;
 
+   /* Acquire SMSM tx enable handle */
+   wcn->tx_enable_state = qcom_smem_state_get(&pdev->dev,
+   "tx-enable", &wcn->tx_enable_state_bit);
+   if (IS_ERR(wcn->tx_enable_state)) {
+   wcn36xx_err("failed to get tx-enable state\n");
+   return PTR_ERR(wcn->tx_enable_state);
+   }
+
+   /* Acquire SMSM tx rings empty handle */
+   wcn->tx_rings_empty_state = qcom_smem_state_get(&pdev->dev,
+   "tx-rings-empty", &wcn->tx_rings_empty_state_bit);
+   if (IS_ERR(wcn->tx_rings_empty_state)) {
+   wcn36xx_err("

[PATCH v5 4/5] wcn36xx: Implement print_reg indication

2016-11-14 Thread Bjorn Andersson
Some firmware versions sends a "print register indication", handle this
by printing out the content.

Cc: Nicolas Dechesne 
Signed-off-by: Bjorn Andersson 
---

Changes since v4:
- None

 drivers/net/wireless/ath/wcn36xx/hal.h | 16 
 drivers/net/wireless/ath/wcn36xx/smd.c | 30 ++
 2 files changed, 46 insertions(+)

diff --git a/drivers/net/wireless/ath/wcn36xx/hal.h 
b/drivers/net/wireless/ath/wcn36xx/hal.h
index 4f87ef1e1eb8..b765c647319d 100644
--- a/drivers/net/wireless/ath/wcn36xx/hal.h
+++ b/drivers/net/wireless/ath/wcn36xx/hal.h
@@ -350,6 +350,8 @@ enum wcn36xx_hal_host_msg_type {
 
WCN36XX_HAL_AVOID_FREQ_RANGE_IND = 233,
 
+   WCN36XX_HAL_PRINT_REG_INFO_IND = 259,
+
WCN36XX_HAL_MSG_MAX = WCN36XX_HAL_MSG_TYPE_MAX_ENUM_SIZE
 };
 
@@ -4703,4 +4705,18 @@ struct stats_class_b_ind {
u32 rx_time_total;
 };
 
+/* WCN36XX_HAL_PRINT_REG_INFO_IND */
+struct wcn36xx_hal_print_reg_info_ind {
+   struct wcn36xx_hal_msg_header header;
+
+   u32 count;
+   u32 scenario;
+   u32 reason;
+
+   struct {
+   u32 addr;
+   u32 value;
+   } regs[];
+} __packed;
+
 #endif /* _HAL_H_ */
diff --git a/drivers/net/wireless/ath/wcn36xx/smd.c 
b/drivers/net/wireless/ath/wcn36xx/smd.c
index be5e5ea1e5c3..1c2966f7db7a 100644
--- a/drivers/net/wireless/ath/wcn36xx/smd.c
+++ b/drivers/net/wireless/ath/wcn36xx/smd.c
@@ -2109,6 +2109,30 @@ static int wcn36xx_smd_delete_sta_context_ind(struct 
wcn36xx *wcn,
return -ENOENT;
 }
 
+static int wcn36xx_smd_print_reg_info_ind(struct wcn36xx *wcn,
+ void *buf,
+ size_t len)
+{
+   struct wcn36xx_hal_print_reg_info_ind *rsp = buf;
+   int i;
+
+   if (len < sizeof(*rsp)) {
+   wcn36xx_warn("Corrupted print reg info indication\n");
+   return -EIO;
+   }
+
+   wcn36xx_dbg(WCN36XX_DBG_HAL,
+   "reginfo indication, scenario: 0x%x reason: 0x%x\n",
+   rsp->scenario, rsp->reason);
+
+   for (i = 0; i < rsp->count; i++) {
+   wcn36xx_dbg(WCN36XX_DBG_HAL, "\t0x%x: 0x%x\n",
+   rsp->regs[i].addr, rsp->regs[i].value);
+   }
+
+   return 0;
+}
+
 int wcn36xx_smd_update_cfg(struct wcn36xx *wcn, u32 cfg_id, u32 value)
 {
struct wcn36xx_hal_update_cfg_req_msg msg_body, *body;
@@ -2237,6 +2261,7 @@ int wcn36xx_smd_rsp_process(struct qcom_smd_channel 
*channel,
case WCN36XX_HAL_OTA_TX_COMPL_IND:
case WCN36XX_HAL_MISSED_BEACON_IND:
case WCN36XX_HAL_DELETE_STA_CONTEXT_IND:
+   case WCN36XX_HAL_PRINT_REG_INFO_IND:
msg_ind = kmalloc(sizeof(*msg_ind) + len, GFP_ATOMIC);
if (!msg_ind) {
wcn36xx_err("Run out of memory while handling SMD_EVENT 
(%d)\n",
@@ -2296,6 +2321,11 @@ static void wcn36xx_ind_smd_work(struct work_struct 
*work)
   hal_ind_msg->msg,
   hal_ind_msg->msg_len);
break;
+   case WCN36XX_HAL_PRINT_REG_INFO_IND:
+   wcn36xx_smd_print_reg_info_ind(wcn,
+  hal_ind_msg->msg,
+  hal_ind_msg->msg_len);
+   break;
default:
wcn36xx_err("SMD_EVENT (%d) not supported\n",
  msg_header->msg_type);
-- 
2.5.0



[PATCH v5 3/5] wcn36xx: Implement firmware assisted scan

2016-11-14 Thread Bjorn Andersson
Using the software based channel scan mechanism from mac80211 keeps us
offline for 10-15 second, we should instead issue a start_scan/end_scan
on each channel reducing this time.

Signed-off-by: Bjorn Andersson 
---

Changes since v4:
- None

 drivers/net/wireless/ath/wcn36xx/main.c| 64 +-
 drivers/net/wireless/ath/wcn36xx/smd.c |  8 ++--
 drivers/net/wireless/ath/wcn36xx/smd.h |  4 +-
 drivers/net/wireless/ath/wcn36xx/txrx.c| 19 ++---
 drivers/net/wireless/ath/wcn36xx/wcn36xx.h |  9 +
 5 files changed, 81 insertions(+), 23 deletions(-)

diff --git a/drivers/net/wireless/ath/wcn36xx/main.c 
b/drivers/net/wireless/ath/wcn36xx/main.c
index 3c2522b07c90..96a9584edcbb 100644
--- a/drivers/net/wireless/ath/wcn36xx/main.c
+++ b/drivers/net/wireless/ath/wcn36xx/main.c
@@ -568,23 +568,59 @@ static int wcn36xx_set_key(struct ieee80211_hw *hw, enum 
set_key_cmd cmd,
return ret;
 }
 
-static void wcn36xx_sw_scan_start(struct ieee80211_hw *hw,
- struct ieee80211_vif *vif,
- const u8 *mac_addr)
+static void wcn36xx_hw_scan_worker(struct work_struct *work)
 {
-   struct wcn36xx *wcn = hw->priv;
+   struct wcn36xx *wcn = container_of(work, struct wcn36xx, scan_work);
+   struct cfg80211_scan_request *req = wcn->scan_req;
+   u8 channels[WCN36XX_HAL_PNO_MAX_NETW_CHANNELS_EX];
+   struct cfg80211_scan_info scan_info = {};
+   int i;
+
+   wcn36xx_dbg(WCN36XX_DBG_MAC, "mac80211 scan %d channels worker\n", 
req->n_channels);
+
+   for (i = 0; i < req->n_channels; i++)
+   channels[i] = req->channels[i]->hw_value;
+
+   wcn36xx_smd_update_scan_params(wcn, channels, req->n_channels);
 
wcn36xx_smd_init_scan(wcn, HAL_SYS_MODE_SCAN);
-   wcn36xx_smd_start_scan(wcn);
+   for (i = 0; i < req->n_channels; i++) {
+   wcn->scan_freq = req->channels[i]->center_freq;
+   wcn->scan_band = req->channels[i]->band;
+
+   wcn36xx_smd_start_scan(wcn, req->channels[i]->hw_value);
+   msleep(30);
+   wcn36xx_smd_end_scan(wcn, req->channels[i]->hw_value);
+
+   wcn->scan_freq = 0;
+   }
+   wcn36xx_smd_finish_scan(wcn, HAL_SYS_MODE_SCAN);
+
+   scan_info.aborted = false;
+   ieee80211_scan_completed(wcn->hw, &scan_info);
+
+   mutex_lock(&wcn->scan_lock);
+   wcn->scan_req = NULL;
+   mutex_unlock(&wcn->scan_lock);
 }
 
-static void wcn36xx_sw_scan_complete(struct ieee80211_hw *hw,
-struct ieee80211_vif *vif)
+static int wcn36xx_hw_scan(struct ieee80211_hw *hw,
+  struct ieee80211_vif *vif,
+  struct ieee80211_scan_request *hw_req)
 {
struct wcn36xx *wcn = hw->priv;
 
-   wcn36xx_smd_end_scan(wcn);
-   wcn36xx_smd_finish_scan(wcn, HAL_SYS_MODE_SCAN);
+   mutex_lock(&wcn->scan_lock);
+   if (wcn->scan_req) {
+   mutex_unlock(&wcn->scan_lock);
+   return -EBUSY;
+   }
+   wcn->scan_req = &hw_req->req;
+   mutex_unlock(&wcn->scan_lock);
+
+   schedule_work(&wcn->scan_work);
+
+   return 0;
 }
 
 static void wcn36xx_update_allowed_rates(struct ieee80211_sta *sta,
@@ -997,8 +1033,7 @@ static const struct ieee80211_ops wcn36xx_ops = {
.configure_filter   = wcn36xx_configure_filter,
.tx = wcn36xx_tx,
.set_key= wcn36xx_set_key,
-   .sw_scan_start  = wcn36xx_sw_scan_start,
-   .sw_scan_complete   = wcn36xx_sw_scan_complete,
+   .hw_scan= wcn36xx_hw_scan,
.bss_info_changed   = wcn36xx_bss_info_changed,
.set_rts_threshold  = wcn36xx_set_rts_threshold,
.sta_add= wcn36xx_sta_add,
@@ -1023,6 +1058,7 @@ static int wcn36xx_init_ieee80211(struct wcn36xx *wcn)
ieee80211_hw_set(wcn->hw, SUPPORTS_PS);
ieee80211_hw_set(wcn->hw, SIGNAL_DBM);
ieee80211_hw_set(wcn->hw, HAS_RATE_CONTROL);
+   ieee80211_hw_set(wcn->hw, SINGLE_SCAN_ON_ALL_BANDS);
 
wcn->hw->wiphy->interface_modes = BIT(NL80211_IFTYPE_STATION) |
BIT(NL80211_IFTYPE_AP) |
@@ -1032,6 +1068,9 @@ static int wcn36xx_init_ieee80211(struct wcn36xx *wcn)
wcn->hw->wiphy->bands[NL80211_BAND_2GHZ] = &wcn_band_2ghz;
wcn->hw->wiphy->bands[NL80211_BAND_5GHZ] = &wcn_band_5ghz;
 
+   wcn->hw->wiphy->max_scan_ssids = WCN36XX_MAX_SCAN_SSIDS;
+   wcn->hw->wiphy->max_scan_ie_len = WCN36XX_MAX_SCAN_IE_LEN;
+
wcn->hw->wiphy->cipher_suites = cipher_suites;
wcn->hw->wiphy->n_cipher_suites = ARRAY_SIZE(cipher_suites);
 
@@ -1152,6 +1191,9 @@ static int wcn36xx_probe(struct platform_device *pdev)
wcn->hw = hw;
wcn->dev = &pdev->dev;
mutex_init(&wcn->hal_mutex);
+   mutex_init(&wcn->scan_lock);
+
+   INIT_WORK(&wcn->scan_work, wcn

[PATCH v5 1/5] soc: qcom: smem_state: Fix include for ERR_PTR()

2016-11-14 Thread Bjorn Andersson
The correct include file for getting errno constants and ERR_PTR() is
linux/err.h, rather than linux/errno.h, so fix the include.

Fixes: e8b123e60084 ("soc: qcom: smem_state: Add stubs for disabled smem_state")
Acked-by: Andy Gross 
Signed-off-by: Bjorn Andersson 
---

Kalle, please merge this patch through your tree.

Changes since v4:
- New patch

 include/linux/soc/qcom/smem_state.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/soc/qcom/smem_state.h 
b/include/linux/soc/qcom/smem_state.h
index 7b88697929e9..b8478ee7a71f 100644
--- a/include/linux/soc/qcom/smem_state.h
+++ b/include/linux/soc/qcom/smem_state.h
@@ -1,7 +1,7 @@
 #ifndef __QCOM_SMEM_STATE__
 #define __QCOM_SMEM_STATE__
 
-#include 
+#include 
 
 struct device_node;
 struct qcom_smem_state;
-- 
2.5.0



[PATCH v5 5/5] wcn36xx: Don't use the destroyed hal_mutex

2016-11-14 Thread Bjorn Andersson
ieee80211_unregister_hw() might invoke operations to stop the interface,
that uses the hal_mutex. So don't destroy it until after we're done
using it.

Signed-off-by: Bjorn Andersson 
---

With this patch I can successfully (although with a SMD send timeout in the
shutdown path) start and stop the WCNSS PIL/remoteproc multiple times and the
wlan0 interface will come and go accordingly.

Will submit the necessary DT patches soon as well.

Changes since v4:
- New patch

 drivers/net/wireless/ath/wcn36xx/main.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/wireless/ath/wcn36xx/main.c 
b/drivers/net/wireless/ath/wcn36xx/main.c
index 96a9584edcbb..0002190c9041 100644
--- a/drivers/net/wireless/ath/wcn36xx/main.c
+++ b/drivers/net/wireless/ath/wcn36xx/main.c
@@ -1241,7 +1241,6 @@ static int wcn36xx_remove(struct platform_device *pdev)
wcn36xx_dbg(WCN36XX_DBG_MAC, "platform remove\n");
 
release_firmware(wcn->nv);
-   mutex_destroy(&wcn->hal_mutex);
 
ieee80211_unregister_hw(hw);
 
@@ -1250,6 +1249,8 @@ static int wcn36xx_remove(struct platform_device *pdev)
 
iounmap(wcn->dxe_base);
iounmap(wcn->ccu_base);
+
+   mutex_destroy(&wcn->hal_mutex);
ieee80211_free_hw(hw);
 
return 0;
-- 
2.5.0



[PATCH v5 2/4] x86: add support for earlyprintk via USB3 debug port

2016-11-14 Thread Lu Baolu
Add support for early printk by writing debug messages to the
USB3 debug port.   Users can use this type of early printk by
specifying kernel parameter of "earlyprintk=xdbc". This gives
users a chance of providing debug output.

The hardware for USB3 debug port requires DMA memory blocks.
This requires to delay setting up debugging hardware and
registering boot console until the memblocks are filled.

Cc: Ingo Molnar 
Cc: x...@kernel.org
Signed-off-by: Lu Baolu 
---
 Documentation/kernel-parameters.txt | 1 +
 arch/x86/kernel/early_printk.c  | 5 +
 arch/x86/kernel/setup.c | 7 +++
 3 files changed, 13 insertions(+)

diff --git a/Documentation/kernel-parameters.txt 
b/Documentation/kernel-parameters.txt
index 37babf9..99b64b3 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -1178,6 +1178,7 @@ bytes respectively. Such letter suffixes can also be 
entirely omitted.
earlyprintk=ttySn[,baudrate]
earlyprintk=dbgp[debugController#]
earlyprintk=pciserial,bus:device.function[,baudrate]
+   earlyprintk=xdbc[xhciController#]
 
earlyprintk is useful when the kernel crashes before
the normal console is initialized. It is not enabled by
diff --git a/arch/x86/kernel/early_printk.c b/arch/x86/kernel/early_printk.c
index 8a12199..c4031b9 100644
--- a/arch/x86/kernel/early_printk.c
+++ b/arch/x86/kernel/early_printk.c
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -381,6 +382,10 @@ static int __init setup_early_printk(char *buf)
if (!strncmp(buf, "efi", 3))
early_console_register(&early_efi_console, keep);
 #endif
+#ifdef CONFIG_EARLY_PRINTK_XDBC
+   if (!strncmp(buf, "xdbc", 4))
+   early_xdbc_parse_parameter(buf + 4);
+#endif
 
buf++;
}
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 9c337b0..09d4a56 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -70,6 +70,8 @@
 #include 
 #include 
 
+#include 
+
 #include 
 
 #include 
@@ -1096,6 +1098,11 @@ void __init setup_arch(char **cmdline_p)
memblock_set_current_limit(ISA_END_ADDRESS);
memblock_x86_fill();
 
+#ifdef CONFIG_EARLY_PRINTK_XDBC
+   if (!early_xdbc_setup_hardware())
+   early_xdbc_register_console();
+#endif
+
reserve_bios_regions();
 
if (efi_enabled(EFI_MEMMAP)) {
-- 
2.1.4



Re: [PATCH v18 2/2] drm/bridge: Add I2C based driver for ps8640 bridge

2016-11-14 Thread Archit Taneja

Hi,

On 11/14/2016 07:11 PM, Jitao Shi wrote:

This patch adds drm_bridge driver for parade DSI to eDP bridge chip.


Thanks for the incorporating the fixes. I have commented on one issue
below.


The only thing that seems to be left now is the firmware update bits, right?

Can we get the firmware pushed on the linux-firmware git repo [1]?

Or

Remove the firmware update parts for now (including the SPI stuff,
since that seems to be only used for writing fw)?

[1] http://git.kernel.org/cgit/linux/kernel/git/firmware/linux-firmware.git/

Thanks,
Archit



Signed-off-by: Jitao Shi 
Reviewed-by: Daniel Kurtz 
Reviewed-by: Enric Balletbo i Serra 
---
Changes since v17:
 - remove some unused head files.
 - add macros for ps8640 pages.
 - remove ddc_i2c client
 - add mipi_dsi_device_register_full
 - remove the manufacturer from the name and i2c_device_id

Changes since v16:
 - Disable ps8640 DSI MCS Function.
 - Rename gpios name more clearly.
 - Tune the ps8640 power on sequence.

Changes since v15:
 - Drop drm_connector_(un)register calls from parade ps8640.
   The main DRM driver mtk_drm_drv now calls
   drm_connector_register_all() after drm_dev_register() in the
   mtk_drm_bind() function. That function should iterate over all
   connectors and call drm_connector_register() for each of them.
   So, remove drm_connector_(un)register calls from parade ps8640.

Changes since v14:
 - update copyright info.
 - change bridge_to_ps8640 and connector_to_ps8640 to inline function.
 - fix some coding style.
 - use sizeof as array counter.
 - use drm_get_edid when read edid.
 - add mutex when firmware updating.

Changes since v13:
 - add const on data, ps8640_write_bytes(struct i2c_client *client, const u8 
*data, u16 data_len)
 - fix PAGE2_SW_REST tyro.
 - move the buf[3] init to entrance of the function.

Changes since v12:
 - fix hw_chip_id build warning

Changes since v11:
 - Remove depends on I2C, add DRM depends
 - Reuse ps8640_write_bytes() in ps8640_write_byte()
 - Use timer check for polling like the routines in 
 - Fix no drm_connector_unregister/drm_connector_cleanup when 
ps8640_bridge_attach fail
 - Check the ps8640 hardware id in ps8640_validate_firmware
 - Remove fw_version check
 - Move ps8640_validate_firmware before ps8640_enter_bl
 - Add ddc_i2c unregister when probe fail and ps8640_remove
---
 drivers/gpu/drm/bridge/Kconfig |   12 +
 drivers/gpu/drm/bridge/Makefile|1 +
 drivers/gpu/drm/bridge/parade-ps8640.c | 1079 
 3 files changed, 1092 insertions(+)
 create mode 100644 drivers/gpu/drm/bridge/parade-ps8640.c

diff --git a/drivers/gpu/drm/bridge/Kconfig b/drivers/gpu/drm/bridge/Kconfig
index 10e12e7..7f41bbc 100644
--- a/drivers/gpu/drm/bridge/Kconfig
+++ b/drivers/gpu/drm/bridge/Kconfig
@@ -57,6 +57,18 @@ config DRM_PARADE_PS8622
---help---
  Parade eDP-LVDS bridge chip driver.

+config DRM_PARADE_PS8640
+   tristate "Parade PS8640 MIPI DSI to eDP Converter"
+   depends on DRM
+   depends on OF
+   select DRM_KMS_HELPER
+   select DRM_MIPI_DSI
+   select DRM_PANEL
+   ---help---
+ Choose this option if you have PS8640 for display
+ The PS8640 is a high-performance and low-power
+ MIPI DSI to eDP converter
+
 config DRM_SII902X
tristate "Silicon Image sii902x RGB/HDMI bridge"
depends on OF
diff --git a/drivers/gpu/drm/bridge/Makefile b/drivers/gpu/drm/bridge/Makefile
index cdf3a3c..7d93d40 100644
--- a/drivers/gpu/drm/bridge/Makefile
+++ b/drivers/gpu/drm/bridge/Makefile
@@ -6,6 +6,7 @@ obj-$(CONFIG_DRM_DW_HDMI) += dw-hdmi.o
 obj-$(CONFIG_DRM_DW_HDMI_AHB_AUDIO) += dw-hdmi-ahb-audio.o
 obj-$(CONFIG_DRM_NXP_PTN3460) += nxp-ptn3460.o
 obj-$(CONFIG_DRM_PARADE_PS8622) += parade-ps8622.o
+obj-$(CONFIG_DRM_PARADE_PS8640) += parade-ps8640.o
 obj-$(CONFIG_DRM_SII902X) += sii902x.o
 obj-$(CONFIG_DRM_TOSHIBA_TC358767) += tc358767.o
 obj-$(CONFIG_DRM_ANALOGIX_DP) += analogix/
diff --git a/drivers/gpu/drm/bridge/parade-ps8640.c 
b/drivers/gpu/drm/bridge/parade-ps8640.c
new file mode 100644
index 000..2d9c337
--- /dev/null
+++ b/drivers/gpu/drm/bridge/parade-ps8640.c
@@ -0,0 +1,1079 @@
+/*
+ * Copyright (c) 2016 MediaTek Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define PAGE1_VSTART   0x6b
+#define PAGE2_SPI_CFG3 0x82
+#define I2C_TO_SPI_RESET   0x20
+#define PAGE2_ROMADD_BYTE

Re: [Patch v6.1] x86/kvm: Add AVX512_4VNNIW and AVX512_4FMAPS support

2016-11-14 Thread He Chen
On Tue, Nov 15, 2016 at 04:24:39AM +0800, kbuild test robot wrote:
> Hi He,
> 
> [auto build test ERROR on kvm/linux-next]
> [also build test ERROR on v4.9-rc5]
> [cannot apply to next-20161114]
> [if your patch is applied to the wrong git tree, please drop us a note to 
> help improve the system]
> 
> url:
> https://github.com/0day-ci/linux/commits/He-Chen/x86-kvm-Add-AVX512_4VNNIW-and-AVX512_4FMAPS-support/20161114-170941
> base:   https://git.kernel.org/pub/scm/virt/kvm/kvm.git linux-next
> config: x86_64-kexec (attached as .config)
> compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
> reproduce:
> # save the attached .config to linux build tree
> make ARCH=x86_64 
> 
> All errors (new ones prefixed by >>):
> 
>arch/x86/kvm/cpuid.c: In function '__do_cpuid_ent':
> >> arch/x86/kvm/cpuid.c:472:18: error: implicit declaration of function 
> >> 'get_scattered_cpuid_leaf' [-Werror=implicit-function-declaration]
>entry->edx &= get_scattered_cpuid_leaf(7, 0, CPUID_EDX);
>  ^~~~
> >> arch/x86/kvm/cpuid.c:472:49: error: 'CPUID_EDX' undeclared (first use in 
> >> this function)
>entry->edx &= get_scattered_cpuid_leaf(7, 0, CPUID_EDX);
> ^
>arch/x86/kvm/cpuid.c:472:49: note: each undeclared identifier is reported 
> only once for each function it appears in
>cc1: some warnings being treated as errors
>
I have downloaded .config.gz in attachment and use the .config in it
to build kernel in my local branch again, and I don't see any warn or
error message.

I wonder whether the previous 0001 and 0002 patches have applied to run
this test? Or is there something wrong with my compiler or patches?

Thanks,
-He


[PATCH v5 0/4] usb: early: add support for early printk through USB3 debug port

2016-11-14 Thread Lu Baolu
xHCI debug capability (DbC) is an optional but standalone
functionality provided by an xHCI host controller. With DbC
hardware initialized, the system will present a debug device
through the USB3 debug port (normally the first USB3 port).
The debug device is fully compliant with the USB framework
and provides the equivalent of a very high performance (USB3)
full-duplex serial link between the debug host and target.
The DbC functionality is independent of xHCI host. There
isn't any precondition from xHCI host side for DbC to work.

This patch set adds support for early printk functionality
through a USB3 debug port by 1) initializing and enabling
the DbC hardware during early boot; 2) registering a boot
console to the system so that early printk messages can go
through the USB3 debug port. It also includes some lines
of changes in usb_debug driver so that it can be bound when
a USB3 debug device is enumerated.

This code is designed to be used only for kernel debugging
when machine crashes very early before the console code is
initialized. It makes the life of kernel debugging easier
when people work with a modern machine without any legacy
serial ports.

---
Change log:
v4->v5:
  - add raw_spin_lock to make xdbc_bulk_write() reentrant. 

v3->v4:
  - Rename the document with .dst suffix.
  - Add the list of hardware that has been succesfuly
tested on in the document.

v2->v3:
  - Removed spinlock usage.
  - Removed work queue usage.
  - Refined the user guide document.

v1->v2:
  - Refactor the duplicate code in xdbc_early_start() and
xdbc_handle_external_reset().
  - Free resources when hardware not used any more.
  - Refine the user guide document.

Lu Baolu (4):
  usb: dbc: early driver for xhci debug capability
  x86: add support for earlyprintk via USB3 debug port
  usb: serial: usb_debug: add support for dbc debug device
  usb: doc: add document for USB3 debug port usage

 Documentation/kernel-parameters.txt   |1 +
 Documentation/usb/usb3-debug-port.rst |   95 +++
 arch/x86/Kconfig.debug|   14 +
 arch/x86/kernel/early_printk.c|5 +
 arch/x86/kernel/setup.c   |7 +
 drivers/usb/Kconfig   |3 +
 drivers/usb/Makefile  |2 +-
 drivers/usb/early/Makefile|1 +
 drivers/usb/early/xhci-dbc.c  | 1068 +
 drivers/usb/early/xhci-dbc.h  |  205 +++
 drivers/usb/serial/usb_debug.c|   28 +-
 include/linux/usb/xhci-dbgp.h |   22 +
 12 files changed, 1447 insertions(+), 4 deletions(-)
 create mode 100644 Documentation/usb/usb3-debug-port.rst
 create mode 100644 drivers/usb/early/xhci-dbc.c
 create mode 100644 drivers/usb/early/xhci-dbc.h
 create mode 100644 include/linux/usb/xhci-dbgp.h

-- 
2.1.4



[PATCH v5 1/4] usb: dbc: early driver for xhci debug capability

2016-11-14 Thread Lu Baolu
xHCI debug capability (DbC) is an optional but standalone
functionality provided by an xHCI host controller. Software
learns this capability by walking through the extended
capability list of the host. xHCI specification describes
DbC in section 7.6.

This patch introduces the code to probe and initialize the
debug capability hardware during early boot. With hardware
initialized, the debug target (system on which this code is
running) will present a debug device through the debug port
(normally the first USB3 port). The debug device is fully
compliant with the USB framework and provides the equivalent
of a very high performance (USB3) full-duplex serial link
between the debug host and target. The DbC functionality is
independent of xHCI host. There isn't any precondition from
xHCI host side for DbC to work.

This patch also includes bulk out and bulk in interfaces.
These interfaces could be used to implement early printk
bootconsole or hook to various system debuggers.

This code is designed to be only used for kernel debugging
when machine crashes very early before the console code is
initialized. For normal operation it is not recommended.

Cc: Mathias Nyman 
Signed-off-by: Lu Baolu 
---
 arch/x86/Kconfig.debug|   14 +
 drivers/usb/Kconfig   |3 +
 drivers/usb/Makefile  |2 +-
 drivers/usb/early/Makefile|1 +
 drivers/usb/early/xhci-dbc.c  | 1068 +
 drivers/usb/early/xhci-dbc.h  |  205 
 include/linux/usb/xhci-dbgp.h |   22 +
 7 files changed, 1314 insertions(+), 1 deletion(-)
 create mode 100644 drivers/usb/early/xhci-dbc.c
 create mode 100644 drivers/usb/early/xhci-dbc.h
 create mode 100644 include/linux/usb/xhci-dbgp.h

diff --git a/arch/x86/Kconfig.debug b/arch/x86/Kconfig.debug
index 67eec55..13e85b7 100644
--- a/arch/x86/Kconfig.debug
+++ b/arch/x86/Kconfig.debug
@@ -29,6 +29,7 @@ config EARLY_PRINTK
 config EARLY_PRINTK_DBGP
bool "Early printk via EHCI debug port"
depends on EARLY_PRINTK && PCI
+   select USB_EARLY_PRINTK
---help---
  Write kernel log output directly into the EHCI debug port.
 
@@ -48,6 +49,19 @@ config EARLY_PRINTK_EFI
  This is useful for kernel debugging when your machine crashes very
  early before the console code is initialized.
 
+config EARLY_PRINTK_XDBC
+   bool "Early printk via xHCI debug port"
+   depends on EARLY_PRINTK && PCI
+   select USB_EARLY_PRINTK
+   ---help---
+ Write kernel log output directly into the xHCI debug port.
+
+ This is useful for kernel debugging when your machine crashes very
+ early before the console code is initialized. For normal operation
+ it is not recommended because it looks ugly and doesn't cooperate
+ with klogd/syslogd or the X server. You should normally N here,
+ unless you want to debug such a crash.
+
 config X86_PTDUMP_CORE
def_bool n
 
diff --git a/drivers/usb/Kconfig b/drivers/usb/Kconfig
index fbe493d..9313fff 100644
--- a/drivers/usb/Kconfig
+++ b/drivers/usb/Kconfig
@@ -19,6 +19,9 @@ config USB_EHCI_BIG_ENDIAN_MMIO
 config USB_EHCI_BIG_ENDIAN_DESC
bool
 
+config USB_EARLY_PRINTK
+   bool
+
 menuconfig USB_SUPPORT
bool "USB support"
depends on HAS_IOMEM
diff --git a/drivers/usb/Makefile b/drivers/usb/Makefile
index 7791af6..0c37838 100644
--- a/drivers/usb/Makefile
+++ b/drivers/usb/Makefile
@@ -49,7 +49,7 @@ obj-$(CONFIG_USB_MICROTEK)+= image/
 obj-$(CONFIG_USB_SERIAL)   += serial/
 
 obj-$(CONFIG_USB)  += misc/
-obj-$(CONFIG_EARLY_PRINTK_DBGP)+= early/
+obj-$(CONFIG_USB_EARLY_PRINTK) += early/
 
 obj-$(CONFIG_USB_ATM)  += atm/
 obj-$(CONFIG_USB_SPEEDTOUCH)   += atm/
diff --git a/drivers/usb/early/Makefile b/drivers/usb/early/Makefile
index 24bbe51..2db5906 100644
--- a/drivers/usb/early/Makefile
+++ b/drivers/usb/early/Makefile
@@ -3,3 +3,4 @@
 #
 
 obj-$(CONFIG_EARLY_PRINTK_DBGP) += ehci-dbgp.o
+obj-$(CONFIG_EARLY_PRINTK_XDBC) += xhci-dbc.o
diff --git a/drivers/usb/early/xhci-dbc.c b/drivers/usb/early/xhci-dbc.c
new file mode 100644
index 000..5ac4223
--- /dev/null
+++ b/drivers/usb/early/xhci-dbc.c
@@ -0,0 +1,1068 @@
+/**
+ * xhci-dbc.c - xHCI debug capability early driver
+ *
+ * Copyright (C) 2016 Intel Corporation
+ *
+ * Author: Lu Baolu 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#define pr_fmt(fmt)KBUILD_MODNAME ":%s: " fmt, __func__
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "../host/xhci.h"
+#include "xhci-dbc.h"
+
+static struct xdbc_state xdbc;
+static int early_console_keep;
+
+#ifdef XDBC_TRACE
+#definexdbc_trace  trace_printk
+#else
+static inline void xdbc_trace(

[PATCH v5 4/4] usb: doc: add document for USB3 debug port usage

2016-11-14 Thread Lu Baolu
Add Documentation/usb/usb3-debug-port.rst. This document includes
the user guide for USB3 debug port.

Cc: linux-...@vger.kernel.org
Signed-off-by: Lu Baolu 
---
 Documentation/usb/usb3-debug-port.rst | 95 +++
 1 file changed, 95 insertions(+)
 create mode 100644 Documentation/usb/usb3-debug-port.rst

diff --git a/Documentation/usb/usb3-debug-port.rst 
b/Documentation/usb/usb3-debug-port.rst
new file mode 100644
index 000..70eabe4
--- /dev/null
+++ b/Documentation/usb/usb3-debug-port.rst
@@ -0,0 +1,95 @@
+===
+USB3 debug port
+===
+
+:Author: Lu Baolu 
+:Date: October 2016
+
+GENERAL
+===
+
+This is a HOWTO for using USB3 debug port on x86 systems.
+
+Before using any kernel debugging functionalities based on USB3
+debug port, you need to check 1) whether debug port is supported
+by the xHCI host, 2) which port is used for debugging purpose
+(normally the first USB3 root port). You must have a USB 3.0
+super-speed A-to-A debugging cable to connect the debug target
+with a debug host. In this document, a debug target stands for
+the system under debugging; while, a debug host stands for a
+stand-alone system that is able to talk to the debugging target
+through the USB3 debug port.
+
+EARLY PRINTK
+
+
+On debug target system, you need to customize a debugging kernel
+with CONFIG_EARLY_PRINTK_XDBC enabled. And add below kernel boot
+parameter::
+
+   "earlyprintk=xdbc"
+
+If there are multiple xHCI controllers in the system, you can
+append a host contoller index to this kernel parameter. This
+index is started from 0.
+
+If you are going to leverage the keep option defined by the
+early printk framework to keep the boot console alive after
+early boot, you'd better add below kernel boot parameter::
+
+   "usbcore.autosuspend=-1"
+
+On debug host side, you don't need to customize the kernel, but
+you need to disable usb subsystem runtime power management by
+adding below kernel boot parameter::
+
+   "usbcore.autosuspend=-1"
+
+Before starting the debug target, you should connect the debug
+port on debug target with a root port or port of any external hub
+on the debug host. The cable used to connect these two ports
+should be a USB 3.0 super-speed A-to-A debugging cable.
+
+During early boot of debug target, DbC (the debug engine for USB3
+debug port) hardware gets initialized. Debug host should be able
+to enumerate the debug target as a debug device. Debug host will
+then bind the debug device with the usb_debug driver module and
+create the /dev/ttyUSB0 device.
+
+If device enumeration goes smoothly, you should be able to see
+below kernel messages on debug host::
+
+   # tail -f /var/log/kern.log
+   [ 1815.983374] usb 4-3: new SuperSpeed USB device number 4 using 
xhci_hcd
+   [ 1815.999595] usb 4-3: LPM exit latency is zeroed, disabling LPM.
+   [ 1815.999899] usb 4-3: New USB device found, idVendor=1d6b, 
idProduct=0004
+   [ 1815.02] usb 4-3: New USB device strings: Mfr=1, Product=2, 
SerialNumber=3
+   [ 1815.03] usb 4-3: Product: Remote GDB
+   [ 1815.04] usb 4-3: Manufacturer: Linux
+   [ 1815.05] usb 4-3: SerialNumber: 0001
+   [ 1816.000240] usb_debug 4-3:1.0: xhci_dbc converter detected
+   [ 1816.000360] usb 4-3: xhci_dbc converter now attached to ttyUSB0
+
+You can run below bash scripts on debug host to read the kernel
+log sent from debug target.
+
+.. code-block:: sh
+
+   = start of bash scripts =
+   #!/bin/bash
+
+   while true ; do
+   while [ ! -d /sys/class/tty/ttyUSB0 ] ; do
+   :
+   done
+   cat /dev/ttyUSB0 >> xdbc.log
+   done
+   = end of bash scripts ===
+
+You should be able to see the early boot message in xdbc.log.
+
+If it doesn't work, please ask it on the 
+mailing list. Below USB hosts have been verified to work::
+
+   Intel Corporation Sunrise Point-H USB 3.0 xHCI Controller
+   Intel Corporation Wildcat Point-LP USB xHCI Controller
-- 
2.1.4



[PATCH v5 3/4] usb: serial: usb_debug: add support for dbc debug device

2016-11-14 Thread Lu Baolu
This patch add dbc debug device support in usb_debug driver.

Signed-off-by: Lu Baolu 
Acked-by: Johan Hovold 
---
 drivers/usb/serial/usb_debug.c | 28 +---
 1 file changed, 25 insertions(+), 3 deletions(-)

diff --git a/drivers/usb/serial/usb_debug.c b/drivers/usb/serial/usb_debug.c
index ca2fa5b..92f7e5c 100644
--- a/drivers/usb/serial/usb_debug.c
+++ b/drivers/usb/serial/usb_debug.c
@@ -32,7 +32,18 @@ static const struct usb_device_id id_table[] = {
{ USB_DEVICE(0x0525, 0x127a) },
{ },
 };
-MODULE_DEVICE_TABLE(usb, id_table);
+
+static const struct usb_device_id dbc_id_table[] = {
+   { USB_DEVICE(0x1d6b, 0x0004) },
+   { },
+};
+
+static const struct usb_device_id id_table_combined[] = {
+   { USB_DEVICE(0x0525, 0x127a) },
+   { USB_DEVICE(0x1d6b, 0x0004) },
+   { },
+};
+MODULE_DEVICE_TABLE(usb, id_table_combined);
 
 /* This HW really does not support a serial break, so one will be
  * emulated when ever the break state is set to true.
@@ -71,9 +82,20 @@ static struct usb_serial_driver debug_device = {
.process_read_urb = usb_debug_process_read_urb,
 };
 
+static struct usb_serial_driver dbc_device = {
+   .driver = {
+   .owner =THIS_MODULE,
+   .name = "xhci_dbc",
+   },
+   .id_table = dbc_id_table,
+   .num_ports =1,
+   .break_ctl =usb_debug_break_ctl,
+   .process_read_urb = usb_debug_process_read_urb,
+};
+
 static struct usb_serial_driver * const serial_drivers[] = {
-   &debug_device, NULL
+   &debug_device, &dbc_device, NULL
 };
 
-module_usb_serial_driver(serial_drivers, id_table);
+module_usb_serial_driver(serial_drivers, id_table_combined);
 MODULE_LICENSE("GPL");
-- 
2.1.4



Re: perf: fuzzer KASAN slab-out-of-bounds in snb_uncore_imc_event_del

2016-11-14 Thread Vince Weaver
On Mon, 14 Nov 2016, Vince Weaver wrote:

> Anyway as per the suggestion at Linux Plumbers I enabled KASAN and on my 
> haswell machine it falls over in a few minutes of running the perf_fuzzer.
> 
> [  205.740194] 
> ==
> [  205.748005] BUG: KASAN: slab-out-of-bounds in 
> snb_uncore_imc_event_del+0x6c/0xa0 at addr 8800caa43768
> [  205.758324] Read of size 8 by task perf_fuzzer/6618
> [  205.763589] CPU: 0 PID: 6618 Comm: perf_fuzzer Not tainted 4.9.0-rc5 #4
> [  205.770721] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 
> 01/26/2014
> [  205.778689]  8800c3c479b8 816bb796 88011ec00600 
> 8800caa43580
> [  205.786759]  8800c3c479e0 812fb961 8800c3c47a78 
> 8800caa43580
> [  205.794850]  8800caa43580 8800c3c47a68 812fbbd8 
> 8800c3c47a28
> [  205.802911] Call Trace:
> [  205.805559]  [] dump_stack+0x63/0x8d
> [  205.811135]  [] kasan_object_err+0x21/0x70
> [  205.817267]  [] kasan_report_error+0x1d8/0x4c0
> [  205.823752]  [] ? __lock_is_held+0x75/0xc0
> [  205.829868]  [] ? snb_uncore_imc_read_counter+0x42/0x50
> [  205.837198]  [] ? uncore_perf_event_update+0xe2/0x160
> [  205.844337]  [] kasan_report+0x39/0x40
> [  205.850085]  [] ? snb_uncore_imc_event_del+0x6c/0xa0

The best I can tell this maps to:

static void snb_uncore_imc_event_del(struct perf_event *event, int flags)
{
struct intel_uncore_box *box = uncore_event_to_box(event);
int i;

snb_uncore_imc_event_stop(event, PERF_EF_UPDATE);

for (i = 0; i < box->n_events; i++) {
>>> if (event == box->event_list[i]) {
--box->n_events;
break;
}
}
}

Can this code be right?  Does it actually remove the event?
The similar code in 

static void uncore_pmu_event_del(struct perf_event *event, int flags)



for (i = 0; i < box->n_events; i++) {
if (event == box->event_list[i]) {
uncore_put_event_constraint(box, event);

for (++i; i < box->n_events; i++)
box->event_list[i - 1] = box->event_list[i];

--box->n_events;
break;
}
}


seems like it is more likely to be correct.

Vince


Re: [PATCH 1/3] virtio: Basic implementation of virtio pstore driver

2016-11-14 Thread Namhyung Kim
On Tue, Nov 15, 2016 at 07:06:28AM +0200, Michael S. Tsirkin wrote:
> On Tue, Nov 15, 2016 at 01:50:21PM +0900, Namhyung Kim wrote:
> > On Thu, Nov 10, 2016 at 06:39:55PM +0200, Michael S. Tsirkin wrote:
> > [SNIP]
> > > > +struct virtio_pstore_fileinfo {
> > > > +   __virtio64  id;
> > > > +   __virtio32  count;
> > > > +   __virtio16  type;
> > > > +   __virtio16  unused;
> > > > +   __virtio32  flags;
> > > > +   __virtio32  len;
> > > > +   __virtio64  time_sec;
> > > > +   __virtio32  time_nsec;
> > > > +   __virtio32  reserved;
> > > > +};
> > > > +
> > > > +struct virtio_pstore_config {
> > > > +   __virtio32  bufsize;
> > > > +};
> > > > +
> > > 
> > > What exactly does each field mean? I'm especially
> > > interested in time fields - maintaining a consistent
> > > time between host and guest is not a simple problem.
> > 
> > These are required by pstore and will be used to create corresponding
> > files in the pstore filesystem.  The time fields are for mtime and
> > ctime and, I think, it's just a hint for user and doesn't require
> > strict consistency.
> 
> Pls add documentation. I would just drop hints for now.

Well, I'll add docmentation.  But I think just dropping might not good
since they all have host time and it's helpful to know their relative
difference in guest.

Thanks,
Namhyung


Re: powerpc64: Enable CONFIG_E500 and CONFIG_PPC_E500MC for e5500/e6500

2016-11-14 Thread Scott Wood
On Fri, 2016-10-07 at 11:00 +0200, David Engraf wrote:
> Am 27.09.2016 um 01:08 schrieb Scott Wood:
> > 
> > On Mon, 2016-09-26 at 10:48 +0200, David Engraf wrote:
> > > 
> > > Am 25.09.2016 um 08:20 schrieb Scott Wood:
> > > > 
> > > > 
> > > > On Mon, Aug 22, 2016 at 04:46:43PM +0200, David Engraf wrote:
> > > > > 
> > > > > 
> > > > > The PowerPC e5500/e6500 architecture is based on the e500mc core.
> > > > > Enable
> > > > > CONFIG_E500 and CONFIG_PPC_E500MC when e5500/e6500 is used.
> > > > > 
> > > > > This will also fix using CONFIG_PPC_QEMU_E500 on PPC64.
> > > > > 
> > > > > Signed-off-by: David Engraf 
> > > > > ---
> > > > >  arch/powerpc/platforms/Kconfig.cputype | 6 --
> > > > >  1 file changed, 4 insertions(+), 2 deletions(-)
> > > > > 
> > > > > diff --git a/arch/powerpc/platforms/Kconfig.cputype
> > > > > b/arch/powerpc/platforms/Kconfig.cputype
> > > > > index f32edec..0382da7 100644
> > > > > --- a/arch/powerpc/platforms/Kconfig.cputype
> > > > > +++ b/arch/powerpc/platforms/Kconfig.cputype
> > > > > @@ -125,11 +125,13 @@ config POWER8_CPU
> > > > > 
> > > > >  config E5500_CPU
> > > > >   bool "Freescale e5500"
> > > > > - depends on E500
> > > > > + select E500
> > > > > + select PPC_E500MC
> > > > > 
> > > > >  config E6500_CPU
> > > > >   bool "Freescale e6500"
> > > > > - depends on E500
> > > > > + select E500
> > > > > + select PPC_E500MC
> > > > These config symbols are for setting -mcpu.  Kernels built with
> > > > CONFIG_GENERIC_CPU should also work on e5500/e6500.
> > > I don't think so.
> > I do think so.  It's what you get when you run "make
> > corenet64_smp_defconfig"
> > and that kernel works on e5500/e6500.
> > 
> > > 
> > >  At least on QEMU it is not working because e5500/e6500
> > > is based on the e500mc core and the option CONFIG_PPC_E500MC also
> > > controls the cpu features (check cputable.h).
> > Again, this is only a problem when you have CONFIG_PPC_QEMU_E500 without
> > CONFIG_CORENET_GENERIC, and the fix for that is to have
> > CONFIG_PPC_QEMU_E500
> > select CONFIG_E500 (and you need to manually turn on CONFIG_PPC_E500MC if
> > applicable, since CONFIG_PPC_QEMU_E500 can also be used with e500v2).
> > 
> > I wouldn't be opposed to also adding "select PPC_E500MC if PPC64" to
> > CONFIG_PPC_QEMU_E500.
> Please find attached the new version, setting E500 and PPC_E500MC on 64 
> bit for review.

Could you send as a standalone patch (not an attachment) with changelog and
signoff so I can apply it?

-Scott



[PATCH v2 1/2] Input: synaptics-rmi4 - add support for F55 sensor tuning

2016-11-14 Thread Guenter Roeck
Sensor tuning support is needed to determine the number of enabled
tx and rx electrodes for use in F54 functions.

The number of enabled electrodes is not identical to the total number
of electrodes as reported with F55:Query0 and F55:Query1. It has to be
calculated by analyzing F55:Ctrl1 (sensor receiver assignment) and
F55:Ctrl2 (sensor transmitter assignment).

Support for additional sensor tuning functions may be added later.

Fixes: 3a762dbd5347 ("[media] Input: synaptics-rmi4 - add support for F54 ...")
Signed-off-by: Guenter Roeck 
---
v2: Drop unnecessary include files
Only read required number of query elements
Added Fixes: tag (both patch 1 and 2 are needed)

 drivers/input/rmi4/Kconfig  |   9 +++
 drivers/input/rmi4/Makefile |   1 +
 drivers/input/rmi4/rmi_bus.c|   3 +
 drivers/input/rmi4/rmi_driver.h |   1 +
 drivers/input/rmi4/rmi_f55.c| 124 
 5 files changed, 138 insertions(+)
 create mode 100644 drivers/input/rmi4/rmi_f55.c

diff --git a/drivers/input/rmi4/Kconfig b/drivers/input/rmi4/Kconfig
index 4c8a55857e00..11ede43c9936 100644
--- a/drivers/input/rmi4/Kconfig
+++ b/drivers/input/rmi4/Kconfig
@@ -72,3 +72,12 @@ config RMI4_F54
 
  Function 54 provides access to various diagnostic features in certain
  RMI4 touch sensors.
+
+config RMI4_F55
+   bool "RMI4 Function 55 (Sensor tuning)"
+   depends on RMI4_CORE
+   help
+ Say Y here if you want to add support for RMI4 function 55
+
+ Function 55 provides access to the RMI4 touch sensor tuning
+ mechanism.
diff --git a/drivers/input/rmi4/Makefile b/drivers/input/rmi4/Makefile
index 0bafc8502c4b..96f8e0c21e3b 100644
--- a/drivers/input/rmi4/Makefile
+++ b/drivers/input/rmi4/Makefile
@@ -8,6 +8,7 @@ rmi_core-$(CONFIG_RMI4_F11) += rmi_f11.o
 rmi_core-$(CONFIG_RMI4_F12) += rmi_f12.o
 rmi_core-$(CONFIG_RMI4_F30) += rmi_f30.o
 rmi_core-$(CONFIG_RMI4_F54) += rmi_f54.o
+rmi_core-$(CONFIG_RMI4_F55) += rmi_f55.o
 
 # Transports
 obj-$(CONFIG_RMI4_I2C) += rmi_i2c.o
diff --git a/drivers/input/rmi4/rmi_bus.c b/drivers/input/rmi4/rmi_bus.c
index ef8c747c35e7..82b7d4960858 100644
--- a/drivers/input/rmi4/rmi_bus.c
+++ b/drivers/input/rmi4/rmi_bus.c
@@ -314,6 +314,9 @@ static struct rmi_function_handler *fn_handlers[] = {
 #ifdef CONFIG_RMI4_F54
&rmi_f54_handler,
 #endif
+#ifdef CONFIG_RMI4_F55
+   &rmi_f55_handler,
+#endif
 };
 
 static void __rmi_unregister_function_handlers(int start_idx)
diff --git a/drivers/input/rmi4/rmi_driver.h b/drivers/input/rmi4/rmi_driver.h
index 8dfbebe9bf86..a65cf70f61e2 100644
--- a/drivers/input/rmi4/rmi_driver.h
+++ b/drivers/input/rmi4/rmi_driver.h
@@ -103,4 +103,5 @@ extern struct rmi_function_handler rmi_f11_handler;
 extern struct rmi_function_handler rmi_f12_handler;
 extern struct rmi_function_handler rmi_f30_handler;
 extern struct rmi_function_handler rmi_f54_handler;
+extern struct rmi_function_handler rmi_f55_handler;
 #endif
diff --git a/drivers/input/rmi4/rmi_f55.c b/drivers/input/rmi4/rmi_f55.c
new file mode 100644
index ..2d221cc97391
--- /dev/null
+++ b/drivers/input/rmi4/rmi_f55.c
@@ -0,0 +1,124 @@
+/*
+ * Copyright (c) 2012-2015 Synaptics Incorporated
+ * Copyright (C) 2016 Zodiac Inflight Innovations
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published by
+ * the Free Software Foundation.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include "rmi_driver.h"
+
+#define F55_NAME   "rmi4_f55"
+
+/* F55 data offsets */
+#define F55_NUM_RX_OFFSET  0
+#define F55_NUM_TX_OFFSET  1
+#define F55_PHYS_CHAR_OFFSET   2
+
+/* Only read required query registers */
+#define F55_QUERY_LEN  3
+
+/* F55 capabilities */
+#define F55_CAP_SENSOR_ASSIGN  BIT(0)
+
+struct f55_data {
+   struct rmi_function *fn;
+
+   u8 qry[F55_QUERY_LEN];
+   u8 num_rx_electrodes;
+   u8 cfg_num_rx_electrodes;
+   u8 num_tx_electrodes;
+   u8 cfg_num_tx_electrodes;
+};
+
+static int rmi_f55_detect(struct rmi_function *fn)
+{
+   struct f55_data *f55;
+   int error;
+
+   f55 = dev_get_drvdata(&fn->dev);
+
+   error = rmi_read_block(fn->rmi_dev, fn->fd.query_base_addr,
+  &f55->qry, sizeof(f55->qry));
+   if (error) {
+   dev_err(&fn->dev, "%s: Failed to query F55 properties\n",
+   __func__);
+   return error;
+   }
+
+   f55->num_rx_electrodes = f55->qry[F55_NUM_RX_OFFSET];
+   f55->num_tx_electrodes = f55->qry[F55_NUM_TX_OFFSET];
+
+   f55->cfg_num_rx_electrodes = f55->num_rx_electrodes;
+   f55->cfg_num_tx_electrodes = f55->num_rx_electrodes;
+
+   if (f55->qry[F55_PHYS_CHAR_OFFSET] & F55_CAP_SENSOR_ASSIGN) {
+   int i, total;
+   u8 buf[256];
+
+   /*
+* Calculate the number of en

[PATCH v2 2/2] Input: synaptics-rmi4 - Propagate correct number of rx and tx electrodes to F54

2016-11-14 Thread Guenter Roeck
F54 diagnostics report functions provide data based on the number of
enabled rx and tx electrodes, which is not identical to the number of
electrodes reported with F54:Query0 and F54:Query1. Those values report
the number of supported electrodes, not the number of enabled electrodes.
The number of enabled electrodes can be determined by analyzing F55:Ctrl1
(sensor receiver assignment) and F55:Ctrl2 (sensor transmitter assignment).

Propagate the number of enabled electrodes from F55 to F54 to avoid
corrupted output if not all electrodes are enabled.

Fixes: 3a762dbd5347 ("[media] Input: synaptics-rmi4 - add support for F54 ...")
Cc: Nick Dyer 
Cc: Andrew Duggan 
Cc: Chris Healy 
Signed-off-by: Guenter Roeck 
---
v2: Update Fixes: sha

 drivers/input/rmi4/Kconfig   |  1 +
 drivers/input/rmi4/rmi_f54.c | 14 ++
 drivers/input/rmi4/rmi_f55.c |  7 +++
 include/linux/rmi.h  |  3 +++
 4 files changed, 21 insertions(+), 4 deletions(-)

diff --git a/drivers/input/rmi4/Kconfig b/drivers/input/rmi4/Kconfig
index 11ede43c9936..d7129928cde6 100644
--- a/drivers/input/rmi4/Kconfig
+++ b/drivers/input/rmi4/Kconfig
@@ -67,6 +67,7 @@ config RMI4_F54
depends on RMI4_CORE
depends on VIDEO_V4L2=y || (RMI4_CORE=m && VIDEO_V4L2=m)
select VIDEOBUF2_VMALLOC
+   select RMI4_F55
help
  Say Y here if you want to add support for RMI4 function 54
 
diff --git a/drivers/input/rmi4/rmi_f54.c b/drivers/input/rmi4/rmi_f54.c
index cf805b960866..9cb3aa733f0f 100644
--- a/drivers/input/rmi4/rmi_f54.c
+++ b/drivers/input/rmi4/rmi_f54.c
@@ -216,8 +216,10 @@ static int rmi_f54_request_report(struct rmi_function *fn, 
u8 report_type)
 
 static size_t rmi_f54_get_report_size(struct f54_data *f54)
 {
-   u8 rx = f54->num_rx_electrodes ? : f54->num_rx_electrodes;
-   u8 tx = f54->num_tx_electrodes ? : f54->num_tx_electrodes;
+   struct rmi_device *rmi_dev = f54->fn->rmi_dev;
+   struct rmi_driver_data *drv_data = dev_get_drvdata(&rmi_dev->dev);
+   u8 rx = drv_data->num_rx_electrodes ? : f54->num_rx_electrodes;
+   u8 tx = drv_data->num_tx_electrodes ? : f54->num_tx_electrodes;
size_t size;
 
switch (rmi_f54_get_reptype(f54, f54->input)) {
@@ -401,6 +403,10 @@ static int rmi_f54_vidioc_enum_input(struct file *file, 
void *priv,
 
 static int rmi_f54_set_input(struct f54_data *f54, unsigned int i)
 {
+   struct rmi_device *rmi_dev = f54->fn->rmi_dev;
+   struct rmi_driver_data *drv_data = dev_get_drvdata(&rmi_dev->dev);
+   u8 rx = drv_data->num_rx_electrodes ? : f54->num_rx_electrodes;
+   u8 tx = drv_data->num_tx_electrodes ? : f54->num_tx_electrodes;
struct v4l2_pix_format *f = &f54->format;
enum rmi_f54_report_type reptype;
int ret;
@@ -415,8 +421,8 @@ static int rmi_f54_set_input(struct f54_data *f54, unsigned 
int i)
 
f54->input = i;
 
-   f->width = f54->num_rx_electrodes;
-   f->height = f54->num_tx_electrodes;
+   f->width = rx;
+   f->height = tx;
f->field = V4L2_FIELD_NONE;
f->colorspace = V4L2_COLORSPACE_RAW;
f->bytesperline = f->width * sizeof(u16);
diff --git a/drivers/input/rmi4/rmi_f55.c b/drivers/input/rmi4/rmi_f55.c
index 2d221cc97391..37390ca6a924 100644
--- a/drivers/input/rmi4/rmi_f55.c
+++ b/drivers/input/rmi4/rmi_f55.c
@@ -38,6 +38,8 @@ struct f55_data {
 
 static int rmi_f55_detect(struct rmi_function *fn)
 {
+   struct rmi_device *rmi_dev = fn->rmi_dev;
+   struct rmi_driver_data *drv_data = dev_get_drvdata(&rmi_dev->dev);
struct f55_data *f55;
int error;
 
@@ -57,6 +59,9 @@ static int rmi_f55_detect(struct rmi_function *fn)
f55->cfg_num_rx_electrodes = f55->num_rx_electrodes;
f55->cfg_num_tx_electrodes = f55->num_rx_electrodes;
 
+   drv_data->num_rx_electrodes = f55->cfg_num_rx_electrodes;
+   drv_data->num_tx_electrodes = f55->cfg_num_rx_electrodes;
+
if (f55->qry[F55_PHYS_CHAR_OFFSET] & F55_CAP_SENSOR_ASSIGN) {
int i, total;
u8 buf[256];
@@ -78,6 +83,7 @@ static int rmi_f55_detect(struct rmi_function *fn)
total++;
}
f55->cfg_num_rx_electrodes = total;
+   drv_data->num_rx_electrodes = total;
}
 
error = rmi_read_block(fn->rmi_dev,
@@ -90,6 +96,7 @@ static int rmi_f55_detect(struct rmi_function *fn)
total++;
}
f55->cfg_num_tx_electrodes = total;
+   drv_data->num_tx_electrodes = total;
}
}
 
diff --git a/include/linux/rmi.h b/include/linux/rmi.h
index e0aca1476001..45734f1343b3 100644
--- a/include/linux/rmi.h
+++ b/include/linux/rmi.h
@@ -345,6 +345,9 @@ struct rmi_driver_data {
u8 pdt_props;
u8 bsr;
 
+   u8 num_rx_electrodes;
+   u8 num_tx_electrode

Re: kvm: WARNING in em_jmp_far

2016-11-14 Thread Nadav Amit

> On Nov 14, 2016, at 9:30 PM, Dmitry Vyukov  wrote:
> 
> On Tue, Nov 15, 2016 at 6:24 AM, Nadav Amit  wrote:
>> 
>>> On Nov 14, 2016, at 9:06 PM, Dmitry Vyukov  wrote:
>>> 
>>> Hello,
>>> 
>>> The following program triggers WARNING in em_jmp_far:
>>> https://gist.githubusercontent.com/dvyukov/16bfd3d68fa7d5461101ef74e07796e4/raw/e6d663980681f2c5838ff6cd361cede7d3204838/gistfile1.txt
>>> 
>>> 
>>> WARNING: CPU: 1 PID: 15748 at arch/x86/kvm/emulate.c:2128 
>>> em_jmp_far+0x4a7/0x530
>> 
>> I don’t know how to “read” the test, but it seems that this warning
>> can be triggered if CS base/limit cause a #GP exception when EIP
>> is loaded.
>> 
>> I think it safe to remove this warning (which I introduced) as well as
>> the redundant “return rc” that follows it. The code should handle the
>> emulation correctly regardless of the warning.
> 
> There was also a similar WARNING in em_ret_far:
> https://groups.google.com/forum/#!msg/syzkaller/o5ZftARBhrs/r1ivQ-HtBgAJ
> 
> Please mail a fix and add a test.

I am sorry, but I don’t think my current employer allows me to contribute
to KVM in such a manner.

Regards,
Nadav

Re: [PATCH 00/34] perf clang: Builtin clang and perfhook support

2016-11-14 Thread Wangnan (F)



On 2016/11/15 13:21, Alexei Starovoitov wrote:

On Mon, Nov 14, 2016 at 9:03 PM, Wangnan (F)  wrote:


On 2016/11/15 12:57, Alexei Starovoitov wrote:

On Mon, Nov 14, 2016 at 8:05 PM, Wang Nan  wrote:

This is version 2 of perf builtin clang patch series. Compare to v1,
add an exciting feature: jit compiling perf hook functions. This
features allows script writer report result through BPF map in a
customized way.

looks great.


SEC("perfhook:record_start")
void record_start(void *ctx)
{
  int perf_pid = getpid(), key = G_perf_pid;
  printf("Start count, perfpid=%d\n", perf_pid);
  jit_helper__map_update_elem(ctx, &GVALS, &key, &perf_pid, 0);

the name, I think, is too verbose.
Why not to keep them as bpf_map_update_elem
even for user space programs?


I can make it shorter by give it a better name or use a wrapper like

BPF_MAP(update_elem)

the macro isn't pretty, since function calls won't look like calls.


but the only thing I can't do is to make perfhook and in-kernel script
use a uniform name for these bpf_map functions, because
bpf_map_update_elem is already defined:

"static long (*bpf_map_update_elem)(void *, void *, void *, unsigned long) =
(void *)2;\n"

right. i guess you could have #ifdef it, so it's different for bpf backend
and for native.


Then the '.c' -> LLVM IR compiling should be done twice for BPF
and for JIT to make the macro work. In current implementation
we have only one LLVM IR. It is faster and can make sure the data
layout ("maps" section) is identical.


Another alternative is to call it map_update_elem or map_update
or bpf_map_update. Something shorter is already a win.
'jit_helper__' prefix is an implementation detail. The users don't
need to know and don't need to spell it out everywhere.

Good. Let choose a better name for them.

Thank you.



Re: [PATCH v6 0/9] tpm: cleanup/fixes in existing event log support

2016-11-14 Thread Nayna



On 11/15/2016 07:45 AM, Jarkko Sakkinen wrote:

On Mon, Nov 14, 2016 at 04:25:14PM -0800, Jarkko Sakkinen wrote:

On Mon, Nov 14, 2016 at 02:33:23PM -0800, Jarkko Sakkinen wrote:

On Mon, Nov 14, 2016 at 05:00:47AM -0500, Nayna Jain wrote:

This patch set includes the cleanup and bug fixes patches, previously
part of the "tpm: add the securityfs pseudo files support for TPM 2.0
firmware event log" patch set, in order to upstream them more quickly.


I applied the patches. I'm not yet sure whether these are part of the
4.10 pull request or whether I postpone to 4.11 (my preference would be
4.10 but I do not want to close that right now). I'll do testing next
week before doing pull request.

I hope that the commits gets some reviews and testing now that they are
easily testable in my master branch.


Event log still works and they do not seem to break TPM 2.0 (tried both
machine with tpm_crb and tpm_tis).

Stefan: would you mind check that these do not break your TPM 1.2
environment? I already tried wih TPM 1.2 machine but probably would
make sense to peer test.


I'm dropping commits 8/9 and 9/9 from my tree and *will not* include
them to my 4.10 pull request.


Will fix this and resend the patch 8/9 and 9/9 again.

Thanks & Regards,
   - Nayna



/Jarkko





kvm: WARNING in rtc_status_pending_eoi_check_valid

2016-11-14 Thread Dmitry Vyukov
Hello,

The following program triggers WARNING in rtc_status_pending_eoi_check_valid:
https://gist.githubusercontent.com/dvyukov/1bd04c1b36a0c2da13c6da386e1e8c08/raw/c22c7dfa28604bd2920e1c135cfff2cb2acf8bed/gistfile1.txt

On commit a25f0944ba9b1d8a6813fd6f1a86f1bd59ac25a6 (Nov 13)


Disabled LAPIC found during irq injection
[ cut here ]
WARNING: CPU: 1 PID: 6812 at arch/x86/kvm/ioapic.c:104[] rtc_status_pending_eoi_check_valid+0x5e/0x80
arch/x86/kvm/ioapic.c:104
Modules linked in:[ 1566.655501] CPU: 1 PID: 6812 Comm: a.out Tainted:
GW   4.9.0-rc5+ #28
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
 880038367128 834c2959 0001 11000706cdb8
 ed000706cdb0 41b58ab3 89575430 834c266b
 815efeb7 88003db8cc58 0082 88003db8cc60
Call Trace:
 [< inline >] __dump_stack lib/dump_stack.c:15
 [] dump_stack+0x2ee/0x3f5 lib/dump_stack.c:51
 [] __warn+0x1a4/0x1e0 kernel/panic.c:550
 [] warn_slowpath_null+0x31/0x40 kernel/panic.c:585
 [] rtc_status_pending_eoi_check_valid+0x5e/0x80
arch/x86/kvm/ioapic.c:104
 [] __rtc_irq_eoi_tracking_restore_one+0x2e5/0x350
arch/x86/kvm/ioapic.c:135
 [] kvm_rtc_eoi_tracking_restore_one+0x6b/0x90
arch/x86/kvm/ioapic.c:144
 [] kvm_apic_set_state+0x97e/0xdc0 arch/x86/kvm/lapic.c:2091
 [< inline >] kvm_vcpu_ioctl_set_lapic arch/x86/kvm/x86.c:2836
 [] kvm_arch_vcpu_ioctl+0x1ae3/0x44a0 arch/x86/kvm/x86.c:3339
 [] kvm_vcpu_ioctl+0x237/0x11c0
arch/x86/kvm/../../../virt/kvm/kvm_main.c:2708
 [< inline >] vfs_ioctl fs/ioctl.c:43
 [] do_vfs_ioctl+0x1c4/0x1630 fs/ioctl.c:679
 [< inline >] SYSC_ioctl fs/ioctl.c:694
 [] SyS_ioctl+0x94/0xc0 fs/ioctl.c:685
 [] entry_SYSCALL_64_fastpath+0x23/0xc6
arch/x86/entry/entry_64.S:209
---[ end trace f9208bd27a680718 ]---


Re: kvm: WARNING in em_jmp_far

2016-11-14 Thread Nadav Amit
 
> On Nov 14, 2016, at 9:06 PM, Dmitry Vyukov  wrote:
> 
> Hello,
> 
> The following program triggers WARNING in em_jmp_far:
> https://gist.githubusercontent.com/dvyukov/16bfd3d68fa7d5461101ef74e07796e4/raw/e6d663980681f2c5838ff6cd361cede7d3204838/gistfile1.txt
> 
> 
> WARNING: CPU: 1 PID: 15748 at arch/x86/kvm/emulate.c:2128 
> em_jmp_far+0x4a7/0x530

I don’t know how to “read” the test, but it seems that this warning
can be triggered if CS base/limit cause a #GP exception when EIP
is loaded.

I think it safe to remove this warning (which I introduced) as well as
the redundant “return rc” that follows it. The code should handle the
emulation correctly regardless of the warning.

Regards,
Nadav

Re: kvm: WARNING in em_jmp_far

2016-11-14 Thread Dmitry Vyukov
On Tue, Nov 15, 2016 at 6:24 AM, Nadav Amit  wrote:
>
>> On Nov 14, 2016, at 9:06 PM, Dmitry Vyukov  wrote:
>>
>> Hello,
>>
>> The following program triggers WARNING in em_jmp_far:
>> https://gist.githubusercontent.com/dvyukov/16bfd3d68fa7d5461101ef74e07796e4/raw/e6d663980681f2c5838ff6cd361cede7d3204838/gistfile1.txt
>>
>>
>> WARNING: CPU: 1 PID: 15748 at arch/x86/kvm/emulate.c:2128 
>> em_jmp_far+0x4a7/0x530
>
> I don’t know how to “read” the test, but it seems that this warning
> can be triggered if CS base/limit cause a #GP exception when EIP
> is loaded.
>
> I think it safe to remove this warning (which I introduced) as well as
> the redundant “return rc” that follows it. The code should handle the
> emulation correctly regardless of the warning.

There was also a similar WARNING in em_ret_far:
https://groups.google.com/forum/#!msg/syzkaller/o5ZftARBhrs/r1ivQ-HtBgAJ

Please mail a fix and add a test.

Thanks


Re: [PATCH v3] soc: qcom: Add SoC info driver

2016-11-14 Thread Bjorn Andersson
On Mon 14 Nov 06:30 PST 2016, Imran Khan wrote:

> On 11/8/2016 1:05 AM, Bjorn Andersson wrote:
> > On Mon 07 Nov 06:35 PST 2016, Imran Khan wrote:
> > 
> > 
> > 
> > [..]
> > 
>  +static void socinfo_populate(struct soc_device_attribute *soc_dev_attr)
>  +{
>  +u32 soc_version = socinfo_get_version();
>  +
>  +soc_dev_attr->soc_id   = kasprintf(GFP_KERNEL, "%d", 
>  socinfo_get_id());
> >>>
> >>> I believe soc_id is supposed to be a human readable name; e.g. "MSM8996"
> >>> not "246".
> >>>
> >>
> >> I am not sure about this. I see other vendors also exposing soc_id as 
> >> numeric value
> >> and machine is perhaps used for a human readable name. Please let me if I 
> >> am getting something wrong here.
> >>
> > 
> > I'm slightly confused to what these various properties are supposed to
> > contain, according to Documentation/ABI/testing/sysfs-devices-soc soc_id
> > should contain the SoC serial number, while most implementations does
> > like you and put something telling which SoC it is.
> > 
> > 246 is however not a useful number, as everyone reading it - be it human
> > or computer - will have to carry the translation table to figure out
> > what it actually says.
> >
> 
> Yeah. I agree on this point. I was just following the lead of other SoCs here.
> Just worried if having a string here breaks the convention. At least having
> a numeric number is more in line with the documentation which expects a 
> serial number. May be here by serial number the documentation means numeric
> id itself. Can someone please provide some feedback? 
>  

Yeah, the more i look at this the more puzzled I become about what
should go where.

>  +soc_dev_attr->family  =  "Snapdragon";
> > 
> > I think family should be e.g. "MSM8996" and then machine should be e.g.
> > "MSM8996AU".
> > 
> 
> I think here family should be Snapdragon.The following site also mentions
> the SoCs as Snapdragon family of processors.
> 
> https://www.qualcomm.com/products/snapdragon/processors/comparison
> 
> Could you please confirm if it's okay?
> 

In our previous technical discussions regarding Qualcomm platforms the
possible values for "family" would be U, A and B (maybe something new
these days?).

But I don't think we gain anything from having the kernel tell us this.

So I'm fine with you reporting "Snapdragon" as family and I guess
machine would then get e.g. "APQ8096". I don't know what to put in
soc_id.

I think this would be sufficient for user space's needs.

Regards,
Bjorn


Re: [PATCH v12 09/22] vfio iommu type1: Add task structure to vfio_dma

2016-11-14 Thread Alexey Kardashevskiy
On 15/11/16 02:42, Kirti Wankhede wrote:
> Add task structure to vfio_dma structure.
> During DMA_UNMAP, same task who mapped it or other task who shares same
> address space is allowed to unmap, otherwise unmap fails.
> QEMU maps few iova ranges initially, then fork threads and from the child
> thread calls DMA_UNMAP on previously mapped iova. Since child shares same
> address space, DMA_UNMAP is successful.

Please add few words why you reference task instead of mm. afaict you only
use mm. Thanks.


> 
> Signed-off-by: Kirti Wankhede 
> Signed-off-by: Neo Jia 
> Change-Id: I7600f1bea6b384fd589fa72421ccf031bcfd9ac5
> ---
>  drivers/vfio/vfio_iommu_type1.c | 137 
> +---
>  1 file changed, 86 insertions(+), 51 deletions(-)
> 
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index ffe2026f1341..50aca95cf61e 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -36,6 +36,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #define DRIVER_VERSION  "0.2"
>  #define DRIVER_AUTHOR   "Alex Williamson "
> @@ -75,6 +76,7 @@ struct vfio_dma {
>   unsigned long   vaddr;  /* Process virtual addr */
>   size_t  size;   /* Map size (bytes) */
>   int prot;   /* IOMMU_READ/WRITE */
> + struct task_struct  *task;
>  };
>  
>  struct vfio_group {
> @@ -277,41 +279,47 @@ static int vaddr_get_pfn(struct mm_struct *mm, unsigned 
> long vaddr,
>   * the iommu can only map chunks of consecutive pfns anyway, so get the
>   * first page and all consecutive pages with the same locking.
>   */
> -static long vfio_pin_pages_remote(unsigned long vaddr, long npage,
> -   int prot, unsigned long *pfn_base)
> +static long vfio_pin_pages_remote(struct vfio_dma *dma, unsigned long vaddr,
> +   long npage, int prot, unsigned long *pfn_base)
>  {
> - unsigned long limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;
> - bool lock_cap = capable(CAP_IPC_LOCK);
> + unsigned long limit;
> + bool lock_cap = ns_capable(task_active_pid_ns(dma->task)->user_ns,
> +CAP_IPC_LOCK);
> + struct mm_struct *mm;
>   long ret, i;
>   bool rsvd;
>  
> - if (!current->mm)
> + mm = get_task_mm(dma->task);
> + if (!mm)
>   return -ENODEV;
>  
> - ret = vaddr_get_pfn(current->mm, vaddr, prot, pfn_base);
> + ret = vaddr_get_pfn(mm, vaddr, prot, pfn_base);
>   if (ret)
> - return ret;
> + goto pin_pg_remote_exit;
>  
>   rsvd = is_invalid_reserved_pfn(*pfn_base);
> + limit = task_rlimit(dma->task, RLIMIT_MEMLOCK) >> PAGE_SHIFT;
>  
> - if (!rsvd && !lock_cap && current->mm->locked_vm + 1 > limit) {
> + if (!rsvd && !lock_cap && mm->locked_vm + 1 > limit) {
>   put_pfn(*pfn_base, prot);
>   pr_warn("%s: RLIMIT_MEMLOCK (%ld) exceeded\n", __func__,
>   limit << PAGE_SHIFT);
> - return -ENOMEM;
> + ret = -ENOMEM;
> + goto pin_pg_remote_exit;
>   }
>  
>   if (unlikely(disable_hugepages)) {
>   if (!rsvd)
> - vfio_lock_acct(current, 1);
> - return 1;
> + vfio_lock_acct(dma->task, 1);
> + ret = 1;
> + goto pin_pg_remote_exit;
>   }
>  
>   /* Lock all the consecutive pages from pfn_base */
>   for (i = 1, vaddr += PAGE_SIZE; i < npage; i++, vaddr += PAGE_SIZE) {
>   unsigned long pfn = 0;
>  
> - ret = vaddr_get_pfn(current->mm, vaddr, prot, &pfn);
> + ret = vaddr_get_pfn(mm, vaddr, prot, &pfn);
>   if (ret)
>   break;
>  
> @@ -321,8 +329,7 @@ static long vfio_pin_pages_remote(unsigned long vaddr, 
> long npage,
>   break;
>   }
>  
> - if (!rsvd && !lock_cap &&
> - current->mm->locked_vm + i + 1 > limit) {
> + if (!rsvd && !lock_cap && mm->locked_vm + i + 1 > limit) {
>   put_pfn(pfn, prot);
>   pr_warn("%s: RLIMIT_MEMLOCK (%ld) exceeded\n",
>   __func__, limit << PAGE_SHIFT);
> @@ -331,13 +338,16 @@ static long vfio_pin_pages_remote(unsigned long vaddr, 
> long npage,
>   }
>  
>   if (!rsvd)
> - vfio_lock_acct(current, i);
> + vfio_lock_acct(dma->task, i);
> + ret = i;
>  
> - return i;
> +pin_pg_remote_exit:
> + mmput(mm);
> + return ret;
>  }
>  
> -static long vfio_unpin_pages_remote(unsigned long pfn, long npage,
> - int prot, bool do_accounting)
> +static long vfio_unpin_pages_remote(struct vfio_dma *dma, unsigned long pfn,
> + long npage, int prot, bool do_accounting)
>  {
> 

Re: [PATCH 00/34] perf clang: Builtin clang and perfhook support

2016-11-14 Thread Alexei Starovoitov
On Mon, Nov 14, 2016 at 9:03 PM, Wangnan (F)  wrote:
>
>
> On 2016/11/15 12:57, Alexei Starovoitov wrote:
>>
>> On Mon, Nov 14, 2016 at 8:05 PM, Wang Nan  wrote:
>>>
>>> This is version 2 of perf builtin clang patch series. Compare to v1,
>>> add an exciting feature: jit compiling perf hook functions. This
>>> features allows script writer report result through BPF map in a
>>> customized way.
>>
>> looks great.
>>
>>>SEC("perfhook:record_start")
>>>void record_start(void *ctx)
>>>{
>>>  int perf_pid = getpid(), key = G_perf_pid;
>>>  printf("Start count, perfpid=%d\n", perf_pid);
>>>  jit_helper__map_update_elem(ctx, &GVALS, &key, &perf_pid, 0);
>>
>> the name, I think, is too verbose.
>> Why not to keep them as bpf_map_update_elem
>> even for user space programs?
>
>
> I can make it shorter by give it a better name or use a wrapper like
>
> BPF_MAP(update_elem)

the macro isn't pretty, since function calls won't look like calls.

> but the only thing I can't do is to make perfhook and in-kernel script
> use a uniform name for these bpf_map functions, because
> bpf_map_update_elem is already defined:
>
> "static long (*bpf_map_update_elem)(void *, void *, void *, unsigned long) =
> (void *)2;\n"

right. i guess you could have #ifdef it, so it's different for bpf backend
and for native.
Another alternative is to call it map_update_elem or map_update
or bpf_map_update. Something shorter is already a win.
'jit_helper__' prefix is an implementation detail. The users don't
need to know and don't need to spell it out everywhere.


Re: [PATCH v2 2/2] ARM: dts: apq8064: add support to pm8821

2016-11-14 Thread Bjorn Andersson
On Mon 14 Nov 09:52 PST 2016, Srinivas Kandagatla wrote:

Acked-by: Bjorn Andersson 

Regards,
Bjorn

> Signed-off-by: Srinivas Kandagatla 
> ---
>  arch/arm/boot/dts/qcom-apq8064.dtsi | 27 +++
>  1 file changed, 27 insertions(+)
> 
> diff --git a/arch/arm/boot/dts/qcom-apq8064.dtsi 
> b/arch/arm/boot/dts/qcom-apq8064.dtsi
> index 268bd47..c61ba32 100644
> --- a/arch/arm/boot/dts/qcom-apq8064.dtsi
> +++ b/arch/arm/boot/dts/qcom-apq8064.dtsi
> @@ -627,6 +627,33 @@
>   clock-names = "core";
>   };
>  
> + ssbi@c0 {
> + compatible = "qcom,ssbi";
> + reg = <0x00c0 0x1000>;
> + qcom,controller-type = "pmic-arbiter";
> +
> + pm8821: pmic@1 {
> + compatible = "qcom,pm8821";
> + interrupt-parent = <&tlmm_pinmux>;
> + interrupts = <76 IRQ_TYPE_LEVEL_LOW>;
> + #interrupt-cells = <2>;
> + interrupt-controller;
> + #address-cells = <1>;
> + #size-cells = <0>;
> +
> + pm8821_mpps: mpps@50 {
> + compatible = "qcom,pm8821-mpp", 
> "qcom,ssbi-mpp";
> + reg = <0x50>;
> + interrupts = <24 IRQ_TYPE_NONE>,
> +  <25 IRQ_TYPE_NONE>,
> +  <26 IRQ_TYPE_NONE>,
> +  <27 IRQ_TYPE_NONE>;
> + gpio-controller;
> + #gpio-cells = <2>;
> + };
> + };
> + };
> +
>   qcom,ssbi@50 {
>   compatible = "qcom,ssbi";
>   reg = <0x0050 0x1000>;
> -- 
> 2.10.1
> 


Re: [PATCH v11 10/22] vfio iommu type1: Add support for mediated devices

2016-11-14 Thread Alexey Kardashevskiy
On 08/11/16 17:52, Alexey Kardashevskiy wrote:
> On 05/11/16 08:10, Kirti Wankhede wrote:
>> VFIO IOMMU drivers are designed for the devices which are IOMMU capable.
>> Mediated device only uses IOMMU APIs, the underlying hardware can be
>> managed by an IOMMU domain.
>>
>> Aim of this change is:
>> - To use most of the code of TYPE1 IOMMU driver for mediated devices
>> - To support direct assigned device and mediated device in single module
>>
>> This change adds pin and unpin support for mediated device to TYPE1 IOMMU
>> backend module. More details:
>> - vfio_pin_pages() callback here uses task and address space of vfio_dma,
>>   that is, of the process who mapped that iova range.
>> - Added pfn_list tracking logic to address space structure. All pages
>>   pinned through this interface are trached in its address space.
>> - Pinned pages list is used to verify unpinning request and to unpin
>>   remaining pages while detaching the group for that device.
>> - Page accounting is updated to account in its address space where the
>>   pages are pinned/unpinned.
>> -  Accouting for mdev device is only done if there is no iommu capable
>>   domain in the container. When there is a direct device assigned to the
>>   container and that domain is iommu capable, all pages are already pinned
>>   during DMA_MAP.
>> - Page accouting is updated on hot plug and unplug mdev device and pass
>>   through device.
>>
>> Tested by assigning below combinations of devices to a single VM:
>> - GPU pass through only
> 
> This does not require this patchset, right?
> 
>> - vGPU device only
> 
> Out of curiosity - how exactly did you test this? The exact GPU, how to
> create vGPU, what was the QEMU command line and the guest does with this
> passed device? Thanks.


ping?


-- 
Alexey


Re: [PATCH] tpm: drop chip->is_open and chip->duration_adjusted

2016-11-14 Thread Jarkko Sakkinen
On Mon, Nov 14, 2016 at 09:30:01PM -0700, Jason Gunthorpe wrote:
> On Mon, Nov 14, 2016 at 03:44:58PM -0800, Jarkko Sakkinen wrote:
> > Use atomic bitops for chip->flags so that we do not need chip->is_open
> > and chip->duration_adjusted anymore.
> 
> I don't know if it s a really great idea to use atomic bit ops for
> things that do not need to be atomic.. It makes the locking scheme
> less clear. is open is genuinely different since it relies on the
> atomic for correctness.

The way I see it is one of the status flags bound to chip among the
others. I do not see this cause too much harm for clarity. It eases
debugging the driver a bit because you get more state out of 'flags'.

It also makes code little a bit more robust as flags is independent of
locks.

How strong is your opposition here? I do not see any exceptional damage
done but see some subtle but still significant benefits.

> Merging is_duration makes lots of sense though

Also timeout_adjusted should be merged (for some reason missed it).

> Jason

/Jarkko


Re: [PATCH 3/3] soc: fsl: make guts driver explicitly non-modular

2016-11-14 Thread Scott Wood
On Sun, 2016-11-13 at 14:03 -0500, Paul Gortmaker wrote:
> The Kconfig currently controlling compilation of this code is:
> 
> drivers/soc/fsl/Kconfig:config FSL_GUTS
> drivers/soc/fsl/Kconfig:bool
> 
> ...meaning that it currently is not being built as a module by anyone.
> 
> Lets remove the modular code that is essentially orphaned, so that
> when reading the driver there is no doubt it is builtin-only.
> 
> We explicitly disallow a driver unbind, since that doesn't have a
> sensible use case anyway, and it allows us to drop the ".remove"
> code for non-modular drivers.
> 
> Since the code was already not using module_init, the init ordering
> remains unchanged with this commit.
> 
> Also note that MODULE_DEVICE_TABLE is a no-op for non-modular code.
> 
> Cc: Scott Wood 
> Cc: Yangbo Lu 
> Cc: Arnd Bergmann 
> Cc: Ulf Hansson 
> Cc: linuxppc-...@lists.ozlabs.org
> Cc: linux-arm-ker...@lists.infradead.org
> Signed-off-by: Paul Gortmaker 

Acked-by: Scott Wood 

-Scott



kvm: WARNING in em_jmp_far

2016-11-14 Thread Dmitry Vyukov
Hello,

The following program triggers WARNING in em_jmp_far:
https://gist.githubusercontent.com/dvyukov/16bfd3d68fa7d5461101ef74e07796e4/raw/e6d663980681f2c5838ff6cd361cede7d3204838/gistfile1.txt


WARNING: CPU: 1 PID: 15748 at arch/x86/kvm/emulate.c:2128 em_jmp_far+0x4a7/0x530
Kernel panic - not syncing: panic_on_warn set ...

CPU: 1 PID: 15748 Comm: syz-executor Not tainted 4.9.0-rc5+ #28
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
 880033986ec8 834c2959 0001 110006730d6c
 ed0006730d64 41b58ab3 89575430 834c266b
 41b58ab3 894d1810 8158f020 811ac787
Call Trace:
 [< inline >] __dump_stack lib/dump_stack.c:15
 [] dump_stack+0x2ee/0x3f5 lib/dump_stack.c:51
 [] panic+0x200/0x425 kernel/panic.c:179
 [] __warn+0x1c9/0x1e0 kernel/panic.c:542
 [] warn_slowpath_null+0x31/0x40 kernel/panic.c:585
 [] em_jmp_far+0x4a7/0x530 arch/x86/kvm/emulate.c:2128
 [] x86_emulate_insn+0x43f/0x4090 arch/x86/kvm/emulate.c:5294
 [] x86_emulate_instruction+0x43e/0x2300
arch/x86/kvm/x86.c:5547
 [< inline >] emulate_instruction arch/x86/include/asm/kvm_host.h:1116
 [< inline >] complete_emulated_io arch/x86/kvm/x86.c:6872
 [] complete_emulated_mmio+0x76e/0xb70 arch/x86/kvm/x86.c:6936
 [] kvm_arch_vcpu_ioctl_run+0x3562/0x4eb0
arch/x86/kvm/x86.c:6980
 [] kvm_vcpu_ioctl+0x678/0x11c0
arch/x86/kvm/../../../virt/kvm/kvm_main.c:2557
 [< inline >] vfs_ioctl fs/ioctl.c:43
 [] do_vfs_ioctl+0x1c4/0x1630 fs/ioctl.c:679
 [< inline >] SYSC_ioctl fs/ioctl.c:694
 [] SyS_ioctl+0x94/0xc0 fs/ioctl.c:685
 [] entry_SYSCALL_64_fastpath+0x23/0xc6
Dumping ftrace buffer:
   (ftrace buffer empty)
Kernel Offset: disabled
reboot: cpu_has_vmx: ecx=80a02021 1

On commit a25f0944ba9b1d8a6813fd6f1a86f1bd59ac25a6 (Nov 13).


Re: [PATCH 1/3] virtio: Basic implementation of virtio pstore driver

2016-11-14 Thread Michael S. Tsirkin
On Tue, Nov 15, 2016 at 01:50:21PM +0900, Namhyung Kim wrote:
> Hi Michael,
> 
> On Thu, Nov 10, 2016 at 06:39:55PM +0200, Michael S. Tsirkin wrote:
> > On Sat, Aug 20, 2016 at 05:07:42PM +0900, Namhyung Kim wrote:
> > > The virtio pstore driver provides interface to the pstore subsystem so
> > > that the guest kernel's log/dump message can be saved on the host
> > > machine.  Users can access the log file directly on the host, or on the
> > > guest at the next boot using pstore filesystem.  It currently deals with
> > > kernel log (printk) buffer only, but we can extend it to have other
> > > information (like ftrace dump) later.
> > > 
> > > It supports legacy PCI device using single order-2 page buffer.
> > 
> > Do you mean a legacy virtio device? I don't see why
> > you would want to support pre-1.0 mode.
> > If you drop that, you can drop all cpu_to_virtio things
> > and just use __le accessors.
> 
> I was thinking about the kvmtools which lacks 1.0 support AFAIK.

Unless kvmtools wants to be left behind it has to go 1.0.

>  But
> I think it'd be better to always use __le type anyway.  Will change.
> 
> 
> > 
> > > It uses
> > > two virtqueues - one for (sync) read and another for (async) write.
> > > Since it cannot wait for write finished, it supports up to 128
> > > concurrent IO.  The buffer size is configurable now.
> > > 
> > > Cc: Paolo Bonzini 
> > > Cc: Radim Krčmář 
> > > Cc: "Michael S. Tsirkin" 
> > > Cc: Anthony Liguori 
> > > Cc: Anton Vorontsov 
> > > Cc: Colin Cross 
> > > Cc: Kees Cook 
> > > Cc: Tony Luck 
> > > Cc: Steven Rostedt 
> > > Cc: Ingo Molnar 
> > > Cc: Minchan Kim 
> > > Cc: k...@vger.kernel.org
> > > Cc: qemu-de...@nongnu.org
> > > Cc: virtualizat...@lists.linux-foundation.org
> > > Signed-off-by: Namhyung Kim 
> > > ---
> > >  drivers/virtio/Kconfig |  10 +
> > >  drivers/virtio/Makefile|   1 +
> > >  drivers/virtio/virtio_pstore.c | 417 
> > > +
> > >  include/uapi/linux/Kbuild  |   1 +
> > >  include/uapi/linux/virtio_ids.h|   1 +
> > >  include/uapi/linux/virtio_pstore.h |  74 +++
> > >  6 files changed, 504 insertions(+)
> > >  create mode 100644 drivers/virtio/virtio_pstore.c
> > >  create mode 100644 include/uapi/linux/virtio_pstore.h
> > > 
> > > diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig
> > > index 77590320d44c..8f0e6c796c12 100644
> > > --- a/drivers/virtio/Kconfig
> > > +++ b/drivers/virtio/Kconfig
> > > @@ -58,6 +58,16 @@ config VIRTIO_INPUT
> > >  
> > >If unsure, say M.
> > >  
> > > +config VIRTIO_PSTORE
> > > + tristate "Virtio pstore driver"
> > > + depends on VIRTIO
> > > + depends on PSTORE
> > > + ---help---
> > > +  This driver supports virtio pstore devices to save/restore
> > > +  panic and oops messages on the host.
> > > +
> > > +  If unsure, say M.
> > > +
> > >   config VIRTIO_MMIO
> > >   tristate "Platform bus driver for memory mapped virtio devices"
> > >   depends on HAS_IOMEM && HAS_DMA
> > > diff --git a/drivers/virtio/Makefile b/drivers/virtio/Makefile
> > > index 41e30e3dc842..bee68cb26d48 100644
> > > --- a/drivers/virtio/Makefile
> > > +++ b/drivers/virtio/Makefile
> > > @@ -5,3 +5,4 @@ virtio_pci-y := virtio_pci_modern.o virtio_pci_common.o
> > >  virtio_pci-$(CONFIG_VIRTIO_PCI_LEGACY) += virtio_pci_legacy.o
> > >  obj-$(CONFIG_VIRTIO_BALLOON) += virtio_balloon.o
> > >  obj-$(CONFIG_VIRTIO_INPUT) += virtio_input.o
> > > +obj-$(CONFIG_VIRTIO_PSTORE) += virtio_pstore.o
> > > diff --git a/drivers/virtio/virtio_pstore.c 
> > > b/drivers/virtio/virtio_pstore.c
> > > new file mode 100644
> > > index ..0a63c7db4278
> > > --- /dev/null
> > > +++ b/drivers/virtio/virtio_pstore.c
> > > @@ -0,0 +1,417 @@
> > > +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> > > +
> > > +#include 
> > > +#include 
> > > +#include 
> > > +#include 
> > > +#include 
> > > +#include 
> > > +#include 
> > > +
> > > +#define VIRT_PSTORE_ORDER2
> > > +#define VIRT_PSTORE_BUFSIZE  (4096 << VIRT_PSTORE_ORDER)
> > > +#define VIRT_PSTORE_NR_REQ   128
> > > +
> > > +struct virtio_pstore {
> > > + struct virtio_device*vdev;
> > > + struct virtqueue*vq[2];
> > 
> > I'd add named fields instead of an array here, vq[0]
> > vq[1] all over the place is hard to read.
> 
> Will change.
> 
> > 
> > > + struct pstore_info   pstore;
> > > + struct virtio_pstore_req req[VIRT_PSTORE_NR_REQ];
> > > + struct virtio_pstore_res res[VIRT_PSTORE_NR_REQ];
> > > + unsigned int req_id;
> > > +
> > > + /* Waiting for host to ack */
> > > + wait_queue_head_t   acked;
> > > + int failed;
> > > +};
> > > +
> > > +#define TYPE_TABLE_ENTRY(_entry) \
> > > + { PSTORE_TYPE_##_entry, VIRTIO_PSTORE_TYPE_##_entry }
> > > +
> > > +struct type_table {
> > > + int pstore;
> > > + u16 virtio;
> > > +} type_table[] = {
> > > + TYPE_TABLE_ENTRY(DMESG),
> > > +};
> > > +
> > > +#undef TYPE_TABLE_ENTRY
> > 
> > let's avoi

Re: [PATCH 00/34] perf clang: Builtin clang and perfhook support

2016-11-14 Thread Wangnan (F)



On 2016/11/15 12:57, Alexei Starovoitov wrote:

On Mon, Nov 14, 2016 at 8:05 PM, Wang Nan  wrote:

This is version 2 of perf builtin clang patch series. Compare to v1,
add an exciting feature: jit compiling perf hook functions. This
features allows script writer report result through BPF map in a
customized way.

looks great.


   SEC("perfhook:record_start")
   void record_start(void *ctx)
   {
 int perf_pid = getpid(), key = G_perf_pid;
 printf("Start count, perfpid=%d\n", perf_pid);
 jit_helper__map_update_elem(ctx, &GVALS, &key, &perf_pid, 0);

the name, I think, is too verbose.
Why not to keep them as bpf_map_update_elem
even for user space programs?


I can make it shorter by give it a better name or use a wrapper like

BPF_MAP(update_elem)

but the only thing I can't do is to make perfhook and in-kernel script
use a uniform name for these bpf_map functions, because
bpf_map_update_elem is already defined:

"static long (*bpf_map_update_elem)(void *, void *, void *, unsigned 
long) = (void *)2;\n"




   SEC("perfhook:record_end")
   void record_end(void *ctx)
   {
 u64 key = -1, value;
 while (!jit_helper__map_get_next_key(ctx, &syscall_counter, &key, 
&key)) {
 jit_helper__map_lookup_elem(ctx, &syscall_counter, &key, 
&value);
 printf("syscall %ld\tcount: %ld\n", (long)key, (long)value);

this loop will be less verbose as well.





Re: [PATCH v18 0/4] Introduce usb charger framework to deal with the usb gadget power negotation

2016-11-14 Thread Peter Chen
On Tue, Nov 15, 2016 at 08:35:13AM +1100, NeilBrown wrote:
> On Mon, Nov 14 2016, Mark Brown wrote:
> 
> > On Mon, Nov 14, 2016 at 03:21:13PM +1100, NeilBrown wrote:
> >> On Thu, Nov 10 2016, Baolin Wang wrote:
> >
> >> > Fourth, we need integrate all charger plugin/out
> >> > event in one framework, not from extcon, maybe type-c in future.
> >
> >> Why not extcon?  Given that a charger is connected by an external
> >> connector, extcon seems like exactly the right thing to use.
> >
> >> Obviously extcon doesn't report the current that was negotiated, but
> >> that is best kept separate.  The battery charger can be advised of the
> >> available current either via extcon or separately via the usb
> >> subsystem.  Don't conflate the two.
> >
> > Conflating the two seems like the whole point here.  We're looking for
> > something that sits between the power supply code and the USB code and
> > tells the power supply code what it's allowed to do which is the result
> > of a combination of physical cable detection and USB protocol.  It seems
> > reasonable that extcon drivers ought to be part of this but it doesn't
> > seem like they are the whole story.
> 
> I don't think "between the power supply code and the USB code" is where
> this thing sits. I think it sits inside the power-supply driver.
> We already have extcon which sits between the phy and the power_supply
> code, and the usb_notifier which sits between the USB code and the
> power supply code.  We don't need another go-between.
> 
> If we have extcon able to deliver reliable information about cable type,
> and if with have the usb notifier able to deliver reliable information
> about negotiated current, and if the power supply manager is able to
> register with the correct extcon and the correct usb notifier, then the
> power supply manager *could* handle all the notifications and make the
> correct determinations and set the current limits itself.  All this
> could be done entirely internally, without the help of any new
> subsystem.
> Do you agree?

Through the USB gadget/phy framework (usb_gadget.vbus_draw->usb_phy.set_power)
we can get the USB bus information when the device connects SDP, but the
enum usb_phy_events lacks some events like bus suspend (2mA), and bus
speed (high/super speed, 500mA vs 900mA). Besides many USB PHYs use
generic PHY driver now, it is lack of above event and related notifier.

About getting cable type, the key points are detect vbus and negotiate
the charger type, these two stuffs are much different among platforms.
Extcon has charger type definition, it is good, we can use it.
But it needs the device which has charger detection function as extcon
device too, and at meanwhile, this device needs to have vbus detect
function, most pmic devices are suitable for that, but not for USB PHY.

Asssume wm831x as a power client, according your suggestion, does its
design like below?
At dts, it needs to be described like below:
&wm831x {
...
phy-dev = <&usb_phy>;
extcon-dev = <&extcon>;
...
}
And at wm831x driver, it gets information through extcon-dev and phy-dev
notifier, and it needs knowledge about current limit for specific
cable type, but these information are from USB (Charger) specification.

Your suggestion is trying use current notifications to get the
information for power client, this patch set is trying to keep
these two notifications at an new framework, and power client
gets refined notification from this new framework.

The biggest problem I concern about your solution is extcon device, it may
not be an universal solution, does current frameworks have a way to
get cable type (usb charger type)? If not, we may need to have a new
framework.

-- 

Best Regards,
Peter Chen


Re: [PATCH v2 05/12] mm: thp: add core routines for thp/pmd migration

2016-11-14 Thread Naoya Horiguchi
On Mon, Nov 14, 2016 at 02:45:03PM +0300, Kirill A. Shutemov wrote:
> On Tue, Nov 08, 2016 at 08:31:50AM +0900, Naoya Horiguchi wrote:
> > This patch prepares thp migration's core code. These code will be open when
> > unmap_and_move() stops unconditionally splitting thp and get_new_page() 
> > starts
> > to allocate destination thps.
> > 
> > Signed-off-by: Naoya Horiguchi 
> > ---
> > ChangeLog v1 -> v2:
> > - support pte-mapped thp, doubly-mapped thp
> > ---
> >  arch/x86/include/asm/pgtable_64.h |   2 +
> >  include/linux/swapops.h   |  61 +++
> >  mm/huge_memory.c  | 154 
> > ++
> >  mm/migrate.c  |  44 ++-
> >  mm/pgtable-generic.c  |   3 +-
> >  5 files changed, 262 insertions(+), 2 deletions(-)
> > 
> > diff --git 
> > v4.9-rc2-mmotm-2016-10-27-18-27/arch/x86/include/asm/pgtable_64.h 
> > v4.9-rc2-mmotm-2016-10-27-18-27_patched/arch/x86/include/asm/pgtable_64.h
> > index 1cc82ec..3a1b48e 100644
> > --- v4.9-rc2-mmotm-2016-10-27-18-27/arch/x86/include/asm/pgtable_64.h
> > +++ 
> > v4.9-rc2-mmotm-2016-10-27-18-27_patched/arch/x86/include/asm/pgtable_64.h
> > @@ -167,7 +167,9 @@ static inline int pgd_large(pgd_t pgd) { return 0; }
> >  ((type) << (SWP_TYPE_FIRST_BIT)) \
> >  | ((offset) << SWP_OFFSET_FIRST_BIT) })
> >  #define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val((pte)) 
> > })
> > +#define __pmd_to_swp_entry(pte)((swp_entry_t) { pmd_val((pmd)) 
> > })
> >  #define __swp_entry_to_pte(x)  ((pte_t) { .pte = (x).val })
> > +#define __swp_entry_to_pmd(x)  ((pmd_t) { .pmd = (x).val })
> >  
> >  extern int kern_addr_valid(unsigned long addr);
> >  extern void cleanup_highmap(void);
> > diff --git v4.9-rc2-mmotm-2016-10-27-18-27/include/linux/swapops.h 
> > v4.9-rc2-mmotm-2016-10-27-18-27_patched/include/linux/swapops.h
> > index 5c3a5f3..b6b22a2 100644
> > --- v4.9-rc2-mmotm-2016-10-27-18-27/include/linux/swapops.h
> > +++ v4.9-rc2-mmotm-2016-10-27-18-27_patched/include/linux/swapops.h
> > @@ -163,6 +163,67 @@ static inline int is_write_migration_entry(swp_entry_t 
> > entry)
> >  
> >  #endif
> >  
> > +#ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION
> > +extern void set_pmd_migration_entry(struct page *page,
> > +   struct vm_area_struct *vma, unsigned long address);
> > +
> > +extern int remove_migration_pmd(struct page *new, pmd_t *pmd,
> > +   struct vm_area_struct *vma, unsigned long addr, void *old);
> > +
> > +extern void pmd_migration_entry_wait(struct mm_struct *mm, pmd_t *pmd);
> > +
> > +static inline swp_entry_t pmd_to_swp_entry(pmd_t pmd)
> > +{
> > +   swp_entry_t arch_entry;
> > +
> > +   arch_entry = __pmd_to_swp_entry(pmd);
> > +   return swp_entry(__swp_type(arch_entry), __swp_offset(arch_entry));
> > +}
> > +
> > +static inline pmd_t swp_entry_to_pmd(swp_entry_t entry)
> > +{
> > +   swp_entry_t arch_entry;
> > +
> > +   arch_entry = __swp_entry(swp_type(entry), swp_offset(entry));
> > +   return __swp_entry_to_pmd(arch_entry);
> > +}
> > +
> > +static inline int is_pmd_migration_entry(pmd_t pmd)
> > +{
> > +   return !pmd_present(pmd) && is_migration_entry(pmd_to_swp_entry(pmd));
> > +}
> > +#else
> > +static inline void set_pmd_migration_entry(struct page *page,
> > +   struct vm_area_struct *vma, unsigned long address)
> > +{
> 
> VM_BUG()? Or BUILD_BUG()?

These should be compiled out, so BUILD_BUG() seems better to me.
3 routines below will be done in the same manner.

> > +}
> > +
> > +static inline int remove_migration_pmd(struct page *new, pmd_t *pmd,
> > +   struct vm_area_struct *vma, unsigned long addr, void *old)
> > +{
> > +   return 0;
> 
> Ditto.
> 
> > +}
> > +
> > +static inline void pmd_migration_entry_wait(struct mm_struct *m, pmd_t *p) 
> > { }
> > +
> > +static inline swp_entry_t pmd_to_swp_entry(pmd_t pmd)
> > +{
> > +   return swp_entry(0, 0);
> 
> Ditto.
> 
> > +}
> > +
> > +static inline pmd_t swp_entry_to_pmd(swp_entry_t entry)
> > +{
> > +   pmd_t pmd = {};
> 
> Ditto.
> 
> > +   return pmd;
> > +}
> > +
> > +static inline int is_pmd_migration_entry(pmd_t pmd)
> > +{
> > +   return 0;
> > +}
> > +#endif
> > +
> >  #ifdef CONFIG_MEMORY_FAILURE
> >  
> >  extern atomic_long_t num_poisoned_pages __read_mostly;
> > diff --git v4.9-rc2-mmotm-2016-10-27-18-27/mm/huge_memory.c 
> > v4.9-rc2-mmotm-2016-10-27-18-27_patched/mm/huge_memory.c
> > index 0509d17..b3022b3 100644
> > --- v4.9-rc2-mmotm-2016-10-27-18-27/mm/huge_memory.c
> > +++ v4.9-rc2-mmotm-2016-10-27-18-27_patched/mm/huge_memory.c
> > @@ -2310,3 +2310,157 @@ static int __init split_huge_pages_debugfs(void)
> >  }
> >  late_initcall(split_huge_pages_debugfs);
> >  #endif
> > +
> > +#ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION
> > +void set_pmd_migration_entry(struct page *page, struct vm_area_struct *vma,
> > +   unsign

Re: [PATCH 00/34] perf clang: Builtin clang and perfhook support

2016-11-14 Thread Alexei Starovoitov
On Mon, Nov 14, 2016 at 8:05 PM, Wang Nan  wrote:
> This is version 2 of perf builtin clang patch series. Compare to v1,
> add an exciting feature: jit compiling perf hook functions. This
> features allows script writer report result through BPF map in a
> customized way.

looks great.

>   SEC("perfhook:record_start")
>   void record_start(void *ctx)
>   {
> int perf_pid = getpid(), key = G_perf_pid;
> printf("Start count, perfpid=%d\n", perf_pid);
> jit_helper__map_update_elem(ctx, &GVALS, &key, &perf_pid, 0);

the name, I think, is too verbose.
Why not to keep them as bpf_map_update_elem
even for user space programs?

>   SEC("perfhook:record_end")
>   void record_end(void *ctx)
>   {
> u64 key = -1, value;
> while (!jit_helper__map_get_next_key(ctx, &syscall_counter, &key, 
> &key)) {
> jit_helper__map_lookup_elem(ctx, &syscall_counter, &key, 
> &value);
> printf("syscall %ld\tcount: %ld\n", (long)key, (long)value);

this loop will be less verbose as well.


kvm: GPF in kvm_ioapic_set_irq

2016-11-14 Thread Dmitry Vyukov
Hello,

The following program triggers GPF in kvm_ioapic_set_irq:
https://gist.githubusercontent.com/dvyukov/9070377a9fdd685d4496972363b35ec5/raw/bfa9fc85736b6d4993138b945ddfa1b7f3432afc/gistfile1.txt


general protection fault:  [#1] SMP DEBUG_PAGEALLOC KASAN
Dumping ftrace buffer:
   (ftrace buffer empty)
Modules linked in:
CPU: 3 PID: 11923 Comm: kworker/3:2 Not tainted 4.9.0-rc5+ #27
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Workqueue: events irqfd_inject
task: 88006a06c7c0 task.stack: 880068638000
RIP: 0010:[]  []
__lock_acquire+0xb35/0x3380 kernel/locking/lockdep.c:3221
RSP: :88006863ea20  EFLAGS: 00010006
RAX: dc00 RBX: dc00 RCX: 
RDX: 0039 RSI:  RDI: 11000d0c7d9e
RBP: 88006863ef58 R08: 0001 R09: 
R10: 01c8 R11:  R12: 88006a06c7c0
R13: 0001 R14: 8baab1a0 R15: 0001
FS:  () GS:88006d10() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 004abdd0 CR3: 3e2f2000 CR4: 26e0
Stack:
 894d0098 11000d0c7d56 88006863ecd0 dc00
 88006a06c7c0  88006863ecf8 0082
  815dd7c1  
Call Trace:
 [] lock_acquire+0x2a2/0x790 kernel/locking/lockdep.c:3746
 [< inline >] __raw_spin_lock include/linux/spinlock_api_smp.h:144
 [] _raw_spin_lock+0x38/0x50 kernel/locking/spinlock.c:151
 [< inline >] spin_lock include/linux/spinlock.h:302
 [] kvm_ioapic_set_irq+0x4c/0x100 arch/x86/kvm/ioapic.c:379
 [] kvm_set_ioapic_irq+0x8f/0xc0 arch/x86/kvm/irq_comm.c:52
 [] kvm_set_irq+0x239/0x640
arch/x86/kvm/../../../virt/kvm/irqchip.c:101
 [] irqfd_inject+0xb4/0x150
arch/x86/kvm/../../../virt/kvm/eventfd.c:60
 [] process_one_work+0xb40/0x1ba0 kernel/workqueue.c:2096
 [] worker_thread+0x214/0x18a0 kernel/workqueue.c:2230
 [] kthread+0x328/0x3e0 kernel/kthread.c:209
 [] ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:433
Code: e9 03 f3 48 ab 48 81 c4 10 05 00 00 44 89 e8 5b 41 5c 41 5d 41
5e 41 5f 5d c3 4c 89 d2 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 <80>
3c 02 00 0f 85 00 26 00 00 49 81 3a c0 74 e1 8a 41 bf 00 00
RIP  [] __lock_acquire+0xb35/0x3380
kernel/locking/lockdep.c:3221
 RSP 
---[ end trace 9e23ceae3896b509 ]---
Kernel panic - not syncing: Fatal exception
Dumping ftrace buffer:
   (ftrace buffer empty)
Kernel Offset: disabled
reboot: cpu_has_vmx: ecx=80a02021 1

On commit a25f0944ba9b1d8a6813fd6f1a86f1bd59ac25a6 (Nov 13).


Re: [PATCH v3 05/46] perf/x86/intel/cmt: add per-package locks

2016-11-14 Thread David Carrillo-Cisneros
> Also, "monr" is a horribly 'word'.

What makes it so bad? (honest question) . Some alternatives:

- res_mon, resm, rmon (Resource Monitor)
- rmnode, rnode, rmon_node (Resource Monitoring node, similar to
Resource Monitor ID, but to reflect that it's a node in a
tree/hierarchy)
 - rdt_mon, rdtm (something with RDT + Monitoring)
 - ment, rdt_ment (Monitoring Entity)

Other suggestions?

Thanks,
David


Re: [PATCH 1/3] virtio: Basic implementation of virtio pstore driver

2016-11-14 Thread Namhyung Kim
Hi Michael,

On Thu, Nov 10, 2016 at 06:39:55PM +0200, Michael S. Tsirkin wrote:
> On Sat, Aug 20, 2016 at 05:07:42PM +0900, Namhyung Kim wrote:
> > The virtio pstore driver provides interface to the pstore subsystem so
> > that the guest kernel's log/dump message can be saved on the host
> > machine.  Users can access the log file directly on the host, or on the
> > guest at the next boot using pstore filesystem.  It currently deals with
> > kernel log (printk) buffer only, but we can extend it to have other
> > information (like ftrace dump) later.
> > 
> > It supports legacy PCI device using single order-2 page buffer.
> 
> Do you mean a legacy virtio device? I don't see why
> you would want to support pre-1.0 mode.
> If you drop that, you can drop all cpu_to_virtio things
> and just use __le accessors.

I was thinking about the kvmtools which lacks 1.0 support AFAIK.  But
I think it'd be better to always use __le type anyway.  Will change.


> 
> > It uses
> > two virtqueues - one for (sync) read and another for (async) write.
> > Since it cannot wait for write finished, it supports up to 128
> > concurrent IO.  The buffer size is configurable now.
> > 
> > Cc: Paolo Bonzini 
> > Cc: Radim Krčmář 
> > Cc: "Michael S. Tsirkin" 
> > Cc: Anthony Liguori 
> > Cc: Anton Vorontsov 
> > Cc: Colin Cross 
> > Cc: Kees Cook 
> > Cc: Tony Luck 
> > Cc: Steven Rostedt 
> > Cc: Ingo Molnar 
> > Cc: Minchan Kim 
> > Cc: k...@vger.kernel.org
> > Cc: qemu-de...@nongnu.org
> > Cc: virtualizat...@lists.linux-foundation.org
> > Signed-off-by: Namhyung Kim 
> > ---
> >  drivers/virtio/Kconfig |  10 +
> >  drivers/virtio/Makefile|   1 +
> >  drivers/virtio/virtio_pstore.c | 417 
> > +
> >  include/uapi/linux/Kbuild  |   1 +
> >  include/uapi/linux/virtio_ids.h|   1 +
> >  include/uapi/linux/virtio_pstore.h |  74 +++
> >  6 files changed, 504 insertions(+)
> >  create mode 100644 drivers/virtio/virtio_pstore.c
> >  create mode 100644 include/uapi/linux/virtio_pstore.h
> > 
> > diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig
> > index 77590320d44c..8f0e6c796c12 100644
> > --- a/drivers/virtio/Kconfig
> > +++ b/drivers/virtio/Kconfig
> > @@ -58,6 +58,16 @@ config VIRTIO_INPUT
> >  
> >  If unsure, say M.
> >  
> > +config VIRTIO_PSTORE
> > +   tristate "Virtio pstore driver"
> > +   depends on VIRTIO
> > +   depends on PSTORE
> > +   ---help---
> > +This driver supports virtio pstore devices to save/restore
> > +panic and oops messages on the host.
> > +
> > +If unsure, say M.
> > +
> >   config VIRTIO_MMIO
> > tristate "Platform bus driver for memory mapped virtio devices"
> > depends on HAS_IOMEM && HAS_DMA
> > diff --git a/drivers/virtio/Makefile b/drivers/virtio/Makefile
> > index 41e30e3dc842..bee68cb26d48 100644
> > --- a/drivers/virtio/Makefile
> > +++ b/drivers/virtio/Makefile
> > @@ -5,3 +5,4 @@ virtio_pci-y := virtio_pci_modern.o virtio_pci_common.o
> >  virtio_pci-$(CONFIG_VIRTIO_PCI_LEGACY) += virtio_pci_legacy.o
> >  obj-$(CONFIG_VIRTIO_BALLOON) += virtio_balloon.o
> >  obj-$(CONFIG_VIRTIO_INPUT) += virtio_input.o
> > +obj-$(CONFIG_VIRTIO_PSTORE) += virtio_pstore.o
> > diff --git a/drivers/virtio/virtio_pstore.c b/drivers/virtio/virtio_pstore.c
> > new file mode 100644
> > index ..0a63c7db4278
> > --- /dev/null
> > +++ b/drivers/virtio/virtio_pstore.c
> > @@ -0,0 +1,417 @@
> > +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> > +
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +
> > +#define VIRT_PSTORE_ORDER2
> > +#define VIRT_PSTORE_BUFSIZE  (4096 << VIRT_PSTORE_ORDER)
> > +#define VIRT_PSTORE_NR_REQ   128
> > +
> > +struct virtio_pstore {
> > +   struct virtio_device*vdev;
> > +   struct virtqueue*vq[2];
> 
> I'd add named fields instead of an array here, vq[0]
> vq[1] all over the place is hard to read.

Will change.

> 
> > +   struct pstore_info   pstore;
> > +   struct virtio_pstore_req req[VIRT_PSTORE_NR_REQ];
> > +   struct virtio_pstore_res res[VIRT_PSTORE_NR_REQ];
> > +   unsigned int req_id;
> > +
> > +   /* Waiting for host to ack */
> > +   wait_queue_head_t   acked;
> > +   int failed;
> > +};
> > +
> > +#define TYPE_TABLE_ENTRY(_entry)   \
> > +   { PSTORE_TYPE_##_entry, VIRTIO_PSTORE_TYPE_##_entry }
> > +
> > +struct type_table {
> > +   int pstore;
> > +   u16 virtio;
> > +} type_table[] = {
> > +   TYPE_TABLE_ENTRY(DMESG),
> > +};
> > +
> > +#undef TYPE_TABLE_ENTRY
> 
> let's avoid macros for now pls. In fact, I would just open-code this
> in to_virtio_type below. We can always change our minds later if
> lots of types are added.

Yep.

> 
> > +
> > +
> 
> single emoty line pls

Ok.

> 
> > +static u16 to_virtio_type(struct virtio_pstore *vps, enum pstore_type_id 
> > type)
> > +{
> > +   unsigned int i;
> > +
> > +   for (i = 0; i 

Disable all network protocols on an interface?

2016-11-14 Thread Ed Swierk
I have a Linux kernel 4.4 system hosting a number of kvm VMs. Physical 
interface eth0 connects to an 802.1Q trunk port on an external switch. Each VM 
has a virtual interface (e1000 or virtio-net) connected to the physical NIC 
through a macvtap interface and a VLAN interface; traffic between the external 
switch and the host is tagged with a per-VM tag. The only logic is 
demultiplexing incoming traffic by VLAN tag and stripping the tag, and adding 
the tag for outgoing traffic. Other than that, the eth0-VM datapath is a dumb 
pipe.

eth0 is assigned an IP address for host applications to send and receive 
untagged packets. For example, here's the setup with 2 VMs.

+- (untagged) 192.168.0.2
  eth0 -+- (tag 1) --- eth0.1 --- macvtap1 --- VM1
+- (tag 2) --- eth0.2 --- macvtap2 --- VM2

Various iptables rules filter the untagged packets received for host 
applications. The last rule in the INPUT chain logs incoming packets that don't 
match earlier rules:

  -A INPUT -m limit --limit 10/min -j LOG --log-prefix FilterInput

This all works, but I see occasional FilterInput messages for traffic received 
on eth0.1 and eth0.2: so far, only DHCP packets with destination MAC address 
ff:ff:ff:ff:ff:ff.

  FilterInput IN=eth0.1 OUT= MAC=ff:ff:ff:ff:ff:ff:00:01:02:03:04:05:08:00 
SRC=0.0.0.0 DST=255.255.255.255 LEN=328 TOS=0x10 PREC=0x00 TTL=128 ID=0 
PROTO=UDP SPT=68 DPT=67 LEN=308

Even though these are IP packets, I naively expect packets received on the VLAN 
interface lacking IP address to be either consumed by the attached macvtap or 
dropped before they trigger an iptables filter INPUT rule. It's a bit alarming 
to see packets destined for a VM to be processed at all by the host IP stack.

Digging through the code, I find that the core packet receive function 
__netif_receive_skb_core() first gives master devices like bridges and 
macvlans/macvtaps a chance to consume the packet; otherwise the packet gets 
handled by all installed protocols like IPv4. The packet gets pretty far down 
the IP receive process before it's discovered that there's nowhere to route it 
to, and no local sockets to deliver it to. The iptables INPUT chain is invoked 
well before that happens. (As far as I can tell, there's no explicit check in 
the IP receive code whether a local interface has an IP address.)

The macvlan's rx_handler definitively consumes or drops unicast packets, 
depending on the destination MAC address. But for broadcast packets, it  passes 
the packet to the attached VM interface and also tells the core receive 
function to continue processing it. Presumably this is to allow a macvlan to 
attach to one or more VMs as well as have a local IP address.

The logic in the bridge driver is a bit different: it consumes all packets from 
the slave interface. This makes sense as only the bridge master interface can 
be assigned a local IP address.

However in my application, I'm setting up the macvtap interfaces in passthrough 
mode, which precludes assigning a local IP address, just like a bridge slave. 
So it stands to reason that for a macvlan in passthrough mode, its rx_handler 
should consume or drop all packets, and not allow broadcast packets to also be 
handled locally.

This one-line change seems to do the trick:

--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -411,7 +411,7 @@ static rx_handler_result_t macvlan_handle_frame(struct 
sk_buff **pskb)
rx_handler_result_t handle_res;

port = macvlan_port_get_rcu(skb->dev);
-   if (is_multicast_ether_addr(eth->h_dest)) {
+   if (is_multicast_ether_addr(eth->h_dest) && !port->passthru) {
skb = ip_check_defrag(dev_net(skb->dev), skb, 
IP_DEFRAG_MACVLAN);
if (!skb)
return RX_HANDLER_CONSUMED;

Well, mostly. I still see FilterInput log messages in the brief window between 
creating the VLAN interface and attaching the macvtap to it, since there's no 
rx_handler to consume them. Hooking the VLAN interface to a bridge rather than 
a macvtap suppresses local IP processing on the slave but enables it on the 
bridge master interface. Apparently any non-slave interface can handle IP 
traffic to some extent, even if it doesn't have an IP address.

I worry that allowing any IP processing at all on eth0-VM traffic is a 
potential security hole, and I'm one configuration typo away from letting VM's 
traffic leak into another VM or a host application, and vice versa. And logging 
those FilterInput messages for non-local traffic just looks like sloppy 
security.

Is there some way to stop all local protocols from handling packets received on 
an interface--a protocol-agnostic equivalent of 
net.ipv6.conf.INTF.disable_ipv6? Would it be reasonable to implement one?

--Ed


mm: BUG in munlock_vma_pages_range

2016-11-14 Thread Dmitry Vyukov
Hello,

The following program triggers BUG in munlock_vma_pages_range:

// autogenerated by syzkaller (http://github.com/google/syzkaller)
#include 

int main()
{
  mmap((void*)0x20105000ul, 0xc0ul, 0x2ul, 0x2172ul, -1, 0);
  mremap((void*)0x201fd000ul, 0x4000ul, 0xc0ul, 0x3ul, 0x203ful);
  return 0;
}


page:ea0001847cc0 count:0 mapcount:1 mapping:dead0400
index:0x20400 compound_mapcount: 1
flags: 0x5fffc00()
page dumped because: VM_BUG_ON_PAGE(PageMlocked(page))
[ cut here ]
kernel BUG at mm/mlock.c:460!
invalid opcode:  [#1] SMP DEBUG_PAGEALLOC KASAN
Modules linked in:
CPU: 2 PID: 6526 Comm: a.out Not tainted 4.9.0-rc5+ #28
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
task: 8800681ca0c0 task.stack: 8800637f
RIP: 0010:[]  []
munlock_vma_pages_range+0xcc2/0x1010 mm/mlock.c:460
RSP: 0018:8800637f7178  EFLAGS: 00010292
RAX:  RBX: ea0001847cc0 RCX: 
RDX:  RSI: 88006d016e08 RDI: ed000c6fee20
RBP: 8800637f7638 R08: 0001 R09: 
R10: dc00 R11: 0001 R12: ea0001847ce0
R13: ea000184 R14: dc00 R15: 8800637f7610
FS:  () GS:88006d00() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 004b2080 CR3: 64f87000 CR4: 06e0
Stack:
 41b58ab3 894cfa90 8800637f74f0 8800637f7578
 0005 88007fff8000 11000c6fee42 8800637f74b0
 2040 00050c6fee41 ea000196c55c ed000c6feeaf
Call Trace:
 [< inline >] munlock_vma_pages_all mm/internal.h:277
 [] exit_mmap+0x1bb/0x4e0 mm/mmap.c:2924
 [< inline >] __mmput kernel/fork.c:866
 [] mmput+0x20e/0x4c0 kernel/fork.c:888
 [< inline >] exit_mm kernel/exit.c:512
 [] do_exit+0x960/0x2640 kernel/exit.c:815
 [] do_group_exit+0x14e/0x420 kernel/exit.c:931
 [< inline >] SYSC_exit_group kernel/exit.c:942
 [] SyS_exit_group+0x22/0x30 kernel/exit.c:940
 [] entry_SYSCALL_64_fastpath+0x23/0xc6
arch/x86/entry/entry_64.S:209
Code: 0b e8 53 2e d8 ff 48 c7 c6 c0 32 31 88 48 89 df e8 54 2e fd ff
0f 0b e8 3d 2e d8 ff 48 c7 c6 80 35 31 88 48 89 df e8 3e 2e fd ff <0f>
0b 48 89 85 a8 fb ff ff e8 20 2e d8 ff 48 8b 85 a8 fb ff ff
RIP  [] munlock_vma_pages_range+0xcc2/0x1010 mm/mlock.c:460
 RSP 
---[ end trace 694dc6462f524cf9 ]---
Fixing recursive fault but reboot is needed!


On commit a25f0944ba9b1d8a6813fd6f1a86f1bd59ac25a6 (Nov 13).


Re: [PATCH 00/34] perf clang: Builtin clang and perfhook support

2016-11-14 Thread Wangnan (F)



On 2016/11/15 12:05, Wang Nan wrote:

   $ sudo -s
   # ulimit -l unlimited
   # perf record -e ./count_syscalls.c echo "Haha"
   Start count, perfpid=25209
   Haha
   [ perf record: Woken up 1 times to write data ]
   syscall 8count: 6
   syscall 11   count: 1
   syscall 4count: 6
   syscall 21   count: 1
   syscall 5count: 3
   syscall 231  count: 1
   syscall 45   count: 3
   syscall 0count: 24
   syscall 257  count: 1
   syscall 59   count: 4
   syscall 23   count: 9
   syscall 78   count: 2
   syscall 41   count: 4
   syscall 72   count: 8
   syscall 10   count: 3
   syscall 321  count: 1
   syscall 298  count: 7
   syscall 16   count: 21
   syscall 9count: 16
   syscall 1count: 114
   syscall 12   count: 3
   syscall 14   count: 35
   syscall 158  count: 1
   syscall 2count: 15
   syscall 7count: 18
   syscall 3count: 11
   [ perf record: Captured and wrote 0.011 MB perf.data ]


Note that this example counts system wide syscall histogram, not
only 'echo' proc. The in-kernel BPF script doesn't know pid of 'echo'
so can't filter base on it. I'm planning adding more perf hook points
to pass information like this.

Thank you.



Re: [PATCH] tpm: drop chip->is_open and chip->duration_adjusted

2016-11-14 Thread Jason Gunthorpe
On Mon, Nov 14, 2016 at 03:44:58PM -0800, Jarkko Sakkinen wrote:
> Use atomic bitops for chip->flags so that we do not need chip->is_open
> and chip->duration_adjusted anymore.

I don't know if it s a really great idea to use atomic bit ops for
things that do not need to be atomic.. It makes the locking scheme
less clear. is open is genuinely different since it relies on the
atomic for correctness.

Merging is_duration makes lots of sense though

Jason


[PATCH 06/34] tools lib bpf: Add private field for bpf_object

2016-11-14 Thread Wang Nan
Like other classes defined in libbpf.h (like map and program), allow
'object' class has its own private data.

Signed-off-by: Wang Nan 
Cc: Alexei Starovoitov 
Cc: Arnaldo Carvalho de Melo 
Cc: Li Zefan 
---
 tools/lib/bpf/libbpf.c | 23 +++
 tools/lib/bpf/libbpf.h |  5 +
 2 files changed, 28 insertions(+)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 96a2b2f..866d5cd 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -229,6 +229,10 @@ struct bpf_object {
 * all objects.
 */
struct list_head list;
+
+   void *priv;
+   bpf_object_clear_priv_t clear_priv;
+
char path[];
 };
 #define obj_elf_valid(o)   ((o)->efile.elf)
@@ -1229,6 +1233,9 @@ void bpf_object__close(struct bpf_object *obj)
if (!obj)
return;
 
+   if (obj->clear_priv)
+   obj->clear_priv(obj, obj->priv);
+
bpf_object__elf_finish(obj);
bpf_object__unload(obj);
 
@@ -1282,6 +1289,22 @@ unsigned int bpf_object__kversion(struct bpf_object *obj)
return obj ? obj->kern_version : 0;
 }
 
+int bpf_object__set_priv(struct bpf_object *obj, void *priv,
+bpf_object_clear_priv_t clear_priv)
+{
+   if (obj->priv && obj->clear_priv)
+   obj->clear_priv(obj, obj->priv);
+
+   obj->priv = priv;
+   obj->clear_priv = clear_priv;
+   return 0;
+}
+
+void *bpf_object__priv(struct bpf_object *obj)
+{
+   return obj ? obj->priv : ERR_PTR(-EINVAL);
+}
+
 struct bpf_program *
 bpf_program__next(struct bpf_program *prev, struct bpf_object *obj)
 {
diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
index dd7a513..0c0b012 100644
--- a/tools/lib/bpf/libbpf.h
+++ b/tools/lib/bpf/libbpf.h
@@ -79,6 +79,11 @@ struct bpf_object *bpf_object__next(struct bpf_object *prev);
 (pos) != NULL; \
 (pos) = (tmp), (tmp) = bpf_object__next(tmp))
 
+typedef void (*bpf_object_clear_priv_t)(struct bpf_object *, void *);
+int bpf_object__set_priv(struct bpf_object *obj, void *priv,
+bpf_object_clear_priv_t clear_priv);
+void *bpf_object__priv(struct bpf_object *prog);
+
 /* Accessors of bpf_program. */
 struct bpf_program;
 struct bpf_program *bpf_program__next(struct bpf_program *prog,
-- 
2.10.1



[PATCH 11/34] tools build: Add feature detection for LLVM

2016-11-14 Thread Wang Nan
Check if basic LLVM compiling environment is ready.

Use llvm-config to detect include and library directories. Avoid using
'llvm-config --cxxflags' because its result contain some unwanted flags
like --sysroot (if LLVM is built by yocto).

Use '?=' to set LLVM_CONFIG, so explicitly passing LLVM_CONFIG to make
would override it.

Use 'llvm-config --libs BPF' to check if BPF backend is compiled in.
Since now BPF bytecode is the only required backend, no need to waste
time linking llvm and clang if BPF backend is missing. This also
introduce an implicit requirement that LLVM should be new enough.  Old
LLVM doesn't support BPF backend.

Signed-off-by: Wang Nan 
Cc: Alexei Starovoitov 
Cc: He Kuang 
Cc: Jiri Olsa 
Cc: Zefan Li 
Cc: pi3or...@163.com
Link: 
http://lkml.kernel.org/r/1474874832-134786-4-git-send-email-wangn...@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/build/feature/Makefile  | 8 
 tools/build/feature/test-llvm.cpp | 8 
 2 files changed, 16 insertions(+)
 create mode 100644 tools/build/feature/test-llvm.cpp

diff --git a/tools/build/feature/Makefile b/tools/build/feature/Makefile
index 8f668bc..c09de59 100644
--- a/tools/build/feature/Makefile
+++ b/tools/build/feature/Makefile
@@ -55,6 +55,7 @@ FILES := $(addprefix $(OUTPUT),$(FILES))
 CC := $(CROSS_COMPILE)gcc -MD
 CXX := $(CROSS_COMPILE)g++ -MD
 PKG_CONFIG := $(CROSS_COMPILE)pkg-config
+LLVM_CONFIG ?= llvm-config
 
 all: $(FILES)
 
@@ -229,6 +230,13 @@ $(OUTPUT)test-cxx.bin:
 $(OUTPUT)test-jvmti.bin:
$(BUILD)
 
+$(OUTPUT)test-llvm.bin:
+   $(BUILDXX) -std=gnu++11 \
+   -I$(shell $(LLVM_CONFIG) --includedir)  \
+   -L$(shell $(LLVM_CONFIG) --libdir)  \
+   $(shell $(LLVM_CONFIG) --libs Core BPF) \
+   $(shell $(LLVM_CONFIG) --system-libs)
+
 -include $(OUTPUT)*.d
 
 ###
diff --git a/tools/build/feature/test-llvm.cpp 
b/tools/build/feature/test-llvm.cpp
new file mode 100644
index 000..d8d2cee
--- /dev/null
+++ b/tools/build/feature/test-llvm.cpp
@@ -0,0 +1,8 @@
+#include "llvm/Support/ManagedStatic.h"
+#include "llvm/Support/raw_ostream.h"
+int main()
+{
+   llvm::errs() << "Hello World!\n";
+   llvm::llvm_shutdown();
+   return 0;
+}
-- 
2.10.1



[PATCH 01/34] perf tools: Fix kernel version error in ubuntu

2016-11-14 Thread Wang Nan
On ubuntu the internal kernel version code is different from what can
be retrived from uname:

 $ uname -r
 4.4.0-47-generic
 $ cat /lib/modules/`uname -r`/build/include/generated/uapi/linux/version.h
 #define LINUX_VERSION_CODE 263192
 #define KERNEL_VERSION(a,b,c) (((a) << 16) + ((b) << 8) + (c))
 $ cat /lib/modules/`uname -r`/build/include/generated/utsrelease.h
 #define UTS_RELEASE "4.4.0-47-generic"
 #define UTS_UBUNTU_RELEASE_ABI 47
 $ cat /proc/version_signature
 Ubuntu 4.4.0-47.68-generic 4.4.24

The macro LINUX_VERSION_CODE is set to 4.4.24 (263192 == 0x40418), but
`uname -r` reports 4.4.0.

This mismatch causes LINUX_VERSION_CODE macro passed to BPF script become
an incorrect value, results in magic failure in BPF loading:

 $ sudo ./buildperf/perf record -e ./tools/perf/tests/bpf-script-example.c ls
 event syntax error: './tools/perf/tests/bpf-script-example.c'
  \___ Failed to load program for unknown reason

According to Ubuntu document (https://wiki.ubuntu.com/Kernel/FAQ), the
correct kernel version can be retrived through /proc/version_signature, which
is ubuntu specific.

This patch checks the existance of /proc/version_signature, and returns
version number through parsing this file instead of uname. Version string
is untouched (value returns from uname) because `uname -r` is required
to be consistence with path of kbuild directory in /lib/module.

Signed-off-by: Wang Nan 
Cc: Arnaldo Carvalho de Melo 
---
 tools/perf/util/util.c | 55 --
 1 file changed, 53 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/util.c b/tools/perf/util/util.c
index 5bbd1f6..67ac765 100644
--- a/tools/perf/util/util.c
+++ b/tools/perf/util/util.c
@@ -637,12 +637,63 @@ bool find_process(const char *name)
return ret ? false : true;
 }
 
+static int
+fetch_ubuntu_kernel_version(unsigned int *puint)
+{
+   ssize_t len;
+   size_t line_len = 0;
+   char *ptr, *line = NULL;
+   int version, patchlevel, sublevel, err;
+   FILE *vsig = fopen("/proc/version_signature", "r");
+
+   if (!vsig) {
+   pr_debug("Open /proc/version_signature failed: %s\n",
+strerror(errno));
+   return -1;
+   }
+
+   len = getline(&line, &line_len, vsig);
+   fclose(vsig);
+   err = -1;
+   if (len <= 0) {
+   pr_debug("Reading from /proc/version_signature failed: %s\n",
+strerror(errno));
+   goto errout;
+   }
+
+   ptr = strrchr(line, ' ');
+   if (!ptr) {
+   pr_debug("Parsing /proc/version_signature failed: %s\n", line);
+   goto errout;
+   }
+
+   err = sscanf(ptr + 1, "%d.%d.%d",
+&version, &patchlevel, &sublevel);
+   if (err != 3) {
+   pr_debug("Unable to get kernel version from 
/proc/version_signature '%s'\n",
+line);
+   goto errout;
+   }
+
+   if (puint)
+   *puint = (version << 16) + (patchlevel << 8) + sublevel;
+   err = 0;
+errout:
+   free(line);
+   return err;
+}
+
 int
 fetch_kernel_version(unsigned int *puint, char *str,
 size_t str_size)
 {
struct utsname utsname;
int version, patchlevel, sublevel, err;
+   bool int_ver_ready = false;
+
+   if (access("/proc/version_signature", R_OK) == 0)
+   if (!fetch_ubuntu_kernel_version(puint))
+   int_ver_ready = true;
 
if (uname(&utsname))
return -1;
@@ -656,12 +707,12 @@ fetch_kernel_version(unsigned int *puint, char *str,
 &version, &patchlevel, &sublevel);
 
if (err != 3) {
-   pr_debug("Unablt to get kernel version from uname '%s'\n",
+   pr_debug("Unable to get kernel version from uname '%s'\n",
 utsname.release);
return -1;
}
 
-   if (puint)
+   if (puint && !int_ver_ready)
*puint = (version << 16) + (patchlevel << 8) + sublevel;
return 0;
 }
-- 
2.10.1



[PATCH 08/34] perf tools: Introduce perf hooks

2016-11-14 Thread Wang Nan
Perf hooks allow hooking user code at perf events. They can be used for
manipulation of BPF maps, taking snapshot and reporting results. In this
patch two perf hook points are introduced: record_start and record_end.

To avoid buggy user actions, a SIGSEGV signal handler is introduced into
'perf record'. It turns off perf hook if it causes a segfault and report
an error to help debugging.

A test case for perf hook is introduced.

Test result:
  $ ./buildperf/perf test -v hook
  50: Test perf hooks  :
  --- start ---
  test child forked, pid 10311
  SIGSEGV is observed as expected, try to recover.
  Fatal error (SEGFAULT) in perf hook 'test'
  test child finished with 0
   end 
  Test perf hooks: Ok

Signed-off-by: Wang Nan 
Cc: Arnaldo Carvalho de Melo 
Cc: Alexei Starovoitov 
Cc: He Kuang 
Cc: Jiri Olsa 
---
 tools/perf/builtin-record.c   | 11 +
 tools/perf/tests/Build|  1 +
 tools/perf/tests/builtin-test.c   |  4 ++
 tools/perf/tests/perf-hooks.c | 44 
 tools/perf/tests/tests.h  |  1 +
 tools/perf/util/Build |  2 +
 tools/perf/util/perf-hooks-list.h |  3 ++
 tools/perf/util/perf-hooks.c  | 84 +++
 tools/perf/util/perf-hooks.h  | 37 +
 9 files changed, 187 insertions(+)
 create mode 100644 tools/perf/tests/perf-hooks.c
 create mode 100644 tools/perf/util/perf-hooks-list.h
 create mode 100644 tools/perf/util/perf-hooks.c
 create mode 100644 tools/perf/util/perf-hooks.h

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 67d2a90..fa26865 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -37,6 +37,7 @@
 #include "util/llvm-utils.h"
 #include "util/bpf-loader.h"
 #include "util/trigger.h"
+#include "util/perf-hooks.h"
 #include "asm/bug.h"
 
 #include 
@@ -206,6 +207,12 @@ static void sig_handler(int sig)
done = 1;
 }
 
+static void sigsegv_handler(int sig)
+{
+   perf_hooks__recover();
+   sighandler_dump_stack(sig);
+}
+
 static void record__sig_exit(void)
 {
if (signr == -1)
@@ -833,6 +840,7 @@ static int __cmd_record(struct record *rec, int argc, const 
char **argv)
signal(SIGCHLD, sig_handler);
signal(SIGINT, sig_handler);
signal(SIGTERM, sig_handler);
+   signal(SIGSEGV, sigsegv_handler);
 
if (rec->opts.auxtrace_snapshot_mode || rec->switch_output) {
signal(SIGUSR2, snapshot_sig_handler);
@@ -970,6 +978,7 @@ static int __cmd_record(struct record *rec, int argc, const 
char **argv)
 
trigger_ready(&auxtrace_snapshot_trigger);
trigger_ready(&switch_output_trigger);
+   perf_hooks__invoke_record_start();
for (;;) {
unsigned long long hits = rec->samples;
 
@@ -1114,6 +1123,8 @@ static int __cmd_record(struct record *rec, int argc, 
const char **argv)
}
}
 
+   perf_hooks__invoke_record_end();
+
if (!err && !quiet) {
char samples[128];
const char *postfix = rec->timestamp_filename ?
diff --git a/tools/perf/tests/Build b/tools/perf/tests/Build
index 8a4ce49..af3ec94 100644
--- a/tools/perf/tests/Build
+++ b/tools/perf/tests/Build
@@ -42,6 +42,7 @@ perf-y += backward-ring-buffer.o
 perf-y += sdt.o
 perf-y += is_printable_array.o
 perf-y += bitmap.o
+perf-y += perf-hooks.o
 
 $(OUTPUT)tests/llvm-src-base.c: tests/bpf-script-example.c tests/Build
$(call rule_mkdir)
diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index 778668a..dab83f7 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -230,6 +230,10 @@ static struct test generic_tests[] = {
.func = test__bitmap_print,
},
{
+   .desc = "Test perf hooks",
+   .func = test__perf_hooks,
+   },
+   {
.func = NULL,
},
 };
diff --git a/tools/perf/tests/perf-hooks.c b/tools/perf/tests/perf-hooks.c
new file mode 100644
index 000..9338cb2
--- /dev/null
+++ b/tools/perf/tests/perf-hooks.c
@@ -0,0 +1,44 @@
+#include 
+#include 
+
+#include "tests.h"
+#include "debug.h"
+#include "util.h"
+#include "perf-hooks.h"
+
+static void sigsegv_handler(int sig __maybe_unused)
+{
+   pr_debug("SIGSEGV is observed as expected, try to recover.\n");
+   perf_hooks__recover();
+   signal(SIGSEGV, SIG_DFL);
+   raise(SIGSEGV);
+   exit(-1);
+}
+
+static int hook_flags;
+
+static void the_hook(void)
+{
+   int *p = NULL;
+
+   hook_flags = 1234;
+
+   /* Generate a segfault, test perf_hooks__recover */
+   *p = 0;
+}
+
+int test__perf_hooks(int subtest __maybe_unused)
+{
+   signal(SIGSEGV, sigsegv_handler);
+   perf_hooks__set_hook("test", the_hook);
+   perf_hooks__invoke_test();
+
+   /* hook is triggered? */
+   if (hook_flags != 1234)
+   return T

[PATCH 15/34] perf clang: Use real file system for #include

2016-11-14 Thread Wang Nan
Utilize clang's OverlayFileSystem facility, allow CompilerInstance to
access real file system.

With this patch '#include' directive can be used.

Add a new getModuleFromSource for real file.

Signed-off-by: Wang Nan 
Cc: Arnaldo Carvalho de Melo 
Cc: Alexei Starovoitov 
Cc: He Kuang 
Cc: Jiri Olsa 
Cc: Zefan Li 
Cc: pi3or...@163.com
---
 tools/perf/util/c++/clang.cpp | 44 +++
 tools/perf/util/c++/clang.h   |  3 +++
 2 files changed, 35 insertions(+), 12 deletions(-)

diff --git a/tools/perf/util/c++/clang.cpp b/tools/perf/util/c++/clang.cpp
index c17b117..cf96199 100644
--- a/tools/perf/util/c++/clang.cpp
+++ b/tools/perf/util/c++/clang.cpp
@@ -15,6 +15,7 @@
 #include "clang/Tooling/Tooling.h"
 #include "llvm/IR/Module.h"
 #include "llvm/Option/Option.h"
+#include "llvm/Support/FileSystem.h"
 #include "llvm/Support/ManagedStatic.h"
 #include 
 
@@ -27,14 +28,6 @@ static std::unique_ptr LLVMCtx;
 
 using namespace clang;
 
-static vfs::InMemoryFileSystem *
-buildVFS(StringRef& Name, StringRef& Content)
-{
-   vfs::InMemoryFileSystem *VFS = new vfs::InMemoryFileSystem(true);
-   VFS->addFile(Twine(Name), 0, llvm::MemoryBuffer::getMemBuffer(Content));
-   return VFS;
-}
-
 static CompilerInvocation *
 createCompilerInvocation(StringRef& Path, DiagnosticsEngine& Diags)
 {
@@ -60,17 +53,17 @@ createCompilerInvocation(StringRef& Path, 
DiagnosticsEngine& Diags)
return CI;
 }
 
-std::unique_ptr
-getModuleFromSource(StringRef Name, StringRef Content)
+static std::unique_ptr
+getModuleFromSource(StringRef Path,
+   IntrusiveRefCntPtr VFS)
 {
CompilerInstance Clang;
Clang.createDiagnostics();
 
-   IntrusiveRefCntPtr VFS = buildVFS(Name, Content);
Clang.setVirtualFileSystem(&*VFS);
 
IntrusiveRefCntPtr CI =
-   createCompilerInvocation(Name, Clang.getDiagnostics());
+   createCompilerInvocation(Path, Clang.getDiagnostics());
Clang.setInvocation(&*CI);
 
std::unique_ptr Act(new EmitLLVMOnlyAction(&*LLVMCtx));
@@ -80,6 +73,33 @@ getModuleFromSource(StringRef Name, StringRef Content)
return Act->takeModule();
 }
 
+std::unique_ptr
+getModuleFromSource(StringRef Name, StringRef Content)
+{
+   using namespace vfs;
+
+   llvm::IntrusiveRefCntPtr OverlayFS(
+   new OverlayFileSystem(getRealFileSystem()));
+   llvm::IntrusiveRefCntPtr MemFS(
+   new InMemoryFileSystem(true));
+
+   /*
+* pushOverlay helps setting working dir for MemFS. Must call
+* before addFile.
+*/
+   OverlayFS->pushOverlay(MemFS);
+   MemFS->addFile(Twine(Name), 0, 
llvm::MemoryBuffer::getMemBuffer(Content));
+
+   return getModuleFromSource(Name, OverlayFS);
+}
+
+std::unique_ptr
+getModuleFromSource(StringRef Path)
+{
+   IntrusiveRefCntPtr VFS(vfs::getRealFileSystem());
+   return getModuleFromSource(Path, VFS);
+}
+
 }
 
 extern "C" {
diff --git a/tools/perf/util/c++/clang.h b/tools/perf/util/c++/clang.h
index f64483b..90aff01 100644
--- a/tools/perf/util/c++/clang.h
+++ b/tools/perf/util/c++/clang.h
@@ -12,5 +12,8 @@ using namespace llvm;
 std::unique_ptr
 getModuleFromSource(StringRef Name, StringRef Content);
 
+std::unique_ptr
+getModuleFromSource(StringRef Path);
+
 }
 #endif
-- 
2.10.1



[PATCH 05/34] tools lib bpf: Add missing bpf map functions

2016-11-14 Thread Wang Nan
Add more BPF map operations to libbpf.

Signed-off-by: Wang Nan 
Cc: Alexei Starovoitov 
Cc: Arnaldo Carvalho de Melo 
Cc: Li Zefan 
---
 tools/lib/bpf/bpf.c | 56 +
 tools/lib/bpf/bpf.h |  7 +++
 2 files changed, 63 insertions(+)

diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index 4212ed6..e966248 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c
@@ -110,3 +110,59 @@ int bpf_map_update_elem(int fd, void *key, void *value,
 
return sys_bpf(BPF_MAP_UPDATE_ELEM, &attr, sizeof(attr));
 }
+
+int bpf_map_lookup_elem(int fd, void *key, void *value)
+{
+   union bpf_attr attr;
+
+   bzero(&attr, sizeof(attr));
+   attr.map_fd = fd;
+   attr.key = ptr_to_u64(key);
+   attr.value = ptr_to_u64(value);
+
+   return sys_bpf(BPF_MAP_LOOKUP_ELEM, &attr, sizeof(attr));
+}
+
+int bpf_map_delete_elem(int fd, void *key)
+{
+   union bpf_attr attr;
+
+   bzero(&attr, sizeof(attr));
+   attr.map_fd = fd;
+   attr.key = ptr_to_u64(key);
+
+   return sys_bpf(BPF_MAP_DELETE_ELEM, &attr, sizeof(attr));
+}
+
+int bpf_map_get_next_key(int fd, void *key, void *next_key)
+{
+   union bpf_attr attr;
+
+   bzero(&attr, sizeof(attr));
+   attr.map_fd = fd;
+   attr.key = ptr_to_u64(key);
+   attr.next_key = ptr_to_u64(next_key);
+
+   return sys_bpf(BPF_MAP_GET_NEXT_KEY, &attr, sizeof(attr));
+}
+
+int bpf_map_pin(int fd, const char *pathname)
+{
+   union bpf_attr attr;
+
+   bzero(&attr, sizeof(attr));
+   attr.pathname = ptr_to_u64((void *)pathname);
+   attr.bpf_fd = fd;
+
+   return sys_bpf(BPF_OBJ_PIN, &attr, sizeof(attr));
+}
+
+int bpf_map_get(const char *pathname)
+{
+   union bpf_attr attr;
+
+   bzero(&attr, sizeof(attr));
+   attr.pathname = ptr_to_u64((void *)pathname);
+
+   return sys_bpf(BPF_OBJ_GET, &attr, sizeof(attr));
+}
diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h
index e8ba540..5b3e52b 100644
--- a/tools/lib/bpf/bpf.h
+++ b/tools/lib/bpf/bpf.h
@@ -35,4 +35,11 @@ int bpf_load_program(enum bpf_prog_type type, struct 
bpf_insn *insns,
 
 int bpf_map_update_elem(int fd, void *key, void *value,
u64 flags);
+
+int bpf_map_lookup_elem(int fd, void *key, void *value);
+int bpf_map_delete_elem(int fd, void *key);
+int bpf_map_get_next_key(int fd, void *key, void *next_key);
+int bpf_map_pin(int fd, const char *pathname);
+int bpf_map_get(const char *pathname);
+
 #endif
-- 
2.10.1



[PATCH 10/34] perf llvm: Extract helpers in llvm-utils.c

2016-11-14 Thread Wang Nan
Following commits will use builtin clang to compile BPF script.
llvm__get_kbuild_opts() and llvm__get_nr_cpus() are extracted to help
building '-DKERNEL_VERSION_CODE' and '-D__NR_CPUS__' macros.

Doing object dumping in bpf loader, so futher builtin clang compiling
needn't consider it.

Signed-off-by: Wang Nan 
Cc: Arnaldo Carvalho de Melo 
Cc: Alexei Starovoitov 
Cc: He Kuang 
Cc: Jiri Olsa 
Cc: Zefan Li 
Cc: pi3or...@163.com
---
 tools/perf/util/bpf-loader.c |  4 +++
 tools/perf/util/llvm-utils.c | 76 +---
 tools/perf/util/llvm-utils.h |  6 
 3 files changed, 68 insertions(+), 18 deletions(-)

diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
index a5fd275..cf16b941 100644
--- a/tools/perf/util/bpf-loader.c
+++ b/tools/perf/util/bpf-loader.c
@@ -90,6 +90,10 @@ struct bpf_object *bpf__prepare_load(const char *filename, 
bool source)
if (err)
return ERR_PTR(-BPF_LOADER_ERRNO__COMPILE);
obj = bpf_object__open_buffer(obj_buf, obj_buf_sz, filename);
+
+   if (!IS_ERR(obj) && llvm_param.dump_obj)
+   llvm__dump_obj(filename, obj_buf, obj_buf_sz);
+
free(obj_buf);
} else
obj = bpf_object__open(filename);
diff --git a/tools/perf/util/llvm-utils.c b/tools/perf/util/llvm-utils.c
index 27b6f303..b23ff44 100644
--- a/tools/perf/util/llvm-utils.c
+++ b/tools/perf/util/llvm-utils.c
@@ -7,6 +7,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "debug.h"
 #include "llvm-utils.h"
 #include "config.h"
@@ -282,9 +283,10 @@ static const char *kinc_fetch_script =
 "rm -rf $TMPDIR\n"
 "exit $RET\n";
 
-static inline void
-get_kbuild_opts(char **kbuild_dir, char **kbuild_include_opts)
+void llvm__get_kbuild_opts(char **kbuild_dir, char **kbuild_include_opts)
 {
+   static char *saved_kbuild_dir;
+   static char *saved_kbuild_include_opts;
int err;
 
if (!kbuild_dir || !kbuild_include_opts)
@@ -293,10 +295,28 @@ get_kbuild_opts(char **kbuild_dir, char 
**kbuild_include_opts)
*kbuild_dir = NULL;
*kbuild_include_opts = NULL;
 
+   if (saved_kbuild_dir && saved_kbuild_include_opts &&
+   !IS_ERR(saved_kbuild_dir) && !IS_ERR(saved_kbuild_include_opts)) {
+   *kbuild_dir = strdup(saved_kbuild_dir);
+   *kbuild_include_opts = strdup(saved_kbuild_include_opts);
+
+   if (*kbuild_dir && *kbuild_include_opts)
+   return;
+
+   zfree(kbuild_dir);
+   zfree(kbuild_include_opts);
+   /*
+* Don't fall through: it may breaks saved_kbuild_dir and
+* saved_kbuild_include_opts if detect them again when
+* memory is low.
+*/
+   return;
+   }
+
if (llvm_param.kbuild_dir && !llvm_param.kbuild_dir[0]) {
pr_debug("[llvm.kbuild-dir] is set to \"\" deliberately.\n");
pr_debug("Skip kbuild options detection.\n");
-   return;
+   goto errout;
}
 
err = detect_kbuild_dir(kbuild_dir);
@@ -306,7 +326,7 @@ get_kbuild_opts(char **kbuild_dir, char 
**kbuild_include_opts)
 "Hint:\tSet correct kbuild directory using 'kbuild-dir' option in [llvm]\n"
 " \tsection of ~/.perfconfig or set it to \"\" to suppress kbuild\n"
 " \tdetection.\n\n");
-   return;
+   goto errout;
}
 
pr_debug("Kernel build dir is set to %s\n", *kbuild_dir);
@@ -325,14 +345,43 @@ get_kbuild_opts(char **kbuild_dir, char 
**kbuild_include_opts)
 
free(*kbuild_dir);
*kbuild_dir = NULL;
-   return;
+   goto errout;
}
 
pr_debug("include option is set to %s\n", *kbuild_include_opts);
+
+   saved_kbuild_dir = strdup(*kbuild_dir);
+   saved_kbuild_include_opts = strdup(*kbuild_include_opts);
+
+   if (!saved_kbuild_dir || !saved_kbuild_include_opts) {
+   zfree(&saved_kbuild_dir);
+   zfree(&saved_kbuild_include_opts);
+   }
+   return;
+errout:
+   saved_kbuild_dir = ERR_PTR(-EINVAL);
+   saved_kbuild_include_opts = ERR_PTR(-EINVAL);
 }
 
-static void
-dump_obj(const char *path, void *obj_buf, size_t size)
+int llvm__get_nr_cpus(void)
+{
+   static int nr_cpus_avail = 0;
+   char serr[STRERR_BUFSIZE];
+
+   if (nr_cpus_avail > 0)
+   return nr_cpus_avail;
+
+   nr_cpus_avail = sysconf(_SC_NPROCESSORS_CONF);
+   if (nr_cpus_avail <= 0) {
+   pr_err(
+"WARNING:\tunable to get available CPUs in this system: %s\n"
+"\tUse 128 instead.\n", str_error_r(errno, serr, sizeof(serr)));
+   nr_cpus_avail = 128;
+   }
+   return nr_cpus_avail;
+}
+
+void llvm__dump_obj(const char *path, void *obj_buf, size_t size)
 {
char *obj_path = strdup(path);
FILE *fp

  1   2   3   4   5   6   7   8   9   10   >