RE: [PATCH v5 13/27] IB/Verbs: Reserve legacy transport type in 'dev_addr'

2015-04-20 Thread Devesh Sharma
> -Original Message-
> From: linux-rdma-ow...@vger.kernel.org [mailto:linux-rdma-
> ow...@vger.kernel.org] On Behalf Of Michael Wang
> Sent: Monday, April 20, 2015 2:08 PM
> To: Roland Dreier; Sean Hefty; linux-r...@vger.kernel.org; linux-
> ker...@vger.kernel.org; h...@dev.mellanox.co.il
> Cc: Michael Wang; Tom Tucker; Steve Wise; Hoang-Nam Nguyen; Christoph
> Raisch; Mike Marciniszyn; Eli Cohen; Faisal Latif; Jack Morgenstein; Or 
> Gerlitz;
> Haggai Eran; Ira Weiny; Tom Talpey; Jason Gunthorpe; Doug Ledford
> Subject: [PATCH v5 13/27] IB/Verbs: Reserve legacy transport type in
> 'dev_addr'
> 
> 
> Reserve the legacy transport type for the 'transport' member of 'struct
> rdma_dev_addr' until we make sure this is no longer needed.
> 
> Cc: Hal Rosenstock 
> Cc: Steve Wise 
> Cc: Tom Talpey 
> Cc: Jason Gunthorpe 
> Cc: Doug Ledford 
> Cc: Ira Weiny 
> Cc: Sean Hefty 
> Signed-off-by: Michael Wang 
> ---
>  drivers/infiniband/core/cma.c | 25 +++--
>  1 file changed, 23 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
> index ebac646..6195bf6 100644
> --- a/drivers/infiniband/core/cma.c
> +++ b/drivers/infiniband/core/cma.c
> @@ -244,14 +244,35 @@ static inline void cma_set_ip_ver(struct cma_hdr
> *hdr, u8 ip_ver)
>   hdr->ip_version = (ip_ver << 4) | (hdr->ip_version & 0xF);  }
> 
> +static inline void cma_set_legacy_transport(struct rdma_cm_id *id) {
> + switch (id->device->node_type) {
> + case RDMA_NODE_IB_CA:
> + case RDMA_NODE_IB_SWITCH:
> + case RDMA_NODE_IB_ROUTER:
> + id->route.addr.dev_addr.transport = RDMA_TRANSPORT_IB;

What about IBOE transport, am I missing something here? As of today ocrdma 
exports node_type  as RDMA_NODE_IB_CA, here transport will be set to 
RDMA_TRANSPORT_IB,
Should it be RDMA_TRANPORT_IBOE?

> + break;
> + case RDMA_NODE_RNIC:
> + id->route.addr.dev_addr.transport =
> RDMA_TRANSPORT_IWARP;
> + break;
> + case RDMA_NODE_USNIC:
> + id->route.addr.dev_addr.transport =
> RDMA_TRANSPORT_USNIC;
> + break;
> + case RDMA_NODE_USNIC_UDP:
> + id->route.addr.dev_addr.transport =
> RDMA_TRANSPORT_USNIC_UDP;
> + break;
> + default:
> + BUG();
> + }
> +}
> +
>  static void cma_attach_to_dev(struct rdma_id_private *id_priv,
> struct cma_device *cma_dev)
>  {
>   atomic_inc(_dev->refcount);
>   id_priv->cma_dev = cma_dev;
>   id_priv->id.device = cma_dev->device;
> - id_priv->id.route.addr.dev_addr.transport =
> - rdma_node_get_transport(cma_dev->device->node_type);
> + cma_set_legacy_transport(_priv->id);
>   list_add_tail(_priv->list, _dev->id_list);  }
> 
> --
> 2.1.0
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the 
> body
> of a message to majord...@vger.kernel.org More majordomo info at
> http://vger.kernel.org/majordomo-info.html


Re: [RESEND RFC PATCH 2/3] ASoC: mediatek: Add AFE connection control

2015-04-20 Thread Sascha Hauer
On Mon, Apr 20, 2015 at 09:52:30PM +0100, Mark Brown wrote:
> On Mon, Apr 20, 2015 at 06:50:17AM +0200, Sascha Hauer wrote:
> > On Sat, Apr 18, 2015 at 06:37:40PM +0100, Mark Brown wrote:
> > > On Fri, Apr 10, 2015 at 04:14:08PM +0800, Koro Chen wrote:
> 
> > > > +   [0][0] =   { .creg = 0x020, .cshift =  0, .sreg = 0x020, 
> > > > .sshift = 10},
> > > > +   [0][1] =   { .creg = 0x020, .cshift = 16, .sreg = 0x020, 
> > > > .sshift = 26},
> 
> > > It'd also be nice to have less magic numbers in the table, at least for
> > > the indexes (which I guess correspond to some of the defines in the
> > > headers)?
> 
> > With defines the above two lines would become something like:
> 
> > [0][0] = { .creg = AFE_CONN0, .cshift = CONN0_I00_O00_S, .sreg = AFE_CONN0, 
> > .sshift = CONN0_I00_O00_R },
> > [0][1] = { .creg = AFE_CONN0, .cshift = CONN0_I00_O01_S, .sreg = AFE_CONN0, 
> > .sshift = CONN0_I00_O01_S },
> 
> > For the registers we could use defines, but I think using defines for
> > the shifts doesn't add much value given they are only used once.
> 
> By indexes I actually meant the [0][0] and so on - they seem the more
> magic bit.

Oh, that's not magic at all, the crossbar switch has inputs and outputs
numbered from 0 to MTK_AFE_INTERCONN_NUM_INPUT /
MTK_AFE_INTERCONN_NUM_OUTPUT (they have the same numbers in the
datasheet. To connect input x with output y look at index [x][y] in the
table and write the register values found at that place. If .creg is 0x0
then it's not possible to connect the given input with the given output.

Sascha



-- 
Pengutronix e.K.   | |
Industrial Linux Solutions | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0|
Amtsgericht Hildesheim, HRA 2686   | Fax:   +49-5121-206917- |
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] dmaengine: dw: add Intel Broxton LPSS Integrated DMA support

2015-04-20 Thread Zha, Qipeng
+ dma maillist




Best wishes
Qipeng

-Original Message-
From: Zha, Qipeng 
Sent: Tuesday, April 21, 2015 7:34 AM
To: linux-kernel@vger.kernel.org
Cc: viresh.li...@gmail.com; andriy.shevche...@linux.intel.com; Westerberg, 
Mika; Chen, Jason CJ; Zheng, Qi; Zha, Qipeng; Zhong, Huiquan
Subject: [PATCH] dmaengine: dw: add Intel Broxton LPSS Integrated DMA support

From: Huiquan Zhong 

Add Broxton Lower Power Sub System Integrated DMA support, Since the DMA 
register space is very similar to DesignWare DMA register space.

Add DW_DMAC_TYPE_IDMA type to distinguish LPSS iDMA register.

Broxton LPSS iDMA's maximum block size is 0x1(128KB -1).

Signed-off-by: Huiquan Zhong 
Signed-off-by: qipeng.zha 
---
 drivers/dma/dw/core.c| 64 +---
 drivers/dma/dw/internal.h|  3 --
 drivers/dma/dw/regs.h| 14 
 include/linux/dma/dw.h   |  8 +
 include/linux/platform_data/dma-dw.h |  2 +-
 5 files changed, 75 insertions(+), 16 deletions(-)

diff --git a/drivers/dma/dw/core.c b/drivers/dma/dw/core.c index 
a8ad052..1d198c9 100644
--- a/drivers/dma/dw/core.c
+++ b/drivers/dma/dw/core.c
@@ -144,8 +144,19 @@ static void dwc_initialize(struct dw_dma_chan *dwc)
 */
BUG_ON(!dws->dma_dev || dws->dma_dev != dw->dma.dev);
 
-   cfghi |= DWC_CFGH_DST_PER(dws->dst_id);
-   cfghi |= DWC_CFGH_SRC_PER(dws->src_id);
+   if (dw->type == DW_DMAC_TYPE_IDMA) {
+   /* Forces channel FIFO to drain while in suspension */
+   cfglo = IDMA_CFGL_CH_DRAIN;
+   /* Burst length aligned */
+   cfglo |= IDMA_CFGL_SRC_BURST_ALIGN
+   | IDMA_CFGL_DST_BURST_ALIGN;
+
+   cfghi |= IDMA_CFGH_DST_PER(dws->dst_id);
+   cfghi |= IDMA_CFGH_SRC_PER(dws->src_id);
+   } else {
+   cfghi |= DWC_CFGH_DST_PER(dws->dst_id);
+   cfghi |= DWC_CFGH_SRC_PER(dws->src_id);
+   }
} else {
cfghi |= DWC_CFGH_DST_PER(dwc->dst_id);
cfghi |= DWC_CFGH_SRC_PER(dwc->src_id); @@ -346,9 +357,14 @@ 
static void dwc_complete_all(struct dw_dma *dw, struct dw_dma_chan *dwc)
 /* Returns how many bytes were already received from source */  static inline 
u32 dwc_get_sent(struct dw_dma_chan *dwc)  {
+   struct dw_dma *dw = to_dw_dma(dwc->chan.device);
+
u32 ctlhi = channel_readl(dwc, CTL_HI);
u32 ctllo = channel_readl(dwc, CTL_LO);
 
+   if (dw->type == DW_DMAC_TYPE_IDMA)
+   return ctlhi & IDMA_CTLH_BLOCK_TS_MASK;
+
return (ctlhi & DWC_CTLH_BLOCK_TS_MASK) * (1 << (ctllo >> 4 & 7));  }
 
@@ -775,6 +791,7 @@ dwc_prep_slave_sg(struct dma_chan *chan, struct scatterlist 
*sgl,
unsigned intreg_width;
unsigned intmem_width;
unsigned intdata_width;
+   unsigned intwidth_trf;
unsigned inti;
struct scatterlist  *sg;
size_t  total_len = 0;
@@ -823,8 +840,14 @@ slave_sg_todev_fill_desc:
desc->lli.sar = mem;
desc->lli.dar = reg;
desc->lli.ctllo = ctllo | DWC_CTLL_SRC_WIDTH(mem_width);
-   if ((len >> mem_width) > dwc->block_size) {
-   dlen = dwc->block_size << mem_width;
+
+   if (dw->type == DW_DMAC_TYPE_IDMA)
+   width_trf = 0;
+   else
+   width_trf = mem_width;
+
+   if ((len >> width_trf) > dwc->block_size) {
+   dlen = dwc->block_size << width_trf;
mem += dlen;
len -= dlen;
} else {
@@ -832,7 +855,7 @@ slave_sg_todev_fill_desc:
len = 0;
}
 
-   desc->lli.ctlhi = dlen >> mem_width;
+   desc->lli.ctlhi = dlen >> width_trf;
desc->len = dlen;
 
if (!first) {
@@ -883,15 +906,20 @@ slave_sg_fromdev_fill_desc:
desc->lli.sar = reg;
desc->lli.dar = mem;
desc->lli.ctllo = ctllo | DWC_CTLL_DST_WIDTH(mem_width);
-   if ((len >> reg_width) > dwc->block_size) {
-   dlen = dwc->block_size << reg_width;
+   if (dw->type == DW_DMAC_TYPE_IDMA)
+   width_trf = 0;
+   else
+   width_trf = reg_width;
+
+   if ((len >> width_trf) > dwc->block_size) {
+   dlen = dwc->block_size << 

RE: [PATCH v5 00/27] IB/Verbs: IB Management Helpers

2015-04-20 Thread Devesh Sharma
Hi Michael,

is there a specific git branch available to pull out all the patches?

-Regards
Devesh

> -Original Message-
> From: linux-rdma-ow...@vger.kernel.org [mailto:linux-rdma-
> ow...@vger.kernel.org] On Behalf Of Michael Wang
> Sent: Monday, April 20, 2015 1:59 PM
> To: Roland Dreier; Sean Hefty; Hal Rosenstock; linux-r...@vger.kernel.org;
> linux-kernel@vger.kernel.org; h...@dev.mellanox.co.il
> Cc: Tom Tucker; Steve Wise; Hoang-Nam Nguyen; Christoph Raisch; Mike
> Marciniszyn; Eli Cohen; Faisal Latif; Jack Morgenstein; Or Gerlitz; Haggai 
> Eran;
> Ira Weiny; Tom Talpey; Jason Gunthorpe; Doug Ledford; Michael Wang
> Subject: [PATCH v5 00/27] IB/Verbs: IB Management Helpers
> 
> 
> Since v4:
>   * Thanks for the comments from Hal, Sean, Tom, Or Gerlitz, Jason,
> Roland, Ira and Steve :-) Please remind me if anything missed :-P
>   * Fix logical issue inside 3#, 14#
>   * Refine 3#, 4#, 5# with label 'free'
>   * Rework 10# to stop using port 1 when port already assigned
> 
> There are plenty of lengthy code to check the transport type of IB device, or 
> the
> link layer type of it's port, but actually we are just speculating whether a
> particular management/feature is supported by the device/port.
> 
> Thus instead of inferring, we should have our own mechanism for IB
> management capability/protocol/feature checking, several proposals below.
> 
> This patch set will reform the method of getting transport type, we will now
> using query_transport() instead of inferring from transport and link layer
> respectively, also we defined the new transport type to make the concept more
> reasonable.
> 
> Mapping List:
>   node-type   link-layer  old-transport   new-transport
> nes   RNICETH IWARP   IWARP
> amso1100  RNICETH IWARP   IWARP
> cxgb3 RNICETH IWARP   IWARP
> cxgb4 RNICETH IWARP   IWARP
> usnic USNIC_UDP   ETH USNIC_UDP   USNIC_UDP
> ocrdmaIB_CA   ETH IB  IBOE
> mlx4  IB_CA   IB/ETH  IB  IB/IBOE
> mlx5  IB_CA   IB  IB  IB
> ehca  IB_CA   IB  IB  IB
> ipath IB_CA   IB  IB  IB
> mthca IB_CA   IB  IB  IB
> qib   IB_CA   IB  IB  IB
> 
> For example:
>   if (transport == IB) && (link-layer == ETH) will now become:
>   if (query_transport() == IBOE)
> 
> Thus we will be able to get rid of the respective transport and link-layer
> checking, and it will help us to add new protocol/Technology (like OPA) more
> easier, also with the introduced management helpers, IB management logical
> will be more clear and easier for extending.
> 
> Highlights:
> The patch set covered a wide range of IB stuff, thus for those who are
> familiar with the particular part, your suggestion would be invaluable ;-)
> 
> Patch 1#~15# included all the logical reform, 16#~25# introduced the
> management helpers, 26#~27# do clean up.
> 
> Patches haven't been tested yet, we appreciate if any one who have these
> HW willing to provide his Tested-by :-)
> 
> Doug suggested the bitmask mechanism:
>   https://www.mail-archive.com/linux-
> r...@vger.kernel.org/msg23765.html
> which could be the plan for future reforming, we prefer that to be another
> series which focus on semantic and performance.
> 
> This patch-set is somewhat 'bloated' now and it may be a good timing for
> staging, I'd like to suggest we focus on improving existed helpers and 
> push
> all the further reforms into next series ;-)
> 
> Proposals:
> Sean:
>   https://www.mail-archive.com/linux-
> r...@vger.kernel.org/msg23339.html
> Doug:
>   https://www.mail-archive.com/linux-
> r...@vger.kernel.org/msg23418.html
>   https://www.mail-archive.com/linux-
> r...@vger.kernel.org/msg23765.html
> Jason:
>   https://www.mail-archive.com/linux-
> r...@vger.kernel.org/msg23425.html
> 
> Michael Wang (27):
> IB/Verbs: Implement new callback query_transport()
> IB/Verbs: Implement raw management helpers
> IB/Verbs: Reform IB-core mad/agent/user_mad
> IB/Verbs: Reform IB-core cm
> IB/Verbs: Reform IB-core sa_query
> IB/Verbs: Reform IB-core multicast
> IB/Verbs: Reform IB-ulp ipoib
> IB/Verbs: Reform IB-ulp xprtrdma
> IB/Verbs: Reform IB-core verbs/uverbs_cmd/sysfs
> IB/Verbs: Reform cm related part in IB-core cma/ucm
> IB/Verbs: Reform route related part in IB-core cma
> IB/Verbs: Reform mcast related part in IB-core cma
> IB/Verbs: Reserve legacy transport type in 'dev_addr'
> IB/Verbs: Reform cma_acquire_dev()
> IB/Verbs: Reform rest part in 

Re: [PATCHv2 1/3] phy: core: Add devm_of_phy_get_by_index to phy-core

2015-04-20 Thread Kishon Vijay Abraham I

Hi,

On Tuesday 21 April 2015 01:49 AM, Arun Ramamurthy wrote:



On 15-04-15 02:59 AM, Kishon Vijay Abraham I wrote:

Hi,

On Tuesday 14 April 2015 03:40 AM, Arun Ramamurthy wrote:

Some generic drivers, such as ehci, may use multiple phys and for such
drivers referencing phy(s) by name(s) does not make sense. Instead of
inventing new naming schemes and using custom code to iterate through
them,
such drivers are better of using nameless phy bindings and using this
newly
introduced API to iterate through them.

Signed-off-by: Arun Ramamurthy 
Reviewed-by: Ray Jui 
Reviewed-by: Scott Branden 
---
   drivers/phy/phy-core.c  | 32 
   include/linux/phy/phy.h |  2 ++
   2 files changed, 34 insertions(+)

diff --git a/drivers/phy/phy-core.c b/drivers/phy/phy-core.c
index 3791838..964a84d 100644
--- a/drivers/phy/phy-core.c
+++ b/drivers/phy/phy-core.c
@@ -623,6 +623,38 @@ struct phy *devm_of_phy_get(struct device *dev,
struct device_node *np,
   EXPORT_SYMBOL_GPL(devm_of_phy_get);

   /**
+ * devm_of_phy_get_by_index() - lookup and obtain a reference to a
phy by index.
+ * @dev: device that requests this phy
+ * @np: node containing the phy
+ * @index: index of the phy
+ *
+ * Gets the phy using _of_phy_get(), and associates a device with it
using
+ * devres. On driver detach, release function is invoked on the
devres data,
+ * then, devres data is freed.
+ *
+ */
+struct phy *devm_of_phy_get_by_index(struct device *dev, struct
device_node *np,
+ int index)
+{
+struct phy **ptr, *phy;
+
+ptr = devres_alloc(devm_phy_release, sizeof(*ptr), GFP_KERNEL);
+if (!ptr)
+return ERR_PTR(-ENOMEM);
+
+phy = _of_phy_get(np, index);
+if (!IS_ERR(phy)) {
+*ptr = phy;
+devres_add(dev, ptr);
+} else {
+devres_free(ptr);
+}
+
+return phy;
+}
+EXPORT_SYMBOL_GPL(devm_of_phy_get_by_index);
+
+/**
* phy_create() - create a new phy
* @dev: device that is creating the new phy
* @node: device node of the phy
diff --git a/include/linux/phy/phy.h b/include/linux/phy/phy.h
index a0197fa..ae2ffaf 100644
--- a/include/linux/phy/phy.h
+++ b/include/linux/phy/phy.h
@@ -133,6 +133,8 @@ struct phy *devm_phy_get(struct device *dev, const
char *string);
   struct phy *devm_phy_optional_get(struct device *dev, const char
*string);
   struct phy *devm_of_phy_get(struct device *dev, struct device_node *np,
   const char *con_id);
+struct phy *devm_of_phy_get_by_index(struct device *dev, struct
device_node *np,
+ int index);


Add stubs for this function too. Also update the Documentation/phy.txt.


Kishon, I have added stubs for this function in my next patch set.
However I am still unclear on whether I need to make GENERIC_PHY an
invisible option or change my "select" to "depend" ? It seems like there
was no consensus on this? Do you have any final thoughts before i send
out the next patch set? Thanks


You can follow Arnd's suggestion. You can have a separate patch to change the 
GENERIC_PHY to invisible option and change existing PHY drivers to select 
GENERIC_PHY.


Non-PHY drivers can either use depends on or have no explicit dependency if the 
PHY is optional for that controller.


Thanks
Kishon
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv2 2/3] usb: ehci-platform: Use devm_of_phy_get_by_index

2015-04-20 Thread Kishon Vijay Abraham I

Arnd,

On Wednesday 15 April 2015 03:17 AM, Arnd Bergmann wrote:

On Tuesday 14 April 2015 11:05:35 Arun Ramamurthy wrote:


[1] ->
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/kbuild/kconfig-language.txt#n111


Kishon,removing select GENERIC_PHY also breaks the builds for certain
architectures (i386 and x84_64). Is the consensus to leave the select
but make GENERIC_PHY a invisible option? Thanks


I think the best solution is

- make GENERIC_PHY a silent option
- change PHY_RCAR_GEN2 to use 'select' instead of 'depends on', so
   it will still work when all other phy drivers are disabled
- change the non-phy drivers that select GENERIC_PHY to either
   use 'depends on' or no explicit dependency in case they are
   still functional without the API. Note that
   drivers/pinctrl/pinctrl-tegra-xusb.c is a phy provider as well,
   not a consumer, despite being outside of drivers/phy.


makes sense to me.

Thanks
Kishon
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] i2c-tools: i2ctransfer: add new tool

2015-04-20 Thread Jean Delvare
Hi Wolfram,

On Mon, 20 Apr 2015 19:36:38 +0200, Wolfram Sang wrote:
> On Fri, Feb 27, 2015 at 05:16:56PM +0100, Wolfram Sang wrote:
> > This tool allows to construct and concat multiple I2C messages into one
> > single transfer. Its aim is to test I2C master controllers, and so there
> > is no SMBus fallback.
> > 
> > Signed-off-by: Wolfram Sang 
> > ---
> > 
> > I've been missing such a tool a number of times now, so I finally got 
> > around to
> > writing it myself. As with all I2C tools, it can be dangerous, but it can 
> > also
> > be very useful when developing. I am not sure if distros should supply it, 
> > I'll
> > leave that to Jean's experience. For embedded build systems, I think this
> > should be selectable. It is RFC for now because it needs broader testing 
> > and some
> > more beautification. However, I've been using it already to test the 
> > i2c_quirk
> > infrastructure and Renesas I2C controllers.
> 
> Jean, my tests went well and so I want to brush it up for inclusion into
> i2c-tools upstream. Any show-stoppers you see from a high-level point of
> view?

I think it is a good idea, just I couldn't find the time to review it,
sorry :(

-- 
Jean Delvare
SUSE L3 Support
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6] perf: __kmod_path__parse: deal with kernel module names in '[]' correctly.

2015-04-20 Thread Namhyung Kim
Hi Wang,

On Tue, Apr 21, 2015 at 03:33:10AM +, Wang Nan wrote:
> Before patch ba92732e9808df679ddf75c5ea1c0caae6d7dce2 ('perf kmaps:
> Check kmaps to make code more robust'), perf report and perf annotate
> will segfault if trace data contains kernel module information like
> this:
> 
>  # perf report -D -i ./perf.data
>  ...
>  0 0 0x188 [0x50]: PERF_RECORD_MMAP -1/0: [0xffbff1018000(0xf068000) @ 
> 0]: x [test_module]
>  ...
> 
>  # perf report -i ./perf.data --objdump=/path/to/objdump 
> --kallsyms=/path/to/kallsyms
> 
>  perf: Segmentation fault
>   backtrace 
>  /path/to/perf[0x503478]
>  /lib64/libc.so.6(+0x3545f)[0x7fb201f3745f]
>  /path/to/perf[0x499b56]
>  /path/to/perf(dso__load_kallsyms+0x13c)[0x49b56c]
>  /path/to/perf(dso__load+0x72e)[0x49c21e]
>  /path/to/perf(map__load+0x6e)[0x4ae9ee]
>  /path/to/perf(thread__find_addr_map+0x24c)[0x47deec]
>  /path/to/perf(perf_event__preprocess_sample+0x88)[0x47e238]
>  /path/to/perf[0x43ad02]
>  /path/to/perf[0x4b55bc]
>  /path/to/perf(ordered_events__flush+0xca)[0x4b57ea]
>  /path/to/perf[0x4b1a01]
>  /path/to/perf(perf_session__process_events+0x3be)[0x4b428e]
>  /path/to/perf(cmd_report+0xf11)[0x43bfc1]
>  /path/to/perf[0x474702]
>  /path/to/perf(main+0x5f5)[0x42de95]
>  /lib64/libc.so.6(__libc_start_main+0xf4)[0x7fb201f23bd4]
>  /path/to/perf[0x42dfc4]
> 
> This is because __kmod_path__parse treats '[' leading names as kernel
> name instead of names of kernel module. If perf.data contains build
> information and the buildid of such modules can be found, the DSO of
> it will be treated as kernel, not kernel module.

Sorry if I missed some prior discussion on it, but any chance to treat
them as modules instead of kernel binaries?

Thanks,
Namhyung


> It will then be passed to
> dso__load_kernel_sym() -> dso__load_kcore() because of --kallsyms
> argument.
> 
> The refered patch adds NULL pointer checker to avoid segfault. However,
> such kernel modules are still processed incorrectly.
> 
> This patch fixes __kmod_path__parse, makes it treat names like
> '[test_module]' as kernel modules.
> 
> kmod-path.c is also update to reflect the above changes.
> 
> Signed-off-by: Wang Nan 
> ---
> Improves commit messages.
> 
> Since ba92732e9808df679ddf75c5ea1c0caae6d7dce2 is already in -tip tree,
> segfault will not be triggered even without this patch.
> ---
>  tools/perf/tests/kmod-path.c | 72 
> 
>  tools/perf/util/dso.c| 42 +++---
>  tools/perf/util/dso.h|  2 +-
>  tools/perf/util/header.c |  8 ++---
>  tools/perf/util/machine.c| 16 +-
>  5 files changed, 130 insertions(+), 10 deletions(-)
> 
> diff --git a/tools/perf/tests/kmod-path.c b/tools/perf/tests/kmod-path.c
> index e8d7cbb..08c433b 100644
> --- a/tools/perf/tests/kmod-path.c
> +++ b/tools/perf/tests/kmod-path.c
> @@ -34,9 +34,21 @@ static int test(const char *path, bool alloc_name, bool 
> alloc_ext,
>   return 0;
>  }
>  
> +static int test_is_kernel_module(const char *path, int cpumode, bool expect)
> +{
> + TEST_ASSERT_VAL("is_kernel_module",
> + (!!is_kernel_module(path, cpumode)) == (!!expect));
> + pr_debug("%s (cpumode: %d) - is_kernel_module: %s\n",
> + path, cpumode, expect ? "true" : "false");
> + return 0;
> +}
> +
>  #define T(path, an, ae, k, c, n, e) \
>   TEST_ASSERT_VAL("failed", !test(path, an, ae, k, c, n, e))
>  
> +#define M(path, c, e) \
> + TEST_ASSERT_VAL("failed", !test_is_kernel_module(path, c, e))
> +
>  int test__kmod_path__parse(void)
>  {
>   /* pathalloc_name  alloc_ext   kmod  comp   name 
> ext */
> @@ -44,30 +56,90 @@ int test__kmod_path__parse(void)
>   T("///x-x.ko", false , true  , true, false, NULL   , 
> NULL);
>   T("///x-x.ko", true  , false , true, false, "[x_x]", 
> NULL);
>   T("///x-x.ko", false , false , true, false, NULL   , 
> NULL);
> + M("///x-x.ko", PERF_RECORD_MISC_CPUMODE_UNKNOWN, true);
> + M("///x-x.ko", PERF_RECORD_MISC_KERNEL, true);
> + M("///x-x.ko", PERF_RECORD_MISC_USER, false);
>  
>   /* pathalloc_name  alloc_ext   kmod  comp  name   ext */
>   T("///x.ko.gz", true , true  , true, true, "[x]", "gz");
>   T("///x.ko.gz", false, true  , true, true, NULL , "gz");
>   T("///x.ko.gz", true , false , true, true, "[x]", NULL);
>   T("///x.ko.gz", false, false , true, true, NULL , NULL);
> + M("///x.ko.gz", PERF_RECORD_MISC_CPUMODE_UNKNOWN, true);
> + M("///x.ko.gz", PERF_RECORD_MISC_KERNEL, true);
> + M("///x.ko.gz", PERF_RECORD_MISC_USER, false);
>  
>   /* path  alloc_name  alloc_ext  kmod   comp  nameext */
>   T("///x.gz", true  , true , false, true, "x.gz" ,"gz");
>  

[PATCH 3/6] perf kmem: Add --live option for current allocation stat

2015-04-20 Thread Namhyung Kim
Currently perf kmem shows total (page) allocation stat by default, but
sometimes one might want to see live (total alloc-only) requests/pages
only.  The new --live option does this by subtracting freed allocation
from the stat.

Acked-by: Pekka Enberg 
Signed-off-by: Namhyung Kim 
---
 tools/perf/Documentation/perf-kmem.txt |   5 ++
 tools/perf/builtin-kmem.c  | 110 -
 2 files changed, 73 insertions(+), 42 deletions(-)

diff --git a/tools/perf/Documentation/perf-kmem.txt 
b/tools/perf/Documentation/perf-kmem.txt
index 69e181272c51..ff0f433b3fce 100644
--- a/tools/perf/Documentation/perf-kmem.txt
+++ b/tools/perf/Documentation/perf-kmem.txt
@@ -56,6 +56,11 @@ OPTIONS
 --page::
Analyze page allocator events
 
+--live::
+   Show live page stat.  The perf kmem shows total allocation stat by
+   default, but this option shows live (currently allocated) pages
+   instead.  (This option works with --page option only)
+
 SEE ALSO
 
 linkperf:perf-record[1]
diff --git a/tools/perf/builtin-kmem.c b/tools/perf/builtin-kmem.c
index 0393a7f3fa35..7ead9423fd7a 100644
--- a/tools/perf/builtin-kmem.c
+++ b/tools/perf/builtin-kmem.c
@@ -244,6 +244,7 @@ static unsigned long nr_page_fails;
 static unsigned long nr_page_nomatch;
 
 static bool use_pfn;
+static bool live_page;
 static struct perf_session *kmem_session;
 
 #define MAX_MIGRATE_TYPES  6
@@ -264,7 +265,7 @@ struct page_stat {
int nr_free;
 };
 
-static struct rb_root page_tree;
+static struct rb_root page_live_tree;
 static struct rb_root page_alloc_tree;
 static struct rb_root page_alloc_sorted;
 static struct rb_root page_caller_tree;
@@ -403,10 +404,19 @@ static u64 find_callsite(struct perf_evsel *evsel, struct 
perf_sample *sample)
return sample->ip;
 }
 
+struct sort_dimension {
+   const char  name[20];
+   sort_fn_t   cmp;
+   struct list_headlist;
+};
+
+static LIST_HEAD(page_alloc_sort_input);
+static LIST_HEAD(page_caller_sort_input);
+
 static struct page_stat *
-__page_stat__findnew_page(u64 page, bool create)
+__page_stat__findnew_page(struct page_stat *pstat, bool create)
 {
-   struct rb_node **node = _tree.rb_node;
+   struct rb_node **node = _live_tree.rb_node;
struct rb_node *parent = NULL;
struct page_stat *data;
 
@@ -416,7 +426,7 @@ __page_stat__findnew_page(u64 page, bool create)
parent = *node;
data = rb_entry(*node, struct page_stat, node);
 
-   cmp = data->page - page;
+   cmp = data->page - pstat->page;
if (cmp < 0)
node = >rb_left;
else if (cmp > 0)
@@ -430,34 +440,28 @@ __page_stat__findnew_page(u64 page, bool create)
 
data = zalloc(sizeof(*data));
if (data != NULL) {
-   data->page = page;
+   data->page = pstat->page;
+   data->order = pstat->order;
+   data->gfp_flags = pstat->gfp_flags;
+   data->migrate_type = pstat->migrate_type;
 
rb_link_node(>node, parent, node);
-   rb_insert_color(>node, _tree);
+   rb_insert_color(>node, _live_tree);
}
 
return data;
 }
 
-static struct page_stat *page_stat__find_page(u64 page)
+static struct page_stat *page_stat__find_page(struct page_stat *pstat)
 {
-   return __page_stat__findnew_page(page, false);
+   return __page_stat__findnew_page(pstat, false);
 }
 
-static struct page_stat *page_stat__findnew_page(u64 page)
+static struct page_stat *page_stat__findnew_page(struct page_stat *pstat)
 {
-   return __page_stat__findnew_page(page, true);
+   return __page_stat__findnew_page(pstat, true);
 }
 
-struct sort_dimension {
-   const char  name[20];
-   sort_fn_t   cmp;
-   struct list_headlist;
-};
-
-static LIST_HEAD(page_alloc_sort_input);
-static LIST_HEAD(page_caller_sort_input);
-
 static struct page_stat *
 __page_stat__findnew_alloc(struct page_stat *pstat, bool create)
 {
@@ -615,17 +619,8 @@ static int perf_evsel__process_page_alloc_event(struct 
perf_evsel *evsel,
 * This is to find the current page (with correct gfp flags and
 * migrate type) at free event.
 */
-   pstat = page_stat__findnew_page(page);
-   if (pstat == NULL)
-   return -ENOMEM;
-
-   pstat->order = order;
-   pstat->gfp_flags = gfp_flags;
-   pstat->migrate_type = migrate_type;
-   pstat->callsite = callsite;
-
this.page = page;
-   pstat = page_stat__findnew_alloc();
+   pstat = page_stat__findnew_page();
if (pstat == NULL)
return -ENOMEM;
 
@@ -633,6 +628,16 @@ static int perf_evsel__process_page_alloc_event(struct 
perf_evsel *evsel,
pstat->alloc_bytes += bytes;
pstat->callsite = callsite;
 
+   if (!live_page) {
+  

[PATCH 1/6] perf kmem: Implement stat --page --caller

2015-04-20 Thread Namhyung Kim
It perf kmem support caller statistics for page.  Unlike slab case,
the tracepoints in page allocator don't provide callsite info.  So
it records with callchain and extracts callsite info.

Note that the callchain contains several memory allocation functions
which has no meaning for users.  So skip those functions to get proper
callsites.  I used following regex pattern to skip the allocator
functions:

  ^_?_?(alloc|get_free|get_zeroed)_pages?

This gave me a following list of functions:

  # perf kmem record --page sleep 3
  # perf kmem stat --page -v
  ...
  alloc func: __get_free_pages
  alloc func: get_zeroed_page
  alloc func: alloc_pages_exact
  alloc func: __alloc_pages_direct_compact
  alloc func: __alloc_pages_nodemask
  alloc func: alloc_page_interleave
  alloc func: alloc_pages_current
  alloc func: alloc_pages_vma
  alloc func: alloc_page_buffers
  alloc func: alloc_pages_exact_nid
  ...

The output looks mostly same as --alloc (I also added callsite column
to that) but groups entries by callsite.  Currently, the order,
migrate type and GFP flag info is for the last allocation and not
guaranteed to be same for all allocations from the callsite.

  
-
   Total_alloc (KB) | Hits  | Order | Mig.type | GFP flags | Callsite
  
-
  1,064 |   266 | 0 | UNMOVABL |  00d0 | __pollwait
 52 |13 | 0 | UNMOVABL |  002084d0 | pte_alloc_one
 44 |11 | 0 |  MOVABLE |  000280da | handle_mm_fault
 20 | 5 | 0 |  MOVABLE |  000200da | do_cow_fault
 20 | 5 | 0 |  MOVABLE |  000200da | do_wp_page
 16 | 4 | 0 | UNMOVABL |  84d0 | __pmd_alloc
 16 | 4 | 0 | UNMOVABL |  0200 | 
__tlb_remove_page
 12 | 3 | 0 | UNMOVABL |  84d0 | __pud_alloc
  8 | 2 | 0 | UNMOVABL |  0010 | 
bio_copy_user_iov
  4 | 1 | 0 | UNMOVABL |  000200d2 | pipe_write
  4 | 1 | 0 |  MOVABLE |  000280da | do_wp_page
  4 | 1 | 0 | UNMOVABL |  002084d0 | pgd_alloc
  
-

Acked-by: Pekka Enberg 
Signed-off-by: Namhyung Kim 
---
 tools/perf/builtin-kmem.c | 327 +++---
 1 file changed, 306 insertions(+), 21 deletions(-)

diff --git a/tools/perf/builtin-kmem.c b/tools/perf/builtin-kmem.c
index 4f0f38462d97..3649eec6807f 100644
--- a/tools/perf/builtin-kmem.c
+++ b/tools/perf/builtin-kmem.c
@@ -10,6 +10,7 @@
 #include "util/header.h"
 #include "util/session.h"
 #include "util/tool.h"
+#include "util/callchain.h"
 
 #include "util/parse-options.h"
 #include "util/trace-event.h"
@@ -21,6 +22,7 @@
 #include 
 #include 
 #include 
+#include 
 
 static int kmem_slab;
 static int kmem_page;
@@ -241,6 +243,7 @@ static unsigned long nr_page_fails;
 static unsigned long nr_page_nomatch;
 
 static bool use_pfn;
+static struct perf_session *kmem_session;
 
 #define MAX_MIGRATE_TYPES  6
 #define MAX_PAGE_ORDER 11
@@ -250,6 +253,7 @@ static int order_stats[MAX_PAGE_ORDER][MAX_MIGRATE_TYPES];
 struct page_stat {
struct rb_node  node;
u64 page;
+   u64 callsite;
int order;
unsignedgfp_flags;
unsignedmigrate_type;
@@ -262,8 +266,144 @@ struct page_stat {
 static struct rb_root page_tree;
 static struct rb_root page_alloc_tree;
 static struct rb_root page_alloc_sorted;
+static struct rb_root page_caller_tree;
+static struct rb_root page_caller_sorted;
 
-static struct page_stat *search_page(unsigned long page, bool create)
+struct alloc_func {
+   u64 start;
+   u64 end;
+   char *name;
+};
+
+static int nr_alloc_funcs;
+static struct alloc_func *alloc_func_list;
+
+static int funcmp(const void *a, const void *b)
+{
+   const struct alloc_func *fa = a;
+   const struct alloc_func *fb = b;
+
+   if (fa->start > fb->start)
+   return 1;
+   else
+   return -1;
+}
+
+static int callcmp(const void *a, const void *b)
+{
+   const struct alloc_func *fa = a;
+   const struct alloc_func *fb = b;
+
+   if (fb->start <= fa->start && fa->end < fb->end)
+   return 0;
+
+   if (fa->start > fb->start)
+   return 1;
+   else
+   return -1;
+}
+
+static int build_alloc_func_list(void)
+{
+   int ret;
+   struct map *kernel_map;
+   struct symbol *sym;
+   struct rb_node *node;
+   struct alloc_func *func;
+   struct machine *machine = _session->machines.host;
+   regex_t alloc_func_regex;
+   

[PATCH 2/6] perf kmem: Support sort keys on page analysis

2015-04-20 Thread Namhyung Kim
Add new sort keys for page: page, order, migtype, gfp - existing
'bytes', 'hit' and 'callsite' sort keys also work for page.  Note that
-s/--sort option should be preceded by either of --slab or --page
option to determine where the sort keys applies.

Now it properly groups and sorts allocation stats - so same
page/caller with different order/migtype/gfp will be printed on a
different line.

  # perf kmem stat --page --caller -l 10 -s order,hit

  

   Total alloc (KB) |  Hits | Order | Mig.type | GFP flags | Callsite
  

 64 | 4 | 2 |  RECLAIM |  00285250 | new_slab
 50,144 |12,536 | 0 |  MOVABLE |  0102005a | 
__page_cache_alloc
 52 |13 | 0 | UNMOVABL |  002084d0 | pte_alloc_one
 40 |10 | 0 |  MOVABLE |  000280da | handle_mm_fault
 28 | 7 | 0 | UNMOVABL |  00d0 | __pollwait
 20 | 5 | 0 |  MOVABLE |  000200da | do_wp_page
 20 | 5 | 0 |  MOVABLE |  000200da | do_cow_fault
 16 | 4 | 0 | UNMOVABL |  0200 | 
__tlb_remove_page
 16 | 4 | 0 | UNMOVABL |  84d0 | __pmd_alloc
  8 | 2 | 0 | UNMOVABL |  84d0 | __pud_alloc
   ...  | ...   | ...   | ...  | ...   | ...
  


Acked-by: Pekka Enberg 
Signed-off-by: Namhyung Kim 
---
 tools/perf/Documentation/perf-kmem.txt |   6 +-
 tools/perf/builtin-kmem.c  | 393 ++---
 2 files changed, 313 insertions(+), 86 deletions(-)

diff --git a/tools/perf/Documentation/perf-kmem.txt 
b/tools/perf/Documentation/perf-kmem.txt
index 23219c65c16f..69e181272c51 100644
--- a/tools/perf/Documentation/perf-kmem.txt
+++ b/tools/perf/Documentation/perf-kmem.txt
@@ -37,7 +37,11 @@ OPTIONS
 
 -s ::
 --sort=::
-   Sort the output (default: frag,hit,bytes)
+   Sort the output (default: 'frag,hit,bytes' for slab and 'bytes,hit'
+   for page).  Available sort keys are 'ptr, callsite, bytes, hit,
+   pingpong, frag' for slab and 'page, callsite, bytes, hit, order,
+   migtype, gfp' for page.  This option should be preceded by one of the
+   mode selection options - i.e. --slab, --page, --alloc and/or --caller.
 
 -l ::
 --line=::
diff --git a/tools/perf/builtin-kmem.c b/tools/perf/builtin-kmem.c
index 3649eec6807f..0393a7f3fa35 100644
--- a/tools/perf/builtin-kmem.c
+++ b/tools/perf/builtin-kmem.c
@@ -30,7 +30,7 @@ static intkmem_page;
 static longkmem_page_size;
 
 struct alloc_stat;
-typedef int (*sort_fn_t)(struct alloc_stat *, struct alloc_stat *);
+typedef int (*sort_fn_t)(void *, void *);
 
 static int alloc_flag;
 static int caller_flag;
@@ -181,8 +181,8 @@ static int perf_evsel__process_alloc_node_event(struct 
perf_evsel *evsel,
return ret;
 }
 
-static int ptr_cmp(struct alloc_stat *, struct alloc_stat *);
-static int callsite_cmp(struct alloc_stat *, struct alloc_stat *);
+static int ptr_cmp(void *, void *);
+static int slab_callsite_cmp(void *, void *);
 
 static struct alloc_stat *search_alloc_stat(unsigned long ptr,
unsigned long call_site,
@@ -223,7 +223,8 @@ static int perf_evsel__process_free_event(struct perf_evsel 
*evsel,
s_alloc->pingpong++;
 
s_caller = search_alloc_stat(0, s_alloc->call_site,
-_caller_stat, callsite_cmp);
+_caller_stat,
+slab_callsite_cmp);
if (!s_caller)
return -1;
s_caller->pingpong++;
@@ -448,26 +449,14 @@ static struct page_stat *page_stat__findnew_page(u64 page)
return __page_stat__findnew_page(page, true);
 }
 
-static int page_stat_cmp(struct page_stat *a, struct page_stat *b)
-{
-   if (a->page > b->page)
-   return -1;
-   if (a->page < b->page)
-   return 1;
-   if (a->order > b->order)
-   return -1;
-   if (a->order < b->order)
-   return 1;
-   if (a->migrate_type > b->migrate_type)
-   return -1;
-   if (a->migrate_type < b->migrate_type)
-   return 1;
-   if (a->gfp_flags > b->gfp_flags)
-   return -1;
-   if (a->gfp_flags < b->gfp_flags)
-   return 1;
-   return 0;
-}
+struct sort_dimension {
+   const char  name[20];
+   sort_fn_t   cmp;
+   struct list_headlist;
+};
+
+static 

[PATCH 4/6] perf kmem: Print gfp flags in human readable string

2015-04-20 Thread Namhyung Kim
Save libtraceevent output and print it in the header.

  # perf kmem stat --page --caller
  #
  # GFP flags
  # -
  # 0010:   NI: GFP_NOIO
  # 00d0:K: GFP_KERNEL
  # 0200:  NWR: GFP_NOWARN
  # 84d0:K|R|Z: GFP_KERNEL|GFP_REPEAT|GFP_ZERO
  # 000200d2:   HU: GFP_HIGHUSER
  # 000200da:  HUM: GFP_HIGHUSER_MOVABLE
  # 000280da:HUM|Z: GFP_HIGHUSER_MOVABLE|GFP_ZERO
  # 002084d0: K|R|Z|NT: GFP_KERNEL|GFP_REPEAT|GFP_ZERO|GFP_NOTRACK
  # 0102005a:  NF|HW|M: GFP_NOFS|GFP_HARDWALL|GFP_MOVABLE

  
-
   Total alloc (KB) | Hits  | Order | Mig.type | GFP flags | Callsite
  
-
 60 |15 | 0 | UNMOVABL | K|R|Z|NT  | pte_alloc_one
 40 |10 | 0 |  MOVABLE | HUM|Z | handle_mm_fault
 24 | 6 | 0 |  MOVABLE | HUM   | do_wp_page
 24 | 6 | 0 | UNMOVABL | K | __pollwait
   ...

Requested-by: Joonsoo Kim 
Suggested-by: Minchan Kim 
Acked-by: Pekka Enberg 
Signed-off-by: Namhyung Kim 
---
 tools/perf/builtin-kmem.c | 222 +++---
 1 file changed, 209 insertions(+), 13 deletions(-)

diff --git a/tools/perf/builtin-kmem.c b/tools/perf/builtin-kmem.c
index 7ead9423fd7a..1c668953c7ec 100644
--- a/tools/perf/builtin-kmem.c
+++ b/tools/perf/builtin-kmem.c
@@ -581,6 +581,176 @@ static bool valid_page(u64 pfn_or_page)
return true;
 }
 
+struct gfp_flag {
+   unsigned int flags;
+   char *compact_str;
+   char *human_readable;
+};
+
+static struct gfp_flag *gfps;
+static int nr_gfps;
+
+static int gfpcmp(const void *a, const void *b)
+{
+   const struct gfp_flag *fa = a;
+   const struct gfp_flag *fb = b;
+
+   return fa->flags - fb->flags;
+}
+
+/* see include/trace/events/gfpflags.h */
+static const struct {
+   const char *original;
+   const char *compact;
+} gfp_compact_table[] = {
+   { "GFP_TRANSHUGE",  "THP" },
+   { "GFP_HIGHUSER_MOVABLE",   "HUM" },
+   { "GFP_HIGHUSER",   "HU" },
+   { "GFP_USER",   "U" },
+   { "GFP_TEMPORARY",  "TMP" },
+   { "GFP_KERNEL", "K" },
+   { "GFP_NOFS",   "NF" },
+   { "GFP_ATOMIC", "A" },
+   { "GFP_NOIO",   "NI" },
+   { "GFP_HIGH",   "H" },
+   { "GFP_WAIT",   "W" },
+   { "GFP_IO", "I" },
+   { "GFP_COLD",   "CO" },
+   { "GFP_NOWARN", "NWR" },
+   { "GFP_REPEAT", "R" },
+   { "GFP_NOFAIL", "NF" },
+   { "GFP_NORETRY","NR" },
+   { "GFP_COMP",   "C" },
+   { "GFP_ZERO",   "Z" },
+   { "GFP_NOMEMALLOC", "NMA" },
+   { "GFP_MEMALLOC",   "MA" },
+   { "GFP_HARDWALL",   "HW" },
+   { "GFP_THISNODE",   "TN" },
+   { "GFP_RECLAIMABLE","RC" },
+   { "GFP_MOVABLE","M" },
+   { "GFP_NOTRACK","NT" },
+   { "GFP_NO_KSWAPD",  "NK" },
+   { "GFP_OTHER_NODE", "ON" },
+   { "GFP_NOWAIT", "NW" },
+};
+
+static size_t max_gfp_len;
+
+static char *compact_gfp_flags(char *gfp_flags)
+{
+   char *orig_flags = strdup(gfp_flags);
+   char *new_flags = NULL;
+   char *str, *pos;
+   size_t len = 0;
+
+   if (orig_flags == NULL)
+   return NULL;
+
+   str = strtok_r(orig_flags, "|", );
+   while (str) {
+   size_t i;
+   char *new;
+   const char *cpt;
+
+   for (i = 0; i < ARRAY_SIZE(gfp_compact_table); i++) {
+   if (strcmp(gfp_compact_table[i].original, str))
+   continue;
+
+   cpt = gfp_compact_table[i].compact;
+   new = realloc(new_flags, len + strlen(cpt) + 2);
+   if (new == NULL) {
+   free(new_flags);
+   return NULL;
+   }
+
+   new_flags = new;
+
+   if (!len) {
+   strcpy(new_flags, cpt);
+   } else {
+   strcat(new_flags, "|");
+   strcat(new_flags, cpt);
+   len++;
+   }
+
+   len += strlen(cpt);
+   }
+
+   str = strtok_r(NULL, "|", );
+   }
+
+   if (max_gfp_len < len)
+   max_gfp_len = len;
+
+   

[PATCH 5/6] perf kmem: Add kmem.default config option

2015-04-20 Thread Namhyung Kim
Currently perf kmem command will select --slab if neither --slab nor
--page is given for backward compatibility.  Add kmem.default config
option to select the default value ('page' or 'slab').

  # cat ~/.perfconfig
  [kmem]
default = page

  # perf kmem stat

  SUMMARY (page allocator)
  
  Total allocation requests :1,518   [6,096 KB ]
  Total free requests   :1,431   [5,748 KB ]

  Total alloc+freed requests:1,330   [5,344 KB ]
  Total alloc-only requests :  188   [  752 KB ]
  Total free-only requests  :  101   [  404 KB ]

  Total allocation failures :0   [0 KB ]
  ...

Acked-by: Pekka Enberg 
Cc: Taeung Song 
Signed-off-by: Namhyung Kim 
---
 tools/perf/builtin-kmem.c | 32 +---
 1 file changed, 29 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-kmem.c b/tools/perf/builtin-kmem.c
index 1c668953c7ec..828b7284e547 100644
--- a/tools/perf/builtin-kmem.c
+++ b/tools/perf/builtin-kmem.c
@@ -28,6 +28,10 @@ static int   kmem_slab;
 static int kmem_page;
 
 static longkmem_page_size;
+static enum {
+   KMEM_SLAB,
+   KMEM_PAGE,
+} kmem_default = KMEM_SLAB;  /* for backward compatibility */
 
 struct alloc_stat;
 typedef int (*sort_fn_t)(void *, void *);
@@ -1710,7 +1714,8 @@ static int parse_sort_opt(const struct option *opt 
__maybe_unused,
if (!arg)
return -1;
 
-   if (kmem_page > kmem_slab) {
+   if (kmem_page > kmem_slab ||
+   (kmem_page == 0 && kmem_slab == 0 && kmem_default == KMEM_PAGE)) {
if (caller_flag > alloc_flag)
return setup_page_sorting(_caller_sort, arg);
else
@@ -1826,6 +1831,22 @@ static int __cmd_record(int argc, const char **argv)
return cmd_record(i, rec_argv, NULL);
 }
 
+static int kmem_config(const char *var, const char *value, void *cb)
+{
+   if (!strcmp(var, "kmem.default")) {
+   if (!strcmp(value, "slab"))
+   kmem_default = KMEM_SLAB;
+   else if (!strcmp(value, "page"))
+   kmem_default = KMEM_PAGE;
+   else
+   pr_err("invalid default value ('slab' or 'page' 
required): %s\n",
+  value);
+   return 0;
+   }
+
+   return perf_default_config(var, value, cb);
+}
+
 int cmd_kmem(int argc, const char **argv, const char *prefix __maybe_unused)
 {
const char * const default_slab_sort = "frag,hit,bytes";
@@ -1862,14 +1883,19 @@ int cmd_kmem(int argc, const char **argv, const char 
*prefix __maybe_unused)
struct perf_session *session;
int ret = -1;
 
+   perf_config(kmem_config, NULL);
argc = parse_options_subcommand(argc, argv, kmem_options,
kmem_subcommands, kmem_usage, 0);
 
if (!argc)
usage_with_options(kmem_usage, kmem_options);
 
-   if (kmem_slab == 0 && kmem_page == 0)
-   kmem_slab = 1;  /* for backward compatibility */
+   if (kmem_slab == 0 && kmem_page == 0) {
+   if (kmem_default == KMEM_SLAB)
+   kmem_slab = 1;
+   else
+   kmem_page = 1;
+   }
 
if (!strncmp(argv[0], "rec", 3)) {
symbol__init(NULL);
-- 
2.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCHSET 0/6] perf kmem: Implement page allocation analysis (v8)

2015-04-20 Thread Namhyung Kim
Hello,

Currently perf kmem command only analyzes SLAB memory allocation.  And
I'd like to introduce page allocation analysis also.  Users can use
 --slab and/or --page option to select it.  If none of these options
 are used, it does slab allocation analysis for backward compatibility.

 * changes in v8)
   - rename 'stat' to 'pstat' due to build error
   - add Acked-by from Pekka

 * changes in v7)
   - drop already merged patches
   - check return value of map__load()  (Arnaldo)
   - rename to page_stat__findnew_*() functions  (Arnaldo)
   - show warning when try to run stat before record
   
 * changes in v6)
   - add -i option fix  (Jiri)
   - libtraceevent operator priority fix

* changes in v5)
   - print migration type and gfp flags in more compact form  (Arnaldo)
   - add kmem.default config option

 * changes in v4)
   - use pfn instead of struct page * in tracepoints  (Joonsoo, Ingo)
   - print gfp flags in human readable string  (Joonsoo, Minchan)

* changes in v3)
  - add live page statistics

 * changes in v2)
   - Use thousand grouping for big numbers - i.e. 12345 -> 12,345  (Ingo)
   - Improve output stat readability  (Ingo)
   - Remove alloc size column as it can be calculated from hits and order

In this patchset, I used two kmem events: kmem:mm_page_alloc and
kmem_page_free for analysis as they can track almost all of memory
allocation/free path AFAIK.  However, unlike slab tracepoint events,
those page allocation events don't provide callsite info directly.  So
I recorded callchains and extracted callsites like below:

Normal page allocation callchains look like this:

  360a7e __alloc_pages_nodemask
  3a711c alloc_pages_current
  357bc7 __page_cache_alloc   <-- callsite
  357cf6 pagecache_get_page
   48b0a prepare_pages
   494d3 __btrfs_buffered_write
   49cdf btrfs_file_write_iter
  3ceb6e new_sync_write
  3cf447 vfs_write
  3cff99 sys_write
  7556e9 system_call
f880 __write_nocancel
   33eb9 cmd_record
   4b38e cmd_kmem
   7aa23 run_builtin
   27a9a main
   20800 __libc_start_main

But first two are internal page allocation functions so it should be
skipped.  To determine such allocation functions, I used following regex:

  ^_?_?(alloc|get_free|get_zeroed)_pages?

This gave me a following list of functions (you can see this with -v):

  alloc func: __get_free_pages
  alloc func: get_zeroed_page
  alloc func: alloc_pages_exact
  alloc func: __alloc_pages_direct_compact
  alloc func: __alloc_pages_nodemask
  alloc func: alloc_page_interleave
  alloc func: alloc_pages_current
  alloc func: alloc_pages_vma
  alloc func: alloc_page_buffers
  alloc func: alloc_pages_exact_nid

After skipping those function, it got '__page_cache_alloc'.

Other information such as allocation order, migration type and gfp
flags are provided by tracepoint events.

Basically the output will be sorted by total allocation bytes, but you
can change it by using -s/--sort option.  The following sort keys are
added to support page analysis: page, order, migtype, gfp.  Existing
'callsite', 'bytes' and 'hit' sort keys also can be used.

An example follows:

  # perf kmem record --page sleep 5
  [ perf record: Woken up 2 times to write data ]
  [ perf record: Captured and wrote 1.065 MB perf.data (2949 samples) ]

  # perf kmem stat --page --caller -s order,hit -l 10
  #
  # GFP flags
  # -
  # 0010: NI: GFP_NOIO
  # 00d0:  K: GFP_KERNEL
  # 0200:NWR: GFP_NOWARN
  # 52d0: K|NWR|NR|C: GFP_KERNEL|GFP_NOWARN|GFP_NORETRY|GFP_COMP
  # 84d0:  K|R|Z: GFP_KERNEL|GFP_REPEAT|GFP_ZERO
  # 000200d0:  U: GFP_USER
  # 000200d2: HU: GFP_HIGHUSER
  # 000200da:HUM: GFP_HIGHUSER_MOVABLE
  # 000280da:  HUM|Z: GFP_HIGHUSER_MOVABLE|GFP_ZERO
  # 002084d0:   K|R|Z|NT: GFP_KERNEL|GFP_REPEAT|GFP_ZERO|GFP_NOTRACK
  # 0102005a:NF|HW|M: GFP_NOFS|GFP_HARDWALL|GFP_MOVABLE

  
-
   Total alloc (KB) | Hits  | Order | Mig.type | GFP flags  | Callsite
  
-
 16 | 1 | 2 | UNMOVABL | K|NWR|NR|C | 
alloc_skb_with_frags
 24 | 3 | 1 | UNMOVABL | K|NWR|NR|C | 
alloc_skb_with_frags
  3,876 |   969 | 0 |  MOVABLE | HUM| 
shmem_alloc_page
972 |   243 | 0 | UNMOVABL | K  | __pollwait
624 |   156 | 0 |  MOVABLE | NF|HW|M| 
__page_cache_alloc
304 |76 | 0 | UNMOVABL | U  | 
dma_generic_alloc_coherent
108 |27 | 0 |  MOVABLE | HUM|Z  | 
handle_mm_fault
 56 |14 | 0 | UNMOVABL | K|R|Z|NT   | pte_alloc_one
 24 | 6 | 0 |  MOVABLE | HUM| do_wp_page
 16 | 4 | 0 | UNMOVABL | 

[PATCH 6/6] perf kmem: Show warning when trying to run stat without record

2015-04-20 Thread Namhyung Kim
Sometimes one can mistakenly run perf kmem stat without perf kmem
record before or different configuration like recoding --slab and stat
--page.  Show a warning message like below to inform user:

  # perf kmem stat --page --caller
  Not found page events.  Have you run 'perf kmem record --page' before?

Acked-by: Pekka Enberg 
Signed-off-by: Namhyung Kim 
---
 tools/perf/builtin-kmem.c | 31 ---
 1 file changed, 28 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-kmem.c b/tools/perf/builtin-kmem.c
index 828b7284e547..f29a766f18f8 100644
--- a/tools/perf/builtin-kmem.c
+++ b/tools/perf/builtin-kmem.c
@@ -1882,6 +1882,7 @@ int cmd_kmem(int argc, const char **argv, const char 
*prefix __maybe_unused)
};
struct perf_session *session;
int ret = -1;
+   const char errmsg[] = "Not found %s events.  Have you run 'perf kmem 
record --%s' before?\n";
 
perf_config(kmem_config, NULL);
argc = parse_options_subcommand(argc, argv, kmem_options,
@@ -1908,11 +1909,35 @@ int cmd_kmem(int argc, const char **argv, const char 
*prefix __maybe_unused)
if (session == NULL)
return -1;
 
+   if (kmem_slab) {
+   struct perf_evsel *evsel;
+   bool found = false;
+
+   evlist__for_each(session->evlist, evsel) {
+   if (!strcmp(perf_evsel__name(evsel), "kmem:kmalloc")) {
+   found = true;
+   break;
+   }
+   }
+   if (!found) {
+   pr_err(errmsg, "slab", "slab");
+   return -1;
+   }
+   }
+
if (kmem_page) {
-   struct perf_evsel *evsel = perf_evlist__first(session->evlist);
+   struct perf_evsel *evsel;
+   bool found = false;
 
-   if (evsel == NULL || evsel->tp_format == NULL) {
-   pr_err("invalid event found.. aborting\n");
+   evlist__for_each(session->evlist, evsel) {
+   if (!strcmp(perf_evsel__name(evsel),
+   "kmem:mm_page_alloc")) {
+   found = true;
+   break;
+   }
+   }
+   if (!found) {
+   pr_err(errmsg, "page", "page");
return -1;
}
 
-- 
2.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [V6,1/9] elf: Add new powerpc specifc core note sections

2015-04-20 Thread Anshuman Khandual
On 04/20/2015 05:57 PM, Ulrich Weigand wrote:
> Anshuman Khandual  wrote on 13.04.2015
> 10:48:57:
>> On 04/10/2015 04:03 PM, Ulrich Weigand wrote:
>>> - You provide checkpointed FPR and VMX registers, but there doesn't
> seem
>>>   to be any way to get at the checkpointed *VSX* registers (i.e. the
> part
>>>   that is neither covered by FPR or VMX, corresponding to NT_PPC_VSX).
>>
>> Will change vsr_get, vsr_set functions as we have done for fpr_get and
> fpr_set
>> functions. Also will add one more ELF core note NT_PPC_TM_CVSX to fetch
> the
>> check pointed state of VSX register while inside the transaction.
> 
> OK.
> 
>>>   I would much prefer three separate regsets (e.g. NT_PPC_DSCR,
> NT_PPC_PPR,
>>>   and NT_PPC_TAR), each of which is available and valid if and only if
> the
>>>   current processor actually has the register in question.
>>
>> Thats like adding one ELF core note for every single register
>> because we cannot
>> put them in any category. Then as Michael Ellerman had pointed out to
> include
>> a lot more registers in this MISC category (which we are not doing right
> now
>> in the interest of having minimum support available before we look at the
> full
>> possible list of MISC registers), we should add one ELF core note section
> for
>> each of those individual registers ? I am not sure.
> 
> This confuses me a bit.  My understanding was that ptrace regsets, once
> defined, should never change in the future.  (GDB will only check whether
> or not a regset is supported; if it is, it will expect the contents to be
> as it expects them to be.)  "Including a lot more registers" would
> therefore
> seem to require adding new regsets anyway, which is one of the reasons why
> I disagree a "MISC" regset is particularly useful.

Yeah right. Started thinking that (NT_PPC_TAR, NT_PPC_CTAR),
(NT_PPC_PPR, NT_PPC_CPPR), (NT_PPC_DSCR, NT_PPC_CDSCR) kind of combinations
make more sense !

> 
>>> - Similarly, the NT_PPC_TM_SPR regset as currently defined mixes and
>>> matches
>>>   registers with different "lifetimes".  The transactional memory
> registers
>>>   (TFHAR, TEXASR, TFIAR) are available *always* on machines that
> support
>>>   transactions.  But the other registers in that regset are
> checkpointed
>>>   versions that are only available/valid within a transaction.  I think
> a
>>>   better way to faithfully represent this would be to have the
>>> NT_PPC_TM_SPR
>>>   regset only contain the transcational memory registers, and use
> separate
>>>   regsets for the checkpointed registers -- those should parallel the
> non-
>>>   checkpointed register regset.
>>
>> Right now, we support NT_PPC_TM_SPR only inside the transaction, so we
> dont
>> have the problem with different "lifetimes" registers accessed together.
> But
>> yes, I get your point.
> 
> Since the transactional SPRs are accessible from user space outside of a
> transaction, it would make sense for them to accessible from ptrace as
> well.
> If the current patch set doesn't do that, I guess it would be better to
> change that.

Yeah I agree. Will change it.

> 
>>> - Particularly confusing to me is the "checkpointed original MSR" which
>>>   currently also resides in NT_PPC_TM_SPR.  What exactly is this?  How
>>>   does that differ from the MSR slot in the NT_PPC_TM_CGPR regset?
>>
>> I believed it stores the check pointed MSR value which was in the
> register
>> before the transaction started. But then how it is different from the
>> ckpt_regs.msr, I am not sure. Mikey or Michael should be able to clarify
>> more on this. I can see "orig_msr" getting used in many places to hold
> the
>> check pointed value of MSR.
> 
> Your other mail states that the orig_mst may be irrelevant for ptrace
> anyway ... that would be OK with me as well.

Yeah. The variable tm_orig_msr is used to compute MSR state inside
the kernel or what would be passed to the user space while returning
at various stages of the transaction, where as ckpt_regs.msr contains
the latest check pointed MSR value to be fetched by ptrace. Thats my
understanding as of now.

> 
>>>   In any case, it seems the ptrace set-register case currently allows
> user
>>>   space to restore *any* arbitrary value into the checkpointed MSR,
> which
>>>   would presumably get restored into the real MSR at some point, unless
> I'm
>>>   missing something here.  Do we not need a check that only safe bits
> are
>>>   modified, just like with ptrace access to the real MSR?
>>
>> Where and which safe bits do we check before writing any value into the
> MSR
>> register from ptrace interface ? May be I am missing something here.
> 
> All ptrace accesses to *set* the regular msr go via this routine:
> 
> static int set_user_msr(struct task_struct *task, unsigned long msr)
> {
> task->thread.regs->msr &= ~MSR_DEBUGCHANGE;
> task->thread.regs->msr |= msr & MSR_DEBUGCHANGE;
> return 0;
> }
> 
> I think we'd need to do the equivalent whenever changing the checkpointed
> 

Re: [PATCH V6 00/10] namespaces: log namespaces per task

2015-04-20 Thread Eric W. Biederman
Richard Guy Briggs  writes:

> The purpose is to track namespace instances in use by logged processes from 
> the
> perspective of init_*_ns by logging the namespace IDs (device ID and namespace
> inode - offset).

In broad strokes the user interface appears correct.

Things that I see that concern me:

- After Als most recent changes these inodes no longer live in the proc
  superblock so the device number reported in these patches is
  incorrect.

- I am nervous about audit logs being flooded with users creating lots
  of namespaces.  But that is more your lookout than mine.

- unshare is not logging when it creates new namespaces.

As small numbers are nice and these inodes all live in their own
superblock now we should be able to remove the games with
PROC_DYNAMIC_FIRST and just use small numbers for these inodes
everywhere.

I have answered your comments below.

> 1/10 exposes proc's ns entries structure which lists a number of useful
> operations per namespace type for other subsystems to use.
>
> 2/10  proc_ns: define PROC_*_INIT_INO in terms of PROC_DYNAMIC_FIRST
>
> 3/10 provides an example of usage for audit_log_task_info() which is used by
> syscall audits, among others.  audit_log_task() and 
> audit_common_recv_message()
> would be other potential use cases.
>
> Proposed output format:
> This differs slightly from Aristeu's patch because of the label conflict with
> "pid=" due to including it in existing records rather than it being a seperate
> record.  It has now returned to being a seperate record.  The proc device
> major/minor are listed in hexadecimal and namespace IDs are the proc inode
> minus the base offset.
>   type=NS_INFO msg=audit(1408577535.306:82): dev=00:03 netns=3 utsns=-3 
> ipcns=-4 pidns=-1 userns=-2 mntns=0
>
> 4/10 change audit startup from __initcall to subsys_initcall to get it started
> earlier to be able to receive initial namespace log messages.
>
> 5/10 tracks the creation and deletion of namespaces, listing the type of
> namespace instance, proc device ID, related namespace id if there is one and
> the newly minted namespace ID.
>
> Proposed output format for initial namespace creation:
>   type=AUDIT_NS_INIT_UTS msg=audit(1408577534.868:5): pid=1 uid=0 
> auid=4294967295 ses=4294967295 subj=kernel dev=00:03 old_utsns=(none) 
> utsns=-3 res=1
>   type=AUDIT_NS_INIT_USER msg=audit(1408577534.868:6): pid=1 uid=0 
> auid=4294967295 ses=4294967295 subj=kernel dev=00:03 old_userns=(none) 
> userns=-2 res=1
>   type=AUDIT_NS_INIT_PID msg=audit(1408577534.868:7): pid=1 uid=0 
> auid=4294967295 ses=4294967295 subj=kernel dev=00:03 old_pidns=(none) 
> pidns=-1 res=1
>   type=AUDIT_NS_INIT_MNT msg=audit(1408577534.868:8): pid=1 uid=0 
> auid=4294967295 ses=4294967295 subj=kernel dev=00:03 old_mntns=(none) mntns=0 
> res=1
>   type=AUDIT_NS_INIT_IPC msg=audit(1408577534.868:9): pid=1 uid=0 
> auid=4294967295 ses=4294967295 subj=kernel dev=00:03 old_ipcns=(none) 
> ipcns=-4 res=1
>   type=AUDIT_NS_INIT_NET msg=audit(1408577533.500:10): pid=1 uid=0 
> auid=4294967295 ses=4294967295 subj=kernel dev=00:03 old_netns=(none) netns=2 
> res=1
>
> And a CLONE action would result in:
>   type=type=AUDIT_NS_INIT_NET msg=audit(1408577535.306:81): pid=481 uid=0 
> auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 dev=00:03 
> old_netns=2 netns=3 res=1
>
> While deleting a namespace would result in:
>   type=type=AUDIT_NS_DEL_MNT msg=audit(1408577552.221:85): pid=481 uid=0 
> auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 dev=00:03 
> mntns=4 res=1
>
> 6/10 accepts a PID from userspace and requests logging an AUDIT_NS_INFO record
> type (CAP_AUDIT_CONTROL required).
>
> 7/10 is a macro for CLONE_NEW_* flags.
>
> 8/10 adds auditing on creation of namespace(s) in fork.
>
> 9/10 adds auditing a change of namespace on setns.
>
> 10/10 attaches a AUDIT_NS_INFO record to AUDIT_VIRT_CONTROL records
> (CAP_AUDIT_WRITE required).
>
>
> v5 -> v6:
>   Switch to using namespace ID based on namespace proc inode minus base 
> offset
>   Added proc device ID to qualify proc inode reference
>   Eliminate exposed /proc interface
>
> v4 -> v5:
>   Clean up prototypes for dependencies on CONFIG_NAMESPACES.
>   Add AUDIT_NS_INFO record type to AUDIT_VIRT_CONTROL record.
>   Log AUDIT_NS_INFO with PID.
>   Move /proc//ns_* patches to end of patchset to deprecate them.
>   Log on changing ns (setns).
>   Log on creating new namespaces when forking.
>   Added a macro for CLONE_NEW*.
>
> v3 -> v4:
>   Seperate out the NS_INFO message from the SYSCALL message.
>   Moved audit_log_namespace_info() out of audit_log_task_info().
>   Use a seperate message type per namespace type for each of INIT/DEL.
>   Make ns= easier to search across NS_INFO and NS_INIT/DEL_XXX msg types.
>   Add /proc//ns/ documentation.
>   Fix dynamic initial ns logging.
>
> v2 -> v3:
>   Use atomic64_t in 

performance changes on 78373b73: -46.6% fsmark.files_per_sec, and few more

2015-04-20 Thread Yuanhan Liu
FYI, we found changes on `fsmark.files_per_sec' by 
78373b7319abdf15050af5b1632c4c8b8b398f33:

> commit 78373b7319abdf15050af5b1632c4c8b8b398f33
> Author: Jaegeuk Kim 
> AuthorDate: Fri Mar 13 21:44:36 2015 -0700
> Commit: Jaegeuk Kim 
> CommitDate: Fri Apr 10 15:08:45 2015 -0700
> 
> f2fs: enhance multi-threads performance

3402e87cfb5e762f9c95071bf4a2ad65fd9392a2 
78373b7319abdf15050af5b1632c4c8b8b398f33
 

run time(m) metric_value ±stddev run time(m) metric_value 
±stddev change   testbox/benchmark/testcase-params
--- --   --- --  
  --
3   0.3 |490.800|±5.73   0.5 |262.067|
±0.4  -46.6% 
ivb44/fsmark/1x-64t-4BRD_12G-RAID0-f2fs-4M-30G-fsyncBeforeClose
3   0.3 |468.367|±3.53   0.5 |264.467|
±0.2  -43.5% 
ivb44/fsmark/1x-64t-9BRD_6G-RAID0-f2fs-4M-30G-fsyncBeforeClose
3   0.6 |211.867|±0.73   0.7 |191.067|
±0.5   -9.8% 
ivb44/fsmark/1x-64t-4BRD_12G-RAID5-f2fs-4M-30G-fsyncBeforeClose

NOTE: here are some more info about those test parameters for you to
  know what the testcase does better:

  1x: where 'x' means iterations or loop, corresponding to the 'L' option 
of fsmark

  1t, 64t: where 't' means thread

  4M: means the single file size, corresponding to the '-s' option of fsmark
  40G, 30G, 120G: means the total test size

  4BRD_12G: BRD is the ramdisk, where '4' means 4 ramdisk, and where '12G' 
means
the size of one ramdisk. So, it would be 48G in total. And we 
made a
raid on those ramdisk


The change is a bit interesting as you already stated it clear that this
patch is for performance gain. The patch itself is clear, too: remove a
mutex lock. So the only reasonable cause, without too much dig, I can think
of would be the remove of this lock reduces sleep time, and brings more
process to be able run, but somehow increases the context switches and cpu
usage in the meantime at somewhere. I guess this is what the following
changes are trying to tell us:

 29708 ±  2%   +5720.0%1729051 ±  1%  
fsmark.time.voluntary_context_switches
   302 ±  0%+113.8%647 ±  0%  
fsmark.time.percent_of_cpu_this_job_got
 61.05 ±  0%+214.0% 191.70 ±  0%  fsmark.time.system_time


FYI, Here I listed all changes for the outstanding change:

3   0.3 |490.800|±5.73   0.5 |262.067|
±0.4  -46.6% 
ivb44/fsmark/1x-64t-4BRD_12G-RAID0-f2fs-4M-30G-fsyncBeforeClose

3402e87cfb5e762f  78373b7319abdf15050af5b163  
  --  
 %stddev %change %stddev
 \  |\  
 29708 ±  2%   +5720.0%1729051 ±  1%  
fsmark.time.voluntary_context_switches
 61.05 ±  0%+214.0% 191.70 ±  0%  fsmark.time.system_time
   302 ±  0%+113.8%647 ±  0%  
fsmark.time.percent_of_cpu_this_job_got
 10476 ±  0% +95.4%  20467 ±  5%  fsmark.time.minor_page_faults
   490 ±  5% -46.6%262 ±  0%  fsmark.files_per_sec
 20.21 ±  0% +46.7%  29.65 ±  0%  fsmark.time.elapsed_time
 20.21 ±  0% +46.7%  29.65 ±  0%  fsmark.time.elapsed_time.max
226379 ±  0% +32.5% 299882 ±  0%  fsmark.app_overhead
 0 ±  0%  +Inf%   1045 ±  2%  proc-vmstat.numa_pages_migrated
   209 ± 26%   +3272.3%   7059 ±  3%  cpuidle.C1E-IVT.usage
   228 ± 42%+686.7%   1799 ± 14%  numa-meminfo.node0.Writeback
 14633 ±  5%   +7573.2%1122849 ±  1%  cpuidle.C1-IVT.usage
 0 ±  0%  +Inf%   1045 ±  2%  proc-vmstat.pgmigrate_success
 29708 ±  2%   +5720.0%1729051 ±  1%  time.voluntary_context_switches
 55663 ±  0%+776.9% 488081 ±  0%  cpuidle.C6-IVT.usage
56 ± 42%+718.8%464 ± 11%  numa-vmstat.node0.nr_writeback
   535 ± 29%+334.4%   2325 ± 10%  meminfo.Writeback
   129 ± 30%+295.6%511 ±  4%  proc-vmstat.nr_writeback
 59.25 ±  5% -74.2%  15.26 ±  3%  turbostat.CPU%c6
  2.58 ±  8% -74.5%   0.66 ± 11%  turbostat.Pkg%pc2
 1.551e+08 ± 14%+233.4%  5.171e+08 ±  4%  cpuidle.C1-IVT.time
 32564 ± 24%+208.1% 100330 ±  5%  softirqs.RCU
 61.05 ±  0%+214.0% 191.70 ±  0%  time.system_time
60 ± 32%+165.7%160 ± 16%  numa-vmstat.node1.nr_writeback
 2 ±  0%+200.0%  6 ±  0%  vmstat.procs.r
  3057 ±  2%+166.1%   8136 ± 22%  numa-vmstat.node0.nr_mapped
 12240 ±  2%+165.9%  32547 ± 22%  numa-meminfo.node0.Mapped
  6324 ±  3%+148.4%  15709 ±  0%  proc-vmstat.nr_mapped
   

[PATCH] mm: soft-offline: fix num_poisoned_pages counting on concurrent events

2015-04-20 Thread Naoya Horiguchi
If multiple soft offline events hit one free page/hugepage concurrently,
soft_offline_page() can handle the free page/hugepage multiple times,
which makes num_poisoned_pages counter increased more than once.
This patch fixes this wrong counting by checking TestSetPageHWPoison for
normal papes and by checking the return value of dequeue_hwpoisoned_huge_page()
for hugepages.

Signed-off-by: Naoya Horiguchi 
Cc: sta...@vger.kernel.org  # v3.14+
---
# This problem might happen before 3.14, but it's rare and non-critical,
# so I want this patch to be backported to stable trees only if the patch
# cleanly applies (i.e. v3.14+).
---
 mm/memory-failure.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git v4.0.orig/mm/memory-failure.c v4.0/mm/memory-failure.c
index 2cc1d578144b..72a5224c8084 100644
--- v4.0.orig/mm/memory-failure.c
+++ v4.0/mm/memory-failure.c
@@ -1721,12 +1721,12 @@ int soft_offline_page(struct page *page, int flags)
} else if (ret == 0) { /* for free pages */
if (PageHuge(page)) {
set_page_hwpoison_huge_page(hpage);
-   dequeue_hwpoisoned_huge_page(hpage);
-   atomic_long_add(1 << compound_order(hpage),
+   if (!dequeue_hwpoisoned_huge_page(hpage))
+   atomic_long_add(1 << compound_order(hpage),
_poisoned_pages);
} else {
-   SetPageHWPoison(page);
-   atomic_long_inc(_poisoned_pages);
+   if (!TestSetPageHWPoison(page))
+   atomic_long_inc(_poisoned_pages);
}
}
unset_migratetype_isolate(page, MIGRATE_MOVABLE);
-- 
2.1.0
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[alsa-devel] [PATCH v2 0/2] Add support for select accessory detect mode to HPDETL or HPDETR

2015-04-20 Thread Inha Song
This set of patches adds support for select accessory detect mode to HPDETL or 
HPDETR.

Changes in v2:
  - Use the value in pdata instead of hpdet_channel in extcon_info.
  - Wrap arizona_extcon_of_get in IS_ENABLED(CONFIG_OF).
  - Change hpdet_channel type to unsigned from signed in pdata.
  - Move ARIZONA_ACCDET_MODE_* define to dt-binding header and directly set it 
to pdata.

Inha Song (2):
  extcon: arizona: Add support for select accessory detect mode when
headphone detection
  mfd: arizona: Update DT binding to support hpdet channel

 Documentation/devicetree/bindings/mfd/arizona.txt |  6 +
 drivers/extcon/extcon-arizona.c   | 28 ---
 include/dt-bindings/mfd/arizona.h |  8 +++
 include/linux/mfd/arizona/pdata.h |  3 +++
 4 files changed, 37 insertions(+), 8 deletions(-)
 create mode 100644 include/dt-bindings/mfd/arizona.h

-- 
2.0.0.390.gcb682f8

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[alsa-devel] [PATCH v2 2/2] mfd: arizona: Update DT binding to support hpdet channel

2015-04-20 Thread Inha Song
This patch add device tree bindings for the pdata needed to configure
the Accessory Detect Mode select when Headphone detection.

Signed-off-by: Inha Song 
---
 Documentation/devicetree/bindings/mfd/arizona.txt | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/Documentation/devicetree/bindings/mfd/arizona.txt 
b/Documentation/devicetree/bindings/mfd/arizona.txt
index 7bd1273..3529592 100644
--- a/Documentation/devicetree/bindings/mfd/arizona.txt
+++ b/Documentation/devicetree/bindings/mfd/arizona.txt
@@ -49,6 +49,12 @@ Optional properties:
 input singals. If values less than the number of input signals, elements
 that has not been specifed are set to 0 by default.
 
+  - wlf,hpdet-channel : Headphone detection channel.
+   1 or ARIZONA_ACCDET_MODE_HPL - Headphone detect mode is set to HPDETL
+   2 or ARIZONA_ACCDET_MODE_HPR - Headphone detect mode is set to HPDETR
+   If this node is not mentioned or if the value is unknown, then
+   headphone detection mode is set to MICDET.
+
   - DCVDD-supply, MICVDD-supply : Power supplies, only need to be specified if
 they are being externally supplied. As covered in
 Documentation/devicetree/bindings/regulator/regulator.txt
-- 
2.0.0.390.gcb682f8

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[alsa-devel] [PATCH v2 1/2] extcon: arizona: Add support for select accessory detect mode when headphone detection

2015-04-20 Thread Inha Song
This patch add support for select accessory detect mode to HPDETL or HPDETR.
Arizona provides a headphone detection circuit on the HPDETL and HPDETR pins
to measure the impedance of an external load connected to the headphone.

Depending on board design, headphone detect pins can change to HPDETR or HPDETL.

Signed-off-by: Inha Song 
---
 drivers/extcon/extcon-arizona.c   | 28 
 include/dt-bindings/mfd/arizona.h |  8 
 include/linux/mfd/arizona/pdata.h |  3 +++
 3 files changed, 31 insertions(+), 8 deletions(-)
 create mode 100644 include/dt-bindings/mfd/arizona.h

diff --git a/drivers/extcon/extcon-arizona.c b/drivers/extcon/extcon-arizona.c
index 63f01c4..c827342 100644
--- a/drivers/extcon/extcon-arizona.c
+++ b/drivers/extcon/extcon-arizona.c
@@ -32,13 +32,10 @@
 #include 
 #include 
 #include 
+#include 
 
 #define ARIZONA_MAX_MICD_RANGE 8
 
-#define ARIZONA_ACCDET_MODE_MIC 0
-#define ARIZONA_ACCDET_MODE_HPL 1
-#define ARIZONA_ACCDET_MODE_HPR 2
-
 #define ARIZONA_MICD_CLAMP_MODE_JDL  0x4
 #define ARIZONA_MICD_CLAMP_MODE_JDH  0x5
 #define ARIZONA_MICD_CLAMP_MODE_JDL_GP5H 0x9
@@ -653,9 +650,9 @@ static void arizona_identify_headphone(struct 
arizona_extcon_info *info)
ret = regmap_update_bits(arizona->regmap,
 ARIZONA_ACCESSORY_DETECT_MODE_1,
 ARIZONA_ACCDET_MODE_MASK,
-ARIZONA_ACCDET_MODE_HPL);
+arizona->pdata.hpdet_channel);
if (ret != 0) {
-   dev_err(arizona->dev, "Failed to set HPDETL mode: %d\n", ret);
+   dev_err(arizona->dev, "Failed to set HPDET mode: %d\n", ret);
goto err;
}
 
@@ -705,9 +702,9 @@ static void arizona_start_hpdet_acc_id(struct 
arizona_extcon_info *info)
 ARIZONA_ACCESSORY_DETECT_MODE_1,
 ARIZONA_ACCDET_SRC | ARIZONA_ACCDET_MODE_MASK,
 info->micd_modes[0].src |
-ARIZONA_ACCDET_MODE_HPL);
+arizona->pdata.hpdet_channel);
if (ret != 0) {
-   dev_err(arizona->dev, "Failed to set HPDETL mode: %d\n", ret);
+   dev_err(arizona->dev, "Failed to set HPDET mode: %d\n", ret);
goto err;
}
 
@@ -1103,6 +1100,16 @@ static void arizona_micd_set_level(struct arizona 
*arizona, int index,
regmap_update_bits(arizona->regmap, reg, mask, level);
 }
 
+static int arizona_extcon_of_get_pdata(struct arizona *arizona)
+{
+   struct arizona_pdata *pdata = >pdata;
+
+   of_property_read_u32(arizona->dev->of_node, "wlf,hpdet-channel",
+>hpdet_channel);
+
+   return 0;
+}
+
 static int arizona_extcon_probe(struct platform_device *pdev)
 {
struct arizona *arizona = dev_get_drvdata(pdev->dev.parent);
@@ -1120,6 +1127,11 @@ static int arizona_extcon_probe(struct platform_device 
*pdev)
if (!info)
return -ENOMEM;
 
+   if (IS_ENABLED(CONFIG_OF)) {
+   if (!dev_get_platdata(arizona->dev))
+   arizona_extcon_of_get_pdata(arizona);
+   }
+
info->micvdd = devm_regulator_get(>dev, "MICVDD");
if (IS_ERR(info->micvdd)) {
ret = PTR_ERR(info->micvdd);
diff --git a/include/dt-bindings/mfd/arizona.h 
b/include/dt-bindings/mfd/arizona.h
new file mode 100644
index 000..9ecff78
--- /dev/null
+++ b/include/dt-bindings/mfd/arizona.h
@@ -0,0 +1,8 @@
+#ifndef __DT_BINDINGS_ARIZONA_H__
+#define __DT_BINDINGS_ARIZONA_H__
+
+#define ARIZONA_ACCDET_MODE_MIC 0
+#define ARIZONA_ACCDET_MODE_HPL 1
+#define ARIZONA_ACCDET_MODE_HPR 2
+
+#endif /* __DT_BINDINGS_ARIZONA_H__ */
diff --git a/include/linux/mfd/arizona/pdata.h 
b/include/linux/mfd/arizona/pdata.h
index 4578c72..2473a67 100644
--- a/include/linux/mfd/arizona/pdata.h
+++ b/include/linux/mfd/arizona/pdata.h
@@ -139,6 +139,9 @@ struct arizona_pdata {
/** GPIO used for mic isolation with HPDET */
int hpdet_id_gpio;
 
+   /** Channel to use for headphone detection */
+   unsigned int hpdet_channel;
+
/** Extra debounce timeout used during initial mic detection (ms) */
int micd_detect_debounce;
 
-- 
2.0.0.390.gcb682f8

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: mips build failures due to commit 8dd928915a73 (mips: fix up obsolete cpu function usage)

2015-04-20 Thread Guenter Roeck

On 04/20/2015 02:09 PM, Aaro Koskinen wrote:

Hi,

On Mon, Apr 20, 2015 at 12:40:28PM -0700, Guenter Roeck wrote:

the upstream kernel fails to build mips:nlm_xlp_defconfig,
mips:nlm_xlp_defconfig, mips:cavium_octeon_defconfig, and possibly
other targets, with errors such as

arch/mips/kernel/smp.c:211:2: error:
passing argument 2 of 'cpumask_set_cpu' discards 'volatile' qualifier
from pointer target type
arch/mips/kernel/process.c:52:2: error:
passing argument 2 of 'cpumask_test_cpu' discards 'volatile' qualifier
from pointer target type
arch/mips/cavium-octeon/smp.c:242:2: error:
passing argument 2 of 'cpumask_clear_cpu' discards 'volatile' qualifier
from pointer target type

The problem was introduced with commit 8dd928915a73 (" mips: fix up
obsolete cpu function usage"). I would send a patch to fix it, but I
am not sure if removing 'volatile' from the variable declaration(s)
would be a good idea.


I think removing volatile from cpu_callin_map declaration should be OK,
since test_cpu (only reader) uses test_bit which takes care of it:

static inline int test_bit(int nr, const volatile unsigned long *addr)



I ran two tests with nlm_xlp_defconfig:

- add volatile to the second argument of cpumask_set_cpu() and 
cpumask_test_cpu():
  vmlinux image size 194664946
- remove volatile from cpu_callin_map:
  vmlinux image size 194664066

Given that, I am not sure I understand the impact of removing volatile
from cpu_callin_map. Maybe it is just better optimization without
volatile. Maybe there is no impact. Maybe the use of volatile is wrong
to start with ('Volatile Considered Harmful' comes into mind).
Maybe the use of volatile for cpu_callin_map is wrong for other architectures
as well (powerpc, ia64). Either case, I don't know to code well enough to
make this call.

Guenter

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v5 1/4] PCI: X-Gene: Add the APM X-Gene v1 PCIe MSI/MSIX termination driver

2015-04-20 Thread Duc Dang
X-Gene v1 SoC supports total 256 MSI/MSIX vectors coalesced into
16 HW IRQ lines.

Signed-off-by: Duc Dang 
Signed-off-by: Tanmay Inamdar 
---
 drivers/pci/host/Kconfig |   6 +
 drivers/pci/host/Makefile|   1 +
 drivers/pci/host/pci-xgene-msi.c | 477 +++
 drivers/pci/host/pci-xgene.c |  21 ++
 4 files changed, 505 insertions(+)
 create mode 100644 drivers/pci/host/pci-xgene-msi.c

diff --git a/drivers/pci/host/Kconfig b/drivers/pci/host/Kconfig
index 7b892a9..c9b61fa 100644
--- a/drivers/pci/host/Kconfig
+++ b/drivers/pci/host/Kconfig
@@ -89,11 +89,17 @@ config PCI_XGENE
depends on ARCH_XGENE
depends on OF
select PCIEPORTBUS
+   select PCI_MSI_IRQ_DOMAIN if PCI_MSI
+   select PCI_XGENE_MSI if PCI_MSI
help
  Say Y here if you want internal PCI support on APM X-Gene SoC.
  There are 5 internal PCIe ports available. Each port is GEN3 capable
  and have varied lanes from x1 to x8.
 
+config PCI_XGENE_MSI
+   bool "X-Gene v1 PCIe MSI feature"
+   depends on PCI_XGENE && PCI_MSI
+
 config PCI_LAYERSCAPE
bool "Freescale Layerscape PCIe controller"
depends on OF && ARM
diff --git a/drivers/pci/host/Makefile b/drivers/pci/host/Makefile
index e61d91c..f39bde3 100644
--- a/drivers/pci/host/Makefile
+++ b/drivers/pci/host/Makefile
@@ -11,5 +11,6 @@ obj-$(CONFIG_PCIE_SPEAR13XX) += pcie-spear13xx.o
 obj-$(CONFIG_PCI_KEYSTONE) += pci-keystone-dw.o pci-keystone.o
 obj-$(CONFIG_PCIE_XILINX) += pcie-xilinx.o
 obj-$(CONFIG_PCI_XGENE) += pci-xgene.o
+obj-$(CONFIG_PCI_XGENE_MSI) += pci-xgene-msi.o
 obj-$(CONFIG_PCI_LAYERSCAPE) += pci-layerscape.o
 obj-$(CONFIG_PCI_VERSATILE) += pci-versatile.o
diff --git a/drivers/pci/host/pci-xgene-msi.c b/drivers/pci/host/pci-xgene-msi.c
new file mode 100644
index 000..910f5db
--- /dev/null
+++ b/drivers/pci/host/pci-xgene-msi.c
@@ -0,0 +1,477 @@
+/*
+ * APM X-Gene MSI Driver
+ *
+ * Copyright (c) 2014, Applied Micro Circuits Corporation
+ * Author: Tanmay Inamdar 
+ *Duc Dang 
+ *
+ * This program is free software; you can redistribute  it and/or modify it
+ * under  the terms of  the GNU General  Public License as published by the
+ * Free Software Foundation;  either version 2 of the  License, or (at your
+ * option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define MSI_IR00x00
+#define MSI_INT0   0x80
+#define IDX_PER_GROUP  8
+#define IRQS_PER_IDX   16
+#define NR_HW_IRQS 16
+#define NR_MSI_VEC (IDX_PER_GROUP * IRQS_PER_IDX * NR_HW_IRQS)
+
+struct xgene_msi {
+   struct device_node  *node;
+   struct msi_controller   mchip;
+   struct irq_domain   *domain;
+   u64 msi_addr;
+   void __iomem*msi_regs;
+   unsigned long   *bitmap;
+   struct mutexbitmap_lock;
+   int *msi_virqs;
+   int num_cpus;
+};
+
+static struct irq_chip xgene_msi_top_irq_chip = {
+   .name   = "X-Gene1 MSI",
+   .irq_enable = pci_msi_unmask_irq,
+   .irq_disable= pci_msi_mask_irq,
+   .irq_mask   = pci_msi_mask_irq,
+   .irq_unmask = pci_msi_unmask_irq,
+};
+
+static struct  msi_domain_info xgene_msi_domain_info = {
+   .flags  = (MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS |
+ MSI_FLAG_PCI_MSIX),
+   .chip   = _msi_top_irq_chip,
+};
+
+/*
+ * X-Gene v1 has 16 groups of MSI termination registers MSInIRx, where
+ * n is group number (0..F), x is index of registers in each group (0..7)
+ * The registers layout is like following:
+ * MSI0IR0 base_addr
+ * MSI0IR1 base_addr +  0x1
+ * ... ...
+ * MSI0IR6 base_addr +  0x6
+ * MSI0IR7 base_addr +  0x7
+ * MSI1IR0 base_addr +  0x8
+ * MSI1IR1 base_addr +  0x9
+ * ... ...
+ * MSI1IR7 base_addr +  0xF
+ * MSI2IR0 base_addr + 0x10
+ * ... ...
+ * MSIFIR0 base_addr + 0x78
+ * MSIFIR1 base_addr + 0x79
+ * ... ...
+ * MSIFIR7 base_addr + 0x7F
+ *
+ * Each index register support 16 MSI vectors (0..15) to generate interrupt.
+ * There are total 16 GIC IRQs assigned for these 16 groups of MSI 

[PATCH v5 2/4] arm64: dts: Add the device tree entry for the APM X-Gene PCIe MSI node

2015-04-20 Thread Duc Dang
There is single MSI block in X-Gene v1 SOC which serves all 5 PCIe ports.

Signed-off-by: Duc Dang 
Signed-off-by: Tanmay Inamdar 
---
 arch/arm64/boot/dts/apm/apm-storm.dtsi | 27 +++
 1 file changed, 27 insertions(+)

diff --git a/arch/arm64/boot/dts/apm/apm-storm.dtsi 
b/arch/arm64/boot/dts/apm/apm-storm.dtsi
index f1ad9c2..4b719c9 100644
--- a/arch/arm64/boot/dts/apm/apm-storm.dtsi
+++ b/arch/arm64/boot/dts/apm/apm-storm.dtsi
@@ -354,6 +354,28 @@
};
};
 
+   msi: msi@7900 {
+   compatible = "apm,xgene1-msi";
+   msi-controller;
+   reg = <0x00 0x7900 0x0 0x90>;
+   interrupts = <  0x0 0x10 0x4
+   0x0 0x11 0x4
+   0x0 0x12 0x4
+   0x0 0x13 0x4
+   0x0 0x14 0x4
+   0x0 0x15 0x4
+   0x0 0x16 0x4
+   0x0 0x17 0x4
+   0x0 0x18 0x4
+   0x0 0x19 0x4
+   0x0 0x1a 0x4
+   0x0 0x1b 0x4
+   0x0 0x1c 0x4
+   0x0 0x1d 0x4
+   0x0 0x1e 0x4
+   0x0 0x1f 0x4>;
+   };
+
pcie0: pcie@1f2b {
status = "disabled";
device_type = "pci";
@@ -375,6 +397,7 @@
 0x0 0x0 0x0 0x4  0x0 0xc5 0x1>;
dma-coherent;
clocks = < 0>;
+   msi-parent = <>;
};
 
pcie1: pcie@1f2c {
@@ -398,6 +421,7 @@
 0x0 0x0 0x0 0x4  0x0 0xcb 0x1>;
dma-coherent;
clocks = < 0>;
+   msi-parent = <>;
};
 
pcie2: pcie@1f2d {
@@ -421,6 +445,7 @@
 0x0 0x0 0x0 0x4  0x0 0xd1 0x1>;
dma-coherent;
clocks = < 0>;
+   msi-parent = <>;
};
 
pcie3: pcie@1f50 {
@@ -444,6 +469,7 @@
 0x0 0x0 0x0 0x4  0x0 0xd7 0x1>;
dma-coherent;
clocks = < 0>;
+   msi-parent = <>;
};
 
pcie4: pcie@1f51 {
@@ -467,6 +493,7 @@
 0x0 0x0 0x0 0x4  0x0 0xdd 0x1>;
dma-coherent;
clocks = < 0>;
+   msi-parent = <>;
};
 
serial0: serial@1c02 {
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v5 4/4] PCI: X-Gene: Add the MAINTAINERS entry for APM X-Gene v1 PCIe MSI driver

2015-04-20 Thread Duc Dang
This patch adds information of maintainers for APM X-Gene v1 PCIe
MSI/MSIX termination driver

Signed-off-by: Duc Dang 
Signed-off-by: Tanmay Inamdar 
---
 MAINTAINERS | 8 
 1 file changed, 8 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index ddc5a8c..a1b119b 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -7490,6 +7490,14 @@ L:   linux-...@vger.kernel.org
 S: Maintained
 F: drivers/pci/host/*spear*
 
+PCI MSI DRIVER FOR APPLIEDMICRO XGENE
+M: Duc Dang 
+L: linux-...@vger.kernel.org
+L: linux-arm-ker...@lists.infradead.org
+S: Maintained
+F: Documentation/devicetree/bindings/pci/xgene-pci-msi.txt
+F: drivers/pci/host/pci-xgene-msi.c
+
 PCMCIA SUBSYSTEM
 P: Linux PCMCIA Team
 L: linux-pcm...@lists.infradead.org
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v5 3/4] documentation: dts: Add the device tree binding for APM X-Gene v1 PCIe MSI device tree node

2015-04-20 Thread Duc Dang
The driver for this binding is under 'drivers/pci/host/pci-xgene-msi.c'

Signed-off-by: Duc Dang 
Signed-off-by: Tanmay Inamdar 
---
 .../devicetree/bindings/pci/xgene-pci-msi.txt  | 63 ++
 1 file changed, 63 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/pci/xgene-pci-msi.txt

diff --git a/Documentation/devicetree/bindings/pci/xgene-pci-msi.txt 
b/Documentation/devicetree/bindings/pci/xgene-pci-msi.txt
new file mode 100644
index 000..0ffdcb3
--- /dev/null
+++ b/Documentation/devicetree/bindings/pci/xgene-pci-msi.txt
@@ -0,0 +1,63 @@
+* AppliedMicro X-Gene PCIe MSI interface
+
+Required properties:
+
+- compatible: should contain "apm,xgene1-msi" to identify the core.
+- msi-controller: indicates that this is X-Gene1 PCIe MSI controller node
+- reg: A list of physical base address and length for each set of controller
+   registers.
+- interrupts: A list of interrupt outputs of the controller.
+
+Each PCIe node needs to have property msi-parent that points to msi controller 
node
+
+Examples:
+
+SoC DTSI:
+
+   + MSI node:
+   msi@7900 {
+   compatible = "apm,xgene1-msi";
+   msi-controller;
+   reg = <0x00 0x7900 0x0 0x90>;
+   interrupts = <  0x0 0x10 0x4
+   0x0 0x11 0x4
+   0x0 0x12 0x4
+   0x0 0x13 0x4
+   0x0 0x14 0x4
+   0x0 0x15 0x4
+   0x0 0x16 0x4
+   0x0 0x17 0x4
+   0x0 0x18 0x4
+   0x0 0x19 0x4
+   0x0 0x1a 0x4
+   0x0 0x1b 0x4
+   0x0 0x1c 0x4
+   0x0 0x1d 0x4
+   0x0 0x1e 0x4
+   0x0 0x1f 0x4>;
+   };
+
+   + PCIe controller node with msi-parent property pointing to MSI node:
+   pcie0: pcie@1f2b {
+   status = "disabled";
+   device_type = "pci";
+   compatible = "apm,xgene-storm-pcie", "apm,xgene-pcie";
+   #interrupt-cells = <1>;
+   #size-cells = <2>;
+   #address-cells = <3>;
+   reg = < 0x00 0x1f2b 0x0 0x0001   /* Controller 
registers */
+   0xe0 0xd000 0x0 0x0004>; /* PCI config space */
+   reg-names = "csr", "cfg";
+   ranges = <0x0100 0x00 0x 0xe0 0x1000 0x00 
0x0001   /* io */
+ 0x0200 0x00 0x8000 0xe1 0x8000 0x00 
0x8000>; /* mem */
+   dma-ranges = <0x4200 0x80 0x 0x80 0x 0x00 
0x8000
+ 0x4200 0x00 0x 0x00 0x 0x80 
0x>;
+   interrupt-map-mask = <0x0 0x0 0x0 0x7>;
+   interrupt-map = <0x0 0x0 0x0 0x1  0x0 0xc2 0x1
+0x0 0x0 0x0 0x2  0x0 0xc3 0x1
+0x0 0x0 0x0 0x3  0x0 0xc4 0x1
+0x0 0x0 0x0 0x4  0x0 0xc5 0x1>;
+   dma-coherent;
+   clocks = < 0>;
+   msi-parent= <>;
+   };
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v5 0/4]PCI: X-Gene: Add APM X-Gene v1 MSI/MSIX termination driver

2015-04-20 Thread Duc Dang
This patch set adds MSI/MSIX termination driver support for APM X-Gene v1 SoC.
APM X-Gene v1 SoC supports its own implementation of MSI, which is not compliant
to GIC V2M specification for MSI Termination.

There is single MSI block in X-Gene v1 SOC which serves all 5 PCIe ports. This 
MSI
block supports 2048 MSI termination ports coalesced into 16 physical HW IRQ 
lines
and shared across all 5 PCIe ports. As the version 5 of this patch, the total 
MSI
vectors this driver supports is reduced to 256 to maintain the correct 
set_affinity 
behavior for each MSI.

v5 changes:
1. Implement set_affinity for each MSI by statically allocating 2 MSI 
GIC IRQs
for each X-Gene CPU core and moving MSI vectors around these GIC IRQs 
to steer
them to target CPU core. As a consequence, the total MSI vectors that 
X-Gene v1
supports is reduced to 256.

v4 changes:
1. Remove affinity setting for each MSI
2. Add description about register layout, MSI termination address and 
data
3. Correct total number of MSI vectors to 2048
4. Clean up error messages
5. Remove unused module code

v3 changes:
1. Implement MSI support using PCI MSI IRQ domain
2. Only use msi_controller to store IRQ domain
v2 changes:
1. Use msi_controller structure
2. Remove arch hooks arch_teardown_msi_irqs and arch_setup_msi_irqs

 .../devicetree/bindings/pci/xgene-pci-msi.txt  |  63 +++
 MAINTAINERS|   8 +
 arch/arm64/boot/dts/apm/apm-storm.dtsi |  27 ++
 drivers/pci/host/Kconfig   |   6 +
 drivers/pci/host/Makefile  |   1 +
 drivers/pci/host/pci-xgene-msi.c   | 477 +
 drivers/pci/host/pci-xgene.c   |  21 +
 7 files changed, 603 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/pci/xgene-pci-msi.txt
 create mode 100644 drivers/pci/host/pci-xgene-msi.c

-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT RFC PULL rcu/urgent] Prevent Kconfig from asking pointless questions

2015-04-20 Thread Mike Galbraith
On Mon, 2015-04-20 at 14:21 -0400, Steven Rostedt wrote:
> 
> I would argue than every case is different, and only the sysadmin 
> would
> know the right value. Thus, just set it to one, and if that's not 
> good
> enough, then the sysadmins can change it to their needs.

Agreed.  I don't have it turned on in my -rt kernels, because I don't 
want to force a knight in shining (priority x) armor on users.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v6] perf: __kmod_path__parse: deal with kernel module names in '[]' correctly.

2015-04-20 Thread Wang Nan
Before patch ba92732e9808df679ddf75c5ea1c0caae6d7dce2 ('perf kmaps:
Check kmaps to make code more robust'), perf report and perf annotate
will segfault if trace data contains kernel module information like
this:

 # perf report -D -i ./perf.data
 ...
 0 0 0x188 [0x50]: PERF_RECORD_MMAP -1/0: [0xffbff1018000(0xf068000) @ 0]: 
x [test_module]
 ...

 # perf report -i ./perf.data --objdump=/path/to/objdump 
--kallsyms=/path/to/kallsyms

 perf: Segmentation fault
  backtrace 
 /path/to/perf[0x503478]
 /lib64/libc.so.6(+0x3545f)[0x7fb201f3745f]
 /path/to/perf[0x499b56]
 /path/to/perf(dso__load_kallsyms+0x13c)[0x49b56c]
 /path/to/perf(dso__load+0x72e)[0x49c21e]
 /path/to/perf(map__load+0x6e)[0x4ae9ee]
 /path/to/perf(thread__find_addr_map+0x24c)[0x47deec]
 /path/to/perf(perf_event__preprocess_sample+0x88)[0x47e238]
 /path/to/perf[0x43ad02]
 /path/to/perf[0x4b55bc]
 /path/to/perf(ordered_events__flush+0xca)[0x4b57ea]
 /path/to/perf[0x4b1a01]
 /path/to/perf(perf_session__process_events+0x3be)[0x4b428e]
 /path/to/perf(cmd_report+0xf11)[0x43bfc1]
 /path/to/perf[0x474702]
 /path/to/perf(main+0x5f5)[0x42de95]
 /lib64/libc.so.6(__libc_start_main+0xf4)[0x7fb201f23bd4]
 /path/to/perf[0x42dfc4]

This is because __kmod_path__parse treats '[' leading names as kernel
name instead of names of kernel module. If perf.data contains build
information and the buildid of such modules can be found, the DSO of
it will be treated as kernel, not kernel module. It will then be passed to
dso__load_kernel_sym() -> dso__load_kcore() because of --kallsyms
argument.

The refered patch adds NULL pointer checker to avoid segfault. However,
such kernel modules are still processed incorrectly.

This patch fixes __kmod_path__parse, makes it treat names like
'[test_module]' as kernel modules.

kmod-path.c is also update to reflect the above changes.

Signed-off-by: Wang Nan 
---
Improves commit messages.

Since ba92732e9808df679ddf75c5ea1c0caae6d7dce2 is already in -tip tree,
segfault will not be triggered even without this patch.
---
 tools/perf/tests/kmod-path.c | 72 
 tools/perf/util/dso.c| 42 +++---
 tools/perf/util/dso.h|  2 +-
 tools/perf/util/header.c |  8 ++---
 tools/perf/util/machine.c| 16 +-
 5 files changed, 130 insertions(+), 10 deletions(-)

diff --git a/tools/perf/tests/kmod-path.c b/tools/perf/tests/kmod-path.c
index e8d7cbb..08c433b 100644
--- a/tools/perf/tests/kmod-path.c
+++ b/tools/perf/tests/kmod-path.c
@@ -34,9 +34,21 @@ static int test(const char *path, bool alloc_name, bool 
alloc_ext,
return 0;
 }
 
+static int test_is_kernel_module(const char *path, int cpumode, bool expect)
+{
+   TEST_ASSERT_VAL("is_kernel_module",
+   (!!is_kernel_module(path, cpumode)) == (!!expect));
+   pr_debug("%s (cpumode: %d) - is_kernel_module: %s\n",
+   path, cpumode, expect ? "true" : "false");
+   return 0;
+}
+
 #define T(path, an, ae, k, c, n, e) \
TEST_ASSERT_VAL("failed", !test(path, an, ae, k, c, n, e))
 
+#define M(path, c, e) \
+   TEST_ASSERT_VAL("failed", !test_is_kernel_module(path, c, e))
+
 int test__kmod_path__parse(void)
 {
/* pathalloc_name  alloc_ext   kmod  comp   name 
ext */
@@ -44,30 +56,90 @@ int test__kmod_path__parse(void)
T("///x-x.ko", false , true  , true, false, NULL   , 
NULL);
T("///x-x.ko", true  , false , true, false, "[x_x]", 
NULL);
T("///x-x.ko", false , false , true, false, NULL   , 
NULL);
+   M("///x-x.ko", PERF_RECORD_MISC_CPUMODE_UNKNOWN, true);
+   M("///x-x.ko", PERF_RECORD_MISC_KERNEL, true);
+   M("///x-x.ko", PERF_RECORD_MISC_USER, false);
 
/* pathalloc_name  alloc_ext   kmod  comp  name   ext */
T("///x.ko.gz", true , true  , true, true, "[x]", "gz");
T("///x.ko.gz", false, true  , true, true, NULL , "gz");
T("///x.ko.gz", true , false , true, true, "[x]", NULL);
T("///x.ko.gz", false, false , true, true, NULL , NULL);
+   M("///x.ko.gz", PERF_RECORD_MISC_CPUMODE_UNKNOWN, true);
+   M("///x.ko.gz", PERF_RECORD_MISC_KERNEL, true);
+   M("///x.ko.gz", PERF_RECORD_MISC_USER, false);
 
/* path  alloc_name  alloc_ext  kmod   comp  nameext */
T("///x.gz", true  , true , false, true, "x.gz" ,"gz");
T("///x.gz", false , true , false, true, NULL   ,"gz");
T("///x.gz", true  , false, false, true, "x.gz" , NULL);
T("///x.gz", false , false, false, true, NULL   , NULL);
+   M("///x.gz", PERF_RECORD_MISC_CPUMODE_UNKNOWN, false);
+   M("///x.gz", PERF_RECORD_MISC_KERNEL, false);
+   

[PATCH] staging: gdm72xx: enclose complex define statement

2015-04-20 Thread Jaime Arrocha
This patch fixes the warning found by checkpatch.pl:
ERROR: Macros with complex values should be enclosed in parentheses

Signed-off-by: Jaime Arrocha 
---
 drivers/staging/gdm72xx/usb_ids.h |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/gdm72xx/usb_ids.h 
b/drivers/staging/gdm72xx/usb_ids.h
index 8ce544d..2b50ac6 100644
--- a/drivers/staging/gdm72xx/usb_ids.h
+++ b/drivers/staging/gdm72xx/usb_ids.h
@@ -32,8 +32,8 @@
 #define BL_PID_MASK0xffc0
 
 #define USB_DEVICE_BOOTLOADER(vid, pid)\
-   {USB_DEVICE((vid), ((pid)_PID_MASK)|B_DOWNLOAD)},\
-   {USB_DEVICE((vid), ((pid)_PID_MASK)|B_DOWNLOAD|B_DIFF_DL_DRV)}
+   ({USB_DEVICE((vid), ((pid)_PID_MASK)|B_DOWNLOAD)},   \
+   {USB_DEVICE((vid), ((pid)_PID_MASK)|B_DOWNLOAD|B_DIFF_DL_DRV)})
 
 #define USB_DEVICE_CDC_DATA(vid, pid)  \
{USB_DEVICE_INTF((vid), (pid), USB_CLASS_CDC_DATA)}
-- 
1.7.10.4


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: sparc64: Build failure due to commit f1600e549b94 (sparc: Make sparc64 use scalable lib/iommu-common.c functions)

2015-04-20 Thread Michael Ellerman
On Mon, 2015-04-20 at 19:32 -0700, Guenter Roeck wrote:
> On 04/20/2015 06:54 PM, Michael Ellerman wrote:
> > On Mon, 2015-04-20 at 12:50 -0400, David Miller wrote:
> >> From: Guenter Roeck 
> >> Date: Mon, 20 Apr 2015 09:44:31 -0700
> >>
> >>> On Mon, Apr 20, 2015 at 12:25:19PM -0400, David Miller wrote:
>  From: Guenter Roeck 
>  Date: Sun, 19 Apr 2015 22:17:21 -0700
> 
> > The debug option is intended for all _other_ architectures, to
> > ensure that changes made for those don't break alpha/s390
> > builds. alpha/s390 have ARCH_NEEDS_WEAK_PER_CPU and don't need the
> > debug option.
> 
>  Ironically this would not create a build failure for the architectures
>  where this matters, because only powerpc has the like named percpu
>  symbol.
> 
>  So it's not really meeting the stated objective in this case.
> >>>
> >>> Yes, that is correct; it can only find problems in non-architecture
> >>> code, and on the downside produces false positives and thus build errors
> >>> like this one.
> >>>
> >>> Which makes the fix a bit philosophical. Rename iommu_pool_hash in
> >>> iommu-common, or drop DEBUG_FORCE_WEAK_PER_CPU. I would rename
> >>> iommu_pool_hash, but that is just me. Ultimately, I don't really
> >>> care one way or another, as long as the problem gets fixed.
> >>
> >> If nightly builds of s390 and alpha, the two platforms where this
> >> matters, are being done as reported in this thread, then I really
> >> don't see the value in DEBUG_FORCE_WEAK_PER_CPU.
> 
> Me not either, but, as you say, that is a different discussion.
> 
> >
> > We do an s390 allmodconfig for every linux-next release:
> >
> >http://kisskb.ellerman.id.au/kisskb/target/573/
> >
> > And also for Linus' tree:
> >
> >http://kisskb.ellerman.id.au/kisskb/target/568/
> >
> > We don't have alpha allmodconfig enabled, though we could, but we do build 
> > the
> > defconfig:
> >
> >http://kisskb.ellerman.id.au/kisskb/target/2499/
> >http://kisskb.ellerman.id.au/kisskb/target/2494/
>
> I cover alpha:allmodconfig in my builds for -next, mainline, as well as all
> kernel.org stable releases and release candidates. This discussion is a good
> argument for enabling s390:allmodconfig as well.
> 
> > So I think that should be sufficient to catch any percpus that are 
> > introduced
> > in generic code with the same name as s390/alpha variables.
> 
> Yes, but unfortunately only after the fact, though I don't see a means
> to avoid that.

Yeah after the merge into linux-next, which I think is probably good enough for
something like this.

cheers


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v4 2/2] efi: an sysfs interface for user to update efi firmware

2015-04-20 Thread Kweh, Hock Leong
> -Original Message-
> From: Greg Kroah-Hartman [mailto:gre...@linuxfoundation.org]
> Sent: Monday, April 20, 2015 10:43 PM
> 
> On Mon, Apr 20, 2015 at 03:28:32AM +, Kweh, Hock Leong wrote:
> > Regarding the 'reboot require' status, is it critical to have a 1 to 1 
> > status
> match
> > with the capsule upload binary? Is it okay to have one sysfs file note to 
> > tell
> the
> > overall status (for example: 10 capsule binaries uploaded but one require
> > reboot, so the status shows reboot require is yes)? I am not here trying to
> argue
> > anything. I am just trying to find out what kind of info is needed but the
> sysfs
> > could not provide.
> >
> > Please imagine if your whole Linux system (kernel + rootfs) has to fit into
> 6MB
> > space and you don't even have the gcc compiler included into the package.
> > I believe in this environment, kernel interface + shell command is the only
> > interaction that user could work with.
> 
> Why would you have to have gcc on such a system?  Why is that a
> requirement for having an ioctl/char interface?

This is my logic:
- Besides writing a C program (for example), I am not aware any shell script
  could perform an ioctl function call. This led me to if I don't have an 
execution
  binary then I need a compiler to compile the source to execution binary.

- For embedded product as mentioned above, not all vendors willing to carry
  the userland tool when they are struggling to fit into small memory space.
  Yet, you may say this tool would not eat up a lot of space compare to others.
  But when the source of this tool being upstream-ed to the tools/ kernel tree,
  we cannot stop people to contribute and make the tool more features support,
  eventually the embedded product may need to drop the tool.

> 
> And if you only have 6Mb of space, you don't have UEFI, sorry, there's
> no way that firmware can get that small.

Actually there is. Quark is one of the examples. The kernel + rootfs take
up 6MB and the UEFI consume only 2MB, so total size 8MB in the spi chip.
If you have an Intel Galileo board, you don't need any mass storage (SD & USB),
you are able to boot to Linux console.


Thanks & Regards,
Wilson
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PULL] clk: changes for 4.1

2015-04-20 Thread Michael Turquette
The following changes since commit c517d838eb7d07bbe9507871fab3931deccff539:

  Linux 4.0-rc1 (2015-02-22 18:21:14 -0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux.git 
tags/clk-for-linus-4.1

for you to fetch changes up to 03bc10ab5b0f9b8f81bffbe6e40c944f9d3dbcc5:

  clk: check ->determine/round_rate() return value in clk_calc_new_rates 
(2015-04-12 21:09:49 -0700)


The changes to the common clock framework for 4.0 are mostly new clock
drivers and updates to existing ones for feature enhancements and bug
fixes. There is more churn than usual in the framework core due to the
change to introduce per-user unique struct clk pointers in 4.0. This
caused several regressions to surface, some of which were sent as fixes
to 4.0. New generic clock drivers were added for GPIO- and PWM-based
clock controllers. Additionally the common clk-divider code recieved
several fixes to the way it rounds rates.


Archit Taneja (2):
  clk: qcom: fix RCG M/N counter configuration
  clk: qcom: Add EBI2 clocks for IPQ806x

Bartlomiej Zolnierkiewicz (2):
  clk: qcom: fix driver dependencies
  clk: samsung: exynos4: Disable ARMCLK down feature on Exynos4210 SoC

Ben Dooks (1):
  clk: at91: change to using endian agnositc IO

Boris Brezillon (2):
  clk: at91: usb: propagate rate modification to the parent clk
  clk: check ->determine/round_rate() return value in clk_calc_new_rates

Chanwoo Choi (22):
  clk: samsung: exynos5433: Add binding document for Exynos5433 clock 
domains
  clk: samsung: exynos5433: Add clocks using common clock framework
  clk: samsung: exynos5433: Add MUX clocks of CMU_TOP domain
  clk: samsung: exynos5433: Add clocks for CMU_PERIC domain
  clk: samsung: exynos5433: Add clocks for CMU_PERIS domain
  clk: samsung: exynos5433: Add clocks for CMU_G2D domain
  clk: samsung: exynos5433: Add clocks for CMU_MIF domain
  clk: samsung: exynos5433: Add clocks for CMU_DISP domain
  clk: samsung: exynos5433: Add clocks for CMU_AUD domain
  clk: samsung: exynos5433: Add clocks for CMU_BUS{0|1|2} domains
  clk: samsung: exynos5433: Add missing clocks for CMU_FSYS domain
  clk: samsung: exynos5433: Add clocks for CMU_G3D domain
  clk: samsung: exynos5433: Add clocks for CMU_GSCL domain
  clk: samsung: exynos5433: Add clocks for CMU_APOLLO domain
  clk: samsung: exynos5433: Add clocks for CMU_ATLAS domain
  clk: samsung: exynos5433: Add clocks for CMU_MSCL domain
  clk: samsung: exynos5433: Add clocks for CMU_MFC domain
  clk: samsung: exynos5433: Add clocks for CMU_HEVC domain
  clk: samsung: exynos5433: Add clocks for CMU_ISP domain
  clk: samsung: exynos5433: Add clocks for CMU_CAM0 domain
  clk: samsung: exynos5433: Add clocks for CMU_CAM1 domain
  clk: samsung: exynos5433: Move CLK_SCLK_HDMI_SPDIF_DISP clock to CMU_TOP 
domain

Chen-Yu Tsai (7):
  clk: sunxi: Move USB clocks to separate file
  clk: sunxi: Add support for sun9i A80 USB clocks and resets
  clk: sunxi: Add muxable ahb factors clock for sun5i and sun7i
  clk: sunxi: Add "cpu" to list of protected clocks for sun5i
  clk: sunxi: Register divs clocks before factor clocks
  clk: sunxi: Make divs clocks specify which output is the base factor clock
  clk: sunxi: Add pll6 / 4 clock output to sun4i-a10-pll6

Dylan Reid (1):
  clk: tegra: Enable HDA to HDMI clocks on Tegra124

Fabian Frederick (1):
  clk: constify of_device_id array

Fengguang Wu (1):
  clk: qcom: fix simple_return.cocci warnings

Georgi Djakov (6):
  clk: qcom: Fix clk_get_parent function return value
  clk: qcom: Do some error handling in configure_bank()
  clk: qcom: Introduce parent_map tables
  dt-bindings: Add #defines for MSM8916 clocks and resets
  clk: qcom: Add MSM8916 Global Clock Controller support
  clk: qcom: Fix parent_map translations

Heikki Krogerus (1):
  clk: fractional-divider: support for divider bypassing

Heiko Stübner (1):
  clk: divider: return real rate instead of divider value

Inha Song (2):
  clk: samsung: Add CLKOUT driver support for Exynos5433 SoC
  clk: samsung: Add CLKOUT driver support for Exynos3250 SoC

Jassi Brar (1):
  clk: Add clock driver for mb86s7x

Julia Lawall (2):
  clk: don't export static symbol
  clk: versatile: test returned value

Krzysztof Kozlowski (4):
  clk: Use lockdep asserts to find missing hold of prepare_lock
  clk: si5351: Constify clock names and struct regmap_config
  clk: si570: Constify struct regmap_config
  clk: cdce706: Constify struct regmap_config

Martin Fuzzey (1):
  clk: clk-gpio-gate: Fix active low

Michael Turquette (6):
  clk: introduce clk_is_match
  Merge tag 'v3.20-exynos5433-clk' of 

Re: [PATCH] perf: annotate: make it respect -i option.

2015-04-20 Thread Wang Nan
On 2015/4/2 16:12, Namhyung Kim wrote:
> On Thu, Apr 02, 2015 at 06:04:52AM +, Wang Nan wrote:
>> There is a bug in perf annotate that it doesn't respect user provided
>> '-i'/'--input' option:
>>
>>  # perf record ls
>>[ perf record: Woken up 1 times to write data ]
>>[ perf record: Captured and wrote 0.001 MB perf.data (8 samples) ]
>>  # mv ./perf.data ./perf.data.new
>>  # perf annotate -i ./perf.data.new  --stdio
>>failed to open perf.data: No such file or directory  (try 'perf record' 
>> first)
>>
>> This patch fix it by setting file path after option parsing, like
>> what 'perf report' does.
>>
>> Signed-off-by: Wang Nan 
> 
> I guess other commands are also suffered from this bug.. anyway,
> 
> Acked-by: Namhyung Kim 
> 
> Thanks,
> Namhyung
> 

Hi,

Looks like the next patch 'perf kmem: Respect -i option' has already been 
collected
by tip/master, but this patch is lost. Is there any problem?

Thank you!


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RESEND] perf tools: introduce arm64 support unwind test.

2015-04-20 Thread Wang Nan
On 2015/3/30 15:03, Jiri Olsa wrote:
> On Mon, Mar 30, 2015 at 02:04:08AM +, Wang Nan wrote:
>> Newest libunwind does support ARM64, and perf is able to utilize it
>> also. This patch enables the missing perf test dwarf unwind for arm64.
>>
>>  Test result:
>>   # ./perf test unwind
>>   25: Test dwarf unwind  : Ok
>>
>> Signed-off-by: Wang Nan 
> 
> cannot try, but looks ok and match closely x86 pattern ;-)
> 
> Acked-by: Jiri Olsa 
> 
> thanks,
> jirka
> 

Sorry, I'm unable to find this patch in git repo (both tip/master and mainline).
Is there any problem?


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Staging: comedi: fix coding style errors in daqboard2000.c

2015-04-20 Thread Gbenga Adalumo
The patch fixes a trailing whitespace and code indenting coding style
errors as reported by checkpatch.pl tool.
 Details of the lines where the fixed errors were reported are as follows:

drivers/staging/comedi/drivers/daqboard2000.c:43: ERROR: trailing whitespace
drivers/staging/comedi/drivers/daqboard2000.c:46: ERROR: code indent
should use tabs where possible
drivers/staging/comedi/drivers/daqboard2000.c:55: ERROR: code indent
should use tabs where possible
drivers/staging/comedi/drivers/daqboard2000.c:58: ERROR: code indent
should use tabs where possible
drivers/staging/comedi/drivers/daqboard2000.c:59: ERROR: code indent
should use tabs where possible
drivers/staging/comedi/drivers/daqboard2000.c:60: ERROR: code indent
should use tabs where possible
drivers/staging/comedi/drivers/daqboard2000.c:61: ERROR: code indent
should use tabs where possible
drivers/staging/comedi/drivers/daqboard2000.c:63: ERROR: code indent
should use tabs where possible
drivers/staging/comedi/drivers/daqboard2000.c:64: ERROR: code indent
should use tabs where possible
drivers/staging/comedi/drivers/daqboard2000.c:65: ERROR: code indent
should use tabs where possible
drivers/staging/comedi/drivers/daqboard2000.c:66: ERROR: code indent
should use tabs where possible
drivers/staging/comedi/drivers/daqboard2000.c:68: ERROR: code indent
should use tabs where possible
drivers/staging/comedi/drivers/daqboard2000.c:75: ERROR: code indent
should use tabs where possible
drivers/staging/comedi/drivers/daqboard2000.c:76: ERROR: code indent
should use tabs where possible
drivers/staging/comedi/drivers/daqboard2000.c:77: ERROR: code indent
should use tabs where possible
drivers/staging/comedi/drivers/daqboard2000.c:78: ERROR: code indent
should use tabs where possible
drivers/staging/comedi/drivers/daqboard2000.c:79: ERROR: code indent
should use tabs where possible
drivers/staging/comedi/drivers/daqboard2000.c:80: ERROR: code indent
should use tabs where possible
drivers/staging/comedi/drivers/daqboard2000.c:81: ERROR: code indent
should use tabs where possible
drivers/staging/comedi/drivers/daqboard2000.c:83: ERROR: code indent
should use tabs where possible
drivers/staging/comedi/drivers/daqboard2000.c:86: ERROR: code indent
should use tabs where possible
drivers/staging/comedi/drivers/daqboard2000.c:87: ERROR: code indent
should use tabs where possible
drivers/staging/comedi/drivers/daqboard2000.c:88: ERROR: code indent
should use tabs where possible
drivers/staging/comedi/drivers/daqboard2000.c:89: ERROR: code indent
should use tabs where possible
drivers/staging/comedi/drivers/daqboard2000.c:90: ERROR: code indent
should use tabs where possible
drivers/staging/comedi/drivers/daqboard2000.c:91: ERROR: code indent
should use tabs where possible
drivers/staging/comedi/drivers/daqboard2000.c:92: ERROR: code indent
should use tabs where possible
drivers/staging/comedi/drivers/daqboard2000.c:93: ERROR: code indent
should use tabs where possible
drivers/staging/comedi/drivers/daqboard2000.c:100: ERROR: code indent
should use tabs where possible


--Gbenga Adalumo

On Mon, Apr 20, 2015 at 7:43 AM, Greg KH  wrote:
> On Sun, Apr 19, 2015 at 07:59:31PM -0700, Gbenga Adalumo wrote:
>> Fix coding style errors found by checkpatch.pl tool
>
> What errors?  Be specific.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-04-20 Thread Dave Young
Hi,

On 04/21/15 at 09:39am, Li, ZhenHua wrote:
> Hi Dave,
> I found the old mail:
> http://lkml.iu.edu/hypermail/linux/kernel/1410.2/03584.html

I know and I have read it before.

==  quote  ===
> > > So with this in mind I would prefer initially taking over the
> > > page-tables from the old kernel before the device drivers re-initialize
> > > the devices.
> >
> > This makes the dump kernel more dependent on data from the old kernel,
> > which we obviously want to avoid when possible.

> Sure, but this is not really possible here (unless we have a generic and
> reliable way to reset all PCI endpoint devices and cancel all in-flight
> DMA before we disable the IOMMU in the kdump kernel).
> Otherwise we always risk data corruption somewhere, in system memory or
> on disk.
=  quote  

What I understand above is it is not really possible to avoid the problem.

But IMHO we should avoid it or we will have problems in the future, if we
really cannot avoid it I would say switching to pci reset way is better.

> 
> Please check this and you will find the discussion.
> 
> Regards
> Zhenhua
> 
> On 04/15/2015 02:48 PM, Dave Young wrote:
> >On 04/15/15 at 01:47pm, Li, ZhenHua wrote:
> >>On 04/15/2015 08:57 AM, Dave Young wrote:
> >>>Again, I think it is bad to use old page table, below issues need consider:
> >>>1) make sure old page table are reliable across crash
> >>>2) do not allow writing oldmem after crash
> >>>
> >>>Please correct me if I'm wrong, or if above is not doable I think I will 
> >>>vote for
> >>>resetting pci bus.
> >>>
> >>>Thanks
> >>>Dave
> >>>
> >>Hi Dave,
> >>
> >>When updating the context tables, we have to write their address to root
> >>tables, this will cause writing to old mem.
> >>
> >>Resetting the pci bus has been discussed, please check this:
> >>http://lists.infradead.org/pipermail/kexec/2014-October/012752.html
> >>https://lkml.org/lkml/2014/10/21/890
> >
> >I know one reason to use old pgtable is this looks better because it fixes 
> >the
> >real problem, but it is not a good way if it introduce more problems because 
> >of
> >it have to use oldmem. I will be glad if this is not a problem but I have not
> >been convinced.
> >
> >OTOH, there's many types of iommu, intel, amd, a lot of other types. They 
> >need
> >their own fixes, so it looks not that elegant.
> >
> >For pci reset, it is not perfect, but it has another advantage, the patch is
> >simpler. The problem I see from the old discusssion is, reset bus in 2nd 
> >kernel
> >is acceptable but it does not fix things on sparc platform. AFAIK current 
> >reported
> >problems are intel and amd iommu, at least pci reset stuff does not make it 
> >worse.
> >
> >Thanks
> >Dave
> >
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 6/8] selftest/x86: have no dependency on all when cross building

2015-04-20 Thread Tyler Baker
On 20 April 2015 at 16:47, Andy Lutomirski  wrote:
> On Mon, Apr 20, 2015 at 4:34 PM, Tyler Baker  wrote:
>> On 20 April 2015 at 16:22, Andy Lutomirski  wrote:
>>> On Mon, Apr 20, 2015 at 4:15 PM, Tyler Baker  wrote:
 If the CROSS_COMPILE is set remove all's dependency on all_32 and all_64.

 Cc: Andy Lutomirski 
 Signed-off-by: Tyler Baker 
 ---
  tools/testing/selftests/x86/Makefile | 8 +++-
  1 file changed, 7 insertions(+), 1 deletion(-)

 diff --git a/tools/testing/selftests/x86/Makefile 
 b/tools/testing/selftests/x86/Makefile
 index be93945..a5ca38b 100644
 --- a/tools/testing/selftests/x86/Makefile
 +++ b/tools/testing/selftests/x86/Makefile
 @@ -7,15 +7,21 @@ BINARIES_64 := $(TARGETS_C_BOTHBITS:%=%_64)

  CFLAGS := -O2 -g -std=gnu99 -pthread -Wall

 +all:
 +
>>>
>>> This...
>>>
  UNAME_M := $(shell uname -m)

 +ifeq ($(CROSS_COMPILE),)
  # Always build 32-bit tests
  all: all_32
 -
  # If we're on a 64-bit host, build 64-bit tests as well
  ifeq ($(UNAME_M),x86_64)
  all: all_64
  endif
 +else
 +# No dependency on all when cross building
 +all:
>>>
>>> ...is redundant with this.  If you delete the "else" and "all:" here, then:
>>
>> Ok, I will remove these bits from this patch. However, the else will
>> need to be added back in the next patch of the series to override the
>> default behavior of EMIT_TESTS and INSTALL_RULE if that you are ok
>> with that.
>
> I'm fine with that, unless you or Shuah want to fix lib.mk.

I will send a follow up series to address this issue.

>
> --Andy

Tyler
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Docs: proc: fix kernel version

2015-04-20 Thread Chen Hanxiao
Change kernel version from 3.20 to 4.1

Signed-off-by: Chen Hanxiao 
---
 Documentation/filesystems/proc.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/filesystems/proc.txt 
b/Documentation/filesystems/proc.txt
index c3b6b30..1cc7155 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -205,7 +205,7 @@ asynchronous manner and the value may not be very precise. 
To see a precise
 snapshot of a moment, you can see /proc//smaps file and scan page table.
 It's slow but very precise.
 
-Table 1-2: Contents of the status files (as of 3.20.0)
+Table 1-2: Contents of the status files (as of 4.1)
 ..
  Field   Content
  Namefilename of the executable
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 7/7] perf data: Fix signess of value

2015-04-20 Thread Wang Nan
On 2015/4/21 5:23, Arnaldo Carvalho de Melo wrote:
> Em Sat, Apr 18, 2015 at 05:50:20PM +0200, Jiri Olsa escreveu:
>> From: Wang Nan 
>>
>> When converting int values, perf first extractes it to a ulonglong, then
>> feeds it to babeltrace as a signed value. For negative 32 bit values
>> (for example, return values of failed syscalls), the extracted data
>> should be something like 0xfffe (-2). It becomes a large int64
>> value. Babeltrace denies to insert it with
>> bt_ctf_field_signed_integer_set_value() because it is larger than
>> 0x7fff, the largest positive value a signed 32 bit int can be.
> 
> There is no such word "signess", it is "signedness", fixing this up.
> Humm, it seems there is such a word indeed:
> 
> http://www.urbandictionary.com/define.php?term=Signess
> 
> But I bet this is the one we want:
> 
> http://en.wikipedia.org/wiki/Signedness
> 
> Right? :-)
> 
> - Arnaldo
>  

Sorry for the bad English. Please help me to fix it. Thank you.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: sparc64: Build failure due to commit f1600e549b94 (sparc: Make sparc64 use scalable lib/iommu-common.c functions)

2015-04-20 Thread Guenter Roeck

On 04/20/2015 06:54 PM, Michael Ellerman wrote:

On Mon, 2015-04-20 at 12:50 -0400, David Miller wrote:

From: Guenter Roeck 
Date: Mon, 20 Apr 2015 09:44:31 -0700


On Mon, Apr 20, 2015 at 12:25:19PM -0400, David Miller wrote:

From: Guenter Roeck 
Date: Sun, 19 Apr 2015 22:17:21 -0700


The debug option is intended for all _other_ architectures, to
ensure that changes made for those don't break alpha/s390
builds. alpha/s390 have ARCH_NEEDS_WEAK_PER_CPU and don't need the
debug option.


Ironically this would not create a build failure for the architectures
where this matters, because only powerpc has the like named percpu
symbol.

So it's not really meeting the stated objective in this case.


Yes, that is correct; it can only find problems in non-architecture
code, and on the downside produces false positives and thus build errors
like this one.

Which makes the fix a bit philosophical. Rename iommu_pool_hash in
iommu-common, or drop DEBUG_FORCE_WEAK_PER_CPU. I would rename
iommu_pool_hash, but that is just me. Ultimately, I don't really
care one way or another, as long as the problem gets fixed.


If nightly builds of s390 and alpha, the two platforms where this
matters, are being done as reported in this thread, then I really
don't see the value in DEBUG_FORCE_WEAK_PER_CPU.


Me not either, but, as you say, that is a different discussion.



We do an s390 allmodconfig for every linux-next release:

   http://kisskb.ellerman.id.au/kisskb/target/573/

And also for Linus' tree:

   http://kisskb.ellerman.id.au/kisskb/target/568/

We don't have alpha allmodconfig enabled, though we could, but we do build the
defconfig:

   http://kisskb.ellerman.id.au/kisskb/target/2499/
   http://kisskb.ellerman.id.au/kisskb/target/2494/


I cover alpha:allmodconfig in my builds for -next, mainline, as well as all
kernel.org stable releases and release candidates. This discussion is a good
argument for enabling s390:allmodconfig as well.


So I think that should be sufficient to catch any percpus that are introduced
in generic code with the same name as s390/alpha variables.


Yes, but unfortunately only after the fact, though I don't see a means
to avoid that.




But I guess that's a more involved longer-term discussion and I guess
I'll apply Sowmini's patches for now.



Thanks!

Guenter

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RESEND RFC PATCH 3/3] ASoC: mediatek: Add AFE platform driver

2015-04-20 Thread Koro Chen
On Mon, 2015-04-20 at 21:55 +0100, Mark Brown wrote:
> On Mon, Apr 20, 2015 at 02:22:24PM +0800, Koro Chen wrote:
> > On Sat, 2015-04-18 at 18:51 +0100, Mark Brown wrote:
> > > On Fri, Apr 10, 2015 at 04:14:09PM +0800, Koro Chen wrote:
> 
> > > Ah, so the SRAM is directly memory mappable.  Nice.  But we have a
> > > limited amount of it so need to allocate it to a device somehow based on
> > > some factor I guess?
> 
> > Yes, actually SRAM is only used for the main playback path (which is
> > memif "DL1") to achieve low power in real use case. Maybe you think it's
> > better to not describe this in the device tree, but to choose SRAM
> > automatically if memif "DL1" is chosen?
> 
> Since it's directly memory mappable is there actually any cost in
> latency terms from using the SRAM in low latency cases (or did I misread
> what the code was doing there)?  If it can only be used with one
> interface and there's no downside from using it...
The SRAM size to be used is defined by params_buffer_bytes(params), not
fixed (of course limited by the actual available SRAM size on HW), so
the latency should be the same compared to a DRAM having the same size. 

The SRAM can be used by any memif, and that's why the plan was let DT
make the decision.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mm/slab_common: Support the slub_debug boot option on specific object size

2015-04-20 Thread Gavin Guo
Hi Christoph,

On Mon, Apr 20, 2015 at 11:40 PM, Christoph Lameter  wrote:
> On Sat, 18 Apr 2015, Gavin Guo wrote:
>
>> The slub_debug=PU,kmalloc-xx cannot work because in the
>> create_kmalloc_caches() the s->name is created after the
>> create_kmalloc_cache() is called. The name is NULL in the
>> create_kmalloc_cache() so the kmem_cache_flags() would not set the
>> slub_debug flags to the s->flags. The fix here set up a temporary
>> kmalloc_names string array for the initialization purpose. After the
>> kmalloc_caches are already it can be used to create s->name in the
>> kasprintf.
>
> Ok if you do that then the dynamic creation of the kmalloc hostname can
> also be removed. This patch should do that as well.

Thanks for your reply. I put the kmalloc_names in the __initdata
section. And it will be cleaned. Do you think the kmalloc_names should
be put in the global data section to avoid the dynamic creation of the
kmalloc hostname again?

Gavin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] block: reread partitions changes and fix for loop

2015-04-20 Thread Jens Axboe

On 04/20/2015 08:15 PM, Ming Lei wrote:

On Mon, Apr 13, 2015 at 5:22 PM, Christoph Hellwig  wrote:

The series looks fine to me:

Reviewed-by: Christoph Hellwig 


Jens, could you share us if you are OK with this patchset?


Looks good to me, I'll queue it up.

--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] block: reread partitions changes and fix for loop

2015-04-20 Thread Ming Lei
On Mon, Apr 13, 2015 at 5:22 PM, Christoph Hellwig  wrote:
> The series looks fine to me:
>
> Reviewed-by: Christoph Hellwig 

Jens, could you share us if you are OK with this patchset?

Thanks,
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] net: dsa: mv88e6xxx: use PORT_DEFAULT_VLAN

2015-04-20 Thread David Miller
From: Vivien Didelot 
Date: Mon, 20 Apr 2015 17:43:26 -0400

> Minor, use the explicit PORT_DEFAULT_VLAN define instead of 0x07.
> 
> Signed-off-by: Vivien Didelot 

Applied.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] net: dsa: mv88e6xxx: fix setup of port control 1

2015-04-20 Thread David Miller
From: Andrew Lunn 
Date: Tue, 21 Apr 2015 01:05:07 +0200

> On Mon, Apr 20, 2015 at 05:19:23PM -0400, Vivien Didelot wrote:
>> mv88e6xxx_setup_port_common was writing to PORT_DEFAULT_VLAN (port
>> offset 0x07) instead of PORT_CONTROL_1 (port offset 0x05).
> 
> Hi Vivien
> 
> Good catch.
>  
>> Signed-off-by: Vivien Didelot 
> 
> Fixes: cca8b1337541 ("net: dsa: Use mnemonics rather than register numbers")
> Acked-by: Andrew Lunn 

Applied.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v1 1/2] blk-mq: fix race between timeout and CPU hotplug

2015-04-20 Thread Ming Lei
Firstly during CPU hotplug, even queue is freezed, timeout
handler still may come and access hctx->tags, which may cause
use after free, so this patch deactivates timeout handler
inside CPU hotplug notifier.

Secondly, tags can be shared by more than one queues, so we
have to check if the hctx has been unmapped, otherwise
still use-after-free on tags can be triggered.

Cc: 
Reported-by: Dongsu Park 
Tested-by: Dongsu Park 
Signed-off-by: Ming Lei 
---
 block/blk-mq.c |   16 +---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index ade8a2d..1fccb98 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -677,8 +677,11 @@ static void blk_mq_rq_timer(unsigned long priv)
data.next = blk_rq_timeout(round_jiffies_up(data.next));
mod_timer(>timeout, data.next);
} else {
-   queue_for_each_hw_ctx(q, hctx, i)
-   blk_mq_tag_idle(hctx);
+   queue_for_each_hw_ctx(q, hctx, i) {
+   /* the hctx may be unmapped, so check it here */
+   if (blk_mq_hw_queue_mapped(hctx))
+   blk_mq_tag_idle(hctx);
+   }
}
 }
 
@@ -2090,9 +2093,16 @@ static int blk_mq_queue_reinit_notify(struct 
notifier_block *nb,
 */
list_for_each_entry(q, _q_list, all_q_node)
blk_mq_freeze_queue_start(q);
-   list_for_each_entry(q, _q_list, all_q_node)
+   list_for_each_entry(q, _q_list, all_q_node) {
blk_mq_freeze_queue_wait(q);
 
+   /*
+* timeout handler can't touch hw queue during the
+* reinitialization
+*/
+   del_timer_sync(>timeout);
+   }
+
list_for_each_entry(q, _q_list, all_q_node)
blk_mq_queue_reinit(q);
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v1 2/2] blk-mq: fix CPU hotplug handling

2015-04-20 Thread Ming Lei
hctx->tags has to be set as NULL in case that it is to be unmapped
no matter if set->tags[hctx->queue_num] is NULL or not in blk_mq_map_swqueue()
because shared tags can be freed already from another request queue.

The same situation has to be considered during handling CPU online too.
Unmapped hw queue can be remapped after CPU topo is changed, so we need
to allocate tags for the hw queue in blk_mq_map_swqueue(). Then tags
allocation for hw queue can be removed in hctx cpu online notifier, and it
is reasonable to do that after mapping is updated.

Cc: 
Reported-by: Dongsu Park 
Tested-by: Dongsu Park 
Signed-off-by: Ming Lei 
---
 block/blk-mq.c |   34 +-
 1 file changed, 13 insertions(+), 21 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 1fccb98..76f460e 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1574,22 +1574,6 @@ static int blk_mq_hctx_cpu_offline(struct blk_mq_hw_ctx 
*hctx, int cpu)
return NOTIFY_OK;
 }
 
-static int blk_mq_hctx_cpu_online(struct blk_mq_hw_ctx *hctx, int cpu)
-{
-   struct request_queue *q = hctx->queue;
-   struct blk_mq_tag_set *set = q->tag_set;
-
-   if (set->tags[hctx->queue_num])
-   return NOTIFY_OK;
-
-   set->tags[hctx->queue_num] = blk_mq_init_rq_map(set, hctx->queue_num);
-   if (!set->tags[hctx->queue_num])
-   return NOTIFY_STOP;
-
-   hctx->tags = set->tags[hctx->queue_num];
-   return NOTIFY_OK;
-}
-
 static int blk_mq_hctx_notify(void *data, unsigned long action,
  unsigned int cpu)
 {
@@ -1597,8 +1581,11 @@ static int blk_mq_hctx_notify(void *data, unsigned long 
action,
 
if (action == CPU_DEAD || action == CPU_DEAD_FROZEN)
return blk_mq_hctx_cpu_offline(hctx, cpu);
-   else if (action == CPU_ONLINE || action == CPU_ONLINE_FROZEN)
-   return blk_mq_hctx_cpu_online(hctx, cpu);
+
+   /*
+* In case of CPU online, tags may be reallocated
+* in blk_mq_map_swqueue() after mapping is updated.
+*/
 
return NOTIFY_OK;
 }
@@ -1778,6 +1765,7 @@ static void blk_mq_map_swqueue(struct request_queue *q)
unsigned int i;
struct blk_mq_hw_ctx *hctx;
struct blk_mq_ctx *ctx;
+   struct blk_mq_tag_set *set = q->tag_set;
 
queue_for_each_hw_ctx(q, hctx, i) {
cpumask_clear(hctx->cpumask);
@@ -1806,16 +1794,20 @@ static void blk_mq_map_swqueue(struct request_queue *q)
 * disable it and free the request entries.
 */
if (!hctx->nr_ctx) {
-   struct blk_mq_tag_set *set = q->tag_set;
-
if (set->tags[i]) {
blk_mq_free_rq_map(set, set->tags[i], i);
set->tags[i] = NULL;
-   hctx->tags = NULL;
}
+   hctx->tags = NULL;
continue;
}
 
+   /* unmapped hw queue can be remapped after CPU topo changed */
+   if (!set->tags[i])
+   set->tags[i] = blk_mq_init_rq_map(set, i);
+   hctx->tags = set->tags[i];
+   WARN_ON(!hctx->tags);
+
/*
 * Set the map size to the number of mapped software queues.
 * This is more accurate and more efficient than looping
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v1 0/2] blk-mq: fix oops caused by CPU hotplug

2015-04-20 Thread Ming Lei
Hi Guys,

Dongsu Park reported[1] that kernel oops can be triggered easily by
CPU plug when there is pending I/O on virtio-scsi.

Turns out two problems exist in blk-mq core code and both can trigger
oops by CPU plug:
- timeout handling vs CPU hotplug, especially unmapped hw queue tags
is still touched by timeout handler
- in case of shared tags, there is one bug about setting and checking
hctx->tags during CPU hotplug

The two patches fix the two problem, and Dongsu has verified that
the oops is fixed with the two patches too.

[1], http://marc.info/?t=14292638928=1=2

V1:
- update comment
- moving tags allocation in blk_mq_map_swqueue() because
unmapped hw queue can be remapped after CPU topo is changed

 block/blk-mq.c |   50 ++
 1 file changed, 26 insertions(+), 24 deletions(-)


Thanks,
Ming Lei

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


linux-next: manual merge of the v4l-dvb tree with Linus' tree

2015-04-20 Thread Stephen Rothwell
Hi Mauro,

Today's linux-next merge of the v4l-dvb tree got a conflict in
include/uapi/linux/media-bus-format.h between various commits from
Linus' tree and various commits from the v4l-dvb tree.

I reported this previously against the drm tree but some fo the numbers
have changed.

I fixed it up (see below) and can carry the fix as necessary (no action
is required).

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

diff --cc include/uapi/linux/media-bus-format.h
index 73c78f18a328,d391893064a0..
--- a/include/uapi/linux/media-bus-format.h
+++ b/include/uapi/linux/media-bus-format.h
@@@ -45,18 -43,14 +45,20 @@@
  #define MEDIA_BUS_FMT_RGB565_2X8_BE   0x1007
  #define MEDIA_BUS_FMT_RGB565_2X8_LE   0x1008
  #define MEDIA_BUS_FMT_RGB666_1X18 0x1009
 +#define MEDIA_BUS_FMT_RGB666_1X24_CPADHI  0x1015
 +#define MEDIA_BUS_FMT_RGB666_1X7X3_SPWG   0x1010
 +#define MEDIA_BUS_FMT_BGR888_1X24 0x1013
 +#define MEDIA_BUS_FMT_GBR888_1X24 0x1014
+ #define MEDIA_BUS_FMT_RBG888_1X24 0x100e
  #define MEDIA_BUS_FMT_RGB888_1X24 0x100a
  #define MEDIA_BUS_FMT_RGB888_2X12_BE  0x100b
  #define MEDIA_BUS_FMT_RGB888_2X12_LE  0x100c
 +#define MEDIA_BUS_FMT_RGB888_1X7X4_SPWG   0x1011
 +#define MEDIA_BUS_FMT_RGB888_1X7X4_JEIDA  0x1012
  #define MEDIA_BUS_FMT_ARGB_1X32   0x100d
+ #define MEDIA_BUS_FMT_RGB888_1X32_PADHI   0x100f
  
 -/* YUV (including grey) - next is 0x2025 */
 +/* YUV (including grey) - next is 0x2026 */
  #define MEDIA_BUS_FMT_Y8_1X8  0x2001
  #define MEDIA_BUS_FMT_UV8_1X8 0x2015
  #define MEDIA_BUS_FMT_UYVY8_1_5X8 0x2002
@@@ -82,13 -80,7 +88,14 @@@
  #define MEDIA_BUS_FMT_VYUY10_1X20 0x201b
  #define MEDIA_BUS_FMT_YUYV10_1X20 0x200d
  #define MEDIA_BUS_FMT_YVYU10_1X20 0x200e
 +#define MEDIA_BUS_FMT_YUV8_1X24   0x2025
 +#define MEDIA_BUS_FMT_YUV10_1X30  0x2016
 +#define MEDIA_BUS_FMT_AYUV8_1X32  0x2017
 +#define MEDIA_BUS_FMT_UYVY12_2X12 0x201c
 +#define MEDIA_BUS_FMT_VYUY12_2X12 0x201d
 +#define MEDIA_BUS_FMT_YUYV12_2X12 0x201e
 +#define MEDIA_BUS_FMT_YVYU12_2X12 0x201f
+ #define MEDIA_BUS_FMT_VUY8_1X24   0x2024
  #define MEDIA_BUS_FMT_UYVY12_1X24 0x2020
  #define MEDIA_BUS_FMT_VYUY12_1X24 0x2021
  #define MEDIA_BUS_FMT_YUYV12_1X24 0x2022


pgp1uJBOVPnOi.pgp
Description: OpenPGP digital signature


Re: sparc64: Build failure due to commit f1600e549b94 (sparc: Make sparc64 use scalable lib/iommu-common.c functions)

2015-04-20 Thread Michael Ellerman
On Mon, 2015-04-20 at 12:50 -0400, David Miller wrote:
> From: Guenter Roeck 
> Date: Mon, 20 Apr 2015 09:44:31 -0700
> 
> > On Mon, Apr 20, 2015 at 12:25:19PM -0400, David Miller wrote:
> >> From: Guenter Roeck 
> >> Date: Sun, 19 Apr 2015 22:17:21 -0700
> >> 
> >> > The debug option is intended for all _other_ architectures, to
> >> > ensure that changes made for those don't break alpha/s390
> >> > builds. alpha/s390 have ARCH_NEEDS_WEAK_PER_CPU and don't need the
> >> > debug option.
> >> 
> >> Ironically this would not create a build failure for the architectures
> >> where this matters, because only powerpc has the like named percpu
> >> symbol.
> >> 
> >> So it's not really meeting the stated objective in this case.
> > 
> > Yes, that is correct; it can only find problems in non-architecture
> > code, and on the downside produces false positives and thus build errors
> > like this one.
> > 
> > Which makes the fix a bit philosophical. Rename iommu_pool_hash in
> > iommu-common, or drop DEBUG_FORCE_WEAK_PER_CPU. I would rename
> > iommu_pool_hash, but that is just me. Ultimately, I don't really
> > care one way or another, as long as the problem gets fixed.
> 
> If nightly builds of s390 and alpha, the two platforms where this
> matters, are being done as reported in this thread, then I really
> don't see the value in DEBUG_FORCE_WEAK_PER_CPU.

We do an s390 allmodconfig for every linux-next release:

  http://kisskb.ellerman.id.au/kisskb/target/573/

And also for Linus' tree:

  http://kisskb.ellerman.id.au/kisskb/target/568/

We don't have alpha allmodconfig enabled, though we could, but we do build the
defconfig:

  http://kisskb.ellerman.id.au/kisskb/target/2499/
  http://kisskb.ellerman.id.au/kisskb/target/2494/

So I think that should be sufficient to catch any percpus that are introduced
in generic code with the same name as s390/alpha variables.


> But I guess that's a more involved longer-term discussion and I guess
> I'll apply Sowmini's patches for now.

Yeah I guess it is. Thanks for merging the fix.

cheers


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


linux-next: manual merge of the v4l-dvb tree with Linus' tree

2015-04-20 Thread Stephen Rothwell
Hi Mauro,

Today's linux-next merge of the v4l-dvb tree got a conflict in
Documentation/DocBook/media/v4l/subdev-formats.xml between various
commits from Linus' tree and various commits from the v4l-dvb tree.

I reported this previously against the drm tree, but some of the
numbers have changed now.

I fixed it up (see below) and can carry the fix as necessary (no action
is required).

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

diff --cc Documentation/DocBook/media/v4l/subdev-formats.xml
index 553a38024745,bc8d3fb9e4a9..
--- a/Documentation/DocBook/media/v4l/subdev-formats.xml
+++ b/Documentation/DocBook/media/v4l/subdev-formats.xml
@@@ -482,96 -440,36 +482,126 @@@ see .b1
  b0

 +  
 +MEDIA_BUS_FMT_RGB666_1X24_CPADHI
 +0x1015
 +
 +
 +0
 +0
 +r5
 +r4
 +r3
 +r2
 +r1
 +r0
 +0
 +0
 +g5
 +g4
 +g3
 +g2
 +g1
 +g0
 +0
 +0
 +b5
 +b4
 +b3
 +b2
 +b1
 +b0
 +  
 +  
 +MEDIA_BUS_FMT_BGR888_1X24
 +0x1013
 +
 +
 +b7
 +b6
 +b5
 +b4
 +b3
 +b2
 +b1
 +b0
 +g7
 +g6
 +g5
 +g4
 +g3
 +g2
 +g1
 +g0
 +r7
 +r6
 +r5
 +r4
 +r3
 +r2
 +r1
 +r0
 +  
 +  
 +MEDIA_BUS_FMT_GBR888_1X24
 +0x1014
 +
 +
 +g7
 +g6
 +g5
 +g4
 +g3
 +g2
 +g1
 +g0
 +b7
 +b6
 +b5
 +b4
 +b3
 +b2
 +b1
 +b0
 +r7
 +r6
 +r5
 +r4
 +r3
 +r2
 +r1
 +r0
 +  
+   
+ MEDIA_BUS_FMT_RBG888_1X24
+ 0x100e
+ 
+ 
+ r7
+ r6
+ r5
+ r4
+ r3
+ r2
+ r1
+ r0
+ b7
+ b6
+ b5
+ b4
+ b3
+ b2
+ b1
+ b0
+ g7
+ g6
+ g5
+ g4
+ g3
+ g2
+ g1
+ g0
+   

  MEDIA_BUS_FMT_RGB888_1X24
  0x100a
@@@ -3047,92 -2719,33 +3106,92 @@@
  u1
  u0

 +  
 +MEDIA_BUS_FMT_YUV8_1X24
 +0x2025
 +
 +-
 +-
 +-
 +-
 +-
 +-
 +-
 +-
 +y7
 +y6
 +y5
 +y4
 +y3
 +y2
 +y1
 +y0
 +u7
 +u6
 +u5
 +u4
 +u3
 +u2
 +u1
 +u0
 +v7
 +v6
 +v5
 +v4
 +v3
 +v2
 +v1
 +v0
 +  
 +  
 +MEDIA_BUS_FMT_YUV10_1X30
 +0x2016
 +
- -
- -
- y9
- y8
++
 +y7
 +y6
 +y5
 +y4
 +y3
 +y2
 +y1
 +y0
- u9
- u8
- u7
- u6
- u5
- u4
- u3
- u2
- u1
- u0
- v9
- v8
- v7
- v6
- v5
- v4
- v3
- v2
- v1
- v0
++d
++d
++d
++d
++d
++d
++d
++d
 +  
-   
- MEDIA_BUS_FMT_AYUV8_1X32
- 0x2017
+   
+ MEDIA_BUS_FMT_YDYUYDYV8_1X16
+ 0x2014
  
- a7
- a6
- a5
- a4
- a3
- a2
- a1
- a0
+ 
+ y7
+ y6
+ y5
+ y4
+ y3
+ y2
+ y1
+ y0
+ d
+ d
+ d
+ d
+ d
+ d
+ d
+   

[PATCH 2/2] added two nvme commands for open/close streams and garbage collection

2015-04-20 Thread kwan.huen
---
 include/uapi/linux/nvme.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/uapi/linux/nvme.h b/include/uapi/linux/nvme.h
index aef9a81..5025610 100644
--- a/include/uapi/linux/nvme.h
+++ b/include/uapi/linux/nvme.h
@@ -229,6 +229,8 @@ enum nvme_opcode {
nvme_cmd_resv_report= 0x0e,
nvme_cmd_resv_acquire   = 0x11,
nvme_cmd_resv_release   = 0x15,
+   nvme_cmd_stream_ctrl= 0x18,
+   nvme_cmd_gc_ctrl= 0x1c,
 };
 
 struct nvme_common_command {
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] added stream id write support

2015-04-20 Thread kwan.huen
---
 drivers/block/nvme-core.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/block/nvme-core.c b/drivers/block/nvme-core.c
index 85b8036..332341a 100644
--- a/drivers/block/nvme-core.c
+++ b/drivers/block/nvme-core.c
@@ -769,6 +769,9 @@ static int nvme_submit_iod(struct nvme_queue *nvmeq, struct 
nvme_iod *iod,
if (req->cmd_flags & REQ_RAHEAD)
dsmgmt |= NVME_RW_DSM_FREQ_PREFETCH;
 
+   if (rq_data_dir(req))
+   dsmgmt |= bio_get_streamid(req->bio) << 8;
+
cmnd = >sq_cmds[nvmeq->sq_tail];
memset(cmnd, 0, sizeof(*cmnd));
 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Write with Stream ID Support

2015-04-20 Thread kwan.huen

The attached patch set enables basic write with stream ID support. 
First patch reads the stream id embedded in the bio and passes to the 
device along with the write command.
Second patch adds two new nvme commands to be used with ioctl 
such that application can do open/close stream and host
initiated garbage collection.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2 V2] memory-hotplug: fix BUG_ON in move_freepages()

2015-04-20 Thread Xishi Qiu
On 2015/4/21 2:23, Yasuaki Ishimatsu wrote:

> 
> On Mon, 20 Apr 2015 11:42:10 +0800
> Xishi Qiu  wrote:
> 
>> On 2015/4/20 11:29, Yasuaki Ishimatsu wrote:
>>
>>>
>>> On Mon, 20 Apr 2015 10:45:45 +0800
>>> Xishi Qiu  wrote:
>>>
 On 2015/4/20 9:42, Gu Zheng wrote:

> Hi Xishi,
> On 04/18/2015 04:05 AM, Yasuaki Ishimatsu wrote:
>
>>
>> Your patches will fix your issue.
>> But, if BIOS reports memory first at node hot add, pgdat can
>> not be initialized.
>>
>> Memory hot add flows are as follows:
>>
>> add_memory
>>   ...
>>   -> hotadd_new_pgdat()
>>   ...
>>   -> node_set_online(nid)
>>
>> When calling hotadd_new_pgdat() for a hot added node, the node is
>> offline because node_set_online() is not called yet. So if applying
>> your patches, the pgdat is not initialized in this case.
>
> Ishimtasu's worry is reasonable. And I am afraid the fix here is a bit
> over-kill. 
>
>>
>> Thanks,
>> Yasuaki Ishimatsu
>>
>> On Fri, 17 Apr 2015 18:50:32 +0800
>> Xishi Qiu  wrote:
>>
>>> Hot remove nodeXX, then hot add nodeXX. If BIOS report cpu first, it 
>>> will call
>>> hotadd_new_pgdat(nid, 0), this will set pgdat->node_start_pfn to 0. As 
>>> nodeXX
>>> exists at boot time, so pgdat->node_spanned_pages is the same as 
>>> original. Then
>>> free_area_init_core()->memmap_init() will pass a wrong start and a 
>>> nonzero size.
>
> As your analysis said the root cause here is passing a *0* as the 
> node_start_pfn,
> then the chaos occurred when init the zones. And this only happens to the 
> re-hotadd
> node, so how about using the saved *node_start_pfn* (via 
> get_pfn_range_for_nid(nid, _pfn, _pfn))
> instead if we find "pgdat->node_start_pfn == 0 && !node_online(XXX)"?
>
> Thanks,
> Gu
>

 Hi Gu,

 I first considered this method, but if the hot added node's start and size 
 are different
 from before, it makes the chaos.

>>>
 e.g.
 nodeXX (8-16G)
 remove nodeXX 
 BIOS report cpu first and online it
 hotadd nodeXX
 use the original value, so pgdat->node_start_pfn is set to 8G, and size is 
 8G
 BIOS report mem(10-12G)
 call add_memory()->__add_zone()->grow_zone_span()/grow_pgdat_span()
 the start is still 8G, not 10G, this is chaos!
>>>
>>> If you set CONFIG_HAVE_MEMBLOCK_NODE_MAP, kernel shows the following
>>> pr_info()'s message.
>>>
>>> void __paginginit free_area_init_node(int nid, unsigned long *zones_size,
>>> unsigned long node_start_pfn, unsigned long *zholes_size)
>>> {
>>> ...
>>> #ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
>>> get_pfn_range_for_nid(nid, _pfn, _pfn);
>>> pr_info("Initmem setup node %d [mem %#018Lx-%#018Lx]\n", nid,
>>> (u64)start_pfn << PAGE_SHIFT, ((u64)end_pfn << PAGE_SHIFT) 
>>> - 1);
>>> #endif
>>> }
>>>
>>> Is the memory range of the message "8G - 16G"?
>>> If so, the reason is that memblk is not deleted at memory hot remove.
>>>
>>> Thanks,
>>> Yasuaki Ishimatsu
>>>
>>
>> Hi Yasuaki,
>>
> 
>> By reading the code, I find memblk is not deleted at memory hot remove.
>> I am not sure whether we should remove it. If remove it, we should also reset
>> "arch_zone_lowest_possible_pfn", right? It seems a little complicated.
> 
> I think memblk should be added/removed by hot adding/removing memory.
> But, arch_zone_lowest_possible_pfn should not be changed.
> 

Ok, thanks for your suggestion.

> Thanks,
> Yasuaki Ishimatsu
> 
>>
>> Thanks,
>> Xishi Qiu
>>
>>>
>>>

 Thanks,
 Xishi Qiu

>>>
>>> .
>>>
>>
>>
>>
> 
> .
> 



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-04-20 Thread Li, ZhenHua

Hi Dave,
I found the old mail:
http://lkml.iu.edu/hypermail/linux/kernel/1410.2/03584.html

Please check this and you will find the discussion.

Regards
Zhenhua

On 04/15/2015 02:48 PM, Dave Young wrote:

On 04/15/15 at 01:47pm, Li, ZhenHua wrote:

On 04/15/2015 08:57 AM, Dave Young wrote:

Again, I think it is bad to use old page table, below issues need consider:
1) make sure old page table are reliable across crash
2) do not allow writing oldmem after crash

Please correct me if I'm wrong, or if above is not doable I think I will vote 
for
resetting pci bus.

Thanks
Dave


Hi Dave,

When updating the context tables, we have to write their address to root
tables, this will cause writing to old mem.

Resetting the pci bus has been discussed, please check this:
http://lists.infradead.org/pipermail/kexec/2014-October/012752.html
https://lkml.org/lkml/2014/10/21/890


I know one reason to use old pgtable is this looks better because it fixes the
real problem, but it is not a good way if it introduce more problems because of
it have to use oldmem. I will be glad if this is not a problem but I have not
been convinced.

OTOH, there's many types of iommu, intel, amd, a lot of other types. They need
their own fixes, so it looks not that elegant.

For pci reset, it is not perfect, but it has another advantage, the patch is
simpler. The problem I see from the old discusssion is, reset bus in 2nd kernel
is acceptable but it does not fix things on sparc platform. AFAIK current 
reported
problems are intel and amd iommu, at least pci reset stuff does not make it 
worse.

Thanks
Dave



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] sched: lockless wake-queues

2015-04-20 Thread George Spelvin
>> Is there some reason you don't use the simpler singly-linked list
>> construction with the tail being a pointer to a pointer:

> Sure, that would also work.

It's just a convenient simplification, already used in struct hlist_node.

>> +/*
>> + * Queue a task for later wake-up by wake_up_q().  If the task is already
>> + * queued by someone else, leave it to them to deliver the wakeup.
>
> This is already commented in the cmpxchg.
>
>> + *
>> + * This property makes it impossible to guarantee the order of wakeups,
>> + * but for efficiency we try to deliver wakeups in the order tasks
>> + * are added.  
>
> Ok.

This is just me thinking "out loud" about the semantics.

>> It may also be worth commenting the fact that wake_up_q() leaves the
>> struct wake_q_head in a corrupt state, so don't try to do it again.

> Right, we could re-init the list once the loop is complete, yes. But it
> shouldn't matter due to how we use wake-queues.

Oh, indeed, there's no point.  Unless it's worth a debugging option,
but as you say the usage patterns are such that I don't expect it's
needed.

It just seemed worth commenting explicitly.


If I were going to comment it, here's what I'd write.  Feel free
to copy any or none of this:

/*
 * Wake-queues are lists of tasks about to be woken up.
 * Deferring the wakeup is useful when the waker is waking up multiple
 * tasks while holding a lock which the woken tasks will need, so they'd
 * go straight into a wait queue anyway.
 *
 * So instead, the the waker can wake_q_add(, task) under the lock,
 * and then wake_up_q() afterward.
 *
 * The list head is allocated on the waker's stack, and the queue nodes
 * are preallocated as part of the task struct.
 *
 * A reference to each task (get_task_struct()) is held during the wait,
 * so the list will remain valid through wake_up_q().
 *
 * One per task suffices, because there's never a need for a task to be
 * in two wake queues simultaneously; it is forbidden to abandon a task
 * in a wake queue (a call to wake_up_q() _must_ follow), so if a task is
 * already in a wake queue, the wakeup will happen soon and the second
 * waker can just skip it.
 *
 * As with all Linux wakeup primitives, there is no guarantee about the
 * order, but this code tries to wake tasks in wake_q_add order.
 *
 * The WAKE_Q macro declares and initializes the list head.
 * wake_up_q() does NOT reinitialize the list; it's expected to be
 * called near the end of a function, where the fact that the queue is
 * not used again will be easy to see by inspection.
 */
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/1] drivers/usb/chipidea/debuc.c: avoid out of bound read

2015-04-20 Thread Peter Chen
On Fri, Apr 17, 2015 at 08:04:13AM +0200, Heinrich Schuchardt wrote:
> A string written by the user may not be zero terminated.
> 
> sscanf may read memory beyond the buffer if no zero byte
> is found.
> 
> Signed-off-by: Heinrich Schuchardt 
> ---
>  drivers/usb/chipidea/debug.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/usb/chipidea/debug.c b/drivers/usb/chipidea/debug.c
> index dfb05ed..ef08af3 100644
> --- a/drivers/usb/chipidea/debug.c
> +++ b/drivers/usb/chipidea/debug.c
> @@ -88,9 +88,13 @@ static ssize_t ci_port_test_write(struct file *file, const 
> char __user *ubuf,
>   char buf[32];
>   int ret;
>  
> - if (copy_from_user(buf, ubuf, min_t(size_t, sizeof(buf) - 1, count)))
> + count = min_t(size_t, sizeof(buf) - 1, count);
> + if (copy_from_user(buf, ubuf, count))
>   return -EFAULT;

Any reasons to change above?
>  
> + /* sscanf requires a zero terminated string */
> + buf[count] = 0;
> +

I prefer using '\0'

>   if (sscanf(buf, "%u", ) != 1)
>   return -EINVAL;
>  
> -- 
> 2.1.4
> 

-- 

Best Regards,
Peter Chen
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT RFC PULL rcu/urgent] Prevent Kconfig from asking pointless questions

2015-04-20 Thread Paul E. McKenney
On Mon, Apr 20, 2015 at 04:50:07PM -0500, Clark Williams wrote:
> On Mon, 20 Apr 2015 14:15:04 -0700
> "Paul E. McKenney"  wrote:
> 
> > On Mon, Apr 20, 2015 at 04:40:49PM -0400, Steven Rostedt wrote:
> > > On Mon, Apr 20, 2015 at 10:09:03AM -0700, Paul E. McKenney wrote:
> > > > 
> > > > The sysfs knob might be nice, but as far as I know nobody has been
> > > > complaining about it.
> > > > 
> > > > Besides, we already have the rcutree.kthread_prio= kernel-boot 
> > > > parameter.
> > > > So how about if the Kconfig parameter selects either SCHED_OTHER
> > > > (the default) or SCHED_FIFO:1, and then the boot parameter can be used
> > > > to select other values.
> > > 
> > > Hmm, what priority is this for anyway. To change the priority of the boost
> > > value at run time, do we only need to change the priority of the rcub 
> > > threads?
> > > 
> > > And the priority of the other rcu threads can change as well with a simple
> > > chrt?
> > > 
> > > If that's the case, then we don't need a sysctl knob at all.
> > 
> > For the grace-period kthreads and the boost kthread, that is the case.
> > It is also the case for the per-CPU kthreads that invoke RCU callbacks
> > for the non-offloaded RCU_BOOST configuration (and that replace all
> > softirq RCU work in -rt).
> > 
> > So, should I just ditch all of the priority-setting within RCU and tell
> > users to just use chrt?
> 
> Looks to me like all we need to do is tell people if they need a boost
> higher than the compiled in default (RCU_KTHREAD_PRIO), then chrt the
> priority of the rcub thread to the desired priority. 

There's the rub.  They also need to chrt the RCU grace-period kthreads
as well as the per-CPU kthreads (rcuc).  Which is a pain and easy to
get wrong.

So at this point, I am leaning towards keeping RCU_KTHREAD_PRIO, but
hiding it behind RCU_EXPERT.  Someone in an emergency situation can use
chrt to get RCU going, at least assuming that they had the foresight to
leave a prio-99 shell running somewhere and assuming that they do the
chrt before the system hits OOM.  But they have to do all that anyway
if they were to use a sysfs or similar interface.  And it is easy to
tell when you have boosted all the necessary kthreads because RCU
grace periods start advancing once again.  You don't get that feedback
when you set things up at boot time.  ;-)

So again, at least for the moment, I believe that RCU need not provide
a run-time interface for changing RCU kthread priorities, that the
RCU_KTHREAD_PRIO Kconfig parameter should remain, except that it needs
to be hidden behind RCU_EXPERT, and that the rcutree.kthread_prio=
kernel-boot parameter should also remain.

Seem reasonable?

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RESEND PATCH] arm64: kgdb: fix single stepping

2015-04-20 Thread AKASHI Takahiro
Jason,

Could you please review my patch below?
See also arm64 maintainer's comment:
http://lists.infradead.org/pipermail/linux-arm-kernel/2015-January/313712.html

Thanks,
-Takahiro AKASHI

I tried to verify kgdb in vanilla kernel on fast model, but it seems that
the single stepping with kgdb doesn't work correctly since its first
appearance at v3.15.

On v3.15, 'stepi' command after breaking the kernel at some breakpoint
steps forward to the next instruction, but the succeeding 'stepi' never
goes beyond that.
On v3.16, 'stepi' moves forward and stops at the next instruction just
after enable_dbg in el1_dbg, and never goes beyond that. This variance of
behavior seems to come in with the following patch in v3.16:

commit 2a2830703a23 ("arm64: debug: avoid accessing mdscr_el1 on fault
paths where possible")

This patch
(1) moves kgdb_disable_single_step() from 'c' command handling to single
step handler.
This makes sure that single stepping gets effective at every 's' command.
Please note that, under the current implementation, single step bit in
spsr, which is cleared by the first single stepping, will not be set
again for the consecutive 's' commands because single step bit in mdscr
is still kept on (that is, kernel_active_single_step() in
kgdb_arch_handle_exception() is true).
(2) re-implements kgdb_roundup_cpus() because the current implementation
enabled interrupts naively. See below.
(3) removes 'enable_dbg' in el1_dbg.
Single step bit in mdscr is turned on in do_handle_exception()->
kgdb_handle_expection() before returning to debugged context, and if
debug exception is enabled in el1_dbg, we will see unexpected single-
stepping in el1_dbg.
Since v3.18, the following patch does the same:
  commit 1059c6bf8534 ("arm64: debug: don't re-enable debug exceptions
  on return from el1_dbg)
(4) masks interrupts while single-stepping one instruction.
If an interrupt is caught during processing a single-stepping, debug
exception is unintentionally enabled by el1_irq's 'enable_dbg' before
returning to debugged context.
Thus, like in (2), we will see unexpected single-stepping in el1_irq.

Basically (1) and (2) are for v3.15, (3) and (4) for v3.1[67].

* issue fixed by (2):
Without (2), we would see another problem if a breakpoint is set at
interrupt-sensible places, like gic_handle_irq():

KGDB: re-enter error: breakpoint removed ffc81258
[ cut here ]
WARNING: CPU: 0 PID: 650 at kernel/debug/debug_core.c:435
kgdb_handle_exception+0x1dc/0x1f4()
Modules linked in:
CPU: 0 PID: 650 Comm: sh Not tainted 3.17.0-rc2+ #177
Call trace:
[] dump_backtrace+0x0/0x130
[] show_stack+0x10/0x1c
[] dump_stack+0x74/0xb8
[] warn_slowpath_common+0x8c/0xb4
[] warn_slowpath_null+0x14/0x20
[] kgdb_handle_exception+0x1d8/0x1f4
[] kgdb_brk_fn+0x18/0x28
[] brk_handler+0x9c/0xe8
[] do_debug_exception+0x3c/0xac
Exception stack(0xffc07e027650 to 0xffc07e027770)
...
[] el1_dbg+0x14/0x68
[] kgdb_cpu_enter+0x464/0x5c0
[] kgdb_handle_exception+0x190/0x1f4
[] kgdb_brk_fn+0x18/0x28
[] brk_handler+0x9c/0xe8
[] do_debug_exception+0x3c/0xac
Exception stack(0xffc07e027ac0 to 0xffc07e027be0)
...
[] el1_dbg+0x14/0x68
[] __handle_sysrq+0x11c/0x190
[] write_sysrq_trigger+0x4c/0x60
[] proc_reg_write+0x54/0x84
[] vfs_write+0x98/0x1c8
[] SyS_write+0x40/0xa0

Once some interrupt occurs, a breakpoint at gic_handle_irq() triggers kgdb.
Kgdb then calls kgdb_roundup_cpus() to sync with other cpus.
Current kgdb_roundup_cpus() unmasks interrupts temporarily to
use smp_call_function().
This eventually allows another interrupt to occur and likely results in
hitting a breakpoint at gic_handle_irq() again since debug exception is
always enabled in el1_irq.

We can avoid this issue by specifying "nokgdbroundup" in kernel parameter,
but this will also leave other cpus be in unknown state in terms of kgdb,
and may result in interfering with kgdb activity.

Signed-off-by: AKASHI Takahiro 
---
 arch/arm64/kernel/kgdb.c |   60 +++---
 1 file changed, 46 insertions(+), 14 deletions(-)

diff --git a/arch/arm64/kernel/kgdb.c b/arch/arm64/kernel/kgdb.c
index a0d10c5..81b5910 100644
--- a/arch/arm64/kernel/kgdb.c
+++ b/arch/arm64/kernel/kgdb.c
@@ -19,9 +19,13 @@
  * along with this program.  If not, see .
  */
 
+#include 
 #include 
+#include 
 #include 
 #include 
+#include 
+#include 
 #include 
 
 struct dbg_reg_def_t dbg_reg_def[DBG_MAX_REG_NUM] = {
@@ -95,6 +99,9 @@ struct dbg_reg_def_t dbg_reg_def[DBG_MAX_REG_NUM] = {
{ "fpcr", 4, -1 },
 };
 
+static DEFINE_PER_CPU(unsigned int, kgdb_pstate);
+static DEFINE_PER_CPU(struct irq_work, kgdb_irq_work);
+
 char *dbg_get_reg(int regno, void *mem, struct pt_regs 

Re: [PATCH v2 1/6] arm64: Enable Hisilicon ARMv8 SoC family in Kconfig and defconfig

2015-04-20 Thread Bintian

Hello Kevin,

On 2015/4/21 5:10, Kevin Hilman wrote:

Bintian Wang  writes:


This patch introduces ARCH_HISI to enable Hisilicon SoC family in
Kconfig and defconfig.

Signed-off-by: Bintian Wang 
Reviewed-by: Haojian Zhuang 
Reviewed-by: Wei Xu 


[...]


diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
index be1f12a..36ebd9b 100644
--- a/arch/arm64/configs/defconfig
+++ b/arch/arm64/configs/defconfig
@@ -36,6 +36,7 @@ CONFIG_ARCH_MEDIATEK=y
  CONFIG_ARCH_THUNDER=y
  CONFIG_ARCH_VEXPRESS=y
  CONFIG_ARCH_XGENE=y
+CONFIG_ARCH_HISI=y


nit: please keep CONFIG_ARCH_* sorted alphabetically.

Will fix in next version, thanks for helping review.

BR,

Bintian


Thanks,

Kevin

.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCHv2] ibmveth: Fix off-by-one error in ibmveth_change_mtu()

2015-04-20 Thread David Gibson
AFAIK the PAPR document which defines the virtual device interface used by
the ibmveth driver doesn't specify a specific maximum MTU.  So, in the
ibmveth driver, the maximum allowed MTU is determined by the maximum
allocated buffer size of 64k (corresponding to one page in the common case)
minus the per-buffer overhead IBMVETH_BUFF_OH (which has value 22 for 14
bytes of ethernet header, plus 8 bytes for an opaque handle).

This suggests a maximum allowable MTU of 65514 bytes, but in fact the
driver only permits a maximum MTU of 65513.  This is because there is a <
instead of an <= in ibmveth_change_mtu(), which only permits an MTU which
is strictly smaller than the buffer size, rather than allowing the buffer
to be completely filled.

This patch fixes the buglet.

Signed-off-by: David Gibson 1
---
 drivers/net/ethernet/ibm/ibmveth.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Changes since v1:
 * Fixed a second instance of the same off-by-one error.  Thanks to
   Thomas Falcon for spotting this.

diff --git a/drivers/net/ethernet/ibm/ibmveth.c 
b/drivers/net/ethernet/ibm/ibmveth.c
index cd7675a..1813476 100644
--- a/drivers/net/ethernet/ibm/ibmveth.c
+++ b/drivers/net/ethernet/ibm/ibmveth.c
@@ -1238,7 +1238,7 @@ static int ibmveth_change_mtu(struct net_device *dev, int 
new_mtu)
return -EINVAL;
 
for (i = 0; i < IBMVETH_NUM_BUFF_POOLS; i++)
-   if (new_mtu_oh < adapter->rx_buff_pool[i].buff_size)
+   if (new_mtu_oh <= adapter->rx_buff_pool[i].buff_size)
break;
 
if (i == IBMVETH_NUM_BUFF_POOLS)
@@ -1257,7 +1257,7 @@ static int ibmveth_change_mtu(struct net_device *dev, int 
new_mtu)
for (i = 0; i < IBMVETH_NUM_BUFF_POOLS; i++) {
adapter->rx_buff_pool[i].active = 1;
 
-   if (new_mtu_oh < adapter->rx_buff_pool[i].buff_size) {
+   if (new_mtu_oh <= adapter->rx_buff_pool[i].buff_size) {
dev->mtu = new_mtu;
vio_cmo_set_dev_desired(viodev,
ibmveth_get_desired_dma
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] firewire: firewire is a big-endian bus

2015-04-20 Thread Joe Perches
On Tue, 2015-04-21 at 02:36 +0200, Laurent Vivier wrote:
> So, dump config_rom data as big-endian values.
> 
> The value given by /sys/bus/firewire/devices/fw0 were correctly
> given on a big-endian host (like powermac) not on a little-endian host
> (like PC), for instance:
[]
> diff --git a/drivers/firewire/core-device.c b/drivers/firewire/core-device.c
[]
> @@ -399,14 +399,14 @@ static ssize_t config_rom_show(struct device *dev,
>  struct device_attribute *attr, char *buf)
>  {
>   struct fw_device *device = fw_device(dev);
> - size_t length;
> + size_t i;
>  
>   down_read(_device_rwsem);
> - length = device->config_rom_length * 4;
> - memcpy(buf, device->config_rom, length);
> + for (i = 0; i < device->config_rom_length; i++)
> + ((u32 *)buf)[i] = be32_to_cpu(device->config_rom[i]);

Is buf guaranteed to be appropriately aligned on a u32?


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux

2015-04-20 Thread Scott Wood
On Mon, 2015-04-20 at 13:53 +0300, Purcareata Bogdan wrote:
> On 10.04.2015 02:53, Scott Wood wrote:
> > On Thu, 2015-04-09 at 10:44 +0300, Purcareata Bogdan wrote:
> >> So at this point I was getting kinda frustrated so I decided to measure
> >> the time spend in kvm_mpic_write and kvm_mpic_read. I assumed these were
> >> the main entry points in the in-kernel MPIC and were basically executed
> >> while holding the spinlock. The scenario was the same - 24 VCPUs guest,
> >> with 24 virtio+vhost interfaces, only this time I ran 24 ping flood
> >> threads to another board instead of netperf. I assumed this would impose
> >> a heavier stress.
> >>
> >> The latencies look pretty ok, around 1-2 us on average, with the max
> >> shown below:
> >>
> >> .kvm_mpic_read 14.560
> >> .kvm_mpic_write12.608
> >>
> >> Those are also microseconds. This was run for about 15 mins.
> >
> > What about other entry points such as kvm_set_msi() and
> > kvmppc_mpic_set_epr()?
> 
> Thanks for the pointers! I redid the measurements, this time for the 
> functions 
> run with the openpic lock down:
> 
> .kvm_mpic_read_internal (.kvm_mpic_read)  1.664
> .kvmppc_mpic_set_epr  6.880
> .kvm_mpic_write_internal (.kvm_mpic_write)7.840
> .openpic_msi_write (.kvm_set_msi) 10.560
> 
> Same scenario, 15 mins, numbers are microseconds.
> 
> There was a weird situation for .kvmppc_mpic_set_epr - its corresponding 
> inner 
> function is kvmppc_set_epr, which is a static inline. Removing the static 
> inline 
> yields a compiler crash (Segmentation fault (core dumped) - 
> scripts/Makefile.build:441: recipe for target 'arch/powerpc/kvm/kvm.o' 
> failed), 
> but that's a different story, so I just let it be for now. Point is the time 
> may 
> include other work after the lock has been released, but before the function 
> actually returned. I noticed this was the case for .kvm_set_msi, which could 
> work up to 90 ms, not actually under the lock. This made me change what I'm 
> looking at.

kvm_set_msi does pretty much nothing outside the lock -- I suspect
you're measuring an interrupt that happened as soon as the lock was
released.

> So far it looks pretty decent. Are there any other MPIC entry points worthy 
> of 
> investigation?

I don't think so.

>  Or perhaps a different stress scenario involving a lot of VCPUs 
> and external interrupts?

You could instrument the MPIC code to find out how many loop iterations
you maxed out on, and compare that to the theoretical maximum.

-Scott


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2] drivers/rtc/rtc-ds1307.c: Enable the mcp794xx alarm after programming time

2015-04-20 Thread Nishanth Menon
Alarm interrupt enable register is at offset 0x7, while the time
registers for the alarm follow that. When we program Alarm interrupt
enable prior to programming the time, it is possible that previous
time value could be close or match at the time of alarm enable
resulting in interrupt trigger which is unexpected (and does not match
the time we expect it to trigger).

To prevent this scenario from occuring, program the ALM0_EN bit only
after the alarm time is appropriately programmed.

Ofcourse, I2C programming is non-atomic, so there are loopholes where
the interrupt wont trigger if the time requested is in the past at
the time of programming the ALM0_EN bit. However, we will not have
unexpected interrupts while the time is programmed after the interrupt
are enabled.

Signed-off-by: Nishanth Menon 
---
Changes in v2:
- minor typo fix in comments
- merged up code that I missed committing in

V1: https://patchwork.kernel.org/patch/6245041/

 drivers/rtc/rtc-ds1307.c |   12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/rtc/rtc-ds1307.c b/drivers/rtc/rtc-ds1307.c
index 4ffabb322a9a..3cd4783375a5 100644
--- a/drivers/rtc/rtc-ds1307.c
+++ b/drivers/rtc/rtc-ds1307.c
@@ -742,17 +742,17 @@ static int mcp794xx_set_alarm(struct device *dev, struct 
rtc_wkalrm *t)
regs[6] &= ~MCP794XX_BIT_ALMX_IF;
/* Set alarm match: second, minute, hour, day, date, month. */
regs[6] |= MCP794XX_MSK_ALMX_MATCH;
-
-   if (t->enabled)
-   regs[0] |= MCP794XX_BIT_ALM0_EN;
-   else
-   regs[0] &= ~MCP794XX_BIT_ALM0_EN;
+   /* Disable interrupt. We will not enable until completely programmed */
+   regs[0] &= ~MCP794XX_BIT_ALM0_EN;
 
ret = ds1307->write_block_data(client, MCP794XX_REG_CONTROL, 10, regs);
if (ret < 0)
return ret;
 
-   return 0;
+   if (!t->enabled)
+   return 0;
+   regs[0] |= MCP794XX_BIT_ALM0_EN;
+   return i2c_smbus_write_byte_data(client, MCP794XX_REG_CONTROL, regs[0]);
 }
 
 static int mcp794xx_alarm_irq_enable(struct device *dev, unsigned int enabled)
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/5] Input: elan_i2c - Correct the x and y trace number.

2015-04-20 Thread duson
Hi Dmitry,

How about the description? Is it looks good for you?
Please let me know if you have any concern.

--
Thank you,
ELAN Duson
✉ Email: duson...@emc.com.tw
--





> duson  於 2015年4月16日 上午9:37 寫道:
> 
> Hi Dmitry,
> 
> I double check with our firmware team and the SPEC, it looks like the 
> subtract 1 just a misunderstanding.
> So, the correct should not subtract 1. For example, if the touchpad x 
> resolution is 2800 and x trace number is 20,
> the pitch size of x should be 2800/20 = 140, not 2800/19 = 147.36. 
> 
> --
> Thanks,
> ELAN Duson
> ✉ Email: duson...@emc.com.tw
> --
> 
> 
> 
> 
> 
>> Dmitry Torokhov  於 2015年4月16日 上午1:47 寫道:
>> 
>> On Wed, Apr 15, 2015 at 09:55:43AM +0800, DusonLin wrote:
>>> The trace number does not need to subtract 1 now.
>> 
>> Could you provide a bit more of background for this change? Why don't we
>> need to decrement the number returned by the firmware anymore? We have
>> been running with the old numbers for many years...
>> 
>> Thanks!
>> 
>>> 
>>> Signed-off-by: Duson Lin 
>>> ---
>>> drivers/input/mouse/elan_i2c_i2c.c   |4 ++--
>>> drivers/input/mouse/elan_i2c_smbus.c |4 ++--
>>> 2 files changed, 4 insertions(+), 4 deletions(-)
>>> 
>>> diff --git a/drivers/input/mouse/elan_i2c_i2c.c
>>> b/drivers/input/mouse/elan_i2c_i2c.c
>>> index 029941f..550f905 100644
>>> --- a/drivers/input/mouse/elan_i2c_i2c.c
>>> +++ b/drivers/input/mouse/elan_i2c_i2c.c
>>> @@ -356,8 +356,8 @@ static int elan_i2c_get_num_traces(struct i2c_client
>>> *client,
>>> return error;
>>> }
>>> 
>>> -   *x_traces = val[0] - 1;
>>> -   *y_traces = val[1] - 1;
>>> +   *x_traces = val[0];
>>> +   *y_traces = val[1];
>>> 
>>> return 0;
>>> }
>>> diff --git a/drivers/input/mouse/elan_i2c_smbus.c
>>> b/drivers/input/mouse/elan_i2c_smbus.c
>>> index 06a2bcd..0b04151 100644
>>> --- a/drivers/input/mouse/elan_i2c_smbus.c
>>> +++ b/drivers/input/mouse/elan_i2c_smbus.c
>>> @@ -268,8 +268,8 @@ static int elan_smbus_get_num_traces(struct i2c_client
>>> *client,
>>> return error;
>>> }
>>> 
>>> -   *x_traces = val[1] - 1;
>>> -   *y_traces = val[2] - 1;
>>> +   *x_traces = val[1];
>>> +   *y_traces = val[2];
>>> 
>>> return 0;
>>> }
>>> 
>> 
>> -- 
>> Dmitry
>> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 6/6] Input: elan_i2c - Add hover detection flag

2015-04-20 Thread duson
Hi Dmitry,

Is this patch looks good for you?
If you have any advice, please let me know.

--
Thank you
ELAN Duson
✉ Email: duson...@emc.com.tw
--





> duson  於 2015年4月17日 上午9:56 寫道:
> 
> When hover event coming, set ABS_MT_DISTANCE as 1, otherwise clear to 0.
> 
> Signed-off-by: Duson Lin 
> ---
> drivers/input/mouse/elan_i2c_core.c |   15 +++
> 1 file changed, 11 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/input/mouse/elan_i2c_core.c 
> b/drivers/input/mouse/elan_i2c_core.c
> index 6333ba6..da7893f 100644
> --- a/drivers/input/mouse/elan_i2c_core.c
> +++ b/drivers/input/mouse/elan_i2c_core.c
> @@ -52,6 +52,7 @@
> #define ETP_REPORT_ID_OFFSET  2
> #define ETP_TOUCH_INFO_OFFSET 3
> #define ETP_FINGER_DATA_OFFSET4
> +#define ETP_HOVER_INFO_OFFSET30
> #define ETP_MAX_REPORT_LEN34
> 
> /* The main device structure */
> @@ -725,7 +726,7 @@ static const struct attribute_group *elan_sysfs_groups[] 
> = {
>  */
> static void elan_report_contact(struct elan_tp_data *data,
>   int contact_num, bool contact_valid,
> - u8 *finger_data)
> + bool hover_event, u8 *finger_data)
> {
>   struct input_dev *input = data->input;
>   unsigned int pos_x, pos_y;
> @@ -769,7 +770,9 @@ static void elan_report_contact(struct elan_tp_data *data,
>   input_mt_report_slot_state(input, MT_TOOL_FINGER, true);
>   input_report_abs(input, ABS_MT_POSITION_X, pos_x);
>   input_report_abs(input, ABS_MT_POSITION_Y, data->max_y - pos_y);
> - input_report_abs(input, ABS_MT_PRESSURE, scaled_pressure);
> + input_report_abs(input, ABS_MT_DISTANCE, hover_event);
> + input_report_abs(input, ABS_MT_PRESSURE,
> +  hover_event ? 0 : scaled_pressure);
>   input_report_abs(input, ABS_TOOL_WIDTH, mk_x);
>   input_report_abs(input, ABS_MT_TOUCH_MAJOR, major);
>   input_report_abs(input, ABS_MT_TOUCH_MINOR, minor);
> @@ -785,11 +788,14 @@ static void elan_report_absolute(struct elan_tp_data 
> *data, u8 *packet)
>   u8 *finger_data = [ETP_FINGER_DATA_OFFSET];
>   int i;
>   u8 tp_info = packet[ETP_TOUCH_INFO_OFFSET];
> - bool contact_valid;
> + u8 hover_info = packet[ETP_HOVER_INFO_OFFSET];
> + bool contact_valid, hover_event;
> 
> + hover_event = hover_info & 0x40;
>   for (i = 0; i < ETP_MAX_FINGERS; i++) {
>   contact_valid = tp_info & (1U << (3 + i));
> - elan_report_contact(data, i, contact_valid, finger_data);
> + elan_report_contact(data, i, contact_valid, hover_event,
> + finger_data);
> 
>   if (contact_valid)
>   finger_data += ETP_FINGER_DATA_LEN;
> @@ -883,6 +889,7 @@ static int elan_setup_input_device(struct elan_tp_data 
> *data)
>ETP_FINGER_WIDTH * max_width, 0, 0);
>   input_set_abs_params(input, ABS_MT_TOUCH_MINOR, 0,
>ETP_FINGER_WIDTH * min_width, 0, 0);
> + input_set_abs_params(input, ABS_MT_DISTANCE, 0, 1, 0, 0);
> 
>   data->input = input;
> 
> 
> 
> 
> 
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCHv4] kvmppc: Implement H_LOGICAL_CI_{LOAD,STORE} in KVM

2015-04-20 Thread David Gibson
On POWER, storage caching is usually configured via the MMU - attributes
such as cache-inhibited are stored in the TLB and the hashed page table.

This makes correctly performing cache inhibited IO accesses awkward when
the MMU is turned off (real mode).  Some CPU models provide special
registers to control the cache attributes of real mode load and stores but
this is not at all consistent.  This is a problem in particular for SLOF,
the firmware used on KVM guests, which runs entirely in real mode, but
which needs to do IO to load the kernel.

To simplify this qemu implements two special hypercalls, H_LOGICAL_CI_LOAD
and H_LOGICAL_CI_STORE which simulate a cache-inhibited load or store to
a logical address (aka guest physical address).  SLOF uses these for IO.

However, because these are implemented within qemu, not the host kernel,
these bypass any IO devices emulated within KVM itself.  The simplest way
to see this problem is to attempt to boot a KVM guest from a virtio-blk
device with iothread / dataplane enabled.  The iothread code relies on an
in kernel implementation of the virtio queue notification, which is not
triggered by the IO hcalls, and so the guest will stall in SLOF unable to
load the guest OS.

This patch addresses this by providing in-kernel implementations of the
2 hypercalls, which correctly scan the KVM IO bus.  Any access to an
address not handled by the KVM IO bus will cause a VM exit, hitting the
qemu implementation as before.

Note that a userspace change is also required, in order to enable these
new hcall implementations with KVM_CAP_PPC_ENABLE_HCALL.

Signed-off-by: David Gibson 
---
 arch/powerpc/include/asm/kvm_book3s.h |  3 ++
 arch/powerpc/kvm/book3s.c | 76 +++
 arch/powerpc/kvm/book3s_hv.c  | 12 ++
 arch/powerpc/kvm/book3s_pr_papr.c | 28 +
 4 files changed, 119 insertions(+)

Changes in v4:
 * Rebase onto 4.0+, correct for changed signature of kvm_io_bus_{read,write}

Alex, I saw from some build system notifications that you seemed to
hit some troubles compiling the last version of this patch. This
should fix it - hope it's not too late to get into 4.1.

diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
b/arch/powerpc/include/asm/kvm_book3s.h
index 9930904..b91e74a 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -288,6 +288,9 @@ static inline bool kvmppc_supports_magic_page(struct 
kvm_vcpu *vcpu)
return !is_kvmppc_hv_enabled(vcpu->kvm);
 }
 
+extern int kvmppc_h_logical_ci_load(struct kvm_vcpu *vcpu);
+extern int kvmppc_h_logical_ci_store(struct kvm_vcpu *vcpu);
+
 /* Magic register values loaded into r3 and r4 before the 'sc' assembly
  * instruction for the OSI hypercalls */
 #define OSI_SC_MAGIC_R30x113724FA
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index cfbcdc6..453a8a4 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -821,6 +821,82 @@ void kvmppc_core_destroy_vm(struct kvm *kvm)
 #endif
 }
 
+int kvmppc_h_logical_ci_load(struct kvm_vcpu *vcpu)
+{
+   unsigned long size = kvmppc_get_gpr(vcpu, 4);
+   unsigned long addr = kvmppc_get_gpr(vcpu, 5);
+   u64 buf;
+   int ret;
+
+   if (!is_power_of_2(size) || (size > sizeof(buf)))
+   return H_TOO_HARD;
+
+   ret = kvm_io_bus_read(vcpu, KVM_MMIO_BUS, addr, size, );
+   if (ret != 0)
+   return H_TOO_HARD;
+
+   switch (size) {
+   case 1:
+   kvmppc_set_gpr(vcpu, 4, *(u8 *));
+   break;
+
+   case 2:
+   kvmppc_set_gpr(vcpu, 4, be16_to_cpu(*(__be16 *)));
+   break;
+
+   case 4:
+   kvmppc_set_gpr(vcpu, 4, be32_to_cpu(*(__be32 *)));
+   break;
+
+   case 8:
+   kvmppc_set_gpr(vcpu, 4, be64_to_cpu(*(__be64 *)));
+   break;
+
+   default:
+   BUG();
+   }
+
+   return H_SUCCESS;
+}
+EXPORT_SYMBOL_GPL(kvmppc_h_logical_ci_load);
+
+int kvmppc_h_logical_ci_store(struct kvm_vcpu *vcpu)
+{
+   unsigned long size = kvmppc_get_gpr(vcpu, 4);
+   unsigned long addr = kvmppc_get_gpr(vcpu, 5);
+   unsigned long val = kvmppc_get_gpr(vcpu, 6);
+   u64 buf;
+   int ret;
+
+   switch (size) {
+   case 1:
+   *(u8 *) = val;
+   break;
+
+   case 2:
+   *(__be16 *) = cpu_to_be16(val);
+   break;
+
+   case 4:
+   *(__be32 *) = cpu_to_be32(val);
+   break;
+
+   case 8:
+   *(__be64 *) = cpu_to_be64(val);
+   break;
+
+   default:
+   return H_TOO_HARD;
+   }
+
+   ret = kvm_io_bus_write(vcpu, KVM_MMIO_BUS, addr, size, );
+   if (ret != 0)
+   return H_TOO_HARD;
+
+   return H_SUCCESS;
+}
+EXPORT_SYMBOL_GPL(kvmppc_h_logical_ci_store);
+
 int 

[PATCH 0/2] Some firewire minor patches

2015-04-20 Thread Laurent Vivier
I've written these two patches when I was searching why my old iPod
cannot be mounted on my PC.

During my investigations, I've compared the fw0 config_rom I can access from
a powermac and the config_rom I have on a PC. It appears that the big-endian
property of the firewire bus is not respected here.

Then I was able to see that the 5 first words of the iPod config_rom were
read correctly and the followings not and fail on an ack timeout.
In fact, after the 5 first words, the max speed of the device is changed
from the lowest value to the real max speed of the device, and this does
not work with my iPod. I don't know why, I suspect an incompatibility between
my firewire card and the iPod. As I'm not an expert, the only solution
I found is to allow the user to force the max device speed at firewire-core
level (in my case, force_speed is 0 -> FW100)

[PATCH 1/2] firewire: firewire is a big-endian bus
[PATCH 2/2] firewire: add a parameter to force the speed of the
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] firewire: firewire is a big-endian bus

2015-04-20 Thread Laurent Vivier
So, dump config_rom data as big-endian values.

The value given by /sys/bus/firewire/devices/fw0 were correctly
given on a big-endian host (like powermac) not on a little-endian host
(like PC), for instance:

  87 a4 04 04 34 39 33 31  22 a2 00 f0 33 22 11 00  |4931"...3"..|
0010  66 66 66 33 0b dd 05 00  c0 83 00 0c 1e 0d d0 03  |fff3|
0020  03 00 00 81 01 00 00 17  08 00 00 81 b7 4c 06 00  |.L..|
0030  00 00 00 00 00 00 00 00  75 6e 69 4c 69 46 20 78  |uniLiF x|
0040  69 77 65 72 00 00 65 72  1c ff 03 00 00 00 00 00  |iwer..er|
0050  00 00 00 00 75 6a 75 4a   |ujuJ|
0058

instead of:

  04 04 a4 87 31 33 39 34  f0 00 a2 22 00 11 22 33  |1394...".."3|
0010  33 66 66 66 00 05 dd 0b  0c 00 83 c0 03 d0 0d 1e  |3fff|
0020  81 00 00 03 17 00 00 01  81 00 00 08 00 06 4c b7  |..L.|
0030  00 00 00 00 00 00 00 00  4c 69 6e 75 78 20 46 69  |Linux Fi|
0040  72 65 77 69 72 65 00 00  00 03 ff 1c 00 00 00 00  |rewire..|
0050  00 00 00 00 4a 75 6a 75   |Juju|
0058

This patch corrects this.

Signed-off-by: Laurent Vivier 
---
 drivers/firewire/core-device.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/firewire/core-device.c b/drivers/firewire/core-device.c
index f9e3aee..5245567 100644
--- a/drivers/firewire/core-device.c
+++ b/drivers/firewire/core-device.c
@@ -399,14 +399,14 @@ static ssize_t config_rom_show(struct device *dev,
   struct device_attribute *attr, char *buf)
 {
struct fw_device *device = fw_device(dev);
-   size_t length;
+   size_t i;
 
down_read(_device_rwsem);
-   length = device->config_rom_length * 4;
-   memcpy(buf, device->config_rom, length);
+   for (i = 0; i < device->config_rom_length; i++)
+   ((u32 *)buf)[i] = be32_to_cpu(device->config_rom[i]);
up_read(_device_rwsem);
 
-   return length;
+   return i * 4;
 }
 
 static ssize_t guid_show(struct device *dev,
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] firewire: add a parameter to force the speed of the devices.

2015-04-20 Thread Laurent Vivier
I was trying to use my old iPod mini firewire (first generation) with
a new firewire card I put in my PC (VIA Technologies, Inc. VT6306/7/8),
but the iPod was not mounted and failed with the following error:
reading config rom failed: no ack
It appears that the configuration rom cannot be read after the
device max speed is set to something else than SCODE_100.

According to the iPod configuration ROM, it should support SCODE_400.

This patch adds a a parameter (force_speed) to the firewire-core module
to be able to set the max speed to use with the firewire devices.

Signed-off-by: Laurent Vivier 
---
 drivers/firewire/core-device.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/drivers/firewire/core-device.c b/drivers/firewire/core-device.c
index 5245567..a075827 100644
--- a/drivers/firewire/core-device.c
+++ b/drivers/firewire/core-device.c
@@ -44,6 +44,17 @@
 
 #include "core.h"
 
+static int force_speed = -1;
+module_param_named(force_speed, force_speed, int, 0644);
+MODULE_PARM_DESC(force_speed, "Force device speed (default = -1"
+   ", FW100 = " __stringify(SCODE_100)
+   ", FW200 = " __stringify(SCODE_200)
+   ", FW400 = " __stringify(SCODE_400)
+   ", FW800 = " __stringify(SCODE_800)
+   ", FW1600 = " __stringify(SCODE_1600)
+   ", FW3200 = " __stringify(SCODE_3200)
+   ", FWBETA = " __stringify(SCODE_BETA));
+
 void fw_csr_iterator_init(struct fw_csr_iterator *ci, const u32 *p)
 {
ci->p = p + 1;
@@ -555,6 +566,8 @@ static int read_config_rom(struct fw_device *device, int 
generation)
}
 
device->max_speed = device->node->max_speed;
+   if (force_speed != -1)
+   device->max_speed = force_speed & 0xf;
 
/*
 * Determine the speed of
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [RFC] x86/cpu: Fix SMAP check in PVOPS environments

2015-04-20 Thread Andy Lutomirski

On 04/20/2015 10:09 AM, Andrew Cooper wrote:

There appears to be no formal statement of what pv_irq_ops.save_fl() is
supposed to return precisely.  Native returns the full flags, while lguest and
Xen only return the Interrupt Flag, and both have comments by the
implementations stating that only the Interrupt Flag is looked at.  This may
have been true when initially implemented, but no longer is.

To make matters worse, the Xen PVOP leaves the upper bits undefined, making
the BUG_ON() undefined behaviour.  Experimentally, this now trips for 32bit PV
guests on Broadwell hardware.  The BUG_ON() is consistent for an individual
build, but not consistent for all builds.  It has also been a sitting timebomb
since SMAP support was introduced.

Use native_save_fl() instead, which will obtain an accurate view of the AC
flag.

Signed-off-by: Andrew Cooper 
CC: Thomas Gleixner 
CC: Ingo Molnar 
CC: H. Peter Anvin 
CC: x...@kernel.org
CC: linux-kernel@vger.kernel.org
CC: Konrad Rzeszutek Wilk 
CC: Boris Ostrovsky 
CC: David Vrabel 
CC: xen-devel 
CC: Rusty Russell 
CC: lgu...@lists.ozlabs.org

---
This patch is RFC because I am not certain that native_save_fl() is
necessarily the correct solution on lguest, but it does seem that setup_smap()
wants to check the actual AC bit, rather than an idealised value.

A different approach, given the dual nature of the AC flag now is to gate
setup_smap() on a kernel rpl of 0.  SMAP necessarily can't be used in a
paravirtual situation where the kernel runs in cpl > 0.

Another different approach would be to formally state that
pv_irq_ops.save_fl() needs to return all the flags, which would make
local_irq_save() safe to use in this circumstance, but that makes a hotpath
longer for the sake of a single boot time check.


...which reminds me:

Why does native_restore_fl restore anything other than IF?  A branch and 
sti should be considerably faster than popf.


Also, if we did this, could Xen use PVI and then use native_restore_fl 
and avoid lots of pvops?


--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/rtc/rtc-ds1307.c: Enable the mcp794xx alarm after programming time

2015-04-20 Thread Nishanth Menon
On 04/20/2015 06:55 PM, Nishanth Menon wrote:
> Alarm interrupt enable register is at offset 0x7, while the time
> registers for the alarm follow that. When we program Alarm interrupt
> enable prior to programming the time, it is possible that previous
> time value could be close or match at the time of alarm enable
> resulting in interrupt trigger which is unexpected (and does not match
> the time we expect it to trigger).
> 
> To prevent this scenario from occurring, program the ALM0_EN bit only
> after the alarm time is appropriately programmed.
> 
> Of course, I2C programming is non-atomic, so there are loopholes where
> the interrupt wont trigger if the time requested is in the past at
> the time of programming the ALM0_EN bit. However, we will not have
> unexpected interrupts while the time is programmed after the interrupt
> are enabled.
> 
> Signed-off-by: Nishanth Menon 
> ---
>  drivers/rtc/rtc-ds1307.c |   12 ++--
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/rtc/rtc-ds1307.c b/drivers/rtc/rtc-ds1307.c
> index 4ffabb322a9a..59f9ecf323d5 100644
> --- a/drivers/rtc/rtc-ds1307.c
> +++ b/drivers/rtc/rtc-ds1307.c
> @@ -742,17 +742,17 @@ static int mcp794xx_set_alarm(struct device *dev, 
> struct rtc_wkalrm *t)
>   regs[6] &= ~MCP794XX_BIT_ALMX_IF;
>   /* Set alarm match: second, minute, hour, day, date, month. */
>   regs[6] |= MCP794XX_MSK_ALMX_MATCH;
> -
> - if (t->enabled)
> - regs[0] |= MCP794XX_BIT_ALM0_EN;
> - else
> - regs[0] &= ~MCP794XX_BIT_ALM0_EN;
> + /* Disable interrupt. We will not enable until completely programed */
^^ typo here. s/programed/programmed
> + regs[0] &= ~MCP794XX_BIT_ALM0_EN;
>  
>   ret = ds1307->write_block_data(client, MCP794XX_REG_CONTROL, 10, regs);
>   if (ret < 0)
>   return ret;
>  
> - return 0;
> + if (!t->enabled)
> + return 0;
> + regs[0] |= MCP7941X_BIT_ALM0_EN;

^^ messed up git commit --amend :(

> + return i2c_smbus_write_byte_data(client, MCP794XX_REG_CONTROL, regs[0]);
>  }
>  
>  static int mcp794xx_alarm_irq_enable(struct device *dev, unsigned int 
> enabled)
> 

will repost a rev2 - apologies on the noise.
-- 
Regards,
Nishanth Menon
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC,1/8] soc/fman: Add FMan MURAM support

2015-04-20 Thread Scott Wood
On Mon, 2015-04-20 at 03:58 -0500, Liberman Igal-B31950 wrote:
> 
> Regards,
> Igal Liberman.
> 
> > -Original Message-
> > From: Kumar Gala [mailto:ga...@kernel.crashing.org]
> > Sent: Thursday, March 12, 2015 5:57 PM
> > To: Liberman Igal-B31950
> > Cc: linuxppc-...@lists.ozlabs.org; net...@vger.kernel.org; linux-
> > ker...@vger.kernel.org; Wood Scott-B07421
> > Subject: Re: [RFC,1/8] soc/fman: Add FMan MURAM support
> > 
> > 
> > On Mar 11, 2015, at 12:07 AM, Igal.Liberman 
> > wrote:
> > 
> > > From: Igal Liberman 
> > >
> > > Add Frame Manager Multi-User RAM support.
> > >
> > > Signed-off-by: Igal Liberman 
> > > ---
> > > drivers/soc/fsl/fman/Kconfig|1 +
> > > drivers/soc/fsl/fman/Makefile   |5 +-
> > > drivers/soc/fsl/fman/fm_muram.c |  174
> > +++
> > > drivers/soc/fsl/fman/inc/fm_muram_ext.h |   98 +
> > > 4 files changed, 276 insertions(+), 2 deletions(-) create mode 100644
> > > drivers/soc/fsl/fman/fm_muram.c create mode 100644
> > > drivers/soc/fsl/fman/inc/fm_muram_ext.h
> > >
> > 
> > use lib/genalloc instead of rheap
> > 
> 
> Hi Kumar,
> I looked into lib/genalloc allocator.
> As far as I see, the genalloc allocator doesn't allow to control the memory 
> alignment when you allocate a chunk of memory.
> Two important notes regarding MURAM memory:
> - The allocated memory chunks should have specific alignment (might be 
> different in each chunk).
> - The allocations must be efficient, we don't want to "waste" MURAM due to 
> alignment issues.

If the requirement is that allocations must be size-aligned, use
gen_pool_first_fit_order_align.  Otherwise, improve genalloc to do what
you need.

-Scott

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] drivers/rtc/rtc-ds1307.c: Enable the mcp794xx alarm after programming time

2015-04-20 Thread Nishanth Menon
Alarm interrupt enable register is at offset 0x7, while the time
registers for the alarm follow that. When we program Alarm interrupt
enable prior to programming the time, it is possible that previous
time value could be close or match at the time of alarm enable
resulting in interrupt trigger which is unexpected (and does not match
the time we expect it to trigger).

To prevent this scenario from occurring, program the ALM0_EN bit only
after the alarm time is appropriately programmed.

Of course, I2C programming is non-atomic, so there are loopholes where
the interrupt wont trigger if the time requested is in the past at
the time of programming the ALM0_EN bit. However, we will not have
unexpected interrupts while the time is programmed after the interrupt
are enabled.

Signed-off-by: Nishanth Menon 
---
 drivers/rtc/rtc-ds1307.c |   12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/rtc/rtc-ds1307.c b/drivers/rtc/rtc-ds1307.c
index 4ffabb322a9a..59f9ecf323d5 100644
--- a/drivers/rtc/rtc-ds1307.c
+++ b/drivers/rtc/rtc-ds1307.c
@@ -742,17 +742,17 @@ static int mcp794xx_set_alarm(struct device *dev, struct 
rtc_wkalrm *t)
regs[6] &= ~MCP794XX_BIT_ALMX_IF;
/* Set alarm match: second, minute, hour, day, date, month. */
regs[6] |= MCP794XX_MSK_ALMX_MATCH;
-
-   if (t->enabled)
-   regs[0] |= MCP794XX_BIT_ALM0_EN;
-   else
-   regs[0] &= ~MCP794XX_BIT_ALM0_EN;
+   /* Disable interrupt. We will not enable until completely programed */
+   regs[0] &= ~MCP794XX_BIT_ALM0_EN;
 
ret = ds1307->write_block_data(client, MCP794XX_REG_CONTROL, 10, regs);
if (ret < 0)
return ret;
 
-   return 0;
+   if (!t->enabled)
+   return 0;
+   regs[0] |= MCP7941X_BIT_ALM0_EN;
+   return i2c_smbus_write_byte_data(client, MCP794XX_REG_CONTROL, regs[0]);
 }
 
 static int mcp794xx_alarm_irq_enable(struct device *dev, unsigned int enabled)
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] __bitmap_parselist: fix bug in empty string handling

2015-04-20 Thread Chris Metcalf

> On Apr 20, 2015, at 1:17 PM, Andrew Morton  wrote:
> 
>> On Fri, 17 Apr 2015 14:00:04 -0400 Chris Metcalf  wrote:
>> 
>> bitmap_parselist("", , nmaskbits) will erroneously set bit
>> zero in the mask.  The same bug is visible in cpumask_parselist()
>> since it is layered on top of the bitmask code, e.g. if you boot with
>> "isolcpus=", you will actually end up with cpu zero isolated.
>> 
>> The bug was introduced in commit 4b060420a596 ("bitmap, irq: add
>> smp_affinity_list interface to /proc/irq") when bitmap_parselist()
>> was generalized to support userspace as well as kernelspace.
>> 
>> Signed-off-by: Chris Metcalf 
>> Cc: sta...@vger.kernel.org
> 
> I don't think we need to backport a fix for a 4 year old bug which has
> very minor consequences.  Am I wrong?

I don't have a strong feeling on this one. My guess is it's trivial to backport 
but also very low impact so either way is pretty reasonable.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 6/8] selftest/x86: have no dependency on all when cross building

2015-04-20 Thread Andy Lutomirski
On Mon, Apr 20, 2015 at 4:34 PM, Tyler Baker  wrote:
> On 20 April 2015 at 16:22, Andy Lutomirski  wrote:
>> On Mon, Apr 20, 2015 at 4:15 PM, Tyler Baker  wrote:
>>> If the CROSS_COMPILE is set remove all's dependency on all_32 and all_64.
>>>
>>> Cc: Andy Lutomirski 
>>> Signed-off-by: Tyler Baker 
>>> ---
>>>  tools/testing/selftests/x86/Makefile | 8 +++-
>>>  1 file changed, 7 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/tools/testing/selftests/x86/Makefile 
>>> b/tools/testing/selftests/x86/Makefile
>>> index be93945..a5ca38b 100644
>>> --- a/tools/testing/selftests/x86/Makefile
>>> +++ b/tools/testing/selftests/x86/Makefile
>>> @@ -7,15 +7,21 @@ BINARIES_64 := $(TARGETS_C_BOTHBITS:%=%_64)
>>>
>>>  CFLAGS := -O2 -g -std=gnu99 -pthread -Wall
>>>
>>> +all:
>>> +
>>
>> This...
>>
>>>  UNAME_M := $(shell uname -m)
>>>
>>> +ifeq ($(CROSS_COMPILE),)
>>>  # Always build 32-bit tests
>>>  all: all_32
>>> -
>>>  # If we're on a 64-bit host, build 64-bit tests as well
>>>  ifeq ($(UNAME_M),x86_64)
>>>  all: all_64
>>>  endif
>>> +else
>>> +# No dependency on all when cross building
>>> +all:
>>
>> ...is redundant with this.  If you delete the "else" and "all:" here, then:
>
> Ok, I will remove these bits from this patch. However, the else will
> need to be added back in the next patch of the series to override the
> default behavior of EMIT_TESTS and INSTALL_RULE if that you are ok
> with that.

I'm fine with that, unless you or Shuah want to fix lib.mk.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] IB/qib: Remove EOL whitespaces

2015-04-20 Thread Sergei Zviagintsev
Remove EOL whitespaces added by commit 4961772560d2
("infinibad: weird APIs switched to ->write_iter()")

Signed-off-by: Sergei Zviagintsev 
---
 drivers/infiniband/hw/qib/qib_file_ops.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/infiniband/hw/qib/qib_file_ops.c 
b/drivers/infiniband/hw/qib/qib_file_ops.c
index 9ea6c440a00c..14414bc5a34d 100644
--- a/drivers/infiniband/hw/qib/qib_file_ops.c
+++ b/drivers/infiniband/hw/qib/qib_file_ops.c
@@ -2261,7 +2261,7 @@ static ssize_t qib_write_iter(struct kiocb *iocb, struct 
iov_iter *from)
 
if (!iter_is_iovec(from) || !from->nr_segs || !pq)
return -EINVAL;
-
+
return qib_user_sdma_writev(rcd, pq, from->iov, from->nr_segs);
 }
 
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFC] x86: enforce inlining for atomics

2015-04-20 Thread Peter Zijlstra
On Mon, Apr 20, 2015 at 11:27:11PM +0200, Hagen Paul Pfeifer wrote:
> During some code analysis I realized that atomic_add, atomic_sub and
> friends are not necessarily inlined AND that each function is defined
> multiple times:
> 
> atomic_inc:  544 duplicates
> atomic_dec:  215 duplicates
> atomic_dec_and_test: 107 duplicates
> atomic64_inc: 38 duplicates
> [...]
> 
> Each definition is exact equally, e.g.:
> 
> 813171b8 :
> 55 push   %rbp
> 48 89 e5   mov%rsp,%rbp
> f0 01 3e   lock add %edi,(%rsi)
> 5d pop%rbp
> c3 retq
> 

Urgh, that's a GCC fail, what version and compile flags?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 2/3] sched/rt: Fix wrong SMP scheduler behavior for equal prio cases

2015-04-20 Thread Peter Zijlstra
On Mon, Apr 20, 2015 at 01:48:03PM -0400, Steven Rostedt wrote:
> On Mon, 20 Apr 2015 19:20:48 +0200
> Peter Zijlstra  wrote:
> 
> > > > +*/
> > > > +   if (preempt_count() & PREEMPT_ACTIVE)
> > > > +   enqueue_pushable_task_preempted(rq, p);
> > > > +   else
> > > > +   enqueue_pushable_task(rq, p);
> > > > +   }
> > > >  }
> > 
> > This looks wrong, what do you want to find? _any_ preemption? In that
> > case PREEMPT_ACTIVE is wrong. What you need to check is if the task is
> > still on the RQ or not.
> > 
> > If the task was put to sleep it got dequeued, if it was not dequeued, it
> > got preempted.
> > 
> > PREEMPT_ACTIVE is only ever set for forced kernel preemption, which is a
> > special sub case only ever triggered with CONFIG_PREEMPT=y.
> 
> Ah, you're right. I was thinking of just forced preemption, but, I
> wasn't thinking about voluntary preemption (preemption points). We want
> this behavior for that too (for kernel).
> 
> And yes, if we preempt in user space, this isn't enough either.
> 
> Actually, I think we only care if the state of the task is
> TASK_RUNNING, if it is anything else, the task is probably going to
> sleep anyway and we don't care about FIFO order then.

Please don't try and be clever there :-) Task state can be misleading,
you might get a wakeup before you're running again, in which case you
never went to sleep.

Please use task_on_rq_queued(p) like all other sites.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v1 4/6] moduleparam.h: add module_param_config_*() helpers

2015-04-20 Thread Julian Calaby
Hi Luis,

You made a spelling mistake:

On Tue, Apr 21, 2015 at 9:30 AM, Luis R. Rodriguez
 wrote:
> From: "Luis R. Rodriguez" 
>
> This adds a couple of bool module_param_config_*() helpers
> which are designed to let us easily associate a booloean
> module parameter with an associated kernel configuration
> option, and to help us remove #ifdef'ery eyesores.
>
> Cc: Rusty Russell 
> Cc: Jani Nikula 
> Cc: Christoph Hellwig 
> Cc: Andrew Morton 
> Cc: Geert Uytterhoeven 
> Cc: Hannes Reinecke 
> Cc: Kees Cook 
> Cc: Tejun Heo 
> Cc: Ingo Molnar 
> Cc: linux-kernel@vger.kernel.org
> Cc: co...@systeme.lip6.fr
> Signed-off-by: Luis R. Rodriguez 
> ---
>  include/linux/moduleparam.h | 37 +
>  1 file changed, 37 insertions(+)
>
> diff --git a/include/linux/moduleparam.h b/include/linux/moduleparam.h
> index 7e00799..fdf7b87 100644
> --- a/include/linux/moduleparam.h
> +++ b/include/linux/moduleparam.h
> @@ -155,6 +155,43 @@ struct kparam_array
> __MODULE_PARM_TYPE(name, #type)
>
>  /**
> + * module_param_config_on_off - bool parameter with run time override
> + * @name: a valid C identifier which is the parameter name.
> + * @value: the actual lvalue to alter.
> + * @perm: visibility in sysfs.
> + * @config: kernel parameter which will enable this option if this
> + * kernel configuration option has been enabled.
> + *
> + * This lets you define a bool module paramter which by default will be

s/paramter/parameter/

> + * set to true if the config option has been set on your kernel's
> + * configuration, otherwise it is set to false.
> + */
> +#define module_param_config_on_off(name, var, perm, config)\
> +   static bool var = IS_ENABLED(config);   \
> +   module_param_named(name, var, bool, perm);
> +
> +/**
> + * module_param_config_on - bool parameter with run time enablement override
> + * @name: a valid C identifier which is the parameter name.
> + * @value: the actual lvalue to alter.
> + * @perm: visibility in sysfs.
> + * @config: kernel parameter which will enable this option if this
> + * kernel configuration option has been enabled.
> + *
> + * This lets you define a bool module paramter which by default will be

Here too.

Thanks,

-- 
Julian Calaby

Email: julian.cal...@gmail.com
Profile: http://www.google.com/profiles/julian.calaby/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 6/8] selftest/x86: have no dependency on all when cross building

2015-04-20 Thread Tyler Baker
On 20 April 2015 at 16:22, Andy Lutomirski  wrote:
> On Mon, Apr 20, 2015 at 4:15 PM, Tyler Baker  wrote:
>> If the CROSS_COMPILE is set remove all's dependency on all_32 and all_64.
>>
>> Cc: Andy Lutomirski 
>> Signed-off-by: Tyler Baker 
>> ---
>>  tools/testing/selftests/x86/Makefile | 8 +++-
>>  1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/tools/testing/selftests/x86/Makefile 
>> b/tools/testing/selftests/x86/Makefile
>> index be93945..a5ca38b 100644
>> --- a/tools/testing/selftests/x86/Makefile
>> +++ b/tools/testing/selftests/x86/Makefile
>> @@ -7,15 +7,21 @@ BINARIES_64 := $(TARGETS_C_BOTHBITS:%=%_64)
>>
>>  CFLAGS := -O2 -g -std=gnu99 -pthread -Wall
>>
>> +all:
>> +
>
> This...
>
>>  UNAME_M := $(shell uname -m)
>>
>> +ifeq ($(CROSS_COMPILE),)
>>  # Always build 32-bit tests
>>  all: all_32
>> -
>>  # If we're on a 64-bit host, build 64-bit tests as well
>>  ifeq ($(UNAME_M),x86_64)
>>  all: all_64
>>  endif
>> +else
>> +# No dependency on all when cross building
>> +all:
>
> ...is redundant with this.  If you delete the "else" and "all:" here, then:

Ok, I will remove these bits from this patch. However, the else will
need to be added back in the next patch of the series to override the
default behavior of EMIT_TESTS and INSTALL_RULE if that you are ok
with that.

>
> Acked-by: Andy Lutomirski 
>
>> +endif
>>
>>  all_32: check_build32 $(BINARIES_32)
>>
>> --
>> 2.1.4
>>
>
>
>
> --
> Andy Lutomirski
> AMA Capital Management, LLC

Tyler
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v1 3/6] kernel/params.c: generalize bool_enable_only

2015-04-20 Thread Luis R. Rodriguez
From: "Luis R. Rodriguez" 

This takes out the bool_enable_only implementation from
the module loading code and generalizes it so that others
can make use of it.

Cc: Rusty Russell 
Cc: Jani Nikula 
Cc: Christoph Hellwig 
Cc: Andrew Morton 
Cc: Geert Uytterhoeven 
Cc: Hannes Reinecke 
Cc: Kees Cook 
Cc: Tejun Heo 
Cc: Ingo Molnar 
Cc: linux-kernel@vger.kernel.org
Cc: co...@systeme.lip6.fr
Signed-off-by: Luis R. Rodriguez 
---
 include/linux/moduleparam.h |  6 ++
 kernel/module.c | 31 ---
 kernel/params.c | 30 ++
 3 files changed, 36 insertions(+), 31 deletions(-)

diff --git a/include/linux/moduleparam.h b/include/linux/moduleparam.h
index 5d0f4d9..7e00799 100644
--- a/include/linux/moduleparam.h
+++ b/include/linux/moduleparam.h
@@ -427,6 +427,12 @@ extern int param_set_bool(const char *val, const struct 
kernel_param *kp);
 extern int param_get_bool(char *buffer, const struct kernel_param *kp);
 #define param_check_bool(name, p) __param_check(name, p, bool)
 
+extern const struct kernel_param_ops param_ops_bool_enable_only;
+extern int param_set_bool_enable_only(const char *val,
+ const struct kernel_param *kp);
+/* getter is the same as for the regular bool */
+#define param_check_bool_enable_only param_check_bool
+
 extern const struct kernel_param_ops param_ops_invbool;
 extern int param_set_invbool(const char *val, const struct kernel_param *kp);
 extern int param_get_invbool(char *buffer, const struct kernel_param *kp);
diff --git a/kernel/module.c b/kernel/module.c
index de12c4a..43a1ef3 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -111,37 +111,6 @@ static bool sig_enforce = true;
 #else
 static bool sig_enforce = false;
 
-static int param_set_bool_enable_only(const char *val,
- const struct kernel_param *kp)
-{
-   int err = 0;
-   bool new_value;
-   bool orig_value = *(bool *)kp->arg;
-   struct kernel_param dummy_kp = *kp;
-
-   dummy_kp.arg = _value;
-
-   err = param_set_bool(val, _kp);
-   if (err)
-   return err;
-
-   /* Don't let them unset it once it's set! */
-   if (!new_value && orig_value)
-   return -EROFS;
-
-   if (new_value)
-   err = param_set_bool(val, kp);
-
-   return err;
-}
-
-static const struct kernel_param_ops param_ops_bool_enable_only = {
-   .flags = KERNEL_PARAM_OPS_FL_NOARG,
-   .set = param_set_bool_enable_only,
-   .get = param_get_bool,
-};
-#define param_check_bool_enable_only param_check_bool
-
 module_param(sig_enforce, bool_enable_only, 0644);
 #endif /* !CONFIG_MODULE_SIG_FORCE */
 #endif /* CONFIG_MODULE_SIG */
diff --git a/kernel/params.c b/kernel/params.c
index b7635c0..324624e 100644
--- a/kernel/params.c
+++ b/kernel/params.c
@@ -335,6 +335,36 @@ const struct kernel_param_ops param_ops_bool = {
 };
 EXPORT_SYMBOL(param_ops_bool);
 
+int param_set_bool_enable_only(const char *val, const struct kernel_param *kp)
+{
+   int err = 0;
+   bool new_value;
+   bool orig_value = *(bool *)kp->arg;
+   struct kernel_param dummy_kp = *kp;
+
+   dummy_kp.arg = _value;
+
+   err = param_set_bool(val, _kp);
+   if (err)
+   return err;
+
+   /* Don't let them unset it once it's set! */
+   if (!new_value && orig_value)
+   return -EROFS;
+
+   if (new_value)
+   err = param_set_bool(val, kp);
+
+   return err;
+}
+EXPORT_SYMBOL_GPL(param_set_bool_enable_only);
+
+const struct kernel_param_ops param_ops_bool_enable_only = {
+   .flags = KERNEL_PARAM_OPS_FL_NOARG,
+   .set = param_set_bool_enable_only,
+   .get = param_get_bool,
+};
+
 /* This one must be bool. */
 int param_set_invbool(const char *val, const struct kernel_param *kp)
 {
-- 
2.3.2.209.gd67f9d5.dirty

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v1 2/6] kernel/module.c: use generic module param operaters for sig_enforce

2015-04-20 Thread Luis R. Rodriguez
From: "Luis R. Rodriguez" 

We're directly checking and modifying sig_enforce when needed instead
of using the generic helpers. This prevents us from generalizing this
helper so that others can use it. Use indirect helpers to allow us
to generalize this code a bit and to make it a bit more clear what
this is doing.

Cc: Rusty Russell 
Cc: Jani Nikula 
Cc: Christoph Hellwig 
Cc: Andrew Morton 
Cc: Geert Uytterhoeven 
Cc: Hannes Reinecke 
Cc: Kees Cook 
Cc: Tejun Heo 
Cc: Ingo Molnar 
Cc: co...@systeme.lip6.fr
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Luis R. Rodriguez 
---
 kernel/module.c | 16 +---
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/kernel/module.c b/kernel/module.c
index 42a1d2a..de12c4a 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -114,23 +114,25 @@ static bool sig_enforce = false;
 static int param_set_bool_enable_only(const char *val,
  const struct kernel_param *kp)
 {
-   int err;
-   bool test;
+   int err = 0;
+   bool new_value;
+   bool orig_value = *(bool *)kp->arg;
struct kernel_param dummy_kp = *kp;
 
-   dummy_kp.arg = 
+   dummy_kp.arg = _value;
 
err = param_set_bool(val, _kp);
if (err)
return err;
 
/* Don't let them unset it once it's set! */
-   if (!test && sig_enforce)
+   if (!new_value && orig_value)
return -EROFS;
 
-   if (test)
-   sig_enforce = true;
-   return 0;
+   if (new_value)
+   err = param_set_bool(val, kp);
+
+   return err;
 }
 
 static const struct kernel_param_ops param_ops_bool_enable_only = {
-- 
2.3.2.209.gd67f9d5.dirty

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v1 6/6] kernel/module.c: use module_param_config_on() for sig_enforce

2015-04-20 Thread Luis R. Rodriguez
From: "Luis R. Rodriguez" 

Cc: Rusty Russell 
Cc: Jani Nikula 
Cc: Christoph Hellwig 
Cc: Andrew Morton 
Cc: Geert Uytterhoeven 
Cc: Hannes Reinecke 
Cc: Kees Cook 
Cc: Tejun Heo 
Cc: Ingo Molnar 
Cc: linux-kernel@vger.kernel.org
Cc: co...@systeme.lip6.fr
Signed-off-by: Luis R. Rodriguez 
---
 kernel/module.c | 8 +---
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/kernel/module.c b/kernel/module.c
index 43a1ef3..e63bbd2 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -106,13 +106,7 @@ struct list_head *kdb_modules =  /* kdb needs the 
list of modules */
 #endif /* CONFIG_KGDB_KDB */
 
 #ifdef CONFIG_MODULE_SIG
-#ifdef CONFIG_MODULE_SIG_FORCE
-static bool sig_enforce = true;
-#else
-static bool sig_enforce = false;
-
-module_param(sig_enforce, bool_enable_only, 0644);
-#endif /* !CONFIG_MODULE_SIG_FORCE */
+module_param_config_on(sig_enforce, sig_enforce, 0644, 
CONFIG_MODULE_SIG_FORCE);
 #endif /* CONFIG_MODULE_SIG */
 
 /* Block module loading/unloading? */
-- 
2.3.2.209.gd67f9d5.dirty

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v1 4/6] moduleparam.h: add module_param_config_*() helpers

2015-04-20 Thread Luis R. Rodriguez
From: "Luis R. Rodriguez" 

This adds a couple of bool module_param_config_*() helpers
which are designed to let us easily associate a booloean
module parameter with an associated kernel configuration
option, and to help us remove #ifdef'ery eyesores.

Cc: Rusty Russell 
Cc: Jani Nikula 
Cc: Christoph Hellwig 
Cc: Andrew Morton 
Cc: Geert Uytterhoeven 
Cc: Hannes Reinecke 
Cc: Kees Cook 
Cc: Tejun Heo 
Cc: Ingo Molnar 
Cc: linux-kernel@vger.kernel.org
Cc: co...@systeme.lip6.fr
Signed-off-by: Luis R. Rodriguez 
---
 include/linux/moduleparam.h | 37 +
 1 file changed, 37 insertions(+)

diff --git a/include/linux/moduleparam.h b/include/linux/moduleparam.h
index 7e00799..fdf7b87 100644
--- a/include/linux/moduleparam.h
+++ b/include/linux/moduleparam.h
@@ -155,6 +155,43 @@ struct kparam_array
__MODULE_PARM_TYPE(name, #type)
 
 /**
+ * module_param_config_on_off - bool parameter with run time override
+ * @name: a valid C identifier which is the parameter name.
+ * @value: the actual lvalue to alter.
+ * @perm: visibility in sysfs.
+ * @config: kernel parameter which will enable this option if this
+ * kernel configuration option has been enabled.
+ *
+ * This lets you define a bool module paramter which by default will be
+ * set to true if the config option has been set on your kernel's
+ * configuration, otherwise it is set to false.
+ */
+#define module_param_config_on_off(name, var, perm, config)\
+   static bool var = IS_ENABLED(config);   \
+   module_param_named(name, var, bool, perm);
+
+/**
+ * module_param_config_on - bool parameter with run time enablement override
+ * @name: a valid C identifier which is the parameter name.
+ * @value: the actual lvalue to alter.
+ * @perm: visibility in sysfs.
+ * @config: kernel parameter which will enable this option if this
+ * kernel configuration option has been enabled.
+ *
+ * This lets you define a bool module paramter which by default will be
+ * set to true if the config option has been set on your kernel's
+ * configuration, otherwise it is set to false. This particular helper
+ * will ensure that if the kernel configuration has been set you will not
+ * be able to disable this kernel parameter. You can only use this to let
+ * the an option that was disabled on your kernel configuration be enabled
+ * at run time.
+ */
+#define module_param_config_on(name, var, perm, config)\
+   static bool var = IS_ENABLED(config);   \
+   module_param_named(name, var, bool_enable_only, perm);
+
+
+/**
  * module_param_cb - general callback for a module/cmdline parameter
  * @name: a valid C identifier which is the parameter name.
  * @ops: the set & get operations for this parameter.
-- 
2.3.2.209.gd67f9d5.dirty

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v1 0/6] module params: few simplifications

2015-04-20 Thread Luis R. Rodriguez
Most code already uses consts for the struct kernel_param_ops,
sweep the kernel for the last offending stragglers. Other than
include/linux/moduleparam.h and kernel/params.c all other changes
were generated with the following Coccinelle SmPL patch. Merge
conflicts between trees can be handled with Coccinelle.

In the future git could get Coccinelle merge support to deal with
patch --> fail --> grammar --> Coccinelle --> new patch conflicts
automatically for us on patches where the grammar is available and
the patch is of high confidence. Consider this a feature request.

Test compiled on x86_64 against:

  * allnoconfig
  * allmodconfig
  * allyesconfig

@ const_found @
identifier ops;
@@

const struct kernel_param_ops ops = {
};

@ const_not_found depends on !const_found @
identifier ops;
@@

-struct kernel_param_ops ops = {
+const struct kernel_param_ops ops = {
};

Generated-by: Coccinelle SmPL
Cc: co...@systeme.lip6.fr
Cc: Rusty Russell 
Cc: Junio C Hamano 
Cc: Jani Nikula 
Cc: Christoph Hellwig 
Cc: Andrew Morton 
Cc: Geert Uytterhoeven 
Cc: Hannes Reinecke 
Cc: Kees Cook 
Cc: Tejun Heo 
Cc: Ingo Molnar 
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Luis R. Rodriguez 
---
 arch/s390/kernel/perf_cpum_sf.c |  2 +-
 arch/x86/kvm/mmu_audit.c|  2 +-
 arch/x86/platform/uv/uv_nmi.c   |  2 +-
 drivers/block/null_blk.c|  4 ++--
 drivers/char/ipmi/ipmi_watchdog.c   |  6 +++---
 drivers/dma/dmatest.c   |  4 ++--
 drivers/ide/ide.c   |  2 +-
 drivers/infiniband/ulp/srp/ib_srp.c |  4 ++--
 drivers/input/misc/ati_remote2.c|  4 ++--
 drivers/input/mouse/psmouse-base.c  |  2 +-
 drivers/misc/lis3lv02d/lis3lv02d.c  |  2 +-
 drivers/mtd/ubi/block.c |  2 +-
 drivers/net/wireless/ath/wil6210/main.c |  4 ++--
 drivers/power/test_power.c  | 16 
 drivers/thermal/intel_powerclamp.c  |  4 ++--
 drivers/tty/hvc/hvc_iucv.c  |  2 +-
 drivers/tty/sysrq.c |  2 +-
 drivers/video/fbdev/uvesafb.c   |  2 +-
 drivers/virtio/virtio_mmio.c|  2 +-
 fs/nfs/super.c  |  2 +-
 include/linux/moduleparam.h | 30 +++---
 kernel/params.c | 14 +++---
 net/sunrpc/auth.c   |  2 +-
 net/sunrpc/xprtsock.c   |  6 +++---
 security/apparmor/lsm.c |  6 +++---
 security/integrity/ima/ima_crypto.c |  2 +-
 sound/pci/hda/hda_intel.c   |  2 +-
 27 files changed, 66 insertions(+), 66 deletions(-)

diff --git a/arch/s390/kernel/perf_cpum_sf.c b/arch/s390/kernel/perf_cpum_sf.c
index e6a1578..afe05bf 100644
--- a/arch/s390/kernel/perf_cpum_sf.c
+++ b/arch/s390/kernel/perf_cpum_sf.c
@@ -1572,7 +1572,7 @@ static int param_set_sfb_size(const char *val, const 
struct kernel_param *kp)
 }
 
 #define param_check_sfb_size(name, p) __param_check(name, p, void)
-static struct kernel_param_ops param_ops_sfb_size = {
+static const struct kernel_param_ops param_ops_sfb_size = {
.set = param_set_sfb_size,
.get = param_get_sfb_size,
 };
diff --git a/arch/x86/kvm/mmu_audit.c b/arch/x86/kvm/mmu_audit.c
index 9ade5cf..87393e3 100644
--- a/arch/x86/kvm/mmu_audit.c
+++ b/arch/x86/kvm/mmu_audit.c
@@ -291,7 +291,7 @@ static int mmu_audit_set(const char *val, const struct 
kernel_param *kp)
return 0;
 }
 
-static struct kernel_param_ops audit_param_ops = {
+static const struct kernel_param_ops audit_param_ops = {
.set = mmu_audit_set,
.get = param_get_bool,
 };
diff --git a/arch/x86/platform/uv/uv_nmi.c b/arch/x86/platform/uv/uv_nmi.c
index 7488caf..020c101 100644
--- a/arch/x86/platform/uv/uv_nmi.c
+++ b/arch/x86/platform/uv/uv_nmi.c
@@ -104,7 +104,7 @@ static int param_set_local64(const char *val, const struct 
kernel_param *kp)
return 0;
 }
 
-static struct kernel_param_ops param_ops_local64 = {
+static const struct kernel_param_ops param_ops_local64 = {
.get = param_get_local64,
.set = param_set_local64,
 };
diff --git a/drivers/block/null_blk.c b/drivers/block/null_blk.c
index 65cd61a..0b4b256 100644
--- a/drivers/block/null_blk.c
+++ b/drivers/block/null_blk.c
@@ -99,7 +99,7 @@ static int null_set_queue_mode(const char *str, const struct 
kernel_param *kp)
return null_param_store_val(str, _mode, NULL_Q_BIO, NULL_Q_MQ);
 }
 
-static struct kernel_param_ops null_queue_mode_param_ops = {
+static const struct kernel_param_ops null_queue_mode_param_ops = {
.set= null_set_queue_mode,
.get= param_get_int,
 };
@@ -127,7 +127,7 @@ static int null_set_irqmode(const char *str, const struct 
kernel_param *kp)
NULL_IRQ_TIMER);
 }
 
-static struct kernel_param_ops null_irqmode_param_ops = {
+static const struct kernel_param_ops null_irqmode_param_ops = {
.set= null_set_irqmode,
   

[PATCH v1 5/6] kernel/workqueue.c: use module_param_config_on_off() for power_efficient

2015-04-20 Thread Luis R. Rodriguez
From: "Luis R. Rodriguez" 

Use the new module_param_config_on_off() for setting / disabling of
CONFIG_WQ_POWER_EFFICIENT_DEFAULT.

Cc: Rusty Russell 
Cc: Jani Nikula 
Cc: Christoph Hellwig 
Cc: Andrew Morton 
Cc: Geert Uytterhoeven 
Cc: Hannes Reinecke 
Cc: Kees Cook 
Cc: Tejun Heo 
Cc: Ingo Molnar 
Cc: linux-kernel@vger.kernel.org
Cc: co...@systeme.lip6.fr
Signed-off-by: Luis R. Rodriguez 
---
 kernel/workqueue.c | 8 +---
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 586ad91..cf6c2f1 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -280,13 +280,7 @@ static bool wq_disable_numa;
 module_param_named(disable_numa, wq_disable_numa, bool, 0444);
 
 /* see the comment above the definition of WQ_POWER_EFFICIENT */
-#ifdef CONFIG_WQ_POWER_EFFICIENT_DEFAULT
-static bool wq_power_efficient = true;
-#else
-static bool wq_power_efficient;
-#endif
-
-module_param_named(power_efficient, wq_power_efficient, bool, 0444);
+module_param_config_on_off(power_efficient, wq_power_efficient, 0444, 
CONFIG_WQ_POWER_EFFICIENT_DEFAULT);
 
 static bool wq_numa_enabled;   /* unbound NUMA affinity enabled */
 
-- 
2.3.2.209.gd67f9d5.dirty

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v1 0/6] module params: few simplifications

2015-04-20 Thread Luis R. Rodriguez
From: "Luis R. Rodriguez" 

Here are a few simplifications on the sig_force module parameter code.
I'm digging through this as long term I'd like enable standard use of
the crypto code for module loading for firmware loading and later
any file requested (non firmware) to replace udev deamons such as
CRDA which should no longer be needed.

Short term this means seeing what code we could re-use and the option
to only force enable/passively enable signing is one of the options
I'd like to see we keep for firmware signing. The same grammar as used
for module signing can be used, but instead of copy+pasting code I
decided to take generalize the feature option of sig_force, make it
generic, learn from its implmentation of using const, making that
generic and lastly to simplify this even further to one line code
as I had done for the early_param_on_off() stuff recently. Since I'm
also adding an on_off() case for module parameters I had to find a
example simple use case for that, picked workqueue for that.

We might later be able to use SmPL grammar to replace a lot of old code
with these helpers (including early_param_on_off) but will let others look
into that as I'd like to complete other tasks.

All this goes test compiled on x86_64 on:

  * allnoconfig
  * allmodconfig
  * allyesconfig

This series was based on top of linux-next next-20150420.

Luis R. Rodriguez (6):
  kernel/params: constify struct kernel_param_ops uses
  kernel/module.c: use generic module param operaters for sig_enforce
  kernel/params.c: generalize bool_enable_only
  moduleparam.h: add module_param_config_*() helpers
  kernel/workqueue.c: use module_param_config_on_off() for
power_efficient
  kernel/module.c: use module_param_config_on() for sig_enforce

 arch/s390/kernel/perf_cpum_sf.c |  2 +-
 arch/x86/kvm/mmu_audit.c|  2 +-
 arch/x86/platform/uv/uv_nmi.c   |  2 +-
 drivers/block/null_blk.c|  4 +-
 drivers/char/ipmi/ipmi_watchdog.c   |  6 +--
 drivers/dma/dmatest.c   |  4 +-
 drivers/ide/ide.c   |  2 +-
 drivers/infiniband/ulp/srp/ib_srp.c |  4 +-
 drivers/input/misc/ati_remote2.c|  4 +-
 drivers/input/mouse/psmouse-base.c  |  2 +-
 drivers/misc/lis3lv02d/lis3lv02d.c  |  2 +-
 drivers/mtd/ubi/block.c |  2 +-
 drivers/net/wireless/ath/wil6210/main.c |  4 +-
 drivers/power/test_power.c  | 16 
 drivers/thermal/intel_powerclamp.c  |  4 +-
 drivers/tty/hvc/hvc_iucv.c  |  2 +-
 drivers/tty/sysrq.c |  2 +-
 drivers/video/fbdev/uvesafb.c   |  2 +-
 drivers/virtio/virtio_mmio.c|  2 +-
 fs/nfs/super.c  |  2 +-
 include/linux/moduleparam.h | 73 ++---
 kernel/module.c | 37 +
 kernel/params.c | 44 
 kernel/workqueue.c  |  8 +---
 net/sunrpc/auth.c   |  2 +-
 net/sunrpc/xprtsock.c   |  6 +--
 security/apparmor/lsm.c |  6 +--
 security/integrity/ima/ima_crypto.c |  2 +-
 sound/pci/hda/hda_intel.c   |  2 +-
 29 files changed, 141 insertions(+), 109 deletions(-)

-- 
2.3.2.209.gd67f9d5.dirty

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 6/8] selftest/x86: have no dependency on all when cross building

2015-04-20 Thread Andy Lutomirski
On Mon, Apr 20, 2015 at 4:15 PM, Tyler Baker  wrote:
> If the CROSS_COMPILE is set remove all's dependency on all_32 and all_64.
>
> Cc: Andy Lutomirski 
> Signed-off-by: Tyler Baker 
> ---
>  tools/testing/selftests/x86/Makefile | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/tools/testing/selftests/x86/Makefile 
> b/tools/testing/selftests/x86/Makefile
> index be93945..a5ca38b 100644
> --- a/tools/testing/selftests/x86/Makefile
> +++ b/tools/testing/selftests/x86/Makefile
> @@ -7,15 +7,21 @@ BINARIES_64 := $(TARGETS_C_BOTHBITS:%=%_64)
>
>  CFLAGS := -O2 -g -std=gnu99 -pthread -Wall
>
> +all:
> +

This...

>  UNAME_M := $(shell uname -m)
>
> +ifeq ($(CROSS_COMPILE),)
>  # Always build 32-bit tests
>  all: all_32
> -
>  # If we're on a 64-bit host, build 64-bit tests as well
>  ifeq ($(UNAME_M),x86_64)
>  all: all_64
>  endif
> +else
> +# No dependency on all when cross building
> +all:

...is redundant with this.  If you delete the "else" and "all:" here, then:

Acked-by: Andy Lutomirski 

> +endif
>
>  all_32: check_build32 $(BINARIES_32)
>
> --
> 2.1.4
>



-- 
Andy Lutomirski
AMA Capital Management, LLC
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 5/8] selftest/x86: build both bitnesses

2015-04-20 Thread Andy Lutomirski
On Mon, Apr 20, 2015 at 4:15 PM, Tyler Baker  wrote:
> Using uname with the processor flag option in some cases can yield 'unknown'
> so lets use the machine flag option as it is deterministic. Add a dependency
> for all_32 when building on a x86 64 bit host so that both bitnesses are
> built in this case.
>

Acked-by: Andy Lutomirski 

> Cc: Andy Lutomirski 
> Signed-off-by: Tyler Baker 
> ---
>  tools/testing/selftests/x86/Makefile | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/tools/testing/selftests/x86/Makefile 
> b/tools/testing/selftests/x86/Makefile
> index ddf6356..be93945 100644
> --- a/tools/testing/selftests/x86/Makefile
> +++ b/tools/testing/selftests/x86/Makefile
> @@ -7,13 +7,13 @@ BINARIES_64 := $(TARGETS_C_BOTHBITS:%=%_64)
>
>  CFLAGS := -O2 -g -std=gnu99 -pthread -Wall
>
> -UNAME_P := $(shell uname -p)
> +UNAME_M := $(shell uname -m)
>
>  # Always build 32-bit tests
>  all: all_32
>
>  # If we're on a 64-bit host, build 64-bit tests as well
> -ifeq ($(shell uname -p),x86_64)
> +ifeq ($(UNAME_M),x86_64)
>  all: all_64
>  endif
>
> --
> 2.1.4
>



-- 
Andy Lutomirski
AMA Capital Management, LLC
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/7] sched/deadline: reschedule if stop task slip in after pull operations

2015-04-20 Thread Wanpeng Li
On Tue, Apr 21, 2015 at 06:59:02AM +0800, Wanpeng Li wrote:
>Hi Juri,
>On Mon, Apr 20, 2015 at 11:27:22AM +0100, Juri Lelli wrote:
>>Hi,
>>
>>On 06/04/2015 09:53, Wanpeng Li wrote:
>>> pull_dl_task can drop (and re-acquire) rq->lock, this means a stop task 
>>> can slip in, in which case we need to reschedule. This patch add the 
>>> reschedule when the scenario occurs.
>>> 
>>
>>Ok, I guess it can happen. Doesn't RT have the same problem? It seems that
>>it also has to deal with DL tasks slipping in, right?
>
>Yeah, I will send another patch to handle RT class in the v2 patchset. :)

Oh, I just find I have already done them in patch 7/7.

Regards,
Wanpeng Li 

>
>Regards,
>Wanpeng Li 
>
>>
>>Thanks,
>>
>>- Juri
>>
>>> Signed-off-by: Wanpeng Li 
>>> ---
>>>  kernel/sched/deadline.c | 16 +++-
>>>  1 file changed, 15 insertions(+), 1 deletion(-)
>>> 
>>> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
>>> index b8b9355..844da0f 100644
>>> --- a/kernel/sched/deadline.c
>>> +++ b/kernel/sched/deadline.c
>>> @@ -1739,7 +1739,13 @@ static void switched_from_dl(struct rq *rq, struct 
>>> task_struct *p)
>>> if (!task_on_rq_queued(p) || rq->dl.dl_nr_running)
>>> return;
>>>  
>>> -   if (pull_dl_task(rq))
>>> +   /*
>>> +* pull_dl_task() can drop (and re-acquire) rq->lock; this
>>> +* means a stop task can slip in, in which case we need to
>>> +* reschedule.
>>> +*/
>>> +   if (pull_dl_task(rq) ||
>>> +   (rq->stop && task_on_rq_queued(rq->stop)))
>>> resched_curr(rq);
>>>  }
>>>  
>>> @@ -1786,6 +1792,14 @@ static void prio_changed_dl(struct rq *rq, struct 
>>> task_struct *p,
>>> pull_dl_task(rq);
>>>  
>>> /*
>>> +* pull_dl_task() can drop (and re-acquire) rq->lock; this
>>> +* means a stop task can slip in, in which case we need to
>>> +* reschedule.
>>> +*/
>>> +   if (rq->stop && task_on_rq_queued(rq->stop))
>>> +   resched_curr(rq);
>>> +
>>> +   /*
>>>  * If we now have a earlier deadline task than p,
>>>  * then reschedule, provided p is still on this
>>>  * runqueue.
>>> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 04/44] perf tools: Add support for AUX area recording

2015-04-20 Thread Arnaldo Carvalho de Melo
Em Thu, Apr 09, 2015 at 06:53:44PM +0300, Adrian Hunter escreveu:
> Add support for reading from the AUX area
> tracing mmap and synthesizing AUX area
> tracing events.
> 
> This patch introduces an abstraction for recording
> AUX area data.  Recording is initialized
> by auxtrace_record__init() which is a weak function
> to be implemented by the architecture to provide
> recording callbacks.  Recording is mainly handled
> by auxtrace_mmap__read() and
> perf_event__synthesize_auxtrace() but there are
> callbacks for miscellaneous needs including
> validating and processing user options, populating
> private data in auxtrace_info_event, and freeing
> the structure when finished.
> 
> Signed-off-by: Adrian Hunter 
> Acked-by: Jiri Olsa 
> ---
>  tools/perf/perf.h  |   2 +
>  tools/perf/util/auxtrace.c | 176 
> +
>  tools/perf/util/auxtrace.h |  56 ++-
>  tools/perf/util/record.c   |  11 ++-
>  4 files changed, 243 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/perf/perf.h b/tools/perf/perf.h
> index e14bb63..5042093 100644
> --- a/tools/perf/perf.h
> +++ b/tools/perf/perf.h
> @@ -54,8 +54,10 @@ struct record_opts {
>   bool period;
>   bool sample_intr_regs;
>   bool running_time;
> + bool full_auxtrace;
>   unsigned int freq;
>   unsigned int mmap_pages;
> + unsigned int auxtrace_mmap_pages;
>   unsigned int user_freq;
>   u64  branch_stack;
>   u64  default_interval;
> diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
> index dedb646..2cafea2 100644
> --- a/tools/perf/util/auxtrace.c
> +++ b/tools/perf/util/auxtrace.c
> @@ -23,6 +23,10 @@
>  #include 
>  #include 
>  
> +#include 
> +#include 
> +#include 
> +
>  #include "../perf.h"
>  #include "util.h"
>  #include "evlist.h"
> @@ -31,6 +35,9 @@
>  #include "asm/bug.h"
>  #include "auxtrace.h"
>  
> +#include "event.h"
> +#include "debug.h"
> +
>  int auxtrace_mmap__mmap(struct auxtrace_mmap *mm,
>   struct auxtrace_mmap_params *mp,
>   void *userpg, int fd)
> @@ -111,3 +118,172 @@ void auxtrace_mmap_params__set_idx(struct 
> auxtrace_mmap_params *mp,
>   mp->tid = evlist->threads->map[idx];
>   }
>  }
> +
> +size_t auxtrace_record__info_priv_size(struct auxtrace_record *itr)
> +{
> + if (itr)
> + return itr->info_priv_size(itr);
> + return 0;
> +}


return itr ? itr->info_priv_size(itr) : 0;

Is more compact, applying anyway... There is one other comment below,
but it is as well cosmetic.

- Arnaldo

> +
> +static int auxtrace_not_supported(void)
> +{
> + pr_err("AUX area tracing is not supported on this architecture\n");
> + return -EINVAL;
> +}
> +
> +int auxtrace_record__info_fill(struct auxtrace_record *itr,
> +struct perf_session *session,
> +struct auxtrace_info_event *auxtrace_info,
> +size_t priv_size)
> +{
> + if (itr)
> + return itr->info_fill(itr, session, auxtrace_info, priv_size);
> + return auxtrace_not_supported();
> +}
> +
> +void auxtrace_record__free(struct auxtrace_record *itr)
> +{
> + if (itr)
> + itr->free(itr);
> +}
> +
> +int auxtrace_record__options(struct auxtrace_record *itr,
> +  struct perf_evlist *evlist,
> +  struct record_opts *opts)
> +{
> + if (itr)
> + return itr->recording_options(itr, evlist, opts);
> + return 0;
> +}
> +
> +u64 auxtrace_record__reference(struct auxtrace_record *itr)
> +{
> + if (itr)
> + return itr->reference(itr);
> + return 0;
> +}
> +
> +struct auxtrace_record *__weak
> +auxtrace_record__init(struct perf_evlist *evlist __maybe_unused, int *err)
> +{
> + *err = 0;
> + return NULL;
> +}
> +
> +int perf_event__synthesize_auxtrace_info(struct auxtrace_record *itr,
> +  struct perf_tool *tool,
> +  struct perf_session *session,
> +  perf_event__handler_t process)
> +{
> + union perf_event *ev;
> + size_t priv_size;
> + int err;
> +
> + pr_debug2("Synthesizing auxtrace information\n");
> + priv_size = auxtrace_record__info_priv_size(itr);
> + ev = zalloc(sizeof(struct auxtrace_info_event) + priv_size);
> + if (!ev)
> + return -ENOMEM;
> +
> + ev->auxtrace_info.header.type = PERF_RECORD_AUXTRACE_INFO;
> + ev->auxtrace_info.header.size = sizeof(struct auxtrace_info_event) +
> + priv_size;
> + err = auxtrace_record__info_fill(itr, session, >auxtrace_info,
> +  priv_size);
> + if (err)
> + goto out_free;
> +
> + err = process(tool, ev, NULL, NULL);
> +out_free:

if (!err)
   

Re: [PATCH 4/7] sched/deadline: reschedule if stop task slip in after pull operations

2015-04-20 Thread Wanpeng Li
Hi Juri,
On Mon, Apr 20, 2015 at 11:27:22AM +0100, Juri Lelli wrote:
>Hi,
>
>On 06/04/2015 09:53, Wanpeng Li wrote:
>> pull_dl_task can drop (and re-acquire) rq->lock, this means a stop task 
>> can slip in, in which case we need to reschedule. This patch add the 
>> reschedule when the scenario occurs.
>> 
>
>Ok, I guess it can happen. Doesn't RT have the same problem? It seems that
>it also has to deal with DL tasks slipping in, right?

Yeah, I will send another patch to handle RT class in the v2 patchset. :)

Regards,
Wanpeng Li 

>
>Thanks,
>
>- Juri
>
>> Signed-off-by: Wanpeng Li 
>> ---
>>  kernel/sched/deadline.c | 16 +++-
>>  1 file changed, 15 insertions(+), 1 deletion(-)
>> 
>> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
>> index b8b9355..844da0f 100644
>> --- a/kernel/sched/deadline.c
>> +++ b/kernel/sched/deadline.c
>> @@ -1739,7 +1739,13 @@ static void switched_from_dl(struct rq *rq, struct 
>> task_struct *p)
>>  if (!task_on_rq_queued(p) || rq->dl.dl_nr_running)
>>  return;
>>  
>> -if (pull_dl_task(rq))
>> +/*
>> + * pull_dl_task() can drop (and re-acquire) rq->lock; this
>> + * means a stop task can slip in, in which case we need to
>> + * reschedule.
>> + */
>> +if (pull_dl_task(rq) ||
>> +(rq->stop && task_on_rq_queued(rq->stop)))
>>  resched_curr(rq);
>>  }
>>  
>> @@ -1786,6 +1792,14 @@ static void prio_changed_dl(struct rq *rq, struct 
>> task_struct *p,
>>  pull_dl_task(rq);
>>  
>>  /*
>> + * pull_dl_task() can drop (and re-acquire) rq->lock; this
>> + * means a stop task can slip in, in which case we need to
>> + * reschedule.
>> + */
>> +if (rq->stop && task_on_rq_queued(rq->stop))
>> +resched_curr(rq);
>> +
>> +/*
>>   * If we now have a earlier deadline task than p,
>>   * then reschedule, provided p is still on this
>>   * runqueue.
>> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 8/8] selftests/exec: do not install subdir as it is already created

2015-04-20 Thread Tyler Baker
Remove subdir from DEPS as it is already created at runtime. Without this,
make install fails.

Acked-by: Michael Ellerman 
Signed-off-by: Tyler Baker 
---
 tools/testing/selftests/exec/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/exec/Makefile 
b/tools/testing/selftests/exec/Makefile
index 4edb7d0..6b76bfd 100644
--- a/tools/testing/selftests/exec/Makefile
+++ b/tools/testing/selftests/exec/Makefile
@@ -1,6 +1,6 @@
 CFLAGS = -Wall
 BINARIES = execveat
-DEPS = execveat.symlink execveat.denatured script subdir
+DEPS = execveat.symlink execveat.denatured script
 all: $(BINARIES) $(DEPS)
 
 subdir:
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 7/8] selftest/x86: install tests

2015-04-20 Thread Tyler Baker
Include lib.mk and set TEST_PROGS where appropriate. Skip the install and test
case when CROSS_COMPILE is not set.

Cc: Andy Lutomirski 
Signed-off-by: Tyler Baker 
---
 tools/testing/selftests/x86/Makefile | 9 +
 1 file changed, 9 insertions(+)

diff --git a/tools/testing/selftests/x86/Makefile 
b/tools/testing/selftests/x86/Makefile
index a5ca38b..d539b44 100644
--- a/tools/testing/selftests/x86/Makefile
+++ b/tools/testing/selftests/x86/Makefile
@@ -14,19 +14,28 @@ UNAME_M := $(shell uname -m)
 ifeq ($(CROSS_COMPILE),)
 # Always build 32-bit tests
 all: all_32
+# Install 32-bit tests
+TEST_PROGS += $(BINARIES_32) run_x86_tests.sh
 # If we're on a 64-bit host, build 64-bit tests as well
 ifeq ($(UNAME_M),x86_64)
 all: all_64
+# Install 64-bit tests
+TEST_PROGS += $(BINARIES_64)
 endif
 else
 # No dependency on all when cross building
 all:
+# Skip install and test case when not built
+override INSTALL_RULE :=
+override EMIT_TESTS :=  echo "echo \"selftests: run_x86_tests.sh [SKIP]\""
 endif
 
 all_32: check_build32 $(BINARIES_32)
 
 all_64: $(BINARIES_64)
 
+include ../lib.mk
+
 clean:
$(RM) $(BINARIES_32) $(BINARIES_64)
 
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 6/8] selftest/x86: have no dependency on all when cross building

2015-04-20 Thread Tyler Baker
If the CROSS_COMPILE is set remove all's dependency on all_32 and all_64.

Cc: Andy Lutomirski 
Signed-off-by: Tyler Baker 
---
 tools/testing/selftests/x86/Makefile | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/x86/Makefile 
b/tools/testing/selftests/x86/Makefile
index be93945..a5ca38b 100644
--- a/tools/testing/selftests/x86/Makefile
+++ b/tools/testing/selftests/x86/Makefile
@@ -7,15 +7,21 @@ BINARIES_64 := $(TARGETS_C_BOTHBITS:%=%_64)
 
 CFLAGS := -O2 -g -std=gnu99 -pthread -Wall
 
+all:
+
 UNAME_M := $(shell uname -m)
 
+ifeq ($(CROSS_COMPILE),)
 # Always build 32-bit tests
 all: all_32
-
 # If we're on a 64-bit host, build 64-bit tests as well
 ifeq ($(UNAME_M),x86_64)
 all: all_64
 endif
+else
+# No dependency on all when cross building
+all:
+endif
 
 all_32: check_build32 $(BINARIES_32)
 
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   8   9   10   >