date:20150510

Re: [PATCH V5 1/1] hv_netvsc: Use the xmit_more skb flag to optimize signaling the host

2015-05-10 Thread Joe Perches

On Sun, 2015-05-10 at 21:08 -0700, K. Y. Srinivasan wrote:
> Based on the information given to this driver (via the xmit_more skb flag),
> we can defer signaling the host if more packets are on the way. This will help
> make the host more efficient since it can potentially process a larger batch 
> of
> packets. Implement this optimization.

trivia:

I think that indirecting VMBUS_DATA_PACKET_FLAG_COMPLETTION_REQUESTED
into a non-const temporary isn't very useful.

Whenever overly long identifiers like VMBUS_ is used,
I think that it'd be better to use it directly and ignore
80 column warnings.

Same with the "sizeof(struct nvsp_message)" on two lines.

> diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
[]
> @@ -743,6 +743,8 @@ static inline int netvsc_send_pkt(
>   u64 req_id;
>   int ret;
>   struct hv_page_buffer *pgbuf;
> + u32 vmbus_flags = VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED;
> + u32 ring_avail = hv_ringbuf_avail_percent(_channel->outbound);
[]
> @@ -769,30 +771,41 @@ static inline int netvsc_send_pkt(
[]
> + ret = vmbus_sendpacket_pagebuffer_ctl(out_channel,
> +   pgbuf,
> +   packet->page_buf_cnt,
> +   ,
> +   sizeof(struct
> +  nvsp_message),
> +   req_id,
> +   vmbus_flags,
> +   !packet->xmit_more);

[]

>   netif_tx_stop_queue(netdev_get_tx_queue(
>   ndev, q_idx));

This could be on one line too.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Crypto Fixes for 4.1

2015-05-10 Thread Herbert Xu

Hi Linus:

This push fixes a the implementation of CRC32 on arm64 where it
incorrectly applied negation on the result.  It also fixes the
arm64 implementations of SHA/SHA256 where in some cases it may
end up finalising the result twice.

Please pull from

git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6.git

or

master.kernel.org:/pub/scm/linux/kernel/git/herbert/crypto-2.6.git


Ard Biesheuvel (3):
  crypto: arm64/crc32 - bring in line with generic CRC32
  crypto: arm64/sha1-ce - prevent asm code finalization in final() path
  crypto: arm64/sha2-ce - prevent asm code finalization in final() path

 arch/arm64/crypto/crc32-arm64.c  |   22 +++---
 arch/arm64/crypto/sha1-ce-glue.c |3 +++
 arch/arm64/crypto/sha2-ce-glue.c |3 +++
 3 files changed, 25 insertions(+), 3 deletions(-)

Thanks,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[LKP] [sched] eae3e9e8843: +36.8% pigz.throughput

2015-05-10 Thread Huang Ying

FYI, we noticed the below changes on

git://bee.sh.intel.com/git/ydu19/linux rewrite-v7-on-4.1-rc1
commit eae3e9e8843146e7e1cc77bd943e5f8138b61314 ("sched: Rewrite per entity 
runnable load average tracking")

testcase/path_params/tbox_group: pigz/performance-100%-128K/lkp-nex06

40fa32019d8574cb  eae3e9e8843146e7e1cc77bd94  
  --  
 2.683e+08 ±  0% +36.8%   3.67e+08 ±  0%  pigz.throughput
 18661 ±  0% -21.6%  14625 ±  0%  pigz.time.user_time
   2046066 ±  0%+176.7%5661440 ±  0%  
pigz.time.voluntary_context_switches
   174 ±  0% +40.9%245 ±  0%  pigz.time.system_time
904986 ±  1% -43.8% 508883 ±  0%  
pigz.time.involuntary_context_switches
  6268 ±  0% -21.0%   4952 ±  0%  
pigz.time.percent_of_cpu_this_job_got
  3382 ±  8%+118.6%   7392 ±  3%  uptime.idle
 33576 ±  0% -12.3%  29441 ±  1%  meminfo.Shmem
   1241653 ±  6% -10.0%1117099 ±  0%  softirqs.RCU
201843 ±  0% +97.2% 397952 ±  0%  softirqs.SCHED
   9772559 ±  0% -20.6%7762431 ±  0%  softirqs.TIMER
63 ±  0% -17.7% 52 ±  1%  vmstat.procs.r
 84867 ±  0% -20.0%  67874 ±  0%  vmstat.system.in
 13057 ±  0%+183.4%  37007 ±  0%  vmstat.system.cs
904986 ±  1% -43.8% 508883 ±  0%  time.involuntary_context_switches
  6268 ±  0% -21.0%   4952 ±  0%  time.percent_of_cpu_this_job_got
   174 ±  0% +40.9%245 ±  0%  time.system_time
 18661 ±  0% -21.6%  14625 ±  0%  time.user_time
   2046066 ±  0%+176.7%5661440 ±  0%  time.voluntary_context_switches
 69.56 ±  0%  +7.4%  74.67 ±  0%  turbostat.%Busy
  1335 ±  0% +24.6%   1663 ±  0%  turbostat.Avg_MHz
  1920 ±  0% +16.0%   2227 ±  0%  turbostat.Bzy_MHz
 29.76 ±  0% -44.5%  16.50 ±  1%  turbostat.CPU%c1
  0.69 ±  5%   +1182.5%   8.82 ±  1%  turbostat.CPU%c3
  0.28 ±  9% +83.0%   0.51 ±  2%  turbostat.Pkg%pc3
  8379 ±  0% -12.2%   7360 ±  1%  proc-vmstat.nr_shmem
  26862047 ±  0% +36.0%   36537881 ±  0%  proc-vmstat.numa_hit
  26861982 ±  0% +36.0%   36537861 ±  0%  proc-vmstat.numa_local
  9403 ±  0% -12.6%   8221 ±  2%  proc-vmstat.pgactivate
   2116622 ±  7% +28.9%2727789 ±  2%  proc-vmstat.pgalloc_dma32
  24828395 ±  0% +36.6%   33904973 ±  0%  proc-vmstat.pgalloc_normal
  26931165 ±  0% +36.0%   36619978 ±  0%  proc-vmstat.pgfree
   6608532 ±  7% +29.1%8533790 ±  2%  numa-numastat.node0.numa_hit
   6607488 ±  7% +29.1%8533243 ±  2%  numa-numastat.node0.local_node
   6949144 ± 10% +32.9%9232380 ± 10%  numa-numastat.node1.local_node
   6950185 ± 10% +32.8%9232396 ± 10%  numa-numastat.node1.numa_hit
   6887396 ± 17% +50.7%   10377491 ± 10%  numa-numastat.node2.local_node
   6889467 ± 17% +50.6%   10378080 ± 10%  numa-numastat.node2.numa_hit
   6427416 ± 16% +30.8%8408092 ±  1%  numa-numastat.node3.local_node
   6429490 ± 16% +30.8%8408663 ±  1%  numa-numastat.node3.numa_hit
 23149 ±  4%+100.3%  46357 ±  1%  cpuidle.C1-NHM.usage
   7161061 ±  6%+326.7%   30557423 ±  1%  cpuidle.C1-NHM.time
100163 ±  4%+263.5% 364095 ±  1%  cpuidle.C1E-NHM.usage
  22529441 ±  3%+390.6%  1.105e+08 ±  1%  cpuidle.C1E-NHM.time
 3.774e+08 ±  3%   +1024.6%  4.244e+09 ±  0%  cpuidle.C3-NHM.time
669049 ±  2%+556.3%4391062 ±  0%  cpuidle.C3-NHM.usage
16 ± 15%+506.1%100 ± 11%  cpuidle.POLL.usage
   770 ± 40%+544.8%   4966 ± 35%  cpuidle.POLL.time
 34535 ± 36% -41.8%  20095 ±  8%  numa-meminfo.node0.Active(anon)
 23133 ±  7% -12.8%  20180 ±  8%  numa-meminfo.node0.AnonPages
 13351 ±  4% -20.6%  10594 ±  9%  numa-meminfo.node0.SReclaimable
 28535 ±  5% +15.7%  33003 ±  9%  numa-meminfo.node2.Slab
218579 ±  2% +13.3% 247589 ±  1%  numa-meminfo.node2.MemUsed
118187 ±  2%  +8.3% 127984 ±  9%  numa-meminfo.node3.FilePages
226067 ±  4% +10.6% 249934 ±  5%  numa-meminfo.node3.MemUsed
  2447 ±  9% +18.8%   2908 ±  3%  numa-meminfo.node3.KernelStack
  5775 ±  7% -12.6%   5047 ±  8%  numa-vmstat.node0.nr_anon_pages
   3498916 ±  9% +24.1%4343868 ±  2%  numa-vmstat.node0.numa_local
   3532759 ±  9% +24.0%4379945 ±  2%  numa-vmstat.node0.numa_hit
  8629 ± 37% -41.8%   5025 ±  8%  numa-vmstat.node0.nr_active_anon
  3337 ±  4% -20.7%   2648 ±  9%  
numa-vmstat.node0.nr_slab_reclaimable
   150 ± 20% -22.2%116 ±  0%  numa-vmstat.node0.nr_unevictable
   150 ± 20% -22.2%116 ±  0%  numa-vmstat.node0.nr_mlock
  6417 ± 32%+363.7%  29753 ± 41%  numa-vmstat.node1.numa_other
   3525752 ± 11% +32.4%4669694 ±  8%  numa-vmstat.node1.numa_hit
   3519334 ± 11% +31.8%

linux-next: manual merge of the staging tree with the v4l-dvb tree

2015-05-10 Thread Stephen Rothwell

Hi Greg,

Today's linux-next merge of the staging tree got a conflict in
drivers/staging/media/dt3155v4l/dt3155v4l.c between commit cc11b140c3f1
("[media] dt3155: move out of staging into drivers/media/pci") from the
v4l-dvb tree and commit ec80e2428046 ("staging: dt3155v4l: remove
unused including ") from the staging tree (which moved
the file to drivers/media/pci/dt3155/dt3155.c).

I fixed it up (I just deleted the file from staging - the
linux/config.h patch needs to be applied to the new file) and can carry
the fix as necessary (no action is required).

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgpJPpINzvGlw.pgp
Description: OpenPGP digital signature

Re: Pointer Cast to Different Type Warnings in sh_mmcif.c

2015-05-10 Thread Kuninori Morimoto


Hi nick

Thank you for your feedback

> > I guess this warning happen from
> > 5f48dd0690cbcea3f35b9ef2f05d5468dedc80b0
> > (mmc: sh_mmcif: remove slave_id settings for DMAEngine)
> > 
> > I didn't check, but does cast issue is solved by this ?
> > 
> > -   (void *)pdata->slave_id_tx :
> > -   (void *)pdata->slave_id_rx;
> > +   (void *)(unsigned long)pdata->slave_id_tx :
> > +   (void *)(unsigned long)pdata->slave_id_rx;
> > 
> > 
> > Best regards
> > ---
> > Kuninori Morimoto
> > 
> Kuninori,
> After running a allmodconfig on my machine the warnings are no longer there. 
> Either I can send in a patch fixing this or
> you can one with my name on a suggested by line, in the signed off part of 
> your patch. The choice is up to you, I don't 
> care either way.

Thank you.
Can you please send this patch ?
I can send Acked-by to your patch.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/3] cpuidle: updates related to tick_broadcast_enter() failures

2015-05-10 Thread Preeti U Murthy

On 05/10/2015 04:45 AM, Rafael J. Wysocki wrote:
> On Saturday, May 09, 2015 10:33:05 PM Rafael J. Wysocki wrote:
>> On Saturday, May 09, 2015 10:11:41 PM Rafael J. Wysocki wrote:
>>> On Saturday, May 09, 2015 11:19:16 AM Preeti U Murthy wrote:
 Hi Rafael,

 On 05/08/2015 07:48 PM, Rafael J. Wysocki wrote:
>>>
>>> [cut]
>>>
>>  
>> +/* Take note of the planned idle state. */
>> +idle_set_state(smp_processor_id(), target_state);
>
> And I wouldn't do this either.
>
> The behavior here is pretty much as though the driver demoted the state 
> chosen
> by the governor and we don't call idle_set_state() again in those cases.

 Why is this wrong?
>>>
>>> It is not "wrong", but incomplete, because demotions done by the cpuidle 
>>> driver
>>> should also be taken into account in the same way.
>>>
>>> But I'm seeing that the recent patch of mine that made cpuidle_enter_state()
>>> call default_idle_call() was a mistake, because it might confuse 
>>> find_idlest_cpu()
>>> significantly as to what state the CPU is in.  I'll drop that one for now.
>>
>> OK, done.
>>
>> So after I've dropped it I think we need to do three things:
>> (1) Move the idle_set_state() calls to cpuidle_enter_state().
>> (2) Make cpuidle_enter_state() call default_idle_call() again, but this time
>> do that *before* it has called idle_set_state() for target_state.
>> (3) Introduce demotion as per my last patch.
>>
>> Let me cut patches for that.
> 
> Done as per the above and the patches follow in replies to this messge.
> 
> All on top of the current linux-next branch of the linux-pm.git tree.

The patches look good. Based and tested these patches on top of
linux-pm/linux-next (They are not yet in the branch as far as I can see.)

All patches in this series
Reviewed and Tested-by: Preeti U Murthy 
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [GIT PULL 00/30] perf/core improvements and fixes

2015-05-10 Thread Namhyung Kim

Hi Arnaldo,

On Fri, May 08, 2015 at 05:56:12PM -0300, Arnaldo Carvalho de Melo wrote:
> Hi Ingo,
> 
>   Please consider pulling,
> 
> - Arnaldo
> 
> 
> 
> The following changes since commit cb307113746b4d184155d2c412e8069aeaa60d42:
> 
>   perf_event: Don't allow vmalloc() backed perf on powerpc (2015-05-08 
> 12:26:01 +0200)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git 
> tags/perf-core-for-mingo
> 
> for you to fetch changes up to 76d408498b08447e0f61dfdd611aeb6e8e61ce80:
> 
>   perf build: Disable libdw DWARF unwind when built with NO_DWARF (2015-05-08 
> 16:43:14 -0300)
> 
> 
> perf/core improvements and fixes:
> 
> User visible:
> 
> - 'perf probe' improvements (Masami Hiramatsu)
> 
>   - Support glob wildcards for function name
>   - Support $params special probe argument: Collect all function arguments
>   - Make --line checks validate C-style function name.
>   - Add --no-inlines option to avoid searching inline functions
> 
> - Introduce new 'perf bench futex' benchmark: 'wake-parallel', to
>   measure parallel waker threads generating contention for kerne
>   locks (hb->lock) (Davidlohr Bueso)
> 
> Bug fixes:
> 
> - 'perf top' survives much longer on high core count machines, more work
>   needed to refcount more data structures besides 'struct thread' and fix
>   more races (Arnaldo Carvalho de Melo)

I'm seeing a segfault on 'perf report' with a large data file after
applying thread refcount change - it happens regardless of the atomic
operation.

Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH 02/10] drivers:host:fsl: Use ehci_overrides structure for EHCI drv

2015-05-10 Thread Ramneek Mehresh



> -Original Message-
> From: Alan Stern [mailto:st...@rowland.harvard.edu]
> Sent: Friday, May 08, 2015 7:56 PM
> To: Mehresh Ramneek-B31383
> Cc: linux-kernel@vger.kernel.org; ba...@ti.com; linux-...@vger.kernel.org;
> gre...@linuxfoundation.org
> Subject: RE: [PATCH 02/10] drivers:host:fsl: Use ehci_overrides structure for
> EHCI drv
> 
> On Fri, 8 May 2015, Ramneek Mehresh wrote:
> 
> > > On Thu, 7 May 2015, Ramneek Mehresh wrote:
> > >
> > > > Make use of ehci_driver_overrides structure for ehci-fsl driver
> > > >
> > > > Signed-off-by: Ramneek Mehresh
> 
> > >
> > > You need to change a lot more than this.  See commit a76dd463c58e
> (USB:
> > > EHCI: make ehci-orion a separate driver) as an example of what is
> > > needed.  In the end, ehci-fsl.ko should be a new driver module, not
> > > compiled into ehci- hcd.ko.
> > >
> > I can definitely make this change, but this patch set is about OTG
> > functionality fix for all FSL QorIQ socs. Changes you are asking are
> > for FSL Host driver. For that I can float separate patch/patch set.
> > Hence, I would request you to please accept the Patch series in conext
> > of OTG functionality fix
> 
> Accept which patch series?  You have posted several different versions (and
> you failed to put the version numbers in the email Subject:
> lines, so I can't tell which patches belong to which version!).
> 
> I am not going to accept something that does a partial job of converting ehci-
> fsl into a separate module.  Either do the entire conversion first, as a
> separate patch that is prerequisite to the OTG fix, or else fix the OTG stuff 
> in
> a way that doesn't require ehci-fsl to be a separate module.
> 
Ok, then let me first fix USB host driver as per expectation...
I'll then re-send OTG patch set over that...I don't want to intermingle these 
two
separate functionalities. Thanks.

> Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

linux-next: build failure after merge of the tty tree

2015-05-10 Thread Stephen Rothwell

Hi Greg,

After merging the tty tree, today's linux-next build (arm multi_v7_defconfig)
failed like this:

drivers/tty/serial/amba-pl011.c: In function 'pl011_startup':
drivers/tty/serial/amba-pl011.c:1582:5: error: 'struct uart_amba_port' has no 
member named 'tx_irq_seen'
  uap->tx_irq_seen = 0;
 ^

Caused by a mismerge between patch "Revert "serial/amba-pl011: Leave
the TX IRQ alone when the UART is not open" (that appears in Linus' tree
and the tty tree) and commit 1e84d22322ce ("serial/amba-pl011: Refactor
and simplify TX FIFO handling") from the tty tree.

I applied this merge fix patch:

From: Stephen Rothwell 
Date: Mon, 11 May 2015 15:12:50 +1000
Subject: [PATCH] serial/amba-pl011: fix mismerge

Signed-off-by: Stephen Rothwell 
---
 drivers/tty/serial/amba-pl011.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/tty/serial/amba-pl011.c b/drivers/tty/serial/amba-pl011.c
index 6fabc059efed..f5bd8426fd75 100644
--- a/drivers/tty/serial/amba-pl011.c
+++ b/drivers/tty/serial/amba-pl011.c
@@ -1578,9 +1578,6 @@ static int pl011_startup(struct uart_port *port)
 
writew(uap->vendor->ifls, uap->port.membase + UART011_IFLS);
 
-   /* Assume that TX IRQ doesn't work until we see one: */
-   uap->tx_irq_seen = 0;
-
spin_lock_irq(>port.lock);
 
/* restore RTS and DTR */
-- 
2.1.4

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgp9A9emEgGVv.pgp
Description: OpenPGP digital signature

Re: [PATCH 16/18] f2fs crypto: add symlink encryption

2015-05-10 Thread Jaegeuk Kim

On Sat, May 09, 2015 at 05:25:54AM +0100, Al Viro wrote:
> On Fri, May 08, 2015 at 09:20:51PM -0700, Jaegeuk Kim wrote:
> > This patch implements encryption support for symlink.
> > 
> > The codes refered the ext4 symlink path.
> 
> ext4 symlink patches are seriously misguided - don't mix encrypted and
> unencrypted cases in the same inode_operations.

Ok. Let me split them into separated inode_operations.

Thanks,

> 
> NAK.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC 2/2] ARM: vf610: Add SoC bus support for Vybrid

2015-05-10 Thread Sanchayan Maity

Implements SoC bus support to export SoC specific information. Read
the unique SoC ID from the Vybrid On Chip One Time Programmable (OCOTP)
controller, SoC specific information from the Miscellaneous System
Control Module (MSCM), revision from the ROM revision register and
expose it via the SoC bus infrastructure.

Signed-off-by: Sanchayan Maity 
---
 arch/arm/mach-imx/mach-vf610.c | 76 +-
 1 file changed, 75 insertions(+), 1 deletion(-)

diff --git a/arch/arm/mach-imx/mach-vf610.c b/arch/arm/mach-imx/mach-vf610.c
index 1ba7738..74681f1 100644
--- a/arch/arm/mach-imx/mach-vf610.c
+++ b/arch/arm/mach-imx/mach-vf610.c
@@ -9,13 +9,87 @@
 
 #include 
 #include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
 #include 
 #include 
 #include "common.h"
 
+#define OCOTP_CFG0_OFFSET  0x0410
+#define OCOTP_CFG1_OFFSET  0x0420
+#define MSCM_CPxCOUNT_OFFSET   0x002C
+#define MSCM_CPxCFG1_OFFSET0x0014
+#define ROM_REVISION_REGISTER  0x0080
+
 static void __init vf610_init_machine(void)
 {
-   of_platform_populate(NULL, of_default_bus_match_table, NULL, NULL);
+   struct regmap *ocotp_regmap, *mscm_regmap;
+   struct soc_device_attribute *soc_dev_attr;
+   struct device *parent = NULL;
+   struct soc_device *soc_dev;
+   char soc_type[] = "xx0";
+   void __iomem *rom_rev;
+   u32 cpxcount, cpxcfg1;
+   u32 soc_id1, soc_id2;
+   u64 soc_id;
+
+   ocotp_regmap = syscon_regmap_lookup_by_compatible("fsl,vf610-ocotp");
+   if (IS_ERR(ocotp_regmap)) {
+   pr_err("regmap lookup for octop failed\n");
+   goto out;
+   }
+
+   mscm_regmap = 
syscon_regmap_lookup_by_compatible("fsl,vf610-mscm-cpucfg");
+   if (IS_ERR(mscm_regmap)) {
+   pr_err("regmap lookup for mscm failed");
+   goto out;
+   }
+
+   regmap_read(ocotp_regmap, OCOTP_CFG0_OFFSET, _id1);
+   regmap_read(ocotp_regmap, OCOTP_CFG1_OFFSET, _id2);
+
+   soc_id = (u64) soc_id1 << 32 | soc_id2;
+   add_device_randomness(_id, sizeof(soc_id));
+
+   regmap_read(mscm_regmap, MSCM_CPxCOUNT_OFFSET, );
+   regmap_read(mscm_regmap, MSCM_CPxCFG1_OFFSET, );
+
+   soc_type[0] = cpxcount ? '6' : '5'; /* Dual Core => VF6x0 */
+   soc_type[1] = cpxcfg1 ? '1' : '0'; /* L2 Cache => VFx10 */
+
+   soc_dev_attr = kzalloc(sizeof(*soc_dev_attr), GFP_KERNEL);
+   if (!soc_dev_attr)
+   goto out;
+
+   soc_dev_attr->machine = kasprintf(GFP_KERNEL, "Freescale Vybrid");
+   soc_dev_attr->soc_id = kasprintf(GFP_KERNEL, "%llx", soc_id);
+   soc_dev_attr->family = kasprintf(GFP_KERNEL, "Freescale Vybrid VF%s",
+soc_type);
+
+   rom_rev = ioremap(ROM_REVISION_REGISTER, SZ_1);
+   if (rom_rev)
+   soc_dev_attr->revision = kasprintf(GFP_KERNEL, "%08x",
+   readl(rom_rev));
+
+   soc_dev = soc_device_register(soc_dev_attr);
+   if (IS_ERR(soc_dev)) {
+   if (!rom_rev)
+   kfree(soc_dev_attr->revision);
+   kfree(soc_dev_attr->family);
+   kfree(soc_dev_attr->soc_id);
+   kfree(soc_dev_attr->machine);
+   kfree(soc_dev_attr);
+   goto out;
+   }
+
+   parent = soc_device_to_device(soc_dev);
+
+out:
+   of_platform_populate(NULL, of_default_bus_match_table, NULL, parent);
vf610_pm_init();
 }
 
-- 
2.4.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC 1/2] ARM: dts: vfxxx: Add OCOTP node

2015-05-10 Thread Sanchayan Maity

Add a device tree node for the On-Chip One Time Programmable
Controller (OCOTP).

Signed-off-by: Sanchayan Maity 
---
 arch/arm/boot/dts/vfxxx.dtsi | 5 +
 1 file changed, 5 insertions(+)

diff --git a/arch/arm/boot/dts/vfxxx.dtsi b/arch/arm/boot/dts/vfxxx.dtsi
index 5857f60..a79776e 100644
--- a/arch/arm/boot/dts/vfxxx.dtsi
+++ b/arch/arm/boot/dts/vfxxx.dtsi
@@ -487,6 +487,11 @@
status = "disabled";
};
 
+   ocotp: ocotp@400a5000 {
+  compatible = "fsl,vf610-ocotp", "syscon";
+  reg = <0x400a5000 0x1000>;
+   };
+
snvs0: snvs@400a7000 {
compatible = "fsl,sec-v4.0-mon", "simple-bus";
#address-cells = <1>;
-- 
2.4.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC 0/2] Implement SoC bus support for Vybrid

2015-05-10 Thread Sanchayan Maity

Hello,

Currently this patchset is based of on our local branch but would like 
some comments before I push this to mainline through Shawn's tree.

This patchset implements the following
https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-devices-soc

Currently the required information is more or less read across the whole 
SoC, but I guess we cannot change that since these are the locations 
with the required information.

There seem to be three options for the revision field:
- ROM revision (see https://community.freescale.com/docs/DOC-94802)
- ANADIG revision (ANADIG_DIGIPROC, as used for the i.MX SoC's)
- OCOTP revision

Some numbers:

Colibri VF61 1.1A (2N02G)
- 0x0013
- 0x0061
- 0x0100
- 0x41c8

Colibri VF61 V1.0B (1N02G)
- 0x0011
- 0x0061
- 0x0100
- 0x41c8

Colibri VF61 V1.0A (which is actually a VF600 SoC, no L2 cache, since
that was the only one we could buy back then, 1N02G printed on it)
- 0x0011
- 0x0061
- 0x0100
- none...

Colibri VF50 V1.0A (1N02G)
- 0x0011
- 0x0061
- 0x0100
- none...

Vybrid Tower Rev J (1N02G)
- 0x0011
- 0x0061
- 0x0100
- 0x41c8

Read from u-boot
md.l 0x80 1
md.l 0x40050260 1
md.l 0x400A5090 1


The ROM revision seems to differ most. So we would like to go with the 
revision from the ROM register 0x80.

Now coming to the primary question. This ROM revision register is not 
really within any of the peripheral maps and I would like to access it 
for the versioning information. Currently, I used ioremap like below

ioremap(ROM_REVISION_REGISTER, SZ_1);

which I guess probably is not the right way to do it. What would be the 
correct or better way to do this? 

Also comments or feedback or any of the other parts of the patch are 
also welcome.

Some Sample outputs are below:
On Colibri VF61 V1.1A:
root@colibri-vf:/sys/devices/soc0# ls
backlight  fxosc  regulators sound  uevent
bl_on  gpio-keys  revision   subsystem
clk16m machinesocsxosc
family power  soc_id syscon-reboot
root@colibri-vf:/sys/devices/soc0# cat revision
0013
root@colibri-vf:/sys/devices/soc0# cat soc_id
dbc8435c211629d4
root@colibri-vf:/sys/devices/soc0# cat family
Freescale Vybrid VF610

On Colibri VF50 V1.1A:
root@colibri-vf:/sys/devices/soc0# ls
backlight   machine subsystem
bl_on   power   sxosc
clk16m  regulators  syscon-reboot
family  revisiontoradex,vf50_touchctrl
fxosc   soc uevent
gpio-keys   soc_id
root@colibri-vf:/sys/devices/soc0# cat revision
0013
root@colibri-vf:/sys/devices/soc0# cat soc_id
df63c12a2e2161d4
root@colibri-vf:/sys/devices/soc0# cat family
Freescale Vybrid VF500
root@colibri-vf:/sys/devices/soc0# cat machine
Freescale Vybrid

Thanks & Regards,
Sanchayan Maity.

Sanchayan Maity (2):
  ARM: dts: vfxxx: Add OCOTP node
  ARM: vf610: Add SoC bus support for Vybrid

 arch/arm/boot/dts/vfxxx.dtsi   |  5 +++
 arch/arm/mach-imx/mach-vf610.c | 76 +-
 2 files changed, 80 insertions(+), 1 deletion(-)

-- 
2.4.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [f2fs-dev] [PATCH 14/18] f2fs crypto: add filename encryption for f2fs_lookup

2015-05-10 Thread Jaegeuk Kim

Hi Hujianynag,

I just fixed it up, and rebased the dev branch.
Could you check them out?

Thanks,

On Mon, May 11, 2015 at 10:52:48AM +0800, hujianyang wrote:
> Hi Jaegeuk,
> 
> While compiling git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs.git
> branch dev(commit 5af6e74892a f2fs crypto: remove checking key context during 
> lookup),
> I saw an error:
> 
> fs/f2fs/dir.c: In function ‘find_in_level’:
> fs/f2fs/dir.c:163: error: unknown field ‘len’ specified in initializer
> fs/f2fs/dir.c:163: warning: excess elements in struct initializer
> fs/f2fs/dir.c:163: warning: (near initialization for ‘name’)
> 
> I think it's related to this patch.
> If there is anything wrong in my configuration, please let me know.
> 
> Thanks,
> Hu
> 
> 
> 
> On 2015/5/9 12:20, Jaegeuk Kim wrote:
> > This patch implements filename encryption support for f2fs_lookup.
> > 
> > Note that, f2fs_find_entry should be outside of f2fs_(un)lock_op().
> > 
> > Signed-off-by: Jaegeuk Kim 
> > ---
> >  fs/f2fs/dir.c| 79 
> > 
> >  fs/f2fs/f2fs.h   |  9 ---
> >  fs/f2fs/inline.c |  9 ---
> >  3 files changed, 56 insertions(+), 41 deletions(-)
> > 
> > diff --git a/fs/f2fs/dir.c b/fs/f2fs/dir.c
> > index ab6455d..5e10d9d 100644
> > --- a/fs/f2fs/dir.c
> > +++ b/fs/f2fs/dir.c
> > @@ -76,20 +76,10 @@ static unsigned long dir_block_index(unsigned int level,
> > return bidx;
> >  }
> >  
> > -static bool early_match_name(size_t namelen, f2fs_hash_t namehash,
> > -   struct f2fs_dir_entry *de)
> > -{
> > -   if (le16_to_cpu(de->name_len) != namelen)
> > -   return false;
> > -
> > -   if (de->hash_code != namehash)
> > -   return false;
> > -
> > -   return true;
> > -}
> > -
> >  static struct f2fs_dir_entry *find_in_block(struct page *dentry_page,
> > -   struct qstr *name, int *max_slots,
> > +   struct f2fs_filename *fname,
> > +   f2fs_hash_t namehash,
> > +   int *max_slots,
> > struct page **res_page)
> >  {
> > struct f2fs_dentry_block *dentry_blk;
> > @@ -99,8 +89,7 @@ static struct f2fs_dir_entry *find_in_block(struct page 
> > *dentry_page,
> > dentry_blk = (struct f2fs_dentry_block *)kmap(dentry_page);
> >  
> > make_dentry_ptr(NULL, , (void *)dentry_blk, 1);
> > -   de = find_target_dentry(name, max_slots, );
> > -
> > +   de = find_target_dentry(fname, namehash, max_slots, );
> > if (de)
> > *res_page = dentry_page;
> > else
> > @@ -114,13 +103,15 @@ static struct f2fs_dir_entry *find_in_block(struct 
> > page *dentry_page,
> > return de;
> >  }
> >  
> > -struct f2fs_dir_entry *find_target_dentry(struct qstr *name, int 
> > *max_slots,
> > -   struct f2fs_dentry_ptr *d)
> > +struct f2fs_dir_entry *find_target_dentry(struct f2fs_filename *fname,
> > +   f2fs_hash_t namehash, int *max_slots,
> > +   struct f2fs_dentry_ptr *d)
> >  {
> > struct f2fs_dir_entry *de;
> > unsigned long bit_pos = 0;
> > -   f2fs_hash_t namehash = f2fs_dentry_hash(name);
> > int max_len = 0;
> > +   struct f2fs_str de_name = FSTR_INIT(NULL, 0);
> > +   struct f2fs_str *name = >disk_name;
> >  
> > if (max_slots)
> > *max_slots = 0;
> > @@ -132,8 +123,18 @@ struct f2fs_dir_entry *find_target_dentry(struct qstr 
> > *name, int *max_slots,
> > }
> >  
> > de = >dentry[bit_pos];
> > -   if (early_match_name(name->len, namehash, de) &&
> > -   !memcmp(d->filename[bit_pos], name->name, name->len))
> > +
> > +   /* encrypted case */
> > +   de_name.name = d->filename[bit_pos];
> > +   de_name.len = le16_to_cpu(de->name_len);
> > +
> > +   /* show encrypted name */
> > +   if (fname->hash) {
> > +   if (de->hash_code == fname->hash)
> > +   goto found;
> > +   } else if (de_name.len == name->len &&
> > +   de->hash_code == namehash &&
> > +   !memcmp(de_name.name, name->name, name->len))
> > goto found;
> >  
> > if (max_slots && max_len > *max_slots)
> > @@ -155,16 +156,21 @@ found:
> >  }
> >  
> >  static struct f2fs_dir_entry *find_in_level(struct inode *dir,
> > -   unsigned int level, struct qstr *name,
> > -   f2fs_hash_t namehash, struct page **res_page)
> > +   unsigned int level,
> > +   struct f2fs_filename *fname,
> > +   struct page **res_page)
> >  {
> > -   int s = GET_DENTRY_SLOTS(name->len);
> > +   struct qstr name = FSTR_TO_QSTR(>disk_name);
> > +   int s = GET_DENTRY_SLOTS(name.len);
> > unsigned int nbucket, nblock;
> > unsigned int

Re: [PATCH v2] dmaengine: bcm2835: Add slave dma support

2015-05-10 Thread Martin Sperl

> On 08.05.2015, at 13:20, Jonathan Bell  wrote:
>> 
> I agree that the interrupt generated would be spurious - in the case where it 
> is not required.
> 
> However if you do && (flags & DMA_PREP_INTERRUPT) then all users of this 
> driver need to explicitly set interrupt flags when doing a scatter-gather 
> transfer. As I understand it, currently the only upstream client of this 
> driver is the I2S driver which only uses cyclic anyway.
> 
> Not requiring an interrupt on completion is a bit of an edge case - the 
> default among other dmaengine drivers appears to be to enable interrupts 
> unconditionally.

I have now submitted a patch for spi-bcm2835 to make use of dma,
so there is one candidate for this kind of behavior.
So please go forward with the merge.

Also note that with the spi-HW dma support of the bcm2835
it is necessary to do a RX transfer even if the data is not
used (similar for TX).

Right now we have to allocate a dummy buffer to run these kind
of “one-way” transfers where we need 2 DMA channels.

The bcm2835 dma hw supports such dummy-transfer modes via 
BCM2835_DMA_S_IGNORE and BCM2835_DMA_D_IGNORE.

So maybe we can add a “flag” to the dmaengine_prep_slave_sg
that will allow such kind of behavior to get implemented?

That is not a necessity, but would be a welcome improvement.

Tested-by: Martin Sperl 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH kernel v9 26/32] powerpc/iommu: Add userspace view of TCE table

2015-05-10 Thread Alexey Kardashevskiy


On 05/11/2015 12:11 PM, Alexey Kardashevskiy wrote:

On 05/05/2015 10:02 PM, David Gibson wrote:

On Fri, May 01, 2015 at 05:12:45PM +1000, Alexey Kardashevskiy wrote:

On 05/01/2015 02:23 PM, David Gibson wrote:

On Fri, May 01, 2015 at 02:01:17PM +1000, Alexey Kardashevskiy wrote:

On 04/29/2015 04:31 PM, David Gibson wrote:

On Sat, Apr 25, 2015 at 10:14:50PM +1000, Alexey Kardashevskiy wrote:

In order to support memory pre-registration, we need a way to track
the use of every registered memory region and only allow unregistration
if a region is not in use anymore. So we need a way to tell from what
region the just cleared TCE was from.

This adds a userspace view of the TCE table into iommu_table struct.
It contains userspace address, one per TCE entry. The table is only
allocated when the ownership over an IOMMU group is taken which means
it is only used from outside of the powernv code (such as VFIO).

Signed-off-by: Alexey Kardashevskiy 
---
Changes:
v9:
* fixed code flow in error cases added in v8

v8:
* added ENOMEM on failed vzalloc()
---
  arch/powerpc/include/asm/iommu.h  |  6 ++
  arch/powerpc/kernel/iommu.c   | 18 ++
  arch/powerpc/platforms/powernv/pci-ioda.c | 22 --
  3 files changed, 44 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/iommu.h
b/arch/powerpc/include/asm/iommu.h
index 7694546..1472de3 100644
--- a/arch/powerpc/include/asm/iommu.h
+++ b/arch/powerpc/include/asm/iommu.h
@@ -111,9 +111,15 @@ struct iommu_table {
  unsigned long *it_map;   /* A simple allocation bitmap for
now */
  unsigned long  it_page_shift;/* table iommu page size */
  struct iommu_table_group *it_table_group;
+unsigned long *it_userspace; /* userspace view of the table */


A single unsigned long doesn't seem like enough.


Why single? This is an array.


As in single per page.



Sorry, I am not following you here.
It is per IOMMU page. MAP/UNMAP work with IOMMU pages which are fully
backed
with either system page or a huge page.





How do you know
which process's address space this address refers to?


It is a current task. Multiple userspaces cannot use the same
container/tables.


Where is that enforced?



It is accessed from VFIO DMA map/unmap which are ioctls() to a container's
fd which is per a process.


Usually, but what enforces that.  If you open a container fd, then
fork(), and attempt to map from both parent and child, what happens?



vfio_group_fops::open() checks if the group is already opened, and I want
to believe open() is called from fork() for new fd so no mapping can happen
later.


I am wrong here. Nothing prevents multiple userspace from using the same 
container. It still does not seem really dangerous as in order to use VFIO, 
someone with the root privilege should set right permissions on /dev/vfio* 
first anyway and that person knows what QEMU does and what QEMU does not :)


I could add pid into iommu_table, next to it_userspace, and fail when other 
pid is trying to change the it_userspace table. Not sure if I want to do 
this check in realmode though (performance). Or make sure somehow that 
fork() closes container and group fd's (but how?). In the worst case, wrong 
userspace page will be put and there will be random backtraces on the host 
kernel. What would you do?



--
Alexey
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH] PM, freezer: Don't thaw when it's intended frozen processes

2015-05-10 Thread Kyungmin Park

On Sat, May 9, 2015 at 12:25 AM, Tejun Heo  wrote:
> Hello, Kyungmin.
>
> On Fri, May 08, 2015 at 09:04:26AM +0900, Kyungmin Park wrote:
>> > I need to think more about it but as an *optimization* we can add
>> > freezing() test before actually waking tasks up during resume, but can
>> > you please clarify what you're seeing?
>>
>> The mobile application has life cycle and one of them is 'suspend'
>> state. it's different from 'pause' or 'background'.
>> if there are some application and enter go 'suspend' state. all
>> behaviors are stopped and can't do anything. right it's suspended. but
>> after system suspend & resume, these application is thawed and
>> running. even though system know it's suspended.
>>
>> We made some test application, print out some message within infinite
>> loop. when it goes 'suspend' state. nothing is print out. but after
>> system suspend & resume, it prints out again. that's not desired
>> behavior. and want to address it.
>>
>> frozen user processes should be remained as frozen while system
>> suspend & resume.
>
> Yes, they should and I'm not sure why what you're saying is happening
> because freezing() test done from the frozen tasks themselves should
> keep them in the freezer.  Which kernel version did you test?  Can you
> please verify it against a recent kernel?

The kernel 3.10 is not working as expected, but right the latest
kernel is working correctly.

I see I'll check what's different and which are modified.

Thank you,
Kyungmin Park
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 4/7] ACPI / processor: remove cpu_index in acpi_processor_get_info()

2015-05-10 Thread Hanjun Guo

Just use pr->id instead of cpu_index to simplify the code.

Signed-off-by: Hanjun Guo 
---
 drivers/acpi/acpi_processor.c | 10 --
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
index ac6bda0..0676b50 100644
--- a/drivers/acpi/acpi_processor.c
+++ b/drivers/acpi/acpi_processor.c
@@ -216,7 +216,7 @@ static int acpi_processor_get_info(struct acpi_device 
*device)
struct acpi_buffer buffer = { sizeof(union acpi_object),  };
struct acpi_processor *pr = acpi_driver_data(device);
phys_cpuid_t phys_id;
-   int cpu_index, device_declaration = 0;
+   int device_declaration = 0;
acpi_status status = AE_OK;
static int cpu0_initialized;
unsigned long long value;
@@ -268,18 +268,16 @@ static int acpi_processor_get_info(struct acpi_device 
*device)
acpi_handle_debug(pr->handle, "failed to get CPU physical 
ID.\n");
pr->phys_id = phys_id;
 
-   cpu_index = acpi_map_cpuid(pr->phys_id, pr->acpi_id);
+   pr->id = acpi_map_cpuid(pr->phys_id, pr->acpi_id);
if (!cpu0_initialized && !acpi_has_cpu_in_madt()) {
cpu0_initialized = 1;
/*
 * Handle UP system running SMP kernel, with no CPU
 * entry in MADT
 */
-   if (invalid_logical_cpuid(cpu_index)
-   && (num_online_cpus() == 1))
-   cpu_index = 0;
+   if (invalid_logical_cpuid(pr->id) && (num_online_cpus() == 1))
+   pr->id = 0;
}
-   pr->id = cpu_index;
 
/*
 *  Extra Processor objects may be enumerated on MP systems with
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 5/7] ACPI / processor: remove phys_id in acpi_processor_get_info()

2015-05-10 Thread Hanjun Guo

Use pr->phys_id to replace phys_id to simplify the code.

Signed-off-by: Hanjun Guo 
---
 drivers/acpi/acpi_processor.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
index 0676b50..62c846b 100644
--- a/drivers/acpi/acpi_processor.c
+++ b/drivers/acpi/acpi_processor.c
@@ -215,7 +215,6 @@ static int acpi_processor_get_info(struct acpi_device 
*device)
union acpi_object object = { 0 };
struct acpi_buffer buffer = { sizeof(union acpi_object),  };
struct acpi_processor *pr = acpi_driver_data(device);
-   phys_cpuid_t phys_id;
int device_declaration = 0;
acpi_status status = AE_OK;
static int cpu0_initialized;
@@ -263,10 +262,10 @@ static int acpi_processor_get_info(struct acpi_device 
*device)
pr->acpi_id = value;
}
 
-   phys_id = acpi_get_phys_id(pr->handle, device_declaration, pr->acpi_id);
-   if (phys_id == PHYS_CPUID_INVALID)
+   pr->phys_id = acpi_get_phys_id(pr->handle, device_declaration,
+   pr->acpi_id);
+   if (pr->phys_id == PHYS_CPUID_INVALID)
acpi_handle_debug(pr->handle, "failed to get CPU physical 
ID.\n");
-   pr->phys_id = phys_id;
 
pr->id = acpi_map_cpuid(pr->phys_id, pr->acpi_id);
if (!cpu0_initialized && !acpi_has_cpu_in_madt()) {
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 7/7] ACPI / processor: Introduce invalid_phys_cpuid()

2015-05-10 Thread Hanjun Guo

Introduce invalid_phys_cpuid() to identify cpu with invalid
physical ID, then used it as replacement of the direct comparisons
with PHYS_CPUID_INVALID.

Signed-off-by: Hanjun Guo 
---
 drivers/acpi/acpi_processor.c | 4 ++--
 drivers/acpi/processor_core.c | 4 ++--
 include/linux/acpi.h  | 5 +
 3 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
index 62c846b..92a5f73 100644
--- a/drivers/acpi/acpi_processor.c
+++ b/drivers/acpi/acpi_processor.c
@@ -170,7 +170,7 @@ static int acpi_processor_hotadd_init(struct acpi_processor 
*pr)
acpi_status status;
int ret;
 
-   if (pr->phys_id == PHYS_CPUID_INVALID)
+   if (invalid_phys_cpuid(pr->phys_id))
return -ENODEV;
 
status = acpi_evaluate_integer(pr->handle, "_STA", NULL, );
@@ -264,7 +264,7 @@ static int acpi_processor_get_info(struct acpi_device 
*device)
 
pr->phys_id = acpi_get_phys_id(pr->handle, device_declaration,
pr->acpi_id);
-   if (pr->phys_id == PHYS_CPUID_INVALID)
+   if (invalid_phys_cpuid(pr->phys_id))
acpi_handle_debug(pr->handle, "failed to get CPU physical 
ID.\n");
 
pr->id = acpi_map_cpuid(pr->phys_id, pr->acpi_id);
diff --git a/drivers/acpi/processor_core.c b/drivers/acpi/processor_core.c
index fd4140d..33a38d6 100644
--- a/drivers/acpi/processor_core.c
+++ b/drivers/acpi/processor_core.c
@@ -184,7 +184,7 @@ phys_cpuid_t acpi_get_phys_id(acpi_handle handle, int type, 
u32 acpi_id)
phys_cpuid_t phys_id;
 
phys_id = map_mat_entry(handle, type, acpi_id);
-   if (phys_id == PHYS_CPUID_INVALID)
+   if (invalid_phys_cpuid(phys_id))
phys_id = map_madt_entry(type, acpi_id);
 
return phys_id;
@@ -196,7 +196,7 @@ int acpi_map_cpuid(phys_cpuid_t phys_id, u32 acpi_id)
int i;
 #endif
 
-   if (phys_id == PHYS_CPUID_INVALID) {
+   if (invalid_phys_cpuid(phys_id)) {
/*
 * On UP processor, there is no _MAT or MADT table.
 * So above phys_id is always set to PHYS_CPUID_INVALID.
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index 913b49f..cc82ff3 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -163,6 +163,11 @@ static inline bool invalid_logical_cpuid(u32 cpuid)
return (int)cpuid < 0;
 }
 
+static inline bool invalid_phys_cpuid(phys_cpuid_t phys_id)
+{
+   return (int)phys_id < 0;
+}
+
 #ifdef CONFIG_ACPI_HOTPLUG_CPU
 /* Arch dependent functions for cpu hotplug support */
 int acpi_map_cpu(acpi_handle handle, phys_cpuid_t physid, int *pcpu);
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 6/7] ACPI / processor: return specific error instead of -1

2015-05-10 Thread Hanjun Guo

Since invalid_logical_cpuid() can check error values, so
return specific error instead of -1 for acpi_map_cpuid().

Signed-off-by: Hanjun Guo 
---
 drivers/acpi/processor_core.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/acpi/processor_core.c b/drivers/acpi/processor_core.c
index b1ec78b..fd4140d 100644
--- a/drivers/acpi/processor_core.c
+++ b/drivers/acpi/processor_core.c
@@ -215,12 +215,12 @@ int acpi_map_cpuid(phys_cpuid_t phys_id, u32 acpi_id)
 * Ignores phys_id and always returns 0 for the processor
 * handle with acpi id 0 if nr_cpu_ids is 1.
 * This should be the case if SMP tables are not found.
-* Return -1 for other CPU's handle.
+* Return -EINVAL for other CPU's handle.
 */
if (nr_cpu_ids <= 1 && acpi_id == 0)
return acpi_id;
else
-   return -1;
+   return -EINVAL;
}
 
 #ifdef CONFIG_SMP
@@ -233,7 +233,7 @@ int acpi_map_cpuid(phys_cpuid_t phys_id, u32 acpi_id)
if (phys_id == 0)
return phys_id;
 #endif
-   return -1;
+   return -ENODEV;
 }
 
 int acpi_get_cpuid(acpi_handle handle, int type, u32 acpi_id)
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 3/7] Xen / ACPI / processor: Remove unneeded NULL check

2015-05-10 Thread Hanjun Guo

Before xen_acpi_processor_enable() is called, struct acpi_processor *pr is
allocated in xen_acpi_processor_add() and checked if it's NULL, so no need
to check again when passed to xen_acpi_processor_enable(), just remove it.

Signed-off-by: Hanjun Guo 
CC: Boris Ostrovsky 
CC: Stefano Stabellini 
---
 drivers/xen/xen-acpi-cpuhotplug.c | 8 +---
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/drivers/xen/xen-acpi-cpuhotplug.c 
b/drivers/xen/xen-acpi-cpuhotplug.c
index 5a62aa0..f4a3694 100644
--- a/drivers/xen/xen-acpi-cpuhotplug.c
+++ b/drivers/xen/xen-acpi-cpuhotplug.c
@@ -46,13 +46,7 @@ static int xen_acpi_processor_enable(struct acpi_device 
*device)
unsigned long long value;
union acpi_object object = { 0 };
struct acpi_buffer buffer = { sizeof(union acpi_object),  };
-   struct acpi_processor *pr;
-
-   pr = acpi_driver_data(device);
-   if (!pr) {
-   pr_err(PREFIX "Cannot find driver data\n");
-   return -EINVAL;
-   }
+   struct acpi_processor *pr = acpi_driver_data(device);
 
if (!strcmp(acpi_device_hid(device), ACPI_PROCESSOR_OBJECT_HID)) {
/* Declared with "Processor" statement; match ProcessorID */
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 1/7] ACPI / processor: Introduce invalid_logical_cpuid()

2015-05-10 Thread Hanjun Guo

In ACPI processor drivers, we use direct comparisons of cpu logical
id with -1 which are error prone in case logical cpuid is accidentally
assinged an error code and prevents us from returning an error-encoding
cpuid directly in some cases.

So introduce invalid_logical_cpuid() to identify cpu with invalid
logical cpu num, then it will be used to replace the direct comparisons
with -1.

Signed-off-by: Hanjun Guo 
---
 drivers/acpi/acpi_processor.c | 5 +++--
 drivers/acpi/processor_pdc.c  | 5 +
 include/linux/acpi.h  | 5 +
 3 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
index 58f335c..ac6bda0 100644
--- a/drivers/acpi/acpi_processor.c
+++ b/drivers/acpi/acpi_processor.c
@@ -275,7 +275,8 @@ static int acpi_processor_get_info(struct acpi_device 
*device)
 * Handle UP system running SMP kernel, with no CPU
 * entry in MADT
 */
-   if ((cpu_index == -1) && (num_online_cpus() == 1))
+   if (invalid_logical_cpuid(cpu_index)
+   && (num_online_cpus() == 1))
cpu_index = 0;
}
pr->id = cpu_index;
@@ -285,7 +286,7 @@ static int acpi_processor_get_info(struct acpi_device 
*device)
 *  less than the max # of CPUs. They should be ignored _iff
 *  they are physically not present.
 */
-   if (pr->id == -1) {
+   if (invalid_logical_cpuid(pr->id)) {
int ret = acpi_processor_hotadd_init(pr);
if (ret)
return ret;
diff --git a/drivers/acpi/processor_pdc.c b/drivers/acpi/processor_pdc.c
index e5dd808..7cfbda4 100644
--- a/drivers/acpi/processor_pdc.c
+++ b/drivers/acpi/processor_pdc.c
@@ -52,10 +52,7 @@ static bool __init processor_physically_present(acpi_handle 
handle)
type = (acpi_type == ACPI_TYPE_DEVICE) ? 1 : 0;
cpuid = acpi_get_cpuid(handle, type, acpi_id);
 
-   if (cpuid == -1)
-   return false;
-
-   return true;
+   return !invalid_logical_cpuid(cpuid);
 }
 
 static void acpi_set_pdc_bits(u32 *buf)
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index e4da5e3..913b49f 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -158,6 +158,11 @@ typedef u32 phys_cpuid_t;
 #define PHYS_CPUID_INVALID (phys_cpuid_t)(-1)
 #endif
 
+static inline bool invalid_logical_cpuid(u32 cpuid)
+{
+   return (int)cpuid < 0;
+}
+
 #ifdef CONFIG_ACPI_HOTPLUG_CPU
 /* Arch dependent functions for cpu hotplug support */
 int acpi_map_cpu(acpi_handle handle, phys_cpuid_t physid, int *pcpu);
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 2/7] Xen / ACPI / processor: use invalid_logical_cpuid()

2015-05-10 Thread Hanjun Guo

Use invalid_logical_cpuid(pr->id) instead of direct comparison.

Signed-off-by: Hanjun Guo 
CC: Boris Ostrovsky 
CC: Stefano Stabellini 
---
 drivers/xen/xen-acpi-cpuhotplug.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/xen/xen-acpi-cpuhotplug.c 
b/drivers/xen/xen-acpi-cpuhotplug.c
index 3e62ee4..5a62aa0 100644
--- a/drivers/xen/xen-acpi-cpuhotplug.c
+++ b/drivers/xen/xen-acpi-cpuhotplug.c
@@ -77,7 +77,7 @@ static int xen_acpi_processor_enable(struct acpi_device 
*device)
 
pr->id = xen_pcpu_id(pr->acpi_id);
 
-   if ((int)pr->id < 0)
+   if (invalid_logical_cpuid(pr->id))
/* This cpu is not presented at hypervisor, try to hotadd it */
if (ACPI_FAILURE(xen_acpi_cpu_hotadd(pr))) {
pr_err(PREFIX "Hotadd CPU (acpi_id = %d) failed.\n",
@@ -226,7 +226,7 @@ static acpi_status xen_acpi_cpu_hotadd(struct 
acpi_processor *pr)
return AE_ERROR;
 
pr->id = xen_hotadd_cpu(pr);
-   if ((int)pr->id < 0)
+   if (invalid_logical_cpuid(pr->id))
return AE_ERROR;
 
/*
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 0/7] minor cleanups for ACPI processor driver

2015-05-10 Thread Hanjun Guo

This patch set is some minor cleanups for ACPI processor driver
to address the comments which raised by Rafael in ARM64 ACPI core
patches.

Boris and Konrad already reviewed the XEN/ACPI part, they seems are
ok with the changes for XEN [1] [2].

Sudeep, I didn't change the (int)phys_id < 0 to phys_id == PHYS_CPUID_INVALID
as they have the same effects and Rafael prefer the previous one
as he mentioned in ACPI core patch set comments, and PHYS_CPUID_INVALID
is still needed for typedef phys_cpuid_t in ACPI core, so if you are
not ok with it, please let me know.

v3:
 - Reorder the patches to let "Introduce invalid_logical_cpuid()"
   as the first one to avoid raising confusions.
 - Replace "return invalid_logical_cpuid(cpuid) ? false : true;"
   as "return !invalid_logical_cpuid(cpuid);"
 - Some updates to the patch subject and changelog
 - Rebase on top of 4.1-rc3

v2:
 - rebased on top of 4.1-rc2

[1]: https://lkml.org/lkml/2015/5/5/733
[2]: https://lkml.org/lkml/2015/5/9/222

Hanjun Guo (7):
  ACPI / processor: Introduce invalid_logical_cpuid()
  Xen / ACPI / processor: use invalid_logical_cpuid()
  Xen / ACPI / processor: Remove unneeded NULL check
  ACPI / processor: remove cpu_index in acpi_processor_get_info()
  ACPI / processor: remove phys_id in acpi_processor_get_info()
  ACPI / processor: return specific error instead of -1
  ACPI / processor: Introduce invalid_phys_cpuid()

 drivers/acpi/acpi_processor.c | 20 +---
 drivers/acpi/processor_core.c | 10 +-
 drivers/acpi/processor_pdc.c  |  5 +
 drivers/xen/xen-acpi-cpuhotplug.c | 12 +++-
 include/linux/acpi.h  | 10 ++
 5 files changed, 28 insertions(+), 29 deletions(-)

-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/2] mmc: cast to avoid unexpected error

2015-05-10 Thread Kuninori Morimoto


Hi Ulf

> > Kuninori Morimoto (2):
> >   mmc: cast u8 to unsigned long long to avoid unexpected error
> >   mmc: cast unsigned int to typeof(sector_t) to avoid unexpected error
(snip)
> Please run another round of checkpatch for these.

Thank you. will do in v2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Pointer Cast to Different Type Warnings in sh_mmcif.c

2015-05-10 Thread Kuninori Morimoto


Hi nick

Thank you for your feedback

> Greetings Ulf and others,
> I am getting the below build warnings on the latest version of Linus's tree:
> drivers/mmc/host/sh_mmcif.c: In function ‘sh_mmcif_request_dma_one’:
> drivers/mmc/host/sh_mmcif.c:401:4: warning: cast to pointer from integer of 
> different size [-Wint-to-pointer-cast]
>  (void *)pdata->slave_id_tx :
>  ^
>  drivers/mmc/host/sh_mmcif.c:402:4: warning: cast to pointer from integer of 
> different size [-Wint-to-pointer-cast
>  (void *)pdata->slave_id_rx;
> After reading and tracing this issue it seems to be a cast for a u64 pointer 
> to a unsigned in the following statement:
> 
>  if (pdata)
> slave_data = direction == DMA_MEM_TO_DEV ?
>  (void *)pdata->slave_id_tx :
> (void *)pdata->slave_id_rx;
> I am wondering as the maintainer if this is a warning that doesn't need to re 
> factor the structure definition for 
> sh_mmcif_plat_data to use a u64 data type. Please let me known if I should 
> consider re factoring the structure 
> to clean up these warnings.

I guess this warning happen from
5f48dd0690cbcea3f35b9ef2f05d5468dedc80b0
(mmc: sh_mmcif: remove slave_id settings for DMAEngine)

I didn't check, but does cast issue is solved by this ?

-   (void *)pdata->slave_id_tx :
-   (void *)pdata->slave_id_rx;
+   (void *)(unsigned long)pdata->slave_id_tx :
+   (void *)(unsigned long)pdata->slave_id_rx;


Best regards
---
Kuninori Morimoto
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/3] cpuidle: updates related to tick_broadcast_enter() failures

2015-05-10 Thread Preeti U Murthy

On 05/10/2015 04:45 AM, Rafael J. Wysocki wrote:
> On Saturday, May 09, 2015 10:33:05 PM Rafael J. Wysocki wrote:
>> On Saturday, May 09, 2015 10:11:41 PM Rafael J. Wysocki wrote:
>>> On Saturday, May 09, 2015 11:19:16 AM Preeti U Murthy wrote:
 Hi Rafael,

 On 05/08/2015 07:48 PM, Rafael J. Wysocki wrote:
>>>
>>> [cut]
>>>
>>  
>> +/* Take note of the planned idle state. */
>> +idle_set_state(smp_processor_id(), target_state);
>
> And I wouldn't do this either.
>
> The behavior here is pretty much as though the driver demoted the state 
> chosen
> by the governor and we don't call idle_set_state() again in those cases.

 Why is this wrong?
>>>
>>> It is not "wrong", but incomplete, because demotions done by the cpuidle 
>>> driver
>>> should also be taken into account in the same way.
>>>
>>> But I'm seeing that the recent patch of mine that made cpuidle_enter_state()
>>> call default_idle_call() was a mistake, because it might confuse 
>>> find_idlest_cpu()
>>> significantly as to what state the CPU is in.  I'll drop that one for now.
>>
>> OK, done.
>>
>> So after I've dropped it I think we need to do three things:
>> (1) Move the idle_set_state() calls to cpuidle_enter_state().
>> (2) Make cpuidle_enter_state() call default_idle_call() again, but this time
>> do that *before* it has called idle_set_state() for target_state.
>> (3) Introduce demotion as per my last patch.
>>
>> Let me cut patches for that.
> 
> Done as per the above and the patches follow in replies to this messge.
> 
> All on top of the current linux-next branch of the linux-pm.git tree.

I don't see the patches on linux-pm/linux-next.

Regards
Preeti U Murthy
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2 V3] memory-hotplug: fix BUG_ON in move_freepages()

2015-05-10 Thread Gu Zheng

Hi Xishi,

What is the condition about this series?

Thanks,
Gu
On 04/22/2015 02:26 PM, Xishi Qiu wrote:

> add CC: Tejun Heo 
> 
> On 2015/4/21 18:15, Xishi Qiu wrote:
> 
>> Hot remove nodeXX, then hot add nodeXX. If BIOS report cpu first, it will 
>> call
>> hotadd_new_pgdat(nid, 0), this will set pgdat->node_start_pfn to 0. As nodeXX
>> exists at boot time, so pgdat->node_spanned_pages is the same as original. 
>> Then
>> free_area_init_core()->memmap_init() will pass a wrong start and a nonzero 
>> size.
>>
>> free_area_init_core()
>>  memmap_init()
>>  memmap_init_zone()
>>  early_pfn_in_nid()
>>  set_page_links()
>>
>> "if (!early_pfn_in_nid(pfn, nid))" will skip the pfn(memory in section), but 
>> it
>> will not skip the pfn(hole in section), this will cover and relink the page 
>> to
>> zone/nid, so page_zone() from memory and hole in the same section are 
>> different.
>>
>> The following call trace shows the bug. This patch add/remove memblk when hot
>> adding/removing memory, so it will set the node size to 0 when hotadd a new 
>> node
>> (original or new). init_currently_empty_zone() and memmap_init() will be 
>> called
>> in add_zone(), so need not to change them.
>>
>> [90476.077469] kernel BUG at mm/page_alloc.c:1042!  // move_freepages() -> 
>> BUG_ON(page_zone(start_page) != page_zone(end_page));
>> [90476.077469] invalid opcode:  [#1] SMP 
>> [90476.077469] Modules linked in: iptable_nat nf_conntrack_ipv4 
>> nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack fuse btrfs zlib_deflate 
>> raid6_pq xor msdos ext4 mbcache jbd2 binfmt_misc bridge stp llc 
>> ip6table_filter ip6_tables iptable_filter ip_tables ebtable_nat ebtables 
>> cfg80211 rfkill sg iTCO_wdt iTCO_vendor_support intel_powerclamp coretemp 
>> intel_rapl kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel 
>> ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd 
>> pcspkr igb vfat i2c_algo_bit dca fat sb_edac edac_core i2c_i801 lpc_ich 
>> i2c_core mfd_core shpchp acpi_pad ipmi_si ipmi_msghandler uinput nfsd 
>> auth_rpcgss nfs_acl lockd sunrpc xfs libcrc32c sd_mod crc_t10dif 
>> crct10dif_common ahci libahci megaraid_sas tg3 ptp libata pps_core dm_mirror 
>> dm_region_hash dm_log dm_mod [last unloaded: rasf]
>> [90476.157382] CPU: 2 PID: 322803 Comm: updatedb Tainted: GF   W  
>> O--   3.10.0-229.1.2.5.hulk.rc14.x86_64 #1
>> [90476.157382] Hardware name: HUAWEI TECHNOLOGIES CO.,LTD. Huawei N1/Huawei 
>> N1, BIOS V100R001 04/13/2015
>> [90476.157382] task: 88006a6d5b00 ti: 880068eb8000 task.ti: 
>> 880068eb8000
>> [90476.157382] RIP: 0010:[]  [] 
>> move_freepages+0x12f/0x140
>> [90476.157382] RSP: 0018:880068ebb640  EFLAGS: 00010002
>> [90476.157382] RAX: 880002316cc0 RBX: ea0001bd RCX: 
>> 0001
>> [90476.157382] RDX: 880002476e40 RSI:  RDI: 
>> 880002316cc0
>> [90476.157382] RBP: 880068ebb690 R08: 0010 R09: 
>> ea0001bd7fc0
>> [90476.157382] R10: 0006f5ff R11:  R12: 
>> 0001
>> [90476.157382] R13: 0003 R14: 880002316eb8 R15: 
>> ea0001bd7fc0
>> [90476.157382] FS:  7f4d3ab95740() GS:880033a0() 
>> knlGS:
>> [90476.157382] CS:  0010 DS:  ES:  CR0: 80050033
>> [90476.157382] CR2: 7f4d3ae1a808 CR3: 00018907a000 CR4: 
>> 001407e0
>> [90476.157382] DR0:  DR1:  DR2: 
>> 
>> [90476.157382] DR3:  DR6: fffe0ff0 DR7: 
>> 0400
>> [90476.157382] Stack:
>> [90476.157382]  880068ebb698 880002316cc0 a800b5378098 
>> 880068ebb698
>> [90476.157382]  810b11dc 880002316cc0 0001 
>> 0003
>> [90476.157382]  880002316eb8 ea0001bd6420 880068ebb6a0 
>> 8115a003
>> [90476.157382] Call Trace:
>> [90476.157382]  [] ? update_curr+0xcc/0x150
>> [90476.157382]  [] move_freepages_block+0x73/0x80
>> [90476.157382]  [] __rmqueue+0x26a/0x460
>> [90476.157382]  [] ? native_sched_clock+0x13/0x80
>> [90476.157382]  [] get_page_from_freelist+0x7f2/0xd30
>> [90476.157382]  [] ? __switch_to+0x179/0x4a0
>> [90476.157382]  [] ? xfs_iext_bno_to_ext+0xa7/0x1a0 [xfs]
>> [90476.157382]  [] __alloc_pages_nodemask+0x1c1/0xc90
>> [90476.157382]  [] ? _xfs_buf_ioapply+0x31c/0x420 [xfs]
>> [90476.157382]  [] ? down_trylock+0x2d/0x40
>> [90476.157382]  [] ? xfs_buf_trylock+0x1f/0x80 [xfs]
>> [90476.157382]  [] alloc_pages_current+0xa9/0x170
>> [90476.157382]  [] new_slab+0x275/0x300
>> [90476.157382]  [] __slab_alloc+0x315/0x48f
>> [90476.157382]  [] ? kmem_zone_alloc+0x77/0x100 [xfs]
>> [90476.157382]  [] ? xfs_bmap_search_extents+0x5c/0xc0 
>> [xfs]
>> [90476.157382]  [] kmem_cache_alloc+0x193/0x1d0
>> [90476.157382]  [] ? kmem_zone_alloc+0x77/0x100 [xfs]
>> [90476.157382]  [] kmem_zone_alloc+0x77/0x100 [xfs]
>>

[PATCH] serial: 8250: remove return statements from void function

2015-05-10 Thread Masahiro Yamada

serial8250_set_mctrl() is a void type function.  Returning something
does not look nice.

Signed-off-by: Masahiro Yamada 
---

 drivers/tty/serial/8250/8250_core.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/tty/serial/8250/8250_core.c 
b/drivers/tty/serial/8250/8250_core.c
index 4882027..146f56c 100644
--- a/drivers/tty/serial/8250/8250_core.c
+++ b/drivers/tty/serial/8250/8250_core.c
@@ -2019,8 +2019,9 @@ EXPORT_SYMBOL_GPL(serial8250_do_set_mctrl);
 static void serial8250_set_mctrl(struct uart_port *port, unsigned int mctrl)
 {
if (port->set_mctrl)
-   return port->set_mctrl(port, mctrl);
-   return serial8250_do_set_mctrl(port, mctrl);
+   port->set_mctrl(port, mctrl);
+   else
+   serial8250_do_set_mctrl(port, mctrl);
 }
 
 static void serial8250_break_ctl(struct uart_port *port, int break_state)
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v11 06/19] h8300: Assembly headers

2015-05-10 Thread Yoshinori Sato

At Sun, 10 May 2015 12:46:14 +0200,
Richard Weinberger wrote:
> 
> On Fri, May 8, 2015 at 5:04 PM, Yoshinori Sato
>  wrote:
> > Signed-off-by: Yoshinori Sato 
> > diff --git a/arch/h8300/include/asm/uaccess.h 
> > b/arch/h8300/include/asm/uaccess.h
> > new file mode 100644
> > index 000..582af79
> > --- /dev/null
> > +++ b/arch/h8300/include/asm/uaccess.h
> > @@ -0,0 +1,131 @@
> > +#ifndef __H8300_UACCESS_H
> > +#define __H8300_UACCESS_H
> > +
> > +/*
> > + * User space memory access functions
> > + */
> > +#include 
> > +#include 
> > +#include 
> > +
> > +#include 
> > +
> > +#define VERIFY_READ0
> > +#define VERIFY_WRITE   1
> > +
> > +/* We let the MMU do all checking */
> > +#define access_ok(type, addr, size) __access_ok((unsigned long)addr, size)
> > +static inline int __access_ok(unsigned long addr, unsigned long size)
> > +{
> > +   return 1;
> > +}
> > +
> > +/*
> > + * The exception table consists of pairs of addresses: the first is the
> > + * address of an instruction that is allowed to fault, and the second is
> > + * the address at which the program should continue.  No registers are
> > + * modified, so it is entirely up to the continuation code to figure out
> > + * what to do.
> > + *
> > + * All the routines below use bits of fixup code that are out of line
> > + * with the main instruction path.  This means when everything is well,
> > + * we don't even have to jump over them.  Further, they do not intrude
> > + * on our cache or tlb entries.
> > + */
> > +
> > +struct exception_table_entry {
> > +   unsigned long insn, fixup;
> > +};
> > +
> > +/* Returns 0 if exception not found and fixup otherwise.  */
> > +extern unsigned long search_exception_table(unsigned long);
> > +
> > +
> > +/*
> > + * These are the main single-value transfer routines.  They automatically
> > + * use the right size if we just have the right pointer type.
> > + */
> > +
> > +#define put_user(x, ptr)   \
> > +({ \
> > +   int __pu_err = 0;   \
> > +   typeof(*(ptr)) __pu_val = (x);  \
> > +   switch (sizeof(*(ptr))) {   \
> > +   case 1: \
> > +   /* falll through */ \
> > +   case 2: \
> > +   /* fall through */ \
> > +   case 4: \
> > +   *(ptr) = x; \
> > +   break;  \
> > +   case 8: \
> > +   memcpy(ptr, &__pu_val, sizeof(*(ptr))); \
> > +   break;  \
> > +   default:\
> > +   __pu_err = __put_user_bad();\
> > +   break;  \
> > +   }   \
> > +   __pu_err;   \
> > +})
> > +
> > +#define __put_user(x, ptr) put_user(x, ptr)
> > +
> > +extern int __put_user_bad(void);
> > +
> > +/*
> > + * Tell gcc we read from memory instead of writing: this is because
> > + * we do not write to any memory gcc knows about, so there are no
> > + * aliasing issues.
> > + */
> > +
> > +#define __ptr(x) ((unsigned long *)(x))
> > +
> > +/*
> > + * Tell gcc we read from memory instead of writing: this is because
> > + * we do not write to any memory gcc knows about, so there are no
> > + * aliasing issues.
> > + */
> > +
> > +#define get_user(x, ptr)   \
> > +({ \
> > +   typeof(*(ptr)) __gu_val;\
> > +   int __gu_err = 0;   \
> > +   switch (sizeof(*(ptr))) {   \
> > +   case 1: \
> > +   *(u8 *)&__gu_val = *((u8 *)(ptr));  \
> > +   break;  \
> > +   case 2: \
> > +   *(u16 *)&__gu_val = *((u16 *)ptr);  \
> > +   break;  \
> > +   case 4: \
> > +   *(u32 *)&__gu_val = *((u32 *)ptr);  \
> > +   break;  \
> > +   case 8: \
> > +   memcpy((void *)&__gu_val, ptr, sizeof(*(ptr))); \
> > +   break;  \
> > +   default:\
> > +   __gu_err = __get_user_bad();\
> > +

[GIT PULL REQUEST]

2015-05-10 Thread NeilBrown


hi Linus
 Please pull this collection of bugfixes.

Thanks,
NeilBrown


The following changes since commit 5ebe6afaf0057ac3eaeb98defd5456894b446d22:

  Linux 4.1-rc2 (2015-05-03 19:22:23 -0700)

are available in the git repository at:

  git://neil.brown.name/md/ tags/md/4.1-rc3-fixes

for you to fetch changes up to bb27051f9fd7643f05d8f0babce3337f0b9b3087:

  md/raid5: fix handling of degraded stripes in batches. (2015-05-08 18:47:57 
+1000)


A few fixes for md.

Most of these are related to the new "batched stripe writeout",
but there are a few others.


Heinz Mauelshagen (1):
  md-raid0: conditional mddev->queue access to suit dm-raid

NeilBrown (6):
  md/raid5: new alloc_stripe() to allocate an initialize a stripe.
  md/raid5: more incorrect BUG_ON in handle_stripe_fill.
  md/raid5: avoid reading parity blocks for full-stripe write to degraded 
array
  md/raid5: don't record new size if resize_stripes fails.
  md/raid5: fix allocation of 'scribble' array.
  md/raid5: fix handling of degraded stripes in batches.

 drivers/md/raid0.c |   5 ++-
 drivers/md/raid5.c | 123 ++---
 2 files changed, 73 insertions(+), 55 deletions(-)


pgp3mSg83KtRO.pgp
Description: OpenPGP digital signature

Re: [EDT] oom_killer: find bulkiest task based on pss value

2015-05-10 Thread David Rientjes

On Fri, 8 May 2015, Yogesh Narayan Gaur wrote:

> Presently in oom_kill.c we calculate badness score of the victim task as per 
> the present RSS counter value of the task.
> RSS counter value for any task is usually '[Private (Dirty/Clean)] + [Shared 
> (Dirty/Clean)]' of the task.
> We have encountered a situation where values for Private fields are less but 
> value for Shared fields are more and hence make total RSS counter value 
> large. Later on oom situation killing task with highest RSS value but as 
> Private field values are not large hence memory gain after killing this 
> process is not as per the expectation.
> 
> For e.g. take below use-case scenario, in which 3 process are running in 
> system. 
> All these process done mmap for file exist in present directory and then 
> copying data from this file to local allocated pointers in while(1) loop with 
> some sleep. Out of 3 process, 2 process has mmaped file with MAP_SHARED 
> setting and one has mapped file with MAP_PRIVATE setting.
> I have all 3 processes in background and checks RSS/PSS value from user space 
> utility (utility over cat /proc/pid/smaps)
> Before OOM, below is the consumed memory status for these 3 process (all 
> processes run with oom_score_adj = 0)
> 
> Comm : 1prg,  Pid : 213 (values in kB)
>   Rss Shared  Private  Pss
>   Process :  375764194596181168 278460
> 
> Comm : 3prg,  Pid : 217 (values in kB)
>   RssShared   Private Pss
>   Process :  305760  32 305728305738
> 
> Comm : 2prg,  Pid : 218 (values in kB)
>   Rss  Shared   Private Pss
>   Process :  389980 194596 195384292676
> 
> 
> Thus as per present code design, first it would select process [2prg : 218] 
> as bulkiest process as its RSS value is highest to kill. But if we kill this 
> process then only ~195MB would be free as compare to expected ~389MB.
> Thus identifying the task based on RSS value is not accurate design and 
> killing that identified process didn’t release expected memory back to system.
> 
> We need to calculate victim task based on PSS instead of RSS as PSS value 
> calculates as
> PSS value = [Private (Dirty/Clean)] + [Shared (Dirty/Clean) / no. of shared 
> task]
> For above use-case scenario also, it can be checked that process [3prg : 217] 
> is having largest PSS value and by killing this process we can gain maximum 
> memory (~305MB) as compare to killing process identified based on RSS value.
> 

The oom killer doesn't expect to necessarily be able to free all memory 
that is represented by the rss of a process.  In fact, after it selects a 
process it will happily kill a child process in favor of its parent if 
they don't share the same memory.

There're a few problems with using pss and the proposed patch that 
follows:

 - it's less predictable since it depends on the number of times the 
   memory is mapped, which may change during the process's lifetime,

 - it requires mm->mmap_sem to do, which is not possible to do because
   it may be held and thus reverting back to rss in situations where
   the trylock fails makes it even less predictable and reliable, and

 - all users who currently tune /proc/pid/oom_score_adj or
   /proc/pid/oom_adj are doing so based on the current heuristic, which
   is rss; if we switched to pss and all a process's memory is shared
   then their oom_score_adj or oom_adj is now severely broken (and as a
   result of the first problem above, defining oom_score_adj is near
   impossible).

We don't have the expectation of freeing the entire rss, the best we can 
do is use a heuristic which is reliable, consistent, and cheap to check.  
We can then ask users who desire a process to have a different oom kill 
priority to use oom_score_adj and they may do so in a reliable way without 
having the fallback behavior that your trylock does.

[GIT PULL] ARM: EXYNOS: Improvements for 4.2, second try

2015-05-10 Thread Krzysztof Kozlowski

Dear Kukjin,

Updated pull request, replacing also the usage of soc_is_exynos4()
with of_machine_is_compatible(). You requested this in comments
for "ARM: EXYNOS: Fix failed second suspend on Exynos4".

This adds coupled cpuidle for Exynos3250 and improves the Exynos
code in few places. Everything for upcoming 4.2 merge window.
Description along with a tag.

Best regards,
Krzysztof


The following changes since commit b82f3a05ff0b5eaf2c9900eeb34e58a6624db8d9:

  ARM: EXYNOS: Fix failed second suspend on Exynos4 (2015-05-11 11:03:09 +0900)

are available in the git repository at:

  https://github.com/krzk/linux.git tags/samsung-for-next-4.2-2

for you to fetch changes up to c91889378098ff0bb5fe6f422a3c0eb554b34930:

  ARM: plat-samsung: Constify platform_device_id (2015-05-11 11:05:31 +0900)


Extending cpuidle driver and improvements for Exynos based boards:
1. Replace soc_is_exynos4() with of_machine_is_compatible().
2. Add missing return-value checks and of_node_put() for power domain
   driver.
3. Fix missing clk_prepare in S3C24XX ADC driver.
4. Rework clock handling when switching power domains on/off. Instead
   of settting fixed parent in DTS we grab the parent clock before
   turning the domain off.
5. Add coupled cpuidle support for Exynos3250 to an existing
   cpuidle-exynos driver. As a result it enables AFTR mode
   (ARM-Off Top-Running) to be used by default on Exynos3250
   without the need to hot unplug CPU1 first.
6. Constify irq_domain_ops and platform_device_id.


Bartlomiej Zolnierkiewicz (5):
  ARM: EXYNOS: fix exynos_boot_secondary() return value on timeout
  ARM: EXYNOS: make exynos_core_restart() less verbose
  ARM: EXYNOS: add exynos_set_boot_addr() helper
  ARM: EXYNOS: add exynos_get_boot_addr() helper
  cpuidle: exynos: add coupled cpuidle support for Exynos3250

Krzysztof Kozlowski (8):
  ARM: EXYNOS: Use of_machine_is_compatible instead of soc_is_exynos4
  ARM: EXYNOS: Handle of of_iomap() failure
  ARM: EXYNOS: Handle of_find_device_by_node and kstrdup failures
  ARM: EXYNOS: Add missing of_node_put() when parsing power domains
  ARM: EXYNOS: Get current parent clock for power domain on/off
  ARM: dts: Use last parent for clocks during power domain on/off
  ARM: EXYNOS: Constify irq_domain_ops
  ARM: plat-samsung: Constify platform_device_id

Sergiy Kibrik (1):
  ARM: SAMSUNG: fix clk_enable() WARNing in S3C24XX ADC

 .../bindings/arm/exynos/power_domain.txt   |  7 +-
 arch/arm/boot/dts/exynos5420.dtsi  | 13 +---
 arch/arm/include/asm/firmware.h|  4 +
 arch/arm/mach-exynos/common.h  |  4 +-
 arch/arm/mach-exynos/exynos.c  |  5 +-
 arch/arm/mach-exynos/firmware.c| 18 +
 arch/arm/mach-exynos/platsmp.c | 86 +++---
 arch/arm/mach-exynos/pm.c  | 51 +++--
 arch/arm/mach-exynos/pm_domains.c  | 44 ---
 arch/arm/mach-exynos/suspend.c |  2 +-
 arch/arm/plat-samsung/adc.c|  6 +-
 11 files changed, 176 insertions(+), 64 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[GIT PULL] ARM: EXYNOS: dts: Improvements for 4.2, second try

2015-05-10 Thread Krzysztof Kozlowski

Dear Kukjin,

I gathered various improvements for upcoming 4.2 merge window.
Description along with a tag.

Best regards,
Krzysztof


The following changes since commit 2495ae559826c60e3ccde9850e3b38815725b9c9:

  ARM: dts: Add keep-power-in-suspend to WiFi SDIO node for Peach Boards 
(2015-05-11 11:00:00 +0900)

are available in the git repository at:

  https://github.com/krzk/linux.git tags/samsung-dt-for-next-4.2-2

for you to fetch changes up to c5f41228c9fad4c7343df8547b870766163b76dc:

  ARM: dts: exynos3250-rinato: add support for JPEG codec (2015-05-11 11:01:57 
+0900)


Device Tree improvements for Exynos based boards:
1. Fix PMIC's RTC alarm on Arndale Octa (S2MPS11 PMIC) and SMDK5250 (MAXIM77686
   PMIC).
2. Enable the S3C RTC (the clock present on SoC) on various Exynos boards.
3. Minor improvements to S3C RTC driver.
4. Add nodes for JPEG codec on Exynos4 and enable it for Rinato board.
5. Add audio on Odroid-XU3 board.


Inha Song (1):
  ARM: dts: Support audio on Exynos5422-odroidxu3 using simple-audio-card

Jacek Anaszewski (1):
  ARM: dts: exynos3250-rinato: add support for JPEG codec

Krzysztof Kozlowski (5):
  ARM: dts: Fix pinctrl settings for S2MPS11 RTC alarm IRQ on Arndale Octa
  ARM: dts: s3c-rtc: Use s3c6410-rtc instead of exynos3250-rtc
  ARM: dts: Use define for s3c-rtc clock id
  ARM: dts: Use define for s3c-rtc clock id
  ARM: dts: Enable S3C RTC on Trats2 and Arndale Octa

Marek Szyprowski (1):
  ARM: dts: exynos4: add nodes for jpeg codec

Markus Reichl (2):
  ARM: dts: Add bindings for 32kHz clocks from s2mps11
  ARM: dts: exynos5422-odroidxu3: add 'rtc_src' clock to rtc node

 Documentation/devicetree/bindings/rtc/s3c-rtc.txt |  3 +-
 arch/arm/boot/dts/exynos3250-monk.dts |  3 +-
 arch/arm/boot/dts/exynos3250-rinato.dts   |  7 ++-
 arch/arm/boot/dts/exynos3250.dtsi |  2 +-
 arch/arm/boot/dts/exynos4.dtsi| 11 +++-
 arch/arm/boot/dts/exynos4412-trats2.dts   |  9 ++-
 arch/arm/boot/dts/exynos4415.dtsi |  2 +-
 arch/arm/boot/dts/exynos4x12.dtsi |  4 ++
 arch/arm/boot/dts/exynos5420-arndale-octa.dts | 24 ++--
 arch/arm/boot/dts/exynos5420.dtsi |  9 +++
 arch/arm/boot/dts/exynos5422-odroidxu3.dts| 70 +--
 include/dt-bindings/clock/samsung,s2mps11.h   | 23 
 12 files changed, 151 insertions(+), 16 deletions(-)
 create mode 100644 include/dt-bindings/clock/samsung,s2mps11.h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] Rename RECLAIM_SWAP to RECLAIM_UNMAP.

2015-05-10 Thread Zhihui Zhang

the name SWAP implies that we are dealing with anonymous pages only.
In fact, the original patch that introduced the min_unmapped_ratio
logic was to fix an issue related to file pages. Rename it to
RECLAIM_UNMAP to match what does.

Historically, commit  renamed .may_swap to .may_unmap,
leaving RECLAIM_SWAP behind.  commit <2e2e42598908> reintroduced .may_swap
for memory controller.

Signed-off-by: Zhihui Zhang 
---
 mm/vmscan.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 5e8eadd..15328de 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -3596,7 +3596,7 @@ int zone_reclaim_mode __read_mostly;
 #define RECLAIM_OFF 0
 #define RECLAIM_ZONE (1<<0)/* Run shrink_inactive_list on the zone */
 #define RECLAIM_WRITE (1<<1)   /* Writeout pages during reclaim */
-#define RECLAIM_SWAP (1<<2)/* Swap pages out during reclaim */
+#define RECLAIM_UNMAP (1<<2)   /* Unmap pages during reclaim */
 
 /*
  * Priority for ZONE_RECLAIM. This determines the fraction of pages
@@ -3638,12 +3638,12 @@ static long zone_pagecache_reclaimable(struct zone 
*zone)
long delta = 0;
 
/*
-* If RECLAIM_SWAP is set, then all file pages are considered
+* If RECLAIM_UNMAP is set, then all file pages are considered
 * potentially reclaimable. Otherwise, we have to worry about
 * pages like swapcache and zone_unmapped_file_pages() provides
 * a better estimate
 */
-   if (zone_reclaim_mode & RECLAIM_SWAP)
+   if (zone_reclaim_mode & RECLAIM_UNMAP)
nr_pagecache_reclaimable = zone_page_state(zone, NR_FILE_PAGES);
else
nr_pagecache_reclaimable = zone_unmapped_file_pages(zone);
@@ -3674,15 +3674,15 @@ static int __zone_reclaim(struct zone *zone, gfp_t 
gfp_mask, unsigned int order)
.order = order,
.priority = ZONE_RECLAIM_PRIORITY,
.may_writepage = !!(zone_reclaim_mode & RECLAIM_WRITE),
-   .may_unmap = !!(zone_reclaim_mode & RECLAIM_SWAP),
+   .may_unmap = !!(zone_reclaim_mode & RECLAIM_UNMAP),
.may_swap = 1,
};
 
cond_resched();
/*
-* We need to be able to allocate from the reserves for RECLAIM_SWAP
+* We need to be able to allocate from the reserves for RECLAIM_UNMAP
 * and we also need to be able to write out pages for RECLAIM_WRITE
-* and RECLAIM_SWAP.
+* and RECLAIM_UNMAP.
 */
p->flags |= PF_MEMALLOC | PF_SWAPWRITE;
lockdep_set_current_reclaim_state(gfp_mask);
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[GIT PULL] ARM: EXYNOS: dts: DTS fixes for 4.1, second try

2015-05-10 Thread Krzysztof Kozlowski

Dear Kukjin,

This groups one fix missed in your last pull requst. This fixes suspend
on Peach boards related to Marvell mwifiex driver. The driver was
enabled for Exynos defconfig during merge window so currently
the suspend on these boards on this config fails.

Best regards,
Krzysztof


The following changes since commit 77b0ffdc4bbd267e9589aa025047e512322fc643:

  Merge branch 'v4.2-next/dt-samsung' into for-next (2015-05-09 04:03:46 +0900)

are available in the git repository at:


  https://github.com/krzk/linux.git tags/samsung-dt-fixes-4.1-2

for you to fetch changes up to 2495ae559826c60e3ccde9850e3b38815725b9c9:

  ARM: dts: Add keep-power-in-suspend to WiFi SDIO node for Peach Boards 
(2015-05-11 11:00:00 +0900)


Fix for suspend on Peach boards related to Marvell mwifiex driver.
The driver was enabled for Exynos defconfig during 4.1 merge window
so currently the suspend on these boards on this config fails.


Javier Martinez Canillas (1):
  ARM: dts: Add keep-power-in-suspend to WiFi SDIO node for Peach Boards

 arch/arm/boot/dts/exynos5420-peach-pit.dts | 1 +
 arch/arm/boot/dts/exynos5800-peach-pi.dts  | 1 +
 2 files changed, 2 insertions(+)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[GIT PULL] ARM: EXYNOS: Fixes for 4.1, second try

2015-05-10 Thread Krzysztof Kozlowski

Dear Kukjin,

This groups important fix for current RC cycle. The patch was present
on mailing list for quite long. It was tested. 

I responded to all your questions without any follow ups from your side.
The only issue left - replacing soc_is_exynos() with
of_machine_is_compatible() is done in separate patch because the fix
should not contain unrelated changes.

Best regards,
Krzysztof


The following changes since commit 77b0ffdc4bbd267e9589aa025047e512322fc643:

  Merge branch 'v4.2-next/dt-samsung' into for-next (2015-05-09 04:03:46 +0900)

are available in the git repository at:


  https://github.com/krzk/linux.git tags/samsung-fixes-4.1-2

for you to fetch changes up to b82f3a05ff0b5eaf2c9900eeb34e58a6624db8d9:

  ARM: EXYNOS: Fix failed second suspend on Exynos4 (2015-05-11 11:03:09 +0900)


Fixes failed second suspend to RAM on Exynos4412 based Trats2 board. This has
shown up after enabling L2 cache but actually the "use delayed reset assertion"
feature is to be blamed.


Krzysztof Kozlowski (1):
  ARM: EXYNOS: Fix failed second suspend on Exynos4

 arch/arm/mach-exynos/common.h  |  2 ++
 arch/arm/mach-exynos/exynos.c  | 27 +++
 arch/arm/mach-exynos/platsmp.c | 39 ++-
 arch/arm/mach-exynos/suspend.c |  3 +++
 4 files changed, 34 insertions(+), 37 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

linux-next: manual merge of the drm tree with Linus' tree

2015-05-10 Thread Stephen Rothwell

Hi Dave,

Today's linux-next merge of the drm tree got a conflict in 
drivers/gpu/drm/drm_irq.c between commit fdb68e09bbb1 ("drm: Zero out invalid 
vblank timestamp in drm_update_vblank_count") from Linus' tree and commit 
d66a1e38280c ("drm: Zero out invalid vblank timestamp in 
drm_update_vblank_count. (v2)") from the drm tree.

I fixed it up (a rebased version of a patch that is already in Linus'
tree :-( - I used the version from the drm tree) and can carry the fix
as necessary (no action is required).



-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au
http://www.canb.auug.org.au/~sfr/


pgpyAdkoWbqZP.pgp
Description: OpenPGP digital signature

[PATCH] livepatch: Prevent to enable uninitialized patch

2015-05-10 Thread Minfei Huang

From: Minfei Huang 

The previous patches can be applied, while the corresponding module is
loaded. Now the code cannot handle correct behavior to deal with the
case that the patch fail to be initialized when the module is being
loaded.

In general, the patch will do relocation (if necessary) and
obtain/verify function address before we start to enable patch. But we
can still trigger to enable the patch (disable the patch firstly, then
enable it), although the patch fail to be initialized in the function
klp_module_notify_coming.

To fix it, we can make obj->mod to NULL, if the object fails to be
initialized.

Signed-off-by: Minfei Huang 
---
 kernel/livepatch/core.c | 23 +--
 1 file changed, 13 insertions(+), 10 deletions(-)

diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
index 284e269..4bbcdda 100644
--- a/kernel/livepatch/core.c
+++ b/kernel/livepatch/core.c
@@ -883,30 +883,30 @@ int klp_register_patch(struct klp_patch *patch)
 }
 EXPORT_SYMBOL_GPL(klp_register_patch);
 
-static void klp_module_notify_coming(struct klp_patch *patch,
+static int klp_module_notify_coming(struct klp_patch *patch,
 struct klp_object *obj)
 {
struct module *pmod = patch->mod;
struct module *mod = obj->mod;
-   int ret;
+   int ret = 0;
 
ret = klp_init_object_loaded(patch, obj);
if (ret)
-   goto err;
+   goto out;
 
if (patch->state == KLP_DISABLED)
-   return;
+   goto out;
 
pr_notice("applying patch '%s' to loading module '%s'\n",
  pmod->name, mod->name);
 
ret = klp_enable_object(obj);
-   if (!ret)
-   return;
 
-err:
-   pr_warn("failed to apply patch '%s' to module '%s' (%d)\n",
-   pmod->name, mod->name, ret);
+out:
+   if (ret)
+   pr_warn("failed to apply patch '%s' to module '%s' (%d)\n",
+   pmod->name, mod->name, ret);
+   return ret;
 }
 
 static void klp_module_notify_going(struct klp_patch *patch,
@@ -930,6 +930,7 @@ disabled:
 static int klp_module_notify(struct notifier_block *nb, unsigned long action,
 void *data)
 {
+   int ret = 0;
struct module *mod = data;
struct klp_patch *patch;
struct klp_object *obj;
@@ -955,7 +956,9 @@ static int klp_module_notify(struct notifier_block *nb, 
unsigned long action,
 
if (action == MODULE_STATE_COMING) {
obj->mod = mod;
-   klp_module_notify_coming(patch, obj);
+   ret = klp_module_notify_coming(patch, obj);
+   if (ret)
+   obj->mod = NULL;
} else /* MODULE_STATE_GOING */
klp_module_notify_going(patch, obj);
 
-- 
2.2.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [f2fs-dev] [PATCH 14/18] f2fs crypto: add filename encryption for f2fs_lookup

2015-05-10 Thread hujianyang

Hi Jaegeuk,

While compiling git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs.git
branch dev(commit 5af6e74892a f2fs crypto: remove checking key context during 
lookup),
I saw an error:

fs/f2fs/dir.c: In function ‘find_in_level’:
fs/f2fs/dir.c:163: error: unknown field ‘len’ specified in initializer
fs/f2fs/dir.c:163: warning: excess elements in struct initializer
fs/f2fs/dir.c:163: warning: (near initialization for ‘name’)

I think it's related to this patch.
If there is anything wrong in my configuration, please let me know.

Thanks,
Hu



On 2015/5/9 12:20, Jaegeuk Kim wrote:
> This patch implements filename encryption support for f2fs_lookup.
> 
> Note that, f2fs_find_entry should be outside of f2fs_(un)lock_op().
> 
> Signed-off-by: Jaegeuk Kim 
> ---
>  fs/f2fs/dir.c| 79 
> 
>  fs/f2fs/f2fs.h   |  9 ---
>  fs/f2fs/inline.c |  9 ---
>  3 files changed, 56 insertions(+), 41 deletions(-)
> 
> diff --git a/fs/f2fs/dir.c b/fs/f2fs/dir.c
> index ab6455d..5e10d9d 100644
> --- a/fs/f2fs/dir.c
> +++ b/fs/f2fs/dir.c
> @@ -76,20 +76,10 @@ static unsigned long dir_block_index(unsigned int level,
>   return bidx;
>  }
>  
> -static bool early_match_name(size_t namelen, f2fs_hash_t namehash,
> - struct f2fs_dir_entry *de)
> -{
> - if (le16_to_cpu(de->name_len) != namelen)
> - return false;
> -
> - if (de->hash_code != namehash)
> - return false;
> -
> - return true;
> -}
> -
>  static struct f2fs_dir_entry *find_in_block(struct page *dentry_page,
> - struct qstr *name, int *max_slots,
> + struct f2fs_filename *fname,
> + f2fs_hash_t namehash,
> + int *max_slots,
>   struct page **res_page)
>  {
>   struct f2fs_dentry_block *dentry_blk;
> @@ -99,8 +89,7 @@ static struct f2fs_dir_entry *find_in_block(struct page 
> *dentry_page,
>   dentry_blk = (struct f2fs_dentry_block *)kmap(dentry_page);
>  
>   make_dentry_ptr(NULL, , (void *)dentry_blk, 1);
> - de = find_target_dentry(name, max_slots, );
> -
> + de = find_target_dentry(fname, namehash, max_slots, );
>   if (de)
>   *res_page = dentry_page;
>   else
> @@ -114,13 +103,15 @@ static struct f2fs_dir_entry *find_in_block(struct page 
> *dentry_page,
>   return de;
>  }
>  
> -struct f2fs_dir_entry *find_target_dentry(struct qstr *name, int *max_slots,
> - struct f2fs_dentry_ptr *d)
> +struct f2fs_dir_entry *find_target_dentry(struct f2fs_filename *fname,
> + f2fs_hash_t namehash, int *max_slots,
> + struct f2fs_dentry_ptr *d)
>  {
>   struct f2fs_dir_entry *de;
>   unsigned long bit_pos = 0;
> - f2fs_hash_t namehash = f2fs_dentry_hash(name);
>   int max_len = 0;
> + struct f2fs_str de_name = FSTR_INIT(NULL, 0);
> + struct f2fs_str *name = >disk_name;
>  
>   if (max_slots)
>   *max_slots = 0;
> @@ -132,8 +123,18 @@ struct f2fs_dir_entry *find_target_dentry(struct qstr 
> *name, int *max_slots,
>   }
>  
>   de = >dentry[bit_pos];
> - if (early_match_name(name->len, namehash, de) &&
> - !memcmp(d->filename[bit_pos], name->name, name->len))
> +
> + /* encrypted case */
> + de_name.name = d->filename[bit_pos];
> + de_name.len = le16_to_cpu(de->name_len);
> +
> + /* show encrypted name */
> + if (fname->hash) {
> + if (de->hash_code == fname->hash)
> + goto found;
> + } else if (de_name.len == name->len &&
> + de->hash_code == namehash &&
> + !memcmp(de_name.name, name->name, name->len))
>   goto found;
>  
>   if (max_slots && max_len > *max_slots)
> @@ -155,16 +156,21 @@ found:
>  }
>  
>  static struct f2fs_dir_entry *find_in_level(struct inode *dir,
> - unsigned int level, struct qstr *name,
> - f2fs_hash_t namehash, struct page **res_page)
> + unsigned int level,
> + struct f2fs_filename *fname,
> + struct page **res_page)
>  {
> - int s = GET_DENTRY_SLOTS(name->len);
> + struct qstr name = FSTR_TO_QSTR(>disk_name);
> + int s = GET_DENTRY_SLOTS(name.len);
>   unsigned int nbucket, nblock;
>   unsigned int bidx, end_block;
>   struct page *dentry_page;
>   struct f2fs_dir_entry *de = NULL;
>   bool room = false;
>   int max_slots;
> + f2fs_hash_t namehash;
> +
> + namehash = f2fs_dentry_hash();
>  
>   f2fs_bug_on(F2FS_I_SB(dir), level > MAX_DIR_HASH_DEPTH);
>  
> @@ -183,7 +189,8 @@

RE: [PATCH V4 net-next 1/1] hv_netvsc: Use the xmit_more skb flag to optimize signaling the host

2015-05-10 Thread KY Srinivasan



> -Original Message-
> From: David Miller [mailto:da...@davemloft.net]
> Sent: Sunday, May 10, 2015 2:51 PM
> To: KY Srinivasan
> Cc: net...@vger.kernel.org; linux-kernel@vger.kernel.org;
> de...@linuxdriverproject.org; o...@aepfle.de; a...@canonical.com;
> jasow...@redhat.com
> Subject: Re: [PATCH V4 net-next 1/1] hv_netvsc: Use the xmit_more skb flag
> to optimize signaling the host
> 
> From: "K. Y. Srinivasan" 
> Date: Wed,  6 May 2015 15:29:05 -0700
> 
> > -   ret = vmbus_sendpacket_pagebuffer(out_channel,
> > - pgbuf,
> > - packet->page_buf_cnt,
> > - ,
> > - sizeof(struct
> nvsp_message),
> > - req_id);
> > +   ret = vmbus_sendpacket_pagebuffer_ctl(out_channel,
> > + pgbuf,
> > + packet->page_buf_cnt,
> > + ,
> > + sizeof(struct
> > +nvsp_message),
> > + req_id,
> > + vmbus_flags,
> > + !packet->xmit_more);
> > } else {
> > -   ret = vmbus_sendpacket(
> > +   ret = vmbus_sendpacket_ctl(
> > out_channel, ,
> > sizeof(struct nvsp_message),
> > req_id,
> > VM_PKT_DATA_INBAND,
> > -
>   VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED);
> > +   vmbus_flags, !packet->xmit_more);
> 
> Just as you did for the vmbus_sendpacket_pagebuffer_ctl() call above,
> you'll need to reindent the arguments for the vmbus_sendpacket_ctl()
> call since the openning parenthesis is now at a different column.

Done.

Regards,

K. Y
> 
> Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH V5 1/1] hv_netvsc: Use the xmit_more skb flag to optimize signaling the host

2015-05-10 Thread K. Y. Srinivasan

Based on the information given to this driver (via the xmit_more skb flag),
we can defer signaling the host if more packets are on the way. This will help
make the host more efficient since it can potentially process a larger batch of
packets. Implement this optimization.

Signed-off-by: K. Y. Srinivasan 
---
v2: Fixed up indentation based on feedback from David Miller.
v3,v4: If the queue could be stopped, deal with that condition: Eric 
Dumazet 
v5: Fixed up indentation based on feedback from David Miller.

 drivers/net/hyperv/netvsc.c |   41 +++--
 1 files changed, 27 insertions(+), 14 deletions(-)

diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
index ea091bc..7fdf3e8 100644
--- a/drivers/net/hyperv/netvsc.c
+++ b/drivers/net/hyperv/netvsc.c
@@ -743,6 +743,8 @@ static inline int netvsc_send_pkt(
u64 req_id;
int ret;
struct hv_page_buffer *pgbuf;
+   u32 vmbus_flags = VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED;
+   u32 ring_avail = hv_ringbuf_avail_percent(_channel->outbound);
 
nvmsg.hdr.msg_type = NVSP_MSG1_TYPE_SEND_RNDIS_PKT;
if (packet->is_data_pkt) {
@@ -769,30 +771,41 @@ static inline int netvsc_send_pkt(
if (out_channel->rescind)
return -ENODEV;
 
+   /*
+* It is possible that once we successfully place this packet
+* on the ringbuffer, we may stop the queue. In that case, we want
+* to notify the host independent of the xmit_more flag. We don't
+* need to be precise here; in the worst case we may signal the host
+* unnecessarily.
+*/
+   if (ring_avail < (RING_AVAIL_PERCENT_LOWATER + 1))
+   packet->xmit_more = false;
+
if (packet->page_buf_cnt) {
pgbuf = packet->cp_partial ? packet->page_buf +
packet->rmsg_pgcnt : packet->page_buf;
-   ret = vmbus_sendpacket_pagebuffer(out_channel,
- pgbuf,
- packet->page_buf_cnt,
- ,
- sizeof(struct nvsp_message),
- req_id);
+   ret = vmbus_sendpacket_pagebuffer_ctl(out_channel,
+ pgbuf,
+ packet->page_buf_cnt,
+ ,
+ sizeof(struct
+nvsp_message),
+ req_id,
+ vmbus_flags,
+ !packet->xmit_more);
} else {
-   ret = vmbus_sendpacket(
-   out_channel, ,
-   sizeof(struct nvsp_message),
-   req_id,
-   VM_PKT_DATA_INBAND,
-   VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED);
+   ret = vmbus_sendpacket_ctl(out_channel, ,
+  sizeof(struct nvsp_message),
+  req_id,
+  VM_PKT_DATA_INBAND,
+  vmbus_flags, !packet->xmit_more);
}
 
if (ret == 0) {
atomic_inc(_device->num_outstanding_sends);
atomic_inc(_device->queue_sends[q_idx]);
 
-   if (hv_ringbuf_avail_percent(_channel->outbound) <
-   RING_AVAIL_PERCENT_LOWATER) {
+   if (ring_avail < RING_AVAIL_PERCENT_LOWATER) {
netif_tx_stop_queue(netdev_get_tx_queue(
ndev, q_idx));
 
-- 
1.7.4.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] brcmfmac: prohibit ACPI power management for brcmfmac driver

2015-05-10 Thread Fu, Zhonghui



On 2015/5/7 16:30, Arend van Spriel wrote:
> On 05/07/15 04:07, Fu, Zhonghui wrote:
>>
>> Hi,
>>
>> Any comments are welcome.
>
> Having some time to spare while spending my vacation so here it is.
>
>> Thanks,
>> Zhonghui
>>
>> On 2015/5/3 23:26, Fu, Zhonghui wrote:
>>> ACPI will manage WiFi chip's power state during suspend/resume
>>> process on some tablet platforms(such as ASUS T100TA). This is
>>> not supported by brcmfmac driver now, and the context of WiFi
>>> chip will be damaged after resume. This patch informs ACPI not
>>> to manage WiFi chip's power state.
>>>
>>> Signed-off-by: Zhonghui Fu
>>> ---
>>> Changes in v2:
>>> - Another implementation.
>>>
>>>   drivers/net/wireless/brcm80211/brcmfmac/bcmsdh.c |8 
>>>   1 files changed, 8 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/drivers/net/wireless/brcm80211/brcmfmac/bcmsdh.c 
>>> b/drivers/net/wireless/brcm80211/brcmfmac/bcmsdh.c
>>> index 9b508bd..6c519e3 100644
>>> --- a/drivers/net/wireless/brcm80211/brcmfmac/bcmsdh.c
>>> +++ b/drivers/net/wireless/brcm80211/brcmfmac/bcmsdh.c
>>> @@ -33,6 +33,7 @@
>>>   #include
>>>   #include
>>>   #include
>>> +#include
>>>   #include
>>>
>>>   #include
>>> @@ -1114,6 +1115,8 @@ static int brcmf_ops_sdio_probe(struct sdio_func 
>>> *func,
>>>   int err;
>>>   struct brcmf_sdio_dev *sdiodev;
>>>   struct brcmf_bus *bus_if;
>>> +struct device *dev;
>>> +struct acpi_device *adev;
>>>
>>>   brcmf_dbg(SDIO, "Enter\n");
>>>   brcmf_dbg(SDIO, "Class=%x\n", func->class);
>>> @@ -1121,6 +1124,11 @@ static int brcmf_ops_sdio_probe(struct sdio_func 
>>> *func,
>>>   brcmf_dbg(SDIO, "sdio device ID: 0x%04x\n", func->device);
>>>   brcmf_dbg(SDIO, "Function#: %d\n", func->num);
>>>
>>> +/* prohibit ACPI power management for this device */
>>> +dev =>dev;
>>> +if (adev = ACPI_COMPANION(dev))
>
> While I understand what you are doing here it makes someone reading the code 
> wonder whether a mistake has been made. So I would prefer to have the 
> assignment separate for the if statement.
>
> For the update patch you may add:
>
> Acked-by: Arend van Spriel 
Many thanks for your comments. I have sent out the v3 patch.


Thanks,
Zhonghui
>
> Regards,
> Arend
>>> +adev->flags.power_manageable = 0;
>>> +
>>>   /* Consume func num 1 but dont do anything with it. */
>>>   if (func->num == 1)
>>>   return 0;
>>> -- 1.7.1
>>>
>>
>
> -- 
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3] brcmfmac: prohibit ACPI power management for brcmfmac driver

2015-05-10 Thread Fu, Zhonghui

ACPI will manage WiFi chip's power state during suspend/resume
process on some tablet platforms(such as ASUS T100TA). This is
not supported by brcmfmac driver now, and the context of WiFi
chip will be damaged after resume. This patch informs ACPI not
to manage WiFi chip's power state.

Signed-off-by: Zhonghui Fu 
Acked-by: Arend van Spriel 
---
Changes in v3:
- Have the assignment separate for the if statement.

 drivers/net/wireless/brcm80211/brcmfmac/bcmsdh.c |9 +
 1 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/drivers/net/wireless/brcm80211/brcmfmac/bcmsdh.c 
b/drivers/net/wireless/brcm80211/brcmfmac/bcmsdh.c
index 9b508bd..c960a12 100644
--- a/drivers/net/wireless/brcm80211/brcmfmac/bcmsdh.c
+++ b/drivers/net/wireless/brcm80211/brcmfmac/bcmsdh.c
@@ -33,6 +33,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include 
@@ -1114,6 +1115,8 @@ static int brcmf_ops_sdio_probe(struct sdio_func *func,
int err;
struct brcmf_sdio_dev *sdiodev;
struct brcmf_bus *bus_if;
+   struct device *dev;
+   struct acpi_device *adev;
 
brcmf_dbg(SDIO, "Enter\n");
brcmf_dbg(SDIO, "Class=%x\n", func->class);
@@ -1121,6 +1124,12 @@ static int brcmf_ops_sdio_probe(struct sdio_func *func,
brcmf_dbg(SDIO, "sdio device ID: 0x%04x\n", func->device);
brcmf_dbg(SDIO, "Function#: %d\n", func->num);
 
+   /* prohibit ACPI power management for this device */
+   dev = >dev;
+   adev = ACPI_COMPANION(dev);
+   if (adev)
+   adev->flags.power_manageable = 0;
+
/* Consume func num 1 but dont do anything with it. */
if (func->num == 1)
return 0;
-- 1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] ASoC: dapm: Modify widget stream name according to prefix

2015-05-10 Thread Koro Chen

When there is prefix specified, currently we will add this prefix in
widget->name, but not in widget->sname.
it causes failure at snd_soc_dapm_link_dai_widgets:

if (!w->sname || !strstr(w->sname, dai_w->name))

because dai_w->name has prefix added, but w->sname does not.
We should also add prefix for stream name

Signed-off-by: Koro Chen 
---
 sound/soc/soc-dapm.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/sound/soc/soc-dapm.c b/sound/soc/soc-dapm.c
index defe0f0..158204d 100644
--- a/sound/soc/soc-dapm.c
+++ b/sound/soc/soc-dapm.c
@@ -3100,11 +3100,16 @@ snd_soc_dapm_new_control(struct snd_soc_dapm_context 
*dapm,
}
 
prefix = soc_dapm_prefix(dapm);
-   if (prefix)
+   if (prefix) {
w->name = kasprintf(GFP_KERNEL, "%s %s", prefix, widget->name);
-   else
+   if (widget->sname)
+   w->sname = kasprintf(GFP_KERNEL, "%s %s", prefix,
+widget->sname);
+   } else {
w->name = kasprintf(GFP_KERNEL, "%s", widget->name);
-
+   if (widget->sname)
+   w->sname = kasprintf(GFP_KERNEL, "%s", widget->sname);
+   }
if (w->name == NULL) {
kfree(w);
return NULL;
-- 
1.8.1.1.dirty

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] aoe: Use 64-bit timestamp in frame

2015-05-10 Thread Tina Ruchandani

'struct frame' uses two variables to store the sent timestamp - 'struct
timeval' and jiffies. jiffies is used to avoid discrepancies caused by
updates to system time. 'struct timeval' uses 32-bit representation for
seconds which will overflow in year 2038.
This patch does the following:
- Replace the use of 'struct timeval' and jiffies with ktime_t, which
is a 64-bit timestamp and is year 2038 safe.
- ktime_t provides both long range (like jiffies) and high resolution
(like timeval). Using ktime_get (monotonic time) instead of wall-clock
time prevents any discprepancies caused by updates to system time.

Signed-off-by: Tina Ruchandani 
---
 drivers/block/aoe/aoe.h|  3 +--
 drivers/block/aoe/aoecmd.c | 36 +++-
 2 files changed, 8 insertions(+), 31 deletions(-)

diff --git a/drivers/block/aoe/aoe.h b/drivers/block/aoe/aoe.h
index 9220f8e..4582b3c 100644
--- a/drivers/block/aoe/aoe.h
+++ b/drivers/block/aoe/aoe.h
@@ -112,8 +112,7 @@ enum frame_flags {
 struct frame {
struct list_head head;
u32 tag;
-   struct timeval sent;/* high-res time packet was sent */
-   u32 sent_jiffs; /* low-res jiffies-based sent time */
+   ktime_t sent;
ulong waited;
ulong waited_total;
struct aoetgt *t;   /* parent target I belong to */
diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c
index 422b7d8..7f78780 100644
--- a/drivers/block/aoe/aoecmd.c
+++ b/drivers/block/aoe/aoecmd.c
@@ -398,8 +398,7 @@ aoecmd_ata_rw(struct aoedev *d)
 
skb = skb_clone(f->skb, GFP_ATOMIC);
if (skb) {
-   do_gettimeofday(>sent);
-   f->sent_jiffs = (u32) jiffies;
+   f->sent = ktime_get();
__skb_queue_head_init();
__skb_queue_tail(, skb);
aoenet_xmit();
@@ -489,8 +488,7 @@ resend(struct aoedev *d, struct frame *f)
skb = skb_clone(skb, GFP_ATOMIC);
if (skb == NULL)
return;
-   do_gettimeofday(>sent);
-   f->sent_jiffs = (u32) jiffies;
+   f->sent = ktime_get();
__skb_queue_head_init();
__skb_queue_tail(, skb);
aoenet_xmit();
@@ -499,32 +497,15 @@ resend(struct aoedev *d, struct frame *f)
 static int
 tsince_hr(struct frame *f)
 {
-   struct timeval now;
+   ktime_t now;
int n;
 
-   do_gettimeofday();
-   n = now.tv_usec - f->sent.tv_usec;
-   n += (now.tv_sec - f->sent.tv_sec) * USEC_PER_SEC;
+   now = ktime_get();
+   n = ktime_to_us(ktime_sub(now, f->sent));
 
if (n < 0)
n = -n;
 
-   /* For relatively long periods, use jiffies to avoid
-* discrepancies caused by updates to the system time.
-*
-* On system with HZ of 1000, 32-bits is over 49 days
-* worth of jiffies, or over 71 minutes worth of usecs.
-*
-* Jiffies overflow is handled by subtraction of unsigned ints:
-* (gdb) print (unsigned) 2 - (unsigned) 0xfffe
-* $3 = 4
-* (gdb)
-*/
-   if (n > USEC_PER_SEC / 4) {
-   n = ((u32) jiffies) - f->sent_jiffs;
-   n *= USEC_PER_SEC / HZ;
-   }
-
return n;
 }
 
@@ -589,7 +570,6 @@ reassign_frame(struct frame *f)
nf->waited = 0;
nf->waited_total = f->waited_total;
nf->sent = f->sent;
-   nf->sent_jiffs = f->sent_jiffs;
f->skb = skb;
 
return nf;
@@ -633,8 +613,7 @@ probe(struct aoetgt *t)
 
skb = skb_clone(f->skb, GFP_ATOMIC);
if (skb) {
-   do_gettimeofday(>sent);
-   f->sent_jiffs = (u32) jiffies;
+   f->sent = ktime_get();
__skb_queue_head_init();
__skb_queue_tail(, skb);
aoenet_xmit();
@@ -1474,8 +1453,7 @@ aoecmd_ata_id(struct aoedev *d)
 
skb = skb_clone(skb, GFP_ATOMIC);
if (skb) {
-   do_gettimeofday(>sent);
-   f->sent_jiffs = (u32) jiffies;
+   f->sent = ktime_get();
}
 
return skb;
-- 
2.2.0.rc0.207.ga3a616c

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

linux-next: manual merge of the wireless-drivers-next tree with the net-next tree

2015-05-10 Thread Stephen Rothwell

Hi Kalle,

Today's linux-next merge of the wireless-drivers-next tree got a
conflict in drivers/net/wireless/ath/ath10k/mac.c between commits
41fbf6e4f317 ("ath10k: enable IEEE80211_HW_SUPPORT_FAST_XMIT") and
df1404650ccb ("mac80211: remove support for IFF_PROMISC") from the
net-next tree and commits 548462133d98 ("ath10k: fix interrupt storm"),
cc9904e694fa ("ath10k: add hw connection monitor support") and
500ff9f9389d ("ath10k: implement chanctx API") (and a few others) from
the wireless-drivers-next tree.

I fixed it up (I think - see below) and can carry the fix as necessary
(no action is required).

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

diff --cc drivers/net/wireless/ath/ath10k/mac.c
index fcd08b2f8d26,eaa0182e001d..
--- a/drivers/net/wireless/ath/ath10k/mac.c
+++ b/drivers/net/wireless/ath/ath10k/mac.c
@@@ -766,9 -1031,68 +1031,48 @@@ static int ath10k_monitor_stop(struct a
return 0;
  }
  
 -static bool ath10k_mac_should_disable_promisc(struct ath10k *ar)
 -{
 -  struct ath10k_vif *arvif;
 -
 -  if (!(ar->filter_flags & FIF_PROMISC_IN_BSS))
 -  return true;
 -
 -  if (!ar->num_started_vdevs)
 -  return false;
 -
 -  list_for_each_entry(arvif, >arvifs, list)
 -  if (arvif->vdev_type != WMI_VDEV_TYPE_AP)
 -  return false;
 -
 -  ath10k_dbg(ar, ATH10K_DBG_MAC,
 - "mac disabling promiscuous mode because vdev is started\n");
 -  return true;
 -}
 -
+ static bool ath10k_mac_monitor_vdev_is_needed(struct ath10k *ar)
+ {
+   int num_ctx;
+ 
+   /* At least one chanctx is required to derive a channel to start
+* monitor vdev on.
+*/
+   num_ctx = ath10k_mac_num_chanctxs(ar);
+   if (num_ctx == 0)
+   return false;
+ 
+   /* If there's already an existing special monitor interface then don't
+* bother creating another monitor vdev.
+*/
+   if (ar->monitor_arvif)
+   return false;
+ 
+   return ar->monitor ||
 - !ath10k_mac_should_disable_promisc(ar) ||
+  test_bit(ATH10K_CAC_RUNNING, >dev_flags);
+ }
+ 
+ static bool ath10k_mac_monitor_vdev_is_allowed(struct ath10k *ar)
+ {
+   int num_ctx;
+ 
+   num_ctx = ath10k_mac_num_chanctxs(ar);
+ 
+   /* FIXME: Current interface combinations and cfg80211/mac80211 code
+* shouldn't allow this but make sure to prevent handling the following
+* case anyway since multi-channel DFS hasn't been tested at all.
+*/
+   if (test_bit(ATH10K_CAC_RUNNING, >dev_flags) && num_ctx > 1)
+   return false;
+ 
+   return true;
+ }
+ 
  static int ath10k_monitor_recalc(struct ath10k *ar)
  {
-   bool should_start;
+   bool needed;
+   bool allowed;
+   int ret;
  
lockdep_assert_held(>conf_mutex);
  
@@@ -5499,9 -6915,14 +6894,15 @@@ int ath10k_mac_register(struct ath10k *
IEEE80211_HW_AP_LINK_PS |
IEEE80211_HW_SPECTRUM_MGMT |
IEEE80211_HW_SW_CRYPTO_CONTROL |
-   IEEE80211_HW_SUPPORT_FAST_XMIT;
++  IEEE80211_HW_SUPPORT_FAST_XMIT |
+   IEEE80211_HW_CONNECTION_MONITOR |
+   IEEE80211_HW_SUPPORTS_PER_STA_GTK |
+   IEEE80211_HW_WANT_MONITOR_VIF |
+   IEEE80211_HW_CHANCTX_STA_CSA |
+   IEEE80211_HW_QUEUE_CONTROL;
  
ar->hw->wiphy->features |= NL80211_FEATURE_STATIC_SMPS;
+   ar->hw->wiphy->flags |= WIPHY_FLAG_IBSS_RSN;
  
if (ar->ht_cap_info & WMI_HT_CAP_DYNAMIC_SMPS)
ar->hw->wiphy->features |= NL80211_FEATURE_DYNAMIC_SMPS;


pgpOWlshewp93.pgp
Description: OpenPGP digital signature

[PATCH V9 1/8] perf, x86: use the PEBS auto reload mechanism when possible

2015-05-10 Thread Kan Liang

From: Yan, Zheng 

When a fixed period is specified, this patch make perf use the PEBS
auto reload mechanism. This makes normal profiling faster, because
it avoids one costly MSR write in the PMI handler.
However, the reset value will be loaded by hardware assist. There is a
little bit delay compared to previous non-auto-reload mechanism. The
delay time is arbitrary, but very small. The assist cost is 400-800
cycles, assuming common cases with everything cached. The minimum period
the patch currently uses is 1. In that extreme case it can be ~10%
if cycles are used.

Signed-off-by: Yan, Zheng 
Signed-off-by: Kan Liang 
---
 arch/x86/kernel/cpu/perf_event.c  | 15 +--
 arch/x86/kernel/cpu/perf_event.h  |  1 +
 arch/x86/kernel/cpu/perf_event_intel.c|  8 ++--
 arch/x86/kernel/cpu/perf_event_intel_ds.c |  7 +++
 4 files changed, 23 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index 87848eb..8cc1153 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -1058,13 +1058,16 @@ int x86_perf_event_set_period(struct perf_event *event)
 
per_cpu(pmc_prev_left[idx], smp_processor_id()) = left;
 
-   /*
-* The hw event starts counting from this event offset,
-* mark it to be able to extra future deltas:
-*/
-   local64_set(>prev_count, (u64)-left);
+   if (!(hwc->flags & PERF_X86_EVENT_AUTO_RELOAD) ||
+   local64_read(>prev_count) != (u64)-left) {
+   /*
+* The hw event starts counting from this event offset,
+* mark it to be able to extra future deltas:
+*/
+   local64_set(>prev_count, (u64)-left);
 
-   wrmsrl(hwc->event_base, (u64)(-left) & x86_pmu.cntval_mask);
+   wrmsrl(hwc->event_base, (u64)(-left) & x86_pmu.cntval_mask);
+   }
 
/*
 * Due to erratum on certan cpu we need
diff --git a/arch/x86/kernel/cpu/perf_event.h b/arch/x86/kernel/cpu/perf_event.h
index 6ac5cb7..1cb5859 100644
--- a/arch/x86/kernel/cpu/perf_event.h
+++ b/arch/x86/kernel/cpu/perf_event.h
@@ -74,6 +74,7 @@ struct event_constraint {
 #define PERF_X86_EVENT_EXCL0x0040 /* HT exclusivity on counter */
 #define PERF_X86_EVENT_DYNAMIC 0x0080 /* dynamic alloc'd constraint */
 #define PERF_X86_EVENT_RDPMC_ALLOWED   0x0100 /* grant rdpmc permission */
+#define PERF_X86_EVENT_AUTO_RELOAD 0x0200 /* use PEBS auto-reload */
 
 
 struct amd_nb {
diff --git a/arch/x86/kernel/cpu/perf_event_intel.c 
b/arch/x86/kernel/cpu/perf_event_intel.c
index 960e85d..3119071 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -2305,8 +2305,12 @@ static int intel_pmu_hw_config(struct perf_event *event)
if (ret)
return ret;
 
-   if (event->attr.precise_ip && x86_pmu.pebs_aliases)
-   x86_pmu.pebs_aliases(event);
+   if (event->attr.precise_ip) {
+   if (!event->attr.freq)
+   event->hw.flags |= PERF_X86_EVENT_AUTO_RELOAD;
+   if (x86_pmu.pebs_aliases)
+   x86_pmu.pebs_aliases(event);
+   }
 
if (needs_branch_stack(event)) {
ret = intel_pmu_setup_lbr_filter(event);
diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c 
b/arch/x86/kernel/cpu/perf_event_intel_ds.c
index 813f75d..f856f73 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
@@ -688,6 +688,7 @@ void intel_pmu_pebs_enable(struct perf_event *event)
 {
struct cpu_hw_events *cpuc = this_cpu_ptr(_hw_events);
struct hw_perf_event *hwc = >hw;
+   struct debug_store *ds = cpuc->ds;
 
hwc->config &= ~ARCH_PERFMON_EVENTSEL_INT;
 
@@ -697,6 +698,12 @@ void intel_pmu_pebs_enable(struct perf_event *event)
cpuc->pebs_enabled |= 1ULL << (hwc->idx + 32);
else if (event->hw.flags & PERF_X86_EVENT_PEBS_ST)
cpuc->pebs_enabled |= 1ULL << 63;
+
+   /* Use auto-reload if possible to save a MSR write in the PMI */
+   if (hwc->flags & PERF_X86_EVENT_AUTO_RELOAD) {
+   ds->pebs_event_reset[hwc->idx] =
+   (u64)(-hwc->sample_period) & x86_pmu.cntval_mask;
+   }
 }
 
 void intel_pmu_pebs_disable(struct perf_event *event)
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH V9 2/8] perf, x86: introduce setup_pebs_sample_data()

2015-05-10 Thread Kan Liang

From: Yan, Zheng 

move codes that setup PEBS sample data to separate function.

Signed-off-by: Yan, Zheng 
Signed-off-by: Kan Liang 
---
 arch/x86/kernel/cpu/perf_event_intel_ds.c | 95 +--
 1 file changed, 52 insertions(+), 43 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c 
b/arch/x86/kernel/cpu/perf_event_intel_ds.c
index f856f73..f26c8b4 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
@@ -853,8 +853,10 @@ static inline u64 intel_hsw_transaction(struct 
pebs_record_hsw *pebs)
return txn;
 }
 
-static void __intel_pmu_pebs_event(struct perf_event *event,
-  struct pt_regs *iregs, void *__pebs)
+static void setup_pebs_sample_data(struct perf_event *event,
+  struct pt_regs *iregs, void *__pebs,
+  struct perf_sample_data *data,
+  struct pt_regs *regs)
 {
 #define PERF_X86_EVENT_PEBS_HSW_PREC \
(PERF_X86_EVENT_PEBS_ST_HSW | \
@@ -866,30 +868,25 @@ static void __intel_pmu_pebs_event(struct perf_event 
*event,
 */
struct cpu_hw_events *cpuc = this_cpu_ptr(_hw_events);
struct pebs_record_hsw *pebs = __pebs;
-   struct perf_sample_data data;
-   struct pt_regs regs;
u64 sample_type;
int fll, fst, dsrc;
int fl = event->hw.flags;
 
-   if (!intel_pmu_save_and_restart(event))
-   return;
-
sample_type = event->attr.sample_type;
dsrc = sample_type & PERF_SAMPLE_DATA_SRC;
 
fll = fl & PERF_X86_EVENT_PEBS_LDLAT;
fst = fl & (PERF_X86_EVENT_PEBS_ST | PERF_X86_EVENT_PEBS_HSW_PREC);
 
-   perf_sample_data_init(, 0, event->hw.last_period);
+   perf_sample_data_init(data, 0, event->hw.last_period);
 
-   data.period = event->hw.last_period;
+   data->period = event->hw.last_period;
 
/*
 * Use latency for weight (only avail with PEBS-LL)
 */
if (fll && (sample_type & PERF_SAMPLE_WEIGHT))
-   data.weight = pebs->lat;
+   data->weight = pebs->lat;
 
/*
 * data.data_src encodes the data source
@@ -902,7 +899,7 @@ static void __intel_pmu_pebs_event(struct perf_event *event,
val = precise_datala_hsw(event, pebs->dse);
else if (fst)
val = precise_store_data(pebs->dse);
-   data.data_src.val = val;
+   data->data_src.val = val;
}
 
/*
@@ -915,58 +912,70 @@ static void __intel_pmu_pebs_event(struct perf_event 
*event,
 * PERF_SAMPLE_IP and PERF_SAMPLE_CALLCHAIN to function properly.
 * A possible PERF_SAMPLE_REGS will have to transfer all regs.
 */
-   regs = *iregs;
-   regs.flags = pebs->flags;
-   set_linear_ip(, pebs->ip);
-   regs.bp = pebs->bp;
-   regs.sp = pebs->sp;
+   *regs = *iregs;
+   regs->flags = pebs->flags;
+   set_linear_ip(regs, pebs->ip);
+   regs->bp = pebs->bp;
+   regs->sp = pebs->sp;
 
if (sample_type & PERF_SAMPLE_REGS_INTR) {
-   regs.ax = pebs->ax;
-   regs.bx = pebs->bx;
-   regs.cx = pebs->cx;
-   regs.dx = pebs->dx;
-   regs.si = pebs->si;
-   regs.di = pebs->di;
-   regs.bp = pebs->bp;
-   regs.sp = pebs->sp;
-
-   regs.flags = pebs->flags;
+   regs->ax = pebs->ax;
+   regs->bx = pebs->bx;
+   regs->cx = pebs->cx;
+   regs->dx = pebs->dx;
+   regs->si = pebs->si;
+   regs->di = pebs->di;
+   regs->bp = pebs->bp;
+   regs->sp = pebs->sp;
+
+   regs->flags = pebs->flags;
 #ifndef CONFIG_X86_32
-   regs.r8 = pebs->r8;
-   regs.r9 = pebs->r9;
-   regs.r10 = pebs->r10;
-   regs.r11 = pebs->r11;
-   regs.r12 = pebs->r12;
-   regs.r13 = pebs->r13;
-   regs.r14 = pebs->r14;
-   regs.r15 = pebs->r15;
+   regs->r8 = pebs->r8;
+   regs->r9 = pebs->r9;
+   regs->r10 = pebs->r10;
+   regs->r11 = pebs->r11;
+   regs->r12 = pebs->r12;
+   regs->r13 = pebs->r13;
+   regs->r14 = pebs->r14;
+   regs->r15 = pebs->r15;
 #endif
}
 
if (event->attr.precise_ip > 1 && x86_pmu.intel_cap.pebs_format >= 2) {
-   regs.ip = pebs->real_ip;
-   regs.flags |= PERF_EFLAGS_EXACT;
-   } else if (event->attr.precise_ip > 1 && intel_pmu_pebs_fixup_ip())
-   regs.flags |= PERF_EFLAGS_EXACT;
+   regs->ip = pebs->real_ip;
+   regs->flags |= PERF_EFLAGS_EXACT;
+   } else if (event->attr.precise_ip > 1 && intel_pmu_pebs_fixup_ip(regs))
+

[PATCH V9 5/8] perf, x86: drain PEBS buffer during context switch

2015-05-10 Thread Kan Liang

From: Yan, Zheng 

Flush the PEBS buffer during context switch if PEBS interrupt threshold
is larger than one. This allows perf to supply TID for sample outputs.

Signed-off-by: Yan, Zheng 
Signed-off-by: Kan Liang 
---
 arch/x86/kernel/cpu/perf_event.h   |  6 +-
 arch/x86/kernel/cpu/perf_event_intel.c | 11 +-
 arch/x86/kernel/cpu/perf_event_intel_ds.c  | 32 ++
 arch/x86/kernel/cpu/perf_event_intel_lbr.c |  3 ---
 4 files changed, 47 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event.h b/arch/x86/kernel/cpu/perf_event.h
index 626ded3..8746c61 100644
--- a/arch/x86/kernel/cpu/perf_event.h
+++ b/arch/x86/kernel/cpu/perf_event.h
@@ -91,9 +91,11 @@ struct amd_nb {
 /*
  * Flags PEBS can handle without an PMI.
  *
+ * TID can only be handled by flushing at context switch.
+ *
  */
 #define PEBS_FREERUNNING_FLAGS \
-   (PERF_SAMPLE_IP | PERF_SAMPLE_ADDR | \
+   (PERF_SAMPLE_IP | PERF_SAMPLE_TID | PERF_SAMPLE_ADDR | \
PERF_SAMPLE_ID | PERF_SAMPLE_CPU | PERF_SAMPLE_STREAM_ID | \
PERF_SAMPLE_DATA_SRC | PERF_SAMPLE_IDENTIFIER | \
PERF_SAMPLE_TRANSACTION)
@@ -872,6 +874,8 @@ void intel_pmu_pebs_enable_all(void);
 
 void intel_pmu_pebs_disable_all(void);
 
+void intel_pmu_pebs_sched_task(struct perf_event_context *ctx, bool sched_in);
+
 void intel_ds_init(void);
 
 void intel_pmu_lbr_sched_task(struct perf_event_context *ctx, bool sched_in);
diff --git a/arch/x86/kernel/cpu/perf_event_intel.c 
b/arch/x86/kernel/cpu/perf_event_intel.c
index fdf818a..c4c5e1f 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -2702,6 +2702,15 @@ static void intel_pmu_cpu_dying(int cpu)
fini_debug_store_on_cpu(cpu);
 }
 
+static void intel_pmu_sched_task(struct perf_event_context *ctx,
+bool sched_in)
+{
+   if (x86_pmu.pebs_active)
+   intel_pmu_pebs_sched_task(ctx, sched_in);
+   if (x86_pmu.lbr_nr)
+   intel_pmu_lbr_sched_task(ctx, sched_in);
+}
+
 PMU_FORMAT_ATTR(offcore_rsp, "config1:0-63");
 
 PMU_FORMAT_ATTR(ldlat, "config1:0-15");
@@ -2791,7 +2800,7 @@ static __initconst const struct x86_pmu intel_pmu = {
.cpu_starting   = intel_pmu_cpu_starting,
.cpu_dying  = intel_pmu_cpu_dying,
.guest_get_msrs = intel_guest_get_msrs,
-   .sched_task = intel_pmu_lbr_sched_task,
+   .sched_task = intel_pmu_sched_task,
 };
 
 static __init void intel_clovertown_quirk(void)
diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c 
b/arch/x86/kernel/cpu/perf_event_intel_ds.c
index 0b727d1..ba02783 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
@@ -546,6 +546,19 @@ int intel_pmu_drain_bts_buffer(void)
return 1;
 }
 
+static inline void intel_pmu_drain_pebs_buffer(void)
+{
+   struct pt_regs regs;
+
+   x86_pmu.drain_pebs();
+}
+
+void intel_pmu_pebs_sched_task(struct perf_event_context *ctx, bool sched_in)
+{
+   if (!sched_in)
+   intel_pmu_drain_pebs_buffer();
+}
+
 /*
  * PEBS
  */
@@ -711,8 +724,19 @@ void intel_pmu_pebs_enable(struct perf_event *event)
if (hwc->flags & PERF_X86_EVENT_FREERUNNING) {
threshold = ds->pebs_absolute_maximum -
x86_pmu.max_pebs_events * x86_pmu.pebs_record_size;
+
+   if (first_pebs)
+   perf_sched_cb_inc(event->ctx->pmu);
} else {
threshold = ds->pebs_buffer_base + x86_pmu.pebs_record_size;
+
+   /*
+* If not all events can use larger buffer,
+* roll back to threshold = 1
+*/
+   if (!first_pebs &&
+   (ds->pebs_interrupt_threshold > threshold))
+   perf_sched_cb_dec(event->ctx->pmu);
}
 
/* Use auto-reload if possible to save a MSR write in the PMI */
@@ -729,6 +753,7 @@ void intel_pmu_pebs_disable(struct perf_event *event)
 {
struct cpu_hw_events *cpuc = this_cpu_ptr(_hw_events);
struct hw_perf_event *hwc = >hw;
+   struct debug_store *ds = cpuc->ds;
 
cpuc->pebs_enabled &= ~(1ULL << hwc->idx);
 
@@ -737,6 +762,13 @@ void intel_pmu_pebs_disable(struct perf_event *event)
else if (event->hw.constraint->flags & PERF_X86_EVENT_PEBS_ST)
cpuc->pebs_enabled &= ~(1ULL << 63);
 
+   if (ds->pebs_interrupt_threshold >
+   ds->pebs_buffer_base + x86_pmu.pebs_record_size) {
+   intel_pmu_drain_pebs_buffer();
+   if (!pebs_is_enabled(cpuc))
+   perf_sched_cb_dec(event->ctx->pmu);
+   }
+
if (cpuc->enabled)
wrmsrl(MSR_IA32_PEBS_ENABLE, cpuc->pebs_enabled);
 
diff --git a/arch/x86/kernel/cpu/perf_event_intel_lbr.c 
b/arch/x86/kernel/cpu/perf_event_intel_lbr.c
index

Re: [PATCH v8 2/3] leds: ktd2692: add device tree bindings for ktd2692

2015-05-10 Thread Ingi Kim

Hi Jacek,

On 2015년 05월 08일 17:33, Jacek Anaszewski wrote:
> Hi Ingi,
> 
> On 05/08/2015 05:03 AM, Ingi Kim wrote:
>> This patch adds the device tree bindings for ktd2692 flash LEDs.
>> Add Optional properties of child node for Flash LED
>>
>> Signed-off-by: Ingi Kim 
>> Acked-by: Seung-Woo Kim 
>> Reviewed-by: Varka Bhadram 
>> ---
>>   .../devicetree/bindings/leds/leds-ktd2692.txt  | 50 
>> ++
>>   1 file changed, 50 insertions(+)
>>   create mode 100644 Documentation/devicetree/bindings/leds/leds-ktd2692.txt
>>
>> diff --git a/Documentation/devicetree/bindings/leds/leds-ktd2692.txt 
>> b/Documentation/devicetree/bindings/leds/leds-ktd2692.txt
>> new file mode 100644
>> index 000..cf45492
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/leds/leds-ktd2692.txt
>> @@ -0,0 +1,50 @@
>> +* Kinetic Technologies - KTD2692 Flash LED Driver
>> +
>> +KTD2692 is the ideal power solution for high-power flash LEDs.
>> +It uses ExpressWire single-wire programming for maximum flexibility.
>> +
>> +The ExpressWire interface through CTRL pin can control LED on/off and
>> +enable/disable the IC, Movie(max 1/3 of Flash current) / Flash mode current,
>> +Flash timeout, LVP(low voltage protection).
>> +
>> +Also, When the AUX pin is pulled high while CTRL pin is high,
>> +LED current will be ramped up to the flash-mode current level.
>> +
>> +Required properties:
>> +- compatible: "kinetic,ktd2692"
> 
> How about:
> 
> compatible : Should be "kinetic,ktd2692".
> 
> Pleas put space between the property name and a colon consequently.
> 
>> +- ctrl-gpio : gpio pin in order control CTRL pin
>> +- aux-gpio : gpio pin in order control AUX pin
> 
> Documentation/devicetree/bindings/gpio/gpio.txt states:
> 
> "GPIO properties should be named "[-]gpios".
> 
> Please adhere to this. Also please begin description after colon
> with a capital letter and put a dot at the end of sentence, e.g.:
> 
> ctrl-gpios : Specifier of the GPIO connected to CTRL pin.
> aux-gpio : Specifier of the GPIO connected to AUX pin.
> 

I should have read binding test strictly.
I'll fix it

>> +Optional properties:
>> +- vin-supply : "vin" LED supply (2.7V to 5.5V)
>> +  See Documentation/devicetree/bindings/regulator/regulator.txt
>> +
>> +A discrete LED element connected to the device must be represented by a 
>> child
>> +node - see Documentation/devicetree/bindings/leds/common.txt.
>> +
>> +Required properties for flash LED child nodes:
>> +  See Documentation/devicetree/bindings/leds/common.txt
>> +- led-max-microamp : Minimum Threshold for Timer protection
>> +  is defined internally (Maximum 300mA)
>> +- flash-max-microamp : Flash LED maximum current
>> +  Formula : I(mA) = 15000 / Rset
>> +- flash-max-timeout-us : Flash LED maximum timeout
>> +
>> +Optional properties for flash LED child nodes:
>> +- label : see Documentation/devicetree/bindings/leds/common.txt
>> +
>> +Example:
>> +
>> +ktd2692 {
>> +compatible = "kinetic,ktd2692";
>> +ctrl-gpio = < 1 0>;
>> +aux-gpio = < 2 0>;
> 
> ctrl-gpios = < 1 0>;
> aux-gpios = < 2 0>;
> 

Check it!

>> +vin-supply = <>;
>> +
>> +flash-led {
>> +label = "ktd2692-flash";
>> +led-max-microamp = <30>;
>> +flash-max-microamp = <150>;
>> +flash-max-timeout-us = <1835000>;
>> +};
>> +};
>>
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH V9 6/8] perf, x86: enlarge PEBS buffer

2015-05-10 Thread Kan Liang

From: Yan, Zheng 

Currently the PEBS buffer size is 4k, it only can hold about 21
PEBS records. This patch enlarges the PEBS buffer size to 64k
(the same as BTS buffer), 64k memory can hold about 330 PEBS
records. This will significantly the reduce number of PMI when
large PEBS interrupt threshold is used.

Signed-off-by: Yan, Zheng 
Signed-off-by: Kan Liang 
---
 arch/x86/kernel/cpu/perf_event_intel_ds.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c 
b/arch/x86/kernel/cpu/perf_event_intel_ds.c
index ba02783..328b10c 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
@@ -11,7 +11,7 @@
 #define BTS_RECORD_SIZE24
 
 #define BTS_BUFFER_SIZE(PAGE_SIZE << 4)
-#define PEBS_BUFFER_SIZE   PAGE_SIZE
+#define PEBS_BUFFER_SIZE   (PAGE_SIZE << 4)
 #define PEBS_FIXUP_SIZEPAGE_SIZE
 
 /*
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH V9 7/8] perf, x86: introduce PERF_RECORD_LOST_SAMPLES

2015-05-10 Thread Kan Liang

From: Kan Liang 

After enlarging the PEBS interrupt threshold, there may be some mixed up
PEBS samples which are discarded by kernel. This patch drives the kernel
to emit a PERF_RECORD_LOST_SAMPLES record with the number of possible
discards when it is impossible to demux the samples. It makes sure the
user is not left in the dark about such discards.

Signed-off-by: Kan Liang 
---
 arch/x86/kernel/cpu/perf_event_intel_ds.c | 20 +++
 include/linux/perf_event.h|  3 +++
 include/uapi/linux/perf_event.h   | 12 +++
 kernel/events/core.c  | 33 +++
 4 files changed, 64 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c 
b/arch/x86/kernel/cpu/perf_event_intel_ds.c
index 328b10c..18afea0b 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
@@ -1127,6 +1127,7 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs 
*iregs)
void *base, *at, *top;
int bit;
short counts[MAX_PEBS_EVENTS] = {};
+   short error[MAX_PEBS_EVENTS] = {};
 
if (!x86_pmu.pebs_active)
return;
@@ -1170,21 +1171,32 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs 
*iregs)
/* slow path */
pebs_status = p->status & cpuc->pebs_enabled;
pebs_status &= (1ULL << MAX_PEBS_EVENTS) - 1;
-   if (pebs_status != (1 << bit))
+   if (pebs_status != (1 << bit)) {
+   u8 i;
+
+   for_each_set_bit(i, (unsigned long 
*)_status,
+MAX_PEBS_EVENTS)
+   error[i]++;
continue;
+   }
}
counts[bit]++;
}
 
for (bit = 0; bit < x86_pmu.max_pebs_events; bit++) {
-   if (counts[bit] == 0)
+   if ((counts[bit] == 0) && (error[bit] == 0))
continue;
event = cpuc->events[bit];
WARN_ON_ONCE(!event);
WARN_ON_ONCE(!event->attr.precise_ip);
 
-   __intel_pmu_pebs_event(event, iregs, base,
-  top, bit, counts[bit]);
+   /* log dropped samples number */
+   if (error[bit])
+   perf_log_lost_samples(event, error[bit]);
+
+   if (counts[bit])
+   __intel_pmu_pebs_event(event, iregs, base,
+  top, bit, counts[bit]);
}
 }
 
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index bed1b6f..d47d792 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -747,6 +747,9 @@ perf_event__output_id_sample(struct perf_event *event,
 struct perf_output_handle *handle,
 struct perf_sample_data *sample);
 
+extern void
+perf_log_lost_samples(struct perf_event *event, u64 lost);
+
 static inline bool is_sampling_event(struct perf_event *event)
 {
return event->attr.sample_period != 0;
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 309211b..bab1938 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -800,6 +800,18 @@ enum perf_event_type {
 */
PERF_RECORD_ITRACE_START= 12,
 
+   /*
+* Records the dropped/lost sample number.
+*
+* struct {
+*  struct perf_event_headerheader;
+*
+*  u64 lost;
+*  struct sample_idsample_id;
+* };
+*/
+   PERF_RECORD_LOST_SAMPLES= 13,
+
PERF_RECORD_MAX,/* non-ABI */
 };
 
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 4d221a4..42f82c5 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -5927,6 +5927,39 @@ void perf_event_aux_event(struct perf_event *event, 
unsigned long head,
 }
 
 /*
+ * Lost/dropped samples logging
+ */
+void perf_log_lost_samples(struct perf_event *event, u64 lost)
+{
+   struct perf_output_handle handle;
+   struct perf_sample_data sample;
+   int ret;
+
+   struct {
+   struct perf_event_headerheader;
+   u64 lost;
+   } lost_samples_event = {
+   .header = {
+   .type = PERF_RECORD_LOST_SAMPLES,
+   .misc = 0,
+   .size = sizeof(lost_samples_event),
+   },
+   .lost   = lost,
+   };
+
+   perf_event_header__init_id(_samples_event.header, , event);
+
+   ret = perf_output_begin(, event,
+

[PATCH V9 8/8] perf tools: handle PERF_RECORD_LOST_SAMPLES

2015-05-10 Thread Kan Liang

From: Kan Liang 

This patch modified the perf tool to handle the new RECORD type,
PERF_RECORD_LOST_SAMPLES.
The number of lost-sample events is stored in
.nr_events[PERF_RECORD_LOST_SAMPLES]. While the exact number of samples
which the kernel dropped is stored in total_lost_samples.
When the percentage of dropped samples is greater than 5%, a warning
will be sent out.

Here are some examples:

Eg 1, Recording different frequently-occurring events is safe with the
  patch. Only a very low drop rate is associated with such actions.

$ perf record -e '{cycles:p,instructions:p}' -c 20003 --no-time ~/tchain
~/tchain
[perf record: Woken up 148 times to write data]
[perf record: Captured and wrote 36.922 MB perf.data (1206322 samples)]

$ perf report -D | tail
  SAMPLE events: 120243
   MMAP2 events:  5
LOST_SAMPLES events: 24
  FINISHED_ROUND events: 15
cycles:p stats:
   TOTAL events:  59348
  SAMPLE events:  59348
instructions:p stats:
   TOTAL events:  60895
  SAMPLE events:  60895

$ perf report --stdio --group
 # To display the perf.data header info, please use
 --header/--header-only options.
 #
 #
 # Total Lost Samples: 24
 #
 # Samples: 120K of event 'anon group { cycles:p, instructions:p }'
 # Event count (approx.): 2404860
 #
 # Overhead  Command  Shared Object Symbol
 #   ...  
 ..
 #
99.74%  99.86%  tchain_edit  tchain_edit   [.] f3
 0.09%   0.02%  tchain_edit  tchain_edit   [.] f2
 0.04%   0.00%  tchain_edit  [kernel.vmlinux]  [k] ixgbe_read_reg

Eg 2, Recording the same thing multiple times can lead to high drop
  rate, but it is not a useful configuration.

$ perf record -e '{cycles:p,cycles:p}' -c 20003 --no-time ~/tchain
[perf record: Woken up 1 times to write data]
Warning:
Processed 600592 samples and lost 99.73% samples!
[perf record: Captured and wrote 0.121 MB perf.data (1629 samples)]

Signed-off-by: Kan Liang 
---
 tools/perf/builtin-report.c |  1 +
 tools/perf/util/event.c |  9 +
 tools/perf/util/event.h | 17 +
 tools/perf/util/machine.c   | 10 ++
 tools/perf/util/machine.h   |  2 ++
 tools/perf/util/session.c   | 19 +++
 tools/perf/util/tool.h  |  1 +
 7 files changed, 59 insertions(+)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 7c73ae5..485b7e9 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -318,6 +318,7 @@ static int perf_evlist__tty_browse_hists(struct perf_evlist 
*evlist,
 {
struct perf_evsel *pos;
 
+   fprintf(stdout, "#\n# Total Lost Samples: %lu\n#\n", 
evlist->stats.total_lost_samples);
evlist__for_each(evlist, pos) {
struct hists *hists = evsel__hists(pos);
const char *evname = perf_evsel__name(pos);
diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index db52609..2daadc8 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -25,6 +25,7 @@ static const char *perf_event__names[] = {
[PERF_RECORD_SAMPLE]= "SAMPLE",
[PERF_RECORD_AUX]   = "AUX",
[PERF_RECORD_ITRACE_START]  = "ITRACE_START",
+   [PERF_RECORD_LOST_SAMPLES]  = "LOST_SAMPLES",
[PERF_RECORD_HEADER_ATTR]   = "ATTR",
[PERF_RECORD_HEADER_EVENT_TYPE] = "EVENT_TYPE",
[PERF_RECORD_HEADER_TRACING_DATA]   = "TRACING_DATA",
@@ -713,6 +714,14 @@ int perf_event__process_itrace_start(struct perf_tool 
*tool __maybe_unused,
return machine__process_itrace_start_event(machine, event);
 }
 
+int perf_event__process_lost_samples(struct perf_tool *tool __maybe_unused,
+union perf_event *event,
+struct perf_sample *sample,
+struct machine *machine)
+{
+   return machine__process_lost_samples_event(machine, event, sample);
+}
+
 size_t perf_event__fprintf_mmap(union perf_event *event, FILE *fp)
 {
return fprintf(fp, " %d/%d: [%#" PRIx64 "(%#" PRIx64 ") @ %#" PRIx64 
"]: %c %s\n",
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index 7eecd5e..e02996a 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -52,6 +52,11 @@ struct lost_event {
u64 lost;
 };
 
+struct lost_samples_event {
+   struct perf_event_header header;
+   u64 lost;
+};
+
 /*
  * PERF_FORMAT_ENABLED | PERF_FORMAT_RUNNING | PERF_FORMAT_ID
  */
@@ -235,6 +240,12 @@ enum auxtrace_error_type {
  * total_lost tells exactly how many events the kernel in fact lost, i.e. it is
  * the sum of all struct lost_event.lost fields reported.
  *
+ * The kernel discards mixed up samples and sends the number in a
+ * PERF_RECORD_LOST_SAMPLES event. The number of

Re: [PATCH kernel v9 31/32] vfio: powerpc/spapr: Support multiple groups in one container if possible

2015-05-10 Thread Alexey Kardashevskiy


On 05/05/2015 09:50 PM, David Gibson wrote:

On Fri, May 01, 2015 at 04:05:24PM +1000, Alexey Kardashevskiy wrote:

On 05/01/2015 02:33 PM, David Gibson wrote:

On Thu, Apr 30, 2015 at 07:33:09PM +1000, Alexey Kardashevskiy wrote:

On 04/30/2015 05:22 PM, David Gibson wrote:

On Sat, Apr 25, 2015 at 10:14:55PM +1000, Alexey Kardashevskiy wrote:

At the moment only one group per container is supported.
POWER8 CPUs have more flexible design and allows naving 2 TCE tables per
IOMMU group so we can relax this limitation and support multiple groups
per container.


It's not obvious why allowing multiple TCE tables per PE has any
pearing on allowing multiple groups per container.



This patchset is a global TCE tables rework (patches 1..30, roughly) with 2
outcomes:
1. reusing the same IOMMU table for multiple groups - patch 31;
2. allowing dynamic create/remove of IOMMU tables - patch 32.

I can remove this one from the patchset and post it separately later but
since 1..30 aim to support both 1) and 2), I'd think I better keep them all
together (might explain some of changes I do in 1..30).


The combined patchset is fine.  My comment is because your commit
message says that multiple groups are possible *because* 2 TCE tables
per group are allowed, and it's not at all clear why one follows from
the other.



Ah. That's wrong indeed, I'll fix it.



This adds TCE table descriptors to a container and uses iommu_table_group_ops
to create/set DMA windows on IOMMU groups so the same TCE tables will be
shared between several IOMMU groups.

Signed-off-by: Alexey Kardashevskiy 
[aw: for the vfio related changes]
Acked-by: Alex Williamson 
---
Changes:
v7:
* updated doc
---
  Documentation/vfio.txt  |   8 +-
  drivers/vfio/vfio_iommu_spapr_tce.c | 268 ++--
  2 files changed, 199 insertions(+), 77 deletions(-)

diff --git a/Documentation/vfio.txt b/Documentation/vfio.txt
index 94328c8..7dcf2b5 100644
--- a/Documentation/vfio.txt
+++ b/Documentation/vfio.txt
@@ -289,10 +289,12 @@ PPC64 sPAPR implementation note

  This implementation has some specifics:

-1) Only one IOMMU group per container is supported as an IOMMU group
-represents the minimal entity which isolation can be guaranteed for and
-groups are allocated statically, one per a Partitionable Endpoint (PE)
+1) On older systems (POWER7 with P5IOC2/IODA1) only one IOMMU group per
+container is supported as an IOMMU table is allocated at the boot time,
+one table per a IOMMU group which is a Partitionable Endpoint (PE)
  (PE is often a PCI domain but not always).


I thought the more fundamental problem was that different PEs tended
to use disjoint bus address ranges, so even by duplicating put_tce
across PEs you couldn't have a common address space.



Sorry, I am not following you here.

By duplicating put_tce, I can have multiple IOMMU groups on the same virtual
PHB in QEMU, "[PATCH qemu v7 04/14] spapr_pci_vfio: Enable multiple groups
per container" does this, the address ranges will the same.


Oh, ok.  For some reason I thought that (at least on the older
machines) the different PEs used different and not easily changeable
DMA windows in bus addresses space.



They do use different tables (which VFIO does not get to remove/create and
uses these old helpers - iommu_take/release_ownership), correct. But all
these windows are mapped at zero on a PE's PCI bus and nothing prevents me
from updating all these tables with the same TCE values when handling
H_PUT_TCE. Yes it is slow but it works (bit more details below).


Um.. I'm pretty sure that contradicts what Ben was saying on the
thread.



True, it does contradict, I do not know why he said what he said :)



--
Alexey
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH V9 4/8] perf, x86: large PEBS interrupt threshold

2015-05-10 Thread Kan Liang

From: Yan, Zheng 

PEBS always had the capability to log samples to its buffers without
an interrupt. Traditionally perf has not used this but always set the
PEBS threshold to one.

For frequently occurring events (like cycles or branches or load/store)
this in term requires using a relatively high sampling period to avoid
overloading the system, by only processing PMIs. This in term increases
sampling error.

For the common cases we still need to use the PMI because the PEBS
hardware has various limitations. The biggest one is that it can not
supply a callgraph. It also requires setting a fixed period, as the
hardware does not support adaptive period. Another issue is that it
cannot supply a time stamp and some other options. To supply a TID it
requires flushing on context switch. It can however supply the IP, the
load/store address, TSX information, registers, and some other things.

So we can make PEBS work for some specific cases, basically as long as
you can do without a callgraph and can set the period you can use this
new PEBS mode.

The main benefit is the ability to support much lower sampling period
(down to -c 1000) without extensive overhead.

One use cases is for example to increase the resolution of the c2c tool.
Another is double checking when you suspect the standard sampling has
too much sampling error.

Some numbers on the overhead, using cycle soak, comparing the elapsed
time from "kernbench -M -H" between plain (threshold set to one) and
multi (large threshold).
The test command for plain:
  "perf record --time -e cycles:p -c $period -- kernbench -M -H"
The test command for multi:
  "perf record --no-time -e cycles:p -c $period -- kernbench -M -H"
(The only difference of test command between multi and plain is time
stamp options. Since time stamp is not supported by large PEBS
threshold, it can be used as a flag to indicate if large threshold is
enabled during the test.)

periodplain(Sec)  multi(Sec)  Delta
10003 32.716.516.2
20003 30.216.214.0
40003 18.614.14.5
80003 16.814.62.2
1316.914.12.8
8315.415.7-0.3
103   15.315.20.2
203   15.315.10.1

With periods below 13, plain (threshold one) cause much more
overhead. With 10003 sampling period, the Elapsed Time for multi is
even 2X faster than plain.

Signed-off-by: Yan, Zheng 
Signed-off-by: Kan Liang 
---
 arch/x86/kernel/cpu/perf_event.h  | 11 +++
 arch/x86/kernel/cpu/perf_event_intel.c|  5 -
 arch/x86/kernel/cpu/perf_event_intel_ds.c | 27 +++
 3 files changed, 38 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event.h b/arch/x86/kernel/cpu/perf_event.h
index 1cb5859..626ded3 100644
--- a/arch/x86/kernel/cpu/perf_event.h
+++ b/arch/x86/kernel/cpu/perf_event.h
@@ -75,6 +75,7 @@ struct event_constraint {
 #define PERF_X86_EVENT_DYNAMIC 0x0080 /* dynamic alloc'd constraint */
 #define PERF_X86_EVENT_RDPMC_ALLOWED   0x0100 /* grant rdpmc permission */
 #define PERF_X86_EVENT_AUTO_RELOAD 0x0200 /* use PEBS auto-reload */
+#define PERF_X86_EVENT_FREERUNNING 0x0400 /* use freerunning PEBS */
 
 
 struct amd_nb {
@@ -88,6 +89,16 @@ struct amd_nb {
 #define MAX_PEBS_EVENTS8
 
 /*
+ * Flags PEBS can handle without an PMI.
+ *
+ */
+#define PEBS_FREERUNNING_FLAGS \
+   (PERF_SAMPLE_IP | PERF_SAMPLE_ADDR | \
+   PERF_SAMPLE_ID | PERF_SAMPLE_CPU | PERF_SAMPLE_STREAM_ID | \
+   PERF_SAMPLE_DATA_SRC | PERF_SAMPLE_IDENTIFIER | \
+   PERF_SAMPLE_TRANSACTION)
+
+/*
  * A debug store configuration.
  *
  * We only support architectures that use 64bit fields.
diff --git a/arch/x86/kernel/cpu/perf_event_intel.c 
b/arch/x86/kernel/cpu/perf_event_intel.c
index 3119071..fdf818a 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -2306,8 +2306,11 @@ static int intel_pmu_hw_config(struct perf_event *event)
return ret;
 
if (event->attr.precise_ip) {
-   if (!event->attr.freq)
+   if (!event->attr.freq) {
event->hw.flags |= PERF_X86_EVENT_AUTO_RELOAD;
+   if (!(event->attr.sample_type & 
~PEBS_FREERUNNING_FLAGS))
+   event->hw.flags |= PERF_X86_EVENT_FREERUNNING;
+   }
if (x86_pmu.pebs_aliases)
x86_pmu.pebs_aliases(event);
}
diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c 
b/arch/x86/kernel/cpu/perf_event_intel_ds.c
index 971c77e..0b727d1 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
@@ -250,7 +250,7 @@ static int alloc_pebs_buffer(int cpu)
 {
struct debug_store *ds = per_cpu(cpu_hw_events, cpu).ds;
int node = cpu_to_node(cpu);
-   int max, thresh = 1; /* always use a

[PATCH V9 0/8] large PEBS interrupt threshold

2015-05-10 Thread Kan Liang

This patch series implements large PEBS interrupt threshold.
Currently, the PEBS threshold is forced to set to one. A larger PEBS
interrupt threshold can significantly reduce the sampling overhead
especially for frequently occurring events
(like cycles or branches or load/stores) with small sampling period.
For example, perf record cycles event when running kernbench
with 10003 sampling period. The Elapsed Time reduced from 32.7 seconds
to 16.5 seconds, which is 2X faster.
For more details, please refer to patch 4's description.

Limitations:
 - It can not supply a callgraph.
 - It requires setting a fixed period.
 - It cannot supply a time stamp.
 - To supply a TID it requires flushing on context switch.
If the above requirement doesn't apply, the threshold will set to one.

Discard samples:
When PEBS events happen close to each other, the records for the events
could be mixed up. Demuxing the records is hard because of hardware
deficiecy. As a result, we have to drop some PEBS records.
A new RECORD type, PERF_RECORD_LOST_SAMPLES, is introduced to record
the number of possible discards, and make sure the user is not left
in the dark about such discards.
For details about sample discards, please refer to patch 3's description.

changes since v1:
  - drop patch 'perf, core: Add all PMUs to pmu_idr'
  - add comments for case that multiple counters overflow simultaneously
changes since v2:
  - rename perf_sched_cb_{enable,disable} to perf_sched_cb_user_{inc,dec}
  - use flag to indicate auto reload mechanism
  - move codes that setup PEBS sample data to separate function
  - output the PEBS records in batch
  - enable this for All (PEBS capable) hardware
  - more description for the multiplex
changes since v3:
  - ignore conflicting PEBS record
changes since v4:
  - Do more tests for collision and update comments
changes since v5:
  - Move autoreload and large PEBS available check to intel_pmu_hw_config
  - make AUTO_RELOAD conditional on large PEBS
  - !PEBS bug fix
  - coherent story about what is collision and how we handle it
  - Remove extra state pebs_sched_cb_enabled
changes since v6:
  - new flag PERF_X86_EVENT_FREERUNNING to indicate large PEBS available
  - patch reorder and changelog changes for patch 1 and 3
  - An easy way to clear !PEBS bit
  - Log collision to PERF_RECORD_SAMPLES_LOST
changes since v7:
  - Introduce PERF_RECORD_LOST_SAMPLES to record the number of discards
  - Remove entire for_each_set_bit() loop
  - Minor changes on comments and changelog
changes since v8:
  - Record 'lost' events to all set bits
  - dropped the @id field from the lost samples record
  - Print lost samples event nr in perf report --stdio output

Yan, Zheng (6):
  perf, x86: use the PEBS auto reload mechanism when possible
  perf, x86: introduce setup_pebs_sample_data()
  perf, x86: handle multiple records in PEBS buffer
  perf, x86: large PEBS interrupt threshold
  perf, x86: drain PEBS buffer during context switch
  perf, x86: enlarge PEBS buffer

Kan Liang (2):
  perf, x86: introduce PERF_RECORD_LOST_SAMPLES
  perf tools: handle PERF_RECORD_LOST_SAMPLES

 arch/x86/kernel/cpu/perf_event.c   |  15 +-
 arch/x86/kernel/cpu/perf_event.h   |  16 ++
 arch/x86/kernel/cpu/perf_event_intel.c |  22 +-
 arch/x86/kernel/cpu/perf_event_intel_ds.c  | 311 +
 arch/x86/kernel/cpu/perf_event_intel_lbr.c |   3 -
 include/linux/perf_event.h |  16 ++
 include/uapi/linux/perf_event.h|  12 ++
 kernel/events/core.c   |  39 +++-
 kernel/events/internal.h   |   9 -
 tools/perf/builtin-report.c|   1 +
 tools/perf/util/event.c|   9 +
 tools/perf/util/event.h|  17 ++
 tools/perf/util/machine.c  |  10 +
 tools/perf/util/machine.h  |   2 +
 tools/perf/util/session.c  |  19 ++
 tools/perf/util/tool.h |   1 +
 16 files changed, 392 insertions(+), 110 deletions(-)

-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH V9 3/8] perf, x86: handle multiple records in PEBS buffer

2015-05-10 Thread Kan Liang

From: Yan, Zheng 

When the PEBS interrupt threshold is larger than one record and the
machine supports multiple PEBS events, the records of these events are
mixed up and we need to demultiplex them.

Demuxing the records is hard because the hardware is deficient. The
hardware has two issues that, when combined, create impossible scenarios
to demux. (The deficiency has been fixed in Skylake. The two issues here
are for pre-SKL platforms.)

The first issue is that the 'status' field of the PEBS record is a copy
of the GLOBAL_STATUS MSR at PEBS assist time. To see why this is a
problem let us first describe the regular PEBS cycle:

A) the CTRn value reaches 0:
  - the corresponding bit in GLOBAL_STATUS gets set
  - we start arming the hardware assist
  < some unspecified amount of time later -- this could cover multiple
events of interest >

B) the hardware assist is armed, any next event will trigger it

C) a matching event happens:
  - the hardware assist triggers and generates a PEBS record
this includes a copy of GLOBAL_STATUS at this moment
  - if we auto-reload we (re)set CTRn
  - we clear the relevant bit in GLOBAL_STATUS

Now consider the following chain of events:

A0, B0, A1, C0

The event generated for counter 0 will include a status with counter 1
set, even though its not at all related to the record. A similar thing
can happen with a !PEBS event if it just happens to overflow at the
right moment.

The second issue is that the hardware will only emit one record for two
or more counters if the event that triggers the assist is 'close'. The
'close' can be several cycles. In some cases even the complete assist,
if the event is something that doesn't need retirement.

For instance, consider this chain of events:
A0, B0, A1, B1, C01

Where C01 is an event that triggers both hardware assists, we will
generate but a single record, but again with both counters listed in the
status field.

This time the record pertains to both events.

Note that these two cases are different but undistinguishable with the
data as generated. Therefore demuxing records with multiple PEBS bits
(we can safely ignore status bits for !PEBS counters) is impossible.

Furthermore we cannot emit the record to both events because that might
cause a data leak -- the events might not have the same privileges -- so
what this patch does is discard such events.

The assumption/hope is that such discards will be rare.

Here lists some possible ways you may get high discard rate.
  - when you count the same thing multiple times. But it is not a useful
configuration.
  - you can be unfortunate if you measure with a userspace only PEBS
event along with either a kernel or unrestricted PEBS event. Imagine
the event triggering and setting the overflow flag right before
entering the kernel. Then all kernel side events will end up with
multiple bits set.

Signed-off-by: Yan, Zheng 
Signed-off-by: Kan Liang 
---
 arch/x86/kernel/cpu/perf_event_intel_ds.c | 144 +-
 include/linux/perf_event.h|  13 +++
 kernel/events/core.c  |   6 +-
 kernel/events/internal.h  |   9 --
 4 files changed, 118 insertions(+), 54 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c 
b/arch/x86/kernel/cpu/perf_event_intel_ds.c
index f26c8b4..971c77e 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
@@ -872,6 +872,9 @@ static void setup_pebs_sample_data(struct perf_event *event,
int fll, fst, dsrc;
int fl = event->hw.flags;
 
+   if (pebs == NULL)
+   return;
+
sample_type = event->attr.sample_type;
dsrc = sample_type & PERF_SAMPLE_DATA_SRC;
 
@@ -966,19 +969,68 @@ static void setup_pebs_sample_data(struct perf_event 
*event,
data->br_stack = >lbr_stack;
 }
 
+static inline void *
+get_next_pebs_record_by_bit(void *base, void *top, int bit)
+{
+   struct cpu_hw_events *cpuc = this_cpu_ptr(_hw_events);
+   void *at;
+   u64 pebs_status;
+
+   if (base == NULL)
+   return NULL;
+
+   for (at = base; at < top; at += x86_pmu.pebs_record_size) {
+   struct pebs_record_nhm *p = at;
+
+   if (test_bit(bit, (unsigned long *)>status)) {
+
+   if (p->status == (1 << bit))
+   return at;
+
+   /* clear non-PEBS bit and re-check */
+   pebs_status = p->status & cpuc->pebs_enabled;
+   pebs_status &= (1ULL << MAX_PEBS_EVENTS) - 1;
+   if (pebs_status == (1 << bit))
+   return at;
+   }
+   }
+   return NULL;
+}
+
 static void __intel_pmu_pebs_event(struct perf_event *event,
-  struct pt_regs *iregs, void *__pebs)
+  struct pt_regs *iregs,
+

Re: [PATCH kernel v9 27/32] powerpc/iommu/ioda2: Add get_table_size() to calculate the size of future table

2015-05-10 Thread Alexey Kardashevskiy


On 05/05/2015 09:58 PM, David Gibson wrote:

On Fri, May 01, 2015 at 04:53:08PM +1000, Alexey Kardashevskiy wrote:

On 05/01/2015 03:12 PM, David Gibson wrote:

On Fri, May 01, 2015 at 02:10:58PM +1000, Alexey Kardashevskiy wrote:

On 04/29/2015 04:40 PM, David Gibson wrote:

On Sat, Apr 25, 2015 at 10:14:51PM +1000, Alexey Kardashevskiy wrote:

This adds a way for the IOMMU user to know how much a new table will
use so it can be accounted in the locked_vm limit before allocation
happens.

This stores the allocated table size in pnv_pci_create_table()
so the locked_vm counter can be updated correctly when a table is
being disposed.

This defines an iommu_table_group_ops callback to let VFIO know
how much memory will be locked if a table is created.

Signed-off-by: Alexey Kardashevskiy 
---
Changes:
v9:
* reimplemented the whole patch
---
  arch/powerpc/include/asm/iommu.h  |  5 +
  arch/powerpc/platforms/powernv/pci-ioda.c | 14 
  arch/powerpc/platforms/powernv/pci.c  | 36 +++
  arch/powerpc/platforms/powernv/pci.h  |  2 ++
  4 files changed, 57 insertions(+)

diff --git a/arch/powerpc/include/asm/iommu.h b/arch/powerpc/include/asm/iommu.h
index 1472de3..9844c106 100644
--- a/arch/powerpc/include/asm/iommu.h
+++ b/arch/powerpc/include/asm/iommu.h
@@ -99,6 +99,7 @@ struct iommu_table {
unsigned long  it_size;  /* Size of iommu table in entries */
unsigned long  it_indirect_levels;
unsigned long  it_level_size;
+   unsigned long  it_allocated_size;
unsigned long  it_offset;/* Offset into global table */
unsigned long  it_base;  /* mapped address of tce table */
unsigned long  it_index; /* which iommu table this is */
@@ -155,6 +156,10 @@ extern struct iommu_table *iommu_init_table(struct 
iommu_table * tbl,
  struct iommu_table_group;

  struct iommu_table_group_ops {
+   unsigned long (*get_table_size)(
+   __u32 page_shift,
+   __u64 window_size,
+   __u32 levels);
long (*create_table)(struct iommu_table_group *table_group,
int num,
__u32 page_shift,
diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
b/arch/powerpc/platforms/powernv/pci-ioda.c
index e0be556..7f548b4 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -2062,6 +2062,18 @@ static void pnv_pci_ioda2_setup_bypass_pe(struct pnv_phb 
*phb,
  }

  #ifdef CONFIG_IOMMU_API
+static unsigned long pnv_pci_ioda2_get_table_size(__u32 page_shift,
+   __u64 window_size, __u32 levels)
+{
+   unsigned long ret = pnv_get_table_size(page_shift, window_size, levels);
+
+   if (!ret)
+   return ret;
+
+   /* Add size of it_userspace */
+   return ret + (window_size >> page_shift) * sizeof(unsigned long);


This doesn't make much sense.  The userspace view can't possibly be a
property of the specific low-level IOMMU model.



This it_userspace thing is all about memory preregistration.

I need some way to track how many actual mappings the
mm_iommu_table_group_mem_t has in order to decide whether to allow
unregistering or not.

When I clear TCE, I can read the old value which is host physical address
which I cannot use to find the preregistered region and adjust the mappings
counter; I can only use userspace addresses for this (not even guest
physical addresses as it is VFIO and probably no KVM).

So I have to keep userspace addresses somewhere, one per IOMMU page, and the
iommu_table seems a natural place for this.


Well.. sort of.  But as noted elsewhere this pulls VFIO specific
constraints into a platform code structure.  And whether you get this
table depends on the platform IOMMU type rather than on what VFIO
wants to do with it, which doesn't make sense.

What might make more sense is an opaque pointer io iommu_table for use
by the table "owner" (in the take_ownership sense).  The pointer would
be stored in iommu_table, but VFIO is responsible for populating and
managing its contents.

Or you could just put the userspace mappings in the container.
Although you might want a different data structure in that case.


Nope. I need this table in in-kernel acceleration to update the mappings
counter per mm_iommu_table_group_mem_t. In KVM's real mode handlers, I only
have IOMMU tables, not containers or groups. QEMU creates a guest view of
the table (KVM_CREATE_SPAPR_TCE) specifying a LIOBN, and then attaches TCE
tables to it via set of ioctls (one per IOMMU group) to VFIO KVM device.

So if I call it it_opaque (instead of it_userspace), I will still need a
common place (visible to VFIO and PowerKVM) for this to put:
#define IOMMU_TABLE_USERSPACE_ENTRY(tbl, entry)


I think it should be in a VFIO header.  If I'm understanding right
this part of the PowerKVM code is explicitly VFIO aware - that's kind
of the point.


Well.

Re: [PATCH v8 3/3] leds: Add ktd2692 flash LED driver

2015-05-10 Thread Ingi Kim

Hi Jacek,
Thanks for the review

I can improve this driver for your attention
I'm trying to rebase source code for merging it and resending soon.

Thank you

On 2015년 05월 08일 17:33, Jacek Anaszewski wrote:
> Hi Ingi,
> 
> Thanks for the update. It looks like we're almost there.
> I can see only two minor issues.
> 
> On 05/08/2015 05:03 AM, Ingi Kim wrote:
>> This patch adds a driver to support the ktd2692 flash LEDs.
>> ktd2692 can control flash current by ExpressWire interface.
>>
>> Signed-off-by: Ingi Kim 
>> Acked-by: Seung-Woo Kim 
>> Reviewed-by: Varka Bhadram 
>> ---
>>   drivers/leds/Kconfig|   9 +
>>   drivers/leds/Makefile   |   1 +
>>   drivers/leds/leds-ktd2692.c | 443 
>> 
>>   3 files changed, 453 insertions(+)
>>   create mode 100644 drivers/leds/leds-ktd2692.c
>>
>> diff --git a/drivers/leds/Kconfig b/drivers/leds/Kconfig
>> index 51059bb..bfbdbd1 100644
>> --- a/drivers/leds/Kconfig
>> +++ b/drivers/leds/Kconfig
>> @@ -505,6 +505,15 @@ config LEDS_MENF21BMC
>> This driver can also be built as a module. If so the module
>> will be called leds-menf21bmc.
>>
>> +config LEDS_KTD2692
>> +tristate "KTD2692 LED flash support"
> 
> Please keep the style of description in line with the existing
> drivers. How about:
> 
> "LED support for KTD2692 flash LED controller"
> 

Sure, I'll change it

>> +depends on LEDS_CLASS_FLASH && GPIOLIB && OF
>> +help
>> +  This option enables support for KTD2692 LED flash connected
>> +  through ExpressWire interface.
>> +
>> +  Say Y to enable this driver.
>> +
>>   comment "LED driver for blink(1) USB RGB LED is under Special HID drivers 
>> (HID_THINGM)"
>>
>>   config LEDS_BLINKM
>> diff --git a/drivers/leds/Makefile b/drivers/leds/Makefile
>> index a739ae2..ed5ed79 100644
>> --- a/drivers/leds/Makefile
>> +++ b/drivers/leds/Makefile
>> @@ -59,6 +59,7 @@ obj-$(CONFIG_LEDS_BLINKM)+= leds-blinkm.o
>>   obj-$(CONFIG_LEDS_SYSCON)+= leds-syscon.o
>>   obj-$(CONFIG_LEDS_VERSATILE)+= leds-versatile.o
>>   obj-$(CONFIG_LEDS_MENF21BMC)+= leds-menf21bmc.o
>> +obj-$(CONFIG_LEDS_KTD2692)+= leds-ktd2692.o
>>
>>   # LED SPI Drivers
>>   obj-$(CONFIG_LEDS_DAC124S085)+= leds-dac124s085.o
>> diff --git a/drivers/leds/leds-ktd2692.c b/drivers/leds/leds-ktd2692.c
>> new file mode 100644
>> index 000..9d878a4
>> --- /dev/null
>> +++ b/drivers/leds/leds-ktd2692.c
>> @@ -0,0 +1,443 @@
>> +/*
>> + * LED driver : leds-ktd2692.c
>> + *
>> + * Copyright (C) 2015 Samsung Electronics
>> + * Ingi Kim 
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + */
>> +
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +
>> +/* Value related the movie mode */
>> +#define KTD2692_MOVIE_MODE_CURRENT_LEVELS16
>> +#define KTD2692_MM_TO_FL_RATIO(x)((x) / 3)
>> +#define KTD2962_MM_MIN_CURR_THRESHOLD_SCALE8
>> +
>> +/* Value related the flash mode */
>> +#define KTD2692_FLASH_MODE_TIMEOUT_LEVELS8
>> +#define KTD2692_FLASH_MODE_TIMEOUT_DISABLE0
>> +#define KTD2692_FLASH_MODE_CURR_PERCENT(x)(((x) * 16) / 100)
>> +
>> +/* Macro for getting offset of flash timeout */
>> +#define GET_TIMEOUT_OFFSET(timeout, step)((timeout) / (step))
>> +
>> +/* Base register address */
>> +#define KTD2692_REG_LVP_BASE0x00
>> +#define KTD2692_REG_FLASH_TIMEOUT_BASE0x20
>> +#define KTD2692_REG_MM_MIN_CURR_THRESHOLD_BASE0x40
>> +#define KTD2692_REG_MOVIE_CURRENT_BASE0x60
>> +#define KTD2692_REG_FLASH_CURRENT_BASE0x80
>> +#define KTD2692_REG_MODE_BASE0xA0
>> +
>> +/* Set bit coding time for expresswire interface */
>> +#define KTD2692_TIME_RESET_US700
>> +#define KTD2692_TIME_DATA_START_TIME_US10
>> +#define KTD2692_TIME_HIGH_END_OF_DATA_US350
>> +#define KTD2692_TIME_LOW_END_OF_DATA_US10
>> +#define KTD2692_TIME_SHORT_BITSET_US4
>> +#define KTD2692_TIME_LONG_BITSET_US12
>> +
>> +/* KTD2692 default length of name */
>> +#define KTD2692_NAME_LENGTH20
>> +
>> +enum ktd2692_bitset {
>> +KTD2692_LOW = 0,
>> +KTD2692_HIGH,
>> +};
>> +
>> +/* Movie / Flash Mode Control */
>> +enum ktd2692_led_mode {
>> +KTD2692_MODE_DISABLE = 0,/* default */
>> +KTD2692_MODE_MOVIE,
>> +KTD2692_MODE_FLASH,
>> +};
>> +
>> +struct ktd2692_led_config_data {
>> +/* maximum LED current in movie mode */
>> +u32 movie_max_microamp;
>> +/* maximum LED current in flash mode */
>> +u32 flash_max_microamp;
>> +/* maximum flash timeout */
>> +u32 flash_max_timeout;
>> +/* max LED brightness level */
>> +enum led_brightness max_brightness;
>> +};
>> +
>> +struct ktd2692_context

Re: [PATCH v2 2/3] perf probe: Add --range option to show variable location range

2015-05-10 Thread He Kuang


Hi, Masami

On 2015/5/10 11:21, Masami Hiramatsu wrote:

On 2015/05/09 18:55, He Kuang wrote:

It is not easy for users to get the accurate byte offset or the line
number where a local variable can be probed. With '--range' option,
local variables in scope of the probe point are showed with byte offset
range, and can be added according to this range information.

For example, there are some variables in function
generic_perform_write():

   
   0  ssize_t generic_perform_write(struct file *file,
   1 struct iov_iter *i, loff_t pos)
   2  {
   3  struct address_space *mapping = file->f_mapping;
   4  const struct address_space_operations *a_ops = mapping->a_ops;
   ...
   42 status = a_ops->write_begin(file, mapping, pos, bytes, 
flags,
, );
   44 if (unlikely(status < 0))

But we got failed when we try to probe the variable 'a_ops' at line 42
or 44.

   $ perf probe --add 'generic_perform_write:42 a_ops'
   Failed to find the location of a_ops at this address.
 Perhaps, it has been optimized out.

This is because source code do not match assembly, so a variable may not
be available in the sourcecode line where it presents. After this patch,
we can lookup the accurate byte offset range of a variable, 'INV'
indicates that this variable is not valid at the given point, but
available in scope:

   $ perf probe --vars 'generic_perform_write:42' --range
   Available variables at generic_perform_write:42
 @
 [INV]   ssize_t written @
 [INV]   struct address_space_operations*a_ops   
@
 [VAL]   (unknown_type)  fsdata  
@
 [VAL]   loff_t  pos 
@
 [VAL]   long intstatus  
@
 [VAL]   long unsigned int   bytes   
@
 [VAL]   struct address_space*   mapping 
@
 [VAL]   struct iov_iter*i   
@
 [VAL]   struct page*page
@


Thanks, this looks easier to understand :)

[...]

diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c
index dcca551..30a1a1b 100644
--- a/tools/perf/util/probe-finder.c
+++ b/tools/perf/util/probe-finder.c
@@ -43,6 +43,9 @@
  /* Kprobe tracer basic type is up to u64 */
  #define MAX_BASIC_TYPE_BITS   64
  
+/* Variable location invalid at addr but valid in scope */

+#define VARIABLE_LOCATION_INVALID_AT_ADDR  -1

Hmm, could you use -ERANGE instead of this?
Other part is OK for me.

Thank you!


I've checked libdw, it never returns -ERANGE, but there is an
errno conflict in the function convert_variable_location itself:

  268regs = get_arch_regstr(regn);
  269if (!regs) {
  270/* This should be a bug in DWARF or this tool */
  271pr_warning("Mapping for the register number %u "
  272   "missing on this architecture.\n", regn);
  273return -ERANGE;
  274 }

So shell we change the above errno to -ENOENT or choose another
errno for current 'VARIABLE_LOCATION_INVALID_AT_ADDR', what's
your opinion?

Thanks!





+
  /* Dwarf FL wrappers */
  static char *debuginfo_path;  /* Currently dummy */
  
@@ -177,7 +180,7 @@ static int convert_variable_location(Dwarf_Die *vr_die, Dwarf_Addr addr,

Dwarf_Word offs = 0;
bool ref = false;
const char *regs;
-   int ret;
+   int ret, ret2 = 0;
  
  	if (dwarf_attr(vr_die, DW_AT_external, ) != NULL)

goto static_var;
@@ -187,9 +190,19 @@ static int convert_variable_location(Dwarf_Die *vr_die, 
Dwarf_Addr addr,
return -EINVAL; /* Broken DIE ? */
if (dwarf_getlocation_addr(, addr, , , 1) <= 0) {
ret = dwarf_entrypc(sp_die, );
-   if (ret || addr != tmp ||
-   dwarf_tag(vr_die) != DW_TAG_formal_parameter ||
-   dwarf_highpc(sp_die, ))
+   if (ret)
+   return -ENOENT;
+
+   if (probe_conf.show_location_range &&
+   (dwarf_tag(vr_die) == DW_TAG_variable)) {
+   ret2 = VARIABLE_LOCATION_INVALID_AT_ADDR;
+   } else if (addr != tmp ||
+   dwarf_tag(vr_die) != DW_TAG_formal_parameter) {
+   return -ENOENT;
+   }
+
+   ret = dwarf_highpc(sp_die, );
+   if (ret)
return -ENOENT;
/*
 * This is fuzzed by fentry mcount. We try to find the
@@ -210,7 +223,7 @@ found:
if (op->atom == DW_OP_addr) {
  static_var:
if (!tvar)
-   return 0;
+   return ret2;
/* Static variables on memory (not stack), make @varname */
ret = strlen(dwarf_diename(vr_die));
tvar->value = zalloc(ret + 2);
@@

Re: [PATCH 2/4] perf tools: Add functions which can get or set perf config variables.

2015-05-10 Thread taeung


Hi, Jiri Olsa

Thanks for your feedbacks on my patches.

There are one thing I don't understand very well.
I wrote a number 1) in the middle of the your feedbacks to mark it.

I don't follow, could you elaborate it little more?


On 05/05/2015 04:16 AM, Jiri Olsa wrote:

On Mon, Apr 27, 2015 at 03:34:24PM +0900, Taeung Song wrote:

This patch consists of functions
which can get, set specific config variables.
For the syntax examples,

perf config [options] [section.subkey[=value] ...]

display key-value pairs of specific config variables
# perf config report.queue-size report.children

[jolsa@krava perf]$ ./perf config krava.krava
krava.krava=true

?

some comments below


set specific config variables
# perf config report.queue-size=100M report.children=true

Signed-off-by: Taeung Song 
---
  tools/perf/Documentation/perf-config.txt |   2 +
  tools/perf/builtin-config.c  | 276 ++-
  tools/perf/util/cache.h  |  17 ++
  tools/perf/util/config.c |  30 +++-
  4 files changed, 320 insertions(+), 5 deletions(-)

SNIP


+static int set_spec_config(const char *section_name, const char *subkey,
+  const char *value)
  {
int ret = 0;
+   ret += set_config(section_name, subkey, value);
+   ret += perf_configset_write_in_full();
+
+   return ret;
+}
+
+static void parse_key(const char *var, const char **section_name, const char 
**subkey)
+{
+   char *key = strdup(var);
+
+   if (!key)
+   die("%s: strdup failed\n", __func__);
+
+   *section_name = strsep(, ".");
+   *subkey = strsep(, ".");

should this check the config syntax? could be used for command line check as 
well

+}
+
+static int collect_config(const char *var, const char *value,
+ void *cb __maybe_unused)
+{
+   struct config_section *section_node;
+   const char *section_name, *subkey;

SNIP


+   }
+   for (i = 0; key[i]; i++) {
+   if (i == 0 && !isalpha(key[i++]))
+   goto out_err;
+
+   switch (key[i]) {
+   case '.':
+   num_dot += 1;
+   if (!isalpha(key[++i]))
+   goto out_err;
+   break;
+   case '=':
+   num_equals += 1;
+   break;
+   default:
+   if (!isalpha(key[i]) && !isalnum(key[i]))
+   goto out_err;

you dont allow '-' in the key report.queue-size, I think we should support also 
_

also please put the name checks into separated function


+   }
+   }
+
+   if (num_equals > 1 || num_dot > 1)
+   goto out_err;
+
+   given_value = strchr(key, '=');
+   if (given_value == NULL || given_value == key)
+   given_value = NULL;

SNIP


argc = parse_options(argc, argv, config_options, config_usage,
 PARSE_OPT_STOP_AT_NON_OPTION);
+   if (origin_argc > argc)
+   is_option = true;
+   else
+   is_option = false;
+
+   if (!is_option && argc >= 0) {
+   switch (argc) {
+   case 0:
+   break;
+   default:
+   for (i = 0; argv[i]; i++) {
+   value = strrchr(argv[i], '=');
+   if (value == NULL || value == argv[i])

hum, so you let go in args like '=krava' ?

why dont you completely check the name (assignment string) first
and decide later about the callback


1) I understood that the name must be completely checked first.
   But I don't know the callback. What does it mean ?
   Could you elaborate it little more?

Thanks,
Taeung


+   ret = 
perf_configset_with_option(show_spec_config, argv[i]);
+   else
+   ret = 
perf_configset_with_option(set_spec_config, argv[i]);
+   if (ret < 0)
+   break;
+   }
+   goto out;
+   }
+   }

SNIP


@@ -502,6 +501,31 @@ out:
return ret;
  }
  
+int perf_configset_write_in_full(void)

+{
+   struct config_section *section_node;
+   struct config_element *element_node;
+   const char *first_line = "# this file is auto-generated.";

so you parse whole config, change it and write back..
hum, I dont see better way.. and I like the first line ;-)


+   FILE *fp = fopen(config_file_name, "w");
+
+   if (!fp)
+   return -1;
+
+   fprintf(fp, "%s\n", first_line);
+   /* overwrite configvariables */
+   list_for_each_entry(section_node, sections, list) {
+   fprintf(fp, "[%s]\n", section_node->name);
+

[PATCH 2/2] Define syscall number for fchmodat4

2015-05-10 Thread William Orr

---
 arch/x86/syscalls/syscall_32.tbl  | 1 +
 arch/x86/syscalls/syscall_64.tbl  | 1 +
 include/uapi/asm-generic/unistd.h | 4 +++-
 3 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/x86/syscalls/syscall_32.tbl b/arch/x86/syscalls/syscall_32.tbl
index ef8187f..cc8ada8 100644
--- a/arch/x86/syscalls/syscall_32.tbl
+++ b/arch/x86/syscalls/syscall_32.tbl
@@ -365,3 +365,4 @@
 356i386memfd_createsys_memfd_create
 357i386bpf sys_bpf
 358i386execveatsys_execveat
stub32_execveat
+359i386fchmodat4   sys_fchmodat4
diff --git a/arch/x86/syscalls/syscall_64.tbl b/arch/x86/syscalls/syscall_64.tbl
index 9ef32d5..bbf8c6b 100644
--- a/arch/x86/syscalls/syscall_64.tbl
+++ b/arch/x86/syscalls/syscall_64.tbl
@@ -329,6 +329,7 @@
 320common  kexec_file_load sys_kexec_file_load
 321common  bpf sys_bpf
 32264  execveatstub_execveat
+323common  fchmodat4   sys_fchmodat4
 
 #
 # x32-specific system call numbers start at 512 to avoid cache impact
diff --git a/include/uapi/asm-generic/unistd.h 
b/include/uapi/asm-generic/unistd.h
index e016bd9..6e362a2 100644
--- a/include/uapi/asm-generic/unistd.h
+++ b/include/uapi/asm-generic/unistd.h
@@ -709,9 +709,11 @@ __SYSCALL(__NR_memfd_create, sys_memfd_create)
 __SYSCALL(__NR_bpf, sys_bpf)
 #define __NR_execveat 281
 __SC_COMP(__NR_execveat, sys_execveat, compat_sys_execveat)
+#define __NR_fchmodat4 282
+__SYSCALL(__NR_fchmodat4, sys_fchmodat4);
 
 #undef __NR_syscalls
-#define __NR_syscalls 282
+#define __NR_syscalls 283
 
 /*
  * All syscalls below here should go away really,
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/2] Implement fchmodat4 syscall

2015-05-10 Thread William Orr

Adds fchmodat4 which more closely matches POSIX by taking 4 arguments,
including the flags argument. flags are the same as fchownat(2), implementing
both AT_SYMLINK_NOFOLLOW and AT_EMPTY_PATH

Based heavily off of Andrew Ayer's patch from 2012.
---
 fs/open.c| 19 +--
 include/linux/syscalls.h |  2 ++
 2 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/fs/open.c b/fs/open.c
index 98e5a52..00dd0f7 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -504,6 +504,9 @@ static int chmod_common(struct path *path, umode_t mode)
struct iattr newattrs;
int error;
 
+   if (S_ISLNK(inode->i_mode))
+   return -EOPNOTSUPP;
+
error = mnt_want_write(path->mnt);
if (error)
return error;
@@ -541,9 +544,21 @@ SYSCALL_DEFINE2(fchmod, unsigned int, fd, umode_t, mode)
 
 SYSCALL_DEFINE3(fchmodat, int, dfd, const char __user *, filename, umode_t, 
mode)
 {
+   return sys_fchmodat4(dfd, filename, mode, 0);
+}
+
+SYSCALL_DEFINE4(fchmodat4, int, dfd, const char __user *, filename, umode_t, 
mode, int, flags)
+{
struct path path;
int error;
-   unsigned int lookup_flags = LOOKUP_FOLLOW;
+   unsigned int lookup_flags;
+
+   if (flags & ~(AT_SYMLINK_NOFOLLOW | AT_EMPTY_PATH))
+   return -EINVAL;
+
+   lookup_flags = (flags & AT_SYMLINK_NOFOLLOW) ? 0 : LOOKUP_FOLLOW;
+   if (flags & AT_EMPTY_PATH)
+   lookup_flags |= LOOKUP_EMPTY;
 retry:
error = user_path_at(dfd, filename, lookup_flags, );
if (!error) {
@@ -559,7 +574,7 @@ retry:
 
 SYSCALL_DEFINE2(chmod, const char __user *, filename, umode_t, mode)
 {
-   return sys_fchmodat(AT_FDCWD, filename, mode);
+   return sys_fchmodat4(AT_FDCWD, filename, mode, 0);
 }
 
 static int chown_common(struct path *path, uid_t user, gid_t group)
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 76d1e38..d6e9602 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -769,6 +769,8 @@ asmlinkage long sys_futimesat(int dfd, const char __user 
*filename,
 asmlinkage long sys_faccessat(int dfd, const char __user *filename, int mode);
 asmlinkage long sys_fchmodat(int dfd, const char __user * filename,
 umode_t mode);
+asmlinkage long sys_fchmodat4(int dfd, const char __user * filename,
+umode_t mode, int flags);
 asmlinkage long sys_fchownat(int dfd, const char __user *filename, uid_t user,
 gid_t group, int flag);
 asmlinkage long sys_openat(int dfd, const char __user *filename, int flags,
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] Implement fchmodat4 system call

2015-05-10 Thread William Orr

Hey,

Currently, Linux's fchmodat(2) doesn't honor the flags argument. To bring it
more in-line with POSIX and other implementations, this patch adds fchmodat4,
which honors the flags argument, and implements the same flags as fchownat(2).

This makes it possible to chmod a file without following symlinks, without
having to call open(2) on a file with O_NOFOLLOW. This is heavily based off
of Andrew Ayer's work in 2012, and is sent with his permission. Let me know
if this can be applied.

Please CC me, since I'm not subscribed to this list.

Thanks,
William Orr

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH kernel v9 26/32] powerpc/iommu: Add userspace view of TCE table

2015-05-10 Thread Alexey Kardashevskiy


On 05/05/2015 10:02 PM, David Gibson wrote:

On Fri, May 01, 2015 at 05:12:45PM +1000, Alexey Kardashevskiy wrote:

On 05/01/2015 02:23 PM, David Gibson wrote:

On Fri, May 01, 2015 at 02:01:17PM +1000, Alexey Kardashevskiy wrote:

On 04/29/2015 04:31 PM, David Gibson wrote:

On Sat, Apr 25, 2015 at 10:14:50PM +1000, Alexey Kardashevskiy wrote:

In order to support memory pre-registration, we need a way to track
the use of every registered memory region and only allow unregistration
if a region is not in use anymore. So we need a way to tell from what
region the just cleared TCE was from.

This adds a userspace view of the TCE table into iommu_table struct.
It contains userspace address, one per TCE entry. The table is only
allocated when the ownership over an IOMMU group is taken which means
it is only used from outside of the powernv code (such as VFIO).

Signed-off-by: Alexey Kardashevskiy 
---
Changes:
v9:
* fixed code flow in error cases added in v8

v8:
* added ENOMEM on failed vzalloc()
---
  arch/powerpc/include/asm/iommu.h  |  6 ++
  arch/powerpc/kernel/iommu.c   | 18 ++
  arch/powerpc/platforms/powernv/pci-ioda.c | 22 --
  3 files changed, 44 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/iommu.h b/arch/powerpc/include/asm/iommu.h
index 7694546..1472de3 100644
--- a/arch/powerpc/include/asm/iommu.h
+++ b/arch/powerpc/include/asm/iommu.h
@@ -111,9 +111,15 @@ struct iommu_table {
unsigned long *it_map;   /* A simple allocation bitmap for now */
unsigned long  it_page_shift;/* table iommu page size */
struct iommu_table_group *it_table_group;
+   unsigned long *it_userspace; /* userspace view of the table */


A single unsigned long doesn't seem like enough.


Why single? This is an array.


As in single per page.



Sorry, I am not following you here.
It is per IOMMU page. MAP/UNMAP work with IOMMU pages which are fully backed
with either system page or a huge page.





How do you know
which process's address space this address refers to?


It is a current task. Multiple userspaces cannot use the same container/tables.


Where is that enforced?



It is accessed from VFIO DMA map/unmap which are ioctls() to a container's
fd which is per a process.


Usually, but what enforces that.  If you open a container fd, then
fork(), and attempt to map from both parent and child, what happens?



vfio_group_fops::open() checks if the group is already opened, and I want 
to believe open() is called from fork() for new fd so no mapping can happen 
later.



--
Alexey
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] ipconfig : put root_server_path into __initdata section

2015-05-10 Thread tyeon


root_server_path is used only for the __init function.
so, no need to keep this variable in the memory after kernel init.

Signed-off-by: Tom(JeHyeon) Yeon 
---
 net/ipv4/ipconfig.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv4/ipconfig.c b/net/ipv4/ipconfig.c
index b26376e..0f1b159 100644
--- a/net/ipv4/ipconfig.c
+++ b/net/ipv4/ipconfig.c
@@ -141,7 +141,7 @@ __be32 ic_addrservaddr = NONE;	/* IP Address of the 
IP addresses'server */

 __be32 ic_servaddr = NONE; /* Boot server IP address */

 __be32 root_server_addr = NONE;/* Address of NFS server */
-u8 root_server_path[256] = { 0, }; /* Path to mount as root */
+u8 root_server_path[256] __initdata = { 0, };  /* Path to mount as root */

 /* vendor class identifier */
 static char vendor_class_identifier[253] __initdata;
--
1.7.9.5
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Hotplug: fix the bug that the system is down,when memory is not in node0 and cpu is logically hotadded.

2015-05-10 Thread Gu Zheng

Hi TJ, Song,

Sorry for late reply.

On 05/08/2015 11:23 PM, Tejun Heo wrote:

> Cc'ing Lai, Gu and Kamezawa as they've been working in the area for a
> while now.  Gu, is this related to what you've been working on?


Yes, they are the same. And we are still working on it, please refer to the
following for detail:
https://lkml.org/lkml/2015/4/24/143
https://lkml.org/lkml/2015/2/27/145
https://lkml.org/lkml/2015/3/25/989

Regards,
Gu

> 
> Thanks.
> 
> On Fri, May 08, 2015 at 07:16:40PM +0800, Song Xiumiao wrote:
>> From: songxiumiao 
>>
>> By analysing the bug function call trace,we find that create_worker
>> function will alloc the memory from node0.Because node0 is offline,
>> the allocation is failed. Then we add a condition to ensure the node
>> is online and system can alloc memory from a node that is online.
>>
>> Follow is the bug information:
>> [root@localhost ~]# echo 1 > /sys/devices/system/cpu/cpu90/online
>> [  225.611209] smpboot: Booting Node 2 Processor 90 APIC 0x40
>> [18446744029.482996] kvm: enabling virtualization on CPU90
>> [  225.725503] TSC synchronization [CPU#43 -> CPU#90]:
>> [  225.730952] Measured 672516581900 cycles TSC warp between CPUs, turning 
>> off TSC clock.
>> [  225.739800] tsc: Marking TSC unstable due to check_tsc_sync_source failed
>> [  225.755126] BUG: unable to handle kernel paging request at 
>> 1b08
>> [  225.762931] IP: [] __alloc_pages_nodemask+0xb7/0x940
>> [  225.770247] PGD 449bb0067 PUD 46110e067 PMD 0
>> [  225.775248] Oops:  [#1] SMP
>> [  225.778875] Modules linked in: xt_CHECKSUM ip6t_rpfilter ip6t_REJECT 
>> nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT nf_reject_ipv4 
>> nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntracd
>> [  225.868198] CPU: 43 PID: 5400 Comm: bash Not tainted 
>> 4.0.0-rc4-bug-fixed-remove #16
>> [  225.876754] Hardware name: Insyde Brickland/Type2 - Board Product Name1, 
>> BIOS Brickland.05.04.15.0024 02/28/2015
>> [  225.888122] task: 88045a3d8da0 ti: 88044612 task.ti: 
>> 88044612
>> [  225.896484] RIP: 0010:[]  [] 
>> __alloc_pages_nodemask+0xb7/0x940
>> [  225.906509] RSP: 0018:880446123918  EFLAGS: 00010246
>> [  225.912443] RAX: 1b00 RBX: 0010 RCX: 
>> 
>> [  225.920416] RDX:  RSI:  RDI: 
>> 002052d0
>> [  225.928388] RBP: 880446123a08 R08: 880460eca0c0 R09: 
>> 60eca101
>> [  225.936361] R10: 88046d007300 R11: 8108dd31 R12: 
>> 0001002a
>> [  225.944334] R13: 002052d0 R14: 0001 R15: 
>> 40d0
>> [  225.952306] FS:  7f9386450740() GS:88046db6() 
>> knlGS:
>> [  225.961346] CS:  0010 DS:  ES:  CR0: 80050033
>> [  225.967765] CR2: 1b08 CR3: 0004612a3000 CR4: 
>> 001407e0
>> [  225.975735] Stack:
>> [  225.977981]  002052d0  0003 
>> 88045a3d8da0
>> [  225.986291]  880446123988 811c7f81 88045a3d8da0 
>> 
>> [  225.994597]  80d2 88046d005500 0003000f 
>> 002052d0002052d0
>> [  226.002904] Call Trace:
>> [  226.005645]  [] ? alloc_pages_current+0x91/0x100
>> [  226.012557]  [] ? deactivate_slab+0x383/0x400
>> [  226.019173]  [] new_slab+0xa7/0x460
>> [  226.024826]  [] __slab_alloc+0x310/0x470
>> [  226.030960]  [] ? get_from_free_list+0x46/0x60
>> [  226.037679]  [] ? alloc_worker+0x21/0x50
>> [  226.043812]  [] kmem_cache_alloc_node_trace+0x91/0x250
>> [  226.051299]  [] alloc_worker+0x21/0x50
>> [  226.057236]  [] create_worker+0x53/0x1e0
>> [  226.063357]  [] alloc_unbound_pwq+0x2a2/0x510
>> [  226.069974]  [] wq_update_unbound_numa+0x1b4/0x220
>> [  226.077076]  [] workqueue_cpu_up_callback+0x308/0x3d0
>> [  226.084468]  [] notifier_call_chain+0x4e/0x80
>> [  226.091084]  [] __raw_notifier_call_chain+0xe/0x10
>> [  226.098189]  [] cpu_notify+0x23/0x50
>> [  226.103929]  [] _cpu_up+0x188/0x1a0
>> [  226.109574]  [] cpu_up+0x89/0xb0
>> [  226.114923]  [] cpu_subsys_online+0x40/0x90
>> [  226.121350]  [] device_online+0x6d/0xa0
>> [  226.127382]  [] online_store+0x95/0xa0
>> [  226.133322]  [] dev_attr_store+0x18/0x30
>> [  226.139457]  [] sysfs_kf_write+0x3d/0x50
>> [  226.145586]  [] kernfs_fop_write+0x12a/0x180
>> [  226.152109]  [] vfs_write+0xb7/0x1f0
>> [  226.157853]  [] ? do_audit_syscall_entry+0x6c/0x70
>> [  226.164954]  [] SyS_write+0x55/0xd0
>> [  226.170595]  [] system_call_fastpath+0x12/0x17
>> [  226.177306] Code: 30 97 00 89 45 bc 83 e1 0f b8 22 01 32 01 01 c9 d3 f8 
>> 83 e0 03 89 9d 6c ff ff ff 83 e3 10 89 45 c0 0f 85 6d 01 00 00 48 8b 45 88 
>> <48> 83 78 08 00 0f 84 51 01 00 00 b8 01
>> [  226.199175] RIP  [] __alloc_pages_nodemask+0xb7/0x940
>> [  226.206576]  RSP 
>> [  226.210471] CR2: 1b08
>> [  226.227939] ---[ end trace 30d753e1e1124696 ]---
>> [  226.412591] Kernel panic - not syncing: Fatal exception
>> [  226.430948]

[PATCH 4/4] tty/slaves: add a driver to power on/off UART attached devices.

2015-05-10 Thread NeilBrown

If a platform has a particular device permanently attached to a UART,
there may be out-of-band signaling necessary to power the device
on and off.

This driver controls that signalling for a number of different devices.
It can
 - enable/disable a regulator
 - toggle a GPIO
 - register an 'rfkill' which can force the device to be off.

When the rfkill is absent or unblocked, the device will be on when the
associated tty device is open, and closed otherwise.

Signed-off-by: NeilBrown 
---
 .../bindings/uart_slave/wi2wi,w2cbw003.txt |   19 +
 .../bindings/uart_slave/wi2wi,w2sg0004.txt |   37 +
 .../devicetree/bindings/vendor-prefixes.txt|1 
 drivers/tty/serial/slave/Kconfig   |   15 +
 drivers/tty/serial/slave/Makefile  |1 
 drivers/tty/serial/slave/serial-power-manager.c|  510 
 6 files changed, 583 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/uart_slave/wi2wi,w2cbw003.txt
 create mode 100644 
Documentation/devicetree/bindings/uart_slave/wi2wi,w2sg0004.txt
 create mode 100644 drivers/tty/serial/slave/serial-power-manager.c

diff --git a/Documentation/devicetree/bindings/uart_slave/wi2wi,w2cbw003.txt 
b/Documentation/devicetree/bindings/uart_slave/wi2wi,w2cbw003.txt
new file mode 100644
index ..cfe6ee5e01e9
--- /dev/null
+++ b/Documentation/devicetree/bindings/uart_slave/wi2wi,w2cbw003.txt
@@ -0,0 +1,19 @@
+wi2wi bluetooth module
+
+This is accessed via a serial port and is largely controlled via that
+link.  Extra configuration is needed to enable power on/off
+
+Required properties:
+- compatible: "wi2wi,w2cbw003"
+- vdd-supply: regulator used to power the device.
+
+The node for this device must be the child of a UART.
+
+Example:
+
+ {
+   bluetooth {
+   compatible = "wi2wi,w2cbw003";
+   vdd-supply = <>;
+   };
+};
diff --git a/Documentation/devicetree/bindings/uart_slave/wi2wi,w2sg0004.txt 
b/Documentation/devicetree/bindings/uart_slave/wi2wi,w2sg0004.txt
new file mode 100644
index ..6bcf3006c1a4
--- /dev/null
+++ b/Documentation/devicetree/bindings/uart_slave/wi2wi,w2sg0004.txt
@@ -0,0 +1,37 @@
+wi2wi GPS device
+
+This is accessed via a serial port and is largely controlled via that
+link.  Extra configuration is needed to enable power on/off
+
+Required properties:
+- compatible: "wi2wi,w2sg0004"
+- gpios: gpio used to toggle 'on/off' pin
+- interrupts: interrupt generated by RX pin when device
+  should be off
+
+Optional properties:
+- vdd-supply: regulator used to power antenna
+- pinctrl: "default", "off"
+  if "off" setting is provided it is imposed when device should
+  be off.  This can route the RX pin to a GPIO interrupt.
+
+The w2sg0004 uses a pin-toggle both to power-on and to
+power-off, so the driver needs to detect what state it is in.
+It does this by detecting characters on the RX line.
+When it should be off, these can optionally be detected by a GPIO.
+
+The node for this device must be the child of a UART.
+
+Example:
+ {
+   gps {
+   compatible = "wi2iw,w2sg0004";
+   vdd-supply = <>;
+   gpios = < 17 0>; /* GPIO_145 */
+   interrupts-extended = < 19 0>; /* GPIO_147 */
+   /* When off, switch RX to be an interrupt */
+   pinctrl-names = "default", "off";
+   pinctrl-0 = <_pins>;
+   pinctrl-1 = <_pins_rx_gpio>;
+   };
+};
diff --git a/Documentation/devicetree/bindings/vendor-prefixes.txt 
b/Documentation/devicetree/bindings/vendor-prefixes.txt
index 80339192c93e..45c703a3b29d 100644
--- a/Documentation/devicetree/bindings/vendor-prefixes.txt
+++ b/Documentation/devicetree/bindings/vendor-prefixes.txt
@@ -202,6 +202,7 @@ variscite   Variscite Ltd.
 viaVIA Technologies, Inc.
 virtio Virtual I/O Device Specification, developed by the OASIS consortium
 voipac Voipac Technologies s.r.o.
+wi2wi  wi2wi Inc.  http://www.wi2wi.com/
 winbond Winbond Electronics corp.
 wlfWolfson Microelectronics
 wm Wondermedia Technologies, Inc.
diff --git a/drivers/tty/serial/slave/Kconfig b/drivers/tty/serial/slave/Kconfig
index 6620e78b763e..46285d8895b7 100644
--- a/drivers/tty/serial/slave/Kconfig
+++ b/drivers/tty/serial/slave/Kconfig
@@ -4,3 +4,18 @@ menuconfig UART_SLAVE
help
 Devices which attach via a uart, but need extra
 driver support for power management etc.
+
+if UART_SLAVE
+
+config SERIAL_POWER_MANAGER
+   tristate "Power Management controller for serial-attached devices"
+   default n
+   help
+Some devices permanently attached via a UART can benefit from
+being power-managed when the tty device is opened or closed.
+This driver can support several such devices with simple
+power requirements such as enabling a regulator.
+
+If in doubt, say 'N'
+
+endif
diff --git a/drivers/tty/serial/slave/Makefile

[PATCH 2/4] TTY: split tty_register_device_attr into 'initialize' and 'add' parts.

2015-05-10 Thread NeilBrown

This provides seperate
  tty_device_initialize_attr()
and
  tty_device_add_attr()

similar to device_initialize() and device_add().

Similiarly tty_port_initialize_device_attr() is added to match
tty_port_register_device_attr().

It will allow a client to separate initalization and adding of the
device.

Signed-off-by: NeilBrown 
---
 drivers/tty/tty_io.c   |  102 +---
 drivers/tty/tty_port.c |   24 +++
 include/linux/tty.h|9 
 3 files changed, 104 insertions(+), 31 deletions(-)

diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
index ccb99b772965..83ca25b9c2da 100644
--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -3205,8 +3205,29 @@ static void tty_device_create_release(struct device *dev)
kfree(dev);
 }
 
+int tty_device_add(struct tty_driver *driver, struct device *dev)
+{
+   int retval;
+   bool cdev = false;
+   int index = dev->devt - MKDEV(driver->major,
+ driver->minor_start);
+
+   if (!(driver->flags & TTY_DRIVER_DYNAMIC_ALLOC)) {
+   retval = tty_cdev_add(driver, dev->devt, index, 1);
+   if (retval)
+   return retval;
+   cdev = true;
+   }
+   retval = device_add(dev);
+   if (retval == 0)
+   return 0;
+   if (cdev)
+   cdev_del(>cdevs[index]);
+   return retval;
+}
+EXPORT_SYMBOL(tty_device_add);
 /**
- * tty_register_device_attr - register a tty device
+ * tty_device_initialize_attr - initialize a tty device, but don't 'add'
  * @driver: the tty driver that describes the tty device
  * @index: the index in the tty driver for this tty device
  * @device: a struct device that is associated with this tty device.
@@ -3218,23 +3239,19 @@ static void tty_device_create_release(struct device 
*dev)
  * Returns a pointer to the struct device for this tty device
  * (or ERR_PTR(-EFOO) on error).
  *
- * This call is required to be made to register an individual tty device
- * if the tty driver's flags have the TTY_DRIVER_DYNAMIC_DEV bit set.  If
- * that bit is not set, this function should not be called by a tty
- * driver.
+ * tty_device_add() must be called after this call returns successfully
+ * before the device will be full registered and available.
  *
  * Locking: ??
  */
-struct device *tty_register_device_attr(struct tty_driver *driver,
-  unsigned index, struct device *device,
-  void *drvdata,
-  const struct attribute_group **attr_grp)
+struct device *tty_device_initialize_attr(struct tty_driver *driver,
+ unsigned index, struct device *device,
+ void *drvdata,
+ const struct attribute_group 
**attr_grp)
 {
char name[64];
dev_t devt = MKDEV(driver->major, driver->minor_start) + index;
struct device *dev = NULL;
-   int retval = -ENODEV;
-   bool cdev = false;
 
if (index >= driver->num) {
printk(KERN_ERR "Attempt to register invalid tty line number "
@@ -3247,19 +3264,11 @@ struct device *tty_register_device_attr(struct 
tty_driver *driver,
else
tty_line_name(driver, index, name);
 
-   if (!(driver->flags & TTY_DRIVER_DYNAMIC_ALLOC)) {
-   retval = tty_cdev_add(driver, devt, index, 1);
-   if (retval)
-   goto error;
-   cdev = true;
-   }
-
dev = kzalloc(sizeof(*dev), GFP_KERNEL);
-   if (!dev) {
-   retval = -ENOMEM;
-   goto error;
-   }
+   if (!dev)
+   return ERR_PTR(-ENOMEM);
 
+   device_initialize(dev);
dev->devt = devt;
dev->class = tty_class;
dev->parent = device;
@@ -3268,17 +3277,48 @@ struct device *tty_register_device_attr(struct 
tty_driver *driver,
dev->groups = attr_grp;
dev_set_drvdata(dev, drvdata);
 
-   retval = device_register(dev);
-   if (retval)
-   goto error;
-
return dev;
+}
+EXPORT_SYMBOL_GPL(tty_device_initialize_attr);
 
-error:
-   put_device(dev);
-   if (cdev)
-   cdev_del(>cdevs[index]);
-   return ERR_PTR(retval);
+/**
+ * tty_register_device_attr - register a tty device
+ * @driver: the tty driver that describes the tty device
+ * @index: the index in the tty driver for this tty device
+ * @device: a struct device that is associated with this tty device.
+ * This field is optional, if there is no known struct device
+ * for this tty device it can be set to NULL safely.
+ * @drvdata: Driver data to be set to device.
+ * @attr_grp: Attribute group to be set on device.
+ *
+ * Returns a pointer to the

[PATCH 3/4] TTY: add support for uart_slave devices.

2015-05-10 Thread NeilBrown

A "uart slave" is a device permanently connected via UART.
Such a device may need its own driver, e.g. for powering
it up on tty open and powering it down on tty release.

When a device is connected to a UART by a 'standard' bus, such as
RS-232, signaling for power control typically uses "DTR".  When the
connection is permanent, as is common on "embedded" boards, separate
signaling may be needed and this requires a separate driver.

uart-slave is a new bus-type which drivers can be written and devices
created.

A "uart slave" device is declared as a child of the uart in
device-tree:

 {
bluetooth {
compatible = "wi2wi,w2cbw003";
vdd-supply = <>;
};
};

This device will be inserted in the driver-model tree between the uart
and the tty.

The uart-slave driver can replace any of the tty_operations functions
so a call by the tty can be intercepted before being handled by the
uart.

When the tty port is initialized, the uart_slave device is created and
waits for a driver to be bound to it.  Once this happens the tty
device, which was previously initialized, will be added.  This slave
is now considered "finalized".

Any "finalized" slaves will be removed when the tty device is
unregistered.  e.g. by destruct_tty_driver.
While slaves are non-finalized they hold a reference to the tty driver
to prevent destruct_tty_driver from being called, as it cannot find
and free slave devices.

Signed-off-by: NeilBrown 
---
 drivers/tty/serial/Kconfig |1 
 drivers/tty/serial/Makefile|2 
 drivers/tty/serial/serial_core.c   |9 +-
 drivers/tty/serial/slave/Kconfig   |6 +
 drivers/tty/serial/slave/Makefile  |2 
 drivers/tty/serial/slave/uart_slave_core.c |  168 
 drivers/tty/tty_io.c   |3 +
 include/linux/uart_slave.h |   29 +
 8 files changed, 219 insertions(+), 1 deletion(-)
 create mode 100644 drivers/tty/serial/slave/Kconfig
 create mode 100644 drivers/tty/serial/slave/Makefile
 create mode 100644 drivers/tty/serial/slave/uart_slave_core.c
 create mode 100644 include/linux/uart_slave.h

diff --git a/drivers/tty/serial/Kconfig b/drivers/tty/serial/Kconfig
index f8120c1bde14..2601a8fb41a3 100644
--- a/drivers/tty/serial/Kconfig
+++ b/drivers/tty/serial/Kconfig
@@ -1594,4 +1594,5 @@ endmenu
 config SERIAL_MCTRL_GPIO
tristate
 
+source drivers/tty/serial/slave/Kconfig
 endif # TTY
diff --git a/drivers/tty/serial/Makefile b/drivers/tty/serial/Makefile
index c3ac3d930b33..7a6ed85257f6 100644
--- a/drivers/tty/serial/Makefile
+++ b/drivers/tty/serial/Makefile
@@ -96,3 +96,5 @@ obj-$(CONFIG_SERIAL_SPRD) += sprd_serial.o
 
 # GPIOLIB helpers for modem control lines
 obj-$(CONFIG_SERIAL_MCTRL_GPIO)+= serial_mctrl_gpio.o
+
+obj-y += slave/
diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c
index 3ea16f524e89..fcad5b30486f 100644
--- a/drivers/tty/serial/serial_core.c
+++ b/drivers/tty/serial/serial_core.c
@@ -34,6 +34,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -2710,10 +2711,16 @@ int uart_add_one_port(struct uart_driver *drv, struct 
uart_port *uport)
 * Register the port whether it's detected or not.  This allows
 * setserial to be used to alter this port's parameters.
 */
-   tty_dev = tty_port_register_device_attr(port, drv->tty_driver,
+   tty_dev = tty_port_initialize_device_attr(port, drv->tty_driver,
uport->line, uport->dev, port, uport->tty_groups);
if (likely(!IS_ERR(tty_dev))) {
device_set_wakeup_capable(tty_dev, 1);
+   if (uart_slave_register(uport->dev, tty_dev,
+   drv->tty_driver) < 0) {
+   ret = tty_device_add(drv->tty_driver, tty_dev);
+   if (ret)
+   put_device(tty_dev);
+   }
} else {
dev_err(uport->dev, "Cannot register tty device on line %d\n",
   uport->line);
diff --git a/drivers/tty/serial/slave/Kconfig b/drivers/tty/serial/slave/Kconfig
new file mode 100644
index ..6620e78b763e
--- /dev/null
+++ b/drivers/tty/serial/slave/Kconfig
@@ -0,0 +1,6 @@
+menuconfig UART_SLAVE
+   tristate "UART slave devices"
+   depends on OF
+   help
+Devices which attach via a uart, but need extra
+driver support for power management etc.
diff --git a/drivers/tty/serial/slave/Makefile 
b/drivers/tty/serial/slave/Makefile
new file mode 100644
index ..aac8697fa406
--- /dev/null
+++ b/drivers/tty/serial/slave/Makefile
@@ -0,0 +1,2 @@
+
+obj-$(CONFIG_UART_SLAVE) += uart_slave_core.o
diff --git a/drivers/tty/serial/slave/uart_slave_core.c 
b/drivers/tty/serial/slave/uart_slave_core.c
new file mode 100644
index ..d48d672300c2
--- /dev/null
+++

[PATCH 1/4] TTY: use class_find_device to find port in uart_suspend/resume.

2015-05-10 Thread NeilBrown

uart_{suspend,resume}_port search the children of a uart device
to find a particular tty device.
This requires all the ttys to be direct children of the uart.

A future patch will allow a 'tty_slave' to intervene between
the port and the uart, voiding this requirement.

So change to use class_find_device.  This is made possible by
exporting a "tty_find_device" from tty_io.c.

This new "tty_find_device" is very similar to the existing
tty_get_device() which has a single caller, so discard
tty_get_device() and just use tty_find_device().

Signed-off-by: NeilBrown 
---
 drivers/tty/serial/serial_core.c |   21 -
 drivers/tty/tty_io.c |6 +++---
 include/linux/tty.h  |1 +
 3 files changed, 12 insertions(+), 16 deletions(-)

diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c
index eb5b03be9dfd..3ea16f524e89 100644
--- a/drivers/tty/serial/serial_core.c
+++ b/drivers/tty/serial/serial_core.c
@@ -2005,26 +2005,19 @@ struct uart_match {
struct uart_driver *driver;
 };
 
-static int serial_match_port(struct device *dev, void *data)
-{
-   struct uart_match *match = data;
-   struct tty_driver *tty_drv = match->driver->tty_driver;
-   dev_t devt = MKDEV(tty_drv->major, tty_drv->minor_start) +
-   match->port->line;
-
-   return dev->devt == devt; /* Actually, only one tty per port */
-}
 
 int uart_suspend_port(struct uart_driver *drv, struct uart_port *uport)
 {
struct uart_state *state = drv->state + uport->line;
struct tty_port *port = >port;
struct device *tty_dev;
-   struct uart_match match = {uport, drv};
+   dev_t devt = MKDEV(drv->tty_driver->major,
+  drv->tty_driver->minor_start) +
+   uport->line;
 
mutex_lock(>mutex);
 
-   tty_dev = device_find_child(uport->dev, , serial_match_port);
+   tty_dev = tty_find_device(devt);
if (device_may_wakeup(tty_dev)) {
if (!enable_irq_wake(uport->irq))
uport->irq_wake = 1;
@@ -2084,12 +2077,14 @@ int uart_resume_port(struct uart_driver *drv, struct 
uart_port *uport)
struct uart_state *state = drv->state + uport->line;
struct tty_port *port = >port;
struct device *tty_dev;
-   struct uart_match match = {uport, drv};
struct ktermios termios;
+   dev_t devt = MKDEV(drv->tty_driver->major,
+  drv->tty_driver->minor_start) +
+   uport->line;
 
mutex_lock(>mutex);
 
-   tty_dev = device_find_child(uport->dev, , serial_match_port);
+   tty_dev = tty_find_device(devt);
if (!uport->suspended && device_may_wakeup(tty_dev)) {
if (uport->irq_wake) {
disable_irq_wake(uport->irq);
diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
index e5695467598f..ccb99b772965 100644
--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -3077,11 +3077,11 @@ static int dev_match_devt(struct device *dev, const 
void *data)
 }
 
 /* Must put_device() after it's unused! */
-static struct device *tty_get_device(struct tty_struct *tty)
+struct device *tty_find_device(dev_t devt)
 {
-   dev_t devt = tty_devnum(tty);
return class_find_device(tty_class, NULL, , dev_match_devt);
 }
+EXPORT_SYMBOL(tty_find_device);
 
 
 /**
@@ -3123,7 +3123,7 @@ struct tty_struct *alloc_tty_struct(struct tty_driver 
*driver, int idx)
tty->ops = driver->ops;
tty->index = idx;
tty_line_name(driver, idx, tty->name);
-   tty->dev = tty_get_device(tty);
+   tty->dev = tty_find_device(tty_devnum(tty));
 
return tty;
 }
diff --git a/include/linux/tty.h b/include/linux/tty.h
index 358a337af598..04d5f1213700 100644
--- a/include/linux/tty.h
+++ b/include/linux/tty.h
@@ -461,6 +461,7 @@ extern void tty_vhangup(struct tty_struct *tty);
 extern int tty_hung_up_p(struct file *filp);
 extern void do_SAK(struct tty_struct *tty);
 extern void __do_SAK(struct tty_struct *tty);
+extern struct device *tty_find_device(dev_t devt);
 extern void no_tty(void);
 extern void tty_flush_to_ldisc(struct tty_struct *tty);
 extern void tty_buffer_free_all(struct tty_port *port);


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 0/4] UART slave device support - version 4

2015-05-10 Thread NeilBrown

Hi all,
 here is version 4 of my "UART slave device" patch set, previously
 known as "tty slave devices".

 The most obvious change is the name.  I realized that this isn't
 really about "tty"s at all - it is about UARTs.

 When a device is connected to a UART via RS-232 (or similar), there
 is a DTR line that can be used for power management, and other "modem
 control" lines.

 On an embedded board, it is very likely that there is no "DTR", and
 any power management need to be done using some completely separate
 mechanism.

 So these "slaves" are really just for devices permanently attached to
 UARTs without a full "RS-232" (or similar) connection.  The driver
 does all the extra control beyond Tx/Rx.

 The core serial code now "initializes" the "tty device" but may not
 "add" it.
 If it successfully finds a slave device, it gives the tty device to
 the slave.  If not, it calls "device_add()" itself.

 The slave device is responsible for adding the tty device when it is
 ready.

 There are two things that I'm not entirely happy with.

 Firstly there is the "uart_slave_activate()" call in tty_init_dev().
 Possibly this belongs more in tty_driver_install_tty(), but it is a
 bit out-off-place in either.
 To make it really clean I think I would need to create my own
 "tty_driver" which mirrors the one supplied by the uart but has a
 few little modification.  But that seems like a lot of clumsy work
 for very little gain.

 Secondly, between the creation of the slave device and the binding of
 a driver to it I'm holding an extra reference to the "tty_driver".
 This is to prevent calls to destruct_tty_driver().
 destruct_tty_driver() will unregister all the tty devices.
 But it cannot get at the uart_slave devices.
 I'm not at all sure I have the device unregistering side of things
 right.  As I was writing the code I imagined that I could arrange it
 so that then the tty was unregistered, that would drop the last ref
 on the slave and it would go away... I don't think that is right
 after all.  So that bit needs more work.

 Should the slave devices only be removed when the "uart_slave_core"
 modules is removed?

 And I'd very much like comment on the changed to be uart-based
 rather than tty-based.

 I've tested this set and it seems to work ... except that something
 is sadly broken with bluetooth support in 4.1-rc1 so I've only really
 tested the GPS driver.  I guess it is time to rebase to -rc3.


Thanks,
NeilBrown

---

NeilBrown (4):
  TTY: use class_find_device to find port in uart_suspend/resume.
  TTY: split tty_register_device_attr into 'initialize' and 'add' parts.
  TTY: add support for uart_slave devices.
  tty/slaves: add a driver to power on/off UART attached devices.


 .../bindings/uart_slave/wi2wi,w2cbw003.txt |   19 +
 .../bindings/uart_slave/wi2wi,w2sg0004.txt |   37 +
 .../devicetree/bindings/vendor-prefixes.txt|1 
 drivers/tty/serial/Kconfig |1 
 drivers/tty/serial/Makefile|2 
 drivers/tty/serial/serial_core.c   |   30 +
 drivers/tty/serial/slave/Kconfig   |   21 +
 drivers/tty/serial/slave/Makefile  |3 
 drivers/tty/serial/slave/serial-power-manager.c|  510 
 drivers/tty/serial/slave/uart_slave_core.c |  168 +++
 drivers/tty/tty_io.c   |  111 +++-
 drivers/tty/tty_port.c |   24 +
 include/linux/tty.h|   10 
 include/linux/uart_slave.h |   29 +
 14 files changed, 918 insertions(+), 48 deletions(-)
 create mode 100644 
Documentation/devicetree/bindings/uart_slave/wi2wi,w2cbw003.txt
 create mode 100644 
Documentation/devicetree/bindings/uart_slave/wi2wi,w2sg0004.txt
 create mode 100644 drivers/tty/serial/slave/Kconfig
 create mode 100644 drivers/tty/serial/slave/Makefile
 create mode 100644 drivers/tty/serial/slave/serial-power-manager.c
 create mode 100644 drivers/tty/serial/slave/uart_slave_core.c
 create mode 100644 include/linux/uart_slave.h

--
Signature

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/1] suspend: delete sys_sync()

2015-05-10 Thread Dave Chinner

On Fri, May 08, 2015 at 03:08:43AM -0400, Len Brown wrote:
> From: Len Brown 
> 
> Remove sys_sync() from the kernel's suspend flow.
> 
> sys_sync() is extremely expensive in some configurations,
> and so the kernel should not force users to pay this cost
> on every suspend.

Since when? Please explain what your use case is that makes this
so prohibitively expensive it needs to be removed.

> 
> The user-space utilities s2ram and s2disk choose to invoke sync() today.
> A user can invoke suspend directly via /sys/power/state to skip that cost.

So, you want to have s2disk write all the dirty pages in memory to
the suspend image, rather than to the filesystem?

Either way you have to write that dirty data to disk, but if you
write it to the suspend image, it then has to be loaded again on
resume, and then written again to the filesystem the system has
resumed. This doesn't seem very efficient to me

And, quite frankly, machines fail to resume from suspne dall the
time. e.g. run out of batteries when they are under s2ram
conditions, or s2disk fails because a kernel upgrade was done before
the s2disk and so can't be resumed. With your change, users lose all
the data that was buffered in memory before suspend, whereas right
now it is written to disk and so nothing is lost if the resume from
suspend fails for whatever reason.

IOWs, I can see several good reasons why the sys_sync() needs to
remain in the suspend code. User data safety and filesystem
integrity is far, far more important than a couple of seconds
improvement in suspend speed

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v9 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-05-10 Thread Don Dutile


On 05/07/2015 09:21 PM, Dave Young wrote:

On 05/07/15 at 10:25am, Don Dutile wrote:

On 05/07/2015 10:00 AM, Dave Young wrote:

On 04/07/15 at 10:12am, Don Dutile wrote:

On 04/06/2015 11:46 PM, Dave Young wrote:

On 04/05/15 at 09:54am, Baoquan He wrote:

On 04/03/15 at 05:21pm, Dave Young wrote:

On 04/03/15 at 05:01pm, Li, ZhenHua wrote:

Hi Dave,

There may be some possibilities that the old iommu data is corrupted by
some other modules. Currently we do not have a better solution for the
dmar faults.

But I think when this happens, we need to fix the module that corrupted
the old iommu data. I once met a similar problem in normal kernel, the
queue used by the qi_* functions was written again by another module.
The fix was in that module, not in iommu module.


It is too late, there will be no chance to save vmcore then.

Also if it is possible to continue corrupt other area of oldmem because
of using old iommu tables then it will cause more problems.

So I think the tables at least need some verifycation before being used.



Yes, it's a good thinking anout this and verification is also an
interesting idea. kexec/kdump do a sha256 calculation on loaded kernel
and then verify this again when panic happens in purgatory. This checks
whether any code stomps into region reserved for kexec/kernel and corrupt
the loaded kernel.

If this is decided to do it should be an enhancement to current
patchset but not a approach change. Since this patchset is going very
close to point as maintainers expected maybe this can be merged firstly,
then think about enhancement. After all without this patchset vt-d often
raised error message, hung.


It does not convince me, we should do it right at the beginning instead of
introduce something wrong.

I wonder why the old dma can not be remap to a specific page in kdump kernel
so that it will not corrupt more memory. But I may missed something, I will
looking for old threads and catch up.

Thanks
Dave


The (only) issue is not corruption, but once the iommu is re-configured, the 
old,
not-stopped-yet, dma engines will use iova's that will generate dmar faults, 
which
will be enabled when the iommu is re-configured (even to a single/simple paging 
scheme)
in the kexec kernel.



Don, so if iommu is not reconfigured then these faults will not happen?


Well, if iommu is not reconfigured, then if the crash isn't caused by
an IOMMU fault (some systems have firmware-first catch the IOMMU fault & convert
them into NMI_IOCK), then the DMA's will continue into the old kernel memory 
space.


So NMI_IOCK is one reason to cause kernel hang, I think I'm still not clear 
about
what does re-configured means though. DMAR faults will happen originally this 
is the old
behavior but we are removing the faults by alowing DMA continuing into old 
memory
space.


A flood of faults occur when the 2nd kernel (re-)configures the IOMMU because
the second kernel effectively clears/disable all DMA except RMRRs, so any DMA 
from 1st kernel will flood
the system with faults.  Its the flood of dmar faults that eventually wedges 
&/or crashes the system
while trying to take a kdump.




Baoquan and me has a confusion below today about iommu=off/intel_iommu=off:

intel_iommu_init()
{
...

dmar_table_init();

disable active iommu translations;

if (no_iommu || dmar_disabled)
 goto out_free_dmar;

...
}

Any reason not move no_iommu check to the begining of intel_iommu_init function?


What does that do/help?


Just do not know why the previous handling is necessary with iommu=off, 
shouldn't
we do noting and return earlier?

Also there is a guess, dmar faults appears after iommu_init, so not sure if the 
codes
before dmar_disabled checking have some effect about enabling the faults 
messages.

Thanks
Dave



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] clocksource: exynos_mct: fix for sleeping in atomic ctx handling cpu hotplug notif.

2015-05-10 Thread Krzysztof Kozlowski

2015-03-12 18:11 GMT+09:00 Damian Eppel :
> This is to fix an issue of sleeping in atomic context when processing
> hotplug notifications in Exynos MCT(Multi-Core Timer).
> The issue was reproducible on Exynos 3250 (Rinato board) and Exynos 5420
> (Arndale Octa board).
>
> Whilst testing cpu hotplug events on kernel configured with DEBUG_PREEMPT
> and DEBUG_ATOMIC_SLEEP we get following BUG message, caused by calling
> request_irq() and free_irq() in the context of hotplug notification
> (which is in this case atomic context).
>
> root@target:~# echo 0 > /sys/devices/system/cpu/cpu1/online
>
> [   25.157867] IRQ18 no longer affine to CPU1
> ...
> [   25.158445] CPU1: shutdown
>
> root@target:~# echo 1 > /sys/devices/system/cpu/cpu1/online
>
> [   40.785859] CPU1: Software reset
> [   40.786660] BUG: sleeping function called from invalid context at 
> mm/slub.c:1241
> [   40.786668] in_atomic(): 1, irqs_disabled(): 128, pid: 0, name: swapper/1
> [   40.786678] Preemption disabled at:[<  (null)>]   (null)
> [   40.786681]
> [   40.786692] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 
> 3.19.0-rc4-00024-g7dca860 #36
> [   40.786698] Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
> [   40.786728] [] (unwind_backtrace) from [] 
> (show_stack+0x10/0x14)
> [   40.786747] [] (show_stack) from [] 
> (dump_stack+0x70/0xbc)
> [   40.786767] [] (dump_stack) from [] 
> (kmem_cache_alloc+0xd8/0x170)
> [   40.786785] [] (kmem_cache_alloc) from [] 
> (request_threaded_irq+0x64/0x128)
> [   40.786804] [] (request_threaded_irq) from [] 
> (exynos4_local_timer_setup+0xc0/0x13c)
> [   40.786820] [] (exynos4_local_timer_setup) from [] 
> (exynos4_mct_cpu_notify+0x30/0xa8)
> [   40.786838] [] (exynos4_mct_cpu_notify) from [] 
> (notifier_call_chain+0x44/0x84)
> [   40.786857] [] (notifier_call_chain) from [] 
> (__cpu_notify+0x28/0x44)
> [   40.786873] [] (__cpu_notify) from [] 
> (secondary_start_kernel+0xec/0x150)
> [   40.786886] [] (secondary_start_kernel) from [<40008764>] 
> (0x40008764)
>
> Solution:
> Clockevent irqs cannot be requested/freed every time cpu is
> hot-plugged/unplugged as CPU_STARTING/CPU_DYING notifications
> that signals hotplug or unplug events are sent with both preemption
> and local interrupts disabled. Since request_irq() may sleep it is
> moved to the initialization stage and performed for all possible
> cpus prior putting them online. Then, in the case of hotplug event
> the irq asociated with the given cpu will simply be enabled.
> Similarly on cpu unplug event the interrupt is not freed but just
> disabled.
>
> Note that after successful request_irq() call for a clockevent device
> associated to given cpu the requested irq is disabled (via disable_irq()).
> That is to make things symmetric as we expect hotplug event as a next
> thing (which will enable irq again). This should not pose any problems
> because at the time of requesting irq the clockevent device is not
> fully initialized yet, therefore should not produce interrupts at that point.
>
> For disabling an irq at cpu unplug notification disable_irq_nosync() is
> chosen which is a non-blocking function. This again shouldn't be a problem as
> prior sending CPU_DYING notification interrupts are migrated away
> from the cpu.
>
> Fixes: 7114cd749a12 ("clocksource: exynos_mct: use (request/free)_irq calls 
> for local timer registration")
> Signed-off-by: Damian Eppel 
> Cc: 
> Reported-by: Krzysztof Kozlowski 
> Reviewed-by: Krzysztof Kozlowski 
> Tested-by: Krzysztof Kozlowski 
> (Tested on Arndale Octa Board)
> Tested-by: Marcin Jabrzyk 
> (Tested on Rinato B2 (Exynos 3250) board)

Hi Daniel and Thomas,

Do you have any comments on this patch? Could you pick it up?

Best regards,
Krzysztof
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v9 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

2015-05-10 Thread Don Dutile


On 05/07/2015 10:00 AM, Dave Young wrote:

On 04/07/15 at 10:12am, Don Dutile wrote:

On 04/06/2015 11:46 PM, Dave Young wrote:

On 04/05/15 at 09:54am, Baoquan He wrote:

On 04/03/15 at 05:21pm, Dave Young wrote:

On 04/03/15 at 05:01pm, Li, ZhenHua wrote:

Hi Dave,

There may be some possibilities that the old iommu data is corrupted by
some other modules. Currently we do not have a better solution for the
dmar faults.

But I think when this happens, we need to fix the module that corrupted
the old iommu data. I once met a similar problem in normal kernel, the
queue used by the qi_* functions was written again by another module.
The fix was in that module, not in iommu module.


It is too late, there will be no chance to save vmcore then.

Also if it is possible to continue corrupt other area of oldmem because
of using old iommu tables then it will cause more problems.

So I think the tables at least need some verifycation before being used.



Yes, it's a good thinking anout this and verification is also an
interesting idea. kexec/kdump do a sha256 calculation on loaded kernel
and then verify this again when panic happens in purgatory. This checks
whether any code stomps into region reserved for kexec/kernel and corrupt
the loaded kernel.

If this is decided to do it should be an enhancement to current
patchset but not a approach change. Since this patchset is going very
close to point as maintainers expected maybe this can be merged firstly,
then think about enhancement. After all without this patchset vt-d often
raised error message, hung.


It does not convince me, we should do it right at the beginning instead of
introduce something wrong.

I wonder why the old dma can not be remap to a specific page in kdump kernel
so that it will not corrupt more memory. But I may missed something, I will
looking for old threads and catch up.

Thanks
Dave


The (only) issue is not corruption, but once the iommu is re-configured, the 
old,
not-stopped-yet, dma engines will use iova's that will generate dmar faults, 
which
will be enabled when the iommu is re-configured (even to a single/simple paging 
scheme)
in the kexec kernel.



Don, so if iommu is not reconfigured then these faults will not happen?

Baoquan and me has a confusion below today about iommu=off/intel_iommu=off:

intel_iommu_init()
{
...

dmar_table_init();

disable active iommu translations;

if (no_iommu || dmar_disabled)
 goto out_free_dmar;

...
}

Any reason not move no_iommu check to the begining of intel_iommu_init function?

Thanks
Dave

Looks like you could.


--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3 4/4] block: loop: support DIO & AIO

2015-05-10 Thread Dave Chinner

On Fri, May 08, 2015 at 12:09:04PM +0800, Ming Lei wrote:
> On Fri, May 8, 2015 at 6:20 AM, Dave Chinner  wrote:
> > On Thu, May 07, 2015 at 08:32:39PM +0800, Ming Lei wrote:
> >> On Thu, May 7, 2015 at 3:24 PM, Christoph Hellwig  
> >> wrote:
> >> >> @@ -441,6 +500,12 @@ static void do_loop_switch(struct loop_device *lo, 
> >> >> struct switch_request *p)
> >> >>   mapping->host->i_bdev->bd_block_size : PAGE_SIZE;
> >> >>   lo->old_gfp_mask = mapping_gfp_mask(mapping);
> >> >>   mapping_set_gfp_mask(mapping, lo->old_gfp_mask & 
> >> >> ~(__GFP_IO|__GFP_FS));
> >> >> +
> >> >> + lo->support_dio = mapping->a_ops && mapping->a_ops->direct_IO;
> >> >> + if (lo->support_dio)
> >> >> + lo->use_aio = true;
> >> >> + else
> >> >> + lo->use_aio = false;
> >> >
> >> > We need an explicit userspace op-in for this.  For one direct I/O can't
> >>
> >> Actually this patch is one simplified version, and my old version
> >> has exported two sysfs files(use_aio, use_dio) which can control
> >> if direct IO or AIO is used but only AIO is enabled if DIO is set. Finally
> >> I think it isn't necessary because dio/aio works well from the tests,
> >> and userspace shouldn't care if it is AIO or not if the performance
> >> is good.
> >
> > Performance won't always be good.
> >
> > It looks to me that this has an unbound queue depth for AIO.  What
...
> Loop has been converted to blk-mq, and the current queue depth is
> 128, so there isn't the problem you worried about, is there?

Oh, OK, that's new. I didn't realise this conversion had already
been done.

> >> > handle sub-sector size access and people use the loop device as a
> >> > workaround for that.
> >>
> >> Yes, user can do that, could you explain a bit what the problem is?
> >
> > I have a 4k sector backing device and a 512 byte sector filesystem
> > image. I can't do 512 byte direct IO to the filesystem image, so I
> > can't run tools that handle fs images in files using direct Io on
> > that file. Create a loop device with the filesystem image, and now I
> > can do 512 byte direct IO to the filesystem image, because all that
> > direct IO to the filesystem image is now buffered by the loop
> > device.
> >
> > If the loop device does direct io in this situation, the backing
> > filesystem rejects direct IO from the loop device because it is not
> > sector (4k) sized/aligned. User now swears, shouts and curses you
> > from afar.
> 
> Yes, it is one problem, but looks it can be addressed by adding the
> following in loop_set_fd():
> 
>  if (inode->i_sb->s_bdev)
> blk_queue_logical_block_size(lo->lo_queue,
>bdev_io_min(inode->i_sb->s_bdev));

How does that address the problem of not being able to do 512 byte
IOs to a loop device that does direct IO to a file hosted on a 4k
sector filesystem?

AFAICT, in that case bdev_io_min(inode->i_sb->s_bdev) will return 4k
because it comes from the backing filesystem, and so the minimum IO
size is still going to be 4k for such configurations...

> > DIO and AIO behaviour needs to be configurable through losetup, and
> > most definitely not the default behaviour.
> 
> Could you share if there are other reasons for making it configurable via
> losetup suppose the above issues can be fixed?

It's not about "fixing issues" - losetup is the tool we use for
configuring loop device behaviour. If a user wants to crete a loop
device with a specific configuration, then that is the tool they
use. You add direct IO for the loop device, that needs to be
configurable because switching to direct IO will significantly
change performance for many workloads loops devices are used for.

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 2/2] ARM: EXYNOS: Use of_machine_is_compatible instead of soc_is_exynos4

2015-05-10 Thread Krzysztof Kozlowski

of_machine_is_compatible() seems to be preferred over soc_is_exynos4().

Signed-off-by: Krzysztof Kozlowski 

---
Changes since v2:
1. New patch, requested by Kukjin Kim.
---
 arch/arm/mach-exynos/exynos.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/mach-exynos/exynos.c b/arch/arm/mach-exynos/exynos.c
index c3bfbba3006d..5917a30eee33 100644
--- a/arch/arm/mach-exynos/exynos.c
+++ b/arch/arm/mach-exynos/exynos.c
@@ -179,7 +179,7 @@ static void __init exynos_init_io(void)
  */
 void exynos_set_delayed_reset_assertion(bool enable)
 {
-   if (soc_is_exynos4()) {
+   if (of_machine_is_compatible("samsung,exynos4")) {
unsigned int tmp, core_id;
 
for (core_id = 0; core_id < num_possible_cpus(); core_id++) {
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 1/2] ARM: EXYNOS: Fix failed second suspend on Exynos4

2015-05-10 Thread Krzysztof Kozlowski

On Exynos4412 boards (Trats2, Odroid U3) after enabling L2 cache in
56b60b8bce4a ("ARM: 8265/1: dts: exynos4: Add nodes for L2 cache
controller") the second suspend to RAM failed. First suspend worked fine
but the next one hang just after powering down of secondary CPUs (system
consumed energy as it would be running but was not responsive).

The issue was caused by enabling delayed reset assertion for CPU0 just
after issuing power down of cores. This was introduced for Exynos4 in
13cfa6c4f7fa ("ARM: EXYNOS: Fix CPU idle clock down after CPU off").

The whole behavior is not well documented but after checking with vendor
code this should be done like this (on Exynos4):
1. Enable delayed reset assertion when system is running (for all CPUs).
2. Disable delayed reset assertion before suspending the system.
   This can be done after powering off secondary CPUs.
3. Re-enable the delayed reset assertion when system is resumed.

Signed-off-by: Krzysztof Kozlowski 
Fixes: 13cfa6c4f7fa ("ARM: EXYNOS: Fix CPU idle clock down after CPU off")
Cc: 
Tested-by: Bartlomiej Zolnierkiewicz 
Tested-by: Chanwoo Choi 

---
Changes since v2:
None.
---
 arch/arm/mach-exynos/common.h  |  2 ++
 arch/arm/mach-exynos/exynos.c  | 27 +++
 arch/arm/mach-exynos/platsmp.c | 39 ++-
 arch/arm/mach-exynos/suspend.c |  3 +++
 4 files changed, 34 insertions(+), 37 deletions(-)

diff --git a/arch/arm/mach-exynos/common.h b/arch/arm/mach-exynos/common.h
index acd5b560b728..5f5cd562c593 100644
--- a/arch/arm/mach-exynos/common.h
+++ b/arch/arm/mach-exynos/common.h
@@ -159,6 +159,8 @@ extern void exynos_enter_aftr(void);
 
 extern struct cpuidle_exynos_data cpuidle_coupled_exynos_data;
 
+extern void exynos_set_delayed_reset_assertion(bool enable);
+
 extern void s5p_init_cpu(void __iomem *cpuid_addr);
 extern unsigned int samsung_rev(void);
 extern void __iomem *cpu_boot_reg_base(void);
diff --git a/arch/arm/mach-exynos/exynos.c b/arch/arm/mach-exynos/exynos.c
index bcde0dd668df..c3bfbba3006d 100644
--- a/arch/arm/mach-exynos/exynos.c
+++ b/arch/arm/mach-exynos/exynos.c
@@ -167,6 +167,33 @@ static void __init exynos_init_io(void)
 }
 
 /*
+ * Set or clear the USE_DELAYED_RESET_ASSERTION option. Used by smp code
+ * and suspend.
+ *
+ * This is necessary only on Exynos4 SoCs. When system is running
+ * USE_DELAYED_RESET_ASSERTION should be set so the ARM CLK clock down
+ * feature could properly detect global idle state when secondary CPU is
+ * powered down.
+ *
+ * However this should not be set when such system is going into suspend.
+ */
+void exynos_set_delayed_reset_assertion(bool enable)
+{
+   if (soc_is_exynos4()) {
+   unsigned int tmp, core_id;
+
+   for (core_id = 0; core_id < num_possible_cpus(); core_id++) {
+   tmp = pmu_raw_readl(EXYNOS_ARM_CORE_OPTION(core_id));
+   if (enable)
+   tmp |= S5P_USE_DELAYED_RESET_ASSERTION;
+   else
+   tmp &= ~(S5P_USE_DELAYED_RESET_ASSERTION);
+   pmu_raw_writel(tmp, EXYNOS_ARM_CORE_OPTION(core_id));
+   }
+   }
+}
+
+/*
  * Apparently, these SoCs are not able to wake-up from suspend using
  * the PMU. Too bad. Should they suddenly become capable of such a
  * feat, the matches below should be moved to suspend.c.
diff --git a/arch/arm/mach-exynos/platsmp.c b/arch/arm/mach-exynos/platsmp.c
index ebd135bb0995..a825bca2a2b6 100644
--- a/arch/arm/mach-exynos/platsmp.c
+++ b/arch/arm/mach-exynos/platsmp.c
@@ -34,30 +34,6 @@
 
 extern void exynos4_secondary_startup(void);
 
-/*
- * Set or clear the USE_DELAYED_RESET_ASSERTION option, set on Exynos4 SoCs
- * during hot-(un)plugging CPUx.
- *
- * The feature can be cleared safely during first boot of secondary CPU.
- *
- * Exynos4 SoCs require setting USE_DELAYED_RESET_ASSERTION during powering
- * down a CPU so the CPU idle clock down feature could properly detect global
- * idle state when CPUx is off.
- */
-static void exynos_set_delayed_reset_assertion(u32 core_id, bool enable)
-{
-   if (soc_is_exynos4()) {
-   unsigned int tmp;
-
-   tmp = pmu_raw_readl(EXYNOS_ARM_CORE_OPTION(core_id));
-   if (enable)
-   tmp |= S5P_USE_DELAYED_RESET_ASSERTION;
-   else
-   tmp &= ~(S5P_USE_DELAYED_RESET_ASSERTION);
-   pmu_raw_writel(tmp, EXYNOS_ARM_CORE_OPTION(core_id));
-   }
-}
-
 #ifdef CONFIG_HOTPLUG_CPU
 static inline void cpu_leave_lowpower(u32 core_id)
 {
@@ -73,8 +49,6 @@ static inline void cpu_leave_lowpower(u32 core_id)
  : "=" (v)
  : "Ir" (CR_C), "Ir" (0x40)
  : "cc");
-
-exynos_set_delayed_reset_assertion(core_id, false);
 }
 
 static inline void platform_do_lowpower(unsigned int cpu, int *spurious)
@@ -87,14 +61,6 @@ static inline void platform_do_lowpower(unsigned int cpu,

linux-next: build failure after merge of the vfs tree

2015-05-10 Thread Stephen Rothwell

Hi Al,

After merging the vfs tree, today's linux-next build (x86_64
allmodconfig) failed like this:

fs/f2fs/namei.c: In function 'f2fs_encrypted_follow_link':
fs/f2fs/namei.c:336:10: warning: passing argument 2 of 'f2fs_follow_link' from 
incompatible pointer type
   return f2fs_follow_link(dentry, nd);
  ^
fs/f2fs/namei.c:311:20: note: expected 'void **' but argument is of type 
'struct nameidata *'
 static const char *f2fs_follow_link(struct dentry *dentry, void **cookie)
^
fs/f2fs/namei.c:336:3: warning: return discards 'const' qualifier from pointer 
target type
   return f2fs_follow_link(dentry, nd);
   ^
fs/f2fs/namei.c:379:2: error: implicit declaration of function 'nd_set_link' 
[-Werror=implicit-function-declaration]
  nd_set_link(nd, paddr);
  ^
fs/f2fs/namei.c: In function 'f2fs_encrypted_put_link':
fs/f2fs/namei.c:400:3: error: implicit declaration of function 'nd_get_link' 
[-Werror=implicit-function-declaration]
   kfree(nd_get_link(nd));
   ^
fs/f2fs/namei.c:400:3: warning: passing argument 1 of 'kfree' makes pointer 
from integer without a cast
In file included from fs/f2fs/f2fs.h:17:0,
 from fs/f2fs/namei.c:19:
include/linux/slab.h:143:6: note: expected 'const void *' but argument is of 
type 'int'
 void kfree(const void *);
  ^
fs/f2fs/namei.c: At top level:
fs/f2fs/namei.c:960:2: warning: initialization from incompatible pointer type
  .follow_link= f2fs_encrypted_follow_link,
  ^
fs/f2fs/namei.c:960:2: warning: (near initialization for 
'f2fs_symlink_inode_operations.follow_link')
fs/f2fs/namei.c:961:2: warning: initialization from incompatible pointer type
  .put_link   = f2fs_encrypted_put_link,
  ^
fs/f2fs/namei.c:961:2: warning: (near initialization for 
'f2fs_symlink_inode_operations.put_link')

Caused by commits cf41cea5a829 ("new ->follow_link() and ->put_link()
calling conventions") and 0ad7e33ea980 ("don't pass nameidata to
->follow_link()") from teh vfs tree interacting with commit
5270e98c341b ("f2fs crypto: add symlink encryption") from the f2fs tree.

I applied the following merge fix patch (which I suspect is not
completely correct - especially the f2fs_encrypted_put_link part):

From: Stephen Rothwell 
Date: Mon, 11 May 2015 11:22:19 +1000
Subject: [PATCH] f2fs: merge fix for follow_link and put_link changes

Signed-off-by: Stephen Rothwell 
---
 fs/f2fs/namei.c | 18 +++---
 1 file changed, 7 insertions(+), 11 deletions(-)

diff --git a/fs/f2fs/namei.c b/fs/f2fs/namei.c
index d7ed99ebe95b..42af89cdb9a4 100644
--- a/fs/f2fs/namei.c
+++ b/fs/f2fs/namei.c
@@ -320,8 +320,8 @@ static const char *f2fs_follow_link(struct dentry *dentry, 
void **cookie)
 }
 
 #ifdef CONFIG_F2FS_FS_ENCRYPTION
-static void *f2fs_encrypted_follow_link(struct dentry *dentry,
-   struct nameidata *nd)
+static const char *f2fs_encrypted_follow_link(struct dentry *dentry,
+   void **cookie)
 {
struct page *cpage = NULL;
char *caddr, *paddr = NULL;
@@ -333,7 +333,7 @@ static void *f2fs_encrypted_follow_link(struct dentry 
*dentry,
u32 max_size = inode->i_sb->s_blocksize;
 
if (!f2fs_encrypted_inode(inode))
-   return f2fs_follow_link(dentry, nd);
+   return f2fs_follow_link(dentry, cookie);
 
res = f2fs_setup_fname_crypto(inode);
if (res)
@@ -341,7 +341,7 @@ static void *f2fs_encrypted_follow_link(struct dentry 
*dentry,
 
cpage = read_mapping_page(inode->i_mapping, 0, NULL);
if (IS_ERR(cpage))
-   return cpage;
+   return ERR_CAST(cpage);
caddr = kmap(cpage);
caddr[size] = 0;
 
@@ -376,12 +376,11 @@ static void *f2fs_encrypted_follow_link(struct dentry 
*dentry,
/* Null-terminate the name */
if (res <= cstr.len)
paddr[res] = '\0';
-   nd_set_link(nd, paddr);
if (cpage) {
kunmap(cpage);
page_cache_release(cpage);
}
-   return NULL;
+   return *cookie = paddr;
 errout:
if (cpage) {
kunmap(cpage);
@@ -391,14 +390,11 @@ errout:
return ERR_PTR(res);
 }
 
-static void f2fs_encrypted_put_link(struct dentry *dentry, struct nameidata 
*nd,
- void *cookie)
+static void f2fs_encrypted_put_link(struct dentry *dentry, void *cookie)
 {
struct page *page = cookie;
 
-   if (!page) {
-   kfree(nd_get_link(nd));
-   } else {
+   if (page) {
kunmap(page);
page_cache_release(page);
}
-- 
2.1.4

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgpAQFbwSg8Oi.pgp
Description: OpenPGP digital signature

[PATCH] tty: rocket: fix comment of ROCKET_SPD_HI

2015-05-10 Thread Masahiro Yamada

This comment does not reflect the actual code.  It should be 57600,
not 56000.

Signed-off-by: Masahiro Yamada 
---

 drivers/tty/rocket.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/tty/rocket.h b/drivers/tty/rocket.h
index ec863f3..c11a939 100644
--- a/drivers/tty/rocket.h
+++ b/drivers/tty/rocket.h
@@ -44,7 +44,7 @@ struct rocket_version {
 #define ROCKET_HUP_NOTIFY  0x0004
 #define ROCKET_SPLIT_TERMIOS   0x0008
 #define ROCKET_SPD_MASK0x0070
-#define ROCKET_SPD_HI  0x0010  /* Use 56000 instead of 38400 
bps */
+#define ROCKET_SPD_HI  0x0010  /* Use 57600 instead of 38400 
bps */
 #define ROCKET_SPD_VHI 0x0020  /* Use 115200 instead of 38400 
bps */
 #define ROCKET_SPD_SHI 0x0030  /* Use 230400 instead of 38400 
bps */
 #define ROCKET_SPD_WARP0x0040  /* Use 460800 instead 
of 38400 bps */
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] tty: fix comment of ASYNCB_SPD_HI

2015-05-10 Thread Masahiro Yamada

This comment does not reflect the actual code.  It should be 57600,
not 56000.

Signed-off-by: Masahiro Yamada 
---

 include/uapi/linux/tty_flags.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/uapi/linux/tty_flags.h b/include/uapi/linux/tty_flags.h
index fae4864..072e41e 100644
--- a/include/uapi/linux/tty_flags.h
+++ b/include/uapi/linux/tty_flags.h
@@ -15,7 +15,7 @@
 #define ASYNCB_FOURPORT 1 /* Set OU1, OUT2 per AST Fourport 
settings */
 #define ASYNCB_SAK  2 /* Secure Attention Key (Orange book) */
 #define ASYNCB_SPLIT_TERMIOS3 /* [x] Separate termios for dialin/callout */
-#define ASYNCB_SPD_HI   4 /* Use 56000 instead of 38400 bps */
+#define ASYNCB_SPD_HI   4 /* Use 57600 instead of 38400 bps */
 #define ASYNCB_SPD_VHI  5 /* Use 115200 instead of 38400 bps */
 #define ASYNCB_SKIP_TEST6 /* Skip UART test during autoconfiguration */
 #define ASYNCB_AUTO_IRQ 7 /* Do automatic IRQ during
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[no subject]

2015-05-10 Thread Bill William




Donation from Bill William And Andrea Groner, reply for more details.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH -tip] locking/pvqspinlock: replace xchg() by the more descriptive set_mb()

2015-05-10 Thread Waiman Long

The xchg() function was used in pv_wait_node() to set a certain value
and provide a memory barrier which is what the set_mb() function
is for.  This patch replaces the xchg() call by set_mb().

Suggested-by: Linus Torvalds 
Signed-off-by: Waiman Long 
---
 kernel/locking/qspinlock_paravirt.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kernel/locking/qspinlock_paravirt.h 
b/kernel/locking/qspinlock_paravirt.h
index b5758a9..27ab96d 100644
--- a/kernel/locking/qspinlock_paravirt.h
+++ b/kernel/locking/qspinlock_paravirt.h
@@ -175,7 +175,7 @@ static void pv_wait_node(struct mcs_spinlock *node)
 *
 * Matches the xchg() from pv_kick_node().
 */
-   (void)xchg(>state, vcpu_halted);
+   set_mb(pn->state, vcpu_halted);
 
if (!READ_ONCE(node->locked))
pv_wait(>state, vcpu_halted);
-- 
1.7.1
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

linux-next: manual merge of the vfs tree with the ext4 tree

2015-05-10 Thread Stephen Rothwell

Hi Al,

Today's linux-next merge of the vfs tree got a conflict in
fs/ext4/symlink.c between commit fd64e6fd4575 ("ext4 crypto: reorganize
how we store keys in the inode") from the ext4 tree and commits
5542f03602af ("ext4: split inode_operations for encrypted symlinks off
the rest") and cf41cea5a829 ("new ->follow_link() and ->put_link()
calling conventions") from the vfs tree.

I fixed it up (I think - see below) and can carry the fix as necessary
(no action is required).

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

diff --cc fs/ext4/symlink.c
index 32870881188e,ba5bd18a9825..
--- a/fs/ext4/symlink.c
+++ b/fs/ext4/symlink.c
@@@ -34,20 -35,19 +34,17 @@@ static const char *ext4_follow_link(str
int res;
u32 plen, max_size = inode->i_sb->s_blocksize;
  
-   if (!ext4_encrypted_inode(inode))
-   return page_follow_link_light(dentry, nd);
- 
 -  ctx = ext4_get_fname_crypto_ctx(inode, inode->i_sb->s_blocksize);
 -  if (IS_ERR(ctx))
 -  return ERR_CAST(ctx);
 +  res = ext4_setup_fname_crypto(inode);
 +  if (res)
 +  return ERR_PTR(res);
  
if (ext4_inode_is_fast_symlink(inode)) {
caddr = (char *) EXT4_I(inode)->i_data;
max_size = sizeof(EXT4_I(inode)->i_data);
} else {
cpage = read_mapping_page(inode->i_mapping, 0, NULL);
 -  if (IS_ERR(cpage)) {
 -  ext4_put_fname_crypto_ctx();
 +  if (IS_ERR(cpage))
-   return cpage;
+   return ERR_CAST(cpage);
 -  }
caddr = kmap(cpage);
caddr[size] = 0;
}
@@@ -78,13 -77,14 +75,12 @@@
/* Null-terminate the name */
if (res <= plen)
paddr[res] = '\0';
-   nd_set_link(nd, paddr);
 -  ext4_put_fname_crypto_ctx();
if (cpage) {
kunmap(cpage);
page_cache_release(cpage);
}
-   return NULL;
+   return *cookie = paddr;
  errout:
 -  ext4_put_fname_crypto_ctx();
if (cpage) {
kunmap(cpage);
page_cache_release(cpage);


pgper49fSWtbK.pgp
Description: OpenPGP digital signature

Re: vsprintf: Add support for userspace strings

2015-05-10 Thread Masami Hiramatsu

On 2015/05/11 4:42, Richard Weinberger wrote:
> While debugging issues I often add (trace_)printks to strategic positions.
> Dealing with user provided string is complicated as an extra buffer a
> copy_from_user() is needed.
> This adds a new format string to allow direct printing of such strings.
> 
> My initial plan was to use %pU but 'U' is already taken, therefore
> I used the next letter which comes in mind when one thinks of userpace,
> 'L'.
> The %pL format string works exactly like %s.

BTW, if you need to do this for debug, you can also use ftrace's kprobe-tracer
(and perf probe) which allows you to dump userspace strings :)

Here is an example.
-
[mhiramat@localhost perf]$ ./perf probe -L do_sys_open:0-3

char*   filename
int dfd
int flags
int lookup
struct open_flags   op
umode_t mode
[mhiramat@localhost perf]$ sudo ./perf probe do_sys_open filename:string
Added new event:
  probe:do_sys_open(on do_sys_open with filename:string)

You can now use it in all perf tools, such as:

perf record -e probe:do_sys_open -aR sleep 1


[mhiramat@localhost perf]$ sudo ./perf record -e probe:do_sys_open -a ls &> 
/dev/null
[mhiramat@localhost perf]$ sudo ./perf script | more
  ls  7238 [003] 1629305.250347: probe:do_sys_open: 
(811c5e40) filename_string="/etc/ld.so.cache"
  ls  7238 [003] 1629305.250384: probe:do_sys_open: 
(811c5e40) filename_string="/lib64/libselinux.so.1"
  ls  7238 [003] 1629305.250501: probe:do_sys_open: 
(811c5e40) filename_string="/lib64/libcap.so.2"
  ls  7238 [003] 1629305.250562: probe:do_sys_open: 
(811c5e40) filename_string="/lib64/libacl.so.1"
  ls  7238 [003] 1629305.250631: probe:do_sys_open: 
(811c5e40) filename_string="/lib64/libc.so.6"
  ls  7238 [003] 1629305.250706: probe:do_sys_open: 
(811c5e40) filename_string="/lib64/libpcre.so.1"
  ls  7238 [003] 1629305.250769: probe:do_sys_open: 
(811c5e40) filename_string="/lib64/liblzma.so.5"
  ls  7238 [003] 1629305.250838: probe:do_sys_open: 
(811c5e40) filename_string="/lib64/libdl.so.2"
  ls  7238 [003] 1629305.250898: probe:do_sys_open: 
(811c5e40) filename_string="/lib64/libattr.so.1"
  ls  7238 [003] 1629305.250959: probe:do_sys_open: 
(811c5e40) filename_string="/lib64/libpthread.so.0"
  ls  7238 [003] 1629305.251591: probe:do_sys_open: 
(811c5e40) filename_string=""
  ls  7238 [003] 1629305.251695: probe:do_sys_open: 
(811c5e40) filename_string="."
[mhiramat@localhost perf]$ sudo ./perf probe -d \*
Removed event: probe:do_sys_open
-

Thank you,

-- 
Masami HIRAMATSU
Linux Technology Research Center, System Productivity Research Dept.
Center for Technology Innovation - Systems Engineering
Hitachi, Ltd., Research & Development Group
E-mail: masami.hiramatsu...@hitachi.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] ARM: dts: add core2 padconf region for am3517

2015-05-10 Thread Nishanth Menon

On 05/10/2015 04:27 PM, Andrey Skvortsov wrote:
> According to the technical reference manual for AM35xx system
> controller module (SCM) PADCONFS core registers are divided in two
> regions: 0x48002030..0x48002268 and 0x480025d8..0x480025FC.
> First region is the same for all omap3 SoC and is described in omap3.dtsi.
> The second region is the same as in omap34xx (see omap34xx.dtsi)
> and omap35xx. The patch adds missing description for the second region.
> This patch was tested on AM3517.
> 
> Signed-off-by: Andrey Skvortsov 
> ---
> 
> Commit 3d495383648a ("ARM: dts: Split omap3 pinmux core device") notes that
> Nishanth Menon  said that 3517 does not have padconf2 region.
> Unfortunately I couldn't find reference to his post on mailing list.
> This patch was tested on AM3517 SoC and original vendor code contains
> pinmuxing for this second region as well.


http://www.ti.com/lit/ug/sprugr0c/sprugr0c.pdf
CONTROL_PADCONF_ETK_CLK is indeed at 0x480025D8
Apologies on missing it :(

> 
>  arch/arm/boot/dts/am3517.dtsi | 12 
>  1 file changed, 12 insertions(+)
> 
> diff --git a/arch/arm/boot/dts/am3517.dtsi b/arch/arm/boot/dts/am3517.dtsi
> index c90724b..2534500 100644
> --- a/arch/arm/boot/dts/am3517.dtsi
> +++ b/arch/arm/boot/dts/am3517.dtsi
> @@ -60,5 +60,16 @@
>   dma-names = "tx", "rx";
>   clock-frequency = <4800>;
>   };
> +
> + omap3_pmx_core2: pinmux@480025D8 {
> + compatible = "ti,omap3-padconf", "pinctrl-single";
> + reg = <0x480025D8 0x24>;

b8845074cfbbd1d1b46720a1b563d7b4240dac21 ("ARM: dts: omap3: add minimal
l4 bus layout with control module support") moves omap3_pmx_core under
scm -> should'nt we do the same here?

> + #address-cells = <1>;
> + #size-cells = <0>;
> + #interrupt-cells = <1>;
> + interrupt-controller;
> + pinctrl-single,register-width = <16>;
> + pinctrl-single,function-mask = <0xff1f>;
> + };
>   };
>  };
> 


-- 
Regards,
Nishanth Menon
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] x86: eliminate comparison between signed and unsigned integer expressions

2015-05-10 Thread Louis Langholtz

Eliminates multiple compiler warnings when the -Wno-sign-compare option is
removed from the x86 Makefile (an option that is documented as a "Workaround
for a gcc prelease that unfortunately was shipped in a suse release").

Signed-off-by: Louis Langholtz 
---
 arch/x86/include/asm/uaccess.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index ace9dec..3289bd1 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -709,7 +709,7 @@ copy_from_user(void *to, const void __user *from, unsigned 
long n)
 * case, and do only runtime checking for non-constant sizes.
 */
 
-   if (likely(sz < 0 || sz >= n))
+   if (likely(sz < 0 || ((unsigned int)sz) >= n))
n = _copy_from_user(to, from, n);
else if(__builtin_constant_p(n))
copy_from_user_overflow();
@@ -727,7 +727,7 @@ copy_to_user(void __user *to, const void *from, unsigned 
long n)
might_fault();
 
/* See the comment in copy_from_user() above. */
-   if (likely(sz < 0 || sz >= n))
+   if (likely(sz < 0 || ((unsigned int)sz) >= n))
n = _copy_to_user(to, from, n);
else if(__builtin_constant_p(n))
copy_to_user_overflow();

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] net: mdio-gpio: Allow for unspecified bus id

2015-05-10 Thread David Miller

From: Bert Vermeulen 
Date: Fri,  8 May 2015 16:18:49 +0200

> When the bus id was supplied via a struct platform_device, the driver wasn't
> handling -1 to mean an unspecified id of the only instance of this driver,
> as the platform spec requires.
> 
> Signed-off-by: Bert Vermeulen 

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Linux 4.1-rc3

2015-05-10 Thread Linus Torvalds

Here's a Mother's Day Sunday release for you all, whether you're  a
mother or not. Because hey, it's Sunday afternoon once again, and
that's just how my -rc releases roll.

Nothing particularly scary or worrisome going on, although there are
little fixes all over. Some of them are regressions from this merge
window, and some of them are older. And some of them are so old that I
almost thought "if it's been broken since 2011, and you only noticed
now, maybe it could have waited for the next merge window". You know
who you are (and others will too, if they read the commit messages -
your shame is out there).

The appended shortlog gives a reasonable overview, and isn't too big.
As you can tell, it's mostly drivers, with a smattering of arch
updates (mostly ARM). And a smattering of misc stuff: perf tooling,
documentation, filesystems. The infiniband update is a noticeable
chunk of the drivers, we had some delayed stuff there due to
maintainership changes.

Go out and test. By -rc3, things really should be pretty
non-threatening and this would be a good time to just make sure
everything is running smoothly if you haven't tried one of the earlier
development kernels already.

  Linus

---

Abhilash Kesavan (1):
  ARM: dts: Fix typo in trip point temperature for exynos5420/5440

Al Viro (2):
  namei: d_is_negative() should be checked before ->d_seq validation
  path_openat(): fix double fput()

Alex Bennée (1):
  tracing: Make ftrace_print_array_seq compute buf_len

Alex Deucher (1):
  drm/radeon: don't setup audio on asics that don't support it

Alex Williamson (2):
  vfio-pci: Log device requests more verbosely
  vfio: Fix runaway interruptible timeout

Alexandre Belloni (1):
  Documentation: bindings: add abracon,abx80x

Andrew Andrianov (1):
  pinctrl: mvebu: Fix mapping of pin 63 (gpo -> gpio)

Andrew Lunn (1):
  ARM: 8350/1: proc-feroceon: Fix feroceon_proc_info macro

Andrew Morton (2):
  revert "zram: move compact_store() to sysfs functions area"
  MAINTAINERS: add co-maintainer for LED subsystem

Antonio Ospite (1):
  ACPI / documentation: fix a sentence about GPIO resources

Arnaldo Carvalho de Melo (2):
  perf trace: Enable events when doing system wide tracing and
starting a workload
  perf trace: Disable events and drain events when forked workload ends

Bart Van Assche (1):
  IPoIB/CM: Fix indentation level

Baruch Siach (1):
  MAINTAINERS: add Conexant Digicolor machines entry

Ben Hutchings (1):
  xen-pciback: Add name prefix to global 'permissive' variable

Bob Liu (2):
  xen/blkback: safely unmap purge persistent grants
  xen/grant: introduce func gnttab_unmap_refs_sync()

Bobby Powers (2):
  tools lib api: Undefine _FORTIFY_SOURCE before setting it
  x86/fpu: Always restore_xinit_state() when use_eager_cpu()

Boris Ostrovsky (6):
  xen: Suspend ticks on all CPUs during suspend
  xen/events: Clear cpu_evtchn_mask before resuming
  xen/xenbus: Update xenbus event channel on resume
  xen/console: Update console event channel on resume
  xen/events: Set irq_info->evtchn before binding the channel to
CPU in __startup_pirq()
  hypervisor/x86/xen: Unset X86_BUG_SYSRET_SS_ATTRS on Xen PV guests

Chao Yu (1):
  elevator: fix double release of elevator module

Chen Yu (1):
  init: fix regression by supporting devices with major:minor:offset format

Chris Bainbridge (1):
  ACPI / SBS: Add 5 us delay to fix SBS hangs on MacBook

Chris Wilson (1):
  drm/i915: Drop PIPE-A quirk for 945GSE HP Mini

Chris Zhong (2):
  ARM: rockchip: disable dapswjdp during suspend
  ARM: rockchip: fix undefined instruction of reset_ctrl_regs

Christian König (6):
  drm/radeon: disable semaphores for UVD V1 (v2)
  drm/radeon: fix userptr lockup
  drm/radeon: make VCE handle check more strict
  drm/radeon: make UVD handle checking more strict
  drm/radeon: more strictly validate the UVD codec
  drm/radeon: stop trying to suspend UVD sessions

Christophe Leroy (1):
  splice: sendfile() at once fails for big files

Chuanxiao Dong (1):
  mmc: card: Don't access RPMB partitions for normal read/write

Colin Ian King (1):
  pinctrl: mediatek: mtk-common: initialize unmask

Corey Minyard (6):
  ipmi_ssif: Fix the logic on user-supplied addresses
  ipmi:ssif: Ignore spaces when comparing I2C adapter names
  ipmi: Don't report err in the SI driver for SSIF devices
  ipmi: Report an error if ACPI _IFT doesn't exist
  ipmi: Add alert handling to SSIF
  ipmi: Fix multi-part message handling

Dan Carpenter (1):
  efi: Fix error handling in add_sysfs_runtime_map_entry()

Daniel Baluta (1):
  configfs: init configfs module earlier at boot time

Daniel Borkmann (1):
  lib: make memzero_explicit more robust against dead store elimination

David Ahern (2):
  perf kmem: Fix compiles on RHEL6/OL6
  IB/core: Fix

Re: [PATCH] ARM: net: delegate filter to kernel interpreter when imm_offset() return value can't fit into 12bits.

2015-05-10 Thread David Miller

From: Nicolas Schichan 
Date: Thu,  7 May 2015 17:14:21 +0200

> The ARM JIT code emits "ldr rX, [pc, #offset]" to access the literal
> pool. #offset maximum value is 4095 and if the generated code is too
> large, the #offset value can overflow and not point to the expected
> slot in the literal pool. Additionally, when overflow occurs, bits of
> the overflow can end up changing the destination register of the ldr
> instruction.
> 
> Fix that by detecting the overflow in imm_offset() and setting a flag
> that is checked for each BPF instructions converted in
> build_body(). As of now it can only be detected in the second pass. As
> a result the second build_body() call can now fail, so add the
> corresponding cleanup code in that case.
> 
> Using multiple literal pools in the JITed code is going to require
> lots of intrusive changes to the JIT code (which would better be done
> as a feature instead of fix), just delegating to the kernel BPF
> interpreter in that case is a more straight forward, minimal fix and
> easy to backport.
> 
> Signed-off-by: Nicolas Schichan 

Applied, thanks Nicolas.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] ARM: net fix emit_udiv() for BPF_ALU | BPF_DIV | BPF_K intruction.

2015-05-10 Thread David Miller

From: Nicolas Schichan 
Date: Wed,  6 May 2015 18:31:56 +0200

> In that case, emit_udiv() will be called with rn == ARM_R0 (r_scratch)
> and loading rm first into ARM_R0 will result in jit_udiv() function
> being called the same dividend and divisor. Fix that by loading rn
> first into ARM_R1 and then rm into ARM_R0.
> 
> Signed-off-by: Nicolas Schichan 
> Cc:  # v3.13+
> Fixes: aee636c4809f (bpf: do not use reciprocal divide)

Applied, thank you.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] ARM: net: add JIT support for loads from struct seccomp_data.

2015-05-10 Thread David Miller

From: Russell King - ARM Linux 
Date: Sun, 10 May 2015 20:40:28 +0100

> I think you have taken previous ARM net JIT patches, so I think it
> makes sense if you continue to do so.  I'm not knowledgable of the
> JIT interface myself, all I can say about many of these patches is
> that they look okay to me on a superficial basis.
> 
> I suspect you're doing more or less the same, but from a slightly
> different perspective (presumably through not knowing ARM assembly.)

Ok, I'll flow the ARM JIT patches through my tree then.

Thanks Russell.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH RFC] vfs: add a O_NOMTIME flag

2015-05-10 Thread Trond Myklebust

On Fri, May 8, 2015 at 6:24 PM, Sage Weil  wrote:
> On Sat, 9 May 2015, Dave Chinner wrote:
>> On Thu, May 07, 2015 at 09:23:24PM -0400, Trond Myklebust wrote:
>> > On Thu, May 7, 2015 at 9:01 PM, Sage Weil  wrote:
>> > > On Thu, 7 May 2015, Zach Brown wrote:
>> > >> On Thu, May 07, 2015 at 10:26:17AM +1000, Dave Chinner wrote:
>> > >> > On Wed, May 06, 2015 at 03:00:12PM -0700, Zach Brown wrote:
>> > >> > > The criteria for using O_NOMTIME is the same as for using O_NOATIME:
>> > >> > > owning the file or having the CAP_FOWNER capability.  If we're not
>> > >> > > comfortable allowing owners to prevent mtime/ctime updates then we
>> > >> > > should add a tunable to allow O_NOMTIME.  Maybe a mount option?
>> > >> >
>> > >> > I dislike "turn off safety for performance" options because Joe
>> > >> > SpeedRacer will always select performance over safety.
>> > >>
>> > >> Well, for ceph there's no safety concern.  They never use cmtime in
>> > >> these files.
>> > >>
>> > >> So are you suggesting not implementing this and making them rework their
>> > >> IO paths to avoid the fs maintaining mtime so that we don't give Joe
>> > >> Speedracer more rope?  Or are we talking about adding some speed bumps
>> > >> that ceph can flip on that might give Joe Speedracer pause?
>> > >
>> > > I think this is the fundamental question: who do we give the ammunition
>> > > to, the user or app writer, or the sysadmin?
>> > >
>> > > One might argue that we gave the user a similar power with O_NOATIME (the
>> > > power to break applications that assume atime is accurate).  Here we give
>> > > developers/users the power to not update mtime and suffer the 
>> > > consequences
>> > > (like, obviously, breaking mtime-based backups).  It should be pretty
>> > > obvious to anyone using the flag what the consequences are.
>> > >
>> > > Note that we can suffer similar lapses in mtime with fdatasync followed 
>> > > by
>> > > a system crash.  And as Andy points out it's semi-broken for writable
>> > > mmap.  The crash case is obviously a slightly different thing, but the
>> > > idea that mtime can't always be trusted certainly isn't crazy talk.
>> > >
>> > > Or, we can be conservative and require a mount option so that the admin
>> > > has to explicitly allow behavior that might break some existing
>> > > assumptions about mtime/ctime ('-o user_noatime' I guess?).
>> > >
>> > > I'm happy either way, so long as in the end an unprivileged ceph daemon
>> > > avoids the useless work.  In our case we always own the entire 
>> > > mount/disk,
>> > > so a mount option is just fine.
>> > >
>> >
>> > So, what is the expectation here for filesystems that cannot support
>> > this flag? NFSv3 in particular would break pretty catastrophically if
>> > someone decided on a whim to turn off mtime: they will have turned off
>> > the client's ability to detect cache incoherencies.
>>
>> It's worse than that, now that I think about it. I think nomtime
>> will break nfsv4 as the I_VERSION check is done *after* the
>> NO[C]MTIME checks. e.g. the atomic change count used to detect file
>> changes is only updated during the mtime update on write() calls in
>> XFS. i.e. when the timestamp is changed, a transaction to change
>> mtime is run, and that transaction commit bumps the change count.
>>
>> So cutting out mtime updates at the VFS will prevent XFS and other
>> I_VERSION aware filesystems from updating the change count that
>> NFSv4 clients rely on to detect foreign data changes in a file.
>>
>> Not sure what to do here, because the current NOCMTIME
>> implementation intentionally cuts out the timestamp update because
>> it's usage is fully invisible IO. i.e. it is used by utilities like
>> xfs_fsr and HSMs to move data into and out of files without the
>> application being able to detect the data movement in any way. These
>> are not data modification operations, though - the file contents as
>> read by the application do not change despite the fact we are moving
>> data in and out of the file. In this case we don't want timestamps
>> or change counters to change on the data movement, so I think we've
>> actually got a difference in behaviour here between O_NOMTIME and
>> O_NOCMTIME, right?
>>
>> i.e. for nfsv4 sanity O_NOMTIME still needs to bump I_VERSION on
>> write, just not modify the timestamp? In which case, not modifying
>> the timestamps gains us nothing, because the inode is still dirtied?
>
> Right: if we dirty the inode we've defeated the purpose of the patch.
>
>> The list of caveats on O_NOMTIME seems to be growing...
>
> ...and remain consistent with our goals.  We couldn't care less if NFS or
> backup software or anything else doesn't notice these changes.  This is
> private data that is wholly managed by the ceph daemon.  The goal is to
> derive *some* value from the file system and avoid reimplementing it in
> userspace (without the bits we don't need).

That makes it completely non-generic though. By putting this in the
VFS, you are

[PATCH v3] net: fec: add support of ethtool get_regs

2015-05-10 Thread Philippe Reynes

This enables the ethtool's "-d" and "--register-dump"
options for fec devices.

Signed-off-by: Philippe Reynes 
---
 drivers/net/ethernet/freescale/fec_main.c |   78 +
 1 files changed, 78 insertions(+), 0 deletions(-)

Changelog:
v2: (thanks Russell King and David Miller for the feedback)
- don't use memcpy_fromio to copy registers
v3: (thanks David Miller for the feedback)
- only copy defined registers
- fix warning reported by checkpatch.pl

diff --git a/drivers/net/ethernet/freescale/fec_main.c 
b/drivers/net/ethernet/freescale/fec_main.c
index 66d47e4..bf4cf3f 100644
--- a/drivers/net/ethernet/freescale/fec_main.c
+++ b/drivers/net/ethernet/freescale/fec_main.c
@@ -2118,6 +2118,82 @@ static void fec_enet_get_drvinfo(struct net_device *ndev,
strlcpy(info->bus_info, dev_name(>dev), sizeof(info->bus_info));
 }
 
+static int fec_enet_get_regs_len(struct net_device *ndev)
+{
+   struct fec_enet_private *fep = netdev_priv(ndev);
+   struct resource *r;
+   int s = 0;
+
+   r = platform_get_resource(fep->pdev, IORESOURCE_MEM, 0);
+   if (r)
+   s = resource_size(r);
+
+   return s;
+}
+
+/* List of registers that can be safety be read to dump them with ethtool */
+#if defined(CONFIG_M523x) || defined(CONFIG_M527x) || defined(CONFIG_M528x) || 
\
+   defined(CONFIG_M520x) || defined(CONFIG_M532x) ||   \
+   defined(CONFIG_ARCH_MXC) || defined(CONFIG_SOC_IMX28)
+static u32 fec_enet_register_offset[] = {
+   FEC_IEVENT, FEC_IMASK, FEC_R_DES_ACTIVE_0, FEC_X_DES_ACTIVE_0,
+   FEC_ECNTRL, FEC_MII_DATA, FEC_MII_SPEED, FEC_MIB_CTRLSTAT, FEC_R_CNTRL,
+   FEC_X_CNTRL, FEC_ADDR_LOW, FEC_ADDR_HIGH, FEC_OPD, FEC_TXIC0, FEC_TXIC1,
+   FEC_TXIC2, FEC_RXIC0, FEC_RXIC1, FEC_RXIC2, FEC_HASH_TABLE_HIGH,
+   FEC_HASH_TABLE_LOW, FEC_GRP_HASH_TABLE_HIGH, FEC_GRP_HASH_TABLE_LOW,
+   FEC_X_WMRK, FEC_R_BOUND, FEC_R_FSTART, FEC_R_DES_START_1,
+   FEC_X_DES_START_1, FEC_R_BUFF_SIZE_1, FEC_R_DES_START_2,
+   FEC_X_DES_START_2, FEC_R_BUFF_SIZE_2, FEC_R_DES_START_0,
+   FEC_X_DES_START_0, FEC_R_BUFF_SIZE_0, FEC_R_FIFO_RSFL, FEC_R_FIFO_RSEM,
+   FEC_R_FIFO_RAEM, FEC_R_FIFO_RAFL, FEC_RACC, FEC_RCMR_1, FEC_RCMR_2,
+   FEC_DMA_CFG_1, FEC_DMA_CFG_2, FEC_R_DES_ACTIVE_1, FEC_X_DES_ACTIVE_1,
+   FEC_R_DES_ACTIVE_2, FEC_X_DES_ACTIVE_2, FEC_QOS_SCHEME,
+   RMON_T_DROP, RMON_T_PACKETS, RMON_T_BC_PKT, RMON_T_MC_PKT,
+   RMON_T_CRC_ALIGN, RMON_T_UNDERSIZE, RMON_T_OVERSIZE, RMON_T_FRAG,
+   RMON_T_JAB, RMON_T_COL, RMON_T_P64, RMON_T_P65TO127, RMON_T_P128TO255,
+   RMON_T_P256TO511, RMON_T_P512TO1023, RMON_T_P1024TO2047,
+   RMON_T_P_GTE2048, RMON_T_OCTETS,
+   IEEE_T_DROP, IEEE_T_FRAME_OK, IEEE_T_1COL, IEEE_T_MCOL, IEEE_T_DEF,
+   IEEE_T_LCOL, IEEE_T_EXCOL, IEEE_T_MACERR, IEEE_T_CSERR, IEEE_T_SQE,
+   IEEE_T_FDXFC, IEEE_T_OCTETS_OK,
+   RMON_R_PACKETS, RMON_R_BC_PKT, RMON_R_MC_PKT, RMON_R_CRC_ALIGN,
+   RMON_R_UNDERSIZE, RMON_R_OVERSIZE, RMON_R_FRAG, RMON_R_JAB,
+   RMON_R_RESVD_O, RMON_R_P64, RMON_R_P65TO127, RMON_R_P128TO255,
+   RMON_R_P256TO511, RMON_R_P512TO1023, RMON_R_P1024TO2047,
+   RMON_R_P_GTE2048, RMON_R_OCTETS,
+   IEEE_R_DROP, IEEE_R_FRAME_OK, IEEE_R_CRC, IEEE_R_ALIGN, IEEE_R_MACERR,
+   IEEE_R_FDXFC, IEEE_R_OCTETS_OK
+};
+#else
+static u32 fec_enet_register_offset[] = {
+   FEC_ECNTRL, FEC_IEVENT, FEC_IMASK, FEC_IVEC, FEC_R_DES_ACTIVE_0,
+   FEC_R_DES_ACTIVE_1, FEC_R_DES_ACTIVE_2, FEC_X_DES_ACTIVE_0,
+   FEC_X_DES_ACTIVE_1, FEC_X_DES_ACTIVE_2, FEC_MII_DATA, FEC_MII_SPEED,
+   FEC_R_BOUND, FEC_R_FSTART, FEC_X_WMRK, FEC_X_FSTART, FEC_R_CNTRL,
+   FEC_MAX_FRM_LEN, FEC_X_CNTRL, FEC_ADDR_LOW, FEC_ADDR_HIGH,
+   FEC_GRP_HASH_TABLE_HIGH, FEC_GRP_HASH_TABLE_LOW, FEC_R_DES_START_0,
+   FEC_R_DES_START_1, FEC_R_DES_START_2, FEC_X_DES_START_0,
+   FEC_X_DES_START_1, FEC_X_DES_START_2, FEC_R_BUFF_SIZE_0,
+   FEC_R_BUFF_SIZE_1, FEC_R_BUFF_SIZE_2
+};
+#endif
+
+static void fec_enet_get_regs(struct net_device *ndev,
+ struct ethtool_regs *regs, void *regbuf)
+{
+   struct fec_enet_private *fep = netdev_priv(ndev);
+   u32 __iomem *theregs = (u32 __iomem *)fep->hwp;
+   u32 *buf = (u32 *)regbuf;
+   u32 i, off;
+
+   memset(buf, 0, regs->len);
+
+   for (i = 0; i < ARRAY_SIZE(fec_enet_register_offset); i++) {
+   off = fec_enet_register_offset[i] / 4;
+   buf[off] = readl([off]);
+   }
+}
+
 static int fec_enet_get_ts_info(struct net_device *ndev,
struct ethtool_ts_info *info)
 {
@@ -2515,6 +2591,8 @@ static const struct ethtool_ops fec_enet_ethtool_ops = {
.get_settings   = fec_enet_get_settings,
.set_settings   = fec_enet_set_settings,
.get_drvinfo= fec_enet_get_drvinfo,
+   .get_regs_len   = fec_enet_get_regs_len,
+   .get_regs

Re: ext2/3/4 performance issue

2015-05-10 Thread Theodore Ts'o

On Sun, May 10, 2015 at 10:01:09AM +0200, Bernhard Kraft wrote:
> 
> I work on implementing the ext2 filesystem for a PIC microcontroller and
> while reading the sources of it in the linux kernel I stumbled upon the
> following performance issue.

Your observations are currect as far as ext2 is concerned; by the time
you get to ext4, it's a bit more complicated.  Also, the
extN_bg_has_super() has in general not been a bottleneck, at least on
modern CPUs.  Perhaps it would be more of an issue on a PIC
microcontroller, of course.

Finally, note that the overhead is cached so it is only calculated the
first time the extN_statfs() is called.

Cheers,

- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH V4 net-next 1/1] hv_netvsc: Use the xmit_more skb flag to optimize signaling the host

2015-05-10 Thread David Miller

From: "K. Y. Srinivasan" 
Date: Wed,  6 May 2015 15:29:05 -0700

> - ret = vmbus_sendpacket_pagebuffer(out_channel,
> -   pgbuf,
> -   packet->page_buf_cnt,
> -   ,
> -   sizeof(struct nvsp_message),
> -   req_id);
> + ret = vmbus_sendpacket_pagebuffer_ctl(out_channel,
> +   pgbuf,
> +   packet->page_buf_cnt,
> +   ,
> +   sizeof(struct
> +  nvsp_message),
> +   req_id,
> +   vmbus_flags,
> +   !packet->xmit_more);
>   } else {
> - ret = vmbus_sendpacket(
> + ret = vmbus_sendpacket_ctl(
>   out_channel, ,
>   sizeof(struct nvsp_message),
>   req_id,
>   VM_PKT_DATA_INBAND,
> - VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED);
> + vmbus_flags, !packet->xmit_more);

Just as you did for the vmbus_sendpacket_pagebuffer_ctl() call above,
you'll need to reindent the arguments for the vmbus_sendpacket_ctl()
call since the openning parenthesis is now at a different column.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] net: fec: add support of ethtool get_regs

2015-05-10 Thread Philippe Reynes

Hi david,

On 10/05/15 03:01, David Miller wrote:

From: Philippe Reynes
Date: Sun, 10 May 2015 00:16:21 +0200

Hi Fabio,

On 09/05/15 23:59, Fabio Estevam wrote:

Philippe,

On Sat, May 9, 2015 at 6:17 PM, Russell King - ARM Linux
   wrote:

Using memcpy_fromio() to copy device registers is not a good idea -
it can use a variable access size which can cause bus faults.

An example on how memcpy_fromio() can be avoided in get_regs:
drivers/net/ethernet/samsung/sxgbe/sxgbe_ethtool.c

Thanks for pointing me this example. I've already send a patch,
and I've used drivers/net/ethernet/freescale/gianfar_ethtool.c
as example. I hope it's a good example too.

I think you need to be much more careful and conservative in your
implementation.

You should skip I/O addresses that don't have defined registers
at those offsets for the chip in question.

Ok, I've added an array with all register, so I only read defined registers.

Also, you should _very_ carefully evaluate each and every register you
dump and potentially skip certain registers which have strong negative
side effects if read arbitrarily.

For example, dumping the interrupt status register could cause pending
interrupt status to be cleared, and thus cause the driver to lose
interrupts and subsequently packet processing will hang.

Thanks for the feedback. I've read all the register, and all registers
can be read without negative side effect. Even interrupt status register,
interrupt are cleared when the register EIR is written (not read).

Regards,
Philippe
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 4 5 6 >

1 - 100 of 512 matches

Mail list logo